Query lcl|NC_016566.1_cdsid_YP_004957446.1 [gene=EP23p10] [protein=phage structural protein] [protein_id=YP_004957446.1] [location=7389..8483] Match_columns 364 No_of_seqs 92 out of 109 Neff 5.5 Searched_HMMs 1612 Date Thu Nov 7 13:07:09 2013 Command /home/guerois/workspace/virfam/python/lib/hhsearch//hhsearch2 -i .//seq/seq_10 -d /home/guerois/workspace/virfam/python/profile_database/capsid_neck_tail.hhm -glob -cpu 7 -o .//seq/HHR/seq_10_vs_rec_db.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 protein:vir:95131 Length: 325 100.0 3E-113 2E-116 637.2 28.1 316 1-326 4-325 (325) 2 protein:vir:96792 Length: 315 100.0 1E-107 7E-111 606.9 27.2 305 1-329 7-315 (315) 3 protein:vir:94989 Length: 349 100.0 7E-84 4.3E-87 476.5 26.4 309 1-329 7-349 (349) 4 protein:vir:78387 Length: 349 100.0 3.2E-83 2E-86 472.8 26.0 309 1-329 7-349 (349) 5 protein:vir:80446 Length: 367 100.0 1.4E-82 8.8E-86 469.3 24.8 306 1-327 4-367 (367) 6 protein:vir:1583 Length: 351 # 100.0 6.4E-75 4E-78 427.3 26.6 326 1-353 5-351 (351) 7 protein:vir:102944 Length: 330 100.0 6.2E-75 3.9E-78 427.4 24.2 312 1-329 1-330 (330) 8 protein:vir:5974 Length: 324 # 100.0 2.2E-74 1.4E-77 424.4 26.7 302 1-329 5-324 (324) 9 protein:vir:95107 Length: 270 100.0 1.1E-33 6.7E-37 201.3 22.5 260 1-302 1-270 (270) 10 protein:vir:105334 Length: 276 100.0 5.4E-31 3.3E-34 186.5 22.4 262 1-300 7-276 (276) 11 protein:vir:1239 Length: 274 # 99.9 4.1E-29 2.5E-32 176.2 21.6 264 1-312 7-274 (274) 12 protein:vir:3613 Length: 272 # 99.9 4.7E-29 2.9E-32 175.9 21.3 257 1-288 7-272 (272) 13 protein:vir:96262 Length: 274 99.9 1.2E-28 7.7E-32 173.6 21.5 260 1-312 7-274 (274) 14 protein:vir:95898 Length: 274 99.9 1.2E-28 7.7E-32 173.6 21.5 260 1-312 7-274 (274) 15 protein:vir:96833 Length: 275 99.9 1.8E-27 1.1E-30 167.1 22.6 260 1-304 8-275 (275) 16 protein:vir:94494 Length: 274 99.9 2.9E-27 1.8E-30 166.0 21.9 263 1-312 7-274 (274) 17 protein:vir:97433 Length: 274 99.9 2.9E-27 1.8E-30 166.0 21.9 263 1-312 7-274 (274) 18 protein:vir:96123 Length: 274 99.9 1.2E-25 7.2E-29 157.3 23.1 264 1-312 1-274 (274) 19 protein:vir:93742 Length: 274 99.9 6.3E-25 3.9E-28 153.3 22.9 263 1-304 7-274 (274) 20 protein:vir:80930 Length: 278 99.9 6.8E-24 4.2E-27 147.6 20.9 259 1-295 1-278 (278) 21 protein:vir:9820 Length: 272 # 99.8 6.4E-22 3.9E-25 136.8 23.2 256 1-296 1-272 (272) 22 protein:vir:3033 Length: 272 # 99.8 6.4E-22 3.9E-25 136.8 23.2 256 1-296 1-272 (272) 23 protein:vir:739 Length: 231 # 99.7 4.8E-18 2.9E-21 115.6 17.9 219 36-295 1-231 (231) 24 protein:vir:104256 Length: 458 98.5 1.9E-07 1.2E-10 57.4 20.4 269 1-296 169-458 (458) 25 protein:vir:79987 Length: 415 98.2 1.4E-06 8.5E-10 52.7 19.9 281 1-304 128-415 (415) 26 protein:vir:81100 Length: 415 98.2 1.4E-06 8.5E-10 52.7 19.9 281 1-304 128-415 (415) 27 protein:vir:98339 Length: 415 98.2 1.4E-06 8.5E-10 52.7 19.9 281 1-304 128-415 (415) 28 protein:vir:9410 Length: 415 # 98.0 6E-06 3.8E-09 49.2 20.2 281 1-304 97-415 (415) 29 protein:vir:1383 Length: 421 # 98.0 4.3E-06 2.7E-09 50.0 18.8 294 1-348 95-421 (421) 30 protein:vir:4600 Length: 415 # 98.0 6.5E-06 4.1E-09 49.0 20.3 279 1-304 109-415 (415) 31 protein:vir:4700 Length: 415 # 98.0 6.5E-06 4.1E-09 49.0 20.3 279 1-304 109-415 (415) 32 protein:vir:9759 Length: 303 # 98.0 7.6E-06 4.7E-09 48.6 19.9 271 1-288 6-303 (303) 33 protein:vir:4339 Length: 395 # 98.0 8.5E-06 5.3E-09 48.4 19.7 260 1-298 120-395 (395) 34 protein:vir:102605 Length: 273 98.0 8.8E-06 5.5E-09 48.3 20.4 259 1-296 1-273 (273) 35 protein:vir:105822 Length: 273 98.0 8.8E-06 5.5E-09 48.3 20.4 259 1-296 1-273 (273) 36 protein:vir:4953 Length: 397 # 98.0 9.3E-06 5.8E-09 48.2 20.1 269 1-301 116-397 (397) 37 protein:vir:7990 Length: 273 # 98.0 9.4E-06 5.8E-09 48.1 20.7 259 1-296 1-273 (273) 38 protein:vir:4856 Length: 293 # 97.8 1.6E-05 9.7E-09 46.9 19.0 269 1-304 12-293 (293) 39 protein:vir:41 Length: 299 # N 97.8 1.7E-05 1E-08 46.8 20.6 265 1-296 3-299 (299) 40 protein:vir:1638 Length: 298 # 97.8 1.8E-05 1.1E-08 46.6 19.9 270 1-296 1-298 (298) 41 protein:vir:100135 Length: 418 97.8 1.9E-05 1.2E-08 46.4 20.0 261 1-294 142-418 (418) 42 protein:vir:7771 Length: 330 # 97.8 2.2E-05 1.4E-08 46.1 18.3 277 1-300 1-330 (330) 43 protein:vir:191 Length: 385 # 97.6 3.3E-05 2E-08 45.2 19.6 262 1-296 96-385 (385) 44 protein:vir:1886 Length: 385 # 97.6 3.3E-05 2E-08 45.2 19.6 262 1-296 96-385 (385) 45 protein:vir:100247 Length: 425 97.6 4.2E-05 2.6E-08 44.6 18.3 271 1-290 117-425 (425) 46 protein:vir:97053 Length: 390 97.6 4.4E-05 2.7E-08 44.5 19.8 260 1-287 120-390 (390) 47 protein:vir:10364 Length: 390 97.5 5.4E-05 3.3E-08 44.0 19.7 255 1-287 120-390 (390) 48 protein:vir:81160 Length: 371 97.5 5.4E-05 3.4E-08 44.0 18.9 253 1-289 98-371 (371) 49 protein:vir:7409 Length: 408 # 97.5 5.6E-05 3.4E-08 43.9 20.5 275 1-343 97-408 (408) 50 protein:vir:3870 Length: 400 # 97.5 5.6E-05 3.5E-08 43.9 19.3 253 1-299 109-400 (400) 51 protein:vir:4830 Length: 397 # 97.5 5.7E-05 3.5E-08 43.8 20.4 269 1-301 109-397 (397) 52 protein:vir:105905 Length: 304 97.5 6E-05 3.7E-08 43.7 17.8 264 1-301 1-304 (304) 53 protein:vir:94142 Length: 304 97.5 6E-05 3.7E-08 43.7 17.8 264 1-301 1-304 (304) 54 protein:vir:81070 Length: 390 97.4 6.7E-05 4.1E-08 43.5 19.2 259 1-287 120-390 (390) 55 protein:vir:6212 Length: 434 # 97.4 8.7E-05 5.4E-08 42.8 17.1 271 1-300 114-434 (434) 56 protein:vir:9574 Length: 300 # 97.3 8.9E-05 5.5E-08 42.8 20.4 272 1-294 5-300 (300) 57 protein:vir:8420 Length: 477 # 97.3 7E-05 4.4E-08 43.4 16.0 278 1-300 133-477 (477) 58 protein:vir:485 Length: 407 # 97.3 9.3E-05 5.8E-08 42.7 19.1 276 1-335 93-407 (407) 59 protein:vir:6242 Length: 390 # 97.3 0.0001 6.2E-08 42.5 18.0 263 1-290 83-390 (390) 60 protein:vir:4456 Length: 401 # 97.3 0.00011 6.7E-08 42.3 18.7 267 1-296 111-401 (401) 61 protein:vir:94771 Length: 298 97.2 0.00012 7.2E-08 42.1 19.7 272 1-296 1-298 (298) 62 protein:vir:8102 Length: 543 # 97.2 0.00012 7.6E-08 42.0 20.1 262 1-329 245-543 (543) 63 protein:vir:78523 Length: 338 97.2 0.00014 8.8E-08 41.7 20.7 278 1-299 23-338 (338) 64 protein:vir:4997 Length: 397 # 97.1 0.00015 9.5E-08 41.5 21.0 269 1-340 116-397 (397) 65 protein:vir:4511 Length: 409 # 97.1 0.00016 1E-07 41.4 18.8 272 1-294 87-409 (409) 66 protein:vir:1328 Length: 392 # 97.1 0.00018 1.1E-07 41.1 18.3 263 1-290 117-392 (392) 67 protein:vir:3991 Length: 404 # 97.0 0.0002 1.2E-07 40.9 19.5 267 1-300 87-404 (404) 68 protein:vir:100172 Length: 394 96.8 0.00032 2E-07 39.7 21.4 263 1-306 85-394 (394) 69 protein:vir:78223 Length: 333 96.8 0.00035 2.2E-07 39.5 20.3 273 1-297 20-333 (333) 70 protein:vir:962 Length: 397 # 96.7 0.00037 2.3E-07 39.4 16.0 261 1-289 104-397 (397) 71 protein:vir:99749 Length: 324 96.7 0.00037 2.3E-07 39.4 20.7 269 1-302 9-324 (324) 72 protein:vir:101607 Length: 379 96.6 0.00044 2.8E-07 39.0 19.4 253 1-291 113-379 (379) 73 protein:vir:102119 Length: 404 96.6 0.0005 3.1E-07 38.7 20.3 270 1-298 100-404 (404) 74 protein:vir:80684 Length: 315 96.4 0.00071 4.4E-07 37.9 19.3 283 1-301 1-315 (315) 75 protein:vir:5739 Length: 366 # 96.3 0.00081 5E-07 37.5 17.0 265 1-308 52-366 (366) 76 protein:vir:103955 Length: 324 96.2 0.00085 5.3E-07 37.4 20.7 269 1-302 9-324 (324) 77 protein:vir:94673 Length: 419 96.2 0.00091 5.6E-07 37.3 19.9 269 1-298 121-419 (419) 78 protein:vir:95763 Length: 297 96.1 0.00096 6E-07 37.1 19.8 264 1-298 1-297 (297) 79 protein:vir:99075 Length: 392 96.1 0.00097 6E-07 37.1 19.1 311 1-364 1-334 (392) 80 protein:vir:96223 Length: 324 96.1 0.0011 6.6E-07 36.9 19.9 268 1-322 4-324 (324) 81 protein:vir:8187 Length: 311 # 95.9 0.0012 7.7E-07 36.5 18.0 270 1-290 5-311 (311) 82 protein:vir:81227 Length: 413 95.9 0.0013 7.9E-07 36.5 20.2 266 1-297 125-413 (413) 83 protein:vir:9309 Length: 324 # 95.8 0.0014 8.4E-07 36.3 20.1 269 1-302 24-324 (324) 84 protein:vir:2504 Length: 305 # 95.8 0.0014 8.6E-07 36.3 19.2 271 1-299 1-305 (305) 85 protein:vir:93881 Length: 387 95.8 0.0015 9E-07 36.1 15.2 257 1-298 86-387 (387) 86 protein:vir:9361 Length: 402 # 95.7 0.0017 1E-06 35.8 15.5 257 1-298 101-402 (402) 87 protein:vir:2685 Length: 387 # 95.6 0.0018 1.1E-06 35.7 14.6 265 1-298 86-387 (387) 88 protein:vir:96978 Length: 387 95.6 0.0018 1.1E-06 35.7 14.6 265 1-298 86-387 (387) 89 protein:vir:94424 Length: 387 95.6 0.0018 1.1E-06 35.7 14.6 265 1-298 86-387 (387) 90 protein:vir:100884 Length: 389 95.6 0.0018 1.1E-06 35.6 21.3 261 1-299 83-389 (389) 91 protein:vir:9704 Length: 394 # 95.5 0.0019 1.2E-06 35.5 18.9 252 1-302 106-394 (394) 92 protein:vir:1025 Length: 408 # 95.5 0.0019 1.2E-06 35.5 19.1 271 1-343 103-408 (408) 93 protein:vir:95376 Length: 425 95.5 0.002 1.2E-06 35.4 19.6 264 1-295 108-425 (425) 94 protein:vir:97148 Length: 324 95.4 0.0022 1.3E-06 35.2 21.0 269 1-302 1-324 (324) 95 protein:vir:93616 Length: 645 95.0 0.0029 1.8E-06 34.5 18.2 267 1-294 307-645 (645) 96 protein:vir:105004 Length: 392 95.0 0.0031 1.9E-06 34.4 19.8 258 1-299 84-392 (392) 97 protein:vir:107593 Length: 392 95.0 0.0031 1.9E-06 34.4 19.8 258 1-299 84-392 (392) 98 protein:vir:102082 Length: 392 95.0 0.0031 1.9E-06 34.4 19.8 258 1-299 84-392 (392) 99 protein:vir:102873 Length: 392 95.0 0.0031 1.9E-06 34.4 19.8 258 1-299 84-392 (392) 100 protein:vir:3845 Length: 395 # 94.9 0.0032 2E-06 34.3 19.4 270 1-306 79-395 (395) 101 protein:vir:96762 Length: 632 94.2 0.0051 3.2E-06 33.2 16.0 264 1-288 326-632 (632) 102 protein:vir:1268 Length: 397 # 93.6 0.0071 4.4E-06 32.4 18.5 253 1-295 130-397 (397) 103 protein:vir:94622 Length: 341 93.2 0.0085 5.3E-06 31.9 19.1 303 1-332 15-341 (341) 104 protein:vir:96392 Length: 324 93.1 0.0087 5.4E-06 31.9 20.1 269 1-322 9-324 (324) 105 protein:vir:78830 Length: 324 93.1 0.0087 5.4E-06 31.9 20.1 269 1-322 9-324 (324) 106 protein:vir:2344 Length: 397 # 92.8 0.0098 6.1E-06 31.6 19.0 328 1-364 3-390 (397) 107 protein:vir:1084 Length: 437 # 92.5 0.011 6.8E-06 31.3 18.8 264 1-300 139-437 (437) 108 protein:vir:80213 Length: 334 92.5 0.011 7E-06 31.3 11.8 264 1-297 23-334 (334) 109 protein:vir:105038 Length: 428 92.1 0.013 8E-06 30.9 15.3 269 1-308 113-428 (428) 110 protein:vir:80180 Length: 381 89.5 0.026 1.6E-05 29.3 20.2 313 1-364 15-347 (381) 111 protein:vir:102655 Length: 322 89.1 0.028 1.7E-05 29.1 17.6 277 1-287 7-322 (322) 112 protein:vir:78640 Length: 352 88.9 0.029 1.8E-05 29.0 14.9 257 1-298 56-352 (352) 113 protein:vir:4092 Length: 390 # 88.7 0.031 1.9E-05 28.9 18.6 274 1-301 68-390 (390) 114 protein:vir:80376 Length: 435 87.4 0.039 2.4E-05 28.3 19.1 267 1-310 119-435 (435) 115 protein:vir:4197 Length: 314 # 87.3 0.04 2.5E-05 28.3 19.9 263 1-320 1-314 (314) 116 protein:vir:99920 Length: 311 86.6 0.045 2.8E-05 28.0 18.7 267 1-289 4-311 (311) 117 protein:vir:1433 Length: 435 # 82.6 0.075 4.7E-05 26.7 19.0 268 1-310 124-435 (435) 118 protein:vir:80128 Length: 466 79.7 0.1 6.3E-05 26.0 11.4 297 1-321 105-466 (466) 119 protein:vir:108211 Length: 318 78.3 0.12 7.2E-05 25.7 15.0 261 1-297 18-318 (318) 120 protein:vir:4226 Length: 326 # 75.4 0.15 9.1E-05 25.1 18.5 269 1-296 26-326 (326) 121 protein:vir:104085 Length: 320 74.0 0.16 0.0001 24.9 18.8 264 1-299 7-320 (320) 122 protein:vir:4159 Length: 315 # 70.8 0.2 0.00013 24.4 18.3 265 1-311 19-315 (315) 123 protein:vir:3158 Length: 321 # 65.9 0.28 0.00017 23.6 19.0 273 1-307 1-321 (321) 124 protein:vir:2430 Length: 318 # 61.9 0.35 0.00021 23.1 18.7 267 1-294 7-318 (318) 125 protein:vir:101650 Length: 497 56.2 0.46 0.00029 22.4 18.3 271 1-296 159-497 (497) 126 protein:vir:7855 Length: 497 # 56.2 0.46 0.00029 22.4 18.3 271 1-296 159-497 (497) 127 protein:vir:100632 Length: 381 47.0 0.72 0.00044 21.4 10.6 272 1-306 82-381 (381) 128 protein:vir:105522 Length: 423 33.5 1.4 0.00084 19.9 18.1 318 1-364 1-347 (423) 129 protein:vir:6324 Length: 335 # 28.4 1.7 0.0011 19.3 11.8 267 1-299 22-335 (335) 130 protein:vir:93696 Length: 364 23.5 2.3 0.0014 18.6 11.0 286 1-337 1-364 (364) No 1 >protein:vir:95131 Length: 325 # NCBI annotation: hypothetical protein ORF010 # Family: family:all:47 # MgeID: mge:1552 # MgeName: PA73 # Cross-refs: genbank:acc:YP_001293417;genbank:gi:148912838;genbank:GeneID:5228206 Probab=100.00 E-value=3.3e-113 Score=637.23 Aligned_cols=316 Identities=25% Similarity=0.441 Sum_probs=293.4 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhh-cccccccccccCCCccccchhhhccc Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLI-ANLVTDRNAYAPVGTPATAKVLARML 79 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i-~g~~~~~d~~~~~~~~~T~~kit~~~ 79 (364) |||+|||+++|++|+|+|+|++++||++|+|||+|++++++|||+++|||++| +|+++++++++.+ .++|+||++++ T Consensus 4 sD~~vfN~~~~~a~~e~~~q~~~~fn~as~gai~l~~~~~~Gd~~~~pf~~~l~g~~~~~~~~~~~~--~vt~~kitt~~ 81 (325) T protein:vir:95 4 SDLAVYSEYAYSAFSETLRQQVDLFNTATGGAIMLQSAAHQGDFSDVAFFAKVTGGLVRRRNAYGSG--TVAEKVLKHLV 81 (325) T ss_pred hhhhhhhhhhhhhhhhhhhhhHhhhhhcccceeEeccccccCceeeccccccccccccccccCCCCc--eeccceecccc Confidence 99999999999999999999999999999999999999999999999999986 6676777776543 46899999999 Q ss_pred eeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCccccccc Q lcl|NC_016566. 80 TNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTF 159 (364) Q Consensus 80 ~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~ 159 (364) +++||++|++||++|++++++|..++|.+++.+||+++++||++++|+.+|++|.++++++++|++|+|..++++. .. T Consensus 82 ~~av~~~r~~g~~~~d~~~~~~g~~~~~~~~~~Ig~~~a~~~~~~~l~~~~~~l~~a~~~~~~~v~dis~~~~~~~--~~ 159 (325) T protein:vir:95 82 DTSVKVAAGTPPVRLDPGQFRWIQQNPEVAGAAMGQQLAVDTMADMLNVGLGSVYSALSQVSDVVYDATANTDAAD--KL 159 (325) T ss_pred ceeeEEecccCcccccHHHHhhcCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhcccccceeeeecccCccc--cc Confidence 9999999999999999999999999999999999999999999999999999999999999999999999887544 45 Q ss_pred ccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccccccccccceeecccCCcEEEEeCCCCC----CCCceE Q lcl|NC_016566. 160 PTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAIGDLQVMGDGLGRRFIISDAAAD----AMGAGK 235 (364) Q Consensus 160 ~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~~~~~~~~lGrrVIVDD~~p~----~~~~Yt 235 (364) +++.+|++|+|||||++++|++|+|||+||++|++++ |.+.++++.+.++.+++++|||||||||+||+ ..++|+ T Consensus 160 ~s~~~l~~A~~klGD~~~~l~~~~MHS~v~~~L~~~~-L~~~~~~~~~~g~~~i~t~~G~~VIVdD~~p~~~~g~~~~yt 238 (325) T protein:vir:95 160 PTWNNLNNGQAKFGDQSSQIAAWIMHSTPMHKLYGSN-LTNGERLFTYGTVNVVRDPFGKLLVMTDSPNLFAAGTPNVYH 238 (325) T ss_pred ccHHHHHHHHHHhcccccceeEEEEchHHHHHHHHhh-ccccccccccCCcccccccCCcEEEEeCCCCCCCccCceeEE Confidence 6888999999999999999999999999999999965 55678889999999999999999999999997 345999 Q ss_pred EEEEecceeEEecC-CCCcceeeccCCCceeeeEEeeEEEEeeeeeeeecccccccccCCCCcChhhhcCCccceeecCc Q lcl|NC_016566. 236 MLGLVPGAVAVTTN-GLDMLAQEKGGNENIERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLSDITDKANWELDQGQ 314 (364) Q Consensus 236 tylfg~GAi~~~~~-~~~~~~~~~~g~e~~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW~rV~~s 314 (364) ||+||+|||+++++ ++++++++.+|+|++..++|.||+|+|||+||||+++. +|+|||++||++++||+||++| T Consensus 239 ty~lg~GAi~~~~~~~~~~~~~~~~~~~~~~~~~~~~~tf~lhp~G~sw~~s~-----~g~sPt~aeL~~~~NW~rv~~~ 313 (325) T protein:vir:95 239 ILGLVPGGVLIGQNNDFDANEETKNGDENIIRTYQAEWSYNIGVKGFAWDKAN-----GGKSPTDAALFTSTNWDKYATS 313 (325) T ss_pred EEEEecCeEEecCCCCccccccccCcccceeeeeeeeeeEEeecceeeeeccc-----ccCCcChHhhcCCcCcceecCC Confidence 99999999999986 55788999999999999999999999999999998753 5789999999999999999999 Q ss_pred CcCcceEEEEec Q lcl|NC_016566. 315 VDNAPATVQDVG 326 (364) Q Consensus 315 ~K~~pgv~~~~~ 326 (364) +|++|||++|-+ T Consensus 314 ~K~tagv~~~~~ 325 (325) T protein:vir:95 314 HKDLAGVVVKTN 325 (325) T ss_pred CccccceeEeeC Confidence 999999999955 No 2 >protein:vir:96792 Length: 315 # NCBI annotation: major capsid protein # Family: family:all:47 # MgeID: mge:1629 # MgeName: phiHSIC # Cross-refs: genbank:acc:YP_224246;genbank:gi:62362381;genbank:GeneID:3345731 Probab=100.00 E-value=1.1e-107 Score=606.89 Aligned_cols=305 Identities=22% Similarity=0.304 Sum_probs=282.2 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) =||+|||+++|++++||++|++++||++|+|||+|.+++++|||.++|||+ |+|++++||+|+.++ ++|+||+++++ T Consensus 7 sdl~vfn~~~~~a~~e~~~~~~~~Fnaas~Gai~l~~~~~~GDf~~~~ff~-i~~~~~~rnv~~~~~--~t~~kit~~~d 83 (315) T protein:vir:96 7 SDLVIYNDTAQTAYLERNMDNLAVFNENSRAAIGLNSELIEGDLKLRSFYK-VGGAIADRDVNSTAT--VAGTKIAADEM 83 (315) T ss_pred cceeeehhhhhhhHHhhhHHHHHHhhhhcCCcccccccccccccccccccc-cccchhhcccCCCcc--ccceecccccc Confidence 899999999999999999999999999999999999999999999999999 999999999997654 68999999999 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCcccccccc Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTFP 160 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~~ 160 (364) ++||++|++||++|++++|+|+|+||+++...|+++++++|++++++.+++.++++++++++++++ +....+ T Consensus 84 vaVk~~~~~~~~~~~~~~~a~~g~dp~~~~~~i~~~~~~~~l~~~l~~~l~~~~aai~~~t~~~~~--------~~~a~~ 155 (315) T protein:vir:96 84 VSVKVPWKYGPYETTEEAFKRRARSPEEFSMLIGQDMADATMAGWIGYALNALQGAIGSNAGMNVS--------GELATE 155 (315) T ss_pred eeEEEeecCCchhccHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhccccccccc--------cccccc Confidence 999999999999999999999999999999999999999999999999999999999999877653 223567 Q ss_pred cHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccccccccccceeec---ccCCcEEEEeCCCCCCCCceEEE Q lcl|NC_016566. 161 TLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAIGDLQVMG---DGLGRRFIISDAAADAMGAGKML 237 (364) Q Consensus 161 s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~~~~~~---~~lGrrVIVDD~~p~~~~~Ytty 237 (364) +..+|++|+|||||++++|++|+|||+||++|++| +|+ +.++++.++.++. .+|||||||||+||+ |++| T Consensus 156 ~~~~l~dA~~klGD~~~~l~~~vMHS~v~~~L~~q-~L~--~~~~~~~~~~~~~~~~~~lGkrViVdD~~P~----~~~~ 228 (315) T protein:vir:96 156 GKKVLTKGLRTMGDKASSIAIWVMDSTSYFDIVDE-AID--NKLYEEAGVVVYGGTPGTLGKPVLVTDQCPA----TKIF 228 (315) T ss_pred CHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHh-hhh--hhcccccceeEecCcCcccccEEEEECCCCc----ceee Confidence 88899999999999999999999999999999994 554 4778888888874 468999999999994 8999 Q ss_pred EEecceeEEecCC-CCcceeeccCCCceeeeEEeeEEEEeeeeeeeecccccccccCCCCcChhhhcCCccceeecCcCc Q lcl|NC_016566. 238 GLVPGAVAVTTNG-LDMLAQEKGGNENIERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLSDITDKANWELDQGQVD 316 (364) Q Consensus 238 lfg~GAi~~~~~~-~~~~~~~~~g~e~~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW~rV~~s~K 316 (364) +|++|||+++++. |++++++.+|+|++.+++|.||+|++||+||||+++ +++|||++||++++||+||++|+| T Consensus 229 gl~~GAi~~~~~~~~~~~~~~~~g~e~l~~~~r~e~tf~l~p~G~sw~~~------~~~sPt~aeLat~~NWekV~~~~K 302 (315) T protein:vir:96 229 GLVAGAVMITESQAPGMRSYQIDDQENLAIGFRAEGTANVEVLGYKWKTK------TNVNPASATLATTTNWEKYATDDK 302 (315) T ss_pred eeecceeeecCCCccccccccCCCcceeEEEEeeeeEeeeeeeeEEeecC------CCcCCChHHhcCCcCcccccCCCc Confidence 9999999999865 578999999999999999999999999999999865 467999999999999999999999 Q ss_pred CcceEEEEecCcc Q lcl|NC_016566. 317 NAPATVQDVGSDS 329 (364) Q Consensus 317 ~~pgv~~~~~~~~ 329 (364) +++|||+|+..+. T Consensus 303 ~tagv~~~~~~~~ 315 (315) T protein:vir:96 303 ATAGFIITLTTTP 315 (315) T ss_pred ccceEEEEecCCC Confidence 9999999987766 No 3 >protein:vir:94989 Length: 349 # NCBI annotation: hypothetical protein # Family: family:all:1522 # MgeID: mge:1547 # MgeName: KS7 # Cross-refs: genbank:acc:YP_224029;genbank:gi:62327316;genbank:GeneID:5176817 Probab=100.00 E-value=7e-84 Score=476.48 Aligned_cols=309 Identities=13% Similarity=0.056 Sum_probs=252.4 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeE----ecc-CcccCceeeeehhhhhcccccccccccCCC-ccccchh Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVV----LGT-GEVLKDVVEKMSVGLIANLVTDRNAYAPVG-TPATAKV 74 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAiv----l~~-~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~-~~~T~~k 74 (364) ||+++||.++|+.|++..+..++.|-+ .|+|+ |.. ....|+|+++|||++|.|-.+ .++++..+ ..+||.| T Consensus 7 ~D~iipe~~vf~~Yv~~~~~e~~~l~q--SGii~~d~~l~~~~~~gG~~~~iPf~~~l~g~~e-~n~~~dt~~~~~t~~k 83 (349) T protein:vir:94 7 GNIVTGNIPVLASYMTEDPVEKTAFFN--SGILTPTPYAAEIARGPSNIANLPFWKAIDTSIE-PNYSNDVYQDIATPRA 83 (349) T ss_pred eeeeccChHHHHHHHHHhHHHhhhhhh--ccceeccHHHHHHHhcCCCEEEeeeeecCCCCcc-cccCCCCccccccccc Confidence 999999999999999999987777776 36776 221 124599999999999976422 34444333 3578999 Q ss_pred hhccceeeEEeccccCchhcCHHHHHhh--cCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc----------c Q lcl|NC_016566. 75 LARMLTNSVNLSAKVGPVAITKAMMAKI--ETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNA----------A 142 (364) Q Consensus 75 it~~~~vaVkl~~~~gpv~~t~~~~~~~--g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na----------~ 142 (364) |++++++++++.|+.+ |.+.+|++. |.||| ++|++||++||.|++|+.||++|+|+|+++. + T Consensus 84 it~~~~~a~~~~r~ka---w~~~Dla~~lsG~dpm---~~Ia~~va~yW~r~~q~~Lia~L~Gvf~~~~~~~~~~~~~~~ 157 (349) T protein:vir:94 84 IQTGEMMARVAYLNEG---FGQADLTVELTSQNPL---QSVASRLDNFWQRQAQRRLIATALGLYNDNVSATDAYHEQND 157 (349) T ss_pred ccccceeeeeeeeccc---cchhHHHHHhhCchHH---HHHHHHHHHHHhhHHHHHHHHHHHhhhcccccccccccccCc Confidence 9999999999999986 999999885 77885 5699999999999999999999999999752 3 Q ss_pred ceeecccccCcccccccccHHHHHHHHHHhccc-----ccCeeEEEEchHHHHHHHHhhcccccccccccccceeecccC Q lcl|NC_016566. 143 ANYTQPARVDGVGGRTFPTLADFPLAASKFGDQ-----AALIKSWFMDGVTWANFIAYQALPSAEQVFAIGDLQVMGDGL 217 (364) Q Consensus 143 ~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~-----~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~~~~~~~~l 217 (364) |++|++++ ++.+..+|++|+++|||+ +++|++++|||+||++|+++++|...........+.+|+ T Consensus 158 ~~~d~~~~-------a~~~~~~~~~A~~~~Gdaa~Gd~~~~lt~i~mHS~v~~~L~~~~li~~i~~s~~~~~i~ty~--- 227 (349) T protein:vir:94 158 MVVDVSAT-------SGFDAGAFIDATQTMGDALMGNGGEVLGAIAMHSFVYAQARKAQLIDFIRDAENNTMFATYQ--- 227 (349) T ss_pred eeEEeccc-------CCCChhhHHHHHHHHHHHhccccccceeEEEEchHHHHHHHhcchhhhccCcccCcccceec--- Confidence 45555543 335667899999998886 799999999999999999999997655555556666664 Q ss_pred CcEEEEeCCCCC----CCCceEEEEEecceeEEecCCCCcc----eeeccCCCceeeeEEeeEEEEeeeeeeeecccccc Q lcl|NC_016566. 218 GRRFIISDAAAD----AMGAGKMLGLVPGAVAVTTNGLDML----AQEKGGNENIERWWQGEFDFNVAVKGYRLKASART 289 (364) Q Consensus 218 GrrVIVDD~~p~----~~~~Yttylfg~GAi~~~~~~~~~~----~~~~~g~e~~~~~~~~~~~f~lhp~G~sw~~~~~~ 289 (364) ||||||||+||+ .+++|+|||||+|||+|+++.|+.+ +++..|+..++..++.|++|++||+||||+++.++ T Consensus 228 G~~VivDD~~Pv~~~g~~~~yttylfg~GAi~~~~~~~~~~~E~~rd~~~g~~~G~d~L~~R~~~~~hp~G~s~~~a~v~ 307 (349) T protein:vir:94 228 GYRVIVDDSMTVVGQDTSRKFISIIFGQGAIGYGEGNPEMPLEYEREASRANGGGVETLWTRKTWLLHPFGYSFTSAVIT 307 (349) T ss_pred CcEEEEeCCCccccCCCCceEEEEEeecceEEeecCCCCcceeeecccccCCcceeEEEEEeeEEEeeeeeeeecccccC Confidence 999999999997 3469999999999999999887633 33444556677788888899999999999998776 Q ss_pred ccc---CCCCcChhhhcCCccceeecCcCcCcceEEEEecCcc Q lcl|NC_016566. 290 PVE---GVRSFKLSDITDKANWELDQGQVDNAPATVQDVGSDS 329 (364) Q Consensus 290 ~~~---gg~SPT~aeLat~~NW~rV~~s~K~~pgv~~~~~~~~ 329 (364) +.+ +..|||++||++++||+||+ ++|++|.|.++.+-++ T Consensus 308 ~~~~~~~~~sPt~aeLa~~~NW~~v~-~~K~I~iv~~~~~~~a 349 (349) T protein:vir:94 308 GNGTETIARSASWQDLANAANWNRVV-DRKHVPIAFLVTGVGA 349 (349) T ss_pred CCccccccCCCChHHhcCCcCccccc-ChhhcceEEEEeccCC Confidence 432 23689999999999999999 7899999999988777 No 4 >protein:vir:78387 Length: 349 # NCBI annotation: putative coat protein # Family: family:all:1522 # MgeID: mge:1851 # MgeName: SETP3 # Cross-refs: genbank:acc:YP_001110837;genbank:gi:134288598;genbank:GeneID:5179650 Probab=100.00 E-value=3.2e-83 Score=472.82 Aligned_cols=309 Identities=13% Similarity=0.051 Sum_probs=252.4 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeE----ecc-CcccCceeeeehhhhhccccc-ccccccCCCccccchh Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVV----LGT-GEVLKDVVEKMSVGLIANLVT-DRNAYAPVGTPATAKV 74 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAiv----l~~-~~~~Gdf~~~~~f~~i~g~~~-~~d~~~~~~~~~T~~k 74 (364) +|+++||.++|+.|++..+..++.|-+ .|+|+ |.. ....|+|+++|||++|.|..+ ..+.++.. ..+||.| T Consensus 7 ~D~iipe~~vf~~Yv~~~~~e~~~l~q--SGii~~d~~l~~~~~~gG~~~~iPf~~~L~g~~e~nv~~D~~~-~~~t~~k 83 (349) T protein:vir:78 7 GDIVTGNIPVLASYMTEDPVEKTAFFD--SGILTSTPYAAEIANGPSNIANLPFWKAIDTSIEPNYSNDVYQ-DIATPRA 83 (349) T ss_pred eeeeccCHHHHHHHHHHhhHHhhhhhh--ccceeccHHHHHHhhcCCCEEEeeeeecCCCCcccccCCCCcc-ccccccc Confidence 999999999999999999987777766 36776 221 124599999999999987432 22223222 2568999 Q ss_pred hhccceeeEEeccccCchhcCHHHHHhh--cCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc----------c Q lcl|NC_016566. 75 LARMLTNSVNLSAKVGPVAITKAMMAKI--ETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNA----------A 142 (364) Q Consensus 75 it~~~~vaVkl~~~~gpv~~t~~~~~~~--g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na----------~ 142 (364) |++++++++++.|+.+ |.+.+|++. |.||| .+|++||++||.|++|+.||++|+|+|+++. + T Consensus 84 itt~~~~a~~~~r~ka---w~~~Dla~~lsG~dpm---~~Ia~~va~yW~r~~q~~Lia~L~Gvf~~~~~a~~~~~~~~~ 157 (349) T protein:vir:78 84 IQTGEMMARVAYLNEG---FGQADLTVELTSQNPL---QSVASRLDNFWQRQAQRRLIATALGLYNDNVSATDAYHEQND 157 (349) T ss_pred ccccceeeeeeeeccc---cchhHHHHHhhCchHH---HHHHHHHHHHHhhHHHHHHHHHHHHhhcccccccchhhhccc Confidence 9999999999999986 999999875 77885 5699999999999999999999999998652 4 Q ss_pred ceeecccccCcccccccccHHHHHHHHHHhccc-----ccCeeEEEEchHHHHHHHHhhcccccccccccccceeecccC Q lcl|NC_016566. 143 ANYTQPARVDGVGGRTFPTLADFPLAASKFGDQ-----AALIKSWFMDGVTWANFIAYQALPSAEQVFAIGDLQVMGDGL 217 (364) Q Consensus 143 ~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~-----~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~~~~~~~~l 217 (364) |++|+++++ +.+...|++|+++|||+ +++|++++|||+||++|+++++|...........+.+|+ T Consensus 158 ~t~d~s~~a-------~~~~~~~~dA~~~lgda~~Gd~~~~lt~i~mHS~v~~~L~~~~li~~i~~s~~~~~i~ty~--- 227 (349) T protein:vir:78 158 MVVDVSATL-------GFDAGAFIDATQTMGDALMGNGGEVLGAIAMHSFVYAQARKAQLIDFIRDAENNTMFATYQ--- 227 (349) T ss_pred ceeeecccc-------CCChhhhhhhHHHHHHHhccccccceeEEEEchHHHHHHHhhhhhhhccCcccCcccceec--- Confidence 677776544 35666899999998886 799999999999999999999987655555555666664 Q ss_pred CcEEEEeCCCCCC----CCceEEEEEecceeEEecCCCCcc----eeeccCCCceeeeEEeeEEEEeeeeeeeecccccc Q lcl|NC_016566. 218 GRRFIISDAAADA----MGAGKMLGLVPGAVAVTTNGLDML----AQEKGGNENIERWWQGEFDFNVAVKGYRLKASART 289 (364) Q Consensus 218 GrrVIVDD~~p~~----~~~Yttylfg~GAi~~~~~~~~~~----~~~~~g~e~~~~~~~~~~~f~lhp~G~sw~~~~~~ 289 (364) ||||||||+||+. +++|+|||||+|||+|+++.|..+ +++..|++.++..++.|++|++||+||||+++.++ T Consensus 228 G~~VivDD~~Pv~~~g~~~~yttylfg~GAi~~~~~~~~~~~et~rd~~~g~~~G~d~l~~R~~~~~hp~G~s~~~a~v~ 307 (349) T protein:vir:78 228 GYRVIVDDSMTVVGQGAQRKFISIIFGQGAIGYGEGNPVMPLEYEREASRANGGGVETLWTRKTWLLHPFGYRFTSAVIT 307 (349) T ss_pred CeEEEEeCCCccccCCCCceEEEEEeecceEEEccCCCccceeeecccccCCcceeEEEEEeeEEEeeeeeeeecccccc Confidence 9999999999974 459999999999999999876533 33445666777888888999999999999998876 Q ss_pred cc---cCCCCcChhhhcCCccceeecCcCcCcceEEEEecCcc Q lcl|NC_016566. 290 PV---EGVRSFKLSDITDKANWELDQGQVDNAPATVQDVGSDS 329 (364) Q Consensus 290 ~~---~gg~SPT~aeLat~~NW~rV~~s~K~~pgv~~~~~~~~ 329 (364) +. .+..|||++||++++||+||+ ++|++|.|.++.+-++ T Consensus 308 ~~~~~~~~~sPt~aeLa~~~NW~~v~-~~K~I~iv~~~~~~~a 349 (349) T protein:vir:78 308 GNGTETIARSASWQDLANATNWNRVV-DRKHVPIAFLVTGVGA 349 (349) T ss_pred CCccccccCCCChHHhcCCcCccccc-ChhhcceEEEEeccCC Confidence 42 224799999999999999999 7899999999988777 No 5 >protein:vir:80446 Length: 367 # NCBI annotation: BcepGomrgp07 # Family: family:all:1522 # MgeID: mge:1882 # MgeName: BcepGomr # Cross-refs: genbank:acc:YP_001210227;genbank:gi:146329919;genbank:GeneID:5123555 Probab=100.00 E-value=1.4e-82 Score=469.30 Aligned_cols=306 Identities=12% Similarity=0.061 Sum_probs=245.4 Q ss_pred CCcc-----ccchhhhhhhhhhhHHHHHHHhhhh-cceeEeccC------cccCceeeeehhhhhcccccccccccCC-C Q lcl|NC_016566. 1 MSLT-----VFQRKLVTAVTQMIPDNLNVFNAAA-NGAVVLGTG------EVLKDVVEKMSVGLIANLVTDRNAYAPV-G 67 (364) Q Consensus 1 fd~~-----vfn~~~~~~~~e~i~q~~~~fn~as-~gAivl~~~------~~~Gdf~~~~~f~~i~g~~~~~d~~~~~-~ 67 (364) ||.+ +|+|++|+.|++..+- .-|+.- .|+|+ .+. ...|+++++|||++|.|.. .++.... . T Consensus 4 ~~~~T~l~Dii~pEvF~~Yv~~~~~---e~~~l~qSGiv~-~d~~l~~~~~~gG~~v~iPf~~~L~g~~--~n~~~d~~~ 77 (367) T protein:vir:80 4 FNNQVRLVDAVIPEVYTSYTAIDRP---ELTAFFLSGAVA-SNDFLSQFLSAPGRLINIPFWRDLDSLE--PNYGSDNPN 77 (367) T ss_pred hhhhhhhhhccchhhhhHHHhhhhh---hhhhhhhcceee-cCHHHHHHhhcCCCEEEeeeeccCCCCc--cccCCCCCc Confidence 8876 5999999999888763 235533 34444 222 3579999999999998843 2332222 2 Q ss_pred ccccchhhhccceeeEEeccccCchhcCHHHHHh--hcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhccc----- Q lcl|NC_016566. 68 TPATAKVLARMLTNSVNLSAKVGPVAITKAMMAK--IETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESN----- 140 (364) Q Consensus 68 ~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~--~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~n----- 140 (364) ...||.||++++++++++.|+.+ |.+.+|+. .|.||| .+|++||++||.|++|+.||++|+|+|+++ T Consensus 78 ~~~t~~kittg~~~a~v~~r~ka---w~~~Dla~~lsG~dpm---~~Ia~qva~yW~r~~q~~Lla~L~Gvf~~~~a~~~ 151 (367) T protein:vir:80 78 VEAPIDGLGSGEMKTTKTWLNKA---YGAMDLTAELAGSNPM---TRIRNRFGVYWTRQWQRRIIAMAVGVYKSNLAGNF 151 (367) T ss_pred ccccccccccchheeeeehhccc---chhhhHHHHhhCchHH---HHHHHHHHHHhhhhhHHHHHHHHHHhhccccccch Confidence 35689999999999999999886 88888876 477885 569999999999999999999999999874 Q ss_pred -----------------ccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc Q lcl|NC_016566. 141 -----------------AAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ 203 (364) Q Consensus 141 -----------------a~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~ 203 (364) .+|++|||++++... +.++..+|++|+++|||++++|++++|||+||++|+++++|..... T Consensus 152 ~~~~~~~~~~a~~~~~~~~~~~Dis~~t~~~~--~~~s~~~~~~A~~~lGD~~~~l~~i~mHS~V~~~L~~~~li~~i~~ 229 (367) T protein:vir:80 152 ATIKTRGRVPAEVLGTAGDMVIDISGQTNPAD--AVFNREAFVDAAFTMGDHVGSIAAIAVHSMVYKRMTNNDEIEFIPD 229 (367) T ss_pred hhhhhhhccccccccccCceeeeeeccCCCcc--ceecHHHHHHHHHHhccccccccEEEEchHHHHHHHhccccccccC Confidence 469999998876443 4688899999999999999999999999999999999999865443 Q ss_pred ccccccceeecccCCcEEEEeCCCCC----CCCceEEEEEecceeEEecCCCCccee----ec---cCCCceeeeEEeeE Q lcl|NC_016566. 204 VFAIGDLQVMGDGLGRRFIISDAAAD----AMGAGKMLGLVPGAVAVTTNGLDMLAQ----EK---GGNENIERWWQGEF 272 (364) Q Consensus 204 ~~~~~~~~~~~~~lGrrVIVDD~~p~----~~~~Yttylfg~GAi~~~~~~~~~~~~----~~---~g~e~~~~~~~~~~ 272 (364) ......+.+|+ ||||||||+||+ .+++|||||||+|||+|+++.|..+.+ +. +|+++++++|+ T Consensus 230 sd~~~~i~ty~---G~~VIvDD~~Pv~~~~a~~~yttYlfg~GAi~~~~~~~~~~~E~~Rd~~~~~~gG~d~L~~Rr--- 303 (367) T protein:vir:80 230 SKGQLTIPTYM---GKVVIVDDGMPVFGTGADKTYLSILFGGAAFGYADGAPQVPVAVGRRELRGNGSGLEYILERK--- 303 (367) T ss_pred CCCccccceec---ceeEEEeCCCcccccCCCceEEEEEEecceeeecccCCccceecccchhhhcCCceEEEEeee--- Confidence 33345666665 999999999997 367999999999999999988764433 32 35556666555 Q ss_pred EEEeeeeeeeeccccccccc----------CCCCcChhhhcCCccceeecCcCcCcceEEEEecC Q lcl|NC_016566. 273 DFNVAVKGYRLKASARTPVE----------GVRSFKLSDITDKANWELDQGQVDNAPATVQDVGS 327 (364) Q Consensus 273 ~f~lhp~G~sw~~~~~~~~~----------gg~SPT~aeLat~~NW~rV~~s~K~~pgv~~~~~~ 327 (364) +|++||+|+||++++++.+. ...|||++||++++||+||+ ++|++|.|.++.+- T Consensus 304 ~~~~hP~G~s~~~~~v~~~~~~~~~~~~~~~~~sPt~~eLa~~~NW~~v~-d~K~I~iv~~it~g 367 (367) T protein:vir:80 304 EWIVHPGGFNWLDADVTIPDNTGSPSGITSGPPAITLANLANPDNWERVT-YRKNVPMAFLVTKG 367 (367) T ss_pred eEEeecceeeecccccccccccccccccccccCCCChHHhcCCccccccc-chhhcceEEEEecC Confidence 89999999999998875322 24689999999999999999 78999999999544 No 6 >protein:vir:1583 Length: 351 # NCBI annotation: minor capsid protein # Family: family:all:1522 # MgeID: mge:32 # MgeName: phig1e # Cross-refs: genbank:acc:NP_695165;swissprot:trembl:o03966;genbank:gi:23455804;uniprot:O03966;genbank:GeneID:955561 Probab=100.00 E-value=6.4e-75 Score=427.34 Aligned_cols=326 Identities=13% Similarity=0.067 Sum_probs=246.8 Q ss_pred CCccccchhhhhhhhhh-hHHHHHHHhhhhcceeE----ecc-CcccCceeeeehhhhhcccccccccccCCCccccchh Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQM-IPDNLNVFNAAANGAVV----LGT-GEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKV 74 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~-i~q~~~~fn~as~gAiv----l~~-~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~k 74 (364) .-..+|||++|..|++. +++ .++|-+ .|+|+ |.+ ..-+|+++.+|||++|+|- ..|++. +...++.+ T Consensus 5 ~lsd~i~PEvf~~yv~~~~~~-~~~l~q--SG~i~~~~~l~~~~~~~G~~it~P~~~~l~Gd--~~~~~~--~~~i~~~k 77 (351) T protein:vir:15 5 HLSDLIVPEVFGNYVVNQIIK-TNRFVQ--SGILTPDPDLGPHLLEAGTRITVPFLNDLTGD--PDNWTD--SDDIDVNN 77 (351) T ss_pred eeeeeechhHHHHHHhhhhHH-hhhHhh--cccccccHHHHHHhhcCCCEEEecccccCCCc--ccccCC--Ccccchhe Confidence 66788999999999754 333 333332 26666 221 1136999999999999873 334433 34568999 Q ss_pred hhccceeeEEeccccCchhcCHHHHHh--hcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc---cceeeccc Q lcl|NC_016566. 75 LARMLTNSVNLSAKVGPVAITKAMMAK--IETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNA---AANYTQPA 149 (364) Q Consensus 75 it~~~~vaVkl~~~~gpv~~t~~~~~~--~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na---~~v~dis~ 149 (364) |++++++++++.|+.| |+..+++. .+.||| .+|++||++||.|++|+.||++|+|+|++.. .|++|++. T Consensus 78 itt~~~~a~i~~~~kg---~~~tD~a~~~sg~dp~---~~i~~q~a~~w~~~~q~~lla~l~gv~~~~~~~~~~~~d~t~ 151 (351) T protein:vir:15 78 LTSGKQQGIKFYQTKA---YGYTDLGTMISGAPVQ---ETIGNRFAAFWQRADQKTLLSVLKGVMGVTKIANSKVYDQTK 151 (351) T ss_pred ecccceeEEEEeeccc---eehhhhhHhhccchHH---HHHHHHHHHHHHHHHHHHHHHHHHHHhhchhhcccceecccc Confidence 9999999999999988 66555543 466774 5699999999999999999999999998764 57788887 Q ss_pred ccCcccccccccHHHHHHHHHHhccc-ccCeeEEEEchHHHHHHHHhhcccccccccccccceeecccCCcEEEEeCCCC Q lcl|NC_016566. 150 RVDGVGGRTFPTLADFPLAASKFGDQ-AALIKSWFMDGVTWANFIAYQALPSAEQVFAIGDLQVMGDGLGRRFIISDAAA 228 (364) Q Consensus 150 ~t~~~~~~~~~s~~~l~~A~~~lGD~-~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~~~~~~~~lGrrVIVDD~~p 228 (364) .++. ...++..+|++|+++|||. .++|++|+|||.||++|+++++|...........+ ..++|+||||||+|| T Consensus 152 ~~~~---~~~is~~~l~~A~~~~GD~~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~i---~t~~G~~VivdD~~p 225 (351) T protein:vir:15 152 VSPS---EPMFGAKGFTGAIGLMGDLQDTAFGAIAVNSATYSLMKVQGLIETIQPQNGATPF---EAYNGLRIVLDDDIE 225 (351) T ss_pred cccc---ccccCHHHHHHHHHHhccccccceEEEEEChHHHHHHHhhhhhhhccccccCccc---ceecceEEEEcCCCc Confidence 6543 3457888999999999996 56799999999999999998887543332222334 445699999999999 Q ss_pred CC-----CCceEEEEEecceeEEecCCCCcc--eee-ccCCCceeeeEEeeEEEEeeeeeeeecccccccccCCCCcChh Q lcl|NC_016566. 229 DA-----MGAGKMLGLVPGAVAVTTNGLDML--AQE-KGGNENIERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLS 300 (364) Q Consensus 229 ~~-----~~~Yttylfg~GAi~~~~~~~~~~--~~~-~~g~e~~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~a 300 (364) +. .++|+||+|++|||+|+++.+..+ |+. .+++++.++. |++|++||+||||+++.+. .++.|||++ T Consensus 226 ~~~~~~~~~~ytsyl~~~GAi~~~~~~~~ve~~rd~~~~~g~d~l~~---r~~~~~hp~G~s~~~~~~~--~~~~sPt~~ 300 (351) T protein:vir:15 226 IDLTDKTKPVSTSYIFAPGAVRYSTNMRSTETKYDPLINGGQDVIVQ---KRVGTIHVAGTSIKASFSP--SKASFPTID 300 (351) T ss_pred cccCCCCCceeEEEEEecceeeeecCCcCcceeecccCCCCceEEEE---eeeeeeeeeeeeecccccc--cCcCCcChH Confidence 73 468999999999999999877543 222 3455555554 6699999999999987644 357899999 Q ss_pred hhcCCccceeecC-cCcCcceEEEEecCccccccccccccccccccccchhhcc Q lcl|NC_016566. 301 DITDKANWELDQG-QVDNAPATVQDVGSDSDTKGRRRTQTAQAVPTRNIKETAG 353 (364) Q Consensus 301 eLat~~NW~rV~~-s~K~~pgv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 353 (364) ||++++||+||++ ++|++|.|.++... +..++..++++++ ||..-.-|.- T Consensus 301 ~L~~~~NW~~v~~~d~k~I~iv~~~~~~--~~~~~~~~~~~~~-~~~~~~~~~~ 351 (351) T protein:vir:15 301 ELAKSSTWEVVDGIDVRSIGVVAYTAQL--DPALTPGAQMPAA-DTSTDTGTTK 351 (351) T ss_pred HhcCCcccccccCCCccccceEEEEEec--CcccccCCcCcCC-CCccccCCCC Confidence 9999999999964 89999988888655 4566677778777 6654332222 No 7 >protein:vir:102944 Length: 330 # NCBI annotation: major head protein # Family: family:all:1522 # MgeID: mge:1461 # MgeName: EJ-1 # Cross-refs: genbank:acc:NP_945286;genbank:gi:39653721;uniprot:Q708M6;genbank:GeneID:2672858 Probab=100.00 E-value=6.2e-75 Score=427.41 Aligned_cols=312 Identities=12% Similarity=0.026 Sum_probs=242.2 Q ss_pred CC------ccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccC-----cccCceeeeehhhhhcccccccccccCCCcc Q lcl|NC_016566. 1 MS------LTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTG-----EVLKDVVEKMSVGLIANLVTDRNAYAPVGTP 69 (364) Q Consensus 1 fd------~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~-----~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~ 69 (364) |- ..+|||++|..|++..+...++|.++ |+|+-..+ .-+||++.+|||++|+|-.++ +..+..+ T Consensus 1 Ma~~~T~l~d~i~pevf~~yv~~~~~~~~~l~qS--G~i~~~~~i~~~~~~~G~~i~~P~~~~l~G~~~~---~~dg~~~ 75 (330) T protein:vir:10 1 MANELTKILDTITPQQYNAYMQQYTAAKSAFVQS--GIAVSDERVSKNITSGGLLVNMPFWNDLTGDSEV---LGNGDKA 75 (330) T ss_pred CCCCceEeeeeechhHHHHHHHHHhHHhhhhhhc--ccccccHHHHHHhhcCCCEEEecccccCCCcccc---cCCCccc Confidence 33 67899999999988888777777665 66652111 136999999999999884433 2223345 Q ss_pred ccchhhhccceeeEEeccccCchhcCHHHHHh--hcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccc---e Q lcl|NC_016566. 70 ATAKVLARMLTNSVNLSAKVGPVAITKAMMAK--IETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAA---N 144 (364) Q Consensus 70 ~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~--~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~---v 144 (364) .+|.+|++++++++++.|+.| |+..+++. .+.||| .+|++||++||.|++|+.||++|+|+|++..+. . T Consensus 76 i~~~ki~t~~~~a~i~~~~k~---~~~tD~a~~~~g~dp~---~~i~~q~a~~w~~~~q~~lla~l~gvf~~~~~~~~~~ 149 (330) T protein:vir:10 76 LETGKITAGADIACVLYRGRG---WAANELTGVVAGSDPV---RAILNRIGAYWLREDQKALIATLNGIFATGTAGEKGA 149 (330) T ss_pred cchhhcccceeEEEEEeecce---eeehhhhhhhcchhHH---HHHHHHHHHHhhhhHHHHHHHHHHhhhhhhhcccchh Confidence 789999999999999988886 66555543 477885 569999999999999999999999999865321 1 Q ss_pred eecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccccccccccceeecccCCcEEEEe Q lcl|NC_016566. 145 YTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAIGDLQVMGDGLGRRFIIS 224 (364) Q Consensus 145 ~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~~~~~~~~lGrrVIVD 224 (364) ++.....+.....+.++...|++|+++|||++++|++|+|||++|++|+++++|.... +...+ ..+.+++||||||| T Consensus 150 ~~~~~~~~~~~~~a~~s~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~--~s~~~-~~i~~~~G~~Vivd 226 (330) T protein:vir:10 150 LEETHVSDQSKASTGIDAGMVLDAKQLLGDSADQVTAIAMHSAVYTKLQKDNLIQYIQ--PTTAT-INIPTYLGYRVIID 226 (330) T ss_pred hhhhheecccccccccCHHHHHHHHHHhccccccceEEEEcHHHHHHHHHhhhhhhhc--ccccC-cccccccceEEEEe Confidence 1122122223345567889999999999999999999999999999999987774322 22222 23455679999999 Q ss_pred CCCCCCCCceEEEEEecceeEEecCCC-CcceeeccCCCc-eeeeEEeeEEEEeeeeeeeecccccccccCCCCcChhhh Q lcl|NC_016566. 225 DAAADAMGAGKMLGLVPGAVAVTTNGL-DMLAQEKGGNEN-IERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLSDI 302 (364) Q Consensus 225 D~~p~~~~~Yttylfg~GAi~~~~~~~-~~~~~~~~g~e~-~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~aeL 302 (364) |+||+..++|++|+|++|||+++++.| ..+.+|.+|+.. ....+..|+||++||+||||+++.++ .+|.|||++|| T Consensus 227 D~~p~~~~~yt~yl~~~GAi~~~~~~~~~~v~~EtdRd~~~g~~~l~~r~~~~~hp~G~s~~~~~~~--~~~~sPt~~~L 304 (330) T protein:vir:10 227 DGIAPTGDIYTSYLFRTGSIGLNTGNPSGLTTFETSREAAKGNDMIYTRRALVMHPYGVKWTGAEVD--AGNITPSNADL 304 (330) T ss_pred CCCCCCCCceeEEEEecCceeeecccCCccccccccCCccccceEEEEeeEEEeeeeeeeecccccc--cCcCCcChHHh Confidence 999999999999999999999999765 456777766543 34467778899999999999988654 24789999999 Q ss_pred cCCccceeecCcCcCcceEEEEecCcc Q lcl|NC_016566. 303 TDKANWELDQGQVDNAPATVQDVGSDS 329 (364) Q Consensus 303 at~~NW~rV~~s~K~~pgv~~~~~~~~ 329 (364) ++++||+||+ ++|++|.|.++.+=+- T Consensus 305 ~~~~NW~~v~-~~k~i~iv~~~~~~~~ 330 (330) T protein:vir:10 305 AKFKNWKRVY-EPKNIGIIALKHKIGK 330 (330) T ss_pred cCCcCccccc-ChhhcceEEEEEecCC Confidence 9999999998 8899999999844332 No 8 >protein:vir:5974 Length: 324 # NCBI annotation: hypothetical protein # Family: family:all:1522 # MgeID: mge:125 # MgeName: SPP1 # Cross-refs: genbank:acc:NP_690674;genbank:geneid:6329212;genbank:gi:22855068;goa:Q38582;uniprot:Q38582;genbank:GeneID:955303 Probab=100.00 E-value=2.2e-74 Score=424.36 Aligned_cols=302 Identities=11% Similarity=0.047 Sum_probs=235.5 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhc-ceeE-------eccCcccCceeeeehhhhhcccccccccccCCCccccc Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAAN-GAVV-------LGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATA 72 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~-gAiv-------l~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~ 72 (364) .-..+||||+|..|++.-.... |+.-. |+|+ +.++..+||++++|||++|+|- .+|++.. ...++ T Consensus 5 ~lsd~i~peVf~~yv~~~~~~~---~~l~qSg~i~~~a~i~~~l~~~~~G~~i~~P~~~~l~Gd--~~~v~~~--~~i~~ 77 (324) T protein:vir:59 5 KISDVIVPELFNPYVINTTTQL---SAFFQSGIAATDDELNALAKKAGGGSTLNMPYWNDLDGD--SQVLNDT--DDLVP 77 (324) T ss_pred eeeceechhHHHHHHHhhhHHH---HHHhhcccccccHHHHHHhhccCCCCEEEecccccCCCc--ccccCCC--cccch Confidence 4578889999998875422222 33222 2322 2234457999999999999884 4455433 45678 Q ss_pred hhhhccceeeEEeccccCchhcCHHHHHhh--cCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc--cceeecc Q lcl|NC_016566. 73 KVLARMLTNSVNLSAKVGPVAITKAMMAKI--ETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNA--AANYTQP 148 (364) Q Consensus 73 ~kit~~~~vaVkl~~~~gpv~~t~~~~~~~--g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na--~~v~dis 148 (364) .+|++++++++++.++.| |+..++++. +.|| +.+|++|+++||.+++|+.||++|+|+|+++. .+++|+| T Consensus 78 ~~l~t~~~~a~i~~~~k~---~~~tD~a~~~sg~dp---~~~i~~q~a~~~~~~~~~~lia~l~g~~~~~~~~~~~~dvs 151 (324) T protein:vir:59 78 QKINAGQDKAVLILRGNA---WSSHDLAATLSGSDP---MQAIGSRVAAYWAREMQKIVFAELAGVFSNDDMKDNKLDIS 151 (324) T ss_pred hhcccceeeEEEEeecCc---eeehhhhhhhccchH---HHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccceeeee Confidence 999999999999988876 666666553 6677 45699999999999999999999999999764 5778888 Q ss_pred cccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccccccccccceeecccCCcEEEEeCCCC Q lcl|NC_016566. 149 ARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAIGDLQVMGDGLGRRFIISDAAA 228 (364) Q Consensus 149 ~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~~~~~~~~lGrrVIVDD~~p 228 (364) +.++ ..++..+|++|+++|||++++|++|+|||+||++|++++++.. +....+-..+.+++||||||||+|| T Consensus 152 a~~~-----~~~s~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~---~~~s~~~~~i~~~~G~~VivdD~~p 223 (324) T protein:vir:59 152 GTAD-----GIYSAETFVDASYKLGDHESLLTAIGMHSATMASAVKQDLIEF---VKDSQSGIRFPTYMNKRVIVDDSMP 223 (324) T ss_pred cccc-----ceecHHHHHHHHHHhCCcccCcEEEEEchHHHHHHHHhhhhhh---ccccccCceeeeecccEEEEeCCCC Confidence 6432 3577889999999999999999999999999999999887743 2222333345677899999999999 Q ss_pred CC-----CCceEEEEEecceeEEecCCCCcceeeccCCCc-eeeeEEeeEEEEeeeeeeeecccccccccCCCCcChhhh Q lcl|NC_016566. 229 DA-----MGAGKMLGLVPGAVAVTTNGLDMLAQEKGGNEN-IERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLSDI 302 (364) Q Consensus 229 ~~-----~~~Yttylfg~GAi~~~~~~~~~~~~~~~g~e~-~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~aeL 302 (364) +. .++|++|+|++|||++.++.+.. +.|.+|+.+ ....+..|+||++||+||||+++.+ ++.|||++|| T Consensus 224 ~~~~~~~~~~y~s~l~~~GAi~~~~~~~~v-~vE~dRd~~~g~~~l~~r~~~~~~p~G~s~~~~~~----~~~sPt~~~L 298 (324) T protein:vir:59 224 VETLEDGTKVFTSYLFGAGALGYAEGQPEV-PTETARNALGSQDILINRKHFVLHPRGVKFTENAM----AGTTPTDEEL 298 (324) T ss_pred ccccCCCCceEEEEEEecCeEEEeecCCCc-ceecccCccccceEEEEeeEEEeEeeeEEeccccc----CCCCCChhhh Confidence 63 46999999999999999987764 445554432 3345667889999999999988764 3679999999 Q ss_pred cCCccceeecCcCcCcceEEEEecCcc Q lcl|NC_016566. 303 TDKANWELDQGQVDNAPATVQDVGSDS 329 (364) Q Consensus 303 at~~NW~rV~~s~K~~pgv~~~~~~~~ 329 (364) ++++||+||+ ++|++|-|.++-.-++ T Consensus 299 ~~~~NW~~v~-~~k~i~i~~~~~~~~~ 324 (324) T protein:vir:59 299 ANGANWQRVY-DPKKIRIVQFKHRLQA 324 (324) T ss_pred cCCccccccc-CccccceEEEEeeccC Confidence 9999999998 6799999999977777 No 9 >protein:vir:95107 Length: 270 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1549 # MgeName: X2 # Cross-refs: genbank:acc:YP_240822;genbank:gi:66394683;genbank:GeneID:5133901 Probab=100.00 E-value=1.1e-33 Score=201.31 Aligned_cols=260 Identities=10% Similarity=-0.001 Sum_probs=170.8 Q ss_pred CCcccc----chhhhhhhh-hhhHHHHHHHhhhhcceeEecc-CcccCceeeeehhhhhcccccccccccCCCccccchh Q lcl|NC_016566. 1 MSLTVF----QRKLVTAVT-QMIPDNLNVFNAAANGAVVLGT-GEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKV 74 (364) Q Consensus 1 fd~~vf----n~~~~~~~~-e~i~q~~~~fn~as~gAivl~~-~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~k 74 (364) |---.+ +|++|..|+ |+++ ...+| ++.+.+... ....||.+.+|||+.||.+.. . ..+...+|.+ T Consensus 1 Ma~T~~~d~I~Pev~~~~V~e~~~-~~~~~---~~~~~~d~~L~g~~G~ti~~P~~~~igdae~----~-~eg~~i~~~~ 71 (270) T protein:vir:95 1 MTQTKKANLINPEVLANVVSAQMQ-NAIRF---TPYAVTDDTLVGQPGDTITRPKYAYIGAAED----L-QEGVAMDTTQ 71 (270) T ss_pred CCceehhhhcchHHHHHHHHHHHH-hHHhh---ccccccccccCCCCCCEEEeeeecCCCcccc----c-cCCCccchhh Confidence 322222 677777775 3333 33333 444444222 235799999999998887532 2 2244678999 Q ss_pred hhccceeeEEeccccCchhcCHHHHHh--hcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccC Q lcl|NC_016566. 75 LARMLTNSVNLSAKVGPVAITKAMMAK--IETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVD 152 (364) Q Consensus 75 it~~~~vaVkl~~~~gpv~~t~~~~~~--~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~ 152 (364) |++.++.+....++.+ |...+++. .+.||+ .++++|++.||.+..++.+|++|+|++.+.+ T Consensus 72 lt~~~~~a~i~~~gk~---~~itD~a~~~~~~dp~---~~~~~q~a~~~a~~~d~~li~~l~~a~~~~~----------- 134 (270) T protein:vir:95 72 MSMTTTKVTVKETGKA---VEVTQTAIITNVNGTL---QEASRQLAMSLADKVEIDYIAELNKSKQTAT----------- 134 (270) T ss_pred cccchheeeeehhhCc---ceecHHHHhhhccchH---HHHHHHHHHHHHHHHHHHHHHHhcccccccc----------- Confidence 9999999977665554 33333332 467994 5689999999999999999999999875431 Q ss_pred cccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccccccccccceeecccCCcEEEEeCCCCCCCC Q lcl|NC_016566. 153 GVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAIGDLQVMGDGLGRRFIISDAAADAMG 232 (364) Q Consensus 153 ~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~~~~~~~~lGrrVIVDD~~p~~~~ 232 (364) ..++...|++|.++|||..+.++.++|||.+|+.|+|+..++..+.......-..+..++|.||||||++| + T Consensus 135 -----~~~t~~~~~dA~~~lgd~~~~~~~i~vhs~~~~~Lrk~~~~~~~~~~~~~~~~G~ig~~~G~~Viv~s~~~---~ 206 (270) T protein:vir:95 135 -----VSADATGILDAIEVFNSENDEDYVLYVNPKDYNKLVKSLFKVGGNVQDRAISKGDLVEIVGVSDIVKSKRV---S 206 (270) T ss_pred -----cccCHHHHHHHHHHhccccCCCcEEEEcHHHHHHHHhhhcccccccccchhcccccceecceeEEEeCCCC---C Confidence 12456689999999999999999999999999999987666543332222111234445799999999998 4 Q ss_pred ceEEEEEecceeEEecCCCCcceeeccCCCce-eeeEEeeEEEEeeeee-eeecccccccccCCCCcChhhh Q lcl|NC_016566. 233 AGKMLGLVPGAVAVTTNGLDMLAQEKGGNENI-ERWWQGEFDFNVAVKG-YRLKASARTPVEGVRSFKLSDI 302 (364) Q Consensus 233 ~Yttylfg~GAi~~~~~~~~~~~~~~~g~e~~-~~~~~~~~~f~lhp~G-~sw~~~~~~~~~gg~SPT~aeL 302 (364) +|++|+|++|||++.+.... ..|..|+... ...+..++||.+|+.. =++-..+.. .++|. |+ T Consensus 207 ~~~~~l~~~gAi~~~~~~~~--~vEtdRd~~~~~d~i~~~~~y~v~~~~~skvv~~t~~-----~a~~~-~~ 270 (270) T protein:vir:95 207 ENTAFLQRYGAMEIVNKKKP--EAYTDFDILKRTHLLSTNYHYSVNLKDETGVVKVTFK-----PSGSL-EM 270 (270) T ss_pred ceeEEEEeccceeeeecCCc--eeeeccchhhcccEEEeeeEEEEEEEccceEEEEEec-----CCCCc-CC Confidence 67999999999999886542 2344444322 2244456789888776 111111110 01111 11 No 10 >protein:vir:105334 Length: 276 # NCBI annotation: putative phage major capsid protein # Family: family:all:522 # MgeID: mge:1679 # MgeName: PH15 # Cross-refs: genbank:acc:YP_950669;genbank:gi:119967839;genbank:GeneID:4643213 Probab=99.96 E-value=5.4e-31 Score=186.54 Aligned_cols=262 Identities=13% Similarity=0.018 Sum_probs=173.1 Q ss_pred CCccccchhhhhhhh-hhhHHHHHHHhhhhcceeEecc-CcccCceeeeehhhhhcccccccccccCCCccccchhhhcc Q lcl|NC_016566. 1 MSLTVFQRKLVTAVT-QMIPDNLNVFNAAANGAVVLGT-GEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARM 78 (364) Q Consensus 1 fd~~vfn~~~~~~~~-e~i~q~~~~fn~as~gAivl~~-~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~ 78 (364) -...+++||+|..|+ |.+. ...+| ++-+.+... +...|+.+.+|+|+.|+.+.. +. .+...++.+|++. T Consensus 7 ~l~d~i~Pev~~~~v~~~~~-~~~~~---~~~~~~~~~l~g~~G~ti~iP~~~~igda~~----~~-eg~~i~~~~lt~~ 77 (276) T protein:vir:10 7 TKSTQIVPEVLAPMMQAELD-KKLRF---AQFADIDSTLVGQPGDTLTFPAFVYSGDATV----VP-EGQKIPVDKIETN 77 (276) T ss_pred ehhhhhchHHHHHHHHHHHH-hhhhh---cccceecccccCCCCCEEEeeeecCCCcccc----cc-CCCccCccccccc Confidence 334468899888884 3333 33334 555554221 235799999999999976432 22 2346789999999 Q ss_pred ceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCcccccc Q lcl|NC_016566. 79 LTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRT 158 (364) Q Consensus 79 ~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~ 158 (364) ++.+....++.+ +..+..+....+.||+ .++.++++.||.+.+++.+++.|++..... + .. T Consensus 78 ~~~a~i~~~~k~-~~~tD~a~~~~~~dp~---~~~~~~~~~~~a~~~d~~~~~~l~~~~~~~-------~--------~~ 138 (276) T protein:vir:10 78 RREAKIHKIGKG-TDITDEALLSGYGDPQ---GEAVRQHGLAIANKVDNDVLEALRGTKLTV-------S--------AD 138 (276) T ss_pred eeeEEeehcccc-ccccHHHHHhhccchH---HHHHHHHHHHHHHHHHHHHHHHHhcccccc-------c--------cc Confidence 999987655544 2344444444567885 568899999999999999999998743221 1 12 Q ss_pred cccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccccccccccc---eeecccCCcEEEEeCCCCCCCCceE Q lcl|NC_016566. 159 FPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAIGDL---QVMGDGLGRRFIISDAAADAMGAGK 235 (364) Q Consensus 159 ~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~~---~~~~~~lGrrVIVDD~~p~~~~~Yt 235 (364) .+++..+.+|.++|||+..+++.++|||.+|+.|+|++++..........++ ..+..++|.|||+||.+| +|+ T Consensus 139 ~~t~d~i~~A~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G~~Vi~s~~~p----~~t 214 (276) T protein:vir:10 139 IGTLAGLEAAIDTFDDEDLEPMVLFINPKDAGKLRSSASDNFTRATELGDNIIVKGAFGEALGAVIVRSKKLD----EGE 214 (276) T ss_pred ccCHHHHHHHHHHhccccCcccEEEEcHHHHHHHHHhccccccccccccccceeccccceecceeEEEcCCCC----cce Confidence 2456789999999999999999999999999999986655322211111111 113345799999999998 789 Q ss_pred EEEEecceeEEecCCCCcceeeccCCCceeeeEEeeEEEEeeeeeeeecccc---cccccCCCCcChh Q lcl|NC_016566. 236 MLGLVPGAVAVTTNGLDMLAQEKGGNENIERWWQGEFDFNVAVKGYRLKASA---RTPVEGVRSFKLS 300 (364) Q Consensus 236 tylfg~GAi~~~~~~~~~~~~~~~g~e~~~~~~~~~~~f~lhp~G~sw~~~~---~~~~~gg~SPT~a 300 (364) +|+|++||+++.+..+ +..|.+|+.+.. ++.-+..+.+|.+....+ +-...+|.+|+.| T Consensus 215 ~~l~~~gAi~~~~~~~--~~vE~dRd~~~~----~d~i~~~~~y~~~~~~~~~vv~~t~~~~~~~~~~ 276 (276) T protein:vir:10 215 AILAKRGAVKLITKRD--FFLETDRDPSTK----TTALYSDKHYVAYLYDESKAVKVTKGAGTTDSGA 276 (276) T ss_pred EEEEeccceeeeecCC--ceeecccchhhc----ccEEEEeeEEEEEEEcCcceEEEecCCcCCcCCC Confidence 9999999999988654 224555554322 222223333455554442 1122346678888 No 11 >protein:vir:1239 Length: 274 # NCBI annotation: similar to phage B1 major head protein # Family: family:all:522 # MgeID: mge:25 # MgeName: phi ETA # Cross-refs: genbank:acc:NP_510938;genbank:gi:17426272;genbank:GeneID:927376 Probab=99.94 E-value=4.1e-29 Score=176.23 Aligned_cols=264 Identities=15% Similarity=0.025 Sum_probs=170.2 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEecc-CcccCceeeeehhhhhcccccccccccCCCccccchhhhccc Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGT-GEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARML 79 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~-~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~ 79 (364) --..+++|+++..|+. +++..-.-.++.+.+... +...||.+.+|+|+.|+.+. + +. .+...++.++++.+ T Consensus 7 ~l~d~iiPev~~~~v~---~~~~~~l~~~~~~~~d~~l~g~~G~tv~iP~~~~ig~a~---~-~~-~g~~i~~~~lt~~~ 78 (274) T protein:vir:12 7 KTSNQIIPEVLAPMMQ---AQLEKKLRFASFAEVDSTLQGQPGDTLTFPAFVYSGDAQ---V-VA-EGEKIPTDILETKK 78 (274) T ss_pred ehhhhhchHHHHHHHH---HHHHhhhhhcccceecccccCCCCCEEEEeeecCCCccc---c-cc-CCCccchhhcccce Confidence 3445588999998862 223322334555555333 34579999999999987532 2 22 23466889999999 Q ss_pred eeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCccccccc Q lcl|NC_016566. 80 TNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTF 159 (364) Q Consensus 80 ~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~ 159 (364) +.+.+..++.+ +.++..+-...+.||+ .++.+|++.+|.+..++.+++.+.+.-. +.+ ... T Consensus 79 ~~~~i~~~~~~-~~i~D~~~~~~~~d~~---~~~~~q~~~~~a~~vd~~~l~~~~~a~~-------~~~--------~~a 139 (274) T protein:vir:12 79 REAKIRKIAKG-TSITDEALLSGYGDPQ---GEQVRQHGLAHANKVDNDVLEALMGAKL-------TVN--------ADI 139 (274) T ss_pred eeEEeeeecce-eeecHHHHHhcccchH---HHHHHHHHHHHHHHHHHHHHHHHhcccc-------ccc--------ccc Confidence 99887665554 3444444444567885 5689999999999999999998876321 111 123 Q ss_pred ccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccccccccccc---eeecccCCcEEEEeCCCCCCCCceEE Q lcl|NC_016566. 160 PTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAIGDL---QVMGDGLGRRFIISDAAADAMGAGKM 236 (364) Q Consensus 160 ~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~~---~~~~~~lGrrVIVDD~~p~~~~~Ytt 236 (364) .++..+.+|.++|||+...++.++|||.+|+.|+|++++..........++ ..+..++|.|||+||.+| +|++ T Consensus 140 ~~~d~i~dA~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~fv~~s~~g~~~~~~G~ig~~~G~~Vi~s~~~p----~~t~ 215 (274) T protein:vir:12 140 TKLNGLQSAIDKFNDEDLEPMVLFINPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEALGAIIVRSNKLE----AGTA 215 (274) T ss_pred cCHHHHHHHHHHhccccccccEEEeCHHHHHHHHhhhhhhccccccccccceecccceeecCeeEEEeCCCC----cceE Confidence 567789999999999999999999999999999986554321111100111 123334699999999998 6899 Q ss_pred EEEecceeEEecCCCCcceeeccCCCceeeeEEeeEEEEeeeeeeeecccccccccCCCCcChhhhcCCccceeec Q lcl|NC_016566. 237 LGLVPGAVAVTTNGLDMLAQEKGGNENIERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLSDITDKANWELDQ 312 (364) Q Consensus 237 ylfg~GAi~~~~~~~~~~~~~~~g~e~~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW~rV~ 312 (364) |+|++|||++....+ +..|..|++... .+.-+..+.+|.+... |+-.-.-+...|++-. T Consensus 216 ~l~~~gA~~~~~~~~--~~vE~~Rd~~~~----~d~i~~~~~y~~~~~~-----------~~~vv~~t~~~~~~~~ 274 (274) T protein:vir:12 216 ILAKKGAVKLILKRD--FFLEVARDASTK----TTALYSDKHYVAYLYD-----------ESKAVKITKGSGSLEM 274 (274) T ss_pred EEEeccceeeeecCC--ceeccccchhhc----ccEEEeeeEEEEEEEc-----------CCceEEEEcCCccccC Confidence 999999999988643 224555554321 1222333445666643 3333333334443322 No 12 >protein:vir:3613 Length: 272 # NCBI annotation: MHP # Family: family:all:522 # MgeID: mge:74 # MgeName: TP901-1 # Cross-refs: genbank:acc:NP_112699;genbank:gi:13786567;genbank:GeneID:921035 Probab=99.94 E-value=4.7e-29 Score=175.88 Aligned_cols=257 Identities=11% Similarity=-0.020 Sum_probs=166.8 Q ss_pred CCccccchhhhhhhh-hhhHHHHHHHhhhhcceeEecc-CcccCceeeeehhhhhcccccccccccCCCccccchhhhcc Q lcl|NC_016566. 1 MSLTVFQRKLVTAVT-QMIPDNLNVFNAAANGAVVLGT-GEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARM 78 (364) Q Consensus 1 fd~~vfn~~~~~~~~-e~i~q~~~~fn~as~gAivl~~-~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~ 78 (364) -...++.||++..|+ |.+. +-.-.++-+++... +...||.+++|+|+.|+.+. .+. .+...++.+++.. T Consensus 7 ~~~d~iiPev~~~~v~~~~~----~~~~~~~~~~~~~~l~g~~G~ti~iP~~~~~gda~----~~~-eg~~i~~~~lt~~ 77 (272) T protein:vir:36 7 TLADLVNPEVLAPIVSYELN----KALRFAPLAQVDTTLQGQPGNTLKFPAFTYIGDAA----DVA-EGGEISLDKIGTT 77 (272) T ss_pred ehhhhhchHHHHHHHHHHHH----hhhhhccccccccccccCCCCEEEEeeeccCcccc----ccC-CCCccChhhcCCc Confidence 334567788887774 3333 22222444444332 22459999999999996542 232 2346688999999 Q ss_pred ceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCcccccc Q lcl|NC_016566. 79 LTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRT 158 (364) Q Consensus 79 ~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~ 158 (364) +..+....++.+ +..+..+..-.+.||+ .++.++++.||.+..++.+++.|.|+.... .. T Consensus 78 ~~~~~i~~~~k~-~~vtD~~~~~~~~d~~---~~~~~~~a~~~a~~~d~~i~~~l~~~~~~~----------------~~ 137 (272) T protein:vir:36 78 TKSVTIKKAAKG-TEITDEAALSGYGDPI---GESNKQLGLSLANKVDDDLLSAAKTTSQTV----------------ST 137 (272) T ss_pred ceeEeeehhhcc-ccccHHHHhhccchHH---HHHHHHHHHHHHHHHHHHHHHHhccccccc----------------cc Confidence 988876555543 3334433333456774 569999999999999999999988743221 12 Q ss_pred cccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccccccc--ccceeecccCCcEEEEeCCCCCCCCceEE Q lcl|NC_016566. 159 FPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAI--GDLQVMGDGLGRRFIISDAAADAMGAGKM 236 (364) Q Consensus 159 ~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~--~~~~~~~~~lGrrVIVDD~~p~~~~~Ytt 236 (364) ..++..+.+|.++|||....++.++|||.+|+.|+|+..+.+....... ..-..++.++|.|||+||.||...+.|++ T Consensus 138 ~~~~d~i~~A~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~~~~~~~~~~~~~~G~ig~~~G~~Vv~s~~~p~~~~~~~~ 217 (272) T protein:vir:36 138 KANVDGVQAALDIFNDEDAQAYVLIVNPKDAAKIRKDANAKNIGSEVGANALINGTYADVLGAQIVRSKKLAEGSALMFK 217 (272) T ss_pred cccHHHHHHHHHHhhhcCCCceEEEEcHHHHHHHhcccccccccccccccceeeeccceecCeeEEEeCCCCCCceeEEE Confidence 2456679999999999999999999999999999886655332110000 00012334579999999999999999999 Q ss_pred EEEecceeEEecCCCCcceeeccCCCc-eeeeEEeeEEEE---eeeee-eeeccccc Q lcl|NC_016566. 237 LGLVPGAVAVTTNGLDMLAQEKGGNEN-IERWWQGEFDFN---VAVKG-YRLKASAR 288 (364) Q Consensus 237 ylfg~GAi~~~~~~~~~~~~~~~g~e~-~~~~~~~~~~f~---lhp~G-~sw~~~~~ 288 (364) |+|++||+++..... .. .|..|+++ ....+..++||. ++|.| ++.+.+-+ T Consensus 218 ~~~~~gA~~~~~~~~-~~-vE~~R~~~~~~d~i~~~~~y~~~v~~~~~vv~~t~~g~ 272 (272) T protein:vir:36 218 IVSNSPALKLVLKRG-VQ-VETDRDIVTKTTVITADEHYAAYLYDLTKVVNITFTGV 272 (272) T ss_pred EEecccceeeeecCC-cc-cccccchhhcCcEEEEEEEEEEEEEcCccEEEEeecCC Confidence 999999999876543 12 23333322 112344455664 55666 44444322 No 13 >protein:vir:96262 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1612 # MgeName: ROSA # Cross-refs: genbank:acc:YP_240311;genbank:gi:66395978;genbank:GeneID:5133339 Probab=99.94 E-value=1.2e-28 Score=173.57 Aligned_cols=260 Identities=15% Similarity=0.038 Sum_probs=170.9 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEecc-CcccCceeeeehhhhhcccccccccccCCCccccchhhhccc Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGT-GEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARML 79 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~-~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~ 79 (364) -=..+++|+++..|+..--....+| ++-+.+... +...||.+.+|+|+.|+.+. + +. .+...++.+|++.. T Consensus 7 ~l~d~i~Pev~~~~v~~~~~~~l~~---~~~~~~~~~l~g~~G~tv~iP~~~~ig~a~---~-~~-~g~~i~~~~lt~~~ 78 (274) T protein:vir:96 7 KLTNQIVPEVLAPMMQAELEKKLRF---ASFAEIDNTLVGQPGDTLTFPAFIYSGDAK---V-VA-EGEKIPTDILETKK 78 (274) T ss_pred ehhheechHHHHHHHHHHHHhhhhc---cccceecccccCCCCCEEEeeeecCCCccc---c-cc-CCCccchhhcccce Confidence 2234577888888853222222233 444444221 22459999999999997542 2 22 23466899999999 Q ss_pred eeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCccccccc Q lcl|NC_016566. 80 TNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTF 159 (364) Q Consensus 80 ~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~ 159 (364) +.+....++.+ +.++..+-.-.+.||+ .++.++++.+|.+..++.+++.|+++... ++ .+. T Consensus 79 ~~~~i~~~~~a-~~i~D~~~~~~~~d~~---~~~~~~~~~~~a~~vd~~i~~~l~~a~~~-------~~--------~~~ 139 (274) T protein:vir:96 79 REAKIRKIAKG-TSISDEALLSGYGDPQ---GEQVRQHGLAHANKVDDDVLEALKSAKLT-------VE--------ADI 139 (274) T ss_pred eEEEeeeeecc-eeehHHHHhhccchHH---HHHHHHHHHHHHHHHHHHHHHHHhccccc-------cc--------ccc Confidence 99887665554 2333333332345774 56899999999999999999999875321 11 123 Q ss_pred ccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc------ccccccceeecccCCcEEEEeCCCCCCCCc Q lcl|NC_016566. 160 PTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ------VFAIGDLQVMGDGLGRRFIISDAAADAMGA 233 (364) Q Consensus 160 ~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~------~~~~~~~~~~~~~lGrrVIVDD~~p~~~~~ 233 (364) .++..|.+|.++|||+...++.++|||.+|+.|+|+.++..... +...+.+. .++|.|||+||.+| + T Consensus 140 ~~~d~i~~A~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig---~~~G~~Vi~s~~~~----~ 212 (274) T protein:vir:96 140 TKLTGLQTAIDKFNDEDLEPMVLFISPLDAGKLRGDATTNFTRATELGDDVIVKGAFG---EALGAVIVRSNKLE----A 212 (274) T ss_pred cCHHHHHHHHHHhccccccccEEEeCHHHHHHHHhhccccccccccccccceeccccc---eecCeEEEEeCCCC----C Confidence 56778999999999999999999999999999998655532111 11122233 34699999999997 7 Q ss_pred eEEEEEecceeEEecCCCCcceeeccCCCce-eeeEEeeEEEEeeeeeeeecccccccccCCCCcChhhhcCCccceeec Q lcl|NC_016566. 234 GKMLGLVPGAVAVTTNGLDMLAQEKGGNENI-ERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLSDITDKANWELDQ 312 (364) Q Consensus 234 Yttylfg~GAi~~~~~~~~~~~~~~~g~e~~-~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW~rV~ 312 (364) |++|+|++|||++.+..+. ..|..|++.. ...+..++|| |++.. .|+-.-+.+..+|++-. T Consensus 213 ~t~~l~~~gA~~~~~~~~~--~vE~~Rd~~~~~d~i~~~~~y-----~~~~~-----------~~~~~v~~tk~~~~~~~ 274 (274) T protein:vir:96 213 GTAILAKKGAVKLITKRDF--FLETDRDPSTKTTALYSDKHY-----VAYLY-----------DESKAVKITKGSGSLEM 274 (274) T ss_pred ceEEEEeccceeeeecCCc--ccccccccccccCEEEEeEEE-----EEEEE-----------cCCcEEEEEcCCccccC Confidence 8999999999999876542 2345554432 1223333344 55543 35566666777777654 No 14 >protein:vir:95898 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1588 # MgeName: 71 # Cross-refs: genbank:acc:YP_240385;genbank:gi:66396054;genbank:GeneID:5133409 Probab=99.94 E-value=1.2e-28 Score=173.57 Aligned_cols=260 Identities=15% Similarity=0.038 Sum_probs=170.9 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEecc-CcccCceeeeehhhhhcccccccccccCCCccccchhhhccc Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGT-GEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARML 79 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~-~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~ 79 (364) -=..+++|+++..|+..--....+| ++-+.+... +...||.+.+|+|+.|+.+. + +. .+...++.+|++.. T Consensus 7 ~l~d~i~Pev~~~~v~~~~~~~l~~---~~~~~~~~~l~g~~G~tv~iP~~~~ig~a~---~-~~-~g~~i~~~~lt~~~ 78 (274) T protein:vir:95 7 KLTNQIVPEVLAPMMQAELEKKLRF---ASFAEIDNTLVGQPGDTLTFPAFIYSGDAK---V-VA-EGEKIPTDILETKK 78 (274) T ss_pred ehhheechHHHHHHHHHHHHhhhhc---cccceecccccCCCCCEEEeeeecCCCccc---c-cc-CCCccchhhcccce Confidence 2234577888888853222222233 444444221 22459999999999997542 2 22 23466899999999 Q ss_pred eeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCccccccc Q lcl|NC_016566. 80 TNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTF 159 (364) Q Consensus 80 ~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~ 159 (364) +.+....++.+ +.++..+-.-.+.||+ .++.++++.+|.+..++.+++.|+++... ++ .+. T Consensus 79 ~~~~i~~~~~a-~~i~D~~~~~~~~d~~---~~~~~~~~~~~a~~vd~~i~~~l~~a~~~-------~~--------~~~ 139 (274) T protein:vir:95 79 REAKIRKIAKG-TSISDEALLSGYGDPQ---GEQVRQHGLAHANKVDDDVLEALKSAKLT-------VE--------ADI 139 (274) T ss_pred eEEEeeeeecc-eeehHHHHhhccchHH---HHHHHHHHHHHHHHHHHHHHHHHhccccc-------cc--------ccc Confidence 99887665554 2333333332345774 56899999999999999999999875321 11 123 Q ss_pred ccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc------ccccccceeecccCCcEEEEeCCCCCCCCc Q lcl|NC_016566. 160 PTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ------VFAIGDLQVMGDGLGRRFIISDAAADAMGA 233 (364) Q Consensus 160 ~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~------~~~~~~~~~~~~~lGrrVIVDD~~p~~~~~ 233 (364) .++..|.+|.++|||+...++.++|||.+|+.|+|+.++..... +...+.+. .++|.|||+||.+| + T Consensus 140 ~~~d~i~~A~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig---~~~G~~Vi~s~~~~----~ 212 (274) T protein:vir:95 140 TKLTGLQTAIDKFNDEDLEPMVLFISPLDAGKLRGDATTNFTRATELGDDVIVKGAFG---EALGAVIVRSNKLE----A 212 (274) T ss_pred cCHHHHHHHHHHhccccccccEEEeCHHHHHHHHhhccccccccccccccceeccccc---eecCeEEEEeCCCC----C Confidence 56778999999999999999999999999999998655532111 11122233 34699999999997 7 Q ss_pred eEEEEEecceeEEecCCCCcceeeccCCCce-eeeEEeeEEEEeeeeeeeecccccccccCCCCcChhhhcCCccceeec Q lcl|NC_016566. 234 GKMLGLVPGAVAVTTNGLDMLAQEKGGNENI-ERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLSDITDKANWELDQ 312 (364) Q Consensus 234 Yttylfg~GAi~~~~~~~~~~~~~~~g~e~~-~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW~rV~ 312 (364) |++|+|++|||++.+..+. ..|..|++.. ...+..++|| |++.. .|+-.-+.+..+|++-. T Consensus 213 ~t~~l~~~gA~~~~~~~~~--~vE~~Rd~~~~~d~i~~~~~y-----~~~~~-----------~~~~~v~~tk~~~~~~~ 274 (274) T protein:vir:95 213 GTAILAKKGAVKLITKRDF--FLETDRDPSTKTTALYSDKHY-----VAYLY-----------DESKAVKITKGSGSLEM 274 (274) T ss_pred ceEEEEeccceeeeecCCc--ccccccccccccCEEEEeEEE-----EEEEE-----------cCCcEEEEEcCCccccC Confidence 8999999999999876542 2345554432 1223333344 55543 35566666777777654 No 15 >protein:vir:96833 Length: 275 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1642 # MgeName: EW # Cross-refs: genbank:acc:YP_240157;genbank:gi:66395822;genbank:GeneID:5133174 Probab=99.93 E-value=1.8e-27 Score=167.15 Aligned_cols=260 Identities=13% Similarity=0.036 Sum_probs=164.9 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEecc-CcccCceeeeehhhhhcccccccccccCCCccccchhhhccc Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGT-GEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARML 79 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~-~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~ 79 (364) --..+++||++..|+..--+...+| ++.+.+... +...||.+.+|+|+.|+.+. + +. .+...++.++++.+ T Consensus 8 ~l~d~i~PEv~~~~v~~~~~~~~~~---~~~~~~~~~l~g~~G~tv~iP~~~~ig~a~---~-~~-~g~~i~~~~lt~~~ 79 (275) T protein:vir:96 8 KLANMVNPEVLAPMMQAELDKKLKF---AQFADIDNTLVGQPGNTITFPAFVYSGDAK---V-VP-EGEEIPIDLIETKK 79 (275) T ss_pred hhhhhhchHHHHHHHHHHHHHhhhh---cccceecccccCCCCCEEEeeeeccCCccc---c-cc-CCCCcchhhcccce Confidence 3345778888888853322334444 555555332 23469999999999996532 2 22 23456889999999 Q ss_pred eeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCccccccc Q lcl|NC_016566. 80 TNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTF 159 (364) Q Consensus 80 ~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~ 159 (364) +.+.+..++.+ +..+..+....+.||+ .++.+|++.+|.+..++.+++.|.+.... ++ ... T Consensus 80 ~~~~i~~~~~~-~~i~D~~~~~~~~d~~---~~~~~~~a~~~a~~~d~~ll~~l~~a~~~-------~~--------~~~ 140 (275) T protein:vir:96 80 RQATIRKIGKG-TVLTDEALLSGYGDPK---GEAVRQHGLAIANKVDNDVLEALQGATLK-------VE--------ADI 140 (275) T ss_pred eeEEeehhccc-ccccHHHHHhhccchH---HHHHHHHHHHHHHHHHHHHHHHHhccccc-------cc--------ccc Confidence 98877665555 3444444444566884 56889999999999999999998874311 11 123 Q ss_pred ccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccccccccc---ceeecccCCcEEEEeCCCCCCCCceEE Q lcl|NC_016566. 160 PTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAIGD---LQVMGDGLGRRFIISDAAADAMGAGKM 236 (364) Q Consensus 160 ~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~---~~~~~~~lGrrVIVDD~~p~~~~~Ytt 236 (364) .++..|.+|.++|||....++.++|||.+|..|+|++.+..........+ -..+..++|.|||+||.+| +|++ T Consensus 141 ~~~d~i~dA~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~f~~~~~~g~~~~~~G~ig~~~G~~Vi~s~~~p----~~t~ 216 (275) T protein:vir:96 141 TKLAGLQTAIDKFNDEDLEPMVLFVNPLDAGKLRASATDNFTRATLLGDNVIVKGAFGEALGAIIVRSNKIK----EGEA 216 (275) T ss_pred cCHHHHHHHHHHhccccCCccEEEeCHHHHHHHHhcccccccccccccccceeccccceecCeeEEEeCCCC----cceE Confidence 56778999999999999999999999999999998754432211111111 1123345799999999998 6899 Q ss_pred EEEecceeEEecCCCCcceeeccCCCce-eeeEEeeEEEEee---eeeeeecccccccccCCCCcChhhhcC Q lcl|NC_016566. 237 LGLVPGAVAVTTNGLDMLAQEKGGNENI-ERWWQGEFDFNVA---VKGYRLKASARTPVEGVRSFKLSDITD 304 (364) Q Consensus 237 ylfg~GAi~~~~~~~~~~~~~~~g~e~~-~~~~~~~~~f~lh---p~G~sw~~~~~~~~~gg~SPT~aeLat 304 (364) |+|++||+++.+... +..|..|++.. ...+..+.||.+| |.|+-=-.. +| +-|.- T Consensus 217 ~i~~~gA~~~~~~~~--~~vE~~Rd~~~~~d~i~~~~~y~~~~~~~~~vv~~t~---------~~--~~~~~ 275 (275) T protein:vir:96 217 ILAKRGAVKLITKRD--FFLETERHASHKSTALFSDKHYVAYLYDESKVVKITK---------SA--SGLGV 275 (275) T ss_pred EEEeccceeeeecCC--cccccccchhhcCcEEEEeEEEEEEEEcCccEEEEEe---------cc--cccCC Confidence 999999999987643 22344444322 2233344455533 322211000 11 11111 No 16 >protein:vir:94494 Length: 274 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1508 # MgeName: 88 # Cross-refs: genbank:acc:YP_240676;genbank:gi:66396348;genbank:GeneID:5133758 Probab=99.93 E-value=2.9e-27 Score=166.05 Aligned_cols=263 Identities=14% Similarity=0.008 Sum_probs=165.6 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEecc-CcccCceeeeehhhhhcccccccccccCCCccccchhhhccc Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGT-GEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARML 79 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~-~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~ 79 (364) --..+++|+++..|+..=.+...+ .++.+.+... +...||.+..|+|+.|+... | +. .+...++.+++..+ T Consensus 7 ~~~d~iiPev~~~~v~~~~~~~l~---~~~~~~~d~~l~g~~G~tv~iP~~~~~g~a~---~-~~-~g~~i~~~~lt~~~ 78 (274) T protein:vir:94 7 KTSDQIIPEVLAPMMQAQLEKKLR---FASFAEVDSTLQGQPGDTLTFPAFVYSGDAQ---V-VA-EGEKIPTDILETKK 78 (274) T ss_pred ehhheechHHHHHHHHHhhhhhhh---hcccceecccccCCCCCEEEEeeecCCCccc---c-cc-CCCcccccccccce Confidence 334458899998885322222223 3555555322 23469999999999886532 2 22 23456889999999 Q ss_pred eeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCccccccc Q lcl|NC_016566. 80 TNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTF 159 (364) Q Consensus 80 ~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~ 159 (364) ..+....++.+ +.++..+-.-.+.||+ .++.++++.+|.+..++.+++.|.+.- .. .+ ... T Consensus 79 ~~~~i~~~~~~-~~i~D~~~~~~~~dp~---~~~~~~~a~a~a~~vd~~~~~~l~~a~-----~~--~~--------~~~ 139 (274) T protein:vir:94 79 REAKIRKIAKG-TSITDEALLSGYGDPQ---GEQVRQHGLAHANKVDNDVLEALMGAK-----LT--VN--------ADI 139 (274) T ss_pred eEEEeeeecce-ecccHHHHHhccchHH---HHHHHHHHHHHHHHHHHHHHHHHhccC-----cc--cc--------ccc Confidence 99887666554 3444444443456774 568899999999999999999887631 11 11 123 Q ss_pred ccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccccccccc---ceeecccCCcEEEEeCCCCCCCCceEE Q lcl|NC_016566. 160 PTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAIGD---LQVMGDGLGRRFIISDAAADAMGAGKM 236 (364) Q Consensus 160 ~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~---~~~~~~~lGrrVIVDD~~p~~~~~Ytt 236 (364) .++..|.+|.++|||+...++.++|||.+|..|+|++++..........+ -..+..++|.|||+||.+| +|++ T Consensus 140 ~~~d~i~dA~~~l~d~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G~~Vi~s~~~p----~~t~ 215 (274) T protein:vir:94 140 TKLNGLQSAIDKFNDEDLEPMVLFVNPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEALGAIIVRTNKLE----AGTA 215 (274) T ss_pred cCHHHHHHHHHHhhccCCCceEEEeCHHHHHHHHhhhhhhccccCcccccceeccccceecCeeEEEcCCCC----cceE Confidence 56778999999999999999999999999999998755422111111011 1113345799999999998 7899 Q ss_pred EEEecceeEEecCCCCcceeeccCCCce-eeeEEeeEEEEeeeeeeeecccccccccCCCCcChhhhcCCccceeec Q lcl|NC_016566. 237 LGLVPGAVAVTTNGLDMLAQEKGGNENI-ERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLSDITDKANWELDQ 312 (364) Q Consensus 237 ylfg~GAi~~~~~~~~~~~~~~~g~e~~-~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW~rV~ 312 (364) |+|++|||++.+..+ +..|..|++.. ...+.. .|.+|.+....+ -.-.-+.+.|++-. T Consensus 216 ~l~~~gA~~~~~~~~--~~vE~~Rd~~~~~d~i~~-----~~~y~~~~~~~~-----------~vv~~t~~~~~~~~ 274 (274) T protein:vir:94 216 ILAKKGAVKLILKRD--FFLEVARDASTKTTALYS-----DKHYVAYLYDES-----------KAVKITKGSGSLEM 274 (274) T ss_pred EEEeCcceEeeecCC--ceeccccchhhcccEEEE-----EEEEEEEEEcCC-----------ceEEEecCcccccC Confidence 999999999988654 22345555432 122333 334455554331 11122222222211 No 17 >protein:vir:97433 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1676 # MgeName: 92 # Cross-refs: genbank:acc:YP_240749;genbank:gi:66396420;genbank:GeneID:5133789 Probab=99.93 E-value=2.9e-27 Score=166.05 Aligned_cols=263 Identities=14% Similarity=0.008 Sum_probs=165.6 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEecc-CcccCceeeeehhhhhcccccccccccCCCccccchhhhccc Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGT-GEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARML 79 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~-~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~ 79 (364) --..+++|+++..|+..=.+...+ .++.+.+... +...||.+..|+|+.|+... | +. .+...++.+++..+ T Consensus 7 ~~~d~iiPev~~~~v~~~~~~~l~---~~~~~~~d~~l~g~~G~tv~iP~~~~~g~a~---~-~~-~g~~i~~~~lt~~~ 78 (274) T protein:vir:97 7 KTSDQIIPEVLAPMMQAQLEKKLR---FASFAEVDSTLQGQPGDTLTFPAFVYSGDAQ---V-VA-EGEKIPTDILETKK 78 (274) T ss_pred ehhheechHHHHHHHHHhhhhhhh---hcccceecccccCCCCCEEEEeeecCCCccc---c-cc-CCCcccccccccce Confidence 334458899998885322222223 3555555322 23469999999999886532 2 22 23456889999999 Q ss_pred eeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCccccccc Q lcl|NC_016566. 80 TNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTF 159 (364) Q Consensus 80 ~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~ 159 (364) ..+....++.+ +.++..+-.-.+.||+ .++.++++.+|.+..++.+++.|.+.- .. .+ ... T Consensus 79 ~~~~i~~~~~~-~~i~D~~~~~~~~dp~---~~~~~~~a~a~a~~vd~~~~~~l~~a~-----~~--~~--------~~~ 139 (274) T protein:vir:97 79 REAKIRKIAKG-TSITDEALLSGYGDPQ---GEQVRQHGLAHANKVDNDVLEALMGAK-----LT--VN--------ADI 139 (274) T ss_pred eEEEeeeecce-ecccHHHHHhccchHH---HHHHHHHHHHHHHHHHHHHHHHHhccC-----cc--cc--------ccc Confidence 99887666554 3444444443456774 568899999999999999999887631 11 11 123 Q ss_pred ccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccccccccc---ceeecccCCcEEEEeCCCCCCCCceEE Q lcl|NC_016566. 160 PTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAIGD---LQVMGDGLGRRFIISDAAADAMGAGKM 236 (364) Q Consensus 160 ~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~---~~~~~~~lGrrVIVDD~~p~~~~~Ytt 236 (364) .++..|.+|.++|||+...++.++|||.+|..|+|++++..........+ -..+..++|.|||+||.+| +|++ T Consensus 140 ~~~d~i~dA~~~l~d~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G~~Vi~s~~~p----~~t~ 215 (274) T protein:vir:97 140 TKLNGLQSAIDKFNDEDLEPMVLFVNPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEALGAIIVRTNKLE----AGTA 215 (274) T ss_pred cCHHHHHHHHHHhhccCCCceEEEeCHHHHHHHHhhhhhhccccCcccccceeccccceecCeeEEEcCCCC----cceE Confidence 56778999999999999999999999999999998755422111111011 1113345799999999998 7899 Q ss_pred EEEecceeEEecCCCCcceeeccCCCce-eeeEEeeEEEEeeeeeeeecccccccccCCCCcChhhhcCCccceeec Q lcl|NC_016566. 237 LGLVPGAVAVTTNGLDMLAQEKGGNENI-ERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLSDITDKANWELDQ 312 (364) Q Consensus 237 ylfg~GAi~~~~~~~~~~~~~~~g~e~~-~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW~rV~ 312 (364) |+|++|||++.+..+ +..|..|++.. ...+.. .|.+|.+....+ -.-.-+.+.|++-. T Consensus 216 ~l~~~gA~~~~~~~~--~~vE~~Rd~~~~~d~i~~-----~~~y~~~~~~~~-----------~vv~~t~~~~~~~~ 274 (274) T protein:vir:97 216 ILAKKGAVKLILKRD--FFLEVARDASTKTTALYS-----DKHYVAYLYDES-----------KAVKITKGSGSLEM 274 (274) T ss_pred EEEeCcceEeeecCC--ceeccccchhhcccEEEE-----EEEEEEEEEcCC-----------ceEEEecCcccccC Confidence 999999999988654 22345555432 122333 334455554331 11122222222211 No 18 >protein:vir:96123 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1602 # MgeName: 37 # Cross-refs: genbank:acc:YP_240078;genbank:gi:66395742;genbank:GeneID:5133103 Probab=99.91 E-value=1.2e-25 Score=157.28 Aligned_cols=264 Identities=15% Similarity=0.036 Sum_probs=159.9 Q ss_pred CC------ccccchhhhhhhhhhhHHHHHHHhhhhcceeEecc-CcccCceeeeehhhhhcccccccccccCCCccccch Q lcl|NC_016566. 1 MS------LTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGT-GEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAK 73 (364) Q Consensus 1 fd------~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~-~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~ 73 (364) |- ..++.|+++..|+..- +.+..-.++.+.+... +...||.+..|+|+.++... | +.. +...++. T Consensus 1 ma~~~T~~~d~i~Pev~s~~v~~~---~~~~~~~~~~~~~~~~l~g~~G~tv~ip~~~~~g~~~---~-~~~-g~~i~~~ 72 (274) T protein:vir:96 1 MAQGTTKVSNLIVPEVLAPMMQAE---LDKKLRFAQFADIDSTLVGQPGDTLTFPAFTYSGDAQ---V-IAE-GEKIPVD 72 (274) T ss_pred CCccccchhhhhhhHHHHHHHHHH---HHhhhhhcccccccccccCCCCCEEEEEeeccCCCcc---c-cCC-CCcCchh Confidence 33 4678888888885332 2222223444444221 33469999999999876532 2 222 3356788 Q ss_pred hhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCc Q lcl|NC_016566. 74 VLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDG 153 (364) Q Consensus 74 kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~ 153 (364) +++.....+....++.+ +.++..+-.-.+.||+ ..+.++++.||.+..++.+++.|.+.- .. .+ T Consensus 73 ~it~~~~~~~i~~~~~~-~~i~D~~~~~~~~d~~---~~~~~~~~~~~a~~~d~~i~~~l~~a~-----~~--~~----- 136 (274) T protein:vir:96 73 QIGTSKREAKVRKIGKG-TELTDEAVLSGFGDPQ---GEAVRQHGLAIANKVDNDVLEALKGAT-----LT--VE----- 136 (274) T ss_pred hcccceeEEEEEeeece-eeecHHHHHhhcchHH---HHHHHHHHHHHHHHHHHHHHHHHhcCC-----CC--cC----- Confidence 99998887765444443 3344444333456774 568999999999999999999987631 11 11 Q ss_pred ccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccccccc---cccceeecccCCcEEEEeCCCCCC Q lcl|NC_016566. 154 VGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFA---IGDLQVMGDGLGRRFIISDAAADA 230 (364) Q Consensus 154 ~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~---~~~~~~~~~~lGrrVIVDD~~p~~ 230 (364) ....++..|.+|.++|||+...++.++|||.+|+.|+|++.+........ ...-..+..++|.+||+||.+| T Consensus 137 ---~~~~~~d~i~dA~~~l~d~~~~~~~ivv~p~~~~~L~k~~~~~f~~~~~~g~~~~~~g~ig~~~G~~Vi~s~~~p-- 211 (274) T protein:vir:96 137 ---ADITKLDGLQTAIDKFNDEDLEPMVLFVNPLDAGGLRTSASDNFTRPTQLGDNIIVKGAFGEALGAVIVRSNKLN-- 211 (274) T ss_pred ---cccccHHHHHHHHHHhcccCCCceEEEeCHHHHHHHHhcccccccccccccccceeecccceecCeeEEEcCCCC-- Confidence 12346778999999999999999999999999999998655422111110 0011123344699999999998 Q ss_pred CCceEEEEEecceeEEecCCCCcceeeccCCCceeeeEEeeEEEEeeeeeeeecccccccccCCCCcChhhhcCCcccee Q lcl|NC_016566. 231 MGAGKMLGLVPGAVAVTTNGLDMLAQEKGGNENIERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLSDITDKANWEL 310 (364) Q Consensus 231 ~~~Yttylfg~GAi~~~~~~~~~~~~~~~g~e~~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW~r 310 (364) +|++|+|+.|||++.+..+.. .|..|++.. +.+.-+..|.+|.+.......-. |.+ +-=++ T Consensus 212 --~~t~~l~~~gA~~~~~~~~~~--vE~~Rd~~~----~~d~i~~~~~yg~~~~~~~~vv~----------~t~-~~~~~ 272 (274) T protein:vir:96 212 --KGEALLAKKGAVKLITKRDFF--LEKDRDASR----KSTALYSDKHYVAYLYDESKVVK----------ITK-GAGDE 272 (274) T ss_pred --cceEEEEeCcceeeeecCCcc--cccccchhh----cccEEEEeeEEEEEEEcCccEEE----------EEc-Ccccc Confidence 689999999999998764321 233443321 12222334445555533211000 000 00001 Q ss_pred ec Q lcl|NC_016566. 311 DQ 312 (364) Q Consensus 311 V~ 312 (364) |. T Consensus 273 ~~ 274 (274) T protein:vir:96 273 VM 274 (274) T ss_pred cC Confidence 11 No 19 >protein:vir:93742 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1475 # MgeName: 55 # Cross-refs: genbank:acc:YP_240459;genbank:gi:66396126;genbank:GeneID:5133511 Probab=99.90 E-value=6.3e-25 Score=153.27 Aligned_cols=263 Identities=13% Similarity=0.010 Sum_probs=159.7 Q ss_pred CCccccchhhhhhhhh-hhHHHHHHHhhhhcceeEecc-CcccCceeeeehhhhhcccccccccccCCCccccchhhhcc Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQ-MIPDNLNVFNAAANGAVVLGT-GEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARM 78 (364) Q Consensus 1 fd~~vfn~~~~~~~~e-~i~q~~~~fn~as~gAivl~~-~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~ 78 (364) =-..++.|+++..|+. .+... .+ .++.+.+... +...||.++.|+|+.|+... + +.. +...++.+++.. T Consensus 7 ~~~~~iiPev~~~~v~~~~~~~-~~---~~~~~~~~~~l~g~~G~tv~ip~~~~~g~~~---~-~~e-g~~i~~~~it~~ 77 (274) T protein:vir:93 7 KTSNQIIPEVLAPMMQAQLEKK-LR---FASFAEVDSTLQGQPGDTLTFPAFVYSGDAQ---V-VAE-GEKIPTDILETK 77 (274) T ss_pred ehhheechHHHHHHHHHHHHhh-hh---hcccccccccccCCCCCEEEEEeeccCCCcc---c-ccC-CCcccccccccc Confidence 2234577888888753 33322 22 2333444222 34469999999999886532 2 222 335578899999 Q ss_pred ceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCcccccc Q lcl|NC_016566. 79 LTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRT 158 (364) Q Consensus 79 ~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~ 158 (364) ...+....++.+ +..+..+..-.+.||+ ..+.++++.+|.+..++.+++.+.+.-. ..+ .. T Consensus 78 ~~~~~i~~~~~~-~~i~D~~~~~~~~d~~---~~~~~~~~~~~a~~~d~~~~~~~~~a~~-------~~~--------~~ 138 (274) T protein:vir:93 78 KREAKIRKIAKG-TSITDEALLSGYGDPQ---GEQVRQHGLAHANKVDNDVLEALMGAKL-------TVN--------AD 138 (274) T ss_pred eeEEEeeeeccc-ccccHHHHHhhccchH---HHHHHHHHHHHHHHHHHHHHHHHhcccc-------ccc--------cc Confidence 888876555544 3344444443566774 5688999999999999999998866321 111 12 Q ss_pred cccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccccccccc---ceeecccCCcEEEEeCCCCCCCCceE Q lcl|NC_016566. 159 FPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAIGD---LQVMGDGLGRRFIISDAAADAMGAGK 235 (364) Q Consensus 159 ~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~---~~~~~~~lGrrVIVDD~~p~~~~~Yt 235 (364) ..++..+.+|.++|||+...++.++|||.+|..|+|++.+..........+ -..+..++|.+||+||.+| +|+ T Consensus 139 ~~~~d~i~dA~~~l~d~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G~~Vi~s~~~p----~~t 214 (274) T protein:vir:93 139 ITKLNGLQSAIDKFNDEDLEPMVLFINPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEALGAIIVRTNKLE----AGT 214 (274) T ss_pred ccCHHHHHHHHHHhhhccCCccEEEeCHHHHHHHHhhhhhcccccccccccceeecccceecCeeEEEcCCCC----cce Confidence 346778999999999999999999999999999998655432111111011 1123345799999999998 689 Q ss_pred EEEEecceeEEecCCCCcceeeccCCCceeeeEEeeEEEEeeeeeeeecccccccccCCCCcChhhhcC Q lcl|NC_016566. 236 MLGLVPGAVAVTTNGLDMLAQEKGGNENIERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLSDITD 304 (364) Q Consensus 236 tylfg~GAi~~~~~~~~~~~~~~~g~e~~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat 304 (364) +|+|++|||++.+..+. ..|..+++... .+.-+..|.+|++.......-. ..+.-+-|+- T Consensus 215 ~~l~~~gai~~~~~~~~--~vE~~Rd~~~~----~d~i~~~~~y~~~~~~~~~~v~---~t~~~~s~~~ 274 (274) T protein:vir:93 215 AILAKKGAVKLILKRDF--FLEVARDASTK----TTALYSDKHYVAYLYDESKAVK---ITKGSGSLEM 274 (274) T ss_pred EEEEeCCeEEEEecCCc--ccccccchhhc----ccEEEEEEEEEEEEEcCCceEE---EeeCccccCC Confidence 99999999999876432 22333433211 1222233444555433311000 0011111111 No 20 >protein:vir:80930 Length: 278 # NCBI annotation: Cps # Family: family:all:522 # MgeID: mge:1886 # MgeName: A500 # Cross-refs: genbank:acc:YP_001468392;genbank:gi:157324966;genbank:GeneID:5601363 Probab=99.87 E-value=6.8e-24 Score=147.61 Aligned_cols=259 Identities=14% Similarity=0.044 Sum_probs=154.0 Q ss_pred CC------ccccchhhhhhhhh-hhHHHHHHHhhhhcceeEecc-CcccCceeeeehhhhhcccccccccccCCCccccc Q lcl|NC_016566. 1 MS------LTVFQRKLVTAVTQ-MIPDNLNVFNAAANGAVVLGT-GEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATA 72 (364) Q Consensus 1 fd------~~vfn~~~~~~~~e-~i~q~~~~fn~as~gAivl~~-~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~ 72 (364) |- ..+|.|+++..|+. .+. ...+| +..+++... +...||.+.+|+|+.|+.. +.+.. +...++ T Consensus 1 Ma~~~T~~~~~iiPev~s~~v~~~~~-~~~v~---~~~~~~~~~l~g~~G~tv~ip~~~~~g~a----~~~~~-g~~i~~ 71 (278) T protein:vir:80 1 MADLTTKLANLIDPEVMGPMISAKLP-KAIKF---GKIAPIDNSLEGQPGSEITVPKYKYIGDA----QDVAE-GAAIDY 71 (278) T ss_pred CCCcceehhheecHHHHHHHHHHHHH-Hhhhh---cccceecccccCCCCCEEEEeeeccCCcc----eeecC-CCcCcc Confidence 32 24589999888852 222 22222 233333222 2346999999999999653 22322 345688 Q ss_pred hhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccC Q lcl|NC_016566. 73 KVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVD 152 (364) Q Consensus 73 ~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~ 152 (364) .+++..+..+....++++ +..+..+-.-.+.|| +.+++++++.||.++.++.+++.|+|+....+.. .+.. T Consensus 72 ~~lt~~~~~~~i~~~~~a-~~v~D~~~~~~~~d~---~~~~~~~~a~~~a~~~d~~l~~~l~~a~~~~~~~---~t~~-- 142 (278) T protein:vir:80 72 SALETESVKHGIKKAGKG-VKLTDESVLSGYGDP---VEEAQKQIRMAIASKVDNDILEEALTTTLEVKGA---INIG-- 142 (278) T ss_pred cccccceeeEeeehhhcc-ccccHHHHhhccccH---HHHHHHHHHHHHHHHHHHHHHHHHhccccccccc---cccc-- Confidence 999999998877666654 233333333345677 4679999999999999999999999865332211 0100 Q ss_pred cccccccccHHHHHHHHHHhcccccC-eeEEEEchHHHHHHHHhhcccccc------cccccccceeecccCCcEEEEeC Q lcl|NC_016566. 153 GVGGRTFPTLADFPLAASKFGDQAAL-IKSWFMDGVTWANFIAYQALPSAE------QVFAIGDLQVMGDGLGRRFIISD 225 (364) Q Consensus 153 ~~~~~~~~s~~~l~~A~~~lGD~~~~-l~~ivMHS~v~~~L~k~~~it~~~------~~~~~~~~~~~~~~lGrrVIVDD 225 (364) .....+..|.++..+|++.... -..++|||.+|+.|+|...+.... .+.+.+.+.. ++|.+||+|| T Consensus 143 ----~~~~~~~~~~da~~~l~~~~~~~~~~ivv~p~~~~~L~k~~~~~~~~~~~~g~~~~~~G~ig~---~~G~~Vi~s~ 215 (278) T protein:vir:80 143 ----LIDKIENTFTDAPDAIEDESITTTGVLFLNYKDTAKLREEAAGSWTKASQLGDDLLVKGAFGE---LLGWEIVRTK 215 (278) T ss_pred ----hhhhHHHHHHHHHHhhcccCCCcccEEEECHHHHHHHHhhhhhhccccccccccceeecccee---ecceeEEEcC Confidence 0112245678888888775433 345899999999998865442211 1112222333 4699999999 Q ss_pred CCCCCCCceEEEEEecceeEEecCCCCcceeeccCCCceeeeEEeeEEEEeeeeeeeecccc----cccccCCC Q lcl|NC_016566. 226 AAADAMGAGKMLGLVPGAVAVTTNGLDMLAQEKGGNENIERWWQGEFDFNVAVKGYRLKASA----RTPVEGVR 295 (364) Q Consensus 226 ~~p~~~~~Yttylfg~GAi~~~~~~~~~~~~~~~g~e~~~~~~~~~~~f~lhp~G~sw~~~~----~~~~~gg~ 295 (364) .+| .|++|+|++|||++.+..+. ..|..|++... .+.-+..|.+|.+..... ++.. .|. T Consensus 216 ~~p----~~t~~l~~~gAi~~~~~~~~--~vE~~Rd~~~~----~d~i~~~~~yg~~v~~~~~~v~it~~-a~~ 278 (278) T protein:vir:80 216 KLA----DGNALAVKAGALKTFLKRNL--LAESGRDMDHK----LTKFNADQHYAVALVDETKAVKVVPV-AGN 278 (278) T ss_pred CCC----cceEEEEeccceeeeecCCc--ccccccchhhc----cceeeeeeEEEEEEEcCcceEEEeec-cCC Confidence 998 57999999999999876542 23444443321 122223333444443221 0000 111 No 21 >protein:vir:9820 Length: 272 # NCBI annotation: putative major capsid/head protein # Family: family:all:522 # MgeID: mge:176 # MgeName: 315.4 # Cross-refs: genbank:acc:NP_795582;genbank:gi:28876339;genbank:GeneID:1257858 Probab=99.84 E-value=6.4e-22 Score=136.79 Aligned_cols=256 Identities=14% Similarity=0.041 Sum_probs=154.2 Q ss_pred CC------ccccchhhhhhhh-hhhHHHHHHHhhhhcceeEec-cCcccCceeeeehhhhhcccccccccccCCCccccc Q lcl|NC_016566. 1 MS------LTVFQRKLVTAVT-QMIPDNLNVFNAAANGAVVLG-TGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATA 72 (364) Q Consensus 1 fd------~~vfn~~~~~~~~-e~i~q~~~~fn~as~gAivl~-~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~ 72 (364) |- ..++.|+.+..|+ |.+++.+ +| ++-+.+.. -+...|+.++.|+|..++.+. .++.+ ...++ T Consensus 1 MA~~~T~~~~~~iPev~s~~v~~~~~~~~-~~---~~~~~~~~~~~g~~G~tv~iP~~~~~~~a~----~v~eg-~~i~~ 71 (272) T protein:vir:98 1 MAVGTTKMAQMLDPEVLADMIDAEVGKAI-RF---APLAEVDTTLEGQPGTTLTVPKWDYIGDAE----DVAEG-EAIPM 71 (272) T ss_pred CCCccccchheechHHHHHHHHHHHHHHh-hh---hccccccccccCCCCCEEEEEEecCCCCcc----cccCC-Ccccc Confidence 22 2368888888885 4444332 22 33333321 134569999999998776532 22222 34567 Q ss_pred hhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccC Q lcl|NC_016566. 73 KVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVD 152 (364) Q Consensus 73 ~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~ 152 (364) .+++.++........+. -+.++.........|+ +..+.++++.+|.+..++.+++.+.++.... T Consensus 72 ~~~~~~~~~~~~~~~~~-~~~itd~~~~~s~~d~---~~~~~~~~~~~~a~~~d~~i~~~~~~a~~~~------------ 135 (272) T protein:vir:98 72 TQLGFKKTTMTIKKAGK-GVEITDEAILSGYGDP---VGQAAKQIVEAIDHKVDADVLDALSKSTQTV------------ 135 (272) T ss_pred cccccceEEEEeeeeee-eeeecHHHHhhccccH---HHHHHHHHHHHHHHHHHHHHHHHhccccccc------------ Confidence 78877766554322222 1345555544444554 5679999999999999999999887753111 Q ss_pred cccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccc--ccc-ccccceeecccCCcEEEEeCCCCC Q lcl|NC_016566. 153 GVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAE--QVF-AIGDLQVMGDGLGRRFIISDAAAD 229 (364) Q Consensus 153 ~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~--~~~-~~~~~~~~~~~lGrrVIVDD~~p~ 229 (364) ....++..+.+|.++|||+...+..|+|||.+|..|++.++..... +.. ....-..+..++|.+||+++.+| T Consensus 136 ----~~~~t~d~i~da~~~l~~~~~~~~~~vv~p~~~~~L~k~~~~~~~~~~~~~~~~~~~g~ig~i~G~~Vi~s~~~p- 210 (272) T protein:vir:98 136 ----EATATVDGVSKALDIFNDEDDAETVIVMNPADASTLRLDAAKEWLGATEVGANRVVSGVYGEVLGVQIVRSRKCP- 210 (272) T ss_pred ----ccccCHHHHHHHHHHHhccCCCccEEEEcHHHHHHHHHhccccccccccccccccccccchhhcCeeEEEcCCCC- Confidence 1223566899999999999999999999999999998765442111 110 00001123345799999999998 Q ss_pred CCCceEEEEEecceeEEecCCCCcceeeccCCC-ceeeeEEeeEEEEee---eee-eeecccccccccCCCC Q lcl|NC_016566. 230 AMGAGKMLGLVPGAVAVTTNGLDMLAQEKGGNE-NIERWWQGEFDFNVA---VKG-YRLKASARTPVEGVRS 296 (364) Q Consensus 230 ~~~~Yttylfg~GAi~~~~~~~~~~~~~~~g~e-~~~~~~~~~~~f~lh---p~G-~sw~~~~~~~~~gg~S 296 (364) +|++|+|++||+++....... .|..++. .....++.+++|.+| |.+ ++++-+.- ++- T Consensus 211 ---~~t~~~~~~~a~~~~~~~~~~--ve~~r~~~~~~~~i~~~~~~~~~v~~~~~vv~~t~~~a-----~~~ 272 (272) T protein:vir:98 211 ---KGTAYMVRKGALRIMLKRNTM--VETDRDITKAINQIVANKHYGVYLYKAEKAVKITLKDA-----AKK 272 (272) T ss_pred ---cceEEEEcCCeEEEEecCCce--eeeccccccceeEEEEEEEEEEEEEcCCceEEEEeccc-----ccC Confidence 788999999999998764422 1222222 122334444455443 332 33333211 111 No 22 >protein:vir:3033 Length: 272 # NCBI annotation: major capsid protein # Family: family:all:522 # MgeID: mge:61 # MgeName: PhiNIH1.1 # Cross-refs: genbank:acc:NP_438146;genbank:gi:16271809;genbank:GeneID:929235 Probab=99.84 E-value=6.4e-22 Score=136.79 Aligned_cols=256 Identities=14% Similarity=0.041 Sum_probs=154.2 Q ss_pred CC------ccccchhhhhhhh-hhhHHHHHHHhhhhcceeEec-cCcccCceeeeehhhhhcccccccccccCCCccccc Q lcl|NC_016566. 1 MS------LTVFQRKLVTAVT-QMIPDNLNVFNAAANGAVVLG-TGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATA 72 (364) Q Consensus 1 fd------~~vfn~~~~~~~~-e~i~q~~~~fn~as~gAivl~-~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~ 72 (364) |- ..++.|+.+..|+ |.+++.+ +| ++-+.+.. -+...|+.++.|+|..++.+. .++.+ ...++ T Consensus 1 MA~~~T~~~~~~iPev~s~~v~~~~~~~~-~~---~~~~~~~~~~~g~~G~tv~iP~~~~~~~a~----~v~eg-~~i~~ 71 (272) T protein:vir:30 1 MAVGTTKMAQMLDPEVLADMIDAEVGKAI-RF---APLAEVDTTLEGQPGTTLTVPKWDYIGDAE----DVAEG-EAIPM 71 (272) T ss_pred CCCccccchheechHHHHHHHHHHHHHHh-hh---hccccccccccCCCCCEEEEEEecCCCCcc----cccCC-Ccccc Confidence 22 2368888888885 4444332 22 33333321 134569999999998776532 22222 34567 Q ss_pred hhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccC Q lcl|NC_016566. 73 KVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVD 152 (364) Q Consensus 73 ~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~ 152 (364) .+++.++........+. -+.++.........|+ +..+.++++.+|.+..++.+++.+.++.... T Consensus 72 ~~~~~~~~~~~~~~~~~-~~~itd~~~~~s~~d~---~~~~~~~~~~~~a~~~d~~i~~~~~~a~~~~------------ 135 (272) T protein:vir:30 72 TQLGFKKTTMTIKKAGK-GVEITDEAILSGYGDP---VGQAAKQIVEAIDHKVDADVLDALSKSTQTV------------ 135 (272) T ss_pred cccccceEEEEeeeeee-eeeecHHHHhhccccH---HHHHHHHHHHHHHHHHHHHHHHHhccccccc------------ Confidence 78877766554322222 1345555544444554 5679999999999999999999887753111 Q ss_pred cccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccc--ccc-ccccceeecccCCcEEEEeCCCCC Q lcl|NC_016566. 153 GVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAE--QVF-AIGDLQVMGDGLGRRFIISDAAAD 229 (364) Q Consensus 153 ~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~--~~~-~~~~~~~~~~~lGrrVIVDD~~p~ 229 (364) ....++..+.+|.++|||+...+..|+|||.+|..|++.++..... +.. ....-..+..++|.+||+++.+| T Consensus 136 ----~~~~t~d~i~da~~~l~~~~~~~~~~vv~p~~~~~L~k~~~~~~~~~~~~~~~~~~~g~ig~i~G~~Vi~s~~~p- 210 (272) T protein:vir:30 136 ----EATATVDGVSKALDIFNDEDDAETVIVMNPADASTLRLDAAKEWLGATEVGANRVVSGVYGEVLGVQIVRSRKCP- 210 (272) T ss_pred ----ccccCHHHHHHHHHHHhccCCCccEEEEcHHHHHHHHHhccccccccccccccccccccchhhcCeeEEEcCCCC- Confidence 1223566899999999999999999999999999998765442111 110 00001123345799999999998 Q ss_pred CCCceEEEEEecceeEEecCCCCcceeeccCCC-ceeeeEEeeEEEEee---eee-eeecccccccccCCCC Q lcl|NC_016566. 230 AMGAGKMLGLVPGAVAVTTNGLDMLAQEKGGNE-NIERWWQGEFDFNVA---VKG-YRLKASARTPVEGVRS 296 (364) Q Consensus 230 ~~~~Yttylfg~GAi~~~~~~~~~~~~~~~g~e-~~~~~~~~~~~f~lh---p~G-~sw~~~~~~~~~gg~S 296 (364) +|++|+|++||+++....... .|..++. .....++.+++|.+| |.+ ++++-+.- ++- T Consensus 211 ---~~t~~~~~~~a~~~~~~~~~~--ve~~r~~~~~~~~i~~~~~~~~~v~~~~~vv~~t~~~a-----~~~ 272 (272) T protein:vir:30 211 ---KGTAYMVRKGALRIMLKRNTM--VETDRDITKAINQIVANKHYGVYLYKAEKAVKITLKDA-----AKK 272 (272) T ss_pred ---cceEEEEcCCeEEEEecCCce--eeeccccccceeEEEEEEEEEEEEEcCCceEEEEeccc-----ccC Confidence 788999999999998764422 1222222 122334444455443 332 33333211 111 No 23 >protein:vir:739 Length: 231 # NCBI annotation: major structural protein 4 # Family: family:all:522 # MgeID: mge:14 # MgeName: Tuc2009 # Cross-refs: genbank:acc:NP_108716;genbank:gi:13487838;genbank:GeneID:920884 Probab=99.67 E-value=4.8e-18 Score=115.56 Aligned_cols=219 Identities=13% Similarity=0.003 Sum_probs=145.9 Q ss_pred ccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHH Q lcl|NC_016566. 36 GTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAA 115 (364) Q Consensus 36 ~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~ 115 (364) .|+..-||.+..|.| ||.+..- +-+.+.++.+|++.+..+.....+.+ +.++..+..-..-||+ .++.+ T Consensus 1 ~~~~~~Gdtit~P~~--iGda~~v-----~eG~~i~~~~l~~t~~~atIk~~gk~-~~itD~a~l~~~gDp~---~ea~~ 69 (231) T protein:vir:73 1 ENGINLANLCEYPND--IGDAADV-----AEGGEISLDKIGTTTKSVTIKKAAKG-TEITDEAALSGYGDPI---GESNK 69 (231) T ss_pred CccccCCceEEeccc--ccchhhh-----cCCCcCChhhccccceeeeEeeeccc-eeeeHHHHhhccCchH---HHHHH Confidence 688889999999987 8765322 22456688899888888765444444 3344433332445774 45889 Q ss_pred HHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHh Q lcl|NC_016566. 116 QATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAY 195 (364) Q Consensus 116 ~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~ 195 (364) |++.+..+...+.++++|.++.-.. .+.+++..+.+|.++|||..+....++|||..|++|+| T Consensus 70 Q~~~~iA~kvD~di~~~~~~a~l~~----------------~~~~t~d~i~~A~~~fgde~~~~~vivv~p~~~~~Lrk- 132 (231) T protein:vir:73 70 QLGLSLANKVDDDLLKAAKTTSQTV----------------STKANVDGVQAALDIFNDEDAQAYVLIVNPKDAAKIRK- 132 (231) T ss_pred HHHHHHHHhhhHHHHHhhccccccc----------------cccccHHHHHHHHHHhccccccceEEEEcchHHHhhhh- Confidence 9999999999999998888643111 13467788999999999999999999999999999976 Q ss_pred hcccccccccccc-----cceeecccCCcEEEEeCCCCCCCCceEEEEEecceeEEecCCCCcceeeccCCCc-eeeeEE Q lcl|NC_016566. 196 QALPSAEQVFAIG-----DLQVMGDGLGRRFIISDAAADAMGAGKMLGLVPGAVAVTTNGLDMLAQEKGGNEN-IERWWQ 269 (364) Q Consensus 196 ~~it~~~~~~~~~-----~~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~GAi~~~~~~~~~~~~~~~g~e~-~~~~~~ 269 (364) .. +......+. .-..+..++|.+||++|.+|...+.+..|+..+||+.+....-. ..|..|+.. ....+. T Consensus 133 ~~--~~~~~~~~~g~~i~~~G~iG~i~G~~Vi~S~~~~~~~~~~~~~i~~~gAl~~~~k~~~--~vEtdRd~~~k~~~i~ 208 (231) T protein:vir:73 133 DA--NAKNIGSEVGANALINGTYADVLGAQIVRSKKLAEGSALMFKIVSNSPALKLVLKRGV--QVETDRDIVTKTTVIT 208 (231) T ss_pred cc--chhhhhhhhccceeeecccceEcceEEEEcCCCCCCceeeeeEEeeccceeeeecccc--eeeccccccccccEEE Confidence 11 111111111 11123344799999999999988889999999999999875321 122334332 223455 Q ss_pred eeEEEEeeee---ee---eecccccccccCCC Q lcl|NC_016566. 270 GEFDFNVAVK---GY---RLKASARTPVEGVR 295 (364) Q Consensus 270 ~~~~f~lhp~---G~---sw~~~~~~~~~gg~ 295 (364) +++||.+|.. |+ +|+ |. T Consensus 209 ~~~~y~v~l~~~~~vv~~t~~---------g~ 231 (231) T protein:vir:73 209 ADEHYAAYLYDLTKVVNITFT---------GV 231 (231) T ss_pred EeEEEEEEEEcCccEEEEEee---------cC Confidence 6678765532 21 111 11 No 24 >protein:vir:104256 Length: 458 # NCBI annotation: major head protein precursor # Family: family:all:27070 # MgeID: mge:1504 # MgeName: T5 # Cross-refs: genbank:acc:YP_006977;genbank:gi:46401878;genbank:GeneID:2777673 Probab=98.49 E-value=1.9e-07 Score=57.42 Aligned_cols=269 Identities=12% Similarity=0.037 Sum_probs=111.4 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCcccc----chhhh Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPAT----AKVLA 76 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T----~~kit 76 (364) --..+..+++...+++.+.+..-..+-+. ..++.|.....+-......+ .-+.......-..+ ..++. T Consensus 169 ~g~~~ip~~~~~~ii~~~~~~~~l~~~~~-------~~~~~~~~~~~~~~~~~~~a-~~v~e~~~~~~~~~~~~~~~~~~ 240 (458) T protein:vir:10 169 VSSESYETIFSQRIIRDLQKELVVGALFE-------ELPMSSKILTMLVEPDAGKA-TWVAASTYGTDTTTGEEVKGALK 240 (458) T ss_pred cccceehhhHhHHHHHHHHhhhhHHhhcc-------eeecCCcceEEEEecCCcce-eecccccccccccccccccccce Confidence 01112333344444444444322222211 11222221122211111100 00000000000000 00122 Q ss_pred ccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH-----Hhhhhcccccceeeccccc Q lcl|NC_016566. 77 RMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGA-----GKAAIESNAAANYTQPARV 151 (364) Q Consensus 77 ~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~-----L~Gv~~~na~~v~dis~~t 151 (364) +-.-...|++. -+.++...+. ..+..+..-|.+++++...+...+.+|.- ..|++............. T Consensus 241 ~i~~~~~k~~~---~v~is~ell~---ds~~~~~~~i~~~l~~~i~~~~d~~~l~G~G~~~p~Gi~~~~~~~~~~~~~~- 313 (458) T protein:vir:10 241 EIHFSTYKLAA---KSFITDETEE---DAIFSLLPLLRKRLIEAHAVSIEEAFMTGDGSGKPKGLLTLASEDSAKVVTE- 313 (458) T ss_pred eeEeeeeeEEe---eehhhHHHHh---cchHHHHHHHHHHHHHHHHHHHHHHhhcCCCCCccceeeecccccccceeec- Confidence 22222223332 2335554332 22234455677888888777666554421 112221111000000000 Q ss_pred CcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--cccc----c-cceeecccCCcEEEEe Q lcl|NC_016566. 152 DGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAI----G-DLQVMGDGLGRRFIIS 224 (364) Q Consensus 152 ~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~----~-~~~~~~~~lGrrVIVD 224 (364) .........++.++.++..++..+...-..|+||...|..|.+ +.+..+ +... . ........+|+||+++ T Consensus 314 ~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~~~~~~~~l~~---lkd~~G~~i~~~~~~~~~~~~~~~~l~G~pv~~~ 390 (458) T protein:vir:10 314 AKADGSVLVTAKTISKLRRKLGRHGLKLSKLVLIVSMDAYYDL---LEDEEWQDVAQVGNDSVKLQGQVGRIYGLPVVVS 390 (458) T ss_pred ccccccccccHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHh---hcccCCceeeccccccccccCcCceecceeeEEc Confidence 0111233457788999999998777777889999999988854 433332 2111 1 1111113469999999 Q ss_pred CCCCCCCCceEE--EEEecceeEEecCCCCcceeeccCCCceeeeEEeeEEEE---eeeeeeeecccccccccCCCC Q lcl|NC_016566. 225 DAAADAMGAGKM--LGLVPGAVAVTTNGLDMLAQEKGGNENIERWWQGEFDFN---VAVKGYRLKASARTPVEGVRS 296 (364) Q Consensus 225 D~~p~~~~~Ytt--ylfg~GAi~~~~~~~~~~~~~~~g~e~~~~~~~~~~~f~---lhp~G~sw~~~~~~~~~gg~S 296 (364) |.||...+.... ..|+.+.+.....+..+.++...... ...+..+..+. ++|-||--. +.+. | T Consensus 391 ~~~p~~~~~~~~~~~~f~~~~~~~~~~~~~v~~d~~~~~~--~~~~~~~~r~~~~v~~~~a~v~~--~~aa-----~ 458 (458) T protein:vir:10 391 EYFPAKANSAEFAVIVYKDNFVMPRQRAVTVERERQAGKQ--RDAYYVTQRVNLQRYFANGVVSG--TYAA-----S 458 (458) T ss_pred cccccccCCcceEEEEecccEEEEEeeceEEEeecccCCC--ceEEEEEEEecceEecccceEEE--eecc-----C Confidence 999975543332 23455554444444443333332222 12233333332 677776432 2111 1 No 25 >protein:vir:79987 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:1875 # MgeName: tp310-3 # Cross-refs: genbank:acc:YP_001430002;genbank:gi:156604057;genbank:GeneID:5525447 Probab=98.25 E-value=1.4e-06 Score=52.73 Aligned_cols=281 Identities=9% Similarity=-0.028 Sum_probs=111.4 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) ---.+-.+++...+++.+.+...+.+-. .-.++.+.-...+.....++.. ...-+.+... ......+... T Consensus 128 ~gg~~iP~~~~~~ii~~~~~~~~l~~~~-------~~~~~~~~~~~~~~~~~~~~~~--~~~v~E~~~~-~~~~~~~~~~ 197 (415) T protein:vir:79 128 SGFVVIPEEIVTDILKLKEVEFNLDKYV-------TVKRVTNGSGKYPVVRQSEVAA--LEKVEELEEN-PELAVKPFFQ 197 (415) T ss_pred ccccccchHHHHHHHHHHHhhhhhhhhe-------eeeeccCCceeEEEEeecCCcc--ceeecccccc-Ccccccceee Confidence 0001122223333333333322221110 0011111111111111111110 0000011100 0011112222 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCcccccccc Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTFP 160 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~~ 160 (364) +-....+-.+-+.++...+. ..+..+..-|.+++++...+-..+.+|..+.. +++.................... T Consensus 198 v~~~~~k~~~~~~iS~ell~---ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g~--g~~~~~~~~~~~~~~~~~~~~~~ 272 (415) T protein:vir:79 198 LAYDINTHRGYFRISREAIE---DAKVNVLQELKLWMARTIAATRNKAIIDVITK--GSTGSTSSGFEKEGKKLEVKKAK 272 (415) T ss_pred EEeeeeeeEeeehhhHHHHh---hchHHHHHHHHHHHHHHHHHHHHHHHhhcccc--Ccccccccccccccccccccccc Confidence 22222221222345555443 22223445578888888877666655543311 11110000000011111233446 Q ss_pred cHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc-cceeecccCCcEEEEeCCCCCCCCceEEE Q lcl|NC_016566. 161 TLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG-DLQVMGDGLGRRFIISDAAADAMGAGKML 237 (364) Q Consensus 161 s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~-~~~~~~~~lGrrVIVDD~~p~~~~~Ytty 237 (364) ++.++.++..++.+....-..|+||+..+..|.+ +.+..+ ++... .-......+|++|++.|.+|.....-.+. T Consensus 273 ~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~l~~---lkd~~G~~l~~~~~~~~~~~~l~G~pV~~~~~~~~~~~~~~~~ 349 (415) T protein:vir:79 273 SLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDK---MKDKLGNYLIQPDVKEKTQQRLLGAKIEILPDEVLGQKGNNTL 349 (415) T ss_pred chhHHHHHHHhhhhhccCCCEEEEcHHHHHHHHH---hhccCCceeeccCcCCCCCceecceeeEEecccccCCCCccEE Confidence 6778999999998888788899999999998864 333332 22211 00111123599999999998644322346 Q ss_pred EEec--ceeEEe-cCCCCcceeeccCC-CceeeeEEeeEEEEeeeeeeeecccccccccCCCCcChhhhcC Q lcl|NC_016566. 238 GLVP--GAVAVT-TNGLDMLAQEKGGN-ENIERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLSDITD 304 (364) Q Consensus 238 lfg~--GAi~~~-~~~~~~~~~~~~g~-e~~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat 304 (364) +||. .++.+. ..++.+........ ..+...++.+. -.+||..|..-.-+-+... |-+--|+. T Consensus 350 ~~Gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~r~d~-~v~~~~a~~~~~~~~~~~~----~~~~~~~~ 415 (415) T protein:vir:79 350 IIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDC-RILDYKSAIVIEYDDSERG----EGDLGLEA 415 (415) T ss_pred EEEehhccEEEEeecceEEEEeccccCceEEEEEEEecc-EEeccccEEEEEEeccCCC----CCccccCC Confidence 7773 344333 23333322221111 11111111221 1267777755322211111 22222222 No 26 >protein:vir:81100 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:1891 # MgeName: tp310-1 # Cross-refs: genbank:acc:YP_001429874;genbank:gi:156603927;genbank:GeneID:5525320 Probab=98.25 E-value=1.4e-06 Score=52.73 Aligned_cols=281 Identities=9% Similarity=-0.028 Sum_probs=111.4 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) ---.+-.+++...+++.+.+...+.+-. .-.++.+.-...+.....++.. ...-+.+... ......+... T Consensus 128 ~gg~~iP~~~~~~ii~~~~~~~~l~~~~-------~~~~~~~~~~~~~~~~~~~~~~--~~~v~E~~~~-~~~~~~~~~~ 197 (415) T protein:vir:81 128 SGFVVIPEEIVTDILKLKEVEFNLDKYV-------TVKRVTNGSGKYPVVRQSEVAA--LEKVEELEEN-PELAVKPFFQ 197 (415) T ss_pred ccccccchHHHHHHHHHHHhhhhhhhhe-------eeeeccCCceeEEEEeecCCcc--ceeecccccc-Ccccccceee Confidence 0001122223333333333322221110 0011111111111111111110 0000011100 0011112222 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCcccccccc Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTFP 160 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~~ 160 (364) +-....+-.+-+.++...+. ..+..+..-|.+++++...+-..+.+|..+.. +++.................... T Consensus 198 v~~~~~k~~~~~~iS~ell~---ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g~--g~~~~~~~~~~~~~~~~~~~~~~ 272 (415) T protein:vir:81 198 LAYDINTHRGYFRISREAIE---DAKVNVLQELKLWMARTIAATRNKAIIDVITK--GSTGSTSSGFEKEGKKLEVKKAK 272 (415) T ss_pred EEeeeeeeEeeehhhHHHHh---hchHHHHHHHHHHHHHHHHHHHHHHHhhcccc--Ccccccccccccccccccccccc Confidence 22222221222345555443 22223445578888888877666655543311 11110000000011111233446 Q ss_pred cHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc-cceeecccCCcEEEEeCCCCCCCCceEEE Q lcl|NC_016566. 161 TLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG-DLQVMGDGLGRRFIISDAAADAMGAGKML 237 (364) Q Consensus 161 s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~-~~~~~~~~lGrrVIVDD~~p~~~~~Ytty 237 (364) ++.++.++..++.+....-..|+||+..+..|.+ +.+..+ ++... .-......+|++|++.|.+|.....-.+. T Consensus 273 ~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~l~~---lkd~~G~~l~~~~~~~~~~~~l~G~pV~~~~~~~~~~~~~~~~ 349 (415) T protein:vir:81 273 SLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDK---MKDKLGNYLIQPDVKEKTQQRLLGAKIEILPDEVLGQKGNNTL 349 (415) T ss_pred chhHHHHHHHhhhhhccCCCEEEEcHHHHHHHHH---hhccCCceeeccCcCCCCCceecceeeEEecccccCCCCccEE Confidence 6778999999998888788899999999998864 333332 22211 00111123599999999998644322346 Q ss_pred EEec--ceeEEe-cCCCCcceeeccCC-CceeeeEEeeEEEEeeeeeeeecccccccccCCCCcChhhhcC Q lcl|NC_016566. 238 GLVP--GAVAVT-TNGLDMLAQEKGGN-ENIERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLSDITD 304 (364) Q Consensus 238 lfg~--GAi~~~-~~~~~~~~~~~~g~-e~~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat 304 (364) +||. .++.+. ..++.+........ ..+...++.+. -.+||..|..-.-+-+... |-+--|+. T Consensus 350 ~~Gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~r~d~-~v~~~~a~~~~~~~~~~~~----~~~~~~~~ 415 (415) T protein:vir:81 350 IIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDC-RILDYKSAIVIEYDDSERG----EGDLGLEA 415 (415) T ss_pred EEEehhccEEEEeecceEEEEeccccCceEEEEEEEecc-EEeccccEEEEEEeccCCC----CCccccCC Confidence 7773 344333 23333322221111 11111111221 1267777755322211111 22222222 No 27 >protein:vir:98339 Length: 415 # NCBI annotation: putative capsid protein # Family: family:all:21 # MgeID: mge:1581 # MgeName: phiPVL(108) # Cross-refs: genbank:acc:YP_918931;genbank:gi:119443693;genbank:GeneID:4594501 Probab=98.25 E-value=1.4e-06 Score=52.73 Aligned_cols=281 Identities=9% Similarity=-0.028 Sum_probs=111.4 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) ---.+-.+++...+++.+.+...+.+-. .-.++.+.-...+.....++.. ...-+.+... ......+... T Consensus 128 ~gg~~iP~~~~~~ii~~~~~~~~l~~~~-------~~~~~~~~~~~~~~~~~~~~~~--~~~v~E~~~~-~~~~~~~~~~ 197 (415) T protein:vir:98 128 SGFVVIPEEIVTDILKLKEVEFNLDKYV-------TVKRVTNGSGKYPVVRQSEVAA--LEKVEELEEN-PELAVKPFFQ 197 (415) T ss_pred ccccccchHHHHHHHHHHHhhhhhhhhe-------eeeeccCCceeEEEEeecCCcc--ceeecccccc-Ccccccceee Confidence 0001122223333333333322221110 0011111111111111111110 0000011100 0011112222 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCcccccccc Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTFP 160 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~~ 160 (364) +-....+-.+-+.++...+. ..+..+..-|.+++++...+-..+.+|..+.. +++.................... T Consensus 198 v~~~~~k~~~~~~iS~ell~---ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g~--g~~~~~~~~~~~~~~~~~~~~~~ 272 (415) T protein:vir:98 198 LAYDINTHRGYFRISREAIE---DAKVNVLQELKLWMARTIAATRNKAIIDVITK--GSTGSTSSGFEKEGKKLEVKKAK 272 (415) T ss_pred EEeeeeeeEeeehhhHHHHh---hchHHHHHHHHHHHHHHHHHHHHHHHhhcccc--Ccccccccccccccccccccccc Confidence 22222221222345555443 22223445578888888877666655543311 11110000000011111233446 Q ss_pred cHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc-cceeecccCCcEEEEeCCCCCCCCceEEE Q lcl|NC_016566. 161 TLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG-DLQVMGDGLGRRFIISDAAADAMGAGKML 237 (364) Q Consensus 161 s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~-~~~~~~~~lGrrVIVDD~~p~~~~~Ytty 237 (364) ++.++.++..++.+....-..|+||+..+..|.+ +.+..+ ++... .-......+|++|++.|.+|.....-.+. T Consensus 273 ~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~l~~---lkd~~G~~l~~~~~~~~~~~~l~G~pV~~~~~~~~~~~~~~~~ 349 (415) T protein:vir:98 273 SLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDK---MKDKLGNYLIQPDVKEKTQQRLLGAKIEILPDEVLGQKGNNTL 349 (415) T ss_pred chhHHHHHHHhhhhhccCCCEEEEcHHHHHHHHH---hhccCCceeeccCcCCCCCceecceeeEEecccccCCCCccEE Confidence 6778999999998888788899999999998864 333332 22211 00111123599999999998644322346 Q ss_pred EEec--ceeEEe-cCCCCcceeeccCC-CceeeeEEeeEEEEeeeeeeeecccccccccCCCCcChhhhcC Q lcl|NC_016566. 238 GLVP--GAVAVT-TNGLDMLAQEKGGN-ENIERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLSDITD 304 (364) Q Consensus 238 lfg~--GAi~~~-~~~~~~~~~~~~g~-e~~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat 304 (364) +||. .++.+. ..++.+........ ..+...++.+. -.+||..|..-.-+-+... |-+--|+. T Consensus 350 ~~Gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~r~d~-~v~~~~a~~~~~~~~~~~~----~~~~~~~~ 415 (415) T protein:vir:98 350 IIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDC-RILDYKSAIVIEYDDSERG----EGDLGLEA 415 (415) T ss_pred EEEehhccEEEEeecceEEEEeccccCceEEEEEEEecc-EEeccccEEEEEEeccCCC----CCccccCC Confidence 7773 344333 23333322221111 11111111221 1267777755322211111 22222222 No 28 >protein:vir:9410 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:167 # MgeName: phi 13 # Cross-refs: genbank:acc:NP_803388;genbank:gi:29028700;genbank:GeneID:1258136 Probab=98.05 E-value=6e-06 Score=49.19 Aligned_cols=281 Identities=9% Similarity=-0.015 Sum_probs=109.0 Q ss_pred CCc-------------------------------cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeeh Q lcl|NC_016566. 1 MSL-------------------------------TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMS 49 (364) Q Consensus 1 fd~-------------------------------~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~ 49 (364) ... ..-.+++...+++.+.+.....+-.. .+-+. ...+.+ ..+- T Consensus 97 ~~~~~~~~e~~~~~~~~~~~~~~~~~~~~~~~g~~~iP~~~~~~ii~~~~~~~~l~~~~~--~~~~~--~~~~~~-~~~~ 171 (415) T protein:vir:94 97 QNTKVTSQEVRDFTEYLETRNDIQGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVT--VKRVT--NGSGKY-PVVR 171 (415) T ss_pred hhhhhhHHHHHHHHHHhhhhhhhhhhccccccccccCcHHHHHHHHHHHHhhhhhhhhcc--eeecc--CCceeE-EEEe Confidence 000 01112223333333333222222110 00000 011111 1111 Q ss_pred hhhhcccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 50 VGLIANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAG 129 (364) Q Consensus 50 f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~l 129 (364) +.....+ ..-+.+... +.........+-.+..+-.+-+.++...+. ..+..+..-|.+++++...+-..+.+ T Consensus 172 ~~~~~~~----~~v~Eg~~~-~~~~~~~~~~i~~~~~k~~~~~~is~ell~---ds~~~~~~~i~~~l~~~~~~~~~~~i 243 (415) T protein:vir:94 172 QSEVAAL----EKVEELEEN-PELAVKPFFQLAYDINTHRGYFRISREAIE---DAKVNVLQELKLWMARTIAATRNKAI 243 (415) T ss_pred ecCCccc----eeccccccc-cccccccceeeEeeheeeeeechhhHHHHh---hchHHHHHHHHHHHHHHHHHHHHHHH Confidence 1111110 000011100 001111222333322222222345555443 22333445577888888777666655 Q ss_pred HHHHhhhhcccccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--cccc Q lcl|NC_016566. 130 IGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAI 207 (364) Q Consensus 130 la~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~ 207 (364) |..+.. ++.....................++.++.++..++.+..-.-..|+||+..|..|.+ +.++.+ ++.. T Consensus 244 l~g~g~--g~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~---lkd~~G~~l~~~ 318 (415) T protein:vir:94 244 IDVITK--GSTGSTSSGFEKEGKKLEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDK---MKDKLGNYLIQP 318 (415) T ss_pred hhcccc--CccccccccccccccccccccccchHHHHHHHHhhhhhccCCCEEEEcHHHHHHHHH---hhccCCCeeecc Confidence 543221 000000000000111112223456778889998988877778899999999998864 433333 2211 Q ss_pred ccc-eeecccCCcEEEEeCCCCCCCCceEEEEEec--ceeEE-ecCCCCcceeeccC-CCceeeeEEeeEEEEeeeeeee Q lcl|NC_016566. 208 GDL-QVMGDGLGRRFIISDAAADAMGAGKMLGLVP--GAVAV-TTNGLDMLAQEKGG-NENIERWWQGEFDFNVAVKGYR 282 (364) Q Consensus 208 ~~~-~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~--GAi~~-~~~~~~~~~~~~~g-~e~~~~~~~~~~~f~lhp~G~s 282 (364) .-. ......+|++|++++.+|.....-..++||. -++.+ ...++.+....... ...+....+.+. -.+||..|. T Consensus 319 ~~~~~~~~~l~G~pV~~~~~~~~~~~~~~~i~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~r~~~r~d~-~~~~~~a~~ 397 (415) T protein:vir:94 319 DVKEKTQQRLLGAKIEILPDEVLGQKGNNTLIIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDC-RILDYKSAI 397 (415) T ss_pred CcCCCCCceecceeeEEecccccCCCCccEEEEEehhccEEEEeecceEEEEeccccCceEEEEEEEecc-EEeccccEE Confidence 100 0111235999999999986543334466663 33333 22333322221111 111111111111 125677666 Q ss_pred ecccccccccCCCCcChhhhcC Q lcl|NC_016566. 283 LKASARTPVEGVRSFKLSDITD 304 (364) Q Consensus 283 w~~~~~~~~~gg~SPT~aeLat 304 (364) .-.-+.+... |-+--|+. T Consensus 398 ~~~~~~~~~~----~~~~~~~~ 415 (415) T protein:vir:94 398 VIEYDDSERG----EGDLGLEA 415 (415) T ss_pred EEEEeccCCC----CCccccCC Confidence 6332211111 11222222 No 29 >protein:vir:1383 Length: 421 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:314 # MgeName: phi3626 # Cross-refs: genbank:acc:NP_612835;genbank:gi:20065969;genbank:GeneID:935826 Probab=98.04 E-value=4.3e-06 Score=49.99 Aligned_cols=294 Identities=9% Similarity=-0.015 Sum_probs=105.7 Q ss_pred CCcccc--------------------------chhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhh- Q lcl|NC_016566. 1 MSLTVF--------------------------QRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLI- 53 (364) Q Consensus 1 fd~~vf--------------------------n~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i- 53 (364) +..+-| .+++...+++.+.+....++-.. ..++.+.-...|.+... T Consensus 95 ~~~~~~~~~~~~~~~~~~~ra~~t~~~gg~liP~~~~~~Ii~~~~~~~~l~~l~~-------~~~~~~~~~~~~~~~~~~ 167 (421) T protein:vir:13 95 LQLSAMSKTIRGIQLSEEERDIMSSTNNGAVIPQEFVNEFEKLKEGYPSLKEHCH-------VIPVNRNAGKMPVRAGAS 167 (421) T ss_pred HHHHHHHHhhhccchhHHHhhccccCCcceecchhhHHHHHHHHHhhhhhhhhce-------eeeccCCceEEEEeecCC Confidence 111111 11222222333322221111110 01111111112211111 Q ss_pred -cccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 54 -ANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGA 132 (364) Q Consensus 54 -~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~ 132 (364) .+.. ..+.... . +..++ +...+-.+.+.-.+-+.++...+. ..+..+..-|.+++++...+-.-...+.. T Consensus 168 ~~~~~-~~~E~~~--~--~~s~~-~f~~i~~~~~k~~~~v~iS~ell~---ds~~~l~~~i~~~la~~~~~~~~~~i~~~ 238 (421) T protein:vir:13 168 VDKLA-NLAKDTE--L--VKAML-KTQPMAYDIDDYGLLAPIDNSLLE---DSEINFLEFVNEEFAEFAVNTENAEIVKQ 238 (421) T ss_pred cccee-ecccccc--c--ccccc-ceeEEEeeeeeeEeehhhhHHHHh---hhHHHHHHHHHHHHHHHHHHHhhhhHhhh Confidence 0000 0000000 0 00111 111222222221222345555443 22223444566777766665554555555 Q ss_pred HhhhhcccccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccccc Q lcl|NC_016566. 133 GKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDL 210 (364) Q Consensus 133 L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~ 210 (364) +.|++... ...++.++.+++.++-.+...-..|+||+..|..|.+ +.++.+ ++....- T Consensus 239 ~~g~~~~~-----------------~~~~~d~i~~~~~~l~~~~~~~a~~v~n~~~~~~l~~---lkd~~G~~i~~~~~~ 298 (421) T protein:vir:13 239 AKAVLAEE-----------------TINDYAGLVKTINSLVPNARKRAIIVTNSDGRAYLDG---LMDKQGRPLLKELSD 298 (421) T ss_pred hhhccccc-----------------cccchHHHHHHHHHhhhhhcCCCEEEEcHHHHHHHHH---hhcCCCceeecCcCC Confidence 55544221 1224557888888887777777899999999999854 333332 3332111 Q ss_pred eeecccCCcEEEEeCCCCCCCCceEEEEEec-c-eeEEe-cCCCCcceeeccCCCceeeeEEeeEEEEeeeeeeeecccc Q lcl|NC_016566. 211 QVMGDGLGRRFIISDAAADAMGAGKMLGLVP-G-AVAVT-TNGLDMLAQEKGGNENIERWWQGEFDFNVAVKGYRLKASA 287 (364) Q Consensus 211 ~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~-G-Ai~~~-~~~~~~~~~~~~g~e~~~~~~~~~~~f~lhp~G~sw~~~~ 287 (364) ......+|++|+++|.+|...+.-...+||. . ++.+. .+++.+.......=+.-.+.++....|. |--|+..+ T Consensus 299 ~~~~tl~G~pV~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~~~~v~~~~~~~f~~~~~~~r~~~r~d----~~~~~~~a 374 (421) T protein:vir:13 299 GGDLVFKGRPVIELEESIFDVGDETKFIVSDFKTLIKFMDRKQYLIDQSKEAGYTKNETIARIIERFD----VNSPLDKS 374 (421) T ss_pred CCCceecceeeEEeccccccCCCceEEEEEeccccEEEEEecceEEEeecccccccCeeEEEEEeeec----ceeecchh Confidence 1111246999999999986544334455554 1 23333 2333332222222122222333322221 11111111 Q ss_pred cccccCCCCcChhhhcCCccceeecCcCcCcceEEEEecCccccccccccccccccccccc Q lcl|NC_016566. 288 RTPVEGVRSFKLSDITDKANWELDQGQVDNAPATVQDVGSDSDTKGRRRTQTAQAVPTRNI 348 (364) Q Consensus 288 ~~~~~gg~SPT~aeLat~~NW~rV~~s~K~~pgv~~~~~~~~~~~~~~~~~~~~~~~~~~~ 348 (364) ...-....|.-.+...+..++ -...+...|+|-++-|-.+-+ -+.|- T Consensus 375 ------------~~~~~~~~~~a~v~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~-~~~~~ 421 (421) T protein:vir:13 375 ------------SDAEKIRKFGVIVKLQEVLKS-SPRSGKNKNESKEEIKEEGEA-TQQNE 421 (421) T ss_pred ------------hheeeecccceeeccccccCC-CCcCCCCccccchheeecccc-ccCCC Confidence 000111111111111111111 011111122222222222222 11111 No 30 >protein:vir:4600 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:101 # MgeName: PVL # Cross-refs: genbank:acc:NP_058445;genbank:gi:9635171;genbank:GeneID:1262708 Probab=98.03 E-value=6.5e-06 Score=49.01 Aligned_cols=279 Identities=9% Similarity=-0.030 Sum_probs=108.7 Q ss_pred CC------------------c-cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccc Q lcl|NC_016566. 1 MS------------------L-TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRN 61 (364) Q Consensus 1 fd------------------~-~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d 61 (364) |. - .+-.+++...+++.+.+......-. ..+-..+ ..+++ |.....++.. .. T Consensus 109 ~~~~~~~~~~~~~~~~~t~~g~~~iP~~~~~~ii~~~~~~~~l~~~~--~~~~~~~--~~~~~---~~~~~~~~~~--~~ 179 (415) T protein:vir:46 109 FTEYLETRNDIQGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYV--TVKRVTN--GSGKY---PVVRQSEVAA--LE 179 (415) T ss_pred HHHHHhhhhhhhhccccccCCcccccHHHHHHHHHHHHhhhhhhhhc--ceeeccC--CceeE---EEEEecCCcc--ee Confidence 00 0 0111222333333333322221110 0000000 11221 1111111100 00 Q ss_pred cccCCCccccch-hhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhccc Q lcl|NC_016566. 62 AYAPVGTPATAK-VLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESN 140 (364) Q Consensus 62 ~~~~~~~~~T~~-kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~n 140 (364) .-+.+.. .|. .......+-....+-.+-+.++...+. ..+..+..-|.+++++...+-..+.+|..+.. +.+ T Consensus 180 ~v~Eg~~--~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~---ds~~~l~~~i~~~l~~~i~~~~d~~il~g~g~--g~~ 252 (415) T protein:vir:46 180 KVEELEE--NPELAVKPFFQLAYDINTHRGYFRISREAIE---DAKVNVLQELKLWMARTIAATRNKAIIDVITK--GST 252 (415) T ss_pred ecccccc--cccccccceeeEEeeeeeeEeeehhhHHHHh---hchHHHHHHHHHHHHHHHHHHHHHHHhhcccc--CCc Confidence 0001110 010 111222222222222222345554443 22233445577888888877766655543311 001 Q ss_pred ccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc-cceeecccC Q lcl|NC_016566. 141 AAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG-DLQVMGDGL 217 (364) Q Consensus 141 a~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~-~~~~~~~~l 217 (364) .................+..++.++.++..++-+....-..|+||+..|..|.+ +.++.+ ++... .-......+ T Consensus 253 ~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~L~~---lkd~~G~~i~~~~~~~~~~~~l~ 329 (415) T protein:vir:46 253 GSTSSGFEKEGKKLEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDK---MKDKLGNYLIQPDVKEKTQQRLL 329 (415) T ss_pred cccccccccccceeccccccchHHHHHHHHhhhhhccCCCEEEEcHHHHHHHHH---hhccCCCeeeccCcCCCCCcccc Confidence 000000001111112334566778889999988887778899999999998854 433332 22111 001111236 Q ss_pred CcEEEEeCCCCCCCCceEEEEEec--ceeEEe-cCCCCcceee-ccCCCceeeeEEeeEEEEeeeeeeeecc-ccccccc Q lcl|NC_016566. 218 GRRFIISDAAADAMGAGKMLGLVP--GAVAVT-TNGLDMLAQE-KGGNENIERWWQGEFDFNVAVKGYRLKA-SARTPVE 292 (364) Q Consensus 218 GrrVIVDD~~p~~~~~Yttylfg~--GAi~~~-~~~~~~~~~~-~~g~e~~~~~~~~~~~f~lhp~G~sw~~-~~~~~~~ 292 (364) |++|++.|.+|.....-.+++||. -++.+. ..+..+.... ......+...++.+. -.+||..|..-. ++.+.. T Consensus 330 G~pV~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~r~d~-~v~~~~a~~~~~~~~~~~~- 407 (415) T protein:vir:46 330 GAKIEILPDEVLGQKGNNTLIIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDC-RILDYKSAIVIEYDDSERG- 407 (415) T ss_pred ceeeEEeccccccCCCccEEEEEehhccEEEEeecceEEEeeccccCceEEEEEEEecc-EEeccccEEEEEeeccCCC- Confidence 999999999996443223467763 233332 2333222221 111111111111111 126666664422 111111 Q ss_pred CCCCcChhhhcC Q lcl|NC_016566. 293 GVRSFKLSDITD 304 (364) Q Consensus 293 gg~SPT~aeLat 304 (364) |-+--|+. T Consensus 408 ----~~~~~~~~ 415 (415) T protein:vir:46 408 ----EGDLGLEA 415 (415) T ss_pred ----CCCccCCC Confidence 21222222 No 31 >protein:vir:4700 Length: 415 # NCBI annotation: phi PVL ORF 7 homologue # Family: family:all:21 # MgeID: mge:102 # MgeName: phiPV83 # Cross-refs: genbank:acc:NP_061632;genbank:gi:9635719;genbank:GeneID:1262976 Probab=98.03 E-value=6.5e-06 Score=49.01 Aligned_cols=279 Identities=9% Similarity=-0.030 Sum_probs=108.7 Q ss_pred CC------------------c-cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccc Q lcl|NC_016566. 1 MS------------------L-TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRN 61 (364) Q Consensus 1 fd------------------~-~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d 61 (364) |. - .+-.+++...+++.+.+......-. ..+-..+ ..+++ |.....++.. .. T Consensus 109 ~~~~~~~~~~~~~~~~~t~~g~~~iP~~~~~~ii~~~~~~~~l~~~~--~~~~~~~--~~~~~---~~~~~~~~~~--~~ 179 (415) T protein:vir:47 109 FTEYLETRNDIQGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYV--TVKRVTN--GSGKY---PVVRQSEVAA--LE 179 (415) T ss_pred HHHHHhhhhhhhhccccccCCcccccHHHHHHHHHHHHhhhhhhhhc--ceeeccC--CceeE---EEEEecCCcc--ee Confidence 00 0 0111222333333333322221110 0000000 11221 1111111100 00 Q ss_pred cccCCCccccch-hhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhccc Q lcl|NC_016566. 62 AYAPVGTPATAK-VLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESN 140 (364) Q Consensus 62 ~~~~~~~~~T~~-kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~n 140 (364) .-+.+.. .|. .......+-....+-.+-+.++...+. ..+..+..-|.+++++...+-..+.+|..+.. +.+ T Consensus 180 ~v~Eg~~--~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~---ds~~~l~~~i~~~l~~~i~~~~d~~il~g~g~--g~~ 252 (415) T protein:vir:47 180 KVEELEE--NPELAVKPFFQLAYDINTHRGYFRISREAIE---DAKVNVLQELKLWMARTIAATRNKAIIDVITK--GST 252 (415) T ss_pred ecccccc--cccccccceeeEEeeeeeeEeeehhhHHHHh---hchHHHHHHHHHHHHHHHHHHHHHHHhhcccc--CCc Confidence 0001110 010 111222222222222222345554443 22233445577888888877766655543311 001 Q ss_pred ccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc-cceeecccC Q lcl|NC_016566. 141 AAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG-DLQVMGDGL 217 (364) Q Consensus 141 a~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~-~~~~~~~~l 217 (364) .................+..++.++.++..++-+....-..|+||+..|..|.+ +.++.+ ++... .-......+ T Consensus 253 ~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~L~~---lkd~~G~~i~~~~~~~~~~~~l~ 329 (415) T protein:vir:47 253 GSTSSGFEKEGKKLEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDK---MKDKLGNYLIQPDVKEKTQQRLL 329 (415) T ss_pred cccccccccccceeccccccchHHHHHHHHhhhhhccCCCEEEEcHHHHHHHHH---hhccCCCeeeccCcCCCCCcccc Confidence 000000001111112334566778889999988887778899999999998854 433332 22111 001111236 Q ss_pred CcEEEEeCCCCCCCCceEEEEEec--ceeEEe-cCCCCcceee-ccCCCceeeeEEeeEEEEeeeeeeeecc-ccccccc Q lcl|NC_016566. 218 GRRFIISDAAADAMGAGKMLGLVP--GAVAVT-TNGLDMLAQE-KGGNENIERWWQGEFDFNVAVKGYRLKA-SARTPVE 292 (364) Q Consensus 218 GrrVIVDD~~p~~~~~Yttylfg~--GAi~~~-~~~~~~~~~~-~~g~e~~~~~~~~~~~f~lhp~G~sw~~-~~~~~~~ 292 (364) |++|++.|.+|.....-.+++||. -++.+. ..+..+.... ......+...++.+. -.+||..|..-. ++.+.. T Consensus 330 G~pV~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~r~d~-~v~~~~a~~~~~~~~~~~~- 407 (415) T protein:vir:47 330 GAKIEILPDEVLGQKGNNTLIIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDC-RILDYKSAIVIEYDDSERG- 407 (415) T ss_pred ceeeEEeccccccCCCccEEEEEehhccEEEEeecceEEEeeccccCceEEEEEEEecc-EEeccccEEEEEeeccCCC- Confidence 999999999996443223467763 233332 2333222221 111111111111111 126666664422 111111 Q ss_pred CCCCcChhhhcC Q lcl|NC_016566. 293 GVRSFKLSDITD 304 (364) Q Consensus 293 gg~SPT~aeLat 304 (364) |-+--|+. T Consensus 408 ----~~~~~~~~ 415 (415) T protein:vir:47 408 ----EGDLGLEA 415 (415) T ss_pred ----CCCccCCC Confidence 21222222 No 32 >protein:vir:9759 Length: 303 # NCBI annotation: putative structural protein # Family: family:all:966 # MgeID: mge:175 # MgeName: 315.3 # Cross-refs: genbank:acc:NP_795521;genbank:gi:28876283;genbank:GeneID:1257824 Probab=98.00 E-value=7.6e-06 Score=48.64 Aligned_cols=271 Identities=6% Similarity=-0.008 Sum_probs=121.7 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) =--.+..+++...++|.+.+....+.-+. ..++.+.-.+.|.+.....+ .-+... ......++ ++++-.- T Consensus 6 ~gg~liP~~~~~~ii~~l~~~s~i~~l~~-------~~~~~~~~~~ip~~~~~~~a-~wv~E~-~~~~~s~~-~f~~v~l 75 (303) T protein:vir:97 6 SKASLFDKHLVSDLINKVKGHSSLAKLSS-------QKPIPFNGSKEFTFTLDSDI-DVVAEN-GKKTHGGL-SLEPVTI 75 (303) T ss_pred CCCeEcchhHHHHHHHHHHhhchhhhhcc-------eeecCCCceEEEEEecCcce-EEeecC-cccccccc-ceeeEEe Confidence 11123456666777777775443333221 11222222344444322111 111111 11111111 2333333 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhccc----ccceeecccccCcccc Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESN----AAANYTQPARVDGVGG 156 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~n----a~~v~dis~~t~~~~~ 156 (364) ...|++. -+.++.+-+.....+...+...|.+++++...+...+.+|.-....-+.. ....+........... T Consensus 76 ~~~kl~~---~~~iS~ell~~~~d~~~~l~~~i~~~la~a~~~~ld~a~l~G~~~~~g~~~~~~~~~~~~~~~~~~~~~~ 152 (303) T protein:vir:97 76 VPIKVEY---GARLSDEFLYATEEEKIDILKAFNEGFAKKLARGIDLMAMHGINPRTKKASDVIGTNHFDSKVTQVVKFT 152 (303) T ss_pred eeEEEEE---eehhhHHHhhcCccchHHHHHHHHHHHHHHHHHHHHhhhhcccccCCccccccccccccccccccccccc Confidence 3334443 23455554432234444455668888888877766665543211000000 0001110000001111 Q ss_pred cccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccc-cccce-eecccCCcEEEEeCCCCCCCC Q lcl|NC_016566. 157 RTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFA-IGDLQ-VMGDGLGRRFIISDAAADAMG 232 (364) Q Consensus 157 ~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~-~~~~~-~~~~~lGrrVIVDD~~p~~~~ 232 (364) .....+.++.++..++-+...+...|+||+..+..|.+ +.+.++ ++. +.... .....+|+||++++.||...+ T Consensus 153 ~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~L~~---lkd~~g~~~~~~~~~~~~~~~~l~G~Pv~~s~~v~~~~~ 229 (303) T protein:vir:97 153 ESEDADANIEAAVNLIQGAEGVVTGLAMDTEFSTALAK---VTNGEMGPKMYPELAWGANPDSINGLKSSVNTTVGAGAD 229 (303) T ss_pred cccchHHHHHHHHHHHhhcCCCccEEEEcHHHHHHHHH---hhccCCCeEEecCccCCCCCceecceeeEEecccCCccc Confidence 23345678888998887777888899999999998864 333333 111 11111 111235999999999986432 Q ss_pred ---ceEEEEEec--ceeEEec-CCCCcceeec---cC------CCceeeeEEeeEE---EEeeeeeeeec-cccc Q lcl|NC_016566. 233 ---AGKMLGLVP--GAVAVTT-NGLDMLAQEK---GG------NENIERWWQGEFD---FNVAVKGYRLK-ASAR 288 (364) Q Consensus 233 ---~Yttylfg~--GAi~~~~-~~~~~~~~~~---~g------~e~~~~~~~~~~~---f~lhp~G~sw~-~~~~ 288 (364) .-..++||. .++.++. ..+.+..-+. ++ ..|. ..++.+.. -++||..|.-- .+.+ T Consensus 230 ~~~~~~~~~~Gdf~~~~~~~~~~~~~~~~~~~~~~d~~~~~~~~~n~-~~~r~~~r~~~~v~~p~af~~l~~~~~ 303 (303) T protein:vir:97 230 EAESKDLVIIGDFESMFKWGYAKQIPMEIIKYGDPDNSGKDLKGYNQ-IYLRAEAYIGWGILDAKSFARVTKGEV 303 (303) T ss_pred cCCCccEEEEeeccccEEEEEecCcEEEEeeccCCCCcchhhhhcCc-EEEEEEEEeccEeecccceEEeeCCCC Confidence 233566764 4444443 2222211111 11 1121 22333222 23778777553 3322 No 33 >protein:vir:4339 Length: 395 # NCBI annotation: major head protein # Family: family:all:585 # MgeID: mge:93 # MgeName: D3 # Cross-refs: genbank:acc:NP_061502;genbank:gi:9635591;genbank:GeneID:1262860 Probab=97.97 E-value=8.5e-06 Score=48.37 Aligned_cols=260 Identities=11% Similarity=0.007 Sum_probs=107.0 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) -.-.+..++.....++.+.+....++-.. ..++.|.-...|......+...-.... ......++ +... T Consensus 120 ~~g~~vp~~~~~~ii~~~~~~~~l~~l~~-------~~~~~~~~~~~~~~~~~~~~a~~v~E~-~~~~~~~~----~~~~ 187 (395) T protein:vir:43 120 SGGALVAPDRRPGVVAAPQRRLTIRDLVA-------PGTTESNSVEYVRETGFVNNAAPVSEG-TQKPYSDL----TFEL 187 (395) T ss_pred CCccccchhhHHHHHHHHHhhhhHHhhcc-------ceecCCCceEEEEEecCCCceeeecCC-cccccccc----ceeE Confidence 11123444555556666665444433322 111222211222221111110011111 11111111 2222 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH------HhhhhcccccceeecccccCcc Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGA------GKAAIESNAAANYTQPARVDGV 154 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~------L~Gv~~~na~~v~dis~~t~~~ 154 (364) +-.+..+-.+-+.++..-+ +|...+..-|.+++++...+..-+.+|.- +.|++......+...+ . T Consensus 188 i~~~~~k~~~~~~is~ell----~d~~~l~~~v~~~la~a~~~~~d~~~l~G~g~~~~~~Gi~~~~~~~~~~~~-----~ 258 (395) T protein:vir:43 188 ENAPVRTIAHLFKASRQIL----DDASALQSYIDARARYGLMLVEECQLLYGNGTGANLHGIIPQAQAYAPPSG-----V 258 (395) T ss_pred EEEeeeeEEEeehhhHHHH----HhHHHHHHHHHHHHHHHHHHHHHHHHHhccCCCCccccccccccccccccc-----c Confidence 3333222223334565533 23333445567777777766555544421 1222221111111111 1 Q ss_pred cccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccccceeecccCCcEEEEeCCCCCCCC Q lcl|NC_016566. 155 GGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQVMGDGLGRRFIISDAAADAMG 232 (364) Q Consensus 155 ~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~~~~~~lGrrVIVDD~~p~~~~ 232 (364) .......+.++.++...+......-..|+||..+|..|.+ +.+..+ ++....-......+|+||+++|.||... T Consensus 259 ~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~---lkd~~G~~i~~~~~~~~~~~l~G~pVv~~~~~~~~~- 334 (395) T protein:vir:43 259 VVTAEQRIDRIRLAILQAQLAEFPASGIVLNPIDWALIEL---NKDAENRYIIGSPQNGTTPTLWRLPVVETQAITQDE- 334 (395) T ss_pred ccccchhHHHHHHHHHhhccccCCCcEEEEcHHHHHHHHH---hhccCCceeccccccCCCceecceeeEEcCCCCCCc- Confidence 1222334667888888887777778899999999988854 323332 2221100111123699999999999542 Q ss_pred ceEEEEEec---ceeEEecCCCCcceeeccCC--CceeeeEEeeEEE---EeeeeeeeecccccccccCCCCcC Q lcl|NC_016566. 233 AGKMLGLVP---GAVAVTTNGLDMLAQEKGGN--ENIERWWQGEFDF---NVAVKGYRLKASARTPVEGVRSFK 298 (364) Q Consensus 233 ~Yttylfg~---GAi~~~~~~~~~~~~~~~g~--e~~~~~~~~~~~f---~lhp~G~sw~~~~~~~~~gg~SPT 298 (364) .+||. +...+...+..+...+..+. ..-.+.++.+..| .++|..|..-.-+ ++ T Consensus 335 ----~~~gd~~~~~~~~~~~~~~i~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~~~~t---------aa 395 (395) T protein:vir:43 335 ----FLTGAFSLGAQIFDRMDIEVLVSTENDKDFENNMVTIRAEERLAFAVYRPEAFVTGSLT---------AS 395 (395) T ss_pred ----EEEEeccceEEEEEecceEEEEeccccchhhcCcEEEEEEEeeccEEecccceEEEEec---------cC Confidence 33332 22222222222222222211 1112223332222 2556666543221 11 No 34 >protein:vir:102605 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:1661 # MgeName: Llij # Cross-refs: genbank:acc:YP_655002;genbank:gi:109392192;genbank:GeneID:4157227 Probab=97.97 E-value=8.8e-06 Score=48.30 Aligned_cols=259 Identities=13% Similarity=0.023 Sum_probs=126.0 Q ss_pred CCccccchhhhhhhh-hhhHHHHHHHhhhhcceeEec---cCcccCceeeeehhhhhcccccccccccCCCccccchhhh Q lcl|NC_016566. 1 MSLTVFQRKLVTAVT-QMIPDNLNVFNAAANGAVVLG---TGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLA 76 (364) Q Consensus 1 fd~~vfn~~~~~~~~-e~i~q~~~~fn~as~gAivl~---~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit 76 (364) |-...|=++++...+ +.+...+ +|.. ++-. .+...||-+..|.+..++- .|+...++. .++..++ T Consensus 1 MA~~~~~pe~~~~~v~~~~~~~l-v~~~-----l~~~~~~~~~~~Gdtv~ip~~~~~~~----~d~~~~~~~-~~~~~~~ 69 (273) T protein:vir:10 1 MAFNNFIPELWSDMLLEEWTAQT-VFAN-----LVNREYEGTASKGNVVHIAGVVAPTV----KDYKAAGRQ-TSADAIS 69 (273) T ss_pred CcchhhhHHHHHHHHHHHHHhhh-ccch-----hhccccccccccCceEEEeecccccc----cccccCCCc-cCccccc Confidence 666666577776654 4444322 2322 2212 2345688888888876642 233222221 1223333 Q ss_pred ccceeeEEecc-ccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCccc Q lcl|NC_016566. 77 RMLTNSVNLSA-KVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVG 155 (364) Q Consensus 77 ~~~~vaVkl~~-~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~ 155 (364) .. .+-+++.+ ++-++.++..+-....-|.++++ ++.+....+..-+.+++.+.+.-..+. .+... T Consensus 70 ~~-~~~~tid~~~~~~~~i~d~d~~~~~~~~~~~~----~~~~~alA~~vD~~i~~~~~~a~~~~~-----~~~~~---- 135 (273) T protein:vir:10 70 DT-GVDLLIDQEKSIDFLVDDIDRVQVAGSLEAYT----RAGATALATDTDKFIADMLVDNGTALT-----GSAPT---- 135 (273) T ss_pred cc-eEEEEEeeeeecceEeecHHHhhhhccHHHHH----HHHHHHHHHHHHHHHHHHHhccccccc-----ccccc---- Confidence 32 22333321 12233444333332333444333 334444444444445555554322221 01100 Q ss_pred ccccccHHHHHHHHHHhccccc--CeeEEEEchHHHHHHHHhh-ccccccccccccc--ceeecccCCcEEEEeCCCCCC Q lcl|NC_016566. 156 GRTFPTLADFPLAASKFGDQAA--LIKSWFMDGVTWANFIAYQ-ALPSAEQVFAIGD--LQVMGDGLGRRFIISDAAADA 230 (364) Q Consensus 156 ~~~~~s~~~l~~A~~~lGD~~~--~l~~ivMHS~v~~~L~k~~-~it~~~~~~~~~~--~~~~~~~lGrrVIVDD~~p~~ 230 (364) .....+..|.+|+++|.++.- .=..+++|+.+|..|++.. .+.+....-.... -..+...+|-.|+.+..+|.. T Consensus 136 -~~~~~~~~i~~a~~~ld~~~vP~~~R~lvv~p~~~~~L~~~~~~~~~~~~~~~~~~l~~G~ig~i~G~~v~~s~~lp~~ 214 (273) T protein:vir:10 136 -DADDAFDLIAKALKELTKANVPNVGRVVVVNAEMAFWLRSSGSKLTSADTSGDAAGLRAGTIGNLLGARIVESNNLRDT 214 (273) T ss_pred -chhHHHHHHHHHHHHhhhcCCCcCCCEEEECHHHHHHHhcchhhhhhhhccccccceeeeeeeEEeceEEEEecccccC Confidence 001124578899999988752 2256899999999997633 2433222111111 122333469999999999975 Q ss_pred CCceEEEEEecceeEEecCCCCcceeeccCCCc----eeeeEEeeEEEEeeeeeeeecccccccccCCCC Q lcl|NC_016566. 231 MGAGKMLGLVPGAVAVTTNGLDMLAQEKGGNEN----IERWWQGEFDFNVAVKGYRLKASARTPVEGVRS 296 (364) Q Consensus 231 ~~~Yttylfg~GAi~~~~~~~~~~~~~~~g~e~----~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~S 296 (364) .+ ++.+.|-.+|+++...-.. .|..+.++ ....+..=.+.+++|.|+-=-.++. | T Consensus 215 ~~-~~~~~~~~~A~~~a~q~~~---~e~~r~~~~~~~~v~~~~~yg~~v~~~~~~~~l~~~g-------~ 273 (273) T protein:vir:10 215 DD-EQFVAFHPSAAAYVSQIDT---VEALRDQDSFSDRIRALHVYGGKVVRPTGVVVFNKTG-------S 273 (273) T ss_pred Cc-cEEEEEeccceeeeeeeeh---hhcccCCCcceeeeeeeeeeeeeEeccceEEEEeccC-------C Confidence 54 6778888999887543211 12222221 2222222224557777765433321 1 No 35 >protein:vir:105822 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:1636 # MgeName: PMC # Cross-refs: genbank:acc:YP_655767;genbank:gi:109522090;genbank:GeneID:4157630 Probab=97.97 E-value=8.8e-06 Score=48.30 Aligned_cols=259 Identities=13% Similarity=0.023 Sum_probs=126.0 Q ss_pred CCccccchhhhhhhh-hhhHHHHHHHhhhhcceeEec---cCcccCceeeeehhhhhcccccccccccCCCccccchhhh Q lcl|NC_016566. 1 MSLTVFQRKLVTAVT-QMIPDNLNVFNAAANGAVVLG---TGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLA 76 (364) Q Consensus 1 fd~~vfn~~~~~~~~-e~i~q~~~~fn~as~gAivl~---~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit 76 (364) |-...|=++++...+ +.+...+ +|.. ++-. .+...||-+..|.+..++- .|+...++. .++..++ T Consensus 1 MA~~~~~pe~~~~~v~~~~~~~l-v~~~-----l~~~~~~~~~~~Gdtv~ip~~~~~~~----~d~~~~~~~-~~~~~~~ 69 (273) T protein:vir:10 1 MAFNNFIPELWSDMLLEEWTAQT-VFAN-----LVNREYEGTASKGNVVHIAGVVAPTV----KDYKAAGRQ-TSADAIS 69 (273) T ss_pred CcchhhhHHHHHHHHHHHHHhhh-ccch-----hhccccccccccCceEEEeecccccc----cccccCCCc-cCccccc Confidence 666666577776654 4444322 2322 2212 2345688888888876642 233222221 1223333 Q ss_pred ccceeeEEecc-ccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCccc Q lcl|NC_016566. 77 RMLTNSVNLSA-KVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVG 155 (364) Q Consensus 77 ~~~~vaVkl~~-~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~ 155 (364) .. .+-+++.+ ++-++.++..+-....-|.++++ ++.+....+..-+.+++.+.+.-..+. .+... T Consensus 70 ~~-~~~~tid~~~~~~~~i~d~d~~~~~~~~~~~~----~~~~~alA~~vD~~i~~~~~~a~~~~~-----~~~~~---- 135 (273) T protein:vir:10 70 DT-GVDLLIDQEKSIDFLVDDIDRVQVAGSLEAYT----RAGATALATDTDKFIADMLVDNGTALT-----GSAPT---- 135 (273) T ss_pred cc-eEEEEEeeeeecceEeecHHHhhhhccHHHHH----HHHHHHHHHHHHHHHHHHHhccccccc-----ccccc---- Confidence 32 22333321 12233444333332333444333 334444444444445555554322221 01100 Q ss_pred ccccccHHHHHHHHHHhccccc--CeeEEEEchHHHHHHHHhh-ccccccccccccc--ceeecccCCcEEEEeCCCCCC Q lcl|NC_016566. 156 GRTFPTLADFPLAASKFGDQAA--LIKSWFMDGVTWANFIAYQ-ALPSAEQVFAIGD--LQVMGDGLGRRFIISDAAADA 230 (364) Q Consensus 156 ~~~~~s~~~l~~A~~~lGD~~~--~l~~ivMHS~v~~~L~k~~-~it~~~~~~~~~~--~~~~~~~lGrrVIVDD~~p~~ 230 (364) .....+..|.+|+++|.++.- .=..+++|+.+|..|++.. .+.+....-.... -..+...+|-.|+.+..+|.. T Consensus 136 -~~~~~~~~i~~a~~~ld~~~vP~~~R~lvv~p~~~~~L~~~~~~~~~~~~~~~~~~l~~G~ig~i~G~~v~~s~~lp~~ 214 (273) T protein:vir:10 136 -DADDAFDLIAKALKELTKANVPNVGRVVVVNAEMAFWLRSSGSKLTSADTSGDAAGLRAGTIGNLLGARIVESNNLRDT 214 (273) T ss_pred -chhHHHHHHHHHHHHhhhcCCCcCCCEEEECHHHHHHHhcchhhhhhhhccccccceeeeeeeEEeceEEEEecccccC Confidence 001124578899999988752 2256899999999997633 2433222111111 122333469999999999975 Q ss_pred CCceEEEEEecceeEEecCCCCcceeeccCCCc----eeeeEEeeEEEEeeeeeeeecccccccccCCCC Q lcl|NC_016566. 231 MGAGKMLGLVPGAVAVTTNGLDMLAQEKGGNEN----IERWWQGEFDFNVAVKGYRLKASARTPVEGVRS 296 (364) Q Consensus 231 ~~~Yttylfg~GAi~~~~~~~~~~~~~~~g~e~----~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~S 296 (364) .+ ++.+.|-.+|+++...-.. .|..+.++ ....+..=.+.+++|.|+-=-.++. | T Consensus 215 ~~-~~~~~~~~~A~~~a~q~~~---~e~~r~~~~~~~~v~~~~~yg~~v~~~~~~~~l~~~g-------~ 273 (273) T protein:vir:10 215 DD-EQFVAFHPSAAAYVSQIDT---VEALRDQDSFSDRIRALHVYGGKVVRPTGVVVFNKTG-------S 273 (273) T ss_pred Cc-cEEEEEeccceeeeeeeeh---hhcccCCCcceeeeeeeeeeeeeEeccceEEEEeccC-------C Confidence 54 6778888999887543211 12222221 2222222224557777765433321 1 No 36 >protein:vir:4953 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:108 # MgeName: Sfi19 # Cross-refs: genbank:acc:NP_049929;genbank:gi:9632900;genbank:GeneID:1262076 Probab=97.95 E-value=9.3e-06 Score=48.16 Aligned_cols=269 Identities=11% Similarity=0.021 Sum_probs=108.4 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) ---.+..+++....++.+.+.....+-+. +..-....|.+...+. ....+....+.........+++ ++ .. T Consensus 116 ~gg~~vP~~~~~~ii~~~~~~~~l~~~~~----~~~~~~~~~~~~~~~~-~~~~~~a~~v~E~~~~~~~~~~-~~---~~ 186 (397) T protein:vir:49 116 DAGLTIPQDIQTAIHTLVSQYDSLQEYVN----VENVTTLTGSRVYEKW-TDITGLANIDDEAGKIADVDDP-KL---SL 186 (397) T ss_pred cCcccccHhHHHHHHHHHHhhhhHHhhhc----eeecccCccceEEEee-ccCCcceeeecCcccccccccc-ce---ee Confidence 00001122222233333333221111110 0000011233222111 1111111111111000000011 12 22 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCcccccccc Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTFP 160 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~~ 160 (364) +-....+-.+-+.++...+. .....+..-|.+++++...+...+.+|.... ... ...... T Consensus 187 i~~~~~k~~~~~~iS~ell~---ds~~~l~~~i~~~l~~~~~~~~d~ai~~G~g----~~~-------------~~~~~~ 246 (397) T protein:vir:49 187 IKYTIKRYAGISTVTNSLLA---DSAENILAWLSGWIAKKVVVTRNKAILEAIA----ALP-------------TKPTLT 246 (397) T ss_pred EEeeeeeEEeeehhHHHHHh---hhHHHHHHHHHHHHHHHHHHHHHHHHHhhcc----ccc-------------cccccc Confidence 22222222233346655443 2223344457777777777665554433211 110 011223 Q ss_pred cHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc-cceeecccCCcEEEEeCCC--CCCCCceE Q lcl|NC_016566. 161 TLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG-DLQVMGDGLGRRFIISDAA--ADAMGAGK 235 (364) Q Consensus 161 s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~-~~~~~~~~lGrrVIVDD~~--p~~~~~Yt 235 (364) ++.++.++..++-.+...-..|+||..+|..|.+ +.+..+ ++... .-......+|+||++.|+. |.....-. T Consensus 247 ~~d~i~~~~~~l~~~~~~~a~~vmn~~~~~~l~~---lkd~~G~~l~~~~~~~~~~~~l~G~PV~~~~~~~~~~~~~~~~ 323 (397) T protein:vir:49 247 KWDDIIDLEAKVDPAIKQTSFFLTNTSGFTALKK---VKNALGDYLMERDVKSPTGYSIDGFAVKEVADRWLANGTGGAM 323 (397) T ss_pred cHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHH---hhcCCCceeeccCcCCCCCceecceeeEEecccccccccCCce Confidence 5567888888887777788999999999998864 333332 22211 1111112369999987764 33222223 Q ss_pred EEEEec--ceeEEe-cCCCCcceeeccCC--CceeeeEEeeEEE---EeeeeeeeecccccccccCCCCcChhh Q lcl|NC_016566. 236 MLGLVP--GAVAVT-TNGLDMLAQEKGGN--ENIERWWQGEFDF---NVAVKGYRLKASARTPVEGVRSFKLSD 301 (364) Q Consensus 236 tylfg~--GAi~~~-~~~~~~~~~~~~g~--e~~~~~~~~~~~f---~lhp~G~sw~~~~~~~~~gg~SPT~ae 301 (364) .++||. .++.+. ..+..+...+..++ ..-.+.+++...| .+||.+|..-.-+-+....+..|+-+- T Consensus 324 ~i~~gd~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~~~~~~~~~~~~~ 397 (397) T protein:vir:49 324 PLYFGDLKQAVTLFDRQHMSLLSTNIGGGAFETDTTKVRVIDRFDVVATDTEAFVPASFKAIADQKGNLGSTAV 397 (397) T ss_pred eEEEeeccceEEEEeecceEEEEeccccchhhcCceeEEEEeeeCcEEecccceEEEEeecccCCCCCcccccC Confidence 466663 234333 34444333332221 1112233332222 377777765332222222344555554 No 37 >protein:vir:7990 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:151 # MgeName: Che8 # Cross-refs: genbank:acc:NP_817344;genbank:gi:29565772;genbank:GeneID:1258978 Probab=97.95 E-value=9.4e-06 Score=48.14 Aligned_cols=259 Identities=13% Similarity=0.028 Sum_probs=128.8 Q ss_pred CCccccchhhhhhhh-hhhHHHHHHHhhhhcceeEecc---CcccCceeeeehhhhhcccccccccccCCCccccchhhh Q lcl|NC_016566. 1 MSLTVFQRKLVTAVT-QMIPDNLNVFNAAANGAVVLGT---GEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLA 76 (364) Q Consensus 1 fd~~vfn~~~~~~~~-e~i~q~~~~fn~as~gAivl~~---~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit 76 (364) |-...|=++++...+ +.++..+ +|- .++-.. ..-.||-+..|.+..++- .|+...+. ..++..++ T Consensus 1 MA~~~~~pei~~~~v~~~~~~~l-v~~-----~l~~~~~~~~~~~GdTv~ip~~~~~~~----~d~~~~~~-~~~~~~~~ 69 (273) T protein:vir:79 1 MAFNNFIPELWSDMLLEEWTAQT-VFA-----NLVNREYEGIASKGNVVHIAGVVAPTV----KDYKAAGR-QTSADAIS 69 (273) T ss_pred CcchhhhHHHHHHHHHHHHHhhc-cch-----hhhhccccccccCCcEEEEeecCcccc----cccccCCC-ccCccccc Confidence 666666577776653 4444332 222 222111 234588889998887653 23332222 11222333 Q ss_pred ccceeeEEecc-ccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCccc Q lcl|NC_016566. 77 RMLTNSVNLSA-KVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVG 155 (364) Q Consensus 77 ~~~~vaVkl~~-~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~ 155 (364) . ..+-+++.+ ++-++.++..+.....-|.+++++ +.+....+..-+.+++.+.+.-..++ ...... T Consensus 70 ~-~~~~~tid~~~~~~~~i~d~d~~~~~~~~~~~~~----~~~~ala~~vD~~i~~~~~~a~~~~~-----~~~~~~--- 136 (273) T protein:vir:79 70 D-TGVDLLIDQEKSIDFLVDDIDRVQVAGSLEAYTR----AGATALATDTDKFIADMLVDNGTALT-----GSAPSD--- 136 (273) T ss_pred c-ceEEEEEeeecccceeeccHHHHhhcccHHHHHH----HHHHHHHHHHHHHHHHHHhhcccccc-----cccccc--- Confidence 2 233344432 233445555554444455554444 44444444444445555554322211 010000 Q ss_pred ccccccHHHHHHHHHHhccccc--CeeEEEEchHHHHHHHHhh-ccccccccccccc--ceeecccCCcEEEEeCCCCCC Q lcl|NC_016566. 156 GRTFPTLADFPLAASKFGDQAA--LIKSWFMDGVTWANFIAYQ-ALPSAEQVFAIGD--LQVMGDGLGRRFIISDAAADA 230 (364) Q Consensus 156 ~~~~~s~~~l~~A~~~lGD~~~--~l~~ivMHS~v~~~L~k~~-~it~~~~~~~~~~--~~~~~~~lGrrVIVDD~~p~~ 230 (364) ....+..|.+|+.+|.++.- .=..+++++..|..|++.. .+.+....-.... -..+...+|-.|+.+..+|.. T Consensus 137 --~~~~~~~i~~a~~~ld~~~vP~~~R~lvv~p~~~~~Ll~~~~~~~~~~~~~~~~~l~~G~ig~~~G~~i~~s~~lp~~ 214 (273) T protein:vir:79 137 --ADDAFDLIASALKELTKANVPNVGRVVVVNAEMAFWLRSSGSKLTSADTSGDAAGLRAGTIGNLLGARIVESNNLRDT 214 (273) T ss_pred --hhhHHHHHHHHHHHhhhccCCccCcEEEECHHHHHHHhhchhhhhhhhhcccccceeeeEeeEEeceEEEeccccccc Confidence 01123568889999988752 2256889999999987642 2433221111111 112223458999999999976 Q ss_pred CCceEEEEEecceeEEecCCCCcceeeccCCC----ceeeeEEeeEEEEeeeeeeeecccccccccCCCC Q lcl|NC_016566. 231 MGAGKMLGLVPGAVAVTTNGLDMLAQEKGGNE----NIERWWQGEFDFNVAVKGYRLKASARTPVEGVRS 296 (364) Q Consensus 231 ~~~Yttylfg~GAi~~~~~~~~~~~~~~~g~e----~~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~S 296 (364) .+ |+.+.|-.+|+++......+ |..+.+ +....+..=.+.+++|.|+-=-.++. | T Consensus 215 ~~-~~~~a~~~~A~~~a~~~~~~---e~~r~~~~~~~~v~~~~~yg~~v~~p~~vv~~~~~g-------~ 273 (273) T protein:vir:79 215 DD-EQFVAFHPSAAAYVSQIDTV---EALRDQDSFSDRIRALHVYGGKVVRPTGVVVFNKTG-------S 273 (273) T ss_pred Cc-eEEEEEeccceeeeeehhhh---hcccCcccceeeeeeeeeeeeEEecCceEEEEeccC-------C Confidence 54 67788889998875432221 122222 12222222224567888865544331 1 No 38 >protein:vir:4856 Length: 293 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:106 # MgeName: DT1 # Cross-refs: genbank:acc:NP_049396;genbank:gi:9632424;genbank:GeneID:1258532 Probab=97.84 E-value=1.6e-05 Score=46.92 Aligned_cols=269 Identities=10% Similarity=0.022 Sum_probs=112.9 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) ---..-.+++...++|.+.+.....+- +-+.......|.+...+ +...++...-+........... .++..-.- T Consensus 12 ~gg~liP~~~~~~Ii~~~~~~~~l~~~----~~~~~~~~~~g~~~~~~-~~~~~~~a~~v~Eg~~~~~~~~-~~~~~i~l 85 (293) T protein:vir:48 12 DAGLTIPQDIRTAINTLVRQYDSLQEY----VNVENVTTLTGSRVYEK-WTDITGLANIDDEAGKIADIDD-PKLSLIKY 85 (293) T ss_pred cCceEechhHHHHHHHHHHhhhhhhhh----ceeeeccCCcceEEEEe-ecCCCcceeeecCCcccccccc-cceeEEEE Confidence 111122444555566666654332221 11111112223322222 1111111111111111100001 12333333 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCcccccccc Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTFP 160 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~~ 160 (364) ...|++.. +..+.+.+....-| +...|.+++++...+-..+.++..+.... ...... T Consensus 86 ~~~k~~~~---~~iS~ell~ds~~~---l~~~i~~~la~~~~~~~~~~i~~g~~~~~-----------------~~~~~~ 142 (293) T protein:vir:48 86 TIKRYAGI---STVTNSLLADSAEN---ILAWLSGWIAKKVVVTRNKAILGVVDKLP-----------------TKPTLT 142 (293) T ss_pred eeeEEEEe---ehhhHHHHhhhhHH---HHHHHHHHHHHHHHHHHHhHHhhcccccc-----------------cccccc Confidence 33344432 34666555422333 33447788888877766555443322111 112345 Q ss_pred cHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc-cceeecccCCcEEEEeCCCCCC--CCceE Q lcl|NC_016566. 161 TLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG-DLQVMGDGLGRRFIISDAAADA--MGAGK 235 (364) Q Consensus 161 s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~-~~~~~~~~lGrrVIVDD~~p~~--~~~Yt 235 (364) ++.++.++..++......-..|+||..++..|.+ +.+..+ ++... .-......+|++|++.|..+.. ...-. T Consensus 143 ~~d~i~~~~~~l~~~~~~~a~~vmn~~~~~~L~~---lkd~~g~~l~~~~~~~~~~~~l~G~Pv~~~~~~~~~~~~~~~~ 219 (293) T protein:vir:48 143 KWDDIIDLEAKVDPAIKQTSFFLTNTSGFTALKK---VKNALGDYLMERDVKSPTGYSIAGFAVKEISDRWLPNASSGVM 219 (293) T ss_pred CHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHH---hhccCCceEeecCcCCCCCceecceeeEEecccccCCccCCce Confidence 6678888999887777777899999999998854 322222 22211 1111112469999886665432 22223 Q ss_pred EEEEec--ceeEEec-CCCCcceeeccCC--CceeeeEEe--eEE-EEeeeeeeeecccccccccCCCCcChhhhcC Q lcl|NC_016566. 236 MLGLVP--GAVAVTT-NGLDMLAQEKGGN--ENIERWWQG--EFD-FNVAVKGYRLKASARTPVEGVRSFKLSDITD 304 (364) Q Consensus 236 tylfg~--GAi~~~~-~~~~~~~~~~~g~--e~~~~~~~~--~~~-f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat 304 (364) .++||. -++.+.+ .++.+...+..++ +.-.+.++. |+. -..||..|..-+-+.+....+ |..-.+- T Consensus 220 ~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~l~~~~~~~~~~---~~~~~~~ 293 (293) T protein:vir:48 220 PLYFGDLKQAVTLFDRQQMSLLSTNIGGGAFETDTTKVRVIDRFDVVATDTEAFVPASFKAIADQKG---NIGSTAV 293 (293) T ss_pred EEEEEeccceEEEEEecceEEEEecccchhhhcCeEEEEEEEeeCcEEecccceEEEEeeccccCCc---cccccCC Confidence 456664 2343333 3333333322211 111222222 222 236777776533111111111 1111111 No 39 >protein:vir:41 Length: 299 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:2 # MgeName: A118 # Cross-refs: genbank:acc:NP_463467;swissprot:trembl:q9t1b7;genbank:gi:16798789;uniprot:Q9T1B7;genbank:GeneID:922353 Probab=97.82 E-value=1.7e-05 Score=46.79 Aligned_cols=265 Identities=8% Similarity=-0.006 Sum_probs=115.6 Q ss_pred CCc----------cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccc Q lcl|NC_016566. 1 MSL----------TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPA 70 (364) Q Consensus 1 fd~----------~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~ 70 (364) |+. -.-.+++...++|.+.+......-.. ..++.|.-.+.|.+...... -++.. ..... T Consensus 3 ~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~~~~-------~~~~~~~~~~~~~~~~~~a~--~v~E~--~~~~~ 71 (299) T protein:vir:41 3 FNPDTTTMQSAKTGSIPINISEQIITGVKNGSAAMKLAK-------AVPMTKPEEEFTFMSGVGAF--WVDEA--ERIQT 71 (299) T ss_pred cCCCcccccCCCceecchhHHHHHHHHHHhcchhhhhce-------eeecCCCcEEEEEEcCCcee--eeecC--ccccc Confidence 331 12345566666666665433322221 12223332344444322111 11111 11111 Q ss_pred cchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccc Q lcl|NC_016566. 71 TAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPAR 150 (364) Q Consensus 71 T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~ 150 (364) +..++++-.-...|++ +-+..+.+.+. .....+...|.+++++...+...+.+|. |--+......+..... T Consensus 72 ~~~~f~~v~l~~~k~~---~~~~is~ell~---ds~~~~~~~i~~~l~~a~~~~~d~a~l~---G~g~~~~~gil~~~~~ 142 (299) T protein:vir:41 72 SKPTFTKAKMRSKKMG---VIIPTTKENLN---YSVTNFFSLMQAEIVEAFYKKFDQAVFT---GVESPYNWNILKSATD 142 (299) T ss_pred cccceeEEEEeeEEEE---EeehhhHHHHh---cCHHHHHHHHHHHHHHHHHHHHHHHHhh---cccCcccccccccccc Confidence 1112222222222333 22346655443 3334455668888999888877775552 2111111111111111 Q ss_pred cCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccccceeecccCCcEEEEeCCCC Q lcl|NC_016566. 151 VDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQVMGDGLGRRFIISDAAA 228 (364) Q Consensus 151 t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~~~~~~lGrrVIVDD~~p 228 (364) ..........++.++.++..++-+...+-..|+||+..+..|.+ +.+..+ ++...........+|++|+++|.|| T Consensus 143 ~~~~~~~~~~~~~~l~~~~~~l~~~~~~~~~~v~n~~~~~~L~~---lkd~~G~~l~~~~~~~~~~~l~G~PV~~~~~~~ 219 (299) T protein:vir:41 143 ASNLVEETANKYDDLNEAIGLIEAEDLEPNGIATIRKQRVKYRS---TKDGNGMPIFNTATSNGVDDVLGLPIAYTPKYT 219 (299) T ss_pred cceeeccccccHHHHHHHHHhhhcccCCcCEEEEcHHHHHHHHH---hhccCCceeecCCcCCCCceecceeeEEecccC Confidence 11112223456788999999998877788899999999999975 333332 2222111111134699999999999 Q ss_pred CCCCceEEEEEec--ceeEEec-CCCCcceeec----c-CC---------CceeeeEEeeEE---EEeeeeeeeeccccc Q lcl|NC_016566. 229 DAMGAGKMLGLVP--GAVAVTT-NGLDMLAQEK----G-GN---------ENIERWWQGEFD---FNVAVKGYRLKASAR 288 (364) Q Consensus 229 ~~~~~Yttylfg~--GAi~~~~-~~~~~~~~~~----~-g~---------e~~~~~~~~~~~---f~lhp~G~sw~~~~~ 288 (364) ...++ ...+||. ..+ ++. .+..+.+.+. . .+ +.-...++.+.. -..||.-|.--...- T Consensus 220 ~~~~~-~~~~~gdfs~~~-i~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~~~d~~v~~~~A~~~l~~~a 297 (299) T protein:vir:41 220 FGDKD-ISELVGDWNQAY-YGILRGVEYEILTEATLTTVADETGKPLNLAERDMAAIKATFEVGFMVVKDEAFSAVQPKA 297 (299) T ss_pred CCCCc-eEEEEEecccEE-EEEecCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEeccEEecccceEEEEecc Confidence 65432 1233333 122 221 2221111110 0 00 011122222111 124555554432221 Q ss_pred ccccCCCC Q lcl|NC_016566. 289 TPVEGVRS 296 (364) Q Consensus 289 ~~~~gg~S 296 (364) + + T Consensus 298 a------~ 299 (299) T protein:vir:41 298 G------N 299 (299) T ss_pred C------C Confidence 1 0 No 40 >protein:vir:1638 Length: 298 # NCBI annotation: Structural protein # Family: family:all:966 # MgeID: mge:33 # MgeName: r1t # Cross-refs: genbank:acc:NP_695059;genbank:gi:23455750;genbank:GeneID:955469 Probab=97.81 E-value=1.8e-05 Score=46.63 Aligned_cols=270 Identities=8% Similarity=-0.008 Sum_probs=116.5 Q ss_pred CCc---cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhc Q lcl|NC_016566. 1 MSL---TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLAR 77 (364) Q Consensus 1 fd~---~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~ 77 (364) |-. .+..+++...++|.+.+..-+..-+. ..++.+.-...|.+..-+.+ .-.. .+......++ ++.+ T Consensus 1 ma~~gG~lvp~~~~~~ii~~~~~~s~i~~l~~-------~~~~~~~~~~ip~~~~~~~a-~~v~-E~~~~~~~~~-~f~~ 70 (298) T protein:vir:16 1 MVLNKGTLFDPTLVTDLISKVAGKSSIARLSA-------QKPIPFNGEKVFTFTMDSEI-DVVA-ESGKKTHGGV-TLAP 70 (298) T ss_pred CcccCcceechhHHHHHHHHHHhhhhhhhhcc-------eeeccCCceEEEEEecCcce-EEec-CCcccccccc-ceeE Confidence 111 13445555666666665433322211 11222221334433321111 1111 1111111111 2222 Q ss_pred cceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhccccc------ceeeccccc Q lcl|NC_016566. 78 MLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAA------ANYTQPARV 151 (364) Q Consensus 78 ~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~------~v~dis~~t 151 (364) -.-...|++.. +..+.+-+.....+...+...|.+++++...+...+.++.... .-.+... .+....... T Consensus 71 v~l~~~k~a~~---~~iS~ell~~s~d~~~~l~~~i~~~la~ai~~~~d~~~l~G~~-~~~g~~~~~~~~~~~~~~~~~~ 146 (298) T protein:vir:16 71 QTMVPIKVEYG---ARISDEFMYASDEEKINILQEFNDGFAKKVARGIDLMAFHGVN-PRLGTASAVIGTNHFDSKVTQK 146 (298) T ss_pred EEEeeeeEEEe---ehhhHHHhhcCcccHHHHHHHHHHHHHHHHHHHHHHHhhcccc-CCCCcccccccccccccccccc Confidence 22233344432 3456554432334444555567888888887766665553210 0001110 111100000 Q ss_pred CcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccccce-eecccCCcEEEEeCCCC Q lcl|NC_016566. 152 DGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQ-VMGDGLGRRFIISDAAA 228 (364) Q Consensus 152 ~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~-~~~~~lGrrVIVDD~~p 228 (364) ............++.++..++-.+..+...|+||+..+..|.+ +.+..+ ++.+.... .....+|+||+++|.+| T Consensus 147 ~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~---lkd~~G~~i~~~~~~~~~~~~l~G~PV~~~~~v~ 223 (298) T protein:vir:16 147 VEAPRGIADPNGAIENAVELLTGVDADVTGIAINPSFRSALAK---QKDLQDNALFPELKWGATPDTINGLPVDVNKTVS 223 (298) T ss_pred cccccccccHHHHHHHHHHHhhhcCCCccEEEEcHHHHHHHHH---hhccCCCeeecCcccCCCCceecceeeEEecccc Confidence 0001111112346788888887777788899999999999865 333333 22221111 01123699999999999 Q ss_pred CCC--CceEEEEEec--ceeEEec-CCCCcceeec---cCC-----CceeeeEEeeEEE---Eeeeeeeeeccccccccc Q lcl|NC_016566. 229 DAM--GAGKMLGLVP--GAVAVTT-NGLDMLAQEK---GGN-----ENIERWWQGEFDF---NVAVKGYRLKASARTPVE 292 (364) Q Consensus 229 ~~~--~~Yttylfg~--GAi~~~~-~~~~~~~~~~---~g~-----e~~~~~~~~~~~f---~lhp~G~sw~~~~~~~~~ 292 (364) ... +++ .++||. .++.++. ....+...+. ++. +.-.+.++++..| .+||..|..-+.+ T Consensus 224 ~~~~~~~~-~~~~GDfs~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~v~~ra~~r~d~~v~~~~a~~~l~~a----- 297 (298) T protein:vir:16 224 DMSLTQRD-RAIIGDFANGFKWGYAKEVPLEVIQYGDPDNSGLDLKGYNQVYIRAELFLGWGILDATKFARVTEA----- 297 (298) T ss_pred cccCCCcc-EEEEeeccceEEEEEecCceEEEeeccCCcCcchhhhhcCcEEEEEEEEEccEeecccceEEEeec----- Confidence 643 233 455553 4444442 2222211111 110 1112334443322 2677777664332 Q ss_pred CCCC Q lcl|NC_016566. 293 GVRS 296 (364) Q Consensus 293 gg~S 296 (364) + T Consensus 298 ---t 298 (298) T protein:vir:16 298 ---N 298 (298) T ss_pred ---C Confidence 1 No 41 >protein:vir:100135 Length: 418 # NCBI annotation: gp5 # Family: family:all:585 # MgeID: mge:1639 # MgeName: phi1026b # Cross-refs: genbank:acc:NP_945035;genbank:gi:38707895;genbank:GeneID:2744182 Probab=97.79 E-value=1.9e-05 Score=46.42 Aligned_cols=261 Identities=7% Similarity=-0.021 Sum_probs=109.6 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) -.-....+++...+++.+.+.....+-.. . .++.+.-...|.+...++...-.... ......++ ++.+-.- T Consensus 142 ~~g~lvp~~~~~~ii~~~~~~~~l~~~~~--~-----~~~~~~~~~~~~~~~~~~~a~~v~E~-~~~~~~~~-~f~~v~~ 212 (418) T protein:vir:10 142 GSNSLVVADRQAGIIAPPQRKMTIRDLLM--P-----GQTSSSSIEYTVETGFTNNAAAVAEG-AQKPTSDL-KFNLKNQ 212 (418) T ss_pred CCccccchhHHHHHHHHHhhhhhHHhhcc--e-----eeccCCceeEEEEecCCCceeeeccC-cccccccc-ceeeEEE Confidence 11113444555566666666555444322 1 11222211222222211111111111 11111111 2222222 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH------HhhhhcccccceeecccccCcc Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGA------GKAAIESNAAANYTQPARVDGV 154 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~------L~Gv~~~na~~v~dis~~t~~~ 154 (364) ...+++. -+.++..-+ +|...+...|.+++++...+..-+.+|.- ..|++.... ..+... T Consensus 213 ~~~k~~~---~~~is~ell----~ds~~l~~~i~~~l~~a~~~~~d~a~l~G~g~~~~p~Gi~~~~~-------~~~~~~ 278 (418) T protein:vir:10 213 PVRTIAH---LFKASRQIL----DDAPALQSYIDGRARYGLQLTEEGQILKGDGTGANILGILPQAS-------AFMPSI 278 (418) T ss_pred eeeeEEE---eehhhHHHH----HhHHHHHHHHHHHHHHHHHHHHHHHHhccCCCCccccccccccc-------cccccc Confidence 2223332 233444432 23323444566666665555444433310 112221111 111111 Q ss_pred cccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccccceeecccCCcEEEEeCCCCCCCC Q lcl|NC_016566. 155 GGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQVMGDGLGRRFIISDAAADAMG 232 (364) Q Consensus 155 ~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~~~~~~lGrrVIVDD~~p~~~~ 232 (364) ......++.++.++..++-+....-..|+||..+|..|.+ +.+..+ ++....-......+|++|+++|.||... T Consensus 279 ~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~L~~---lkd~~G~~i~~~~~~~~~~~l~G~pV~~~~~~p~~~- 354 (418) T protein:vir:10 279 TLANATPIDKIRLALLQAVLAEFPATGIVLNPIDWASIEL---TKDSQGRYIVGNPVNGTTPRLWNLPVVETQAMTANE- 354 (418) T ss_pred cccccccHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHH---hhcCCCceeccccccCCCceecceeeEEcCCCCCCc- Confidence 1223345667888999988888888899999999998854 322222 2221100111124699999999999542 Q ss_pred ceEEEEEecc--eeEE-ecCCCCcceeeccCC--CceeeeEEeeEE---EEeeeeeeeecccccccccCC Q lcl|NC_016566. 233 AGKMLGLVPG--AVAV-TTNGLDMLAQEKGGN--ENIERWWQGEFD---FNVAVKGYRLKASARTPVEGV 294 (364) Q Consensus 233 ~Yttylfg~G--Ai~~-~~~~~~~~~~~~~g~--e~~~~~~~~~~~---f~lhp~G~sw~~~~~~~~~gg 294 (364) ++||.- ++.+ ...++.+...+..+. ..-.+.++.+.. -.+||.+|.+-.-+.+ -+| T Consensus 355 ----~~~gd~s~~~~~~~~~~~~i~~~~~~~~~f~~~~~~~r~~~~~d~~~~~~~a~~~~~~~~~--~~g 418 (418) T protein:vir:10 355 ----FLVGAFSMAAQIFDRMEIEVLLSTENVDDFEKNMVSIRAEERLALAVYRPESFVTGALVEQ--AGG 418 (418) T ss_pred ----EEEeeccceEEEEEecceEEEEecccchhhhcCceEEEEEEeeccEEecccceEEEEeccC--CCC Confidence 445531 2222 223333322222221 111222332222 1377888876333211 123 No 42 >protein:vir:7771 Length: 330 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:149 # MgeName: Bxz2 # Cross-refs: genbank:acc:NP_817605;genbank:gi:29566035;genbank:GeneID:1259229 Probab=97.76 E-value=2.2e-05 Score=46.14 Aligned_cols=277 Identities=12% Similarity=0.000 Sum_probs=113.8 Q ss_pred CCcc---------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccC Q lcl|NC_016566. 1 MSLT---------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAP 65 (364) Q Consensus 1 fd~~---------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~ 65 (364) |-.. +-.++....+++.+.+..-..+-. ...+..+.-...|.+.....+. -+... . T Consensus 1 m~~~~~~a~~~~~t~~~g~~i~~~~~~~ii~~~~~~s~l~~~~-------~~~~~~~~~~~~p~~~~~~~a~-~v~Eg-~ 71 (330) T protein:vir:77 1 MAGSTVPSTQVALTGDFSAFLTPEQSQDYFAEIEKTSIVQRIA-------RKVPMGPTGISIPHWTGAVSAS-WTGEA-E 71 (330) T ss_pred CcccccchhhccccCCCcceechhHHHHHHHHHHhccchhhhc-------ceeeccCCceEEEEEcCCccee-EecCC-C Confidence 1111 112333444455555432221111 1122223223344333222111 11111 1 Q ss_pred CCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHH------HHhhhhcc Q lcl|NC_016566. 66 VGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIG------AGKAAIES 139 (364) Q Consensus 66 ~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla------~L~Gv~~~ 139 (364) .-...++ ++.+-.-...|++ +-+..+.+.+.. ....+...|.+++++...+...+.+|. -+.|.+.. T Consensus 72 ~~~~~~~-~f~~i~~~~~k~~---~~~~is~ell~d---s~~~~~~~i~~~l~~ai~~~~~~~~l~G~g~~~~~~g~~~~ 144 (330) T protein:vir:77 72 RKPITKG-SFGKQELEPVKIT---TIFAESAEVVRL---NPLNYLNTMRTKIAEAIALKFDAAAIHGIDKPSAFKGYLAE 144 (330) T ss_pred ccccccc-eeeEEEEeEEEEE---EeehhhHHHHhc---chHHHHHHHHHHHHHHHHHHHHHHhhcccCCCCcccccccc Confidence 1111111 2222222223333 223455554432 223344568888888888877776651 11222221 Q ss_pred cccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc----ccee- Q lcl|NC_016566. 140 NAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG----DLQV- 212 (364) Q Consensus 140 na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~----~~~~- 212 (364) ......................+.++.+++.++..+...-..|+||..++..|.+ +.+..+ ++... +... T Consensus 145 ~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~vmn~~~~~~l~~---lkd~~G~~l~~~~~~~~~~~~~ 221 (330) T protein:vir:77 145 TTKVVSLADTNLTTASGPQGNAYLAVNNALSLLVNSGKKWTGTLLDNVTEPILNT---AVDGNGRPLFVESTYTEQVGAI 221 (330) T ss_pred ccccceeecccccccccccchhHHHHHHHHHhhhhcCCCccEEEEcHHHHHHHHH---HhccCCceeecCcccccccccc Confidence 1111110111111112233344567888888888888888899999999998864 322222 22110 1111 Q ss_pred ec-ccCCcEEEEeCCCCCCC-CceEEEEEecc-eeEEec-CCCCcce-ee---ccCC--------------CceeeeEEe Q lcl|NC_016566. 213 MG-DGLGRRFIISDAAADAM-GAGKMLGLVPG-AVAVTT-NGLDMLA-QE---KGGN--------------ENIERWWQG 270 (364) Q Consensus 213 ~~-~~lGrrVIVDD~~p~~~-~~Yttylfg~G-Ai~~~~-~~~~~~~-~~---~~g~--------------e~~~~~~~~ 270 (364) .+ ..+|++|+++|.||... +.-...+||.- -+.++. .+..... .+ ..+. +.-...++. T Consensus 222 ~~~~l~G~PV~~~~~~p~~~~~~~~~~~~gd~s~~~i~~~~~~~i~~~~e~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~ 301 (330) T protein:vir:77 222 REGRILGRPTYVADNVVNGTVGNRVVGVMGDFSQVIWGQIGGLSFDVTDQATLDFGEEQGGVWVPKLISLWQHNMVAVRC 301 (330) T ss_pred CCceecceeeEEeccccCCCCCCccEEEEEecceEEEEEecCcEEEEeecceeeecccccccccccccchhhcCcEEEEE Confidence 11 23699999999998532 11222334431 111222 1111110 00 0000 111223333 Q ss_pred eEE--E-EeeeeeeeecccccccccCCCCcChh Q lcl|NC_016566. 271 EFD--F-NVAVKGYRLKASARTPVEGVRSFKLS 300 (364) Q Consensus 271 ~~~--f-~lhp~G~sw~~~~~~~~~gg~SPT~a 300 (364) +.. | ..||..|.--... ..|..|.-+ T Consensus 302 ~~r~d~~v~~~~a~~~i~~~----~~~~~~~~~ 330 (330) T protein:vir:77 302 EAEFAFMVNDKDAFVKLTDQ----VAGTDPEEE 330 (330) T ss_pred EEEeccEEecccceEEEEec----cCCcCCCCC Confidence 221 1 2677776553222 124445444 No 43 >protein:vir:191 Length: 385 # NCBI annotation: major head subunit precursor # Family: family:all:585 # MgeID: mge:6 # MgeName: HK97 # Cross-refs: genbank:acc:NP_037701;genbank:gi:9634158;genbank:GeneID:1262530 Probab=97.65 E-value=3.3e-05 Score=45.16 Aligned_cols=262 Identities=9% Similarity=0.018 Sum_probs=105.6 Q ss_pred CC---------------ccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccC Q lcl|NC_016566. 1 MS---------------LTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAP 65 (364) Q Consensus 1 fd---------------~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~ 65 (364) +. -.+.-++.....++.+.+....++-.. ..++.+.-...|.+...++...-+... . T Consensus 96 ~~~~~~~~~~~~~~~~~g~~i~~~~~~~ii~~~~~~~~l~~~~~-------~~~~~~~~~~~~~~~~~~~~a~~v~E~-~ 167 (385) T protein:vir:19 96 FGAKTFNKSLGSDADSAGSLIQPMQIPGIIMPGLRRLTIRDLLA-------QGRTSSNALEYVREEVFTNNADVVAEK-A 167 (385) T ss_pred chhhHHHhhhccccccCCceecchhhhHHHHHhhhccchhhhcc-------eecccCcceEEEEEecCCcceeeeccC-c Confidence 11 111223333444444444333322211 112222212233222111111111111 1 Q ss_pred CCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc-cce Q lcl|NC_016566. 66 VGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNA-AAN 144 (364) Q Consensus 66 ~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na-~~v 144 (364) .....+| ++.+-.-...+++. -+.++...+ +|...+..-|.+++++...+..-+.+| .|.-.++. ..+ T Consensus 168 ~~~~~~~-~~~~~~~~~~k~~~---~~~is~ell----~d~~~l~~~i~~~la~a~~~~~d~~~l---~G~g~~~~~~Gi 236 (385) T protein:vir:19 168 LKPESDI-TFSKQTANVKTIAH---WVQASRQVM----DDAPMLQSYINNRLMYGLALKEEGQLL---NGDGTGDNLEGL 236 (385) T ss_pred ccccccc-ceeEEEEeeeeEEE---eehhhHHHH----hhHHHHHHHHHHHHHHHHHHHHHHHHH---hccCCCCccccc Confidence 1111111 22222222223332 234555433 233345556777777776665554443 22111111 111 Q ss_pred eeccc-ccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccccceeecccCCcEE Q lcl|NC_016566. 145 YTQPA-RVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQVMGDGLGRRF 221 (364) Q Consensus 145 ~dis~-~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~~~~~~lGrrV 221 (364) ...+. .+..........+..+.++..++-.+...-..|+||..+|..|.+ +.+..+ ++....-..-...+|+|| T Consensus 237 ~~~~~~~~~~~~~~~~~~~d~i~~~~~~l~~~~~~~~~~~~~~~~~~~l~~---lkd~~G~~l~~~~~~~~~~~l~G~pV 313 (385) T protein:vir:19 237 NKVATAYDTSLNATGDTRADIIAHAIYQVTESEFSASGIVLNPRDWHNIAL---LKDNEGRYIFGGPQAFTSNIMWGLPV 313 (385) T ss_pred ccccccccccccccccchHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHH---hhcCCCceeccCcccCCCceecceee Confidence 11111 000011112235667889999998888888899999999998865 322222 222111111112369999 Q ss_pred EEeCCCCCCCCceEEEEEec--ceeEE-ecCCCCcceeeccC---CCceeeeEEe--eEEE-Eeeeeeeeeccccccccc Q lcl|NC_016566. 222 IISDAAADAMGAGKMLGLVP--GAVAV-TTNGLDMLAQEKGG---NENIERWWQG--EFDF-NVAVKGYRLKASARTPVE 292 (364) Q Consensus 222 IVDD~~p~~~~~Yttylfg~--GAi~~-~~~~~~~~~~~~~g---~e~~~~~~~~--~~~f-~lhp~G~sw~~~~~~~~~ 292 (364) +++|.||... .+||. .++.+ ...++.+......+ ..+ .+.++. |+.+ ..+|..|..-.-+-+ T Consensus 314 ~~~~~~p~~~-----~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~-~~~~~~~~r~~~~v~~~~a~~~~~~~aa--- 384 (385) T protein:vir:19 314 VPTKAQAAGT-----FTVGGFDMASQVWDRMDATVEVSREDRDNFVKN-MLTILCEERLALAHYRPTAIIKGTFSSG--- 384 (385) T ss_pred EEcCcCCCCc-----EEEeecccEEEEEEecceEEEEeccccchhhcC-cEEEEEEEeeccEEecccceEEEEeccC--- Confidence 9999999532 33332 12222 22222221111111 122 222222 2222 367777755322211 Q ss_pred CCCC Q lcl|NC_016566. 293 GVRS 296 (364) Q Consensus 293 gg~S 296 (364) + T Consensus 385 ---~ 385 (385) T protein:vir:19 385 ---S 385 (385) T ss_pred ---C Confidence 1 No 44 >protein:vir:1886 Length: 385 # NCBI annotation: major capsid subunit precursor # Family: family:all:585 # MgeID: mge:41 # MgeName: HK022 # Cross-refs: genbank:acc:NP_037666;genbank:gi:9634124;genbank:GeneID:1262513 Probab=97.65 E-value=3.3e-05 Score=45.16 Aligned_cols=262 Identities=9% Similarity=0.018 Sum_probs=105.6 Q ss_pred CC---------------ccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccC Q lcl|NC_016566. 1 MS---------------LTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAP 65 (364) Q Consensus 1 fd---------------~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~ 65 (364) +. -.+.-++.....++.+.+....++-.. ..++.+.-...|.+...++...-+... . T Consensus 96 ~~~~~~~~~~~~~~~~~g~~i~~~~~~~ii~~~~~~~~l~~~~~-------~~~~~~~~~~~~~~~~~~~~a~~v~E~-~ 167 (385) T protein:vir:18 96 FGAKTFNKSLGSDADSAGSLIQPMQIPGIIMPGLRRLTIRDLLA-------QGRTSSNALEYVREEVFTNNADVVAEK-A 167 (385) T ss_pred chhhHHHhhhccccccCCceecchhhhHHHHHhhhccchhhhcc-------eecccCcceEEEEEecCCcceeeeccC-c Confidence 11 111223333444444444333322211 112222212233222111111111111 1 Q ss_pred CCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc-cce Q lcl|NC_016566. 66 VGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNA-AAN 144 (364) Q Consensus 66 ~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na-~~v 144 (364) .....+| ++.+-.-...+++. -+.++...+ +|...+..-|.+++++...+..-+.+| .|.-.++. ..+ T Consensus 168 ~~~~~~~-~~~~~~~~~~k~~~---~~~is~ell----~d~~~l~~~i~~~la~a~~~~~d~~~l---~G~g~~~~~~Gi 236 (385) T protein:vir:18 168 LKPESDI-TFSKQTANVKTIAH---WVQASRQVM----DDAPMLQSYINNRLMYGLALKEEGQLL---NGDGTGDNLEGL 236 (385) T ss_pred ccccccc-ceeEEEEeeeeEEE---eehhhHHHH----hhHHHHHHHHHHHHHHHHHHHHHHHHH---hccCCCCccccc Confidence 1111111 22222222223332 234555433 233345556777777776665554443 22111111 111 Q ss_pred eeccc-ccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccccceeecccCCcEE Q lcl|NC_016566. 145 YTQPA-RVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQVMGDGLGRRF 221 (364) Q Consensus 145 ~dis~-~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~~~~~~lGrrV 221 (364) ...+. .+..........+..+.++..++-.+...-..|+||..+|..|.+ +.+..+ ++....-..-...+|+|| T Consensus 237 ~~~~~~~~~~~~~~~~~~~d~i~~~~~~l~~~~~~~~~~~~~~~~~~~l~~---lkd~~G~~l~~~~~~~~~~~l~G~pV 313 (385) T protein:vir:18 237 NKVATAYDTSLNATGDTRADIIAHAIYQVTESEFSASGIVLNPRDWHNIAL---LKDNEGRYIFGGPQAFTSNIMWGLPV 313 (385) T ss_pred ccccccccccccccccchHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHH---hhcCCCceeccCcccCCCceecceee Confidence 11111 000011112235667889999998888888899999999998865 322222 222111111112369999 Q ss_pred EEeCCCCCCCCceEEEEEec--ceeEE-ecCCCCcceeeccC---CCceeeeEEe--eEEE-Eeeeeeeeeccccccccc Q lcl|NC_016566. 222 IISDAAADAMGAGKMLGLVP--GAVAV-TTNGLDMLAQEKGG---NENIERWWQG--EFDF-NVAVKGYRLKASARTPVE 292 (364) Q Consensus 222 IVDD~~p~~~~~Yttylfg~--GAi~~-~~~~~~~~~~~~~g---~e~~~~~~~~--~~~f-~lhp~G~sw~~~~~~~~~ 292 (364) +++|.||... .+||. .++.+ ...++.+......+ ..+ .+.++. |+.+ ..+|..|..-.-+-+ T Consensus 314 ~~~~~~p~~~-----~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~-~~~~~~~~r~~~~v~~~~a~~~~~~~aa--- 384 (385) T protein:vir:18 314 VPTKAQAAGT-----FTVGGFDMASQVWDRMDATVEVSREDRDNFVKN-MLTILCEERLALAHYRPTAIIKGTFSSG--- 384 (385) T ss_pred EEcCcCCCCc-----EEEeecccEEEEEEecceEEEEeccccchhhcC-cEEEEEEEeeccEEecccceEEEEeccC--- Confidence 9999999532 33332 12222 22222221111111 122 222222 2222 367777755322211 Q ss_pred CCCC Q lcl|NC_016566. 293 GVRS 296 (364) Q Consensus 293 gg~S 296 (364) + T Consensus 385 ---~ 385 (385) T protein:vir:18 385 ---S 385 (385) T ss_pred ---C Confidence 1 No 45 >protein:vir:100247 Length: 425 # NCBI annotation: gp76 # Family: family:all:21 # MgeID: mge:1619 # MgeName: Bcep176 # Cross-refs: genbank:acc:YP_355412;genbank:gi:77864702;genbank:GeneID:3725969 Probab=97.58 E-value=4.2e-05 Score=44.57 Aligned_cols=271 Identities=8% Similarity=-0.058 Sum_probs=105.1 Q ss_pred CCcc--------------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhccccccc Q lcl|NC_016566. 1 MSLT--------------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDR 60 (364) Q Consensus 1 fd~~--------------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~ 60 (364) |+-. +-.+++...++|.+.+.....+-. .++ +-. .|++ ..|... ++..- T Consensus 117 f~~~l~~~e~~~al~~~t~~~gG~lvP~~~~~~ii~~~~~~s~l~~l~---~~~--~~~-~~~~-~~~~~~--~~~~a-- 185 (425) T protein:vir:10 117 FKAHVKRGDVQAALNKGEDSEGGYLTPIEWDRTITNKLVLISPMRQLC---RVQ--PVS-KAGF-SKLFNM--GGTTS-- 185 (425) T ss_pred HHHHhhhhhhHHHhhcCcCCCCceeccHhHHHHHHHHHHhhhhhhhhc---eee--ecc-CCce-EEEEEc--CCcce-- Confidence 1000 223333344444444322221111 111 000 0111 111110 11000 Q ss_pred ccccCCCccccc-hhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH-----Hh Q lcl|NC_016566. 61 NAYAPVGTPATA-KVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGA-----GK 134 (364) Q Consensus 61 d~~~~~~~~~T~-~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~-----L~ 134 (364) .--+.+.. -| ........+.....+-.+-+.++...+. ...-.+-..|.+++++...+..-+.+|.- .. T Consensus 186 ~wv~E~~~--~~~~~~~~f~~v~~~~~k~~~~i~iS~ell~---ds~~~l~~~i~~~la~ai~~~~d~~~l~G~G~~~p~ 260 (425) T protein:vir:10 186 GWVGEASQ--RPQTNAATFQPLSFASGEIYANPAATQQILD---DAEIDLESWLATEVQTEFAKQEGKAFLAGDGTNKPN 260 (425) T ss_pred eeeccccc--cccccccccceeeeeheeeEeehHhHHHHHh---cchhHHHHHHHHHHHHHHHHHHHhhhhcccCCCCcc Confidence 00000000 01 0111122222222222223345555443 11223344577777777776655544321 01 Q ss_pred hhhcccccceeec---cc--ccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--cccc Q lcl|NC_016566. 135 AAIESNAAANYTQ---PA--RVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAI 207 (364) Q Consensus 135 Gv~~~na~~v~di---s~--~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~ 207 (364) |++..-+...... .. ..-.......+++.++.+....+......=..|+||...+..|.+ +.+.++ ++.. T Consensus 261 Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~l~~l~~~l~~~~~~~a~~vmn~~~~~~L~~---lkD~~G~~l~~~ 337 (425) T protein:vir:10 261 GLLTYIAGGANAAKHPFGAIEVVNSGAAADITSDGIIDLVYDLPSAFTGNARFAMNRNTQRQVRK---LKDGQGNYLWQP 337 (425) T ss_pred eeeeccccccccccccccccccccccccccccHHHHHHHHhhhhhhhccCCEEEEchHHHHHHHH---hhcCCCceeecc Confidence 1221111000000 00 000111233456677888888887777677789999999998854 334343 2222 Q ss_pred c-cceeecccCCcEEEEeCCCCCCCCceEEEEEec--ceeEEec-CCCCcceeeccCCCceeeeEEeeEE-EEeeeeeee Q lcl|NC_016566. 208 G-DLQVMGDGLGRRFIISDAAADAMGAGKMLGLVP--GAVAVTT-NGLDMLAQEKGGNENIERWWQGEFD-FNVAVKGYR 282 (364) Q Consensus 208 ~-~~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~--GAi~~~~-~~~~~~~~~~~g~e~~~~~~~~~~~-f~lhp~G~s 282 (364) . .-......+|++|+++|.||.....-...+||. .++.+.+ ....+..+.........+....|+. =.+||..|+ T Consensus 338 ~~~~g~~~~l~G~PV~~~~~~p~~~~~~~~i~~Gd~~~~~~i~~~~~~~v~~d~~~~~~~~~~~~~~r~d~~v~~~~A~~ 417 (425) T protein:vir:10 338 SYVAGQPATLAGYPVTEVPDMPDVAANSTPILFGDFQQTYLIIDRIGVRVLRDPYTAKPYVLFYTTKRVGGGLLNPEPMR 417 (425) T ss_pred CccCCCCceecceeeEEecCcCCccCCccEEEEEehhccEEEEEecceEEEecccccCCcEEEEEEEEeccEeecccceE Confidence 1 111111246999999999996543334556653 3333332 3333222222222212111111211 125666655 Q ss_pred eccccccc Q lcl|NC_016566. 283 LKASARTP 290 (364) Q Consensus 283 w~~~~~~~ 290 (364) --.-..+. T Consensus 418 ~l~~~as~ 425 (425) T protein:vir:10 418 AMKVAASE 425 (425) T ss_pred EEEeeccC Confidence 53222110 No 46 >protein:vir:97053 Length: 390 # NCBI annotation: putative head protein # Family: family:all:585 # MgeID: mge:1653 # MgeName: OP1 # Cross-refs: genbank:acc:YP_453565;genbank:gi:84662600;genbank:GeneID:5142468 Probab=97.57 E-value=4.4e-05 Score=44.47 Aligned_cols=260 Identities=8% Similarity=-0.014 Sum_probs=106.7 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) ---.+-.++....+++.+.+...+.+-.. ..++.+.-...|.+....+...-+... ......++ ++.+-.- T Consensus 120 ~~g~lip~~~~~~ii~~~~~~~~i~~~~~-------~~~~~~~~~~~~~~~~~~~~a~~v~Eg-~~~~~~~~-~~~~i~~ 190 (390) T protein:vir:97 120 SAGALTTPNRLPGFITPPDARLTVRDLIG-------SGRTDSALIEYVQETGFVNNAAIVAEG-ALKPESSL-KFAKKTD 190 (390) T ss_pred ccccccchhhhHHHHHHHhhhhhhHhhcc-------eeeccCCceEEEEEecCCcceeeecCC-cccccccc-ceeEEEE Confidence 00112233444555555555444433221 111222212223222111111111111 11111111 1222222 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhccc-ccceeecccccCc-ccccc Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESN-AAANYTQPARVDG-VGGRT 158 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~n-a~~v~dis~~t~~-~~~~~ 158 (364) ...++.. -+.++...+. |...+..-|.+++++...+..-+.+|. |--.++ .......+..... ..... T Consensus 191 ~~~k~~~---~~~is~ell~----ds~~l~~~i~~~la~a~~~~~d~a~l~---G~g~~~~p~Gi~~~~~~~~~~~~~~~ 260 (390) T protein:vir:97 191 TTHVIAH---TMKATRQILS----DAPQLASYMNNRLIRGLKVKEDAEILR---GTGANDGLLGLIPQATTYAAPTTIAG 260 (390) T ss_pred eeeeEEE---eehhhHHHHH----hHHHHHHHHHHHHHHHHHHHHHHHHhh---cCCCCccccceeeccccccccccccc Confidence 2222322 2345554332 333455567777887777766554442 211111 0111111111111 11122 Q ss_pred cccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccccceeecccCCcEEEEeCCCCCCCCceEE Q lcl|NC_016566. 159 FPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQVMGDGLGRRFIISDAAADAMGAGKM 236 (364) Q Consensus 159 ~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~~~~~~lGrrVIVDD~~p~~~~~Ytt 236 (364) ...+.++.++..++.+..-.-..|+||+.++..|.+ +.+..+ ++....-..-...+|++|+++|.||.. + T Consensus 261 ~~~~d~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~---lkd~~G~~l~~~~~~~~~~~l~G~pV~~~~~~~~~-----~ 332 (390) T protein:vir:97 261 ATRVDQLRLAMLQASLAEYPASGIVINPIDWAAIEL---AKDANNQYLIGNARGTLTPTLWGLPVVATQAMAPG-----E 332 (390) T ss_pred cchHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHH---hhcCCCceeecCccCCCCceecceeeEEcCCCCCC-----c Confidence 334556788899998888888999999999999864 333333 222211111112369999999999854 2 Q ss_pred EEEec--ceeEE-ecCCCCcceeec-cCCCceeeeEEee--EEE-Eeeeeeeeecccc Q lcl|NC_016566. 237 LGLVP--GAVAV-TTNGLDMLAQEK-GGNENIERWWQGE--FDF-NVAVKGYRLKASA 287 (364) Q Consensus 237 ylfg~--GAi~~-~~~~~~~~~~~~-~g~e~~~~~~~~~--~~f-~lhp~G~sw~~~~ 287 (364) .+||. .++.+ ...++....... ..-..-.+.++.+ +.+ .+||..|..-.-+ T Consensus 333 ~~~gd~~~~~~~~~~~~~~i~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~v~~~~a 390 (390) T protein:vir:97 333 FLVGAFDLAAQIFDQWDARVEIGYVNDDFQRNMVTVLAEERLALVVYRPEALITGSFA 390 (390) T ss_pred EEEEeccceEEEEEecceEEEEeecccccccCcEEEEEEEeeccEEeccccEEEEEeC Confidence 34442 22322 223332222111 1111112223332 222 2556666542221 No 47 >protein:vir:10364 Length: 390 # NCBI annotation: head protein; major capsid subunit precursor # Family: family:all:585 # MgeID: mge:183 # MgeName: Xp10 # Cross-refs: genbank:acc:NP_858956;genbank:gi:32128421;genbank:GeneID:2648357 Probab=97.51 E-value=5.4e-05 Score=44.00 Aligned_cols=255 Identities=8% Similarity=-0.030 Sum_probs=108.9 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) =+-.+..++....+++.+.+....++-.. ..++.+.-...|.+....+.......... .... +. +... T Consensus 120 ~~g~~~~~~~~~~ii~~~~~~~~l~~~~~-------~~~~~~~~~~~~~~~~~~~~a~~v~Eg~~-~~~~---~~-~~~~ 187 (390) T protein:vir:10 120 SAGALTTPNRLPGFITQPDARLTVRDLIG-------SGRTDSALIEYVQETGFVNNAAIVAEGAL-KPES---SL-KFAK 187 (390) T ss_pred ccccccchhHHHHHHHHHHhhchhhhhcc-------eeeccCCceEEEEEecCCcceeeecCCcc-cccc---cc-ceeE Confidence 01113445555566666655444333211 11122221233333222221111121111 1111 11 1233 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH------HhhhhcccccceeecccccCcc Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGA------GKAAIESNAAANYTQPARVDGV 154 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~------L~Gv~~~na~~v~dis~~t~~~ 154 (364) +-.+...-.+-+.++...+ +|...+..-|.+++++...+..-+.+|.- ..|++... ...... T Consensus 188 i~~~~~k~~~~~~is~ell----~d~~~l~~~i~~~l~~~~~~~~~~~il~G~G~~~~p~Gi~~~~--------~~~~~~ 255 (390) T protein:vir:10 188 KTDTTHVIAHTMKATRQIL----SDAPQLASYMNNRLIRGLKVKEDAEILRGTGANDGLLGLIPQA--------TTYAAP 255 (390) T ss_pred EEEeeEEEEEeehhhHHHH----HhHHHHHHHHHHHHHHHHHHHHHHHHhhcCCCCcccccccccc--------cccccc Confidence 3333322223344565433 23334556677777777666555544421 12222111 101001 Q ss_pred c-ccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccccceeecccCCcEEEEeCCCCCCC Q lcl|NC_016566. 155 G-GRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQVMGDGLGRRFIISDAAADAM 231 (364) Q Consensus 155 ~-~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~~~~~~lGrrVIVDD~~p~~~ 231 (364) . ......+..+.++...+-+....-..|+||...|..|.+ +.+..+ ++....-..-...+|+||+++|.||... T Consensus 256 ~~~~~~~~~~~~~~~~~~l~~~~~~~~~~v~n~~~~~~L~~---lkd~~g~~l~~~~~~~~~~~l~G~pv~~~~~~p~~~ 332 (390) T protein:vir:10 256 TTIAGATRVDQLRLAMLQASLAEYPASGIVINPIDWAAIEL---AKDANNQYLIGNARGTLTPTLWGLPVVATQAMAPGE 332 (390) T ss_pred ccccccchHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHH---hhcCCCceeecCCcCcCCceecceeeEEcCCCCCCc Confidence 1 112234456788998998888889999999999998865 323332 2222111111134699999999999532 Q ss_pred CceEEEEEec---ceeEEecCCCCcceeec-cCCCceeeeEEeeEEE---Eeeeeeeeecccc Q lcl|NC_016566. 232 GAGKMLGLVP---GAVAVTTNGLDMLAQEK-GGNENIERWWQGEFDF---NVAVKGYRLKASA 287 (364) Q Consensus 232 ~~Yttylfg~---GAi~~~~~~~~~~~~~~-~g~e~~~~~~~~~~~f---~lhp~G~sw~~~~ 287 (364) ++||. +...+....+....... ..-..-.+.++.+..| .++|..|..-.-+ T Consensus 333 -----~~~gdf~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~r~~~r~d~~v~~~~a~~~~~~a 390 (390) T protein:vir:10 333 -----FLVGAFDLAAQIFDQWDARVEIGYVNDDFQRNMVTVLAEERLALVVYRPEALISGSFA 390 (390) T ss_pred -----EEEEeccceEEEEEecceEEEEeecccccccCcEEEEEEEeeccEEeccccEEEEEeC Confidence 33332 22222223332221111 1111112223222221 2566665442221 No 48 >protein:vir:81160 Length: 371 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:1892 # MgeName: Geobacillus virus E2 # Cross-refs: genbank:acc:YP_001285811;genbank:gi:148747732;genbank:GeneID:5247203 Probab=97.50 E-value=5.4e-05 Score=43.96 Aligned_cols=253 Identities=6% Similarity=-0.031 Sum_probs=96.5 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccc--cchhhhcc Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPA--TAKVLARM 78 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~--T~~kit~~ 78 (364) ---.+..+++....++.+.+..-..+-.. . .++.+.-...++....++.. ...-+.+.... +...+++- T Consensus 98 ~gg~~vP~~~~~~ii~~~~~~s~i~~~~~--~-----~~~~~~~~~~~~~~~~~~~~--a~~v~Eg~~~~~~~~~~f~~i 168 (371) T protein:vir:81 98 DGGYTVPQDIQTRINELRESKDALQNLIT--V-----EPVTTLSGSRVFKKRSQQTG--FVEVAEGAAIGEKATPQFTLL 168 (371) T ss_pred cCceeecHhHHHHHHHHHHhhhhhhhhce--e-----eeccCCceeEEEEeecCCcc--eeeeccccccccccccceeeE Confidence 01112222333333444443322222111 0 11111111111111111100 00011111100 00122222 Q ss_pred ceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCcccccc Q lcl|NC_016566. 79 LTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRT 158 (364) Q Consensus 79 ~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~ 158 (364) .-...|++.. +.++...+. ...-.+..-|.+++++...+-..+.++.... .. .... T Consensus 169 ~~~~~k~~~~---~~iS~ell~---ds~~~l~~~i~~~l~~a~~~~~~~~i~~g~g-~~-----------------~~~~ 224 (371) T protein:vir:81 169 QYQVKKYAGF---FRVTNELLN---DSTEAIVNTLVRWIGDESRVTRNGLIINVLN-TK-----------------AKTA 224 (371) T ss_pred EeeeeEEEEe---ehhhHHHHh---hhhHHHHHHHHHHHHHHHHHHHHHHHHhhcc-cc-----------------cccc Confidence 2223334332 345555443 1122334446666666665544443333211 00 0111 Q ss_pred cccHHHHHHHHH-HhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc-cceeecccCCcEEEEeCCCCCCC--- Q lcl|NC_016566. 159 FPTLADFPLAAS-KFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG-DLQVMGDGLGRRFIISDAAADAM--- 231 (364) Q Consensus 159 ~~s~~~l~~A~~-~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~-~~~~~~~~lGrrVIVDD~~p~~~--- 231 (364) ..++.++..+.. .+-.....-..|+||...|..|.+ +.+..+ ++... .-......+|++|+++|.+|... T Consensus 225 ~~~~~~i~~~~~~~l~~~~~~~a~~vmn~~~~~~L~~---lkd~~g~~l~~~~~~~~~~~~l~G~pV~~~~~~~~~~~~~ 301 (371) T protein:vir:81 225 IADLDGLKQIINVQLDPVFRSTSSVIVNQDAFNWLDT---LKDQNGQYLLQPSISSPTGRQLLGLPVVIVSNKVLANRVD 301 (371) T ss_pred cccHHHHHHHHHhhcchhhhcCCEEEEcHHHHHHHHH---hhccCCCeeeecccCCCCCceecceeEEEecccccCcccc Confidence 233445665554 344444455789999999999864 333332 32211 11111123699999999998432 Q ss_pred ----CceEEEEEec---ceeEEecCCCCcceeeccC--CCceeeeEEeeEEE---Eeeeeeeeecccccc Q lcl|NC_016566. 232 ----GAGKMLGLVP---GAVAVTTNGLDMLAQEKGG--NENIERWWQGEFDF---NVAVKGYRLKASART 289 (364) Q Consensus 232 ----~~Yttylfg~---GAi~~~~~~~~~~~~~~~g--~e~~~~~~~~~~~f---~lhp~G~sw~~~~~~ 289 (364) .....++||. +............+.+..+ -+.-...++.+..| .+||..|..-.-+.+ T Consensus 302 ~~~~~~~~~i~~Gd~~~~~~~~~~~~~~i~~~~~~~~~f~~~~v~~~~~~r~d~~~~~~~a~~~~~~~~A 371 (371) T protein:vir:81 302 GGTGAQFAPIIVGDLKEAVVMFDRQRTEIMSSNVAMDAFETDATLWRAIERMDVKMRDDEAFVFGEVQLA 371 (371) T ss_pred ccccCCcceEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEEecC Confidence 1233455553 2222222222222222211 11222333333322 277888766433211 No 49 >protein:vir:7409 Length: 408 # NCBI annotation: major structural protein # Family: family:all:21 # MgeID: mge:146 # MgeName: P335 # Cross-refs: genbank:acc:NP_839926;genbank:gi:30089896;genbank:GeneID:1260683 Probab=97.50 E-value=5.6e-05 Score=43.92 Aligned_cols=275 Identities=11% Similarity=0.064 Sum_probs=104.6 Q ss_pred CC------c--------------------cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhc Q lcl|NC_016566. 1 MS------L--------------------TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIA 54 (364) Q Consensus 1 fd------~--------------------~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~ 54 (364) |- . .+-.+++....++.+.+.....+-.. .+ .-....|++.... +...+ T Consensus 97 ~~~~~~~~~~~~~~~~~~a~~~~~~~~gg~~vP~~~~~~Ii~~~~~~~~l~~~~~--~~--~~~~~~~~~~~~~-~~~~~ 171 (408) T protein:vir:74 97 FVNMVRNPMAFLNTVSSKTETSGSDSAAGLTIPQDIRTMINTLVRQYDSLQQYVR--VE--SVSTSSGSRVYEK-WTDVT 171 (408) T ss_pred HHHHHhcchhhhhhhhhhhhcccccCCCceeechhHhhHHHHHHhhhcchhhhcc--ee--eccCCcceEEEEe-ecCCc Confidence 00 0 00111222222222222211111110 00 0001112221111 11111 Q ss_pred ccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHh Q lcl|NC_016566. 55 NLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGK 134 (364) Q Consensus 55 g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~ 134 (364) +........+.. . ....-+...+-.++.+-.+-+.++...+. ..+..+...|.+++++...+-..+.+|. T Consensus 172 ~~~~~v~E~~~~--~--~~~~~~~~~i~~~~~k~~~~~~iS~ell~---ds~~~l~~~i~~~l~~~~~~~~d~~il~--- 241 (408) T protein:vir:74 172 PLKAMDEEDGKI--P--DLDNPRLTIIKYLIKRYAGIITATNTLLK---DTAENILAWLSSWIAKKVVVTRNQAIIA--- 241 (408) T ss_pred cccccccccccc--c--cccccceeeEEeeeeeEEeeehhHHHHHh---hchHHHHHHHHHHHHHHHHHHHHHHHhh--- Confidence 100010100000 0 00001122233333322233446665543 2232344556677777776655543332 Q ss_pred hhhcccccceeecccccCcccccccccHHHHHHHHH-HhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc-cc Q lcl|NC_016566. 135 AAIESNAAANYTQPARVDGVGGRTFPTLADFPLAAS-KFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG-DL 210 (364) Q Consensus 135 Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~-~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~-~~ 210 (364) |- +.+. ......++.++.++.+ .+-.....-..|+||...|..|.+ +.+..+ ++... .- T Consensus 242 G~-G~~~-------------~~~~~~~~~~i~~~~~~~l~~~~~~~a~~v~n~~~~~~l~~---lkd~~G~~l~~~~~~~ 304 (408) T protein:vir:74 242 AM-GTVP-------------KKPTIANFDDVITMINTSVDPAIIATSSLLTNQSGLNKLAL---VKTAEGKYLLEPDPTK 304 (408) T ss_pred cc-cccc-------------cccccccHHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHH---hhcCCCceEeccCcCC Confidence 20 0110 1122344556777664 344444444679999999998864 333222 22211 11 Q ss_pred eeecccCCcEEEEeCC--CCCCCCceEEEEEec--ceeEEe-cCCCCcceeecc--CCCceeeeEEeeEEEEeeeeeeee Q lcl|NC_016566. 211 QVMGDGLGRRFIISDA--AADAMGAGKMLGLVP--GAVAVT-TNGLDMLAQEKG--GNENIERWWQGEFDFNVAVKGYRL 283 (364) Q Consensus 211 ~~~~~~lGrrVIVDD~--~p~~~~~Yttylfg~--GAi~~~-~~~~~~~~~~~~--g~e~~~~~~~~~~~f~lhp~G~sw 283 (364) ......+|++|++.|. ||.....-.+++||. .++.+. ..+..+...... +-..-.+.++.+..|. T Consensus 305 ~~~~~l~G~pV~~~~~~~~~~~~~~~~~i~~gd~~~~~~~~~~~~~~i~~~~~~~~~f~~~~~~~r~~~r~d-------- 376 (408) T protein:vir:74 305 PNSYLIKGKQVIVVADRWLPNSGSTVYPLYYGDMSQAITLFDRENMSLLPTNIGAGAFETDTTKIRVIDRFD-------- 376 (408) T ss_pred CCCceecceeeEEecCcccccccCCcceEEEEehhccEEEEEecceEEEEeccccchhhcceeeEEEEEeeC-------- Confidence 1111236999998654 665443334456654 233332 233332222221 1122222222222111 Q ss_pred cccccccccCCCCcChhhhcCCccceeecCcCcCcceEEEEecCcccccccccccccccc Q lcl|NC_016566. 284 KASARTPVEGVRSFKLSDITDKANWELDQGQVDNAPATVQDVGSDSDTKGRRRTQTAQAV 343 (364) Q Consensus 284 ~~~~~~~~~gg~SPT~aeLat~~NW~rV~~s~K~~pgv~~~~~~~~~~~~~~~~~~~~~~ 343 (364) | .+. ..-+-|.+++.+.++..+.-.++|+|+| T Consensus 377 ------------------------~-~~~---~~~a~~~~~~~~~~~~~~~~~~~~~~~~ 408 (408) T protein:vir:74 377 ------------------------V-KAT---DSEALVAGSFTAIADQVGNFKTTTSTAV 408 (408) T ss_pred ------------------------c-EEe---cccceEEEEeecccCCCCCCCCCccccC Confidence 1 111 1223456666666666666666777776 No 50 >protein:vir:3870 Length: 400 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:82 # MgeName: A2 # Cross-refs: genbank:acc:NP_680487;swissprot:trembl:q8ltc0;genbank:gi:22296527;interpro:IPR006444;uniprot:Q8LTC0;genbank:GeneID:951713 Probab=97.49 E-value=5.6e-05 Score=43.89 Aligned_cols=253 Identities=6% Similarity=-0.043 Sum_probs=88.9 Q ss_pred CCc--------------------------------cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeee Q lcl|NC_016566. 1 MSL--------------------------------TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKM 48 (364) Q Consensus 1 fd~--------------------------------~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~ 48 (364) +.. .+=.+++...+++.+.+.....+-+. -.+..+.-...| T Consensus 109 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~~~~-------~~~~~~~~~~~~ 181 (400) T protein:vir:38 109 VNFEKTDVGTFAVLRAVPTDASDAVNAGVKAADAASTIPETISNTPQRELQTVVDLKPFTN-------VFQASTQKGTYP 181 (400) T ss_pred HHHHHHHHHHHhhhhhhhHHHHHHHhhcccccCCcccccHHHHHHHHHHHHhhhhhhhcce-------eEeccCcceEEE Confidence 000 01112222223333332211111110 011111112233 Q ss_pred hhhhhcccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 49 SVGLIANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKA 128 (364) Q Consensus 49 ~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~ 128 (364) ....-++......-.+......+| ++ ..+.....+-.+-+.++...+. .....+...|.+.+ .+. T Consensus 182 ~~~~~~~~~~~~~E~~~~~~~~~~-~f---~~i~~~~~k~~~~~~is~ell~---ds~~~~~~~i~~~l--------~~~ 246 (400) T protein:vir:38 182 TVANATTKMVTVAELEKNPAMAKP-EF---KPVNWSVETYRQALPVSQESID---DSAIDLVGLIAQNG--------QQI 246 (400) T ss_pred EEecCCCccccccccccccccccc-cc---eeeEeehhheeeehhhHHHHHh---hhHHHHHHHHHHHH--------HHH Confidence 322222211111101000000001 11 1222222111122234443332 11111222233333 333 Q ss_pred HHHHHhhhhcccccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccc Q lcl|NC_016566. 129 GIGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFA 206 (364) Q Consensus 129 lla~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~ 206 (364) +...+..+|-.... ........++.++.++....=|.... ..|+||+.+|..|.+ +.+..+ ++. T Consensus 247 ~~~~~~~~i~~~~~----------~~~~~~~~~~~~~~~~~~~~~~~~~~-a~~v~~~~~~~~l~~---lkd~~G~~i~~ 312 (400) T protein:vir:38 247 KVNTTNGAVATLLK----------GFTAKTISSVDDLKHINNVDLDPAYS-RVIIASQSFYNFLDT---VKDGNGRYLLQ 312 (400) T ss_pred HHHHHHHhhhhccc----------cccccccccHHHHHHHHHhhhhhhhC-cEEEEcHHHHHHHHH---hhccCCCeeee Confidence 33333333311100 00111223455666666654343332 689999999998854 433333 222 Q ss_pred c-ccceeecccCCcEEEEeCCCCCCCCceEEEEEec-c-eeEEe-cCCCCcceeec-cCCCceeeeEEeeEEEEeeeeee Q lcl|NC_016566. 207 I-GDLQVMGDGLGRRFIISDAAADAMGAGKMLGLVP-G-AVAVT-TNGLDMLAQEK-GGNENIERWWQGEFDFNVAVKGY 281 (364) Q Consensus 207 ~-~~~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~-G-Ai~~~-~~~~~~~~~~~-~g~e~~~~~~~~~~~f~lhp~G~ 281 (364) . ..-......+|++|+++|.+|.....-...+||. . ++.+. .....+..... .....+...++.+. -.+||.+| T Consensus 313 ~~~~~~~~~~l~G~pv~~~~~~~~~~~g~~~~~~gd~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~d~-~~~~~~a~ 391 (400) T protein:vir:38 313 DSILTPSGKSVLGMPIAVVSDDTLGAAGEAHAFLGDIKRAILFANRADFMVRWVDDQIYGQFLQAGMRFGV-SVADEKAG 391 (400) T ss_pred cCcCCCCccccccceeEEecccccCCCCceEEEEEeccccEEEEeecceEEEEecccccceeEEEEEEecc-EEecccce Confidence 1 1111111246999999999986443234556655 2 23332 22222222111 11122211221121 23778888 Q ss_pred eecccccccccCCCCcCh Q lcl|NC_016566. 282 RLKASARTPVEGVRSFKL 299 (364) Q Consensus 282 sw~~~~~~~~~gg~SPT~ 299 (364) .+-.-+.+ - T Consensus 392 ~~l~~~~~---------a 400 (400) T protein:vir:38 392 YFLTYTPK---------A 400 (400) T ss_pred EEEEeecC---------C Confidence 87444321 1 No 51 >protein:vir:4830 Length: 397 # NCBI annotation: MPL-7201 # Family: family:all:21 # MgeID: mge:105 # MgeName: 7201 # Cross-refs: genbank:acc:NP_038327;genbank:gi:9634653;genbank:GeneID:1262632 Probab=97.49 E-value=5.7e-05 Score=43.85 Aligned_cols=269 Identities=12% Similarity=0.036 Sum_probs=108.6 Q ss_pred CCc-------cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccch Q lcl|NC_016566. 1 MSL-------TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAK 73 (364) Q Consensus 1 fd~-------~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~ 73 (364) +.. .+..+++....++.+.+.....+-. ..+ .-....|.+...++.. ..+...............++ T Consensus 109 ~~~~t~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~---~~~-~~~~~~~~~~~~~~~~-~~~~a~~v~E~~~~~~~~~~- 182 (397) T protein:vir:48 109 KTDASGSDAGLTIPQDIQTAIHTLVRQYDSLQEYV---NVE-NVTTLTGSRVYEKWAD-ITGLAKLDDEAGSIGTNDDP- 182 (397) T ss_pred hhccCCccccccccHHHHHHHHHHHHHHHHHHhhh---cee-eccCCcceEEEEeecC-CCcceeeecccccccccccc- Confidence 100 1223333344444444432222211 111 0111122222222111 11111111111110000001 Q ss_pred hhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCc Q lcl|NC_016566. 74 VLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDG 153 (364) Q Consensus 74 kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~ 153 (364) ++.+-.-...+++ +-+.++...+. .....+...|.+++++...+...+.+|. | .... T Consensus 183 ~~~~v~~~~~k~~---~~~~iS~ell~---ds~~~l~~~v~~~l~~~~~~~~d~~il~---G----~g~~---------- 239 (397) T protein:vir:48 183 KLYPIRYAIKRYA---GISTVTNSLLA---DSAENILAWLSGWIAKKVVVTRNKAILE---A----IATL---------- 239 (397) T ss_pred ceeeEEeeheeee---eehhhHHHHHh---hchHHHHHHHHHHHHHHHHHHHHHHHhh---c----cccc---------- Confidence 1222222222232 22345555443 2222344456677777776655554432 2 1100 Q ss_pred ccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc-cceeecccCCcEEEEeCCCCC- Q lcl|NC_016566. 154 VGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG-DLQVMGDGLGRRFIISDAAAD- 229 (364) Q Consensus 154 ~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~-~~~~~~~~lGrrVIVDD~~p~- 229 (364) .......++.++.++..++......-..|+||...|..|.+ +.+..+ ++... .-......+|++|++.|..+. T Consensus 240 ~~~~~~~~~d~i~~~~~~l~~~~~~~a~~v~n~~~~~~L~~---lkd~~G~~i~~~~~~~~~~~~l~G~PV~~~~~~~~~ 316 (397) T protein:vir:48 240 PTKPTLTKWDDIIDLQAKVDPAIKQTSFFLTNTSGFTALKK---VKNAFGDYLMERDVKSPTGYSIDGFAVKEVADRWLA 316 (397) T ss_pred ccccccccHHHHHHHHHHhhhhhcCCCEEEECHHHHHHHHH---hhcCCCceeeccCcCCCCCceeccceeEEecccccC Confidence 01112345667888888887777788999999999998864 433332 22211 111111236999998776432 Q ss_pred -CCCceEEEEEec--ceeEEe-cCCCCcceeeccCC--CceeeeEEee--EE-EEeeeeeeeecccccccccCCCCcChh Q lcl|NC_016566. 230 -AMGAGKMLGLVP--GAVAVT-TNGLDMLAQEKGGN--ENIERWWQGE--FD-FNVAVKGYRLKASARTPVEGVRSFKLS 300 (364) Q Consensus 230 -~~~~Yttylfg~--GAi~~~-~~~~~~~~~~~~g~--e~~~~~~~~~--~~-f~lhp~G~sw~~~~~~~~~gg~SPT~a 300 (364) ....-.+++||. .++.+. .+.+.+...+..++ ..-.+.++.. +. -.+||.+|.+-.-+-+....+..|+.+ T Consensus 317 ~~~~~~~~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~~~~~~~~~~~~ 396 (397) T protein:vir:48 317 NASSGAMPLYFGDLKQAVTLFDRQQMSLLSTNIGGGAFETDTTKIRVIDRFDVVATDTESFVPASFKAIADQKGNLGSTA 396 (397) T ss_pred CcCCCceEEEEEeccceEEEEeecceEEEEeccchhhhhcCceeEEEEeeeccEEecccceEEEEecccccCCCCccccC Confidence 222234566764 244433 33333332222211 1111222221 21 237888887743222222223334433 Q ss_pred h Q lcl|NC_016566. 301 D 301 (364) Q Consensus 301 e 301 (364) - T Consensus 397 ~ 397 (397) T protein:vir:48 397 V 397 (397) T ss_pred C Confidence 3 No 52 >protein:vir:105905 Length: 304 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:1514 # MgeName: phiETA3 # Cross-refs: genbank:acc:YP_001004375;genbank:gi:122891830;genbank:GeneID:4712376 Probab=97.47 E-value=6e-05 Score=43.73 Aligned_cols=264 Identities=9% Similarity=0.006 Sum_probs=110.9 Q ss_pred CCcc---------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccC Q lcl|NC_016566. 1 MSLT---------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAP 65 (364) Q Consensus 1 fd~~---------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~ 65 (364) |-.. ...+++...+++.+.+..-..+. .. ..++.+.-...|.+..-..+. -...... T Consensus 1 ma~~~~~~~~~~~t~~gg~lip~~~~~~ii~~~~~~~~l~~~---~~----~~~~~~~~~~ip~~~~~~~a~-~v~E~~~ 72 (304) T protein:vir:10 1 MATPTYTPGNVILSDFKNGVIPAEQGTLIMKDIMANSAIMKL---AK----NEPMTAQKKKFTYLAKGVGAY-WVSETER 72 (304) T ss_pred CcccccccccccccCCCceecchhHHHHHHHHHHhccchhhh---cc----eeeccCCceEEEEEeCCcceE-EeecCcc Confidence 2222 23344444455555443222111 11 112222223444443211111 1111111 Q ss_pred CCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc---c Q lcl|NC_016566. 66 VGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNA---A 142 (364) Q Consensus 66 ~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na---~ 142 (364) ...+..++.+-.-...|+.. -+.++.+.+. ....++...|.+++++...+..-+.+|.- .|.-..+. . T Consensus 73 --~~~~~~~~~~i~~~~~k~~~---~~~iS~ell~---ds~~~l~~~i~~~l~~~ia~~~d~~~l~G-~g~~~~~~~~~~ 143 (304) T protein:vir:10 73 --IQTSKPEYAQAEMEAKKIGV---IIPLSKEFLK---WTAKDFFNEVKPLIAEAFYKAFDQAVIFG-TKSPYNTSTSGK 143 (304) T ss_pred --cccccceeeEEEEEEEEEEE---eehhhHHHHh---cchHHHHHHHHHHHHHHHHHHHHhhheec-cCCCcccccccc Confidence 11111123333222333332 2345555443 22333445577777777776655544321 11000000 0 Q ss_pred ceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccccccc-ccceeecccCCcEE Q lcl|NC_016566. 143 ANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAI-GDLQVMGDGLGRRF 221 (364) Q Consensus 143 ~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~-~~~~~~~~~lGrrV 221 (364) ...................+.++.++..++.++...-..|+||...|..|.+ +.+..+-+-. .+.. ..+|++| T Consensus 144 ~~~~~~~~~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~~~~~~~~L~~---lkd~~G~~l~~~~~~---~l~G~PV 217 (304) T protein:vir:10 144 PLVEGAEEKGNVVTDTNNLYVDLSALMATIEDEELDPNGVLTTRSFRSKMRN---ALDANDRPLFDANGN---EIMGLPL 217 (304) T ss_pred cccccccccccccccccchHHHHHHHHHHhhhccCCcCEEEEcHHHHHHHHH---hhccCCcEeecCCCc---cccceee Confidence 0111111111112233456778999999998888888899999999999864 3333332111 1111 2369999 Q ss_pred EEeCCCCCCCCceEEEEEec-ceeEEec-CCCCccee----------e-ccCC-----CceeeeEEeeEEE---Eeeeee Q lcl|NC_016566. 222 IISDAAADAMGAGKMLGLVP-GAVAVTT-NGLDMLAQ----------E-KGGN-----ENIERWWQGEFDF---NVAVKG 280 (364) Q Consensus 222 IVDD~~p~~~~~Yttylfg~-GAi~~~~-~~~~~~~~----------~-~~g~-----e~~~~~~~~~~~f---~lhp~G 280 (364) +++|.+|...++.. .+||. --+.++. +++...+. + ..|. +.-.+.++.+..| .+||.. T Consensus 218 ~~~~~~~~~~~~~~-~~~gd~~~~~~~~~~~~~i~~~~e~~~~~~~~~~~~g~~~~~f~~~~~~~r~~~r~~~~v~~~~a 296 (304) T protein:vir:10 218 SYTGADVYDKKKSL-ALMGDWDYARYGILQGIEYAISEDATLTTLQASDASGQPVSLFERDMFALRATMHIAYMNVKPEA 296 (304) T ss_pred EEecccccCCCCcE-EEEEehhhEEEEEecceEEEEeecceeeeecccccCccchhhhhcCcEEEEEEEEeccEeecccc Confidence 99999997665444 33332 1111222 12211110 0 0110 1111223332222 266776 Q ss_pred eeecccccccccCCCCcChhh Q lcl|NC_016566. 281 YRLKASARTPVEGVRSFKLSD 301 (364) Q Consensus 281 ~sw~~~~~~~~~gg~SPT~ae 301 (364) |.--..+ | T Consensus 297 ~~~l~~a-------------~ 304 (304) T protein:vir:10 297 FATLKPT-------------E 304 (304) T ss_pred eEEEEec-------------C Confidence 6553322 1 No 53 >protein:vir:94142 Length: 304 # NCBI annotation: ORF013 # Family: family:all:507 # MgeID: mge:1494 # MgeName: 96 # Cross-refs: genbank:acc:YP_240234;genbank:gi:66395898;genbank:GeneID:5133311 Probab=97.47 E-value=6e-05 Score=43.73 Aligned_cols=264 Identities=9% Similarity=0.006 Sum_probs=110.9 Q ss_pred CCcc---------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccC Q lcl|NC_016566. 1 MSLT---------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAP 65 (364) Q Consensus 1 fd~~---------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~ 65 (364) |-.. ...+++...+++.+.+..-..+. .. ..++.+.-...|.+..-..+. -...... T Consensus 1 ma~~~~~~~~~~~t~~gg~lip~~~~~~ii~~~~~~~~l~~~---~~----~~~~~~~~~~ip~~~~~~~a~-~v~E~~~ 72 (304) T protein:vir:94 1 MATPTYTPGNVILSDFKNGVIPAEQGTLIMKDIMANSAIMKL---AK----NEPMTAQKKKFTYLAKGVGAY-WVSETER 72 (304) T ss_pred CcccccccccccccCCCceecchhHHHHHHHHHHhccchhhh---cc----eeeccCCceEEEEEeCCcceE-EeecCcc Confidence 2222 23344444455555443222111 11 112222223444443211111 1111111 Q ss_pred CCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc---c Q lcl|NC_016566. 66 VGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNA---A 142 (364) Q Consensus 66 ~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na---~ 142 (364) ...+..++.+-.-...|+.. -+.++.+.+. ....++...|.+++++...+..-+.+|.- .|.-..+. . T Consensus 73 --~~~~~~~~~~i~~~~~k~~~---~~~iS~ell~---ds~~~l~~~i~~~l~~~ia~~~d~~~l~G-~g~~~~~~~~~~ 143 (304) T protein:vir:94 73 --IQTSKPEYAQAEMEAKKIGV---IIPLSKEFLK---WTAKDFFNEVKPLIAEAFYKAFDQAVIFG-TKSPYNTSTSGK 143 (304) T ss_pred --cccccceeeEEEEEEEEEEE---eehhhHHHHh---cchHHHHHHHHHHHHHHHHHHHHhhheec-cCCCcccccccc Confidence 11111123333222333332 2345555443 22333445577777777776655544321 11000000 0 Q ss_pred ceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccccccc-ccceeecccCCcEE Q lcl|NC_016566. 143 ANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAI-GDLQVMGDGLGRRF 221 (364) Q Consensus 143 ~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~-~~~~~~~~~lGrrV 221 (364) ...................+.++.++..++.++...-..|+||...|..|.+ +.+..+-+-. .+.. ..+|++| T Consensus 144 ~~~~~~~~~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~~~~~~~~L~~---lkd~~G~~l~~~~~~---~l~G~PV 217 (304) T protein:vir:94 144 PLVEGAEEKGNVVTDTNNLYVDLSALMATIEDEELDPNGVLTTRSFRSKMRN---ALDANDRPLFDANGN---EIMGLPL 217 (304) T ss_pred cccccccccccccccccchHHHHHHHHHHhhhccCCcCEEEEcHHHHHHHHH---hhccCCcEeecCCCc---cccceee Confidence 0111111111112233456778999999998888888899999999999864 3333332111 1111 2369999 Q ss_pred EEeCCCCCCCCceEEEEEec-ceeEEec-CCCCccee----------e-ccCC-----CceeeeEEeeEEE---Eeeeee Q lcl|NC_016566. 222 IISDAAADAMGAGKMLGLVP-GAVAVTT-NGLDMLAQ----------E-KGGN-----ENIERWWQGEFDF---NVAVKG 280 (364) Q Consensus 222 IVDD~~p~~~~~Yttylfg~-GAi~~~~-~~~~~~~~----------~-~~g~-----e~~~~~~~~~~~f---~lhp~G 280 (364) +++|.+|...++.. .+||. --+.++. +++...+. + ..|. +.-.+.++.+..| .+||.. T Consensus 218 ~~~~~~~~~~~~~~-~~~gd~~~~~~~~~~~~~i~~~~e~~~~~~~~~~~~g~~~~~f~~~~~~~r~~~r~~~~v~~~~a 296 (304) T protein:vir:94 218 SYTGADVYDKKKSL-ALMGDWDYARYGILQGIEYAISEDATLTTLQASDASGQPVSLFERDMFALRATMHIAYMNVKPEA 296 (304) T ss_pred EEecccccCCCCcE-EEEEehhhEEEEEecceEEEEeecceeeeecccccCccchhhhhcCcEEEEEEEEeccEeecccc Confidence 99999997665444 33332 1111222 12211110 0 0110 1111223332222 266776 Q ss_pred eeecccccccccCCCCcChhh Q lcl|NC_016566. 281 YRLKASARTPVEGVRSFKLSD 301 (364) Q Consensus 281 ~sw~~~~~~~~~gg~SPT~ae 301 (364) |.--..+ | T Consensus 297 ~~~l~~a-------------~ 304 (304) T protein:vir:94 297 FATLKPT-------------E 304 (304) T ss_pred eEEEEec-------------C Confidence 6553322 1 No 54 >protein:vir:81070 Length: 390 # NCBI annotation: p09 # Family: family:all:585 # MgeID: mge:1889 # MgeName: Xop411 # Cross-refs: genbank:acc:YP_001285679;genbank:gi:148727187;genbank:GeneID:5247115 Probab=97.44 E-value=6.7e-05 Score=43.48 Aligned_cols=259 Identities=8% Similarity=-0.022 Sum_probs=108.2 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) -.-.+-.++....+++.+.+.....+-.. + .++.+.-...|.+..-.+...-+..... .....+ ++.+-.- T Consensus 120 ~~g~~~~~~~~~~ii~~~~~~~~l~~~~~----~---~~~~~~~~~~~~~~~~~~~a~~v~Eg~~-~~~~~~-~~~~i~~ 190 (390) T protein:vir:81 120 SAGALTTPNRLPGFITPPDARLTVRDLIG----S---GRTDSALIEYVQETGFVNNAAIVAEGAL-KPESSL-KFAKKTD 190 (390) T ss_pred CCcceechhhhHHHHHHHhhhhhhhhhcc----e---eeccCCceEEEEEecCCcceeeecCCcc-cccccc-eeeEEEE Confidence 01112233444455555555433332211 1 1122222233322221111111111111 111111 2333222 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhccc-ccceeecccccCc-ccccc Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESN-AAANYTQPARVDG-VGGRT 158 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~n-a~~v~dis~~t~~-~~~~~ 158 (364) ...+++. -+.++...+. |...+...|.+++++...+...+.+|. |--.++ ....+..+..... ..... T Consensus 191 ~~~k~~~---~~~is~ell~----d~~~~~~~i~~~l~~~~~~~~d~a~l~---G~g~~~~~~Gi~~~~~~~~~~~~~~~ 260 (390) T protein:vir:81 191 TTHVIAH---TMKATRQILS----DAPQLASYMNNRLIRGLKVKEDAEILR---GTGANDGLLGLIPQATTYAAPTTIAG 260 (390) T ss_pred eeeEEEE---eehhhHHHHH----hHHHHHHHHHHHHHHHHHHHHHHHHHh---cCCCCCcccceeeccccccccccccc Confidence 2333332 2345554332 222355567788888877766664442 211111 1111111111000 01122 Q ss_pred cccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccccceeecccCCcEEEEeCCCCCCCCceEE Q lcl|NC_016566. 159 FPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQVMGDGLGRRFIISDAAADAMGAGKM 236 (364) Q Consensus 159 ~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~~~~~~lGrrVIVDD~~p~~~~~Ytt 236 (364) ...+.++.++..++......-..|+||..+|..|.+ +.++.+ ++....-......+|++|+++|.||... T Consensus 261 ~~~~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~---lkd~~G~~l~~~~~~~~~~~l~G~pv~~~~~~p~~~----- 332 (390) T protein:vir:81 261 ATRVDQLRLAMLQASLAEYNPSGIVINPIDWAAIEL---AKDANNQYLIGNARGTLTPTLWGLPVVATQAMAPGE----- 332 (390) T ss_pred chhHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHH---hhcCCCceeecCcccccCceecceeeEEcCCCCCCc----- Confidence 344567888999998888888899999999998864 333332 2222111111123699999999999542 Q ss_pred EEEec--ceeEE-ecCCCCcceeec-cC-CCceeeeEEeeEEE---Eeeeeeeeecccc Q lcl|NC_016566. 237 LGLVP--GAVAV-TTNGLDMLAQEK-GG-NENIERWWQGEFDF---NVAVKGYRLKASA 287 (364) Q Consensus 237 ylfg~--GAi~~-~~~~~~~~~~~~-~g-~e~~~~~~~~~~~f---~lhp~G~sw~~~~ 287 (364) .+||. .++.+ ...++.+..... .. ..+ .+.++....| .+||..|..-.-+ T Consensus 333 ~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~-~v~~r~~~r~d~~v~~~~a~v~~t~a 390 (390) T protein:vir:81 333 FLVGAFDLAAQIFDQWDARVEIGYVGEDFQRN-MITVLAEERLALVVYRPEALISGSFA 390 (390) T ss_pred EEEEehhceEEEEEecceEEEEecccchhhcC-cEEEEEEEeeccEEecccceEEEEeC Confidence 34443 12222 223332222111 11 112 2223332222 2566666543221 No 55 >protein:vir:6212 Length: 434 # NCBI annotation: prohead protease # Family: family:all:21 # MgeID: mge:128 # MgeName: phBC6A52 # Cross-refs: genbank:acc:NP_852592;genbank:gi:31415852;genbank:GeneID:1489210 Probab=97.35 E-value=8.7e-05 Score=42.84 Aligned_cols=271 Identities=7% Similarity=-0.038 Sum_probs=94.5 Q ss_pred CCccc--------c---------------------------chhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCcee Q lcl|NC_016566. 1 MSLTV--------F---------------------------QRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVV 45 (364) Q Consensus 1 fd~~v--------f---------------------------n~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~ 45 (364) ..... | .+++...+++.+.+..-+.+- +.++ +..|+ . T Consensus 114 ~~~~~~~~e~r~a~~~~l~~~~~~~e~~a~~~~t~~GG~lvP~~~~~~Ii~~l~~~~~i~~~---~~~~----~~~~~-~ 185 (434) T protein:vir:62 114 GHRTNKETEIRSVFANYIVGNIDEKEARALGLVTGNGSVTIPDFLSKEIITYAQEENFLRRL---GTGV----KTKEN-I 185 (434) T ss_pred cccchHHHHHHHHHHHHhccccchhhhhhhcccccccceecchhhHHHHHHhhhhhhhhhhh---ccee----ccCCc-e Confidence 11111 1 111122222222221111111 0111 11122 1 Q ss_pred eeehhhhhcccccccccc-cCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 46 EKMSVGLIANLVTDRNAY-APVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLH 124 (364) Q Consensus 46 ~~~~f~~i~g~~~~~d~~-~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~ 124 (364) ..|.+-.-+.+....... +......+| ++.+-.-...+++. -+.++...+... +-++..-|.+++++...+. T Consensus 186 ~~p~~~~~~~a~~~~~~~e~~~~~~~~~-~f~~v~~~~~k~~~---~~~iS~ell~ds---~~~l~~~i~~~la~~~~~~ 258 (434) T protein:vir:62 186 KYPVLVKKAEAQGHKNERTNNEMPETDI-EFDEIELSPTEFDA---LATVTKKLLART---GLPIEQIVMDELKKAYVRK 258 (434) T ss_pred EEEEEecCCcccceeccccccccccccc-ceeeEEeeheeeEe---ehhhHHHHHhcc---hHHHHHHHHHHHHHHHHHH Confidence 222111111100000000 000000011 12222222222322 234555544322 2223344666666666665 Q ss_pred HHHHHHHHHhhhhccc-ccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc Q lcl|NC_016566. 125 YLKAGIGAGKAAIESN-AAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ 203 (364) Q Consensus 125 ~qk~lla~L~Gv~~~n-a~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~ 203 (364) .-+.+|. |-=.++ .......+..+ .......++.++.+....+-.....=..|+||..++..|.+ +.+.++ T Consensus 259 ~d~~~l~---G~G~~~~~~g~~~~~~~~--~~~~~~~~~d~l~~l~~~l~~~~~~~a~~v~n~~~~~~L~~---lkd~~G 330 (434) T protein:vir:62 259 ETQYMVN---GDEANNINDGALAKKAVE--FKTDEKNLYDALVKMKNTPVKEVRKKARWVLNTAALTKIET---MKTDDG 330 (434) T ss_pred HHHHHhc---cCCCCccccceeeccccc--ccccccchhhHHHHHHhhcchhhhcCCEEEEcHHHHHHHHH---hhccCC Confidence 5554441 111111 00011001001 01122345667888888887777677789999999998854 444433 Q ss_pred --cccc-ccc--eeecccCCcEEEEeCCCCCCC-CceEEEEEec---ceeEEecCCCCcceeeccCCCceeeeEEeeEEE Q lcl|NC_016566. 204 --VFAI-GDL--QVMGDGLGRRFIISDAAADAM-GAGKMLGLVP---GAVAVTTNGLDMLAQEKGGNENIERWWQGEFDF 274 (364) Q Consensus 204 --~~~~-~~~--~~~~~~lGrrVIVDD~~p~~~-~~Yttylfg~---GAi~~~~~~~~~~~~~~~g~e~~~~~~~~~~~f 274 (364) ++.. .+. ......+|+||+++|.||... +.-..++||. ..|.--.+...+.+.....-.+-.+.+++...+ T Consensus 331 ~~l~~~~~~~~~g~~~tl~G~pV~~~~~~~~~~~~~~~~i~~Gdfs~~~i~~~~g~~~i~~~~~~~~~~~~v~~~~~~r~ 410 (434) T protein:vir:62 331 FPLLRPFNQAEGGIGYTLLGFPVEEEDAIDIPDSPDTPVFYFGDFSKFYIQDVIGSLEVQKLVELFSRTNRVGFRIWNLL 410 (434) T ss_pred CEeeccCCCccCCCCceecceeeEEecCccCccCCCceEEEEeeccceEEEEeeceeEEEeehhhhcccCceEEEEEeee Confidence 2221 111 111124699999999998643 2223455542 222211122111111111101111222222111 Q ss_pred E---ee-eeeeeecccccccccCCCCcChh Q lcl|NC_016566. 275 N---VA-VKGYRLKASARTPVEGVRSFKLS 300 (364) Q Consensus 275 ~---lh-p~G~sw~~~~~~~~~gg~SPT~a 300 (364) - +| |.=...-.- .++.||-+ T Consensus 411 Dgk~i~~~~~~~~~~~------~~~~~~~~ 434 (434) T protein:vir:62 411 DAQLIHSPFEVPVYKY------VLKAPTGA 434 (434) T ss_pred cceeecCcccceEEEE------EeccCCCC Confidence 1 22 322221110 12233333 No 56 >protein:vir:9574 Length: 300 # NCBI annotation: gp40 # Family: family:all:966 # MgeID: mge:171 # MgeName: SM1 # Cross-refs: genbank:acc:NP_862879;genbank:gi:32469471;genbank:GeneID:1461316 Probab=97.35 E-value=8.9e-05 Score=42.79 Aligned_cols=272 Identities=8% Similarity=-0.024 Sum_probs=119.0 Q ss_pred CCc--cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhcc Q lcl|NC_016566. 1 MSL--TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARM 78 (364) Q Consensus 1 fd~--~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~ 78 (364) .+. .+..+++....+|.+.+..-...- ...+.+.. .-.+.|.+..-..+ .-+. .+.....++ .++.+- T Consensus 5 t~~~G~lip~~~~~~ii~~l~~~s~i~~l--~~~~~~~~-----~~~~~p~~~~~~~a-~wv~-Eg~~~~~s~-~~f~~v 74 (300) T protein:vir:95 5 QLSKGNLFNPELVTKVINKVKGHSSIAKL--SPQKPIPF-----NGQREFVFDFDSDI-DIVA-ENGKKTHGG-VSLDPV 74 (300) T ss_pred ccCCcceechhhHHHHHHHHHhhhhhhhh--cceeeccC-----CceEEEEEecCcce-EEee-CCccccccc-ccceee Confidence 222 245677777788877764333221 12222221 11233432211111 0111 111111111 123333 Q ss_pred ceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecc---cc-cCcc Q lcl|NC_016566. 79 LTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQP---AR-VDGV 154 (364) Q Consensus 79 ~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis---~~-t~~~ 154 (364) .-...|++.- +.++.+-+.....+...+...|.+++++...+..-+.+|.-... -.++........ .. +... T Consensus 75 ~l~~~k~~~~---~~iS~ell~~~~d~~~~l~~~i~~~l~~aia~~~d~~~l~G~~~-~~g~~~~~~~~~~~~~~~~~~~ 150 (300) T protein:vir:95 75 TIVPLKVEYG---ARVSDEFLHASEEAKVDMLTDFVEGFSKKLARGLDIMSIHGINP-RTKQASTIIGDNCFDKKVTQTV 150 (300) T ss_pred EeeeEEEEEe---ehhhHHHhccCCCCHHHHHHHHHHHHHHHHHHHHHHhhhhcccC-CCCCCcccccccccccccceee Confidence 3333344432 34555544323344455556677777777776666555522100 000100000000 00 0000 Q ss_pred cccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccccc--cccccce-eecccCCcEEEEeCCCCCCC Q lcl|NC_016566. 155 GGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQV--FAIGDLQ-VMGDGLGRRFIISDAAADAM 231 (364) Q Consensus 155 ~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~--~~~~~~~-~~~~~lGrrVIVDD~~p~~~ 231 (364) .........++.++..++.+...+...|+||+..+..|.+ +.+.++- +...... .....+|+||++++.+|... T Consensus 151 ~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~L~~---lkd~~G~~i~~~~~~~~~~~~l~G~Pv~~s~~v~~~~ 227 (300) T protein:vir:95 151 PFKDTNPDESMEDAVGMIDGSERDITGAILDPIFTTALSK---MKNAEGGKLYPELAWGGVPDAINGLAVDKNRTVSYSQ 227 (300) T ss_pred cccccchHHHHHHHHHHhhhcCCCccEEEECHHHHHHHHH---hhccCCCeeccCccccCCCceecceeeEEecCCCCCC Confidence 1111233467888999998888888899999999998854 4444432 2221111 12224699999999999754 Q ss_pred C-ceEEEEEec--ceeEEec-CCCCcceee---ccCC-----CceeeeEEeeEE--E-EeeeeeeeecccccccccCC Q lcl|NC_016566. 232 G-AGKMLGLVP--GAVAVTT-NGLDMLAQE---KGGN-----ENIERWWQGEFD--F-NVAVKGYRLKASARTPVEGV 294 (364) Q Consensus 232 ~-~Yttylfg~--GAi~~~~-~~~~~~~~~---~~g~-----e~~~~~~~~~~~--f-~lhp~G~sw~~~~~~~~~gg 294 (364) + .-...+||. .++.++. ....+...+ .++. +.-.+.++++.. | ++||.-|.--... +| T Consensus 228 ~~~~~~~~~GDf~~~~~~~~~~~~~~~v~~~~~~d~~~~~~f~~~~v~~r~~~r~d~~v~~~~a~~~l~~~-----~g 300 (300) T protein:vir:95 228 TDPKNTAIVGDFETMFKWGYAKEVPMEIIKYGDPDNSGRDLKGYNQIYIRCEAYIGWGIMDAASFARIVKT-----GG 300 (300) T ss_pred CCCccEEEEeeccceEEEEEecccEEEEeeccCCCCcchhhhhcCcEEEEEEEeecceeecccceEEEecC-----CC Confidence 3 223344453 3333332 222221111 1111 111123333222 2 2566666553321 23 No 57 >protein:vir:8420 Length: 477 # NCBI annotation: gp15 # Family: family:all:21 # MgeID: mge:155 # MgeName: Omega # Cross-refs: genbank:acc:NP_818316;genbank:gi:29566752;genbank:GeneID:1260033 Probab=97.34 E-value=7e-05 Score=43.36 Aligned_cols=278 Identities=10% Similarity=0.051 Sum_probs=95.7 Q ss_pred CCccc------------------------------cchhh-hhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeeh Q lcl|NC_016566. 1 MSLTV------------------------------FQRKL-VTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMS 49 (364) Q Consensus 1 fd~~v------------------------------fn~~~-~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~ 49 (364) +++.. --++. ..-.+|.+.+..-..+-. +.+.+.. -.|+ ...|. T Consensus 133 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gg~lv~~~~~~~~ii~~l~~~~~i~~~~--~~~~~~~--~~~~-~~ip~ 207 (477) T protein:vir:84 133 RHMVDVESDKEIRKIAKVGEEYRDLDRNGGTGGYAVPPLWMMNRFIELARAGRTYANLC--PTEPLPG--GTSS-INIPK 207 (477) T ss_pred HHHhhhhhhhhHHHHHHhhhhhccccccCCCcceeeccchhHHHHHHHhhhcchHHHhh--ceeeecC--Ccce-eEEEE Confidence 00000 00111 111223222211111100 0110000 0111 12221 Q ss_pred hhhhcccccccccccCC-Ccccc-c-hhhh--ccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 50 VGLIANLVTDRNAYAPV-GTPAT-A-KVLA--RMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLH 124 (364) Q Consensus 50 f~~i~g~~~~~d~~~~~-~~~~T-~-~kit--~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~ 124 (364) .. +|.. ..+-.. +...+ . +..+ +...+..+..+-.+-+.++...+....- .+..-|.++++....+. T Consensus 208 ~~--~~~~---~a~~~~Eg~~~~~~~~~~s~~~f~~i~~~~~k~~~~~~iS~ell~ds~~---~l~~~i~~~l~~~~~~~ 279 (477) T protein:vir:84 208 IL--TGTS---TAIQAADNAALTAPSAHEVDLTDGFVQANVKTIAGQQGIAIQLLDQAAV---SVDEFVFRDLAADYANK 279 (477) T ss_pred Ee--cCcc---eeeeeccCcccccccccccccceeeEEEeeeeEEeeeHHHHHHHhccch---hHHHHHHHHHHHHHHHH Confidence 10 1100 000000 00000 0 0000 1112222222212223355554432122 23344667777766665 Q ss_pred HHHHHHH------HHhhhhcccccceeecccccCcccccccccHHHHHHHHHHhccccc-CeeEEEEchHHHHHHHHhhc Q lcl|NC_016566. 125 YLKAGIG------AGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAA-LIKSWFMDGVTWANFIAYQA 197 (364) Q Consensus 125 ~qk~lla------~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~-~l~~ivMHS~v~~~L~k~~~ 197 (364) .-+.+|. -..|++............. .............+.++...+..... .-..|+||+..+..|.+ T Consensus 280 ~d~~~l~G~Gt~~~p~Gi~~~~~~~~~~~~~~-~~t~~~~~~~~~~i~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~--- 355 (477) T protein:vir:84 280 LNVQVISGTGSNNQVVGVRATAGITQVTATSA-GSALEKHQIIYQKIADAIQRVHTSRFLEPEVIVMHPRRWASFHA--- 355 (477) T ss_pred HHHHHhccCCCCCccceeeecccccccccccc-ccchhhHHHHHHHHHHHHhhccccccCCccEEEEcHHHHHHHHH--- Confidence 5544441 1122221111100111000 00000001122334556555544433 34579999999988754 Q ss_pred cccccc--ccccc--c----------c--eeecccCCcEEEEeCCCCCCCC---ceEEEEEec-ceeEEecCCCCcceee Q lcl|NC_016566. 198 LPSAEQ--VFAIG--D----------L--QVMGDGLGRRFIISDAAADAMG---AGKMLGLVP-GAVAVTTNGLDMLAQE 257 (364) Q Consensus 198 it~~~~--~~~~~--~----------~--~~~~~~lGrrVIVDD~~p~~~~---~Yttylfg~-GAi~~~~~~~~~~~~~ 257 (364) +.+..+ ++... + + ......+|+||++++.||...+ .-..++||. +.+.+.++...+...+ T Consensus 356 lkd~~G~~l~~~~~~~~~~~~~~~~~~~~~~~~~l~G~pVv~s~~~p~~~~~~~d~~~i~~gd~~~~~i~~~~~~~~~~~ 435 (477) T protein:vir:84 356 IFAGDDRPLIVPSGPGFNNLGVLTEVASQRVVGQMHGLPVVTDPTLPTTLGTGTDQDVIHVLRASDLALFESSVRMRALQ 435 (477) T ss_pred hhccCCCeeeecCcccccccccccccccccccchhcccceEecCcccccccccCCcceEEEEEeceEEEEeeceeEEecc Confidence 323222 22111 0 0 0111235999999999996332 234555554 3344444433333333 Q ss_pred ccCCCceeeeEEe--eEEEE--eeeeeeeecccccccccCCCCcChh Q lcl|NC_016566. 258 KGGNENIERWWQG--EFDFN--VAVKGYRLKASARTPVEGVRSFKLS 300 (364) Q Consensus 258 ~~g~e~~~~~~~~--~~~f~--lhp~G~sw~~~~~~~~~gg~SPT~a 300 (364) ..........++. .+.|. -||.-|.= -+.+ +--.||.+ T Consensus 436 ~~~~~~~~~~~~v~~~~~~~~~r~~~afv~--~t~~---~~~~~~~~ 477 (477) T protein:vir:84 436 ETRAENLSVLLQVYGYLAFTAARFPQSVVE--IGGT---ALTAPTFA 477 (477) T ss_pred ccccccceeeeeehhhhhhhhhccccceEE--eecc---cccccccC Confidence 2222233333322 11111 26666542 1111 12258888 No 58 >protein:vir:485 Length: 407 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:11 # MgeName: P27 # Cross-refs: genbank:acc:NP_543092;swissprot:trembl:q8w627;genbank:gi:18249904;uniprot:Q8W627;genbank:GeneID:929693 Probab=97.33 E-value=9.3e-05 Score=42.68 Aligned_cols=276 Identities=9% Similarity=0.023 Sum_probs=101.3 Q ss_pred CCcc--------------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhccccccc Q lcl|NC_016566. 1 MSLT--------------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDR 60 (364) Q Consensus 1 fd~~--------------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~ 60 (364) ..-. +-.+++...+++.+.+..-..+- +-+ -+..+.-...| -..++..-. T Consensus 93 g~~~~~~~~e~~a~~~~t~~~gG~~iP~~~~~~I~~~~~~~~~l~~~----~~~---~~~~~~~~~~~--~~~~~~~a~- 162 (407) T protein:vir:48 93 GREDGLRELERKALQVGNDEDGGYAIPEELDRTILTLLKDEVVMRQE----ATV---ITLGGSDYKKL--VNLGGTTSG- 162 (407) T ss_pred cchhhhhHHHHHhhhcccCCCCcccccHhHHHHHHHHHHhhhhhhhh----cee---eecCCCceEEE--EecCCccee- Confidence 0000 11222222233333322211111 100 01111101111 111111100 Q ss_pred ccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH-----Hhh Q lcl|NC_016566. 61 NAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGA-----GKA 135 (364) Q Consensus 61 d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~-----L~G 135 (364) .-+.+... +..+......+....++=.+-+.++.+.+. ..+..+-..|.+++++...+..-+.+|.- ..| T Consensus 163 -~v~E~~~~-~~~~~~~f~~i~~~~~k~~~~~~iS~ell~---ds~~~l~~~i~~~l~~~i~~~~~~a~l~G~G~~~p~G 237 (407) T protein:vir:48 163 -WVGETDAR-PETATSKLGLIEPFMGEIYGNPQATQKMLD---DAFFNVEDWINSELALEFAEQEEIAFTSGDGSKKPKG 237 (407) T ss_pred -eecccccc-cccccccceeEEeeeeeeEeehhhHHHHHh---cchHHHHHHHHHHHHHHHHHHHHhhhhccCCCCccce Confidence 00011100 001111122222222221222345555443 22223444566777777665544433210 112 Q ss_pred hhcccccceeecccc-----cCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc Q lcl|NC_016566. 136 AIESNAAANYTQPAR-----VDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG 208 (364) Q Consensus 136 v~~~na~~v~dis~~-----t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~ 208 (364) ++...+....+.... .........+++.++.+....|......=..|+||..++..|.+ +.+.++ ++... T Consensus 238 il~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~i~~l~~~l~~~~~~~a~~v~n~~~~~~L~~---lkD~~Gr~l~~~~ 314 (407) T protein:vir:48 238 FLAYESTDEDDKTRAFGKLQHIASGAASGVTADAIIKLIYTLRKAHRSGAKFMMNNSSLFAIRL---LKDNDGNYLWRPG 314 (407) T ss_pred eeecccccccccccccccccccccccccccChHHHHHHHHhhchhhhcCCEEEEcHHHHHHHHH---hhccCCceeeccC Confidence 222111100000000 00112223456778888888886666666789999999988854 433333 32221 Q ss_pred -cceeecccCCcEEEEeCCCCCCCCceEEEEEec--ceeEEe-cCCCCcceeeccCCCceeeeEEee--EE-EEeeeeee Q lcl|NC_016566. 209 -DLQVMGDGLGRRFIISDAAADAMGAGKMLGLVP--GAVAVT-TNGLDMLAQEKGGNENIERWWQGE--FD-FNVAVKGY 281 (364) Q Consensus 209 -~~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~--GAi~~~-~~~~~~~~~~~~g~e~~~~~~~~~--~~-f~lhp~G~ 281 (364) .-......+|++|+++|.||.....-...+||. .++.+. .....+..+.-....- +.++.. +. =.++|..| T Consensus 315 ~~~g~~~~l~G~PV~~~~~~p~~~~~~~~i~~Gd~~~~~~i~~~~~~~i~~d~~~~~~~--~~~~~~~r~d~~v~~~~a~ 392 (407) T protein:vir:48 315 IELGQPSSLAGYGIVENEQMPDIAADAKAIAFGNFKRGYTIVDRIGTRILRDPYTNKPF--VGFYTTKRTGGMLVDSQAI 392 (407) T ss_pred cCCCCCceecceeeEEecCcCCccCCccEEEEEeccccEEEEEeeceEEEeeccccCCc--EEEEEEEEeccEEecccce Confidence 111111246999999999996443223445553 223322 2333222222211111 122221 11 12667766 Q ss_pred eecccccccccCCCCcChhhhcCCccceeecCcCcCcceEEEEecCcccccccc Q lcl|NC_016566. 282 RLKASARTPVEGVRSFKLSDITDKANWELDQGQVDNAPATVQDVGSDSDTKGRR 335 (364) Q Consensus 282 sw~~~~~~~~~gg~SPT~aeLat~~NW~rV~~s~K~~pgv~~~~~~~~~~~~~~ 335 (364) .--.-+-+.. ..... T Consensus 393 ~~l~~~aa~~---------------------------------------~~~~~ 407 (407) T protein:vir:48 393 KLMKIGAATR---------------------------------------QKAAA 407 (407) T ss_pred EEEEeeccCC---------------------------------------CCCCC Confidence 5522211111 00000 No 59 >protein:vir:6242 Length: 390 # NCBI annotation: gp36 # Family: family:all:21 # MgeID: mge:131 # MgeName: phi-BT1 # Cross-refs: genbank:acc:NP_813696;swissprot:trembl:q859c1;genbank:gi:29366756;interpro:IPR006444;uniprot:Q859C1;genbank:GeneID:1258897 Probab=97.30 E-value=0.0001 Score=42.51 Aligned_cols=263 Identities=9% Similarity=-0.018 Sum_probs=93.5 Q ss_pred CCcc----------------------------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceee Q lcl|NC_016566. 1 MSLT----------------------------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVE 46 (364) Q Consensus 1 fd~~----------------------------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~ 46 (364) -... +..+..+...|. +.+...+-....+-+... -.|.... T Consensus 83 ~~~~~~~~~~~r~~~~~~~r~~~~~~~~~~~t~~~~g~~~~~~~~~~~i~---~~~~~~~~l~~~~~~~~~--~~~~~~~ 157 (390) T protein:vir:62 83 RSADVDDDATLRAGNLGEARSFEFAPEKRDGTKAGNPNVLSRTLYGQLIA---QAVERSAIMRGGATTFTT--SDANPLD 157 (390) T ss_pred hhcchHHHHHHhhhhhhhhHHHHhhhhhhcccccCCCccccccchHHHHH---HHHhhhhhhhhcceeeec--CCCceeE Confidence 0000 001111111111 111111111111111110 0111122 Q ss_pred eehhhhhcccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 47 KMSVGLIANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYL 126 (364) Q Consensus 47 ~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~q 126 (364) .|.+..-..+ .-+.-.... ...++ ++.+-.-...|++. -+..+...+....- .+...|.+++++...+..- T Consensus 158 ~p~~~~~~~a-~wv~E~~~~-~~~~~-~f~~i~~~~~k~~~---~~~iS~ell~ds~~---~l~~~i~~~l~~~i~~~~d 228 (390) T protein:vir:62 158 FTVITGRSSA-SIVGETAEI-PESYP-ATAQRSMGGFKYGF---ASVVSYEFATDQVL---DLVGFLVSDAGPAIGDAMG 228 (390) T ss_pred EEEEcCCcce-eeecccccc-ccccc-ceeeeEeeeeeEEe---ehHHHHHHHhhhhH---HHHHHHHHHHHHHHHHHHH Confidence 3322211000 000100000 01112 22222222233332 22344444432122 2334466667776665554 Q ss_pred HHHHH---HHhhhhcccccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc Q lcl|NC_016566. 127 KAGIG---AGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ 203 (364) Q Consensus 127 k~lla---~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~ 203 (364) +.+|. .-.|++...+.....+ ........++.++.+....|......=..|+||+..+..|.+ |.+.++ T Consensus 229 ~~~l~G~G~p~Gi~~~~~~~~~~~-----~~~~~~~~~~~~l~~~~~~l~~~~~~~a~~vmn~~~~~~L~~---lkd~~g 300 (390) T protein:vir:62 229 RHFITGTGQPRGILTDASPATATF-----LATDTDSKVSDALIDLFHEVPSAYRANAKYVVNDLRAAQMRK---LKDANG 300 (390) T ss_pred hhhhccCCccccccccccccccce-----ecccccccchHHHHHHHHhhhhhhhcCCEEEEchHHHHHHHH---hhccCC Confidence 44332 1122332221111000 011223456677888777776655555679999999998854 443333 Q ss_pred --ccccc-cceeecccCCcEEEEeCCCCCCCCceEEEEEec--ceeEEecCCCCcceeeccCCCceeeeEEeeEEE---E Q lcl|NC_016566. 204 --VFAIG-DLQVMGDGLGRRFIISDAAADAMGAGKMLGLVP--GAVAVTTNGLDMLAQEKGGNENIERWWQGEFDF---N 275 (364) Q Consensus 204 --~~~~~-~~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~--GAi~~~~~~~~~~~~~~~g~e~~~~~~~~~~~f---~ 275 (364) ++... ........+|++|+++|.+|... .+||. ..+....+++.........=+.-.+.+++.+.| . T Consensus 301 ~~l~~~~~~~g~~~~l~G~Pv~~~~~~p~~~-----i~~gd~s~~~i~~~~~~~v~~~~~~~~~~~~~~~~~~~r~d~~~ 375 (390) T protein:vir:62 301 QYLWQSGLTVGAPSLFNGKVVETDDGMPADK-----ILFADLSKYRVRFAGSLRVDRSVDAKFSTDQIVYRFLQRADGLL 375 (390) T ss_pred CeeecCCcCCCccceecccceEEecCCCCcc-----EEEeeccceeEEeecceEEEeeccccccCCcEEEEEEEEeCcEe Confidence 22211 11111124699999999999642 34443 111111222221111111101111122222111 2 Q ss_pred eeeeeeeeccccccc Q lcl|NC_016566. 276 VAVKGYRLKASARTP 290 (364) Q Consensus 276 lhp~G~sw~~~~~~~ 290 (364) +||..++--.-+-+. T Consensus 376 ~~~~A~~~l~~~~~a 390 (390) T protein:vir:62 376 VDARGAKVLTVTPGA 390 (390) T ss_pred echhheEEEEeecCC Confidence 566666553322110 No 60 >protein:vir:4456 Length: 401 # NCBI annotation: Major capsid protein precursor # Family: family:all:21 # MgeID: mge:96 # MgeName: ST64B # Cross-refs: genbank:acc:NP_700379;genbank:gi:23505451;genbank:GeneID:955658 Probab=97.28 E-value=0.00011 Score=42.34 Aligned_cols=267 Identities=11% Similarity=0.027 Sum_probs=106.3 Q ss_pred CCc---cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCc-eeeeehhhhhccc-ccccccccCCCccccchhh Q lcl|NC_016566. 1 MSL---TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKD-VVEKMSVGLIANL-VTDRNAYAPVGTPATAKVL 75 (364) Q Consensus 1 fd~---~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gd-f~~~~~f~~i~g~-~~~~d~~~~~~~~~T~~ki 75 (364) .+. ..-.+++...++|.+.+.....+-+. + .++.|. +...... ++. ..-..-... ... ... T Consensus 111 ~~~~GG~~iP~~~~~~ii~~~~~~~~l~~~~~----~---~~~~~~~~~~~~~~---~~~~a~wv~E~~~--~~~--~~~ 176 (401) T protein:vir:44 111 TDEDGGYAVPEELDRSILSLLKDEVVMRQEAT----V---ITVGGSDYKKLVNL---GGTASGWVGETDT--RSQ--TAT 176 (401) T ss_pred CCCCCceeccHhHHHHHHHHHHhhhhhhhhce----e---eecCCCceEEEEec---CCccceeeccccc--cCc--ccc Confidence 110 01122233333333333222211111 0 111111 1111111 110 000000000 000 011 Q ss_pred hccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH-----Hhhhhcccccceeecc-- Q lcl|NC_016566. 76 ARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGA-----GKAAIESNAAANYTQP-- 148 (364) Q Consensus 76 t~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~-----L~Gv~~~na~~v~dis-- 148 (364) .....+-...++-.+-+.++.+.+. ..+..+-..|.+++++...+...+.+|.- ..|++........+.. T Consensus 177 ~~~~~v~~~~~k~~~~~~iS~ell~---ds~~~l~~~i~~~la~ai~~~~~~~~l~G~G~~~p~Gil~~~~~~~~~~~~~ 253 (401) T protein:vir:44 177 SRLGLIEPFMGEIYGNPQATQKMLD---DAFFNVEAWINSELATEFAEQEEIAFTTGDGTKKPKGFLAYESTEESDKARA 253 (401) T ss_pred ccceeeeeehhheeeehhhhHHHHh---cchHHHHHHHHHHHHHHHHHHHHhhhhccCCCCccceeeccccccccccccc Confidence 1122222222222223345555443 22334445577778887776655555421 1233322111000000 Q ss_pred -ccc--CcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc-cceeecccCCcEEE Q lcl|NC_016566. 149 -ARV--DGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG-DLQVMGDGLGRRFI 222 (364) Q Consensus 149 -~~t--~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~-~~~~~~~~lGrrVI 222 (364) ... .-......+++.++.++...|......=..|+||...|..|.+ +.+.++ ++... ........+|++|+ T Consensus 254 ~~~~~~~~t~~~~~~~~d~i~~~~~~l~~~~~~~a~~v~n~~~~~~L~~---lkd~~G~~l~~~~~~~g~~~~l~G~PVv 330 (401) T protein:vir:44 254 FGKLQHIVSGEATAVTADAIIKLIYTLRKAHRTGAKFMMNNNSLFAIRL---LKDTEGNYLWRPGLELGQPSSLAGYGIA 330 (401) T ss_pred cccccccccccccccCHHHHHHHHHhcchhhhcCCEEEEcHHHHHHHHH---hhccCCceeecCCcCCCCCceecceeeE Confidence 000 0011233456778888888887666666779999999998854 444333 22211 11111124699999 Q ss_pred EeCCCCCCCCceEEEEEec--ceeEEec-CCCCcceeeccCCCceeeeEEeeEEE---EeeeeeeeecccccccccCCCC Q lcl|NC_016566. 223 ISDAAADAMGAGKMLGLVP--GAVAVTT-NGLDMLAQEKGGNENIERWWQGEFDF---NVAVKGYRLKASARTPVEGVRS 296 (364) Q Consensus 223 VDD~~p~~~~~Yttylfg~--GAi~~~~-~~~~~~~~~~~g~e~~~~~~~~~~~f---~lhp~G~sw~~~~~~~~~gg~S 296 (364) ++|.||.....-...+||. -++.+.+ .+..+.++..... + ...+++...| .++|..|..-.-+- + T Consensus 331 ~~~~~p~~~~~~~~i~~Gd~~~~~~i~~~~~~~~~~~~~~~~-~-~v~~~a~~r~d~~~~~~~a~~~l~~~a-------a 401 (401) T protein:vir:44 331 ENEQMPDIAADAKAIAFGNFKRGYTIVDRIGTRILRDPYTNK-P-FVGFYTTKRTGGMLVDSQAIKLLKIAA-------A 401 (401) T ss_pred EecCcCCccCCccEEEEeehhccEEEEEecceEEeeeccccC-C-cEEEEEEEEeccEEecccceEEEEeec-------C Confidence 9999997544334455553 2343332 2333222222211 1 1222322111 26666666533211 1 No 61 >protein:vir:94771 Length: 298 # NCBI annotation: major head protein # Family: family:all:966 # MgeID: mge:1529 # MgeName: phi LC3 # Cross-refs: genbank:acc:NP_996706;genbank:gi:45597421;genbank:GeneID:2769044 Probab=97.25 E-value=0.00012 Score=42.15 Aligned_cols=272 Identities=7% Similarity=-0.000 Sum_probs=114.2 Q ss_pred CCcc---ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhc Q lcl|NC_016566. 1 MSLT---VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLAR 77 (364) Q Consensus 1 fd~~---vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~ 77 (364) |-.. +-.+++...++|.+.+..-...-+. . .+..+.-.+.|.+..-..+ .-...... ....++ ++.+ T Consensus 1 ma~~gG~lip~~~~~~ii~~~~~~s~i~~~~~--~-----~~~~~~~~~~p~~~~~~~a-~~v~Eg~~-~~~~~~-~f~~ 70 (298) T protein:vir:94 1 MVLNKGTLFDPELVTDLISKVAGKSSIARLSA--Q-----KPIPFNGEKVFTFTMDSEI-DVVAESGK-KTHGGV-TLAP 70 (298) T ss_pred CeeccccccChhHHHHHHHHHHhhchhhhhcc--e-----eeccCCceEEEEEecCcce-EEeeCCcc-cccccc-ceeE Confidence 2111 2344445555555554322211111 1 1122111234433211111 11111111 111112 2333 Q ss_pred cceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhccccc-----ceeecccccC Q lcl|NC_016566. 78 MLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAA-----ANYTQPARVD 152 (364) Q Consensus 78 ~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~-----~v~dis~~t~ 152 (364) -+-...|++.. +..+.+-+.-...+...+...|.+++++.+.+...+.+|.....--+.+.. .......... T Consensus 71 v~l~~~k~~~~---~~iS~ell~~~~~~~~~l~~~i~~~la~ai~~~~d~~~l~G~~~~~g~~~~~~~~~~~~~~~~~~~ 147 (298) T protein:vir:94 71 QTMVPIKVEYG---ARISDEFMYASDEEKINILQAFNDGFAKKVARGIDLMAFHGVNPRLGTASAVIGTNHFDSKVTQKV 147 (298) T ss_pred EEEeeeEEEEe---eehhHHHhccCCccHHHHHHHHHHHHHHHHHHHHHHHhhcccccCCCccccccccccccccccccc Confidence 33334445432 235544332223344455566778888888877666655321000000000 0000000000 Q ss_pred cccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccccc-eeecccCCcEEEEeCCCCC Q lcl|NC_016566. 153 GVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDL-QVMGDGLGRRFIISDAAAD 229 (364) Q Consensus 153 ~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~-~~~~~~lGrrVIVDD~~p~ 229 (364) ...........++.++..++-+...+...|+||+..+..|.+ +.+.++ ++..... ......+|+||+++|.+|. T Consensus 148 ~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~---lkd~~G~~l~~~~~~~~~~~tl~G~PV~~~~~v~~ 224 (298) T protein:vir:94 148 EAPRGIADPNGAIENAVELLTGVDADVTGIAINPSFRSALAK---QKDLQGNALFPELKWGATPDTINGLPVDVNKTVSD 224 (298) T ss_pred ccccccccHHHHHHHHHHhhhhcCCCccEEEEcHHHHHHHHH---hhccCCCeeecCcccCCCCceecceeeEEeccccc Confidence 001111122457888988997777788899999999999865 333333 2222111 1111246999999999996 Q ss_pred CCCce-EEEEEec--ceeEEec-CCCCcceee---ccCC-----CceeeeEEeeEEEE---eeeeeeeecccccccccCC Q lcl|NC_016566. 230 AMGAG-KMLGLVP--GAVAVTT-NGLDMLAQE---KGGN-----ENIERWWQGEFDFN---VAVKGYRLKASARTPVEGV 294 (364) Q Consensus 230 ~~~~Y-ttylfg~--GAi~~~~-~~~~~~~~~---~~g~-----e~~~~~~~~~~~f~---lhp~G~sw~~~~~~~~~gg 294 (364) ..+.- ...+||. .++.++. .+..+...+ .++. +.-...++++..|. .||..|.--+.. T Consensus 225 ~~~~~~~~~~~Gdfs~~~~~~~~~~~~~~~~~~~~~d~~~~~~f~~~~v~~r~~~r~~~~~~~~~a~~~l~~~------- 297 (298) T protein:vir:94 225 MSLTQRDRAIIGDFANGFKWGYAKEVPLEVIQYGDPDNSGLDLKGYNQVYIRAELFLGWGILDATKFARVTEA------- 297 (298) T ss_pred ccCCCccEEEEeeccceEEEEEecCceEEEeecCCCcCcchhhhhcCcEEEEEEEEeccEeecccceEEEEec------- Confidence 43221 2455563 3344443 222221111 1111 11122333433322 566665553221 Q ss_pred CC Q lcl|NC_016566. 295 RS 296 (364) Q Consensus 295 ~S 296 (364) + T Consensus 298 -t 298 (298) T protein:vir:94 298 -N 298 (298) T ss_pred -C Confidence 1 No 62 >protein:vir:8102 Length: 543 # NCBI annotation: gp6 # Family: family:all:21 # MgeID: mge:152 # MgeName: Che9c # Cross-refs: genbank:acc:NP_817683;genbank:gi:29566114;genbank:GeneID:1259308 Probab=97.23 E-value=0.00012 Score=42.03 Aligned_cols=262 Identities=13% Similarity=0.035 Sum_probs=102.8 Q ss_pred CC------c------cccchhhhhhhh-hhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhccc-ccccccccCC Q lcl|NC_016566. 1 MS------L------TVFQRKLVTAVT-QMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANL-VTDRNAYAPV 66 (364) Q Consensus 1 fd------~------~vfn~~~~~~~~-e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~-~~~~d~~~~~ 66 (364) |. . .+-.+++....+ +.+.+. +....-+ .-.+..|++. .|... ++. ..-+. .+.. T Consensus 245 ~~~~~~~~~t~~~gg~lip~~~~~~ii~~~~~~~----~~l~~~~---~~~~~~g~~~-~~~~~--~~~~a~~v~-Eg~~ 313 (543) T protein:vir:81 245 INEVRAMGLTKADGGYLVPFQLDPTVIITSNGSL----NDIRRFA---RQVVATGDVW-HGVSS--AAVQWSWDA-EFEE 313 (543) T ss_pred hhhhhhcccccccCcccCchhhhhHHHHHHHhhh----chhhhhc---ccccCCcceE-EEEec--CCcceeecc-cCcc Confidence 00 0 000011111111 111110 0000000 0011234432 22111 111 00111 1111 Q ss_pred CccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHH------HHhhhhccc Q lcl|NC_016566. 67 GTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIG------AGKAAIESN 140 (364) Q Consensus 67 ~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla------~L~Gv~~~n 140 (364) -...++ ++.+ +-+...+-.+-+.++...+ ..++ .+...|.+++++.+.+..-+.+|. -..|++... T Consensus 314 ~~~~~~-~~~~---i~~~~~k~~~~~~is~ell---~d~~-~~~~~i~~~l~~~~~~~~d~ail~G~Gt~~~p~Gi~~~~ 385 (543) T protein:vir:81 314 VSDDSP-EFGQ---PEIPVKKAQGFVPISIEAL---QDEA-NVTETVALLFAEGKDELEAVTLTTGTGQGNQPTGIVTAL 385 (543) T ss_pred cccccc-ccce---eeeeeeeeEeeehhhHHHH---hccH-HHHHHHHHHHHHHHHHHHHHHHhccCCCCcccccchhhc Confidence 111111 1222 2222222222234555433 2333 566778888999888766665542 233444322 Q ss_pred ccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccccceeecccCC Q lcl|NC_016566. 141 AAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQVMGDGLG 218 (364) Q Consensus 141 a~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~~~~~~lG 218 (364) +.....+. ......+++.++.++...+-.....-..|+||..+|..|.+ +.+..+ ++....-......+| T Consensus 386 ~~~~~~~~-----~~~~~~~~~~~~~~~~~~l~~~~~~~~~~v~n~~~~~~l~~---lkd~~G~~l~~~~~~g~~~~l~G 457 (543) T protein:vir:81 386 AGTAAEIA-----PVTAETFALADVYAVYEQLAARHRRQGAWLANNLIYNKIRQ---FDTQGGAGLWTTIGNGEPSQLLG 457 (543) T ss_pred cccccccc-----ccccccccHHHHHHHHHhhhccccCCcEEEEcHHHHHHHHH---hhcCCCceeccCcCCCCCccccc Confidence 21111111 11223456677888888876666666789999999999864 333332 332211111112469 Q ss_pred cEEEEeCCCCCCC------CceEEEEEec--ceeEEecCCCCcceeecc-CCC---ceeeeEEe--eEEE-Eeeeeeeee Q lcl|NC_016566. 219 RRFIISDAAADAM------GAGKMLGLVP--GAVAVTTNGLDMLAQEKG-GNE---NIERWWQG--EFDF-NVAVKGYRL 283 (364) Q Consensus 219 rrVIVDD~~p~~~------~~Yttylfg~--GAi~~~~~~~~~~~~~~~-g~e---~~~~~~~~--~~~f-~lhp~G~sw 283 (364) ++|+++|.||... +.+ .++||. +.+.....++.+...+.. ... .....++. ++.| .++|..|.. T Consensus 458 ~pv~~~~~~~~~~~~~~~~~~~-~i~~gd~~~~~i~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~r~d~~v~~~~A~~~ 536 (543) T protein:vir:81 458 RPVGEAEAMDANWNTSASADNF-VLLYGNFQNYVIADRIGMTVEFIPHLFGTNRRPNGSRGWFAYYRMGADVVNPNAFRL 536 (543) T ss_pred eeeEEeccccccccccccCCcc-eEEEeeccceeEEeecccEEEEeccccccchhhcCceEEEEEEeeccEeecccceEE Confidence 9999999998632 233 344443 322222233322222211 111 11122222 1111 145555544 Q ss_pred cccccccccCCCCcChhhhcCCccceeecCcCcCcceEEEEecCcc Q lcl|NC_016566. 284 KASARTPVEGVRSFKLSDITDKANWELDQGQVDNAPATVQDVGSDS 329 (364) Q Consensus 284 ~~~~~~~~~gg~SPT~aeLat~~NW~rV~~s~K~~pgv~~~~~~~~ 329 (364) -.-+ +.+ T Consensus 537 l~~~---------------------------------------~~a 543 (543) T protein:vir:81 537 LNVE---------------------------------------TAS 543 (543) T ss_pred EEec---------------------------------------ccC Confidence 2221 111 No 63 >protein:vir:78523 Length: 338 # NCBI annotation: Putative head structural protein # Family: family:all:507 # MgeID: mge:1853 # MgeName: U2 # Cross-refs: genbank:acc:YP_001491585;genbank:gi:157786408;genbank:GeneID:5625675 Probab=97.18 E-value=0.00014 Score=41.69 Aligned_cols=278 Identities=13% Similarity=0.020 Sum_probs=114.1 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhh------cccccccccccCCCccccchh Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLI------ANLVTDRNAYAPVGTPATAKV 74 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i------~g~~~~~d~~~~~~~~~T~~k 74 (364) =-..+..+++...++|.+.+..-+.+-.. ..++.+.-.+.|.+..- ++..- ..-+.+.. .+..+ T Consensus 23 ~~~~liP~~~~~~ii~~~~~~s~l~~l~~-------~~~~~~~~~~ip~~~~~~~a~~v~~~~~--~~~~Eg~~-~~~~~ 92 (338) T protein:vir:78 23 VPSDLLPKEIVGPIFDKAQESSLVLRLGE-------NIPISYGETIIPTTVKRPEVGQVGVGTS--NEQREGGT-KPLSG 92 (338) T ss_pred ccccccchHHHHHHHHHHHhhchhhhhcc-------eeeccCCceEEEEEecCccceeeccccc--cccccccc-ccccc Confidence 11225667777777777776433322211 12333433444433211 11000 00000000 00111 Q ss_pred hhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHh-----hhhcccccceeeccc Q lcl|NC_016566. 75 LARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGK-----AAIESNAAANYTQPA 149 (364) Q Consensus 75 it~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~-----Gv~~~na~~v~dis~ 149 (364) . +...+-.+..+-.+-+..+.+.+. .....+...|.+++++...+...+.+|.--. +..+..+........ T Consensus 93 ~-~f~~v~l~~~k~~~~~~is~ell~---ds~~~~~~~i~~~la~a~~~~~d~~~l~G~g~~~~~~~~gi~~~~~~~~~~ 168 (338) T protein:vir:78 93 T-AWDTRSVAPIKLATIVTVSEEFAR---MNPSGLYTKLQADLAYAIGRGIDLAVFHGKSPLTGSALQGIDTNNVIVNTT 168 (338) T ss_pred c-ceeEEEEEEEEEEEeehhhHHHHh---cCHHHHHHHHHHHHHHHHHHHHHHHhhcccCCCcccccccccccccccccc Confidence 1 122222222222223345555443 2333445568888888888777766553111 111111111111111 Q ss_pred ccCcccccccccHHHHHHHHHHh-cccccCeeEEEEchHHHHHHHHhhcccccccc--ccccccee-ecccCCcEEEEeC Q lcl|NC_016566. 150 RVDGVGGRTFPTLADFPLAASKF-GDQAALIKSWFMDGVTWANFIAYQALPSAEQV--FAIGDLQV-MGDGLGRRFIISD 225 (364) Q Consensus 150 ~t~~~~~~~~~s~~~l~~A~~~l-GD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~--~~~~~~~~-~~~~lGrrVIVDD 225 (364) ..+.........+..+.++..++ .+....-..|+||...+..|.+...+.+..+- +......- ....+|++|+++| T Consensus 169 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~m~~~~~~~L~~~~~l~d~~g~~l~~~~~~~~~~~~l~G~PV~~~~ 248 (338) T protein:vir:78 169 NVDYLQTGTTPLLDRFLDGYDLVSANTDVDFNGWAADPRYRARLLRSQAYRDANGNVDPTRINLAASAGDLLGLPVQFGK 248 (338) T ss_pred ccccccccchhhHHHHHHHHHHhhhhccccceEEEEchHHHHHHHHHhhhccCCCceeecccccCCCCceeeeeeEEEcc Confidence 11111112223445677777776 44455667899999999999765556554432 22211111 1123699999999 Q ss_pred CCCCCCC----ceEEEEEecce-eEEec-CCCCcceeecc----C-C---------CceeeeEEeeEEE---Eeeeeeee Q lcl|NC_016566. 226 AAADAMG----AGKMLGLVPGA-VAVTT-NGLDMLAQEKG----G-N---------ENIERWWQGEFDF---NVAVKGYR 282 (364) Q Consensus 226 ~~p~~~~----~Yttylfg~GA-i~~~~-~~~~~~~~~~~----g-~---------e~~~~~~~~~~~f---~lhp~G~s 282 (364) .||...+ .-...+||.=. +.++. .+..+.+.+.. + . +.-.+.++.+..+ .+||..|. T Consensus 249 ~ip~~~~~~~~~~~~~~~gdfs~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~r~d~~v~~~~a~~ 328 (338) T protein:vir:78 249 AVGGDLGAATDSKVRVVGGDFSQLKYGFADEIRVKMSDTATLTDNTSPTPQTVSMWQTNQIAILIEVTFGWLLGDKQAFV 328 (338) T ss_pred ccCccccccCCcccEEEEEecceEEEEeecccEEEEeecccccccccccccchhhhhcCcEEEEEEEEeccEeecccceE Confidence 9985321 11223344311 22222 12211111110 0 0 0111222222211 26666664 Q ss_pred ecccccccccCCCCcCh Q lcl|NC_016566. 283 LKASARTPVEGVRSFKL 299 (364) Q Consensus 283 w~~~~~~~~~gg~SPT~ 299 (364) --... .-|.- T Consensus 329 ~l~~~-------~~~~~ 338 (338) T protein:vir:78 329 KFVDD-------EDPDA 338 (338) T ss_pred EEecc-------cCCCC Confidence 42221 01221 No 64 >protein:vir:4997 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:109 # MgeName: Sfi21 # Cross-refs: genbank:acc:NP_049971;genbank:gi:9632943;genbank:GeneID:1262106 Probab=97.14 E-value=0.00015 Score=41.50 Aligned_cols=269 Identities=12% Similarity=0.046 Sum_probs=105.4 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) ---.+..+++....++.+.+..-+.+-+ ..+-+. ...|.+.. +-+....+...-... ++...+ ....+... T Consensus 116 ~gg~~iP~~~~~~ii~~~~~~~~l~~~~--~~~~~~--~~~~~~~~-~~~~~~~~~a~~v~E---~~~~~~-~~~~~~~~ 186 (397) T protein:vir:49 116 DAGLTIPQDIRTAINTLVRQFDSLQEYV--NVENVT--TLTGSRVY-EKWADITGLAKLDDE---GGQIGQ-NDDPKLSL 186 (397) T ss_pred cCcceecHHHHHHHHHHHHhhhhHhhhc--ceeecc--CCcceEEE-EeeccCCcceeeecc---cccccc-ccccceee Confidence 0001123333333444444322211110 000000 11122211 111111111111111 111000 01111122 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCcccccccc Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTFP 160 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~~ 160 (364) +-....+-.+-+.++...+. ..+..+..-|.+.+++...+...+.+| .| ... +....... T Consensus 187 v~~~~~k~~~~~~iS~ell~---ds~~~l~~~i~~~l~~~~~~~~d~ail---~G----~g~----------~~~~~~~~ 246 (397) T protein:vir:49 187 IRYAIKRYAGISTVTNSLLA---DSAENILAWLSGWIAKKVVVTRNKAIL---EA----IGT----------LPNKPTLA 246 (397) T ss_pred eEeeeeeeEeehhhHHHHHh---hhhHHHHHHHHHHHHHHHHHHHHHHHH---hc----ccc----------cccccccc Confidence 22222222222345555443 223334445667777766665444332 22 110 00122345 Q ss_pred cHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccccc--cccc-cceeecccCCcEEEEeCCC--CCCCCceE Q lcl|NC_016566. 161 TLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQV--FAIG-DLQVMGDGLGRRFIISDAA--ADAMGAGK 235 (364) Q Consensus 161 s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~--~~~~-~~~~~~~~lGrrVIVDD~~--p~~~~~Yt 235 (364) ++.++.++..++-.....-..|+||+..+..|.+ +.+..+- +... .-......+|++|++.+++ |.....-. T Consensus 247 ~~d~i~~~~~~l~~~~~~~a~~v~n~~~~~~l~~---lkd~~g~~l~~~~~~~g~~~~l~G~pV~~~~~~~~~~~~~~~~ 323 (397) T protein:vir:49 247 KWDDIIDLQAKVDPAIKQTSLFLTNTSGFTALKK---VKNAMGDYLMERDVKSPTGYSIDGFVVKEISDRFLPNGTGGAM 323 (397) T ss_pred CHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHH---hhccCCceeecccccCCCCceecceeeEEecccccccccCCce Confidence 6678888988888888888999999999998864 4333332 2111 1111112469999987765 43333334 Q ss_pred EEEEec--ceeEEec-CCCCcceeeccCC--CceeeeEEeeEEE---EeeeeeeeecccccccccCCCCcChhhhcCCcc Q lcl|NC_016566. 236 MLGLVP--GAVAVTT-NGLDMLAQEKGGN--ENIERWWQGEFDF---NVAVKGYRLKASARTPVEGVRSFKLSDITDKAN 307 (364) Q Consensus 236 tylfg~--GAi~~~~-~~~~~~~~~~~g~--e~~~~~~~~~~~f---~lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~N 307 (364) .++||. -++.+.+ .++.+...+..+. ..-.+.++.+..| .++|..|..-. T Consensus 324 ~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~r~d~~~~~~~a~~~~~---------------------- 381 (397) T protein:vir:49 324 PLYFGDLKQAVTLFDRQHLSLLSTNIGGGAFETDTTKVRVIDRFDVVSTDTEAFVPAS---------------------- 381 (397) T ss_pred eEEEeeccceEEEEeecccEEEEeccccchhhcCeeeEEEEEeeccEEecccceEEEE---------------------- Confidence 566773 3444443 3333333222221 1112222222111 25666665522 Q ss_pred ceeecCcCcCcceEEEEecCccccccccccccc Q lcl|NC_016566. 308 WELDQGQVDNAPATVQDVGSDSDTKGRRRTQTA 340 (364) Q Consensus 308 W~rV~~s~K~~pgv~~~~~~~~~~~~~~~~~~~ 340 (364) +.+.++......+++| T Consensus 382 -----------------~~~~~~~~~~~~~~~~ 397 (397) T protein:vir:49 382 -----------------FKAIADQKAKLSTAGA 397 (397) T ss_pred -----------------ecccccccCcccccCC Confidence 2222222222222222 No 65 >protein:vir:4511 Length: 409 # NCBI annotation: capsid # Family: family:all:21 # MgeID: mge:97 # MgeName: V # Cross-refs: genbank:acc:NP_599037;genbank:gi:19548995;genbank:GeneID:935211 Probab=97.12 E-value=0.00016 Score=41.37 Aligned_cols=272 Identities=7% Similarity=-0.018 Sum_probs=95.9 Q ss_pred CCccccchhh-------------------------------------hhhhhhhhHHHHHHHhhhhcceeEeccCccc-C Q lcl|NC_016566. 1 MSLTVFQRKL-------------------------------------VTAVTQMIPDNLNVFNAAANGAVVLGTGEVL-K 42 (364) Q Consensus 1 fd~~vfn~~~-------------------------------------~~~~~e~i~q~~~~fn~as~gAivl~~~~~~-G 42 (364) --.+.|..++ ...+++.+.+....++-. .+ .+.. + T Consensus 87 ~~~~a~~~~l~~~~~~~~~~e~~~~~~~~a~~~~~~~~gg~liP~~~~~~ii~~~~~~~~l~~~~---~~----~~~~~~ 159 (409) T protein:vir:45 87 KRAQVFDKWMRHGASELTSEERKALRELRAQGVAQDEKGGYTVPETFLAKVVEKMKSYGGIASVA---QI----LTTSDG 159 (409) T ss_pred HHHHHHHHHHHhhhhhccHHHHHHHHHHhhccCccCcCCceeccHhHHHHHHHHHHhhhhhhhhc---ee----eecCCC Confidence 1111122222 222222222211111100 00 0110 1 Q ss_pred ceeeeehhhhhcccccccccccCCCc-cccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHH Q lcl|NC_016566. 43 DVVEKMSVGLIANLVTDRNAYAPVGT-PATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAI 121 (364) Q Consensus 43 df~~~~~f~~i~g~~~~~d~~~~~~~-~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw 121 (364) .....+....... .-..-+.+.. ..+...+.+..-.++|++.. .+.++...+.. .++ .+...|.+++++.. T Consensus 160 ~~~~~~~~~~~~~---~~~~v~E~~~~~~~~~~f~~~~l~~~k~~~~--~i~is~ell~d--s~~-~l~~~i~~~la~a~ 231 (409) T protein:vir:45 160 RTMEWATADGTSE---VGVLLGENEEAGEEDTDFGMGSLGALKMTSK--IIRVSNELLQD--SAI-DMEAYLARRIAERI 231 (409) T ss_pred ceEEEEeeccCcc---ccccccccccccccccccceeeeeeeeeeee--ehhhhHHHHhc--cHH-HHHHHHHHHHHHHH Confidence 1111111110000 0000000000 00101122222223344332 23355554431 122 33445667777777 Q ss_pred HHHHHHHHHHHHhhhhcc-cccceeecccccCcccccccccHHHHHHHHHHhcccccCee--EEEEchHHHHHHHHhhcc Q lcl|NC_016566. 122 MLHYLKAGIGAGKAAIES-NAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIK--SWFMDGVTWANFIAYQAL 198 (364) Q Consensus 122 ~~~~qk~lla~L~Gv~~~-na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~--~ivMHS~v~~~L~k~~~i 198 (364) .+...+.+|. ..|.-.. +...+...+...........+++.++.++...|......-. .|+||+.++..|.+ | T Consensus 232 ~~~~~~a~l~-G~G~~~~~~p~Gil~~~~~~~~~~~~~~~~~d~i~~l~~~l~~~~~~~a~~~~~~n~~~~~~l~~---l 307 (409) T protein:vir:45 232 GRGEARYLIQ-GTGAGTPKQPKGLAASVTGTTQTAAANAVKWQEILALKHSIDPAYRRGPKFRLAFNDNTLKLISE---M 307 (409) T ss_pred HHHHHHHhhc-cCCCCCccccceeeeccccccccccccccchHHHHHHHHhhhhhhccCCeEEEEECHHHHHHHHH---h Confidence 7655554442 1111000 00001111111111223345667788888888866655544 45789999988854 4 Q ss_pred ccccc--ccccc-cceeecccCCcEEEEeCCCCCCC-CceEEEEEec-ceeEEec-CCCCcceeeccCCCceeeeEEeeE Q lcl|NC_016566. 199 PSAEQ--VFAIG-DLQVMGDGLGRRFIISDAAADAM-GAGKMLGLVP-GAVAVTT-NGLDMLAQEKGGNENIERWWQGEF 272 (364) Q Consensus 199 t~~~~--~~~~~-~~~~~~~~lGrrVIVDD~~p~~~-~~Yttylfg~-GAi~~~~-~~~~~~~~~~~g~e~~~~~~~~~~ 272 (364) .+.++ ++... .-......+|+||+++|.||... +.++ .+||. .-+.+.. +...........-+...+.++... T Consensus 308 kd~~G~~i~~~~~~~~~~~~l~G~PV~~~~~~p~~~~~~~~-i~~Gd~~~~~i~~~~~~~~~~~~d~~~~~~~~~~~~~~ 386 (409) T protein:vir:45 308 EDGQGRPLWLPDIVGVAPASVLNVPYVIDQEIDDIGAGKKF-MFCGDFDRFIIRRVRYMILKRLVERYAEYDQTGFLAFH 386 (409) T ss_pred hcCCCceeeccCcCCCCCceecceeeEEecCcCCccCCccE-EEEeehhhhheeeccceEEEEeecccccCCcEEEEEEE Confidence 33333 22211 00111123699999999999633 3443 44544 1122222 221111111001111122222222 Q ss_pred EE---EeeeeeeeecccccccccCC Q lcl|NC_016566. 273 DF---NVAVKGYRLKASARTPVEGV 294 (364) Q Consensus 273 ~f---~lhp~G~sw~~~~~~~~~gg 294 (364) .| .++|..|+--... ...|| T Consensus 387 r~d~~~~~~~A~~~l~~k--~s~~~ 409 (409) T protein:vir:45 387 RFDCILEDTSAIKALVGK--GSVGG 409 (409) T ss_pred EeccEeechhheEEEEec--cCCCC Confidence 22 2445544432211 11122 No 66 >protein:vir:1328 Length: 392 # NCBI annotation: gp36 # Family: family:all:21 # MgeID: mge:28 # MgeName: phi-C31 # Cross-refs: genbank:acc:NP_047927;swissprot:trembl:q9zwv6;genbank:gi:9631145;uniprot:Q9ZWV6;genbank:GeneID:2715889 Probab=97.07 E-value=0.00018 Score=41.08 Aligned_cols=263 Identities=9% Similarity=-0.008 Sum_probs=101.7 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) -+-.+..+..+...++.+. ..++-....+-+.... .|.-...|....-..+ .-+.-.... ...+| ++.+-.- T Consensus 117 ~~g~~~~~~~~~~~i~~~~---~~~~~l~~~~~~~~~~--~~~~~~~~~~~~~~~a-~~v~E~~~~-~~~~~-~f~~v~~ 188 (392) T protein:vir:13 117 GNPNVLSRTLYGQLIAQAV---ERSAIMRGGASTFTTS--DANPMDFTVITGRATA-GIVGETAEI-PESYP-ATTQRSM 188 (392) T ss_pred CCCccccccchHHHHHHHH---hhhhhhhhcceeeecC--CCceeEEEEEcCCcce-eeecccccc-ccccc-ceeeEEe Confidence 1111222333333333222 2222222222221110 1111222222211110 000101000 00111 2222222 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH-----HhhhhcccccceeecccccCccc Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGA-----GKAAIESNAAANYTQPARVDGVG 155 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~-----L~Gv~~~na~~v~dis~~t~~~~ 155 (364) ...|++. -+.++..-+. .....+...|.+++++...+..-+.+|.- ..|++...+... .. -... T Consensus 189 ~~~k~~~---~~~iS~ell~---ds~~~l~~~i~~~l~~~i~~~~d~~~l~G~Gt~~p~Gil~~~~~~~--~~---~~~~ 257 (392) T protein:vir:13 189 GGFKYGF---ASVVSYEFAT---DQVLDLVGFLVSDAGPAIGDAMGRHFLTGTGTGQPRGILTDATGAN--AA---FGEA 257 (392) T ss_pred eeeeEEe---eehhHHHHHh---cchHHHHHHHHHHHHHHHHHHHHHHHhcccCCcccccccccccccc--cc---cccc Confidence 2233332 2345555443 11223344577777777776655555421 112222211100 00 0111 Q ss_pred ccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc-cceeecccCCcEEEEeCCCCCCCC Q lcl|NC_016566. 156 GRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG-DLQVMGDGLGRRFIISDAAADAMG 232 (364) Q Consensus 156 ~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~-~~~~~~~~lGrrVIVDD~~p~~~~ 232 (364) .....++.++.+....|......=..|+||+..+..|.+ +.+..+ ++... ........+|+||+++|.||.. T Consensus 258 ~~~~~~~d~l~~~~~~l~~~~~~~a~~v~n~~~~~~l~~---lkd~~G~~l~~~~~~~g~~~~l~G~Pv~~~~~~~~~-- 332 (392) T protein:vir:13 258 DADSKVSDALIDLFHEVPSAYRKNAKFVVNDLRAAQMRK---LKDANGQYLWQSALTVGAPDTFNGKVVETDDGMPAD-- 332 (392) T ss_pred ccccccHHHHHHHHHhhhhhhhcCCEEEEcHHHHHHHHH---hhccCCceeecCCcCCCCCceecceeeEEcCCCCCC-- Confidence 223456778888888886666566789999999998854 444333 22211 1111112369999999999953 Q ss_pred ceEEEEEec-ceeEEec-CCCCcceeeccCCCceeeeEEeeEEE---Eeeeeeeeeccccccc Q lcl|NC_016566. 233 AGKMLGLVP-GAVAVTT-NGLDMLAQEKGGNENIERWWQGEFDF---NVAVKGYRLKASARTP 290 (364) Q Consensus 233 ~Yttylfg~-GAi~~~~-~~~~~~~~~~~g~e~~~~~~~~~~~f---~lhp~G~sw~~~~~~~ 290 (364) +.+||. ..+.++. ++..+.+.....-+.-.+.++....| .+||..|.--.-+.+. T Consensus 333 ---~i~~Gdf~~~~i~~~~~~~i~~~~~~~~~~~~~~~r~~~r~d~~~~~~~A~~~~~~~~aa 392 (392) T protein:vir:13 333 ---KVLFADLSKYRVRFAGSLRVDRSVDAKFSTDQIVYRFLQRADGLLVDARGAKVLTVTPAA 392 (392) T ss_pred ---cEEEeeccceeEEeecceEEEeeccccccCCcEEEEEEEEeccEEecccceEEEEeeccC Confidence 234444 2222222 22211111111111111222222221 2556666553332111 No 67 >protein:vir:3991 Length: 404 # NCBI annotation: major structural protein # Family: family:all:21 # MgeID: mge:319 # MgeName: BK5-T # Cross-refs: genbank:acc:NP_116499;genbank:gi:14251132;genbank:GeneID:921252 Probab=97.04 E-value=0.0002 Score=40.91 Aligned_cols=267 Identities=9% Similarity=-0.005 Sum_probs=99.0 Q ss_pred CCc------------------------------------cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCce Q lcl|NC_016566. 1 MSL------------------------------------TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDV 44 (364) Q Consensus 1 fd~------------------------------------~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf 44 (364) ... .+-.+++...+++.+.+.....+-. .-.++.+.- T Consensus 87 ~~~~~~~~~~~~~~~~~~~~~~~~~e~~a~~~~t~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~-------~~~~~~~~~ 159 (404) T protein:vir:39 87 YELKDKFVKEFVNMVRNPMAFLNTVSSKTETSGSDSAAGLTIPQDIRTMINTLVRQYDSLQQYV-------RVESVSTSN 159 (404) T ss_pred hhhHHHHHHHHHHHHhcchhhhhhhhhhhhhcccccCCceeccHHHHHHHHHHHHhhhhHHhhc-------ceeeccCCc Confidence 000 0112222222233333222211111 011111111 Q ss_pred eeeehhhhhcccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 45 VEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLH 124 (364) Q Consensus 45 ~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~ 124 (364) ...+++..-++... -...+.+..... ...-....+.....+-.+-+.++...+.. ....+...|.+++++...+. T Consensus 160 ~~~~~~~~~~~~~~-a~~v~Eg~~~~~-~~~~~f~~i~~~~~k~~~~~~iS~ell~d---s~~~l~~~i~~~l~~~~~~~ 234 (404) T protein:vir:39 160 GSRVYEKWTDVTPL-TVMDAEDGKIPD-LDNPRLTIIKYLIKRYAGIITATNTLLKD---TAENILAWLSSWIAKKVVVT 234 (404) T ss_pred ceEEEEeecCCccc-eeeecCcccccc-ccccceeeEEeeeeeEEeeehhHHHHHhh---chHHHHHHHHHHHHHHHHHH Confidence 11122221111000 000101111000 01112223332222222333456554432 22233445667777777765 Q ss_pred HHHHHHHHHhhhhcccccceeecccccCcccccccccHHHHHHHHHH-hcccccCeeEEEEchHHHHHHHHhhccccccc Q lcl|NC_016566. 125 YLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASK-FGDQAALIKSWFMDGVTWANFIAYQALPSAEQ 203 (364) Q Consensus 125 ~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~-lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~ 203 (364) .-+.+|... +.+. ......++.++.++... +......-..|+||...|..|.+ +.+..+ T Consensus 235 ~d~~il~g~----g~~~-------------~~~~~~~~~~i~~~~~~~~~~~~~~~a~~v~n~~~~~~L~~---lkd~~G 294 (404) T protein:vir:39 235 RNQAIIAAM----GTVP-------------KKPTIAKFDDVITMINTSVDPAIIATSSLLTNQSGLNKLAL---VKTAEG 294 (404) T ss_pred HHHHHHhcc----cccc-------------cccccccHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHH---hhccCC Confidence 555443221 1111 11123345567777654 44444555689999999999864 333322 Q ss_pred --ccccc-cceeecccCCcEEEEeCCC--CCCCCceEEEEEec---ceeEEecCCCCcceeeccC--CCceeeeEEeeEE Q lcl|NC_016566. 204 --VFAIG-DLQVMGDGLGRRFIISDAA--ADAMGAGKMLGLVP---GAVAVTTNGLDMLAQEKGG--NENIERWWQGEFD 273 (364) Q Consensus 204 --~~~~~-~~~~~~~~lGrrVIVDD~~--p~~~~~Yttylfg~---GAi~~~~~~~~~~~~~~~g--~e~~~~~~~~~~~ 273 (364) ++... .-......+|++|++.|.+ |.....-..++||. +.......+..+...+... -+.-.+.++.+.. T Consensus 295 ~~l~~~~~~~~~~~~l~G~pV~~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~r~~~r 374 (404) T protein:vir:39 295 KYLLEPDPTKPNSYLIKGKKVIVVADRWLPNSGSTVYPLYYGDMSQAITLFDRENMSLLPTNIGAGAFETDTTKIRVIDR 374 (404) T ss_pred ceeeccCcCCCCcceecceeEEEecccccCccCCCccEEEEEeccccEEEEeecceEEEEeccchhhhhhceeeEEEEee Confidence 22111 0011112369999998764 43322223455553 3333333444333333221 1222233333222 Q ss_pred E---Eeeeeeeeecc-cccccccCCCCcChh Q lcl|NC_016566. 274 F---NVAVKGYRLKA-SARTPVEGVRSFKLS 300 (364) Q Consensus 274 f---~lhp~G~sw~~-~~~~~~~gg~SPT~a 300 (364) | .+||..|..-. ++.+.. +|.+|+=- T Consensus 375 ~d~~~~~~~a~~~~~~~~~a~~-~~~~~~~~ 404 (404) T protein:vir:39 375 FDVKTTDSEALVAGSFTAIADQ-VGNFTAGK 404 (404) T ss_pred eccEEecccceEEEEeeccccC-CCCCCCCC Confidence 2 26676665522 222221 22333322 No 68 >protein:vir:100172 Length: 394 # NCBI annotation: putative major head protein # Family: family:all:21 # MgeID: mge:1524 # MgeName: phi AT3 # Cross-refs: genbank:acc:YP_025031;genbank:gi:48697264;genbank:GeneID:2948270 Probab=96.81 E-value=0.00032 Score=39.71 Aligned_cols=263 Identities=7% Similarity=0.002 Sum_probs=93.8 Q ss_pred CCc---------------------------------cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeee Q lcl|NC_016566. 1 MSL---------------------------------TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEK 47 (364) Q Consensus 1 fd~---------------------------------~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~ 47 (364) +.. ..-.+++...+++.+.+....++-.. ..++.+.-... T Consensus 85 ~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~~~l~~~~~-------~~~~~~~~~~~ 157 (394) T protein:vir:10 85 KKKPIDAKKKAINDFIHSHGKVIDNAAGHVTSTEAGVLIPEEIIYDPTAEVNSVVDLSTLVT-------KTPVTTPKGTY 157 (394) T ss_pred hhhHHHHHHHHHHHHHhccchhhhhhhcccccccCceeccHHHHHHHHHHHHhhhhhhhhce-------eeeccCCceEE Confidence 000 01122233333333333222211111 01111111122 Q ss_pred ehhhhhcccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 48 MSVGLIANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLK 127 (364) Q Consensus 48 ~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk 127 (364) |....-++...-....+......+| ....+..+...-.+-+.++...+. .....+...|.+.+++...+-..+ T Consensus 158 ~~~~~~~~~~~~~~E~~~~~~~~~~----~~~~v~l~~~k~~~~~~iS~ell~---ds~~~l~~~i~~~la~~~~~~~~~ 230 (394) T protein:vir:10 158 PILKRATDRFSSVAELAENPALAEP----EFEQVDWSVSTYRGAIPLSEEAIA---DSAVDLTSLVGQSINEKSVNTYNA 230 (394) T ss_pred EEEecCCCccccccccccccccccc----cceeEEeeeeeeEeeehhHHHHHh---hhhHHHHHHHHHHHHHHHHHHHHH Confidence 2111111111000000000000001 112222222222222345555443 122233344556666555544333 Q ss_pred HHHHHHhhhhcccccceeecccccCcccccccccHHHHHHHHHH-hcccccCeeEEEEchHHHHHHHHhhccccccc--c Q lcl|NC_016566. 128 AGIGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASK-FGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--V 204 (364) Q Consensus 128 ~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~-lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~ 204 (364) .++..+ +... .....+..++.++.++.+. +..... ..|+||..+|..|.+ +.+..+ + T Consensus 231 ~il~g~-------g~~~--------~~~~~~~~~~d~l~~~~~~~~~~~~~--a~~vmn~~~~~~l~~---lkd~~G~~i 290 (394) T protein:vir:10 231 MIAPVL-------QSFT--------AKATTTDTLVDSLKHILNVDLDPAYS--RALVVTQSLFNTLDT---LKDKNGRYL 290 (394) T ss_pred HHhhcc-------cccc--------cccccccccHHHHHHHHHhhhhhhcc--CEEEecHHHHHHHHH---hhccCCCee Confidence 322211 1100 0011223344566666654 344332 689999999998864 444443 2 Q ss_pred ccccc-----ceeecccCCcEEEEeCCC--CCCCCceEEEEEec---ceeEEecCCCCcceee-ccCCCceeeeEEeeEE Q lcl|NC_016566. 205 FAIGD-----LQVMGDGLGRRFIISDAA--ADAMGAGKMLGLVP---GAVAVTTNGLDMLAQE-KGGNENIERWWQGEFD 273 (364) Q Consensus 205 ~~~~~-----~~~~~~~lGrrVIVDD~~--p~~~~~Yttylfg~---GAi~~~~~~~~~~~~~-~~g~e~~~~~~~~~~~ 273 (364) +...- .......+|+||++.|.+ |...+. ..++||. +.+.+......+.... ......+...++.+. T Consensus 291 ~~~~~~~~~~~~~~~~L~G~PV~~~~~~~~~~~~~~-~~i~~gd~s~~~~~~~~~~~~v~~~~~~~~~~~~~~~~r~d~- 368 (394) T protein:vir:10 291 LHDASDSITDGTAKGTVLGVPVYVVGDALLGSAAGD-QKAFVGDLKRGVLFADRQQVTLAWEDSKIYGRYLGAAFRFGV- 368 (394) T ss_pred eeccccccccCCcccccccceeEEecccccCCCCCc-eEEEEeeccccEEEEeecceEEEEecccccceeEEEEEEecc- Confidence 22111 111113479999887654 333333 3455553 3333322332221111 112222222222222 Q ss_pred EEeeeeeeeecccccccccCCCCcChhhhcCCc Q lcl|NC_016566. 274 FNVAVKGYRLKASARTPVEGVRSFKLSDITDKA 306 (364) Q Consensus 274 f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~ 306 (364) -.+||..|.|-.-+.+.. -|+. +++. T Consensus 369 ~~~~~~ai~~~~~~~~~~----~~~~---~~~~ 394 (394) T protein:vir:10 369 KQADSNAGYFVTNTDAAS----GSTS---GTGK 394 (394) T ss_pred EEeccccEEEEEeecccC----CCCC---CCCC Confidence 238899998843321111 1221 2222 No 69 >protein:vir:78223 Length: 333 # NCBI annotation: Putative major head protein # Family: family:all:966 # MgeID: mge:1849 # MgeName: Bethlehem # Cross-refs: genbank:acc:YP_001491666;genbank:gi:157786490;genbank:GeneID:5625701 Probab=96.77 E-value=0.00035 Score=39.55 Aligned_cols=273 Identities=11% Similarity=0.035 Sum_probs=112.3 Q ss_pred CC---ccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhccc--cccccccc--CCC-ccccc Q lcl|NC_016566. 1 MS---LTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANL--VTDRNAYA--PVG-TPATA 72 (364) Q Consensus 1 fd---~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~--~~~~d~~~--~~~-~~~T~ 72 (364) |. .....+++...++|.+.+..-...- +.. .++.+.-.+.|.+.....+ +..-.... ... ...+. T Consensus 20 ~~~~~~~liP~~~~~~ii~~l~~~s~l~~~---~~~----~~~~~~~~~~p~~~~~~~a~~v~eg~~~~~~e~~~~~~~~ 92 (333) T protein:vir:78 20 LAHVPSDLLPKEIVGPIFDKAQESSLVLRM---GEQ----IPISYGETIIPTTVKRPEVGQVGVGTSNEQREGGLKPLSG 92 (333) T ss_pred eecCCccccchhHHHHHHHHHHhhchhhhh---cce----eeccCCceEEEEEeCCceeEeecCcccccccccccccccc Confidence 11 1144555666666666543222111 111 1122222233333211100 00000000 000 00000 Q ss_pred hhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH--------Hhhhhcccccce Q lcl|NC_016566. 73 KVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGA--------GKAAIESNAAAN 144 (364) Q Consensus 73 ~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~--------L~Gv~~~na~~v 144 (364) .++.+-.-...|++. -+..+.+.+. .++..+...|.+++++...+...+.+|.. +.|+..... . T Consensus 93 ~~f~~i~l~~~kl~~---~~~is~ell~---~s~~~~~~~i~~~la~ai~~~~d~~~l~G~g~~~~~~~~g~~~~~~--~ 164 (333) T protein:vir:78 93 TAWDTRSVSPIKLAT---IVTVSEEFAR---MNPSGLYTKLQGDLAYAIGRGIDLAVFHGKSPLTGSALQGIDTDNV--I 164 (333) T ss_pred cceeEEEEeeEEEEE---eehhhHHHHh---cCHHHHHHHHHHHHHHHHHHHHHHHHhcccCCCCCccccccccccc--c Confidence 112222222223332 2334544332 34445556788888888888777766521 112111110 0 Q ss_pred eecccccCcccccccccHHHHHHHHHHhcc-cccCeeEEEEchHHHHHHHHhhccccccc--ccccccc-eeecccCCcE Q lcl|NC_016566. 145 YTQPARVDGVGGRTFPTLADFPLAASKFGD-QAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDL-QVMGDGLGRR 220 (364) Q Consensus 145 ~dis~~t~~~~~~~~~s~~~l~~A~~~lGD-~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~-~~~~~~lGrr 220 (364) ...+ ...........++.++.++..++.. ....-..|+||+..+..|.+...+.+..+ ++..... ......+|++ T Consensus 165 ~~~~-~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~~~~~~d~~G~~i~~~~~~~~~~~~l~G~P 243 (333) T protein:vir:78 165 ANTT-NVDYLQETGDPLLDRLLDGYDLVSANTDVEFNGWAVDPRFRAHLLRAQAYRDANGNVDPSRINLAAQTGDVLGLP 243 (333) T ss_pred cccc-cccccccccchhHHHHHHHHHhhccccccCceEEEEcchHHHHHHHHhhhcCCCCceeecCccccCCCceeecee Confidence 0000 0011112223456678888877643 34445689999999999977665654433 2222111 1111246999 Q ss_pred EEEeCCCCCCC-----CceEEEEEecc-eeEEec-CCCCcc--eee--ccCC-------CceeeeEEeeEEE---Eeeee Q lcl|NC_016566. 221 FIISDAAADAM-----GAGKMLGLVPG-AVAVTT-NGLDML--AQE--KGGN-------ENIERWWQGEFDF---NVAVK 279 (364) Q Consensus 221 VIVDD~~p~~~-----~~Yttylfg~G-Ai~~~~-~~~~~~--~~~--~~g~-------e~~~~~~~~~~~f---~lhp~ 279 (364) |+++|.+|... +++. .+||.- -+.++. +.+.+. ++. .+.+ +.-.+.++.+..| ++||. T Consensus 244 v~~~~~i~~~~~~~~~~~~~-~~~gD~~~~~~g~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~v~~r~~~r~d~~v~~~~ 322 (333) T protein:vir:78 244 AQFGRAVGGDLGAAVDSKTR-IIGGDFSQLKFGFADEIRIKMSDTATLTDSGSATVSMWQTNQIAILIEVTFGWLLGDKQ 322 (333) T ss_pred eEEccccCCCccccCCCccE-EEEEecccEEEEEeeccEEEEeccccccccccceeehhhcCcEEEEEEEEEccEEeccc Confidence 99999998642 1222 333321 122222 122111 111 1111 1112223333222 26677 Q ss_pred eeeecccccccccCCCCc Q lcl|NC_016566. 280 GYRLKASARTPVEGVRSF 297 (364) Q Consensus 280 G~sw~~~~~~~~~gg~SP 297 (364) .|.--... ..| T Consensus 323 a~~~l~~~-------~a~ 333 (333) T protein:vir:78 323 AFVKFVDD-------EQP 333 (333) T ss_pred ceEEEecc-------CCC Confidence 76653321 246 No 70 >protein:vir:962 Length: 397 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:19 # MgeName: bIL285 # Cross-refs: genbank:acc:NP_076616;genbank:gi:13095724;genbank:GeneID:920264 Probab=96.75 E-value=0.00037 Score=39.42 Aligned_cols=261 Identities=9% Similarity=0.042 Sum_probs=77.7 Q ss_pred CCccccchh------hhhhhhhhhHH-HHHHHhhhhcceeEeccCcccCceeeeehhhhhccccccccccc--------- Q lcl|NC_016566. 1 MSLTVFQRK------LVTAVTQMIPD-NLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYA--------- 64 (364) Q Consensus 1 fd~~vfn~~------~~~~~~e~i~q-~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~--------- 64 (364) ...+..++. .+..++..... .........+|..+ +..+.....+..-+..|....+...... T Consensus 104 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v--p~~~~~~i~~~~~~~~l~~~~~~~~~~~~~~~~~~~~ 181 (397) T protein:vir:96 104 KKFKVTEEELAEKRSAINAFVKSKGAEKRDGFTSVEGGALI--PQELLQPQLEPKDIVDLSKYVRSVPVNSASGKFPVIS 181 (397) T ss_pred HHHhhhhHHHHHHHHHHHHHHHhhhhhhhhcccccccccch--hHHHHHHHHHhhhhhhHHHhhhhccccccceeEEEEe Confidence 000000000 00000000000 00000001111110 0000000000000000000000000000 Q ss_pred --CC--C-c---cccch-hhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_016566. 65 --PV--G-T---PATAK-VLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKA 135 (364) Q Consensus 65 --~~--~-~---~~T~~-kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~G 135 (364) +. . + ...|. .-.+...+-..+..-.+-+.++...+. ..+..+...|.+++++...+-....++ T Consensus 182 ~~~~~~~~~~E~~~~~~~~~~~~~~i~~~~~~~~~~~~~s~ell~---ds~~~l~~~i~~~l~~~~~~~~~~~i~----- 253 (397) T protein:vir:96 182 KSGSKMATVQQLEKNPQLANPKMVEIDYSVATRRGYIPISQEMID---DASYDVTGLIADEIQDQSLNTKNADIA----- 253 (397) T ss_pred ccCCccccccccccccccccccccceeecHhHhhcchhhHHHHHh---hhHHHHHHHHHHHHHHHHHHHHHHHHh----- Confidence 00 0 0 00000 000111111111111111122222221 112222223444444333332222211 Q ss_pred hhcccccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc-ccee Q lcl|NC_016566. 136 AIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG-DLQV 212 (364) Q Consensus 136 v~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~-~~~~ 212 (364) .+... .......++.++.++.+.+=+.... ..|+||+.+|..|.+ +.+..+ ++... .-.. T Consensus 254 --~g~g~-----------~~~~~~~~~d~~~~~~~~~~~~~~~-a~~v~n~~~~~~l~~---lkd~~G~~~~~~~~~~~~ 316 (397) T protein:vir:96 254 --AVLKT-----------ATAKSVVGVDGLKDLINKEIKKVYD-VKLFISASMYSELDK---LKDKNGRYLLQDSITAAS 316 (397) T ss_pred --hcccc-----------cccccccchHHHHHHHHHhhhhhcC-cEEEEcHHHHHHHHH---hhccCCCeEeccCccCCC Confidence 11110 0111233455666666654333222 579999999999864 433333 22211 1111 Q ss_pred ecccCCcEEEEeCCC-CCCCCceEEEEEec--ceeEEec-CCCCcceeeccC-CCceeeeEEeeEEEEeeeeeeeecccc Q lcl|NC_016566. 213 MGDGLGRRFIISDAA-ADAMGAGKMLGLVP--GAVAVTT-NGLDMLAQEKGG-NENIERWWQGEFDFNVAVKGYRLKASA 287 (364) Q Consensus 213 ~~~~lGrrVIVDD~~-p~~~~~Yttylfg~--GAi~~~~-~~~~~~~~~~~g-~e~~~~~~~~~~~f~lhp~G~sw~~~~ 287 (364) ....+|+||++.|.+ |.....-.+++||. .++.+.+ .+..+....... ...+...++.++ -.+||..|..-.-+ T Consensus 317 ~~~l~G~pv~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~d~-~~~~~~a~~~~~~~ 395 (397) T protein:vir:96 317 GKQLLGKEVVVLDDDVIGKSVGNVVGFIGDAKAFASFFDRKQVSVSWVDNNIYGQLLAGIIRYDV-KATDKKAGFYVTFT 395 (397) T ss_pred cccccccceEEecccccCCCCCceEEEEeehhcceEeEeecceEEEEecccccceeEEEEEEEcc-EEecccceEEEEee Confidence 112469999876654 43332224455664 1222322 222222211111 112211111121 23788888875543 Q ss_pred cc Q lcl|NC_016566. 288 RT 289 (364) Q Consensus 288 ~~ 289 (364) .+ T Consensus 396 ~a 397 (397) T protein:vir:96 396 IG 397 (397) T ss_pred cC Confidence 32 No 71 >protein:vir:99749 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1497 # MgeName: phiETA2 # Cross-refs: genbank:acc:YP_001004307;genbank:gi:122891761;genbank:GeneID:4712304 Probab=96.75 E-value=0.00037 Score=39.42 Aligned_cols=269 Identities=8% Similarity=-0.026 Sum_probs=112.9 Q ss_pred CCcc-------------------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcc Q lcl|NC_016566. 1 MSLT-------------------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIAN 55 (364) Q Consensus 1 fd~~-------------------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g 55 (364) ++.+ ...+++...+++.+.+..-++.- + ...++.+.-...|.+..... T Consensus 9 ~~~~~~~~~~~~~~~~~a~~~~~~~~~~~lip~~~~~~ii~~~~~~s~l~~~----~---~~~~~~~~~~~~p~~~~~~~ 81 (324) T protein:vir:99 9 LNLQHFASNNVKPQVFNPDNVMMHEKKDGTLLNDFTTPILQEVMENSKIMRL----G---KYEPMEGTEKKFTFWADKPG 81 (324) T ss_pred HHHHHHHHHhhhhhhccccceeccCCCcceechhHHHHHHHHHHhhchhhhh----c---ceeeccCCceEEEEEecCcc Confidence 1111 12233333334444332221111 1 11122222123333321111 Q ss_pred cccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_016566. 56 LVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKA 135 (364) Q Consensus 56 ~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~G 135 (364) + .-+. .+......+ .++..-.-...|+.. -+.++.+.+. ....++...|.+++++.+.+..-+.+|..- | T Consensus 82 a-~~v~-Eg~~~~~~~-~~~~~v~~~~~k~~~---~~~iS~ell~---ds~~~l~~~i~~~l~~ai~~~~d~~~l~G~-g 151 (324) T protein:vir:99 82 A-YWVG-EGQKIETSK-ATWVNATMRAFKLGV---ILPVTKEFLN---YTYSQFFEEMKPMIAEAFYKKFDEAGILNQ-G 151 (324) T ss_pred e-eEec-cCccccccc-cceeEEEEeeEEEEE---eehhhHHHHh---cchHHHHHHHHHHHHHHHHHHHHHHhhhcC-C Confidence 1 1111 111100111 122222222233332 2345555443 223345567888888888887766555311 1 Q ss_pred hhccc-ccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--cccccccee Q lcl|NC_016566. 136 AIESN-AAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQV 212 (364) Q Consensus 136 v~~~n-a~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~~ 212 (364) .++ ........ ...........++.++.++..++.++...-..|+||...|..|.+ +.+..+ ++....- T Consensus 152 --~~~~~~~~~~~~-~~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~n~~~~~~L~~---l~d~~g~~~~~~~~~-- 223 (324) T protein:vir:99 152 --NNPFGKSIAQSI-EKTNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRK---IVDPETKERIYDRNS-- 223 (324) T ss_pred --CCccCccccccc-cccceeccccCCHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHH---hhcCCCceeecCCCC-- Confidence 110 00011100 111122234567788999999998888888899999999998864 333332 2222111 Q ss_pred ecccCCcEEEEeCCCCCCCCceEEEEEec-ceeEEec-CCCCcceeec--------cCC------CceeeeEEeeEEEE- Q lcl|NC_016566. 213 MGDGLGRRFIISDAAADAMGAGKMLGLVP-GAVAVTT-NGLDMLAQEK--------GGN------ENIERWWQGEFDFN- 275 (364) Q Consensus 213 ~~~~lGrrVIVDD~~p~~~~~Yttylfg~-GAi~~~~-~~~~~~~~~~--------~g~------e~~~~~~~~~~~f~- 275 (364) +..+|++|++++.++...+ ..+||. .-+.++. +++.+.+.+. .++ +.-.+.++.+..|. T Consensus 224 -~~l~G~PVv~~~~~~~~~~---~~i~gd~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~ 299 (324) T protein:vir:99 224 -DTLDGLPVVNLKSSNLKRG---ELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVAL 299 (324) T ss_pred -ccccceeEEeecCCCCCcc---eEEEEecccEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEEcc Confidence 2246999999998886543 223322 1112222 1111111100 000 11123333333332 Q ss_pred --eeeeeeeecccccccccCCCCcChhhh Q lcl|NC_016566. 276 --VAVKGYRLKASARTPVEGVRSFKLSDI 302 (364) Q Consensus 276 --lhp~G~sw~~~~~~~~~gg~SPT~aeL 302 (364) +||..|.--..+ ..+..++-+|. T Consensus 300 ~v~~~~a~~~lt~a----~~~~~~~~~~~ 324 (324) T protein:vir:99 300 HIADDKAFAKLVPA----DKKTDSVPGEV 324 (324) T ss_pred EEecccceEEEEec----cCCCCCCCCCC Confidence 566665442221 12334566666 No 72 >protein:vir:101607 Length: 379 # NCBI annotation: major capsid protein precursor # Family: family:all:585 # MgeID: mge:1646 # MgeName: 11b # Cross-refs: genbank:acc:YP_112497;genbank:gi:53793597;uniprot:Q5ZGF6;genbank:GeneID:3101715 Probab=96.64 E-value=0.00044 Score=38.96 Aligned_cols=253 Identities=9% Similarity=-0.095 Sum_probs=97.9 Q ss_pred CCcc-ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccc-cccccccCCCccccchhhhcc Q lcl|NC_016566. 1 MSLT-VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLV-TDRNAYAPVGTPATAKVLARM 78 (364) Q Consensus 1 fd~~-vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~-~~~d~~~~~~~~~T~~kit~~ 78 (364) -+.. .-.+......++.+.+.....+-++.. ++.+.-...|.....++.. .-.. .+......+| ++.+- T Consensus 113 ~~~~~~ip~~~~~~ii~~~~~~~~i~~~~~~~-------~~~~~~~~~~~~~~~~~~~~~~v~-Eg~~~~~~~~-~f~~i 183 (379) T protein:vir:10 113 VNLTGAQPKDYNFDVVLNPSQMLNVSDIVGAV-------SISGGTYTFVRENGAGEGAIGAQV-EGATKGQKDY-DISMI 183 (379) T ss_pred CCCccccchhhhhHHHHhHHhhhhHHhhceee-------eccCCceEEEEeecCCCccccccc-CCcccccccc-ceeee Confidence 1111 112233444455554443333332211 1112111222111111100 0000 1111111112 23333 Q ss_pred ceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCcccccc Q lcl|NC_016566. 79 LTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRT 158 (364) Q Consensus 79 ~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~ 158 (364) .-..-|++. -+.++...+ +|...+..-|.+++++...+..-..++.. .+ +....... ...+ T Consensus 184 ~~~~~k~~~---~~~iS~ell----~D~~~l~~~i~~~la~~~~~~~~~~~~~g----~~--~~~~~~~~------~~~~ 244 (379) T protein:vir:10 184 DVNTDFIAG---FTRYSKKMA----NNLPFLTSFIPNALRRDYAKAENAAFNAV----LA--ANATASTE------IITN 244 (379) T ss_pred EeeeeeEEe---eehhhHHHH----hhHHHHHHHHHHHHHHHHHHHHHHHHhcc----cc--cccccccc------cccC Confidence 222233332 234555443 23323444555555543322222222211 11 11111111 1223 Q ss_pred cccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccccccc-ccce----eecccCCcEEEEeCCCCCCCCc Q lcl|NC_016566. 159 FPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAI-GDLQ----VMGDGLGRRFIISDAAADAMGA 233 (364) Q Consensus 159 ~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~-~~~~----~~~~~lGrrVIVDD~~p~~~~~ 233 (364) ..++.++.++..++-+..-.-..|+||...|..|.+ +.++.+-+-. .+.. .....+|+||++++.||-. T Consensus 245 ~~~~d~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~---lkd~~G~~l~~~~~~~~~~~~~~l~G~pvv~s~~~~ag--- 318 (379) T protein:vir:10 245 KNKVEMLINEIAKQENLDFPVTAIVLRPTDYYDILV---TQKSVGAGYGLPGVVTQDNGVLRINGIPLFRATWLAAN--- 318 (379) T ss_pred cccHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHH---hhccCCceeccCCccCCCCCcceecceeeEecCCCCCC--- Confidence 344567889999888888888899999999998854 3333332110 1111 0012359999999999842 Q ss_pred eEEEEEe---cceeEEecC-CCCcceeeccCCCceeeeEEeeEEE---Eeeeeeeeecccccccc Q lcl|NC_016566. 234 GKMLGLV---PGAVAVTTN-GLDMLAQEKGGNENIERWWQGEFDF---NVAVKGYRLKASARTPV 291 (364) Q Consensus 234 Yttylfg---~GAi~~~~~-~~~~~~~~~~g~e~~~~~~~~~~~f---~lhp~G~sw~~~~~~~~ 291 (364) +++|| .+++.+.++ ...+..+....=+.-.+.++.+.+| ..||..|-.-.- +.+ T Consensus 319 --~~~~gdf~~~~~~~~~~~~i~~~~~~~~~f~~~~~~~r~~~R~~~~v~~p~a~v~~~~--~~~ 379 (379) T protein:vir:10 319 --KYYVGDWTRVTKVTTEGLSLEFSEVEGTNFVKNNITARIEAQVALAVEQPAALIFGDF--TAV 379 (379) T ss_pred --ceEEeecccEEEEEEeceEEEEeecccccccCCcEEEEEEEEeccEEecCccEEEEEe--cCC Confidence 13332 333433332 1111111111111112333333232 256666544211 111 No 73 >protein:vir:102119 Length: 404 # NCBI annotation: phage major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1641 # MgeName: phiSM101 # Cross-refs: genbank:acc:YP_699941;genbank:gi:110804052;genbank:GeneID:4206662 Probab=96.58 E-value=0.0005 Score=38.70 Aligned_cols=270 Identities=10% Similarity=0.026 Sum_probs=98.1 Q ss_pred CCc-----------------cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcc---cCceeeeehhhhhccccccc Q lcl|NC_016566. 1 MSL-----------------TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEV---LKDVVEKMSVGLIANLVTDR 60 (364) Q Consensus 1 fd~-----------------~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~---~Gdf~~~~~f~~i~g~~~~~ 60 (364) +.. .+-.+.+...+++.+.+.....+-. .-.++ .|.+....... ..++. .. T Consensus 100 ~~~~~~e~~a~~~~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~l~-------~~~~~~~~~g~~~~~~~~~-~~~~~-~v 170 (404) T protein:vir:10 100 LNLSEKEINAISENIDEDGGYAVPEDIQTKINTRLKDTTDLYNMV-------DYEPVFTRSGSRTYEKRSK-QKPMK-PL 170 (404) T ss_pred hcchhhHHhhhccccCCCCceeechhHHHHHHHHHhhhhhHhhhh-------ceeeccCCccceEEEEecC-Cccee-ec Confidence 100 0011122222222222221111110 01111 12221111111 11100 00 Q ss_pred ccccCCC-ccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcc Q lcl|NC_016566. 61 NAYAPVG-TPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIES 139 (364) Q Consensus 61 d~~~~~~-~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~ 139 (364) ....... ...++ ++.+-.-...|++. -+.++...+. ..+..+..-|.+++++...+...+.+| .|.=.+ T Consensus 171 ~e~~~~~~~~~~~-~f~~i~~~~~k~~~---~~~iS~ell~---ds~~~l~~~i~~~la~~~~~~~~~~il---~G~g~~ 240 (404) T protein:vir:10 171 SENQQIPTNGDNG-KLERFNFKLKDLAD---FMSIPNDLLK---FADKSLEDWIINWFVDKVRITRNAEIL---YGAGGD 240 (404) T ss_pred ccccccccccccc-ceeeeEeeheeeEe---eehhhHHHHh---hcHHHHHHHHHHHHHHHHHHHHHHHHh---hcCCCC Confidence 0000000 00011 11111111122322 2345554432 223345556778888888876666544 332111 Q ss_pred cc-cceeecccccCcccccccccHHHHHHHHHH-hcccccCeeEEEEchHHHHHHHHhhccccccc--ccccccceeec- Q lcl|NC_016566. 140 NA-AANYTQPARVDGVGGRTFPTLADFPLAASK-FGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQVMG- 214 (364) Q Consensus 140 na-~~v~dis~~t~~~~~~~~~s~~~l~~A~~~-lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~~~~- 214 (364) +. ....... ...........++.++.+++.. +-.....=..|+||...|..|.+ +.+..+ ++...-..-.. T Consensus 241 ~~~~gi~~~~-~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~v~n~~~~~~L~~---lkd~~G~~l~~~~~~~~~~~ 316 (404) T protein:vir:10 241 EHATGIMTAN-KFKKITLPKSPALKDFKKCKNVELLNVFKATSSWIVNQDGFNYLDS---LEDKTGRPYLQPDPKDPTQY 316 (404) T ss_pred Ccccceeecc-ccceeeccccccHHHHHHHHHhhhhccccCCCEEEEcHHHHHHHHH---hhccCCceeeccCcCCCCCc Confidence 11 1111100 0001112233455667666653 32223333569999999998865 333232 22211011111 Q ss_pred ccCCcEEEE-eCCCCCCCCceEEEEEec--ceeEEec-CCCCcceeec--cCCCceeeeEEeeEE---EEeeeeeeeecc Q lcl|NC_016566. 215 DGLGRRFII-SDAAADAMGAGKMLGLVP--GAVAVTT-NGLDMLAQEK--GGNENIERWWQGEFD---FNVAVKGYRLKA 285 (364) Q Consensus 215 ~~lGrrVIV-DD~~p~~~~~Yttylfg~--GAi~~~~-~~~~~~~~~~--~g~e~~~~~~~~~~~---f~lhp~G~sw~~ 285 (364) ..+|++|++ ++.+|.....-.+++||. .++.+.. .+........ ..-+.-.+.++.+.. =.+||.+|..-. T Consensus 317 ~l~G~PV~~~~~~~~~~~~~~~~~~~gd~s~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~r~d~~v~~~~a~~~~~ 396 (404) T protein:vir:10 317 RFLGLPVIELPNDLLLSTESAIPVLLGDTKEAYKYVSDGAYELATTNIGAGAFETNTTKARIIMRIDGNVKDSEALLIAE 396 (404) T ss_pred cccceeeEEecccccCCCCCccEEEEEeccccEEEEEecceEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEE Confidence 236999985 455554433334567773 3344332 3333222111 111111222222222 237888886633 Q ss_pred cccccccCCCCcC Q lcl|NC_016566. 286 SARTPVEGVRSFK 298 (364) Q Consensus 286 ~~~~~~~gg~SPT 298 (364) -+.+ .+|. T Consensus 397 ~~~a-----a~~~ 404 (404) T protein:vir:10 397 IPVE-----SVQA 404 (404) T ss_pred eecc-----cCCC Confidence 3322 2454 No 74 >protein:vir:80684 Length: 315 # NCBI annotation: gp6 # Family: family:all:966 # MgeID: mge:1884 # MgeName: PA6 # Cross-refs: genbank:acc:YP_001285582;genbank:gi:148727088;genbank:GeneID:5247055 Probab=96.36 E-value=0.00071 Score=37.86 Aligned_cols=283 Identities=9% Similarity=0.002 Sum_probs=108.3 Q ss_pred CC--c-----cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccch Q lcl|NC_016566. 1 MS--L-----TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAK 73 (364) Q Consensus 1 fd--~-----~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~ 73 (364) |- . -+-.+++...++|.+.+..-...-+ ..+ +..+.-...|.+..-..+ .-+... ......+ . T Consensus 1 Ma~~~~~~gg~~vP~~~~~~ii~~l~~~s~i~~l~---~~i----~~~~~~~~ip~~~~~~~a-~wv~Eg-~~~~~s~-~ 70 (315) T protein:vir:80 1 MADDFLSAGKLELPGSMIGAVRDRAIDSGVLAKLS---PEQ----PTIFGPVKGAVFSGVPRA-KIVGEG-EVKPSAS-V 70 (315) T ss_pred CCCCcCCcCceEcchHHHHHHHHHHHhhchhhhhc---cee----ecCCCceEEEEEeCCcce-EEeeCC-ccccccc-c Confidence 11 0 1224455566777776643332221 111 122221233433211111 011111 1111111 1 Q ss_pred hhhccceeeEEeccccCchhcCHHHHHhhcCCHHH-HHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccce-eeccccc Q lcl|NC_016566. 74 VLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNS-VAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAAN-YTQPARV 151 (364) Q Consensus 74 kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~-~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v-~dis~~t 151 (364) ++.+-.-...|++.- +..+.+-+.....+... +-..|.+++++...+...+.+|.-. +..++..... ......+ T Consensus 71 ~f~~v~l~~~kl~~~---~~iS~ell~~s~~~~~~~l~~~i~~~la~ai~~~~d~a~~~G~-~~~~~~~~~~~~~~~~~~ 146 (315) T protein:vir:80 71 DVSAFTAQPIKVVTQ---QRVSDEFMWADADYRLGVLQDLISPALGASIGRAVDLIAFHGI-DPATGKAASAVHTSLNKT 146 (315) T ss_pred ceeeeEeeeeeEEee---ehhhHHHhhcCchhHHHHHHHHHHHHHHHHHHHHHhhheeecc-CCCCCccccccccccccc Confidence 233333333444322 23555433211222111 2244666777766665554444211 1111111100 0000011 Q ss_pred CcccccccccHHHHHHHHHHh-cccccCeeEEEEchHHHHHHHHhhcc--c--ccccccccccceeecccCCcEEEEeCC Q lcl|NC_016566. 152 DGVGGRTFPTLADFPLAASKF-GDQAALIKSWFMDGVTWANFIAYQAL--P--SAEQVFAIGDLQVMGDGLGRRFIISDA 226 (364) Q Consensus 152 ~~~~~~~~~s~~~l~~A~~~l-GD~~~~l~~ivMHS~v~~~L~k~~~i--t--~~~~~~~~~~~~~~~~~lGrrVIVDD~ 226 (364) ............++.++..++ +.....-..|+||+.++..|.+...- . +...++...........+|++|+++|. T Consensus 147 ~~~~~~~~~~~~d~~~~~~~~~~~~~~~~~~~imn~~~~~~L~~l~~~~g~~~~g~~~~~~~~~g~~~tl~G~PV~~~~~ 226 (315) T protein:vir:80 147 KNIVDATDSATADLVKAVGLIAGAGLQVPNGVALDPAFSFALSTEVYPKGSPLAGQPMYPAAGFAGLDNWRGLNVGASST 226 (315) T ss_pred cceeeccccchHHHHHHHHHHhhccCccceEEEEcHHHHHHHHHHhhccCCcccccccccccccCCCceecceeeEecCc Confidence 111112233455777787776 55555567899999999998653110 0 111112111111122346999999999 Q ss_pred CCCCCC-----ceEEEEEec---ceeEEecC-CCCcceeec-cCC-----CceeeeEEeeEEE---Eeeeeeeeeccccc Q lcl|NC_016566. 227 AADAMG-----AGKMLGLVP---GAVAVTTN-GLDMLAQEK-GGN-----ENIERWWQGEFDF---NVAVKGYRLKASAR 288 (364) Q Consensus 227 ~p~~~~-----~Yttylfg~---GAi~~~~~-~~~~~~~~~-~g~-----e~~~~~~~~~~~f---~lhp~G~sw~~~~~ 288 (364) ||.... +. ..+||. -.+++-.+ ..+...+.. ++. +.-.+.++.+..+ +.||..|.--+... T Consensus 227 ~~~~~~~~~~~~~-~~~~GDfs~~~~g~~~~~~i~i~~~~~~~~~~~~~~~~~~v~~r~~~r~~~~v~~~~a~~~l~~~~ 305 (315) T protein:vir:80 227 VSGAPEMSPASGV-KAIVGDFSRVHWGFQRNFPIELIEYGDPDQTGRDLKGHNEVMVRAEAVLYVAIESLDSFAVVKEKA 305 (315) T ss_pred CCccccccccccc-EEEEeecccEEEEEecCeeEEEeccccccCcccchhhcCcEEEEEEEEecceeecccceEEEeecc Confidence 985321 12 233342 12222111 111111111 110 1112233333222 26777777643322 Q ss_pred ccccCCCCcChhh Q lcl|NC_016566. 289 TPVEGVRSFKLSD 301 (364) Q Consensus 289 ~~~~gg~SPT~ae 301 (364) +.. .+|-.+. T Consensus 306 a~~---~~~~~~~ 315 (315) T protein:vir:80 306 APK---PNPPAEN 315 (315) T ss_pred CCC---CCCCCCC Confidence 211 1222222 No 75 >protein:vir:5739 Length: 366 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:122 # MgeName: PY54 # Cross-refs: genbank:acc:NP_892050;genbank:gi:33770513;interpro:IPR006444;uniprot:Q7Y410;genbank:GeneID:1732928 Probab=96.26 E-value=0.00081 Score=37.54 Aligned_cols=265 Identities=12% Similarity=0.117 Sum_probs=94.6 Q ss_pred CCcc--------------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhccccccc Q lcl|NC_016566. 1 MSLT--------------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDR 60 (364) Q Consensus 1 fd~~--------------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~ 60 (364) |=.+ .-.+++...++|.+.+..-...- |+-++.. ..|+ .+.|-+..-..+ .-. T Consensus 52 ~a~~~~~~~~~~~a~~~~~~~Gg~lvP~~~~~~ii~~l~~~s~l~~l---g~~~v~~--~~g~-~~~p~~t~~~~a-~wv 124 (366) T protein:vir:57 52 FAATELGDTGLSMAISTAAGSGGALIPQNMQNEVIELLRDRTVVRIL---GARSIPL--PNGN-LSMPRLSGGATA-GYV 124 (366) T ss_pred HHHHhhcchhhhhhccccccCCccccchhHHHHHHHHHhhhcchhhh---ceeeeec--CCCc-eEEEEEeCCcce-eee Confidence 0000 01223334444444432111100 1211110 1233 233322211000 001 Q ss_pred ccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHH------HHh Q lcl|NC_016566. 61 NAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIG------AGK 134 (364) Q Consensus 61 d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla------~L~ 134 (364) ... . ....+..++.+-.-...|++ +-+.++.+.+.... + .+...|.+++++...+..-+.+|. -.. T Consensus 125 ~E~-~-~~~~s~~~f~~i~~~~~k~~---~~~~iS~ell~ds~--~-~~~~~i~~~l~~a~~~~~d~a~l~G~G~~~~p~ 196 (366) T protein:vir:57 125 GEG-K-DVVATGATFDDVKLSAKTMI---ALVPVSNQLIGRAG--F-NVEQLLLGDILSAIATREDKAFLRDDGTGDTPK 196 (366) T ss_pred ccC-c-cccccccceeEEEEeeEEEE---EeehhhHHHHhhhh--H-HHHHHHHHHHHHHHHHHHHHHhhccCCCCcccc Confidence 111 1 11111112333222223333 33346665443222 2 233457788888777766554442 122 Q ss_pred hhhcccccceeecccccCcccccccccHHHHHHHH---HHhcccccCeeEEEEchHHHHHHHHhhccccccc--cccccc Q lcl|NC_016566. 135 AAIESNAAANYTQPARVDGVGGRTFPTLADFPLAA---SKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGD 209 (364) Q Consensus 135 Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~---~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~ 209 (364) |++........... ..+. ..+......+.+.+ ...-.....-..|+||...+..|.+ +.+.++ ++.... T Consensus 197 Gi~~~~~~~~~~~~--~~~t-~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~vmn~~~~~~L~~---lkd~~G~~l~~~~~ 270 (366) T protein:vir:57 197 GMKAVATAANRLVA--WTGT-AINLTTIDEYLDSLILKHMDSNSNMIRCGWGLSNRTYMTLFG---LRDGNGNKVYPEMS 270 (366) T ss_pred ceeeccccccceee--cccc-ccchhhHHHHHHHHHHhhhccccccccCEEEecHHHHHHHHh---hhccCCceeccCCC Confidence 22221111000000 0000 00111111222222 2222233446789999999998865 433333 222222 Q ss_pred ceeecccCCcEEEEeCCCCCCC---CceEEEEEec-ceeEEec-CCCCc--ceee--ccCCCce-------eeeEEeeEE Q lcl|NC_016566. 210 LQVMGDGLGRRFIISDAAADAM---GAGKMLGLVP-GAVAVTT-NGLDM--LAQE--KGGNENI-------ERWWQGEFD 273 (364) Q Consensus 210 ~~~~~~~lGrrVIVDD~~p~~~---~~Yttylfg~-GAi~~~~-~~~~~--~~~~--~~g~e~~-------~~~~~~~~~ 273 (364) -.+ .+|+||++++.||... +.-..++||. .-+.++. ++... .++. .++.+.+ ...++.+.. T Consensus 271 ~g~---l~G~Pvv~s~~ip~~~~~~~~~~~i~~gdfs~~~i~~~~~i~i~~~~ea~~~~~~g~~~~~f~~~~~~iR~~~~ 347 (366) T protein:vir:57 271 QGI---LKGYPIQRTSAIPANLGDDGNESEIYFCDFNDVVIGEDGMMKVDFSTEATYKDADGQLVSAFARNQSLIRVVTE 347 (366) T ss_pred CCe---ecceeeEEccccccccccCCCccEEEEEecceEEEEEecceEEEEeeccccccccccchhhhhcCceeEEeeee Confidence 222 3599999999999632 1222344443 2222332 22211 1111 1111111 122333222 Q ss_pred EE---eeeeeeeecccccccccCCCCcChhhhcCCccc Q lcl|NC_016566. 274 FN---VAVKGYRLKASARTPVEGVRSFKLSDITDKANW 308 (364) Q Consensus 274 f~---lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW 308 (364) |. .||..| .+-++.+| T Consensus 348 ~d~~v~~~~a~-------------------~~lt~~~~ 366 (366) T protein:vir:57 348 HDIGFRHPEGL-------------------VLGTGVIW 366 (366) T ss_pred eCcEeeccccE-------------------EEEecccC Confidence 21 233333 33445556 No 76 >protein:vir:103955 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1662 # MgeName: phiNM # Cross-refs: genbank:acc:YP_873992;genbank:gi:118430767;genbank:GeneID:4525449 Probab=96.23 E-value=0.00085 Score=37.43 Aligned_cols=269 Identities=9% Similarity=-0.023 Sum_probs=110.4 Q ss_pred CCcc-------------------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcc Q lcl|NC_016566. 1 MSLT-------------------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIAN 55 (364) Q Consensus 1 fd~~-------------------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g 55 (364) ++++ .-.+++...+++.+.+..-++.- + ...++.+.-...|.+..... T Consensus 9 ~~~~~f~~~~~~~~~~~a~~~~~~~~~~~liP~~~~~~ii~~~~~~s~l~~~----~---~~~~~~~~~~~~p~~~~~~~ 81 (324) T protein:vir:10 9 LNLQHFASNNVKPQVFNPDNVMMHEKKDGTLLNDFTTPILQEVMENSKIMQL----G---KYEPMEGTEKKFTFWADKPG 81 (324) T ss_pred HHHHHHHHHhhccceecccceeccCCCcceechhHHHHHHHHHHhhchhhhh----c---ceeeccCCceEEEEEeCCcc Confidence 1111 11222223333333322111111 0 11222222233443332111 Q ss_pred cccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_016566. 56 LVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKA 135 (364) Q Consensus 56 ~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~G 135 (364) + .-.... ......++ ++..-.-...|+.. -+.++.+.+. ..+.++...|.+++++.+.+..-+.+|. | T Consensus 82 a-~~v~Eg-~~~~~~~~-~~~~v~~~~~k~~~---~~~iS~ell~---ds~~~l~~~i~~~l~~ai~~~~d~a~l~---G 149 (324) T protein:vir:10 82 A-YWVGEG-QKIETSKA-TWVNATMRAFKLGV---ILPVTKEFLN---YTYSQFFEEMKPMIAEAFYKKFDEAGIL---N 149 (324) T ss_pred e-eEeccC-cccccccc-ceeEEEEeeEEEEE---eehhhHHHHh---cchHHHHHHHHHHHHHHHHHHHHHHhhh---c Confidence 1 111111 11111111 22222222233332 2345555443 3334455668888888888766665442 2 Q ss_pred hhccc-ccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--cccccccee Q lcl|NC_016566. 136 AIESN-AAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQV 212 (364) Q Consensus 136 v~~~n-a~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~~ 212 (364) --.++ ....... ............++.++.++..++.++...-..|+||...|..|.+ +.+..+ ++....-. T Consensus 150 ~g~~~~~~~i~~~-~~~~~~~~~~~~t~~~i~~~~~~l~~~~~~~~~~v~n~~~~~~L~~---l~d~~g~~~~~~~~~~- 224 (324) T protein:vir:10 150 QGNNPFGKSIAQS-IEKTNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRK---IVDPETKERIYDRNSD- 224 (324) T ss_pred CCCCccCcccccc-ccccceeccccCCHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHH---hhccCCceeecCCCCc- Confidence 11111 0111111 0111122234567788999999998888888899999999998864 333332 22222211 Q ss_pred ecccCCcEEEEeCCCCCCCCceEEEEEec-ceeEEec-CCCCcceee---------ccCC-----CceeeeEEeeEEE-- Q lcl|NC_016566. 213 MGDGLGRRFIISDAAADAMGAGKMLGLVP-GAVAVTT-NGLDMLAQE---------KGGN-----ENIERWWQGEFDF-- 274 (364) Q Consensus 213 ~~~~lGrrVIVDD~~p~~~~~Yttylfg~-GAi~~~~-~~~~~~~~~---------~~g~-----e~~~~~~~~~~~f-- 274 (364) ..+|++|++++.++...+ .++||. .-+.++. ++..+.+.+ ..+. +.-...++.+..| T Consensus 225 --~l~G~PV~~~~~~~~~~~---~~~~gd~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~r~d~ 299 (324) T protein:vir:10 225 --TLDGLPVVNLKSSNLKRG---ELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVAL 299 (324) T ss_pred --cccceeEEeecCCCCCcc---eEEEEecccEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEEcc Confidence 246999999988876543 223332 1122322 122211111 0110 1112333333332 Q ss_pred -EeeeeeeeecccccccccCCCCcChhhh Q lcl|NC_016566. 275 -NVAVKGYRLKASARTPVEGVRSFKLSDI 302 (364) Q Consensus 275 -~lhp~G~sw~~~~~~~~~gg~SPT~aeL 302 (364) .++|..|.--..+. .+..++-+|. T Consensus 300 ~v~~~~A~~~l~~a~----~~~~~~~~~~ 324 (324) T protein:vir:10 300 HIADDKAFAKLVPAD----KKTDSVPGEV 324 (324) T ss_pred EEecccceEEEEecc----CCCCCCCCCC Confidence 25566554422111 1222344444 No 77 >protein:vir:94673 Length: 419 # NCBI annotation: major capsid protein # Family: family:all:585 # MgeID: mge:1527 # MgeName: mu1/6 # Cross-refs: genbank:acc:YP_579208;genbank:gi:93007444;genbank:GeneID:5076792 Probab=96.18 E-value=0.00091 Score=37.26 Aligned_cols=269 Identities=13% Similarity=0.045 Sum_probs=98.9 Q ss_pred CCcc---------ccchhhhhhhhhhhH-HHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccc-c----ccC Q lcl|NC_016566. 1 MSLT---------VFQRKLVTAVTQMIP-DNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRN-A----YAP 65 (364) Q Consensus 1 fd~~---------vfn~~~~~~~~e~i~-q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d-~----~~~ 65 (364) .+.. .--++.+...+..++ +.+...+-.. ++....-...|..+..+. +.. ....+ . .+. T Consensus 121 ~~~~~~~~~~~~~~~~p~~~~~~i~~~~~~~~~i~~~~~----~~~~~~~~~~~~~~~~~~-~~~-~~~~~~a~~v~Eg~ 194 (419) T protein:vir:94 121 RDAPAGTITNPNVPHLPQLVPGIVPTTPDLPLLVADLLD----QQNADYNVLEYIRDTSGT-AGA-GSTWNKAAVVPEGT 194 (419) T ss_pred cccccccccCCcccccchhhhHHHHHHHhhhhhhhhcce----eeeccCCceeeeeecccc-ccc-cccCcccceecCCc Confidence 0000 011222222222221 1111111111 000000001121211111 000 00000 0 000 Q ss_pred CCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHH-----HHhhhhccc Q lcl|NC_016566. 66 VGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIG-----AGKAAIESN 140 (364) Q Consensus 66 ~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla-----~L~Gv~~~n 140 (364) .....++ ++.+-.-...+++ +-+.++...+. |...+...|.+++++...+..-+.+|. -..|++... T Consensus 195 ~~~~~~~-~~~~i~~~~~k~~---~~~~is~ell~----d~~~l~~~i~~~la~a~~~~~d~aii~G~G~~~p~Gi~~~~ 266 (419) T protein:vir:94 195 AKPQSTL-SFDTITTTLKTVA---HWLPITRQAAD----DNSQLMGYIQGRLTYGLRFLRDRQLLNGNGSTEMQGILTTP 266 (419) T ss_pred ccccccc-ceeeEEeeeeeEE---EeehhhHHHHH----hHHHHHHHHHHHHHHHHHHHHHHHHHhccCcccccceeccc Confidence 0011111 1222222222233 22345544332 322344556666776666655554442 122222211 Q ss_pred ccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccccceeecccCC Q lcl|NC_016566. 141 AAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQVMGDGLG 218 (364) Q Consensus 141 a~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~~~~~~lG 218 (364) ... .+..............+.++.++..++-....+-..|+||...+..|.+. .-++... ++....-......+| T Consensus 267 ~~~--~~~~~~~~~~~t~~~~~~~l~~~~~~~~~~~~~~~~~v~n~~~~~~l~~~-k~~~~~~~~~~~~~~~~~~~~l~G 343 (419) T protein:vir:94 267 GIG--TYQQPKPTAPATDEPPLVDIRRAKTVAEIAGFPPDGVVVHPQDWESIELD-QAPGSGVFRVIANVQGEATPRIWG 343 (419) T ss_pred ccc--cccccccccccccchhHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHH-hhcCCCceeecCCcccCCCccccc Confidence 110 01111111112233346678889888866666777999999999988542 1111111 111111111223469 Q ss_pred cEEEEeCCCCCCCCceEEEEEe---cceeEEecCCCCcceeeccCC--CceeeeEEeeEEE---Eeeeeeeeeccccccc Q lcl|NC_016566. 219 RRFIISDAAADAMGAGKMLGLV---PGAVAVTTNGLDMLAQEKGGN--ENIERWWQGEFDF---NVAVKGYRLKASARTP 290 (364) Q Consensus 219 rrVIVDD~~p~~~~~Yttylfg---~GAi~~~~~~~~~~~~~~~g~--e~~~~~~~~~~~f---~lhp~G~sw~~~~~~~ 290 (364) ++|+++|.||... .+|| .+...+...++........++ ..-.+.++.+..| .++|.+|.--.-+ T Consensus 344 ~pV~~~~~~~~~~-----~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~~r~~~r~d~~v~~~~a~~~~~~~--- 415 (419) T protein:vir:94 344 LNVVSTVAIAQGT-----ALVGGFRQGATLWSRQGITVLMTDSHADFFTANTLVILAEFRANLAVYQPKAFVRVTFA--- 415 (419) T ss_pred eeeEEcCCCCCcc-----EEEeeccceEEEEEecceEEEEeccccchhhcCcEEEEEEEeeccEEeccccEEEEEec--- Confidence 9999999998532 2332 222233323332222222221 1112223332222 2667776652221 Q ss_pred ccCCCCcC Q lcl|NC_016566. 291 VEGVRSFK 298 (364) Q Consensus 291 ~~gg~SPT 298 (364) ..|| T Consensus 416 ----aa~~ 419 (419) T protein:vir:94 416 ----AATT 419 (419) T ss_pred ----cCCC Confidence 1355 No 78 >protein:vir:95763 Length: 297 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1578 # MgeName: SMP # Cross-refs: genbank:acc:YP_950590;genbank:gi:119953785;genbank:GeneID:5076833 Probab=96.13 E-value=0.00096 Score=37.12 Aligned_cols=264 Identities=8% Similarity=-0.044 Sum_probs=110.4 Q ss_pred CCccccchhhh-----------hhhhhhhHHHHHHHhhhhcceeEeccCcccCc-eeeeehhhhhcccccccccccCCCc Q lcl|NC_016566. 1 MSLTVFQRKLV-----------TAVTQMIPDNLNVFNAAANGAVVLGTGEVLKD-VVEKMSVGLIANLVTDRNAYAPVGT 68 (364) Q Consensus 1 fd~~vfn~~~~-----------~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gd-f~~~~~f~~i~g~~~~~d~~~~~~~ 68 (364) |+.+.|+...- ..+.+.|-+.+...+....-+-+ .+..+. ....|-......+ .-.. .+.. . T Consensus 1 m~~~~~~~~~~~~t~~~~~lvP~~~~~~ii~~~~~~s~l~~~~~~---~~~~~~~~~~~~~~~~~~~a-~~v~-Eg~~-~ 74 (297) T protein:vir:95 1 MTVQTFNPENVLVSQKKDGTLHKEFTDIIMKEVAQNSLVMQLGQY---QEMEGEQEKTVYVQTDGISA-YWVN-ETEK-I 74 (297) T ss_pred CCccccccccccccCCCcceechhHHHHHHHHHHhhchhhhhcce---eecCCCccEEEEEEcCCcee-EEee-cCcc-c Confidence 88887776622 11112222222222221111101 112221 1122211110000 0111 1111 1 Q ss_pred cccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecc Q lcl|NC_016566. 69 PATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQP 148 (364) Q Consensus 69 ~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis 148 (364) ..+ + .+...+-.+..+-.+-+..+.+.+. .....+.+.|.+++++.+.+...+.+|. |.=++.....+... T Consensus 75 ~~~--~-~~f~~v~l~~~k~~~~~~is~ell~---ds~~~l~~~i~~~la~ai~~~~d~a~l~---G~g~~~~~gi~~~~ 145 (297) T protein:vir:95 75 KTD--K-PEVVPVTLKAHKLGIILVTSREALN---YTWKKFFEDMKPQIVEAFYKKIDEAGLL---GHDTPFANSVAKAA 145 (297) T ss_pred ccc--c-cceeEEEEeeEEEEEeehhhHHHHh---cCHHHHHHHHHHHHHHHHHHHHHHHHhc---ccCCcccccccccc Confidence 101 1 1122222222222222345554443 2223455678899999999988887763 21111111111111 Q ss_pred cccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccccceeecccCCcEEEEeCC Q lcl|NC_016566. 149 ARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQVMGDGLGRRFIISDA 226 (364) Q Consensus 149 ~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~~~~~~lGrrVIVDD~ 226 (364) ... .......+++.++.++..++.++......|+||+..+..|.+ +.+..+ ++... ....+|++|++..+ T Consensus 146 ~~~-~~~~~~~~t~~~i~~~~~~l~~~~~~~~~~v~~~~~~~~L~~---l~d~~G~~i~~~~----~~~l~G~Pv~~~~~ 217 (297) T protein:vir:95 146 KDA-NKVIGGPINYDNILKLQDALYDADVEPNAFVSKIQNRSALRE---ARDGNKVSIYDKA----ANTIDGITTVDLKS 217 (297) T ss_pred ccc-ceecccccCHHHHHHHHHHhhhccCCcCEEEEcHHHHHHHHH---hhccCCceeecCC----CCcccceeeEeecC Confidence 111 112233567888999999999888888999999999999864 333322 22111 11235999999887 Q ss_pred CCCCCCceEEEEEec-ceeEEec-CCCCcceeec---------cCC-----CceeeeEEeeEEE---Eeeeeeeeecccc Q lcl|NC_016566. 227 AADAMGAGKMLGLVP-GAVAVTT-NGLDMLAQEK---------GGN-----ENIERWWQGEFDF---NVAVKGYRLKASA 287 (364) Q Consensus 227 ~p~~~~~Yttylfg~-GAi~~~~-~~~~~~~~~~---------~g~-----e~~~~~~~~~~~f---~lhp~G~sw~~~~ 287 (364) ++...+ ..+||. --+.++. ++..+...+. .+. +.-...++.+..| .++|..|.=-..+ T Consensus 218 ~~~~~~---~~~~gd~s~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~~~d~~v~~~~a~~~l~~a 294 (297) T protein:vir:95 218 ARFEKG---DLLAGDFDNLIYGVPYNITYKISEEGQISTITNADGTPINLFEQEMIAIRATMDIAVMITKTDAFAKLTPA 294 (297) T ss_pred CCCCCc---eEEEEecccEEEEEecCeEEEEeeccccccccccCccchhhhhcCcEEEEEEEEeccEeecccceEEEeec Confidence 776544 233433 1111222 1221111110 000 1111222222211 2555555442221 Q ss_pred cccccCCCCcC Q lcl|NC_016566. 288 RTPVEGVRSFK 298 (364) Q Consensus 288 ~~~~~gg~SPT 298 (364) +|- T Consensus 295 --------t~~ 297 (297) T protein:vir:95 295 --------ERV 297 (297) T ss_pred --------CCC Confidence 222 No 79 >protein:vir:99075 Length: 392 # NCBI annotation: gp30 # Family: family:all:10837 # MgeID: mge:1671 # MgeName: Wildcat # Cross-refs: genbank:acc:YP_655895;genbank:gi:109521467;genbank:GeneID:4158040 Probab=96.13 E-value=0.00097 Score=37.10 Aligned_cols=311 Identities=13% Similarity=0.027 Sum_probs=130.2 Q ss_pred CCccccchhhhhhh-hhhhHHHHHHHhhhhcceeEecc--Cc---ccCceeeeehhhhhcccccccccccCCCccccchh Q lcl|NC_016566. 1 MSLTVFQRKLVTAV-TQMIPDNLNVFNAAANGAVVLGT--GE---VLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKV 74 (364) Q Consensus 1 fd~~vfn~~~~~~~-~e~i~q~~~~fn~as~gAivl~~--~~---~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~k 74 (364) |--.+|.|++..+. ++.+.+.|- | +.++.+. .. ..||-+..|.++.+.......+.+..+ ...+++. T Consensus 1 Ma~~~~~p~~~a~~~l~~l~~~lv-~-----~~lv~~~~~~~~~~~~GdtV~i~~~~~~~~~~~~~~~~~~~-~~~~~~~ 73 (392) T protein:vir:99 1 MANAFSKPTAVVDTAIQMLQNELI-L-----TNLVWLNGIGDFAHKFNDTITVRVPAPSRGHTRKLRGAGAE-RNLTVSD 73 (392) T ss_pred CccccccHHHHHHHHHHHHHhhcc-c-----hhhhccccccccccCCCCeEEEeecccccceeeeccccccC-Ccccccc Confidence 88889999988765 777765432 2 2233222 11 357777887777654321111111111 1223334 Q ss_pred hhccceeeEEe-ccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCc Q lcl|NC_016566. 75 LARMLTNSVNL-SAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDG 153 (364) Q Consensus 75 it~~~~vaVkl-~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~ 153 (364) ++.... -+++ -.++-++.++..+......|+ .+.+.++.+....+..-+.+++.+.++-..+... .. T Consensus 74 ~~~~~~-~~~id~~k~~~~~i~d~e~~~~~~~~---~~~~~~~a~~ala~~vd~~i~~~~~~a~~~~~~~-------~~- 141 (392) T protein:vir:99 74 FTEDSF-PVTLTDVAYHLGVLTDEELTFDLESF---ATQILPRQVRGVADILEEGVRDMIVGAPYEAAGA-------VH- 141 (392) T ss_pred cccceE-EEEEeeeeecceeechHHHhhhhhhh---HHHHHHHHHHHHHHHHHHHHHHHHhccccccccc-------cc- Confidence 433322 2333 223335667777765434444 3445455555555555555666655532211110 00 Q ss_pred ccccccccHHHHHHHHHHhcccc--cCeeEEEEchHHHHHHHHhhcccccccc-------cccccceeecccCCcEEEEe Q lcl|NC_016566. 154 VGGRTFPTLADFPLAASKFGDQA--ALIKSWFMDGVTWANFIAYQALPSAEQV-------FAIGDLQVMGDGLGRRFIIS 224 (364) Q Consensus 154 ~~~~~~~s~~~l~~A~~~lGD~~--~~l~~ivMHS~v~~~L~k~~~it~~~~~-------~~~~~~~~~~~~lGrrVIVD 224 (364) .......+..|.+|.++|.++. .. +-+++.+..|..|++...+.+.... .+.+.+. ..+|-.|+++ T Consensus 142 -~~~~~~~~~~i~~a~~~L~~~~vP~~-R~~vv~p~~~~~l~~~~~~~~~~~~g~~~~~~l~~G~vg---~i~G~~v~~s 216 (392) T protein:vir:99 142 -EVAPDEFFKGVNGARRALNELYIPQG-RVLVVGTAVTEQILNDDRFIKYESQGQSAVSALQEARLG---RIYGYEIVES 216 (392) T ss_pred -ccChhhhHHHHHHHHHHHhhcCCCCC-CEEEEcHHHHHHHhcccceeecccccchhhhhhhcceee---eeeeeEEEee Confidence 0011123557889999997743 23 5788999999999765444322211 1223332 2348899999 Q ss_pred CCCCCCCCceEEEEEecceeEEecCCCCcceee-----ccCCCceeeeEEeeEEEEeeeeeeeecccccccccCCCCcCh Q lcl|NC_016566. 225 DAAADAMGAGKMLGLVPGAVAVTTNGLDMLAQE-----KGGNENIERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKL 299 (364) Q Consensus 225 D~~p~~~~~Yttylfg~GAi~~~~~~~~~~~~~-----~~g~e~~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~ 299 (364) -.+|... .+.|-..++.+....|..+... ..+...+..++-..+. ..|..... +- T Consensus 217 ~~~~~~t----~~a~~~~a~~~at~a~v~~~~~~~~~s~s~~~~v~~~~~~~~~-------~t~~s~~~---------~v 276 (392) T protein:vir:99 217 TLIPHGD----AYLYHPTAFIMATRAPAPPMGAVRSTAISGDQRIAMRWLVDYD-------STITSNRS---------LI 276 (392) T ss_pred ccccccc----ceeeeccccccccccccccccccceeEEecccceecceeeccc-------ceeecccc---------cc Confidence 9888653 3455555655554443221111 1111211111111110 11111000 00 Q ss_pred hhhcCCccceeecCcCcCcceEEEEecCcc--ccccccccccccccccccchhhcceeEEEeeeecC Q lcl|NC_016566. 300 SDITDKANWELDQGQVDNAPATVQDVGSDS--DTKGRRRTQTAQAVPTRNIKETAGVLVTLTATTAS 364 (364) Q Consensus 300 aeLat~~NW~rV~~s~K~~pgv~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 364 (364) .-+.-........ .......+.+.+.++. -..+...+.+ +.-..|-..+|.+|... T Consensus 277 ~~~~g~~~v~~~~-~~~~~~~~~~~~~~~~v~v~~v~~~~~~--------~~~~~~~~~~~~~t~~~ 334 (392) T protein:vir:99 277 DTYFGLKVVEDPN-GVGFVRARKIHLIPGSIEVAPEAGANAT--------ITAAAGEDHTVQLKVTD 334 (392) T ss_pred ceeEEEEEEeecc-ccceeeeeeeeeecceeeeeeeecccce--------eEeeeccceeEEEEEEe Confidence 0000000000000 0000001111111100 0011111111 11112333344444322 No 80 >protein:vir:96223 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1607 # MgeName: 69 # Cross-refs: genbank:acc:YP_239571;genbank:gi:66395304;genbank:GeneID:5132771 Probab=96.06 E-value=0.0011 Score=36.90 Aligned_cols=268 Identities=9% Similarity=-0.019 Sum_probs=106.2 Q ss_pred CCcc------------------------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehh Q lcl|NC_016566. 1 MSLT------------------------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSV 50 (364) Q Consensus 1 fd~~------------------------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f 50 (364) ++.. +..+++...+++.+.+..-...-+ ...++.|.-.+.|.+ T Consensus 4 ~~~~~~~~~~f~~~~~~~~~~~a~~~~~~~~~~~lip~~~~~~ii~~~~~~s~l~~l~-------~~~~~~~~~~~~p~~ 76 (324) T protein:vir:96 4 TQKLKLNLQHFASNNVKPQVFNPDNVMMHEKKDGTLLNDFTTPILQEVMENSKIMQLG-------KYEPMEGTEKKFTFW 76 (324) T ss_pred chhhhHHHHHHHHhhhhhhhcccccccccCCCcceechhHHHHHHHHHHhhchhhhhc-------ceeeccCCceEEEEE Confidence 1111 222333333333333321111110 111122221223322 Q ss_pred hhhcccccccccccCCCcc-ccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 51 GLIANLVTDRNAYAPVGTP-ATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAG 129 (364) Q Consensus 51 ~~i~g~~~~~d~~~~~~~~-~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~l 129 (364) ..-..+ ...+.+... .+...+.+-.-...|++ +-+.++.+.+. .+..++...|.+++++.+.+...+.+ T Consensus 77 ~~~~~a----~~v~Eg~~~~~~~~~f~~v~~~~~k~~---~~~~is~ell~---ds~~~l~~~i~~~l~~aia~~~d~~~ 146 (324) T protein:vir:96 77 ADKPGA----YWVGEGQKIETSKATWVNATMRAFKLG---VILPVTKEFLN---YTYSQFFEEMKPMIAEAFYKKFDEAG 146 (324) T ss_pred ecCcce----eeecCCccccccccceeEEEEEeEEEE---EeehhhHHHHh---cchHHHHHHHHHHHHHHHHHHHHHHh Confidence 111100 001111111 00011222222222333 22345554443 22334556788889988888776655 Q ss_pred HHHHhhhhccc-ccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccccc--cc Q lcl|NC_016566. 130 IGAGKAAIESN-AAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQV--FA 206 (364) Q Consensus 130 la~L~Gv~~~n-a~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~--~~ 206 (364) |. |.-+++ .......... .........++.++.++..++.++......|+||+..+..|.+ +.+..+- +. T Consensus 147 l~---G~g~~~~~~~~~~~~~~-~~~~~~~~~~~~~i~~~~~~i~~~~~~~~~~i~n~~~~~~L~~---lkd~~G~~~~~ 219 (324) T protein:vir:96 147 IL---NQGNNPFGKSIAQSIKK-TNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRK---IVDPETKERIY 219 (324) T ss_pred hh---cCCCCCcCccccccccc-cceecccccchHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHH---hhCCCCCeeec Confidence 53 211111 1111111110 1111223456778999999998888888899999999998864 3333332 22 Q ss_pred cccceeecccCCcEEEEeCCCCCCCCceEEEEEec-ceeEEec-CCCCccee-e------c-cCC------CceeeeEEe Q lcl|NC_016566. 207 IGDLQVMGDGLGRRFIISDAAADAMGAGKMLGLVP-GAVAVTT-NGLDMLAQ-E------K-GGN------ENIERWWQG 270 (364) Q Consensus 207 ~~~~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~-GAi~~~~-~~~~~~~~-~------~-~g~------e~~~~~~~~ 270 (364) ..... ..+|++|++++.++...+ ..+||. .-+.++. +++.+... + . .++ +.-...++. T Consensus 220 ~~~~~---~l~G~PV~~~~~~~~~~~---~~~~gd~s~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~n~v~~r~ 293 (324) T protein:vir:96 220 DRNSD---SLDGLPVVNLKSSNLKRG---ELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRA 293 (324) T ss_pred CCCCC---cccceeeEeecCCCCCcc---eEEEEecceEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEE Confidence 22222 246999999887775543 222322 1122222 11111110 0 0 000 111233444 Q ss_pred eEEEE---eeeeeeeecccccccccCCCCcChhhhcCCccceeecCcCcCcceEE Q lcl|NC_016566. 271 EFDFN---VAVKGYRLKASARTPVEGVRSFKLSDITDKANWELDQGQVDNAPATV 322 (364) Q Consensus 271 ~~~f~---lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW~rV~~s~K~~pgv~ 322 (364) +..|. ++|..|.--..+ +|-- ..+||=+ T Consensus 294 ~~r~d~~v~~~~a~~~l~~a-------------------~~~~-----~~~~~~~ 324 (324) T protein:vir:96 294 TMHVALHIADDKAFAKLVPA-------------------DKRT-----DSVPGEV 324 (324) T ss_pred EEEeccEEecccceEEEecc-------------------cccC-----CCCCCCC Confidence 33332 445554432211 1111 1122211 No 81 >protein:vir:8187 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:153 # MgeName: Che9d # Cross-refs: genbank:acc:NP_817980;genbank:gi:29566414;genbank:GeneID:2700968 Probab=95.93 E-value=0.0012 Score=36.51 Aligned_cols=270 Identities=9% Similarity=-0.058 Sum_probs=108.1 Q ss_pred CCcccc-chhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccc Q lcl|NC_016566. 1 MSLTVF-QRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARML 79 (364) Q Consensus 1 fd~~vf-n~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~ 79 (364) =+-.++ .+++...++|.+.+..-...-+ .. .++.+.=...|.+.....+ .-.. .+......+| ++.+-. T Consensus 5 ~~gg~lvP~~~~~~ii~~~~~~s~i~~~~---~~----i~~~~~~~~~p~~~~~~~a-~wv~-Eg~~~~~~~~-~f~~v~ 74 (311) T protein:vir:81 5 ATGTFQLPKHLVPGVWQKAQGQSVLARLS---MA----EPQEFGEQQYMTLTAPPRG-EVVG-EGAQKSESTA-TFAPVT 74 (311) T ss_pred cCCceEcchhHHHHHHHHHHhcchhhhhc---ce----eecCCCceEEEEEeCCcee-EEee-cCcccccccc-eeeEEE Confidence 011122 2334455666666543322211 11 1122111344443322111 1111 1111111122 233333 Q ss_pred eeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH--------hhhhcccccceeeccccc Q lcl|NC_016566. 80 TNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAG--------KAAIESNAAANYTQPARV 151 (364) Q Consensus 80 ~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L--------~Gv~~~na~~v~dis~~t 151 (364) -...|++.. +..+.+-+.....+...+...|.+++++...+...+.+|.-. .|+.... .+..... T Consensus 75 l~~~kl~~~---~~iS~ell~~~~d~~~~l~~~i~~~la~ai~~~~d~a~l~G~~~~~~~~~~gi~~~~----~~~~~~~ 147 (311) T protein:vir:81 75 AIPRKVQVT---QRFSQEVKWADESRQLGVLQTMADLSGVALGRALDLIGIHGINPLTGAALSGSPAKI----LDTTNIV 147 (311) T ss_pred EeeEEEEEe---ehhhHHHhhcCcccHHHHHHHHHHHHHHHHHHHHHHhhhccccCCCCcccccccccc----cccceee Confidence 333444432 235544332223444445566888888888777666555321 1111110 0000000 Q ss_pred CcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccccce-eecccCCcEEEEeCCCC Q lcl|NC_016566. 152 DGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQ-VMGDGLGRRFIISDAAA 228 (364) Q Consensus 152 ~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~-~~~~~lGrrVIVDD~~p 228 (364) .............+.++..++-+...+-.+|+||+..+..|.+ +.+..+ ++...... .....+|++|++++.|| T Consensus 148 ~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~---lkd~~G~~l~~~~~~~~~~~tl~G~Pv~~~~~i~ 224 (311) T protein:vir:81 148 ELTTGTSATPDLAVEAAVGLVLGDNLSPDGVALDNTFSFMLAT---QRDSQGRKLYPELGFGTDVASFAGLNAAVSDTVR 224 (311) T ss_pred eecccccchHHHHHHHHHHHhhhcCCCceEEEEcHHHHHHHHh---hhccCCCeeecCccccCCCceecceeEEeccccc Confidence 0001111111223455666665655666789999999999865 433333 22221111 11123699999999998 Q ss_pred CCCC----ceE---------EEEEecc-eeEEec-CCCCccee-eccCC------CceeeeEEeeEEE---Eeeeeeeee Q lcl|NC_016566. 229 DAMG----AGK---------MLGLVPG-AVAVTT-NGLDMLAQ-EKGGN------ENIERWWQGEFDF---NVAVKGYRL 283 (364) Q Consensus 229 ~~~~----~Yt---------tylfg~G-Ai~~~~-~~~~~~~~-~~~g~------e~~~~~~~~~~~f---~lhp~G~sw 283 (364) .... ... .++||.= -+.++. .++.+..- +.+.+ +.-.+.++.+..+ .+||..|.- T Consensus 225 ~~~~~~~~~~~~~~~~~~~~~~~~gDfs~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~r~~~r~d~~v~~~~a~~~ 304 (311) T protein:vir:81 225 GGPEAVTASTGVYRTTNPNVKAIAGDFSAFRWGVQVSIPLELIEFGDPDGLGDLKRQNQIAIRAEVVYGIGIMSTDAFAV 304 (311) T ss_pred ccccccccccchhcccCCccEEEEEecccEEEEEeccceEEEeccCCCCcchhhhhcCcEEEEEEEEeccEeecccceEE Confidence 5321 111 1233331 112221 11111111 11110 1122334333222 267777665 Q ss_pred ccccccc Q lcl|NC_016566. 284 KASARTP 290 (364) Q Consensus 284 ~~~~~~~ 290 (364) -..+... T Consensus 305 l~~a~~~ 311 (311) T protein:vir:81 305 VRDADES 311 (311) T ss_pred EEeeccC Confidence 3332211 No 82 >protein:vir:81227 Length: 413 # NCBI annotation: gp6, major capsid protein # Family: family:all:585 # MgeID: mge:1893 # MgeName: BFK20 # Cross-refs: genbank:acc:YP_001456736;genbank:gi:157168379;hssp:P49861;interpro:IPR006444;uniprot:Q9MBJ9;genbank:GeneID:5580350 Probab=95.91 E-value=0.0013 Score=36.47 Aligned_cols=266 Identities=10% Similarity=-0.016 Sum_probs=93.1 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccc--cCCCccccc-hhhhc Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAY--APVGTPATA-KVLAR 77 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~--~~~~~~~T~-~kit~ 77 (364) ----...+++...+++.+.+...+.+-.. ..++.|.-...|....... ...+.. +.++. .| ..+.. T Consensus 125 ~~~~~vp~~~~~~ii~~~~~~~~l~~~~~-------~~~~~~~~~~~~~~~~~~~--~~~~a~~v~Eg~~--~~~~~~~~ 193 (413) T protein:vir:81 125 EFQGGYGTTWNRNIIYRRREKLVVADLMD-------NLTMTNTTIKYLMEKANRV--VEGGFKTVAEGGK--KPYMRFAD 193 (413) T ss_pred ccccccchhhHHHHHHHHhhhhhHHhhcc-------eeeccCCceeEEEeccccc--cccccceecCccc--ccccCccc Confidence 00111233344444555444333222211 1122222111121111100 000000 00100 01 01111 Q ss_pred cceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc-cceeecccccCcccc Q lcl|NC_016566. 78 MLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNA-AANYTQPARVDGVGG 156 (364) Q Consensus 78 ~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na-~~v~dis~~t~~~~~ 156 (364) ...+-....+-.+-+..+...+. |...+..-|.+.+++...+-.-+.+|. |-=.++. ..++..+........ T Consensus 194 f~~i~~~~~k~~~~~~iS~ell~----ds~~l~~~i~~~la~~~~~~~d~~~l~---G~G~~~~~~Gi~~~~~~~~~~~~ 266 (413) T protein:vir:81 194 FDIVTESLSKIAGLTKITDEMIE----DYDFLVSYINARLLEELAIEEERQLLL---GDGTGNNLTGLLKRDGIQTLAVS 266 (413) T ss_pred ceeeEeeeeeEEEeehhhHHHHH----HHHHHHHHHHHHHHHHHHHHHHHHHhc---cCCCCCccccccccccccccccc Confidence 22222222111122334544332 222355566676776666655544432 2100000 011111111000011 Q ss_pred cccccHHHHHHHHHHh-cccccCeeEEEEchHHHHHHHHhhccccccc--cccc--------ccceeecccCCcEEEEeC Q lcl|NC_016566. 157 RTFPTLADFPLAASKF-GDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAI--------GDLQVMGDGLGRRFIISD 225 (364) Q Consensus 157 ~~~~s~~~l~~A~~~l-GD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~--------~~~~~~~~~lGrrVIVDD 225 (364) .......++.++..++ ....-+-.+|+||...|..|.+ +.+..+ ++.. .........+|+||+++| T Consensus 267 ~~~~~~~~i~~~~~~~~~~~~~~~~~~vmn~~~~~~l~~---lkd~~G~~l~~~~~~~~~~~~~~~~~~~l~G~pv~~s~ 343 (413) T protein:vir:81 267 NKDELADSIYKAMTNISLATPFQADALVINPLDYQELRL---AKDANGQYYGGGVFQGQYGSGGIMLDPAPWGLRTVQSQ 343 (413) T ss_pred ccchhHHHHHHHHHHhhhhccCCCcEEEEcHHHHHHHHH---hhccCCceeccccccccccccccccCceecceeeEEcC Confidence 1111233455666554 2222233569999999998854 322222 1111 111111124699999999 Q ss_pred CCCCCCCceEEEEEec--ceeE-EecCCCCcceeeccCC--CceeeeEEe--eEEE-EeeeeeeeecccccccccCCCCc Q lcl|NC_016566. 226 AAADAMGAGKMLGLVP--GAVA-VTTNGLDMLAQEKGGN--ENIERWWQG--EFDF-NVAVKGYRLKASARTPVEGVRSF 297 (364) Q Consensus 226 ~~p~~~~~Yttylfg~--GAi~-~~~~~~~~~~~~~~g~--e~~~~~~~~--~~~f-~lhp~G~sw~~~~~~~~~gg~SP 297 (364) .||... .+||. .++. +.-..+.....+..+. ..-.+.++. |+.+ ..||..|.--.-+ ...+| T Consensus 344 ~~~~~~-----~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~l~~~-----~~~~p 413 (413) T protein:vir:81 344 VVPVGK-----PVVGAFRSAASVLRKGGVRIDSTNTNVDDFENNLITVRAEERVGLMVTFPEAIVQLDVA-----EVVTP 413 (413) T ss_pred CCCccc-----EEEEecccEEEEEEecceEEEEeccccchhhcCcEEEEEEEeeccEEecccceEEEEec-----CCCCC Confidence 999532 33332 1222 2222222222222110 111223333 2222 2677777442211 12357 No 83 >protein:vir:9309 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:165 # MgeName: phi 11 # Cross-refs: genbank:acc:NP_803287;genbank:gi:29028597;genbank:GeneID:1258044 Probab=95.85 E-value=0.0014 Score=36.30 Aligned_cols=269 Identities=8% Similarity=-0.041 Sum_probs=111.1 Q ss_pred CCc---------c-ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccc Q lcl|NC_016566. 1 MSL---------T-VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPA 70 (364) Q Consensus 1 fd~---------~-vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~ 70 (364) |.. . ...+++...+++.+.+..-...-+ ...++.|.-.+.|.+..-..+ .-+ +.+... T Consensus 24 ~~a~~~~~~~~~~~liP~~~~~~ii~~~~~~s~l~~l~-------~~~~~~~~~~~ip~~~~~~~a-~~v---~Eg~~~- 91 (324) T protein:vir:93 24 FNPDNVMMHEKKDGTLLNDFTTPILQEVMENSKIMQLG-------KYEPMEGTEKKFTFWADKPGA-YWV---GEGQKI- 91 (324) T ss_pred cccccccccCCCcceechhHHHHHHHHHHhhchhhhhc-------ceeeccCCceEEEEEecCcce-eee---cCCccc- Confidence 110 0 223334444444444322211111 112233322233333211111 001 111111 Q ss_pred cchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhccc-ccceeeccc Q lcl|NC_016566. 71 TAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESN-AAANYTQPA 149 (364) Q Consensus 71 T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~n-a~~v~dis~ 149 (364) ...+. +...+-++..+-.+-+.++.+.+. ...-++...|.+++++.+.+...+.+|. |.-+++ ......... T Consensus 92 ~~~~~-~f~~i~~~~~k~~~~~~iS~ell~---ds~~~l~~~i~~~l~~aia~~~d~a~l~---G~g~~~~~~~~~~~~~ 164 (324) T protein:vir:93 92 ETSKA-TWVNATMRAFKLGVILPVTKEFLN---YTYSQFFEEMKPMIAEAFYKKFDEAGIL---NQGNNPFGKSIAQSIE 164 (324) T ss_pred ccccc-ceeEEEEEeEEEEEeehhhHHHHh---cchHHHHHHHHHHHHHHHHHHHHHHHhc---CCCCCCcCcccccccc Confidence 11111 122222222222223345655443 2223455678899999998887776653 211111 111111111 Q ss_pred ccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccccc--cccccceeecccCCcEEEEeCCC Q lcl|NC_016566. 150 RVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQV--FAIGDLQVMGDGLGRRFIISDAA 227 (364) Q Consensus 150 ~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~--~~~~~~~~~~~~lGrrVIVDD~~ 227 (364) . .........++.++.++..++.++......|+||..+|..|.+ +.+.++- +...... ..+|++|++++.+ T Consensus 165 ~-~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~n~~~~~~L~~---l~d~~G~~~~~~~~~~---~l~G~PVv~~~~~ 237 (324) T protein:vir:93 165 K-TNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRK---IVDPETKERIYDRNSD---SLDGLPVVNLKSS 237 (324) T ss_pred c-cceeccccccHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHH---hhCCCCCeeecCCCCC---cccceeeEeecCC Confidence 1 1111223456788999999998888888999999999999864 3333332 2222211 2469999998877 Q ss_pred CCCCCceEEEEEec-ceeEEec-CCCCcceeec---------cCC-----CceeeeEEeeEEE---Eeeeeeeeeccccc Q lcl|NC_016566. 228 ADAMGAGKMLGLVP-GAVAVTT-NGLDMLAQEK---------GGN-----ENIERWWQGEFDF---NVAVKGYRLKASAR 288 (364) Q Consensus 228 p~~~~~Yttylfg~-GAi~~~~-~~~~~~~~~~---------~g~-----e~~~~~~~~~~~f---~lhp~G~sw~~~~~ 288 (364) +...+ . .+||. .-+.++. +++.+...+. .+. +.-...++.+..| .+||..|.--..+ T Consensus 238 ~~~~~--~-i~~gdfs~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~n~~~~r~~~r~d~~v~~~~a~~~l~~a- 313 (324) T protein:vir:93 238 NLKRG--E-LITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAFAKLVPA- 313 (324) T ss_pred CCCcc--e-EEEEecceEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEeccEEecccceEEEecc- Confidence 65433 1 22222 1122222 1111111110 000 1112233333332 2555555442211 Q ss_pred ccccCCCCcChhhh Q lcl|NC_016566. 289 TPVEGVRSFKLSDI 302 (364) Q Consensus 289 ~~~~gg~SPT~aeL 302 (364) ..+..+|-.|. T Consensus 314 ---~~~~~~~~~~~ 324 (324) T protein:vir:93 314 ---DKRTDSVPGEV 324 (324) T ss_pred ---cccCCCCCCCC Confidence 11222333333 No 84 >protein:vir:2504 Length: 305 # NCBI annotation: major capsid subunit gp9 # Family: family:all:507 # MgeID: mge:53 # MgeName: TM4 # Cross-refs: genbank:acc:NP_569745;genbank:gi:18496895;genbank:GeneID:932268 Probab=95.84 E-value=0.0014 Score=36.27 Aligned_cols=271 Identities=11% Similarity=-0.000 Sum_probs=108.9 Q ss_pred CC-------ccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccC---CCccc Q lcl|NC_016566. 1 MS-------LTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAP---VGTPA 70 (364) Q Consensus 1 fd-------~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~---~~~~~ 70 (364) |. -..-.+++...++|.+.+..-...... ..++.+.....|.+..-..+. -...... +.... T Consensus 1 ma~~t~~~gg~liP~~~~~~Ii~~~~~~s~l~~l~~-------~~~~~~~~~~~p~~~~~~~a~-wv~E~~~~~~~~~~~ 72 (305) T protein:vir:25 1 MADISRAEVASLIQEAYSDTLLAAAKQGSTVLSAFQ-------NVNMGTKTTHLPVLATLPEAD-WVGESATDPKGVKPT 72 (305) T ss_pred CCCccCCccceecCHHHHHHHHHHHHhhchhhhhcc-------eeeccCCcEEEEEEeCCcceE-Eeecccccccccccc Confidence 11 011244455555666655433322211 122223323344333211110 0000000 00000 Q ss_pred cchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH---hhhhcccccceeec Q lcl|NC_016566. 71 TAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAG---KAAIESNAAANYTQ 147 (364) Q Consensus 71 T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L---~Gv~~~na~~v~di 147 (364) +..++.+-.-...|++. -+..+.+.+. ..+..+...|.+++++.+.+...+.+|.-- .+.+.......... T Consensus 73 s~~~f~~i~~~~~k~~~---~~~is~ell~---ds~~~~~~~i~~~l~~~~a~~~d~a~~~G~g~~~~~~~~~~~~~~~~ 146 (305) T protein:vir:25 73 SKVTWANRTLVAEEIAV---IIPVHENVID---DATVAVLTEVAELGGQAIGKKLDQAVIFGTDKPASWVSPALIPAAVT 146 (305) T ss_pred cccceeeEEeeeEEEEE---eehhhHHHHh---cchHHHHHHHHHHHHHHHHHHHhhhheeccCCCCCcccccccccccc Confidence 11123333333334432 2345655443 333445667888899998888877776311 11111000000000 Q ss_pred ccccCcccccccccHHH----HHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccccccccccceeec--ccCCcEE Q lcl|NC_016566. 148 PARVDGVGGRTFPTLAD----FPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAIGDLQVMG--DGLGRRF 221 (364) Q Consensus 148 s~~t~~~~~~~~~s~~~----l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~~~~~~--~~lGrrV 221 (364) ... ............+ +.++...+-+..-....|+||...+..|.+ +.++++- .+|+ ..+|++| T Consensus 147 ~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~---lkd~~G~------~i~~~~~l~G~Pv 216 (305) T protein:vir:25 147 AGQ-AVEVVGGVANESDIVGATNRAAKAVASAGWAPDTLLSSLALRYEVAN---IRDANGN------PVFRDDSFAGFRT 216 (305) T ss_pred ccc-cccccccchhhhHHHHHHHHHHHhhhhcccccceeEecHHHHHHHHH---hhccCCc------eeecCCcccccce Confidence 000 0001111112222 334444444445556679999999998854 4443332 2333 2469999 Q ss_pred EEeCCCCCCCCceEEEEEec-ceeEEec-CCCCcceeec----cCC------CceeeeEEe--eEEE-Eeeeeeeeeccc Q lcl|NC_016566. 222 IISDAAADAMGAGKMLGLVP-GAVAVTT-NGLDMLAQEK----GGN------ENIERWWQG--EFDF-NVAVKGYRLKAS 286 (364) Q Consensus 222 IVDD~~p~~~~~Yttylfg~-GAi~~~~-~~~~~~~~~~----~g~------e~~~~~~~~--~~~f-~lhp~G~sw~~~ 286 (364) +++|.+|...++...| ||. --+.++. ++..+.+.+. .++ ++-...+++ |+.| +++|..+-.-.. T Consensus 217 ~~~~~~~~~~~~~~~~-~gd~s~~~i~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~R~~~r~~~~v~~p~a~v~~~~ 295 (305) T protein:vir:25 217 FFNRNGAWDADAAIEV-IADSSRVKIGVRQDITVKFLDQATLGTGENQINLAERDMVALRLKARFAYVLGVSATAQGANK 295 (305) T ss_pred EEcCccCCCCCccEEE-EEecceEEEEEecCeEEEEeeeeeeecCCceeeeeecCcEEEEEEEeecceeeCcccEEEEcc Confidence 9999999766654433 332 2222232 2222211110 111 111122222 2232 346666655332 Q ss_pred ccccccCCCCcCh Q lcl|NC_016566. 287 ARTPVEGVRSFKL 299 (364) Q Consensus 287 ~~~~~~gg~SPT~ 299 (364) .... -..|+- T Consensus 296 ~~~~---~~~pa~ 305 (305) T protein:vir:25 296 TPVA---VVAPAA 305 (305) T ss_pred cccc---ccCCCC Confidence 2110 012332 No 85 >protein:vir:93881 Length: 387 # NCBI annotation: ORF011 # Family: family:all:658 # MgeID: mge:1485 # MgeName: 3A # Cross-refs: genbank:acc:YP_239938;genbank:gi:66395599;genbank:GeneID:5130947 Probab=95.79 E-value=0.0015 Score=36.14 Aligned_cols=257 Identities=9% Similarity=-0.009 Sum_probs=91.5 Q ss_pred CCccccc---------------------------------------hhhhhhhhhhhHHHHHHHhhhhcceeEeccCccc Q lcl|NC_016566. 1 MSLTVFQ---------------------------------------RKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVL 41 (364) Q Consensus 1 fd~~vfn---------------------------------------~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~ 41 (364) -+.+-|. ++++..+++.+.+.....+-+. |. +.. T Consensus 86 ~~~~~~~~~~r~~~~~~~~~~~~~~~~~~~~al~~~t~s~gG~~IP~~~~~~Ii~~~~~~~~l~~~~~----v~---~~~ 158 (387) T protein:vir:93 86 KMVKAKAEFYRHAILPNEFEKPSMEAQRLLHALPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLREKAR----LT---NIK 158 (387) T ss_pred HHHHHHHHHHHHHhhhhhhhhhhhhhHHHHHhhccCcCCCCceeechhHHHHHHHHHHhhchhhhhee----ee---ecC Confidence 1111111 1122222222222111111000 00 011 Q ss_pred CceeeeehhhhhcccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHH Q lcl|NC_016566. 42 KDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAI 121 (364) Q Consensus 42 Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw 121 (364) |. ..|....-.+...-.. .+.....+.| ++.+-.-...+++. -+.++...+....- ++-.-|.+++++.. T Consensus 159 ~~--~~p~~~~~~~~a~~v~-E~~~~~~~~~-~f~~v~~~~~k~~~---~~~iS~ell~Ds~~---~l~~~i~~~la~~~ 228 (387) T protein:vir:93 159 GL--EIPRVSYTLDDDDFIT-DVETAKELKL-KGDTVKFTTNKFKV---FAAISDTVIHGSDV---DLVNWVENALQSGL 228 (387) T ss_pred Cc--eEEEEeecCCcccccc-Cccccccccc-ccceeeeeheeeee---echhhHHHHhhhHH---HHHHHHHHHHHHHH Confidence 10 0110000000000001 0000011111 22222222222322 22344443321122 23344666676665 Q ss_pred HHHHHHHHHHHHhhhhcccccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccc Q lcl|NC_016566. 122 MLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSA 201 (364) Q Consensus 122 ~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~ 201 (364) .+-..+.++....|.- ....... ............+.++.++...+.....+=..|+||+.++.+|++ ++.+. T Consensus 229 ~~~e~~~~~~~g~g~g--~p~g~l~---~~~~~~v~~~~~~d~i~~~~~~l~~~~~~~a~~~mn~~t~~~~~~--~~~d~ 301 (387) T protein:vir:93 229 AAKERKDALAVSPKSG--LDHMSFY---NGSVKEVEGADMYDAIINALADLHEDYRDNATIYMRYADYVKIIS--VLSNG 301 (387) T ss_pred HHHHHHhHhhcCCCcc--ccceeee---ccccccccccchHHHHHHHHhccChhhhcCCEEEEechHHHHHHH--HHhcC Confidence 5543444443332211 0001110 000001111223567888888887777777789999999988765 34443 Q ss_pred ccccccccceeecccCCcEEEEeCCCCCCCCceEEEEEec---ceeEEecCCCCcceeeccCCCceeeeEEe--eEEE-E Q lcl|NC_016566. 202 EQVFAIGDLQVMGDGLGRRFIISDAAADAMGAGKMLGLVP---GAVAVTTNGLDMLAQEKGGNENIERWWQG--EFDF-N 275 (364) Q Consensus 202 ~~~~~~~~~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~---GAi~~~~~~~~~~~~~~~g~e~~~~~~~~--~~~f-~ 275 (364) .+..-.+ .....+|+||+++|++|.. +||. +-+.+.........+...| ...+.. |+.. . T Consensus 302 ~~~~~~~---~~~~llG~PV~~~~~~~~~-------~~GDf~~~~~~~~~~~~~~~~~~~~~----~~~~~~~~r~d~~v 367 (387) T protein:vir:93 302 TTNFFDT---PAEKVFGKPVVFTDAAVKP-------IVGDFNYFGINYDGTTYDTDKDVKKG----EYLFVLTAWYDQQR 367 (387) T ss_pred CCccccc---CCccccccceEEecCCCce-------eeeehhhhheehhhheeeecccccCC----ceeEEEEeeeCcee Confidence 3322111 1123479999999998742 2221 1111111101101111112 222222 2211 1 Q ss_pred eeeeeeeecccccccccCCCCcC Q lcl|NC_016566. 276 VAVKGYRLKASARTPVEGVRSFK 298 (364) Q Consensus 276 lhp~G~sw~~~~~~~~~gg~SPT 298 (364) +.|.-|..-+-.- ..+..|+ T Consensus 368 ~~~eA~~~l~~k~---~~~~~~~ 387 (387) T protein:vir:93 368 TLDSAFRIAKAKE---NTGSLPS 387 (387) T ss_pred echhheEEEEeec---CCCCCCC Confidence 4466665522211 1244577 No 86 >protein:vir:9361 Length: 402 # NCBI annotation: SLT orf 37-like protein # Family: family:all:658 # MgeID: mge:166 # MgeName: phi 12 # Cross-refs: genbank:acc:NP_803339;genbank:gi:29028650;genbank:GeneID:1258088 Probab=95.66 E-value=0.0017 Score=35.82 Aligned_cols=257 Identities=9% Similarity=-0.020 Sum_probs=91.1 Q ss_pred CCcccc---------------------------------------chhhhhhhhhhhHHHHHHHhhhhcceeEeccCccc Q lcl|NC_016566. 1 MSLTVF---------------------------------------QRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVL 41 (364) Q Consensus 1 fd~~vf---------------------------------------n~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~ 41 (364) =..+-| .+++...+++.+.+....++-+. ++ +.. T Consensus 101 ~~~~~~~~~~r~~~~~~~~~~~~~~~~~~~~a~~~~t~~~GG~lIP~~~~~~Ii~~~~~~~~l~~~~~---v~----~~~ 173 (402) T protein:vir:93 101 KMVKAKAEFYRHAILPNEFEKPSMEAQRLLHALPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLREKAR---LT----NIK 173 (402) T ss_pred HHHHHHHHHHHHHHhhhhHHHHHHhHHHHHhhhccCCCcCCccccchhHHHHHHHhHHhhhhhhhhce---ee----ecC Confidence 000000 11111122222221111100000 00 001 Q ss_pred CceeeeehhhhhcccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHH Q lcl|NC_016566. 42 KDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAI 121 (364) Q Consensus 42 Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw 121 (364) |. ..|-...-.+...-.. .+......+| ++.+ +....++=.+-+.++...+....- .+..-|.+++++.. T Consensus 174 ~~--~~p~~~~~~~~a~~v~-Eg~~~~~~~~-~f~~---i~~~~~k~~~~i~iS~ell~Ds~~---~l~~~i~~~la~~~ 243 (402) T protein:vir:93 174 GL--EIPRVSYTLDDDDFIT-DVETAKELKA-KGDT---VKFTTNKFKVFAAISDTVIHGSDV---DLVNWVENALQSGL 243 (402) T ss_pred Cc--eeeeeeccCCcccccc-cccccccccc-ccce---eeecceeeeeechhhHHHHhhhHH---HHHHHHHHHHHHHH Confidence 10 0010000000000000 0000000111 1222 222222111223344443321122 23344667777766 Q ss_pred HHHHHHHHHHHHhhhhcccccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccc Q lcl|NC_016566. 122 MLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSA 201 (364) Q Consensus 122 ~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~ 201 (364) .+-..+.++....|. +........ ...........+.++.++...+-.....=..|+||+.++.+|++ ++.+. T Consensus 244 ~~~e~~~~~~~g~g~--g~p~g~~~~---~~~~~~~~~~~~d~l~~~~~~l~~~y~~na~~imn~~t~~~~~~--~~~d~ 316 (402) T protein:vir:93 244 AAKERKDALAVSPKS--GLEHMSFYN---GSVKEVEGADMYDAIINALADLHEDYRDNATIYMRYADYVKIIS--VLSNG 316 (402) T ss_pred HHHHHHhHhhcCCCc--cccceeeec---cccccccccchHHHHHHHHhccChhhhcCCEEEEechHHHHHHH--HHhcC Confidence 654344444333321 111111100 00011111123456888888887766666789999999988865 24443 Q ss_pred ccccccccceeecccCCcEEEEeCCCCCCCCceEEEEEec---ceeEEecCCCCcceeeccCCCceeeeEEeeEEE---E Q lcl|NC_016566. 202 EQVFAIGDLQVMGDGLGRRFIISDAAADAMGAGKMLGLVP---GAVAVTTNGLDMLAQEKGGNENIERWWQGEFDF---N 275 (364) Q Consensus 202 ~~~~~~~~~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~---GAi~~~~~~~~~~~~~~~g~e~~~~~~~~~~~f---~ 275 (364) .+..-.+ .....||++|+++|++|. .+||. .-+.+.....+..++...| .+.++....| . T Consensus 317 ~~~~~~~---~~~~llG~PV~~t~~~~~-------i~~GDf~~~~~~~~~~~~~~~~~~~~~----~~~~~~~~r~Dg~v 382 (402) T protein:vir:93 317 TTNFFDT---PAEKVFGKPVVFTDAAVK-------PIVGDFNYFGINYDGTTYDTDKDVKKG----EYLFVLTAWYDQQR 382 (402) T ss_pred CCccccc---CCccccccceEEecCCCc-------eeeechhhhhhhhhhhhhhhhhcccCC----ceEEEEEEEeCcEE Confidence 3322111 112347999999999874 23322 1111111111111111122 2223222111 2 Q ss_pred eeeeeeeecccccccccCCCCcC Q lcl|NC_016566. 276 VAVKGYRLKASARTPVEGVRSFK 298 (364) Q Consensus 276 lhp~G~sw~~~~~~~~~gg~SPT 298 (364) +.|..|+.-.-.- ..+..|| T Consensus 383 ~~~~A~~~l~ik~---~~~~~~~ 402 (402) T protein:vir:93 383 TLDSAFRIAKAKE---NTGPLPS 402 (402) T ss_pred echhheEEEEeec---CCCCCCC Confidence 5577776532211 1345677 No 87 >protein:vir:2685 Length: 387 # NCBI annotation: hypothetical protein # Family: family:all:658 # MgeID: mge:57 # MgeName: phiSLT # Cross-refs: genbank:acc:NP_075504;genbank:gi:12719433;genbank:GeneID:920169 Probab=95.59 E-value=0.0018 Score=35.65 Aligned_cols=265 Identities=8% Similarity=0.004 Sum_probs=90.5 Q ss_pred CCccccchhhhh----hhhhhh-HHHHHHHhh------hhcceeEeccCcccCceeeee-hhh---------hhcccccc Q lcl|NC_016566. 1 MSLTVFQRKLVT----AVTQMI-PDNLNVFNA------AANGAVVLGTGEVLKDVVEKM-SVG---------LIANLVTD 59 (364) Q Consensus 1 fd~~vfn~~~~~----~~~e~i-~q~~~~fn~------as~gAivl~~~~~~Gdf~~~~-~f~---------~i~g~~~~ 59 (364) -..+-|.+++-. ...... .......|+ ..||.+| +..+...+++.. -++ .+++.... T Consensus 86 ~~~~~~~~~~r~~~~~~~~~~~~~~~~~~~~a~~~~~~~~gG~lI--P~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~~~p 163 (387) T protein:vir:26 86 KMVKAKAEFYRHAILPNEFEKPSMEAQRLLHALPTGNDSGGDKLL--PKTLSKEIVSEPFAKNQLREKARLTNIKGLEIP 163 (387) T ss_pred HHHHHHHHHHHHHHhhhhHHHHHHHHHHHHhhhccCCCCCCceee--chhHHHHHHHHHHhhchhhhhceeeecCCceee Confidence 000111111100 000000 000011111 1122222 111111111000 000 01110000 Q ss_pred c-ccc---------cCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 60 R-NAY---------APVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAG 129 (364) Q Consensus 60 ~-d~~---------~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~l 129 (364) + ... +......+| +..+-.-...+++ +-+.++..-+....- .+..-|.+++++...+-..+.+ T Consensus 164 ~~~~~~~~a~~v~Eg~~~~~~~~-~f~~v~l~~~k~~---~~i~iS~ell~ds~~---~l~~~i~~~la~~~~~~e~~~~ 236 (387) T protein:vir:26 164 RVSYTLDDDDFITDVETAKELKA-KGDTVKFTTNKFK---VFAAISDTVIHGSDV---DLVNWVENALQSGLAAKERKDA 236 (387) T ss_pred eeeccCCcccccccccccccccc-ccceeeechheee---eechhhHHHHhhhHH---HHHHHHHHHHHHHHHHHHHHhH Confidence 0 000 000000011 1111111111222 112344433321122 2333466777776655444444 Q ss_pred HHHHhhhhcccccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccccccccc Q lcl|NC_016566. 130 IGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAIGD 209 (364) Q Consensus 130 la~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~ 209 (364) +....|. +....... ..+.........+.++.++...+-....+=..|+||+.+|..|++ ++.+..+..-.+ T Consensus 237 ~~~g~g~--g~~~g~~~---~~~~~~~~~~~~~d~i~~~~~~l~~~y~~na~~imn~~t~~~~~~--~~~~~~~~~~~~- 308 (387) T protein:vir:26 237 LAVSPKS--GLEHMSFY---NGSVKEVEGADMYDAIINALADLHEDYRDNATIYMRYADYVKIIS--VLSNGTTNFFDT- 308 (387) T ss_pred hhcCCCc--cccceeee---ccccccccccchHHHHHHHHhccChhhhcCCEEEEechHHHHHHH--HHhcCCCccccc- Confidence 4333321 11111110 000011111223567888888887766666789999999988865 344433322111 Q ss_pred ceeecccCCcEEEEeCCCCCCCCceEEEEEec---ceeEEecCCCCcceeeccCCCceeeeEEeeEEE---Eeeeeeeee Q lcl|NC_016566. 210 LQVMGDGLGRRFIISDAAADAMGAGKMLGLVP---GAVAVTTNGLDMLAQEKGGNENIERWWQGEFDF---NVAVKGYRL 283 (364) Q Consensus 210 ~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~---GAi~~~~~~~~~~~~~~~g~e~~~~~~~~~~~f---~lhp~G~sw 283 (364) .....||++|+++|++|. .+||. .-+.+.. +.....+.-....+.++....| .+.|..|.. T Consensus 309 --~~~~llG~PV~~~~~~~~-------~~~GDf~~~~~~~~~----~~~~~~~~~~~~~~~~~~~~r~Dg~v~~~~A~~~ 375 (387) T protein:vir:26 309 --PAEKVFGKPVVFTDAAVK-------PIVGDFNYFGINYDG----TTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRI 375 (387) T ss_pred --CCccccccceEEecCCCc-------eeeechhhhhhhhhh----hhheecccccCCceEEEEEEEeCcEeechhheEE Confidence 112347999999999874 22322 1111110 1111111111122333332211 245777766 Q ss_pred cccccccccCCCCcC Q lcl|NC_016566. 284 KASARTPVEGVRSFK 298 (364) Q Consensus 284 ~~~~~~~~~gg~SPT 298 (364) -.-+- ..+..|| T Consensus 376 l~~ka---~~~~~~~ 387 (387) T protein:vir:26 376 AKAKE---NTGPLPS 387 (387) T ss_pred EEeec---CCCCCCC Confidence 33221 2344577 No 88 >protein:vir:96978 Length: 387 # NCBI annotation: ORF009 # Family: family:all:658 # MgeID: mge:1643 # MgeName: 42e # Cross-refs: genbank:acc:YP_239859;genbank:gi:66395517;genbank:GeneID:5133011 Probab=95.59 E-value=0.0018 Score=35.65 Aligned_cols=265 Identities=8% Similarity=0.004 Sum_probs=90.5 Q ss_pred CCccccchhhhh----hhhhhh-HHHHHHHhh------hhcceeEeccCcccCceeeee-hhh---------hhcccccc Q lcl|NC_016566. 1 MSLTVFQRKLVT----AVTQMI-PDNLNVFNA------AANGAVVLGTGEVLKDVVEKM-SVG---------LIANLVTD 59 (364) Q Consensus 1 fd~~vfn~~~~~----~~~e~i-~q~~~~fn~------as~gAivl~~~~~~Gdf~~~~-~f~---------~i~g~~~~ 59 (364) -..+-|.+++-. ...... .......|+ ..||.+| +..+...+++.. -++ .+++.... T Consensus 86 ~~~~~~~~~~r~~~~~~~~~~~~~~~~~~~~a~~~~~~~~gG~lI--P~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~~~p 163 (387) T protein:vir:96 86 KMVKAKAEFYRHAILPNEFEKPSMEAQRLLHALPTGNDSGGDKLL--PKTLSKEIVSEPFAKNQLREKARLTNIKGLEIP 163 (387) T ss_pred HHHHHHHHHHHHHHhhhhHHHHHHHHHHHHhhhccCCCCCCceee--chhHHHHHHHHHHhhchhhhhceeeecCCceee Confidence 000111111100 000000 000011111 1122222 111111111000 000 01110000 Q ss_pred c-ccc---------cCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 60 R-NAY---------APVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAG 129 (364) Q Consensus 60 ~-d~~---------~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~l 129 (364) + ... +......+| +..+-.-...+++ +-+.++..-+....- .+..-|.+++++...+-..+.+ T Consensus 164 ~~~~~~~~a~~v~Eg~~~~~~~~-~f~~v~l~~~k~~---~~i~iS~ell~ds~~---~l~~~i~~~la~~~~~~e~~~~ 236 (387) T protein:vir:96 164 RVSYTLDDDDFITDVETAKELKA-KGDTVKFTTNKFK---VFAAISDTVIHGSDV---DLVNWVENALQSGLAAKERKDA 236 (387) T ss_pred eeeccCCcccccccccccccccc-ccceeeechheee---eechhhHHHHhhhHH---HHHHHHHHHHHHHHHHHHHHhH Confidence 0 000 000000011 1111111111222 112344433321122 2333466777776655444444 Q ss_pred HHHHhhhhcccccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccccccccc Q lcl|NC_016566. 130 IGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAIGD 209 (364) Q Consensus 130 la~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~ 209 (364) +....|. +....... ..+.........+.++.++...+-....+=..|+||+.+|..|++ ++.+..+..-.+ T Consensus 237 ~~~g~g~--g~~~g~~~---~~~~~~~~~~~~~d~i~~~~~~l~~~y~~na~~imn~~t~~~~~~--~~~~~~~~~~~~- 308 (387) T protein:vir:96 237 LAVSPKS--GLEHMSFY---NGSVKEVEGADMYDAIINALADLHEDYRDNATIYMRYADYVKIIS--VLSNGTTNFFDT- 308 (387) T ss_pred hhcCCCc--cccceeee---ccccccccccchHHHHHHHHhccChhhhcCCEEEEechHHHHHHH--HHhcCCCccccc- Confidence 4333321 11111110 000011111223567888888887766666789999999988865 344433322111 Q ss_pred ceeecccCCcEEEEeCCCCCCCCceEEEEEec---ceeEEecCCCCcceeeccCCCceeeeEEeeEEE---Eeeeeeeee Q lcl|NC_016566. 210 LQVMGDGLGRRFIISDAAADAMGAGKMLGLVP---GAVAVTTNGLDMLAQEKGGNENIERWWQGEFDF---NVAVKGYRL 283 (364) Q Consensus 210 ~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~---GAi~~~~~~~~~~~~~~~g~e~~~~~~~~~~~f---~lhp~G~sw 283 (364) .....||++|+++|++|. .+||. .-+.+.. +.....+.-....+.++....| .+.|..|.. T Consensus 309 --~~~~llG~PV~~~~~~~~-------~~~GDf~~~~~~~~~----~~~~~~~~~~~~~~~~~~~~r~Dg~v~~~~A~~~ 375 (387) T protein:vir:96 309 --PAEKVFGKPVVFTDAAVK-------PIVGDFNYFGINYDG----TTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRI 375 (387) T ss_pred --CCccccccceEEecCCCc-------eeeechhhhhhhhhh----hhheecccccCCceEEEEEEEeCcEeechhheEE Confidence 112347999999999874 22322 1111110 1111111111122333332211 245777766 Q ss_pred cccccccccCCCCcC Q lcl|NC_016566. 284 KASARTPVEGVRSFK 298 (364) Q Consensus 284 ~~~~~~~~~gg~SPT 298 (364) -.-+- ..+..|| T Consensus 376 l~~ka---~~~~~~~ 387 (387) T protein:vir:96 376 AKAKE---NTGPLPS 387 (387) T ss_pred EEeec---CCCCCCC Confidence 33221 2344577 No 89 >protein:vir:94424 Length: 387 # NCBI annotation: ORF010 # Family: family:all:658 # MgeID: mge:1506 # MgeName: 47 # Cross-refs: genbank:acc:YP_240005;genbank:gi:66395666;genbank:GeneID:5133084 Probab=95.59 E-value=0.0018 Score=35.65 Aligned_cols=265 Identities=8% Similarity=0.004 Sum_probs=90.5 Q ss_pred CCccccchhhhh----hhhhhh-HHHHHHHhh------hhcceeEeccCcccCceeeee-hhh---------hhcccccc Q lcl|NC_016566. 1 MSLTVFQRKLVT----AVTQMI-PDNLNVFNA------AANGAVVLGTGEVLKDVVEKM-SVG---------LIANLVTD 59 (364) Q Consensus 1 fd~~vfn~~~~~----~~~e~i-~q~~~~fn~------as~gAivl~~~~~~Gdf~~~~-~f~---------~i~g~~~~ 59 (364) -..+-|.+++-. ...... .......|+ ..||.+| +..+...+++.. -++ .+++.... T Consensus 86 ~~~~~~~~~~r~~~~~~~~~~~~~~~~~~~~a~~~~~~~~gG~lI--P~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~~~p 163 (387) T protein:vir:94 86 KMVKAKAEFYRHAILPNEFEKPSMEAQRLLHALPTGNDSGGDKLL--PKTLSKEIVSEPFAKNQLREKARLTNIKGLEIP 163 (387) T ss_pred HHHHHHHHHHHHHHhhhhHHHHHHHHHHHHhhhccCCCCCCceee--chhHHHHHHHHHHhhchhhhhceeeecCCceee Confidence 000111111100 000000 000011111 1122222 111111111000 000 01110000 Q ss_pred c-ccc---------cCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 60 R-NAY---------APVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAG 129 (364) Q Consensus 60 ~-d~~---------~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~l 129 (364) + ... +......+| +..+-.-...+++ +-+.++..-+....- .+..-|.+++++...+-..+.+ T Consensus 164 ~~~~~~~~a~~v~Eg~~~~~~~~-~f~~v~l~~~k~~---~~i~iS~ell~ds~~---~l~~~i~~~la~~~~~~e~~~~ 236 (387) T protein:vir:94 164 RVSYTLDDDDFITDVETAKELKA-KGDTVKFTTNKFK---VFAAISDTVIHGSDV---DLVNWVENALQSGLAAKERKDA 236 (387) T ss_pred eeeccCCcccccccccccccccc-ccceeeechheee---eechhhHHHHhhhHH---HHHHHHHHHHHHHHHHHHHHhH Confidence 0 000 000000011 1111111111222 112344433321122 2333466777776655444444 Q ss_pred HHHHhhhhcccccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccccccccc Q lcl|NC_016566. 130 IGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFAIGD 209 (364) Q Consensus 130 la~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~ 209 (364) +....|. +....... ..+.........+.++.++...+-....+=..|+||+.+|..|++ ++.+..+..-.+ T Consensus 237 ~~~g~g~--g~~~g~~~---~~~~~~~~~~~~~d~i~~~~~~l~~~y~~na~~imn~~t~~~~~~--~~~~~~~~~~~~- 308 (387) T protein:vir:94 237 LAVSPKS--GLEHMSFY---NGSVKEVEGADMYDAIINALADLHEDYRDNATIYMRYADYVKIIS--VLSNGTTNFFDT- 308 (387) T ss_pred hhcCCCc--cccceeee---ccccccccccchHHHHHHHHhccChhhhcCCEEEEechHHHHHHH--HHhcCCCccccc- Confidence 4333321 11111110 000011111223567888888887766666789999999988865 344433322111 Q ss_pred ceeecccCCcEEEEeCCCCCCCCceEEEEEec---ceeEEecCCCCcceeeccCCCceeeeEEeeEEE---Eeeeeeeee Q lcl|NC_016566. 210 LQVMGDGLGRRFIISDAAADAMGAGKMLGLVP---GAVAVTTNGLDMLAQEKGGNENIERWWQGEFDF---NVAVKGYRL 283 (364) Q Consensus 210 ~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~---GAi~~~~~~~~~~~~~~~g~e~~~~~~~~~~~f---~lhp~G~sw 283 (364) .....||++|+++|++|. .+||. .-+.+.. +.....+.-....+.++....| .+.|..|.. T Consensus 309 --~~~~llG~PV~~~~~~~~-------~~~GDf~~~~~~~~~----~~~~~~~~~~~~~~~~~~~~r~Dg~v~~~~A~~~ 375 (387) T protein:vir:94 309 --PAEKVFGKPVVFTDAAVK-------PIVGDFNYFGINYDG----TTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRI 375 (387) T ss_pred --CCccccccceEEecCCCc-------eeeechhhhhhhhhh----hhheecccccCCceEEEEEEEeCcEeechhheEE Confidence 112347999999999874 22322 1111110 1111111111122333332211 245777766 Q ss_pred cccccccccCCCCcC Q lcl|NC_016566. 284 KASARTPVEGVRSFK 298 (364) Q Consensus 284 ~~~~~~~~~gg~SPT 298 (364) -.-+- ..+..|| T Consensus 376 l~~ka---~~~~~~~ 387 (387) T protein:vir:94 376 AKAKE---NTGPLPS 387 (387) T ss_pred EEeec---CCCCCCC Confidence 33221 2344577 No 90 >protein:vir:100884 Length: 389 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1473 # MgeName: Lc-Nu # Cross-refs: genbank:acc:YP_358764;genbank:gi:78000028;genbank:GeneID:3726155 Probab=95.56 E-value=0.0018 Score=35.59 Aligned_cols=261 Identities=7% Similarity=0.012 Sum_probs=92.3 Q ss_pred CCc-------ccc--------------------------chhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeee Q lcl|NC_016566. 1 MSL-------TVF--------------------------QRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEK 47 (364) Q Consensus 1 fd~-------~vf--------------------------n~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~ 47 (364) ... +.| .+++...+++.+.+....++-+. -.++.+.-... T Consensus 83 ~~~~~~~~~~~~~~~~lr~~~~~~~~~~~~t~~~gg~~vP~~~~~~i~~~~~~~~~l~~~~~-------~~~~~~~~~~~ 155 (389) T protein:vir:10 83 LSKKPIDAKKKAINDFIHSHGKVIDATSKVTSTEAGVLIPEEIIYDPTAEVNSVVDLSTLVT-------KTPVTTPKGTY 155 (389) T ss_pred cchhHHHHHHHHHHHHhhcchhhhhhhcccccCCcceeehHHHHHHHHHHHHhhhhHHhhcc-------eeeccCCeeEE Confidence 000 000 11222233333333222221111 01111111111 Q ss_pred ehhhhhcccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 48 MSVGLIANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLK 127 (364) Q Consensus 48 ~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk 127 (364) |....-++.....-..+......+| ++..-.-...+++ +-+.++...+. ....++...|.+.+++...+..-. T Consensus 156 ~~~~~~~~~~~~~~E~~~~~~~~~~-~~~~i~~~~~k~~---~~~~iS~ell~---ds~~~l~~~i~~~la~~~~~~~~~ 228 (389) T protein:vir:10 156 PILKRATDRFSSVAELAENPKLAEP-EFNKVDWSVATYR---GAIPLSEEAIA---DSAVDLTALVGQSIKEKSVNTYNA 228 (389) T ss_pred EEEecCCCccccccccccccccccc-cceeeeeeheeeE---eeehhhHHHHh---hhhHHHHHHHHHHHHHHHHHHHHH Confidence 1111111110000000111000111 1222222222333 22335544433 222234444666666665554444 Q ss_pred HHHHHHhhhhcccccceeecccccCcccccccccHHHHHHHHHH-hcccccCeeEEEEchHHHHHHHHhhccccccc--c Q lcl|NC_016566. 128 AGIGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASK-FGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--V 204 (364) Q Consensus 128 ~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~-lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~ 204 (364) .++..+.+.- + .......++.++.++.+. +-.... ..|+||..++..|.+ +.+..+ + T Consensus 229 ~i~~g~~~~~---~------------~~~~~~~~~d~l~~~~~~~~~~~~~--a~~~~n~~~~~~L~~---lkd~~G~~i 288 (389) T protein:vir:10 229 MIAPVLQSFT---A------------KKTTTDTLVDSLKHILNVDLDPAYS--RALVVTQSLFNTLDT---LKDKNGRYL 288 (389) T ss_pred HHhhhhcccc---c------------ccccccccHHHHHHHHHhhhhhhhC--cEEEecHHHHHHHHH---hhccCCCee Confidence 4433322110 0 011122344456665543 322222 589999999998864 333332 2 Q ss_pred cccccce----e-ecccCCcEEEEeCCC-CCCCCceEEEEEec--ceeEEec-CCCCcceeecc-CCCceeeeEEeeEEE Q lcl|NC_016566. 205 FAIGDLQ----V-MGDGLGRRFIISDAA-ADAMGAGKMLGLVP--GAVAVTT-NGLDMLAQEKG-GNENIERWWQGEFDF 274 (364) Q Consensus 205 ~~~~~~~----~-~~~~lGrrVIVDD~~-p~~~~~Yttylfg~--GAi~~~~-~~~~~~~~~~~-g~e~~~~~~~~~~~f 274 (364) +...-.. . ....+|++|+|.|++ |...+--..++||. -++.+.. ++....+.... -...+...++.+. - T Consensus 289 ~~~~~~~~~~~~~~~~l~G~pV~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~r~d~-~ 367 (389) T protein:vir:10 289 LHDASDSITDGTAKGTILGVPVYVVGDTLLGSLAGDQKAFVGDLKRGVLFTDRQQVTLAWEDSKIYGKYLGAAFRFGV-Q 367 (389) T ss_pred eecCcccccccccccccccceeEEecccccCCCCCceEEEEeeccccEEEEeecceEEEeeccccccceEEEEEEecc-E Confidence 2211100 0 112369999876554 43322223566664 2333332 33322222111 1122222222222 2 Q ss_pred EeeeeeeeecccccccccCCCCcCh Q lcl|NC_016566. 275 NVAVKGYRLKASARTPVEGVRSFKL 299 (364) Q Consensus 275 ~lhp~G~sw~~~~~~~~~gg~SPT~ 299 (364) .+||..|.+-.-+ ... +.+|+- T Consensus 368 ~~~~~a~~~~~~~--~~~-~~~~~~ 389 (389) T protein:vir:10 368 KADSKAGYFVTNT--DVP-GSALGK 389 (389) T ss_pred EecccceEEEEee--ccC-CCCCCC Confidence 3788887763321 111 123433 No 91 >protein:vir:9704 Length: 394 # NCBI annotation: hypothetical protein # Family: family:all:21 # MgeID: mge:174 # MgeName: 315.2 # Cross-refs: genbank:acc:NP_795466;genbank:gi:28876225;genbank:GeneID:1257769 Probab=95.54 E-value=0.0019 Score=35.53 Aligned_cols=252 Identities=8% Similarity=-0.012 Sum_probs=86.7 Q ss_pred CCcc-----------------------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhh Q lcl|NC_016566. 1 MSLT-----------------------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVG 51 (364) Q Consensus 1 fd~~-----------------------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~ 51 (364) -... +-.+++....++.+.+.....+- +=+ .+..+.-...|.+. T Consensus 106 ~~~~~~~~~~~~~~~~~~~~~~~~~~t~~~gg~liP~~~~~~ii~~~~~~~~l~~~----~~~---~~~~~~~~~~~~~~ 178 (394) T protein:vir:97 106 LRFEGKDEVLMPINETTPVEPQKDGIKKENAKPVSSEEILYTPAREVKTVVDLKPF----TTV---YQAKKASGKYPVLQ 178 (394) T ss_pred hhhhhHHHHHHHHHhhhhhhhhccccccccccccChHHHHHHHHHHhhhhhhhhhh----cee---eeccCcceEEEEEe Confidence 0000 01111222222222221111111 100 01111111223222 Q ss_pred hhcccccccccccCCCccccch-hhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 52 LIANLVTDRNAYAPVGTPATAK-VLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGI 130 (364) Q Consensus 52 ~i~g~~~~~d~~~~~~~~~T~~-kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~ll 130 (364) .-++...-.. .+. ..|. .--....+-...+.-.+-+.++...+....- .+...|.+++++...+-.-..+| T Consensus 179 ~~~~~~~~v~---E~~--~~~~~~~~~~~~v~l~~~k~~~~i~is~ell~ds~~---~~~~~i~~~la~~~~~~~~~~i~ 250 (394) T protein:vir:97 179 RATTKMVTVA---ELE--KNPALAKPDFKDVAWNIDTYRGAIPLSQESIDDADV---DLVGIVSESISQIKVNTTNDAIA 250 (394) T ss_pred cCCCccceec---ccc--cccccccccceeEEeehhheeeehhhHHHHHhhhhH---HHHHHHHHHHHHHHHHHHHHHHh Confidence 1111100110 000 0110 0011122222222222233455443331111 23334555555544443222221 Q ss_pred HHHhhhhcccccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc Q lcl|NC_016566. 131 GAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG 208 (364) Q Consensus 131 a~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~ 208 (364) . |. .+ ....+..++.++.++....-|.... ..|+||..+|..|.+ +.+..+ ++... T Consensus 251 ~---g~---~~------------~~~~~~~~~~~~~~~~~~~~~~~~~-a~~v~n~~~~~~l~~---lkd~~G~~i~~~~ 308 (394) T protein:vir:97 251 K---VL---KS------------FTTKTVKNLDEIKALLNGGFDPAYN-VSLIVSQSFYQTLDT---LKDGNGRYLLQDD 308 (394) T ss_pred h---cc---cc------------ccccccccHHHHHHHHHhhhhhhhC-CEEEEcHHHHHHHHH---hhccCCCeeeecC Confidence 1 10 00 0112234556677777654443322 679999999998854 433333 22211 Q ss_pred -cceeecccCCcEEEEeCCCCCCCCceEEEEEec--ceeEEec-CCCCcceeeccC-CCceeeeEEeeEEEEeeeeeeee Q lcl|NC_016566. 209 -DLQVMGDGLGRRFIISDAAADAMGAGKMLGLVP--GAVAVTT-NGLDMLAQEKGG-NENIERWWQGEFDFNVAVKGYRL 283 (364) Q Consensus 209 -~~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~--GAi~~~~-~~~~~~~~~~~g-~e~~~~~~~~~~~f~lhp~G~sw 283 (364) .-..-...+|++|++.|+++...+ +++||. -++.+.+ .+..+....... ...+...++.+. -..||..|.. T Consensus 309 ~~~~~~~~l~G~pv~~~~~~~~~~~---~~~~gd~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~d~-~v~~~~a~~~ 384 (394) T protein:vir:97 309 ITAVSGKVLLGKPVFVLSDEVLGAN---KAFIGDFKRGVLFADRKDLGLRWADNEIYGQYLQAVLRFGV-SKVDDKAGYY 384 (394) T ss_pred cCCCCCceeccceeEEecccccCCc---cEEEeeccccEEEEEecceEEEEecccccceeEEEEEEEcc-EEecccceEE Confidence 111111236999999877665443 244543 2222222 222111111111 111111111111 2367777776 Q ss_pred cccccccccCCCCcChhhh Q lcl|NC_016566. 284 KASARTPVEGVRSFKLSDI 302 (364) Q Consensus 284 ~~~~~~~~~gg~SPT~aeL 302 (364) -.- +|+-+-| T Consensus 385 ~~~---------~~~~~p~ 394 (394) T protein:vir:97 385 VTF---------TPEPLPL 394 (394) T ss_pred EEe---------cccccCC Confidence 443 2333333 No 92 >protein:vir:1025 Length: 408 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:20 # MgeName: bIL286 # Cross-refs: genbank:acc:NP_076679;genbank:gi:13095788;genbank:GeneID:920362 Probab=95.53 E-value=0.0019 Score=35.51 Aligned_cols=271 Identities=11% Similarity=0.052 Sum_probs=98.5 Q ss_pred CCccc--------------------cchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhccccccc Q lcl|NC_016566. 1 MSLTV--------------------FQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDR 60 (364) Q Consensus 1 fd~~v--------------------fn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~ 60 (364) ..... =.+++....++.+.+.....+- +=++......|.+.... +....+...-. T Consensus 103 ~~~~~~~~~~~~a~~~~t~~~gg~~vP~~~~~~Ii~~~~~~~~l~~~----~~~~~~~~~~~~~~~~~-~~~~~~~a~~v 177 (408) T protein:vir:10 103 NPMAFMNTVSSKTETSGSDSAAGLTIPQDIRTMINTLVRQYDSLQQY----VRVESVSTSNGSRVYEK-WTDVTPLTVMD 177 (408) T ss_pred cchhhhhhhhhhhhhcccccCCceeccHhHHHHHHHHHHhhchhhhh----cceeeccCCcceEEEee-ccccccceeee Confidence 00000 0111112222222221111110 00000001112221111 11110100000 Q ss_pred ccccCCCccccc-hhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcc Q lcl|NC_016566. 61 NAYAPVGTPATA-KVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIES 139 (364) Q Consensus 61 d~~~~~~~~~T~-~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~ 139 (364) . .+.. -| ....+...+-....+-.+-+.++...+.. .+..+...|.+.+++...+-..+.+|....+ + T Consensus 178 ~---E~~~--~~~~~~~~~~~i~~~~~k~~~~~~iS~ell~d---s~~~l~~~i~~~l~~~~~~~~~~~il~g~g~---~ 246 (408) T protein:vir:10 178 A---EDGK--IPDLDNPQLTIIKYLIKRYAGIITATNTSLKD---TAENILAWLSSWIAKKVVVTRNQAIIEVMKA---A 246 (408) T ss_pred c---Cccc--cccccCcceeeEEeeeeeEEeeehhHHHHHhh---chHHHHHHHHHHHHHHHHHHHHHHHhhcccc---c Confidence 0 0100 01 01112233333222222223455554432 2223344466666666665544433322111 0 Q ss_pred cccceeecccccCcccccccccHHHHHHHHH-HhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc-cceeecc Q lcl|NC_016566. 140 NAAANYTQPARVDGVGGRTFPTLADFPLAAS-KFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG-DLQVMGD 215 (364) Q Consensus 140 na~~v~dis~~t~~~~~~~~~s~~~l~~A~~-~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~-~~~~~~~ 215 (364) .......++.++.++.. .+-.....=..|+||...|..|.+ +.+.++ ++... .-..... T Consensus 247 --------------~~~~~~~~~~~l~~~~~~~~~~~~~~~a~~v~n~~~~~~l~~---lkd~~G~~i~~~~~~~~~~~~ 309 (408) T protein:vir:10 247 --------------PKKPTIAKFDDVITMINTAVDPAIIATSSLLTNQSGLNKLAL---VKTAEGKYLLEPDPTKPNSYL 309 (408) T ss_pred --------------ccccccccHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHH---hhccCCceEeccCcCCCCCce Confidence 01112334556777764 343333333579999999998865 332222 22211 1111112 Q ss_pred cCCcEEEEeCC--CCCCCCceEEEEEec--ceeEEe-cCCCCcceeeccCC--CceeeeEEeeEEE---Eeeeeeeeecc Q lcl|NC_016566. 216 GLGRRFIISDA--AADAMGAGKMLGLVP--GAVAVT-TNGLDMLAQEKGGN--ENIERWWQGEFDF---NVAVKGYRLKA 285 (364) Q Consensus 216 ~lGrrVIVDD~--~p~~~~~Yttylfg~--GAi~~~-~~~~~~~~~~~~g~--e~~~~~~~~~~~f---~lhp~G~sw~~ 285 (364) .+|+||++.+. +|.....-..++||. .++.+. .....+......+. +.-.+.++.+..| .+||.+| T Consensus 310 l~G~PV~~~~~~~~~~~~~~~~~i~~gd~~~~~~~~~~~~~~v~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~---- 385 (408) T protein:vir:10 310 IKGKQVIVVADRWLPNTGSTVYPLYYGDMSQAITLFDRENMSLLPTNIGAGAFETDTTKIRVIDRFDVKATDSEAL---- 385 (408) T ss_pred ecceeeEEecccccCccCCCceEEEEEehhccEEEEEecceEEEEcccccchhhcCceEEEEEEeeccEEeccccE---- Confidence 46999999664 554332223355664 223332 23333322222111 1112222222211 1334333 Q ss_pred cccccccCCCCcChhhhcCCccceeecCcCcCcceEEEEecCcccccccccccccccc Q lcl|NC_016566. 286 SARTPVEGVRSFKLSDITDKANWELDQGQVDNAPATVQDVGSDSDTKGRRRTQTAQAV 343 (364) Q Consensus 286 ~~~~~~~gg~SPT~aeLat~~NW~rV~~s~K~~pgv~~~~~~~~~~~~~~~~~~~~~~ 343 (364) |.+++.+.++..+...++++++| T Consensus 386 -----------------------------------~~~~~~~~~~~~~~~~~~~~~~~ 408 (408) T protein:vir:10 386 -----------------------------------VAGSFSAIADQVGNFKTTTSTAV 408 (408) T ss_pred -----------------------------------EEEEeeccccCCCCCCCCCcccC Confidence 44555555555555555555555 No 93 >protein:vir:95376 Length: 425 # NCBI annotation: phage major capsid protein # Family: family:all:635 # MgeID: mge:1567 # MgeName: GBSV1 # Cross-refs: genbank:acc:YP_764476;genbank:gi:115334630;genbank:GeneID:5179263 Probab=95.49 E-value=0.002 Score=35.41 Aligned_cols=264 Identities=10% Similarity=0.033 Sum_probs=97.3 Q ss_pred CCc-------------------------------------cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCc Q lcl|NC_016566. 1 MSL-------------------------------------TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKD 43 (364) Q Consensus 1 fd~-------------------------------------~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gd 43 (364) +++ .+-.+++...+++.+.+....++... .-+..|+ T Consensus 108 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gg~~vP~~~~~~Ii~~l~~~~~i~~~~~-------~~~~~g~ 180 (425) T protein:vir:95 108 VEMNRLQVREMLKTGEYYKRSEVVEFYEKFRNLRAVAGGELTIPEVVVNRIMDIMGDYTTLYPLVD-------KIRVKGT 180 (425) T ss_pred HHHHHHHHHHHHhhhhhhhhhHHHHHHHHHHhhcccccCceeccHHHHHHHHHHHHhhhhHHHhhc-------eeecCce Confidence 110 02233344444455444333333211 1123344 Q ss_pred eeeeehhhhhcccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 44 VVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIML 123 (364) Q Consensus 44 f~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~ 123 (364) + ..|-....+.+ ...+.+.. .......+...+....++=.+-+..+...+. ..+..+-.-|.+++++...+ T Consensus 181 ~-~ip~~~~~~~a----~~v~E~~~-~~~~~~~~f~~i~l~~~k~~~~~~iS~ell~---ds~~~l~~~i~~~l~~~i~~ 251 (425) T protein:vir:95 181 T-RILVDTDTSPA----TWIEQSGA-LPTGDVGTIASIDFDGFKVGKVTFVDNYLLQ---DSIINLDDYVTKKIARAIAK 251 (425) T ss_pred e-EEEEecCCccc----cccccccc-cccccccccceeeeeheeeeeeehhhHHHHh---ccHHHHHHHHHHHHHHHHHH Confidence 3 22211111000 00000000 0001111222222222222222334554443 22333344455666665555 Q ss_pred HHHHHHHHH-------HhhhhcccccceeecccccCcccccccccHHHHHHHHHHhccccc--CeeEEEEchHHHHH-HH Q lcl|NC_016566. 124 HYLKAGIGA-------GKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAA--LIKSWFMDGVTWAN-FI 193 (364) Q Consensus 124 ~~qk~lla~-------L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~--~l~~ivMHS~v~~~-L~ 193 (364) -.-+.+|.- -.|++.+ ++....-.......++.++.++..++.-... .=..|+||...|.+ |. T Consensus 252 ~~d~~il~G~G~~~~~p~Gil~~-------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~~l~ 324 (425) T protein:vir:95 252 ALDLAIVKGTGAANKQPLGIIPS-------LPPENQVTVEADNNLLKNLVKQIGLIDTGDDSVGEIVAVMKRSTYYNRLV 324 (425) T ss_pred HHHHHhhccCCCCccccceeecc-------cccccccccccccchHHHHHHHHHhhhhhccccCceEEEEeChHHHHHHH Confidence 444433321 1122221 1111111112234456677777766543332 33468999887654 44 Q ss_pred Hhhcccccccc--cccccceeecccCCcEEEEeCCCCCCCCceEEEEEecce-eEEec-CCCCcceeeccCCCceeeeEE Q lcl|NC_016566. 194 AYQALPSAEQV--FAIGDLQVMGDGLGRRFIISDAAADAMGAGKMLGLVPGA-VAVTT-NGLDMLAQEKGGNENIERWWQ 269 (364) Q Consensus 194 k~~~it~~~~~--~~~~~~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~GA-i~~~~-~~~~~~~~~~~g~e~~~~~~~ 269 (364) +...+.+..+- ....+.... ..+|+||+++|.||... .+||.-. ..++. ++..........-....+.++ T Consensus 325 ~l~~~kd~~g~~i~~~~~~~~~-~l~G~pvv~~~~~~~~~-----i~~Gd~~~~~~~~~~~~~i~~~~~~~f~~~~~~~~ 398 (425) T protein:vir:95 325 EFSIQVDSNGNVVGKLPNLRTP-DLLGLRVVFNNFLDDDT-----VLFGEFEQYTLVERENITIDSSTHVKFTEDQTAFR 398 (425) T ss_pred HHHhhcCCCCceeeccCCCCCc-cccceeeEEcCcCCCcc-----EEEEecccEEEEeecceEEEeecccccccCceEEE Confidence 43444444432 111111111 34699999999999642 3443311 11111 111111111111111122233 Q ss_pred e--eEEE-EeeeeeeeecccccccccCCC Q lcl|NC_016566. 270 G--EFDF-NVAVKGYRLKASARTPVEGVR 295 (364) Q Consensus 270 ~--~~~f-~lhp~G~sw~~~~~~~~~gg~ 295 (364) + |..+ .+||..|..-.- +.+..|. T Consensus 399 ~~~r~d~~~~~~~a~~~~~i--~~~~~g~ 425 (425) T protein:vir:95 399 GKGRFDGKPVKPEAFVLVTI--TDPVQGA 425 (425) T ss_pred EEEeeCcEeecccceEEEEe--cCcCCCC Confidence 2 2222 267888776322 2211122 No 94 >protein:vir:97148 Length: 324 # NCBI annotation: ORF010 # Family: family:all:507 # MgeID: mge:1654 # MgeName: 85 # Cross-refs: genbank:acc:YP_239726;genbank:gi:66394880;genbank:GeneID:5130881 Probab=95.39 E-value=0.0022 Score=35.20 Aligned_cols=269 Identities=9% Similarity=-0.027 Sum_probs=106.2 Q ss_pred CCccccchh---------------------------------hhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeee Q lcl|NC_016566. 1 MSLTVFQRK---------------------------------LVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEK 47 (364) Q Consensus 1 fd~~vfn~~---------------------------------~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~ 47 (364) |--.-++.+ +...++|.+.+..-.+..+ ...+..+.-... T Consensus 1 ~~~~~~~~~~~~~f~~~~~~~~~~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~~~-------~~~~~~~~~~~i 73 (324) T protein:vir:97 1 MEQTQKLKLNLQHFASNNVKPQVFNPDNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQLG-------KYEPMEGTEKKF 73 (324) T ss_pred CccchhHHHHHHHHHHhhhhhhhhccccccccCCCcceechhHHHHHHHHHHhhcchhhhc-------ceeeccCCceEE Confidence 211122221 2222222222211111100 111122221233 Q ss_pred ehhhhhcccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 48 MSVGLIANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLK 127 (364) Q Consensus 48 ~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk 127 (364) |-+..-..+ .-.. .+. ....+..++..-.-...|++ +-+..+.+.+. ...-++...|.+++++...+..-+ T Consensus 74 p~~~~~~~a-~~v~-Eg~-~~~~~~~~f~~v~~~~~k~~---~~~~is~ell~---ds~~~l~~~i~~~l~~aia~~~d~ 144 (324) T protein:vir:97 74 TFWADKPGA-YWVG-EGQ-KIETSKATWVNATMRAFKLG---VILPVTKEFLN---YTYSQFFEEMKPMIAEAFYKKFDE 144 (324) T ss_pred EEEecCcce-eEec-cCc-cccccccceeEEEEeeEEEE---EeehhhHHHHh---cchHHHHHHHHHHHHHHHHHHHHH Confidence 322211110 0000 000 01111111222222222333 22345554443 223344566888888888887766 Q ss_pred HHHHHHhhhhccc-ccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--c Q lcl|NC_016566. 128 AGIGAGKAAIESN-AAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--V 204 (364) Q Consensus 128 ~lla~L~Gv~~~n-a~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~ 204 (364) .+|..- | .++ ........ ...........++.++.++..++.+....-..|+||..++..|.+ +.+..+ + T Consensus 145 a~l~G~-g--~~~~~~gi~~~~-~~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~n~~~~~~L~~---lkd~~g~~~ 217 (324) T protein:vir:97 145 AGILNQ-G--NNPFGKSIAQSI-EKTNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRK---IVDPETKER 217 (324) T ss_pred HhhccC-C--CCccCccccccc-cccceeccccCCHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHH---hhcCCCcee Confidence 555321 1 111 01111100 011112234567788999999998888888899999999998854 333332 2 Q ss_pred cccccceeecccCCcEEEEeCCCCCCCCceEEEEEec-ceeEEec-CCCCcceeec---------cCC-----CceeeeE Q lcl|NC_016566. 205 FAIGDLQVMGDGLGRRFIISDAAADAMGAGKMLGLVP-GAVAVTT-NGLDMLAQEK---------GGN-----ENIERWW 268 (364) Q Consensus 205 ~~~~~~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~-GAi~~~~-~~~~~~~~~~---------~g~-----e~~~~~~ 268 (364) +....- +..+|++|++++..+...+ ..+||. .-+.++. +++.+.+.+. .+. +.-...+ T Consensus 218 ~~~~~~---~tl~G~PV~~~~~~~~~~~---~~~~gd~~~~~i~~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~ 291 (324) T protein:vir:97 218 IYDRNS---DTLDGLPVVNLKSSNLKRG---ELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVAL 291 (324) T ss_pred ecCCCC---ccccceeeEeecCCCCCcc---eEEEEecccEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEE Confidence 222211 2246999999988776443 233332 1112222 1222111110 000 1112233 Q ss_pred EeeEEEE---eeeeeeeecccccccccCCCCcChhhh Q lcl|NC_016566. 269 QGEFDFN---VAVKGYRLKASARTPVEGVRSFKLSDI 302 (364) Q Consensus 269 ~~~~~f~---lhp~G~sw~~~~~~~~~gg~SPT~aeL 302 (364) +.+..|. .||..|.--..+ .++..-|-+|. T Consensus 292 r~~~r~d~~v~~~~a~~~l~~~----~~~~~~~~~~~ 324 (324) T protein:vir:97 292 RATMHVALHIADDKAFAKLVPA----DKKTDSVPGEV 324 (324) T ss_pred EEEEEeccEEecccceEEEEec----cCCCCCCCCCC Confidence 3322222 556665542221 11112233343 No 95 >protein:vir:93616 Length: 645 # NCBI annotation: putative major head protein/prohead protease # Family: family:all:21 # MgeID: mge:157 # MgeName: phi 4795 # Cross-refs: genbank:acc:YP_001449293;genbank:gi:157166041;goa:Q6H9U8;interpro:IPR006433;uniprot:Q6H9U8;genbank:GeneID:5580438 Probab=95.03 E-value=0.0029 Score=34.48 Aligned_cols=267 Identities=13% Similarity=0.078 Sum_probs=92.0 Q ss_pred CCcccc--------------------------------------chhhhhhhhhhhHHHHHHHhhhhcceeEec-cCccc Q lcl|NC_016566. 1 MSLTVF--------------------------------------QRKLVTAVTQMIPDNLNVFNAAANGAVVLG-TGEVL 41 (364) Q Consensus 1 fd~~vf--------------------------------------n~~~~~~~~e~i~q~~~~fn~as~gAivl~-~~~~~ 41 (364) .+.... .++...-++|.+.+..-. ... |+.++. ....+ T Consensus 307 g~~~~a~e~a~~~~~~~~~~~~~~~~a~~~~~~~~~~~~Gg~~vp~~~~~~ii~~l~~~svv-~~l--~~~~~~~~~~~~ 383 (645) T protein:vir:93 307 GVRSEALEVARRQYPDDSRLHHVLKSAVGAGTTTDPQWAGSLSEYQEYAQDFIDYLRPQTII-GRF--GQGGIPALRQVP 383 (645) T ss_pred cchhHHHHHHHhhcccchhhhhhhhhhhhccccccccccCCccCchhhHHHHHHhhhhhhhH-Hhh--cccccccccccc Confidence 111111 111122234444332111 111 011100 01112 Q ss_pred Cceeeeehhhhhcccc-cccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHH Q lcl|NC_016566. 42 KDVVEKMSVGLIANLV-TDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQA 120 (364) Q Consensus 42 Gdf~~~~~f~~i~g~~-~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~y 120 (364) |+. ..|-+ .+|.. .-+. .+.. ...+..++.+-.-...|++. -+.++..-+....-+. ...|.+++++. T Consensus 384 ~~~-~ip~~--t~~~~a~wv~-Eg~~-~~~s~~~f~~v~l~~~kla~---~~~iS~ell~ds~~~~---~~~i~~~l~~a 452 (645) T protein:vir:93 384 FNI-RVHAQ--VSGGAAGWVG-EGKT-KPLTKFDFESITFSHAKVSA---IAVLTEELIRFSSPAA---DALVRNALAEA 452 (645) T ss_pred Cce-eeeee--ecCcceEEec-cCcc-ccccccceeEEEEeeEEEEE---eehhHHHHHhhchHHH---HHHHHHHHHHH Confidence 221 11211 01110 0000 0000 11111123332223333432 2335554443222232 33466666666 Q ss_pred HHHHHHHHHHHHHhhhhcc-cccce-eecccccCcccccccccHHHHHHHHHHhcccccCe--eEEEEchHHHHHHHHhh Q lcl|NC_016566. 121 IMLHYLKAGIGAGKAAIES-NAAAN-YTQPARVDGVGGRTFPTLADFPLAASKFGDQAALI--KSWFMDGVTWANFIAYQ 196 (364) Q Consensus 121 w~~~~qk~lla~L~Gv~~~-na~~v-~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l--~~ivMHS~v~~~L~k~~ 196 (364) ..+..-+.+|....+...+ ....+ ++... ....+....++..+..++-++...+ ..|+||+.++..|.+ T Consensus 453 ia~~~d~a~l~g~g~~~~~~~p~gi~~~~~~-----~~~~~~~~~d~~~~~~~~~~a~~~~~~a~~vmn~~~~~~L~~-- 525 (645) T protein:vir:93 453 VVARLDTDFVDPKKAAVADVSPASITHDVKG-----TASSGNPDADAEAAFGQFVAANLQPTGAVWLMSSTNALALSM-- 525 (645) T ss_pred HHHHHHHHhhcCCCcccCCccccceeccccc-----cccccchHHHHHHHHHHHHhcCCCccccEEEEcHHHHHHHHh-- Confidence 6665555444322111111 11111 11111 0111223345555655554433333 469999999998855 Q ss_pred cccccccccccccceeec-ccCCcEEEEeCCCCCC--CCceEEEEEec-ceeEEecC---CCCcceeec----cCCC--- Q lcl|NC_016566. 197 ALPSAEQVFAIGDLQVMG-DGLGRRFIISDAAADA--MGAGKMLGLVP-GAVAVTTN---GLDMLAQEK----GGNE--- 262 (364) Q Consensus 197 ~it~~~~~~~~~~~~~~~-~~lGrrVIVDD~~p~~--~~~Yttylfg~-GAi~~~~~---~~~~~~~~~----~g~e--- 262 (364) +.+..+.+-..++..-+ ..+|+||++++.||.. -+-...++||. |.+.+.-+ ...+-.+.. .+.+ T Consensus 526 -lkd~~G~~~~~~~~~~~~tL~G~PV~~s~~vp~~~~~gd~s~~~ig~~~~v~i~~s~~a~~~~~~~~~~~~~~~~~~~~ 604 (645) T protein:vir:93 526 -RKNALGQKEYPDMTLLGGSFQGLPVIVSQYVGDQLVLVNAPDIYLADDGGVAVDMSREASLEMQSEPTGDSTTPSPVEL 604 (645) T ss_pred -ccccCCceeecCCCCCCceeeceeeEEeccCCcceeEeccccEEEEEecceEEEeecceeEEEeecccccccccccccc Confidence 43333322111111111 2369999999999852 11222233332 33333321 111000000 0000 Q ss_pred -----ceeeeEEe--eEEEE-eeee------eeeecccccccccCC Q lcl|NC_016566. 263 -----NIERWWQG--EFDFN-VAVK------GYRLKASARTPVEGV 294 (364) Q Consensus 263 -----~~~~~~~~--~~~f~-lhp~------G~sw~~~~~~~~~gg 294 (364) .-.+.+++ +..|. .||. |++|=.+. || T Consensus 605 v~lf~~d~vaira~~r~d~~~~~p~a~~~lt~~~~g~~~-----~~ 645 (645) T protein:vir:93 605 VSMFQTGSVAIRAERWINWRRRRTAAVAVITGVNYGSAS-----GG 645 (645) T ss_pred hhHhhcCceEEEEEEEEcceeeCccceEEEecccCCccc-----CC Confidence 01122222 23333 3554 55554432 33 No 96 >protein:vir:105004 Length: 392 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:1490 # MgeName: W Beta # Cross-refs: genbank:acc:YP_459969;genbank:gi:85701384;genbank:GeneID:3882145 Probab=94.97 E-value=0.0031 Score=34.36 Aligned_cols=258 Identities=7% Similarity=0.018 Sum_probs=93.1 Q ss_pred CCcc-----------------------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeeh-- Q lcl|NC_016566. 1 MSLT-----------------------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMS-- 49 (364) Q Consensus 1 fd~~-----------------------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~-- 49 (364) +.-+ +-.+++...+++.+.+..-+.+-+ ...++.+.-...++ T Consensus 84 l~~~~~~~~~~~~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~~~-------~~~~~~~~~~~~~~~~ 156 (392) T protein:vir:10 84 LRNKPLNAEEREFLEDDLEQRAMSGLTGEDGGLVIPQDIQTQINELARSFDALEQYV-------TVEPVRTRSGSRVLEK 156 (392) T ss_pred HhcccccHHHHHHHhhhhhhhhccccccCCCceecchhHHHHHHHHHHhhhhhhhhc-------eeeeccCCceeEEEEe Confidence 0000 111222222333333221111110 11112221111111 Q ss_pred hhhhcccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 50 VGLIANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAG 129 (364) Q Consensus 50 f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~l 129 (364) +.....+ .-....+.. ...+..++..-.-...|++ +-+.++...+.- +++ .+...|.+++++...+...+.+ T Consensus 157 ~~~~~~a-~~v~E~~~~-~~~~~~~~~~v~l~~~k~~---~~~~iS~ell~d--s~~-~l~~~i~~~l~~~i~~~~d~~~ 228 (392) T protein:vir:10 157 NSDMIPF-AEITEMGEI-PETDNPKFSNVQYAVKDRA---GILPLSRSLLQD--SDQ-NILKYVTKWLGKKSKVTRNVLI 228 (392) T ss_pred ecCCccc-eeecccccc-cccccccceeEEeeeeeEE---EeehhhHHHHhh--hHH-HHHHHHHHHHHHHHHHHHHHHH Confidence 1111111 001101000 0000012222222223332 333466554431 122 2334455666655554443333 Q ss_pred HHHHhhhhcccccceeecccccCcccccccccHHHHHHHHH-HhcccccCeeEEEEchHHHHHHHHhhccccccc--ccc Q lcl|NC_016566. 130 IGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAAS-KFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFA 206 (364) Q Consensus 130 la~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~-~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~ 206 (364) +.. ... .......++.++.++.. .+-.....-..|+||...|..|.+ +.++++ ++. T Consensus 229 ~~g-------~g~-----------~~~~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~---lkd~~G~~l~~ 287 (392) T protein:vir:10 229 LGV-------IEK-----------LTKQAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDK---LKDKDGKYILQ 287 (392) T ss_pred hhc-------ccc-----------ccccCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHH---hhccCCCeEee Confidence 221 110 01122345567777764 555555555779999999999965 433332 221 Q ss_pred cc-cceeecccCCcEEEE-eCCCCC-CC---CceEEEEEec--ceeEEe-cCCCCcceeeccCC--CceeeeEEeeE--E Q lcl|NC_016566. 207 IG-DLQVMGDGLGRRFII-SDAAAD-AM---GAGKMLGLVP--GAVAVT-TNGLDMLAQEKGGN--ENIERWWQGEF--D 273 (364) Q Consensus 207 ~~-~~~~~~~~lGrrVIV-DD~~p~-~~---~~Yttylfg~--GAi~~~-~~~~~~~~~~~~g~--e~~~~~~~~~~--~ 273 (364) .. .-......+|+|+|+ +|.++. .. ..-..++||. -++.+. ..++.+...+..+. +.-.+.++.+. . T Consensus 288 ~~~~~~~~~tllG~~~v~~~~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d 367 (392) T protein:vir:10 288 SDPTQKNKKLFAGTNPVVVVSNRFLKSKGTTAKKAPLIIGDLKEAIVLFKREDMELASTDVGGKAFTRNTLDLRAIQRDD 367 (392) T ss_pred cCccCCccccccCcccEEEecccccCCCcccCCceEEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeec Confidence 11 111111246976555 455432 11 1223456654 233333 23343333322211 11122233322 2 Q ss_pred E-Eeeeeeeee---cccccccccCCCCcCh Q lcl|NC_016566. 274 F-NVAVKGYRL---KASARTPVEGVRSFKL 299 (364) Q Consensus 274 f-~lhp~G~sw---~~~~~~~~~gg~SPT~ 299 (364) + .+||.+|.- +.++++. +|-= T Consensus 368 ~~v~~~~a~~~l~~~~~a~~~-----~~~~ 392 (392) T protein:vir:10 368 VQMWDNEAAVYGEIDLSAPVE-----QPQG 392 (392) T ss_pred cEEecccceEEEEeccccccc-----CCCC Confidence 1 277888765 3333332 2222 No 97 >protein:vir:107593 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1491 # MgeName: Gamma # Cross-refs: genbank:acc:YP_338188;genbank:gi:77020144;genbank:GeneID:3703724 Probab=94.97 E-value=0.0031 Score=34.36 Aligned_cols=258 Identities=7% Similarity=0.018 Sum_probs=93.1 Q ss_pred CCcc-----------------------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeeh-- Q lcl|NC_016566. 1 MSLT-----------------------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMS-- 49 (364) Q Consensus 1 fd~~-----------------------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~-- 49 (364) +.-+ +-.+++...+++.+.+..-+.+-+ ...++.+.-...++ T Consensus 84 l~~~~~~~~~~~~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~~~-------~~~~~~~~~~~~~~~~ 156 (392) T protein:vir:10 84 LRNKPLNAEEREFLEDDLEQRAMSGLTGEDGGLVIPQDIQTQINELARSFDALEQYV-------TVEPVRTRSGSRVLEK 156 (392) T ss_pred HhcccccHHHHHHHhhhhhhhhccccccCCCceecchhHHHHHHHHHHhhhhhhhhc-------eeeeccCCceeEEEEe Confidence 0000 111222222333333221111110 11112221111111 Q ss_pred hhhhcccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 50 VGLIANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAG 129 (364) Q Consensus 50 f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~l 129 (364) +.....+ .-....+.. ...+..++..-.-...|++ +-+.++...+.- +++ .+...|.+++++...+...+.+ T Consensus 157 ~~~~~~a-~~v~E~~~~-~~~~~~~~~~v~l~~~k~~---~~~~iS~ell~d--s~~-~l~~~i~~~l~~~i~~~~d~~~ 228 (392) T protein:vir:10 157 NSDMIPF-AEITEMGEI-PETDNPKFSNVQYAVKDRA---GILPLSRSLLQD--SDQ-NILKYVTKWLGKKSKVTRNVLI 228 (392) T ss_pred ecCCccc-eeecccccc-cccccccceeEEeeeeeEE---EeehhhHHHHhh--hHH-HHHHHHHHHHHHHHHHHHHHHH Confidence 1111111 001101000 0000012222222223332 333466554431 122 2334455666655554443333 Q ss_pred HHHHhhhhcccccceeecccccCcccccccccHHHHHHHHH-HhcccccCeeEEEEchHHHHHHHHhhccccccc--ccc Q lcl|NC_016566. 130 IGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAAS-KFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFA 206 (364) Q Consensus 130 la~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~-~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~ 206 (364) +.. ... .......++.++.++.. .+-.....-..|+||...|..|.+ +.++++ ++. T Consensus 229 ~~g-------~g~-----------~~~~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~---lkd~~G~~l~~ 287 (392) T protein:vir:10 229 LGV-------IEK-----------LTKQAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDK---LKDKDGKYILQ 287 (392) T ss_pred hhc-------ccc-----------ccccCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHH---hhccCCCeEee Confidence 221 110 01122345567777764 555555555779999999999965 433332 221 Q ss_pred cc-cceeecccCCcEEEE-eCCCCC-CC---CceEEEEEec--ceeEEe-cCCCCcceeeccCC--CceeeeEEeeE--E Q lcl|NC_016566. 207 IG-DLQVMGDGLGRRFII-SDAAAD-AM---GAGKMLGLVP--GAVAVT-TNGLDMLAQEKGGN--ENIERWWQGEF--D 273 (364) Q Consensus 207 ~~-~~~~~~~~lGrrVIV-DD~~p~-~~---~~Yttylfg~--GAi~~~-~~~~~~~~~~~~g~--e~~~~~~~~~~--~ 273 (364) .. .-......+|+|+|+ +|.++. .. ..-..++||. -++.+. ..++.+...+..+. +.-.+.++.+. . T Consensus 288 ~~~~~~~~~tllG~~~v~~~~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d 367 (392) T protein:vir:10 288 SDPTQKNKKLFAGTNPVVVVSNRFLKSKGTTAKKAPLIIGDLKEAIVLFKREDMELASTDVGGKAFTRNTLDLRAIQRDD 367 (392) T ss_pred cCccCCccccccCcccEEEecccccCCCcccCCceEEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeec Confidence 11 111111246976555 455432 11 1223456654 233333 23343333322211 11122233322 2 Q ss_pred E-Eeeeeeeee---cccccccccCCCCcCh Q lcl|NC_016566. 274 F-NVAVKGYRL---KASARTPVEGVRSFKL 299 (364) Q Consensus 274 f-~lhp~G~sw---~~~~~~~~~gg~SPT~ 299 (364) + .+||.+|.- +.++++. +|-= T Consensus 368 ~~v~~~~a~~~l~~~~~a~~~-----~~~~ 392 (392) T protein:vir:10 368 VQMWDNEAAVYGEIDLSAPVE-----QPQG 392 (392) T ss_pred cEEecccceEEEEeccccccc-----CCCC Confidence 1 277888765 3333332 2222 No 98 >protein:vir:102082 Length: 392 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1503 # MgeName: Fah # Cross-refs: genbank:acc:YP_512315;genbank:gi:89152484;genbank:GeneID:3953075 Probab=94.97 E-value=0.0031 Score=34.36 Aligned_cols=258 Identities=7% Similarity=0.018 Sum_probs=93.1 Q ss_pred CCcc-----------------------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeeh-- Q lcl|NC_016566. 1 MSLT-----------------------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMS-- 49 (364) Q Consensus 1 fd~~-----------------------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~-- 49 (364) +.-+ +-.+++...+++.+.+..-+.+-+ ...++.+.-...++ T Consensus 84 l~~~~~~~~~~~~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~~~-------~~~~~~~~~~~~~~~~ 156 (392) T protein:vir:10 84 LRNKPLNAEEREFLEDDLEQRAMSGLTGEDGGLVIPQDIQTQINELARSFDALEQYV-------TVEPVRTRSGSRVLEK 156 (392) T ss_pred HhcccccHHHHHHHhhhhhhhhccccccCCCceecchhHHHHHHHHHHhhhhhhhhc-------eeeeccCCceeEEEEe Confidence 0000 111222222333333221111110 11112221111111 Q ss_pred hhhhcccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 50 VGLIANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAG 129 (364) Q Consensus 50 f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~l 129 (364) +.....+ .-....+.. ...+..++..-.-...|++ +-+.++...+.- +++ .+...|.+++++...+...+.+ T Consensus 157 ~~~~~~a-~~v~E~~~~-~~~~~~~~~~v~l~~~k~~---~~~~iS~ell~d--s~~-~l~~~i~~~l~~~i~~~~d~~~ 228 (392) T protein:vir:10 157 NSDMIPF-AEITEMGEI-PETDNPKFSNVQYAVKDRA---GILPLSRSLLQD--SDQ-NILKYVTKWLGKKSKVTRNVLI 228 (392) T ss_pred ecCCccc-eeecccccc-cccccccceeEEeeeeeEE---EeehhhHHHHhh--hHH-HHHHHHHHHHHHHHHHHHHHHH Confidence 1111111 001101000 0000012222222223332 333466554431 122 2334455666655554443333 Q ss_pred HHHHhhhhcccccceeecccccCcccccccccHHHHHHHHH-HhcccccCeeEEEEchHHHHHHHHhhccccccc--ccc Q lcl|NC_016566. 130 IGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAAS-KFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFA 206 (364) Q Consensus 130 la~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~-~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~ 206 (364) +.. ... .......++.++.++.. .+-.....-..|+||...|..|.+ +.++++ ++. T Consensus 229 ~~g-------~g~-----------~~~~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~---lkd~~G~~l~~ 287 (392) T protein:vir:10 229 LGV-------IEK-----------LTKQAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDK---LKDKDGKYILQ 287 (392) T ss_pred hhc-------ccc-----------ccccCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHH---hhccCCCeEee Confidence 221 110 01122345567777764 555555555779999999999965 433332 221 Q ss_pred cc-cceeecccCCcEEEE-eCCCCC-CC---CceEEEEEec--ceeEEe-cCCCCcceeeccCC--CceeeeEEeeE--E Q lcl|NC_016566. 207 IG-DLQVMGDGLGRRFII-SDAAAD-AM---GAGKMLGLVP--GAVAVT-TNGLDMLAQEKGGN--ENIERWWQGEF--D 273 (364) Q Consensus 207 ~~-~~~~~~~~lGrrVIV-DD~~p~-~~---~~Yttylfg~--GAi~~~-~~~~~~~~~~~~g~--e~~~~~~~~~~--~ 273 (364) .. .-......+|+|+|+ +|.++. .. ..-..++||. -++.+. ..++.+...+..+. +.-.+.++.+. . T Consensus 288 ~~~~~~~~~tllG~~~v~~~~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d 367 (392) T protein:vir:10 288 SDPTQKNKKLFAGTNPVVVVSNRFLKSKGTTAKKAPLIIGDLKEAIVLFKREDMELASTDVGGKAFTRNTLDLRAIQRDD 367 (392) T ss_pred cCccCCccccccCcccEEEecccccCCCcccCCceEEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeec Confidence 11 111111246976555 455432 11 1223456654 233333 23343333322211 11122233322 2 Q ss_pred E-Eeeeeeeee---cccccccccCCCCcCh Q lcl|NC_016566. 274 F-NVAVKGYRL---KASARTPVEGVRSFKL 299 (364) Q Consensus 274 f-~lhp~G~sw---~~~~~~~~~gg~SPT~ 299 (364) + .+||.+|.- +.++++. +|-= T Consensus 368 ~~v~~~~a~~~l~~~~~a~~~-----~~~~ 392 (392) T protein:vir:10 368 VQMWDNEAAVYGEIDLSAPVE-----QPQG 392 (392) T ss_pred cEEecccceEEEEeccccccc-----CCCC Confidence 1 277888765 3333332 2222 No 99 >protein:vir:102873 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1492 # MgeName: Cherry # Cross-refs: genbank:acc:YP_338137;genbank:gi:77020198;genbank:GeneID:3703782 Probab=94.97 E-value=0.0031 Score=34.36 Aligned_cols=258 Identities=7% Similarity=0.018 Sum_probs=93.1 Q ss_pred CCcc-----------------------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeeh-- Q lcl|NC_016566. 1 MSLT-----------------------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMS-- 49 (364) Q Consensus 1 fd~~-----------------------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~-- 49 (364) +.-+ +-.+++...+++.+.+..-+.+-+ ...++.+.-...++ T Consensus 84 l~~~~~~~~~~~~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~~~-------~~~~~~~~~~~~~~~~ 156 (392) T protein:vir:10 84 LRNKPLNAEEREFLEDDLEQRAMSGLTGEDGGLVIPQDIQTQINELARSFDALEQYV-------TVEPVRTRSGSRVLEK 156 (392) T ss_pred HhcccccHHHHHHHhhhhhhhhccccccCCCceecchhHHHHHHHHHHhhhhhhhhc-------eeeeccCCceeEEEEe Confidence 0000 111222222333333221111110 11112221111111 Q ss_pred hhhhcccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 50 VGLIANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAG 129 (364) Q Consensus 50 f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~l 129 (364) +.....+ .-....+.. ...+..++..-.-...|++ +-+.++...+.- +++ .+...|.+++++...+...+.+ T Consensus 157 ~~~~~~a-~~v~E~~~~-~~~~~~~~~~v~l~~~k~~---~~~~iS~ell~d--s~~-~l~~~i~~~l~~~i~~~~d~~~ 228 (392) T protein:vir:10 157 NSDMIPF-AEITEMGEI-PETDNPKFSNVQYAVKDRA---GILPLSRSLLQD--SDQ-NILKYVTKWLGKKSKVTRNVLI 228 (392) T ss_pred ecCCccc-eeecccccc-cccccccceeEEeeeeeEE---EeehhhHHHHhh--hHH-HHHHHHHHHHHHHHHHHHHHHH Confidence 1111111 001101000 0000012222222223332 333466554431 122 2334455666655554443333 Q ss_pred HHHHhhhhcccccceeecccccCcccccccccHHHHHHHHH-HhcccccCeeEEEEchHHHHHHHHhhccccccc--ccc Q lcl|NC_016566. 130 IGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAAS-KFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFA 206 (364) Q Consensus 130 la~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~-~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~ 206 (364) +.. ... .......++.++.++.. .+-.....-..|+||...|..|.+ +.++++ ++. T Consensus 229 ~~g-------~g~-----------~~~~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~---lkd~~G~~l~~ 287 (392) T protein:vir:10 229 LGV-------IEK-----------LTKQAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDK---LKDKDGKYILQ 287 (392) T ss_pred hhc-------ccc-----------ccccCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHH---hhccCCCeEee Confidence 221 110 01122345567777764 555555555779999999999965 433332 221 Q ss_pred cc-cceeecccCCcEEEE-eCCCCC-CC---CceEEEEEec--ceeEEe-cCCCCcceeeccCC--CceeeeEEeeE--E Q lcl|NC_016566. 207 IG-DLQVMGDGLGRRFII-SDAAAD-AM---GAGKMLGLVP--GAVAVT-TNGLDMLAQEKGGN--ENIERWWQGEF--D 273 (364) Q Consensus 207 ~~-~~~~~~~~lGrrVIV-DD~~p~-~~---~~Yttylfg~--GAi~~~-~~~~~~~~~~~~g~--e~~~~~~~~~~--~ 273 (364) .. .-......+|+|+|+ +|.++. .. ..-..++||. -++.+. ..++.+...+..+. +.-.+.++.+. . T Consensus 288 ~~~~~~~~~tllG~~~v~~~~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d 367 (392) T protein:vir:10 288 SDPTQKNKKLFAGTNPVVVVSNRFLKSKGTTAKKAPLIIGDLKEAIVLFKREDMELASTDVGGKAFTRNTLDLRAIQRDD 367 (392) T ss_pred cCccCCccccccCcccEEEecccccCCCcccCCceEEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeec Confidence 11 111111246976555 455432 11 1223456654 233333 23343333322211 11122233322 2 Q ss_pred E-Eeeeeeeee---cccccccccCCCCcCh Q lcl|NC_016566. 274 F-NVAVKGYRL---KASARTPVEGVRSFKL 299 (364) Q Consensus 274 f-~lhp~G~sw---~~~~~~~~~gg~SPT~ 299 (364) + .+||.+|.- +.++++. +|-= T Consensus 368 ~~v~~~~a~~~l~~~~~a~~~-----~~~~ 392 (392) T protein:vir:10 368 VQMWDNEAAVYGEIDLSAPVE-----QPQG 392 (392) T ss_pred cEEecccceEEEEeccccccc-----CCCC Confidence 1 277888765 3333332 2222 No 100 >protein:vir:3845 Length: 395 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:322 # MgeName: phi adh # Cross-refs: genbank:acc:NP_050151;swissprot:trembl:q9t1f6;genbank:gi:9633043;uniprot:Q9T1F6;genbank:GeneID:1262163 Probab=94.91 E-value=0.0032 Score=34.26 Aligned_cols=270 Identities=7% Similarity=0.014 Sum_probs=94.4 Q ss_pred CCccccchh-----hhhhhhhhhHHHHHH--HhhhhcceeEeccCcccCce-----------------------eeeehh Q lcl|NC_016566. 1 MSLTVFQRK-----LVTAVTQMIPDNLNV--FNAAANGAVVLGTGEVLKDV-----------------------VEKMSV 50 (364) Q Consensus 1 fd~~vfn~~-----~~~~~~e~i~q~~~~--fn~as~gAivl~~~~~~Gdf-----------------------~~~~~f 50 (364) ......+.. .....+..+.+.... -...+||.++ +..+..++ ...+++ T Consensus 79 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gg~~v--P~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~ 156 (395) T protein:vir:38 79 KPLPVKDGKPDAQAMKNQFVKDFKNLVTSGTTGTGNAGLTI--PEDIQLQIRTLTRSFTSLESLANVENVTTSHGSRVYE 156 (395) T ss_pred cccchhhhhHHHHHHHHHHHHHHHHHHhhccCccCCCceec--chhHhhHHHHHHHhhcchhhhcceeeccCCcceEEEE Confidence 111111111 111111111100000 0001122211 11111110 011111 Q ss_pred hhhcccccccccccCC--Cccccch-hhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 51 GLIANLVTDRNAYAPV--GTPATAK-VLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLK 127 (364) Q Consensus 51 ~~i~g~~~~~d~~~~~--~~~~T~~-kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk 127 (364) ..-+. ...+.+ .....|. ..-....+-.+.++-.+-+.++...+. .....+...|.+++++...+...+ T Consensus 157 ~~~~~-----~~~a~~v~E~~~~~~~~~~~f~~v~~~~~k~~~~~~iS~ell~---ds~~~l~~~i~~~la~~~~~~~~~ 228 (395) T protein:vir:38 157 KLADI-----TPLKDLDDESALIGDNDDPELTVVKYLIHRYAGITTVTNTLLK---DTVDNIIQWLVNWAAKKDVVTRNA 228 (395) T ss_pred eeccC-----CccccccccccccccccccceeeEEeeeeeeEeehhhHHHHHh---hhHHHHHHHHHHHHHHHHHHHHHH Confidence 10000 000000 0000010 001122333333332233345555443 122234455777777777766555 Q ss_pred HHHHHHhhhhcccccceeecccccCcccccccccHHHHHHHHH-HhcccccCeeEEEEchHHHHHHHHhhccccccc--c Q lcl|NC_016566. 128 AGIGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAAS-KFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--V 204 (364) Q Consensus 128 ~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~-~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~ 204 (364) .+|.... .... .....++.++.++.. .+......-..|+||..+|..|.+ +.+..+ + T Consensus 229 ~il~g~g----~~~~-------------~~~~~~~~~i~~~~~~~l~~~~~~~a~~v~n~~~~~~L~~---lkd~~G~~l 288 (395) T protein:vir:38 229 KILEVMG----KAPK-------------KPTISQFDNIKDLENNTLDPAIESTSSFITNQSGYNILSK---VKDADGRYL 288 (395) T ss_pred HHhhccc----cccc-------------ccccccHHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHH---hhccCCcee Confidence 4432111 0000 011223445666654 344455555779999999998854 333333 2 Q ss_pred cccc-cceeecccCCcEEEEeCCCCCC--CCceEEEEEec--ceeEEec-CCCCcceeeccC--CCceeeeEEee--EE- Q lcl|NC_016566. 205 FAIG-DLQVMGDGLGRRFIISDAAADA--MGAGKMLGLVP--GAVAVTT-NGLDMLAQEKGG--NENIERWWQGE--FD- 273 (364) Q Consensus 205 ~~~~-~~~~~~~~lGrrVIVDD~~p~~--~~~Yttylfg~--GAi~~~~-~~~~~~~~~~~g--~e~~~~~~~~~--~~- 273 (364) +... .-......+|++|++.|.|+.. .+. .+++||. .++.+.. .+......+..+ -+.-.+.++.. +. T Consensus 289 ~~~~~~~~~~~~l~G~pV~~~~~~~~~~~~~~-~~i~~gd~~~~~~i~~~~~~~i~~~~~~~~~~~~~~~~~r~~~r~d~ 367 (395) T protein:vir:38 289 MQPDVTSPDKYLIDGKPVIRIADKWLPDVSGS-HPLYFGDLKQGITLFDRQQMQIDTTNVGAGSFEHDTTKLRFIDRFDV 367 (395) T ss_pred eccCcCCCCcceeccceeEEecccccCcCCCc-ceEEEEeccccEEEEEecceEEEEeccccchhhcCceEEEEEEeecc Confidence 2110 0011112369999999987543 233 3456663 3343332 333222222111 12222233332 22 Q ss_pred EEeeeeeeeecccccccccCCCCcChhhhcCCc Q lcl|NC_016566. 274 FNVAVKGYRLKASARTPVEGVRSFKLSDITDKA 306 (364) Q Consensus 274 f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~ 306 (364) -.+||..|.--.-+.+.. ..|.-. .+|. T Consensus 368 ~~~~~~a~~~~~~~~~~~---~~~~~~--~~~~ 395 (395) T protein:vir:38 368 QLIDDGAFAAASFKTVAN---QAQGTA--GTGK 395 (395) T ss_pred EEecccceEEEEeecccC---CCCCcc--CCCC Confidence 226677766522211110 011111 1122 No 101 >protein:vir:96762 Length: 632 # NCBI annotation: putative phage-related protein # Family: family:all:21 # MgeID: mge:1628 # MgeName: VP882 # Cross-refs: genbank:acc:YP_001039818;genbank:gi:126010917;genbank:GeneID:5076272 Probab=94.20 E-value=0.0051 Score=33.16 Aligned_cols=264 Identities=8% Similarity=-0.012 Sum_probs=91.8 Q ss_pred CCccc-------cchhhhhhhhh--hh-HHHHHHHhhhhcceeEeccCcccCceeee----ehhhhh------------- Q lcl|NC_016566. 1 MSLTV-------FQRKLVTAVTQ--MI-PDNLNVFNAAANGAVVLGTGEVLKDVVEK----MSVGLI------------- 53 (364) Q Consensus 1 fd~~v-------fn~~~~~~~~e--~i-~q~~~~fn~as~gAivl~~~~~~Gdf~~~----~~f~~i------------- 53 (364) |.... +.....+.++. .+ ...+..-...+||.+| ..+.....|++. +...++ T Consensus 326 ~~~e~a~~~a~~~G~~arg~~~~~~~l~~ra~~~~t~~~gg~lv-p~~~~~~~iie~lr~~s~i~~l~~~~~~~~~g~~~ 404 (632) T protein:vir:96 326 FEREVSLAIADASGKEARGFYMPHEVLVQRQLEKKTAGKGGELV-ATELLSEEFIDILRNKAIIGQMGARMLPGLVGDVD 404 (632) T ss_pred hhhHHHHHHHHhhhhhhhhhhhhHHHHHHhhhhccccccccccc-ccccchHHHHHHHhhcchhhhhcceEeecCCcceE Confidence 11111 11111111100 00 0011111112222222 111111111110 000011 Q ss_pred -----cccc-cccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 54 -----ANLV-TDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLK 127 (364) Q Consensus 54 -----~g~~-~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk 127 (364) +|.. .-..-... -..+++ ++.+-.-..-+++. -+..+...+.. .+| .+-..|.+.++....+..-+ T Consensus 405 ip~~~~~~~a~wv~E~~~-~~~s~~-~f~~i~l~~~k~~~---~v~iS~ell~d--s~~-~~~~~i~~~l~~a~~~~~d~ 476 (632) T protein:vir:96 405 IPKKTSGANFYWIGEDED-VQDSDF-DFTTLSFSPKTIAG---AVPVTRKLRKQ--SSI-HVENLIREDLIEGIGVALDL 476 (632) T ss_pred EEEEeCCceeEeecCCcc-cccccc-ceeeEEeeeeEEEE---ehhhHHHHHhc--cch-HHHHHHHHHHHHHHHHHHHH Confidence 0000 00000000 000011 11111111122322 22344443331 223 22233555555544443333 Q ss_pred HHHHHHhhhhc-ccccceeecccccCcccccccccHHHHHHHHHHhcccccC--eeEEEEchHHHHHHHHhhcccccccc Q lcl|NC_016566. 128 AGIGAGKAAIE-SNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAAL--IKSWFMDGVTWANFIAYQALPSAEQV 204 (364) Q Consensus 128 ~lla~L~Gv~~-~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~--l~~ivMHS~v~~~L~k~~~it~~~~~ 204 (364) .+| .|--. ++...+...+....-.....++++.++.++..++...... =..|+||+.++..|.+ ..+.+..+- T Consensus 477 a~l---~G~G~~~~p~Gi~~~~~~~~~~~~~~~~~~~~i~~~~~~i~~~~~~~~~~~~~~~~~~~~~l~~-~~l~d~~G~ 552 (632) T protein:vir:96 477 AML---TGTGLANDPVGLLNMTGVPALTYPAGGVDWASVVDMETKISTFNADAGRLAYLTSVTQRGAAKK-AQVFDNTGE 552 (632) T ss_pred Hhh---cccCCCCccceeeecccccceecccccCCHHHHHHHHHHHhhcccccCccEEEEchhHHHHHHH-HhccCCCCc Confidence 332 22111 1111111111100011122345677788887776444332 3479999999988865 234333332 Q ss_pred cccccceeecc--cCCcEEEEeCCCCCCCCceEEEEEecc-eeEEec-CCCCcceeeccC--CCceeeeEEeeEEE-Eee Q lcl|NC_016566. 205 FAIGDLQVMGD--GLGRRFIISDAAADAMGAGKMLGLVPG-AVAVTT-NGLDMLAQEKGG--NENIERWWQGEFDF-NVA 277 (364) Q Consensus 205 ~~~~~~~~~~~--~lGrrVIVDD~~p~~~~~Yttylfg~G-Ai~~~~-~~~~~~~~~~~g--~e~~~~~~~~~~~f-~lh 277 (364) + +|++ .+|++|++++.+|... .+||.- -+.+++ +...+.+.+... ...+......++.| ..| T Consensus 553 ~------i~~~~~l~G~pv~~s~~ip~~~-----~~~gd~s~~~i~~~~~~~i~~~~~~~~~~~~v~~~~~~~~d~~v~~ 621 (632) T protein:vir:96 553 R------IWQNNEVNGYRAEASNQIPADT-----WIFGDWSQIVIAMWGVLDLKVDPYTKAASDGLVLRVFQDVDAGVRR 621 (632) T ss_pred e------eecCCeecccceEeccccccCc-----EEEeecceEEEEEecceEEEEccccccccCceEEEEEeecCceeec Confidence 2 3332 3599999999999542 333332 122232 222222222222 12222222222222 389 Q ss_pred eeeeeeccccc Q lcl|NC_016566. 278 VKGYRLKASAR 288 (364) Q Consensus 278 p~G~sw~~~~~ 288 (364) |..|.|-..+= T Consensus 622 ~~af~~~k~~A 632 (632) T protein:vir:96 622 KEAFCIAKKGA 632 (632) T ss_pred hhhhhheeecC Confidence 99999955430 No 102 >protein:vir:1268 Length: 397 # NCBI annotation: hypothetical protein # Family: family:all:21 # MgeID: mge:329 # MgeName: phi-105 # Cross-refs: genbank:acc:NP_690760;genbank:gi:22855000;genbank:GeneID:955203 Probab=93.57 E-value=0.0071 Score=32.37 Aligned_cols=253 Identities=6% Similarity=-0.055 Sum_probs=90.4 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCcc--ccchhhhcc Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTP--ATAKVLARM 78 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~--~T~~kit~~ 78 (364) .=-.+..+++....++.+.+..-.+.-.. -+-+. ...|.+ ++....++.. -...+.+... .+-.++..- T Consensus 130 ~gg~lvP~~~~~~ii~~~~~~~~l~~~~~--~~~~~--~~~~~~---~~~~~~~~~~--a~~v~Eg~~~~~~~~~~~~~v 200 (397) T protein:vir:12 130 DGGILIPEDIGRQIHEFKRQFEPLEQYVT--VEPVT--TRSGTR---LLEKNADMVP--FSPVEELGNLPEIDQPRFTKV 200 (397) T ss_pred cCcccCchhHHHHHHHhhhhhhhHHhhcc--eeecc--CCceeE---EEEEecCCcc--eeeecccccccccccccceeE Confidence 00001122222333333333222211100 00000 011211 1111111100 0000001000 000012221 Q ss_pred ceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCcccccc Q lcl|NC_016566. 79 LTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDGVGGRT 158 (364) Q Consensus 79 ~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~~~~~~ 158 (364) .-...|++. -+.++...+. ..+..+...|.+++++...+-....+|.... .+ . ... T Consensus 201 ~~~~~k~~~---~~~is~e~l~---ds~~~l~~~i~~~l~~~~~~~~d~~il~G~g----~~--~------------~~g 256 (397) T protein:vir:12 201 SYSIIDYGG---IMTLSNSMLN---DSDQAIMTYVAKWFAKKSVVTRNNLILAAIA----SL--K------------KVD 256 (397) T ss_pred EeeheeeEe---eehhhHHHHh---hchHHHHHHHHHHHHHHHHHHHHHHHHhccc----cc--c------------ccc Confidence 222223332 2345555443 2222334446666666666654443332111 00 0 011 Q ss_pred cccHHHHHHHHH-HhcccccCeeEEEEchHHHHHHHHhhcccccccc--cccc-cceeecccCCcEEEEeCCC-CCCCCc Q lcl|NC_016566. 159 FPTLADFPLAAS-KFGDQAALIKSWFMDGVTWANFIAYQALPSAEQV--FAIG-DLQVMGDGLGRRFIISDAA-ADAMGA 233 (364) Q Consensus 159 ~~s~~~l~~A~~-~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~--~~~~-~~~~~~~~lGrrVIVDD~~-p~~~~~ 233 (364) ..++.++.++.. .+=.....=..|+||...|..|.+ +.++++- +... .-......+|+||++.+.+ |..... T Consensus 257 ~~~~~~i~~~~~~~l~~~~~~~a~~~~n~~~~~~L~~---lkd~~G~~l~~~~~~~g~~~~l~G~pv~~~~~~~~~~~~~ 333 (397) T protein:vir:12 257 IDGLDGIKKALNVTLDPMVAPGSIVLTNQDGYDWLDT---LKDGTGRYLLQPDPTNPTKKLLDGRPVVPFTNRVLKTQKG 333 (397) T ss_pred cccHHHHHHHHhhccchhhhCCCEEEEcHHHHHHHHH---hhccCCceeecccccCCCCccccceeeEEecccccccCCC Confidence 233456666664 342333344679999999998854 4333332 2111 0011112369999876654 443322 Q ss_pred eEEEEEec--cee-EEecCCCCcceeeccCC--CceeeeEEeeEE---EEeeeeeeeecccccccccCCC Q lcl|NC_016566. 234 GKMLGLVP--GAV-AVTTNGLDMLAQEKGGN--ENIERWWQGEFD---FNVAVKGYRLKASARTPVEGVR 295 (364) Q Consensus 234 Yttylfg~--GAi-~~~~~~~~~~~~~~~g~--e~~~~~~~~~~~---f~lhp~G~sw~~~~~~~~~gg~ 295 (364) -..++||. .++ .....++.+...+.... +.-.+.++.... -.++|..|..-.-+. + T Consensus 334 ~~~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~f~~~~~~~r~~~r~d~~~~~~~a~~~~~~t~------~ 397 (397) T protein:vir:12 334 KAPLIIGNLKEAIVLFDREQQSIASTDTGAGAFETNSTKVRGIEREDVRKWDEDAVVFGQITV------E 397 (397) T ss_pred ccEEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEEee------C Confidence 33466764 233 33223333322222211 111222222221 126676666633321 1 No 103 >protein:vir:94622 Length: 341 # NCBI annotation: PfWMP4_37 # Family: family:all:2203 # MgeID: mge:1525 # MgeName: Pf-WMP4 # Cross-refs: genbank:acc:YP_762667;genbank:gi:115304375;genbank:GeneID:5142322 Probab=93.17 E-value=0.0085 Score=31.93 Aligned_cols=303 Identities=10% Similarity=0.033 Sum_probs=119.9 Q ss_pred CCccccchhhhhhhh-hhhHHHHHHHhhhhcceeEe-ccCcccCceeeeehhhhhcccccccccccCCCccccchhhhcc Q lcl|NC_016566. 1 MSLTVFQRKLVTAVT-QMIPDNLNVFNAAANGAVVL-GTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARM 78 (364) Q Consensus 1 fd~~vfn~~~~~~~~-e~i~q~~~~fn~as~gAivl-~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~ 78 (364) =.-+.|=|+++...+ +.++++ .+|=. .+.- ..+...||-+..|..++.. + .|+.... ..++..++.. T Consensus 15 ~~v~~fipei~s~~i~~~l~~~-~v~~~----~~~d~~~~~~~Gdtv~ip~~g~~~--~--~d~~~~~--~i~~~~~~~~ 83 (341) T protein:vir:94 15 QRGQQFIPEQWLSEVQMFRKAK-MLDTS----VVKTWGAQVKKGDTFHVPRISELG--V--EDKATDV--PVGVQPVNDT 83 (341) T ss_pred hhHHHHHHHHHHHHHHHHHHhh-cchhh----ccccccccccCCceEEEeccCcce--e--eeecCCC--ccccccccCc Confidence 112335566666653 233222 12211 1110 1112348878888665432 2 2322111 2234445443 Q ss_pred ceeeEEe-ccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc-cceeecccccCcccc Q lcl|NC_016566. 79 LTNSVNL-SAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNA-AANYTQPARVDGVGG 156 (364) Q Consensus 79 ~~vaVkl-~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na-~~v~dis~~t~~~~~ 156 (364) + +-+.+ -.++-++.++..+.....-|++ ..+.++.+....+..-+.+++.+.+. ...+ ..... +........ T Consensus 84 ~-~~itiD~~~~~~~~i~d~d~~~~~~d~~---~~~~~~~~~aLA~~~D~~i~~~~a~~-~~~~~~~~~~-~~~~~~t~~ 157 (341) T protein:vir:94 84 D-FVITVDTDRTTAVALDDLLEIQASYDLR---APYLEAMGYALAKDMTGSILGLRAAV-QNTASQNVFS-SSNGAITGN 157 (341) T ss_pred e-EEEEEeeeeecceeechHHHHhhccchH---HHHHHHHHHHHHHHHHHHHHHHhhhc-cccccCcccc-CccccccCc Confidence 3 33444 2233445566655555555663 33555555555555555555544432 1111 11100 000101111 Q ss_pred cccccHHHHHHHHHHhccccc--CeeEEEEchHHHHHHHHhhccccccccccc-ccceeecccCCcEEEEeCCCCCCCCc Q lcl|NC_016566. 157 RTFPTLADFPLAASKFGDQAA--LIKSWFMDGVTWANFIAYQALPSAEQVFAI-GDLQVMGDGLGRRFIISDAAADAMGA 233 (364) Q Consensus 157 ~~~~s~~~l~~A~~~lGD~~~--~l~~ivMHS~v~~~L~k~~~it~~~~~~~~-~~~~~~~~~lGrrVIVDD~~p~~~~~ 233 (364) ....++..+.+|+++|.++.- .=..+++++.+|..|++...+.+....-.. ..-..+...+|-.|+++..+|...+ T Consensus 158 ~~~~~~~~i~~a~~~Lde~~VP~~gR~lvv~P~~~~~Ll~~~~~~~~~~~g~~~l~~G~ig~i~G~~V~~Sn~lp~~~~- 236 (341) T protein:vir:94 158 GQAFSFAVFLAARRLLLEADVPEEKIVLLISPGQESALFTIPQFISKDFINNAPIAQGQIGSLMGVRVIRTSLIGNNSA- 236 (341) T ss_pred hhhhhHHHHHHHHHHHhhcCCCccCCEEEeCHHHHHHHhhchhhhhhhccccchhheeeeeeEeceEEEEecccccccc- Confidence 223455678889999988642 115578899999999764444433222110 1111122235999999999996542 Q ss_pred eEEEEEecceeEEecCCCCcceeeccCCCceeeeEEeeEEEEeeeeeeeecccccccccCCCCc------ChhhhcCCcc Q lcl|NC_016566. 234 GKMLGLVPGAVAVTTNGLDMLAQEKGGNENIERWWQGEFDFNVAVKGYRLKASARTPVEGVRSF------KLSDITDKAN 307 (364) Q Consensus 234 Yttylfg~GAi~~~~~~~~~~~~~~~g~e~~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SP------T~aeLat~~N 307 (364) +.|.-+.+-+......+...-.+.-+.++ .|.--+.|+-|....+.... ..-| ...++.+..- T Consensus 237 -~~~~~~~~~~~~~~~~~~i~~~~~~~~~~---------~~~~~~~gl~~~~~av~~~k-~~~~~~~~~~~~~~~~~~~~ 305 (341) T protein:vir:94 237 -TGWRNGAPTIAPAEATPGFTGSRYLPKQD---------SFTSLPATFTGNSRPVHTAV-MCHMDWAAAVVSKAPRVTQS 305 (341) T ss_pred -ccccccccceecccccccccccccccccc---------cccccEEEEEEeccccccee-eecchhhhcccccccccccc Confidence 12222222222222222111111111110 11112222222111110000 0001 1111222222 Q ss_pred ce-----------eecCcCcCcceEEEEecCccccc Q lcl|NC_016566. 308 WE-----------LDQGQVDNAPATVQDVGSDSDTK 332 (364) Q Consensus 308 W~-----------rV~~s~K~~pgv~~~~~~~~~~~ 332 (364) |+ .++...=.=|=+.+.+.++.+|- T Consensus 306 ~~~~~~~~~i~~~~~~G~~~lrp~~~v~~~~~~~~~ 341 (341) T protein:vir:94 306 FENREQVWLMVGRQAYGARLYRPLHAVNIHTTGDTV 341 (341) T ss_pred chhhhhhhhhhhhhhhcccccCcceeEEEecCcCCC Confidence 21 02222112222333333333332 No 104 >protein:vir:96392 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1613 # MgeName: 53 # Cross-refs: genbank:acc:YP_239648;genbank:gi:66395381;genbank:GeneID:5132868 Probab=93.11 E-value=0.0087 Score=31.87 Aligned_cols=269 Identities=10% Similarity=-0.015 Sum_probs=104.3 Q ss_pred CCccccchhh-------------------------hhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcc Q lcl|NC_016566. 1 MSLTVFQRKL-------------------------VTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIAN 55 (364) Q Consensus 1 fd~~vfn~~~-------------------------~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g 55 (364) |+++.|.... ...+++.+.+..-...- ....++.|.-.+.|-+..-.. T Consensus 9 ~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~l-------~~~~~~~~~~~~~p~~~~~~~ 81 (324) T protein:vir:96 9 LNLQHFASNNVKPQVFNPDNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQL-------GKYEPMEGTEKKFTFWADKPG 81 (324) T ss_pred HHHHHHHHHhhhhhhhccccccccCcCccccchhHHHHHHHHHHhhchhhhh-------cceeeccCCceEEEEEecCcc Confidence 3333332221 12222222221111000 011122222122332211100 Q ss_pred cccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_016566. 56 LVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKA 135 (364) Q Consensus 56 ~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~G 135 (364) + .-+. .+. ....+..++.+-.-...|++ +-+..+.+.+. ....++...|.+++++.+.+..-+.+|. | T Consensus 82 a-~~v~-Eg~-~~~~~~~~~~~v~~~~~k~~---~~~~is~ell~---ds~~~l~~~i~~~la~ai~~~~d~a~l~---G 149 (324) T protein:vir:96 82 A-YWVG-EGQ-KIETSKATWVNATMRAFKLG---VILPVTKEFLN---YTYSQFFEEMKPMIAEAFYKKFDEAGIL---N 149 (324) T ss_pred e-eEec-CCc-cccccccceeEEEEeeEEEE---EeehhhHHHHh---cchHHHHHHHHHHHHHHHHHHHHHHHhc---c Confidence 0 0000 000 01111111222222222333 22345555443 2233455668888888888876665542 2 Q ss_pred hhcccc-cceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccccc--ccccccee Q lcl|NC_016566. 136 AIESNA-AANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQV--FAIGDLQV 212 (364) Q Consensus 136 v~~~na-~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~--~~~~~~~~ 212 (364) .-.++. ........ ..........++.++.++..++.++......|+||+.++..|.+ +.+.++- +...... T Consensus 150 ~g~~~~~~gi~~~~~-~~~~~~~~~~t~~~i~~~~~~l~~~~~~~~~~vmn~~~~~~L~~---l~d~~G~~~~~~~~~~- 224 (324) T protein:vir:96 150 QGNNPFGKSIAQSIE-KTNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRK---IVDPETKERIYDRNSD- 224 (324) T ss_pred CCCCCcCcccccccc-ccceeccccccHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHH---hhccCCCeeecCCCCC- Confidence 111111 11111111 11111223456788999999998888888899999999998864 3333332 2212211 Q ss_pred ecccCCcEEEEeCCCCCCCCceEEEEEec-ceeEEec-CCCCcceeec---------cCC-----CceeeeEEeeEEEE- Q lcl|NC_016566. 213 MGDGLGRRFIISDAAADAMGAGKMLGLVP-GAVAVTT-NGLDMLAQEK---------GGN-----ENIERWWQGEFDFN- 275 (364) Q Consensus 213 ~~~~lGrrVIVDD~~p~~~~~Yttylfg~-GAi~~~~-~~~~~~~~~~---------~g~-----e~~~~~~~~~~~f~- 275 (364) ..+|++|++++.++...+ ..+||. .-+.++. +++.+...+. .+. +.-.+.++.+..|. T Consensus 225 --~l~G~PV~~~~~~~~~~~---~~~~gd~~~~~~g~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~r~~~r~d~ 299 (324) T protein:vir:96 225 --SLDGLPVVNLKSSNLKRG---ELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVAL 299 (324) T ss_pred --cccceeeEeeCCCCCCcc---eEEEEecceEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEEcc Confidence 246999999888765433 223332 1122322 2222111110 000 11122333333222 Q ss_pred --eeeeeeeecccccccccCCCCcChhhhcCCccceeecCcCcCcceEE Q lcl|NC_016566. 276 --VAVKGYRLKASARTPVEGVRSFKLSDITDKANWELDQGQVDNAPATV 322 (364) Q Consensus 276 --lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW~rV~~s~K~~pgv~ 322 (364) +||..|.-- ++++|.- ..|||=+ T Consensus 300 ~v~~~~A~~~l-------------------~~a~~~~-----~~~~~~~ 324 (324) T protein:vir:96 300 HIADDKAFAKL-------------------VPADKRT-----DSVPGEV 324 (324) T ss_pred EEecccceEEE-------------------ecccccC-----CCCCCCC Confidence 334443221 1122200 1122211 No 105 >protein:vir:78830 Length: 324 # NCBI annotation: major head protein # Family: family:all:507 # MgeID: mge:1858 # MgeName: 80alpha # Cross-refs: genbank:acc:YP_001285361;genbank:gi:148717889;genbank:GeneID:5246961 Probab=93.11 E-value=0.0087 Score=31.87 Aligned_cols=269 Identities=10% Similarity=-0.015 Sum_probs=104.3 Q ss_pred CCccccchhh-------------------------hhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcc Q lcl|NC_016566. 1 MSLTVFQRKL-------------------------VTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIAN 55 (364) Q Consensus 1 fd~~vfn~~~-------------------------~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g 55 (364) |+++.|.... ...+++.+.+..-...- ....++.|.-.+.|-+..-.. T Consensus 9 ~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~l-------~~~~~~~~~~~~~p~~~~~~~ 81 (324) T protein:vir:78 9 LNLQHFASNNVKPQVFNPDNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQL-------GKYEPMEGTEKKFTFWADKPG 81 (324) T ss_pred HHHHHHHHHhhhhhhhccccccccCcCccccchhHHHHHHHHHHhhchhhhh-------cceeeccCCceEEEEEecCcc Confidence 3333332221 12222222221111000 011122222122332211100 Q ss_pred cccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_016566. 56 LVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKA 135 (364) Q Consensus 56 ~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~G 135 (364) + .-+. .+. ....+..++.+-.-...|++ +-+..+.+.+. ....++...|.+++++.+.+..-+.+|. | T Consensus 82 a-~~v~-Eg~-~~~~~~~~~~~v~~~~~k~~---~~~~is~ell~---ds~~~l~~~i~~~la~ai~~~~d~a~l~---G 149 (324) T protein:vir:78 82 A-YWVG-EGQ-KIETSKATWVNATMRAFKLG---VILPVTKEFLN---YTYSQFFEEMKPMIAEAFYKKFDEAGIL---N 149 (324) T ss_pred e-eEec-CCc-cccccccceeEEEEeeEEEE---EeehhhHHHHh---cchHHHHHHHHHHHHHHHHHHHHHHHhc---c Confidence 0 0000 000 01111111222222222333 22345555443 2233455668888888888876665542 2 Q ss_pred hhcccc-cceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccccc--ccccccee Q lcl|NC_016566. 136 AIESNA-AANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQV--FAIGDLQV 212 (364) Q Consensus 136 v~~~na-~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~--~~~~~~~~ 212 (364) .-.++. ........ ..........++.++.++..++.++......|+||+.++..|.+ +.+.++- +...... T Consensus 150 ~g~~~~~~gi~~~~~-~~~~~~~~~~t~~~i~~~~~~l~~~~~~~~~~vmn~~~~~~L~~---l~d~~G~~~~~~~~~~- 224 (324) T protein:vir:78 150 QGNNPFGKSIAQSIE-KTNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRK---IVDPETKERIYDRNSD- 224 (324) T ss_pred CCCCCcCcccccccc-ccceeccccccHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHH---hhccCCCeeecCCCCC- Confidence 111111 11111111 11111223456788999999998888888899999999998864 3333332 2212211 Q ss_pred ecccCCcEEEEeCCCCCCCCceEEEEEec-ceeEEec-CCCCcceeec---------cCC-----CceeeeEEeeEEEE- Q lcl|NC_016566. 213 MGDGLGRRFIISDAAADAMGAGKMLGLVP-GAVAVTT-NGLDMLAQEK---------GGN-----ENIERWWQGEFDFN- 275 (364) Q Consensus 213 ~~~~lGrrVIVDD~~p~~~~~Yttylfg~-GAi~~~~-~~~~~~~~~~---------~g~-----e~~~~~~~~~~~f~- 275 (364) ..+|++|++++.++...+ ..+||. .-+.++. +++.+...+. .+. +.-.+.++.+..|. T Consensus 225 --~l~G~PV~~~~~~~~~~~---~~~~gd~~~~~~g~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~r~~~r~d~ 299 (324) T protein:vir:78 225 --SLDGLPVVNLKSSNLKRG---ELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVAL 299 (324) T ss_pred --cccceeeEeeCCCCCCcc---eEEEEecceEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEEcc Confidence 246999999888765433 223332 1122322 2222111110 000 11122333333222 Q ss_pred --eeeeeeeecccccccccCCCCcChhhhcCCccceeecCcCcCcceEE Q lcl|NC_016566. 276 --VAVKGYRLKASARTPVEGVRSFKLSDITDKANWELDQGQVDNAPATV 322 (364) Q Consensus 276 --lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW~rV~~s~K~~pgv~ 322 (364) +||..|.-- ++++|.- ..|||=+ T Consensus 300 ~v~~~~A~~~l-------------------~~a~~~~-----~~~~~~~ 324 (324) T protein:vir:78 300 HIADDKAFAKL-------------------VPADKRT-----DSVPGEV 324 (324) T ss_pred EEecccceEEE-------------------ecccccC-----CCCCCCC Confidence 334443221 1122200 1122211 No 106 >protein:vir:2344 Length: 397 # NCBI annotation: gp14 # Family: family:all:507 # MgeID: mge:51 # MgeName: Bxb1 # Cross-refs: genbank:acc:NP_075281;genbank:gi:12657868;genbank:GeneID:920118 Probab=92.82 E-value=0.0098 Score=31.59 Aligned_cols=328 Identities=11% Similarity=-0.035 Sum_probs=115.3 Q ss_pred CC--------------ccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCC Q lcl|NC_016566. 1 MS--------------LTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPV 66 (364) Q Consensus 1 fd--------------~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~ 66 (364) |+ -.+-.++....+++.+.+..-+.+- ....++.+.-...|.+..-..+ .-++.... T Consensus 3 ~~~e~~~~~~~~t~~~~g~l~~~~~~~ii~~l~~~s~i~~l-------~~~~~~~~~~~~ip~~~~~~~a-~wv~Eg~~- 73 (397) T protein:vir:23 3 FSADHSQIAQTKDTMFTGYLDPVQAKDYFAEAEKTSIVQRV-------AQKIPMGATGIVIPHWTGDVSA-QWIGEGDM- 73 (397) T ss_pred cCHHHHHHhhccCCCCccccchhHHHHHHHHHHhccchhhh-------cceeeccCCceEEEEEcCCcce-EEecCCcc- Confidence 11 1123344444455554433222111 1122333333445544432221 11111111 Q ss_pred CccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc-ccee Q lcl|NC_016566. 67 GTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNA-AANY 145 (364) Q Consensus 67 ~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na-~~v~ 145 (364) ....++ ++ ..+-++.++-.+-+..+.+.+. .....+...|.+++++...+..-+.+|. |--+... .... T Consensus 74 ~~~s~~-~f---~~v~l~~~k~~~~v~iS~ell~---ds~~~l~~~i~~~l~~aia~~~d~a~l~---G~gt~~~~~~~~ 143 (397) T protein:vir:23 74 KPITKG-NM---TKRDVHPAKIATIFVASAETVR---ANPANYLGTMRTKVATAIAMAFDNAALH---GTNAPSAFQGYL 143 (397) T ss_pred cccccc-ce---eEEEEeeEEEEEeehhhHHHHh---cchHHHHHHHHHHHHHHHHHHHHHHHhh---cccCCccccccc Confidence 111111 22 3333333332233446655443 2223344567788888888877665552 2111111 0001 Q ss_pred ecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc---c--ceee-cccC Q lcl|NC_016566. 146 TQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG---D--LQVM-GDGL 217 (364) Q Consensus 146 dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~---~--~~~~-~~~l 217 (364) +.... ...........++.++..++-+....-..|+||...+..|.+ +.+.++ ++... + .... ...+ T Consensus 144 ~~~~~--~~~~~~~~~~~~~~~~~~~l~~~~~~~a~~vmn~~~~~~L~~---lkd~~G~~i~~~~~~~~~~~~~~~~tl~ 218 (397) T protein:vir:23 144 DQSNK--TQSISPNAYQGLGVSGLTKLVTDGKKWTHTLLDDTVEPVLNG---SVDANGRPLFVESTYESLTTPFREGRIL 218 (397) T ss_pred ccccc--eeeecccchhHHHHHHHHhhhhcccCCCEEEEcHHHHHHHHH---hhccCCceeecccccccccccccCceee Confidence 11100 001112234456778888888888888999999999999865 333222 22111 0 0000 1236 Q ss_pred CcEEEEeCCCCCCCCceEEEEEec-ceeEEec-CCCCccee-e---c-----cCC-----CceeeeEEeeE--EE-Eeee Q lcl|NC_016566. 218 GRRFIISDAAADAMGAGKMLGLVP-GAVAVTT-NGLDMLAQ-E---K-----GGN-----ENIERWWQGEF--DF-NVAV 278 (364) Q Consensus 218 GrrVIVDD~~p~~~~~Yttylfg~-GAi~~~~-~~~~~~~~-~---~-----~g~-----e~~~~~~~~~~--~f-~lhp 278 (364) |++|+++|.||.... ..+||. .-+.++. ++..+... + . .+. +.-...++++. .| .+|| T Consensus 219 G~Pv~~s~~~~~g~~---~~~~gDfs~~~i~~~~~i~i~~~~e~~~~~~~~~~~~~~~lf~~d~v~~ra~~r~d~~v~~~ 295 (397) T protein:vir:23 219 GRPTILSDHVAEGDV---VGYAGDFSQIIWGQVGGLSFDVTDQATLNLGSQESPNFVSLWQHNLVAVRVEAEYGLLINDV 295 (397) T ss_pred eeeEEEeCCCCCCce---EEEEeecceEEEEEEeceEEEEeeeeeeeeccccccceeeeeeccceeEEEEeeeccceecc Confidence 999999999985321 122221 1111221 11111100 0 0 000 11112233222 22 3566 Q ss_pred eeeeecccccccccCCCCcChhhh---cCCccceeecCcCcCcceEEEEecCcc---------c--cccccccccccccc Q lcl|NC_016566. 279 KGYRLKASARTPVEGVRSFKLSDI---TDKANWELDQGQVDNAPATVQDVGSDS---------D--TKGRRRTQTAQAVP 344 (364) Q Consensus 279 ~G~sw~~~~~~~~~gg~SPT~aeL---at~~NW~rV~~s~K~~pgv~~~~~~~~---------~--~~~~~~~~~~~~~~ 344 (364) ..|..-..... ..+...+ .++.+..+-+. .+.+.+.-..+++.. + ..+-..-++ ..-| T Consensus 296 ~a~~~~~~~~~------~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~ 367 (397) T protein:vir:23 296 NAFVKLTFDPV------LTTYALDLDGASAGNFTLSLD-GKTSANIAYNASTATVKSAIVAIDDGVSADDVTVTG-SAGD 367 (397) T ss_pred cceEEEeeccc------cceeeecccccCcceEEEEec-CccccCcccccchhhhHHHhhhcccccccceeeeec-CCce Confidence 66555332110 0111111 01233333321 122222111100000 0 000000000 0001 Q ss_pred cccchh----hcceeEEEeeeecC Q lcl|NC_016566. 345 TRNIKE----TAGVLVTLTATTAS 364 (364) Q Consensus 345 ~~~~~~----~~~~~~~~~~~~~~ 364 (364) .. |+- +++....+++...+ T Consensus 368 ~~-~~~~~~~~~~~~~~~~~~~~~ 390 (397) T protein:vir:23 368 YT-ITVPGTLTADFSGLTDGEGAS 390 (397) T ss_pred eE-EEeccccccCccccccCcccc Confidence 00 000 00011111111111 No 107 >protein:vir:1084 Length: 437 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:21 # MgeName: bIL309 # Cross-refs: genbank:acc:NP_076738;genbank:gi:13095848;genbank:GeneID:920418 Probab=92.54 E-value=0.011 Score=31.33 Aligned_cols=264 Identities=14% Similarity=0.045 Sum_probs=84.2 Q ss_pred CCccccchhhhh-------------------hh-hhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhccccccc Q lcl|NC_016566. 1 MSLTVFQRKLVT-------------------AV-TQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDR 60 (364) Q Consensus 1 fd~~vfn~~~~~-------------------~~-~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~ 60 (364) -...-|...... .. ...|. .+..++.....+-+.......+ ..|.....++..... T Consensus 139 ~~~~~~~~~~~~~e~~~~~~~~~~~~g~lvp~~~~~~i~-~~~~~~~l~~~~~~~~~~~~~~---~~~~~~~~~~~~~~~ 214 (437) T protein:vir:10 139 KKVTAFADYLKTGEVRDVTGIALKDGKVIIPETILTPEK-EVHQFPRLGSLVRTESVTTTTG---KLPIFNNSTDLLTAH 214 (437) T ss_pred hhhhhhHHHHHhhhhhhhhhcccccccccchHHHHHHHH-HhhhhhhhhhcceeEeeccCce---eeEEeeccccccccc Confidence 000001110000 00 00000 0111111111111111001111 122111111111111 Q ss_pred ccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhccc Q lcl|NC_016566. 61 NAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESN 140 (364) Q Consensus 61 d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~n 140 (364) ...+......+| ++.+-.-...+++ +-+.++...+. ..+..+...|.+.+++...+-.-..+|..+.. .. T Consensus 215 ~e~~~~~e~~~~-~~~~v~~~~~k~~---~~~~is~ell~---ds~~~~~~~i~~~l~~~~~~~~~~~i~~g~g~---~~ 284 (437) T protein:vir:10 215 TEYGQTTKNATP-VITPILWDLKTYT---GGYVFSQELIS---DSSYDWQAELQSRLIELRDNTDDSLIITALTD---GI 284 (437) T ss_pred cccccccccccc-cceeeeeehhhee---eehhhhHHHHh---hhHHHHHHHHHHHHHHHHHHHHHHHHhhhhcc---cc Confidence 111111101111 1222111112222 22334444332 22223334455555555544333332222111 00 Q ss_pred ccceeecccccCcccccccccHHHHHHHHH-HhcccccCeeEEEEchHHHHHHHHhhccccccc--cccc-ccceeeccc Q lcl|NC_016566. 141 AAANYTQPARVDGVGGRTFPTLADFPLAAS-KFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAI-GDLQVMGDG 216 (364) Q Consensus 141 a~~v~dis~~t~~~~~~~~~s~~~l~~A~~-~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~-~~~~~~~~~ 216 (364) . . .....+..++.++.. .+-.....=..|+||..++..|.+ +.++.+ ++.. ..-...... T Consensus 285 ~----------~---~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~l~~---lkd~~g~~~~~~~~~~~~~~~l 348 (437) T protein:vir:10 285 K----------K---TTSTYLLGDLKKVLNVTLKPQDSAAASIVMSQSAYNLFDM---ATDAMGRPLLQPNVTAATGYTL 348 (437) T ss_pred c----------c---cccccchhhHHHHHHhhhhhhhhcCCEEEEcHHHHHHHHH---hhccCCCeeeccCccCCCCccc Confidence 0 0 111122234555443 232223333579999999998854 443333 3321 111111124 Q ss_pred CCcEEEEeCCCC--C-CCCceEEEEEec--ceeEEec-CCCCcceeec--cCCCceeeeEEeeEEEEeeeeeeeecc--- Q lcl|NC_016566. 217 LGRRFIISDAAA--D-AMGAGKMLGLVP--GAVAVTT-NGLDMLAQEK--GGNENIERWWQGEFDFNVAVKGYRLKA--- 285 (364) Q Consensus 217 lGrrVIVDD~~p--~-~~~~Yttylfg~--GAi~~~~-~~~~~~~~~~--~g~e~~~~~~~~~~~f~lhp~G~sw~~--- 285 (364) +|+||++.|+|+ . ..+.+ +.+||. -++.+.. .+..+...+. .....+...++.++ =.+||..|..-. T Consensus 349 ~G~pv~~~~~~~~~~~~~~~~-~~~~gd~~~~~~~~~r~~~~~~~~~~~~~~~~~~~~~~r~d~-~~~~~~a~~~l~~~~ 426 (437) T protein:vir:10 349 LGKTVVIVDDKLFPSASAGDV-NIVVAPLKKAVINFKLTEITGQFQDTYDIWYKQLGIFLRQNV-VQASKDLIVNLTGKL 426 (437) T ss_pred ccceeEEecccccCCcCCCce-EEEEeeccccEEEEeeeceEEEEecccccccceeeEEEEEcc-EEecccceEEEEeec Confidence 699999988864 2 22333 344553 2333332 2222111110 01111111111111 237888877532 Q ss_pred cccccccCCCCcChh Q lcl|NC_016566. 286 SARTPVEGVRSFKLS 300 (364) Q Consensus 286 ~~~~~~~gg~SPT~a 300 (364) .+++.. .|+-. T Consensus 427 ~~~~~~----~~~~~ 437 (437) T protein:vir:10 427 KAVTVV----QSTAV 437 (437) T ss_pred cccccC----CCCCC Confidence 222211 23322 No 108 >protein:vir:80213 Length: 334 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:1879 # MgeName: LKA1 # Cross-refs: genbank:acc:YP_001522884;genbank:gi:158345177;genbank:GeneID:5687476 Probab=92.46 E-value=0.011 Score=31.26 Aligned_cols=264 Identities=10% Similarity=-0.088 Sum_probs=114.2 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) .=+++|.-+++++|-+. .--+++.+. .....|+-++.+..++.. -..... +...+++.+...+. T Consensus 23 l~le~~~geV~~af~~~-s~~~~~~~~---------r~i~~G~s~~~~~iG~~~----~~~~~~--g~~l~~~~~~~~~~ 86 (334) T protein:vir:80 23 LHIEEHLGLVDASFMYS-SKFASWMNV---------RSLRGTNQLRVDRVGAST----IAGRKA--GEELVVQKNVSDKL 86 (334) T ss_pred ehhhhhhhHHHHHHHHh-hhhhcccee---------eeccccceEEEeeeccee----eeeecC--CCCCCCCCcccCce Confidence 22688888888888655 212222221 222346655555444331 111111 22223333433332 Q ss_pred -eeEE--eccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHH-HHHHHHHHHhhhhccccc---------ceeec Q lcl|NC_016566. 81 -NSVN--LSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLH-YLKAGIGAGKAAIESNAA---------ANYTQ 147 (364) Q Consensus 81 -vaVk--l~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~-~qk~lla~L~Gv~~~na~---------~v~di 147 (364) +-|- +..++--...++.+..+.- ...++++.+....+. +|..++.+++|+...... ....+ T Consensus 87 ~l~ID~~l~~~~~VddiD~~q~~~D~------rse~~~~~G~aLA~~~D~~~~~~l~kaa~~~~~~~~~~~~~~G~~~~~ 160 (334) T protein:vir:80 87 NLTVDTVLYARHFFDKFDEWTSNLDV------RKETAREDGIALARQYDQACIIQLQKCGDFLAPAHLKPAFHDGILLPS 160 (334) T ss_pred EEEEeeeeehhhhHhhHHHHhcCcch------HHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccccccccCCcceee Confidence 2221 2222222223343333222 233555555555443 344555555555432111 11111 Q ss_pred ccccCcccccccccH----HHHHHHHHHhcccccC-----eeEEEEchHHHHHHHHhhcccccccc-------cccccce Q lcl|NC_016566. 148 PARVDGVGGRTFPTL----ADFPLAASKFGDQAAL-----IKSWFMDGVTWANFIAYQALPSAEQV-------FAIGDLQ 211 (364) Q Consensus 148 s~~t~~~~~~~~~s~----~~l~~A~~~lGD~~~~-----l~~ivMHS~v~~~L~k~~~it~~~~~-------~~~~~~~ 211 (364) . .++ .......+. ..+.+|++.|..+.-. =..++|-...|..|++-..+.|.+-. .....+. T Consensus 161 ~-~~g-~~~~~~~~~~~l~~a~~~a~~~L~e~dvp~~~~~~R~~vv~P~~y~~Ll~~~r~~n~d~~~s~~~~~~~~g~i~ 238 (334) T protein:vir:80 161 T-ISG-LAADAAADADVLVAAHRQGVEAMVFRDLGDQLMSEGVTLLDPVIFSFLLEHDRLMNVEFGAKEGGNSFVGGRIA 238 (334) T ss_pred c-ccc-cccchhhhHHHHHHHHHHHHHHHHhcCCCCCcCCceEEEeChHHHHHHhcccccccceeccccccccccceeEE Confidence 1 011 110011111 2345677778776444 27889999999999875556543211 1111122 Q ss_pred eecccCCcEEEEeCCCCCC-------CCceEE----------EEEecceeEEecCCC-Ccce-eeccCCCceeeeEEeeE Q lcl|NC_016566. 212 VMGDGLGRRFIISDAAADA-------MGAGKM----------LGLVPGAVAVTTNGL-DMLA-QEKGGNENIERWWQGEF 272 (364) Q Consensus 212 ~~~~~lGrrVIVDD~~p~~-------~~~Ytt----------ylfg~GAi~~~~~~~-~~~~-~~~~g~e~~~~~~~~~~ 272 (364) . .+|-+|+.+=.+|.. ++.|.. +.|...|++..+-.+ .... ++..+..+...++++-. T Consensus 239 ~---v~G~~V~~Sn~~P~~~~t~~~~g~~~~~~agd~t~~~~~~~~~~Al~t~~~~~~~~e~~~~~~~~~d~i~~~~a~G 315 (334) T protein:vir:80 239 M---LNGVRVVETPRFPQSAITANALGADFNVTDAEVRRKMITFIPSMALISAQVHPVSAQFWEEKKDFGHYLDTFQSYN 315 (334) T ss_pred E---EeceEEEeecCCCCccccccccccccccccccccceEEEEEeCceEEEEEEeecceeeeechhhHHHHHHHHHHcC Confidence 2 359999999999953 223332 455668877665322 1111 11112222222222222 Q ss_pred EEEeeeeeeeecccccccccCCCCc Q lcl|NC_016566. 273 DFNVAVKGYRLKASARTPVEGVRSF 297 (364) Q Consensus 273 ~f~lhp~G~sw~~~~~~~~~gg~SP 297 (364) +=.+.|.+..--+-+. .+| T Consensus 316 ~g~lRPeaa~vv~~~~------~~~ 334 (334) T protein:vir:80 316 IGQRRPDAVAVHDITV------TNP 334 (334) T ss_pred CceeccceEEEEEEee------ecC Confidence 2235554443333322 245 No 109 >protein:vir:105038 Length: 428 # NCBI annotation: major capsid head protein precursor # Family: family:all:21 # MgeID: mge:1465 # MgeName: phiKO2 # Cross-refs: genbank:acc:YP_006586;genbank:gi:46402092;genbank:GeneID:2777903 Probab=92.09 E-value=0.013 Score=30.94 Aligned_cols=269 Identities=11% Similarity=0.077 Sum_probs=93.4 Q ss_pred CCcc-------------------c-cchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhccccccc Q lcl|NC_016566. 1 MSLT-------------------V-FQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDR 60 (364) Q Consensus 1 fd~~-------------------v-fn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~ 60 (364) |-.. + =.++++..++|.+++..-.++- |+-++.. -.|+ ...|-+..-.. ..-+ T Consensus 113 ~~~~~~~~~~~~~~~~~~~~~gg~liP~~~~~~ii~~l~~~~~l~~~---~~~~~~~--~~g~-~~~p~~~~~~~-a~~v 185 (428) T protein:vir:10 113 FASDELNDQSVSMAISTAAGSGGVLIPQNIHSEVIELLRDRTIVRKL---GARSIPL--PNGN-MSLPRLAGGAT-ASYT 185 (428) T ss_pred HhhhhhhhhhHhhhhcccccCCccccchhHHHHHHHHHhhhchhhhh---cceeeec--CCcc-eEEEEEeCCcc-eeee Confidence 0000 0 0122233334444332211111 1111100 1122 22232211000 0001 Q ss_pred ccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHH------HHh Q lcl|NC_016566. 61 NAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIG------AGK 134 (364) Q Consensus 61 d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla------~L~ 134 (364) ... ......++ ++.+-.-...+++. -+.++.+.+.. +++ .+..-|.+++++...+..-+.+|. ... T Consensus 186 ~Eg-~~~~~~~~-~f~~i~~~~~k~~~---~v~is~ell~d--s~~-~l~~~i~~~l~~ai~~~~d~~~l~G~G~~~~p~ 257 (428) T protein:vir:10 186 GEN-QDAKVSEA-RFDDVKLTAKTMIA---MVPISNALIGR--AGF-NVEQLVLQDILTAISVREDKAFMRDDGTGDTPI 257 (428) T ss_pred ccC-cccccccc-ceeeEEeeeEEEEE---eehhhHHHHhh--hhH-HHHHHHHHHHHHHHHHHHHHHHhccCCCCcccc Confidence 101 10111111 22222222223332 23455554431 222 344457777777777666554442 111 Q ss_pred hhhcccccceeecccccCcccccccccHHHHHHHH---HHhcccccCeeEEEEchHHHHHHHHhhccccccc--cccccc Q lcl|NC_016566. 135 AAIESNAAANYTQPARVDGVGGRTFPTLADFPLAA---SKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGD 209 (364) Q Consensus 135 Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~---~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~ 209 (364) |++..-.. ...+ ..+......+...+.+..++. ...+.....-..|+||...|..|.+ +.+.++ ++.... T Consensus 258 Gi~~~~~~-~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~---lkd~~G~~i~~~~~ 332 (428) T protein:vir:10 258 GMKARATQ-WNRL-LPWAADAAVNLDTIDTYLDSIILMSMDGNSNMISSGWGMSNRTYMKLFG---LRDGNGNKVYPEMA 332 (428) T ss_pred cccccccc-cccc-ccccccccccHHHHHHHHHHHHHhhhccccccccCEEEEcHHHHHHHHH---hhccCCceeccCCC Confidence 22211100 0000 000001111111222223332 2234444445689999999998854 333333 332222 Q ss_pred ceeecccCCcEEEEeCCCCCCCC---ceEEEEEec-ceeEEec-CCCCcc--eee--ccCCCce-------eeeEEeeEE Q lcl|NC_016566. 210 LQVMGDGLGRRFIISDAAADAMG---AGKMLGLVP-GAVAVTT-NGLDML--AQE--KGGNENI-------ERWWQGEFD 273 (364) Q Consensus 210 ~~~~~~~lGrrVIVDD~~p~~~~---~Yttylfg~-GAi~~~~-~~~~~~--~~~--~~g~e~~-------~~~~~~~~~ 273 (364) -.+ .+|+||+++|.||...+ .-+.++||. .-+.++. ++..+. ++. ......+ ...++.+.. T Consensus 333 ~g~---l~G~pv~~~~~~p~~~~~~~~~~~i~~gd~s~~~i~~~~~i~i~~~~~~~~~~~~~~~~~~f~~~~~~~R~~~r 409 (428) T protein:vir:10 333 QGM---LKGYPIQRTSAIPANLGEGGKESEIYFADFNDVVIGEDGNMKVDFSKEASYIDTDGKLVSAFSRNQSLIRVVTE 409 (428) T ss_pred CCe---eeceeeEEeccccccccCCCccceEEEEecceEEEEEecceEEEeecccccccccccccchhhcchhheeeeee Confidence 222 35999999999986321 223344443 1122222 222111 110 0111111 112222211 Q ss_pred EEeeeeeeeecccccccccCCCCcChhhhcCCccc Q lcl|NC_016566. 274 FNVAVKGYRLKASARTPVEGVRSFKLSDITDKANW 308 (364) Q Consensus 274 f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW 308 (364) | |+.. ..|.-=-+-++.+| T Consensus 410 ~-----d~~v-----------~~p~a~~~~t~~~~ 428 (428) T protein:vir:10 410 H-----DIGF-----------RHPEGLVLGTGVLF 428 (428) T ss_pred e-----Ccee-----------eccceEEEEeccCC Confidence 1 1111 13445555666777 No 110 >protein:vir:80180 Length: 381 # NCBI annotation: capsid protein # Family: family:all:2203 # MgeID: mge:1878 # MgeName: Pf-WMP3 # Cross-refs: genbank:acc:YP_001285797;genbank:gi:148747831;genbank:GeneID:5220456 Probab=89.46 E-value=0.026 Score=29.25 Aligned_cols=313 Identities=11% Similarity=0.003 Sum_probs=122.2 Q ss_pred CCc---cccchhhhhhhhh-hhHHHHHHHhhhhcceeEecc--CcccCceeeeehhhhhcccccccccccCCCccccchh Q lcl|NC_016566. 1 MSL---TVFQRKLVTAVTQ-MIPDNLNVFNAAANGAVVLGT--GEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKV 74 (364) Q Consensus 1 fd~---~vfn~~~~~~~~e-~i~q~~~~fn~as~gAivl~~--~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~k 74 (364) +++ +.|=|+++...+. .+++. ..| ..++-.+ +...||-+..|..+... ..|+... ...++.. T Consensus 15 ~~~t~~~~fiPev~s~~v~~~l~~~-lv~-----~~l~~~~~~~~~~GdTV~ip~~g~~~----a~d~~~g--~~i~~~~ 82 (381) T protein:vir:80 15 VDLSNVQVFIPEVWSSEVRMFRDQK-FAA-----LEATKKIPFEGKKGDLIHIPNISRAA----VYDKQPQ--TPVNLQA 82 (381) T ss_pred cchhhHHhhhhHHHHHHHHHHHHHh-hhh-----hhccccccceeecCceEEeeccCcce----eeeecCC--Ccccccc Confidence 332 2222344443322 22221 122 2222111 22458888888776542 2232222 2334555 Q ss_pred hhccceeeEEecc-ccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccc-------eee Q lcl|NC_016566. 75 LARMLTNSVNLSA-KVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAA-------NYT 146 (364) Q Consensus 75 it~~~~vaVkl~~-~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~-------v~d 146 (364) ++..+. -+.+.. ++-+..++..+.....-|+. ..+.++.+....+.+=+.+++.+..+-...+.. +-+ T Consensus 83 ~~~~~~-~itID~~~~~~~~Idd~D~~~~~~D~~---~~~~~~~~~aLA~~~D~~i~~~~~~~~~~~~~~~~t~~~~i~~ 158 (381) T protein:vir:80 83 RTDSEF-TFTVTKYKESSFMIEDIVNTQASYTLR---QYYTKEAGYALARDMDNFALAHRAVINAFPSQRIYSYDTTLGD 158 (381) T ss_pred cCCceE-EEEEeeeeecceeechHHHHhhccChH---HHHHHHHHHHHHHHHHHHHHHHHhhcccccccccccccccccc Confidence 555543 344422 22234455544443445663 335555555555554444444443332221110 111 Q ss_pred cccccCcccccccccHHHHHHHHHHhcccccC--eeEEEEchHHHHHHHHhhcccccccc----cccccceeecccCCcE Q lcl|NC_016566. 147 QPARVDGVGGRTFPTLADFPLAASKFGDQAAL--IKSWFMDGVTWANFIAYQALPSAEQV----FAIGDLQVMGDGLGRR 220 (364) Q Consensus 147 is~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~--l~~ivMHS~v~~~L~k~~~it~~~~~----~~~~~~~~~~~~lGrr 220 (364) .+............++..|.+|+++|.++.-- =..+++++.+|..|++...+.+.+.. -+.+.+. ..+|-+ T Consensus 159 ~~~~~~~t~~~~~~t~~~i~~a~~~Lde~~VP~egR~lvv~P~~~~~Ll~~~~~~~ad~~~~~~l~~G~Ig---~i~G~~ 235 (381) T protein:vir:80 159 GTVNAHLTGTPAPLTYAALLLAKQKLDEADVPQEGRIVMVSPAQYIDLLSINQFISVDFSQVKPVTSGVVG---TILGME 235 (381) T ss_pred cccccccccchhhHHHHHHHHHHHHHhhcCCCcCCcEEEeCHHHHHHHhhchhhhhhhhccchhhhceeee---EEcceE Confidence 11111111222344667889999999887521 14688999999999865444433221 1122222 335999 Q ss_pred EEEeCCCCCCCCceEEEEEecceeEEecCCCCcceeeccCCCceeeeEEeeEEEEeeeeeeeecccccccccCCCCcChh Q lcl|NC_016566. 221 FIISDAAADAMGAGKMLGLVPGAVAVTTNGLDMLAQEKGGNENIERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLS 300 (364) Q Consensus 221 VIVDD~~p~~~~~Yttylfg~GAi~~~~~~~~~~~~~~~g~e~~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~a 300 (364) |+++-.+|.... +-|.++.|+-.... |.+......+.+. ..+.-.+.+|.++.+-...- +-= T Consensus 236 Vv~Sn~lp~~~~--t~~~~~agap~~~~--~~~~~~~~~g~~s----~~a~av~~~k~yd~~~~~~~----------~~~ 297 (381) T protein:vir:80 236 VIVTTQIGINSL--TGYVNGQGAPTQPT--PGVLGSPYLPDQA----GTANVVNTGSASDLAVSLSY----------FGL 297 (381) T ss_pred EEeecccccccc--cceeeecccccccc--ccccccccccccc----cceeeeeeeeeeceeeeeee----------ccc Confidence 999999997543 23334444422211 1110011111111 11112233444444432110 000 Q ss_pred hhcCCccceeecCcCcCcceEEEEecCccccccccccccccccccccchhhcceeEEEeeeecC Q lcl|NC_016566. 301 DITDKANWELDQGQVDNAPATVQDVGSDSDTKGRRRTQTAQAVPTRNIKETAGVLVTLTATTAS 364 (364) Q Consensus 301 eLat~~NW~rV~~s~K~~pgv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 364 (364) .-.+++-|. +++.|-+=|. ++ ..+.-+++|-.+-=-|-+|+..-.+- --+ T Consensus 298 ~~~~g~~~~--~~~~~~~~~~-~~----------~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~ 347 (381) T protein:vir:80 298 PVFSGAGAT--AADGGQTLGS-FG----------GANRWATAVVCHPDWLAVGVQQNVKS-ESS 347 (381) T ss_pred eeeecceee--ecCCCceeee-eh----------hhhhhhhhcccccccccccceeEeec-ccc Confidence 001111111 1111111110 00 01111111111111122333332110 001 No 111 >protein:vir:102655 Length: 322 # NCBI annotation: Hypothetical protein # Family: family:all:6384 # MgeID: mge:1624 # MgeName: VP2 # Cross-refs: genbank:acc:YP_052979;genbank:gi:50282923;genbank:GeneID:2948122 Probab=89.15 E-value=0.028 Score=29.10 Aligned_cols=277 Identities=11% Similarity=0.042 Sum_probs=122.5 Q ss_pred CCc-cccchhhhhhhhhhhHHHHHHH----hhhhcceeEeccCcccCceeeeeh---hhhhcccccccccccCCCccccc Q lcl|NC_016566. 1 MSL-TVFQRKLVTAVTQMIPDNLNVF----NAAANGAVVLGTGEVLKDVVEKMS---VGLIANLVTDRNAYAPVGTPATA 72 (364) Q Consensus 1 fd~-~vfn~~~~~~~~e~i~q~~~~f----n~as~gAivl~~~~~~Gdf~~~~~---f~~i~g~~~~~d~~~~~~~~~T~ 72 (364) +.+ .+-+.++=.++++.-.+++..+ .+-=.+++...++...++|...+- |..++.- +......++...|| T Consensus 7 ~~~~~~Ms~~i~~~fv~qy~~~v~~~~qq~~s~L~~tV~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~d~~~dtp 84 (322) T protein:vir:10 7 MSMLPLIAGDIDQAFVQTYETTLRILSQQKSAKLKQYCQHKNESSESHNWETLASMDPDAVKRK--RSRQQSADGTYPTP 84 (322) T ss_pred eeeeeeeechhhhHHHHHHHHHHHHHHHHhhhhhhcccccccccccccceeecccccccccccc--cccccccCcccCCC Confidence 222 1111122222222222222222 223334555555555555433221 1111100 00111111122233 Q ss_pred hhhhccceeeEEe-ccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcc--cccceeeccc Q lcl|NC_016566. 73 KVLARMLTNSVNL-SAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIES--NAAANYTQPA 149 (364) Q Consensus 73 ~kit~~~~vaVkl-~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~--na~~v~dis~ 149 (364) .-.......-|-+ .+.++ ...+..+..+..-||... ..+.-+.++.|.+=+.+++.+.|.-.- ....+..-+. T Consensus 85 ~~~~~~~~r~~~~~d~~~~-~~VDd~D~~k~~~D~~~~---~~~~~a~AL~R~~D~~I~~a~~g~a~~~~~gt~v~~~ss 160 (322) T protein:vir:10 85 VNNKPFAKRRTNVDTYDTG-HVVEQEDISQMLLDPNSA---LITSQAYAMARKTDDLIIAGAWKPASIKGTGQPVEFLAT 160 (322) T ss_pred ccccccceEEEeecccccc-eecchHHHHHhhcCchHH---HHHHHHHHhhhHHHHHHHhhhhccccccccccccccCCC Confidence 2222222222332 23332 234455555566788543 557777778777777777666553321 1111111111 Q ss_pred ccCcccccccccHHHHHHHHHHhcccc--cCe-eEEEEchHHHHHHHHhhccccccc-----ccccccceeecccCCcEE Q lcl|NC_016566. 150 RVDGVGGRTFPTLADFPLAASKFGDQA--ALI-KSWFMDGVTWANFIAYQALPSAEQ-----VFAIGDLQVMGDGLGRRF 221 (364) Q Consensus 150 ~t~~~~~~~~~s~~~l~~A~~~lGD~~--~~l-~~ivMHS~v~~~L~k~~~it~~~~-----~~~~~~~~~~~~~lGrrV 221 (364) . .-..+..++++..|.+|.++|..+. ++. +-+++-+.-|.+|++-.-+++..- ++.. .+++..||-.| T Consensus 161 ~-~i~~g~~g~t~~kl~~a~~~l~~~dvp~d~~R~~vv~p~~~~~LL~d~~~ts~D~~~~~~l~~~---G~ig~~lGf~~ 236 (322) T protein:vir:10 161 Q-EIGDGTKPISFDYVTEITERFLENEIEPEVSKVIVIGPTQARKLLQITEATSADYTSAMDLQSK---GIITNWMGYTW 236 (322) T ss_pred c-ccccCccchhHHHHHHHHHHHHhcCCCCCCCeEEEeCHHHHHHHhcchhhhhhhcccchhhhhc---CeeeeeeeEEE Confidence 1 0122345778888999999996644 233 456666677777765444443221 1121 23444589999 Q ss_pred EEeCCCCCC--------------CCceEEEEEecceeEEecCC-CCcceeeccCCCc-eeee-EEeeEEEEeeeee---e Q lcl|NC_016566. 222 IISDAAADA--------------MGAGKMLGLVPGAVAVTTNG-LDMLAQEKGGNEN-IERW-WQGEFDFNVAVKG---Y 281 (364) Q Consensus 222 IVDD~~p~~--------------~~~Yttylfg~GAi~~~~~~-~~~~~~~~~g~e~-~~~~-~~~~~~f~lhp~G---~ 281 (364) |+.-.+|.. .++..++.+-..||+++.+. ......+..+... ..+. ...-..-.+.|.| + T Consensus 237 i~s~~lp~~~~t~~~~~~~~~~~~~~~~~~a~~k~Av~~a~~~dv~~~i~~~~~~~~a~~I~~~~~~Ga~ri~~~gVv~i 316 (322) T protein:vir:10 237 IVSTRLDKFDPTQWGMAAEDGPQGDEIWCIAMTDMALGYHSCKDIWTKVAEDPSASFAWRIYSAFTADCVRVEDEHIFKL 316 (322) T ss_pred EEeccCCccccccccccccCCCCccceeEEEEecCceeEEEeeeeeEEeeccCCcchhhhhhhhhhhCceEeccCcEEEE Confidence 999988842 23677899999999998753 2222211111111 1010 0000011133333 3 Q ss_pred eecccc Q lcl|NC_016566. 282 RLKASA 287 (364) Q Consensus 282 sw~~~~ 287 (364) ..+++- T Consensus 317 ~~~e~~ 322 (322) T protein:vir:10 317 RLKNSL 322 (322) T ss_pred EEeccC Confidence 444432 No 112 >protein:vir:78640 Length: 352 # NCBI annotation: phage capsid # Family: family:all:658 # MgeID: mge:1855 # MgeName: tp310-2 # Cross-refs: genbank:acc:YP_001429943;genbank:gi:156603997;genbank:GeneID:5525386 Probab=88.94 E-value=0.029 Score=28.99 Aligned_cols=257 Identities=8% Similarity=-0.004 Sum_probs=87.1 Q ss_pred CC---cc-------------------------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceee Q lcl|NC_016566. 1 MS---LT-------------------------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVE 46 (364) Q Consensus 1 fd---~~-------------------------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~ 46 (364) |. .. +=.+++...+++.+.+. +...+=+-+ . +..|. . T Consensus 56 ~~~~~r~~~~~~~~~~~~~~~~~~~~al~~~~~~~gG~lIP~~~~~~Ii~~l~~~----s~l~~~~~v-~--~~~~~--~ 126 (352) T protein:vir:78 56 KAEFYRHAILPNEFEKPSMEAQRLLHALPTGNDSGGDKLLPKTLSKEIVSEPFAK----NQLREKARL-T--NIKGL--E 126 (352) T ss_pred HHHHHHHHhhhhHHHHHHhhHHHHHHHhccCCCCCCceeccHhHHHHHHHHHHhh----cchhhheee-E--ecCCc--e Confidence 00 00 00111111222222111 111000000 0 01111 0 Q ss_pred eehhhhhcccccccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_016566. 47 KMSVGLIANLVTDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYL 126 (364) Q Consensus 47 ~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~q 126 (364) .|-...-.+...-.. .+..-..++| +..+-.-...+++. -+..+...+....-|. ..-|.+++++...+-.. T Consensus 127 ~p~~~~~~~~a~~v~-E~~~~~~~~~-~f~~v~~~~~k~~~---~i~is~ell~Ds~~~l---~~~i~~~la~~~~~~e~ 198 (352) T protein:vir:78 127 IPRVSYTLDDDDFIT-DVETAKELKL-KGDTVKFTTNKFKV---FAAISDTVIHGSDVDL---VNWVENALQSGLAAKER 198 (352) T ss_pred EEEEecCCCcccccc-cccccccccc-cceeeeecceeEEe---echhhHHHHhhhhHHH---HHHHHHHHHHHHHHHHH Confidence 110000000000000 0000000111 12222211222222 2234444333222233 33466777776655434 Q ss_pred HHHHHHHhhhhcccccceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhcccccccccc Q lcl|NC_016566. 127 KAGIGAGKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQVFA 206 (364) Q Consensus 127 k~lla~L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~~~~ 206 (364) +.++....|. +......... ..........+.++.++...|-.....=..|+||+.++..|++ ++.+..+..- T Consensus 199 ~~~~~~g~g~--~~~~g~l~~~---~~~~~t~~~~~d~i~~~~~~l~~~~~~~a~~~mn~~t~~~l~~--~~~~~~~~~~ 271 (352) T protein:vir:78 199 KDALAVSPKS--GLEHMSFYNG---SVKEVEGANMYDAIINALADLHEDYRDNATIYMRYADYVKIIS--VLSNGTTNFF 271 (352) T ss_pred HhhhhcCCCC--cccccceecc---ccccccccchHHHHHHHHhccChhhhcCCEEEEehHHHHHHHH--HHhccCCccc Confidence 4444332221 1111110000 0001111123557888888886666666789999999988865 2333222211 Q ss_pred cccceeecccCCcEEEEeCCCCCCCCceEEEEEec---ceeEEecCCCCcceeeccCCCceeeeEEeeEEE---Eeeeee Q lcl|NC_016566. 207 IGDLQVMGDGLGRRFIISDAAADAMGAGKMLGLVP---GAVAVTTNGLDMLAQEKGGNENIERWWQGEFDF---NVAVKG 280 (364) Q Consensus 207 ~~~~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~---GAi~~~~~~~~~~~~~~~g~e~~~~~~~~~~~f---~lhp~G 280 (364) .. .....+|+||+++|+++.. +||. +-+.+. ...+.+. ..-....+.+..+..| .++|.- T Consensus 272 ~~---~~~~llG~PV~~~~~~~~~-------~~Gdf~~~~~~~~--~~~~~~~--~~~~~g~~~f~~~~r~Dg~~~~~eA 337 (352) T protein:vir:78 272 DT---PAEKVFGKPVVFTDAAVKP-------IVGDFNYFGINYD--GTTYDTD--KDVKKGEYLFVLTAWYDQQRTLDSA 337 (352) T ss_pred cc---CCccccccceEEecCCCce-------eEeehhhhhhhhh--hheeeee--ccccCCeeEEEEEeeeCceeechhh Confidence 11 1113479999999998742 2321 111111 0011111 1111122333332222 255665 Q ss_pred eeecccccccccCCCCcC Q lcl|NC_016566. 281 YRLKASARTPVEGVRSFK 298 (364) Q Consensus 281 ~sw~~~~~~~~~gg~SPT 298 (364) |+--..+-+ .+.-|+ T Consensus 338 ~~~l~~~a~---~~~~~~ 352 (352) T protein:vir:78 338 FRIAKAKES---TGSLPS 352 (352) T ss_pred eEEEEeecc---cCCCCC Confidence 554221111 122466 No 113 >protein:vir:4092 Length: 390 # NCBI annotation: major capsid protein a # Family: family:all:635 # MgeID: mge:86 # MgeName: 2389 # Cross-refs: genbank:acc:NP_510986;swissprot:trembl:q8w604;genbank:gi:17488508;uniprot:Q8W604;genbank:GeneID:1260361 Probab=88.66 E-value=0.031 Score=28.86 Aligned_cols=274 Identities=10% Similarity=-0.029 Sum_probs=89.6 Q ss_pred CCc-------c----------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccc Q lcl|NC_016566. 1 MSL-------T----------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLV 57 (364) Q Consensus 1 fd~-------~----------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~ 57 (364) .+. + +-.++++..+++.+.+.....+.. .-.+..+.....|-....+. . T Consensus 68 ~~~~l~~~~r~~~~~~~~~~~~~~gg~lvP~~~~~~I~~~~~~~s~i~~~~-------~~~~~~~~~~~i~~~~~~~~-a 139 (390) T protein:vir:40 68 GANALTSDESKYYNEVIAGNGFAGVTALLPPTVFERVFEDLTVEHPLLSKI-------NFVNTTATTEWIISVGDVAT-A 139 (390) T ss_pred CchhccHHHHHHHHHHHhccCcccCcccccHHHHHHHHHHHHhhhhhhhhc-------eeeecCCceeEEEEEcCCcc-e Confidence 000 0 001222222333332221111110 01111111111121110000 0 Q ss_pred cccccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH----- Q lcl|NC_016566. 58 TDRNAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGA----- 132 (364) Q Consensus 58 ~~~d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~----- 132 (364) .-.+..+......+| ++.+-.-...+++ +-+..+...+. ..+..+..-|.+++++.+.+..-+.+|.- T Consensus 140 ~~~~E~~~~~~~~~~-~f~~i~l~~~k~~---~~i~iS~ell~---ds~~~l~~~i~~~la~~i~~~~~~a~l~G~G~~~ 212 (390) T protein:vir:40 140 WWGPLCAEIKEVLDN-GFDKIQTGMYKLS---AYIPVCNAMLD---LGPSWLDQYVRTILGEAMALGLEAGIVNGSGKDQ 212 (390) T ss_pred eeeccccccCccccc-cceeeEeeeeeEE---EeehhhHHHHh---cchHHHHHHHHHHHHHHHHHHHHhhhhcccCCCc Confidence 000000000000111 2333322233333 22345544433 33334455577888888877666554421 Q ss_pred HhhhhcccccceeecccccCcccccccccHHHHH-HHHHHhccc---ccCeeEEEEchHHHHHHHH-hhccccccccccc Q lcl|NC_016566. 133 GKAAIESNAAANYTQPARVDGVGGRTFPTLADFP-LAASKFGDQ---AALIKSWFMDGVTWANFIA-YQALPSAEQVFAI 207 (364) Q Consensus 133 L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~-~A~~~lGD~---~~~l~~ivMHS~v~~~L~k-~~~it~~~~~~~~ 207 (364) -.|++...+.......... .....+..+..++. +-...+++. ...-..|+||+..+.++++ ...+.+..+- T Consensus 213 P~Gil~~~~~~~~~~~~~~-~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~a~~i~n~~t~~~~l~~~~~~~d~~G~--- 288 (390) T protein:vir:40 213 PIGMMRDLNNVTAGEHPVK-TATPLTDLTPATLATKVMLPLTDNGKKSVSDAILVINPADYWSKIYAATSYMTPQGV--- 288 (390) T ss_pred cceeeeccccccccccccc-cccccchhhHHHHHHHHHHHhhcchhhhhcCceEEEcchhHHHHHHHHhhccCCCCc--- Confidence 0112211110000000000 00111111111111 112223333 2334679999988765543 2223332221 Q ss_pred ccceeec-ccCCcEEEEeCCCCCCCCceEEEEEecc-eeEEec-CCCCcceeeccCCCceeeeEEeeEEEE---eeeeee Q lcl|NC_016566. 208 GDLQVMG-DGLGRRFIISDAAADAMGAGKMLGLVPG-AVAVTT-NGLDMLAQEKGGNENIERWWQGEFDFN---VAVKGY 281 (364) Q Consensus 208 ~~~~~~~-~~lGrrVIVDD~~p~~~~~Yttylfg~G-Ai~~~~-~~~~~~~~~~~g~e~~~~~~~~~~~f~---lhp~G~ 281 (364) .+++ .++|++||++|.||... .+||.- -+.++. +++.+.+.....-+...+.+++...|- ++|..| T Consensus 289 ---~v~~~~~~g~pvv~~~~~p~~~-----i~~Gd~s~~~i~~~~~~~v~~~~~~~f~~~~~~~r~~~r~dg~v~~~~A~ 360 (390) T protein:vir:40 289 ---WVTGILPVPLEIVQSVAVPVGK-----AVAGRAKDYFMGIGSEQVIRTSTEYRLLDDETLYYAKQYANGRPKDNSSF 360 (390) T ss_pred ---cccccCCCceeEEEcCCCCCCc-----EEEEeeceEEEEeecceEEEecchhhhhcCcEEEEEEEEeCCEEecccce Confidence 2233 24699999999999532 333321 111111 222222222111112222233222221 445444 Q ss_pred ee---cccccc---cc----cCCCCcChhh Q lcl|NC_016566. 282 RL---KASART---PV----EGVRSFKLSD 301 (364) Q Consensus 282 sw---~~~~~~---~~----~gg~SPT~ae 301 (364) .- +....+ ++ ..+.+|+-+| T Consensus 361 ~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~ 390 (390) T protein:vir:40 361 LVFDITGLEGSPAIDVNVVNNATPSETPAE 390 (390) T ss_pred EEEEeeccCCCCCCCcceeeCCCCCCCCCC Confidence 32 222110 00 0122344444 No 114 >protein:vir:80376 Length: 435 # NCBI annotation: gp6, major capsid head protein # Family: family:all:21 # MgeID: mge:1881 # MgeName: phi644-2 # Cross-refs: genbank:acc:YP_001111085;genbank:gi:134288639;genbank:GeneID:4960624 Probab=87.43 E-value=0.039 Score=28.32 Aligned_cols=267 Identities=15% Similarity=0.078 Sum_probs=94.2 Q ss_pred CCcc--------------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhccccccc Q lcl|NC_016566. 1 MSLT--------------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDR 60 (364) Q Consensus 1 fd~~--------------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~ 60 (364) +.-. +-.+++...++|.+.+.....+- ++-++.. ..|. ...|.+..-.. ..-. T Consensus 119 ~~~~~~~~~~~~~~~~~~~~~gg~lvP~~~~~~ii~~l~~~~~i~~~---~~~~v~~--~~~~-~~~p~~~~~~~-a~~v 191 (435) T protein:vir:80 119 AIERGFGEEVAMSLNTLSPGAGGVLVPENLSSEVIELLRPKSVVRKL---GARTLPL--SNGN-ITIPRLKGGAI-VGYI 191 (435) T ss_pred HHhhhhhhhhhhhhcccCCCCCccccchhHHHHHHHHHhhhchhhhc---cceeeec--CCCc-eEEEEEeCCcc-eeee Confidence 0000 11222333344444432111110 1211111 0122 23343321110 0011 Q ss_pred ccccCCCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH------Hh Q lcl|NC_016566. 61 NAYAPVGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGA------GK 134 (364) Q Consensus 61 d~~~~~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~------L~ 134 (364) ... . ....+..++.+-.-...|++ +-+..+..-+.-.+.+|. +-.-|.+++++...+..-+.+|.- .. T Consensus 192 ~E~-~-~~~~~~~~f~~i~~~~~k~~---~~~~is~ell~ds~~~~~-l~~~i~~~l~~a~~~~~d~a~l~G~G~~~~p~ 265 (435) T protein:vir:80 192 GAD-T-DIPTTQQQFDDLKLTAKKMA---ALVPIANDLIKYAGVNPN-VDQIVVGDLTAAIGAREDKAFIRDDGTANTPK 265 (435) T ss_pred ccC-c-cccccccceeeEEEeeEEEE---EeehhhHHHHHhhcccHH-HHHHHHHHHHHHHHHHHHHHhhccCCCCCccc Confidence 111 0 11111111222222222333 223455554443444543 223467777777776665544421 11 Q ss_pred hhhccccc-ceeecccccCcccccccccHHHHHHHHHHhcc--cccCeeEEEEchHHHHHHHHhhccccccc--cccccc Q lcl|NC_016566. 135 AAIESNAA-ANYTQPARVDGVGGRTFPTLADFPLAASKFGD--QAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGD 209 (364) Q Consensus 135 Gv~~~na~-~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD--~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~ 209 (364) |++..... .+...+ ..........++.++...+-. ..-.=..|+||+.++..|.+ +.+.++ ++.... T Consensus 266 Gi~~~~~~~~~~~~~-----~~~~~~~~~~d~~~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~---lkd~~G~~l~~~~~ 337 (435) T protein:vir:80 266 GLRFWALPGNVITAS-----DGSTLQKIETDLGKAILALENADANLTQPGWIMAPRTFRFLEG---LRDGNGNKVYPELA 337 (435) T ss_pred ceeecccccceeecc-----cccchhhHHHHHHHHHHHhhccccccccCEEEEcHHHHHHHHh---hhccCCceeccCCC Confidence 22211100 000000 000111112244555444422 12223569999999988854 333332 332222 Q ss_pred ceeecccCCcEEEEeCCCCCCC---CceEEEEEec-ceeEEec-CCCCcceee----ccCCCce-------eeeEEeeEE Q lcl|NC_016566. 210 LQVMGDGLGRRFIISDAAADAM---GAGKMLGLVP-GAVAVTT-NGLDMLAQE----KGGNENI-------ERWWQGEFD 273 (364) Q Consensus 210 ~~~~~~~lGrrVIVDD~~p~~~---~~Yttylfg~-GAi~~~~-~~~~~~~~~----~~g~e~~-------~~~~~~~~~ 273 (364) -.+ .+|+||+++|.||... +.-...+||. .-+.++. ++..+.... .++...+ ...++.+.. T Consensus 338 ~~~---l~G~pv~~~~~~p~~~~~~~~~~~i~~gd~s~~~i~~~~~~~i~~~~~~~~~~~~~~~~~~f~~n~~~~r~~~r 414 (435) T protein:vir:80 338 NGM---LKGYPVGKTTQVPINLGEAGKESEIYFTDFGDVFIGEEETLEIDYSKEATYKDADGHMVSAFQRDQTLIRVIAK 414 (435) T ss_pred CCe---EeeeeeEEeccccccccCCCCcceEEEEEcccEEEEeecceEEEEeccccccccccchhhhhhcCcceeeeeee Confidence 222 3599999999998632 2223344442 1122222 222221111 1121111 122233322 Q ss_pred E---EeeeeeeeecccccccccCCCCcChhhhcCCcccee Q lcl|NC_016566. 274 F---NVAVKGYRLKASARTPVEGVRSFKLSDITDKANWEL 310 (364) Q Consensus 274 f---~lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW~r 310 (364) | ..||..|..-... +|-- T Consensus 415 ~d~~~~~~~a~~~l~~~-------------------~~~~ 435 (435) T protein:vir:80 415 NDFGPRHVESIAVLSGV-------------------AWGA 435 (435) T ss_pred eCcEeecccceEEEecc-------------------CCCC Confidence 2 1455555443221 1111 No 115 >protein:vir:4197 Length: 314 # NCBI annotation: putative structural protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:88 # MgeName: psiM100 # Cross-refs: genbank:acc:NP_071822;genbank:gi:11863105;genbank:GeneID:1257607 Probab=87.32 E-value=0.04 Score=28.28 Aligned_cols=263 Identities=10% Similarity=0.011 Sum_probs=107.3 Q ss_pred CC--ccc----------------cchhhhhhhhhhhHHHHHHHhhhhcceeEec-cCcccCceeeeehhhhhcccccccc Q lcl|NC_016566. 1 MS--LTV----------------FQRKLVTAVTQMIPDNLNVFNAAANGAVVLG-TGEVLKDVVEKMSVGLIANLVTDRN 61 (364) Q Consensus 1 fd--~~v----------------fn~~~~~~~~e~i~q~~~~fn~as~gAivl~-~~~~~Gdf~~~~~f~~i~g~~~~~d 61 (364) || -+. -+|+.+..+++.+.+.-...+. |-|+. .....+++....+=..+ .+.++ T Consensus 1 ~~~~~~~~~~~k~it~~d~~gG~L~P~~~~~~i~~l~e~s~i~~~----a~vi~t~~s~~~~i~~i~~g~~~---~~~~~ 73 (314) T protein:vir:41 1 MDFLNKPFQITPKIDVPDLGKGILAVQRFGEFVREVRENSAIIKD----ARVLNALKSYEVDISRISLGVEL---EPGRN 73 (314) T ss_pred CchhhhHHHhhcccccccCCCceeChHHHHHHHHHHHhccchhhh----eeeecccCccceeecccccCccc---ccccc Confidence 11 122 2444455566666553222222 22322 23333433222110011 11122 Q ss_pred cccCCC--ccccchhhhccceeeEEeccccCchhcCHHHHHhh--cCCHHHHHHHHHHHHHHHHHHHHHHHHHHH----- Q lcl|NC_016566. 62 AYAPVG--TPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKI--ETNVNSVAAEIAAQATQAIMLHYLKAGIGA----- 132 (364) Q Consensus 62 ~~~~~~--~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~--g~dp~~~~~~Ig~~va~yw~~~~qk~lla~----- 132 (364) -.+... ..++| ..++..-.+-|+... +..+..-+.-. +.|.++ .|.+++|+-+.++.+..++.- T Consensus 74 ~~~~~~~~~~~~~-tf~~~~l~~~kl~~~---v~is~e~L~D~a~~~~le~---~i~~~~Ae~~g~~~~~~~~nGdg~~~ 146 (314) T protein:vir:41 74 TSGTKVAPTADEV-TVSTNTLEMKELVTK---VVLEDEALEDNIEQSAFEQ---TITSLLASGVTYDLECFFLHADSSLT 146 (314) T ss_pred cccCCccCCcccc-cccceeeeeEEEEEe---ecccHHHHHhhhchhhHHH---HHHHHHHHHHHHHHHHHhhccccCCc Confidence 111111 11222 245554444455432 34555555322 335544 477889999999888877642 Q ss_pred --------HhhhhcccccceeecccccCcccccccccHHHHHHHHHHhcccccC---eeEEEEchHHHHHHHHhhccccc Q lcl|NC_016566. 133 --------GKAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAAL---IKSWFMDGVTWANFIAYQALPSA 201 (364) Q Consensus 133 --------L~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~---l~~ivMHS~v~~~L~k~~~it~~ 201 (364) ..|.+..-..++.+.+.. ....+...|.+....|-+..-+ =-.|+||..++..+.+. +.+. T Consensus 147 s~~~~~~~p~G~l~~a~~~~~~~~~~------~~~~~~~~~~~l~~sl~~~yr~~~~~~~~~m~~~t~~~~r~~--l~~~ 218 (314) T protein:vir:41 147 TGRELYRINDGWMKLAGNQYTDAEPE------DENWPLNLFDGMMDELDTRYLQLKPRMKFYVSNEIYNGYRKQ--LLVR 218 (314) T ss_pred CcccchhcchhhhhhcccceeecCcc------ccccHHHHHHHHHHhcCchhhcCCCceEEEecHHHHHHHHHH--Hhcc Confidence 234442222233322211 1223345577888888886533 34799999998776542 2111 Q ss_pred cc-c----cccccceeecccCCcEEEEeCCCCCCCCceEEEEEecceeEEecCC-CCc-ceeecc--CCCceeeeEEeeE Q lcl|NC_016566. 202 EQ-V----FAIGDLQVMGDGLGRRFIISDAAADAMGAGKMLGLVPGAVAVTTNG-LDM-LAQEKG--GNENIERWWQGEF 272 (364) Q Consensus 202 ~~-~----~~~~~~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~GAi~~~~~~-~~~-~~~~~~--g~e~~~~~~~~~~ 272 (364) .. + ...+.-. ..+||+|+....||... ++.++|.+++-. +.. ...+.. ...+. .. ..+ T Consensus 219 ~~~l~~~~~~~~~~~---~l~G~PV~~~~~~~~~~-------~~~~~i~fgd~~nlv~~~~~~ir~~~~~~a-~~--~~~ 285 (314) T protein:vir:41 219 ETGLGDSALIGATGL---QYDGIPIQYVPALDALG-------DDKARALLTVPTNLVYGFWRNIRIEPKRDA-AM--RRT 285 (314) T ss_pred CCcccchhhhCCCCc---eecceeeEecccccccC-------CCCceEEEechhheEEEeeceeEEeecccC-cC--CeE Confidence 11 1 0111111 23699999999998533 334444444311 100 000000 00000 00 111 Q ss_pred EEEeeee---eeeecccccccccCCCCcChhhhcCCccceeecCcCcCcce Q lcl|NC_016566. 273 DFNVAVK---GYRLKASARTPVEGVRSFKLSDITDKANWELDQGQVDNAPA 320 (364) Q Consensus 273 ~f~lhp~---G~sw~~~~~~~~~gg~SPT~aeLat~~NW~rV~~s~K~~pg 320 (364) .|..+-| +|.|.++.+-.. =.|.-+| T Consensus 286 ~~~~~~r~d~~~~~~~aa~~~~----------------------~~~~~~~ 314 (314) T protein:vir:41 286 EYIASLRADCNYEDENAAVAAV----------------------IDMSSGG 314 (314) T ss_pred EEEEEEEeceEEEEcCcEEEEE----------------------eeccCCC Confidence 2222221 233332221111 0111111 No 116 >protein:vir:99920 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:1611 # MgeName: Halo # Cross-refs: genbank:acc:YP_655524;genbank:gi:109392294;genbank:GeneID:4157089 Probab=86.55 E-value=0.045 Score=27.98 Aligned_cols=267 Identities=7% Similarity=-0.059 Sum_probs=101.5 Q ss_pred CCcc---ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCccc-CceeeeehhhhhcccccccccccCCCccccchhhh Q lcl|NC_016566. 1 MSLT---VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVL-KDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLA 76 (364) Q Consensus 1 fd~~---vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~-Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit 76 (364) .+.. .-.+++...++|.+.+......- +..+ +.. |. .+.|.+..-..+ .-+. .+..-...++ ++. T Consensus 4 ~tt~~g~~vP~~~~~~ii~~~~~~s~l~~~---~~~i----~~~~~~-~~~p~~~~~~~a-~wv~-Eg~~~~~~~~-~f~ 72 (311) T protein:vir:99 4 FGTGNLKNLPRNIADGMVKDVVQGSTVAVL---SARK----PQRFGN-EDIITFNGRPKA-EFVG-EGQQKSSTTG-EFD 72 (311) T ss_pred ecCCCceeccHHHHHHHHHHHHhhchhhhh---ccee----eccCCc-eEEEEEeCCcee-EEee-cCcccccccc-eee Confidence 1110 12344455566665543222111 1111 111 22 234433211110 0000 0110001111 122 Q ss_pred ccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHh-----hhhcccc---cceeecc Q lcl|NC_016566. 77 RMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGK-----AAIESNA---AANYTQP 148 (364) Q Consensus 77 ~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~-----Gv~~~na---~~v~dis 148 (364) +-.-...|++. -+..+.+-+.-...+-..+...|.+++++.+.+...+.+|.-.. +..+... .....+. T Consensus 73 ~v~l~~~k~~~---~~~iS~ell~~~~d~~~~l~~~i~~~la~ai~~~~d~~~l~G~g~~~g~~~~g~~~~~~~~~~~~~ 149 (311) T protein:vir:99 73 FVTSTPKKAQV---TMRFNEEVQWADEDYQLGVLQTLSEAGAEALARALDLGLYHRINPLTGTVIPGWSNYLGAASKRVE 149 (311) T ss_pred EEEEeeEEEEE---eehhhHHHhhcccccHHHHHHHHHHHHHHHHHHHHHHHhhcccCcccCccccccccccccccceee Confidence 22222233332 23355443321123334456678899999999888777764321 1111000 0000000 Q ss_pred cccCcccccccccHHHHHHHHHHhcccccCe--eEEEEchHHHHHHHHhhccccccc--cccccccee-ecccCCcEEEE Q lcl|NC_016566. 149 ARVDGVGGRTFPTLADFPLAASKFGDQAALI--KSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQV-MGDGLGRRFII 223 (364) Q Consensus 149 ~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l--~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~~-~~~~lGrrVIV 223 (364) ..+. .......++.++..++-....+. .+|+||+..+..|.+ +.+.++ ++......- ....+|+||++ T Consensus 150 ~~~~----~~~~~~~~i~~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~---lkd~~G~~l~~~~~~~~~~~~l~G~Pv~~ 222 (311) T protein:vir:99 150 LTAD----TIANPDLAIEAAVGLLVANGHPTPVNGLALHPSIAWGLST---ARYTDGRKKFPELGLGIGVSSFEGIDASV 222 (311) T ss_pred cccc----ccchhHHHHHHHHHHHhhhccCCCccEEEEcHHHHHHHHh---hhccCCCeeecCcccCCCCceecceeeEe Confidence 0000 00111234556666654433332 359999999998854 443332 222211110 11235999999 Q ss_pred eCCCCCCCC-----------ceEEEEEec--ceeEEecC-CCCcceeeccC-C------CceeeeEEee--EEE-Eeeee Q lcl|NC_016566. 224 SDAAADAMG-----------AGKMLGLVP--GAVAVTTN-GLDMLAQEKGG-N------ENIERWWQGE--FDF-NVAVK 279 (364) Q Consensus 224 DD~~p~~~~-----------~Yttylfg~--GAi~~~~~-~~~~~~~~~~g-~------e~~~~~~~~~--~~f-~lhp~ 279 (364) ++.+|.... -+..+++|. ..+.++.. +..+...+... + +.-.+.++.+ +.| ++||. T Consensus 223 s~~i~~~~~~~~~~~~~~~~~~~~~~~Gdf~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~~~~r~~~r~d~~v~~~~ 302 (311) T protein:vir:99 223 SDTVNGGDEADPDDEDLDAARAVRGIVGDFANGIHWGVQRDIPVELIKYGDPDGQGDLKRHNQIALRLEIVYGWYVFTDR 302 (311) T ss_pred ecccccccccccccchhhccCcceEEEeeccccEEEEEecCceEEEeecCCCCcchhhhhcCcEEEEEEEeecceecChh Confidence 999874221 122223332 23333221 11111111110 1 0111222222 221 35666 Q ss_pred eeeecccccc Q lcl|NC_016566. 280 GYRLKASART 289 (364) Q Consensus 280 G~sw~~~~~~ 289 (364) -++.+.++ + T Consensus 303 ~v~~~~~~-A 311 (311) T protein:vir:99 303 FVVIENAV-A 311 (311) T ss_pred Heeeeccc-C Confidence 66665543 0 No 117 >protein:vir:1433 Length: 435 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:30 # MgeName: phiE125 # Cross-refs: genbank:acc:NP_536362;genbank:gi:17975167;genbank:GeneID:929171 Probab=82.62 E-value=0.075 Score=26.74 Aligned_cols=268 Identities=13% Similarity=0.050 Sum_probs=94.5 Q ss_pred CCc---------------cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccC Q lcl|NC_016566. 1 MSL---------------TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAP 65 (364) Q Consensus 1 fd~---------------~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~ 65 (364) +.. .+-.+++...+++.+.+..-...- ++-++.. ..|. ...|.+..-+.+ .-+... . T Consensus 124 ~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~l~~~~~i~~~---~~~~~~~--~~~~-~~~p~~~~~~~a-~~v~E~-~ 195 (435) T protein:vir:14 124 FGEEVAMSLNTLSPGAGGVLVPENLSSEVIELLRPKSVVRKL---GARTLPL--SNGN-ITIPRLKGGAIV-GYIGAD-T 195 (435) T ss_pred hhhhhhhhcccCCcCCCccccchhHHHHHHHHHhhhchhhhh---cceeeec--CCCc-eEEEEEeCCcce-eeeccC-c Confidence 000 012333444455555433221111 2211111 1122 244433311110 011111 1 Q ss_pred CCccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHH------HHhhhhcc Q lcl|NC_016566. 66 VGTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIG------AGKAAIES 139 (364) Q Consensus 66 ~~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla------~L~Gv~~~ 139 (364) .....++ ++.+-.-...+++ +-+..+...+.-.+.+|. +-..|.+++++...+...+.+|. ...|++.. T Consensus 196 ~~~~~~~-~f~~i~~~~~k~~---~~~~iS~ell~ds~~~~~-l~~~i~~~l~~ai~~~~d~a~l~G~G~~~~p~Gi~~~ 270 (435) T protein:vir:14 196 DIPTTQQ-QFDDLKLTAKKMA---ALVPIANDLIKYAGVNPN-VDQIVVGDLTAAIGAREDKAFIRDDGTANTPKGLRFW 270 (435) T ss_pred ccccccc-ceeEEEeeeEEEE---EeehhhHHHHHhhccCHH-HHHHHHHHHHHHHHHHHHHHhhccCCCCccccceeec Confidence 1111111 1222222222333 223455554443455653 33457777888777776665541 11222110 Q ss_pred cccceeecccccCcccccccccHHHHHHHHHHhccccc--CeeEEEEchHHHHHHHHhhccccccc--ccccccceeecc Q lcl|NC_016566. 140 NAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAA--LIKSWFMDGVTWANFIAYQALPSAEQ--VFAIGDLQVMGD 215 (364) Q Consensus 140 na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~--~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~~~~~~~~ 215 (364) .. ...+...++. ........++.+....+-.... .=..|+||...+..|.+ +.+.++ ++....-.+ T Consensus 271 ~~--~~~~~~~~~~--~~~~~~~~~~~~l~~~~~~~~~~~~~~~~v~n~~~~~~L~~---lkd~~G~~l~~~~~~g~--- 340 (435) T protein:vir:14 271 AL--PSNVITASDA--STLQKIETDLGKVILALENADANLTQPGWIMAPRTFRFLEG---LRDGNGNKVYPELANGM--- 340 (435) T ss_pred cc--ccceeccccc--cchhhHHHHHHHHHHHhhhccccccCCEEEEcHHHHHHHHH---hhccCCceeccCCCCCe--- Confidence 00 0000000000 0001111233333333322111 12469999999998854 333332 332222122 Q ss_pred cCCcEEEEeCCCCCC---CCceEEEEEec-ceeEEec-CCCCcceeec----cCCCce-------eeeEEeeEEEE---e Q lcl|NC_016566. 216 GLGRRFIISDAAADA---MGAGKMLGLVP-GAVAVTT-NGLDMLAQEK----GGNENI-------ERWWQGEFDFN---V 276 (364) Q Consensus 216 ~lGrrVIVDD~~p~~---~~~Yttylfg~-GAi~~~~-~~~~~~~~~~----~g~e~~-------~~~~~~~~~f~---l 276 (364) .+|+||+++|.||.. .+.-...+||. .-+.++. ++........ ++...+ ...++....|- . T Consensus 341 l~G~Pv~~~~~~p~~~~~~~~~~~i~~gd~s~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~~~ 420 (435) T protein:vir:14 341 LKGYPVGKTTQVPINLGETGKESEIYFTDFGDVFIGEEETLEIDYSKEATYKDADGHMVSAFQRDQTLIRVIAKNDFGPR 420 (435) T ss_pred eecceeEeeccccccccCCCccceEEEeecccEEEEEecccEEEEeccccccccccchhhhhhcChhheeeeeeeCceee Confidence 359999999999863 22333455553 1122222 2222211111 111111 11222221111 2 Q ss_pred eeeeeeecccccccccCCCCcChhhhcCCcccee Q lcl|NC_016566. 277 AVKGYRLKASARTPVEGVRSFKLSDITDKANWEL 310 (364) Q Consensus 277 hp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW~r 310 (364) +|.-|. .-++.+|-- T Consensus 421 ~~~a~~-------------------~l~~~~~~~ 435 (435) T protein:vir:14 421 HVESIA-------------------VLAGVAWGA 435 (435) T ss_pred cccceE-------------------EEecCCCCC Confidence 333222 223333322 No 118 >protein:vir:80128 Length: 466 # NCBI annotation: Phage capsid protein # Family: family:all:635 # MgeID: mge:1877 # MgeName: bacteriophage bv1 # Cross-refs: genbank:acc:YP_001425603;genbank:gi:155042936;genbank:GeneID:5469556 Probab=79.73 E-value=0.1 Score=26.03 Aligned_cols=297 Identities=14% Similarity=0.065 Sum_probs=73.9 Q ss_pred CCccc-----------------------cchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCce-eeeehhhhhccc Q lcl|NC_016566. 1 MSLTV-----------------------FQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDV-VEKMSVGLIANL 56 (364) Q Consensus 1 fd~~v-----------------------fn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf-~~~~~f~~i~g~ 56 (364) +.... .++.. ..+.+.+.+....--..++|.++ .++.+.... ....-++.|-.+ T Consensus 105 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~g~~~~-vP~~~~~~i~~~l~~~~~l~~~ 182 (466) T protein:vir:80 105 RTQQFVGGETRMKGFFRNMPYEQRAALIARSEV-KEFLAQVRTLAQQKRAVSGAELT-IPDVMLELLRDNMHRYSKLISK 182 (466) T ss_pred hhhHHhhHHHHHHHHHHhhhhhhHHHHHHHHHH-HHHHHHHHHHhhhhhhhcccccc-ccHHHHHHHHHhhhhhhhhhhh Confidence 00000 00000 00000111100000111222111 111110000 000000111111 Q ss_pred cccccccc--------C---CCccccchhhhc----cceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHH Q lcl|NC_016566. 57 VTDRNAYA--------P---VGTPATAKVLAR----MLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAI 121 (364) Q Consensus 57 ~~~~d~~~--------~---~~~~~T~~kit~----~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw 121 (364) ++-....+ . ..+......+.+ ...+....+.=.+-+..+...+.-...+ +-.-|.+.+++.. T Consensus 183 ~~v~~~~g~~~~~~~~~~~~a~wv~E~~~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~ds~~~---l~~~i~~~la~~~ 259 (466) T protein:vir:80 183 VRLRPLKGTARQNIAGAIPEGVWTEAVANLNELSLSFSQIEVDGYKVGGFIPIPNSTLEDSDLN---LADEILDAIGQAI 259 (466) T ss_pred eeeeecCceeEeeeecCCcceeecccccccccccccccceeecceeeeeehhhhHHHHhcchHH---HHHHHHHHHHHHH Confidence 10000000 0 000000000000 1111111111112223343333211222 2233445555544 Q ss_pred HHHHHHHHHHH-----Hhhhhcccccce------------eeccccc----CcccccccccHHHHHHHHHH-hcccccCe Q lcl|NC_016566. 122 MLHYLKAGIGA-----GKAAIESNAAAN------------YTQPARV----DGVGGRTFPTLADFPLAASK-FGDQAALI 179 (364) Q Consensus 122 ~~~~qk~lla~-----L~Gv~~~na~~v------------~dis~~t----~~~~~~~~~s~~~l~~A~~~-lGD~~~~l 179 (364) ..-.-+.+|.- -.|++...+..+ .+++... ..........+..+..+... ..-..... T Consensus 260 ~~~~~~ail~G~G~~~P~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 339 (466) T protein:vir:80 260 GFALDKAILYGTGTKMPVGIVTRLAQTTQPPNWGTKAPAWTNLSTTNLLKIDPTGKSAEEFFSELVLKLSKARANYSNGM 339 (466) T ss_pred HHHHhhheeeccCCCCcceeeecccccccccccccccccccccchhhhhhhhhhccchhhHHHHHHHHHHhhhccccCCc Confidence 43333332210 012221110000 0000000 00000000001111111111 12223345 Q ss_pred eEEEEchHHHHHHHHhhcccccccccccccceeecccCCcEEEEeCCCCCCCCceEEEEEecceeEEecCCCCcceeecc Q lcl|NC_016566. 180 KSWFMDGVTWANFIAYQALPSAEQVFAIGDLQVMGDGLGRRFIISDAAADAMGAGKMLGLVPGAVAVTTNGLDMLAQEKG 259 (364) Q Consensus 180 ~~ivMHS~v~~~L~k~~~it~~~~~~~~~~~~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~GAi~~~~~~~~~~~~~~~ 259 (364) ..|+||+.++..|++.....+..+...... ..-...+|++|++++.||... -.+++....+.....+..+.+.... T Consensus 340 ~~w~~~~~~~~~l~~~~~~~~~~g~~~~~~-~~~~~i~G~pvv~s~~~~~~~---~~~g~~~~y~i~~r~~~~i~~~~~~ 415 (466) T protein:vir:80 340 KFWAMSSNTHAVLMSKAITFNSAGALVASL-NNTMPIVGGDIVILDFIPDND---IIGGYGSLYLLAERADIKLAQSEHV 415 (466) T ss_pred eeEEecchhHHHhhcccccccCCccccccC-CCcccccccceeecCccCccc---eeeeccccEEEEeecceEEEechhh Confidence 569999999988865322112222211110 001124699999999999642 1111112212111122222111111 Q ss_pred C-CCceeeeEEeeEEEE---eeeeeeeecccccccccCCCCcChhhhcCCccceeecCcCcCcceE Q lcl|NC_016566. 260 G-NENIERWWQGEFDFN---VAVKGYRLKASARTPVEGVRSFKLSDITDKANWELDQGQVDNAPAT 321 (364) Q Consensus 260 g-~e~~~~~~~~~~~f~---lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW~rV~~s~K~~pgv 321 (364) . .++ .+.+++...+. ++|..|.-- .++.. +|.-. =+- ..+--++|-| T Consensus 416 ~f~~d-~~~~r~~~r~dg~~~~~~afv~~--~~~~~----~~~~~-------~~~-~~~~~~~~~~ 466 (466) T protein:vir:80 416 RFIED-QTVFKGTARYDGKPVFGEGFVAV--NIANA----NPTTS-------ITF-APDEANVPEV 466 (466) T ss_pred hhhcC-cEEEEEEEEEccEEeccCceEEE--EecCC----Ccccc-------eee-ecCcCcCCCC Confidence 1 011 12233322222 333333321 11100 11000 000 1122222222 No 119 >protein:vir:108211 Length: 318 # NCBI annotation: gp9 # Family: family:all:6420 # MgeID: mge:2004 # MgeName: Giles # Cross-refs: genbank:acc:YP_001552338;genbank:gi:160700658;genbank:GeneID:5758931 Probab=78.26 E-value=0.12 Score=25.71 Aligned_cols=261 Identities=11% Similarity=0.002 Sum_probs=114.7 Q ss_pred CCccccchhhhhhhhhhhHHHH---HHHhh---hhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchh Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNL---NVFNA---AANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKV 74 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~---~~fn~---as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~k 74 (364) -|.=+=||.+.+..+..+.+++ +++.. +..+.+| .. +-..|+|- .| +..++..-+-...+... T Consensus 18 v~~ll~~P~~I~~~i~e~~~~~~iad~lf~~~~a~~~~~v-~f------~~~~p~~~--~~--d~e~VaEggEiP~~~~~ 86 (318) T protein:vir:10 18 VRELVGNPLWIPTALKKMMVNQFISESLFRNGGANPNGVV-AY------NEGNPSFL--ED--DVADVAEFGEIPVSAGA 86 (318) T ss_pred hHHhhCCchhHHHHHHHHHhccchhhhhhhccccccccee-EE------Eecccccc--cC--cHhhccCcccccccCCC Confidence 3333556777766655444432 22222 2222222 22 11223332 11 11121111111111111 Q ss_pred hhcccee--eEEeccccCchhcCHHHHHhhcCCHHH-HHHHHHHHHHHHHHHHHHHHHHHHHhhhh-ccc---------c Q lcl|NC_016566. 75 LARMLTN--SVNLSAKVGPVAITKAMMAKIETNVNS-VAAEIAAQATQAIMLHYLKAGIGAGKAAI-ESN---------A 141 (364) Q Consensus 75 it~~~~v--aVkl~~~~gpv~~t~~~~~~~g~dp~~-~~~~Ig~~va~yw~~~~qk~lla~L~Gv~-~~n---------a 141 (364) . ....+ ++|...+ +..+.++.+|-.-++.+ .+.++++.++.+-.+ ..+.+|..+. -.- + T Consensus 87 ~-G~~~ia~~~K~G~~---~~vS~Em~~~n~~~~v~r~~~~l~Nti~r~~d~----~a~dal~sa~t~~~~~s~~w~~~~ 158 (318) T protein:vir:10 87 R-GLPRTAFAVKKALG---VRVSKEMIDENRVGAVNDQMLQLRNTFIRANDR----SAKALLQSPIVPTLAVPTAWDNGG 158 (318) T ss_pred C-Cchhhhhhehhccc---eeccHHHHhhcChhHHHHHHHHHHHHHHHHHHH----HHHHHHhccccccccCCcCCCCcc Confidence 1 11122 2244444 45889999887777744 334455555555444 3444442221 110 0 Q ss_pred cceeecccccCcccccccccHHHHHHHHHHhccccc----CeeEEEEchHHHHHHHHhhcc-----ccccccccccc-ce Q lcl|NC_016566. 142 AANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAA----LIKSWFMDGVTWANFIAYQAL-----PSAEQVFAIGD-LQ 211 (364) Q Consensus 142 ~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~----~l~~ivMHS~v~~~L~k~~~i-----t~~~~~~~~~~-~~ 211 (364) ....|+..+.+..++ ...+++.| .+|.... ....|+||...+..|.+...+ .+++..+...+ .. T Consensus 159 ~~~~d~~~A~e~v~~----a~~~~~~a--~~~~~~~~~GY~pdtIVlhP~~~~~l~~n~~~~~~y~~~a~~~~~~~~~tg 232 (318) T protein:vir:10 159 KVRTDIAIAIEQIST----AAPTAYPA--GVGSSDEYFGFIPDTIVMHYALLPILMDNENFMKVYERNANYVSTAPDWTG 232 (318) T ss_pred cccccchhhhhhhhh----hhhhhhhh--hhhhhhhccCccceeeEECHHHHHHHhcchhhhhhhhccchhhhhcccccc Confidence 111122111110000 00011111 1122222 237999999999998653332 12222322211 11 Q ss_pred ee-cccCCcEEEEeCCCCCCCCceEEEEEecceeEEe-cCCCCc--ceeec----cCCCceeeeEEe-e--EEEEeeeee Q lcl|NC_016566. 212 VM-GDGLGRRFIISDAAADAMGAGKMLGLVPGAVAVT-TNGLDM--LAQEK----GGNENIERWWQG-E--FDFNVAVKG 280 (364) Q Consensus 212 ~~-~~~lGrrVIVDD~~p~~~~~Yttylfg~GAi~~~-~~~~~~--~~~~~----~g~e~~~~~~~~-~--~~f~lhp~G 280 (364) .+ .-+||.+||++=.+|.. +-|+|-.|.+++- +..|-. ..+.. +|+++.-+..++ | --.+..|+. T Consensus 233 ~~~g~~lGl~vi~s~~~p~~----~alvlq~g~vG~~~d~~pl~~t~~~~egg~~~g~~~~s~~~~~~~~~~~~V~~PkA 308 (318) T protein:vir:10 233 NFPGSVMGLNVIRSRTFPID----RVLIMERGTVGFYSDTRPLQFTALYPEGNGPNGGPTESYRADASHKRALAVDQPKA 308 (318) T ss_pred cccceeeceEEeecCccCCC----eeEEEecCCcceeeccccceeeecccCCCCCCCCcchhhheehheeeeeeeeCcce Confidence 11 23479999999999953 3699999999864 333211 12222 444444333222 1 125578999 Q ss_pred eeecccccccccCCCCc Q lcl|NC_016566. 281 YRLKASARTPVEGVRSF 297 (364) Q Consensus 281 ~sw~~~~~~~~~gg~SP 297 (364) +-|-.- -.|| T Consensus 309 ~~~itg-------i~~~ 318 (318) T protein:vir:10 309 ALWLTG-------IVTP 318 (318) T ss_pred eEEEee-------ccCC Confidence 999543 3478 No 120 >protein:vir:4226 Length: 326 # NCBI annotation: observed 35.2Kd protein # Family: family:all:507 # MgeID: mge:89 # MgeName: L5 # Cross-refs: genbank:acc:NP_039681;swissprot:sw:q05223;genbank:gi:9625447;uniprot:Q05223;genbank:GeneID:2942929 Probab=75.37 E-value=0.15 Score=25.14 Aligned_cols=269 Identities=9% Similarity=-0.008 Sum_probs=98.0 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) =.-.+-.+++...++|.+.+..-...- ...-++.+.-.+.|.+..-..+. .+... ......++ ++.+ T Consensus 26 ~~g~~ip~~~~~~ii~~~~~~s~i~~~-------~~~~~~~~~~~~~p~~~~~~~a~-~v~Eg-~~~~~~~~-~f~~--- 92 (326) T protein:vir:42 26 MFEGYLEPEQAQDYFAEAEKISIVQQF-------AQKIPMGTTGQKIPHWTGDVSAS-WIGEG-DMKPITKG-NMTS--- 92 (326) T ss_pred CCcceechhhHHHHHHHHHhcchhhhh-------cceeeccCCceEEEEEeCCcceE-EecCC-cccccccc-ceeE--- Confidence 111123444444455554432211110 01122223223444433222211 11111 11111111 2222 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH-----hhhhcccccceeecccccCccc Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAG-----KAAIESNAAANYTQPARVDGVG 155 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L-----~Gv~~~na~~v~dis~~t~~~~ 155 (364) +-.+..+-.+-+..+.+.+. .....+...|.+++++...+...+.+|.-- .|++............ ... T Consensus 93 i~~~~~k~~~~v~iS~ell~---~s~~~~~~~i~~~l~~a~~~~~d~a~l~G~gs~~p~gi~~~~~~~~~~~~~---~~~ 166 (326) T protein:vir:42 93 QTIAPHKIATIFVASAETVR---ANPANYLGTMRTKVATAFAMAFDNAAINGTDSPFPTFLAQTTKEVSLVDPD---GTG 166 (326) T ss_pred EEEeeEEEEEeehhhHHHHh---cCHHHHHHHHHHHHHHHHHHHHHHHhhcccCCCccccccccccccceeecc---ccc Confidence 22322222222345554443 233445566888888888887766554211 1111111110000000 011 Q ss_pred ccccccHHH--HHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccccc----cce-e-ecccCCcEEEEeC Q lcl|NC_016566. 156 GRTFPTLAD--FPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAIG----DLQ-V-MGDGLGRRFIISD 225 (364) Q Consensus 156 ~~~~~s~~~--l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~~----~~~-~-~~~~lGrrVIVDD 225 (364) .....+..+ +.++..++......-..|+||...+..|.+ +.+.++ ++... ... . ....+|++|+++| T Consensus 167 ~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~n~~~~~~L~~---lkd~~G~~l~~~~~~~~~~~~~~~~~l~G~pv~~~~ 243 (326) T protein:vir:42 167 SNADLTVYDAVAVNALSLLVNAGKKWTHTLLDDITEPILNG---AKDKSGRPLFIESTYTEENSPFRLGRIVARPTILSD 243 (326) T ss_pred ccccchhHHHHHHHHHhhhhhhccCccEEEEeHHHHHHHHH---hhccCCceeeccccccCccccccCceeeeeeEEEcC Confidence 111112212 234555555555666789999999999864 332222 22211 111 1 1124699999999 Q ss_pred CCCCCCC-----ceEEEEEe-cceeEEecCC-C--CcceeeccC----CCceeeeEEeeEEE---Eeeeeeeee-ccccc Q lcl|NC_016566. 226 AAADAMG-----AGKMLGLV-PGAVAVTTNG-L--DMLAQEKGG----NENIERWWQGEFDF---NVAVKGYRL-KASAR 288 (364) Q Consensus 226 ~~p~~~~-----~Yttylfg-~GAi~~~~~~-~--~~~~~~~~g----~e~~~~~~~~~~~f---~lhp~G~sw-~~~~~ 288 (364) .+|.... -+.-|.|+ .+.+.+.-.. . .....+... -+.-...++.+..| .+||.-|.- +..+ T Consensus 244 ~~~~~~~~~~~Gd~s~~~~~~~~~~~v~~~~e~~~~~~~~~~~~~~~~~~~d~~~~r~~~~~d~~v~~~~a~~~l~~~~- 322 (326) T protein:vir:42 244 HVASGTVVGYQGDFRQLVWGQVGGLSFDVTDQATLNLGTPQAPNFVSLWQHNLVAVRVEAEYAFHCNDKDAFVKLTNVD- 322 (326) T ss_pred CCCCCceEEEEeecceEEEEEecceEEEEeecceeeecccccccchhhhhcCcEEEEEEEEeccEEecccceEEEeecc- Confidence 9986321 11111121 1222222110 0 000000000 00011222222111 155555522 1111 Q ss_pred ccccCCCC Q lcl|NC_016566. 289 TPVEGVRS 296 (364) Q Consensus 289 ~~~~gg~S 296 (364) .+++ T Consensus 323 ----~~~~ 326 (326) T protein:vir:42 323 ----ATEA 326 (326) T ss_pred ----ccCC Confidence 1112 No 121 >protein:vir:104085 Length: 320 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:1656 # MgeName: Che12 # Cross-refs: genbank:acc:YP_655596;genbank:gi:109392467;genbank:GeneID:4156953 Probab=74.00 E-value=0.16 Score=24.90 Aligned_cols=264 Identities=8% Similarity=-0.047 Sum_probs=100.4 Q ss_pred CCcccc--------------chhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCC Q lcl|NC_016566. 1 MSLTVF--------------QRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPV 66 (364) Q Consensus 1 fd~~vf--------------n~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~ 66 (364) ||.... .++....+++.+.+..-+..- . ...++.+.=...|.+.....+ .-......- T Consensus 7 ~~~~~~~~~~t~~~~~~~~ip~~~~~~ii~~~~~~s~l~~~---~----~~~~~~~~~~~~p~~~~~~~a-~~v~E~~~~ 78 (320) T protein:vir:10 7 FQVDHAQIAQTGDTMFKGYLEPEQAKDYFAEAEKTSIVQQF---A----QKVPMGTTGQKIPHWIGDVSA-QWIGEGDMK 78 (320) T ss_pred CCHHHHHhhccccccccccccHHHHHHHHHHHHhccchhhh---c----ceeeccCCceEEEEEeCCcce-EEecCCccc Confidence 332221 222223333333322111111 0 111222322344444322211 111111110 Q ss_pred CccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc---c- Q lcl|NC_016566. 67 GTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNA---A- 142 (364) Q Consensus 67 ~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na---~- 142 (364) ...++ +++ .+..+..+-.+-+..+.+.+. .....+...|.+++++.+.+...+.+|. |--+... . T Consensus 79 -~~~~~-~f~---~v~~~~~k~~~~~~is~ell~---ds~~~l~~~i~~~l~~a~a~~~d~a~l~---G~g~~~~~~~~~ 147 (320) T protein:vir:10 79 -PITKG-NMT---SQNIAPHKIATIFVASAETVR---ANPANYLGTMRTKVATAFAMAFDSAALN---GTDSPFPTYLAQ 147 (320) T ss_pred -ccccc-cee---EEEEeeEEEEEeehhhHHHHh---cChHHHHHHHHHHHHHHHHHHHHHHhhc---ccCCCCCccccc Confidence 11111 222 233333322233346655443 2333455668899999988876666542 2111110 0 Q ss_pred --ceeecccccCcccccccccH--HHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--cccc----cccee Q lcl|NC_016566. 143 --ANYTQPARVDGVGGRTFPTL--ADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFAI----GDLQV 212 (364) Q Consensus 143 --~v~dis~~t~~~~~~~~~s~--~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~~----~~~~~ 212 (364) ......... ... ....+. ..+.++..++-+....-..|+||...+..|.+ +.+.++ ++.. ..-.. T Consensus 148 ~~~~~~~~~~~-~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~---lkd~~G~~l~~~~~~~~~~~~ 222 (320) T protein:vir:10 148 TTKSVSLADPG-GAT-ASDLTAYDAVAVNGLSLLVNAKKKWTHTLLDDIVEPILNG---AKDKNGRPLFIESTYTDENSP 222 (320) T ss_pred ccccccceecc-ccc-ccccccHHHHHHHHHhhhhcccCCCcEEEEcHHHHHHHHH---hhccCCceeeccccccCcccc Confidence 000111100 000 111111 13556677777777778899999999999954 332222 2111 01111 Q ss_pred ec--ccCCcEEEEeCCCCCCCCceEEEEEec--ceeEEec-CCCCcceee-----ccCC---------CceeeeEEeeEE Q lcl|NC_016566. 213 MG--DGLGRRFIISDAAADAMGAGKMLGLVP--GAVAVTT-NGLDMLAQE-----KGGN---------ENIERWWQGEFD 273 (364) Q Consensus 213 ~~--~~lGrrVIVDD~~p~~~~~Yttylfg~--GAi~~~~-~~~~~~~~~-----~~g~---------e~~~~~~~~~~~ 273 (364) ++ ..+|++|++++.+|.... ..+||. .++ ++. +++.+.+.+ .... +.-...++.+.. T Consensus 223 ~~~~~i~g~pv~~~~~~~~~~~---~~~~gd~~~~~-~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~~ 298 (320) T protein:vir:10 223 FRAGRIVSRPTILSDHVADGTT---VGYMGDFRNVI-WGQVGGLSFDVTDQATLNLGTPTEPNFVSLWQHNLVAVRVEAE 298 (320) T ss_pred ccCceeeeeeeEecCCCCCCce---EEEEeecceEE-EEEecCeEEEEeecceeeeccccccccchhhhcCcEEEEEEEe Confidence 11 236999999999986431 112221 111 221 112111110 0000 111122333222 Q ss_pred E---EeeeeeeeecccccccccCCCCcCh Q lcl|NC_016566. 274 F---NVAVKGYRLKASARTPVEGVRSFKL 299 (364) Q Consensus 274 f---~lhp~G~sw~~~~~~~~~gg~SPT~ 299 (364) | .+||.-|.--.... .|.- T Consensus 299 ~d~~v~~~~a~~~l~~~~-------ap~~ 320 (320) T protein:vir:10 299 YAFHNNDKDAFVKLTNVV-------TPDA 320 (320) T ss_pred eccEEecccceEEEEecc-------CCCC Confidence 2 15566654433221 1221 No 122 >protein:vir:4159 Length: 315 # NCBI annotation: structural protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:87 # MgeName: psiM2 # Cross-refs: genbank:acc:NP_046968;genbank:gi:9630538;genbank:GeneID:1261712 Probab=70.76 E-value=0.2 Score=24.36 Aligned_cols=265 Identities=9% Similarity=-0.014 Sum_probs=101.6 Q ss_pred CCc-----cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCC--ccccch Q lcl|NC_016566. 1 MSL-----TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVG--TPATAK 73 (364) Q Consensus 1 fd~-----~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~--~~~T~~ 73 (364) +.. =+-.|+.+..+++++.+..-....+ .++-......+.+....+=.++. +.++..+... ..++| T Consensus 19 ~t~~d~~Gg~l~P~~~~~~i~~~~e~s~~l~~~---~vi~~~~~~~~~i~~~g~~~~~~---~g~~~~~~~~~~~~~~~- 91 (315) T protein:vir:41 19 IDVPDLGRGVLSVDRFGEFVKAVRDSAVIIPEA---RIDNALKSYEKDISRLSLVLDVG---PGRDETGQKLAPPESTA- 91 (315) T ss_pred cCCcCCCCceechHHHHHHHHHHHhhhhhhhhc---eeeeccccccccccccccCcccc---cccccccCcCCCCCCcc- Confidence 111 1123555556666666532211111 11111111222211111100010 0111111111 11222 Q ss_pred hhhccceeeEEeccccCchhcCHHHHHhh--cCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH-----------hhhhccc Q lcl|NC_016566. 74 VLARMLTNSVNLSAKVGPVAITKAMMAKI--ETNVNSVAAEIAAQATQAIMLHYLKAGIGAG-----------KAAIESN 140 (364) Q Consensus 74 kit~~~~vaVkl~~~~gpv~~t~~~~~~~--g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L-----------~Gv~~~n 140 (364) .+.+..-.+-++... +..+.+.+.-. +.|.+ ..+.+++++.+.++.+..++.-= +|.+..- T Consensus 92 ~f~~~~l~~~~l~~~---~~it~elL~D~~~~~~~e---~~l~~~~a~~~a~~~~~~~~nGdg~s~~p~~~~~~G~l~~a 165 (315) T protein:vir:41 92 EVKTNTLYMREMVTK---VVIHEDAIEDNIEGKAFE---QKIVTLLGEGISYVLEKYYLHGDTSSSDPLLRMSDGWLKLA 165 (315) T ss_pred ccceeeeceeeeeee---ccccHHHHHhhhccccHH---HHHHHHHHHHHHHHHHHHhhccCCcCcCccccccccceecc Confidence 344443333344322 34555544322 33544 44778899999998888766431 1322111 Q ss_pred ccceeecccccCcccccccccHHHHHHHHHHhcccccC---eeEEEEchHHHHHHHHhhccccccc--cc----ccccce Q lcl|NC_016566. 141 AAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAAL---IKSWFMDGVTWANFIAYQALPSAEQ--VF----AIGDLQ 211 (364) Q Consensus 141 a~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~---l~~ivMHS~v~~~L~k~~~it~~~~--~~----~~~~~~ 211 (364) ..++.. ++........+...|.+..+.|-...-+ =..|+||..++..+.|. + +..+ +- ..+.-. T Consensus 166 ~~~~~~----~~~~~~a~~~~~d~l~~l~~sl~~~yr~~~~~~~~imn~~t~~~~rkl--k-~~~g~~lw~~~~~~g~~~ 238 (315) T protein:vir:41 166 SEKLTE----SDVDPEAEDWPMNLFDTMIESLPTPYRNNLPNMKFYVTWDIYRAYRDA--L-KGRETGLGDQALTGANSI 238 (315) T ss_pred cccccc----cccccccccccHHHHHHHHHhcChHHhhcCCceEEEEcHHHHHHHHHH--h-ccCCCccccchhhcCCCc Confidence 111110 0111111223344566666667664432 34799999999877551 2 1111 11 111222 Q ss_pred eecccCCcEEEEeCCCCCCCCceEEEEEecce-eEEecC-CCCcceeeccCCCceeeeEEeeEEEEeeee-eeeeccccc Q lcl|NC_016566. 212 VMGDGLGRRFIISDAAADAMGAGKMLGLVPGA-VAVTTN-GLDMLAQEKGGNENIERWWQGEFDFNVAVK-GYRLKASAR 288 (364) Q Consensus 212 ~~~~~lGrrVIVDD~~p~~~~~Yttylfg~GA-i~~~~~-~~~~~~~~~~g~e~~~~~~~~~~~f~lhp~-G~sw~~~~~ 288 (364) + .+|++|...+.||.....-..++||.=. +.++.. +....++. +-.+ .+.++.+.+.+- +|-|.+..+ T Consensus 239 t---l~G~PV~~~~~m~~~~~~~~~ilf~d~~nl~~~~~~~i~i~~~~-~a~~-----~~~~~~~~~r~d~~~~~~~~~a 309 (315) T protein:vir:41 239 L---YDGRPVQYVPALEALNDGKSRALFVVPTQLVYGFWRNIKVVPDY-DAEM-----RLTKYVASLRTDNHYEDEEGAV 309 (315) T ss_pred e---ecccceEecccccccCCCCccEEEecccceEEEeccccEEEeee-cCCC-----CceEEEEEEEeceeEEecccee Confidence 2 3699999999998644222334555421 222221 11111111 0000 001111112221 223333322 Q ss_pred ccccCCCCcChhhhcCCccceee Q lcl|NC_016566. 289 TPVEGVRSFKLSDITDKANWELD 311 (364) Q Consensus 289 ~~~~gg~SPT~aeLat~~NW~rV 311 (364) .... +| T Consensus 310 ~~~~-----------------~v 315 (315) T protein:vir:41 310 SATI-----------------TV 315 (315) T ss_pred Eeee-----------------eC Confidence 2111 11 No 123 >protein:vir:3158 Length: 321 # NCBI annotation: capsid protein gpE # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:316 # MgeName: PhiCh1 # Cross-refs: genbank:acc:NP_665929;genbank:gi:22091115;genbank:GeneID:951342 Probab=65.91 E-value=0.28 Score=23.65 Aligned_cols=273 Identities=8% Similarity=0.030 Sum_probs=97.3 Q ss_pred CCccccchhh------------------------hhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhccc Q lcl|NC_016566. 1 MSLTVFQRKL------------------------VTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANL 56 (364) Q Consensus 1 fd~~vfn~~~------------------------~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~ 56 (364) |.-|.||.+. ...+++++.+...... .-.+ ..-....|. ++.++ +++. T Consensus 1 ~~~k~~~~~l~~~~~~~~~~~~~~~~g~~v~~~~~~~l~~~i~e~s~~l~---~i~v-~~v~~~~~~---i~~~~-~~~~ 72 (321) T protein:vir:31 1 MASRTINNDLSRITEKNALTVDDLDAGGTLPDPLWDEFWTDMIEETPLLD---AIRT-ETVGAKKTR---IPTLN-IGER 72 (321) T ss_pred CchHHHHHHHHHHHHhccccccccCCcceeCHHHHHHHHHHHHHhhhhhh---hcee-eeccCccee---eeeec-cCCc Confidence 5555555542 2223333322211111 1111 111122332 22222 1222 Q ss_pred ccccccccCCCc-cccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHH-- Q lcl|NC_016566. 57 VTDRNAYAPVGT-PATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAG-- 133 (364) Q Consensus 57 ~~~~d~~~~~~~-~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L-- 133 (364) ..+....+.... ..+| ++.+.+--..++.. .+..+...+......| ++...|.+++++-+..+++..++.-- T Consensus 73 ~~~~~~e~~~~~~~~~~-~~~~~~~~~~k~~~---~~~it~e~L~d~a~~~-d~e~~i~~~ia~~~a~~~~~~~~nGd~~ 147 (321) T protein:vir:31 73 HRRPQDEGEWNENESDV-STGTIDISTEKATV---AWDLPREVVQENPEGE-ALADRILNLMTDAWSADVEDLAANGDED 147 (321) T ss_pred ccccccccccccccccc-eeeeeeeeeEEEEe---ehhccHHHHHhhhcch-hHHHHHHHHHHHHHHHHHHhheeecccc Confidence 222221111111 1122 23333322333332 2334544443212212 24455788888888888877665321 Q ss_pred ---------hhhhcccccceeecccccCcccccccccHHHHHHHHHHhcccccCe--eEEEEchHHHHHHHHhhcccccc Q lcl|NC_016566. 134 ---------KAAIESNAAANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALI--KSWFMDGVTWANFIAYQALPSAE 202 (364) Q Consensus 134 ---------~Gv~~~na~~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l--~~ivMHS~v~~~L~k~~~it~~~ 202 (364) +|.+..-..+... .+ .+....+...|.+..+.+-....+- -.|+||..++..+++. +.+.+ T Consensus 148 ~~~~~~~~n~G~l~~a~~~~~~----~~--~~~~~~~~d~l~~l~~~l~~~yr~~~~~v~im~~~~~~~~~~~--l~~~~ 219 (321) T protein:vir:31 148 AEDSFENQNDGFITVAEGDVET----ID--AADDILDNDLVIRTIAGLDSKYRARMNPALIVSEDQLLSYHYT--LTDRD 219 (321) T ss_pred CCCcccccchhhhhhhcccccc----cc--ccccccCHHHHHHHHHhccHhHhcCCCeEEEechHHHHHHHHH--HhcCC Confidence 2222110111100 00 1122345567788888886644321 2689999998766442 22222 Q ss_pred cccccccc--eeecccCCcEEEEeCCCCCCCCceEEEEEec---ceeEEecCCCCcceeeccC---CCc--eeeeEEeeE Q lcl|NC_016566. 203 QVFAIGDL--QVMGDGLGRRFIISDAAADAMGAGKMLGLVP---GAVAVTTNGLDMLAQEKGG---NEN--IERWWQGEF 272 (364) Q Consensus 203 ~~~~~~~~--~~~~~~lGrrVIVDD~~p~~~~~Yttylfg~---GAi~~~~~~~~~~~~~~~g---~e~--~~~~~~~~~ 272 (364) ..-....+ .....++|++|++++.||... .+|+. =.+++..+ ....+..... .+. .......+. T Consensus 220 ~~~~~~~l~~~~~~tl~G~pvv~~~~mP~~~-----il~t~~~nl~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~ 293 (321) T protein:vir:31 220 TPLGDNVIMGEADVNPFSFPIIGSGLWPDDK-----AMFTDPQNLIYALYRD-LEIDVLTESDKVSERDLHARYFMRGDD 293 (321) T ss_pred CccccchhhccccccccceeEEEcCCCCCCc-----EEEeccccEEEEEeec-cEEEEeecCccccccceeeEeeeeeec Confidence 21000000 001124699999999999643 22222 11111111 1111111111 111 111111222 Q ss_pred EEEeeeeeeeecccccccccCCCCcChhhhcCCcc Q lcl|NC_016566. 273 DFNVAVKGYRLKASARTPVEGVRSFKLSDITDKAN 307 (364) Q Consensus 273 ~f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~N 307 (364) .|.+.-.+ .++-.++-+-|- +-|+-.+. T Consensus 294 ~~~ve~~~------a~a~~~~i~~~~-~~~~~~~~ 321 (321) T protein:vir:31 294 DFAIENTE------AVVLAEGLGDPL-EHLEEETS 321 (321) T ss_pred ceeEeccc------cEEEEecCCcch-hcccCCCC Confidence 23322211 111111101111 11111111 No 124 >protein:vir:2430 Length: 318 # NCBI annotation: major head subunit # Family: family:all:507 # MgeID: mge:52 # MgeName: D29 # Cross-refs: genbank:acc:NP_046832;genbank:gi:9630400;genbank:GeneID:1261582 Probab=61.85 E-value=0.35 Score=23.11 Aligned_cols=267 Identities=7% Similarity=-0.064 Sum_probs=102.7 Q ss_pred CCcc--------------ccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCC Q lcl|NC_016566. 1 MSLT--------------VFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPV 66 (364) Q Consensus 1 fd~~--------------vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~ 66 (364) |+.. +-.++....+++.+.+..-.++-. ...++.+.-...|.+.....+ .-...... T Consensus 7 ~~~e~~~~~~~~~~~~~~~ip~~~~~~ii~~~~~~~~l~~~~-------~~~~~~~~~~~ip~~~~~~~a-~~v~Eg~~- 77 (318) T protein:vir:24 7 FAVDHAQIAQTGDTMFKGYLEPEQAKDYFAEAEKTSIVQQFA-------QKVPMGTTGQKIPHWVGDVSA-QWIGEGDM- 77 (318) T ss_pred CCHHHHHhhcccCcccceeechhHHHHHHHHHHhhchhhhhc-------ceeeccCCceEEEEEeCCcce-EEecCCcc- Confidence 2221 234445555555555433222221 111222222334433322111 11111111 Q ss_pred CccccchhhhccceeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc----c Q lcl|NC_016566. 67 GTPATAKVLARMLTNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNA----A 142 (364) Q Consensus 67 ~~~~T~~kit~~~~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na----~ 142 (364) -...++ ++.+-.-...|+. +-+..+.+.+. ....++...|.+++++.+.+...+.+|. |.-.... . T Consensus 78 ~~~~~~-~f~~i~~~~~k~~---~~~~iS~e~l~---ds~~~~~~~i~~~l~~~~~~~~d~a~l~---G~g~~~~~~~~~ 147 (318) T protein:vir:24 78 KPITKG-NMTSQTIAPHKIA---TIFVASAETVR---ANPANYLGTMRTKVATAFAMAFDGAAMH---GTDSPFPTYIGQ 147 (318) T ss_pred cccccc-ceeEEEEeeEEEE---EeehhhHHHhh---cChHHHHHHHHHHHHHHHHHHHHHhhhc---ccCCCCCccccc Confidence 011111 2222222222333 22335554333 3333456678899999988877776542 2111110 0 Q ss_pred ceeecccccCcccccccccHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhccccccc--ccc----cccceeec-- Q lcl|NC_016566. 143 ANYTQPARVDGVGGRTFPTLADFPLAASKFGDQAALIKSWFMDGVTWANFIAYQALPSAEQ--VFA----IGDLQVMG-- 214 (364) Q Consensus 143 ~v~dis~~t~~~~~~~~~s~~~l~~A~~~lGD~~~~l~~ivMHS~v~~~L~k~~~it~~~~--~~~----~~~~~~~~-- 214 (364) ....++.. ...+........+.++..++-.....-..|+||...+..|.+ +.+.++ ++. ..+...+. T Consensus 148 ~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~---lkd~~G~~l~~~~~~~~~~~~~~~~ 222 (318) T protein:vir:24 148 TTKAISIA--DTTGATTVYDQVAVNGLSLLVNDGKKWTHTLLDDITEPILNG---AKDQNGRPLFIESTYGEAASPFRSG 222 (318) T ss_pred cccccccc--ccccccchHHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHH---hhccCCceeecCccccCccccccCc Confidence 01111100 001111111123456666666666666789999999998864 333222 111 11111111 Q ss_pred ccCCcEEEEeCCCCCCCCceEEEEEec-ceeEEec-CCCCcceee-------c-cCC------CceeeeEEe--eEEE-E Q lcl|NC_016566. 215 DGLGRRFIISDAAADAMGAGKMLGLVP-GAVAVTT-NGLDMLAQE-------K-GGN------ENIERWWQG--EFDF-N 275 (364) Q Consensus 215 ~~lGrrVIVDD~~p~~~~~Yttylfg~-GAi~~~~-~~~~~~~~~-------~-~g~------e~~~~~~~~--~~~f-~ 275 (364) ..+|.+|+++|.+|.... ..+||. .-+.++. ++......+ . .+. +.-...++. |+.| . T Consensus 223 ~i~g~pv~~~~~~~~~~~---~~~~gdfs~~~~~~~~~l~i~~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v 299 (318) T protein:vir:24 223 RIVARPTILSDHVVEGTT---VGFMGDFSQLIWGQIGGLSFDVTDQATLNLGTVESPNFVSLWQHNLVAVRVEAEYAFHC 299 (318) T ss_pred eEEEEeeEEeCCCCCCcc---EEEEeecceEEEEEecCeEEEEeeccceeccccccccchhhhhcCcEEEEEEEEEccEE Confidence 235899999999985432 122322 1122222 111111100 0 000 001122222 2222 2 Q ss_pred eeeeeeeecccccccccCC Q lcl|NC_016566. 276 VAVKGYRLKASARTPVEGV 294 (364) Q Consensus 276 lhp~G~sw~~~~~~~~~gg 294 (364) ++|..|.--....+...-| T Consensus 300 ~~~~a~~~i~~~~a~~~~~ 318 (318) T protein:vir:24 300 NDAEAFVALTNVVSGGGEG 318 (318) T ss_pred ecccceEEEEeeccCCCCC Confidence 6666664422221111111 No 125 >protein:vir:101650 Length: 497 # NCBI annotation: gp13 # Family: family:all:585 # MgeID: mge:1515 # MgeName: 244 # Cross-refs: genbank:acc:YP_654768;genbank:gi:109302766;genbank:GeneID:4156084 Probab=56.17 E-value=0.46 Score=22.42 Aligned_cols=271 Identities=9% Similarity=0.011 Sum_probs=88.4 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) +- -.-.++....+++.+.+...+.+-... + +..+.-+..|....-.+...-.... ......+| ++..-.- T Consensus 159 gg-~~vp~~~~~~ii~~~~~~~~i~~l~~~--~-----~~~~~~~~~~~~~~~~~~a~wv~E~-~~~~~s~~-~f~~i~~ 228 (497) T protein:vir:10 159 FA-PGILPTFLPGIVEQLFYELSLADLISS--R-----PVTSPNLSYLTESAAHNNAAAVAEA-GTYPFSSE-EFARVYE 228 (497) T ss_pred cc-cccchhhhHHHHHHHHhhhhHHhhccc--c-----ccCCCceEEEEEcCCCCcceeeccC-cccccccc-cceeeEe Confidence 00 012233333444444433322222111 0 0111101111110000000000000 00001111 2222222 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH-----Hhhhhcccccceeecccc----- Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGA-----GKAAIESNAAANYTQPAR----- 150 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~-----L~Gv~~~na~~v~dis~~----- 150 (364) ...|++. -+.++...+. ..| .+-.-|.+++++...+..-+.+|.- ..|++...+......... T Consensus 229 ~~~k~a~---~~~iS~ell~---d~~-~l~~~i~~~l~~~i~~~~d~~~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~~ 301 (497) T protein:vir:10 229 QVGKVAN---ALTITDEGLR---DAP-ELFNFVQGRLLEGIQRKEEVQLLAGGGYPGVNGLLQRSTGFTASSASSLFGAT 301 (497) T ss_pred eeeeeEe---ecHhHHHHHH---hHH-HHHHHHHHHHHHHHHHHHHHHhhcCCCcccccccccccccccccccccchhhh Confidence 2233332 2234443321 223 2344455666655554444333321 111221110000000000 Q ss_pred ----------cCccc--ccc------------------------------cccHHHHHHHHHHhcc-cccCeeEEEEchH Q lcl|NC_016566. 151 ----------VDGVG--GRT------------------------------FPTLADFPLAASKFGD-QAALIKSWFMDGV 187 (364) Q Consensus 151 ----------t~~~~--~~~------------------------------~~s~~~l~~A~~~lGD-~~~~l~~ivMHS~ 187 (364) +.+.. ..+ ...+..+.++...+-. ....-.+|+||.. T Consensus 302 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~vmn~~ 381 (497) T protein:vir:10 302 SATVSNVKFPADGTNGAFVGQDTVASLKYGRVVTGAAGSGSGVAGSYPTAAEIAENVFDAFVDIQLTLFQTPNAVVMNPR 381 (497) T ss_pred hhhhhhhhhhcccccchhhhhhHHHHHHHHHhhhhhhhhccchhccccchhhhhhHHHHHHhhhhhhcccCCCeEEEchH Confidence 00000 000 0000011112211111 1112247999999 Q ss_pred HHHHHHHhhcccccccc--ccc-----ccceeec--ccCCcEEEEeCCCCCCCCceEEEEEecceeEEec-CCCCcceee Q lcl|NC_016566. 188 TWANFIAYQALPSAEQV--FAI-----GDLQVMG--DGLGRRFIISDAAADAMGAGKMLGLVPGAVAVTT-NGLDMLAQE 257 (364) Q Consensus 188 v~~~L~k~~~it~~~~~--~~~-----~~~~~~~--~~lGrrVIVDD~~p~~~~~Yttylfg~GAi~~~~-~~~~~~~~~ 257 (364) .+..|.+ +.+..+- +.. .+..+.. ..+|+||++++.||... +..=-|.++++.+.. ..+.+...+ T Consensus 382 ~~~~l~~---lkd~~G~~i~~~~~~~~~~~~~~~~~~l~G~pV~~t~~~~~~~--~~~Gd~~~~~~~i~~r~~~~v~~~~ 456 (497) T protein:vir:10 382 DWELLRL---TKDANGQYMGGNFFGNAYGNPVNGGKNIWGVPVVTTPLIPLGT--ILVGHFAPSVIQTARREGVTMQMTN 456 (497) T ss_pred HHHHHHH---hhcCCCceeccCcccccccccccCCceeeceeeEecCCCCCCc--eEEeecccceEEEEEecccEEEeec Confidence 9998854 4333331 111 1111111 23599999999999532 211113456665543 344333332 Q ss_pred ccCC--CceeeeEEeeEEE---EeeeeeeeecccccccccCCCC Q lcl|NC_016566. 258 KGGN--ENIERWWQGEFDF---NVAVKGYRLKASARTPVEGVRS 296 (364) Q Consensus 258 ~~g~--e~~~~~~~~~~~f---~lhp~G~sw~~~~~~~~~gg~S 296 (364) .++. +.-.+.++.+..| +.||..|.--.-+-+. . + | T Consensus 457 ~~~~~f~~n~v~~r~~~r~~~~v~~p~A~~~l~~~~~~-~-~-~ 497 (497) T protein:vir:10 457 SNGTDFVDGKVTVRAEERLGLLVYRPSAFQLIQLKKGA-T-G-S 497 (497) T ss_pred ccchhhhcCcEEEEEEEeecceeeccccEEEEEecCCc-c-C-C Confidence 2221 1113334443333 3678777663221100 0 1 2 No 126 >protein:vir:7855 Length: 497 # NCBI annotation: gp12 # Family: family:all:585 # MgeID: mge:150 # MgeName: CJW1 # Cross-refs: genbank:acc:NP_817462;genbank:gi:29565891;genbank:GeneID:1259081 Probab=56.17 E-value=0.46 Score=22.42 Aligned_cols=271 Identities=9% Similarity=0.011 Sum_probs=88.4 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccce Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARMLT 80 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~~ 80 (364) +- -.-.++....+++.+.+...+.+-... + +..+.-+..|....-.+...-.... ......+| ++..-.- T Consensus 159 gg-~~vp~~~~~~ii~~~~~~~~i~~l~~~--~-----~~~~~~~~~~~~~~~~~~a~wv~E~-~~~~~s~~-~f~~i~~ 228 (497) T protein:vir:78 159 FA-PGILPTFLPGIVEQLFYELSLADLISS--R-----PVTSPNLSYLTESAAHNNAAAVAEA-GTYPFSSE-EFARVYE 228 (497) T ss_pred cc-cccchhhhHHHHHHHHhhhhHHhhccc--c-----ccCCCceEEEEEcCCCCcceeeccC-cccccccc-cceeeEe Confidence 00 012233333444444433322222111 0 0111101111110000000000000 00001111 2222222 Q ss_pred eeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHH-----Hhhhhcccccceeecccc----- Q lcl|NC_016566. 81 NSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGA-----GKAAIESNAAANYTQPAR----- 150 (364) Q Consensus 81 vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~-----L~Gv~~~na~~v~dis~~----- 150 (364) ...|++. -+.++...+. ..| .+-.-|.+++++...+..-+.+|.- ..|++...+......... T Consensus 229 ~~~k~a~---~~~iS~ell~---d~~-~l~~~i~~~l~~~i~~~~d~~~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~~ 301 (497) T protein:vir:78 229 QVGKVAN---ALTITDEGLR---DAP-ELFNFVQGRLLEGIQRKEEVQLLAGGGYPGVNGLLQRSTGFTASSASSLFGAT 301 (497) T ss_pred eeeeeEe---ecHhHHHHHH---hHH-HHHHHHHHHHHHHHHHHHHHHhhcCCCcccccccccccccccccccccchhhh Confidence 2233332 2234443321 223 2344455666655554444333321 111221110000000000 Q ss_pred ----------cCccc--ccc------------------------------cccHHHHHHHHHHhcc-cccCeeEEEEchH Q lcl|NC_016566. 151 ----------VDGVG--GRT------------------------------FPTLADFPLAASKFGD-QAALIKSWFMDGV 187 (364) Q Consensus 151 ----------t~~~~--~~~------------------------------~~s~~~l~~A~~~lGD-~~~~l~~ivMHS~ 187 (364) +.+.. ..+ ...+..+.++...+-. ....-.+|+||.. T Consensus 302 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~vmn~~ 381 (497) T protein:vir:78 302 SATVSNVKFPADGTNGAFVGQDTVASLKYGRVVTGAAGSGSGVAGSYPTAAEIAENVFDAFVDIQLTLFQTPNAVVMNPR 381 (497) T ss_pred hhhhhhhhhhcccccchhhhhhHHHHHHHHHhhhhhhhhccchhccccchhhhhhHHHHHHhhhhhhcccCCCeEEEchH Confidence 00000 000 0000011112211111 1112247999999 Q ss_pred HHHHHHHhhcccccccc--ccc-----ccceeec--ccCCcEEEEeCCCCCCCCceEEEEEecceeEEec-CCCCcceee Q lcl|NC_016566. 188 TWANFIAYQALPSAEQV--FAI-----GDLQVMG--DGLGRRFIISDAAADAMGAGKMLGLVPGAVAVTT-NGLDMLAQE 257 (364) Q Consensus 188 v~~~L~k~~~it~~~~~--~~~-----~~~~~~~--~~lGrrVIVDD~~p~~~~~Yttylfg~GAi~~~~-~~~~~~~~~ 257 (364) .+..|.+ +.+..+- +.. .+..+.. ..+|+||++++.||... +..=-|.++++.+.. ..+.+...+ T Consensus 382 ~~~~l~~---lkd~~G~~i~~~~~~~~~~~~~~~~~~l~G~pV~~t~~~~~~~--~~~Gd~~~~~~~i~~r~~~~v~~~~ 456 (497) T protein:vir:78 382 DWELLRL---TKDANGQYMGGNFFGNAYGNPVNGGKNIWGVPVVTTPLIPLGT--ILVGHFAPSVIQTARREGVTMQMTN 456 (497) T ss_pred HHHHHHH---hhcCCCceeccCcccccccccccCCceeeceeeEecCCCCCCc--eEEeecccceEEEEEecccEEEeec Confidence 9998854 4333331 111 1111111 23599999999999532 211113456665543 344333332 Q ss_pred ccCC--CceeeeEEeeEEE---EeeeeeeeecccccccccCCCC Q lcl|NC_016566. 258 KGGN--ENIERWWQGEFDF---NVAVKGYRLKASARTPVEGVRS 296 (364) Q Consensus 258 ~~g~--e~~~~~~~~~~~f---~lhp~G~sw~~~~~~~~~gg~S 296 (364) .++. +.-.+.++.+..| +.||..|.--.-+-+. . + | T Consensus 457 ~~~~~f~~n~v~~r~~~r~~~~v~~p~A~~~l~~~~~~-~-~-~ 497 (497) T protein:vir:78 457 SNGTDFVDGKVTVRAEERLGLLVYRPSAFQLIQLKKGA-T-G-S 497 (497) T ss_pred ccchhhhcCcEEEEEEEeecceeeccccEEEEEecCCc-c-C-C Confidence 2221 1113334443333 3678777663221100 0 1 2 No 127 >protein:vir:100632 Length: 381 # NCBI annotation: 77ORF006 # Family: family:all:635 # MgeID: mge:1476 # MgeName: 77 # Cross-refs: genbank:acc:NP_958606;genbank:gi:41189521;genbank:GeneID:2743778 Probab=47.02 E-value=0.72 Score=21.38 Aligned_cols=272 Identities=10% Similarity=-0.020 Sum_probs=82.6 Q ss_pred CCc-cccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccc Q lcl|NC_016566. 1 MSL-TVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARML 79 (364) Q Consensus 1 fd~-~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~ 79 (364) ++- -+-.+++...++|.+.+.--.++. +- +.+ ..|.. +.+.-...+.+. -....+......+| ++.+-. T Consensus 82 ~~Gg~lvP~~~~~~I~~~l~~~spir~~----a~-v~~--~~~~~-~i~~~~~~~~a~-W~~e~~~~~~~~~~-~f~~i~ 151 (381) T protein:vir:10 82 YKEEKLLPEETIDRIFEDLTTNHPLLAD----LG-IKN--AGLRL-KFLKSETSGVAV-WGKIYGEIKGQLDA-AFSEET 151 (381) T ss_pred CCCceecCHHHHHHHHHHHHhhcceeee----ee-eEe--cCcce-EEEeecCCcceE-EeecccccccccCc-cceeEe Confidence 110 011222333333333322111111 10 111 12221 222111111000 00000010111122 344444 Q ss_pred eeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc-cceeecccc---cCc-- Q lcl|NC_016566. 80 TNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNA-AANYTQPAR---VDG-- 153 (364) Q Consensus 80 ~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na-~~v~dis~~---t~~-- 153 (364) -.+.|++. + +..+..-+. ..+-.+..-|.+++++...+..-+.+| ...|. +.. ....+.+.. +++ T Consensus 152 l~~~kl~a-~--i~is~elL~---Ds~~~le~~i~~~la~~~a~~~~~afi-~GdG~--~qP~Gil~~~~~~~~~~~g~~ 222 (381) T protein:vir:10 152 AIQNKLTA-F--VVLPKDLND---FGPAWIERFVRVQIEEAFAVALETAFL-KGTGK--DQPIGLNRQVQKGVSVTDGAY 222 (381) T ss_pred ecceeEEe-e--ccccHHHHh---ccHHHHHHHHHHHHHHHHHHHhhceeE-ecccC--CCceeeeecCCcccccccccc Confidence 44445542 2 234433332 222222223444444443333222111 11111 000 011111100 000 Q ss_pred -----ccccccccHHH-------HHHHHHHhcccc----cCeeEEEEchHHHHHHHHhhcccccccccccccceeecccC Q lcl|NC_016566. 154 -----VGGRTFPTLAD-------FPLAASKFGDQA----ALIKSWFMDGVTWANFIAYQALPSAEQVFAIGDLQVMGDGL 217 (364) Q Consensus 154 -----~~~~~~~s~~~-------l~~A~~~lGD~~----~~l~~ivMHS~v~~~L~k~~~it~~~~~~~~~~~~~~~~~l 217 (364) ....+..+... +.......+... ..=..|+||..+|.+|++.-.+-+ ..+..++.-++ T Consensus 223 ~~~~~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~vmn~~t~~~l~~~~~~~~------~~G~~v~~lp~ 296 (381) T protein:vir:10 223 PEKEEQGTLTFANPRATVNELTQVFKYHSTNEKGKSVAVKGNVTMVVNPSDAFEVQAQYTHLN------ANGVYVTALPF 296 (381) T ss_pred ccccccccccccchhhHHHHHHHHHHhhhhhhccccccccCceEEEEchhhHHhhccccccCC------CCCceeecCCC Confidence 00001111111 111222222111 112368999999998865222212 22333444557 Q ss_pred CcEEEEeCCCCCCCCceEEEEEec-ceeEEec-CCCCcceeeccCCCceeeeEEe--eEEE-Eeeeeeeeeccccccccc Q lcl|NC_016566. 218 GRRFIISDAAADAMGAGKMLGLVP-GAVAVTT-NGLDMLAQEKGGNENIERWWQG--EFDF-NVAVKGYRLKASARTPVE 292 (364) Q Consensus 218 GrrVIVDD~~p~~~~~Yttylfg~-GAi~~~~-~~~~~~~~~~~g~e~~~~~~~~--~~~f-~lhp~G~sw~~~~~~~~~ 292 (364) |.+|++++.||... .+||. +-..+.+ .++.+.+.....=..-.+.+++ |+.. .+||..+.--.-+.. T Consensus 297 g~~vv~~~~~p~~~-----i~fGDfs~Y~i~~r~~~~i~~~~~~~~~~d~~~f~a~~r~dG~~~~~~A~~v~~l~~~--- 368 (381) T protein:vir:10 297 NLNVIESTVQEAGK-----VLTYVKGLYDGYLAGGINVQKFKETLALDDMDLYTAKQFAYGKAKDNKVAAVWKLDLK--- 368 (381) T ss_pred CceeEEcCCCCcCc-----EEEEEcccEEEEEecccEEEeechhhhhcCceEEEEEEEEcCEEecCCcEEEEEEeec--- Confidence 99999999999532 34443 1122222 2222221111110111122222 2111 366666554221111 Q ss_pred CCCCcChhhhcCCc Q lcl|NC_016566. 293 GVRSFKLSDITDKA 306 (364) Q Consensus 293 gg~SPT~aeLat~~ 306 (364) |..|.-++-+-.- T Consensus 369 -~~~~~~~~~~~~~ 381 (381) T protein:vir:10 369 -GHKPALEDTEETL 381 (381) T ss_pred -CCccccccccccC Confidence 1123222211111 No 128 >protein:vir:105522 Length: 423 # NCBI annotation: phage major head protein # Family: family:all:1412 # MgeID: mge:1463 # MgeName: phiSG1 # Cross-refs: genbank:acc:YP_516191;genbank:gi:89885994;genbank:GeneID:3964382 Probab=33.52 E-value=1.4 Score=19.87 Aligned_cols=318 Identities=12% Similarity=-0.001 Sum_probs=108.8 Q ss_pred CCccc--cchhhhhhh-hhhhHHHHHHHhhhhcceeEeccC---cccCceeeeehhhhhcccccccccccCCCccccchh Q lcl|NC_016566. 1 MSLTV--FQRKLVTAV-TQMIPDNLNVFNAAANGAVVLGTG---EVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKV 74 (364) Q Consensus 1 fd~~v--fn~~~~~~~-~e~i~q~~~~fn~as~gAivl~~~---~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~k 74 (364) |-..+ |.+++..+. ++.+.++|-.=|.-.+.. ..+ ...||-+.+|.=... .+++. .. .....-++.. T Consensus 1 MANsl~~l~p~iia~~al~~l~~~lV~~~lV~r~y---~~ef~~ak~GDTV~I~~P~~~--~~~d~-~~-~~~t~~~~~~ 73 (423) T protein:vir:10 1 MANNLDANVSQIVLKKFLPGFMSDLVLCKTVDRQL---LAGEINSSTGDSVSFKRPHQF--KSERT-MD-GDITGKSKNS 73 (423) T ss_pred CccccccccHHHHHHHHHHHHHhhcccchhhccCC---CccccccccCCEEEEeeCCce--eeecc-cC-cccCcccccc Confidence 76446 888876554 666765443211111111 001 113554444322211 11110 00 0000112334 Q ss_pred hhccceeeEEeccc-cCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccceeecccccCc Q lcl|NC_016566. 75 LARMLTNSVNLSAK-VGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAAANYTQPARVDG 153 (364) Q Consensus 75 it~~~~vaVkl~~~-~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~~v~dis~~t~~ 153 (364) +.+++ +-+++... +=++.|+..++.....+.++++..-.+.+|+.. .+.|...+.. .+.+ ...-.+. T Consensus 74 l~e~~-v~l~id~~k~~a~~v~d~E~~l~i~~~~~~l~~A~~aLA~~v----d~~ia~~~~~-~~~~---~vgt~~t--- 141 (423) T protein:vir:10 74 LISAK-ATGEVGNYITVAVEYRQIEEALKLNQLDQILVPINERMVTDL----ETELALFMMK-HGAL---SLGSPNT--- 141 (423) T ss_pred cccce-EEEEecceeeeeeeeChHHHhcChhHHHHHHHHHHHHHHHHH----HHHHHHHhhh-cccc---ccccccc--- Confidence 44443 44444322 225667777776333333332222222222222 2222222211 1111 1111111 Q ss_pred ccccccccHHHHHHHHHHhcccccC--eeEEEEchHHHHHHHHh-hccccccccc----ccccceeecccCCcEEEEeCC Q lcl|NC_016566. 154 VGGRTFPTLADFPLAASKFGDQAAL--IKSWFMDGVTWANFIAY-QALPSAEQVF----AIGDLQVMGDGLGRRFIISDA 226 (364) Q Consensus 154 ~~~~~~~s~~~l~~A~~~lGD~~~~--l~~ivMHS~v~~~L~k~-~~it~~~~~~----~~~~~~~~~~~lGrrVIVDD~ 226 (364) .-..+.++.+|..+|-++.-- =.-++|-+..|..|++. ..+....... +.++ +.+..+|-.|+++.+ T Consensus 142 ----~~~a~~~~a~a~~~L~~~~vP~~~R~~Vv~p~~~a~Ll~~~~~~~~~~~~~~~alr~~~--i~G~~~GFdi~~Sn~ 215 (423) T protein:vir:10 142 ----PIKKWSDVAQTASFLKDLGINSGENYAVMDPWAAQRLADAQSGLHVSEQLVRTAWENAQ--ISGNFGGIRALMSNG 215 (423) T ss_pred ----ccccHHHHHHHHHHHhhccCCcCCCEEEeCHHHHHHHhhhhhhhccccccchHHHHhcc--cceeecceEEEEecC Confidence 112356788898888664321 14569999999999742 2233222221 1222 223345899999999 Q ss_pred CCC-CCCceEEEEEecceeEEecCCC------Ccc----eeeccCCCceeeeEEeeEEEEeeeeeeeecccccccccCCC Q lcl|NC_016566. 227 AAD-AMGAGKMLGLVPGAVAVTTNGL------DML----AQEKGGNENIERWWQGEFDFNVAVKGYRLKASARTPVEGVR 295 (364) Q Consensus 227 ~p~-~~~~Yttylfg~GAi~~~~~~~------~~~----~~~~~g~e~~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~ 295 (364) +|. ..+.+.-..-..|+..+..... ... .....|--..+.....-..|.+||.=.. .. ++ +. T Consensus 216 vp~~T~g~~~ga~~~~~~~~vt~a~~~~~~~~~~~~~~~T~s~~g~l~~GD~~t~aGv~~v~~~tk~----~l--~~-~~ 288 (423) T protein:vir:10 216 LASRTQGAFGGKLTVKGTPEVNYDSVKDSYAFTATLTGATASKKGFLKVGDQLQFDDTHWLNQQSKQ----TL--YN-GA 288 (423) T ss_pred CcccccccccceeeeeeeeEEEecccccccccccceeeccceeceeEEecceEeecceeeecccccc----ee--ec-cc Confidence 994 3443332223344444322111 000 0001110000000000011223432210 00 00 00 Q ss_pred CcChhhhcCCccceeecCcCc-CcceEEEEecCc-cccccccccccccccccccchh--hcceeEEEeeeecC Q lcl|NC_016566. 296 SFKLSDITDKANWELDQGQVD-NAPATVQDVGSD-SDTKGRRRTQTAQAVPTRNIKE--TAGVLVTLTATTAS 364 (364) Q Consensus 296 SPT~aeLat~~NW~rV~~s~K-~~pgv~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~ 364 (364) .|..-+ ...+.+..- ...++-|++.+- ....... |-.|++- ..|..+|.+.+..+ T Consensus 289 ~~~~~~------~~V~~~~~~~a~~~~tv~i~p~~~~~~~~~--------~~~~V~a~~a~~~~vT~~~~~~~ 347 (423) T protein:vir:10 289 SALSFT------ATVMEDANAHSSGDVTVKISGVPIFDAGYP--------QYNAVDRLLAEGDTVSVIGTSKQ 347 (423) T ss_pred CCcceE------EEEEecccccccCceEEEeccccccccCcc--------cccceeccccCCceeEEeeccCC Confidence 000000 000000000 000000000000 0000000 0111111 13445555444333 No 129 >protein:vir:6324 Length: 335 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:132 # MgeName: phiKMV # Cross-refs: genbank:acc:NP_877471;genbank:gi:33300843;uniprot:Q7Y2D3;genbank:GeneID:1482613 Probab=28.44 E-value=1.7 Score=19.25 Aligned_cols=267 Identities=10% Similarity=-0.074 Sum_probs=106.6 Q ss_pred CCccccchhhhhhhhhhhHHHHHHHhhhhcceeEeccCcccCceeeeehhhhhcccccccccccCCCccccchhhhccc- Q lcl|NC_016566. 1 MSLTVFQRKLVTAVTQMIPDNLNVFNAAANGAVVLGTGEVLKDVVEKMSVGLIANLVTDRNAYAPVGTPATAKVLARML- 79 (364) Q Consensus 1 fd~~vfn~~~~~~~~e~i~q~~~~fn~as~gAivl~~~~~~Gdf~~~~~f~~i~g~~~~~d~~~~~~~~~T~~kit~~~- 79 (364) | +++|+-+++++|-+.-- -+++++..+ ...|+-.+.|+-++..= ....+ +.....+.+...+ T Consensus 22 ~-le~f~geV~~af~~~s~-~~~~~~~rt---------i~~g~s~~~~~iG~~~~----~~~~p--G~~l~~~~~~~~k~ 84 (335) T protein:vir:63 22 H-LEEHLGIVDKHFAYTSK-FAPLMNIRD---------LRGSNVVRLDRLGNVEA----KGRRA--GEELERSRVVNDKW 84 (335) T ss_pred e-hhhhhhhHHHHHHhhhh-hccccceee---------eccceeEEEeeeeeeee----ecccC--CcCcCCCCccccce Confidence 4 78888888888765322 233333222 23465555555444321 01010 1111111222222 Q ss_pred eeeEE--eccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcc-----------cccceee Q lcl|NC_016566. 80 TNSVN--LSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIES-----------NAAANYT 146 (364) Q Consensus 80 ~vaVk--l~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~-----------na~~v~d 146 (364) .+-|- |..+.--...++.+-.+.-+.+ ++.++|..+|+.- +|..++.+.+++-.. ....... T Consensus 85 ~itVD~ll~a~~~I~dlDe~~~~yDvRse--~s~e~G~aLA~~~---D~~~~~~i~~aa~~~a~~~~~~~~~~G~~~~~~ 159 (335) T protein:vir:63 85 NLTVDTLLYLRHQFDHQDEWTQSFDMRKE--VAELDGQELARKF---DQACLIQVIKAAAMDAPVDLEDAFSPGVLEKLD 159 (335) T ss_pred EEEecceeechhhhhhHHHHhcCchhHHH--HHHHHHHHHHHHH---HHHHHHHHHhhccccCccccCCCcCCCcceeee Confidence 22221 2222111123333222222222 3344444444433 344444444443321 1011111 Q ss_pred cccccCcccccccccH--HHHHHHHHHhcccc--cCe---eEEEEchHHHHHHHHhhcccccccc-------ccccccee Q lcl|NC_016566. 147 QPARVDGVGGRTFPTL--ADFPLAASKFGDQA--ALI---KSWFMDGVTWANFIAYQALPSAEQV-------FAIGDLQV 212 (364) Q Consensus 147 is~~t~~~~~~~~~s~--~~l~~A~~~lGD~~--~~l---~~ivMHS~v~~~L~k~~~it~~~~~-------~~~~~~~~ 212 (364) +++. ...+.+.. ..+-+|.++|-++. +.- ..++|...+|..|++...+.|.+-. +....+.. T Consensus 160 ~tg~----~~~~~~~~l~~a~~~a~~~L~e~dVP~~~~~dr~~vv~P~~y~~Ll~~~~l~n~~~~~s~~~~~~~~g~v~~ 235 (335) T protein:vir:63 160 LTGL----TAKQAADKIVRMHRRVVETFIDRDLGDAVYSEGLTPMSPRVFSLLLEHDKLMNVEYQATGATNDYVKSRVAI 235 (335) T ss_pred eccC----cccccHHHHHHHHHHHHHHHHhccCCCcccCceEEEeChHHHHHHhccccccccccccccccccccCceeEE Confidence 2211 11112222 23557888887665 222 6789999999999876555543211 11122232 Q ss_pred ecccCCcEEEEeCCCCCCCC-----------------ceEEEEEecceeEEecCCC-Ccceee-ccCCCceeeeEEeeEE Q lcl|NC_016566. 213 MGDGLGRRFIISDAAADAMG-----------------AGKMLGLVPGAVAVTTNGL-DMLAQE-KGGNENIERWWQGEFD 273 (364) Q Consensus 213 ~~~~lGrrVIVDD~~p~~~~-----------------~Yttylfg~GAi~~~~~~~-~~~~~~-~~g~e~~~~~~~~~~~ 273 (364) + +|-+|+.+=.+|.... +...++|-+.|++..+-.+ ...... ..+..+....+++-.+ T Consensus 236 v---~Gv~V~~sn~lP~~~~t~~~lg~a~n~~~~d~~~~~~~~~~~~Al~t~~~~~vt~e~~~~~~~~~~~i~~~~a~G~ 312 (335) T protein:vir:63 236 L---NGVKVLETPRFATKAIAAHPLGRHFNVSAEESERQIALFLPSKTLITAQVAPVQAKLWEDNEKFSWVLDTFQMYNI 312 (335) T ss_pred e---eceEEEeeccCCCCCcccccccccCCccccccceeEEEEEecceEEEEEEeecccceeeccchhhHHhHHHHHcCC Confidence 3 4888888888884211 2345566777777665432 111111 1122222222221111 Q ss_pred EEeeeeeeeecccccccccCCCCcCh Q lcl|NC_016566. 274 FNVAVKGYRLKASARTPVEGVRSFKL 299 (364) Q Consensus 274 f~lhp~G~sw~~~~~~~~~gg~SPT~ 299 (364) =.+.|...-=-+.+. -|.-+-|- T Consensus 313 g~lRPe~a~~i~~tg---~~~~~~~~ 335 (335) T protein:vir:63 313 GARRPDTAGAIELKG---IGAFDITA 335 (335) T ss_pred cccccceEEEEEEcC---CCceeecC Confidence 112222211111100 00001111 No 130 >protein:vir:93696 Length: 364 # NCBI annotation: Bcep22gp55 # Family: family:all:974 # MgeID: mge:1470 # MgeName: Bcep22 # Cross-refs: genbank:acc:NP_944284;genbank:gi:38640361;genbank:GeneID:2658350 Probab=23.47 E-value=2.3 Score=18.60 Aligned_cols=286 Identities=13% Similarity=0.030 Sum_probs=114.6 Q ss_pred CCcccc---chhhh--------hhhhhhhHHHHHHH-hhhhcceeEeccC--cccCceeeeehhhhhcccccccccccCC Q lcl|NC_016566. 1 MSLTVF---QRKLV--------TAVTQMIPDNLNVF-NAAANGAVVLGTG--EVLKDVVEKMSVGLIANLVTDRNAYAPV 66 (364) Q Consensus 1 fd~~vf---n~~~~--------~~~~e~i~q~~~~f-n~as~gAivl~~~--~~~Gdf~~~~~f~~i~g~~~~~d~~~~~ 66 (364) |--..| +|++. ....+..+ ...+| -..++..|+..++ .-.||-+....-..|.|. -+.+ T Consensus 1 Ma~T~~~~~~p~a~~~ws~~l~~~~~~~s~-f~~~l~G~~~~~~I~~~~dL~k~~Gd~v~f~L~~~L~g~----gv~G-- 73 (364) T protein:vir:93 1 MSQTVIPFGDPKAVKRWSADLAVDVRKKSY-FEQRFIGTSENAVIQRKTELESDAGDRITFDLSVHLRGK----PTYG-- 73 (364) T ss_pred CceeccCcCCHHHHHHHHHHHHHHHHhhCc-cccccccCCCCCcEEEeeecCCCCCceEEeeeeeecccC----Cccc-- Confidence 111111 11211 11111111 01111 1244556665543 345776666655566552 1111 Q ss_pred Cccccchhhhccc--eeeEEeccccCchhcCHHHHHhhcCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhccccc-- Q lcl|NC_016566. 67 GTPATAKVLARML--TNSVNLSAKVGPVAITKAMMAKIETNVNSVAAEIAAQATQAIMLHYLKAGIGAGKAAIESNAA-- 142 (364) Q Consensus 67 ~~~~T~~kit~~~--~vaVkl~~~~gpv~~t~~~~~~~g~dp~~~~~~Ig~~va~yw~~~~qk~lla~L~Gv~~~na~-- 142 (364) +..++ ++-...+ .-.|++.....||...- .|. ..+.|-++-..--+.+++||.+.+...++--|.|+-+.+.+ T Consensus 74 d~~le-Gnee~L~~~~~~i~idq~r~~V~~~g-~ms-~qRt~~dlr~~ar~~L~~w~~~~~d~~~f~~laGarg~~~~~~ 150 (364) T protein:vir:93 74 DARVE-GKEESLRFYQDEVRIDQVRHSVSAGG-RMS-RKRTVHNIRRIARDRLGDYFYKFTDELLFIYLSGARGINLDFI 150 (364) T ss_pred Cceee-ccccceeEEeeEEEEeeccccccccC-chh-hhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhcccccccccc Confidence 11121 1111111 12355566666776432 222 36677777666667888899888888888888885433211 Q ss_pred --------ceeeccc-----------ccC--cccccccccHHHHHHHHHHhcccc----------------cCeeEEEEc Q lcl|NC_016566. 143 --------ANYTQPA-----------RVD--GVGGRTFPTLADFPLAASKFGDQA----------------ALIKSWFMD 185 (364) Q Consensus 143 --------~v~dis~-----------~t~--~~~~~~~~s~~~l~~A~~~lGD~~----------------~~l~~ivMH 185 (364) ++.++.+ .+. .-...-.+++..+-+|..++-.+. ++.=.++|| T Consensus 151 ~~~~~~~~~~N~v~aPt~~r~~~~~~at~~~~l~stD~~sl~~id~a~~~a~~~~~~~~~~~~~~Pv~~~g~~~yV~~l~ 230 (364) T protein:vir:93 151 ETPDFTGYAGNPLDAPDVDHLLYGGVATSKASLAATDIMAPLVIEKAVEKAAMMQAENPDVANMVPVSIDGDDHYVCVMS 230 (364) T ss_pred cccCcccccccccCCCCCCcEEeccccCchhhccccccccHHHHHHHHHHHHHhCCCCCCCcccceeEecCcceeEEEEc Confidence 1111110 011 111122355555555544321111 123388999 Q ss_pred hHHHHHHHH--------hhccccc-----ccccccccceeecccC---CcEEEEeCCCCCCCCc--eEEEEEecceeEEe Q lcl|NC_016566. 186 GVTWANFIA--------YQALPSA-----EQVFAIGDLQVMGDGL---GRRFIISDAAADAMGA--GKMLGLVPGAVAVT 247 (364) Q Consensus 186 S~v~~~L~k--------~~~it~~-----~~~~~~~~~~~~~~~l---GrrVIVDD~~p~~~~~--Yttylfg~GAi~~~ 247 (364) +..+++|.. .|...-. .-+|+ +++..|++.+ =.+||-....+....+ -..+|||.-|.++. T Consensus 231 p~q~~~Lr~~t~~~w~d~qk~A~~~~g~~nPlF~-G~~gm~ngvii~~~~~vi~~~~~~~~~~v~~~ralllGaQA~~~a 309 (364) T protein:vir:93 231 EYQATDMRTAAGGTWIDFQKAAAAAEGRNNPIFK-GGLGMINNVVLHKHRNVIRFNDYGAGANVEAARALFMGRQAGVIA 309 (364) T ss_pred chhhhhhhhcCCHHHHHHHHHhhhcccccCCcee-cCeeeEcCeEEeccCCcccccccccCccccchhhheecceeeEEE Confidence 999998851 1111000 11333 3444454210 1122222212222223 34689998775554 Q ss_pred cCC----CCcceeeccCCC-ceeeeEEeeEEEEeeeeeeeecccccccccCCCCcChhhhcCCccceeecCcCcCcceEE Q lcl|NC_016566. 248 TNG----LDMLAQEKGGNE-NIERWWQGEFDFNVAVKGYRLKASARTPVEGVRSFKLSDITDKANWELDQGQVDNAPATV 322 (364) Q Consensus 248 ~~~----~~~~~~~~~g~e-~~~~~~~~~~~f~lhp~G~sw~~~~~~~~~gg~SPT~aeLat~~NW~rV~~s~K~~pgv~ 322 (364) .+. +..-.|+.-+-+ ..++... +++...-.+|... + =||+ T Consensus 310 ~g~~~g~~~~w~Ee~~D~gn~~~i~~~----~i~G~kK~rF~~~------------------------------D-fGvi 354 (364) T protein:vir:93 310 YGTANGLRFDWEETVKDYGNEPAIAAG----FIAGMKKARFNNK------------------------------D-FGVI 354 (364) T ss_pred eecCCCCCceeeecccCCCCchhhhhh----hHhhhhhcccCCc------------------------------c-ceEE Confidence 322 111233332222 2222211 1233333333211 1 1333 Q ss_pred EEecCcccccccccc Q lcl|NC_016566. 323 QDVGSDSDTKGRRRT 337 (364) Q Consensus 323 ~~~~~~~~~~~~~~~ 337 (364) ++ |+.-++-+ T Consensus 355 ~i-----dtaa~~~~ 364 (364) T protein:vir:93 355 SI-----DTAAKKHS 364 (364) T ss_pred Ee-----cccccccC Confidence 33 11111111 Done!