Query lcl|NC_011142.1_cdsid_YP_002128463.1 [gene=phiPLPE_29] [protein=gp29 major capsid protein] [protein_id=YP_002128463.1] [location=15568..16599] Match_columns 343 No_of_seqs 124 out of 181 Neff 7.8 Searched_HMMs 1612 Date Thu Nov 7 13:38:00 2013 Command /home/guerois/workspace/virfam/python/lib/hhsearch//hhsearch2 -i .//seq/seq_29 -d /home/guerois/workspace/virfam/python/profile_database/capsid_neck_tail.hhm -glob -cpu 7 -o .//seq/HHR/seq_29_vs_rec_db.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 protein:vir:79642 Length: 329 100.0 2.2E-93 1.4E-96 528.5 30.4 315 1-343 8-326 (329) 2 protein:vir:104342 Length: 314 100.0 4.1E-93 2.6E-96 527.1 29.4 311 1-343 1-311 (314) 3 protein:vir:107687 Length: 319 100.0 2.1E-91 1.3E-94 517.7 30.4 315 1-343 2-319 (319) 4 protein:vir:103285 Length: 296 100.0 7E-91 4.3E-94 514.9 29.3 292 34-343 1-293 (296) 5 protein:vir:80068 Length: 301 100.0 1.6E-89 1E-92 507.4 29.9 294 34-343 1-301 (301) 6 protein:vir:5255 Length: 304 # 100.0 5E-88 3.1E-91 499.2 28.4 295 39-342 1-304 (304) 7 protein:vir:94070 Length: 339 100.0 2.4E-84 1.5E-87 479.0 27.2 322 1-343 1-339 (339) 8 protein:vir:3643 Length: 336 # 100.0 2E-78 1.2E-81 446.5 24.5 319 7-343 1-336 (336) 9 protein:vir:78558 Length: 336 100.0 2.5E-78 1.5E-81 446.0 23.6 316 7-343 1-336 (336) 10 protein:vir:101557 Length: 336 100.0 6.5E-78 4E-81 443.8 25.3 319 7-343 1-336 (336) 11 protein:vir:106734 Length: 336 100.0 3.7E-78 2.3E-81 445.1 23.3 316 7-343 1-336 (336) 12 protein:vir:99576 Length: 388 100.0 1.6E-77 1E-80 441.6 25.8 326 1-343 21-388 (388) 13 protein:vir:107732 Length: 379 100.0 3.9E-76 2.4E-79 434.0 27.1 325 1-343 11-379 (379) 14 protein:vir:96079 Length: 382 100.0 2.3E-75 1.4E-78 429.8 26.2 325 1-343 19-382 (382) 15 protein:vir:105778 Length: 358 99.9 4.5E-25 2.8E-28 154.0 10.7 312 6-343 1-357 (358) 16 protein:vir:108211 Length: 318 99.1 2.8E-12 1.7E-15 84.0 11.6 280 29-343 1-315 (318) 17 protein:vir:9574 Length: 300 # 99.0 8.4E-11 5.2E-14 75.8 16.6 282 34-343 1-298 (300) 18 protein:vir:1433 Length: 435 # 99.0 2.4E-10 1.5E-13 73.4 18.9 317 1-343 91-431 (435) 19 protein:vir:8187 Length: 311 # 98.9 1.9E-10 1.2E-13 73.9 16.0 292 34-343 1-308 (311) 20 protein:vir:80376 Length: 435 98.9 3E-10 1.9E-13 72.8 16.9 317 1-343 91-431 (435) 21 protein:vir:99920 Length: 311 98.8 5.7E-10 3.6E-13 71.3 16.4 296 26-343 1-310 (311) 22 protein:vir:41 Length: 299 # N 98.8 8.8E-10 5.5E-13 70.2 16.9 278 29-343 1-296 (299) 23 protein:vir:9759 Length: 303 # 98.8 2E-09 1.2E-12 68.3 18.5 284 34-343 1-301 (303) 24 protein:vir:2504 Length: 305 # 98.8 2E-09 1.2E-12 68.3 18.0 274 26-343 1-296 (305) 25 protein:vir:5739 Length: 366 # 98.8 3.3E-09 2.1E-12 67.1 19.1 315 1-343 21-364 (366) 26 protein:vir:8420 Length: 477 # 98.8 3.4E-10 2.1E-13 72.5 13.6 321 1-343 90-469 (477) 27 protein:vir:80684 Length: 315 98.8 1.5E-09 9.3E-13 69.0 16.5 286 34-343 1-304 (315) 28 protein:vir:105905 Length: 304 98.8 3.8E-09 2.4E-12 66.7 18.1 284 21-343 1-303 (304) 29 protein:vir:94142 Length: 304 98.8 3.8E-09 2.4E-12 66.7 18.1 284 21-343 1-303 (304) 30 protein:vir:78223 Length: 333 98.7 3.2E-09 2E-12 67.2 17.4 304 13-343 1-330 (333) 31 protein:vir:1638 Length: 298 # 98.7 3.1E-09 1.9E-12 67.2 16.6 280 34-343 1-297 (298) 32 protein:vir:7771 Length: 330 # 98.7 5.6E-09 3.5E-12 65.8 18.0 291 21-343 1-321 (330) 33 protein:vir:105038 Length: 428 98.7 8.3E-09 5.2E-12 64.9 18.7 316 1-343 74-426 (428) 34 protein:vir:94673 Length: 419 98.7 1E-08 6.4E-12 64.4 18.9 311 1-343 70-415 (419) 35 protein:vir:94771 Length: 298 98.7 3.5E-09 2.2E-12 66.9 16.3 280 34-343 1-297 (298) 36 protein:vir:104085 Length: 320 98.6 1.3E-08 8E-12 63.8 17.9 290 1-343 1-315 (320) 37 protein:vir:95763 Length: 297 98.6 4.1E-08 2.5E-11 61.1 19.0 278 21-343 1-294 (297) 38 protein:vir:104256 Length: 458 98.6 2.8E-08 1.7E-11 62.0 18.0 316 1-343 125-456 (458) 39 protein:vir:78830 Length: 324 98.6 1.8E-08 1.1E-11 63.0 16.9 293 1-343 1-313 (324) 40 protein:vir:96392 Length: 324 98.6 1.8E-08 1.1E-11 63.0 16.9 293 1-343 1-313 (324) 41 protein:vir:103955 Length: 324 98.5 3E-08 1.8E-11 61.9 17.3 291 1-343 4-313 (324) 42 protein:vir:7855 Length: 497 # 98.5 2.9E-08 1.8E-11 61.9 16.8 316 1-343 98-491 (497) 43 protein:vir:101650 Length: 497 98.5 2.9E-08 1.8E-11 61.9 16.8 316 1-343 98-491 (497) 44 protein:vir:97148 Length: 324 98.5 4.4E-08 2.7E-11 60.9 17.8 295 1-343 1-313 (324) 45 protein:vir:99749 Length: 324 98.5 4.2E-08 2.6E-11 61.0 17.5 291 1-343 4-313 (324) 46 protein:vir:9309 Length: 324 # 98.5 5.2E-08 3.2E-11 60.5 17.7 293 1-343 1-313 (324) 47 protein:vir:96223 Length: 324 98.5 6.9E-08 4.3E-11 59.9 18.0 291 1-343 4-313 (324) 48 protein:vir:78523 Length: 338 98.5 7.3E-08 4.5E-11 59.7 17.3 302 13-343 1-333 (338) 49 protein:vir:100135 Length: 418 98.4 1.3E-07 7.8E-11 58.4 17.7 302 1-343 87-413 (418) 50 protein:vir:2430 Length: 318 # 98.4 1.2E-07 7.2E-11 58.6 17.4 287 2-343 1-311 (318) 51 protein:vir:81227 Length: 413 98.4 1.5E-07 9.3E-11 58.0 16.8 303 1-343 80-408 (413) 52 protein:vir:4226 Length: 326 # 98.3 1.6E-07 9.6E-11 57.9 16.7 297 1-343 1-321 (326) 53 protein:vir:4700 Length: 415 # 98.3 1.5E-07 9.5E-11 57.9 15.8 313 1-343 71-402 (415) 54 protein:vir:4600 Length: 415 # 98.3 1.5E-07 9.5E-11 57.9 15.8 313 1-343 71-402 (415) 55 protein:vir:191 Length: 385 # 98.3 1.5E-07 9.4E-11 58.0 15.6 302 1-343 64-382 (385) 56 protein:vir:1886 Length: 385 # 98.3 1.5E-07 9.4E-11 58.0 15.6 302 1-343 64-382 (385) 57 protein:vir:100247 Length: 425 98.2 4E-07 2.5E-10 55.7 15.9 311 1-343 100-422 (425) 58 protein:vir:8102 Length: 543 # 98.2 6E-07 3.7E-10 54.7 16.6 308 1-343 219-540 (543) 59 protein:vir:4456 Length: 401 # 98.2 2E-07 1.3E-10 57.3 13.6 315 1-343 74-399 (401) 60 protein:vir:4339 Length: 395 # 98.2 1E-06 6.3E-10 53.5 17.3 310 1-343 76-393 (395) 61 protein:vir:10364 Length: 390 98.1 1.9E-06 1.2E-09 51.9 18.1 302 1-343 78-390 (390) 62 protein:vir:9410 Length: 415 # 98.1 8.1E-07 5E-10 54.0 15.3 306 1-343 84-402 (415) 63 protein:vir:93616 Length: 645 98.0 2.7E-06 1.7E-09 51.1 17.4 306 1-343 286-637 (645) 64 protein:vir:97053 Length: 390 98.0 3.1E-06 1.9E-09 50.8 17.5 300 1-343 72-390 (390) 65 protein:vir:98339 Length: 415 98.0 1.8E-06 1.1E-09 52.1 16.1 307 1-343 78-402 (415) 66 protein:vir:81100 Length: 415 98.0 1.8E-06 1.1E-09 52.1 16.1 307 1-343 78-402 (415) 67 protein:vir:79987 Length: 415 98.0 1.8E-06 1.1E-09 52.1 16.1 307 1-343 78-402 (415) 68 protein:vir:3158 Length: 321 # 98.0 3.6E-06 2.2E-09 50.4 17.7 295 1-343 1-309 (321) 69 protein:vir:485 Length: 407 # 98.0 3.6E-06 2.2E-09 50.5 17.5 315 1-343 65-398 (407) 70 protein:vir:2344 Length: 397 # 97.9 3.7E-06 2.3E-09 50.4 16.8 282 1-343 1-304 (397) 71 protein:vir:81070 Length: 390 97.9 5.3E-06 3.3E-09 49.5 17.4 303 1-343 78-390 (390) 72 protein:vir:96762 Length: 632 97.9 1.8E-06 1.1E-09 52.0 14.8 304 1-343 288-631 (632) 73 protein:vir:102119 Length: 404 97.9 2.7E-06 1.7E-09 51.1 15.6 312 1-343 60-398 (404) 74 protein:vir:1328 Length: 392 # 97.9 4.3E-06 2.7E-09 50.0 15.8 307 1-343 73-389 (392) 75 protein:vir:6212 Length: 434 # 97.9 2.8E-06 1.7E-09 51.0 14.7 309 1-343 95-427 (434) 76 protein:vir:4197 Length: 314 # 97.8 2E-05 1.2E-08 46.4 19.1 295 1-343 1-310 (314) 77 protein:vir:8843 Length: 317 # 97.8 2.2E-06 1.3E-09 51.6 12.9 287 29-343 1-313 (317) 78 protein:vir:4159 Length: 315 # 97.7 2E-05 1.3E-08 46.3 17.9 301 5-343 1-315 (315) 79 protein:vir:3991 Length: 404 # 97.6 1.9E-05 1.2E-08 46.5 16.0 302 1-343 75-391 (404) 80 protein:vir:4511 Length: 409 # 97.6 3.2E-05 2E-08 45.3 16.7 305 1-343 85-404 (409) 81 protein:vir:3613 Length: 272 # 97.5 2.9E-05 1.8E-08 45.5 16.1 267 26-343 1-272 (272) 82 protein:vir:101607 Length: 379 97.5 4.1E-05 2.6E-08 44.6 16.7 292 1-343 75-379 (379) 83 protein:vir:4953 Length: 397 # 97.5 3.2E-05 2E-08 45.2 16.2 300 1-343 73-383 (397) 84 protein:vir:6242 Length: 390 # 97.5 2.3E-05 1.4E-08 46.1 15.0 301 1-343 80-387 (390) 85 protein:vir:107593 Length: 392 97.4 5E-05 3.1E-08 44.2 15.8 300 1-343 64-382 (392) 86 protein:vir:105004 Length: 392 97.4 5E-05 3.1E-08 44.2 15.8 300 1-343 64-382 (392) 87 protein:vir:102082 Length: 392 97.4 5E-05 3.1E-08 44.2 15.8 300 1-343 64-382 (392) 88 protein:vir:102873 Length: 392 97.4 5E-05 3.1E-08 44.2 15.8 300 1-343 64-382 (392) 89 protein:vir:80930 Length: 278 97.4 5.7E-05 3.6E-08 43.8 16.1 271 26-343 1-275 (278) 90 protein:vir:7409 Length: 408 # 97.4 4.8E-05 3E-08 44.3 15.4 301 1-343 76-391 (408) 91 protein:vir:4856 Length: 293 # 97.4 8E-05 5E-08 43.0 16.4 272 22-343 1-279 (293) 92 protein:vir:9820 Length: 272 # 97.3 9.2E-05 5.7E-08 42.7 17.4 263 26-343 1-267 (272) 93 protein:vir:3033 Length: 272 # 97.3 9.2E-05 5.7E-08 42.7 17.4 263 26-343 1-267 (272) 94 protein:vir:102655 Length: 322 97.3 8.8E-05 5.5E-08 42.8 16.4 301 21-343 1-319 (322) 95 protein:vir:81160 Length: 371 97.3 6.8E-05 4.2E-08 43.4 15.8 293 1-343 60-368 (371) 96 protein:vir:1025 Length: 408 # 97.3 9.2E-05 5.7E-08 42.7 16.3 301 1-343 76-391 (408) 97 protein:vir:1268 Length: 397 # 97.3 5.6E-05 3.5E-08 43.9 14.8 298 1-343 77-395 (397) 98 protein:vir:4997 Length: 397 # 97.2 9.3E-05 5.8E-08 42.7 15.3 300 1-343 73-383 (397) 99 protein:vir:3845 Length: 395 # 97.2 8.3E-05 5.1E-08 43.0 14.8 294 1-343 81-381 (395) 100 protein:vir:3870 Length: 400 # 97.1 3.4E-05 2.1E-08 45.1 12.2 293 1-343 103-397 (400) 101 protein:vir:100172 Length: 394 97.1 0.00018 1.1E-07 41.1 15.7 294 1-343 75-382 (394) 102 protein:vir:4830 Length: 397 # 97.1 0.00018 1.1E-07 41.1 15.6 299 1-343 73-385 (397) 103 protein:vir:93742 Length: 274 97.0 0.0002 1.2E-07 40.9 17.2 264 26-343 1-268 (274) 104 protein:vir:97255 Length: 310 97.0 0.00023 1.4E-07 40.5 15.6 285 31-343 1-308 (310) 105 protein:vir:1383 Length: 421 # 96.8 0.00029 1.8E-07 40.0 15.0 294 1-343 79-392 (421) 106 protein:vir:78640 Length: 352 96.7 0.00016 9.6E-08 41.5 12.8 291 1-343 41-344 (352) 107 protein:vir:96123 Length: 274 96.7 0.00039 2.4E-07 39.3 18.0 264 26-343 1-268 (274) 108 protein:vir:94494 Length: 274 96.7 0.00043 2.7E-07 39.1 17.2 264 26-343 1-268 (274) 109 protein:vir:97433 Length: 274 96.7 0.00043 2.7E-07 39.1 17.2 264 26-343 1-268 (274) 110 protein:vir:739 Length: 231 # 96.5 0.00043 2.7E-07 39.1 13.7 225 73-343 1-231 (231) 111 protein:vir:105334 Length: 276 96.4 0.0007 4.4E-07 37.9 16.0 265 26-343 1-268 (276) 112 protein:vir:4092 Length: 390 # 96.3 0.00082 5.1E-07 37.5 16.1 301 1-343 47-368 (390) 113 protein:vir:96833 Length: 275 96.2 0.00094 5.8E-07 37.2 15.7 265 24-343 1-275 (275) 114 protein:vir:80128 Length: 466 96.1 0.001 6.2E-07 37.0 17.4 313 1-343 102-446 (466) 115 protein:vir:94933 Length: 330 95.8 0.0013 8.3E-07 36.3 13.4 307 1-343 1-327 (330) 116 protein:vir:95376 Length: 425 95.7 0.0015 9.6E-07 36.0 14.2 305 1-343 101-419 (425) 117 protein:vir:9704 Length: 394 # 95.6 0.0014 8.8E-07 36.2 12.6 289 1-343 85-388 (394) 118 protein:vir:99888 Length: 309 95.5 0.002 1.2E-06 35.4 13.0 270 29-343 1-301 (309) 119 protein:vir:93881 Length: 387 95.4 0.0021 1.3E-06 35.3 13.7 291 1-343 81-379 (387) 120 protein:vir:9361 Length: 402 # 95.2 0.0022 1.3E-06 35.2 12.5 290 1-343 96-396 (402) 121 protein:vir:1239 Length: 274 # 95.2 0.0025 1.5E-06 34.9 16.8 265 26-343 1-268 (274) 122 protein:vir:962 Length: 397 # 95.1 0.0016 9.9E-07 35.9 11.3 292 1-343 87-395 (397) 123 protein:vir:96262 Length: 274 95.0 0.0031 1.9E-06 34.4 16.6 264 26-343 1-268 (274) 124 protein:vir:95898 Length: 274 95.0 0.0031 1.9E-06 34.4 16.6 264 26-343 1-268 (274) 125 protein:vir:100884 Length: 389 94.9 0.0032 2E-06 34.3 17.2 288 1-343 83-382 (389) 126 protein:vir:9643 Length: 377 # 94.8 0.0034 2.1E-06 34.1 17.6 302 1-343 38-377 (377) 127 protein:vir:9509 Length: 381 # 94.8 0.0036 2.2E-06 34.0 19.2 302 1-343 25-365 (381) 128 protein:vir:101291 Length: 381 94.8 0.0036 2.2E-06 34.0 19.2 302 1-343 25-365 (381) 129 protein:vir:94424 Length: 387 94.2 0.0037 2.3E-06 33.9 11.2 291 1-343 81-379 (387) 130 protein:vir:96978 Length: 387 94.2 0.0037 2.3E-06 33.9 11.2 291 1-343 81-379 (387) 131 protein:vir:2685 Length: 387 # 94.2 0.0037 2.3E-06 33.9 11.2 291 1-343 81-379 (387) 132 protein:vir:100632 Length: 381 93.9 0.0061 3.8E-06 32.7 16.0 302 1-343 36-370 (381) 133 protein:vir:95107 Length: 270 93.5 0.0075 4.6E-06 32.2 14.9 260 21-343 1-263 (270) 134 protein:vir:78739 Length: 332 91.6 0.015 9.5E-06 30.5 13.3 291 23-343 1-332 (332) 135 protein:vir:102823 Length: 470 90.8 0.0015 9.5E-07 36.0 4.5 293 1-343 1-345 (470) 136 protein:vir:97031 Length: 402 90.2 0.022 1.4E-05 29.7 12.9 300 21-343 1-342 (402) 137 protein:vir:98480 Length: 348 89.9 0.023 1.5E-05 29.5 12.0 272 29-343 1-291 (348) 138 protein:vir:2736 Length: 348 # 89.0 0.029 1.8E-05 29.0 14.2 277 26-343 1-330 (348) 139 protein:vir:1084 Length: 437 # 88.3 0.033 2.1E-05 28.7 14.1 297 1-343 120-425 (437) 140 protein:vir:78350 Length: 383 87.8 0.036 2.3E-05 28.5 16.1 301 1-343 43-372 (383) 141 protein:vir:95963 Length: 395 87.3 0.04 2.5E-05 28.3 18.6 302 1-343 45-373 (395) 142 protein:vir:107882 Length: 307 86.8 0.043 2.7E-05 28.1 13.0 271 26-343 1-300 (307) 143 protein:vir:98635 Length: 377 86.2 0.047 2.9E-05 27.8 17.1 302 1-342 51-377 (377) 144 protein:vir:94576 Length: 347 85.9 0.05 3.1E-05 27.7 14.3 308 10-343 1-347 (347) 145 protein:vir:105645 Length: 400 85.6 0.051 3.2E-05 27.6 13.0 302 1-343 1-331 (400) 146 protein:vir:4902 Length: 348 # 85.5 0.053 3.3E-05 27.6 14.9 280 26-343 1-330 (348) 147 protein:vir:99675 Length: 324 85.2 0.055 3.4E-05 27.5 11.5 260 69-343 1-301 (324) 148 protein:vir:79078 Length: 307 84.9 0.057 3.5E-05 27.4 10.9 274 26-343 1-300 (307) 149 protein:vir:94622 Length: 341 84.6 0.059 3.7E-05 27.3 18.7 296 21-343 1-337 (341) 150 protein:vir:6378 Length: 346 # 81.6 0.084 5.2E-05 26.5 18.1 287 36-343 1-346 (346) 151 protein:vir:8885 Length: 347 # 81.0 0.09 5.6E-05 26.3 15.1 310 10-343 1-345 (347) 152 protein:vir:79928 Length: 393 80.5 0.095 5.9E-05 26.2 10.6 311 1-343 28-376 (393) 153 protein:vir:96490 Length: 348 80.1 0.098 6.1E-05 26.1 16.2 279 26-343 1-330 (348) 154 protein:vir:6324 Length: 335 # 79.9 0.1 6.2E-05 26.1 14.9 296 1-343 1-328 (335) 155 protein:vir:94800 Length: 319 76.6 0.13 8.3E-05 25.4 15.6 287 1-343 1-292 (319) 156 protein:vir:97331 Length: 319 76.6 0.13 8.3E-05 25.4 15.6 287 1-343 1-292 (319) 157 protein:vir:80213 Length: 334 75.8 0.14 8.9E-05 25.2 15.8 303 10-343 1-332 (334) 158 protein:vir:96666 Length: 462 71.8 0.19 0.00012 24.5 9.6 294 1-343 1-310 (462) 159 protein:vir:99311 Length: 463 69.6 0.22 0.00014 24.2 9.1 284 1-343 3-299 (463) 160 protein:vir:95603 Length: 463 69.6 0.22 0.00014 24.2 9.1 284 1-343 3-299 (463) 161 protein:vir:80180 Length: 381 61.7 0.35 0.00022 23.1 17.7 282 10-343 1-297 (381) 162 protein:vir:7019 Length: 401 # 59.2 0.4 0.00025 22.8 11.1 301 1-343 1-335 (401) 163 protein:vir:78935 Length: 335 58.2 0.42 0.00026 22.7 15.1 299 1-343 1-328 (335) 164 protein:vir:7990 Length: 273 # 51.8 0.57 0.00036 21.9 16.9 264 29-343 1-271 (273) 165 protein:vir:1541 Length: 347 # 49.8 0.63 0.00039 21.7 16.0 305 10-343 1-345 (347) 166 protein:vir:10450 Length: 344 47.3 0.71 0.00044 21.4 12.6 310 10-343 1-342 (344) 167 protein:vir:103323 Length: 364 44.9 0.79 0.00049 21.1 16.8 299 21-343 1-337 (364) 168 protein:vir:97397 Length: 517 44.0 0.82 0.00051 21.0 12.0 303 1-343 168-512 (517) 169 protein:vir:106590 Length: 349 38.0 1.1 0.00068 20.4 15.8 280 29-343 1-334 (349) 170 protein:vir:102605 Length: 273 37.9 1.1 0.00068 20.4 19.4 264 29-343 1-271 (273) 171 protein:vir:105822 Length: 273 37.9 1.1 0.00068 20.4 19.4 264 29-343 1-271 (273) 172 protein:vir:8324 Length: 410 # 37.6 1.1 0.00069 20.3 6.3 292 1-343 89-410 (410) 173 protein:vir:3364 Length: 347 # 34.7 1.3 0.00079 20.0 14.9 309 10-343 1-345 (347) 174 protein:vir:94711 Length: 347 29.5 1.7 0.001 19.4 12.9 304 1-343 1-345 (347) 175 protein:vir:100057 Length: 375 25.9 2 0.0012 18.9 18.2 308 1-343 1-368 (375) 176 protein:vir:107120 Length: 329 23.6 2.3 0.0014 18.6 15.8 289 8-343 1-303 (329) No 1 >protein:vir:79642 Length: 329 # NCBI annotation: HsbB # Family: family:all:463 # MgeID: mge:1872 # MgeName: TLS # Cross-refs: genbank:acc:YP_001285525;genbank:gi:148734508;genbank:GeneID:5220000 Probab=100.00 E-value=2.2e-93 Score=528.53 Aligned_cols=315 Identities=18% Similarity=0.262 Sum_probs=287.5 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) =|.+|+..||..++. +++ -+.+.+|+++.++|+.+||++||++|||+++++++++++||+.++++||+ T Consensus 8 ~~~~~d~~~~~~~a~--------~~~----~~~~~~~~~~~~~f~~~ql~~id~~v~e~~~~~l~~~~~i~i~~~~~~~~ 75 (329) T protein:vir:79 8 KEMKYDEFEANVIAN--------HMQ----LRGAKNDASDMGIWTSQELHKIKAQAYEKEYPAGSALRVFPVTSELSDTD 75 (329) T ss_pred hhhccchhhhhhHhh--------hcc----cccceeccchhhHHHHHHHHHHHHHHHhhhhcccchhhhcccccCCCCce Confidence 244555555555444 222 23447788899999999999999999999999999999999999999999 Q ss_pred eEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhh Q lcl|NC_011142. 81 THWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQR 160 (343) Q Consensus 81 ~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~ 160 (343) ++++|+++|.+|++++|+++++|+|+++++++++.+|++.|+.+|+|+++||+++++.|+||+++|+.+|++++++++|+ T Consensus 76 ~~~t~~~~~~~G~a~~~~d~~~dip~vd~~~~~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~ 155 (329) T protein:vir:79 76 KTFEYQTFDKVGHAKIIADYTDDLSTVDALMTSEFGKVFRLGNAFLISIDEIKAGQRTGKSLSTRKANAAQNAHDQLVNH 155 (329) T ss_pred eEEEeeeeecceeeeeecCcccccceeecccceeEEEEEEEEEEEEecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhcc Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred eeeeeehhhcceeeeecCCccccccC----cCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhcccc Q lcl|NC_011142. 161 VAYFGDTNRNMSGLLNNPNVTKTSAT----VNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLM 236 (343) Q Consensus 161 ~~f~G~~~~g~~GLlN~p~v~~~~~~----~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~ 236 (343) ++|||++++|++||||+||++....+ ++|++||++||++||++++++++.+++|++.|++|+|||++|.+|++++ T Consensus 156 i~f~G~~~~g~~GLlN~p~v~~~~~~~~~~~~w~~kt~~ei~~di~~~~~~l~~~s~g~~~p~~L~Lpp~~~~~L~~~~- 234 (329) T protein:vir:79 156 LVFKGSKPHKIISVFEHPNLTTINSAGWNNAAGTGKKPETAQDELEQAIEKIETLTNGQHRANMILIPPSMRKVLMVRM- 234 (329) T ss_pred EEEeecccccceeeecCCCccccccCCCCCccccccCHHHHHHHHHHHHHHHHHhcCceecccEEEecHHHHHHhhccc- Confidence 99999999999999999999875443 4699999999999999999999999999999999999999999998754 Q ss_pred CCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccceecCc Q lcl|NC_011142. 237 TGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLGL 316 (343) Q Consensus 237 ~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~~ 316 (343) .++++|+++||++|||+..|.+.| ++.+++.+|+|||++|+++++++++++||||++||+|++++ T Consensus 235 -~~~~~tvl~~lk~~~~~l~I~~~~--------------el~~ag~~g~~~~v~y~~~~~~~~~~vp~~~~~l~~q~~~~ 299 (329) T protein:vir:79 235 -PETTMSYLDYFKQQNGGITIESIS--------------ELEDIDGAGTKAALVYEKDPMNMSIEIPEAFNMLTAQPKDL 299 (329) T ss_pred -CCCCccHHHHHHHhCCCcEEEEcc--------------cccccCCCCceEEEEEecCCceEEEecCcceeeeeceecCc Confidence 357999999999999988776544 23456677899999999999999999999999999999999 Q ss_pred eeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 317 GITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 317 ~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +|++||++|+|||+||||.||+|+||| T Consensus 300 ~~~v~~~~r~~Gv~i~~P~ai~~~dGI 326 (329) T protein:vir:79 300 HFKVPCTSKCTGLTIYRPLTLVLIKGL 326 (329) T ss_pred eEEEceeeeEEEEEEECcceeeeeeee Confidence 999999999999999999999999999 No 2 >protein:vir:104342 Length: 314 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:1593 # MgeName: RTP # Cross-refs: genbank:acc:YP_398971;genbank:gi:81343955;genbank:GeneID:3778874 Probab=100.00 E-value=4.1e-93 Score=527.07 Aligned_cols=311 Identities=22% Similarity=0.293 Sum_probs=287.1 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) |.=+|. .|+..++. ....++ .+|+|++++|+.+||++||++|||+++++++++++||+.++++||+ T Consensus 1 ~~~~~~-~~~~~~~~---------~~~~~~----~~~~d~~~~fl~~ql~~id~~v~e~~~~~~~~~~~i~v~~~~~~~~ 66 (314) T protein:vir:10 1 MAIKFD-AEQAKITT---------HLEQMG----VEKADAAGIWAVSQLTAALNRAYEKEYAENSVVNIFPVTNEIPGHA 66 (314) T ss_pred CccchH-HHHHHHHH---------HHHhhc----ccchhhhHHHHHHHHHHHHHHHhhhhccccccceeeccccCCCCce Confidence 766676 46666654 111112 4677899999999999999999999999999999999999999999 Q ss_pred eEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhh Q lcl|NC_011142. 81 THWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQR 160 (343) Q Consensus 81 ~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~ 160 (343) ++++|.++|.+|++++|+++++|+|+++++++++..|++.|+.+|+|+++||+++++.|+||+++|+.+|++++++++|+ T Consensus 67 et~~~~~~e~~G~a~~~~d~~~dip~vd~~~~~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~ 146 (314) T protein:vir:10 67 KYFEYPEFDGVGIAQIIADYSDDLPLVDAFMTEKQGKVFRFGNAFLISTDEIKAGAATGQSLSARKQALAFEAHDNLLDK 146 (314) T ss_pred eEEEeeeeccccceeeeCCcccccceeecccceeEEEEEEEEeeEEecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhce Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred eeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCC Q lcl|NC_011142. 161 VAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYT 240 (343) Q Consensus 161 ~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~ 240 (343) ++|+|++++|++||||+||++..+++++|+ |++||++||++++++++++|+|.+.|++|+|||+.|.+|+++ ++++ T Consensus 147 i~f~G~~~~g~~GLlN~p~v~~~~~~~~Wa--T~~ei~~Di~~~~~~l~~~s~g~~~p~~l~Lpp~~~~~L~~~--~~~~ 222 (314) T protein:vir:10 147 LVWSGSAPHGIVSVFDQPNINNVVATPNWS--VPQNAIDDVTAMIDAVESSTQGLHHVTDILLPASARRVMQGL--VPQT 222 (314) T ss_pred EEEeecccccceeEeecCCCccccCCCCcc--cHHHHHHHHHHHHHHHHHhcCccccceeEEecHHHHHhhccc--ccCC Confidence 999999999999999999999888888995 799999999999999999999999999999999999999764 3568 Q ss_pred CccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccceecCceeEe Q lcl|NC_011142. 241 DRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLGLGITV 320 (343) Q Consensus 241 ~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~~~~~~ 320 (343) ++|+++||++|||+..|.+.|. +.+++.+|++||++|+++++++++++||||++||+|+++++|++ T Consensus 223 ~~tvl~~l~~n~~~l~I~~~~e--------------l~~ag~~g~~~~v~y~~~~~~~~~~vp~~~~~l~~e~~~~~~~~ 288 (314) T protein:vir:10 223 NLSYGELFTRNNPGLTIRFLQF--------------LDNYDGAGGKAALAFEKSPLNMSIEIPEVTNVLPAQPKDLHFRY 288 (314) T ss_pred CccHHHHHHHhCCCcEEEEccc--------------ccccCCCcceEEEEEecCCcEEEEecCccceeecceecCceEEE Confidence 9999999999999888766542 33566778999999999999999999999999999999999999 Q ss_pred eeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 321 PAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 321 ~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ||++|+|||+||||.+|+|+||| T Consensus 289 ~~~~r~~Gv~i~~P~ai~~~dGI 311 (314) T protein:vir:10 289 PVTSKATGLIVYRPLTMAVIKGI 311 (314) T ss_pred cceeeeEEEEEECcceeEeeeee Confidence 99999999999999999999999 No 3 >protein:vir:107687 Length: 319 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:1518 # MgeName: T1 # Cross-refs: genbank:acc:YP_003898;genbank:gi:45686314;genbank:GeneID:2773027 Probab=100.00 E-value=2.1e-91 Score=517.74 Aligned_cols=315 Identities=18% Similarity=0.270 Sum_probs=287.9 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecch-hhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcc Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDA-DGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEY 79 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA-~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~ 79 (343) =|.+|++.|+.-|+.. ++ .++ +.-|| ++.|+|+++||++||++|||+++++++++|+||+.++++|| T Consensus 2 ~~~~~~~~~~~~~~~~--------~~-~~~---~~~da~~~~g~~~~~ql~~id~~v~e~~~~~l~~~~~i~v~~~~~~~ 69 (319) T protein:vir:10 2 TTKKFDEADKSNVEMY--------LI-QAG---VKQDAAATMGIWTAQELHRIKSQSYEEDYPVGSALRVFPVTTELSPT 69 (319) T ss_pred CCcchhHHhhHHHHHH--------Hh-hcc---chhhhhhhhhhHHHHHHHHHHHHHHhhhhcceechhhcccccCCCCc Confidence 2457888998888762 22 223 23344 35679999999999999999999999999999999999999 Q ss_pred eeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhh Q lcl|NC_011142. 80 ATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQ 159 (343) Q Consensus 80 ~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n 159 (343) +++++|.++|.+|++++|+++++|+|+++++++++.+|++.++.+|+|+++||+++++.|+||+++|+.+|++++++++| T Consensus 70 ~~~~~~~~~~~~G~a~~~~d~~~dip~v~~~~~~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n 149 (319) T protein:vir:10 70 DKTFEYMTFDKVGTAQIIADYTDDLPLVDALGTSEFGKVFRLGNAYLISIDEIKAGQATGRPLSTRKASACQLAHDQLVN 149 (319) T ss_pred eEEEEeeeeccccceeeecCccccccceeccceeeEEEEEEEEeeeeecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhc Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred heeeeeehhhcceeeeecCCccccccC--cCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccC Q lcl|NC_011142. 160 RVAYFGDTNRNMSGLLNNPNVTKTSAT--VNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMT 237 (343) Q Consensus 160 ~~~f~G~~~~g~~GLlN~p~v~~~~~~--~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~ 237 (343) +++|+|++++|++||||+||++..+++ ++|+++|+++|++||++++++++.+++|++.|++|+|||++|.+|++++ T Consensus 150 ~i~f~G~~~~g~~GLlN~p~~~~~~~~~~~~~~t~t~~~i~~di~~~~~~l~~~s~g~~~p~~L~L~p~~~~~L~~~~-- 227 (319) T protein:vir:10 150 RLVFKGSAPHKIVSVFNHPNITKITSGKWIDVSTMKPETAEAELTQAIETIETITRGQHRATNILIPPSMRKVLAIRM-- 227 (319) T ss_pred eEEEeecccccceeEEeCCCceeeecCCCCCccccCHHHHHHHHHHHHHHHHHhcCceeeceEEEecHHHHHhhhccc-- Confidence 999999999999999999999887665 3568899999999999999999999999999999999999999998654 Q ss_pred CCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccceecCce Q lcl|NC_011142. 238 GYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLGLG 317 (343) Q Consensus 238 ~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~~~ 317 (343) +++++|+++||++||++..|.+.|. +.+++.+|+|||++|+++++++++++||||++||+|+++++ T Consensus 228 ~~~~~t~l~~lk~~~~~l~I~~~pe--------------l~~ag~~g~~~~v~y~~~~~~~~~~v~~~~~~~~~e~~~l~ 293 (319) T protein:vir:10 228 PETTMSYLDYFKSQNSGIEIDSIAE--------------LEDIDGAGTKGVLVYEKNPMNMSIEIPEAFNMLPAQPKDLH 293 (319) T ss_pred CCCCeeHHHHHHHhcCCceEEEeee--------------ecccCCCcceEEEEEecCCceEEEecCcceeeeeeeecCce Confidence 4679999999999999887766542 23456678999999999999999999999999999999999 Q ss_pred eEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 318 ITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 318 ~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) |++||++|+|||+||||.||+|+||| T Consensus 294 ~~~~~~~r~~Gv~i~~P~ai~~~dGI 319 (319) T protein:vir:10 294 FKVPCTSKCTGLTIYRPMTIVLITGV 319 (319) T ss_pred EEEeeeeeeEEEEEEccceeEeeecC Confidence 99999999999999999999999999 No 4 >protein:vir:103285 Length: 296 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:1605 # MgeName: JK06 # Cross-refs: genbank:acc:YP_277465;genbank:gi:71834107;genbank:GeneID:3562396 Probab=100.00 E-value=7e-91 Score=514.85 Aligned_cols=292 Identities=19% Similarity=0.286 Sum_probs=276.5 Q ss_pred hecc-hhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccc Q lcl|NC_011142. 34 VVND-ADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAK 112 (343) Q Consensus 34 ~~~d-A~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~ 112 (343) |.+| ||++++|+++||++||++|||+++++++++++||+.++++||+++++|+++|.+|++++|+++++|+|+++++++ T Consensus 1 ~~~~~a~~~~~f~~~ql~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~ 80 (296) T protein:vir:10 1 MGVDKADAAGIWTVKQLTASLNKAYETEYDQNSVVNLFPVSNEIPGYAKYFEYPVFDGVGIAQIVADYTDDLPLVDALAT 80 (296) T ss_pred CcccchhhhHHHHHHHHHHHHHHHHhhhhcccccceecccccCCCCceeEEEeeeeeccCceeEeCCCccccceeeccce Confidence 5555 788999999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred eeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCcccc Q lcl|NC_011142. 113 LHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATC 192 (343) Q Consensus 113 ~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~ 192 (343) ++..|++.++.+|+|+++||++|++.|+||+++|+.+|++++++++|+++|+|++++|++||||+||++..+++++|+++ T Consensus 81 ~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~ka~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~W~~~ 160 (296) T protein:vir:10 81 ERQGKVFRFGNAFLISIDEIKVGQATGQSLSTRKQSLAFEAHDKLLDKLVWSGSTAHGIPSVFDYPNINNVVSGGSWSQP 160 (296) T ss_pred eEEEEEEEEEeeeeecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhceEEEeecccccceeEeecCCCccccccCCccCH Confidence 99999999999999999999999999999999999999999999999999999999999999999999998888899874 Q ss_pred CHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeee Q lcl|NC_011142. 193 TGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMA 272 (343) Q Consensus 193 t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~ 272 (343) .+|++||++++++++.+++|++.|++|+|||++|.+|++++ +++++|+++||++||++..+.+.|. T Consensus 161 --t~i~~Di~~~~~~l~~~s~g~~~p~~l~L~p~~~~~L~~~~--~~~~~t~l~~ik~~~~~l~i~~~~~---------- 226 (296) T protein:vir:10 161 --TTAVSDITSLLDIIETSTNGQHRATHLLLPTTARRIMQNLV--PGTSVSYGEFFRQNNSGVTVEFVQY---------- 226 (296) T ss_pred --HHHHHHHHHHHHHHHHhhCceecceeEEeCHHHHHHHhhcc--CCCCccHHHHHHHhcCCceEEEeee---------- Confidence 59999999999999999999999999999999999998654 5689999999999999887765442 Q ss_pred chhhhccccCCccceEEEEEcccceEEEeeccchhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 273 TELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 273 ~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +.+++.+|+++|++|+++++++++++||||++||+|+++++|++||++++|||+||||.||+|+||| T Consensus 227 ----l~~a~~~g~~~~v~~~~~~~~~~~~v~~~~~~~~~e~~~l~~~~~~~~~~~Gv~i~~P~ai~~~dGI 293 (296) T protein:vir:10 227 ----LNDYNGTGTSAAIAYEKDPNNMAIEIPEATNALPAQPKDLHFKIPVTSKATGLIVYRPLTMAVMKGI 293 (296) T ss_pred ----eccCCCCcceEEEEEEcCCceEEEEcCcceeeecccccCceEEEeeEeeEEEEEEECCceeEEEeee Confidence 2345567899999999999999999999999999999999999999999999999999999999999 No 5 >protein:vir:80068 Length: 301 # NCBI annotation: gp8 # Family: family:all:463 # MgeID: mge:1876 # MgeName: B054 # Cross-refs: genbank:acc:YP_001468712;genbank:gi:157325292;genbank:GeneID:5601759 Probab=100.00 E-value=1.6e-89 Score=507.38 Aligned_cols=294 Identities=17% Similarity=0.262 Sum_probs=278.6 Q ss_pred hecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccce Q lcl|NC_011142. 34 VVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKL 113 (343) Q Consensus 34 ~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~ 113 (343) |.. ++.|+|+++||++||++|+|++++++.+|+++|+.++++||+++++|+++|.+|++++|+++++|+|++++++++ T Consensus 1 ~~~--~~~g~f~~~~l~~id~~v~e~~~~~l~~r~l~~v~~~~~~~~~~~~~~~~~~~G~~~~~~~~~~dip~~~~~~~~ 78 (301) T protein:vir:80 1 MQG--KITATIEARDLQAIDNVIYEPKQEELTARSVFPQKFDVNEGAESYSFDVMTRSGAAKIIANGADDLPLVDVDMVR 78 (301) T ss_pred CCc--cccchhhHHHHHHHHHHHHHhhhhhhhhhhhcccccCCCCceEEEEEeeeccceeEEEecCccccccccccccee Confidence 344 456899999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred eEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccc-------cC Q lcl|NC_011142. 114 HQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTS-------AT 186 (343) Q Consensus 114 ~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~-------~~ 186 (343) +..|++.++.+|+|+|+||+++++.|+||+++|+.+|++++++++|+++|+|++++|++||||+||++... .. T Consensus 79 ~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aa~~~~~~~~n~~~f~G~~~~g~~GLlN~p~~~~~~~~~~~~~~~ 158 (301) T protein:vir:80 79 KSVPIYSIGIGLSYTIQDLRAARMQGTTVDAAKATTVRRAIAEKENSIAFRGEKKYAIKGAFEATGIQIDVSPTTGVGNV 158 (301) T ss_pred EEEEEEEEEeeeeecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhceEEeeecccccceeeecCCCcccccccCcccccc Confidence 99999999999999999999999999999999999999999999999999999999999999999986532 23 Q ss_pred cCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccc Q lcl|NC_011142. 187 VNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKI 266 (343) Q Consensus 187 ~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~ 266 (343) ++|++||++||++||++++++++.+++|++.|++|+|||++|.+|+++++++++++|+++||++|+++..|.+.| T Consensus 159 ~~w~~~t~~ei~~di~~~~~~l~~~s~g~~~p~~L~L~p~~~~~L~~~~~~~~~~~tvl~~l~~~~~~~~I~~~p----- 233 (301) T protein:vir:80 159 SKWEKKTAEQIIDEIGEAHTKITVLPGYGTASLKLCLPPKQFELINKKRYSNEDSRSVLKVLQDNAWFSAIVRVP----- 233 (301) T ss_pred cccccCCHHHHHHHHHHHHHHHHHhcCceecccEEEecHHHHHhhhhccccCCCCeeHHHHHHHHcCcceEEEcc----- Confidence 579999999999999999999999999999999999999999999999999999999999999999998877655 Q ss_pred cceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 267 RFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 267 ~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ++.+++.+|+|||++|+++++++++++||||++||+|+++++|++||++|+|||+||||.||+|+||| T Consensus 234 ---------~L~~~g~~g~~~~v~~~~~~d~~~~~v~~~~~~~~~e~~~~~~~~~~~~r~~Gv~i~~P~ai~~~~GI 301 (301) T protein:vir:80 234 ---------DLAGMGTAGSDSFAVIHDSNETAELIIPMDITRHPEEYSFPRTKVPFEERTAGVVVRFPAAIVRVDGI 301 (301) T ss_pred ---------eeccCCCCcccEEEEEecCCcEEEEEecCceeeecceecCceeEeeeeeeeEEEEEEccceEEEEecC Confidence 23355667899999999999999999999999999999999999999999999999999999999999 No 6 >protein:vir:5255 Length: 304 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:117 # MgeName: Aaphi23 # Cross-refs: genbank:acc:NP_852760;genbank:gi:31544035;uniprot:Q7Y5U0;genbank:GeneID:2753552 Probab=100.00 E-value=5e-88 Score=499.20 Aligned_cols=295 Identities=18% Similarity=0.195 Sum_probs=276.3 Q ss_pred hhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccce--eecCCcCccceeeeccceeEE Q lcl|NC_011142. 39 DGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGK--FISANASDLPRVAQSAKLHQV 116 (343) Q Consensus 39 ~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~--~~~~~~~dip~v~~~~~~~~~ 116 (343) -++++|+.+||++||++|||.++++++++++||+.++++||+++++|.++|.+|+++ +++++++|||+++++++++.. T Consensus 1 ~~~lafl~~qL~~id~~vye~~~~~~~~~~lipv~t~~~~~~~~~~~~~~d~~G~a~~~~i~~~a~dip~vd~~~~~~~~ 80 (304) T protein:vir:52 1 MSLLAYVKNGLTAVSKDIAETKYPEIVFPQFVYVDQQTAVGITEKLHYGADEHGSLDDGLITVGTSTLDQVEVGFTPTRS 80 (304) T ss_pred CchHHHHHHHHHHHhhhhhccccccchhhhhccccCCCCcccceEEEeeeeccCcccccccCCcCCccceeecccceeEE Confidence 367999999999999999999999999999999999999999999999999999999 889999999999999999999 Q ss_pred EEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehh-hcceeeeecCCccccc-----cCcCcc Q lcl|NC_011142. 117 ELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTN-RNMSGLLNNPNVTKTS-----ATVNYA 190 (343) Q Consensus 117 ~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~-~g~~GLlN~p~v~~~~-----~~~~w~ 190 (343) ||+.|+.+|+|+++||++|++.|++|+++|+.+|++++++++|+++|+|++. .|++||||+|+++... ++++|+ T Consensus 81 ~i~~~~~~~~y~~~El~~a~~~g~~l~~~ka~aa~~a~~~~~n~v~~~Gd~~~~g~~GllN~p~v~~~~~~~~~a~~~w~ 160 (304) T protein:vir:52 81 YIVPWAKSVTWTKPELEQGKLLGLALNTAKIMALNKNAQQTLQKVAFLGHAKDSRLTGLLNNKSVEVYAIKGAAQNTKVQ 160 (304) T ss_pred EEEEEeeeeeecHHHHHHHHHhCCCcHHHHHHHHHHHHHhhhceEEEEeeccccceEEEEeCCCcceeeecCCccCCccc Confidence 9999999999999999999999999999999999999999999999999985 7999999999998543 346899 Q ss_pred ccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeecccccccccccee Q lcl|NC_011142. 191 TCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQL 270 (343) Q Consensus 191 ~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l 270 (343) +||++||++||++++++++.+++|++.|++|+|||+.|.+|++++++ ++++|+|+||++||++. .+.|++|..+.+. T Consensus 161 ~~T~~eI~~di~~~~~~i~~~s~~~~~p~tl~Lpp~~~~~l~~~~~~-~~~~Tvl~~l~~n~~~~--~g~~l~I~~v~~~ 237 (304) T protein:vir:52 161 AMDFDKAVAFFKEIFLKGMEKTKRIEAPNTFAIDSLDLAHLALVQRA-NTDTTALEFLTKHLSAA--AGRQVAIKALPSN 237 (304) T ss_pred cCCHHHHHHHHHHHHHHHHhccCceecCceEEeCHHHHHHHhhccCC-CCCchHHHHHHHhcccc--cCCcceEEEeccc Confidence 99999999999999999999999999999999999999999987765 58899999999999874 5788888766542 Q ss_pred eechhhhccccCCccceEEEEEcccceEEEeeccchhcccceecCc-eeEeeeeeeeeeEEEECcceeeeecc Q lcl|NC_011142. 271 MATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLGL-GITVPAEYKISGTEYRYPLCAQYVDM 342 (343) Q Consensus 271 ~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~~-~~~~~~~~~~gGv~i~~P~ai~~~dG 342 (343) . .++|.+|+|||++|++|+++++|++|||+++||+|++++ .|++||++|+|||+||||.+++|+|= T Consensus 238 ~------~~~g~~g~~r~vvY~~d~~~~~~~vP~p~~~l~~q~~~~~~~~vp~~~r~gGv~v~~P~a~~y~D~ 304 (304) T protein:vir:52 238 Y------GTRVTDGKTRAMVYVNSKEHVIFDVPMSPTVLDAQPKGLLAFESGLRMAFGGVTFMEPDSALYVDY 304 (304) T ss_pred c------cccCCCCceEEEEEecChhheEEecCccccccchhhcCCceEEecceeeeeeEEEEccceeeeecC Confidence 2 356778999999999999999999999999999999986 79999999999999999999999999 No 7 >protein:vir:94070 Length: 339 # NCBI annotation: putative structural protein # Family: family:all:1653 # MgeID: mge:1493 # MgeName: OP2 # Cross-refs: genbank:acc:YP_453625;genbank:gi:84662661;genbank:GeneID:5142580 Probab=100.00 E-value=2.4e-84 Score=479.03 Aligned_cols=322 Identities=12% Similarity=0.008 Sum_probs=287.3 Q ss_pred CCcceeccchhhhhchhhhchhccccccc----Ccchhecchhhhhh---------hhHHHHHHHHHHHHhhhhhcccch Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATI----GVPSVVNDADGGAA---------YYISQLASLETTVYEVPYADITYL 67 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~----~~~~~~~dA~~~~~---------f~~~~l~~id~~v~e~~~~~l~~~ 67 (343) |+ .-.|++.++..++++-.+|+-... .+-.+.|||+..++ +.+.++++||++|||+++++++++ T Consensus 1 ~~---~~~~~~~~~~l~~~g~~~~~~~~~~~~~~~~~~a~d~~~~~~~~~~~~~~~i~a~~~~~i~~~vy~~~~~~~~~~ 77 (339) T protein:vir:94 1 MS---INNDRTDIKQLEKVGIIFDGYSPKSISSEVSAYAMDAVNLTPTLQTTANAGIPAWMTTFVDRRVIDIQLAPMAAA 77 (339) T ss_pred Cc---eechHHHHHHHHhhceeeccchhhhcchhhHhhhccccccccccccccccchhhhhhhhhchhheeecccccchh Confidence 44 446888888888887766543321 34567888876553 667899999999999999999999 Q ss_pred hhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHH Q lcl|NC_011142. 68 EDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQA 147 (343) Q Consensus 68 ~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~ 147 (343) ++||+.+.++|++++++|+++|.+|++++|+|++++ |+++++++++.++++.++.||+|+++|+++|++.|++|+++|+ T Consensus 78 ~l~pv~t~g~w~~~t~~y~~~e~~G~a~~ygd~ad~-Pl~~~~v~~~~~~v~~~~~g~~y~~~E~~~A~~~g~~l~~~Ka 156 (339) T protein:vir:94 78 KIFPEVKKGDWTTTYGVFIIAEPVGQVATYSDWSAN-GMSKANVNFESRQNYRYQTWTEYGDLEMATYGEAGIDYVARQE 156 (339) T ss_pred hhcccccCCCCcccEEEEeeeecccceEEcccccCC-CcccccceeeEEeEEEEEEEEeecHHHHHHHHhhCCChHHHHH Confidence 999999999999999999999999999999999865 9999999999999999999999999999999999999999999 Q ss_pred HHHHHHHHHhhhheeeeeehhhcceeeeecCCccc-cccCcCccccCHHHHHHHHHHHHHHHHHhcCCee---cccEEEe Q lcl|NC_011142. 148 ELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTK-TSATVNYATCTGQELFDLLNNPVFAVVKASKRFH---TPNTVLM 223 (343) Q Consensus 148 ~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~-~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~---~p~tL~l 223 (343) .+|++++++++|+++|+|++++|++||||||+++. .+++++|++||++||++||++++++++.+|+|.+ .|++|+| T Consensus 157 ~aA~~al~~~~N~i~~~Gd~~~~~~GLlN~P~l~~~v~~s~~Wa~kT~~eI~~Di~~~~~~l~~~s~g~~~~~~~~~L~L 236 (339) T protein:vir:94 157 ISASLVMAKFANSSYLLGVAGIANYGLMNDPSLPAPVAATVNWATAAPEDIANDVVAMVGRLISQSGGLITGQERMVMAL 236 (339) T ss_pred HHHHHHHHHhhceEEeeeecccceEEEEeCCCccccccCCCCcccCCHHHHHHHHHHHHHHHHHhcCCeeeeccCcEEEe Confidence 99999999999999999999999999999999965 4566899999999999999999999999999875 5779999 Q ss_pred cHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeec Q lcl|NC_011142. 224 FPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKP 303 (343) Q Consensus 224 ~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~ 303 (343) ||+.+.+|+++ +.+++|+++||++|+|+..+...|. +. +++.++..+|+.|.+++++.++++| T Consensus 237 P~~~~~~L~~~---n~~~~Tvl~~lk~n~pnl~i~~~~e------------l~--~a~g~~~~~~~~~~~~~~~~~~~~p 299 (339) T protein:vir:94 237 APSALNNVNRT---NNFGLSAGAKIAQTYPNIQFVAVPE------------FD--TASGRLVQLWVPEVNGQPTGEVAFA 299 (339) T ss_pred cHHHHHhcccC---CcCCccHHHHHHHhcCCcEEEEccc------------cc--cCCCceEEEEEEeccCCcceEEEcc Confidence 99999999875 4578999999999999987765442 22 3344566778888889999999999 Q ss_pred cchhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 304 IPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 304 ~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) |||++||+|+++++|++||++|||||+||||+||+|++|| T Consensus 300 ~~~~~lpvq~~~~~~~v~~~~rt~Gv~i~~P~ai~~~~GI 339 (339) T protein:vir:94 300 EKLRSHSIERYSTTTRQKHSGATFGAVIYQPWAVTQELGV 339 (339) T ss_pred hhhhccccEEcCceEEecceeeeeeEEEEccceeeeeecC Confidence 9999999999999999999999999999999999999999 No 8 >protein:vir:3643 Length: 336 # NCBI annotation: gp12 # Family: family:all:1653 # MgeID: mge:75 # MgeName: Bcep781 # Cross-refs: genbank:acc:NP_705638;genbank:gi:23752323;genbank:GeneID:955719 Probab=100.00 E-value=2e-78 Score=446.54 Aligned_cols=319 Identities=11% Similarity=-0.010 Sum_probs=271.5 Q ss_pred ccchhhhhchhhhchhccccc-ccC--cchhecchhhhhh-hh-------HHHH-HHHHHHHHhhhhhcccchhhccccC Q lcl|NC_011142. 7 VIDAQTIAGNRWLNKFLDSNA-TIG--VPSVVNDADGGAA-YY-------ISQL-ASLETTVYEVPYADITYLEDVPVLA 74 (343) Q Consensus 7 ~~~~~~~~~~~~~~~~~~~~~-~~~--~~~~~~dA~~~~~-f~-------~~~l-~~id~~v~e~~~~~l~~~~~i~v~~ 74 (343) .=|++.++..++++-.++.-+ .++ +-.+.|||.+.++ .. .+.| ++|||++||++++++.+.+++|+.+ T Consensus 1 ~~~~~~~~~l~~~gi~~~~~~~~~~~~~~~~~~da~d~~~~~~~~~~~~~~~~l~~~i~p~~~~~~~~~~~~~~l~pv~t 80 (336) T protein:vir:36 1 MRDAQRIQNLARAGVILPRSVQNVSTPLTEYAMDAADLSPHLSSTGSSGIPNYLTTYVDPSVIDILVAPMKAAELVGESK 80 (336) T ss_pred CchHHHHHHHhhcCeeecchhhhhhhHHHHhhhhhhhccCccccCCCcchHHHHHHhhccceEeeecchhhhhhhccccc Confidence 347788887777776655332 222 2344566642221 10 1233 6999999999999999999999999 Q ss_pred CCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHH Q lcl|NC_011142. 75 NIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGS 154 (343) Q Consensus 75 ~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~ 154 (343) .++|.+++++|.++|.+|++++|||++| +|+++++.++++++++.++.+|+|+++|+++|+++|++|..+|+.+|++++ T Consensus 81 ~g~W~~~~~~~~~~e~~G~a~~ygd~~D-~P~~d~~~~~~~~~v~~~~~g~~yg~~E~~~Aa~~~~~l~~~Ka~aA~~al 159 (336) T protein:vir:36 81 KGDWTTLVAAFITAEPTTKVATYGDYSS-DGDSGANINYPQRQSYFFQTWTRWGERELEMAGAGRVDLASELNYSSALGL 159 (336) T ss_pred cCCccceeEEEeeeeceeeEEEeeccCC-CceeecccceeeeeEEEEEeeeeeCHHHHHHHHHhCCCcHHHHHHHHHHHH Confidence 8887789999999999999999999865 599999999999999999999999999999999999999999999999999 Q ss_pred HHhhhheeeeeehhhcceeeeecCCccc-cc-cCcCccccCHHHHHHHHHHHHHHHHHhcCCe---ecccEEEecHHHHH Q lcl|NC_011142. 155 EEHSQRVAYFGDTNRNMSGLLNNPNVTK-TS-ATVNYATCTGQELFDLLNNPVFAVVKASKRF---HTPNTVLMFPDLWK 229 (343) Q Consensus 155 ~~~~n~~~f~G~~~~g~~GLlN~p~v~~-~~-~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~---~~p~tL~l~p~~~~ 229 (343) ++++|+++|+|++++++|||||||+++. .+ +++.|.++|++||++||++++++|+.+|+|. +.|++|+|||+++. T Consensus 160 e~~~N~i~~~Gd~~~~~yGllNdP~l~a~~t~~t~~~~~~t~~ei~~Di~~~~~~l~~qt~G~i~~~~~~tL~LP~~~~~ 239 (336) T protein:vir:36 160 AKFLNGSYLFGVAGLENYGLINDPSLSAPITATTPWSGSPAVEAVVNEVVALFQVLQTQSQGIITQEDVLRMGLPPTAMS 239 (336) T ss_pred HHhhCcEEEEeccccceEEEEecCCCccccccCCCcccccCHHHHHHHHHHHHHHHHHhcCCeeeeccccEEEechHHHH Confidence 9999999999999999999999999975 33 3345688899999999999999999999986 67999999999999 Q ss_pred HHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcc Q lcl|NC_011142. 230 RASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRML 309 (343) Q Consensus 230 ~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~ 309 (343) +|+++ +.+++|+++||++|+|+..+...|+ + .+++.++..+|+-+..+++..++++|++|++| T Consensus 240 ~Ls~~---n~~g~Tvl~~lk~n~Pnl~i~t~pE------------l--~~a~g~~~~l~~~~~~~~~t~~~~~p~~~~~l 302 (336) T protein:vir:36 240 DLSKT---NQYGLAAAAKLKDIFPKLEFVTIPE------------Y--DTASGRLVQLWAPRVEGKDTATCGFTEKMRAH 302 (336) T ss_pred hccCC---CccCccHHHHHHHhcCccEEEEccc------------c--ccCCCceEEEEEEecCCCcceeeecchhhhcc Confidence 99865 5678999999999999988776552 2 23333344445555678899999999999999 Q ss_pred cceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 310 APQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 310 ~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) |+|+++++|++||++|||||+||||++|+|++|| T Consensus 303 ~vq~~~~~~~v~~~~rt~Gv~i~~P~ai~~~~GI 336 (336) T protein:vir:36 303 SIERYSSYFRQKKSAGTWGAVIFRPFAVAQMIGV 336 (336) T ss_pred ceeecCceeEeccccceeeeeeeccchheeeecC Confidence 9999999999999999999999999999999999 No 9 >protein:vir:78558 Length: 336 # NCBI annotation: major capsid protein # Family: family:all:1653 # MgeID: mge:1854 # MgeName: BcepNY3 # Cross-refs: genbank:acc:YP_001294848;genbank:gi:149882911;genbank:GeneID:5291029 Probab=100.00 E-value=2.5e-78 Score=446.04 Aligned_cols=316 Identities=11% Similarity=0.011 Sum_probs=269.4 Q ss_pred ccchhhhhchhhhchhccccc-ccC--cchhecchhhhhh-hh-------HHHH-HHHHHHHHhhhhhcccchhhccccC Q lcl|NC_011142. 7 VIDAQTIAGNRWLNKFLDSNA-TIG--VPSVVNDADGGAA-YY-------ISQL-ASLETTVYEVPYADITYLEDVPVLA 74 (343) Q Consensus 7 ~~~~~~~~~~~~~~~~~~~~~-~~~--~~~~~~dA~~~~~-f~-------~~~l-~~id~~v~e~~~~~l~~~~~i~v~~ 74 (343) .=|++.++..++++-.|+.-. .+. +-.+.|||++.++ .. .++| ++|||++||++++++.+.+++|+.+ T Consensus 1 ~~~~~~~~~l~~~gi~~~~~~~~~~~~~~~~a~da~d~~~~~~t~~~~g~~~~l~~~i~p~~~~~~~~~~~~~~l~~v~t 80 (336) T protein:vir:78 1 MRDAQRIQNLARAGVILPRSVKNVSTPLAEYAMDAADLSPHLSSTGSSGIPNYLTTYVDPSVIDILVAPMKAAELVGESK 80 (336) T ss_pred CchHHHHHHHhccCeecchhhhhhhHHHHHHHHhhhhhccccccCCCcchHHHHHHhcccceeeehhhhhhhhhhccccc Confidence 346777777777776665432 222 2245677543222 11 1222 6999999999999999999999999 Q ss_pred CCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHH Q lcl|NC_011142. 75 NIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGS 154 (343) Q Consensus 75 ~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~ 154 (343) .++|.+++++|.++|.+|++++|||++|+ |++++++++++++++.++.+|+|+++|+++|+++|++|+.+|+.+|++++ T Consensus 81 ~g~W~~~~~~~~~~e~~G~a~~ygd~~D~-P~vd~~~~~~~~~v~~~~~g~~yg~~El~~A~~~g~~l~~~Ka~aA~~al 159 (336) T protein:vir:78 81 KGDWTTLVAAFITAEPTTTVATYGDYSSD-GDSGTNINYPQRQSYFFQTWTRWGERELEMAGAGRVDLASELNYSSALGL 159 (336) T ss_pred CCCccccEEEEeeeecceeeEEeecccCC-CeeecceeeEEEEEEEEEeeeeecHHHHHHHHHhCCCcHHHHHHHHHHHH Confidence 87766789999999999999999998655 99999999999999999999999999999999999999999999999999 Q ss_pred HHhhhheeeeeehhhcceeeeecCCcccc--ccCcCccccCHHHHHHHHHHHHHHHHHhcCCe---ecccEEEecHHHHH Q lcl|NC_011142. 155 EEHSQRVAYFGDTNRNMSGLLNNPNVTKT--SATVNYATCTGQELFDLLNNPVFAVVKASKRF---HTPNTVLMFPDLWK 229 (343) Q Consensus 155 ~~~~n~~~f~G~~~~g~~GLlN~p~v~~~--~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~---~~p~tL~l~p~~~~ 229 (343) ++++|+++|+|++++|++||||||+++.. .+++.|++||++||++||++++++|+.+|+|. +.|++|+|||+.+. T Consensus 160 e~~~N~~~~~Gd~~~~~~GllN~P~l~a~~t~~~~~w~~~T~~~I~~Di~~~~~~l~~qt~g~~~~~~~~tL~Lp~~~~~ 239 (336) T protein:vir:78 160 AKFLNGSYLFGVAGLENYGLINDPSLSAPITATTPWSGSPAVEAVVNEVVTLFQVLQTQSQGIITQEAVLHMGLPPTAMS 239 (336) T ss_pred HHhhCeEEEEeccccceEEEEeCCCCCcccccCcCcccccCHHHHHHHHHHHHHHHHHhcCCeeeeccceEEEechHHHH Confidence 99999999999999999999999999753 34456899999999999999999999999987 46889999999999 Q ss_pred HHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEE---cccceEEEeeccch Q lcl|NC_011142. 230 RASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYD---KSERNLALAKPIPF 306 (343) Q Consensus 230 ~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~---~~~~~~~~~v~~~~ 306 (343) +|+++ +.+++|+++||++|+|+..+...|. +. ++ |++++.+|. .++++.++++|++| T Consensus 240 ~L~~~---n~~g~tv~~~lk~n~Pnl~i~t~pe------------l~--~A---gg~~~~~~~~~~~~~~t~~~~~p~~f 299 (336) T protein:vir:78 240 DLSKT---NQYGLSAAAKLKEIFPKLEFVTIPE------------YD--TA---SGRLVQLWAPRVEGKDTATCGFTEKM 299 (336) T ss_pred hccCC---CccCccHHHHHHHhcCccEEEEccc------------cc--cc---CcceEEEEEeeccCCcceeeecchhh Confidence 99865 5688999999999999988766542 21 22 235566664 45789999999999 Q ss_pred hcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 307 RMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 307 ~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ++||+|+++++|++||++|||||+||||++|+|++|| T Consensus 300 ~~lpvq~~~~~~~v~~~~rt~Gv~i~~P~ai~~~~GI 336 (336) T protein:vir:78 300 RAHSIERYSSYFRQKKSAGTWGAVIFRPFAVAQMIGV 336 (336) T ss_pred hccceeecCceeEeccccceeeeeeeccchheeeccC Confidence 9999999999999999999999999999999999999 No 10 >protein:vir:101557 Length: 336 # NCBI annotation: gp12 # Family: family:all:1653 # MgeID: mge:1477 # MgeName: Bcep43 # Cross-refs: genbank:acc:NP_958117;genbank:gi:41057663;genbank:GeneID:2716814 Probab=100.00 E-value=6.5e-78 Score=443.75 Aligned_cols=319 Identities=10% Similarity=-0.022 Sum_probs=269.4 Q ss_pred ccchhhhhchhhhchhccccc-ccC--cchhecchh-hhh--------hhhHHHHHHHHHHHHhhhhhcccchhhccccC Q lcl|NC_011142. 7 VIDAQTIAGNRWLNKFLDSNA-TIG--VPSVVNDAD-GGA--------AYYISQLASLETTVYEVPYADITYLEDVPVLA 74 (343) Q Consensus 7 ~~~~~~~~~~~~~~~~~~~~~-~~~--~~~~~~dA~-~~~--------~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~ 74 (343) .=|++.++..++++-.++.-+ -+. +-.+.|||. .++ .+-+-..++|||++|+++++++.+.+++|+.+ T Consensus 1 ~~~~~~~~~l~~~gi~~~~~~~~~~~~~~~~~~da~d~~~~~~~~~~~~i~~~l~~~i~p~~~~~~~~p~~a~~l~pv~t 80 (336) T protein:vir:10 1 MRDAQRIQNLARAGVILPRSVQNVSTPLTEYAMDAADLSPHLSSTGSSGIPNYLTTYVDPAVIDILVAPMKAAELVGESK 80 (336) T ss_pred CchHHHHHHHhhcCeeecchhhhhhhhHHHhhhhhhhccCccccCCCchhHHHHHhhcccceeeehhhhhhhhhhccccc Confidence 346777777777765554322 111 123345542 111 11122237999999999999999999999999 Q ss_pred CCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHH Q lcl|NC_011142. 75 NIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGS 154 (343) Q Consensus 75 ~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~ 154 (343) .++|.+++++|.++|.+|++++|||++| +|+++++.++++++++.++.+|+|+++|+++|+++|++|+.+|+.+|++++ T Consensus 81 ~g~W~~~~~~~~~~e~~G~a~~ygd~~D-~P~~d~~~~~~~~~v~~~~~g~~yg~~El~~A~~~g~~l~~~Ka~aA~~al 159 (336) T protein:vir:10 81 KGDWTTLVAAFITAEPTTKVATYGDYSS-DGDSGANINYPQRQSYFFQTWTRWGERELEMAGAGRVDLASELNYSSALGL 159 (336) T ss_pred cCCccceeEEEeeeeceeeEEEeeccCC-CceeecccceeeeeEEEEEeeeeeCHHHHHHHHHhCCCcHHHHHHHHHHHH Confidence 8887789999999999999999999865 599999999999999999999999999999999999999999999999999 Q ss_pred HHhhhheeeeeehhhcceeeeecCCccc-cc-cCcCccccCHHHHHHHHHHHHHHHHHhcCCe---ecccEEEecHHHHH Q lcl|NC_011142. 155 EEHSQRVAYFGDTNRNMSGLLNNPNVTK-TS-ATVNYATCTGQELFDLLNNPVFAVVKASKRF---HTPNTVLMFPDLWK 229 (343) Q Consensus 155 ~~~~n~~~f~G~~~~g~~GLlN~p~v~~-~~-~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~---~~p~tL~l~p~~~~ 229 (343) ++++|+++|+|++++++|||||||+++. .+ +++.|.++|++||++||++++++|+.|++|. +.|++|+|||+++. T Consensus 160 e~~~N~i~~~Gd~~~~~yGllN~P~l~a~~t~~t~~~~~~t~eei~~Di~~~~~~l~~qs~G~i~~~~~~tL~LP~~~~~ 239 (336) T protein:vir:10 160 AKFLNGSYLFGVAGLENYGLINDPSLSAPITATTPWSGSPAVEAVVNEVVALFQVLQTQSQGIITQEDVLRMGLPPTAMS 239 (336) T ss_pred HHhhCcEEEEeccccceEEEEeCCCCccccccCCCcccccCHHHHHHHHHHHHHHHHHhcCCeecccCcceEEecHHHHH Confidence 9999999999999999999999999974 33 3344678899999999999999999999997 67999999999999 Q ss_pred HHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcc Q lcl|NC_011142. 230 RASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRML 309 (343) Q Consensus 230 ~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~ 309 (343) +|+++ +.+++|+++||++|+|+..+...|+ + .+++.++..+|+-+..+++..++++|++|++| T Consensus 240 ~Ls~~---n~~g~Tvl~~lk~n~Pnl~i~t~pE------------l--~~a~G~~~~l~~~~~~~~~t~~~~~p~~~~~l 302 (336) T protein:vir:10 240 DLSKT---NQYGLAAAAKLKDIFPKLEFVTIPE------------Y--DTASGRLVQLWAPRVEGKDTATCGFTEKMRAH 302 (336) T ss_pred hccCC---CccCccHHHHHHHhcCccEEEEccc------------c--ccCCCceEEEEEEecCCCcceeeecchhhhcc Confidence 99865 5678999999999999988776552 2 23333344455555678899999999999999 Q ss_pred cceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 310 APQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 310 ~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) |+|+++++|++||++|||||+||||++|+|++|| T Consensus 303 ~vq~~~~~~~v~~~~rt~Gv~i~~P~ai~~~~GI 336 (336) T protein:vir:10 303 SIERYSSYFRQKKSAGTWGAVIFRPFAVAQMIGV 336 (336) T ss_pred ceeecCceeEeccccceeeeeeeccchheeeecC Confidence 9999999999999999999999999999999999 No 11 >protein:vir:106734 Length: 336 # NCBI annotation: gp13 # Family: family:all:1653 # MgeID: mge:1599 # MgeName: Bcep1 # Cross-refs: genbank:acc:NP_944321;genbank:gi:38638620;genbank:GeneID:2657363 Probab=100.00 E-value=3.7e-78 Score=445.09 Aligned_cols=316 Identities=10% Similarity=-0.001 Sum_probs=272.0 Q ss_pred ccchhhhhchhhhchhccccc-ccC--cchhecchhhhhh-hh-------HHHH-HHHHHHHHhhhhhcccchhhccccC Q lcl|NC_011142. 7 VIDAQTIAGNRWLNKFLDSNA-TIG--VPSVVNDADGGAA-YY-------ISQL-ASLETTVYEVPYADITYLEDVPVLA 74 (343) Q Consensus 7 ~~~~~~~~~~~~~~~~~~~~~-~~~--~~~~~~dA~~~~~-f~-------~~~l-~~id~~v~e~~~~~l~~~~~i~v~~ 74 (343) .=|++.++..++++-.|+.-. .+. +-.+.|||++.++ .. .++| ++|||++||++++++.+.+++|+.+ T Consensus 1 ~~~~~~~~~l~~~gi~~~~~~~~~~~~~~~~a~da~d~~~~~~t~~~~g~~~~l~~~i~p~~~~~~~~~~~~~~l~~v~t 80 (336) T protein:vir:10 1 MRDAQRIQNLARAGVILPRSVKNVSTPLAEYAMDAADLSPHLSSTGSSGIPNYLTTYVDPSVIDILVAPMKAAELVGESK 80 (336) T ss_pred CchHHHHHHHhccCeecchhhhhhhHHHHHHHHhhhhhccccccCCCcchHHHHHhhcCcceeeeeechhchhhhccccc Confidence 346777877777776665432 222 2245677543222 11 1222 6999999999999999999999999 Q ss_pred CCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHH Q lcl|NC_011142. 75 NIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGS 154 (343) Q Consensus 75 ~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~ 154 (343) .++||+++++|.++|.+|+++.|||+ +|+|++++++++.+++++.++.+|+|+.+|+++|+++|++|+.+|+.+|++++ T Consensus 81 ~g~w~~~~~~~~~~e~~G~a~~ygd~-~d~P~~d~~~~~~~~~v~~~~~g~~yg~~El~~A~~~g~~l~~~Ka~aA~~al 159 (336) T protein:vir:10 81 KGDWTTLVAAFITAEPTTKVATYGDY-SSDGDSGTNINYPQRQSYFFQTWTRWGERELEMAGAGRVDLASELNYSSALGL 159 (336) T ss_pred CCCcceeeEEEEeeeeeeeEEEcccc-CCCcceeeeeeeeeeeEEEEEEEEeeCHHHHHHHHHhCCCcHHHHHHHHHHHH Confidence 99999999999999999999999997 67999999999999999999999999999999999999999999999999999 Q ss_pred HHhhhheeeeeehhhcceeeeecCCcccc--ccCcCccccCHHHHHHHHHHHHHHHHHhcCCe---ecccEEEecHHHHH Q lcl|NC_011142. 155 EEHSQRVAYFGDTNRNMSGLLNNPNVTKT--SATVNYATCTGQELFDLLNNPVFAVVKASKRF---HTPNTVLMFPDLWK 229 (343) Q Consensus 155 ~~~~n~~~f~G~~~~g~~GLlN~p~v~~~--~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~---~~p~tL~l~p~~~~ 229 (343) ++++|+++|+|++++|+|||||||+++.. .+++.|++||++||++||++++++|+.+++|. +.|++|+|||+++. T Consensus 160 e~~~N~~~~~Gd~~~~~~GllN~P~l~a~~t~~~~~w~~~T~~eI~~Di~~~~~~l~~qt~g~i~~~~~~tL~Lp~~~~~ 239 (336) T protein:vir:10 160 AKFLNGSYLFGVAGLENYGLINDPSLSAPITATTPWSGSPAVEAVVNEVVTLFQVLQTQSQGIITQEAVLHMGLPPTAMS 239 (336) T ss_pred HHhhCeEEEEeecccceEEEeecCCCCcccccCcCcccccCHHHHHHHHHHHHHHHHHhcCCeeeeccceEEEechHHHH Confidence 99999999999999999999999999753 34456899999999999999999999999987 46889999999999 Q ss_pred HHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEc---ccceEEEeeccch Q lcl|NC_011142. 230 RASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDK---SERNLALAKPIPF 306 (343) Q Consensus 230 ~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~---~~~~~~~~v~~~~ 306 (343) +|+++ +.+++|+++||++|+|+..+...|. +. ++ |++++.+|.+ ++++.++++|++| T Consensus 240 ~L~~~---n~~g~tv~~~lk~n~Pnl~i~t~pe------------l~--~A---gg~~~~~~~~~~~~~~t~~~~~P~~f 299 (336) T protein:vir:10 240 DLSKT---NQYGLSAAAKLKEIFPKLEFVTIPE------------YD--TA---SGRLVQLWAPRVEGKDTATCGFTEKM 299 (336) T ss_pred hccCC---CccCccHHHHHHHhCCccEEEEccc------------cc--cc---CCceEEEEEecccCCcceeeecChhh Confidence 99865 5688999999999999988766542 21 22 2356667654 4789999999999 Q ss_pred hcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 307 RMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 307 ~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ++||+|+++++|++||++|||||+||||++|+|++|| T Consensus 300 ~~lpvq~~~~~~~v~~~~rt~Gv~i~rP~ai~~~~GI 336 (336) T protein:vir:10 300 RAHSIERYSSYFRQKKSAGTWGAVIFRPFAVAQMLGV 336 (336) T ss_pred hccceeecCceeEeccccceeeeeeeccchheeeccC Confidence 9999999999999999999999999999999999999 No 12 >protein:vir:99576 Length: 388 # NCBI annotation: hypothetical protein # Family: family:all:1653 # MgeID: mge:1544 # MgeName: BcepF1 # Cross-refs: genbank:acc:YP_001039801;genbank:gi:126011051;genbank:GeneID:4818271 Probab=100.00 E-value=1.6e-77 Score=441.55 Aligned_cols=326 Identities=13% Similarity=0.038 Sum_probs=277.1 Q ss_pred CCcce--eccchhhhhchhhhchhcccc---------cccCcchhecchhh-------hhhhhHHHHHHHHHHHHhhhhh Q lcl|NC_011142. 1 MSEKR--VVIDAQTIAGNRWLNKFLDSN---------ATIGVPSVVNDADG-------GAAYYISQLASLETTVYEVPYA 62 (343) Q Consensus 1 ~~~~~--~~~~~~~~~~~~~~~~~~~~~---------~~~~~~~~~~dA~~-------~~~f~~~~l~~id~~v~e~~~~ 62 (343) |.+.. +-.|...++..++++-.++.- ...++-.+.|||+. ...+.+..++++||++|++.++ T Consensus 21 ~~~~~~~~~~~~~~~~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~a~da~~~~~~t~~~~gip~~~~~~~~p~~~~~~~~ 100 (388) T protein:vir:99 21 MANGKADYRLTDMAVRELKKFGLVFDHATVKRQIELLHEGGVATQAFDSAYVAPTTQASIPTPIQFLQQWLPGFVKVLTS 100 (388) T ss_pred hhcCCcceeeechhhHhhhhcceeccCccchhhhhhhhhhhhhhcccCcccccccccCcccHHHHHhhhhccceeeeeec Confidence 33322 335666677777766555441 12334456788752 3457889999999999999999 Q ss_pred cccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCc Q lcl|NC_011142. 63 DITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPI 142 (343) Q Consensus 63 ~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l 142 (343) ++.+.+|||+.+.++|.+++++|.++|.+|++++|+|++| +|++++++++.+++++.++.+|+|+++|+++|++.|++| T Consensus 101 p~~~~~l~pv~t~g~W~~~~~~f~v~e~~G~A~~ygd~~D-~Pl~d~~~~~~~r~v~~~~~g~~yg~~El~~A~~~g~~l 179 (388) T protein:vir:99 101 ARKIDEILGVKTVGSWEDQEIVQGIVEPAGTAMEYGDLTN-IPLSSWNVNFERRTIVRGEMGIQVGLLEEGRASAMRINS 179 (388) T ss_pred hhhhhhhccccccCCccceeEEEeeeecceeEEEeecccC-CCceeccceeeeeeEEEEEeeeeecHHHHHHHHhhCCCc Confidence 9999999999998877789999999999999999999865 599999999999999999999999999999999999999 Q ss_pred cHHHHHHHHHHHHHhhhheeeeeehh---hcceeeeecCCcccc------ccCcCccccCHHHHHHHHHHHHHHHHHhcC Q lcl|NC_011142. 143 DSMQAELAFRGSEEHSQRVAYFGDTN---RNMSGLLNNPNVTKT------SATVNYATCTGQELFDLLNNPVFAVVKASK 213 (343) Q Consensus 143 ~~~k~~aA~~~~~~~~n~~~f~G~~~---~g~~GLlN~p~v~~~------~~~~~w~~~t~~~i~~di~~~~~~l~~~s~ 213 (343) +.+|+.+|++++++++|+++|||+++ .++|||||||+++.. .+++.|++||++||++||++++++|+.+++ T Consensus 180 ~~~Ka~AA~~ale~~~N~i~f~G~~g~~~~~~yGllNdP~l~a~v~at~~~~~~~Wa~kT~~eI~~Di~~~~~~i~~qs~ 259 (388) T protein:vir:99 180 AEVKRQGAAVQLEIMRNAIGFYGWEGKNGNRTFGFLNDPSLLPAIASTTPGGWVSGGANAFQGIVGDLRLMLITLRVQSE 259 (388) T ss_pred HHHHHHHHHHHHHhhhceEEEEeecCCCccceEEEeeCCCcccccccccCCcCcccccCCHHHHHHHHHHHHHHHHHhcC Confidence 99999999999999999999999875 479999999998753 234579999999999999999999999999 Q ss_pred Ceecc----cEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEE Q lcl|NC_011142. 214 RFHTP----NTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYV 289 (343) Q Consensus 214 g~~~p----~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v 289 (343) |.+.| .+|+|||+.+.+|+++ +.+++|+++||++|+|+..++..|+ +.+++ +++|.+.++ T Consensus 260 g~~~~~~~~~tL~LP~~~~~~Ls~~---n~~g~Tvl~~lk~n~Pnl~i~t~pE------------l~~a~-~tgg~~~~~ 323 (388) T protein:vir:99 260 DNIDPEDVDITLVLPMNKVDMLSVV---TDLGISVRDWLKQTYPRVRVMSAPE------------LQGGN-PDDGKDIAY 323 (388) T ss_pred CeeeecccceEEEechHHHHhcccc---CcCCccHHHHHHHhcCCcEEEEecc------------ccccc-ccCCceeEE Confidence 98765 4899999999999865 4578999999999999988876553 22221 244667788 Q ss_pred EEEcc-----------cceEEEeeccchhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 290 VYDKS-----------ERNLALAKPIPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 290 ~y~~~-----------~~~~~~~v~~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .|.++ .+...+++|+||++||+|+++++|++||++|||||+||||+||+|++|| T Consensus 324 ~~~~~~~~~~~~~~~~~~t~~~~~p~~~~~l~vq~~~~~~~~~~~~rt~Gv~ir~P~Ai~~~~GI 388 (388) T protein:vir:99 324 MFLDSVDTAVDGSTDGGDTWAQLVQSKFVTLGVEKRVKNYVEAYSNATAGVMLKRPWAVVRLIGL 388 (388) T ss_pred EEecccccccccCccCcceeEEecccccccccceecCceeEeccccceeeeEEeccchhheeccC Confidence 88654 3467788999999999999999999999999999999999999999999 No 13 >protein:vir:107732 Length: 379 # NCBI annotation: gp23 # Family: family:all:1653 # MgeID: mge:1520 # MgeName: BcepB1A # Cross-refs: genbank:acc:YP_024871;genbank:gi:48697513;genbank:GeneID:2948349 Probab=100.00 E-value=3.9e-76 Score=433.99 Aligned_cols=325 Identities=13% Similarity=0.067 Sum_probs=266.4 Q ss_pred CCcceecc---chh-----hhhchhhhchhcccccccCcc--hhecchhhhhh--------hh------HHHHHHHHHHH Q lcl|NC_011142. 1 MSEKRVVI---DAQ-----TIAGNRWLNKFLDSNATIGVP--SVVNDADGGAA--------YY------ISQLASLETTV 56 (343) Q Consensus 1 ~~~~~~~~---~~~-----~~~~~~~~~~~~~~~~~~~~~--~~~~dA~~~~~--------f~------~~~l~~id~~v 56 (343) |+-|.+.. |++ .+.+.++++-.|+.-...-.+ .+.|||+..++ .+ ...|+.--|.+ T Consensus 11 ~~~~~~~~~~~~~~~~~~~~~~~l~~~gi~~~~~~~~~~~~~~~amd~~~~~~~~~~~~~l~~~~~~g~~~~l~~~~p~~ 90 (379) T protein:vir:10 11 LNARQMTQMVMDSADVTLDNLKHLESYGIHLNGRKNKLFELMQFAMDSNDIGPIPTPLSPLSPVSIPGLIQFLQNWLPGH 90 (379) T ss_pred cCccccchhhhccccccHHHHHHHHhcCccccchhhhhhhhhhhhhccccccccccccCccccccccchHHHHHhhcchH Confidence 44444433 322 444555555545432222222 34788763221 11 13444333899 Q ss_pred HhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHH Q lcl|NC_011142. 57 YEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTA 136 (343) Q Consensus 57 ~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~ 136 (343) +++..+++++.+|||+.+.++|++++++|.++|.+|++++|+|+++ +|+++++++++.++++.++.+|+|+++|+++|+ T Consensus 91 i~~~tap~~a~~l~pv~t~g~W~~~~~~~~v~e~~G~A~~ygd~~d-~pl~d~~~~~~~r~v~~~~~g~~yg~~El~~Aa 169 (379) T protein:vir:10 91 VRILTAVREADEFLGLSTVGQWDDEQIVQRVLEGLGTAQPYTDGGN-MALMSWTPTFETRTVVRFEAGLQVAPLEEARSS 169 (379) T ss_pred HHHHhhhhhhhhhcccccCCCceeeeEEEeeeeeeeeeEEeccccC-CCeeeeeeeeeeeeeEEEEEEEeecHHHHHHHH Confidence 9999999999999999999999999999999999999999999865 599999999999999999999999999999999 Q ss_pred HhCCCccHHHHHHHHHHHHHhhhheeeee--ehhhcceeeeecCCccccc-------cCcCccccCHHHHHHHHHHHHHH Q lcl|NC_011142. 137 AVNMPIDSMQAELAFRGSEEHSQRVAYFG--DTNRNMSGLLNNPNVTKTS-------ATVNYATCTGQELFDLLNNPVFA 207 (343) Q Consensus 137 ~~g~~l~~~k~~aA~~~~~~~~n~~~f~G--~~~~g~~GLlN~p~v~~~~-------~~~~w~~~t~~~i~~di~~~~~~ 207 (343) +.|++|+++|+.+|++++++++|+++||| ++++++|||||||+++... ++++|++||++||++||++++.+ T Consensus 170 ~~g~~l~~~Ka~aA~~ale~~~N~i~f~G~~d~~~~~yGllNdP~l~a~~t~atg~~~~t~Wa~kT~~eI~~Di~~~~~~ 249 (379) T protein:vir:10 170 RVQVSSADEKRAMVGEALEVQRNRVAFYGYNDGSGRTFGFLNDPNLPAYVAVPNGAGGSPLWAQKTTLEIIADLRNGLTA 249 (379) T ss_pred HhCCChHHHHHHHHHHHHHHhhceEEEEeecCCCcceEEEEeCCCCcccccccCCcccccccccCCHHHHHHHHHHHHHH Confidence 99999999999999999999999999999 5688999999999987532 23569999999999999999999 Q ss_pred HHHhcCCeec----ccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCC Q lcl|NC_011142. 208 VVKASKRFHT----PNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNG 283 (343) Q Consensus 208 l~~~s~g~~~----p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~ 283 (343) ++.+|+|.+. |++|+|||+.+.+|+++ +.+++|+++||++|+|+..+.+.|. + .+++.+ T Consensus 250 l~~qs~g~~~~~~~~~tL~LP~~~~~~L~~~---n~~g~Tvl~~lk~n~Pnl~i~t~pE------------L--~~aggg 312 (379) T protein:vir:10 250 LQVQSMGRIKSNKTPITIGIPNAYENYITTP---TELGYSVAQYMRESYPNVTFVSAPE------------L--NDANGG 312 (379) T ss_pred HHHhhCCeecccccceeEEecHHHHHhhccc---cccCccHHHHHHHhcCCcEEEEccc------------c--cccCCC Confidence 9999999864 55999999999999865 5678999999999999988776552 2 244444 Q ss_pred ccceEEEEEc-------ccceEEEeeccchhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 284 NKDRYVVYDK-------SERNLALAKPIPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 284 g~dr~v~y~~-------~~~~~~~~v~~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ++++++.+++ +++.+.+++||+|++||+|+++++|++||++|||||+||||+||+|++|- T Consensus 313 ~~~~~~~~~~~~~~~t~~~~~~~~~~p~k~~~l~ve~~~~~~~~~~~~rt~Gv~ir~P~Ai~~~~G~ 379 (379) T protein:vir:10 313 SSAIYYYADAVENNGTDDGRTWLQVVPTKMFTLGVEKKIKGYAEGYTNATAGAMLKRPFATYRQTGA 379 (379) T ss_pred ccEEEEEeeccCCCccCCcceEEEecchhhhhccceecCceeEeccccceeeeeeecchhhheecCC Confidence 5555444432 33468899999999999999999999999999999999999999999999 No 14 >protein:vir:96079 Length: 382 # NCBI annotation: hypothetical protein ORF023 # Family: family:all:1653 # MgeID: mge:1597 # MgeName: F8 # Cross-refs: genbank:acc:YP_001294440;genbank:gi:149408337;genbank:GeneID:5237198 Probab=100.00 E-value=2.3e-75 Score=429.81 Aligned_cols=325 Identities=13% Similarity=0.051 Sum_probs=268.8 Q ss_pred CCcceeccchhhhhchhhhchhccc---------ccccC--cchhecchh-h------hhhhhHHHHHHHHHHHHhhhhh Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDS---------NATIG--VPSVVNDAD-G------GAAYYISQLASLETTVYEVPYA 62 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~---------~~~~~--~~~~~~dA~-~------~~~f~~~~l~~id~~v~e~~~~ 62 (343) |..++| --+.++..++++-.+|. ....+ ...+.|||+ . ...+.+..|+++||++|+++++ T Consensus 19 ~~~~~~--~~~~~~~l~~~gi~~~~~~~~~~~~~~~~~~~~~~~~amDa~~~~~~t~~~~g~p~~~l~~~~p~~~~~~~~ 96 (382) T protein:vir:96 19 FDLKNV--THEAVAALGRIGLVFDHAVVQDQIKALAKAGAFRSGSAMDSNFTAPVTTPSIPTPIQFLQTWLPGFVKVMTA 96 (382) T ss_pred hhhhcc--cHHHHHHHhccccccCcccchhHhhhhhhhhhhhhhcccccccCCccccCCccHHHHHHhhhhhhhhhhhhh Confidence 333332 11334555555544432 11112 233578876 2 2346888999999999999999 Q ss_pred cccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCc Q lcl|NC_011142. 63 DITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPI 142 (343) Q Consensus 63 ~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l 142 (343) ++.++++||+.+.++|.+++++|.++|.+|++++|+|++|+ |++++++++..++++.++.+|+|+.+|+.+|+++|++| T Consensus 97 p~~~~~l~pv~t~g~W~~~t~ty~~~e~~G~A~~ygd~~D~-Pl~d~~~~~~~r~v~~~~~g~~yg~lE~~rAa~~~~~l 175 (382) T protein:vir:96 97 ARKIDEIIGIDTVGSWEDQEIVQGIVEPAGTAVEYGDHTNI-PLTSWNANFERRTIVRGELGLLVGTLEEGRASAIRLNS 175 (382) T ss_pred hhhhhhhccccccCCccceEEEEeeeecccceEEeecccCC-CccccccceeEEEEEEEEEeeeecHHHHHHHHhhCCCc Confidence 99999999999987776899999999999999999998654 99999999999999999999999999999999999999 Q ss_pred cHHHHHHHHHHHHHhhhheeeeeeh---hhcceeeeecCCcccc--ccCcCccccCHHHHHHHHHHHHHHHHHhcCCeec Q lcl|NC_011142. 143 DSMQAELAFRGSEEHSQRVAYFGDT---NRNMSGLLNNPNVTKT--SATVNYATCTGQELFDLLNNPVFAVVKASKRFHT 217 (343) Q Consensus 143 ~~~k~~aA~~~~~~~~n~~~f~G~~---~~g~~GLlN~p~v~~~--~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~ 217 (343) ..+|+.+|++++++++|+++|||+. +.|+|||||||++++. .++++|++||++||++||++++++++.+|+|.+. T Consensus 176 ~~~Ka~aA~~ale~~~N~i~f~G~~~g~~~~~yGllNdP~l~a~~t~a~~~Wa~kT~~eI~~Di~~l~~~i~~qt~G~~~ 255 (382) T protein:vir:96 176 AETKRQQAAIGLEIFRNAIGFYGWQSGLGNRTYGFLNDPNLPPFQTPPSQGWATADWAGIIGDIREAVRQLRIQSQDQID 255 (382) T ss_pred HHHHHHHHHHHHHHhhceEEEEeeecCcCcceEEEEeCCCcccccccCCCCcccccHHHHHHHHHHHHHHHHhccCCeee Confidence 9999999999999999999999973 4689999999999853 4567899999999999999999999999999886 Q ss_pred ----ccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhcc-ccCCccceEEEEE Q lcl|NC_011142. 218 ----PNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAG-VSNGNKDRYVVYD 292 (343) Q Consensus 218 ----p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g-~~~~g~dr~v~y~ 292 (343) |.+|+|||+.+.+|+++ +.+++|+++||++|+|+..+++.|+ +..++ .+.+++++++.|. T Consensus 256 ~~~~~~~L~LP~~~~~~Ls~~---n~~g~Tvl~~lk~n~Pnl~i~t~pe------------L~~a~~~g~g~~~~~~~~~ 320 (382) T protein:vir:96 256 PKAEKITMALATSKVDYLSVT---TPYGISVSDWIEQTYPKMRIVSAPE------------LSGVQMQGKTPEDALVLFV 320 (382) T ss_pred ecccceEEeechHHHhhcccc---CccCccHHHHHHHhcCCcEEEEccc------------cccccCCCccceeEEEEec Confidence 45899999999999864 5678999999999999998877653 22222 2335789999998 Q ss_pred cccc-----------eEEEeeccchhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 293 KSER-----------NLALAKPIPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 293 ~~~~-----------~~~~~v~~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ++.+ .+...+|.+++++++|+++++|++||+++||||+||||++|+|++|| T Consensus 321 ~e~~~~~~~s~~~p~~f~q~~p~~~~~l~ve~~~~~~~~~~s~~t~Gv~i~~P~ai~~~~GI 382 (382) T protein:vir:96 321 EEVDASVDGSTDGGSVFSQLVQSKFITLGVEKRAKSYVEDFSNGTAGALCKRPWAVVRYLGI 382 (382) T ss_pred chhhhhcccccccCcceeccccceeeeccceeecceeEeccccceeeeEEEcchhhhhccCC Confidence 8744 33444455566789999999999999999999999999999999999 No 15 >protein:vir:105778 Length: 358 # NCBI annotation: gp9 # Family: family:all:10995 # MgeID: mge:1501 # MgeName: ES18 # Cross-refs: genbank:acc:YP_224147;genbank:gi:62362222;genbank:GeneID:3342531 Probab=99.86 E-value=4.5e-25 Score=154.04 Aligned_cols=312 Identities=11% Similarity=0.108 Sum_probs=224.4 Q ss_pred eccchhhhhchhhhc----------hhccc--------c-cccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhc--c Q lcl|NC_011142. 6 VVIDAQTIAGNRWLN----------KFLDS--------N-ATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYAD--I 64 (343) Q Consensus 6 ~~~~~~~~~~~~~~~----------~~~~~--------~-~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~--l 64 (343) .+|.=+.++++.+.. -+... . +++ .|. .|-.++.+.|...-|..+|.++++.-.++ + T Consensus 1 ~~f~K~~~an~~~~~~qw~~L~~~Rna~n~~~~a~maan~a~~-~~~-~~~~NAv~~v~~D~wr~~D~~~~q~fr~e~~~ 78 (358) T protein:vir:10 1 MYFSKETLATNSRLGGHWNELWANRNMWNAQHDAMIAANRSNM-TPE-WLAVNAVGGFTRDFWAEIDRQVLQLRDQEVGM 78 (358) T ss_pred CeechhhhhhHHHHHHHHHHHHHHHHHhhhhhhhHHhhhHHHh-hhh-hheecccccCCHHHHHHHhhhhhhhcccchhH Confidence 334334444433321 00000 0 001 111 12234567788889999999999977775 3 Q ss_pred -cchhhccccCCCCcceeEEEEeeecc-cccceeecC--CcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCC Q lcl|NC_011142. 65 -TYLEDVPVLANIPEYATHWNYRSYDG-AAMGKFISA--NASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNM 140 (343) Q Consensus 65 -~~~~~i~v~~~~~~~~~~~~~~~~~~-~G~a~~~~~--~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~ 140 (343) ..-+|+++.+.++.|.....|.+... .|++...-+ .+.++. .+.+.....||+.+..||+.+|||++..+-.|+ T Consensus 79 ~l~NDLm~ls~sv~Igktv~~y~~~gd~~~~v~~SmsGQ~~~~lD--~~~y~~dGtpiPIfdsg~~f~WR~~~~~~~~g~ 156 (358) T protein:vir:10 79 EIVNDLIGVQTVLPVGKTAKLYNVIGDIADDVSVSIDGQAPFSFD--HTEYASDGDPIPVFTAGYGVNWRHAAGLNSLGI 156 (358) T ss_pred HHHhhhhhccccccHHHHHHHHhhhcCCCceEEEEecccCccccc--ceeeeccCCEeeeeccCccccccchhhcCcccc Confidence 34668899999999999998987755 787764322 233343 444677778999999999999999999999999 Q ss_pred CccHHHHHHHHHHHHHhhhheeeeeehh-----hcceeeeecCCcccc-------ccCcCccccCHHHHHHHH-HHHHHH Q lcl|NC_011142. 141 PIDSMQAELAFRGSEEHSQRVAYFGDTN-----RNMSGLLNNPNVTKT-------SATVNYATCTGQELFDLL-NNPVFA 207 (343) Q Consensus 141 ~l~~~k~~aA~~~~~~~~n~~~f~G~~~-----~g~~GLlN~p~v~~~-------~~~~~w~~~t~~~i~~di-~~~~~~ 207 (343) ++.++.++...++++++.-+++|+|+.+ +-.+||-|||++... ..+-|.+++|+++++..+ .+++.+ T Consensus 157 d~~~daQ~~~~~kv~~~~vdy~lNG~~~I~v~g~t~~Glrn~~n~~qv~l~~~s~g~NiDlttat~~a~~~~f~~~l~~~ 236 (358) T protein:vir:10 157 DLVLDSQMAKMRKFNQKRVNYYLNGDPNIQVQSYPAQGIKNHRNTKKINLGSGSGGANIDLTTADMTALFAFFGKGAFGT 236 (358) T ss_pred chhHHHHHHHHHHHHHHHHhhhhccCCceeecCcccccccCCcceeEEEeccCCCcceeeeccCCHHHHHHHHHHHHHHH Confidence 9999999999999999999999999875 446999999997632 233578999999999888 566777 Q ss_pred HHHhcCCeecccEEEecHHHHHHHhccccCCCC-CccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccc Q lcl|NC_011142. 208 VVKASKRFHTPNTVLMFPDLWKRASSLLMTGYT-DRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKD 286 (343) Q Consensus 208 l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~-~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~d 286 (343) +..+ +....-.+++++|+.+..|.++|..+++ .-|||+++++-.+..+|. +...+. .+ T Consensus 237 ~~~~-N~~~~~~~~~vs~ei~~n~~r~Y~~~~~~~gTIl~~vl~~~~va~I~--------------~~~~Ls------gN 295 (358) T protein:vir:10 237 LARA-NKVAQYDVMWVSPEIWANLAQPYVVNGVVSGNVLNAVLPFAPVREIR--------------QTFALS------GN 295 (358) T ss_pred HHhh-cccceeeEEEEcHHHHhhhhcccccccccchhhHHHhhcccCccccc--------------ccccCC------Cc Confidence 7654 5566678999999999999988876654 569999999754433332 222232 27 Q ss_pred eEEEEEcccceEEEeeccchhcccceecC--ceeEeeeeeeeeeEEEECcc----eeeeeccC Q lcl|NC_011142. 287 RYVVYDKSERNLALAKPIPFRMLAPQLLG--LGITVPAEYKISGTEYRYPL----CAQYVDML 343 (343) Q Consensus 287 r~v~y~~~~~~~~~~v~~~~~~~~~~~~~--~~~~~~~~~~~gGv~i~~P~----ai~~~dGI 343 (343) .+++|.+..+++...+.||+-..|.-..+ -+|.+..++.. |++||.-. .+.|..-+ T Consensus 296 eii~~~~~~~vi~plvG~~~gt~~~pR~~p~ddY~f~vwsA~-glqik~D~~Gks~Vv~~~~~ 357 (358) T protein:vir:10 296 EFIAYVRRQDIISPLVGMAVGVVPLPRPLPNVNYNFQIMSAE-GLQITADDQGLSGVVYGANL 357 (358) T ss_pred cEEEEEeCCceeeeeecceeeeecCCCCCCCcchhhhhhhhh-ceeeeeccccceeeEeeccc Confidence 89999999999999999999877654433 35666667765 78888764 34444444 No 16 >protein:vir:108211 Length: 318 # NCBI annotation: gp9 # Family: family:all:6420 # MgeID: mge:2004 # MgeName: Giles # Cross-refs: genbank:acc:YP_001552338;genbank:gi:160700658;genbank:GeneID:5758931 Probab=99.07 E-value=2.8e-12 Score=83.96 Aligned_cols=280 Identities=11% Similarity=0.033 Sum_probs=163.1 Q ss_pred cCcchhecchhhhhhhhHHHH----HHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecc---cccceeecCCc Q lcl|NC_011142. 29 IGVPSVVNDADGGAAYYISQL----ASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDG---AAMGKFISANA 101 (343) Q Consensus 29 ~~~~~~~~dA~~~~~f~~~~l----~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~---~G~a~~~~~~~ 101 (343) |..|.=..-+..++.++.+++ +.|+.++.+.....+.+..||--. +.....++.|.-... .|.+..+..++ T Consensus 1 ~~~~~~i~s~~~~~~itv~~ll~~P~~I~~~i~e~~~~~~iad~lf~~~--~a~~~~~v~f~~~~p~~~~~d~e~VaEgg 78 (318) T protein:vir:10 1 MTAPTGIVSVSDGPAITVRELVGNPLWIPTALKKMMVNQFISESLFRNG--GANPNGVVAYNEGNPSFLEDDVADVAEFG 78 (318) T ss_pred CCCCCcceeeecCCceehHHhhCCchhHHHHHHHHHhccchhhhhhhcc--cccccceeEEEecccccccCcHhhccCcc Confidence 333333333445577777776 678888888888888888888532 222344566654333 35666565554 Q ss_pred CccceeeeccceeEE-EEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCc Q lcl|NC_011142. 102 SDLPRVAQSAKLHQV-ELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNV 180 (343) Q Consensus 102 ~dip~v~~~~~~~~~-~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v 180 (343) + +|.++...+.... .+..++.+++++.+.+.+ .+.+.-.+...++++...++.|+.++ ..|.++++ T Consensus 79 E-iP~~~~~~G~~~ia~~~K~G~~~~vS~Em~~~---n~~~~v~r~~~~l~Nti~r~~d~~a~---------dal~sa~t 145 (318) T protein:vir:10 79 E-IPVSAGARGLPRTAFAVKKALGVRVSKEMIDE---NRVGAVNDQMLQLRNTFIRANDRSAK---------ALLQSPIV 145 (318) T ss_pred c-ccccCCCCCchhhhhhehhccceeccHHHHhh---cChhHHHHHHHHHHHHHHHHHHHHHH---------HHHhcccc Confidence 4 7888777755544 556899999998766554 46667777888888888888887744 34666666 Q ss_pred cccccCcCccccCHHHHHHHHHHHHHHH-------------HHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHH Q lcl|NC_011142. 181 TKTSATVNYATCTGQELFDLLNNPVFAV-------------VKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEH 247 (343) Q Consensus 181 ~~~~~~~~w~~~t~~~i~~di~~~~~~l-------------~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~ 247 (343) +...++..|.+. .....|+..++..+ ....+.-+.|++|+|+|..+..|.+- ..+.++ T Consensus 146 ~~~~~s~~w~~~--~~~~~d~~~A~e~v~~a~~~~~~a~~~~~~~~~GY~pdtIVlhP~~~~~l~~n-------~~~~~~ 216 (318) T protein:vir:10 146 PTLAVPTAWDNG--GKVRTDIAIAIEQISTAAPTAYPAGVGSSDEYFGFIPDTIVMHYALLPILMDN-------ENFMKV 216 (318) T ss_pred ccccCCcCCCCc--ccccccchhhhhhhhhhhhhhhhhhhhhhhhccCccceeeEECHHHHHHHhcc-------hhhhhh Confidence 776677777642 11122333232111 11124456899999999999999631 222333 Q ss_pred HHhc-Ccce-eeccccc---cccccceeeechhhhccccCCccceEEEEEcccceEEE-eeccchhccccee-------- Q lcl|NC_011142. 248 FQIN-NAYT-LLTRNPI---DIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLAL-AKPIPFRMLAPQL-------- 313 (343) Q Consensus 248 l~~n-~~~~-~~~~~p~---~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~-~v~~~~~~~~~~~-------- 313 (343) +.+| ++.. .+..... .+-....+..+.+ -.|+..++. ..++.+ ..++|+++.+..+ T Consensus 217 y~~~a~~~~~~~~~tg~~~g~~lGl~vi~s~~~--------p~~~alvlq--~g~vG~~~d~~pl~~t~~~~egg~~~g~ 286 (318) T protein:vir:10 217 YERNANYVSTAPDWTGNFPGSVMGLNVIRSRTF--------PIDRVLIME--RGTVGFYSDTRPLQFTALYPEGNGPNGG 286 (318) T ss_pred hhccchhhhhcccccccccceeeceEEeecCcc--------CCCeeEEEe--cCCcceeeccccceeeecccCCCCCCCC Confidence 3322 2211 1111111 0001111111111 123334433 333332 3556666655433 Q ss_pred cCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 314 LGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 314 ~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .+.+|...+...+ ..-|.+|+|++.++|| T Consensus 287 ~~~s~~~~~~~~~-~~~V~~PkA~~~itgi 315 (318) T protein:vir:10 287 PTESYRADASHKR-ALAVDQPKAALWLTGI 315 (318) T ss_pred cchhhheehheee-eeeeeCcceeEEEeec Confidence 5678888888776 6999999999999999 No 17 >protein:vir:9574 Length: 300 # NCBI annotation: gp40 # Family: family:all:966 # MgeID: mge:171 # MgeName: SM1 # Cross-refs: genbank:acc:NP_862879;genbank:gi:32469471;genbank:GeneID:1461316 Probab=98.99 E-value=8.4e-11 Score=75.81 Aligned_cols=282 Identities=9% Similarity=-0.035 Sum_probs=159.4 Q ss_pred hecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccce Q lcl|NC_011142. 34 VVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKL 113 (343) Q Consensus 34 ~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~ 113 (343) |.....++|.+..+ .+.+.+++..+..-..+++.++.. .+.+ ...+.+.+..+.+.+++.. ..+|..+..+++ T Consensus 1 ma~~t~~~G~lip~---~~~~~ii~~l~~~s~i~~l~~~~~-~~~~--~~~~p~~~~~~~a~wv~Eg-~~~~~s~~~f~~ 73 (300) T protein:vir:95 1 MSEAQLSKGNLFNP---ELVTKVINKVKGHSSIAKLSPQKP-IPFN--GQREFVFDFDSDIDIVAEN-GKKTHGGVSLDP 73 (300) T ss_pred CcccccCCcceech---hhHHHHHHHHHhhhhhhhhcceee-ccCC--ceEEEEEecCcceEEeeCC-ccccccccccee Confidence 33333444444433 345567777766666666665432 2222 3456666666788888865 568888888999 Q ss_pred eEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeeh-----hhcceeeeecCCccccccCcC Q lcl|NC_011142. 114 HQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDT-----NRNMSGLLNNPNVTKTSATVN 188 (343) Q Consensus 114 ~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~-----~~g~~GLlN~p~v~~~~~~~~ 188 (343) ...+.+.++.-..+|.+=+........+|...-....++++++.+++.+|+|+. ..+..|..+..+.....+..+ T Consensus 74 v~l~~~k~~~~~~iS~ell~~~~d~~~~l~~~i~~~l~~aia~~~d~~~l~G~~~~~g~~~~~~~~~~~~~~~~~~~~~~ 153 (300) T protein:vir:95 74 VTIVPLKVEYGARVSDEFLHASEEAKVDMLTDFVEGFSKKLARGLDIMSIHGINPRTKQASTIIGDNCFDKKVTQTVPFK 153 (300) T ss_pred eEeeeEEEEEeehhhHHHhccCCCCHHHHHHHHHHHHHHHHHHHHHHhhhhcccCCCCCCcccccccccccccceeeccc Confidence 999999999888887553322223456677778888999999999999999953 233566666555444333222 Q ss_pred ccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccc Q lcl|NC_011142. 189 YATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRF 268 (343) Q Consensus 189 w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~ 268 (343) . ...+++|.+++..+... ...|..++|+|+.+..|.+.. +..|..++.-.........+-|.|+..... T Consensus 154 ~-----~~~~~~i~~~~~~~~~~---~~~~~~~vmn~~~~~~L~~lk--d~~G~~i~~~~~~~~~~~~l~G~Pv~~s~~- 222 (300) T protein:vir:95 154 D-----TNPDESMEDAVGMIDGS---ERDITGAILDPIFTTALSKMK--NAEGGKLYPELAWGGVPDAINGLAVDKNRT- 222 (300) T ss_pred c-----cchHHHHHHHHHHhhhc---CCCccEEEECHHHHHHHHHhh--ccCCCeeccCccccCCCceecceeeEEecC- Confidence 1 12357888888777542 234678999999999996533 444433321111111122334444322211 Q ss_pred eeeechhhhccccCCccceEEEEEcccceEEEeeccc--hhcccc-eec----Cc----eeEeeeeeeeeeEEEECccee Q lcl|NC_011142. 269 QLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIP--FRMLAP-QLL----GL----GITVPAEYKISGTEYRYPLCA 337 (343) Q Consensus 269 ~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~--~~~~~~-~~~----~~----~~~~~~~~~~gGv~i~~P~ai 337 (343) . ....++.++.+++-+.+ +.+.+.+-+. +.+.+- ... ++ ..-+.++.++ |+.+++|.|+ T Consensus 223 ------v--~~~~~~~~~~~~~GDf~-~~~~~~~~~~~~~~v~~~~~~d~~~~~~f~~~~v~~r~~~r~-d~~v~~~~a~ 292 (300) T protein:vir:95 223 ------V--SYSQTDPKNTAIVGDFE-TMFKWGYAKEVPMEIIKYGDPDNSGRDLKGYNQIYIRCEAYI-GWGIMDAASF 292 (300) T ss_pred ------C--CCCCCCCccEEEEeecc-ceEEEEEecccEEEEeeccCCCCcchhhhhcCcEEEEEEEee-cceeecccce Confidence 1 11112233333332222 1111111111 111110 111 11 1445567776 6888999999 Q ss_pred eeeccC Q lcl|NC_011142. 338 QYVDML 343 (343) Q Consensus 338 ~~~dGI 343 (343) +.+.|. T Consensus 293 ~~l~~~ 298 (300) T protein:vir:95 293 ARIVKT 298 (300) T ss_pred EEEecC Confidence 999999 No 18 >protein:vir:1433 Length: 435 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:30 # MgeName: phiE125 # Cross-refs: genbank:acc:NP_536362;genbank:gi:17975167;genbank:GeneID:929171 Probab=98.98 E-value=2.4e-10 Score=73.36 Aligned_cols=317 Identities=13% Similarity=0.089 Sum_probs=167.5 Q ss_pred CCcceeccch--hhhh---chhhh--chhcccccccCcch-hecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccc Q lcl|NC_011142. 1 MSEKRVVIDA--QTIA---GNRWL--NKFLDSNATIGVPS-VVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPV 72 (343) Q Consensus 1 ~~~~~~~~~~--~~~~---~~~~~--~~~~~~~~~~~~~~-~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v 72 (343) +.++...+.. +.+. ++.+. ...........+.. +.......|.++.. +.+...|++..++....+.+..- T Consensus 91 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~t~~~gg~~vP--~~~~~~ii~~l~~~~~i~~~~~~ 168 (435) T protein:vir:14 91 LEVKGAKMARMVRALAAARGDAQLASKLAIERGFGEEVAMSLNTLSPGAGGVLVP--ENLSSEVIELLRPKSVVRKLGAR 168 (435) T ss_pred hhhhHHHHHHHHHHHHhhcchhhHHHHHHHhhhhhhhhhhhcccCCcCCCccccc--hhHHHHHHHHHhhhchhhhhcce Confidence 1111111100 0000 00000 00000000000111 11112233444554 34566788866665554444211 Q ss_pred cCCCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHH Q lcl|NC_011142. 73 LANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFR 152 (343) Q Consensus 73 ~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~ 152 (343) ..+.....+.+.+.+..+.+.+++.. ..+|..+..++......+.++.-+.+|.+=|+.+ ..+.+|..--....+. T Consensus 169 --~~~~~~~~~~~p~~~~~~~a~~v~E~-~~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~ds-~~~~~l~~~i~~~l~~ 244 (435) T protein:vir:14 169 --TLPLSNGNITIPRLKGGAIVGYIGAD-TDIPTTQQQFDDLKLTAKKMAALVPIANDLIKYA-GVNPNVDQIVVGDLTA 244 (435) T ss_pred --eeecCCCceEEEEEeCCcceeeeccC-ccccccccceeEEEeeeEEEEEeehhhHHHHHhh-ccCHHHHHHHHHHHHH Confidence 12222224566666666777777654 4578888888888889999998888775433332 2233476777888899 Q ss_pred HHHHhhhheeeeeehh-hcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHH Q lcl|NC_011142. 153 GSEEHSQRVAYFGDTN-RNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRA 231 (343) Q Consensus 153 ~~~~~~n~~~f~G~~~-~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L 231 (343) ++.+.+|+.+++|+.. ....||++....+......++. +.+.+..|+.+++..+.....+ ..+..++|+|..+..| T Consensus 245 ai~~~~d~a~l~G~G~~~~p~Gi~~~~~~~~~~~~~~~~--~~~~~~~~~~~l~~~~~~~~~~-~~~~~~v~n~~~~~~L 321 (435) T protein:vir:14 245 AIGAREDKAFIRDDGTANTPKGLRFWALPSNVITASDAS--TLQKIETDLGKVILALENADAN-LTQPGWIMAPRTFRFL 321 (435) T ss_pred HHHHHHHHHhhccCCCCccccceeecccccceecccccc--chhhHHHHHHHHHHHhhhcccc-ccCCEEEEcHHHHHHH Confidence 9999999999999864 3579999887765554444443 4667888999998887754222 3456789999999998 Q ss_pred hccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc Q lcl|NC_011142. 232 SSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP 311 (343) Q Consensus 232 ~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~ 311 (343) .... +..|.-++. ....-.+-|.|+..... .. ...+.+++.-.++|-+=.+++ +..-.+++..-. T Consensus 322 ~~lk--d~~G~~l~~----~~~~g~l~G~Pv~~~~~-------~p-~~~~~~~~~~~i~~gd~s~~~-i~~~~~~~~~~~ 386 (435) T protein:vir:14 322 EGLR--DGNGNKVYP----ELANGMLKGYPVGKTTQ-------VP-INLGETGKESEIYFTDFGDVF-IGEEETLEIDYS 386 (435) T ss_pred HHhh--ccCCceecc----CCCCCeeecceeEeecc-------cc-ccccCCCccceEEEeecccEE-EEEecccEEEEe Confidence 6533 333332221 11111344555433211 10 111122222223333222222 222233322111 Q ss_pred ---------------eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 312 ---------------QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 312 ---------------~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -.+++ ..+.+..++ ++.+.+|.|+++++|+ T Consensus 387 ~~~~~~~~~~~~~~~f~~~~-~~~r~~~r~-d~~~~~~~a~~~l~~~ 431 (435) T protein:vir:14 387 KEATYKDADGHMVSAFQRDQ-TLIRVIAKN-DFGPRHVESIAVLAGV 431 (435) T ss_pred ccccccccccchhhhhhcCh-hheeeeeee-CceeecccceEEEecC Confidence 11221 345567777 4699999999999999 No 19 >protein:vir:8187 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:153 # MgeName: Che9d # Cross-refs: genbank:acc:NP_817980;genbank:gi:29566414;genbank:GeneID:2700968 Probab=98.91 E-value=1.9e-10 Score=73.91 Aligned_cols=292 Identities=11% Similarity=-0.008 Sum_probs=156.4 Q ss_pred hecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccce Q lcl|NC_011142. 34 VVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKL 113 (343) Q Consensus 34 ~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~ 113 (343) |++ -+.|.++.. +.+...|++..++.-..+++..+. +.+.+ ...+.+.+..+.+.+++.. ..+|..+..+++ T Consensus 1 mat--~~~gg~lvP--~~~~~~ii~~~~~~s~i~~~~~~i-~~~~~--~~~~p~~~~~~~a~wv~Eg-~~~~~~~~~f~~ 72 (311) T protein:vir:81 1 MVA--LATGTFQLP--KHLVPGVWQKAQGQSVLARLSMAE-PQEFG--EQQYMTLTAPPRGEVVGEG-AQKSESTATFAP 72 (311) T ss_pred Cce--ecCCceEcc--hhHHHHHHHHHHhcchhhhhccee-ecCCC--ceEEEEEeCCceeEEeecC-cccccccceeeE Confidence 111 222334433 344567888777776677776653 22223 3556667777788888754 568888888888 Q ss_pred eEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeeh---hhcceeeeecCCccccccCcCcc Q lcl|NC_011142. 114 HQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDT---NRNMSGLLNNPNVTKTSATVNYA 190 (343) Q Consensus 114 ~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~---~~g~~GLlN~p~v~~~~~~~~w~ 190 (343) .....+.++.-..+|.+=|+.......+|...-+...++++++.+|+.+++|+. +.+..|+++... ......... T Consensus 73 v~l~~~kl~~~~~iS~ell~~~~d~~~~l~~~i~~~la~ai~~~~d~a~l~G~~~~~~~~~~gi~~~~~--~~~~~~~~~ 150 (311) T protein:vir:81 73 VTAIPRKVQVTQRFSQEVKWADESRQLGVLQTMADLSGVALGRALDLIGIHGINPLTGAALSGSPAKIL--DTTNIVELT 150 (311) T ss_pred EEEeeEEEEEeehhhHHHhhcCcccHHHHHHHHHHHHHHHHHHHHHHhhhccccCCCCccccccccccc--ccceeeeec Confidence 888888888777766542332233455677778888999999999999999974 234556665421 111111122 Q ss_pred ccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHH-HHHhcCcceeeccccccccccce Q lcl|NC_011142. 191 TCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIE-HFQINNAYTLLTRNPIDIKIRFQ 269 (343) Q Consensus 191 ~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle-~l~~n~~~~~~~~~p~~i~~~~~ 269 (343) +.+...+..+|.+++..+.. ....|..++|+|..+..|.+-. +..+.-++. ......+ ..+-|.|+.+..... T Consensus 151 ~~~~~~~~~~i~~~~~~~~~---~~~~~~~~vmn~~~~~~l~~lk--d~~G~~l~~~~~~~~~~-~tl~G~Pv~~~~~i~ 224 (311) T protein:vir:81 151 TGTSATPDLAVEAAVGLVLG---DNLSPDGVALDNTFSFMLATQR--DSQGRKLYPELGFGTDV-ASFAGLNAAVSDTVR 224 (311) T ss_pred ccccchHHHHHHHHHHHhhh---cCCCceEEEEcHHHHHHHHhhh--ccCCCeeecCccccCCC-ceecceeEEeccccc Confidence 22233345667777776643 2345678999999999996422 323322221 0111111 123344443221110 Q ss_pred eee---chhhhccccCCccceEEEEEcccceEEEeeccchhcccc----e-----ecCceeEeeeeeeeeeEEEECccee Q lcl|NC_011142. 270 LMA---TELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP----Q-----LLGLGITVPAEYKISGTEYRYPLCA 337 (343) Q Consensus 270 l~~---~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~----~-----~~~~~~~~~~~~~~gGv~i~~P~ai 337 (343) ... ...........++++++..+.+.=.+.+.-.+.+...+. . .++ ...+.+..|+ |..+.+|.|+ T Consensus 225 ~~~~~~~~~~~~~~~~~~~~~~~~gDfs~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~-~v~~r~~~r~-d~~v~~~~a~ 302 (311) T protein:vir:81 225 GGPEAVTASTGVYRTTNPNVKAIAGDFSAFRWGVQVSIPLELIEFGDPDGLGDLKRQN-QIAIRAEVVY-GIGIMSTDAF 302 (311) T ss_pred ccccccccccchhcccCCccEEEEEecccEEEEEeccceEEEeccCCCCcchhhhhcC-cEEEEEEEEe-ccEeecccce Confidence 000 000000011123344444443321222211111211111 1 112 2456667776 6888899999 Q ss_pred eeeccC Q lcl|NC_011142. 338 QYVDML 343 (343) Q Consensus 338 ~~~dGI 343 (343) +++.|- T Consensus 303 ~~l~~a 308 (311) T protein:vir:81 303 AVVRDA 308 (311) T ss_pred EEEEee Confidence 999999 No 20 >protein:vir:80376 Length: 435 # NCBI annotation: gp6, major capsid head protein # Family: family:all:21 # MgeID: mge:1881 # MgeName: phi644-2 # Cross-refs: genbank:acc:YP_001111085;genbank:gi:134288639;genbank:GeneID:4960624 Probab=98.91 E-value=3e-10 Score=72.76 Aligned_cols=317 Identities=13% Similarity=0.091 Sum_probs=165.0 Q ss_pred CCcceeccc--hhhhhchh---hh--chhcccccccCcc-hhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccc Q lcl|NC_011142. 1 MSEKRVVID--AQTIAGNR---WL--NKFLDSNATIGVP-SVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPV 72 (343) Q Consensus 1 ~~~~~~~~~--~~~~~~~~---~~--~~~~~~~~~~~~~-~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v 72 (343) .++|...+. ++.+...+ +. ..+...-..-... .+.......|.++.. +.+...|++...+....+++-.. T Consensus 91 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gg~lvP--~~~~~~ii~~l~~~~~i~~~~~~ 168 (435) T protein:vir:80 91 PEVKGAKMARMVRALAAARGDAQLASKLAIERGFGEEVAMSLNTLSPGAGGVLVP--ENLSSEVIELLRPKSVVRKLGAR 168 (435) T ss_pred hhhhHHHHHHHHHHHHhccchhHHHHHHHHhhhhhhhhhhhhcccCCCCCccccc--hhHHHHHHHHHhhhchhhhccce Confidence 111100000 00000000 00 0000000000000 011111122344444 34556777766555444444211 Q ss_pred cCCCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHH Q lcl|NC_011142. 73 LANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFR 152 (343) Q Consensus 73 ~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~ 152 (343) ..+.....+.|.+.+..+.+.+++.. ..+|..+..+++.....+.++.-+.+|.+=|+. ...+.+|..--....+. T Consensus 169 --~v~~~~~~~~~p~~~~~~~a~~v~E~-~~~~~~~~~f~~i~~~~~k~~~~~~is~ell~d-s~~~~~l~~~i~~~l~~ 244 (435) T protein:vir:80 169 --TLPLSNGNITIPRLKGGAIVGYIGAD-TDIPTTQQQFDDLKLTAKKMAALVPIANDLIKY-AGVNPNVDQIVVGDLTA 244 (435) T ss_pred --eeecCCCceEEEEEeCCcceeeeccC-ccccccccceeeEEEeeEEEEEeehhhHHHHHh-hcccHHHHHHHHHHHHH Confidence 12222224566666666777777655 458888888999999999999888887554443 33344577778888999 Q ss_pred HHHHhhhheeeeeehh-hcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHH Q lcl|NC_011142. 153 GSEEHSQRVAYFGDTN-RNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRA 231 (343) Q Consensus 153 ~~~~~~n~~~f~G~~~-~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L 231 (343) ++.+.+++.+|+|+.. ....||+++...........+ .+.+.+..|+.+++..+.....+ ..+..++|+|..+..| T Consensus 245 a~~~~~d~a~l~G~G~~~~p~Gi~~~~~~~~~~~~~~~--~~~~~~~~d~~~~~~~~~~~~~~-~~~~~~vmn~~~~~~L 321 (435) T protein:vir:80 245 AIGAREDKAFIRDDGTANTPKGLRFWALPGNVITASDG--STLQKIETDLGKAILALENADAN-LTQPGWIMAPRTFRFL 321 (435) T ss_pred HHHHHHHHHhhccCCCCCcccceeecccccceeecccc--cchhhHHHHHHHHHHHhhccccc-cccCEEEEcHHHHHHH Confidence 9999999999999864 357899998876554443333 34667788999998887654222 3456789999999999 Q ss_pred hccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc Q lcl|NC_011142. 232 SSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP 311 (343) Q Consensus 232 ~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~ 311 (343) .... +..|.-++.-+ ....+-|.|+..... + + ...+.++....++|-+=.+++ +..-..++.... T Consensus 322 ~~lk--d~~G~~l~~~~----~~~~l~G~pv~~~~~--~--p----~~~~~~~~~~~i~~gd~s~~~-i~~~~~~~i~~~ 386 (435) T protein:vir:80 322 EGLR--DGNGNKVYPEL----ANGMLKGYPVGKTTQ--V--P----INLGEAGKESEIYFTDFGDVF-IGEEETLEIDYS 386 (435) T ss_pred Hhhh--ccCCceeccCC----CCCeEeeeeeEEecc--c--c----ccccCCCCcceEEEEEcccEE-EEeecceEEEEe Confidence 7533 33443332111 111344555433211 1 0 011112222223332211122 211112221100 Q ss_pred ---------------eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 312 ---------------QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 312 ---------------~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -.+| ...+.+..++ ++.+.+|.|++++.|+ T Consensus 387 ~~~~~~~~~~~~~~~f~~n-~~~~r~~~r~-d~~~~~~~a~~~l~~~ 431 (435) T protein:vir:80 387 KEATYKDADGHMVSAFQRD-QTLIRVIAKN-DFGPRHVESIAVLSGV 431 (435) T ss_pred ccccccccccchhhhhhcC-cceeeeeeee-CcEeecccceEEEecc Confidence 1122 2455667776 6899999999999999 No 21 >protein:vir:99920 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:1611 # MgeName: Halo # Cross-refs: genbank:acc:YP_655524;genbank:gi:109392294;genbank:GeneID:4157089 Probab=98.85 E-value=5.7e-10 Score=71.25 Aligned_cols=296 Identities=8% Similarity=-0.048 Sum_probs=156.1 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccc Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLP 105 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip 105 (343) |+ +++++.+.. .. +.+..+|++..++....+++..+. +.+. ....|.+....+.+.+++.. ..+| T Consensus 1 Ma-------t~tt~~g~~--vP--~~~~~~ii~~~~~~s~l~~~~~~i-~~~~--~~~~~p~~~~~~~a~wv~Eg-~~~~ 65 (311) T protein:vir:99 1 MA-------TFGTGNLKN--LP--RNIADGMVKDVVQGSTVAVLSARK-PQRF--GNEDIITFNGRPKAEFVGEG-QQKS 65 (311) T ss_pred Cc-------eecCCCcee--cc--HHHHHHHHHHHHhhchhhhhccee-eccC--CceEEEEEeCCceeEEeecC-cccc Confidence 22 233332222 22 234456777776666666665542 2222 23456666677788888765 4688 Q ss_pred eeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehh---hcceeeeecCCccc Q lcl|NC_011142. 106 RVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTN---RNMSGLLNNPNVTK 182 (343) Q Consensus 106 ~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~---~g~~GLlN~p~v~~ 182 (343) ..+..++......+.++.-+.+|.+=++.......+|...-....++++++.+|+.+|+|+.. .+..|+.+..+... T Consensus 66 ~~~~~f~~v~l~~~k~~~~~~iS~ell~~~~d~~~~l~~~i~~~la~ai~~~~d~~~l~G~g~~~g~~~~g~~~~~~~~~ 145 (311) T protein:vir:99 66 STTGEFDFVTSTPKKAQVTMRFNEEVQWADEDYQLGVLQTLSEAGAEALARALDLGLYHRINPLTGTVIPGWSNYLGAAS 145 (311) T ss_pred cccceeeEEEEeeEEEEEeehhhHHHhhcccccHHHHHHHHHHHHHHHHHHHHHHHhhcccCcccCcccccccccccccc Confidence 888888898888898888887775533333345567888888899999999999999999753 34555544433322 Q ss_pred cccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccc Q lcl|NC_011142. 183 TSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPI 262 (343) Q Consensus 183 ~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~ 262 (343) .......+ +......||.+++..+... +....++.++|+|..+..|.+-. +..|.-++.-......-..+-|.|+ T Consensus 146 ~~~~~~~~--~~~~~~~~i~~~~~~~~~~-~~~~~~~~~vmn~~~~~~L~~lk--d~~G~~l~~~~~~~~~~~~l~G~Pv 220 (311) T protein:vir:99 146 KRVELTAD--TIANPDLAIEAAVGLLVAN-GHPTPVNGLALHPSIAWGLSTAR--YTDGRKKFPELGLGIGVSSFEGIDA 220 (311) T ss_pred ceeecccc--ccchhHHHHHHHHHHHhhh-ccCCCccEEEEcHHHHHHHHhhh--ccCCCeeecCcccCCCCceecceee Confidence 22221122 2334456777777666543 33345677999999999996532 3333222111100100112334444 Q ss_pred ccccccee-eechhhhccccCCccceEEEEEcccceEEEeeccchhcccce---ec---Cc----eeEeeeeeeeeeEEE Q lcl|NC_011142. 263 DIKIRFQL-MATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQ---LL---GL----GITVPAEYKISGTEY 331 (343) Q Consensus 263 ~i~~~~~l-~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~---~~---~~----~~~~~~~~~~gGv~i 331 (343) .......- .....+......+.++.+++-+. .+.+.+.+-..++..-.. .. ++ -.-+.+..++++. + T Consensus 221 ~~s~~i~~~~~~~~~~~~~~~~~~~~~~~Gdf-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~~~~r~~~r~d~~-v 298 (311) T protein:vir:99 221 SVSDTVNGGDEADPDDEDLDAARAVRGIVGDF-ANGIHWGVQRDIPVELIKYGDPDGQGDLKRHNQIALRLEIVYGWY-V 298 (311) T ss_pred EeecccccccccccccchhhccCcceEEEeec-cccEEEEEecCceEEEeecCCCCcchhhhhcCcEEEEEEEeecce-e Confidence 32211000 00000000011112222222111 122333332222211110 01 11 1345678888775 6 Q ss_pred ECcceeeeeccC Q lcl|NC_011142. 332 RYPLCAQYVDML 343 (343) Q Consensus 332 ~~P~ai~~~dGI 343 (343) ++|.++...++. T Consensus 299 ~~~~~v~~~~~~ 310 (311) T protein:vir:99 299 FTDRFVVIENAV 310 (311) T ss_pred cChhHeeeeccc Confidence 789999988888 No 22 >protein:vir:41 Length: 299 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:2 # MgeName: A118 # Cross-refs: genbank:acc:NP_463467;swissprot:trembl:q9t1b7;genbank:gi:16798789;uniprot:Q9T1B7;genbank:GeneID:922353 Probab=98.83 E-value=8.8e-10 Score=70.23 Aligned_cols=278 Identities=9% Similarity=0.009 Sum_probs=156.3 Q ss_pred cCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceee Q lcl|NC_011142. 29 IGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVA 108 (343) Q Consensus 29 ~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~ 108 (343) ||..++.....+.+..+.. +.+..+|++.....-..+++..+. +.+.+. ..+.+.+. ..+.+++.. ..+|..+ T Consensus 1 ~g~~a~~~~~~~~~~~~iP--~~~~~~ii~~~~~~s~l~~~~~~~-~~~~~~--~~~~~~~~-~~a~~v~E~-~~~~~~~ 73 (299) T protein:vir:41 1 MGFNPDTTTMQSAKTGSIP--INISEQIITGVKNGSAAMKLAKAV-PMTKPE--EEFTFMSG-VGAFWVDEA-ERIQTSK 73 (299) T ss_pred CCcCCCcccccCCCceecc--hhHHHHHHHHHHhcchhhhhceee-ecCCCc--EEEEEEcC-CceeeeecC-ccccccc Confidence 5544443222222222222 345556777666666666655442 222232 33444443 446777654 5688888 Q ss_pred eccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccccCcC Q lcl|NC_011142. 109 QSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVN 188 (343) Q Consensus 109 ~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~ 188 (343) ..++........++.-+.++.+=++. ...++...-....++++++.+|+.+++|+....-.|+++.......++.. T Consensus 74 ~~f~~v~l~~~k~~~~~~is~ell~d---s~~~~~~~i~~~l~~a~~~~~d~a~l~G~g~~~~~gil~~~~~~~~~~~~- 149 (299) T protein:vir:41 74 PTFTKAKMRSKKMGVIIPTTKENLNY---SVTNFFSLMQAEIVEAFYKKFDQAVFTGVESPYNWNILKSATDASNLVEE- 149 (299) T ss_pred cceeEEEEeeEEEEEeehhhHHHHhc---CHHHHHHHHHHHHHHHHHHHHHHHHhhcccCcccccccccccccceeecc- Confidence 88999999999999999988754443 23568888889999999999999999999876667888765433222211 Q ss_pred ccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccc Q lcl|NC_011142. 189 YATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRF 268 (343) Q Consensus 189 w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~ 268 (343) .+. -++||.+++.++... + ..+..++++|+.+..|.+.. +..+.-++.=-..+... .+-|.|+.+.. T Consensus 150 -~~~----~~~~l~~~~~~l~~~--~-~~~~~~v~n~~~~~~L~~lk--d~~G~~l~~~~~~~~~~-~l~G~PV~~~~-- 216 (299) T protein:vir:41 150 -TAN----KYDDLNEAIGLIEAE--D-LEPNGIATIRKQRVKYRSTK--DGNGMPIFNTATSNGVD-DVLGLPIAYTP-- 216 (299) T ss_pred -ccc----cHHHHHHHHHhhhcc--c-CCcCEEEEcHHHHHHHHHhh--ccCCceeecCCcCCCCc-eecceeeEEec-- Confidence 111 267888888887642 2 35678999999999997533 33332221100001111 12344432221 Q ss_pred eeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc------------------eecCceeEeeeeeeeeeEE Q lcl|NC_011142. 269 QLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP------------------QLLGLGITVPAEYKISGTE 330 (343) Q Consensus 269 ~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~------------------~~~~~~~~~~~~~~~gGv~ 330 (343) ... . +.++..+++-+= ..+-+..-+.++.... ..++ ...+.+..++ |.. T Consensus 217 -----~~~---~--~~~~~~~~~gdf-s~~~i~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~r~~~~~-d~~ 283 (299) T protein:vir:41 217 -----KYT---F--GDKDISELVGDW-NQAYYGILRGVEYEILTEATLTTVADETGKPLNLAERD-MAAIKATFEV-GFM 283 (299) T ss_pred -----ccC---C--CCCceEEEEEec-ccEEEEEecCcEEEEeecccccccccccccchhhhhcC-cEEEEEEEEe-ccE Confidence 111 1 112222222221 1122222222222111 1122 2455677777 678 Q ss_pred EECcceeeeeccC Q lcl|NC_011142. 331 YRYPLCAQYVDML 343 (343) Q Consensus 331 i~~P~ai~~~dGI 343 (343) +++|.|++.+.+- T Consensus 284 v~~~~A~~~l~~~ 296 (299) T protein:vir:41 284 VVKDEAFSAVQPK 296 (299) T ss_pred EecccceEEEEec Confidence 8889999999999 No 23 >protein:vir:9759 Length: 303 # NCBI annotation: putative structural protein # Family: family:all:966 # MgeID: mge:175 # MgeName: 315.3 # Cross-refs: genbank:acc:NP_795521;genbank:gi:28876283;genbank:GeneID:1257824 Probab=98.82 E-value=2e-09 Score=68.28 Aligned_cols=284 Identities=10% Similarity=-0.060 Sum_probs=152.7 Q ss_pred hecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccce Q lcl|NC_011142. 34 VVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKL 113 (343) Q Consensus 34 ~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~ 113 (343) |.++...+ ++.. +.+...|++...+.-..+++..+. +.+ .....+.+....+.+.+++.. ..+|..+..++. T Consensus 1 m~t~t~gg--~liP--~~~~~~ii~~l~~~s~i~~l~~~~-~~~--~~~~~ip~~~~~~~a~wv~E~-~~~~~s~~~f~~ 72 (303) T protein:vir:97 1 MGTETSKA--SLFD--KHLVSDLINKVKGHSSLAKLSSQK-PIP--FNGSKEFTFTLDSDIDVVAEN-GKKTHGGLSLEP 72 (303) T ss_pred CcccCCCC--eEcc--hhHHHHHHHHHHhhchhhhhccee-ecC--CCceEEEEEecCcceEEeecC-ccccccccceee Confidence 33333322 2333 344566788777776667766553 222 223455666667788888865 558888888999 Q ss_pred eEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhh-c----ceeeeecCCccccccCcC Q lcl|NC_011142. 114 HQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNR-N----MSGLLNNPNVTKTSATVN 188 (343) Q Consensus 114 ~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~-g----~~GLlN~p~v~~~~~~~~ 188 (343) ...+.+.++.-+.+|.+=+........+|...-....++++++.+|+.+++|+... | ..|..+..+...... T Consensus 73 v~l~~~kl~~~~~iS~ell~~~~d~~~~l~~~i~~~la~a~~~~ld~a~l~G~~~~~g~~~~~~~~~~~~~~~~~~~--- 149 (303) T protein:vir:97 73 VTIVPIKVEYGARLSDEFLYATEEEKIDILKAFNEGFAKKLARGIDLMAMHGINPRTKKASDVIGTNHFDSKVTQVV--- 149 (303) T ss_pred EEeeeEEEEEeehhhHHHhhcCccchHHHHHHHHHHHHHHHHHHHHhhhhcccccCCcccccccccccccccccccc--- Confidence 99999999988888755333333445667778888999999999999999996432 2 122222112111111 Q ss_pred ccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHH-HHHHhcCcceeecccccccccc Q lcl|NC_011142. 189 YATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVI-EHFQINNAYTLLTRNPIDIKIR 267 (343) Q Consensus 189 w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvl-e~l~~n~~~~~~~~~p~~i~~~ 267 (343) ...+.+..++||.+++..+... ...|..++|+|+.+..|.+.. +..+.-++ .-+........+-|.|+..... T Consensus 150 -~~~~~~~~~~~i~~~~~~~~~~---~~~~~~~vmn~~~~~~L~~lk--d~~g~~~~~~~~~~~~~~~~l~G~Pv~~s~~ 223 (303) T protein:vir:97 150 -KFTESEDADANIEAAVNLIQGA---EGVVTGLAMDTEFSTALAKVT--NGEMGPKMYPELAWGANPDSINGLKSSVNTT 223 (303) T ss_pred -ccccccchHHHHHHHHHHHhhc---CCCccEEEEcHHHHHHHHHhh--ccCCCeEEecCccCCCCCceecceeeEEecc Confidence 1112334578999998887542 245678999999999886432 22221111 0001111111234555433211 Q ss_pred ceeeechhhhccccCCccceEEEEEc-------ccceEEEeeccchhccc----ceecCceeEeeeeeeeeeEEEECcce Q lcl|NC_011142. 268 FQLMATELAAAGVSNGNKDRYVVYDK-------SERNLALAKPIPFRMLA----PQLLGLGITVPAEYKISGTEYRYPLC 336 (343) Q Consensus 268 ~~l~~~~~~~~g~~~~g~dr~v~y~~-------~~~~~~~~v~~~~~~~~----~~~~~~~~~~~~~~~~gGv~i~~P~a 336 (343) ....+....+++..++-+. ..+.+++.+........ .-.+++ .-+.++.++ +..+++|.| T Consensus 224 -------v~~~~~~~~~~~~~~~Gdf~~~~~~~~~~~~~~~~~~~~~~d~~~~~~~~~n~-~~~r~~~r~-~~~v~~p~a 294 (303) T protein:vir:97 224 -------VGAGADEAESKDLVIIGDFESMFKWGYAKQIPMEIIKYGDPDNSGKDLKGYNQ-IYLRAEAYI-GWGILDAKS 294 (303) T ss_pred -------cCCccccCCCccEEEEeeccccEEEEEecCcEEEEeeccCCCCcchhhhhcCc-EEEEEEEEe-ccEeecccc Confidence 1111111112222221111 12233333211000000 011221 334557776 678899999 Q ss_pred eeeeccC Q lcl|NC_011142. 337 AQYVDML 343 (343) Q Consensus 337 i~~~dGI 343 (343) ++++... T Consensus 295 f~~l~~~ 301 (303) T protein:vir:97 295 FARVTKG 301 (303) T ss_pred eEEeeCC Confidence 9999999 No 24 >protein:vir:2504 Length: 305 # NCBI annotation: major capsid subunit gp9 # Family: family:all:507 # MgeID: mge:53 # MgeName: TM4 # Cross-refs: genbank:acc:NP_569745;genbank:gi:18496895;genbank:GeneID:932268 Probab=98.80 E-value=2e-09 Score=68.26 Aligned_cols=274 Identities=10% Similarity=0.017 Sum_probs=148.8 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcC--- Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANAS--- 102 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~--- 102 (343) |+. .++++ ++.+.. +.+.+.|++..+..-..+++..+.. ....+..+.+......+.+++..+. T Consensus 1 ma~------~t~~~-gg~liP---~~~~~~Ii~~~~~~s~l~~l~~~~~---~~~~~~~~p~~~~~~~a~wv~E~~~~~~ 67 (305) T protein:vir:25 1 MAD------ISRAE-VASLIQ---EAYSDTLLAAAKQGSTVLSAFQNVN---MGTKTTHLPVLATLPEADWVGESATDPK 67 (305) T ss_pred CCC------ccCCc-cceecC---HHHHHHHHHHHHhhchhhhhcceee---ccCCcEEEEEEeCCcceEEeeccccccc Confidence 222 11222 222222 3345667777777766666665532 2222455666666677788766542 Q ss_pred -ccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhh---cceeeeecC Q lcl|NC_011142. 103 -DLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNR---NMSGLLNNP 178 (343) Q Consensus 103 -dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~---g~~GLlN~p 178 (343) ++|..+..+++.....+.++.-..++.+=++. ...++..--....++++++.+|+.+|+|+..- +..+.++.- T Consensus 68 ~~~~~s~~~f~~i~~~~~k~~~~~~is~ell~d---s~~~~~~~i~~~l~~~~a~~~d~a~~~G~g~~~~~~~~~~~~~~ 144 (305) T protein:vir:25 68 GVKPTSKVTWANRTLVAEEIAVIIPVHENVIDD---ATVAVLTEVAELGGQAIGKKLDQAVIFGTDKPASWVSPALIPAA 144 (305) T ss_pred ccccccccceeeEEeeeEEEEEeehhhHHHHhc---chHHHHHHHHHHHHHHHHHHHhhhheeccCCCCCcccccccccc Confidence 46766777888888899999888887644432 34568888889999999999999999998631 122222221 Q ss_pred CccccccCcCcc-ccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceee Q lcl|NC_011142. 179 NVTKTSATVNYA-TCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLL 257 (343) Q Consensus 179 ~v~~~~~~~~w~-~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~ 257 (343) ..... ....+. ..+.+++++++.++...+.. ....+..++|+|..+..|.+.. +..+.-++ .. ..+ T Consensus 145 ~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~---~~~~~~~~v~~~~~~~~l~~lk--d~~G~~i~----~~---~~l 211 (305) T protein:vir:25 145 VTAGQ-AVEVVGGVANESDIVGATNRAAKAVAS---AGWAPDTLLSSLALRYEVANIR--DANGNPVF----RD---DSF 211 (305) T ss_pred ccccc-cccccccchhhhHHHHHHHHHHHhhhh---cccccceeEecHHHHHHHHHhh--ccCCceee----cC---Ccc Confidence 11111 111121 22345567777777665543 2234567999999999986432 43343222 11 123 Q ss_pred ccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc--------------eecCceeEeeee Q lcl|NC_011142. 258 TRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP--------------QLLGLGITVPAE 323 (343) Q Consensus 258 ~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~--------------~~~~~~~~~~~~ 323 (343) -|.|+.+.... ....++. .+++ -|.+.+.+.....++.... -.++ .+.+.++ T Consensus 212 ~G~Pv~~~~~~-----------~~~~~~~-~~~~-gd~s~~~i~~~~~~~i~~~~~~~~~~~~~~~~~~~~~-~~~~R~~ 277 (305) T protein:vir:25 212 AGFRTFFNRNG-----------AWDADAA-IEVI-ADSSRVKIGVRQDITVKFLDQATLGTGENQINLAERD-MVALRLK 277 (305) T ss_pred cccceEEcCcc-----------CCCCCcc-EEEE-EecceEEEEEecCeEEEEeeeeeeecCCceeeeeecC-cEEEEEE Confidence 45554332110 0011111 1222 1222222222222211100 0111 2445667 Q ss_pred eeeeeEEEECcceeeeeccC Q lcl|NC_011142. 324 YKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 324 ~~~gGv~i~~P~ai~~~dGI 343 (343) .|+ |+.+.+|.+++.++|+ T Consensus 278 ~r~-~~~v~~p~a~v~~~~~ 296 (305) T protein:vir:25 278 ARF-AYVLGVSATAQGANKT 296 (305) T ss_pred Eee-cceeeCcccEEEEccc Confidence 777 5778999999999999 No 25 >protein:vir:5739 Length: 366 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:122 # MgeName: PY54 # Cross-refs: genbank:acc:NP_892050;genbank:gi:33770513;interpro:IPR006444;uniprot:Q7Y410;genbank:GeneID:1732928 Probab=98.80 E-value=3.3e-09 Score=67.07 Aligned_cols=315 Identities=8% Similarity=0.033 Sum_probs=156.0 Q ss_pred CCcceeccc------hhhhh---chhhhchhcccccccCcc----hhecchhhhhhhhHHHHHHHHHHHHhhhhhcccch Q lcl|NC_011142. 1 MSEKRVVID------AQTIA---GNRWLNKFLDSNATIGVP----SVVNDADGGAAYYISQLASLETTVYEVPYADITYL 67 (343) Q Consensus 1 ~~~~~~~~~------~~~~~---~~~~~~~~~~~~~~~~~~----~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~ 67 (343) -+|....-. +..++ |+.. ..+..+....+-. ++.+.+.+ |.++.. +.+..+|++..++....+ T Consensus 21 ~~~~~~~kg~~~~~~~~a~a~~~g~~~-~a~~~a~~~~~~~~~~~a~~~~~~~-Gg~lvP--~~~~~~ii~~l~~~s~l~ 96 (366) T protein:vir:57 21 KEELQQYKGAGMTRMVMSIAAGKGNLA-DAAKFAATELGDTGLSMAISTAAGS-GGALIP--QNMQNEVIELLRDRTVVR 96 (366) T ss_pred ccccccccchhHHHHHHHHHhcccchh-HHHHHHHHhhcchhhhhhccccccC-Cccccc--hhHHHHHHHHHhhhcchh Confidence 000000000 00000 1100 0000000000000 11222333 444444 334566777666554444 Q ss_pred hhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHH Q lcl|NC_011142. 68 EDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQA 147 (343) Q Consensus 68 ~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~ 147 (343) ++ .. +..+.....+.+.+.+....+.+++.. .++|..+..+++...+.+.++.-..++.+=|+.+ ..+++.--. T Consensus 97 ~l-g~-~~v~~~~g~~~~p~~t~~~~a~wv~E~-~~~~~s~~~f~~i~~~~~k~~~~~~iS~ell~ds---~~~~~~~i~ 170 (366) T protein:vir:57 97 IL-GA-RSIPLPNGNLSMPRLSGGATAGYVGEG-KDVVATGATFDDVKLSAKTMIALVPVSNQLIGRA---GFNVEQLLL 170 (366) T ss_pred hh-ce-eeeecCCCceEEEEEeCCcceeeeccC-ccccccccceeEEEEeeEEEEEeehhhHHHHhhh---hHHHHHHHH Confidence 43 11 111222224556666666677777664 5688888889999999999998888874434333 456777788 Q ss_pred HHHHHHHHHhhhheeeeeehh-hcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHH Q lcl|NC_011142. 148 ELAFRGSEEHSQRVAYFGDTN-RNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPD 226 (343) Q Consensus 148 ~aA~~~~~~~~n~~~f~G~~~-~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~ 226 (343) ...++++.+.+|+.+++|+.. ..-.||+|..+.........-...+...+..++..+....... +........+|+|. T Consensus 171 ~~l~~a~~~~~d~a~l~G~G~~~~p~Gi~~~~~~~~~~~~~~~t~~~~~~~~~~~~~~~~~~~~~-~~~~~~a~~vmn~~ 249 (366) T protein:vir:57 171 GDILSAIATREDKAFLRDDGTGDTPKGMKAVATAANRLVAWTGTAINLTTIDEYLDSLILKHMDS-NSNMIRCGWGLSNR 249 (366) T ss_pred HHHHHHHHHHHHHHhhccCCCCccccceeeccccccceeeccccccchhhHHHHHHHHHHhhhcc-ccccccCEEEecHH Confidence 889999999999999999863 4678999988754432221111233444444444333332221 22223456899999 Q ss_pred HHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccch Q lcl|NC_011142. 227 LWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPF 306 (343) Q Consensus 227 ~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~ 306 (343) .+..|.+.+ +..|..++.-+. ...+-|.|+..... + +. ..+.++...-++|-+=.+ +-+..-..+ T Consensus 250 ~~~~L~~lk--d~~G~~l~~~~~----~g~l~G~Pvv~s~~--i--p~----~~~~~~~~~~i~~gdfs~-~~i~~~~~i 314 (366) T protein:vir:57 250 TYMTLFGLR--DGNGNKVYPEMS----QGILKGYPIQRTSA--I--PA----NLGDDGNESEIYFCDFND-VVIGEDGMM 314 (366) T ss_pred HHHHHHhhh--ccCCceeccCCC----CCeecceeeEEccc--c--cc----ccccCCCccEEEEEecce-EEEEEecce Confidence 999997543 444443331111 11234555433211 0 10 111111112233322112 222222222 Q ss_pred hccc---------------ceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 307 RMLA---------------PQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 307 ~~~~---------------~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +... ...++ ...+.+..++ ++.+++|.|+++++|| T Consensus 315 ~i~~~~ea~~~~~~g~~~~~f~~~-~~~iR~~~~~-d~~v~~~~a~~~lt~~ 364 (366) T protein:vir:57 315 KVDFSTEATYKDADGQLVSAFARN-QSLIRVVTEH-DIGFRHPEGLVLGTGV 364 (366) T ss_pred EEEEeeccccccccccchhhhhcC-ceeEEeeeee-CcEeeccccEEEEecc Confidence 2211 00111 2456667776 6788999999999999 No 26 >protein:vir:8420 Length: 477 # NCBI annotation: gp15 # Family: family:all:21 # MgeID: mge:155 # MgeName: Omega # Cross-refs: genbank:acc:NP_818316;genbank:gi:29566752;genbank:GeneID:1260033 Probab=98.79 E-value=3.4e-10 Score=72.47 Aligned_cols=321 Identities=13% Similarity=0.148 Sum_probs=155.6 Q ss_pred CCcceec-----------------------------cchhhhh-------chhhhchhcccccccCcchhecchhhhhhh Q lcl|NC_011142. 1 MSEKRVV-----------------------------IDAQTIA-------GNRWLNKFLDSNATIGVPSVVNDADGGAAY 44 (343) Q Consensus 1 ~~~~~~~-----------------------------~~~~~~~-------~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f 44 (343) +.+..+. ..++.+. .+........... -..++.....++|.+ T Consensus 90 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~~~gg~l 167 (477) T protein:vir:84 90 VRKATVEVNEALTYEKGNGQSYFRDLAMQTVGMADEPAKERLRRHMVDVESDKEIRKIAKVGE--EYRDLDRNGGTGGYA 167 (477) T ss_pred hcccccccccchhhhhhHHHHHHHHHHHHHhhhhhhHHHHHHHHHHhhhhhhhhHHHHHHhhh--hhccccccCCCccee Confidence 1111110 0000000 0000000000000 000111111223333 Q ss_pred hHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccc-cceeecCCc----CccceeeeccceeEEEEE Q lcl|NC_011142. 45 YISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAA-MGKFISANA----SDLPRVAQSAKLHQVELG 119 (343) Q Consensus 45 ~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G-~a~~~~~~~----~dip~v~~~~~~~~~~v~ 119 (343) ... +.+...+++...+....++++... ..+.+...+.+...+... .+.+.+..+ ...|..+..++....+.+ T Consensus 168 v~~--~~~~~~ii~~l~~~~~i~~~~~~~-~~~~~~~~~~ip~~~~~~~~a~~~~Eg~~~~~~~~~~s~~~f~~i~~~~~ 244 (477) T protein:vir:84 168 VPP--LWMMNRFIELARAGRTYANLCPTE-PLPGGTSSINIPKILTGTSTAIQAADNAALTAPSAHEVDLTDGFVQANVK 244 (477) T ss_pred ecc--chhHHHHHHHhhhcchHHHhhcee-eecCCcceeEEEEEecCcceeeeeccCcccccccccccccceeeEEEeee Confidence 333 234456777666665556665542 122233334555443322 233444432 245666777888888888 Q ss_pred EEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehh-hcceeeeecCCccccccC---cCccccCHH Q lcl|NC_011142. 120 YAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTN-RNMSGLLNNPNVTKTSAT---VNYATCTGQ 195 (343) Q Consensus 120 ~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~-~g~~GLlN~p~v~~~~~~---~~w~~~t~~ 195 (343) .++.-+.+|.+=|+. ...++..--....+.+++..+|..+++|+.. ....||+|.+++...+.+ ..|.. .+ T Consensus 245 k~~~~~~iS~ell~d---s~~~l~~~i~~~l~~~~~~~~d~~~l~G~Gt~~~p~Gi~~~~~~~~~~~~~~~~t~~~--~~ 319 (477) T protein:vir:84 245 TIAGQQGIAIQLLDQ---AAVSVDEFVFRDLAADYANKLNVQVISGTGSNNQVVGVRATAGITQVTATSAGSALEK--HQ 319 (477) T ss_pred eEEeeeHHHHHHHhc---cchhHHHHHHHHHHHHHHHHHHHHHhccCCCCCccceeeeccccccccccccccchhh--HH Confidence 888777776544443 3557888888889999999999999999864 457999999987654432 33433 44 Q ss_pred HHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHH----------HH---HhcCcceeeccccc Q lcl|NC_011142. 196 ELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIE----------HF---QINNAYTLLTRNPI 262 (343) Q Consensus 196 ~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle----------~l---~~n~~~~~~~~~p~ 262 (343) ..+.+|.+++..+.. ++...+...+|+|..+..|..-. +..+.-++. ++ -.+.+.-.+-|.|+ T Consensus 320 ~~~~~i~~~~~~~~~--~~~~~~~~~v~~~~~~~~l~~lk--d~~G~~l~~~~~~~~~~~~~~~~~~~~~~~~~l~G~pV 395 (477) T protein:vir:84 320 IIYQKIADAIQRVHT--SRFLEPEVIVMHPRRWASFHAIF--AGDDRPLIVPSGPGFNNLGVLTEVASQRVVGQMHGLPV 395 (477) T ss_pred HHHHHHHHHHhhccc--cccCCccEEEEcHHHHHHHHHhh--ccCCCeeeecCcccccccccccccccccccchhcccce Confidence 566667777665532 34445667999999999886422 322221110 00 00011112234443 Q ss_pred cccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccceecC-ceeEeeeeeeeeeEEEECcceeeeec Q lcl|NC_011142. 263 DIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLG-LGITVPAEYKISGTEYRYPLCAQYVD 341 (343) Q Consensus 263 ~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~-~~~~~~~~~~~gGv~i~~P~ai~~~d 341 (343) .+.. ... .+.+.++....++|-+-.+++-..-.+.+...+-.+.+ ....+.........-+|+|.|++.++ T Consensus 396 v~s~-------~~p-~~~~~~~d~~~i~~gd~~~~~i~~~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~~r~~~afv~~t 467 (477) T protein:vir:84 396 VTDP-------TLP-TTLGTGTDQDVIHVLRASDLALFESSVRMRALQETRAENLSVLLQVYGYLAFTAARFPQSVVEIG 467 (477) T ss_pred EecC-------ccc-ccccccCCcceEEEEEeceEEEEeeceeEEeccccccccceeeeeehhhhhhhhhccccceEEee Confidence 2221 111 11222222223444443444433323333333322222 22222222223335678899999999 Q ss_pred cC Q lcl|NC_011142. 342 ML 343 (343) Q Consensus 342 GI 343 (343) |. T Consensus 468 ~~ 469 (477) T protein:vir:84 468 GT 469 (477) T ss_pred cc Confidence 99 No 27 >protein:vir:80684 Length: 315 # NCBI annotation: gp6 # Family: family:all:966 # MgeID: mge:1884 # MgeName: PA6 # Cross-refs: genbank:acc:YP_001285582;genbank:gi:148727088;genbank:GeneID:5247055 Probab=98.78 E-value=1.5e-09 Score=68.97 Aligned_cols=286 Identities=13% Similarity=0.008 Sum_probs=151.6 Q ss_pred hecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccce Q lcl|NC_011142. 34 VVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKL 113 (343) Q Consensus 34 ~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~ 113 (343) |...+.+.+.++.. +.+...|++..+..-..+++..+. +.......+.+....+.+.+++.. ..+|..+..++. T Consensus 1 Ma~~~~~~gg~~vP--~~~~~~ii~~l~~~s~i~~l~~~i---~~~~~~~~ip~~~~~~~a~wv~Eg-~~~~~s~~~f~~ 74 (315) T protein:vir:80 1 MADDFLSAGKLELP--GSMIGAVRDRAIDSGVLAKLSPEQ---PTIFGPVKGAVFSGVPRAKIVGEG-EVKPSASVDVSA 74 (315) T ss_pred CCCCcCCcCceEcc--hHHHHHHHHHHHhhchhhhhccee---ecCCCceEEEEEeCCcceEEeeCC-ccccccccceee Confidence 33334444444444 345566777776666666655442 222234566777777788888875 468888888888 Q ss_pred eEEEEEEEEEEEeecHHHHHHHHHhC-CCccHHHHHHHHHHHHHhhhheeeeeehhh---cceeeeecCCccccccCcCc Q lcl|NC_011142. 114 HQVELGYAGVECHYSLDELRTTAAVN-MPIDSMQAELAFRGSEEHSQRVAYFGDTNR---NMSGLLNNPNVTKTSATVNY 189 (343) Q Consensus 114 ~~~~v~~~~~~~~~~~~El~~a~~~g-~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~---g~~GLlN~p~v~~~~~~~~w 189 (343) .....+.++.-..+|.+=++.....- -.|...-....++++++.+|+.+|+|+... +..|+.+.-+... . T Consensus 75 v~l~~~kl~~~~~iS~ell~~s~~~~~~~l~~~i~~~la~ai~~~~d~a~~~G~~~~~~~~~~~~~~~~~~~~-----~- 148 (315) T protein:vir:80 75 FTAQPIKVVTQQRVSDEFMWADADYRLGVLQDLISPALGASIGRAVDLIAFHGIDPATGKAASAVHTSLNKTK-----N- 148 (315) T ss_pred eEeeeeeEEeeehhhHHHhhcCchhHHHHHHHHHHHHHHHHHHHHHhhheeeccCCCCCcccccccccccccc-----c- Confidence 88888888887777654332211111 115566677889999999999999997532 2333332211100 0 Q ss_pred cccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHH----hcCcceeecccccccc Q lcl|NC_011142. 190 ATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQ----INNAYTLLTRNPIDIK 265 (343) Q Consensus 190 ~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~----~n~~~~~~~~~p~~i~ 265 (343) ........++|+.+++..+... ....+...+|+|..+..|.+-+..+..+ +...++. ...+ ..+-|.|+... T Consensus 149 ~~~~~~~~~~d~~~~~~~~~~~--~~~~~~~~imn~~~~~~L~~l~~~~g~~-~~g~~~~~~~~~g~~-~tl~G~PV~~~ 224 (315) T protein:vir:80 149 IVDATDSATADLVKAVGLIAGA--GLQVPNGVALDPAFSFALSTEVYPKGSP-LAGQPMYPAAGFAGL-DNWRGLNVGAS 224 (315) T ss_pred eeeccccchHHHHHHHHHHhhc--cCccceEEEEcHHHHHHHHHHhhccCCc-ccccccccccccCCC-ceecceeeEec Confidence 1112334567888888776532 3344567999999999986543221111 1111111 1111 13445554332 Q ss_pred ccceeeechhhhccccCCccceEEEEEcc------cceEEEeeccchhcccceecCc----eeEeeeeeeeeeEEEECcc Q lcl|NC_011142. 266 IRFQLMATELAAAGVSNGNKDRYVVYDKS------ERNLALAKPIPFRMLAPQLLGL----GITVPAEYKISGTEYRYPL 335 (343) Q Consensus 266 ~~~~l~~~~~~~~g~~~~g~dr~v~y~~~------~~~~~~~v~~~~~~~~~~~~~~----~~~~~~~~~~gGv~i~~P~ 335 (343) ... + .....+.+.+..++.-+.+ .+.+++.+-..-... ....++ ...+.++.++ |..+.+|. T Consensus 225 ~~~----~--~~~~~~~~~~~~~~~GDfs~~~~g~~~~~~i~i~~~~~~~-~~~~~~~~~~~v~~r~~~r~-~~~v~~~~ 296 (315) T protein:vir:80 225 STV----S--GAPEMSPASGVKAIVGDFSRVHWGFQRNFPIELIEYGDPD-QTGRDLKGHNEVMVRAEAVL-YVAIESLD 296 (315) T ss_pred CcC----C--cccccccccccEEEEeecccEEEEEecCeeEEEecccccc-CcccchhhcCcEEEEEEEEe-cceeeccc Confidence 111 0 0111111222223222222 122222221100000 001111 2455667776 68899999 Q ss_pred eeeeeccC Q lcl|NC_011142. 336 CAQYVDML 343 (343) Q Consensus 336 ai~~~dGI 343 (343) |++++.+. T Consensus 297 a~~~l~~~ 304 (315) T protein:vir:80 297 SFAVVKEK 304 (315) T ss_pred ceEEEeec Confidence 99999999 No 28 >protein:vir:105905 Length: 304 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:1514 # MgeName: phiETA3 # Cross-refs: genbank:acc:YP_001004375;genbank:gi:122891830;genbank:GeneID:4712376 Probab=98.75 E-value=3.8e-09 Score=66.73 Aligned_cols=284 Identities=12% Similarity=0.014 Sum_probs=156.0 Q ss_pred hhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCC Q lcl|NC_011142. 21 KFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISAN 100 (343) Q Consensus 21 ~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~ 100 (343) |+.+...+.. .++.+.++....+ .+.+.+++........+++..+.. .+.....+.+.+..+.+.+++.. T Consensus 1 ma~~~~~~~~----~~~t~~gg~lip~---~~~~~ii~~~~~~~~l~~~~~~~~---~~~~~~~ip~~~~~~~a~~v~E~ 70 (304) T protein:vir:10 1 MATPTYTPGN----VILSDFKNGVIPA---EQGTLIMKDIMANSAIMKLAKNEP---MTAQKKKFTYLAKGVGAYWVSET 70 (304) T ss_pred Cccccccccc----ccccCCCceecch---hHHHHHHHHHHhccchhhhcceee---ccCCceEEEEEeCCcceEEeecC Confidence 3333332222 1222333333333 344567777766666666655432 22233456666667778888765 Q ss_pred cCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCc Q lcl|NC_011142. 101 ASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNV 180 (343) Q Consensus 101 ~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v 180 (343) ..+|..+..++........++.-+.++.+=++. ...++...-....++++++.+|+.+++|+...+-.|.+....+ T Consensus 71 -~~~~~~~~~~~~i~~~~~k~~~~~~iS~ell~d---s~~~l~~~i~~~l~~~ia~~~d~~~l~G~g~~~~~~~~~~~~~ 146 (304) T protein:vir:10 71 -ERIQTSKPEYAQAEMEAKKIGVIIPLSKEFLKW---TAKDFFNEVKPLIAEAFYKAFDQAVIFGTKSPYNTSTSGKPLV 146 (304) T ss_pred -cccccccceeeEEEEEEEEEEEeehhhHHHHhc---chHHHHHHHHHHHHHHHHHHHHhhheeccCCCccccccccccc Confidence 457888888899999999999888887644433 3466888888889999999999999999876554554444333 Q ss_pred cccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccc Q lcl|NC_011142. 181 TKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRN 260 (343) Q Consensus 181 ~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~ 260 (343) +......... .+....++||.+++.++... + ..+..++|+|+.|..|.+.. +..+.-++ ..++. .+-|. T Consensus 147 ~~~~~~~~~~-~~~~~~~~~i~~~~~~l~~~--~-~~~~~~v~~~~~~~~L~~lk--d~~G~~l~----~~~~~-~l~G~ 215 (304) T protein:vir:10 147 EGAEEKGNVV-TDTNNLYVDLSALMATIEDE--E-LDPNGVLTTRSFRSKMRNAL--DANDRPLF----DANGN-EIMGL 215 (304) T ss_pred cccccccccc-ccccchHHHHHHHHHHhhhc--c-CCcCEEEEcHHHHHHHHHhh--ccCCcEee----cCCCc-cccce Confidence 3222111111 12334588889998887642 2 34568999999999996432 33332211 11111 23345 Q ss_pred cccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcc----c-------ceec----C-c---eeEee Q lcl|NC_011142. 261 PIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRML----A-------PQLL----G-L---GITVP 321 (343) Q Consensus 261 p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~----~-------~~~~----~-~---~~~~~ 321 (343) |+..... . .. . .++-.+++- |.+.+-+..-..++.- + .... + + ...+. T Consensus 216 PV~~~~~-------~--~~--~-~~~~~~~~g-d~~~~~~~~~~~~~i~~~~e~~~~~~~~~~~~g~~~~~f~~~~~~~r 282 (304) T protein:vir:10 216 PLSYTGA-------D--VY--D-KKKSLALMG-DWDYARYGILQGIEYAISEDATLTTLQASDASGQPVSLFERDMFALR 282 (304) T ss_pred eeEEecc-------c--cc--C-CCCcEEEEE-ehhhEEEEEecceEEEEeecceeeeecccccCccchhhhhcCcEEEE Confidence 5432211 0 00 0 111112221 1122212211111110 0 0000 0 0 14455 Q ss_pred eeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 322 AEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 322 ~~~~~gGv~i~~P~ai~~~dGI 343 (343) ++.|+ |..+.+|.|++.+..- T Consensus 283 ~~~r~-~~~v~~~~a~~~l~~a 303 (304) T protein:vir:10 283 ATMHI-AYMNVKPEAFATLKPT 303 (304) T ss_pred EEEEe-ccEeecccceEEEEec Confidence 67777 5677779999999888 No 29 >protein:vir:94142 Length: 304 # NCBI annotation: ORF013 # Family: family:all:507 # MgeID: mge:1494 # MgeName: 96 # Cross-refs: genbank:acc:YP_240234;genbank:gi:66395898;genbank:GeneID:5133311 Probab=98.75 E-value=3.8e-09 Score=66.73 Aligned_cols=284 Identities=12% Similarity=0.014 Sum_probs=156.0 Q ss_pred hhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCC Q lcl|NC_011142. 21 KFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISAN 100 (343) Q Consensus 21 ~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~ 100 (343) |+.+...+.. .++.+.++....+ .+.+.+++........+++..+.. .+.....+.+.+..+.+.+++.. T Consensus 1 ma~~~~~~~~----~~~t~~gg~lip~---~~~~~ii~~~~~~~~l~~~~~~~~---~~~~~~~ip~~~~~~~a~~v~E~ 70 (304) T protein:vir:94 1 MATPTYTPGN----VILSDFKNGVIPA---EQGTLIMKDIMANSAIMKLAKNEP---MTAQKKKFTYLAKGVGAYWVSET 70 (304) T ss_pred Cccccccccc----ccccCCCceecch---hHHHHHHHHHHhccchhhhcceee---ccCCceEEEEEeCCcceEEeecC Confidence 3333332222 1222333333333 344567777766666666655432 22233456666667778888765 Q ss_pred cCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCc Q lcl|NC_011142. 101 ASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNV 180 (343) Q Consensus 101 ~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v 180 (343) ..+|..+..++........++.-+.++.+=++. ...++...-....++++++.+|+.+++|+...+-.|.+....+ T Consensus 71 -~~~~~~~~~~~~i~~~~~k~~~~~~iS~ell~d---s~~~l~~~i~~~l~~~ia~~~d~~~l~G~g~~~~~~~~~~~~~ 146 (304) T protein:vir:94 71 -ERIQTSKPEYAQAEMEAKKIGVIIPLSKEFLKW---TAKDFFNEVKPLIAEAFYKAFDQAVIFGTKSPYNTSTSGKPLV 146 (304) T ss_pred -cccccccceeeEEEEEEEEEEEeehhhHHHHhc---chHHHHHHHHHHHHHHHHHHHHhhheeccCCCccccccccccc Confidence 457888888899999999999888887644433 3466888888889999999999999999876554554444333 Q ss_pred cccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccc Q lcl|NC_011142. 181 TKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRN 260 (343) Q Consensus 181 ~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~ 260 (343) +......... .+....++||.+++.++... + ..+..++|+|+.|..|.+.. +..+.-++ ..++. .+-|. T Consensus 147 ~~~~~~~~~~-~~~~~~~~~i~~~~~~l~~~--~-~~~~~~v~~~~~~~~L~~lk--d~~G~~l~----~~~~~-~l~G~ 215 (304) T protein:vir:94 147 EGAEEKGNVV-TDTNNLYVDLSALMATIEDE--E-LDPNGVLTTRSFRSKMRNAL--DANDRPLF----DANGN-EIMGL 215 (304) T ss_pred cccccccccc-ccccchHHHHHHHHHHhhhc--c-CCcCEEEEcHHHHHHHHHhh--ccCCcEee----cCCCc-cccce Confidence 3222111111 12334588889998887642 2 34568999999999996432 33332211 11111 23345 Q ss_pred cccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcc----c-------ceec----C-c---eeEee Q lcl|NC_011142. 261 PIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRML----A-------PQLL----G-L---GITVP 321 (343) Q Consensus 261 p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~----~-------~~~~----~-~---~~~~~ 321 (343) |+..... . .. . .++-.+++- |.+.+-+..-..++.- + .... + + ...+. T Consensus 216 PV~~~~~-------~--~~--~-~~~~~~~~g-d~~~~~~~~~~~~~i~~~~e~~~~~~~~~~~~g~~~~~f~~~~~~~r 282 (304) T protein:vir:94 216 PLSYTGA-------D--VY--D-KKKSLALMG-DWDYARYGILQGIEYAISEDATLTTLQASDASGQPVSLFERDMFALR 282 (304) T ss_pred eeEEecc-------c--cc--C-CCCcEEEEE-ehhhEEEEEecceEEEEeecceeeeecccccCccchhhhhcCcEEEE Confidence 5432211 0 00 0 111112221 1122212211111110 0 0000 0 0 14455 Q ss_pred eeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 322 AEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 322 ~~~~~gGv~i~~P~ai~~~dGI 343 (343) ++.|+ |..+.+|.|++.+..- T Consensus 283 ~~~r~-~~~v~~~~a~~~l~~a 303 (304) T protein:vir:94 283 ATMHI-AYMNVKPEAFATLKPT 303 (304) T ss_pred EEEEe-ccEeecccceEEEEec Confidence 67777 5677779999999888 No 30 >protein:vir:78223 Length: 333 # NCBI annotation: Putative major head protein # Family: family:all:966 # MgeID: mge:1849 # MgeName: Bethlehem # Cross-refs: genbank:acc:YP_001491666;genbank:gi:157786490;genbank:GeneID:5625701 Probab=98.75 E-value=3.2e-09 Score=67.19 Aligned_cols=304 Identities=10% Similarity=-0.042 Sum_probs=158.7 Q ss_pred hhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccc Q lcl|NC_011142. 13 IAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAA 92 (343) Q Consensus 13 ~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G 92 (343) +|.-.-+... .+...+.-.+.+...+ +.. +.+..+|++.....-..+++..+.. .+. ....+.+..... T Consensus 1 ~a~l~el~~~----~~~~~~~g~~~~~~~~--liP--~~~~~~ii~~l~~~s~l~~~~~~~~-~~~--~~~~~p~~~~~~ 69 (333) T protein:vir:78 1 MATLNELLPN----SAGSNHQGRLAHVPSD--LLP--KEIVGPIFDKAQESSLVLRMGEQIP-ISY--GETIIPTTVKRP 69 (333) T ss_pred CchhHHhhhh----cccccccCceecCCcc--ccc--hhHHHHHHHHHHhhchhhhhcceee-ccC--CceEEEEEeCCc Confidence 3333333211 0000111111111111 222 3445667777777766677665532 222 223444444444 Q ss_pred cceeecCC-------cCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeee Q lcl|NC_011142. 93 MGKFISAN-------ASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFG 165 (343) Q Consensus 93 ~a~~~~~~-------~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G 165 (343) .+.+++.. +..+|..+..+++.....+.++.-..++.+=++. ...++..--....++++++.+|+-+|+| T Consensus 70 ~a~~v~eg~~~~~~e~~~~~~~~~~f~~i~l~~~kl~~~~~is~ell~~---s~~~~~~~i~~~la~ai~~~~d~~~l~G 146 (333) T protein:vir:78 70 EVGQVGVGTSNEQREGGLKPLSGTAWDTRSVSPIKLATIVTVSEEFARM---NPSGLYTKLQGDLAYAIGRGIDLAVFHG 146 (333) T ss_pred eeEeecCcccccccccccccccccceeEEEEeeEEEEEeehhhHHHHhc---CHHHHHHHHHHHHHHHHHHHHHHHHhcc Confidence 44443322 2446777788888888889999888887643332 2456778888889999999999999999 Q ss_pred ehh---hcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccc-cCCCCC Q lcl|NC_011142. 166 DTN---RNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLL-MTGYTD 241 (343) Q Consensus 166 ~~~---~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~-~~~~~~ 241 (343) +.. .+..|+++..++...+.. .....+.+..+++|.+++..+.. ++...+..++|+|..+..|.+-. ..+..+ T Consensus 147 ~g~~~~~~~~g~~~~~~~~~~~~~-~~~~~~~~~~~~~i~~~~~~~~~--~~~~~~~~~vmn~~~~~~L~~~~~~~d~~G 223 (333) T protein:vir:78 147 KSPLTGSALQGIDTDNVIANTTNV-DYLQETGDPLLDRLLDGYDLVSA--NTDVEFNGWAVDPRFRAHLLRAQAYRDANG 223 (333) T ss_pred cCCCCCcccccccccccccccccc-cccccccchhHHHHHHHHHhhcc--ccccCceEEEEcchHHHHHHHHhhhcCCCC Confidence 864 567888887765543322 12222344457888888877653 34556778999999988775421 223333 Q ss_pred ccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccce--------- Q lcl|NC_011142. 242 RTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQ--------- 312 (343) Q Consensus 242 ~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~--------- 312 (343) .-++...........+-|.|+...... + ...+.+.+++...++-+.+. +-+.+...++..... T Consensus 224 ~~i~~~~~~~~~~~~l~G~Pv~~~~~i----~--~~~~~~~~~~~~~~~gD~~~--~~~g~~~~~~i~~~~~~~~~~~~~ 295 (333) T protein:vir:78 224 NVDPSRINLAAQTGDVLGLPAQFGRAV----G--GDLGAAVDSKTRIIGGDFSQ--LKFGFADEIRIKMSDTATLTDSGS 295 (333) T ss_pred ceeecCccccCCCceeeceeeEEcccc----C--CCccccCCCccEEEEEeccc--EEEEEeeccEEEEecccccccccc Confidence 333222111111123345554322110 0 01112222233333223222 222222222221100 Q ss_pred ------ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 313 ------LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 313 ------~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .++ ...+.++.++ ++.+++|.|++++.+- T Consensus 296 ~~~~~~~~~-~v~~r~~~r~-d~~v~~~~a~~~l~~~ 330 (333) T protein:vir:78 296 ATVSMWQTN-QIAILIEVTF-GWLLGDKQAFVKFVDD 330 (333) T ss_pred ceeehhhcC-cEEEEEEEEE-ccEEecccceEEEecc Confidence 111 1234566766 6778999999999988 No 31 >protein:vir:1638 Length: 298 # NCBI annotation: Structural protein # Family: family:all:966 # MgeID: mge:33 # MgeName: r1t # Cross-refs: genbank:acc:NP_695059;genbank:gi:23455750;genbank:GeneID:955469 Probab=98.72 E-value=3.1e-09 Score=67.24 Aligned_cols=280 Identities=9% Similarity=0.001 Sum_probs=156.9 Q ss_pred hecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccce Q lcl|NC_011142. 34 VVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKL 113 (343) Q Consensus 34 ~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~ 113 (343) |. .++|.+..+ .+..++++.....-..+++.++.. .+.+ ...+.+.+..+.+.+++.. .++|..+..++. T Consensus 1 ma---~~gG~lvp~---~~~~~ii~~~~~~s~i~~l~~~~~-~~~~--~~~ip~~~~~~~a~~v~E~-~~~~~~~~~f~~ 70 (298) T protein:vir:16 1 MV---LNKGTLFDP---TLVTDLISKVAGKSSIARLSAQKP-IPFN--GEKVFTFTMDSEIDVVAES-GKKTHGGVTLAP 70 (298) T ss_pred Cc---ccCcceech---hHHHHHHHHHHhhhhhhhhcceee-ccCC--ceEEEEEecCcceEEecCC-ccccccccceeE Confidence 22 223333333 334567777776666677666432 2222 2445566677888888765 568988888999 Q ss_pred eEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehh-----hcceeeeecCCccccccCcC Q lcl|NC_011142. 114 HQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTN-----RNMSGLLNNPNVTKTSATVN 188 (343) Q Consensus 114 ~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~-----~g~~GLlN~p~v~~~~~~~~ 188 (343) .....+.++.-..+|.+=|..+.....++...-+...++++++.+++.+++|... .+..|+....+....... T Consensus 71 v~l~~~k~a~~~~iS~ell~~s~d~~~~l~~~i~~~la~ai~~~~d~~~l~G~~~~~g~~~~~~~~~~~~~~~~~~~~-- 148 (298) T protein:vir:16 71 QTMVPIKVEYGARISDEFMYASDEEKINILQEFNDGFAKKVARGIDLMAFHGVNPRLGTASAVIGTNHFDSKVTQKVE-- 148 (298) T ss_pred EEEeeeeEEEeehhhHHHhhcCcccHHHHHHHHHHHHHHHHHHHHHHHhhccccCCCCcccccccccccccccccccc-- Confidence 9999999998888876555544445566777788889999999999999999531 233443333322211111 Q ss_pred ccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccc Q lcl|NC_011142. 189 YATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRF 268 (343) Q Consensus 189 w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~ 268 (343) .. ......+.||.+++.++... ...+..++|+|+.+..|.+.. +..+.-++.-...+.....+-|.|+.+... T Consensus 149 ~~-~~~~~~~~~i~~~~~~~~~~---~~~~~~~vmn~~~~~~l~~lk--d~~G~~i~~~~~~~~~~~~l~G~PV~~~~~- 221 (298) T protein:vir:16 149 AP-RGIADPNGAIENAVELLTGV---DADVTGIAINPSFRSALAKQK--DLQDNALFPELKWGATPDTINGLPVDVNKT- 221 (298) T ss_pred cc-cccccHHHHHHHHHHHhhhc---CCCccEEEEcHHHHHHHHHhh--ccCCCeeecCcccCCCCceecceeeEEecc- Confidence 11 12234578899999887652 234567999999999986432 444433321111111112334555433211 Q ss_pred eeeechhhhccccCCccceEEEEEcccceEEEeeccchh--cccc----------eecCceeEeeeeeeeeeEEEECcce Q lcl|NC_011142. 269 QLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFR--MLAP----------QLLGLGITVPAEYKISGTEYRYPLC 336 (343) Q Consensus 269 ~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~--~~~~----------~~~~~~~~~~~~~~~gGv~i~~P~a 336 (343) .. .....+++.+++-+.+ +.+.+.+...++ .... -++++ ..+.++.++ |..+.+|.+ T Consensus 222 ------v~--~~~~~~~~~~~~GDfs-~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~-v~~ra~~r~-d~~v~~~~a 290 (298) T protein:vir:16 222 ------VS--DMSLTQRDRAIIGDFA-NGFKWGYAKEVPLEVIQYGDPDNSGLDLKGYNQ-VYIRAELFL-GWGILDATK 290 (298) T ss_pred ------cc--cccCCCccEEEEeecc-ceEEEEEecCceEEEeeccCCcCcchhhhhcCc-EEEEEEEEE-ccEeecccc Confidence 11 1112234444432222 112222111111 1110 01121 334556666 688999999 Q ss_pred eeeeccC Q lcl|NC_011142. 337 AQYVDML 343 (343) Q Consensus 337 i~~~dGI 343 (343) ++++.|. T Consensus 291 ~~~l~~a 297 (298) T protein:vir:16 291 FARVTEA 297 (298) T ss_pred eEEEeec Confidence 9999999 No 32 >protein:vir:7771 Length: 330 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:149 # MgeName: Bxz2 # Cross-refs: genbank:acc:NP_817605;genbank:gi:29566035;genbank:GeneID:1259229 Probab=98.72 E-value=5.6e-09 Score=65.84 Aligned_cols=291 Identities=10% Similarity=-0.033 Sum_probs=161.4 Q ss_pred hhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCC Q lcl|NC_011142. 21 KFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISAN 100 (343) Q Consensus 21 ~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~ 100 (343) |+.+...+..+ ..+ .+++.+...+ .. .++++..+.....+++.++.. .......|.+.+..+.+.+++.. T Consensus 1 m~~~~~~a~~~--~~t--~~~g~~i~~~--~~-~~ii~~~~~~s~l~~~~~~~~---~~~~~~~~p~~~~~~~a~~v~Eg 70 (330) T protein:vir:77 1 MAGSTVPSTQV--ALT--GDFSAFLTPE--QS-QDYFAEIEKTSIVQRIARKVP---MGPTGISIPHWTGAVSASWTGEA 70 (330) T ss_pred Ccccccchhhc--ccc--CCCcceechh--HH-HHHHHHHHhccchhhhcceee---ccCCceEEEEEcCCcceeEecCC Confidence 22221111110 011 2234444443 22 356666666666666666532 22233556677777788888764 Q ss_pred cCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehh-hcceeeeecCC Q lcl|NC_011142. 101 ASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTN-RNMSGLLNNPN 179 (343) Q Consensus 101 ~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~-~g~~GLlN~p~ 179 (343) ..+|..+..+++.....+.++.-..++.+=|+. ...++...-....++++++.+|+.+|+|+.. .+..|+++... T Consensus 71 -~~~~~~~~~f~~i~~~~~k~~~~~~is~ell~d---s~~~~~~~i~~~l~~ai~~~~~~~~l~G~g~~~~~~g~~~~~~ 146 (330) T protein:vir:77 71 -ERKPITKGSFGKQELEPVKITTIFAESAEVVRL---NPLNYLNTMRTKIAEAIALKFDAAAIHGIDKPSAFKGYLAETT 146 (330) T ss_pred -CccccccceeeEEEEeEEEEEEeehhhHHHHhc---chHHHHHHHHHHHHHHHHHHHHHHhhcccCCCCcccccccccc Confidence 568888888889999999999888887754443 3567888888899999999999999999874 45689998764 Q ss_pred ccccccCcCc--cccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHH-HHHhcCc--- Q lcl|NC_011142. 180 VTKTSATVNY--ATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIE-HFQINNA--- 253 (343) Q Consensus 180 v~~~~~~~~w--~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle-~l~~n~~--- 253 (343) .......... .+.+....++|+.+++..+... + ..+..++|+|+.+..|.+-. +..+.-++. -+....+ T Consensus 147 ~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~--~-~~~~~~vmn~~~~~~l~~lk--d~~G~~l~~~~~~~~~~~~~ 221 (330) T protein:vir:77 147 KVVSLADTNLTTASGPQGNAYLAVNNALSLLVNS--G-KKWTGTLLDNVTEPILNTAV--DGNGRPLFVESTYTEQVGAI 221 (330) T ss_pred ccceeecccccccccccchhHHHHHHHHHhhhhc--C-CCccEEEEcHHHHHHHHHHh--ccCCceeecCcccccccccc Confidence 3222111111 1234556788999998887643 2 34568999999999987432 323322211 0100111 Q ss_pred -ceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhccc---------------------- Q lcl|NC_011142. 254 -YTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLA---------------------- 310 (343) Q Consensus 254 -~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~---------------------- 310 (343) ...+-|.|+.... .. .....+.+..++.-+.+.=. +.....++... T Consensus 222 ~~~~l~G~PV~~~~-------~~--p~~~~~~~~~~~~gd~s~~~--i~~~~~~~i~~~~e~~~~~~~~~~~~~~~~~~~ 290 (330) T protein:vir:77 222 REGRILGRPTYVAD-------NV--VNGTVGNRVVGVMGDFSQVI--WGQIGGLSFDVTDQATLDFGEEQGGVWVPKLIS 290 (330) T ss_pred CCceecceeeEEec-------cc--cCCCCCCccEEEEEecceEE--EEEecCcEEEEeecceeeecccccccccccccc Confidence 1122344432221 11 11111122222222222211 22212211110 Q ss_pred ceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 311 PQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 311 ~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .-.++ ...+.++.++ ++.+.+|.|++.+.+. T Consensus 291 ~f~~~-~~~~r~~~r~-d~~v~~~~a~~~i~~~ 321 (330) T protein:vir:77 291 LWQHN-MVAVRCEAEF-AFMVNDKDAFVKLTDQ 321 (330) T ss_pred hhhcC-cEEEEEEEEe-ccEEecccceEEEEec Confidence 01122 2566778887 4667889999999999 No 33 >protein:vir:105038 Length: 428 # NCBI annotation: major capsid head protein precursor # Family: family:all:21 # MgeID: mge:1465 # MgeName: phiKO2 # Cross-refs: genbank:acc:YP_006586;genbank:gi:46402092;genbank:GeneID:2777903 Probab=98.71 E-value=8.3e-09 Score=64.88 Aligned_cols=316 Identities=10% Similarity=0.026 Sum_probs=154.2 Q ss_pred CCccee----ccchhhhhchhhh------------c---hhcccccccCcc-hhecchhhhhhhhHHHHHHHHHHHHhhh Q lcl|NC_011142. 1 MSEKRV----VIDAQTIAGNRWL------------N---KFLDSNATIGVP-SVVNDADGGAAYYISQLASLETTVYEVP 60 (343) Q Consensus 1 ~~~~~~----~~~~~~~~~~~~~------------~---~~~~~~~~~~~~-~~~~dA~~~~~f~~~~l~~id~~v~e~~ 60 (343) ..++.+ +.....-++..+. + .+......-... ++.+++.. |.++.. +.+.++|++.. T Consensus 74 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-gg~liP--~~~~~~ii~~l 150 (428) T protein:vir:10 74 QHGPAVIVKAEPKQYTGAGMTRMVMSIAAAQGNLQDAAKFASDELNDQSVSMAISTAAGS-GGVLIP--QNIHSEVIELL 150 (428) T ss_pred hhccccccccccchhhhHHHHHHHHHHHHhhhhHHHHHHHhhhhhhhhhHhhhhcccccC-Cccccc--hhHHHHHHHHH Confidence 000000 0000000000000 0 000000000000 11222222 334444 34456677766 Q ss_pred hhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCC Q lcl|NC_011142. 61 YADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNM 140 (343) Q Consensus 61 ~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~ 140 (343) ......+++..-..+..-| .+.+......+.+.+++.. ..+|..+..++........++.-+.+|.+=|+.+ .. T Consensus 151 ~~~~~l~~~~~~~~~~~~g--~~~~p~~~~~~~a~~v~Eg-~~~~~~~~~f~~i~~~~~k~~~~v~is~ell~ds---~~ 224 (428) T protein:vir:10 151 RDRTIVRKLGARSIPLPNG--NMSLPRLAGGATASYTGEN-QDAKVSEARFDDVKLTAKTMIAMVPISNALIGRA---GF 224 (428) T ss_pred hhhchhhhhcceeeecCCc--ceEEEEEeCCcceeeeccC-ccccccccceeeEEeeeEEEEEeehhhHHHHhhh---hH Confidence 6655555542111112222 2455555555667777655 5578888888888888999998888887655543 34 Q ss_pred CccHHHHHHHHHHHHHhhhheeeeeehh-hcceeeeecCCccccccCcCc-cccCHHHHHHHHHHHHHHHHHhcCCeecc Q lcl|NC_011142. 141 PIDSMQAELAFRGSEEHSQRVAYFGDTN-RNMSGLLNNPNVTKTSATVNY-ATCTGQELFDLLNNPVFAVVKASKRFHTP 218 (343) Q Consensus 141 ~l~~~k~~aA~~~~~~~~n~~~f~G~~~-~g~~GLlN~p~v~~~~~~~~w-~~~t~~~i~~di~~~~~~l~~~s~g~~~p 218 (343) ++..--....+.++.+.+|+.+++|+.. ....|++|..........+.- +..+.+.+-..+..+. ............ T Consensus 225 ~l~~~i~~~l~~ai~~~~d~~~l~G~G~~~~p~Gi~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~ 303 (428) T protein:vir:10 225 NVEQLVLQDILTAISVREDKAFMRDDGTGDTPIGMKARATQWNRLLPWAADAAVNLDTIDTYLDSII-LMSMDGNSNMIS 303 (428) T ss_pred HHHHHHHHHHHHHHHHHHHHHHhccCCCCccccccccccccccccccccccccccHHHHHHHHHHHH-Hhhhcccccccc Confidence 5777788889999999999999999864 346899987664432221111 1223333222222222 121111222334 Q ss_pred cEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceE Q lcl|NC_011142. 219 NTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNL 298 (343) Q Consensus 219 ~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~ 298 (343) ...+|+|..|..|.... +..|.-++. .... ..+-|.|+.+... . ..+.+.+++...++|-+ ...+ T Consensus 304 ~~~v~n~~~~~~L~~lk--d~~G~~i~~---~~~~-g~l~G~pv~~~~~-------~-p~~~~~~~~~~~i~~gd-~s~~ 368 (428) T protein:vir:10 304 SGWGMSNRTYMKLFGLR--DGNGNKVYP---EMAQ-GMLKGYPIQRTSA-------I-PANLGEGGKESEIYFAD-FNDV 368 (428) T ss_pred CEEEEcHHHHHHHHHhh--ccCCceecc---CCCC-CeeeceeeEEecc-------c-cccccCCCccceEEEEe-cceE Confidence 67899999999987532 444433321 1111 1344555433211 1 01122223333333322 2222 Q ss_pred EEeeccchhcccce---------------ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 299 ALAKPIPFRMLAPQ---------------LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 299 ~~~v~~~~~~~~~~---------------~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -+..-..++..... .++ ...+.+..++ ++.+++|.|+++++|| T Consensus 369 ~i~~~~~i~i~~~~~~~~~~~~~~~~~~f~~~-~~~~R~~~r~-d~~v~~p~a~~~~t~~ 426 (428) T protein:vir:10 369 VIGEDGNMKVDFSKEASYIDTDGKLVSAFSRN-QSLIRVVTEH-DIGFRHPEGLVLGTGV 426 (428) T ss_pred EEEEecceEEEeecccccccccccccchhhcc-hhheeeeeee-CceeeccceEEEEecc Confidence 23332333322111 112 1344567776 6899999999999999 No 34 >protein:vir:94673 Length: 419 # NCBI annotation: major capsid protein # Family: family:all:585 # MgeID: mge:1527 # MgeName: mu1/6 # Cross-refs: genbank:acc:YP_579208;genbank:gi:93007444;genbank:GeneID:5076792 Probab=98.70 E-value=1e-08 Score=64.37 Aligned_cols=311 Identities=12% Similarity=0.041 Sum_probs=159.4 Q ss_pred CCcce--eccc----hhhhhchhhhchhccccc---------ccCcchh-ecchh-----hhhhhhHHHHHHHHHHHHhh Q lcl|NC_011142. 1 MSEKR--VVID----AQTIAGNRWLNKFLDSNA---------TIGVPSV-VNDAD-----GGAAYYISQLASLETTVYEV 59 (343) Q Consensus 1 ~~~~~--~~~~----~~~~~~~~~~~~~~~~~~---------~~~~~~~-~~dA~-----~~~~f~~~~l~~id~~v~e~ 59 (343) +.++. -.-+ .+......+++....... ....... .+++. .++..... +.+...+.+. T Consensus 70 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~p--~~~~~~i~~~ 147 (419) T protein:vir:94 70 GTPLTPAEAGTFRSLAQRFADSDGLREYRARDKRGQFQVEMRDIDPNRLLSRDAPAGTITNPNVPHLP--QLVPGIVPTT 147 (419) T ss_pred hccccccccccccchhhhhhhHHHHHHHHHhhhhhhhhHHHHHHHHHHhhccccccccccCCcccccc--hhhhHHHHHH Confidence 00000 0000 000011111110000000 0000000 01100 11111111 2334445555 Q ss_pred hhhcccchhhccccCCCCcceeEEEEee--------ecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHH Q lcl|NC_011142. 60 PYADITYLEDVPVLANIPEYATHWNYRS--------YDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDE 131 (343) Q Consensus 60 ~~~~l~~~~~i~v~~~~~~~~~~~~~~~--------~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~E 131 (343) .......+.++.+..-.. ..+.|.. ....+.+.+++..+ .+|..+..+++....++.++.-+.++.+= T Consensus 148 ~~~~~~i~~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~a~~v~Eg~-~~~~~~~~~~~i~~~~~k~~~~~~is~el 223 (419) T protein:vir:94 148 PDLPLLVADLLDQQNADY---NVLEYIRDTSGTAGAGSTWNKAAVVPEGT-AKPQSTLSFDTITTTLKTVAHWLPITRQA 223 (419) T ss_pred HhhhhhhhhcceeeeccC---CceeeeeeccccccccccCcccceecCCc-cccccccceeeEEeeeeeEEEeehhhHHH Confidence 555555566665432221 1222222 12234456777654 47878888899999999999999988765 Q ss_pred HHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHh Q lcl|NC_011142. 132 LRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKA 211 (343) Q Consensus 132 l~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~ 211 (343) ++.+ . ++..--....++++...+|+.+++|+......|++|.+++........+...|....++||.+++..+... T Consensus 224 l~d~---~-~l~~~i~~~la~a~~~~~d~aii~G~G~~~p~Gi~~~~~~~~~~~~~~~~~~t~~~~~~~l~~~~~~~~~~ 299 (419) T protein:vir:94 224 ADDN---S-QLMGYIQGRLTYGLRFLRDRQLLNGNGSTEMQGILTTPGIGTYQQPKPTAPATDEPPLVDIRRAKTVAEIA 299 (419) T ss_pred HHhH---H-HHHHHHHHHHHHHHHHHHHHHHHhccCcccccceecccccccccccccccccccchhHHHHHHHHHhhhhc Confidence 5433 1 47777777789999999999999999988899999999988776665666677888899999999888642 Q ss_pred cCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEE Q lcl|NC_011142. 212 SKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVY 291 (343) Q Consensus 212 s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y 291 (343) + ..+..++|+|+.|..|.....++ ++.-++.--..+.....+-|.|+..... . +.+ ..++. T Consensus 300 --~-~~~~~~v~n~~~~~~l~~~k~~~-~~~~~~~~~~~~~~~~~l~G~pV~~~~~-------~------~~~--~~~~g 360 (419) T protein:vir:94 300 --G-FPPDGVVVHPQDWESIELDQAPG-SGVFRVIANVQGEATPRIWGLNVVSTVA-------I------AQG--TALVG 360 (419) T ss_pred --c-CCCCEEEEcHHHHHHHHHHhhcC-CCceeecCCcccCCCccccceeeEEcCC-------C------CCc--cEEEe Confidence 2 35678999999999987543322 2211111000011111233444322211 0 001 11111 Q ss_pred EcccceEEEeeccchhccccee------cCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 292 DKSERNLALAKPIPFRMLAPQL------LGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 292 ~~~~~~~~~~v~~~~~~~~~~~------~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +-+ +.+.+..-+.++...... ++ ...+.++.++ ++.+++|.++++++.- T Consensus 361 d~~-~~~~~~~~~~~~v~~~~~~~~~~~~~-~~~~r~~~r~-d~~v~~~~a~~~~~~~ 415 (419) T protein:vir:94 361 GFR-QGATLWSRQGITVLMTDSHADFFTAN-TLVILAEFRA-NLAVYQPKAFVRVTFA 415 (419) T ss_pred ecc-ceEEEEEecceEEEEeccccchhhcC-cEEEEEEEee-ccEEeccccEEEEEec Confidence 111 111111111222211111 12 2345567776 4667889999998888 No 35 >protein:vir:94771 Length: 298 # NCBI annotation: major head protein # Family: family:all:966 # MgeID: mge:1529 # MgeName: phi LC3 # Cross-refs: genbank:acc:NP_996706;genbank:gi:45597421;genbank:GeneID:2769044 Probab=98.70 E-value=3.5e-09 Score=66.94 Aligned_cols=280 Identities=9% Similarity=0.005 Sum_probs=152.1 Q ss_pred hecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccce Q lcl|NC_011142. 34 VVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKL 113 (343) Q Consensus 34 ~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~ 113 (343) |++ ++|.... +.+...+++...+.-..+++.++.. .+.+ ...+.+....+.+.+++.. ..+|..+..++. T Consensus 1 ma~---~gG~lip---~~~~~~ii~~~~~~s~i~~~~~~~~-~~~~--~~~~p~~~~~~~a~~v~Eg-~~~~~~~~~f~~ 70 (298) T protein:vir:94 1 MVL---NKGTLFD---PELVTDLISKVAGKSSIARLSAQKP-IPFN--GEKVFTFTMDSEIDVVAES-GKKTHGGVTLAP 70 (298) T ss_pred Cee---ccccccC---hhHHHHHHHHHHhhchhhhhcceee-ccCC--ceEEEEEecCcceEEeeCC-ccccccccceeE Confidence 333 2232222 3345567777777666677666532 2222 3456666667778888765 568888888899 Q ss_pred eEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehh-hc----ceeeeecCCccccccCcC Q lcl|NC_011142. 114 HQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTN-RN----MSGLLNNPNVTKTSATVN 188 (343) Q Consensus 114 ~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~-~g----~~GLlN~p~v~~~~~~~~ 188 (343) .....+.++.-..+|.+=|....-...+|...-+...++++++.+++.+++|... .| ..|..+..+...... . T Consensus 71 v~l~~~k~~~~~~iS~ell~~~~~~~~~l~~~i~~~la~ai~~~~d~~~l~G~~~~~g~~~~~~~~~~~~~~~~~~~--~ 148 (298) T protein:vir:94 71 QTMVPIKVEYGARISDEFMYASDEEKINILQAFNDGFAKKVARGIDLMAFHGVNPRLGTASAVIGTNHFDSKVTQKV--E 148 (298) T ss_pred EEEeeeEEEEeeehhHHHhccCCccHHHHHHHHHHHHHHHHHHHHHHHhhcccccCCCccccccccccccccccccc--c Confidence 9999999998888775544332333455777788889999999999999999532 11 222211111111000 1 Q ss_pred ccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccc Q lcl|NC_011142. 189 YATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRF 268 (343) Q Consensus 189 w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~ 268 (343) .......+++||.+++.++... ...+..++|+|+.+..|.+.. +..|.-++.-...+...-.+-|.|+... T Consensus 149 -~~~~~~~~~~~i~~~~~~~~~~---~~~~~~~vmn~~~~~~l~~lk--d~~G~~l~~~~~~~~~~~tl~G~PV~~~--- 219 (298) T protein:vir:94 149 -APRGIADPNGAIENAVELLTGV---DADVTGIAINPSFRSALAKQK--DLQGNALFPELKWGATPDTINGLPVDVN--- 219 (298) T ss_pred -cccccccHHHHHHHHHHhhhhc---CCCccEEEEcHHHHHHHHHhh--ccCCCeeecCcccCCCCceecceeeEEe--- Confidence 1122345688999999888653 245678999999999996532 3333222110001111112334443222 Q ss_pred eeeechhhhccccCCccceEEEEEcccceEEEeeccchh--cccc----------eecCceeEeeeeeeeeeEEEECcce Q lcl|NC_011142. 269 QLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFR--MLAP----------QLLGLGITVPAEYKISGTEYRYPLC 336 (343) Q Consensus 269 ~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~--~~~~----------~~~~~~~~~~~~~~~gGv~i~~P~a 336 (343) .... ....+.++.+++-+.+ +.+.+.+-..++ ..+- -.++ ...+.++.++ |+.+.+|.| T Consensus 220 ----~~v~--~~~~~~~~~~~~Gdfs-~~~~~~~~~~~~~~~~~~~~~d~~~~~~f~~~-~v~~r~~~r~-~~~~~~~~a 290 (298) T protein:vir:94 220 ----KTVS--DMSLTQRDRAIIGDFA-NGFKWGYAKEVPLEVIQYGDPDNSGLDLKGYN-QVYIRAELFL-GWGILDATK 290 (298) T ss_pred ----cccc--cccCCCccEEEEeecc-ceEEEEEecCceEEEeecCCCcCcchhhhhcC-cEEEEEEEEe-ccEeecccc Confidence 1111 1112233433332222 112221112221 1110 1112 1234556766 688899999 Q ss_pred eeeeccC Q lcl|NC_011142. 337 AQYVDML 343 (343) Q Consensus 337 i~~~dGI 343 (343) ++++.|. T Consensus 291 ~~~l~~~ 297 (298) T protein:vir:94 291 FARVTEA 297 (298) T ss_pred eEEEEec Confidence 9999999 No 36 >protein:vir:104085 Length: 320 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:1656 # MgeName: Che12 # Cross-refs: genbank:acc:YP_655596;genbank:gi:109392467;genbank:GeneID:4156953 Probab=98.64 E-value=1.3e-08 Score=63.85 Aligned_cols=290 Identities=8% Similarity=-0.074 Sum_probs=148.4 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) |-|..- +|++..+- +-+++.++++.+. . .+-.++++........+++.++.. ... T Consensus 1 ~~~~~~-~~~~~~~~-----------------~~t~~~~~~~~ip-~---~~~~~ii~~~~~~s~l~~~~~~~~---~~~ 55 (320) T protein:vir:10 1 MAAGTA-FQVDHAQI-----------------AQTGDTMFKGYLE-P---EQAKDYFAEAEKTSIVQQFAQKVP---MGT 55 (320) T ss_pred CCCCcc-CCHHHHHh-----------------hcccccccccccc-H---HHHHHHHHHHHhccchhhhcceee---ccC Confidence 332221 11111110 0122333333332 2 233556666666666666665532 222 Q ss_pred eEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhh Q lcl|NC_011142. 81 THWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQR 160 (343) Q Consensus 81 ~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~ 160 (343) ....+.+.+..+.+.+++.. ..+|..+..+++...+.+.++..+.++.+=|+.+ ..++...-....++++++.+|+ T Consensus 56 ~~~~~p~~~~~~~a~~v~E~-~~~~~~~~~f~~v~~~~~k~~~~~~is~ell~ds---~~~l~~~i~~~l~~a~a~~~d~ 131 (320) T protein:vir:10 56 TGQKIPHWIGDVSAQWIGEG-DMKPITKGNMTSQNIAPHKIATIFVASAETVRAN---PANYLGTMRTKVATAFAMAFDS 131 (320) T ss_pred CceEEEEEeCCcceEEecCC-ccccccccceeEEEEeeEEEEEeehhhHHHHhcC---hHHHHHHHHHHHHHHHHHHHHH Confidence 34556666667778888764 5689999999999999999999999887655543 3568888888899999999999 Q ss_pred eeeeeehhhcceeeeecCC-ccccccC-cCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCC Q lcl|NC_011142. 161 VAYFGDTNRNMSGLLNNPN-VTKTSAT-VNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTG 238 (343) Q Consensus 161 ~~f~G~~~~g~~GLlN~p~-v~~~~~~-~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~ 238 (343) .+|+|+..-.-.|++.... ++....+ ..++..+ ..-+++.+++..+. .....+..++++|+.+..|.+.. + T Consensus 132 a~l~G~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~---~~~~~~~~~v~n~~~~~~L~~lk--d 204 (320) T protein:vir:10 132 AALNGTDSPFPTYLAQTTKSVSLADPGGATASDLT--AYDAVAVNGLSLLV---NAKKKWTHTLLDDIVEPILNGAK--D 204 (320) T ss_pred HhhcccCCCCCcccccccccccceecccccccccc--cHHHHHHHHHhhhh---cccCCCcEEEEcHHHHHHHHHhh--c Confidence 9999987433333333221 1111111 1111111 11223444444443 23456789999999999997533 3 Q ss_pred CCCccHHH-HHHhcCc----ceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-- Q lcl|NC_011142. 239 YTDRTVIE-HFQINNA----YTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-- 311 (343) Q Consensus 239 ~~~~tvle-~l~~n~~----~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-- 311 (343) ..+.-++. -+....+ ...+.|.|+.... ....++.. ++|- |...+-+.....++.... T Consensus 205 ~~G~~l~~~~~~~~~~~~~~~~~i~g~pv~~~~-------------~~~~~~~~-~~~g-d~~~~~~~~~~~~~i~~~~~ 269 (320) T protein:vir:10 205 KNGRPLFIESTYTDENSPFRAGRIVSRPTILSD-------------HVADGTTV-GYMG-DFRNVIWGQVGGLSFDVTDQ 269 (320) T ss_pred cCCceeeccccccCccccccCceeeeeeeEecC-------------CCCCCceE-EEEe-ecceEEEEEecCeEEEEeec Confidence 22322211 0010000 1112233321110 01112211 1221 111111222222211100 Q ss_pred ----------------eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 312 ----------------QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 312 ----------------~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ..++ ...+.+..++ ++.+.+|.|++.+.|+ T Consensus 270 ~~~~~~~~~~~~~~~~f~~~-~~~~r~~~~~-d~~v~~~~a~~~l~~~ 315 (320) T protein:vir:10 270 ATLNLGTPTEPNFVSLWQHN-LVAVRVEAEY-AFHNNDKDAFVKLTNV 315 (320) T ss_pred ceeeeccccccccchhhhcC-cEEEEEEEee-ccEEecccceEEEEec Confidence 1112 1344556765 6888999999999999 No 37 >protein:vir:95763 Length: 297 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1578 # MgeName: SMP # Cross-refs: genbank:acc:YP_950590;genbank:gi:119953785;genbank:GeneID:5076833 Probab=98.58 E-value=4.1e-08 Score=61.10 Aligned_cols=278 Identities=6% Similarity=-0.089 Sum_probs=155.3 Q ss_pred hhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCC Q lcl|NC_011142. 21 KFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISAN 100 (343) Q Consensus 21 ~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~ 100 (343) |+++...+..+ |...+++....+ .+..++++.....-..+++.++..-.+ .....+.+......+.+++.. T Consensus 1 m~~~~~~~~~~----~~t~~~~~lvP~---~~~~~ii~~~~~~s~l~~~~~~~~~~~--~~~~~~~~~~~~~~a~~v~Eg 71 (297) T protein:vir:95 1 MTVQTFNPENV----LVSQKKDGTLHK---EFTDIIMKEVAQNSLVMQLGQYQEMEG--EQEKTVYVQTDGISAYWVNET 71 (297) T ss_pred CCccccccccc----cccCCCcceech---hHHHHHHHHHHhhchhhhhcceeecCC--CccEEEEEEcCCceeEEeecC Confidence 43333333221 112233333333 334567776666666666665532211 222344555566677888765 Q ss_pred cCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCc Q lcl|NC_011142. 101 ASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNV 180 (343) Q Consensus 101 ~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v 180 (343) ..+|..+..++........++....++.+-++.+ ..++...-....++++++.+|+.+++|+...+-.|+++.... T Consensus 72 -~~~~~~~~~f~~v~l~~~k~~~~~~is~ell~ds---~~~l~~~i~~~la~ai~~~~d~a~l~G~g~~~~~gi~~~~~~ 147 (297) T protein:vir:95 72 -EKIKTDKPEVVPVTLKAHKLGIILVTSREALNYT---WKKFFEDMKPQIVEAFYKKIDEAGLLGHDTPFANSVAKAAKD 147 (297) T ss_pred -ccccccccceeEEEEeeEEEEEeehhhHHHHhcC---HHHHHHHHHHHHHHHHHHHHHHHHhcccCCcccccccccccc Confidence 4588888889999999999999999887655544 246888888899999999999999999987777888876543 Q ss_pred cccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccc Q lcl|NC_011142. 181 TKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRN 260 (343) Q Consensus 181 ~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~ 260 (343) ...... +. -| ++||.+++.++... + ..+..++|+|+.+..|.+.. +..+.-++ .... ..+-|. T Consensus 148 ~~~~~~-~~--~t----~~~i~~~~~~l~~~--~-~~~~~~v~~~~~~~~L~~l~--d~~G~~i~----~~~~-~~l~G~ 210 (297) T protein:vir:95 148 ANKVIG-GP--IN----YDNILKLQDALYDA--D-VEPNAFVSKIQNRSALREAR--DGNKVSIY----DKAA-NTIDGI 210 (297) T ss_pred cceecc-cc--cC----HHHHHHHHHHhhhc--c-CCcCEEEEcHHHHHHHHHhh--ccCCceee----cCCC-Ccccce Confidence 222111 11 12 56777788777643 2 34678999999999996432 33332111 1111 122233 Q ss_pred cccccccceeeechhhhccccCCccceEEEEEccc------ceEEEeeccchhcccc----------eecCceeEeeeee Q lcl|NC_011142. 261 PIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSE------RNLALAKPIPFRMLAP----------QLLGLGITVPAEY 324 (343) Q Consensus 261 p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~------~~~~~~v~~~~~~~~~----------~~~~~~~~~~~~~ 324 (343) |+... .. ....++..+.-+.+. +.+++.+-.+...... ..++ ...+.+.+ T Consensus 211 Pv~~~-------~~------~~~~~~~~~~gd~s~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~r~~~ 276 (297) T protein:vir:95 211 TTVDL-------KS------ARFEKGDLLAGDFDNLIYGVPYNITYKISEEGQISTITNADGTPINLFEQE-MIAIRATM 276 (297) T ss_pred eeEee-------cC------CCCCCceEEEEecccEEEEEecCeEEEEeeccccccccccCccchhhhhcC-cEEEEEEE Confidence 43111 00 001112222222121 1122222111111000 1112 24455667 Q ss_pred eeeeEEEECcceeeeeccC Q lcl|NC_011142. 325 KISGTEYRYPLCAQYVDML 343 (343) Q Consensus 325 ~~gGv~i~~P~ai~~~dGI 343 (343) ++ |..+.+|.|++.+..- T Consensus 277 ~~-d~~v~~~~a~~~l~~a 294 (297) T protein:vir:95 277 DI-AVMITKTDAFAKLTPA 294 (297) T ss_pred Ee-ccEeecccceEEEeec Confidence 76 6778889999998877 No 38 >protein:vir:104256 Length: 458 # NCBI annotation: major head protein precursor # Family: family:all:27070 # MgeID: mge:1504 # MgeName: T5 # Cross-refs: genbank:acc:YP_006977;genbank:gi:46401878;genbank:GeneID:2777673 Probab=98.57 E-value=2.8e-08 Score=61.99 Aligned_cols=316 Identities=5% Similarity=-0.044 Sum_probs=149.0 Q ss_pred CCcceeccchhhhhchhhh-c-hhccccccc-Ccchhec-chhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCC Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWL-N-KFLDSNATI-GVPSVVN-DADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANI 76 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~-~-~~~~~~~~~-~~~~~~~-dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~ 76 (343) -.++...-+++.-+-.+++ + ........+ ..++... -....+..+.. +.+.+.|++........+++..+. +. T Consensus 125 ~~~~~~~~~~e~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~g~~~ip--~~~~~~ii~~~~~~~~l~~~~~~~-~~ 201 (458) T protein:vir:10 125 GTQENFEDEVEKLVLLSYVMEKGVFETEHGQRHLKAVNQSSSVEVSSESYE--TIFSQRIIRDLQKELVVGALFEEL-PM 201 (458) T ss_pred hhhhhHHHHHHHHHHHHHHHhhccchhhhhhhhhhhhhhcccCccccceeh--hhHhHHHHHHHHhhhhHHhhccee-ec Confidence 0000000000000000000 0 000000000 0001000 01122333333 456677888777766666665542 12 Q ss_pred CcceeEEEEeeecccccceeecCCcCccc------eeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHH Q lcl|NC_011142. 77 PEYATHWNYRSYDGAAMGKFISANASDLP------RVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELA 150 (343) Q Consensus 77 ~~~~~~~~~~~~~~~G~a~~~~~~~~dip------~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA 150 (343) + .....|.+....+.+.+++.... .| ..+..++......+.++.-+.+|.+=|.. ...++..--.... T Consensus 202 ~--~~~~~~~~~~~~~~a~~v~e~~~-~~~~~~~~~~~~~~~~i~~~~~k~~~~v~is~ell~d---s~~~~~~~i~~~l 275 (458) T protein:vir:10 202 S--SKILTMLVEPDAGKATWVAASTY-GTDTTTGEEVKGALKEIHFSTYKLAAKSFITDETEED---AIFSLLPLLRKRL 275 (458) T ss_pred C--CcceEEEEecCCcceeecccccc-cccccccccccccceeeEeeeeeEEeeehhhHHHHhc---chHHHHHHHHHHH Confidence 2 22344444455566666654421 22 22334666777777888777777653333 2345777788888 Q ss_pred HHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHH-HHHHHHHHHHHHHHhcCCeecccEEEecHHHHH Q lcl|NC_011142. 151 FRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQE-LFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWK 229 (343) Q Consensus 151 ~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~-i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~ 229 (343) +.++...+|+-+++|+......|++|+++.........++...+.. -+++|.+++..+... + ..+..++|+|..|. T Consensus 276 ~~~i~~~~d~~~l~G~G~~~p~Gi~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~l~~~--~-~~~~~~v~~~~~~~ 352 (458) T protein:vir:10 276 IEAHAVSIEEAFMTGDGSGKPKGLLTLASEDSAKVVTEAKADGSVLVTAKTISKLRRKLGRH--G-LKLSKLVLIVSMDA 352 (458) T ss_pred HHHHHHHHHHHhhcCCCCCccceeeecccccccceeecccccccccccHHHHHHHHHhhhhh--h-cCCCEEEEcHHHHH Confidence 9999999999999999776789999999866543332322211111 256677777776532 2 34577999999999 Q ss_pred HHhccccCCCCCccHHHHHHhcCcc----eeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccc Q lcl|NC_011142. 230 RASSLLMTGYTDRTVIEHFQINNAY----TLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIP 305 (343) Q Consensus 230 ~L~~~~~~~~~~~tvle~l~~n~~~----~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~ 305 (343) .|..-. +..|.-++..-..+.+. ..+-|.|+.... .. ..+.+.++ ++|-+-.+.+.+..-.. T Consensus 353 ~l~~lk--d~~G~~i~~~~~~~~~~~~~~~~l~G~pv~~~~-------~~---p~~~~~~~--~~~~~f~~~~~~~~~~~ 418 (458) T protein:vir:10 353 YYDLLE--DEEWQDVAQVGNDSVKLQGQVGRIYGLPVVVSE-------YF---PAKANSAE--FAVIVYKDNFVMPRQRA 418 (458) T ss_pred HHHhhc--ccCCceeeccccccccccCcCceecceeeEEcc-------cc---ccccCCcc--eEEEEecccEEEEEeec Confidence 886432 33332121111111111 123344432221 11 11111122 22222223333322222 Q ss_pred hhcccceecCc-eeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 306 FRMLAPQLLGL-GITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 306 ~~~~~~~~~~~-~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ++...-..... ...+-...|+ |..+++|.+++..+== T Consensus 419 ~~v~~d~~~~~~~~~~~~~~r~-~~~v~~~~a~v~~~~a 456 (458) T protein:vir:10 419 VTVERERQAGKQRDAYYVTQRV-NLQRYFANGVVSGTYA 456 (458) T ss_pred eEEEeecccCCCceEEEEEEEe-cceEecccceEEEeec Confidence 33221111111 2345557776 6888999988772111 No 39 >protein:vir:78830 Length: 324 # NCBI annotation: major head protein # Family: family:all:507 # MgeID: mge:1858 # MgeName: 80alpha # Cross-refs: genbank:acc:YP_001285361;genbank:gi:148717889;genbank:GeneID:5246961 Probab=98.57 E-value=1.8e-08 Score=63.02 Aligned_cols=293 Identities=8% Similarity=-0.007 Sum_probs=160.3 Q ss_pred CCccee-ccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcc Q lcl|NC_011142. 1 MSEKRV-VIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEY 79 (343) Q Consensus 1 ~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~ 79 (343) |.++.- -.+.+..+++......+.+... +....++..... .+...+++.....-..+.++++.. .. T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~a~~~-------~~~~~~~~~iP~---~~~~~ii~~~~~~s~l~~l~~~~~---~~ 67 (324) T protein:vir:78 1 MEQTQKLKLNLQHFASNNVKPQVFNPDNV-------MMHEKKDGTLMN---EFTTPILQEVMENSKIMQLGKYEP---ME 67 (324) T ss_pred CCcchhhhHHHHHHHHHhhhhhhhccccc-------cccCcCccccch---hHHHHHHHHHHhhchhhhhcceee---cc Confidence 544322 2334444544443333333311 222223332222 334566666666666666655432 22 Q ss_pred eeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhh Q lcl|NC_011142. 80 ATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQ 159 (343) Q Consensus 80 ~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n 159 (343) ...+.+.+.+..+.+.+++.. ..+|..+..+++.......++.-..++.+=++.+ ..++...-....++++++.+| T Consensus 68 ~~~~~~p~~~~~~~a~~v~Eg-~~~~~~~~~~~~v~~~~~k~~~~~~is~ell~ds---~~~l~~~i~~~la~ai~~~~d 143 (324) T protein:vir:78 68 GTEKKFTFWADKPGAYWVGEG-QKIETSKATWVNATMRAFKLGVILPVTKEFLNYT---YSQFFEEMKPMIAEAFYKKFD 143 (324) T ss_pred CCceEEEEEecCcceeEecCC-ccccccccceeEEEEeeEEEEEeehhhHHHHhcc---hHHHHHHHHHHHHHHHHHHHH Confidence 234566777777788888774 5688888889999999999998888877545533 356888888889999999999 Q ss_pred heeeeeehhhc-ceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCC Q lcl|NC_011142. 160 RVAYFGDTNRN-MSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTG 238 (343) Q Consensus 160 ~~~f~G~~~~g-~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~ 238 (343) +.+|+|+...+ ..|+++..+....... ...-++||.+++.++... + ..+..++|+|+.+..|.+.. + T Consensus 144 ~a~l~G~g~~~~~~gi~~~~~~~~~~~~-------~~~t~~~i~~~~~~l~~~--~-~~~~~~vmn~~~~~~L~~l~--d 211 (324) T protein:vir:78 144 EAGILNQGNNPFGKSIAQSIEKTNKVIK-------GDFTQDNIIDLEALLEDD--E-LEANAFISKTQNRSLLRKIV--D 211 (324) T ss_pred HHHhccCCCCCcCccccccccccceecc-------ccccHHHHHHHHHhhhhc--c-CCCCEEEEcHHHHHHHHHhh--c Confidence 99999976432 3455554442221111 111267778888777542 2 45678999999999987533 2 Q ss_pred CCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc------- Q lcl|NC_011142. 239 YTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP------- 311 (343) Q Consensus 239 ~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~------- 311 (343) ..+.-++ .......+-|.|+... . ....++..++.-+.+ .+-+.....++.-.. T Consensus 212 ~~G~~~~----~~~~~~~l~G~PV~~~-------~------~~~~~~~~~~~gd~~--~~~~g~~~~~~i~~~~~~~~~~ 272 (324) T protein:vir:78 212 PETKERI----YDRNSDSLDGLPVVNL-------K------SSNLKRGELITGDFD--KLIYGIPQLIEYKIDETAQLST 272 (324) T ss_pred cCCCeee----cCCCCCcccceeeEee-------C------CCCCCcceEEEEecc--eEEEEEecCcEEEEeecccccc Confidence 2232211 1111122334443211 0 011122222222211 121222222221110 Q ss_pred -----------eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 312 -----------QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 312 -----------~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -.++ ...+.+..++ |+.+.+|.|++++.|. T Consensus 273 ~~~~~~~~~~~f~~d-~~~~r~~~r~-d~~v~~~~A~~~l~~a 313 (324) T protein:vir:78 273 VKNEDGTPVNLFEQD-MVALRATMHV-ALHIADDKAFAKLVPA 313 (324) T ss_pred cccccccchhhhhcC-cEEEEEEEEE-ccEEecccceEEEecc Confidence 0111 2444556766 6778889999999999 No 40 >protein:vir:96392 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1613 # MgeName: 53 # Cross-refs: genbank:acc:YP_239648;genbank:gi:66395381;genbank:GeneID:5132868 Probab=98.57 E-value=1.8e-08 Score=63.02 Aligned_cols=293 Identities=8% Similarity=-0.007 Sum_probs=160.3 Q ss_pred CCccee-ccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcc Q lcl|NC_011142. 1 MSEKRV-VIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEY 79 (343) Q Consensus 1 ~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~ 79 (343) |.++.- -.+.+..+++......+.+... +....++..... .+...+++.....-..+.++++.. .. T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~a~~~-------~~~~~~~~~iP~---~~~~~ii~~~~~~s~l~~l~~~~~---~~ 67 (324) T protein:vir:96 1 MEQTQKLKLNLQHFASNNVKPQVFNPDNV-------MMHEKKDGTLMN---EFTTPILQEVMENSKIMQLGKYEP---ME 67 (324) T ss_pred CCcchhhhHHHHHHHHHhhhhhhhccccc-------cccCcCccccch---hHHHHHHHHHHhhchhhhhcceee---cc Confidence 544322 2334444544443333333311 222223332222 334566666666666666655432 22 Q ss_pred eeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhh Q lcl|NC_011142. 80 ATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQ 159 (343) Q Consensus 80 ~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n 159 (343) ...+.+.+.+..+.+.+++.. ..+|..+..+++.......++.-..++.+=++.+ ..++...-....++++++.+| T Consensus 68 ~~~~~~p~~~~~~~a~~v~Eg-~~~~~~~~~~~~v~~~~~k~~~~~~is~ell~ds---~~~l~~~i~~~la~ai~~~~d 143 (324) T protein:vir:96 68 GTEKKFTFWADKPGAYWVGEG-QKIETSKATWVNATMRAFKLGVILPVTKEFLNYT---YSQFFEEMKPMIAEAFYKKFD 143 (324) T ss_pred CCceEEEEEecCcceeEecCC-ccccccccceeEEEEeeEEEEEeehhhHHHHhcc---hHHHHHHHHHHHHHHHHHHHH Confidence 234566777777788888774 5688888889999999999998888877545533 356888888889999999999 Q ss_pred heeeeeehhhc-ceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCC Q lcl|NC_011142. 160 RVAYFGDTNRN-MSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTG 238 (343) Q Consensus 160 ~~~f~G~~~~g-~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~ 238 (343) +.+|+|+...+ ..|+++..+....... ...-++||.+++.++... + ..+..++|+|+.+..|.+.. + T Consensus 144 ~a~l~G~g~~~~~~gi~~~~~~~~~~~~-------~~~t~~~i~~~~~~l~~~--~-~~~~~~vmn~~~~~~L~~l~--d 211 (324) T protein:vir:96 144 EAGILNQGNNPFGKSIAQSIEKTNKVIK-------GDFTQDNIIDLEALLEDD--E-LEANAFISKTQNRSLLRKIV--D 211 (324) T ss_pred HHHhccCCCCCcCccccccccccceecc-------ccccHHHHHHHHHhhhhc--c-CCCCEEEEcHHHHHHHHHhh--c Confidence 99999976432 3455554442221111 111267778888777542 2 45678999999999987533 2 Q ss_pred CCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc------- Q lcl|NC_011142. 239 YTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP------- 311 (343) Q Consensus 239 ~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~------- 311 (343) ..+.-++ .......+-|.|+... . ....++..++.-+.+ .+-+.....++.-.. T Consensus 212 ~~G~~~~----~~~~~~~l~G~PV~~~-------~------~~~~~~~~~~~gd~~--~~~~g~~~~~~i~~~~~~~~~~ 272 (324) T protein:vir:96 212 PETKERI----YDRNSDSLDGLPVVNL-------K------SSNLKRGELITGDFD--KLIYGIPQLIEYKIDETAQLST 272 (324) T ss_pred cCCCeee----cCCCCCcccceeeEee-------C------CCCCCcceEEEEecc--eEEEEEecCcEEEEeecccccc Confidence 2232211 1111122334443211 0 011122222222211 121222222221110 Q ss_pred -----------eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 312 -----------QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 312 -----------~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -.++ ...+.+..++ |+.+.+|.|++++.|. T Consensus 273 ~~~~~~~~~~~f~~d-~~~~r~~~r~-d~~v~~~~A~~~l~~a 313 (324) T protein:vir:96 273 VKNEDGTPVNLFEQD-MVALRATMHV-ALHIADDKAFAKLVPA 313 (324) T ss_pred cccccccchhhhhcC-cEEEEEEEEE-ccEEecccceEEEecc Confidence 0111 2444556766 6778889999999999 No 41 >protein:vir:103955 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1662 # MgeName: phiNM # Cross-refs: genbank:acc:YP_873992;genbank:gi:118430767;genbank:GeneID:4525449 Probab=98.55 E-value=3e-08 Score=61.86 Aligned_cols=291 Identities=9% Similarity=-0.015 Sum_probs=157.9 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) |++.+ .+++..+++..-...+.++- .|..+.++..... .+...|++.....-..+++.++.. ... T Consensus 4 ~~~~~--~~~~~f~~~~~~~~~~~a~~-------~~~~~~~~~liP~---~~~~~ii~~~~~~s~l~~~~~~~~---~~~ 68 (324) T protein:vir:10 4 TQKLK--LNLQHFASNNVKPQVFNPDN-------VMMHEKKDGTLLN---DFTTPILQEVMENSKIMQLGKYEP---MEG 68 (324) T ss_pred chHHH--HHHHHHHHHhhccceecccc-------eeccCCCcceech---hHHHHHHHHHHhhchhhhhcceee---ccC Confidence 33333 33555555444444333321 1222222322222 334556665555555566555432 222 Q ss_pred eEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhh Q lcl|NC_011142. 81 THWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQR 160 (343) Q Consensus 81 ~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~ 160 (343) .++.+.+.+..+.+.+++.. ..+|..+..++........++.-..++.+=++.+ ..++...-.....+++++.+|+ T Consensus 69 ~~~~~p~~~~~~~a~~v~Eg-~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~ds---~~~l~~~i~~~l~~ai~~~~d~ 144 (324) T protein:vir:10 69 TEKKFTFWADKPGAYWVGEG-QKIETSKATWVNATMRAFKLGVILPVTKEFLNYT---YSQFFEEMKPMIAEAFYKKFDE 144 (324) T ss_pred CceEEEEEeCCcceeEeccC-ccccccccceeEEEEeeEEEEEeehhhHHHHhcc---hHHHHHHHHHHHHHHHHHHHHH Confidence 34566677777788888765 5588888889999999999999888887655543 3567788888899999999999 Q ss_pred eeeeeehhhc-ceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCC Q lcl|NC_011142. 161 VAYFGDTNRN-MSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGY 239 (343) Q Consensus 161 ~~f~G~~~~g-~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~ 239 (343) .+++|+...+ -.|+++......... . ...-++||.+++..+... ...+..++|+|+.+..|.+.. +. T Consensus 145 a~l~G~g~~~~~~~i~~~~~~~~~~~----~---~~~t~~~i~~~~~~l~~~---~~~~~~~v~n~~~~~~L~~l~--d~ 212 (324) T protein:vir:10 145 AGILNQGNNPFGKSIAQSIEKTNKVI----K---GDFTQDNIIDLEALLEDD---ELEANAFISKTQNRSLLRKIV--DP 212 (324) T ss_pred HhhhcCCCCccCccccccccccceec----c---ccCCHHHHHHHHHhhhhc---cCCCCEEEEcHHHHHHHHHhh--cc Confidence 9999976432 344544332111111 1 112267778888777542 345678999999999987533 22 Q ss_pred CCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-------- Q lcl|NC_011142. 240 TDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-------- 311 (343) Q Consensus 240 ~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-------- 311 (343) .+.-++ ....+. .+-|.|+.+. . ....++..+++-+ ...+-+.+..+++.-.. T Consensus 213 ~g~~~~---~~~~~~-~l~G~PV~~~-------~------~~~~~~~~~~~gd--~~~~~~~~~~~~~i~~~~~~~~~~~ 273 (324) T protein:vir:10 213 ETKERI---YDRNSD-TLDGLPVVNL-------K------SSNLKRGELITGD--FDKLIYGIPQLIEYKIDETAQLSTV 273 (324) T ss_pred CCceee---cCCCCc-cccceeEEee-------c------CCCCCcceEEEEe--cccEEEEEecCcEEEEeeccccccc Confidence 222111 111111 2334443211 1 1111222222211 12222222222221100 Q ss_pred ----------eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 312 ----------QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 312 ----------~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -.++ ...+.++.++ |..+.+|.|++.+.|. T Consensus 274 ~~~~~~~~~~~~~~-~~~~r~~~r~-d~~v~~~~A~~~l~~a 313 (324) T protein:vir:10 274 KNEDGTPVNLFEQD-MVALRATMHV-ALHIADDKAFAKLVPA 313 (324) T ss_pred ccccccchhhhhcC-cEEEEEEEEE-ccEEecccceEEEEec Confidence 0112 2445566776 5666789999999999 No 42 >protein:vir:7855 Length: 497 # NCBI annotation: gp12 # Family: family:all:585 # MgeID: mge:150 # MgeName: CJW1 # Cross-refs: genbank:acc:NP_817462;genbank:gi:29565891;genbank:GeneID:1259081 Probab=98.53 E-value=2.9e-08 Score=61.92 Aligned_cols=316 Identities=13% Similarity=0.062 Sum_probs=159.3 Q ss_pred CC---------cceeccchhhhh---------chhhhchh----cccccccCcchhecchhhhhhhhHHHHHHHHHHHHh Q lcl|NC_011142. 1 MS---------EKRVVIDAQTIA---------GNRWLNKF----LDSNATIGVPSVVNDADGGAAYYISQLASLETTVYE 58 (343) Q Consensus 1 ~~---------~~~~~~~~~~~~---------~~~~~~~~----~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e 58 (343) +. +....++...-. .......+ ..++... ..+.+-..+.+.++.. +.+.+.|++ T Consensus 98 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~~gg~~vp--~~~~~~ii~ 173 (497) T protein:vir:78 98 MNPELKNATSFEKGTKFDVSFNVSAKAADPGTAAAELMGAFADGETAPAAI--GQNPFGSTGTFAPGIL--PTFLPGIVE 173 (497) T ss_pred hhHHHHhhhhhhhhhhhhhhhhhhhhhhhhHHHHHHHHHHHhhhhhhHHHH--HhhhcccCcccccccc--hhhhHHHHH Confidence 00 011111100000 00000000 0000000 0111111223333333 345567888 Q ss_pred hhhhcccchhhccccCCCCcceeEEEEeeec-ccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHH Q lcl|NC_011142. 59 VPYADITYLEDVPVLANIPEYATHWNYRSYD-GAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAA 137 (343) Q Consensus 59 ~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~-~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~ 137 (343) ..++....+.++++..-.+ ..+.|.... ..+.+.+++.. ..+|..+..++......+.++.-..+|.+ |..-. T Consensus 174 ~~~~~~~i~~l~~~~~~~~---~~~~~~~~~~~~~~a~wv~E~-~~~~~s~~~f~~i~~~~~k~a~~~~iS~e-ll~d~- 247 (497) T protein:vir:78 174 QLFYELSLADLISSRPVTS---PNLSYLTESAAHNNAAAVAEA-GTYPFSSEEFARVYEQVGKVANALTITDE-GLRDA- 247 (497) T ss_pred HHHhhhhHHhhccccccCC---CceEEEEEcCCCCcceeeccC-cccccccccceeeEeeeeeeEeecHhHHH-HHHhH- Confidence 8888888788876533222 234555433 34567788765 45888888899999999999987777654 44322 Q ss_pred hCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCcc--------------------------- Q lcl|NC_011142. 138 VNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYA--------------------------- 190 (343) Q Consensus 138 ~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~--------------------------- 190 (343) . .|..--....++++++.+|+-+++|+...+..||++.++.........+. T Consensus 248 -~-~l~~~i~~~l~~~i~~~~d~~~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 325 (497) T protein:vir:78 248 -P-ELFNFVQGRLLEGIQRKEEVQLLAGGGYPGVNGLLQRSTGFTASSASSLFGATSATVSNVKFPADGTNGAFVGQDTV 325 (497) T ss_pred -H-HHHHHHHHHHHHHHHHHHHHHhhcCCCcccccccccccccccccccccchhhhhhhhhhhhhhcccccchhhhhhHH Confidence 2 37777888899999999999999999877889999998754322211110 Q ss_pred ----------------------ccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHH- Q lcl|NC_011142. 191 ----------------------TCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEH- 247 (343) Q Consensus 191 ----------------------~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~- 247 (343) ..+..+.+.++..++..+.. .+...|..++|+|..|..|.+- -+..|.-++.- T Consensus 326 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~vmn~~~~~~l~~l--kd~~G~~i~~~~ 401 (497) T protein:vir:78 326 ASLKYGRVVTGAAGSGSGVAGSYPTAAEIAENVFDAFVDIQL--TLFQTPNAVVMNPRDWELLRLT--KDANGQYMGGNF 401 (497) T ss_pred HHHHHHHhhhhhhhhccchhccccchhhhhhHHHHHHhhhhh--hcccCCCeEEEchHHHHHHHHh--hcCCCceeccCc Confidence 01233455666666666654 3455678899999999988642 24444322210 Q ss_pred -----HHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccceecCceeEeee Q lcl|NC_011142. 248 -----FQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLGLGITVPA 322 (343) Q Consensus 248 -----l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~~ 322 (343) -.-.+....+-|.|+....... .... -.|.-+.-...++ +...+++.+..- ....-.+| ...+.+ T Consensus 402 ~~~~~~~~~~~~~~l~G~pV~~t~~~~--~~~~---~~Gd~~~~~~~i~--~r~~~~v~~~~~--~~~~f~~n-~v~~r~ 471 (497) T protein:vir:78 402 FGNAYGNPVNGGKNIWGVPVVTTPLIP--LGTI---LVGHFAPSVIQTA--RREGVTMQMTNS--NGTDFVDG-KVTVRA 471 (497) T ss_pred ccccccccccCCceeeceeeEecCCCC--CCce---EEeecccceEEEE--EecccEEEeecc--cchhhhcC-cEEEEE Confidence 0000000122244432221100 0000 0000000001111 112222221100 00001123 356667 Q ss_pred eeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 323 EYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 323 ~~~~gGv~i~~P~ai~~~dGI 343 (343) +.|++ +.+++|.||++++-. T Consensus 472 ~~r~~-~~v~~p~A~~~l~~~ 491 (497) T protein:vir:78 472 EERLG-LLVYRPSAFQLIQLK 491 (497) T ss_pred EEeec-ceeeccccEEEEEec Confidence 88875 588899999999988 No 43 >protein:vir:101650 Length: 497 # NCBI annotation: gp13 # Family: family:all:585 # MgeID: mge:1515 # MgeName: 244 # Cross-refs: genbank:acc:YP_654768;genbank:gi:109302766;genbank:GeneID:4156084 Probab=98.53 E-value=2.9e-08 Score=61.92 Aligned_cols=316 Identities=13% Similarity=0.062 Sum_probs=159.3 Q ss_pred CC---------cceeccchhhhh---------chhhhchh----cccccccCcchhecchhhhhhhhHHHHHHHHHHHHh Q lcl|NC_011142. 1 MS---------EKRVVIDAQTIA---------GNRWLNKF----LDSNATIGVPSVVNDADGGAAYYISQLASLETTVYE 58 (343) Q Consensus 1 ~~---------~~~~~~~~~~~~---------~~~~~~~~----~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e 58 (343) +. +....++...-. .......+ ..++... ..+.+-..+.+.++.. +.+.+.|++ T Consensus 98 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~~gg~~vp--~~~~~~ii~ 173 (497) T protein:vir:10 98 MNPELKNATSFEKGTKFDVSFNVSAKAADPGTAAAELMGAFADGETAPAAI--GQNPFGSTGTFAPGIL--PTFLPGIVE 173 (497) T ss_pred hhHHHHhhhhhhhhhhhhhhhhhhhhhhhhHHHHHHHHHHHhhhhhhHHHH--HhhhcccCcccccccc--hhhhHHHHH Confidence 00 011111100000 00000000 0000000 0111111223333333 345567888 Q ss_pred hhhhcccchhhccccCCCCcceeEEEEeeec-ccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHH Q lcl|NC_011142. 59 VPYADITYLEDVPVLANIPEYATHWNYRSYD-GAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAA 137 (343) Q Consensus 59 ~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~-~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~ 137 (343) ..++....+.++++..-.+ ..+.|.... ..+.+.+++.. ..+|..+..++......+.++.-..+|.+ |..-. T Consensus 174 ~~~~~~~i~~l~~~~~~~~---~~~~~~~~~~~~~~a~wv~E~-~~~~~s~~~f~~i~~~~~k~a~~~~iS~e-ll~d~- 247 (497) T protein:vir:10 174 QLFYELSLADLISSRPVTS---PNLSYLTESAAHNNAAAVAEA-GTYPFSSEEFARVYEQVGKVANALTITDE-GLRDA- 247 (497) T ss_pred HHHhhhhHHhhccccccCC---CceEEEEEcCCCCcceeeccC-cccccccccceeeEeeeeeeEeecHhHHH-HHHhH- Confidence 8888888788876533222 234555433 34567788765 45888888899999999999987777654 44322 Q ss_pred hCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCcc--------------------------- Q lcl|NC_011142. 138 VNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYA--------------------------- 190 (343) Q Consensus 138 ~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~--------------------------- 190 (343) . .|..--....++++++.+|+-+++|+...+..||++.++.........+. T Consensus 248 -~-~l~~~i~~~l~~~i~~~~d~~~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 325 (497) T protein:vir:10 248 -P-ELFNFVQGRLLEGIQRKEEVQLLAGGGYPGVNGLLQRSTGFTASSASSLFGATSATVSNVKFPADGTNGAFVGQDTV 325 (497) T ss_pred -H-HHHHHHHHHHHHHHHHHHHHHhhcCCCcccccccccccccccccccccchhhhhhhhhhhhhhcccccchhhhhhHH Confidence 2 37777888899999999999999999877889999998754322211110 Q ss_pred ----------------------ccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHH- Q lcl|NC_011142. 191 ----------------------TCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEH- 247 (343) Q Consensus 191 ----------------------~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~- 247 (343) ..+..+.+.++..++..+.. .+...|..++|+|..|..|.+- -+..|.-++.- T Consensus 326 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~vmn~~~~~~l~~l--kd~~G~~i~~~~ 401 (497) T protein:vir:10 326 ASLKYGRVVTGAAGSGSGVAGSYPTAAEIAENVFDAFVDIQL--TLFQTPNAVVMNPRDWELLRLT--KDANGQYMGGNF 401 (497) T ss_pred HHHHHHHhhhhhhhhccchhccccchhhhhhHHHHHHhhhhh--hcccCCCeEEEchHHHHHHHHh--hcCCCceeccCc Confidence 01233455666666666654 3455678899999999988642 24444322210 Q ss_pred -----HHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccceecCceeEeee Q lcl|NC_011142. 248 -----FQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLGLGITVPA 322 (343) Q Consensus 248 -----l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~~ 322 (343) -.-.+....+-|.|+....... .... -.|.-+.-...++ +...+++.+..- ....-.+| ...+.+ T Consensus 402 ~~~~~~~~~~~~~~l~G~pV~~t~~~~--~~~~---~~Gd~~~~~~~i~--~r~~~~v~~~~~--~~~~f~~n-~v~~r~ 471 (497) T protein:vir:10 402 FGNAYGNPVNGGKNIWGVPVVTTPLIP--LGTI---LVGHFAPSVIQTA--RREGVTMQMTNS--NGTDFVDG-KVTVRA 471 (497) T ss_pred ccccccccccCCceeeceeeEecCCCC--CCce---EEeecccceEEEE--EecccEEEeecc--cchhhhcC-cEEEEE Confidence 0000000122244432221100 0000 0000000001111 112222221100 00001123 356667 Q ss_pred eeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 323 EYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 323 ~~~~gGv~i~~P~ai~~~dGI 343 (343) +.|++ +.+++|.||++++-. T Consensus 472 ~~r~~-~~v~~p~A~~~l~~~ 491 (497) T protein:vir:10 472 EERLG-LLVYRPSAFQLIQLK 491 (497) T ss_pred EEeec-ceeeccccEEEEEec Confidence 88875 588899999999988 No 44 >protein:vir:97148 Length: 324 # NCBI annotation: ORF010 # Family: family:all:507 # MgeID: mge:1654 # MgeName: 85 # Cross-refs: genbank:acc:YP_239726;genbank:gi:66394880;genbank:GeneID:5130881 Probab=98.52 E-value=4.4e-08 Score=60.92 Aligned_cols=295 Identities=8% Similarity=-0.012 Sum_probs=158.9 Q ss_pred CCcc-eeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcc Q lcl|NC_011142. 1 MSEK-RVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEY 79 (343) Q Consensus 1 ~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~ 79 (343) |..+ +-..+++..+.+......+.+.. .+.++.++....+ .+...+++........+.+..+. +.. T Consensus 1 ~~~~~~~~~~~~~f~~~~~~~~~~~a~~-------~~~~~~~~~~iP~---~~~~~ii~~~~~~s~l~~~~~~~---~~~ 67 (324) T protein:vir:97 1 MEQTQKLKLNLQHFASNNVKPQVFNPDN-------VMMHEKKDGTLMN---EFTTPILQEVMENSKIMQLGKYE---PME 67 (324) T ss_pred CccchhHHHHHHHHHHhhhhhhhhcccc-------ccccCCCcceech---hHHHHHHHHHHhhcchhhhccee---ecc Confidence 4332 11233444444433333332221 1122333333333 33456667666666666665543 222 Q ss_pred eeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhh Q lcl|NC_011142. 80 ATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQ 159 (343) Q Consensus 80 ~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n 159 (343) ..+..+.+....+.+.+++.. ..+|..+..++........++.-..++.+=++.+ ..++...-....++++++.+| T Consensus 68 ~~~~~ip~~~~~~~a~~v~Eg-~~~~~~~~~f~~v~~~~~k~~~~~~is~ell~ds---~~~l~~~i~~~l~~aia~~~d 143 (324) T protein:vir:97 68 GTEKKFTFWADKPGAYWVGEG-QKIETSKATWVNATMRAFKLGVILPVTKEFLNYT---YSQFFEEMKPMIAEAFYKKFD 143 (324) T ss_pred CCceEEEEEecCcceeEeccC-ccccccccceeEEEEeeEEEEEeehhhHHHHhcc---hHHHHHHHHHHHHHHHHHHHH Confidence 334567777777888888876 4688889999999999999999988887545443 456888888889999999999 Q ss_pred heeeeeehhhc-ceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCC Q lcl|NC_011142. 160 RVAYFGDTNRN-MSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTG 238 (343) Q Consensus 160 ~~~f~G~~~~g-~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~ 238 (343) +.+++|+...+ ..|+++..+.......+ + .-++||.+++.++... + ..+.+++|+|..+..|.+.. + T Consensus 144 ~a~l~G~g~~~~~~gi~~~~~~~~~~~~~---~----~~~~~i~~~~~~l~~~--~-~~~~~~v~n~~~~~~L~~lk--d 211 (324) T protein:vir:97 144 EAGILNQGNNPFGKSIAQSIEKTNKVIKG---D----FTQDNIIDLEALLEDD--E-LEANAFISKTQNRSLLRKIV--D 211 (324) T ss_pred HHhhccCCCCccCccccccccccceeccc---c----CCHHHHHHHHHhhhhc--c-CCCCEEEEcHHHHHHHHHhh--c Confidence 99999986543 35555554322211111 1 1256777888777542 2 35678999999999987532 2 Q ss_pred CCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcc------cceEEEeeccchhcccc- Q lcl|NC_011142. 239 YTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKS------ERNLALAKPIPFRMLAP- 311 (343) Q Consensus 239 ~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~------~~~~~~~v~~~~~~~~~- 311 (343) ..+..++ .......+-|.|+... . ....++..++.-+.+ .+.+.+.+-........ T Consensus 212 ~~g~~~~----~~~~~~tl~G~PV~~~-------~------~~~~~~~~~~~gd~~~~~i~~~~~~~i~~~~~~~~~~~~ 274 (324) T protein:vir:97 212 PETKERI----YDRNSDTLDGLPVVNL-------K------SSNLKRGELITGDFDKLIYGIPQLIEYKIDETAQLSTVK 274 (324) T ss_pred CCCceee----cCCCCccccceeeEee-------c------CCCCCcceEEEEecccEEEEEecCcEEEEeecccccccc Confidence 2222111 1111112334443211 0 001111112211111 11122222111100000 Q ss_pred ---------eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 312 ---------QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 312 ---------~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -.++ ...+.+..++ ++.+.+|.|++.+.+. T Consensus 275 ~~~~~~~~~f~~d-~~~~r~~~r~-d~~v~~~~a~~~l~~~ 313 (324) T protein:vir:97 275 NEDGTPVNLFEQD-MVALRATMHV-ALHIADDKAFAKLVPA 313 (324) T ss_pred cccccchhhhhcC-cEEEEEEEEe-ccEEecccceEEEEec Confidence 0112 2344556776 5666789999999999 No 45 >protein:vir:99749 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1497 # MgeName: phiETA2 # Cross-refs: genbank:acc:YP_001004307;genbank:gi:122891761;genbank:GeneID:4712304 Probab=98.52 E-value=4.2e-08 Score=61.05 Aligned_cols=291 Identities=9% Similarity=-0.014 Sum_probs=158.0 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) |++.+ .+++...++..-...+.++- .+..+.++..... .+...+++.....-..+++..+.. ... T Consensus 4 ~~~~~--~~~~~~~~~~~~~~~~~a~~-------~~~~~~~~~lip~---~~~~~ii~~~~~~s~l~~~~~~~~---~~~ 68 (324) T protein:vir:99 4 TQKLK--LNLQHFASNNVKPQVFNPDN-------VMMHEKKDGTLLN---DFTTPILQEVMENSKIMRLGKYEP---MEG 68 (324) T ss_pred chHhh--HHHHHHHHHhhhhhhccccc-------eeccCCCcceech---hHHHHHHHHHHhhchhhhhcceee---ccC Confidence 33332 33555554444443333321 1222333332323 334556665555555566555432 222 Q ss_pred eEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhh Q lcl|NC_011142. 81 THWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQR 160 (343) Q Consensus 81 ~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~ 160 (343) .+..+.+.+..+.+.+++.. ..+|..+..++........++.-..++.+-++.+ ..++...-.....+++++.+++ T Consensus 69 ~~~~~p~~~~~~~a~~v~Eg-~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~ds---~~~l~~~i~~~l~~ai~~~~d~ 144 (324) T protein:vir:99 69 TEKKFTFWADKPGAYWVGEG-QKIETSKATWVNATMRAFKLGVILPVTKEFLNYT---YSQFFEEMKPMIAEAFYKKFDE 144 (324) T ss_pred CceEEEEEecCcceeEeccC-ccccccccceeEEEEeeEEEEEeehhhHHHHhcc---hHHHHHHHHHHHHHHHHHHHHH Confidence 34566667777788888764 5688889999999999999999999887655544 3467788888899999999999 Q ss_pred eeeeeehhhc-ceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCC Q lcl|NC_011142. 161 VAYFGDTNRN-MSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGY 239 (343) Q Consensus 161 ~~f~G~~~~g-~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~ 239 (343) .+++|+...+ ..|+++........+ . ...-++||.+++..+.. ....+..++++|+.|..|.+-. +. T Consensus 145 ~~l~G~g~~~~~~~~~~~~~~~~~~~----~---~~~~~~~i~~~~~~l~~---~~~~~~~~v~n~~~~~~L~~l~--d~ 212 (324) T protein:vir:99 145 AGILNQGNNPFGKSIAQSIEKTNKVI----K---GDFTQDNIIDLEALLED---DELEANAFISKTQNRSLLRKIV--DP 212 (324) T ss_pred HhhhcCCCCccCccccccccccceec----c---ccCCHHHHHHHHHhhhh---ccCCCCEEEEcHHHHHHHHHhh--cC Confidence 9999976432 244444332111111 1 11226777888877754 2345678999999999987432 22 Q ss_pred CCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-------- Q lcl|NC_011142. 240 TDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-------- 311 (343) Q Consensus 240 ~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-------- 311 (343) .+.-++ ....+ -.+-|.|+... . ....++..+++-+ ...+-+.+..+++.-.. T Consensus 213 ~g~~~~---~~~~~-~~l~G~PVv~~-------~------~~~~~~~~~i~gd--~~~~~~~~~~~~~i~~~~~~~~~~~ 273 (324) T protein:vir:99 213 ETKERI---YDRNS-DTLDGLPVVNL-------K------SSNLKRGELITGD--FDKLIYGIPQLIEYKIDETAQLSTV 273 (324) T ss_pred CCceee---cCCCC-ccccceeEEee-------c------CCCCCcceEEEEe--cccEEEEEecCcEEEEeeccccccc Confidence 222111 11111 12234443211 1 1111222222212 22222322232222110 Q ss_pred ----------eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 312 ----------QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 312 ----------~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -.++ ...+.++.++ |+.+.+|.|++.+.|. T Consensus 274 ~~~~~~~~~~f~~~-~~~~r~~~r~-d~~v~~~~a~~~lt~a 313 (324) T protein:vir:99 274 KNEDGTPVNLFEQD-MVALRATMHV-ALHIADDKAFAKLVPA 313 (324) T ss_pred ccccccchhhhhcC-cEEEEEEEEE-ccEEecccceEEEEec Confidence 0112 2445567776 5667789999999999 No 46 >protein:vir:9309 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:165 # MgeName: phi 11 # Cross-refs: genbank:acc:NP_803287;genbank:gi:29028597;genbank:GeneID:1258044 Probab=98.51 E-value=5.2e-08 Score=60.52 Aligned_cols=293 Identities=9% Similarity=-0.010 Sum_probs=158.0 Q ss_pred CCccee-ccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcc Q lcl|NC_011142. 1 MSEKRV-VIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEY 79 (343) Q Consensus 1 ~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~ 79 (343) |.|++- -.+++..+.+......+.++- .+..+.++....+ .+...+++.....-..+++..+.. .. T Consensus 1 ~~~~~~~~~~~~~f~~~~~~~~~~~a~~-------~~~~~~~~~liP~---~~~~~ii~~~~~~s~l~~l~~~~~---~~ 67 (324) T protein:vir:93 1 MEQTQKLKLNLQHFASNNVKPQVFNPDN-------VMMHEKKDGTLLN---DFTTPILQEVMENSKIMQLGKYEP---ME 67 (324) T ss_pred CchhHHHHHHHHHHHHhhhhhhhccccc-------ccccCCCcceech---hHHHHHHHHHHhhchhhhhcceee---cc Confidence 433321 123444444433333332221 1112222323333 233556665555555555554422 22 Q ss_pred eeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhh Q lcl|NC_011142. 80 ATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQ 159 (343) Q Consensus 80 ~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n 159 (343) ...+.|.+.+..+.+.+++.. ..+|..+..++........++.-+.++.+=++.+ ..++...-....++++++.+| T Consensus 68 ~~~~~ip~~~~~~~a~~v~Eg-~~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~ds---~~~l~~~i~~~l~~aia~~~d 143 (324) T protein:vir:93 68 GTEKKFTFWADKPGAYWVGEG-QKIETSKATWVNATMRAFKLGVILPVTKEFLNYT---YSQFFEEMKPMIAEAFYKKFD 143 (324) T ss_pred CCceEEEEEecCcceeeecCC-ccccccccceeEEEEEeEEEEEeehhhHHHHhcc---hHHHHHHHHHHHHHHHHHHHH Confidence 233566677777778888765 5688888889999999999998888887555543 346778888888999999999 Q ss_pred heeeeeehhhc-ceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCC Q lcl|NC_011142. 160 RVAYFGDTNRN-MSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTG 238 (343) Q Consensus 160 ~~~f~G~~~~g-~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~ 238 (343) +.+++|+...+ ..|+++.......... ...-++||.+++.++... ...+..++++|+.+..|.+.. + T Consensus 144 ~a~l~G~g~~~~~~~~~~~~~~~~~~~~-------~~~~~~~i~~~~~~l~~~---~~~~~~~v~n~~~~~~L~~l~--d 211 (324) T protein:vir:93 144 EAGILNQGNNPFGKSIAQSIEKTNKVIK-------GDFTQDNIIDLEALLEDD---ELEANAFISKTQNRSLLRKIV--D 211 (324) T ss_pred HHHhcCCCCCCcCccccccccccceecc-------ccccHHHHHHHHHhhhhc---cCCCCEEEEcHHHHHHHHHhh--C Confidence 99999976432 3455544332211111 112267788888877542 235678999999999996432 3 Q ss_pred CCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc------- Q lcl|NC_011142. 239 YTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP------- 311 (343) Q Consensus 239 ~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~------- 311 (343) ..+.-++ .......+-|.|+... .....++..+++-+. ..+-+....+++.... T Consensus 212 ~~G~~~~----~~~~~~~l~G~PVv~~-------------~~~~~~~~~i~~gdf--s~~~~~~~~~~~i~~~~~~~~~~ 272 (324) T protein:vir:93 212 PETKERI----YDRNSDSLDGLPVVNL-------------KSSNLKRGELITGDF--DKLIYGIPQLIEYKIDETAQLST 272 (324) T ss_pred CCCCeee----cCCCCCcccceeeEee-------------cCCCCCcceEEEEec--ceEEEEEecCcEEEEeecccccc Confidence 3332111 1111122334443211 011112222222222 2222322233222111 Q ss_pred -----------eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 312 -----------QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 312 -----------~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -.++ ...+.+..++ |+.+.+|.|++++.+. T Consensus 273 ~~~~~~~~~~~f~~n-~~~~r~~~r~-d~~v~~~~a~~~l~~a 313 (324) T protein:vir:93 273 VKNEDGTPVNLFEQD-MVALRATMHV-ALHIADDKAFAKLVPA 313 (324) T ss_pred cccccccchhhhhcC-cEEEEEEEEe-ccEEecccceEEEecc Confidence 0112 2455567776 6778899999999998 No 47 >protein:vir:96223 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1607 # MgeName: 69 # Cross-refs: genbank:acc:YP_239571;genbank:gi:66395304;genbank:GeneID:5132771 Probab=98.49 E-value=6.9e-08 Score=59.85 Aligned_cols=291 Identities=8% Similarity=-0.017 Sum_probs=154.4 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) |++.+ .|.+-..++-.-...+.++- .|..+.++.+..++ +-.+|++.....-..++++++.. .+ . T Consensus 4 ~~~~~--~~~~~f~~~~~~~~~~~a~~-------~~~~~~~~~lip~~---~~~~ii~~~~~~s~l~~l~~~~~-~~--~ 68 (324) T protein:vir:96 4 TQKLK--LNLQHFASNNVKPQVFNPDN-------VMMHEKKDGTLLND---FTTPILQEVMENSKIMQLGKYEP-ME--G 68 (324) T ss_pred chhhh--HHHHHHHHhhhhhhhccccc-------ccccCCCcceechh---HHHHHHHHHHhhchhhhhcceee-cc--C Confidence 44333 23333333222222221111 22233334333333 33456665555555556555432 22 2 Q ss_pred eEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhh Q lcl|NC_011142. 81 THWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQR 160 (343) Q Consensus 81 ~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~ 160 (343) ..+.|.+.+..+.+.+++.. ..+|..+..+++.....+.++.-..++.+=++.+ ..++...-....++++++.+|+ T Consensus 69 ~~~~~p~~~~~~~a~~v~Eg-~~~~~~~~~f~~v~~~~~k~~~~~~is~ell~ds---~~~l~~~i~~~l~~aia~~~d~ 144 (324) T protein:vir:96 69 TEKKFTFWADKPGAYWVGEG-QKIETSKATWVNATMRAFKLGVILPVTKEFLNYT---YSQFFEEMKPMIAEAFYKKFDE 144 (324) T ss_pred CceEEEEEecCcceeeecCC-ccccccccceeEEEEEeEEEEEeehhhHHHHhcc---hHHHHHHHHHHHHHHHHHHHHH Confidence 34566677767778888765 5588888889999999999998888887555543 3568888888899999999999 Q ss_pred eeeeeehhhc-ceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCC Q lcl|NC_011142. 161 VAYFGDTNRN-MSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGY 239 (343) Q Consensus 161 ~~f~G~~~~g-~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~ 239 (343) .+|+|+...+ ..|+++.... ...+...+ .-++||.+++.++... ...+..++++|+.+..|.+.. +. T Consensus 145 ~~l~G~g~~~~~~~~~~~~~~-----~~~~~~~~--~~~~~i~~~~~~i~~~---~~~~~~~i~n~~~~~~L~~lk--d~ 212 (324) T protein:vir:96 145 AGILNQGNNPFGKSIAQSIKK-----TNKVIKGD--FTQDNIIDLEALLEDD---ELEANAFISKTQNRSLLRKIV--DP 212 (324) T ss_pred HhhhcCCCCCcCccccccccc-----cceecccc--cchHHHHHHHHhhhhc---cCCCCEEEEcHHHHHHHHHhh--CC Confidence 9999976432 2333332211 11111111 1256777777776542 245678999999999987532 32 Q ss_pred CCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-------- Q lcl|NC_011142. 240 TDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-------- 311 (343) Q Consensus 240 ~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-------- 311 (343) .+.-++ .+.....+-|.|+... . ....++..++.-+.+ .+-+.+..+++.-.. T Consensus 213 ~G~~~~----~~~~~~~l~G~PV~~~-------~------~~~~~~~~~~~gd~s--~~~~~~~~~~~i~~~~~~~~~~~ 273 (324) T protein:vir:96 213 ETKERI----YDRNSDSLDGLPVVNL-------K------SSNLKRGELITGDFD--KLIYGIPQLIEYKIDETAQLSTV 273 (324) T ss_pred CCCeee----cCCCCCcccceeeEee-------c------CCCCCcceEEEEecc--eEEEEEecCcEEEEeeccccccc Confidence 332211 1111122334443211 0 111112222222211 222222233222110 Q ss_pred ----------eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 312 ----------QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 312 ----------~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ..+| ...+.+..++ |+.+.+|.|++++.+- T Consensus 274 ~~~~~~~~~~~~~n-~v~~r~~~r~-d~~v~~~~a~~~l~~a 313 (324) T protein:vir:96 274 KNEDGTPVNLFEQD-MVALRATMHV-ALHIADDKAFAKLVPA 313 (324) T ss_pred ccccccchhhhhcC-cEEEEEEEEe-ccEEecccceEEEecc Confidence 1112 1345566766 6778889999999988 No 48 >protein:vir:78523 Length: 338 # NCBI annotation: Putative head structural protein # Family: family:all:507 # MgeID: mge:1853 # MgeName: U2 # Cross-refs: genbank:acc:YP_001491585;genbank:gi:157786408;genbank:GeneID:5625675 Probab=98.45 E-value=7.3e-08 Score=59.73 Aligned_cols=302 Identities=11% Similarity=-0.051 Sum_probs=153.0 Q ss_pred hhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecc-- Q lcl|NC_011142. 13 IAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDG-- 90 (343) Q Consensus 13 ~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~-- 90 (343) .+.-..++. +.....+.-...+.. +.+.. +.+-.+|++.....-..+++.++. +.+.+. ..+.+... T Consensus 1 ~~~~~e~~~----~~~~~~~~~~~~~~~-~~liP---~~~~~~ii~~~~~~s~l~~l~~~~-~~~~~~--~~ip~~~~~~ 69 (338) T protein:vir:78 1 MATLNELAP----NTAGSNHQGRLAHVP-SDLLP---KEIVGPIFDKAQESSLVLRLGENI-PISYGE--TIIPTTVKRP 69 (338) T ss_pred CcchHHhhh----hhcccccccceeccc-ccccc---hHHHHHHHHHHHhhchhhhhccee-eccCCc--eEEEEEecCc Confidence 111111110 000000000000111 11222 334456777777777777777653 233332 23333321 Q ss_pred ------cccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeee Q lcl|NC_011142. 91 ------AAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYF 164 (343) Q Consensus 91 ------~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~ 164 (343) .+.+.+++.. ..+|..+..++......+.++.-..++.+=++. ...++..--....++++++.+|+.+++ T Consensus 70 ~a~~v~~~~~~~~~Eg-~~~~~~~~~f~~v~l~~~k~~~~~~is~ell~d---s~~~~~~~i~~~la~a~~~~~d~~~l~ 145 (338) T protein:vir:78 70 EVGQVGVGTSNEQREG-GTKPLSGTAWDTRSVAPIKLATIVTVSEEFARM---NPSGLYTKLQADLAYAIGRGIDLAVFH 145 (338) T ss_pred cceeeccccccccccc-ccccccccceeEEEEEEEEEEEeehhhHHHHhc---CHHHHHHHHHHHHHHHHHHHHHHHhhc Confidence 2334444443 457777788888888888888888887643333 335677777888999999999999999 Q ss_pred eehh---hcceeeeecCCccccc-cCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccc-cCCC Q lcl|NC_011142. 165 GDTN---RNMSGLLNNPNVTKTS-ATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLL-MTGY 239 (343) Q Consensus 165 G~~~---~g~~GLlN~p~v~~~~-~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~-~~~~ 239 (343) |+.. .+..|++++......+ ....++ .....++++.+++..+... ....+..++|+|..+..|...+ ..+. T Consensus 146 G~g~~~~~~~~gi~~~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~--~~~~~~~~~m~~~~~~~L~~~~~l~d~ 221 (338) T protein:vir:78 146 GKSPLTGSALQGIDTNNVIVNTTNVDYLQT--GTTPLLDRFLDGYDLVSAN--TDVDFNGWAADPRYRARLLRSQAYRDA 221 (338) T ss_pred ccCCCccccccccccccccccccccccccc--cchhhHHHHHHHHHHhhhh--ccccceEEEEchHHHHHHHHHhhhccC Confidence 9874 3567887776654332 222222 2445688888888777543 3345678999999998885421 2233 Q ss_pred CCccHHHHHHhcCcceeeccccccccccceeeechhhh-ccccCCccceEEEEEcccceEEEeeccchhcccc------- Q lcl|NC_011142. 240 TDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAA-AGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP------- 311 (343) Q Consensus 240 ~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~-~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~------- 311 (343) .+.-++.-......-..+-|.|+.+.. .+.. .+...+.+..+++-+.+ .+.+.....++.... T Consensus 222 ~g~~l~~~~~~~~~~~~l~G~PV~~~~-------~ip~~~~~~~~~~~~~~~gdfs--~~~~~~~~~~~i~~~~~~~~~~ 292 (338) T protein:vir:78 222 NGNVDPTRINLAASAGDLLGLPVQFGK-------AVGGDLGAATDSKVRVVGGDFS--QLKYGFADEIRVKMSDTATLTD 292 (338) T ss_pred CCceeecccccCCCCceeeeeeEEEcc-------ccCccccccCCcccEEEEEecc--eEEEEeecccEEEEeecccccc Confidence 332221111111111123344443221 1111 11122222222221222 122222222221110 Q ss_pred ------eecCc----eeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 312 ------QLLGL----GITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 312 ------~~~~~----~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +..++ ...+.++.++ |..+.+|.|++++... T Consensus 293 ~~~~~~~~~~~~~~~~~~~r~~~r~-d~~v~~~~a~~~l~~~ 333 (338) T protein:vir:78 293 NTSPTPQTVSMWQTNQIAILIEVTF-GWLLGDKQAFVKFVDD 333 (338) T ss_pred cccccccchhhhhcCcEEEEEEEEe-ccEeecccceEEEecc Confidence 11111 1345567776 6889999999999999 No 49 >protein:vir:100135 Length: 418 # NCBI annotation: gp5 # Family: family:all:585 # MgeID: mge:1639 # MgeName: phi1026b # Cross-refs: genbank:acc:NP_945035;genbank:gi:38707895;genbank:GeneID:2744182 Probab=98.42 E-value=1.3e-07 Score=58.40 Aligned_cols=302 Identities=12% Similarity=0.077 Sum_probs=151.0 Q ss_pred CCcceec----cchhhhhchhhhchhc---------c----cccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhc Q lcl|NC_011142. 1 MSEKRVV----IDAQTIAGNRWLNKFL---------D----SNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYAD 63 (343) Q Consensus 1 ~~~~~~~----~~~~~~~~~~~~~~~~---------~----~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~ 63 (343) +.+.... ...+-...+..++... . ........ +..+...++... . +.+.+.|++..... T Consensus 87 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~g~lv-p--~~~~~~ii~~~~~~ 162 (418) T protein:vir:10 87 GGGSAELETPKTLGQLVTESEEMKGMDGSARKSVRVRVDRKSIMNVPAT-VGSGVSGSNSLV-V--ADRQAGIIAPPQRK 162 (418) T ss_pred cccccccchhhhhhHHhhhHHHHHHHHHHHhhhhhhhhHHHHHHHhhhh-ccCCCCCCcccc-c--hhHHHHHHHHHhhh Confidence 0000000 0000000111110000 0 00000000 011111223222 2 23455677777777 Q ss_pred ccchhhccccCCCCcceeEEEEeeecc-cccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCc Q lcl|NC_011142. 64 ITYLEDVPVLANIPEYATHWNYRSYDG-AAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPI 142 (343) Q Consensus 64 l~~~~~i~v~~~~~~~~~~~~~~~~~~-~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l 142 (343) ...+.++++.. .+..+..+..... .+.+.+++.. ..+|..+..++......+.++.-+.+|.+ +.... . ++ T Consensus 163 ~~l~~~~~~~~---~~~~~~~~~~~~~~~~~a~~v~E~-~~~~~~~~~f~~v~~~~~k~~~~~~is~e-ll~ds--~-~l 234 (418) T protein:vir:10 163 MTIRDLLMPGQ---TSSSSIEYTVETGFTNNAAAVAEG-AQKPTSDLKFNLKNQPVRTIAHLFKASRQ-ILDDA--P-AL 234 (418) T ss_pred hhHHhhcceee---ccCCceeEEEEecCCCceeeeccC-ccccccccceeeEEEeeeeEEEeehhhHH-HHHhH--H-HH Confidence 66677665432 1222344444433 3556677655 45788888889999999999988888865 44322 2 57 Q ss_pred cHHHHHHHHHHHHHhhhheeeeeehhh-cceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEE Q lcl|NC_011142. 143 DSMQAELAFRGSEEHSQRVAYFGDTNR-NMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTV 221 (343) Q Consensus 143 ~~~k~~aA~~~~~~~~n~~~f~G~~~~-g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL 221 (343) ..--.....+++++.+|+.+|+|+..- ...|++|..++...+.+.. ...-++||.+++..+.. ....+..+ T Consensus 235 ~~~i~~~l~~a~~~~~d~a~l~G~g~~~~p~Gi~~~~~~~~~~~~~~-----~~~~~~~i~~~~~~~~~---~~~~~~~~ 306 (418) T protein:vir:10 235 QSYIDGRARYGLQLTEEGQILKGDGTGANILGILPQASAFMPSITLA-----NATPIDKIRLALLQAVL---AEFPATGI 306 (418) T ss_pred HHHHHHHHHHHHHHHHHHHHhccCCCCcccccccccccccccccccc-----ccccHHHHHHHHHhhcc---ccCCCCEE Confidence 777788889999999999999998644 3789999987654433211 11225677777766643 22345679 Q ss_pred EecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEe Q lcl|NC_011142. 222 LMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALA 301 (343) Q Consensus 222 ~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~ 301 (343) +|+|..|..|.... +..|.-++.-... .....+-|.|+..... + ..++ +++-+=.+.+.+. T Consensus 307 v~n~~~~~~L~~lk--d~~G~~i~~~~~~-~~~~~l~G~pV~~~~~--~-----------p~~~---~~~gd~s~~~~~~ 367 (418) T protein:vir:10 307 VLNPIDWASIELTK--DSQGRYIVGNPVN-GTTPRLWNLPVVETQA--M-----------TANE---FLVGAFSMAAQIF 367 (418) T ss_pred EEcHHHHHHHHHhh--cCCCceecccccc-CCCceecceeeEEcCC--C-----------CCCc---EEEeeccceEEEE Confidence 99999999987533 3334332211111 1112233444432211 0 0011 1111111112121 Q ss_pred eccchhcccc-ee-----cCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 302 KPIPFRMLAP-QL-----LGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 302 v~~~~~~~~~-~~-----~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .-+.++...- +. ++ ...+.++.+++ +.+++|.|+++++.- T Consensus 368 ~~~~~~i~~~~~~~~~f~~~-~~~~r~~~~~d-~~~~~~~a~~~~~~~ 413 (418) T protein:vir:10 368 DRMEIEVLLSTENVDDFEKN-MVSIRAEERLA-LAVYRPESFVTGALV 413 (418) T ss_pred EecceEEEEecccchhhhcC-ceEEEEEEeec-cEEecccceEEEEec Confidence 1122222111 11 11 23444567765 579999999999988 No 50 >protein:vir:2430 Length: 318 # NCBI annotation: major head subunit # Family: family:all:507 # MgeID: mge:52 # MgeName: D29 # Cross-refs: genbank:acc:NP_046832;genbank:gi:9630400;genbank:GeneID:1261582 Probab=98.41 E-value=1.2e-07 Score=58.59 Aligned_cols=287 Identities=8% Similarity=-0.077 Sum_probs=146.3 Q ss_pred CcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCccee Q lcl|NC_011142. 2 SEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYAT 81 (343) Q Consensus 2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~ 81 (343) --|-..+|++... + .. ..+.+.++.+. +.+...|++..++.-..+++..+. +.... T Consensus 1 ~~~~~~~~~e~~~----~-------~~------~~~~~~~~~ip----~~~~~~ii~~~~~~~~l~~~~~~~---~~~~~ 56 (318) T protein:vir:24 1 MAAGTAFAVDHAQ----I-------AQ------TGDTMFKGYLE----PEQAKDYFAEAEKTSIVQQFAQKV---PMGTT 56 (318) T ss_pred CCCCCCCCHHHHH----h-------hc------ccCcccceeec----hhHHHHHHHHHHhhchhhhhccee---eccCC Confidence 0011112221100 0 00 11222233222 233455666666555555655442 22223 Q ss_pred EEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhhe Q lcl|NC_011142. 82 HWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRV 161 (343) Q Consensus 82 ~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~ 161 (343) ...+.+....+.+.+++.. ..+|..+..+++.......++....++.+=|+. ...++...-....++++++.+|+. T Consensus 57 ~~~ip~~~~~~~a~~v~Eg-~~~~~~~~~f~~i~~~~~k~~~~~~iS~e~l~d---s~~~~~~~i~~~l~~~~~~~~d~a 132 (318) T protein:vir:24 57 GQKIPHWVGDVSAQWIGEG-DMKPITKGNMTSQTIAPHKIATIFVASAETVRA---NPANYLGTMRTKVATAFAMAFDGA 132 (318) T ss_pred ceEEEEEeCCcceEEecCC-ccccccccceeEEEEeeEEEEEeehhhHHHhhc---ChHHHHHHHHHHHHHHHHHHHHHh Confidence 4556666666778888764 568888888888888899999888887754443 235687888888999999999999 Q ss_pred eeeeehhhcceeeeecCCc-cccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCC Q lcl|NC_011142. 162 AYFGDTNRNMSGLLNNPNV-TKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYT 240 (343) Q Consensus 162 ~f~G~~~~g~~GLlN~p~v-~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~ 240 (343) +++|+..-.-.|+++.... ........ ......++.+++..+.. ....+..++|+|+.+..|.+.. +.. T Consensus 133 ~l~G~g~~~~~~~~~~~~~~~~~~~~~~-----~~~~~~~~~~~~~~~~~---~~~~~~~~v~n~~~~~~L~~lk--d~~ 202 (318) T protein:vir:24 133 AMHGTDSPFPTYIGQTTKAISIADTTGA-----TTVYDQVAVNGLSLLVN---DGKKWTHTLLDDITEPILNGAK--DQN 202 (318) T ss_pred hhcccCCCCCcccccccccccccccccc-----cchHHHHHHHHHHhhcc---ccCCCCEEEEcHHHHHHHHHhh--ccC Confidence 9999865444555554331 11111111 01112334455544432 2345678999999999997533 333 Q ss_pred CccHHHHHHhc-Ccc----eeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc---- Q lcl|NC_011142. 241 DRTVIEHFQIN-NAY----TLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP---- 311 (343) Q Consensus 241 ~~tvle~l~~n-~~~----~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~---- 311 (343) +..++.-...+ .+. ..+.|.|..+. ..-..++..++.-+.+ .+-+....+++.... T Consensus 203 G~~l~~~~~~~~~~~~~~~~~i~g~pv~~~-------------~~~~~~~~~~~~gdfs--~~~~~~~~~l~i~~~~~~~ 267 (318) T protein:vir:24 203 GRPLFIESTYGEAASPFRSGRIVARPTILS-------------DHVVEGTTVGFMGDFS--QLIWGQIGGLSFDVTDQAT 267 (318) T ss_pred CceeecCccccCccccccCceEEEEeeEEe-------------CCCCCCccEEEEeecc--eEEEEEecCeEEEEeeccc Confidence 33322111111 110 01112221111 0111233332221211 121222222221110 Q ss_pred --------------eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 312 --------------QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 312 --------------~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ..++ ...+.+..++ ++.+.+|.|++.+.++ T Consensus 268 ~~~~~~~~~~~~~~f~~~-~~~~r~~~r~-d~~v~~~~a~~~i~~~ 311 (318) T protein:vir:24 268 LNLGTVESPNFVSLWQHN-LVAVRVEAEY-AFHCNDAEAFVALTNV 311 (318) T ss_pred eeccccccccchhhhhcC-cEEEEEEEEE-ccEEecccceEEEEee Confidence 1112 2455667777 6778999999999999 No 51 >protein:vir:81227 Length: 413 # NCBI annotation: gp6, major capsid protein # Family: family:all:585 # MgeID: mge:1893 # MgeName: BFK20 # Cross-refs: genbank:acc:YP_001456736;genbank:gi:157168379;hssp:P49861;interpro:IPR006444;uniprot:Q9MBJ9;genbank:GeneID:5580350 Probab=98.35 E-value=1.5e-07 Score=57.99 Aligned_cols=303 Identities=12% Similarity=0.034 Sum_probs=150.6 Q ss_pred CCcceeccchhhhhchhhhchhccc-----ccccCcchh--ecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhcccc Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDS-----NATIGVPSV--VNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVL 73 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~-----~~~~~~~~~--~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~ 73 (343) ..++...-......+..+.+..... ...++.... .++.+.+.. .. +.+.+.+++........++++.+. T Consensus 80 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--vp--~~~~~~ii~~~~~~~~l~~~~~~~ 155 (413) T protein:vir:81 80 FFAKRAGDQIKQQAGGAQLNYSVGEYVAPRVKAASDPASTATLTDEFQGG--YG--TTWNRNIIYRRREKLVVADLMDNL 155 (413) T ss_pred hhhhhhhhHHHHHHHHHHhhhhhhhhhhhHHHhhhhhhhhcccccccccc--cc--hhhHHHHHHHHhhhhhHHhhccee Confidence 0000000000111111111000000 000000000 111122222 22 456677888888888878877653 Q ss_pred CCCCcceeEEEEeeec----ccccceeecCCcCccceeee-ccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHH Q lcl|NC_011142. 74 ANIPEYATHWNYRSYD----GAAMGKFISANASDLPRVAQ-SAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAE 148 (343) Q Consensus 74 ~~~~~~~~~~~~~~~~----~~G~a~~~~~~~~dip~v~~-~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~ 148 (343) .-.+ .+..|.+.. ..+.+.+++.. ..+|..+. .++.....++.++.-+.+|.+=|+.+ . .|..--.. T Consensus 156 ~~~~---~~~~~~~~~~~~~~~~~a~~v~Eg-~~~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~ds---~-~l~~~i~~ 227 (413) T protein:vir:81 156 TMTN---TTIKYLMEKANRVVEGGFKTVAEG-GKKPYMRFADFDIVTESLSKIAGLTKITDEMIEDY---D-FLVSYINA 227 (413) T ss_pred eccC---CceeEEEeccccccccccceecCc-ccccccCcccceeeEeeeeeEEEeehhhHHHHHHH---H-HHHHHHHH Confidence 3222 223333222 23456677654 34666553 57888899999998888886533332 2 27777777 Q ss_pred HHHHHHHHhhhheeeeeehhh-cceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHH Q lcl|NC_011142. 149 LAFRGSEEHSQRVAYFGDTNR-NMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDL 227 (343) Q Consensus 149 aA~~~~~~~~n~~~f~G~~~~-g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~ 227 (343) ..+.++++.+|+.+++|+... ...||++.+++...... +.+..++++.+++..+... ....+..++|+|+. T Consensus 228 ~la~~~~~~~d~~~l~G~G~~~~~~Gi~~~~~~~~~~~~------~~~~~~~~i~~~~~~~~~~--~~~~~~~~vmn~~~ 299 (413) T protein:vir:81 228 RLLEELAIEEERQLLLGDGTGNNLTGLLKRDGIQTLAVS------NKDELADSIYKAMTNISLA--TPFQADALVINPLD 299 (413) T ss_pred HHHHHHHHHHHHHHhccCCCCCccccccccccccccccc------ccchhHHHHHHHHHHhhhh--ccCCCcEEEEcHHH Confidence 788899999999999998533 35799999987654433 2334567777777666543 33356789999999 Q ss_pred HHHHhccccCCCCCccHHHHHHhcCc-------ceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEE Q lcl|NC_011142. 228 WKRASSLLMTGYTDRTVIEHFQINNA-------YTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLAL 300 (343) Q Consensus 228 ~~~L~~~~~~~~~~~tvle~l~~n~~-------~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~ 300 (343) |..|..-. +..|.-++.-.....+ ...+-|.|+..... . ..+ . ++|-+=.+.+.+ T Consensus 300 ~~~l~~lk--d~~G~~l~~~~~~~~~~~~~~~~~~~l~G~pv~~s~~-------~------~~~--~-~~~gd~~~~~~~ 361 (413) T protein:vir:81 300 YQELRLAK--DANGQYYGGGVFQGQYGSGGIMLDPAPWGLRTVQSQV-------V------PVG--K-PVVGAFRSAASV 361 (413) T ss_pred HHHHHHhh--ccCCceeccccccccccccccccCceecceeeEEcCC-------C------Ccc--c-EEEEecccEEEE Confidence 99986433 3333222110000000 00122444322110 0 001 1 111111112222 Q ss_pred eeccchhcccc-e-----ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 301 AKPIPFRMLAP-Q-----LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 301 ~v~~~~~~~~~-~-----~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .....++..-. + .++ ...+.++.++ ++.+++|.++++++.= T Consensus 362 ~~~~~~~v~~~~~~~~~~~~~-~~~~r~~~r~-d~~~~~~~a~~~l~~~ 408 (413) T protein:vir:81 362 LRKGGVRIDSTNTNVDDFENN-LITVRAEERV-GLMVTFPEAIVQLDVA 408 (413) T ss_pred EEecceEEEEeccccchhhcC-cEEEEEEEee-ccEEecccceEEEEec Confidence 22122221111 1 122 2455567776 5777899999998866 No 52 >protein:vir:4226 Length: 326 # NCBI annotation: observed 35.2Kd protein # Family: family:all:507 # MgeID: mge:89 # MgeName: L5 # Cross-refs: genbank:acc:NP_039681;swissprot:sw:q05223;genbank:gi:9625447;uniprot:Q05223;genbank:GeneID:2942929 Probab=98.35 E-value=1.6e-07 Score=57.91 Aligned_cols=297 Identities=8% Similarity=-0.074 Sum_probs=149.4 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) ||=-.+. +.+++... ...+ +.+....++....++ +-..+++.....-..+++..+. +.+. T Consensus 1 ~~~~~~r-------~~~~~~~~--e~~a-----~~~~~~~~g~~ip~~---~~~~ii~~~~~~s~i~~~~~~~---~~~~ 60 (326) T protein:vir:42 1 MAVNPDR-------TTPFLGVN--DPKV-----AQTGDSMFEGYLEPE---QAQDYFAEAEKISIVQQFAQKI---PMGT 60 (326) T ss_pred CCCCccc-------hhhhcCcc--hhhh-----eeccccCCcceechh---hHHHHHHHHHhcchhhhhccee---eccC Confidence 4421111 11111110 0111 111112223333332 3345677666665555555442 2223 Q ss_pred eEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhh Q lcl|NC_011142. 81 THWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQR 160 (343) Q Consensus 81 ~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~ 160 (343) ....+.+.+..+.+.+++. +..+|..+..+++.....+.++..+.+|.+=++. ...++..--....++++.+.+|+ T Consensus 61 ~~~~~p~~~~~~~a~~v~E-g~~~~~~~~~f~~i~~~~~k~~~~v~iS~ell~~---s~~~~~~~i~~~l~~a~~~~~d~ 136 (326) T protein:vir:42 61 TGQKIPHWTGDVSASWIGE-GDMKPITKGNMTSQTIAPHKIATIFVASAETVRA---NPANYLGTMRTKVATAFAMAFDN 136 (326) T ss_pred CceEEEEEeCCcceEEecC-CccccccccceeEEEEeeEEEEEeehhhHHHHhc---CHHHHHHHHHHHHHHHHHHHHHH Confidence 3455666667777788865 4678988999999999999999998888754443 34568788888889999999999 Q ss_pred eeeeeehhhcceeeeecCCcccc-ccCcCcc--ccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccC Q lcl|NC_011142. 161 VAYFGDTNRNMSGLLNNPNVTKT-SATVNYA--TCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMT 237 (343) Q Consensus 161 ~~f~G~~~~g~~GLlN~p~v~~~-~~~~~w~--~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~ 237 (343) .+|+|+...+-.|++|.+..... ....... ..+..++ ++..++..+. ........++|+|..+..|.+-. T Consensus 137 a~l~G~gs~~p~gi~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~~---~~~~~~a~~v~n~~~~~~L~~lk-- 209 (326) T protein:vir:42 137 AAINGTDSPFPTFLAQTTKEVSLVDPDGTGSNADLTVYDA--VAVNALSLLV---NAGKKWTHTLLDDITEPILNGAK-- 209 (326) T ss_pred HhhcccCCCccccccccccccceeecccccccccchhHHH--HHHHHHhhhh---hhccCccEEEEeHHHHHHHHHhh-- Confidence 99999886666788877653221 1111222 1122221 1222332222 22334567899999999997532 Q ss_pred CCCCccHHH-HHHhcC----cceeeccccccccccceeeechhhhccccCCccceEE------EEEcccceEEEeeccch Q lcl|NC_011142. 238 GYTDRTVIE-HFQINN----AYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYV------VYDKSERNLALAKPIPF 306 (343) Q Consensus 238 ~~~~~tvle-~l~~n~----~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v------~y~~~~~~~~~~v~~~~ 306 (343) +..+.-+.. -+.... ....+-|.|+.+.. .. . .++..++ +|-.+.+-+.+.+-... T Consensus 210 d~~G~~l~~~~~~~~~~~~~~~~~l~G~pv~~~~-------~~---~---~~~~~~~~Gd~s~~~~~~~~~~~v~~~~e~ 276 (326) T protein:vir:42 210 DKSGRPLFIESTYTEENSPFRLGRIVARPTILSD-------HV---A---SGTVVGYQGDFRQLVWGQVGGLSFDVTDQA 276 (326) T ss_pred ccCCceeeccccccCccccccCceeeeeeEEEcC-------CC---C---CCceEEEEeecceEEEEEecceEEEEeecc Confidence 322321111 000000 01112233332211 00 0 1111111 01111222222221111 Q ss_pred hcc--cc--------eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 307 RML--AP--------QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 307 ~~~--~~--------~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ... .. ..++ ...+.+..++ ++.+.+|.|++++.++ T Consensus 277 ~~~~~~~~~~~~~~~~~~d-~~~~r~~~~~-d~~v~~~~a~~~l~~~ 321 (326) T protein:vir:42 277 TLNLGTPQAPNFVSLWQHN-LVAVRVEAEY-AFHCNDKDAFVKLTNV 321 (326) T ss_pred eeeecccccccchhhhhcC-cEEEEEEEEe-ccEEecccceEEEeec Confidence 100 00 0111 2455667776 6788999999999999 No 53 >protein:vir:4700 Length: 415 # NCBI annotation: phi PVL ORF 7 homologue # Family: family:all:21 # MgeID: mge:102 # MgeName: phiPV83 # Cross-refs: genbank:acc:NP_061632;genbank:gi:9635719;genbank:GeneID:1262976 Probab=98.30 E-value=1.5e-07 Score=57.94 Aligned_cols=313 Identities=7% Similarity=-0.008 Sum_probs=146.0 Q ss_pred CCcceeccchh----------------hhhchhhhchhcccccccCcc--hhecchhhhhhhhHHHHHHHHHHHHhhhhh Q lcl|NC_011142. 1 MSEKRVVIDAQ----------------TIAGNRWLNKFLDSNATIGVP--SVVNDADGGAAYYISQLASLETTVYEVPYA 62 (343) Q Consensus 1 ~~~~~~~~~~~----------------~~~~~~~~~~~~~~~~~~~~~--~~~~dA~~~~~f~~~~l~~id~~v~e~~~~ 62 (343) ........+.+ ...........+......+.. ...... +++..+.. +.+.+.|++.... T Consensus 71 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~t-~~g~~~iP--~~~~~~ii~~~~~ 147 (415) T protein:vir:47 71 NQQSVEVNEARTYRNQANINDLGISIQNTKVTSQEVRDFTEYLETRNDIQGGSLKT-DSGFVVIP--EEIVTDILKLKEV 147 (415) T ss_pred cccccccchhhhhHHHHHHHHHHHhhhhhhhhHHHHHHHHHHHhhhhhhhhccccc-cCCccccc--HHHHHHHHHHHHh Confidence 00000000000 000000000000000000000 001111 22333444 4556678887777 Q ss_pred cccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceee-eccceeEEEEEEEEEEEeecHHHHHHHHHhCCC Q lcl|NC_011142. 63 DITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVA-QSAKLHQVELGYAGVECHYSLDELRTTAAVNMP 141 (343) Q Consensus 63 ~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~-~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~ 141 (343) ....+.++.+.. .+.+...+.+........+.+++.++ .+|..+ ..++......+.++.-+.+|.+=++. ...+ T Consensus 148 ~~~l~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~v~Eg~-~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~d---s~~~ 222 (415) T protein:vir:47 148 EFNLDKYVTVKR-VTNGSGKYPVVRQSEVAALEKVEELE-ENPELAVKPFFQLAYDINTHRGYFRISREAIED---AKVN 222 (415) T ss_pred hhhhhhhcceee-ccCCceeEEEEEecCCcceeeccccc-ccccccccceeeEEeeeeeeEeeehhhHHHHhh---chHH Confidence 777666665421 22222222222223334556666553 456443 46788888999999888887654433 3457 Q ss_pred ccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEE Q lcl|NC_011142. 142 IDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTV 221 (343) Q Consensus 142 l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL 221 (343) |..--....++++.+.+|+.+++|+......+......... ..+... ...-++||.+++.++... + ..+..+ T Consensus 223 l~~~i~~~l~~~i~~~~d~~il~g~g~g~~~~~~~~~~~~~----~~~~~~-~~~~~~~i~~~~~~~~~~--~-~~~~~~ 294 (415) T protein:vir:47 223 VLQELKLWMARTIAATRNKAIIDVITKGSTGSTSSGFEKEG----KKLEVK-KAKSLDDIKDAINLNVKP--N-YEHNVA 294 (415) T ss_pred HHHHHHHHHHHHHHHHHHHHHhhccccCCcccccccccccc----ceeccc-cccchHHHHHHHHhhhhh--c-cCCCEE Confidence 88888888999999999999999976433333222211111 011111 112256777777776542 2 356789 Q ss_pred EecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEe Q lcl|NC_011142. 222 LMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALA 301 (343) Q Consensus 222 ~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~ 301 (343) +|+|+.|..|.... +..|.-++.--..+.....+-|.|+.+. .... .+..|+. .++|-+=.+.+.+. T Consensus 295 v~n~~~~~~L~~lk--d~~G~~i~~~~~~~~~~~~l~G~pV~~~-------~~~~---~~~~~~~-~~~~gd~~~~~~~~ 361 (415) T protein:vir:47 295 IVSQTMFAKLDKMK--DKLGNYLIQPDVKEKTQQRLLGAKIEIL-------PDEV---LGQKGNN-TLIIGNLKDAIVLF 361 (415) T ss_pred EEcHHHHHHHHHhh--ccCCCeeeccCcCCCCCccccceeeEEe-------cccc---ccCCCcc-EEEEEehhccEEEE Confidence 99999999996532 3333322110000111112334443221 1111 1122222 23333222323332 Q ss_pred eccchhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 302 KPIPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 302 v~~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .-+.++...............+.|+ ++.+.+|.++++++-= T Consensus 362 ~~~~~~v~~~~~~~~~~~~~~~~r~-d~~v~~~~a~~~~~~~ 402 (415) T protein:vir:47 362 DRSQYQASWTDYMHFGECLMIAVRQ-DCRILDYKSAIVIEYD 402 (415) T ss_pred eecceEEEeeccccCceEEEEEEEe-ccEEeccccEEEEEee Confidence 2233333222222222334556776 6788899999998754 No 54 >protein:vir:4600 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:101 # MgeName: PVL # Cross-refs: genbank:acc:NP_058445;genbank:gi:9635171;genbank:GeneID:1262708 Probab=98.30 E-value=1.5e-07 Score=57.94 Aligned_cols=313 Identities=7% Similarity=-0.008 Sum_probs=146.0 Q ss_pred CCcceeccchh----------------hhhchhhhchhcccccccCcc--hhecchhhhhhhhHHHHHHHHHHHHhhhhh Q lcl|NC_011142. 1 MSEKRVVIDAQ----------------TIAGNRWLNKFLDSNATIGVP--SVVNDADGGAAYYISQLASLETTVYEVPYA 62 (343) Q Consensus 1 ~~~~~~~~~~~----------------~~~~~~~~~~~~~~~~~~~~~--~~~~dA~~~~~f~~~~l~~id~~v~e~~~~ 62 (343) ........+.+ ...........+......+.. ...... +++..+.. +.+.+.|++.... T Consensus 71 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~t-~~g~~~iP--~~~~~~ii~~~~~ 147 (415) T protein:vir:46 71 NQQSVEVNEARTYRNQANINDLGISIQNTKVTSQEVRDFTEYLETRNDIQGGSLKT-DSGFVVIP--EEIVTDILKLKEV 147 (415) T ss_pred cccccccchhhhhHHHHHHHHHHHhhhhhhhhHHHHHHHHHHHhhhhhhhhccccc-cCCccccc--HHHHHHHHHHHHh Confidence 00000000000 000000000000000000000 001111 22333444 4556678887777 Q ss_pred cccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceee-eccceeEEEEEEEEEEEeecHHHHHHHHHhCCC Q lcl|NC_011142. 63 DITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVA-QSAKLHQVELGYAGVECHYSLDELRTTAAVNMP 141 (343) Q Consensus 63 ~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~-~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~ 141 (343) ....+.++.+.. .+.+...+.+........+.+++.++ .+|..+ ..++......+.++.-+.+|.+=++. ...+ T Consensus 148 ~~~l~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~v~Eg~-~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~d---s~~~ 222 (415) T protein:vir:46 148 EFNLDKYVTVKR-VTNGSGKYPVVRQSEVAALEKVEELE-ENPELAVKPFFQLAYDINTHRGYFRISREAIED---AKVN 222 (415) T ss_pred hhhhhhhcceee-ccCCceeEEEEEecCCcceeeccccc-ccccccccceeeEEeeeeeeEeeehhhHHHHhh---chHH Confidence 777666665421 22222222222223334556666553 456443 46788888999999888887654433 3457 Q ss_pred ccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEE Q lcl|NC_011142. 142 IDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTV 221 (343) Q Consensus 142 l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL 221 (343) |..--....++++.+.+|+.+++|+......+......... ..+... ...-++||.+++.++... + ..+..+ T Consensus 223 l~~~i~~~l~~~i~~~~d~~il~g~g~g~~~~~~~~~~~~~----~~~~~~-~~~~~~~i~~~~~~~~~~--~-~~~~~~ 294 (415) T protein:vir:46 223 VLQELKLWMARTIAATRNKAIIDVITKGSTGSTSSGFEKEG----KKLEVK-KAKSLDDIKDAINLNVKP--N-YEHNVA 294 (415) T ss_pred HHHHHHHHHHHHHHHHHHHHHhhccccCCcccccccccccc----ceeccc-cccchHHHHHHHHhhhhh--c-cCCCEE Confidence 88888888999999999999999976433333222211111 011111 112256777777776542 2 356789 Q ss_pred EecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEe Q lcl|NC_011142. 222 LMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALA 301 (343) Q Consensus 222 ~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~ 301 (343) +|+|+.|..|.... +..|.-++.--..+.....+-|.|+.+. .... .+..|+. .++|-+=.+.+.+. T Consensus 295 v~n~~~~~~L~~lk--d~~G~~i~~~~~~~~~~~~l~G~pV~~~-------~~~~---~~~~~~~-~~~~gd~~~~~~~~ 361 (415) T protein:vir:46 295 IVSQTMFAKLDKMK--DKLGNYLIQPDVKEKTQQRLLGAKIEIL-------PDEV---LGQKGNN-TLIIGNLKDAIVLF 361 (415) T ss_pred EEcHHHHHHHHHhh--ccCCCeeeccCcCCCCCccccceeeEEe-------cccc---ccCCCcc-EEEEEehhccEEEE Confidence 99999999996532 3333322110000111112334443221 1111 1122222 23333222323332 Q ss_pred eccchhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 302 KPIPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 302 v~~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .-+.++...............+.|+ ++.+.+|.++++++-= T Consensus 362 ~~~~~~v~~~~~~~~~~~~~~~~r~-d~~v~~~~a~~~~~~~ 402 (415) T protein:vir:46 362 DRSQYQASWTDYMHFGECLMIAVRQ-DCRILDYKSAIVIEYD 402 (415) T ss_pred eecceEEEeeccccCceEEEEEEEe-ccEEeccccEEEEEee Confidence 2233333222222222334556776 6788899999998754 No 55 >protein:vir:191 Length: 385 # NCBI annotation: major head subunit precursor # Family: family:all:585 # MgeID: mge:6 # MgeName: HK97 # Cross-refs: genbank:acc:NP_037701;genbank:gi:9634158;genbank:GeneID:1262530 Probab=98.30 E-value=1.5e-07 Score=57.98 Aligned_cols=302 Identities=12% Similarity=0.095 Sum_probs=159.5 Q ss_pred CCcceeccchh----hhhchhhhchhccccc-ccC----cchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhcc Q lcl|NC_011142. 1 MSEKRVVIDAQ----TIAGNRWLNKFLDSNA-TIG----VPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVP 71 (343) Q Consensus 1 ~~~~~~~~~~~----~~~~~~~~~~~~~~~~-~~~----~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~ 71 (343) ..+.....+.. .-+.++..+ .+.... ... .-++...++.+|.+...+ +.+.+++........+.+++ T Consensus 64 ~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~g~~i~~~---~~~~ii~~~~~~~~l~~~~~ 139 (385) T protein:vir:19 64 LASGAENPGEKKSFSERAAEELIK-SWDGKQGTFGAKTFNKSLGSDADSAGSLIQPM---QIPGIIMPGLRRLTIRDLLA 139 (385) T ss_pred hhccccccchhhhhHHHHHHHHHH-HHHHhhccchhhHHHhhhccccccCCceecch---hhhHHHHHhhhccchhhhcc Confidence 11111111110 001111110 000000 000 011122233444444442 34567777777777777777 Q ss_pred ccCCCCcceeEEEEeeecc-cccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHH Q lcl|NC_011142. 72 VLANIPEYATHWNYRSYDG-AAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELA 150 (343) Q Consensus 72 v~~~~~~~~~~~~~~~~~~-~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA 150 (343) +.. .+ ...+.|...+. .+.+.+++.+ ..+|..+..++........++..+.++. |+.... ..+...-.... T Consensus 140 ~~~-~~--~~~~~~~~~~~~~~~a~~v~E~-~~~~~~~~~~~~~~~~~~k~~~~~~is~-ell~d~---~~l~~~i~~~l 211 (385) T protein:vir:19 140 QGR-TS--SNALEYVREEVFTNNADVVAEK-ALKPESDITFSKQTANVKTIAHWVQASR-QVMDDA---PMLQSYINNRL 211 (385) T ss_pred eec-cc--CcceEEEEEecCCcceeeeccC-ccccccccceeEEEEeeeeEEEeehhhH-HHHhhH---HHHHHHHHHHH Confidence 643 22 22455665544 4566677664 5688888889999999999999999885 454322 24777777788 Q ss_pred HHHHHHhhhheeeeeehhh-cceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHH Q lcl|NC_011142. 151 FRGSEEHSQRVAYFGDTNR-NMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWK 229 (343) Q Consensus 151 ~~~~~~~~n~~~f~G~~~~-g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~ 229 (343) ++++++.+|+.+++|+... ...||++.++....+... +.+..+++|.+++.++... ...+..++|+|+.|. T Consensus 212 a~a~~~~~d~~~l~G~g~~~~~~Gi~~~~~~~~~~~~~-----~~~~~~d~i~~~~~~l~~~---~~~~~~~~~~~~~~~ 283 (385) T protein:vir:19 212 MYGLALKEEGQLLNGDGTGDNLEGLNKVATAYDTSLNA-----TGDTRADIIAHAIYQVTES---EFSASGIVLNPRDWH 283 (385) T ss_pred HHHHHHHHHHHHHhccCCCCcccccccccccccccccc-----cccchHHHHHHHHHhhccc---cCCCCEEEEcHHHHH Confidence 8899999999999998543 468999988765544321 2233577888888777532 234678999999999 Q ss_pred HHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcc Q lcl|NC_011142. 230 RASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRML 309 (343) Q Consensus 230 ~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~ 309 (343) .|..-. +..|.-++.-.....+ ..+-|.|+.... .. +. +..++.+. .+.+.+..-..++.. T Consensus 284 ~l~~lk--d~~G~~l~~~~~~~~~-~~l~G~pV~~~~-------~~------p~--~~~~~gd~-~~~~~~~~~~~~~v~ 344 (385) T protein:vir:19 284 NIALLK--DNEGRYIFGGPQAFTS-NIMWGLPVVPTK-------AQ------AA--GTFTVGGF-DMASQVWDRMDATVE 344 (385) T ss_pred HHHHhh--cCCCceeccCcccCCC-ceecceeeEEcC-------cC------CC--CcEEEeec-ccEEEEEEecceEEE Confidence 986533 4344333211111111 122344432211 10 00 11121111 112222222222221 Q ss_pred cc-e-----ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 310 AP-Q-----LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 310 ~~-~-----~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .. + .++ .+.+.++.+++ +.+++|.+++.++.- T Consensus 345 ~~~~~~~~~~~~-~~~~~~~~r~~-~~v~~~~a~~~~~~~ 382 (385) T protein:vir:19 345 VSREDRDNFVKN-MLTILCEERLA-LAHYRPTAIIKGTFS 382 (385) T ss_pred EeccccchhhcC-cEEEEEEEeec-cEEecccceEEEEec Confidence 11 1 122 24555677774 677899999999988 No 56 >protein:vir:1886 Length: 385 # NCBI annotation: major capsid subunit precursor # Family: family:all:585 # MgeID: mge:41 # MgeName: HK022 # Cross-refs: genbank:acc:NP_037666;genbank:gi:9634124;genbank:GeneID:1262513 Probab=98.30 E-value=1.5e-07 Score=57.98 Aligned_cols=302 Identities=12% Similarity=0.095 Sum_probs=159.5 Q ss_pred CCcceeccchh----hhhchhhhchhccccc-ccC----cchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhcc Q lcl|NC_011142. 1 MSEKRVVIDAQ----TIAGNRWLNKFLDSNA-TIG----VPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVP 71 (343) Q Consensus 1 ~~~~~~~~~~~----~~~~~~~~~~~~~~~~-~~~----~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~ 71 (343) ..+.....+.. .-+.++..+ .+.... ... .-++...++.+|.+...+ +.+.+++........+.+++ T Consensus 64 ~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~g~~i~~~---~~~~ii~~~~~~~~l~~~~~ 139 (385) T protein:vir:18 64 LASGAENPGEKKSFSERAAEELIK-SWDGKQGTFGAKTFNKSLGSDADSAGSLIQPM---QIPGIIMPGLRRLTIRDLLA 139 (385) T ss_pred hhccccccchhhhhHHHHHHHHHH-HHHHhhccchhhHHHhhhccccccCCceecch---hhhHHHHHhhhccchhhhcc Confidence 11111111110 001111110 000000 000 011122233444444442 34567777777777777777 Q ss_pred ccCCCCcceeEEEEeeecc-cccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHH Q lcl|NC_011142. 72 VLANIPEYATHWNYRSYDG-AAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELA 150 (343) Q Consensus 72 v~~~~~~~~~~~~~~~~~~-~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA 150 (343) +.. .+ ...+.|...+. .+.+.+++.+ ..+|..+..++........++..+.++. |+.... ..+...-.... T Consensus 140 ~~~-~~--~~~~~~~~~~~~~~~a~~v~E~-~~~~~~~~~~~~~~~~~~k~~~~~~is~-ell~d~---~~l~~~i~~~l 211 (385) T protein:vir:18 140 QGR-TS--SNALEYVREEVFTNNADVVAEK-ALKPESDITFSKQTANVKTIAHWVQASR-QVMDDA---PMLQSYINNRL 211 (385) T ss_pred eec-cc--CcceEEEEEecCCcceeeeccC-ccccccccceeEEEEeeeeEEEeehhhH-HHHhhH---HHHHHHHHHHH Confidence 643 22 22455665544 4566677664 5688888889999999999999999885 454322 24777777788 Q ss_pred HHHHHHhhhheeeeeehhh-cceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHH Q lcl|NC_011142. 151 FRGSEEHSQRVAYFGDTNR-NMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWK 229 (343) Q Consensus 151 ~~~~~~~~n~~~f~G~~~~-g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~ 229 (343) ++++++.+|+.+++|+... ...||++.++....+... +.+..+++|.+++.++... ...+..++|+|+.|. T Consensus 212 a~a~~~~~d~~~l~G~g~~~~~~Gi~~~~~~~~~~~~~-----~~~~~~d~i~~~~~~l~~~---~~~~~~~~~~~~~~~ 283 (385) T protein:vir:18 212 MYGLALKEEGQLLNGDGTGDNLEGLNKVATAYDTSLNA-----TGDTRADIIAHAIYQVTES---EFSASGIVLNPRDWH 283 (385) T ss_pred HHHHHHHHHHHHHhccCCCCcccccccccccccccccc-----cccchHHHHHHHHHhhccc---cCCCCEEEEcHHHHH Confidence 8899999999999998543 468999988765544321 2233577888888777532 234678999999999 Q ss_pred HHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcc Q lcl|NC_011142. 230 RASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRML 309 (343) Q Consensus 230 ~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~ 309 (343) .|..-. +..|.-++.-.....+ ..+-|.|+.... .. +. +..++.+. .+.+.+..-..++.. T Consensus 284 ~l~~lk--d~~G~~l~~~~~~~~~-~~l~G~pV~~~~-------~~------p~--~~~~~gd~-~~~~~~~~~~~~~v~ 344 (385) T protein:vir:18 284 NIALLK--DNEGRYIFGGPQAFTS-NIMWGLPVVPTK-------AQ------AA--GTFTVGGF-DMASQVWDRMDATVE 344 (385) T ss_pred HHHHhh--cCCCceeccCcccCCC-ceecceeeEEcC-------cC------CC--CcEEEeec-ccEEEEEEecceEEE Confidence 986533 4344333211111111 122344432211 10 00 11121111 112222222222221 Q ss_pred cc-e-----ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 310 AP-Q-----LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 310 ~~-~-----~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .. + .++ .+.+.++.+++ +.+++|.+++.++.- T Consensus 345 ~~~~~~~~~~~~-~~~~~~~~r~~-~~v~~~~a~~~~~~~ 382 (385) T protein:vir:18 345 VSREDRDNFVKN-MLTILCEERLA-LAHYRPTAIIKGTFS 382 (385) T ss_pred EeccccchhhcC-cEEEEEEEeec-cEEecccceEEEEec Confidence 11 1 122 24555677774 677899999999988 No 57 >protein:vir:100247 Length: 425 # NCBI annotation: gp76 # Family: family:all:21 # MgeID: mge:1619 # MgeName: Bcep176 # Cross-refs: genbank:acc:YP_355412;genbank:gi:77864702;genbank:GeneID:3725969 Probab=98.19 E-value=4e-07 Score=55.66 Aligned_cols=311 Identities=11% Similarity=0.069 Sum_probs=154.4 Q ss_pred CCcceeccchhhhhchhhhch---hcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCC Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNK---FLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIP 77 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~---~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~ 77 (343) +..+........ .-.+.+.. ..+...+ +.....+.|.++.. +.+.+.|++.....-..+++..+.. .+ T Consensus 100 ~~~~~~~~~~~~-~~~~af~~~l~~~e~~~a-----l~~~t~~~gG~lvP--~~~~~~ii~~~~~~s~l~~l~~~~~-~~ 170 (425) T protein:vir:10 100 MGANGVKPLRDP-EYTEAFKAHVKRGDVQAA-----LNKGEDSEGGYLTP--IEWDRTITNKLVLISPMRQLCRVQP-VS 170 (425) T ss_pred cccccccccccH-HHHHHHHHHhhhhhhHHH-----hhcCcCCCCceecc--HhHHHHHHHHHHhhhhhhhhceeee-cc Confidence 111111000000 00000000 0011111 11111233445554 4566678887777666666665432 22 Q ss_pred cceeEEEEeeecccccceeecCCcCccceeee-ccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHH Q lcl|NC_011142. 78 EYATHWNYRSYDGAAMGKFISANASDLPRVAQ-SAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEE 156 (343) Q Consensus 78 ~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~-~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~ 156 (343) .+ ...+.+......+.+++.. ..+|..+. .+++.....+.++.-..+|.+=|+ ....+|...-....+.++++ T Consensus 171 ~~--~~~~~~~~~~~~a~wv~E~-~~~~~~~~~~f~~v~~~~~k~~~~i~iS~ell~---ds~~~l~~~i~~~la~ai~~ 244 (425) T protein:vir:10 171 KA--GFSKLFNMGGTTSGWVGEA-SQRPQTNAATFQPLSFASGEIYANPAATQQILD---DAEIDLESWLATEVQTEFAK 244 (425) T ss_pred CC--ceEEEEEcCCcceeeeccc-cccccccccccceeeeeheeeEeehHhHHHHHh---cchhHHHHHHHHHHHHHHHH Confidence 22 2334444445566676655 44665553 577888888888877777654343 33567888888999999999 Q ss_pred hhhheeeeeehhhcceeeeecCCccccccCcCcc-------ccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHH Q lcl|NC_011142. 157 HSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYA-------TCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWK 229 (343) Q Consensus 157 ~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~-------~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~ 229 (343) .+|+.+++|+......|+||++..........|. ..+..--++||.+++..|.. .+ ...-.++|+|..|. T Consensus 245 ~~d~~~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~l~~l~~~l~~--~~-~~~a~~vmn~~~~~ 321 (425) T protein:vir:10 245 QEGKAFLAGDGTNKPNGLLTYIAGGANAAKHPFGAIEVVNSGAAADITSDGIIDLVYDLPS--AF-TGNARFAMNRNTQR 321 (425) T ss_pred HHHhhhhcccCCCCcceeeeccccccccccccccccccccccccccccHHHHHHHHhhhhh--hh-ccCCEEEEchHHHH Confidence 9999999999877889999988755433322211 11222345667777766643 22 23457899999999 Q ss_pred HHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcc Q lcl|NC_011142. 230 RASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRML 309 (343) Q Consensus 230 ~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~ 309 (343) .|..-. +..|.-++.-=-.++....+-|.|+.+.. ... ..+. +.+. ++|-+=.+.+.+.--.-++.+ T Consensus 322 ~L~~lk--D~~G~~l~~~~~~~g~~~~l~G~PV~~~~-------~~p--~~~~-~~~~-i~~Gd~~~~~~i~~~~~~~v~ 388 (425) T protein:vir:10 322 QVRKLK--DGQGNYLWQPSYVAGQPATLAGYPVTEVP-------DMP--DVAA-NSTP-ILFGDFQQTYLIIDRIGVRVL 388 (425) T ss_pred HHHHhh--cCCCceeeccCccCCCCceecceeeEEec-------CcC--CccC-CccE-EEEEehhccEEEEEecceEEE Confidence 987432 43332221100001111223344443321 111 1111 2232 333221222222111222222 Q ss_pred cceecC-ceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 310 APQLLG-LGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 310 ~~~~~~-~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .-.... -...+..+.|+ ++.+.+|.|++.+..= T Consensus 389 ~d~~~~~~~~~~~~~~r~-d~~v~~~~A~~~l~~~ 422 (425) T protein:vir:10 389 RDPYTAKPYVLFYTTKRV-GGGLLNPEPMRAMKVA 422 (425) T ss_pred ecccccCCcEEEEEEEEe-ccEeecccceEEEEee Confidence 111111 12344556776 5667779998775544 No 58 >protein:vir:8102 Length: 543 # NCBI annotation: gp6 # Family: family:all:21 # MgeID: mge:152 # MgeName: Che9c # Cross-refs: genbank:acc:NP_817683;genbank:gi:29566114;genbank:GeneID:1259308 Probab=98.18 E-value=6e-07 Score=54.70 Aligned_cols=308 Identities=10% Similarity=0.046 Sum_probs=148.7 Q ss_pred CCcceeccchhhhhchhhhc----hhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhc-ccchhhccccCC Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLN----KFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYAD-ITYLEDVPVLAN 75 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~----~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~-l~~~~~i~v~~~ 75 (343) ..+++......+=....++. ..+......+ .++++ +..+.. +.+.+.++.....+ -..+.+..+.. T Consensus 219 ~~~~~a~~~~~~~~~~~~l~~~e~~~~~~~~~~~----~t~~~--gg~lip--~~~~~~ii~~~~~~~~~l~~~~~~~~- 289 (543) T protein:vir:81 219 PAYLRAWSKMARNPHAAILTEEEKRAINEVRAMG----LTKAD--GGYLVP--FQLDPTVIITSNGSLNDIRRFARQVV- 289 (543) T ss_pred hhhhhHHHHHHHhhHHHHhhhhhhhhhhhhhhcc----ccccc--CcccCc--hhhhhHHHHHHHhhhchhhhhccccc- Confidence 00000000000000000000 0000000001 12222 222322 23334444333333 23344443321 Q ss_pred CCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHH Q lcl|NC_011142. 76 IPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSE 155 (343) Q Consensus 76 ~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~ 155 (343) + .| .+.+.+....+.+.+++.+ ..+|..+..++.....+..++.-+.+|.+ +... ..++...-....+.+++ T Consensus 290 ~-~g--~~~~~~~~~~~~a~~v~Eg-~~~~~~~~~~~~i~~~~~k~~~~~~is~e-ll~d---~~~~~~~i~~~l~~~~~ 361 (543) T protein:vir:81 290 A-TG--DVWHGVSSAAVQWSWDAEF-EEVSDDSPEFGQPEIPVKKAQGFVPISIE-ALQD---EANVTETVALLFAEGKD 361 (543) T ss_pred C-Cc--ceEEEEecCCcceeecccC-ccccccccccceeeeeeeeeEeeehhhHH-HHhc---cHHHHHHHHHHHHHHHH Confidence 1 22 2344555666777777765 45788888899999999999999998874 4432 24788888888999999 Q ss_pred Hhhhheeeeeehhh-cceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhcc Q lcl|NC_011142. 156 EHSQRVAYFGDTNR-NMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSL 234 (343) Q Consensus 156 ~~~n~~~f~G~~~~-g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~ 234 (343) +.+|+.+|+|+... ...|+++.+........ + ..+..-.++|+.+++..+.. .+ .....++|+|..|..|... T Consensus 362 ~~~d~ail~G~Gt~~~p~Gi~~~~~~~~~~~~--~-~~~~~~~~~~~~~~~~~l~~--~~-~~~~~~v~n~~~~~~l~~l 435 (543) T protein:vir:81 362 ELEAVTLTTGTGQGNQPTGIVTALAGTAAEIA--P-VTAETFALADVYAVYEQLAA--RH-RRQGAWLANNLIYNKIRQF 435 (543) T ss_pred HHHHHHHhccCCCCcccccchhhccccccccc--c-cccccccHHHHHHHHHhhhc--cc-cCCcEEEEcHHHHHHHHHh Confidence 99999999998643 57899988764332221 1 11222346788888877643 22 2335799999999999754 Q ss_pred ccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhc--ccce Q lcl|NC_011142. 235 LMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRM--LAPQ 312 (343) Q Consensus 235 ~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~--~~~~ 312 (343) . +..|.-++.-+....+ ..+-|.|+.+.... . .....+ ...+ +-.++|-+ -..+.+..-..++. .+-- T Consensus 436 k--d~~G~~l~~~~~~g~~-~~l~G~pv~~~~~~--~--~~~~~~-~~~~-~~~i~~gd-~~~~~i~~~~~~~i~~~~~~ 505 (543) T protein:vir:81 436 D--TQGGAGLWTTIGNGEP-SQLLGRPVGEAEAM--D--ANWNTS-ASAD-NFVLLYGN-FQNYVIADRIGMTVEFIPHL 505 (543) T ss_pred h--cCCCceeccCcCCCCC-ccccceeeEEeccc--c--cccccc-ccCC-cceEEEee-ccceeEEeecccEEEEeccc Confidence 3 3233212111111111 12334443332110 0 000000 1111 22233322 22333332223222 1110 Q ss_pred ------ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 313 ------LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 313 ------~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .++ .+.+..+.++ |+.+++|.|++.+.-- T Consensus 506 ~~~~~~~~~-~~~~~~~~r~-d~~v~~~~A~~~l~~~ 540 (543) T protein:vir:81 506 FGTNRRPNG-SRGWFAYYRM-GADVVNPNAFRLLNVE 540 (543) T ss_pred cccchhhcC-ceEEEEEEee-ccEeecccceEEEEec Confidence 011 2344456666 5677889999888877 No 59 >protein:vir:4456 Length: 401 # NCBI annotation: Major capsid protein precursor # Family: family:all:21 # MgeID: mge:96 # MgeName: ST64B # Cross-refs: genbank:acc:NP_700379;genbank:gi:23505451;genbank:GeneID:955658 Probab=98.16 E-value=2e-07 Score=57.29 Aligned_cols=315 Identities=9% Similarity=0.073 Sum_probs=154.0 Q ss_pred CCcceeccchhhhhchhhhchh-cccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcc Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKF-LDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEY 79 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~ 79 (343) ..+.....+-+.-. ...++.. .+.....-.-++..-.++.|.++.+ +.+.+.|++........+.+..+.. -+ T Consensus 74 ~~~~~~~~e~~~a~-~~~lr~~~~~~~~~~e~~a~~~~~~~~GG~~iP--~~~~~~ii~~~~~~~~l~~~~~~~~---~~ 147 (401) T protein:vir:44 74 GAQNKVAAEHKDAF-VGFLRKGREDGLRDLERKALQVGTDEDGGYAVP--EELDRSILSLLKDEVVMRQEATVIT---VG 147 (401) T ss_pred ccccchhHHHHHHH-HHHHhhhhhhhhHHHHHHHhhcCCCCCCceecc--HhHHHHHHHHHHhhhhhhhhceeee---cC Confidence 11111100001000 0011000 0000000000111111233445554 4566778887766666666554422 12 Q ss_pred eeEEEEeeecccccceeecCCcCccceee-eccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhh Q lcl|NC_011142. 80 ATHWNYRSYDGAAMGKFISANASDLPRVA-QSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHS 158 (343) Q Consensus 80 ~~~~~~~~~~~~G~a~~~~~~~~dip~v~-~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~ 158 (343) .....+.+......+.+.+.. ...|..+ ..+++....++.++.-+.+|.+=|+. ...+|...-....+.++.+.+ T Consensus 148 ~~~~~~~~~~~~~~a~wv~E~-~~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~d---s~~~l~~~i~~~la~ai~~~~ 223 (401) T protein:vir:44 148 GSDYKKLVNLGGTASGWVGET-DTRSQTATSRLGLIEPFMGEIYGNPQATQKMLDD---AFFNVEAWINSELATEFAEQE 223 (401) T ss_pred CCceEEEEecCCccceeeccc-cccCccccccceeeeeehhheeeehhhhHHHHhc---chHHHHHHHHHHHHHHHHHHH Confidence 223445555555556666554 3355444 35777778888888777777654443 355788888888899999999 Q ss_pred hheeeeeehhhcceeeeecCCccccccCcCcc------ccCHH-HHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHH Q lcl|NC_011142. 159 QRVAYFGDTNRNMSGLLNNPNVTKTSATVNYA------TCTGQ-ELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRA 231 (343) Q Consensus 159 n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~------~~t~~-~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L 231 (343) +..+++|+......|+|+.+..........|. +.+.. --+++|.+++..|.. .+. ..-.++|+++.|..| T Consensus 224 ~~~~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~~~~~t~~~~~~~~d~i~~~~~~l~~--~~~-~~a~~v~n~~~~~~L 300 (401) T protein:vir:44 224 EIAFTTGDGTKKPKGFLAYESTEESDKARAFGKLQHIVSGEATAVTADAIIKLIYTLRK--AHR-TGAKFMMNNNSLFAI 300 (401) T ss_pred HhhhhccCCCCccceeeccccccccccccccccccccccccccccCHHHHHHHHHhcch--hhh-cCCEEEEcHHHHHHH Confidence 99999999877789999998865433222221 11111 226777777777643 222 234689999999999 Q ss_pred hccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc Q lcl|NC_011142. 232 SSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP 311 (343) Q Consensus 232 ~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~ 311 (343) ..- -+..|.-++.--..++....+-|.|+.... .... .++ +.+ .++|-+=.+.+.+.--+.++.+-- T Consensus 301 ~~l--kd~~G~~l~~~~~~~g~~~~l~G~PVv~~~-------~~p~--~~~-~~~-~i~~Gd~~~~~~i~~~~~~~~~~~ 367 (401) T protein:vir:44 301 RLL--KDTEGNYLWRPGLELGQPSSLAGYGIAENE-------QMPD--IAA-DAK-AIAFGNFKRGYTIVDRIGTRILRD 367 (401) T ss_pred HHh--hccCCceeecCCcCCCCCceecceeeEEec-------CcCC--ccC-Ccc-EEEEeehhccEEEEEecceEEeee Confidence 643 243343222100011221234455543221 1111 111 222 233322122222221122222211 Q ss_pred e--ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 312 Q--LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 312 ~--~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) . .++ ...+.++.|++ +.+..|.|++.+..= T Consensus 368 ~~~~~~-~v~~~a~~r~d-~~~~~~~a~~~l~~~ 399 (401) T protein:vir:44 368 PYTNKP-FVGFYTTKRTG-GMLVDSQAIKLLKIA 399 (401) T ss_pred ccccCC-cEEEEEEEEec-cEEecccceEEEEee Confidence 1 112 24456677874 556669999876555 No 60 >protein:vir:4339 Length: 395 # NCBI annotation: major head protein # Family: family:all:585 # MgeID: mge:93 # MgeName: D3 # Cross-refs: genbank:acc:NP_061502;genbank:gi:9635591;genbank:GeneID:1262860 Probab=98.15 E-value=1e-06 Score=53.45 Aligned_cols=310 Identities=13% Similarity=0.071 Sum_probs=154.1 Q ss_pred CCcceeccchhhhhchhhhchhcccccc---cC--cchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNAT---IG--VPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLAN 75 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~---~~--~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~ 75 (343) ..+.....-.+...-....+........ +. .-++.....++|..... .+.+.|++........+.++++..- T Consensus 76 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~vp~---~~~~~ii~~~~~~~~l~~l~~~~~~ 152 (395) T protein:vir:43 76 GGEEAPKTAGQMVAESLKEQGVTSSLRGSHRVSMPRSAITSIDGSGGALVAP---DRRPGVVAAPQRRLTIRDLVAPGTT 152 (395) T ss_pred cccchhhhHHHHHHHHHHHHHHHHHhhhhhhhhhhhhhhcccCCCCccccch---hhHHHHHHHHHhhhhHHhhccceec Confidence 0000000000000000000000000000 00 00001111123333333 2345688877777666666665332 Q ss_pred CCcceeEEEEeeec-ccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHH Q lcl|NC_011142. 76 IPEYATHWNYRSYD-GAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGS 154 (343) Q Consensus 76 ~~~~~~~~~~~~~~-~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~ 154 (343) +.....|.... ..+.+.+++..+ ..|..+..++......+.++..+.++.+ +.... . .+..--....++++ T Consensus 153 ---~~~~~~~~~~~~~~~~a~~v~E~~-~~~~~~~~~~~i~~~~~k~~~~~~is~e-ll~d~--~-~l~~~v~~~la~a~ 224 (395) T protein:vir:43 153 ---ESNSVEYVRETGFVNNAAPVSEGT-QKPYSDLTFELENAPVRTIAHLFKASRQ-ILDDA--S-ALQSYIDARARYGL 224 (395) T ss_pred ---CCCceEEEEEecCCCceeeecCCc-cccccccceeEEEEeeeeEEEeehhhHH-HHHhH--H-HHHHHHHHHHHHHH Confidence 22345555543 346777877654 5888888899999999999999998865 44322 2 47777788889999 Q ss_pred HHhhhheeeeeehhhc-ceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhc Q lcl|NC_011142. 155 EEHSQRVAYFGDTNRN-MSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASS 233 (343) Q Consensus 155 ~~~~n~~~f~G~~~~g-~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~ 233 (343) +..+|+.+++|+...+ ..|+++..++....... ..+.+..+++|.+++..+... + ..+..++|+|..|..|.. T Consensus 225 ~~~~d~~~l~G~g~~~~~~Gi~~~~~~~~~~~~~---~~~~~~~~~~i~~~~~~~~~~--~-~~~~~~vmn~~~~~~l~~ 298 (395) T protein:vir:43 225 MLVEECQLLYGNGTGANLHGIIPQAQAYAPPSGV---VVTAEQRIDRIRLAILQAQLA--E-FPASGIVLNPIDWALIEL 298 (395) T ss_pred HHHHHHHHHhccCCCCcccccccccccccccccc---ccccchhHHHHHHHHHhhccc--c-CCCcEEEEcHHHHHHHHH Confidence 9999999999986433 57999988865544332 123455788888888877542 2 245789999999999865 Q ss_pred cccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCc-cceEEEEEcccceEEEeeccchhcccce Q lcl|NC_011142. 234 LLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGN-KDRYVVYDKSERNLALAKPIPFRMLAPQ 312 (343) Q Consensus 234 ~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g-~dr~v~y~~~~~~~~~~v~~~~~~~~~~ 312 (343) .+ +..|.-+..-... .....+-|.|+..... ......-.|. ++.+.++++. -+.+.+... ....- T Consensus 299 lk--d~~G~~i~~~~~~-~~~~~l~G~pVv~~~~-------~~~~~~~~gd~~~~~~~~~~~--~~~i~~~~~--~~~~f 364 (395) T protein:vir:43 299 NK--DAENRYIIGSPQN-GTTPTLWRLPVVETQA-------ITQDEFLTGAFSLGAQIFDRM--DIEVLVSTE--NDKDF 364 (395) T ss_pred hh--ccCCceecccccc-CCCceecceeeEEcCC-------CCCCcEEEEeccceEEEEEec--ceEEEEecc--ccchh Confidence 33 3334323211111 1111223444322211 0000000000 1111122211 111111000 00000 Q ss_pred ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 313 LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 313 ~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .++ .+.+.++.++ ++.+++|.|+++++-= T Consensus 365 ~~~-~~~~r~~~r~-d~~v~~~~a~~~~~~t 393 (395) T protein:vir:43 365 ENN-MVTIRAEERL-AFAVYRPEAFVTGSLT 393 (395) T ss_pred hcC-cEEEEEEEee-ccEEecccceEEEEec Confidence 111 2234445665 6778999999998655 No 61 >protein:vir:10364 Length: 390 # NCBI annotation: head protein; major capsid subunit precursor # Family: family:all:585 # MgeID: mge:183 # MgeName: Xp10 # Cross-refs: genbank:acc:NP_858956;genbank:gi:32128421;genbank:GeneID:2648357 Probab=98.11 E-value=1.9e-06 Score=51.92 Aligned_cols=302 Identities=13% Similarity=0.056 Sum_probs=153.2 Q ss_pred CCcceecc-chhhhhchhhhc---hhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCC Q lcl|NC_011142. 1 MSEKRVVI-DAQTIAGNRWLN---KFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANI 76 (343) Q Consensus 1 ~~~~~~~~-~~~~~~~~~~~~---~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~ 76 (343) +.+....- +.+...+..... ..+..... ..........++|.++.++ .+ +++++........+.++.+.+ . T Consensus 78 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~g~~~~~~--~~-~~ii~~~~~~~~l~~~~~~~~-~ 152 (390) T protein:vir:10 78 VGDLFVASEQFQASAGRWNDRSARATMNIKAA-LNTASTDAAGSAGALTTPN--RL-PGFITQPDARLTVRDLIGSGR-T 152 (390) T ss_pred hhhhhhhhHHHHHHHHhhhhhhhhhhhHHHHH-HHhhhcccccccccccchh--HH-HHHHHHHHhhchhhhhcceee-c Confidence 00000000 000000000000 00000000 0001111123344455443 23 467777777666666666532 2 Q ss_pred CcceeEEEEeeecc-cccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHH Q lcl|NC_011142. 77 PEYATHWNYRSYDG-AAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSE 155 (343) Q Consensus 77 ~~~~~~~~~~~~~~-~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~ 155 (343) ....+.|...+. .+.+.+++.. ..+|..+..++.....++.++.-+.+|.+ +.... .+|..--....+++++ T Consensus 153 --~~~~~~~~~~~~~~~~a~~v~Eg-~~~~~~~~~~~~i~~~~~k~~~~~~is~e-ll~d~---~~l~~~i~~~l~~~~~ 225 (390) T protein:vir:10 153 --DSALIEYVQETGFVNNAAIVAEG-ALKPESSLKFAKKTDTTHVIAHTMKATRQ-ILSDA---PQLASYMNNRLIRGLK 225 (390) T ss_pred --cCCceEEEEEecCCcceeeecCC-ccccccccceeEEEEeeEEEEEeehhhHH-HHHhH---HHHHHHHHHHHHHHHH Confidence 222345554443 4667777665 45888888899999999999998888875 43322 2577777888899999 Q ss_pred Hhhhheeeeeehhh-cceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhcc Q lcl|NC_011142. 156 EHSQRVAYFGDTNR-NMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSL 234 (343) Q Consensus 156 ~~~n~~~f~G~~~~-g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~ 234 (343) +.+|+.+++|+... +..||+|.++....+... + ....++++.+++..+... ...+..++|+|+.|..|.+. T Consensus 226 ~~~~~~il~G~G~~~~p~Gi~~~~~~~~~~~~~--~---~~~~~~~~~~~~~~l~~~---~~~~~~~v~n~~~~~~L~~l 297 (390) T protein:vir:10 226 VKEDAEILRGTGANDGLLGLIPQATTYAAPTTI--A---GATRVDQLRLAMLQASLA---EYPASGIVINPIDWAAIELA 297 (390) T ss_pred HHHHHHHhhcCCCCccccccccccccccccccc--c---ccchHHHHHHHHHhhccc---cCCCCEEEEcHHHHHHHHHh Confidence 99999999998643 479999998865543321 1 222356777777777542 23467899999999999753 Q ss_pred ccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcc----- Q lcl|NC_011142. 235 LMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRML----- 309 (343) Q Consensus 235 ~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~----- 309 (343) . +..|.-++.-.....+. .+-|.|+..... . +.+ ..++-+.+ +.+.+.....++.. T Consensus 298 k--d~~g~~l~~~~~~~~~~-~l~G~pv~~~~~-------~------p~~--~~~~gdf~-~~~~~~~~~~~~i~~~~~~ 358 (390) T protein:vir:10 298 K--DANNQYLIGNARGTLTP-TLWGLPVVATQA-------M------APG--EFLVGAFD-LAAQIFDQWDARVEIGYVN 358 (390) T ss_pred h--cCCCceeecCCcCcCCc-eecceeeEEcCC-------C------CCC--cEEEEecc-ceEEEEEecceEEEEeecc Confidence 3 43343222111111111 233444322111 0 011 11111111 11222111222211 Q ss_pred cceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 310 APQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 310 ~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ..-.++ ...+.+..++ ++.+++|.|+++++== T Consensus 359 ~~~~~~-~~~~r~~~r~-d~~v~~~~a~~~~~~a 390 (390) T protein:vir:10 359 DDFQRN-MVTVLAEERL-ALVVYRPEALISGSFA 390 (390) T ss_pred cccccC-cEEEEEEEee-ccEEeccccEEEEEeC Confidence 111122 2455567776 5789999999876533 No 62 >protein:vir:9410 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:167 # MgeName: phi 13 # Cross-refs: genbank:acc:NP_803388;genbank:gi:29028700;genbank:GeneID:1258136 Probab=98.07 E-value=8.1e-07 Score=53.99 Aligned_cols=306 Identities=8% Similarity=0.046 Sum_probs=145.5 Q ss_pred CCccee-----ccchhhhhchhh--hchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhcccc Q lcl|NC_011142. 1 MSEKRV-----VIDAQTIAGNRW--LNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVL 73 (343) Q Consensus 1 ~~~~~~-----~~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~ 73 (343) .++... -.....+.+.+. +...+...... .+...+ .+++.++.. +.+.+.+++........+.++.+. T Consensus 84 ~~~~~~~~~~~~~~~~~~~~~e~~~~~~~~~~~~~~--~~~~~~-~~~g~~~iP--~~~~~~ii~~~~~~~~l~~~~~~~ 158 (415) T protein:vir:94 84 RNQANINDLGISIQNTKVTSQEVRDFTEYLETRNDI--QGGSLK-TDSGFVVIP--EEIVTDILKLKEVEFNLDKYVTVK 158 (415) T ss_pred HHHHHHHHHHhhhhhhhhhHHHHHHHHHHhhhhhhh--hhhccc-cccccccCc--HHHHHHHHHHHHhhhhhhhhccee Confidence 000000 000000000000 00000000000 000011 122344444 456677888777777777776553 Q ss_pred CCCCcceeEEEEeeecccccceeecCCcCcccee-eeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHH Q lcl|NC_011142. 74 ANIPEYATHWNYRSYDGAAMGKFISANASDLPRV-AQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFR 152 (343) Q Consensus 74 ~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v-~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~ 152 (343) . .+-+...+.+......+.+.+++..+ .+|-. ...++.....++.++.-+.+|.+=++ ....++..--....++ T Consensus 159 ~-~~~~~~~~~~~~~~~~~~~~~v~Eg~-~~~~~~~~~~~~i~~~~~k~~~~~~is~ell~---ds~~~~~~~i~~~l~~ 233 (415) T protein:vir:94 159 R-VTNGSGKYPVVRQSEVAALEKVEELE-ENPELAVKPFFQLAYDINTHRGYFRISREAIE---DAKVNVLQELKLWMAR 233 (415) T ss_pred e-ccCCceeEEEEeecCCccceeccccc-cccccccccceeeEeeheeeeeechhhHHHHh---hchHHHHHHHHHHHHH Confidence 2 22233334444444455666666553 45643 34678888889999888888765333 2345777778888999 Q ss_pred HHHHhhhheeeeeehhhcceee-eecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHH Q lcl|NC_011142. 153 GSEEHSQRVAYFGDTNRNMSGL-LNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRA 231 (343) Q Consensus 153 ~~~~~~n~~~f~G~~~~g~~GL-lN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L 231 (343) ++.+.+|+.+++|+......+. .+....... +... ...-++||.+++..+... + ..+..++|+|+.|..| T Consensus 234 ~~~~~~~~~il~g~g~g~~~~~~~~~~~~~~~-----~~~~-~~~~~~~i~~~~~~~~~~--~-~~~~~~vmn~~~~~~l 304 (415) T protein:vir:94 234 TIAATRNKAIIDVITKGSTGSTSSGFEKEGKK-----LEVK-KAKSLDDIKDAINLNVKP--N-YEHNVAIVSQTMFAKL 304 (415) T ss_pred HHHHHHHHHHhhccccCccccccccccccccc-----cccc-cccchHHHHHHHHhhhhh--c-cCCCEEEEcHHHHHHH Confidence 9999999999999764322222 211111111 1111 112256777777776432 2 3477899999999999 Q ss_pred hccccCCCCCccHHHHHHhcCcc----eeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchh Q lcl|NC_011142. 232 SSLLMTGYTDRTVIEHFQINNAY----TLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFR 307 (343) Q Consensus 232 ~~~~~~~~~~~tvle~l~~n~~~----~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~ 307 (343) .... +..|.-++ ..++. ..+-|.|+.... ... .+..++.. ++|-+=.+.+.+..-..++ T Consensus 305 ~~lk--d~~G~~l~----~~~~~~~~~~~l~G~pV~~~~-------~~~---~~~~~~~~-i~~gd~~~~~~~~~~~~~~ 367 (415) T protein:vir:94 305 DKMK--DKLGNYLI----QPDVKEKTQQRLLGAKIEILP-------DEV---LGQKGNNT-LIIGNLKDAIVLFDRSQYQ 367 (415) T ss_pred HHhh--ccCCCeee----ccCcCCCCCceecceeeEEec-------ccc---cCCCCccE-EEEEehhccEEEEeecceE Confidence 7532 33333221 11111 123344432211 111 11112222 2332212222222223333 Q ss_pred cccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 308 MLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 308 ~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ...............+.++ ++.+.+|.|+++++-- T Consensus 368 v~~~~~~~~~~~~r~~~r~-d~~~~~~~a~~~~~~~ 402 (415) T protein:vir:94 368 ASWTDYMHFGECLMIAVRQ-DCRILDYKSAIVIEYD 402 (415) T ss_pred EEEeccccCceEEEEEEEe-ccEEeccccEEEEEEe Confidence 3222222222334456666 5777889999998755 No 63 >protein:vir:93616 Length: 645 # NCBI annotation: putative major head protein/prohead protease # Family: family:all:21 # MgeID: mge:157 # MgeName: phi 4795 # Cross-refs: genbank:acc:YP_001449293;genbank:gi:157166041;goa:Q6H9U8;interpro:IPR006433;uniprot:Q6H9U8;genbank:GeneID:5580438 Probab=98.03 E-value=2.7e-06 Score=51.09 Aligned_cols=306 Identities=9% Similarity=0.011 Sum_probs=148.9 Q ss_pred CCc-------ce----------eccchhhhhchhh-----hchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHh Q lcl|NC_011142. 1 MSE-------KR----------VVIDAQTIAGNRW-----LNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYE 58 (343) Q Consensus 1 ~~~-------~~----------~~~~~~~~~~~~~-----~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e 58 (343) .++ .+ ..-.+..++-..+ ....+. .++. ..+.+++.++|.++.. +.+...+++ T Consensus 286 ~~~~~kg~~f~~~~~al~~~~g~~~~a~e~a~~~~~~~~~~~~~~~--~a~~-~~~~~~~~~~Gg~~vp--~~~~~~ii~ 360 (645) T protein:vir:93 286 EQKLDKGIGFARFAKSLAAAKGVRSEALEVARRQYPDDSRLHHVLK--SAVG-AGTTTDPQWAGSLSEY--QEYAQDFID 360 (645) T ss_pred hhhhhhhhhHHHHHHHHHhcccchhHHHHHHHhhcccchhhhhhhh--hhhh-ccccccccccCCccCc--hhhHHHHHH Confidence 000 00 0000000000000 000000 0111 1224555666777766 344466888 Q ss_pred hhhhcccchhhccccCCCCcce-eEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHH Q lcl|NC_011142. 59 VPYADITYLEDVPVLANIPEYA-THWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAA 137 (343) Q Consensus 59 ~~~~~l~~~~~i~v~~~~~~~~-~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~ 137 (343) ..++....+++-.....+-.+. -.+.......-+.+.+++.. .++|..+..++......+.++.-..+|.+=|+.+ T Consensus 361 ~l~~~svv~~l~~~~~~~~~~~~~~~~ip~~t~~~~a~wv~Eg-~~~~~s~~~f~~v~l~~~kla~~~~iS~ell~ds-- 437 (645) T protein:vir:93 361 YLRPQTIIGRFGQGGIPALRQVPFNIRVHAQVSGGAAGWVGEG-KTKPLTKFDFESITFSHAKVSAIAVLTEELIRFS-- 437 (645) T ss_pred hhhhhhhHHhhccccccccccccCceeeeeeecCcceEEeccC-ccccccccceeEEEEeeEEEEEeehhHHHHHhhc-- Confidence 7777666665543221111111 11233344445667777654 5688888889998888888888777754333433 Q ss_pred hCCCccHHHHHHHHHHHHHhhhheeeeeehhhc----ceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcC Q lcl|NC_011142. 138 VNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRN----MSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASK 213 (343) Q Consensus 138 ~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g----~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~ 213 (343) ..+++.--....+.++++.+|..+|+|+...+ -.|++|. +... . +......|+.+++.++..+ T Consensus 438 -~~~~~~~i~~~l~~aia~~~d~a~l~g~g~~~~~~~p~gi~~~--~~~~----~----~~~~~~~d~~~~~~~~~~a-- 504 (645) T protein:vir:93 438 -SPAADALVRNALAEAVVARLDTDFVDPKKAAVADVSPASITHD--VKGT----A----SSGNPDADAEAAFGQFVAA-- 504 (645) T ss_pred -hHHHHHHHHHHHHHHHHHHHHHHhhcCCCcccCCccccceecc--cccc----c----cccchHHHHHHHHHHHHhc-- Confidence 55677777788899999999999999875421 2344431 1111 1 1112346777888777654 Q ss_pred Ceecc-cEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEE Q lcl|NC_011142. 214 RFHTP-NTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYD 292 (343) Q Consensus 214 g~~~p-~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~ 292 (343) +...+ -..+|+|..+..|...+ +..|.-++.-+...+ ..+-|.|+...... ...+. .+ .-++..++- T Consensus 505 ~~~~~~a~~vmn~~~~~~L~~lk--d~~G~~~~~~~~~~~--~tL~G~PV~~s~~v---p~~~~---~g-d~s~~~ig~- 572 (645) T protein:vir:93 505 NLQPTGAVWLMSSTNALALSMRK--NALGQKEYPDMTLLG--GSFQGLPVIVSQYV---GDQLV---LV-NAPDIYLAD- 572 (645) T ss_pred CCCccccEEEEcHHHHHHHHhcc--ccCCceeecCCCCCC--ceeeceeeEEeccC---Cccee---Ee-ccccEEEEE- Confidence 22222 34789999999997543 222222210010011 12345554332110 00000 00 011222111 Q ss_pred cccceEEEeeccchhc------------------ccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 293 KSERNLALAKPIPFRM------------------LAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 293 ~~~~~~~~~v~~~~~~------------------~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ...+.+.+....+. +-.-+++ .+-+.++.++ +..+++|.|+++++|+ T Consensus 573 --~~~v~i~~s~~a~~~~~~~~~~~~~~~~~~~~v~lf~~d-~vaira~~r~-d~~~~~p~a~~~lt~~ 637 (645) T protein:vir:93 573 --DGGVAVDMSREASLEMQSEPTGDSTTPSPVELVSMFQTG-SVAIRAERWI-NWRRRRTAAVAVITGV 637 (645) T ss_pred --ecceEEEeecceeEEEeecccccccccccccchhHhhcC-ceEEEEEEEE-cceeeCccceEEEecc Confidence 11222222111110 0011223 2456677877 5778999999999999 No 64 >protein:vir:97053 Length: 390 # NCBI annotation: putative head protein # Family: family:all:585 # MgeID: mge:1653 # MgeName: OP1 # Cross-refs: genbank:acc:YP_453565;genbank:gi:84662600;genbank:GeneID:5142468 Probab=98.01 E-value=3.1e-06 Score=50.76 Aligned_cols=300 Identities=11% Similarity=0.057 Sum_probs=151.3 Q ss_pred CCcceeccchhhhhchhhhchhc------------ccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchh Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFL------------DSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLE 68 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~------------~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~ 68 (343) ..+...+ .+....++.++..+ ....... ........++|....++ +.+.+++.....-..+. T Consensus 72 ~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~g~lip~~---~~~~ii~~~~~~~~i~~ 145 (390) T protein:vir:97 72 DVQHVSV--GDMFVASEQFQASTGRWNDRSARATMNIKAALN-TASTDAAGSAGALTTPN---RLPGFITPPDARLTVRD 145 (390) T ss_pred ccccccc--hhhhhhhHHHHHHHHHhhhhhhhhhhHHHHHHH-hhhcccccccccccchh---hhHHHHHHHhhhhhhHh Confidence 0000000 00001111111000 0000000 00011112333333332 23457776666666666 Q ss_pred hccccCCCCcceeEEEEeeecc-cccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHH Q lcl|NC_011142. 69 DVPVLANIPEYATHWNYRSYDG-AAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQA 147 (343) Q Consensus 69 ~i~v~~~~~~~~~~~~~~~~~~-~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~ 147 (343) ++++.. . ......|...+. .+.+.+++.. ..+|..+..++........++.-..++.+ +.... .++..--. T Consensus 146 ~~~~~~-~--~~~~~~~~~~~~~~~~a~~v~Eg-~~~~~~~~~~~~i~~~~~k~~~~~~is~e-ll~ds---~~l~~~i~ 217 (390) T protein:vir:97 146 LIGSGR-T--DSALIEYVQETGFVNNAAIVAEG-ALKPESSLKFAKKTDTTHVIAHTMKATRQ-ILSDA---PQLASYMN 217 (390) T ss_pred hcceee-c--cCCceEEEEEecCCcceeeecCC-ccccccccceeEEEEeeeeEEEeehhhHH-HHHhH---HHHHHHHH Confidence 655432 2 223345555543 4567777654 45888888889999999999988888774 54322 25777778 Q ss_pred HHHHHHHHHhhhheeeeeehhh-cceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHH Q lcl|NC_011142. 148 ELAFRGSEEHSQRVAYFGDTNR-NMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPD 226 (343) Q Consensus 148 ~aA~~~~~~~~n~~~f~G~~~~-g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~ 226 (343) ...++++++.+|+.+|+|+... ...||+|.++....... .+.+..++++.+++..+.. ....+..++|+|+ T Consensus 218 ~~la~a~~~~~d~a~l~G~g~~~~p~Gi~~~~~~~~~~~~-----~~~~~~~d~~~~~~~~~~~---~~~~~~~~v~n~~ 289 (390) T protein:vir:97 218 NRLIRGLKVKEDAEILRGTGANDGLLGLIPQATTYAAPTT-----IAGATRVDQLRLAMLQASL---AEYPASGIVINPI 289 (390) T ss_pred HHHHHHHHHHHHHHHhhcCCCCccccceeecccccccccc-----ccccchHHHHHHHHHhhcc---ccCCCCEEEEcHH Confidence 8899999999999999998643 37899998875543322 1233446778888877653 2235678999999 Q ss_pred HHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccch Q lcl|NC_011142. 227 LWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPF 306 (343) Q Consensus 227 ~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~ 306 (343) .|..|.+.. +..|.-++.-.. +.....+-|.|+.... .. +.+ ..++.+.+ +.+.+...+.+ T Consensus 290 ~~~~L~~lk--d~~G~~l~~~~~-~~~~~~l~G~pV~~~~-------~~------~~~--~~~~gd~~-~~~~~~~~~~~ 350 (390) T protein:vir:97 290 DWAAIELAK--DANNQYLIGNAR-GTLTPTLWGLPVVATQ-------AM------APG--EFLVGAFD-LAAQIFDQWDA 350 (390) T ss_pred HHHHHHHhh--cCCCceeecCcc-CCCCceecceeeEEcC-------CC------CCC--cEEEEecc-ceEEEEEecce Confidence 999997533 333332211000 0111123344432221 00 001 11211111 12222222222 Q ss_pred hcccc-----eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 307 RMLAP-----QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 307 ~~~~~-----~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +.... -.+++ ..+.+..++ |..+++|.|+++++== T Consensus 351 ~i~~~~~~~~f~~~~-~~~r~~~r~-d~~v~~~~a~v~~~~a 390 (390) T protein:vir:97 351 RVEIGYVNDDFQRNM-VTVLAEERL-ALVVYRPEALITGSFA 390 (390) T ss_pred EEEEeecccccccCc-EEEEEEEee-ccEEeccccEEEEEeC Confidence 22111 11232 234455665 6788999999887633 No 65 >protein:vir:98339 Length: 415 # NCBI annotation: putative capsid protein # Family: family:all:21 # MgeID: mge:1581 # MgeName: phiPVL(108) # Cross-refs: genbank:acc:YP_918931;genbank:gi:119443693;genbank:GeneID:4594501 Probab=98.01 E-value=1.8e-06 Score=52.10 Aligned_cols=307 Identities=8% Similarity=0.017 Sum_probs=144.9 Q ss_pred CCcceec-----------cchhhhhchhhhchhcccccc-cCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchh Q lcl|NC_011142. 1 MSEKRVV-----------IDAQTIAGNRWLNKFLDSNAT-IGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLE 68 (343) Q Consensus 1 ~~~~~~~-----------~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~ 68 (343) ..++... .....+.+.+ .+........ ......... .++|.++.. +.+.+.|++..+.....+. T Consensus 78 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~-~~~gg~~iP--~~~~~~ii~~~~~~~~l~~ 153 (415) T protein:vir:98 78 NEARTYRNQANINDLGISIQNTKVTSQE-VRDFTEYLETRNDIQGGSLK-TDSGFVVIP--EEIVTDILKLKEVEFNLDK 153 (415) T ss_pred chhhhHHHHHHHHHHhhhhhhhhhHHHH-HHHHHHHHhhhhhhhhcccc-ccccccccc--hHHHHHHHHHHHhhhhhhh Confidence 1111110 0000011100 0000000000 000001111 123444444 4556678777777666666 Q ss_pred hccccCCCCcceeEEEEeeecccccceeecCCcCccceee-eccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHH Q lcl|NC_011142. 69 DVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVA-QSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQA 147 (343) Q Consensus 69 ~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~-~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~ 147 (343) ++.+.. .+-+...+.+........+.+++..+ ++|-.+ ..++.....++.++.-+.+|.+=++ ....++..--. T Consensus 154 ~~~~~~-~~~~~~~~~~~~~~~~~~~~~v~E~~-~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~---ds~~~l~~~i~ 228 (415) T protein:vir:98 154 YVTVKR-VTNGSGKYPVVRQSEVAALEKVEELE-ENPELAVKPFFQLAYDINTHRGYFRISREAIE---DAKVNVLQELK 228 (415) T ss_pred heeeee-ccCCceeEEEEeecCCccceeecccc-ccCcccccceeeEEeeeeeeEeeehhhHHHHh---hchHHHHHHHH Confidence 655422 22222233333334445556665543 456443 4678888888889888887755333 23556777788 Q ss_pred HHHHHHHHHhhhheeeeeehhhc-ceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHH Q lcl|NC_011142. 148 ELAFRGSEEHSQRVAYFGDTNRN-MSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPD 226 (343) Q Consensus 148 ~aA~~~~~~~~n~~~f~G~~~~g-~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~ 226 (343) ...+.++.+.+|+.+++|+.... ..++.+........... ...-++||.+++.++... ...+..++|+|+ T Consensus 229 ~~l~~~~~~~~~~~il~g~g~g~~~~~~~~~~~~~~~~~~~------~~~~~~~i~~~~~~~~~~---~~~~~~~v~n~~ 299 (415) T protein:vir:98 229 LWMARTIAATRNKAIIDVITKGSTGSTSSGFEKEGKKLEVK------KAKSLDDIKDAINLNVKP---NYEHNVAIVSQT 299 (415) T ss_pred HHHHHHHHHHHHHHHhhccccCccccccccccccccccccc------cccchhHHHHHHHhhhhh---ccCCCEEEEcHH Confidence 88899999999999999985432 22222222211111111 112256777777776432 235678999999 Q ss_pred HHHHHhccccCCCCCccHHHHHHhcCc----ceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEee Q lcl|NC_011142. 227 LWKRASSLLMTGYTDRTVIEHFQINNA----YTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAK 302 (343) Q Consensus 227 ~~~~L~~~~~~~~~~~tvle~l~~n~~----~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v 302 (343) .|..|.+.. +..|. +|-..++ ...+-|.|+.... ... .+..+... ++|-+=.+.+.+.. T Consensus 300 ~~~~l~~lk--d~~G~----~l~~~~~~~~~~~~l~G~pV~~~~-------~~~---~~~~~~~~-~~~Gd~~~~~~~~~ 362 (415) T protein:vir:98 300 MFAKLDKMK--DKLGN----YLIQPDVKEKTQQRLLGAKIEILP-------DEV---LGQKGNNT-LIIGNLKDAIVLFD 362 (415) T ss_pred HHHHHHHhh--ccCCc----eeeccCcCCCCCceecceeeEEec-------ccc---cCCCCccE-EEEEehhccEEEEe Confidence 999997532 33332 1211111 1123344432211 111 11122222 33322112222222 Q ss_pred ccchhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 303 PIPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 303 ~~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -..+++................++ ++.+.+|.|+++++-- T Consensus 363 ~~~~~v~~~~~~~~~~~~~~~~r~-d~~v~~~~a~~~~~~~ 402 (415) T protein:vir:98 363 RSQYQASWTDYMHFGECLMIAVRQ-DCRILDYKSAIVIEYD 402 (415) T ss_pred ecceEEEEeccccCceEEEEEEEe-ccEEeccccEEEEEEe Confidence 223333222222223334456776 5777889999998766 No 66 >protein:vir:81100 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:1891 # MgeName: tp310-1 # Cross-refs: genbank:acc:YP_001429874;genbank:gi:156603927;genbank:GeneID:5525320 Probab=98.01 E-value=1.8e-06 Score=52.10 Aligned_cols=307 Identities=8% Similarity=0.017 Sum_probs=144.9 Q ss_pred CCcceec-----------cchhhhhchhhhchhcccccc-cCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchh Q lcl|NC_011142. 1 MSEKRVV-----------IDAQTIAGNRWLNKFLDSNAT-IGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLE 68 (343) Q Consensus 1 ~~~~~~~-----------~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~ 68 (343) ..++... .....+.+.+ .+........ ......... .++|.++.. +.+.+.|++..+.....+. T Consensus 78 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~-~~~gg~~iP--~~~~~~ii~~~~~~~~l~~ 153 (415) T protein:vir:81 78 NEARTYRNQANINDLGISIQNTKVTSQE-VRDFTEYLETRNDIQGGSLK-TDSGFVVIP--EEIVTDILKLKEVEFNLDK 153 (415) T ss_pred chhhhHHHHHHHHHHhhhhhhhhhHHHH-HHHHHHHHhhhhhhhhcccc-ccccccccc--hHHHHHHHHHHHhhhhhhh Confidence 1111110 0000011100 0000000000 000001111 123444444 4556678777777666666 Q ss_pred hccccCCCCcceeEEEEeeecccccceeecCCcCccceee-eccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHH Q lcl|NC_011142. 69 DVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVA-QSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQA 147 (343) Q Consensus 69 ~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~-~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~ 147 (343) ++.+.. .+-+...+.+........+.+++..+ ++|-.+ ..++.....++.++.-+.+|.+=++ ....++..--. T Consensus 154 ~~~~~~-~~~~~~~~~~~~~~~~~~~~~v~E~~-~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~---ds~~~l~~~i~ 228 (415) T protein:vir:81 154 YVTVKR-VTNGSGKYPVVRQSEVAALEKVEELE-ENPELAVKPFFQLAYDINTHRGYFRISREAIE---DAKVNVLQELK 228 (415) T ss_pred heeeee-ccCCceeEEEEeecCCccceeecccc-ccCcccccceeeEEeeeeeeEeeehhhHHHHh---hchHHHHHHHH Confidence 655422 22222233333334445556665543 456443 4678888888889888887755333 23556777788 Q ss_pred HHHHHHHHHhhhheeeeeehhhc-ceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHH Q lcl|NC_011142. 148 ELAFRGSEEHSQRVAYFGDTNRN-MSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPD 226 (343) Q Consensus 148 ~aA~~~~~~~~n~~~f~G~~~~g-~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~ 226 (343) ...+.++.+.+|+.+++|+.... ..++.+........... ...-++||.+++.++... ...+..++|+|+ T Consensus 229 ~~l~~~~~~~~~~~il~g~g~g~~~~~~~~~~~~~~~~~~~------~~~~~~~i~~~~~~~~~~---~~~~~~~v~n~~ 299 (415) T protein:vir:81 229 LWMARTIAATRNKAIIDVITKGSTGSTSSGFEKEGKKLEVK------KAKSLDDIKDAINLNVKP---NYEHNVAIVSQT 299 (415) T ss_pred HHHHHHHHHHHHHHHhhccccCccccccccccccccccccc------cccchhHHHHHHHhhhhh---ccCCCEEEEcHH Confidence 88899999999999999985432 22222222211111111 112256777777776432 235678999999 Q ss_pred HHHHHhccccCCCCCccHHHHHHhcCc----ceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEee Q lcl|NC_011142. 227 LWKRASSLLMTGYTDRTVIEHFQINNA----YTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAK 302 (343) Q Consensus 227 ~~~~L~~~~~~~~~~~tvle~l~~n~~----~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v 302 (343) .|..|.+.. +..|. +|-..++ ...+-|.|+.... ... .+..+... ++|-+=.+.+.+.. T Consensus 300 ~~~~l~~lk--d~~G~----~l~~~~~~~~~~~~l~G~pV~~~~-------~~~---~~~~~~~~-~~~Gd~~~~~~~~~ 362 (415) T protein:vir:81 300 MFAKLDKMK--DKLGN----YLIQPDVKEKTQQRLLGAKIEILP-------DEV---LGQKGNNT-LIIGNLKDAIVLFD 362 (415) T ss_pred HHHHHHHhh--ccCCc----eeeccCcCCCCCceecceeeEEec-------ccc---cCCCCccE-EEEEehhccEEEEe Confidence 999997532 33332 1211111 1123344432211 111 11122222 33322112222222 Q ss_pred ccchhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 303 PIPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 303 ~~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -..+++................++ ++.+.+|.|+++++-- T Consensus 363 ~~~~~v~~~~~~~~~~~~~~~~r~-d~~v~~~~a~~~~~~~ 402 (415) T protein:vir:81 363 RSQYQASWTDYMHFGECLMIAVRQ-DCRILDYKSAIVIEYD 402 (415) T ss_pred ecceEEEEeccccCceEEEEEEEe-ccEEeccccEEEEEEe Confidence 223333222222223334456776 5777889999998766 No 67 >protein:vir:79987 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:1875 # MgeName: tp310-3 # Cross-refs: genbank:acc:YP_001430002;genbank:gi:156604057;genbank:GeneID:5525447 Probab=98.01 E-value=1.8e-06 Score=52.10 Aligned_cols=307 Identities=8% Similarity=0.017 Sum_probs=144.9 Q ss_pred CCcceec-----------cchhhhhchhhhchhcccccc-cCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchh Q lcl|NC_011142. 1 MSEKRVV-----------IDAQTIAGNRWLNKFLDSNAT-IGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLE 68 (343) Q Consensus 1 ~~~~~~~-----------~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~ 68 (343) ..++... .....+.+.+ .+........ ......... .++|.++.. +.+.+.|++..+.....+. T Consensus 78 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~-~~~gg~~iP--~~~~~~ii~~~~~~~~l~~ 153 (415) T protein:vir:79 78 NEARTYRNQANINDLGISIQNTKVTSQE-VRDFTEYLETRNDIQGGSLK-TDSGFVVIP--EEIVTDILKLKEVEFNLDK 153 (415) T ss_pred chhhhHHHHHHHHHHhhhhhhhhhHHHH-HHHHHHHHhhhhhhhhcccc-ccccccccc--hHHHHHHHHHHHhhhhhhh Confidence 1111110 0000011100 0000000000 000001111 123444444 4556678777777666666 Q ss_pred hccccCCCCcceeEEEEeeecccccceeecCCcCccceee-eccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHH Q lcl|NC_011142. 69 DVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVA-QSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQA 147 (343) Q Consensus 69 ~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~-~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~ 147 (343) ++.+.. .+-+...+.+........+.+++..+ ++|-.+ ..++.....++.++.-+.+|.+=++ ....++..--. T Consensus 154 ~~~~~~-~~~~~~~~~~~~~~~~~~~~~v~E~~-~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~---ds~~~l~~~i~ 228 (415) T protein:vir:79 154 YVTVKR-VTNGSGKYPVVRQSEVAALEKVEELE-ENPELAVKPFFQLAYDINTHRGYFRISREAIE---DAKVNVLQELK 228 (415) T ss_pred heeeee-ccCCceeEEEEeecCCccceeecccc-ccCcccccceeeEEeeeeeeEeeehhhHHHHh---hchHHHHHHHH Confidence 655422 22222233333334445556665543 456443 4678888888889888887755333 23556777788 Q ss_pred HHHHHHHHHhhhheeeeeehhhc-ceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHH Q lcl|NC_011142. 148 ELAFRGSEEHSQRVAYFGDTNRN-MSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPD 226 (343) Q Consensus 148 ~aA~~~~~~~~n~~~f~G~~~~g-~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~ 226 (343) ...+.++.+.+|+.+++|+.... ..++.+........... ...-++||.+++.++... ...+..++|+|+ T Consensus 229 ~~l~~~~~~~~~~~il~g~g~g~~~~~~~~~~~~~~~~~~~------~~~~~~~i~~~~~~~~~~---~~~~~~~v~n~~ 299 (415) T protein:vir:79 229 LWMARTIAATRNKAIIDVITKGSTGSTSSGFEKEGKKLEVK------KAKSLDDIKDAINLNVKP---NYEHNVAIVSQT 299 (415) T ss_pred HHHHHHHHHHHHHHHhhccccCccccccccccccccccccc------cccchhHHHHHHHhhhhh---ccCCCEEEEcHH Confidence 88899999999999999985432 22222222211111111 112256777777776432 235678999999 Q ss_pred HHHHHhccccCCCCCccHHHHHHhcCc----ceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEee Q lcl|NC_011142. 227 LWKRASSLLMTGYTDRTVIEHFQINNA----YTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAK 302 (343) Q Consensus 227 ~~~~L~~~~~~~~~~~tvle~l~~n~~----~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v 302 (343) .|..|.+.. +..|. +|-..++ ...+-|.|+.... ... .+..+... ++|-+=.+.+.+.. T Consensus 300 ~~~~l~~lk--d~~G~----~l~~~~~~~~~~~~l~G~pV~~~~-------~~~---~~~~~~~~-~~~Gd~~~~~~~~~ 362 (415) T protein:vir:79 300 MFAKLDKMK--DKLGN----YLIQPDVKEKTQQRLLGAKIEILP-------DEV---LGQKGNNT-LIIGNLKDAIVLFD 362 (415) T ss_pred HHHHHHHhh--ccCCc----eeeccCcCCCCCceecceeeEEec-------ccc---cCCCCccE-EEEEehhccEEEEe Confidence 999997532 33332 1211111 1123344432211 111 11122222 33322112222222 Q ss_pred ccchhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 303 PIPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 303 ~~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -..+++................++ ++.+.+|.|+++++-- T Consensus 363 ~~~~~v~~~~~~~~~~~~~~~~r~-d~~v~~~~a~~~~~~~ 402 (415) T protein:vir:79 363 RSQYQASWTDYMHFGECLMIAVRQ-DCRILDYKSAIVIEYD 402 (415) T ss_pred ecceEEEEeccccCceEEEEEEEe-ccEEeccccEEEEEEe Confidence 223333222222223334456776 5777889999998766 No 68 >protein:vir:3158 Length: 321 # NCBI annotation: capsid protein gpE # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:316 # MgeName: PhiCh1 # Cross-refs: genbank:acc:NP_665929;genbank:gi:22091115;genbank:GeneID:951342 Probab=98.00 E-value=3.6e-06 Score=50.42 Aligned_cols=295 Identities=13% Similarity=0.029 Sum_probs=141.4 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) |+.|...=.+++|+....++ ..|++.+.....+..+.|-..+.+. -+-+...+.+++.... T Consensus 1 ~~~k~~~~~l~~~~~~~~~~--------------~~~~~~g~~v~~~~~~~l~~~i~e~-s~~l~~i~v~~v~~~~---- 61 (321) T protein:vir:31 1 MASRTINNDLSRITEKNALT--------------VDDLDAGGTLPDPLWDEFWTDMIEE-TPLLDAIRTETVGAKK---- 61 (321) T ss_pred CchHHHHHHHHHHHHhcccc--------------ccccCCcceeCHHHHHHHHHHHHHh-hhhhhhceeeeccCcc---- Confidence 77766544444444321111 1123333333334444454555542 2333334444442211 Q ss_pred eEEEEeeecccccceeecCC-cCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhh Q lcl|NC_011142. 81 THWNYRSYDGAAMGKFISAN-ASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQ 159 (343) Q Consensus 81 ~~~~~~~~~~~G~a~~~~~~-~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n 159 (343) ..+ ......+...+.+.. ....+..+..++..............++++-|+.. ..+.++...-....+++++..++ T Consensus 62 ~~i--~~~~~~~~~~~~~~e~~~~~~~~~~~~~~~~~~~~k~~~~~~it~e~L~d~-a~~~d~e~~i~~~ia~~~a~~~~ 138 (321) T protein:vir:31 62 TRI--PTLNIGERHRRPQDEGEWNENESDVSTGTIDISTEKATVAWDLPREVVQEN-PEGEALADRILNLMTDAWSADVE 138 (321) T ss_pred eee--eeeccCCcccccccccccccccccceeeeeeeeeEEEEeehhccHHHHHhh-hcchhHHHHHHHHHHHHHHHHHH Confidence 111 111111222222221 12233334456666777788887778877766653 34667888888999999999999 Q ss_pred heeeeeehhhcc------eeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeeccc-EEEecHHHHHHHh Q lcl|NC_011142. 160 RVAYFGDTNRNM------SGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPN-TVLMFPDLWKRAS 232 (343) Q Consensus 160 ~~~f~G~~~~g~------~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~-tL~l~p~~~~~L~ 232 (343) +++|+|+....- .|+++...-...+ .++.+.+. -++++.+++..|-. .+...+. ..+|+++.+..+. T Consensus 139 ~~~~nGd~~~~~~~~~~n~G~l~~a~~~~~~--~~~~~~~~--~~d~l~~l~~~l~~--~yr~~~~~v~im~~~~~~~~~ 212 (321) T protein:vir:31 139 DLAANGDEDAEDSFENQNDGFITVAEGDVET--IDAADDIL--DNDLVIRTIAGLDS--KYRARMNPALIVSEDQLLSYH 212 (321) T ss_pred hheeeccccCCCcccccchhhhhhhcccccc--cccccccc--CHHHHHHHHHhccH--hHhcCCCeEEEechHHHHHHH Confidence 999999875443 4555543211111 11221111 12344455555432 2333343 5679988876654 Q ss_pred ccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcc--c Q lcl|NC_011142. 233 SLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRML--A 310 (343) Q Consensus 233 ~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~--~ 310 (343) ......+++ +++-...+.....+.|.|+.+.... + .+.+++ -+.+++.+-+....+.. . T Consensus 213 ~~l~~~~~~--~~~~~l~~~~~~tl~G~pvv~~~~m----P-----------~~~il~--t~~~nl~~~~~~~~~~~~~~ 273 (321) T protein:vir:31 213 YTLTDRDTP--LGDNVIMGEADVNPFSFPIIGSGLW----P-----------DDKAMF--TDPQNLIYALYRDLEIDVLT 273 (321) T ss_pred HHHhcCCCc--cccchhhccccccccceeEEEcCCC----C-----------CCcEEE--eccccEEEEEeeccEEEEee Confidence 433222222 1221122222233556665332211 0 111222 23555544443433322 1 Q ss_pred c--ee--cCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 311 P--QL--LGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 311 ~--~~--~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) - +. +...+..-.+.. -+..|..+.+++.+.|| T Consensus 274 ~~~~~~~~~~~~~~~~~~~-~~~~ve~~~a~a~~~~i 309 (321) T protein:vir:31 274 ESDKVSERDLHARYFMRGD-DDFAIENTEAVVLAEGL 309 (321) T ss_pred cCccccccceeeEeeeeee-cceeEeccccEEEEecC Confidence 1 11 122344333444 46888999999999999 No 69 >protein:vir:485 Length: 407 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:11 # MgeName: P27 # Cross-refs: genbank:acc:NP_543092;swissprot:trembl:q8w627;genbank:gi:18249904;uniprot:Q8W627;genbank:GeneID:929693 Probab=98.00 E-value=3.6e-06 Score=50.45 Aligned_cols=315 Identities=11% Similarity=0.084 Sum_probs=153.4 Q ss_pred CCccee------ccchhhh-hchhhhchh-cccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccc Q lcl|NC_011142. 1 MSEKRV------VIDAQTI-AGNRWLNKF-LDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPV 72 (343) Q Consensus 1 ~~~~~~------~~~~~~~-~~~~~~~~~-~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v 72 (343) ..+++- ...++.. +-...++.. .+.....-.-++.+-....|.++.+ +.+.+.|++........+.+..+ T Consensus 65 ~~~~~~~~~~~~~~~~e~~~a~~~~l~~g~~~~~~~~e~~a~~~~t~~~gG~~iP--~~~~~~I~~~~~~~~~l~~~~~~ 142 (407) T protein:vir:48 65 AEVKRPAGGTQNKVASEHKEAFIGFMRKGREDGLRELERKALQVGNDEDGGYAIP--EELDRTILTLLKDEVVMRQEATV 142 (407) T ss_pred HHhhccccccccchhhHHHHHHHHHHhccchhhhhHHHHHhhhcccCCCCccccc--HhHHHHHHHHHHhhhhhhhhcee Confidence 000000 0000000 000111100 0000000000111111223444444 45677888877766666665543 Q ss_pred cCCCCcceeEEEEeeecccccceeecCCcCccceee-eccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHH Q lcl|NC_011142. 73 LANIPEYATHWNYRSYDGAAMGKFISANASDLPRVA-QSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAF 151 (343) Q Consensus 73 ~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~-~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~ 151 (343) . +-......+.+......+.+++... ..|-.+ ..++.....++.++.-+.+|.+=|+. ...++..--....+ T Consensus 143 ~---~~~~~~~~~~~~~~~~~a~~v~E~~-~~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~d---s~~~l~~~i~~~l~ 215 (407) T protein:vir:48 143 I---TLGGSDYKKLVNLGGTTSGWVGETD-ARPETATSKLGLIEPFMGEIYGNPQATQKMLDD---AFFNVEDWINSELA 215 (407) T ss_pred e---ecCCCceEEEEecCCcceeeecccc-cccccccccceeEEeeeeeeEeehhhHHHHHhc---chHHHHHHHHHHHH Confidence 2 2222344555555556667776543 455443 35777788888888877777654443 34568788888889 Q ss_pred HHHHHhhhheeeeeehhhcceeeeecCCccccccCcCcc------ccCHH-HHHHHHHHHHHHHHHhcCCeecccEEEec Q lcl|NC_011142. 152 RGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYA------TCTGQ-ELFDLLNNPVFAVVKASKRFHTPNTVLMF 224 (343) Q Consensus 152 ~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~------~~t~~-~i~~di~~~~~~l~~~s~g~~~p~tL~l~ 224 (343) .++.+.+++.+++|+......|+|+++.+........|. +.++. --++||.+++..|... +.. .-.++++ T Consensus 216 ~~i~~~~~~a~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~i~~l~~~l~~~--~~~-~a~~v~n 292 (407) T protein:vir:48 216 LEFAEQEEIAFTSGDGSKKPKGFLAYESTDEDDKTRAFGKLQHIASGAASGVTADAIIKLIYTLRKA--HRS-GAKFMMN 292 (407) T ss_pred HHHHHHHHhhhhccCCCCccceeeecccccccccccccccccccccccccccChHHHHHHHHhhchh--hhc-CCEEEEc Confidence 999999999999999877789999998865433222221 11111 1267777777776432 332 2358999 Q ss_pred HHHHHHHhccccCCCCCccHHH-HHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeec Q lcl|NC_011142. 225 PDLWKRASSLLMTGYTDRTVIE-HFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKP 303 (343) Q Consensus 225 p~~~~~L~~~~~~~~~~~tvle-~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~ 303 (343) +..|..|..- -+..|.-++. =+. ++....+-|.|+.+. .... ..+. +++. ++|-+=.+.+.+.-- T Consensus 293 ~~~~~~L~~l--kD~~Gr~l~~~~~~-~g~~~~l~G~PV~~~-------~~~p--~~~~-~~~~-i~~Gd~~~~~~i~~~ 358 (407) T protein:vir:48 293 NSSLFAIRLL--KDNDGNYLWRPGIE-LGQPSSLAGYGIVEN-------EQMP--DIAA-DAKA-IAFGNFKRGYTIVDR 358 (407) T ss_pred HHHHHHHHHh--hccCCceeeccCcC-CCCCceecceeeEEe-------cCcC--CccC-CccE-EEEEeccccEEEEEe Confidence 9999998642 2433322210 011 111112334443222 1111 1111 2232 333211111211111 Q ss_pred cchhccccee--cCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 304 IPFRMLAPQL--LGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 304 ~~~~~~~~~~--~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +.++..--.. ++ ...+.++.|++ +.+.+|.+++.+..= T Consensus 359 ~~~~i~~d~~~~~~-~~~~~~~~r~d-~~v~~~~a~~~l~~~ 398 (407) T protein:vir:48 359 IGTRILRDPYTNKP-FVGFYTTKRTG-GMLVDSQAIKLMKIG 398 (407) T ss_pred eceEEEeeccccCC-cEEEEEEEEec-cEEecccceEEEEee Confidence 1222221111 22 23455677874 567779999876554 No 70 >protein:vir:2344 Length: 397 # NCBI annotation: gp14 # Family: family:all:507 # MgeID: mge:51 # MgeName: Bxb1 # Cross-refs: genbank:acc:NP_075281;genbank:gi:12657868;genbank:GeneID:920118 Probab=97.95 E-value=3.7e-06 Score=50.39 Aligned_cols=282 Identities=10% Similarity=-0.037 Sum_probs=145.2 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) |. ++++.... +...+...++ ++..+. + .++++..+..-..+++..+.. ... T Consensus 1 ~g-----~~~e~~~~-----------------~~~~t~~~~g-~l~~~~--~-~~ii~~l~~~s~i~~l~~~~~---~~~ 51 (397) T protein:vir:23 1 MG-----FSADHSQI-----------------AQTKDTMFTG-YLDPVQ--A-KDYFAEAEKTSIVQRVAQKIP---MGA 51 (397) T ss_pred CC-----cCHHHHHH-----------------hhccCCCCcc-ccchhH--H-HHHHHHHHhccchhhhcceee---ccC Confidence 10 11110000 0012222333 333322 2 345555555545555554422 222 Q ss_pred eEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhh Q lcl|NC_011142. 81 THWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQR 160 (343) Q Consensus 81 ~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~ 160 (343) ....+.+.+....+.+++.. ..+|..+..++.....++.++..+.++.+=++.+ ..++...-....++++++.+|+ T Consensus 52 ~~~~ip~~~~~~~a~wv~Eg-~~~~~s~~~f~~v~l~~~k~~~~v~iS~ell~ds---~~~l~~~i~~~l~~aia~~~d~ 127 (397) T protein:vir:23 52 TGIVIPHWTGDVSAQWIGEG-DMKPITKGNMTKRDVHPAKIATIFVASAETVRAN---PANYLGTMRTKVATAIAMAFDN 127 (397) T ss_pred CceEEEEEcCCcceEEecCC-ccccccccceeEEEEeeEEEEEeehhhHHHHhcc---hHHHHHHHHHHHHHHHHHHHHH Confidence 34566667777778888664 5688888889999999999999988876655533 4668888889999999999999 Q ss_pred eeeeeehh-hcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCC Q lcl|NC_011142. 161 VAYFGDTN-RNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGY 239 (343) Q Consensus 161 ~~f~G~~~-~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~ 239 (343) .+++|+.. .+..|+.+..+......+. -.++++.+++.++... + ..+..++|+|+.+..|.+.. +. T Consensus 128 a~l~G~gt~~~~~~~~~~~~~~~~~~~~--------~~~~~~~~~~~~l~~~--~-~~~a~~vmn~~~~~~L~~lk--d~ 194 (397) T protein:vir:23 128 AALHGTNAPSAFQGYLDQSNKTQSISPN--------AYQGLGVSGLTKLVTD--G-KKWTHTLLDDTVEPVLNGSV--DA 194 (397) T ss_pred HHhhcccCCcccccccccccceeeeccc--------chhHHHHHHHHhhhhc--c-cCCCEEEEcHHHHHHHHHhh--cc Confidence 99999864 3455555544433222211 1234445555555432 2 34578999999999997532 33 Q ss_pred CCccHHHH-HHhcCcc----eeeccccccccccceeeechhhhccccCCccceEEEEEc------ccceEEEeeccchhc Q lcl|NC_011142. 240 TDRTVIEH-FQINNAY----TLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDK------SERNLALAKPIPFRM 308 (343) Q Consensus 240 ~~~tvle~-l~~n~~~----~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~------~~~~~~~~v~~~~~~ 308 (343) .+.-++.- .....+. ..+.|.|+.+... ...++..++.-+. ..+.+.+.+-..... T Consensus 195 ~G~~i~~~~~~~~~~~~~~~~tl~G~Pv~~s~~-------------~~~g~~~~~~gDfs~~~i~~~~~i~i~~~~e~~~ 261 (397) T protein:vir:23 195 NGRPLFVESTYESLTTPFREGRILGRPTILSDH-------------VAEGDVVGYAGDFSQIIWGQVGGLSFDVTDQATL 261 (397) T ss_pred CCceeecccccccccccccCceeeeeeEEEeCC-------------CCCCceEEEEeecceEEEEEEeceEEEEeeeeee Confidence 33322110 0000011 1233444322211 0111111111111 111222222111110 Q ss_pred cc----------ceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 309 LA----------PQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 309 ~~----------~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .. .-.++ ...+.+..++ ++.+++|.++++.++- T Consensus 262 ~~~~~~~~~~~~lf~~d-~v~~ra~~r~-d~~v~~~~a~~~~~~~ 304 (397) T protein:vir:23 262 NLGSQESPNFVSLWQHN-LVAVRVEAEY-GLLINDVNAFVKLTFD 304 (397) T ss_pred eeccccccceeeeeecc-ceeEEEEeee-ccceecccceEEEeec Confidence 00 01112 1344556776 5799999999999986 No 71 >protein:vir:81070 Length: 390 # NCBI annotation: p09 # Family: family:all:585 # MgeID: mge:1889 # MgeName: Xop411 # Cross-refs: genbank:acc:YP_001285679;genbank:gi:148727187;genbank:GeneID:5247115 Probab=97.93 E-value=5.3e-06 Score=49.51 Aligned_cols=303 Identities=13% Similarity=0.046 Sum_probs=153.3 Q ss_pred CCcceecc-chhhhhchhhhchhccccc--ccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCC Q lcl|NC_011142. 1 MSEKRVVI-DAQTIAGNRWLNKFLDSNA--TIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIP 77 (343) Q Consensus 1 ~~~~~~~~-~~~~~~~~~~~~~~~~~~~--~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~ 77 (343) +.+..... ..+.+.+............ ............++|.+..++ ++ +.+++........+.++.+.. .+ T Consensus 78 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~--~~-~~ii~~~~~~~~l~~~~~~~~-~~ 153 (390) T protein:vir:81 78 VGDMFVASEQFQASAGRWNDRSARATMNIKAALNTASTDAAGSAGALTTPN--RL-PGFITPPDARLTVRDLIGSGR-TD 153 (390) T ss_pred chhhhhhhHHHHHHHHHHhhhhhhhhhHHHHHHHhhccccccCCcceechh--hh-HHHHHHHhhhhhhhhhcceee-cc Confidence 11111110 0111111111100000000 000001111123344455443 22 457776766666666665432 22 Q ss_pred cceeEEEEeeec-ccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHH Q lcl|NC_011142. 78 EYATHWNYRSYD-GAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEE 156 (343) Q Consensus 78 ~~~~~~~~~~~~-~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~ 156 (343) .....+.... ..+.+.+++.+ ..+|..+..++.....+..++.-+.++.+ +.... .++..--....++++++ T Consensus 154 --~~~~~~~~~~~~~~~a~~v~Eg-~~~~~~~~~~~~i~~~~~k~~~~~~is~e-ll~d~---~~~~~~i~~~l~~~~~~ 226 (390) T protein:vir:81 154 --SALIEYVQETGFVNNAAIVAEG-ALKPESSLKFAKKTDTTHVIAHTMKATRQ-ILSDA---PQLASYMNNRLIRGLKV 226 (390) T ss_pred --CCceEEEEEecCCcceeeecCC-cccccccceeeEEEEeeeEEEEeehhhHH-HHHhH---HHHHHHHHHHHHHHHHH Confidence 2234444443 34667777765 45888888899999999999999888874 44322 25777788889999999 Q ss_pred hhhheeeeeehhh-cceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccc Q lcl|NC_011142. 157 HSQRVAYFGDTNR-NMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLL 235 (343) Q Consensus 157 ~~n~~~f~G~~~~-g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~ 235 (343) .+|+.+++|+... ...|++|..+....+... +....++||.+++.++... ...+..++|+|+.|..|.+.. T Consensus 227 ~~d~a~l~G~g~~~~~~Gi~~~~~~~~~~~~~-----~~~~~~~~~~~~~~~~~~~---~~~~~~~v~~~~~~~~l~~lk 298 (390) T protein:vir:81 227 KEDAEILRGTGANDGLLGLIPQATTYAAPTTI-----AGATRVDQLRLAMLQASLA---EYNPSGIVINPIDWAAIELAK 298 (390) T ss_pred HHHHHHHhcCCCCCcccceeeccccccccccc-----ccchhHHHHHHHHHhhccc---cCCCCEEEEcHHHHHHHHHhh Confidence 9999999998643 489999988765443221 1222356778888777542 245678999999999987532 Q ss_pred cCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhccc----c Q lcl|NC_011142. 236 MTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLA----P 311 (343) Q Consensus 236 ~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~----~ 311 (343) +..|.-++.-...... ..+-|.|+.... .. +.+ .+++.+.+ +.+.+..-..++... . T Consensus 299 --d~~G~~l~~~~~~~~~-~~l~G~pv~~~~-------~~------p~~--~~~~gd~~-~~~~~~~~~~~~v~~~~~~~ 359 (390) T protein:vir:81 299 --DANNQYLIGNARGTLT-PTLWGLPVVATQ-------AM------APG--EFLVGAFD-LAAQIFDQWDARVEIGYVGE 359 (390) T ss_pred --cCCCceeecCcccccC-ceecceeeEEcC-------CC------CCC--cEEEEehh-ceEEEEEecceEEEEecccc Confidence 3333322211111111 122344432211 10 001 11211211 112222112222211 1 Q ss_pred -eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 312 -QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 312 -~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -.++ ...+.+..++ +..++.|.|++.++== T Consensus 360 ~~~~~-~v~~r~~~r~-d~~v~~~~a~v~~t~a 390 (390) T protein:vir:81 360 DFQRN-MITVLAEERL-ALVVYRPEALISGSFA 390 (390) T ss_pred hhhcC-cEEEEEEEee-ccEEecccceEEEEeC Confidence 1122 2345567776 5688899998776522 No 72 >protein:vir:96762 Length: 632 # NCBI annotation: putative phage-related protein # Family: family:all:21 # MgeID: mge:1628 # MgeName: VP882 # Cross-refs: genbank:acc:YP_001039818;genbank:gi:126010917;genbank:GeneID:5076272 Probab=97.93 E-value=1.8e-06 Score=52.04 Aligned_cols=304 Identities=9% Similarity=-0.009 Sum_probs=146.9 Q ss_pred CCccee-------------------------------------ccchhhhhchhhhc--hhcccccccCcchhecchhhh Q lcl|NC_011142. 1 MSEKRV-------------------------------------VIDAQTIAGNRWLN--KFLDSNATIGVPSVVNDADGG 41 (343) Q Consensus 1 ~~~~~~-------------------------------------~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~dA~~~ 41 (343) +.+... ...+..-.|.+.-+ ...+.... -++..-..+. T Consensus 288 ~~~~~~~~~i~~~~re~~~~~l~rai~a~a~~~~~~a~~~~e~a~~~a~~~G~~arg~~~~~~~l~~---ra~~~~t~~~ 364 (632) T protein:vir:96 288 KPAIHSARDLGIQHKELQQYSLMRAINAAATGDWSKAGFEREVSLAIADASGKEARGFYMPHEVLVQ---RQLEKKTAGK 364 (632) T ss_pred hhhhhhhhhhhhhHHHHHHHHHHHHHHhhhccchhhhhhhhHHHHHHHHhhhhhhhhhhhhHHHHHH---hhhhcccccc Confidence 100000 00000000100000 00000000 0000001122 Q ss_pred hhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEE Q lcl|NC_011142. 42 AAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYA 121 (343) Q Consensus 42 ~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~ 121 (343) |.++... +.....+++.+++....+++ +.. ..+.....+.+......+.+.+++.. ..+|..+..++........+ T Consensus 365 gg~lvp~-~~~~~~iie~lr~~s~i~~l-~~~-~~~~~~g~~~ip~~~~~~~a~wv~E~-~~~~~s~~~f~~i~l~~~k~ 440 (632) T protein:vir:96 365 GGELVAT-ELLSEEFIDILRNKAIIGQM-GAR-MLPGLVGDVDIPKKTSGANFYWIGED-EDVQDSDFDFTTLSFSPKTI 440 (632) T ss_pred ccccccc-ccchHHHHHHHhhcchhhhh-cce-EeecCCcceEEEEEeCCceeEeecCC-ccccccccceeeEEeeeeEE Confidence 3333331 22334677766665555554 211 11111223556666666666776654 45777788888888888888 Q ss_pred EEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehh-hcceeeeecCCccccccCcCccccCHHHHHHH Q lcl|NC_011142. 122 GVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTN-RNMSGLLNNPNVTKTSATVNYATCTGQELFDL 200 (343) Q Consensus 122 ~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~-~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~d 200 (343) +.-+.+|.+=|.. ...+++.--......++++.+|+.+++|+.. ....|++|..+++..+.+. ...+ +++ T Consensus 441 ~~~v~iS~ell~d---s~~~~~~~i~~~l~~a~~~~~d~a~l~G~G~~~~p~Gi~~~~~~~~~~~~~--~~~~----~~~ 511 (632) T protein:vir:96 441 AGAVPVTRKLRKQ---SSIHVENLIREDLIEGIGVALDLAMLTGTGLANDPVGLLNMTGVPALTYPA--GGVD----WAS 511 (632) T ss_pred EEehhhHHHHHhc---cchHHHHHHHHHHHHHHHHHHHHHhhcccCCCCccceeeecccccceeccc--ccCC----HHH Confidence 8887776544443 3556777777888999999999999999863 3478999998876544322 1222 456 Q ss_pred HHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccc Q lcl|NC_011142. 201 LNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGV 280 (343) Q Consensus 201 i~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~ 280 (343) |.++..++... +....+...+++|..+..|......+..+.-+ -.++ .+.|.|..+..... ... . T Consensus 512 i~~~~~~i~~~-~~~~~~~~~~~~~~~~~~l~~~~l~d~~G~~i----~~~~---~l~G~pv~~s~~ip--~~~-----~ 576 (632) T protein:vir:96 512 VVDMETKISTF-NADAGRLAYLTSVTQRGAAKKAQVFDNTGERI----WQNN---EVNGYRAEASNQIP--ADT-----W 576 (632) T ss_pred HHHHHHHHhhc-ccccCccEEEEchhHHHHHHHHhccCCCCcee----ecCC---eecccceEeccccc--cCc-----E Confidence 66666666543 22233456889998887776543434444322 2222 34466654432110 000 0 Q ss_pred cCCccceEEEEEcccceEEEeeccchhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 281 SNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 281 ~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -.|.-..+++.. ..-+.+.+ -+.+ ....-...+.++.++ ++.+++|.++++..-= T Consensus 577 ~~gd~s~~~i~~--~~~~~i~~-~~~~----~~~~~~v~~~~~~~~-d~~v~~~~af~~~k~~ 631 (632) T protein:vir:96 577 IFGDWSQIVIAM--WGVLDLKV-DPYT----KAASDGLVLRVFQDV-DAGVRRKEAFCIAKKG 631 (632) T ss_pred EEeecceEEEEE--ecceEEEE-cccc----ccccCceEEEEEeec-Cceeechhhhhheeec Confidence 000001111111 11122221 0111 111123455566665 5788899888754433 No 73 >protein:vir:102119 Length: 404 # NCBI annotation: phage major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1641 # MgeName: phiSM101 # Cross-refs: genbank:acc:YP_699941;genbank:gi:110804052;genbank:GeneID:4206662 Probab=97.92 E-value=2.7e-06 Score=51.13 Aligned_cols=312 Identities=9% Similarity=0.068 Sum_probs=147.2 Q ss_pred CCcc-----------eeccchhhh---hchhhhchhcccccccC---cchhecchhhhhhhhHHHHHHHHHHHHhhhhhc Q lcl|NC_011142. 1 MSEK-----------RVVIDAQTI---AGNRWLNKFLDSNATIG---VPSVVNDADGGAAYYISQLASLETTVYEVPYAD 63 (343) Q Consensus 1 ~~~~-----------~~~~~~~~~---~~~~~~~~~~~~~~~~~---~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~ 63 (343) +.+. ...-+.+.. ....+++..-....... ..++.....+.|.++.+ +.+.++|++..... T Consensus 60 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~e~~a~~~~~~~~gg~~vP--~~~~~~ii~~~~~~ 137 (404) T protein:vir:10 60 FNEDNVKSLNTGKEENVIYNGALFVRAIADNLLKQKNQRGLNLSEKEINAISENIDEDGGYAVP--EDIQTKINTRLKDT 137 (404) T ss_pred HhhhhccccccccchhhHHHHHHHHHHHHHHHHHHHHhhhhcchhhHHhhhccccCCCCceeec--hhHHHHHHHHHhhh Confidence 0000 000000000 00000000000000000 00111111233445544 45667788877766 Q ss_pred ccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccce--eeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCC Q lcl|NC_011142. 64 ITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPR--VAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMP 141 (343) Q Consensus 64 l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~--v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~ 141 (343) ...+.++++.. .+...-.+.|........+.+++..+. .|. .+..++........++.-+.+|.+=|+ ....+ T Consensus 138 ~~l~~l~~~~~-~~~~~g~~~~~~~~~~~~~~~v~e~~~-~~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~---ds~~~ 212 (404) T protein:vir:10 138 TDLYNMVDYEP-VFTRSGSRTYEKRSKQKPMKPLSENQQ-IPTNGDNGKLERFNFKLKDLADFMSIPNDLLK---FADKS 212 (404) T ss_pred hhHhhhhceee-ccCCccceEEEEecCCcceeecccccc-ccccccccceeeeEeeheeeEeeehhhHHHHh---hcHHH Confidence 66666665532 222222334444444455566655432 333 234467777778888887777764333 23446 Q ss_pred ccHHHHHHHHHHHHHhhhheeeeeehh-hcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccE Q lcl|NC_011142. 142 IDSMQAELAFRGSEEHSQRVAYFGDTN-RNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNT 220 (343) Q Consensus 142 l~~~k~~aA~~~~~~~~n~~~f~G~~~-~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~t 220 (343) |..--....++++++.+|+.+++|+.. ....|+++..++...+.++. .+ ++|+.++++.... .+...... T Consensus 213 l~~~i~~~la~~~~~~~~~~il~G~g~~~~~~gi~~~~~~~~~~~~~~---~~----~~~~~~~~~~~l~--~~~~~~~~ 283 (404) T protein:vir:10 213 LEDWIINWFVDKVRITRNAEILYGAGGDEHATGIMTANKFKKITLPKS---PA----LKDFKKCKNVELL--NVFKATSS 283 (404) T ss_pred HHHHHHHHHHHHHHHHHHHHHhhcCCCCCcccceeeccccceeecccc---cc----HHHHHHHHHhhhh--ccccCCCE Confidence 777788889999999999999999874 34689988887665443311 12 3455555543221 23333457 Q ss_pred EEecHHHHHHHhccccCCCCCccHHH-HHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEE Q lcl|NC_011142. 221 VLMFPDLWKRASSLLMTGYTDRTVIE-HFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLA 299 (343) Q Consensus 221 L~l~p~~~~~L~~~~~~~~~~~tvle-~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~ 299 (343) ++|+|..|..|.+.. +..|.-++. -+. +.....+-|.|+.. .... ...++.+.. -++|-+=.+.+. T Consensus 284 ~v~n~~~~~~L~~lk--d~~G~~l~~~~~~-~~~~~~l~G~PV~~-------~~~~--~~~~~~~~~-~~~~gd~s~~~~ 350 (404) T protein:vir:10 284 WIVNQDGFNYLDSLE--DKTGRPYLQPDPK-DPTQYRFLGLPVIE-------LPND--LLLSTESAI-PVLLGDTKEAYK 350 (404) T ss_pred EEEcHHHHHHHHHhh--ccCCceeeccCcC-CCCCccccceeeEE-------eccc--ccCCCCCcc-EEEEEeccccEE Confidence 899999999987532 333322211 001 11111233444321 1110 011122222 233433223333 Q ss_pred Eeeccchhcccc-ee-----cCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 300 LAKPIPFRMLAP-QL-----LGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 300 ~~v~~~~~~~~~-~~-----~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +..-..++.... +. ++ ...+.++.++ |+.+.+|.+++.++=- T Consensus 351 ~~~~~~~~i~~~~~~~~~~~~~-~~~~~~~~r~-d~~v~~~~a~~~~~~~ 398 (404) T protein:vir:10 351 YVSDGAYELATTNIGAGAFETN-TTKARIIMRI-DGNVKDSEALLIAEIP 398 (404) T ss_pred EEEecceEEEEeccccchhhcC-ceEEEEEEee-ccEEecccceEEEEee Confidence 322222222111 11 12 2345567776 6789999999887765 No 74 >protein:vir:1328 Length: 392 # NCBI annotation: gp36 # Family: family:all:21 # MgeID: mge:28 # MgeName: phi-C31 # Cross-refs: genbank:acc:NP_047927;swissprot:trembl:q9zwv6;genbank:gi:9631145;uniprot:Q9ZWV6;genbank:GeneID:2715889 Probab=97.86 E-value=4.3e-06 Score=49.99 Aligned_cols=307 Identities=9% Similarity=-0.002 Sum_probs=145.3 Q ss_pred CCcceeccchhhh---hchhhhchhcccc-cccC-cchh--ecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhcccc Q lcl|NC_011142. 1 MSEKRVVIDAQTI---AGNRWLNKFLDSN-ATIG-VPSV--VNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVL 73 (343) Q Consensus 1 ~~~~~~~~~~~~~---~~~~~~~~~~~~~-~~~~-~~~~--~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~ 73 (343) ..++... +++.- .....++...... .+.. .+.. .+.+ +++.++.++ .+.+.+.+........+.+..+. T Consensus 73 ~~~~~~~-~~~~~~~~~~~~~~r~g~~~~~~~~~~~~~~~~~t~~-~~g~~~~~~--~~~~~i~~~~~~~~~l~~~~~~~ 148 (392) T protein:vir:13 73 GLQGSGS-GAQRSADHDDDAVLRAGNLGEARSFEFAPEKRDGTKA-GNPNVLSRT--LYGQLIAQAVERSAIMRGGASTF 148 (392) T ss_pred ccCCccc-chhhhhhHHHHHHHhccchhhhHHHHhhhhhhccccc-CCCcccccc--chHHHHHHHHhhhhhhhhcceee Confidence 0000000 00000 0000111000000 0000 0000 0111 122232221 12222323222221223222221 Q ss_pred CCCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHH Q lcl|NC_011142. 74 ANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRG 153 (343) Q Consensus 74 ~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~ 153 (343) . .. +...+.+.+.+..+.+.+++.. ..+|..+..++......+.++.-..+|.+=|+. ...++..--....+.+ T Consensus 149 ~-~~-~~~~~~~~~~~~~~~a~~v~E~-~~~~~~~~~f~~v~~~~~k~~~~~~iS~ell~d---s~~~l~~~i~~~l~~~ 222 (392) T protein:vir:13 149 T-TS-DANPMDFTVITGRATAGIVGET-AEIPESYPATTQRSMGGFKYGFASVVSYEFATD---QVLDLVGFLVSDAGPA 222 (392) T ss_pred e-cC-CCceeEEEEEcCCcceeeeccc-ccccccccceeeEEeeeeeEEeeehhHHHHHhc---chHHHHHHHHHHHHHH Confidence 1 11 1223455566677777887665 457888888889889999999888877654443 3556777777888999 Q ss_pred HHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhc Q lcl|NC_011142. 154 SEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASS 233 (343) Q Consensus 154 ~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~ 233 (343) +++.+|..+++|+....-.|+++++....... .|.+.+ .-.++++.+++..|... .-.+-.++|+|..+..|.. T Consensus 223 i~~~~d~~~l~G~Gt~~p~Gil~~~~~~~~~~--~~~~~~-~~~~d~l~~~~~~l~~~---~~~~a~~v~n~~~~~~l~~ 296 (392) T protein:vir:13 223 IGDAMGRHFLTGTGTGQPRGILTDATGANAAF--GEADAD-SKVSDALIDLFHEVPSA---YRKNAKFVVNDLRAAQMRK 296 (392) T ss_pred HHHHHHHHHhcccCCccccccccccccccccc--cccccc-cccHHHHHHHHHhhhhh---hhcCCEEEEcHHHHHHHHH Confidence 99999999999987666789999876443322 222211 11255666666666432 1223468999999999875 Q ss_pred cccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhccc-ce Q lcl|NC_011142. 234 LLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLA-PQ 312 (343) Q Consensus 234 ~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~-~~ 312 (343) -. +..|.-++.-=...+....+-|.|+.+... .. .+.+ +|- |...+.+..-..++... .+ T Consensus 297 lk--d~~G~~l~~~~~~~g~~~~l~G~Pv~~~~~----------~~-----~~~i-~~G-df~~~~i~~~~~~~i~~~~~ 357 (392) T protein:vir:13 297 LK--DANGQYLWQSALTVGAPDTFNGKVVETDDG----------MP-----ADKV-LFA-DLSKYRVRFAGSLRVDRSVD 357 (392) T ss_pred hh--ccCCceeecCCcCCCCCceecceeeEEcCC----------CC-----CCcE-EEe-eccceeEEeecceEEEeecc Confidence 32 433432221000001111234455432211 00 1112 221 11122222223333221 11 Q ss_pred ec--CceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 313 LL--GLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 313 ~~--~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +. .-...+.++.|++ +.+.+|.|+..+..= T Consensus 358 ~~~~~~~~~~r~~~r~d-~~~~~~~A~~~~~~~ 389 (392) T protein:vir:13 358 AKFSTDQIVYRFLQRAD-GLLVDARGAKVLTVT 389 (392) T ss_pred ccccCCcEEEEEEEEec-cEEecccceEEEEee Confidence 11 1124556778875 668999998877766 No 75 >protein:vir:6212 Length: 434 # NCBI annotation: prohead protease # Family: family:all:21 # MgeID: mge:128 # MgeName: phBC6A52 # Cross-refs: genbank:acc:NP_852592;genbank:gi:31415852;genbank:GeneID:1489210 Probab=97.85 E-value=2.8e-06 Score=51.02 Aligned_cols=309 Identities=10% Similarity=0.063 Sum_probs=155.7 Q ss_pred CCcceeccchhhhh------ch-----hhhchhcccccccCc-----chhecchhhhhhhhHHHHHHHHHHHHhhhhhcc Q lcl|NC_011142. 1 MSEKRVVIDAQTIA------GN-----RWLNKFLDSNATIGV-----PSVVNDADGGAAYYISQLASLETTVYEVPYADI 64 (343) Q Consensus 1 ~~~~~~~~~~~~~~------~~-----~~~~~~~~~~~~~~~-----~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l 64 (343) +.|.+-.++. .+. ++ ..++.++......+. -++.. ..+.|.++.+ +.+.+.|++...... T Consensus 95 ~~e~~~~~~~-~~~~~~~~~~~~~~~~~e~r~a~~~~l~~~~~~~e~~a~~~-~t~~GG~lvP--~~~~~~Ii~~l~~~~ 170 (434) T protein:vir:62 95 SEEQRSAISA-SIAAALSTKGHRTNKETEIRSVFANYIVGNIDEKEARALGL-VTGNGSVTIP--DFLSKEIITYAQEEN 170 (434) T ss_pred HHHHHHHHHH-HHHhhhhhccccchHHHHHHHHHHHHhccccchhhhhhhcc-cccccceecc--hhhHHHHHHhhhhhh Confidence 0111100000 000 00 000101000000000 00011 1133556665 456677888777666 Q ss_pred cchhhccccCCCCcceeEEEEeeecccccceeec--CCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCc Q lcl|NC_011142. 65 TYLEDVPVLANIPEYATHWNYRSYDGAAMGKFIS--ANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPI 142 (343) Q Consensus 65 ~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~--~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l 142 (343) ..+.+..+.. .+ ....|.+....+.+.+.. ..+.++|..+..++......+.++.-+.+|.+=|+. ...+| T Consensus 171 ~i~~~~~~~~-~~---~~~~~p~~~~~~~a~~~~~~~e~~~~~~~~~~f~~v~~~~~k~~~~~~iS~ell~d---s~~~l 243 (434) T protein:vir:62 171 FLRRLGTGVK-TK---ENIKYPVLVKKAEAQGHKNERTNNEMPETDIEFDEIELSPTEFDALATVTKKLLAR---TGLPI 243 (434) T ss_pred hhhhhcceec-cC---CceEEEEEecCCcccceecccccccccccccceeeEEeeheeeEeehhhHHHHHhc---chHHH Confidence 6666554422 11 124455555445555442 334567888888888889999998877776554443 35678 Q ss_pred cHHHHHHHHHHHHHhhhheeeeeehhhc-ceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEE Q lcl|NC_011142. 143 DSMQAELAFRGSEEHSQRVAYFGDTNRN-MSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTV 221 (343) Q Consensus 143 ~~~k~~aA~~~~~~~~n~~~f~G~~~~g-~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL 221 (343) ..--....+.++...+++.+++|+...+ ..|+++.++++..+.+ ...+++|.+++.++... +. ..-.. T Consensus 244 ~~~i~~~la~~~~~~~d~~~l~G~G~~~~~~g~~~~~~~~~~~~~--------~~~~d~l~~l~~~l~~~--~~-~~a~~ 312 (434) T protein:vir:62 244 EQIVMDELKKAYVRKETQYMVNGDEANNINDGALAKKAVEFKTDE--------KNLYDALVKMKNTPVKE--VR-KKARW 312 (434) T ss_pred HHHHHHHHHHHHHHHHHHHHhccCCCCccccceeecccccccccc--------cchhhHHHHHHhhcchh--hh-cCCEE Confidence 8888888999999999999999997554 5677877776543221 12356777777766532 22 22357 Q ss_pred EecHHHHHHHhccccCCCCCccHHH-HHH-hcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceE- Q lcl|NC_011142. 222 LMFPDLWKRASSLLMTGYTDRTVIE-HFQ-INNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNL- 298 (343) Q Consensus 222 ~l~p~~~~~L~~~~~~~~~~~tvle-~l~-~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~- 298 (343) +|+|..|..|..-. +..|.-++. ... .......+-|.|+.+. .... .+.++....++|-+=.+++ T Consensus 313 v~n~~~~~~L~~lk--d~~G~~l~~~~~~~~~g~~~tl~G~pV~~~-------~~~~---~~~~~~~~~i~~Gdfs~~~i 380 (434) T protein:vir:62 313 VLNTAALTKIETMK--TDDGFPLLRPFNQAEGGIGYTLLGFPVEEE-------DAID---IPDSPDTPVFYFGDFSKFYI 380 (434) T ss_pred EEcHHHHHHHHHhh--ccCCCEeeccCCCccCCCCceecceeeEEe-------cCcc---CccCCCceEEEEeeccceEE Confidence 99999999986432 333322111 000 0011112344454332 1111 1222323333342212222 Q ss_pred EEee-ccchhccccee-cCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 299 ALAK-PIPFRMLAPQL-LGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 299 ~~~v-~~~~~~~~~~~-~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .... ++.+++...-. ..-..-+..+.|..|-.|+.|++++.+.+- T Consensus 381 ~~~~g~~~i~~~~~~~~~~~~v~~~~~~r~Dgk~i~~~~~~~~~~~~ 427 (434) T protein:vir:62 381 QDVIGSLEVQKLVELFSRTNRVGFRIWNLLDAQLIHSPFEVPVYKYV 427 (434) T ss_pred EEeeceeEEEeehhhhcccCceEEEEEeeecceeecCcccceEEEEE Confidence 1111 11122221111 112234667889988889999999988665 No 76 >protein:vir:4197 Length: 314 # NCBI annotation: putative structural protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:88 # MgeName: psiM100 # Cross-refs: genbank:acc:NP_071822;genbank:gi:11863105;genbank:GeneID:1257607 Probab=97.78 E-value=2e-05 Score=46.40 Aligned_cols=295 Identities=10% Similarity=-0.004 Sum_probs=146.2 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) |.|-+-.+++ --++. . .+.+|..+..+. .+ ++++.....-..+++..+....+... T Consensus 1 ~~~~~~~~~~---------------~k~it-----~-~d~~gG~L~P~~--~~-~~i~~l~e~s~i~~~a~vi~t~~s~~ 56 (314) T protein:vir:41 1 MDFLNKPFQI---------------TPKID-----V-PDLGKGILAVQR--FG-EFVREVRENSAIIKDARVLNALKSYE 56 (314) T ss_pred CchhhhHHHh---------------hcccc-----c-ccCCCceeChHH--HH-HHHHHHHhccchhhheeeecccCccc Confidence 5554433331 11122 1 233344555432 23 35554444444555555432222222 Q ss_pred eEEEEeeecc----cccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHH Q lcl|NC_011142. 81 THWNYRSYDG----AAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEE 156 (343) Q Consensus 81 ~~~~~~~~~~----~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~ 156 (343) ..+ ..... .....+.+. ....|..+..++......+.+..-+.++.+-|+.. ..|.+|...-....++.+.+ T Consensus 57 ~~i--~~i~~g~~~~~~~~~~~~-~~~~~~~~~tf~~~~l~~~kl~~~v~is~e~L~D~-a~~~~le~~i~~~~Ae~~g~ 132 (314) T protein:vir:41 57 VDI--SRISLGVELEPGRNTSGT-KVAPTADEVTVSTNTLEMKELVTKVVLEDEALEDN-IEQSAFEQTITSLLASGVTY 132 (314) T ss_pred eee--cccccCcccccccccccC-CccCCcccccccceeeeeEEEEEeecccHHHHHhh-hchhhHHHHHHHHHHHHHHH Confidence 211 11111 011112222 23356666778888888888888888887777764 35678988888899999999 Q ss_pred hhhheeeeeehhh--------cceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCee-c-ccEEEecHH Q lcl|NC_011142. 157 HSQRVAYFGDTNR--------NMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFH-T-PNTVLMFPD 226 (343) Q Consensus 157 ~~n~~~f~G~~~~--------g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~-~-p~tL~l~p~ 226 (343) .+..+.|+|+... ...|+|+.........+. -..+.+++ .+.+++..|.. .|.. . .-..+|+++ T Consensus 133 ~~~~~~~nGdg~~~s~~~~~~~p~G~l~~a~~~~~~~~~-~~~~~~~~---~~~~l~~sl~~--~yr~~~~~~~~~m~~~ 206 (314) T protein:vir:41 133 DLECFFLHADSSLTTGRELYRINDGWMKLAGNQYTDAEP-EDENWPLN---LFDGMMDELDT--RYLQLKPRMKFYVSNE 206 (314) T ss_pred HHHHHhhccccCCcCcccchhcchhhhhhcccceeecCc-cccccHHH---HHHHHHHhcCc--hhhcCCCceEEEecHH Confidence 9999999998642 346777765433322211 11123333 34455544432 2322 2 246889999 Q ss_pred HHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccch Q lcl|NC_011142. 227 LWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPF 306 (343) Q Consensus 227 ~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~ 306 (343) .+..+.+. ..+. +..+.+-.....-...+-|.|+. .++.....+. +. ..++| -+++++-+.+...+ T Consensus 207 t~~~~r~~-l~~~-~~~l~~~~~~~~~~~~l~G~PV~-------~~~~~~~~~~--~~--~~i~f-gd~~nlv~~~~~~i 272 (314) T protein:vir:41 207 IYNGYRKQ-LLVR-ETGLGDSALIGATGLQYDGIPIQ-------YVPALDALGD--DK--ARALL-TVPTNLVYGFWRNI 272 (314) T ss_pred HHHHHHHH-Hhcc-CCcccchhhhCCCCceecceeeE-------ecccccccCC--CC--ceEEE-echhheEEEeecee Confidence 88766432 2221 11122212222222333444432 2233222221 11 22333 34677777777777 Q ss_pred hccccee-cCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 307 RMLAPQL-LGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 307 ~~~~~~~-~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +..+-.. +.-.+.+-...|++....-.+.++..+-+= T Consensus 273 r~~~~~~a~~~~~~~~~~~r~d~~~~~~~aa~~~~~~~ 310 (314) T protein:vir:41 273 RIEPKRDAAMRRTEYIASLRADCNYEDENAAVAAVIDM 310 (314) T ss_pred EEeecccCcCCeEEEEEEEEeceEEEEcCcEEEEEeec Confidence 7664322 222455555566643333455555555444 No 77 >protein:vir:8843 Length: 317 # NCBI annotation: major head protein # Family: family:all:3919 # MgeID: mge:158 # MgeName: PaP3 # Cross-refs: genbank:acc:NP_775251;genbank:gi:27476049;genbank:GeneID:2700597 Probab=97.77 E-value=2.2e-06 Score=51.64 Aligned_cols=287 Identities=11% Similarity=0.031 Sum_probs=136.3 Q ss_pred cCcchhe---cchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccc Q lcl|NC_011142. 29 IGVPSVV---NDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLP 105 (343) Q Consensus 29 ~~~~~~~---~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip 105 (343) |..|+.+ .++...- +.+...|+..--.. ..|+..-.+.......+.|...+.....+..-..+.|.| T Consensus 1 ma~~~~~~~t~~~~g~~-------~dl~~~I~~isp~d---TPf~S~i~~~~a~~~~~~W~~d~l~~~~~~~~~EG~da~ 70 (317) T protein:vir:88 1 MATPTNAVSTVEINGKR-------EDLIDIIYNIAPYD---TPFMSAIGKGVATAITHEWQTDELRQPGKNTRVEGEDAT 70 (317) T ss_pred CCccccceEeeeeeeee-------echhhhheecCCcc---CcceeeecCceecccEEEEEeeecCCccccccccCcccc Confidence 3334332 2222111 11112222211111 112222223334444555555444333322222333333 Q ss_pred eeeeccce---eEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehh---------hccee Q lcl|NC_011142. 106 RVAQSAKL---HQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTN---------RNMSG 173 (343) Q Consensus 106 ~v~~~~~~---~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~---------~g~~G 173 (343) ........ -..+|+.-...++.+.+-...+ ..-+........+..++.+.++..+++|.+. ..+-| T Consensus 71 ~~~~~~r~~~~N~tQIf~k~v~VSgTa~av~~~--G~~~ela~q~~kk~~EikrdmE~~li~g~~a~~~~~~t~~r~~~G 148 (317) T protein:vir:88 71 IKAGSFTTMLNNYCQISDETLQVTGTADRVKKA--GRKNELAYQLAKKSKELKLDMEYALVGAPQAKVQRNTTTPGQMAN 148 (317) T ss_pred cccccCCEEeccEEEEEEeEEEEeehhhhhhhc--CccchhHHHHHHHHHHHHHHHHHHHhcCeeeccCCCCccchhhhh Confidence 32222221 1234555555555555554333 2245566677778888999999999999642 23456 Q ss_pred eeec---------CCc-cccccCcCccccCHHHH-HHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCc Q lcl|NC_011142. 174 LLNN---------PNV-TKTSATVNYATCTGQEL-FDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDR 242 (343) Q Consensus 174 LlN~---------p~v-~~~~~~~~w~~~t~~~i-~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~ 242 (343) |++. +|. +....+..|...|+..+ -++|++++.++|...+ .|..+.++|.+-..|+.- +.++. T Consensus 149 l~~~i~t~~~~~~~g~~~~~~~~~~~t~~t~~~lte~~l~~~l~~i~~~Gg---~~~~i~v~a~~k~~i~~~-~~~~~-- 222 (317) T protein:vir:88 149 IFAYYKTNGSLGANGVAPVGDGSNTGTAGDLRLLTEDMLLNASESIWRNGG---QANSIQTSSSIKKAISKN-MKGRA-- 222 (317) T ss_pred HHHHhccCceeccCccccccCCCccccccccccccHHHHHHHHHHHHhcCC---CCCEEEeChHHHHHHHHH-hcCCc-- Confidence 6543 221 11223334544333322 2568899999998532 578899999988888643 11111 Q ss_pred cHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccceecCceeEeee Q lcl|NC_011142. 243 TVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLGLGITVPA 322 (343) Q Consensus 243 tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~~ 322 (343) +........+.+...+..-+.-.....++. .... ..|.++++ |++.+++..-.|+...+.-..+-+-+.-. T Consensus 223 ~~i~~~~~~~~~g~~v~~~~tdfG~v~ii~-~r~l------p~~~~~~~--D~~~~~l~~Lr~~~~e~laKtGd~~k~~i 293 (317) T protein:vir:88 223 TEITLDASDNRIAQTVDVYESDFGKYTIRA-NRWF------HENTLFVF--DPKMHSLCYLRPFFQHELAKTGDSEKRQL 293 (317) T ss_pred eeEEEcccCeEEEEEEEEEEeCCeEEEEEe-CCCC------CCCeEEEE--cccccceeecccceeeccCCCcccceeEE Confidence 100000000111111100001111111111 1111 12444444 68888887766665555554454444444 Q ss_pred eeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 323 EYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 323 ~~~~gGv~i~~P~ai~~~dGI 343 (343) +.. .|++++-|.+.+...|| T Consensus 294 ~~E-~tLe~~N~~a~a~i~~l 313 (317) T protein:vir:88 294 LVE-YTFRVNNEKSGALIRDV 313 (317) T ss_pred EEE-EEEEEcCccceeEEEEe Confidence 554 57999999999999999 No 78 >protein:vir:4159 Length: 315 # NCBI annotation: structural protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:87 # MgeName: psiM2 # Cross-refs: genbank:acc:NP_046968;genbank:gi:9630538;genbank:GeneID:1261712 Probab=97.75 E-value=2e-05 Score=46.29 Aligned_cols=301 Identities=11% Similarity=0.049 Sum_probs=140.5 Q ss_pred eeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhH-HHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEE Q lcl|NC_011142. 5 RVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYI-SQLASLETTVYEVPYADITYLEDVPVLANIPEYATHW 83 (343) Q Consensus 5 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~-~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~ 83 (343) -.-|| .+-+++ ... ...++++ ++.+|.++. +++.++-..+.| .-..+++..+.+........+ T Consensus 1 ~~~~~--~~~~~~-~~~---~~k~~t~------~d~~Gg~l~P~~~~~~i~~~~e----~s~~l~~~~vi~~~~~~~~~i 64 (315) T protein:vir:41 1 MLTIE--DIRGGK-PFE---IVPKIDV------PDLGRGVLSVDRFGEFVKAVRD----SAVIIPEARIDNALKSYEKDI 64 (315) T ss_pred Ccccc--hhhcCC-hhh---hhhhcCC------cCCCCceechHHHHHHHHHHHh----hhhhhhhceeeeccccccccc Confidence 00011 111111 111 1112221 123344444 334433334444 223344443322111111110 Q ss_pred ---EEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhh Q lcl|NC_011142. 84 ---NYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQR 160 (343) Q Consensus 84 ---~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~ 160 (343) .+.+....| ..+.+. ..+.+..+..++....+...+..-..++.+-|+. ...|.++...-....++++++.++. T Consensus 65 ~~~g~~~~~~~g-~~~~~~-~~~~~~~~~~f~~~~l~~~~l~~~~~it~elL~D-~~~~~~~e~~l~~~~a~~~a~~~~~ 141 (315) T protein:vir:41 65 SRLSLVLDVGPG-RDETGQ-KLAPPESTAEVKTNTLYMREMVTKVVIHEDAIED-NIEGKAFEQKIVTLLGEGISYVLEK 141 (315) T ss_pred cccccCcccccc-cccccC-cCCCCCCccccceeeeceeeeeeeccccHHHHHh-hhccccHHHHHHHHHHHHHHHHHHH Confidence 000000001 112222 2234444555677777777777777777777775 4456789999999999999999999 Q ss_pred eeeeeehhh------cceeeeecCCccccccCcCcccc-CHHHHHHHHHHHHHHHHHhcCCee--cccEEEecHHHHHHH Q lcl|NC_011142. 161 VAYFGDTNR------NMSGLLNNPNVTKTSATVNYATC-TGQELFDLLNNPVFAVVKASKRFH--TPNTVLMFPDLWKRA 231 (343) Q Consensus 161 ~~f~G~~~~------g~~GLlN~p~v~~~~~~~~w~~~-t~~~i~~di~~~~~~l~~~s~g~~--~p~tL~l~p~~~~~L 231 (343) ..|+|+... ...|+|+...........++++. .+.+.+.|+...+.. .+.. ..-.++|+++.+..+ T Consensus 142 ~~~nGdg~s~~p~~~~~~G~l~~a~~~~~~~~~~~~a~~~~~d~l~~l~~sl~~-----~yr~~~~~~~~imn~~t~~~~ 216 (315) T protein:vir:41 142 YYLHGDTSSSDPLLRMSDGWLKLASEKLTESDVDPEAEDWPMNLFDTMIESLPT-----PYRNNLPNMKFYVTWDIYRAY 216 (315) T ss_pred HhhccCCcCcCccccccccceecccccccccccccccccccHHHHHHHHHhcCh-----HHhhcCCceEEEEcHHHHHHH Confidence 999998753 45788887765444333344432 223334343333322 2222 123688999988776 Q ss_pred hccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc Q lcl|NC_011142. 232 SSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP 311 (343) Q Consensus 232 ~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~ 311 (343) .+-. ++.+.-+++-.....-...+-|.|+. ..+.....+. + +-.+++. +.+++-+.+...++..+- T Consensus 217 rklk--~~~g~~lw~~~~~~g~~~tl~G~PV~-------~~~~m~~~~~---~-~~~ilf~-d~~nl~~~~~~~i~i~~~ 282 (315) T protein:vir:41 217 RDAL--KGRETGLGDQALTGANSILYDGRPVQ-------YVPALEALND---G-KSRALFV-VPTQLVYGFWRNIKVVPD 282 (315) T ss_pred HHHh--ccCCCccccchhhcCCCceecccceE-------ecccccccCC---C-CccEEEe-cccceEEEeccccEEEee Confidence 4322 22222222111112222233344432 2222111111 1 1123333 345555555455555433 Q ss_pred ee-cCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 312 QL-LGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 312 ~~-~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .. +.-.+.+-...|+++-.+-...+++.+-.| T Consensus 283 ~~a~~~~~~~~~~~r~d~~~~~~~~~a~~~~~v 315 (315) T protein:vir:41 283 YDAEMRLTKYVASLRTDNHYEDEEGAVSATITV 315 (315) T ss_pred ecCCCCceEEEEEEEeceeEEeccceeEeeeeC Confidence 22 222344545567766556677888888888 No 79 >protein:vir:3991 Length: 404 # NCBI annotation: major structural protein # Family: family:all:21 # MgeID: mge:319 # MgeName: BK5-T # Cross-refs: genbank:acc:NP_116499;genbank:gi:14251132;genbank:GeneID:921252 Probab=97.62 E-value=1.9e-05 Score=46.46 Aligned_cols=302 Identities=8% Similarity=-0.035 Sum_probs=138.9 Q ss_pred CCcceeccchhhhhchhhhchhccccc--------ccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccc Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNA--------TIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPV 72 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~--------~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v 72 (343) ..+.......+..........++-... ....-++..-..+.|.++.+ +.+.+.|++..+.....+.++.+ T Consensus 75 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~e~~a~~~~t~~~gg~~iP--~~~~~~ii~~~~~~~~l~~~~~~ 152 (404) T protein:vir:39 75 REEEKGPLNKSEYELKDKFVKEFVNMVRNPMAFLNTVSSKTETSGSDSAAGLTIP--QDIRTMINTLVRQYDSLQQYVRV 152 (404) T ss_pred ccccccccccchhhhHHHHHHHHHHHHhcchhhhhhhhhhhhhcccccCCceecc--HHHHHHHHHHHHhhhhHHhhcce Confidence 000000000000000000000000000 00000011111233444444 45556788877777666666654 Q ss_pred cCCCCcceeEEEEe-eecccccceeecCCcCccce-eeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHH Q lcl|NC_011142. 73 LANIPEYATHWNYR-SYDGAAMGKFISANASDLPR-VAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELA 150 (343) Q Consensus 73 ~~~~~~~~~~~~~~-~~~~~G~a~~~~~~~~dip~-v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA 150 (343) . +.+.+...+.+. ..+..+.+.+++..+ .+|- ....++.....+..++.-+.+|..=++. ...+|..--.... T Consensus 153 ~-~~~~~~~~~~~~~~~~~~~~a~~v~Eg~-~~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~d---s~~~l~~~i~~~l 227 (404) T protein:vir:39 153 E-SVSTSNGSRVYEKWTDVTPLTVMDAEDG-KIPDLDNPRLTIIKYLIKRYAGIITATNTLLKD---TAENILAWLSSWI 227 (404) T ss_pred e-eccCCcceEEEEeecCCccceeeecCcc-ccccccccceeeEEeeeeeEEeeehhHHHHHhh---chHHHHHHHHHHH Confidence 2 222233333333 224446667777653 3564 3467888889999999888877654433 3456777788889 Q ss_pred HHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHH Q lcl|NC_011142. 151 FRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKR 230 (343) Q Consensus 151 ~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~ 230 (343) ++++.+.+|+.+++|+.... +... ..+.+ |+.+++...... .......++|+|+.|.. T Consensus 228 ~~~~~~~~d~~il~g~g~~~----------~~~~------~~~~~----~i~~~~~~~~~~--~~~~~a~~v~n~~~~~~ 285 (404) T protein:vir:39 228 AKKVVVTRNQAIIAAMGTVP----------KKPT------IAKFD----DVITMINTSVDP--AIIATSSLLTNQSGLNK 285 (404) T ss_pred HHHHHHHHHHHHHhcccccc----------cccc------cccHH----HHHHHHHHhhhh--hhccCCEEEEcHHHHHH Confidence 99999999999999975321 1111 12233 344443321111 11223469999999999 Q ss_pred HhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhccc Q lcl|NC_011142. 231 ASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLA 310 (343) Q Consensus 231 L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~ 310 (343) |..-. +..|.-++.--..+.....+-|.|+.+... ...+....+...+++.+.+ +.+.+..-..++... T Consensus 286 L~~lk--d~~G~~l~~~~~~~~~~~~l~G~pV~~~~~--------~~~~~~~~~~~~~~~gd~~-~~~~~~~~~~~~i~~ 354 (404) T protein:vir:39 286 LALVK--TAEGKYLLEPDPTKPNSYLIKGKKVIVVAD--------RWLPNSGSTVYPLYYGDMS-QAITLFDRENMSLLP 354 (404) T ss_pred HHHhh--ccCCceeeccCcCCCCcceecceeEEEecc--------cccCccCCCccEEEEEecc-ccEEEEeecceEEEE Confidence 97532 333332221000011111233444322110 0111111122222222222 333333223333221 Q ss_pred ce-ec----CceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 311 PQ-LL----GLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 311 ~~-~~----~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .. .. .-...+.+..++ |+.+++|.+++.++.- T Consensus 355 ~~~~~~~~~~~~~~~r~~~r~-d~~~~~~~a~~~~~~~ 391 (404) T protein:vir:39 355 TNIGAGAFETDTTKIRVIDRF-DVKTTDSEALVAGSFT 391 (404) T ss_pred eccchhhhhhceeeEEEEeee-ccEEecccceEEEEee Confidence 11 11 112455667877 5789999999999877 No 80 >protein:vir:4511 Length: 409 # NCBI annotation: capsid # Family: family:all:21 # MgeID: mge:97 # MgeName: V # Cross-refs: genbank:acc:NP_599037;genbank:gi:19548995;genbank:GeneID:935211 Probab=97.58 E-value=3.2e-05 Score=45.25 Aligned_cols=305 Identities=12% Similarity=0.076 Sum_probs=140.1 Q ss_pred CCcceeccchhhhhc-hhhh----chhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC Q lcl|NC_011142. 1 MSEKRVVIDAQTIAG-NRWL----NKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLAN 75 (343) Q Consensus 1 ~~~~~~~~~~~~~~~-~~~~----~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~ 75 (343) +.+++-.+. +-+-+ ...+ ...+....+++. ..| ..|.++.. +.+...|++........+.+..+..- T Consensus 85 ~~~~~~a~~-~~l~~~~~~~~~~e~~~~~~~~a~~~---~~~--~~gg~liP--~~~~~~ii~~~~~~~~l~~~~~~~~~ 156 (409) T protein:vir:45 85 DEKRAQVFD-KWMRHGASELTSEERKALRELRAQGV---AQD--EKGGYTVP--ETFLAKVVEKMKSYGGIASVAQILTT 156 (409) T ss_pred hHHHHHHHH-HHHHhhhhhccHHHHHHHHHHhhccC---ccC--cCCceecc--HhHHHHHHHHHHhhhhhhhhceeeec Confidence 111111111 00100 0000 000000001110 122 22444444 34456777777666666655544221 Q ss_pred CCcceeEEEEeeecccc-cceeecCCcCccceeeeccceeEEEEEEEEEE-EeecHHHHHHHHHhCCCccHHHHHHHHHH Q lcl|NC_011142. 76 IPEYATHWNYRSYDGAA-MGKFISANASDLPRVAQSAKLHQVELGYAGVE-CHYSLDELRTTAAVNMPIDSMQAELAFRG 153 (343) Q Consensus 76 ~~~~~~~~~~~~~~~~G-~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~-~~~~~~El~~a~~~g~~l~~~k~~aA~~~ 153 (343) .+ .....+...+..+ .+.+++.. ..+|..+..+.......+..... ..+|.+=++.+ ..+|..--....+.+ T Consensus 157 ~~--~~~~~~~~~~~~~~~~~~v~E~-~~~~~~~~~f~~~~l~~~k~~~~~i~is~ell~ds---~~~l~~~i~~~la~a 230 (409) T protein:vir:45 157 SD--GRTMEWATADGTSEVGVLLGEN-EEAGEEDTDFGMGSLGALKMTSKIIRVSNELLQDS---AIDMEAYLARRIAER 230 (409) T ss_pred CC--CceEEEEeeccCcccccccccc-ccccccccccceeeeeeeeeeeeehhhhHHHHhcc---HHHHHHHHHHHHHHH Confidence 11 1223334444332 34455554 34676676677666655554433 34554433332 446777777788999 Q ss_pred HHHhhhheeeeeehh---hcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeeccc-EEEecHHHHH Q lcl|NC_011142. 154 SEEHSQRVAYFGDTN---RNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPN-TVLMFPDLWK 229 (343) Q Consensus 154 ~~~~~n~~~f~G~~~---~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~-tL~l~p~~~~ 229 (343) +.+.+|+.+++|+.. .+..|+++.......++.. .+.+ ++||.+++..|.. .+...+. .+++++..+. T Consensus 231 ~~~~~~~a~l~G~G~~~~~~p~Gil~~~~~~~~~~~~--~~~~----~d~i~~l~~~l~~--~~~~~a~~~~~~n~~~~~ 302 (409) T protein:vir:45 231 IGRGEARYLIQGTGAGTPKQPKGLAASVTGTTQTAAA--NAVK----WQEILALKHSIDP--AYRRGPKFRLAFNDNTLK 302 (409) T ss_pred HHHHHHHHhhccCCCCCccccceeeeccccccccccc--cccc----hHHHHHHHHhhhh--hhccCCeEEEEECHHHHH Confidence 999999999999864 3678999887643322211 1123 4566666666643 2333333 4678999998 Q ss_pred HHhccccCCCCCccHHH-HHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhc Q lcl|NC_011142. 230 RASSLLMTGYTDRTVIE-HFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRM 308 (343) Q Consensus 230 ~L~~~~~~~~~~~tvle-~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~ 308 (343) .|..- .+..|.-++. -+....+ ..+-|.|+.+. .... ..+ .+.+ .++|-+=.+++ +..-..++. T Consensus 303 ~l~~l--kd~~G~~i~~~~~~~~~~-~~l~G~PV~~~-------~~~p--~~~-~~~~-~i~~Gd~~~~~-i~~~~~~~~ 367 (409) T protein:vir:45 303 LISEM--EDGQGRPLWLPDIVGVAP-ASVLNVPYVID-------QEID--DIG-AGKK-FMFCGDFDRFI-IRRVRYMIL 367 (409) T ss_pred HHHHh--hcCCCceeeccCcCCCCC-ceecceeeEEe-------cCcC--Ccc-CCcc-EEEEeehhhhh-eeeccceEE Confidence 88642 2434432221 0111111 12334444322 1111 111 1223 24442211222 111112211 Q ss_pred --cccee-cCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 309 --LAPQL-LGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 309 --~~~~~-~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ..... ..-...+.+..|+ |..+.+|.|++.+.+= T Consensus 368 ~~~~d~~~~~~~~~~~~~~r~-d~~~~~~~A~~~l~~k 404 (409) T protein:vir:45 368 KRLVERYAEYDQTGFLAFHRF-DCILEDTSAIKALVGK 404 (409) T ss_pred EEeecccccCCcEEEEEEEEe-ccEeechhheEEEEec Confidence 11101 1112345667777 5668999999987765 No 81 >protein:vir:3613 Length: 272 # NCBI annotation: MHP # Family: family:all:522 # MgeID: mge:74 # MgeName: TP901-1 # Cross-refs: genbank:acc:NP_112699;genbank:gi:13786567;genbank:GeneID:921035 Probab=97.55 E-value=2.9e-05 Score=45.50 Aligned_cols=267 Identities=8% Similarity=-0.064 Sum_probs=126.2 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC--CCcceeEEEEeeecccccceeecCCcCc Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLAN--IPEYATHWNYRSYDGAAMGKFISANASD 103 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~--~~~~~~~~~~~~~~~~G~a~~~~~~~~d 103 (343) |++. .+..++ .+-.+-|.. -|.+.....+....+..+... +.+| .++.++.+...|.+..+.++ ++ T Consensus 1 ma~~----~T~~~d---~iiPev~~~---~v~~~~~~~~~~~~~~~~~~~l~g~~G-~ti~iP~~~~~gda~~~~eg-~~ 68 (272) T protein:vir:36 1 MSKQ----KTTLAD---LVNPEVLAP---IVSYELNKALRFAPLAQVDTTLQGQPG-NTLKFPAFTYIGDAADVAEG-GE 68 (272) T ss_pred CCCc----ceehhh---hhchHHHHH---HHHHHHHhhhhhccccccccccccCCC-CEEEEeeeccCccccccCCC-Cc Confidence 2210 011111 111222321 233333334444454444333 2233 35677777778888877776 46 Q ss_pred cceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCcccc Q lcl|NC_011142. 104 LPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKT 183 (343) Q Consensus 104 ip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~ 183 (343) ++..+...+.....+.+.+.+|.+ .|++.++. +-++-..-...++..+++..|+-++-.-. | .... T Consensus 69 i~~~~lt~~~~~~~i~~~~k~~~v--tD~~~~~~-~~d~~~~~~~~~a~~~a~~~d~~i~~~l~-----~------~~~~ 134 (272) T protein:vir:36 69 ISLDKIGTTTKSVTIKKAAKGTEI--TDEAALSG-YGDPIGESNKQLGLSLANKVDDDLLSAAK-----T------TSQT 134 (272) T ss_pred cChhhcCCcceeEeeehhhccccc--cHHHHhhc-cchHHHHHHHHHHHHHHHHHHHHHHHHhc-----c------cccc Confidence 888888888888888887766555 55555443 44555556666777788888765542111 1 0111 Q ss_pred ccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeecccccc Q lcl|NC_011142. 184 SATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPID 263 (343) Q Consensus 184 ~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~ 263 (343) ..+ ..+ +++|.+++..+-.. ...+..++++|..+..|.+-...........+-+..|+..-.+.|.++. T Consensus 135 ~~~----~~~----~d~i~~A~~~lgd~---~~~~~~ivv~p~~~~~L~k~~~~~~~~~~~~~~~~~~G~ig~~~G~~Vv 203 (272) T protein:vir:36 135 VST----KAN----VDGVQAALDIFNDE---DAQAYVLIVNPKDAAKIRKDANAKNIGSEVGANALINGTYADVLGAQIV 203 (272) T ss_pred ccc----ccc----HHHHHHHHHHhhhc---CCCceEEEEcHHHHHHHhcccccccccccccccceeeeccceecCeeEE Confidence 111 112 45667777776543 2346889999999999864211111111111111122222223333321 Q ss_pred ccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-eecCceeEeeeeeeeeeEEEECcceeeee-- Q lcl|NC_011142. 264 IKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-QLLGLGITVPAEYKISGTEYRYPLCAQYV-- 340 (343) Q Consensus 264 i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-~~~~~~~~~~~~~~~gGv~i~~P~ai~~~-- 340 (343) .+.. + +.++.....|..-+.-+.+....+++.-.- .+....-.+.... ..|+.+.+|.+++.+ T Consensus 204 ~s~~-------~------p~~~~~~~~~~~~~gA~~~~~~~~~~vE~~R~~~~~~d~i~~~~-~y~~~v~~~~~vv~~t~ 269 (272) T protein:vir:36 204 RSKK-------L------AEGSALMFKIVSNSPALKLVLKRGVQVETDRDIVTKTTVITADE-HYAAYLYDLTKVVNITF 269 (272) T ss_pred EeCC-------C------CCCceeEEEEEecccceeeeecCCcccccccchhhcCcEEEEEE-EEEEEEEcCccEEEEee Confidence 1111 0 011122222222222222222222222111 1111223333333 468999999976664 Q ss_pred ccC Q lcl|NC_011142. 341 DML 343 (343) Q Consensus 341 dGI 343 (343) .|+ T Consensus 270 ~g~ 272 (272) T protein:vir:36 270 TGV 272 (272) T ss_pred cCC Confidence 688 No 82 >protein:vir:101607 Length: 379 # NCBI annotation: major capsid protein precursor # Family: family:all:585 # MgeID: mge:1646 # MgeName: 11b # Cross-refs: genbank:acc:YP_112497;genbank:gi:53793597;uniprot:Q5ZGF6;genbank:GeneID:3101715 Probab=97.53 E-value=4.1e-05 Score=44.63 Aligned_cols=292 Identities=12% Similarity=0.032 Sum_probs=136.4 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcc---hhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCC Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVP---SVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIP 77 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~ 77 (343) .++..... +.+..+.............+.+ .+....+.++.+. +.+.+.|++........+.++.+.+ T Consensus 75 ~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ip----~~~~~~ii~~~~~~~~i~~~~~~~~--- 145 (379) T protein:vir:10 75 EDKSDSLV--KSITENFNDIKEVRNGKSIQVKAVGDMTLPVNLTGAQP----KDYNFDVVLNPSQMLNVSDIVGAVS--- 145 (379) T ss_pred cccchhHH--HHHHHHHHhHHHHHhhhhhhhhhhcccccCCCCccccc----hhhhhHHHHhHHhhhhHHhhceeee--- Confidence 11111111 1111110000000000000000 1111222223322 3445567776666666666665432 Q ss_pred cceeEEEEeeecccccc--eeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHH Q lcl|NC_011142. 78 EYATHWNYRSYDGAAMG--KFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSE 155 (343) Q Consensus 78 ~~~~~~~~~~~~~~G~a--~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~ 155 (343) ....++.|......+.+ .+++. +...|..+..++.....++.++.-+.+|.+=|+.+. .|..--....++.++ T Consensus 146 ~~~~~~~~~~~~~~~~~~~~~v~E-g~~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~D~~----~l~~~i~~~la~~~~ 220 (379) T protein:vir:10 146 ISGGTYTFVRENGAGEGAIGAQVE-GATKGQKDYDISMIDVNTDFIAGFTRYSKKMANNLP----FLTSFIPNALRRDYA 220 (379) T ss_pred ccCCceEEEEeecCCCcccccccC-CccccccccceeeeEeeeeeEEeeehhhHHHHhhHH----HHHHHHHHHHHHHHH Confidence 22334555554444333 33443 456888888899999999999998888765444331 266666666778888 Q ss_pred HhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccc Q lcl|NC_011142. 156 EHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLL 235 (343) Q Consensus 156 ~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~ 235 (343) +.+|.-++.|....+..+... .++. .-+++|.+++..+... ...+..++|+|..|..|.... T Consensus 221 ~~~~~~~~~g~~~~~~~~~~~---------~~~~------~~~d~i~~~~~~~~~~---~~~~~~~vmn~~~~~~l~~lk 282 (379) T protein:vir:10 221 KAENAAFNAVLAANATASTEI---------ITNK------NKVEMLINEIAKQENL---DFPVTAIVLRPTDYYDILVTQ 282 (379) T ss_pred HHHHHHHhccccccccccccc---------ccCc------ccHHHHHHHHHhhhhc---cCCCCEEEEcHHHHHHHHHhh Confidence 888888777765433222111 1111 1145677777666532 235678999999999986533 Q ss_pred cCCCCCccHHH--HHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEE--eeccchhcccc Q lcl|NC_011142. 236 MTGYTDRTVIE--HFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLAL--AKPIPFRMLAP 311 (343) Q Consensus 236 ~~~~~~~tvle--~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~--~v~~~~~~~~~ 311 (343) +..|.-++. ....+.....+-|.|+... ... +.|+ +++-+.+.=.+.+ .+...+..... T Consensus 283 --d~~G~~l~~~~~~~~~~~~~~l~G~pvv~s-------~~~------~ag~--~~~gdf~~~~~~~~~~~~i~~~~~~~ 345 (379) T protein:vir:10 283 --KSVGAGYGLPGVVTQDNGVLRINGIPLFRA-------TWL------AANK--YYVGDWTRVTKVTTEGLSLEFSEVEG 345 (379) T ss_pred --ccCCceeccCCccCCCCCcceecceeeEec-------CCC------CCCc--eEEeecccEEEEEEeceEEEEeeccc Confidence 333322111 0000111012224443222 111 1111 1111111101111 01111111111 Q ss_pred --eecCceeEeeeeeeeeeEEEECcceeee--eccC Q lcl|NC_011142. 312 --QLLGLGITVPAEYKISGTEYRYPLCAQY--VDML 343 (343) Q Consensus 312 --~~~~~~~~~~~~~~~gGv~i~~P~ai~~--~dGI 343 (343) -.++ ...+.+..|+ |+.+++|.|+++ +.+| T Consensus 346 ~~f~~~-~~~~r~~~R~-~~~v~~p~a~v~~~~~~~ 379 (379) T protein:vir:10 346 TNFVKN-NITARIEAQV-ALAVEQPAALIFGDFTAV 379 (379) T ss_pred ccccCC-cEEEEEEEEe-ccEEecCccEEEEEecCC Confidence 1122 3456677887 688889999999 7788 No 83 >protein:vir:4953 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:108 # MgeName: Sfi19 # Cross-refs: genbank:acc:NP_049929;genbank:gi:9632900;genbank:GeneID:1262076 Probab=97.53 E-value=3.2e-05 Score=45.19 Aligned_cols=300 Identities=8% Similarity=-0.033 Sum_probs=142.8 Q ss_pred CCcceeccchhhhhchhhhchhcccccccC---cchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCC Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIG---VPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIP 77 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~ 77 (343) ..+++...+...-..+.+.+.......+-. ..++..-....|.++.+ +.+.+.|++........++++.+.. .+ T Consensus 73 ~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~t~~~gg~~vP--~~~~~~ii~~~~~~~~l~~~~~~~~-~~ 149 (397) T protein:vir:49 73 EEEKKPLTKSEEEVKAGFVKDFKNLVRGRYQNLLDSKTDASGSDAGLTIP--QDIQTAIHTLVSQYDSLQEYVNVEN-VT 149 (397) T ss_pred cccccccccchhHHHHHHHHHHHHHHhcchhHHHHHhhccccccCccccc--HhHHHHHHHHHHhhhhHHhhhceee-cc Confidence 111222222221112222111111100000 00011111123444444 3455678887777766666655432 11 Q ss_pred cceeEEEEee-ecccccceeecCCcCccce-eeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHH Q lcl|NC_011142. 78 EYATHWNYRS-YDGAAMGKFISANASDLPR-VAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSE 155 (343) Q Consensus 78 ~~~~~~~~~~-~~~~G~a~~~~~~~~dip~-v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~ 155 (343) .....+.|.. .+..+.+.+++.++ .+|- ....++.....++.++.-+.+|.+=++. ...++..--....+++++ T Consensus 150 ~~~~~~~~~~~~~~~~~a~~v~E~~-~~~~~~~~~~~~i~~~~~k~~~~~~iS~ell~d---s~~~l~~~i~~~l~~~~~ 225 (397) T protein:vir:49 150 TLTGSRVYEKWTDITGLANIDDEAG-KIADVDDPKLSLIKYTIKRYAGISTVTNSLLAD---SAENILAWLSGWIAKKVV 225 (397) T ss_pred cCccceEEEeeccCCcceeeecCcc-ccccccccceeeEEeeeeeEEeeehhHHHHHhh---hHHHHHHHHHHHHHHHHH Confidence 1222233333 23456677877654 3553 4567888888999999888876543333 245677777888899999 Q ss_pred HhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccc Q lcl|NC_011142. 156 EHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLL 235 (343) Q Consensus 156 ~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~ 235 (343) +.+|+-+++|+......+ . . .+ +++|.+++..+... + .....++|+|+.|..|..- T Consensus 226 ~~~d~ai~~G~g~~~~~~----------~-~-----~~----~d~i~~~~~~l~~~--~-~~~a~~vmn~~~~~~l~~l- 281 (397) T protein:vir:49 226 VTRNKAILEAIAALPTKP----------T-L-----TK----WDDIIDLEAKVDPA--I-KQTSFFLTNTSGFTALKKV- 281 (397) T ss_pred HHHHHHHHhhcccccccc----------c-c-----cc----HHHHHHHHHhhhhh--h-cCCCEEEEcHHHHHHHHHh- Confidence 999999999975422110 0 0 11 35666677666542 2 3346799999999999643 Q ss_pred cCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcc--cc-- Q lcl|NC_011142. 236 MTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRML--AP-- 311 (343) Q Consensus 236 ~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~--~~-- 311 (343) -+..|.-++.--..++....+-|.|+.+... .....+..+ +..++|-+=.+.+.+..-..++.. +. T Consensus 282 -kd~~G~~l~~~~~~~~~~~~l~G~PV~~~~~--------~~~~~~~~~-~~~i~~gd~~~~~~~~~~~~~~i~~~~~~~ 351 (397) T protein:vir:49 282 -KNALGDYLMERDVKSPTGYSIDGFAVKEVAD--------RWLANGTGG-AMPLYFGDLKQAVTLFDRQHMSLLSTNIGG 351 (397) T ss_pred -hcCCCceeeccCcCCCCCceecceeeEEecc--------cccccccCC-ceeEEEeeccceEEEEeecceEEEEecccc Confidence 2333332221000011111233444322110 011111112 223444332233333221222221 11 Q ss_pred --eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 312 --QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 312 --~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -.++ ...+.++.++ ++.+++|.+++.++.= T Consensus 352 ~~~~~~-~~~~r~~~r~-d~~~~~~~a~~~~~~~ 383 (397) T protein:vir:49 352 GAFETD-TTKVRVIDRF-DVVATDTEAFVPASFK 383 (397) T ss_pred chhhcC-ceeEEEEeee-CcEEecccceEEEEee Confidence 0111 2344556676 5688999999887644 No 84 >protein:vir:6242 Length: 390 # NCBI annotation: gp36 # Family: family:all:21 # MgeID: mge:131 # MgeName: phi-BT1 # Cross-refs: genbank:acc:NP_813696;swissprot:trembl:q859c1;genbank:gi:29366756;interpro:IPR006444;uniprot:Q859C1;genbank:GeneID:1258897 Probab=97.50 E-value=2.3e-05 Score=46.06 Aligned_cols=301 Identities=9% Similarity=0.002 Sum_probs=144.7 Q ss_pred CCcceecc-chhhh-hchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCc Q lcl|NC_011142. 1 MSEKRVVI-DAQTI-AGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPE 78 (343) Q Consensus 1 ~~~~~~~~-~~~~~-~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~ 78 (343) =+++.... +.+.+ .|.......... ...... .+.+.+++.+...+ ....+.+........+.+..+..-.+ T Consensus 80 ~~~~~~~~~~~~~~r~~~~~~~r~~~~--~~~~~~-~t~~~~g~~~~~~~---~~~~i~~~~~~~~~l~~~~~~~~~~~- 152 (390) T protein:vir:62 80 GAQRSADVDDDATLRAGNLGEARSFEF--APEKRD-GTKAGNPNVLSRTL---YGQLIAQAVERSAIMRGGATTFTTSD- 152 (390) T ss_pred cchhhcchHHHHHHhhhhhhhhHHHHh--hhhhhc-ccccCCCccccccc---hHHHHHHHHhhhhhhhhcceeeecCC- Confidence 00010000 11111 111000000000 000000 12222223322222 23334443333333333333321111 Q ss_pred ceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhh Q lcl|NC_011142. 79 YATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHS 158 (343) Q Consensus 79 ~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~ 158 (343) ...+.+.+....+.+.+++.. ..+|..+..++........++.-+.+|.+=|+. ...++..--....+.+++..+ T Consensus 153 -~~~~~~p~~~~~~~a~wv~E~-~~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~d---s~~~l~~~i~~~l~~~i~~~~ 227 (390) T protein:vir:62 153 -ANPLDFTVITGRSSASIVGET-AEIPESYPATAQRSMGGFKYGFASVVSYEFATD---QVLDLVGFLVSDAGPAIGDAM 227 (390) T ss_pred -CceeEEEEEcCCcceeeeccc-ccccccccceeeeEeeeeeEEeehHHHHHHHhh---hhHHHHHHHHHHHHHHHHHHH Confidence 123556667777788887754 458888888999999999999888887655554 355677778888899999999 Q ss_pred hheeeeeehhhcceeeeecCCccccccCcCcc-ccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccC Q lcl|NC_011142. 159 QRVAYFGDTNRNMSGLLNNPNVTKTSATVNYA-TCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMT 237 (343) Q Consensus 159 n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~-~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~ 237 (343) |+-+++|+.. -.|++|+++....+.....+ ..+ +++|.+++.+|.. .+.. .-..+|+|+.+..|..-. T Consensus 228 d~~~l~G~G~--p~Gi~~~~~~~~~~~~~~~~~~~~----~~~l~~~~~~l~~--~~~~-~a~~vmn~~~~~~L~~lk-- 296 (390) T protein:vir:62 228 GRHFITGTGQ--PRGILTDASPATATFLATDTDSKV----SDALIDLFHEVPS--AYRA-NAKYVVNDLRAAQMRKLK-- 296 (390) T ss_pred HhhhhccCCc--cccccccccccccceecccccccc----hHHHHHHHHhhhh--hhhc-CCEEEEchHHHHHHHHhh-- Confidence 9999999853 37999988754433222222 123 4566666666543 2222 235899999999996432 Q ss_pred CCCCccHH-HHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhccc-ceec- Q lcl|NC_011142. 238 GYTDRTVI-EHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLA-PQLL- 314 (343) Q Consensus 238 ~~~~~tvl-e~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~-~~~~- 314 (343) +..+.=++ .-+. +.....+-|.|+.+... .. .+. ++|-+ -..+-+..-..++... .+.. T Consensus 297 d~~g~~l~~~~~~-~g~~~~l~G~Pv~~~~~----------~p-----~~~-i~~gd-~s~~~i~~~~~~~v~~~~~~~~ 358 (390) T protein:vir:62 297 DANGQYLWQSGLT-VGAPSLFNGKVVETDDG----------MP-----ADK-ILFAD-LSKYRVRFAGSLRVDRSVDAKF 358 (390) T ss_pred ccCCCeeecCCcC-CCccceecccceEEecC----------CC-----Ccc-EEEee-ccceeEEeecceEEEeeccccc Confidence 33332111 0011 11112233444332111 11 111 22211 1111121112222211 1111 Q ss_pred -CceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 315 -GLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 315 -~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .-...+..+.|++ +.+.+|.|++.+..= T Consensus 359 ~~~~~~~~~~~r~d-~~~~~~~A~~~l~~~ 387 (390) T protein:vir:62 359 STDQIVYRFLQRAD-GLLVDARGAKVLTVT 387 (390) T ss_pred cCCcEEEEEEEEeC-cEeechhheEEEEee Confidence 1124455678875 679999998888855 No 85 >protein:vir:107593 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1491 # MgeName: Gamma # Cross-refs: genbank:acc:YP_338188;genbank:gi:77020144;genbank:GeneID:3703724 Probab=97.40 E-value=5e-05 Score=44.16 Aligned_cols=300 Identities=9% Similarity=-0.005 Sum_probs=133.9 Q ss_pred CCcceeccch--------hhhhchh---hhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhh Q lcl|NC_011142. 1 MSEKRVVIDA--------QTIAGNR---WLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLED 69 (343) Q Consensus 1 ~~~~~~~~~~--------~~~~~~~---~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~ 69 (343) .-|.+..-.. +.+.+.. .....+..... ...+..-..++|.++.. +.+.+.|++.....-..+++ T Consensus 64 ~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~--~~~~~~~t~~~gg~~vP--~~~~~~ii~~~~~~s~l~~~ 139 (392) T protein:vir:10 64 EVETRNVDGEMEYRDVFMKALRNKPLNAEEREFLEDDLE--QRAMSGLTGEDGGLVIP--QDIQTQINELARSFDALEQY 139 (392) T ss_pred cccccCccchHHHHHHHHHHHhcccccHHHHHHHhhhhh--hhhccccccCCCceecc--hhHHHHHHHHHHhhhhhhhh Confidence 0000000000 0000000 00000000000 00000001123444444 35556677766666555555 Q ss_pred ccccCCCCcceeEEEEeeecccccceeecCCcCccceee-eccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHH Q lcl|NC_011142. 70 VPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVA-QSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAE 148 (343) Q Consensus 70 i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~-~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~ 148 (343) ..+. +.+.+...+.+......+.+.+++..+. .|-.+ ..++......+.++.-+.+|.+=|+.+ ..+|..--.. T Consensus 140 ~~~~-~~~~~~~~~~~~~~~~~~~a~~v~E~~~-~~~~~~~~~~~v~l~~~k~~~~~~iS~ell~ds---~~~l~~~i~~ 214 (392) T protein:vir:10 140 VTVE-PVRTRSGSRVLEKNSDMIPFAEITEMGE-IPETDNPKFSNVQYAVKDRAGILPLSRSLLQDS---DQNILKYVTK 214 (392) T ss_pred ceee-eccCCceeEEEEeecCCccceeeccccc-ccccccccceeEEeeeeeEEEeehhhHHHHhhh---HHHHHHHHHH Confidence 5432 1111222233333344456677766543 45433 467888888888988888887655543 4567888888 Q ss_pred HHHHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHH Q lcl|NC_011142. 149 LAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLW 228 (343) Q Consensus 149 aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~ 228 (343) ..++++++.+|..+++|+......| ..+.+ ||.+++..... . .....-.++|+|+.| T Consensus 215 ~l~~~i~~~~d~~~~~g~g~~~~~~-----------------~~~~d----~i~~~~~~~l~-~-~~~~~a~~vm~~~~~ 271 (392) T protein:vir:10 215 WLGKKSKVTRNVLILGVIEKLTKQA-----------------IKSLD----DIKDVLNVKLD-P-AISPNAILLTNQDGF 271 (392) T ss_pred HHHHHHHHHHHHHHhhccccccccC-----------------ccCHH----HHHHHHHHhhh-h-hhccCCEEEEcHHHH Confidence 8899999999999999876432111 11233 33433321111 1 222335699999999 Q ss_pred HHHhccccCCCCCccHHHH-HHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchh Q lcl|NC_011142. 229 KRASSLLMTGYTDRTVIEH-FQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFR 307 (343) Q Consensus 229 ~~L~~~~~~~~~~~tvle~-l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~ 307 (343) ..|.+-. +..|.-++.- +....+ ..+-|.|+.+.. ........+.+..+--++|-+=.+.+.+..-..++ T Consensus 272 ~~L~~lk--d~~G~~l~~~~~~~~~~-~tllG~~~v~~~------~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~~~~~~ 342 (392) T protein:vir:10 272 NYLDKLK--DKDGKYILQSDPTQKNK-KLFAGTNPVVVV------SNRFLKSKGTTAKKAPLIIGDLKEAIVLFKREDME 342 (392) T ss_pred HHHHHhh--ccCCCeEeecCccCCcc-ccccCcccEEEe------cccccCCCcccCCceEEEEEehhceEEEEeecceE Confidence 9996532 3233211100 011111 112222221110 00000011111112223332212222222212222 Q ss_pred --cccce----ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 308 --MLAPQ----LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 308 --~~~~~----~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +.+.. .++ ...+.++.++ |+.+++|.+|+.++.= T Consensus 343 ~~~~~~~~~~f~~~-~~~~r~~~r~-d~~v~~~~a~~~l~~~ 382 (392) T protein:vir:10 343 LASTDVGGKAFTRN-TLDLRAIQRD-DVQMWDNEAAVYGEID 382 (392) T ss_pred EEEeccccchhhcC-ceEEEEEEee-ccEEecccceEEEEec Confidence 22111 112 3456678887 4688999999998775 No 86 >protein:vir:105004 Length: 392 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:1490 # MgeName: W Beta # Cross-refs: genbank:acc:YP_459969;genbank:gi:85701384;genbank:GeneID:3882145 Probab=97.40 E-value=5e-05 Score=44.16 Aligned_cols=300 Identities=9% Similarity=-0.005 Sum_probs=133.9 Q ss_pred CCcceeccch--------hhhhchh---hhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhh Q lcl|NC_011142. 1 MSEKRVVIDA--------QTIAGNR---WLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLED 69 (343) Q Consensus 1 ~~~~~~~~~~--------~~~~~~~---~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~ 69 (343) .-|.+..-.. +.+.+.. .....+..... ...+..-..++|.++.. +.+.+.|++.....-..+++ T Consensus 64 ~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~--~~~~~~~t~~~gg~~vP--~~~~~~ii~~~~~~s~l~~~ 139 (392) T protein:vir:10 64 EVETRNVDGEMEYRDVFMKALRNKPLNAEEREFLEDDLE--QRAMSGLTGEDGGLVIP--QDIQTQINELARSFDALEQY 139 (392) T ss_pred cccccCccchHHHHHHHHHHHhcccccHHHHHHHhhhhh--hhhccccccCCCceecc--hhHHHHHHHHHHhhhhhhhh Confidence 0000000000 0000000 00000000000 00000001123444444 35556677766666555555 Q ss_pred ccccCCCCcceeEEEEeeecccccceeecCCcCccceee-eccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHH Q lcl|NC_011142. 70 VPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVA-QSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAE 148 (343) Q Consensus 70 i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~-~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~ 148 (343) ..+. +.+.+...+.+......+.+.+++..+. .|-.+ ..++......+.++.-+.+|.+=|+.+ ..+|..--.. T Consensus 140 ~~~~-~~~~~~~~~~~~~~~~~~~a~~v~E~~~-~~~~~~~~~~~v~l~~~k~~~~~~iS~ell~ds---~~~l~~~i~~ 214 (392) T protein:vir:10 140 VTVE-PVRTRSGSRVLEKNSDMIPFAEITEMGE-IPETDNPKFSNVQYAVKDRAGILPLSRSLLQDS---DQNILKYVTK 214 (392) T ss_pred ceee-eccCCceeEEEEeecCCccceeeccccc-ccccccccceeEEeeeeeEEEeehhhHHHHhhh---HHHHHHHHHH Confidence 5432 1111222233333344456677766543 45433 467888888888988888887655543 4567888888 Q ss_pred HHHHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHH Q lcl|NC_011142. 149 LAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLW 228 (343) Q Consensus 149 aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~ 228 (343) ..++++++.+|..+++|+......| ..+.+ ||.+++..... . .....-.++|+|+.| T Consensus 215 ~l~~~i~~~~d~~~~~g~g~~~~~~-----------------~~~~d----~i~~~~~~~l~-~-~~~~~a~~vm~~~~~ 271 (392) T protein:vir:10 215 WLGKKSKVTRNVLILGVIEKLTKQA-----------------IKSLD----DIKDVLNVKLD-P-AISPNAILLTNQDGF 271 (392) T ss_pred HHHHHHHHHHHHHHhhccccccccC-----------------ccCHH----HHHHHHHHhhh-h-hhccCCEEEEcHHHH Confidence 8899999999999999876432111 11233 33433321111 1 222335699999999 Q ss_pred HHHhccccCCCCCccHHHH-HHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchh Q lcl|NC_011142. 229 KRASSLLMTGYTDRTVIEH-FQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFR 307 (343) Q Consensus 229 ~~L~~~~~~~~~~~tvle~-l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~ 307 (343) ..|.+-. +..|.-++.- +....+ ..+-|.|+.+.. ........+.+..+--++|-+=.+.+.+..-..++ T Consensus 272 ~~L~~lk--d~~G~~l~~~~~~~~~~-~tllG~~~v~~~------~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~~~~~~ 342 (392) T protein:vir:10 272 NYLDKLK--DKDGKYILQSDPTQKNK-KLFAGTNPVVVV------SNRFLKSKGTTAKKAPLIIGDLKEAIVLFKREDME 342 (392) T ss_pred HHHHHhh--ccCCCeEeecCccCCcc-ccccCcccEEEe------cccccCCCcccCCceEEEEEehhceEEEEeecceE Confidence 9996532 3233211100 011111 112222221110 00000011111112223332212222222212222 Q ss_pred --cccce----ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 308 --MLAPQ----LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 308 --~~~~~----~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +.+.. .++ ...+.++.++ |+.+++|.+|+.++.= T Consensus 343 ~~~~~~~~~~f~~~-~~~~r~~~r~-d~~v~~~~a~~~l~~~ 382 (392) T protein:vir:10 343 LASTDVGGKAFTRN-TLDLRAIQRD-DVQMWDNEAAVYGEID 382 (392) T ss_pred EEEeccccchhhcC-ceEEEEEEee-ccEEecccceEEEEec Confidence 22111 112 3456678887 4688999999998775 No 87 >protein:vir:102082 Length: 392 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1503 # MgeName: Fah # Cross-refs: genbank:acc:YP_512315;genbank:gi:89152484;genbank:GeneID:3953075 Probab=97.40 E-value=5e-05 Score=44.16 Aligned_cols=300 Identities=9% Similarity=-0.005 Sum_probs=133.9 Q ss_pred CCcceeccch--------hhhhchh---hhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhh Q lcl|NC_011142. 1 MSEKRVVIDA--------QTIAGNR---WLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLED 69 (343) Q Consensus 1 ~~~~~~~~~~--------~~~~~~~---~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~ 69 (343) .-|.+..-.. +.+.+.. .....+..... ...+..-..++|.++.. +.+.+.|++.....-..+++ T Consensus 64 ~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~--~~~~~~~t~~~gg~~vP--~~~~~~ii~~~~~~s~l~~~ 139 (392) T protein:vir:10 64 EVETRNVDGEMEYRDVFMKALRNKPLNAEEREFLEDDLE--QRAMSGLTGEDGGLVIP--QDIQTQINELARSFDALEQY 139 (392) T ss_pred cccccCccchHHHHHHHHHHHhcccccHHHHHHHhhhhh--hhhccccccCCCceecc--hhHHHHHHHHHHhhhhhhhh Confidence 0000000000 0000000 00000000000 00000001123444444 35556677766666555555 Q ss_pred ccccCCCCcceeEEEEeeecccccceeecCCcCccceee-eccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHH Q lcl|NC_011142. 70 VPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVA-QSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAE 148 (343) Q Consensus 70 i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~-~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~ 148 (343) ..+. +.+.+...+.+......+.+.+++..+. .|-.+ ..++......+.++.-+.+|.+=|+.+ ..+|..--.. T Consensus 140 ~~~~-~~~~~~~~~~~~~~~~~~~a~~v~E~~~-~~~~~~~~~~~v~l~~~k~~~~~~iS~ell~ds---~~~l~~~i~~ 214 (392) T protein:vir:10 140 VTVE-PVRTRSGSRVLEKNSDMIPFAEITEMGE-IPETDNPKFSNVQYAVKDRAGILPLSRSLLQDS---DQNILKYVTK 214 (392) T ss_pred ceee-eccCCceeEEEEeecCCccceeeccccc-ccccccccceeEEeeeeeEEEeehhhHHHHhhh---HHHHHHHHHH Confidence 5432 1111222233333344456677766543 45433 467888888888988888887655543 4567888888 Q ss_pred HHHHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHH Q lcl|NC_011142. 149 LAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLW 228 (343) Q Consensus 149 aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~ 228 (343) ..++++++.+|..+++|+......| ..+.+ ||.+++..... . .....-.++|+|+.| T Consensus 215 ~l~~~i~~~~d~~~~~g~g~~~~~~-----------------~~~~d----~i~~~~~~~l~-~-~~~~~a~~vm~~~~~ 271 (392) T protein:vir:10 215 WLGKKSKVTRNVLILGVIEKLTKQA-----------------IKSLD----DIKDVLNVKLD-P-AISPNAILLTNQDGF 271 (392) T ss_pred HHHHHHHHHHHHHHhhccccccccC-----------------ccCHH----HHHHHHHHhhh-h-hhccCCEEEEcHHHH Confidence 8899999999999999876432111 11233 33433321111 1 222335699999999 Q ss_pred HHHhccccCCCCCccHHHH-HHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchh Q lcl|NC_011142. 229 KRASSLLMTGYTDRTVIEH-FQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFR 307 (343) Q Consensus 229 ~~L~~~~~~~~~~~tvle~-l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~ 307 (343) ..|.+-. +..|.-++.- +....+ ..+-|.|+.+.. ........+.+..+--++|-+=.+.+.+..-..++ T Consensus 272 ~~L~~lk--d~~G~~l~~~~~~~~~~-~tllG~~~v~~~------~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~~~~~~ 342 (392) T protein:vir:10 272 NYLDKLK--DKDGKYILQSDPTQKNK-KLFAGTNPVVVV------SNRFLKSKGTTAKKAPLIIGDLKEAIVLFKREDME 342 (392) T ss_pred HHHHHhh--ccCCCeEeecCccCCcc-ccccCcccEEEe------cccccCCCcccCCceEEEEEehhceEEEEeecceE Confidence 9996532 3233211100 011111 112222221110 00000011111112223332212222222212222 Q ss_pred --cccce----ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 308 --MLAPQ----LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 308 --~~~~~----~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +.+.. .++ ...+.++.++ |+.+++|.+|+.++.= T Consensus 343 ~~~~~~~~~~f~~~-~~~~r~~~r~-d~~v~~~~a~~~l~~~ 382 (392) T protein:vir:10 343 LASTDVGGKAFTRN-TLDLRAIQRD-DVQMWDNEAAVYGEID 382 (392) T ss_pred EEEeccccchhhcC-ceEEEEEEee-ccEEecccceEEEEec Confidence 22111 112 3456678887 4688999999998775 No 88 >protein:vir:102873 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1492 # MgeName: Cherry # Cross-refs: genbank:acc:YP_338137;genbank:gi:77020198;genbank:GeneID:3703782 Probab=97.40 E-value=5e-05 Score=44.16 Aligned_cols=300 Identities=9% Similarity=-0.005 Sum_probs=133.9 Q ss_pred CCcceeccch--------hhhhchh---hhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhh Q lcl|NC_011142. 1 MSEKRVVIDA--------QTIAGNR---WLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLED 69 (343) Q Consensus 1 ~~~~~~~~~~--------~~~~~~~---~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~ 69 (343) .-|.+..-.. +.+.+.. .....+..... ...+..-..++|.++.. +.+.+.|++.....-..+++ T Consensus 64 ~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~--~~~~~~~t~~~gg~~vP--~~~~~~ii~~~~~~s~l~~~ 139 (392) T protein:vir:10 64 EVETRNVDGEMEYRDVFMKALRNKPLNAEEREFLEDDLE--QRAMSGLTGEDGGLVIP--QDIQTQINELARSFDALEQY 139 (392) T ss_pred cccccCccchHHHHHHHHHHHhcccccHHHHHHHhhhhh--hhhccccccCCCceecc--hhHHHHHHHHHHhhhhhhhh Confidence 0000000000 0000000 00000000000 00000001123444444 35556677766666555555 Q ss_pred ccccCCCCcceeEEEEeeecccccceeecCCcCccceee-eccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHH Q lcl|NC_011142. 70 VPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVA-QSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAE 148 (343) Q Consensus 70 i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~-~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~ 148 (343) ..+. +.+.+...+.+......+.+.+++..+. .|-.+ ..++......+.++.-+.+|.+=|+.+ ..+|..--.. T Consensus 140 ~~~~-~~~~~~~~~~~~~~~~~~~a~~v~E~~~-~~~~~~~~~~~v~l~~~k~~~~~~iS~ell~ds---~~~l~~~i~~ 214 (392) T protein:vir:10 140 VTVE-PVRTRSGSRVLEKNSDMIPFAEITEMGE-IPETDNPKFSNVQYAVKDRAGILPLSRSLLQDS---DQNILKYVTK 214 (392) T ss_pred ceee-eccCCceeEEEEeecCCccceeeccccc-ccccccccceeEEeeeeeEEEeehhhHHHHhhh---HHHHHHHHHH Confidence 5432 1111222233333344456677766543 45433 467888888888988888887655543 4567888888 Q ss_pred HHHHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHH Q lcl|NC_011142. 149 LAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLW 228 (343) Q Consensus 149 aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~ 228 (343) ..++++++.+|..+++|+......| ..+.+ ||.+++..... . .....-.++|+|+.| T Consensus 215 ~l~~~i~~~~d~~~~~g~g~~~~~~-----------------~~~~d----~i~~~~~~~l~-~-~~~~~a~~vm~~~~~ 271 (392) T protein:vir:10 215 WLGKKSKVTRNVLILGVIEKLTKQA-----------------IKSLD----DIKDVLNVKLD-P-AISPNAILLTNQDGF 271 (392) T ss_pred HHHHHHHHHHHHHHhhccccccccC-----------------ccCHH----HHHHHHHHhhh-h-hhccCCEEEEcHHHH Confidence 8899999999999999876432111 11233 33433321111 1 222335699999999 Q ss_pred HHHhccccCCCCCccHHHH-HHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchh Q lcl|NC_011142. 229 KRASSLLMTGYTDRTVIEH-FQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFR 307 (343) Q Consensus 229 ~~L~~~~~~~~~~~tvle~-l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~ 307 (343) ..|.+-. +..|.-++.- +....+ ..+-|.|+.+.. ........+.+..+--++|-+=.+.+.+..-..++ T Consensus 272 ~~L~~lk--d~~G~~l~~~~~~~~~~-~tllG~~~v~~~------~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~~~~~~ 342 (392) T protein:vir:10 272 NYLDKLK--DKDGKYILQSDPTQKNK-KLFAGTNPVVVV------SNRFLKSKGTTAKKAPLIIGDLKEAIVLFKREDME 342 (392) T ss_pred HHHHHhh--ccCCCeEeecCccCCcc-ccccCcccEEEe------cccccCCCcccCCceEEEEEehhceEEEEeecceE Confidence 9996532 3233211100 011111 112222221110 00000011111112223332212222222212222 Q ss_pred --cccce----ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 308 --MLAPQ----LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 308 --~~~~~----~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +.+.. .++ ...+.++.++ |+.+++|.+|+.++.= T Consensus 343 ~~~~~~~~~~f~~~-~~~~r~~~r~-d~~v~~~~a~~~l~~~ 382 (392) T protein:vir:10 343 LASTDVGGKAFTRN-TLDLRAIQRD-DVQMWDNEAAVYGEID 382 (392) T ss_pred EEEeccccchhhcC-ceEEEEEEee-ccEEecccceEEEEec Confidence 22111 112 3456678887 4688999999998775 No 89 >protein:vir:80930 Length: 278 # NCBI annotation: Cps # Family: family:all:522 # MgeID: mge:1886 # MgeName: A500 # Cross-refs: genbank:acc:YP_001468392;genbank:gi:157324966;genbank:GeneID:5601363 Probab=97.40 E-value=5.7e-05 Score=43.83 Aligned_cols=271 Identities=7% Similarity=-0.090 Sum_probs=137.6 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC--CCcceeEEEEeeecccccceeecCCcCc Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLAN--IPEYATHWNYRSYDGAAMGKFISANASD 103 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~--~~~~~~~~~~~~~~~~G~a~~~~~~~~d 103 (343) |+.+. | ...-.|..+.|. +.|.+.....+....+..+... +.+| .++.++.+...|.+..+.+. ++ T Consensus 1 Ma~~~-----T--~~~~~iiPev~s---~~v~~~~~~~~v~~~~~~~~~~l~g~~G-~tv~ip~~~~~g~a~~~~~g-~~ 68 (278) T protein:vir:80 1 MADLT-----T--KLANLIDPEVMG---PMISAKLPKAIKFGKIAPIDNSLEGQPG-SEITVPKYKYIGDAQDVAEG-AA 68 (278) T ss_pred CCCcc-----e--ehhheecHHHHH---HHHHHHHHHhhhhcccceecccccCCCC-CEEEEeeeccCCcceeecCC-Cc Confidence 22211 1 111233333343 2233333444444554443322 2334 45677777777888877775 46 Q ss_pred cceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCcccc Q lcl|NC_011142. 104 LPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKT 183 (343) Q Consensus 104 ip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~ 183 (343) ++..+...+.....+...+.+|. +.|+++ ...+.++-..-...++..+++..|+.++..-.+. .+. T Consensus 69 i~~~~lt~~~~~~~i~~~~~a~~--v~D~~~-~~~~~d~~~~~~~~~a~~~a~~~d~~l~~~l~~a-----~~~------ 134 (278) T protein:vir:80 69 IDYSALETESVKHGIKKAGKGVK--LTDESV-LSGYGDPVEEAQKQIRMAIASKVDNDILEEALTT-----TLE------ 134 (278) T ss_pred CcccccccceeeEeeehhhcccc--ccHHHH-hhccccHHHHHHHHHHHHHHHHHHHHHHHHHhcc-----ccc------ Confidence 88888888888888888776655 555554 3346667778888899999999998777443221 110 Q ss_pred ccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccH-HHHHHhcCcceeeccccc Q lcl|NC_011142. 184 SATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTV-IEHFQINNAYTLLTRNPI 262 (343) Q Consensus 184 ~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tv-le~l~~n~~~~~~~~~p~ 262 (343) .+......+.+..++.+.++..++... +...+..|+++|..|..|.+-...+....+- .+=+..|+..-.+.|.++ T Consensus 135 -~~~~~t~~~~~~~~~~~~da~~~l~~~--~~~~~~~ivv~p~~~~~L~k~~~~~~~~~~~~g~~~~~~G~ig~~~G~~V 211 (278) T protein:vir:80 135 -VKGAINIGLIDKIENTFTDAPDAIEDE--SITTTGVLFLNYKDTAKLREEAAGSWTKASQLGDDLLVKGAFGELLGWEI 211 (278) T ss_pred -cccccccchhhhHHHHHHHHHHhhccc--CCCcccEEEECHHHHHHHHhhhhhhccccccccccceeeccceeecceeE Confidence 111112223555677777777666432 3334456999999998885421111111110 010112222223333332 Q ss_pred cccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-eecCceeEeeeeeeeeeEEEECcceeeeec Q lcl|NC_011142. 263 DIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-QLLGLGITVPAEYKISGTEYRYPLCAQYVD 341 (343) Q Consensus 263 ~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d 341 (343) ..+... +.+ ..+++ .+.-+.+....+++.-.- .++.....+.... ..|+.+.+|.+++.+. T Consensus 212 i~s~~~-------------p~~--t~~l~--~~gAi~~~~~~~~~vE~~Rd~~~~~d~i~~~~-~yg~~v~~~~~~v~it 273 (278) T protein:vir:80 212 VRTKKL-------------ADG--NALAV--KAGALKTFLKRNLLAESGRDMDHKLTKFNADQ-HYAVALVDETKAVKVV 273 (278) T ss_pred EEcCCC-------------Ccc--eEEEE--eccceeeeecCCcccccccchhhccceeeeee-EEEEEEEcCcceEEEe Confidence 222110 111 11222 233344333333332111 1111233333333 3589999999999988 Q ss_pred cC Q lcl|NC_011142. 342 ML 343 (343) Q Consensus 342 GI 343 (343) -- T Consensus 274 ~~ 275 (278) T protein:vir:80 274 PV 275 (278) T ss_pred ec Confidence 77 No 90 >protein:vir:7409 Length: 408 # NCBI annotation: major structural protein # Family: family:all:21 # MgeID: mge:146 # MgeName: P335 # Cross-refs: genbank:acc:NP_839926;genbank:gi:30089896;genbank:GeneID:1260683 Probab=97.38 E-value=4.8e-05 Score=44.28 Aligned_cols=301 Identities=8% Similarity=0.011 Sum_probs=137.4 Q ss_pred CCcceeccchhhhhchhhhchhc-------ccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhcccc Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFL-------DSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVL 73 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~-------~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~ 73 (343) ..++...-..+.-...+..+... .........++.....+.|.++.+ +.+.+.|++..+.....+.++.+. T Consensus 76 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~gg~~vP--~~~~~~Ii~~~~~~~~l~~~~~~~ 153 (408) T protein:vir:74 76 EEEKGPLNKSENELKDKFVKDFVNMVRNPMAFLNTVSSKTETSGSDSAAGLTIP--QDIRTMINTLVRQYDSLQQYVRVE 153 (408) T ss_pred ccccccccchhhhhHHHHHHHHHHHHhcchhhhhhhhhhhhcccccCCCceeec--hhHhhHHHHHHhhhcchhhhccee Confidence 00000000000000000000000 000001111111122233445444 455667888777777777776542 Q ss_pred CCCCcceeEEEEeeecccccc-eeecCCcCccce-eeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHH Q lcl|NC_011142. 74 ANIPEYATHWNYRSYDGAAMG-KFISANASDLPR-VAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAF 151 (343) Q Consensus 74 ~~~~~~~~~~~~~~~~~~G~a-~~~~~~~~dip~-v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~ 151 (343) +.+.+...+.+......+.. .+++.. .++|- .+..++......+.++.-+.+|.+=++. ...+|..--....+ T Consensus 154 -~~~~~~~~~~~~~~~~~~~~~~~v~E~-~~~~~~~~~~~~~i~~~~~k~~~~~~iS~ell~d---s~~~l~~~i~~~l~ 228 (408) T protein:vir:74 154 -SVSTSSGSRVYEKWTDVTPLKAMDEED-GKIPDLDNPRLTIIKYLIKRYAGIITATNTLLKD---TAENILAWLSSWIA 228 (408) T ss_pred -eccCCcceEEEEeecCCcccccccccc-cccccccccceeeEEeeeeeEEeeehhHHHHHhh---chHHHHHHHHHHHH Confidence 22223333444444333333 344443 44564 4467888889999999888887654433 35567777888889 Q ss_pred HHHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHH Q lcl|NC_011142. 152 RGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRA 231 (343) Q Consensus 152 ~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L 231 (343) +++...+|+.+++|+.... +.. ...+.+++.+.++. .+.. .+. ..-.++|+|..|..| T Consensus 229 ~~~~~~~d~~il~G~G~~~----------~~~------~~~~~~~i~~~~~~---~l~~--~~~-~~a~~v~n~~~~~~l 286 (408) T protein:vir:74 229 KKVVVTRNQAIIAAMGTVP----------KKP------TIANFDDVITMINT---SVDP--AII-ATSSLLTNQSGLNKL 286 (408) T ss_pred HHHHHHHHHHHhhcccccc----------ccc------ccccHHHHHHHHHH---hhhh--hhc-CCCEEEEcHHHHHHH Confidence 9999999999999975321 110 01233444333332 2221 222 234689999999999 Q ss_pred hccccCCCCCccHHHH-HHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhc-- Q lcl|NC_011142. 232 SSLLMTGYTDRTVIEH-FQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRM-- 308 (343) Q Consensus 232 ~~~~~~~~~~~tvle~-l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~-- 308 (343) .+-. +..|.-++.- +....+ ..+-|.|+.+... ..+ ....+++.. ++|-+-.+.+.+..-..++. T Consensus 287 ~~lk--d~~G~~l~~~~~~~~~~-~~l~G~pV~~~~~-------~~~-~~~~~~~~~-i~~gd~~~~~~~~~~~~~~i~~ 354 (408) T protein:vir:74 287 ALVK--TAEGKYLLEPDPTKPNS-YLIKGKQVIVVAD-------RWL-PNSGSTVYP-LYYGDMSQAITLFDRENMSLLP 354 (408) T ss_pred HHhh--cCCCceEeccCcCCCCC-ceecceeeEEecC-------ccc-ccccCCcce-EEEEehhccEEEEEecceEEEE Confidence 7533 3333322210 111111 2234444332211 001 111112222 23322222222221122222 Q ss_pred ccceec---CceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 309 LAPQLL---GLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 309 ~~~~~~---~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .+-... .-...+.++.|++ +.+++|.|++.++.- T Consensus 355 ~~~~~~~f~~~~~~~r~~~r~d-~~~~~~~a~~~~~~~ 391 (408) T protein:vir:74 355 TNIGAGAFETDTTKIRVIDRFD-VKATDSEALVAGSFT 391 (408) T ss_pred eccccchhhcceeeEEEEEeeC-cEEecccceEEEEee Confidence 111111 1134566788875 568889999888754 No 91 >protein:vir:4856 Length: 293 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:106 # MgeName: DT1 # Cross-refs: genbank:acc:NP_049396;genbank:gi:9632424;genbank:GeneID:1258532 Probab=97.35 E-value=8e-05 Score=43.04 Aligned_cols=272 Identities=8% Similarity=-0.042 Sum_probs=137.6 Q ss_pred hcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeec-ccccceeecCC Q lcl|NC_011142. 22 FLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYD-GAAMGKFISAN 100 (343) Q Consensus 22 ~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~-~~G~a~~~~~~ 100 (343) .|.++.. .+. ++|.++.. +.+.+.|++........+++..+. +.+.....+.+.... ..+.+.+++.. T Consensus 1 ~l~~~~~------~t~--~~gg~liP--~~~~~~Ii~~~~~~~~l~~~~~~~-~~~~~~g~~~~~~~~~~~~~a~~v~Eg 69 (293) T protein:vir:48 1 MLDSKTD------HSG--SDAGLTIP--QDIRTAINTLVRQYDSLQEYVNVE-NVTTLTGSRVYEKWTDITGLANIDDEA 69 (293) T ss_pred Cceeecc------ccc--CcCceEec--hhHHHHHHHHHHhhhhhhhhceee-eccCCcceEEEEeecCCCcceeeecCC Confidence 3333322 111 22333333 344566777777666666655432 122222233443333 34567777665 Q ss_pred cCcccee-eeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCC Q lcl|NC_011142. 101 ASDLPRV-AQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPN 179 (343) Q Consensus 101 ~~dip~v-~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~ 179 (343) + .+|-. ...+++.....+.++.-+.+|.+=++.+ ..+|...-....++++++.+|+-++.|...... T Consensus 70 ~-~~~~~~~~~~~~i~l~~~k~~~~~~iS~ell~ds---~~~l~~~i~~~la~~~~~~~~~~i~~g~~~~~~-------- 137 (293) T protein:vir:48 70 G-KIADIDDPKLSLIKYTIKRYAGISTVTNSLLADS---AENILAWLSGWIAKKVVVTRNKAILGVVDKLPT-------- 137 (293) T ss_pred c-ccccccccceeEEEEeeeEEEEeehhhHHHHhhh---hHHHHHHHHHHHHHHHHHHHHhHHhhccccccc-------- Confidence 4 35543 4568888889999998888876555433 456877788888999999999999888643110 Q ss_pred ccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeecc Q lcl|NC_011142. 180 VTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTR 259 (343) Q Consensus 180 v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~ 259 (343) .....+ ++||.+++.++... + .....++|+|+.+..|..-. +..+.-++.--..+..--.+-| T Consensus 138 --------~~~~~~----~d~i~~~~~~l~~~--~-~~~a~~vmn~~~~~~L~~lk--d~~g~~l~~~~~~~~~~~~l~G 200 (293) T protein:vir:48 138 --------KPTLTK----WDDIIDLEAKVDPA--I-KQTSFFLTNTSGFTALKKVK--NALGDYLMERDVKSPTGYSIAG 200 (293) T ss_pred --------cccccC----HHHHHHHHHhhhhh--h-cCCCEEEEcHHHHHHHHHhh--ccCCceEeecCcCCCCCceecc Confidence 111223 45667777776532 2 23457999999999986532 3333221110001111112334 Q ss_pred ccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-ee----cCceeEeeeeeeeeeEEEECc Q lcl|NC_011142. 260 NPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-QL----LGLGITVPAEYKISGTEYRYP 334 (343) Q Consensus 260 ~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-~~----~~~~~~~~~~~~~gGv~i~~P 334 (343) .|+.+.. ... .+....+ +..++|-+=.+.+.+..-..++..-- +. ..-...+.+..|++ +.+++| T Consensus 201 ~Pv~~~~-------~~~-~~~~~~~-~~~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~r~~~r~d-~~~~~~ 270 (293) T protein:vir:48 201 FAVKEIS-------DRW-LPNASSG-VMPLYFGDLKQAVTLFDRQQMSLLSTNIGGGAFETDTTKVRVIDRFD-VVATDT 270 (293) T ss_pred eeeEEec-------ccc-cCCccCC-ceEEEEEeccceEEEEEecceEEEEecccchhhhcCeEEEEEEEeeC-cEEecc Confidence 4432211 100 0111112 22233332222222211122221111 11 11124556678875 567899 Q ss_pred ceeeeeccC Q lcl|NC_011142. 335 LCAQYVDML 343 (343) Q Consensus 335 ~ai~~~dGI 343 (343) .|++.+..= T Consensus 271 ~a~~~l~~~ 279 (293) T protein:vir:48 271 EAFVPASFK 279 (293) T ss_pred cceEEEEee Confidence 999987744 No 92 >protein:vir:9820 Length: 272 # NCBI annotation: putative major capsid/head protein # Family: family:all:522 # MgeID: mge:176 # MgeName: 315.4 # Cross-refs: genbank:acc:NP_795582;genbank:gi:28876339;genbank:GeneID:1257858 Probab=97.33 E-value=9.2e-05 Score=42.71 Aligned_cols=263 Identities=10% Similarity=-0.052 Sum_probs=136.1 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC--CCcceeEEEEeeecccccceeecCCcCc Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLAN--IPEYATHWNYRSYDGAAMGKFISANASD 103 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~--~~~~~~~~~~~~~~~~G~a~~~~~~~~d 103 (343) |+... + ..+-.+..+.|.. .+.+.....+....+.-+... +..|. ++.++.++..|.+.+++.+ ++ T Consensus 1 MA~~~----T---~~~~~~iPev~s~---~v~~~~~~~~~~~~~~~~~~~~~g~~G~-tv~iP~~~~~~~a~~v~eg-~~ 68 (272) T protein:vir:98 1 MAVGT----T---KMAQMLDPEVLAD---MIDAEVGKAIRFAPLAEVDTTLEGQPGT-TLTVPKWDYIGDAEDVAEG-EA 68 (272) T ss_pred CCCcc----c---cchheechHHHHH---HHHHHHHHHhhhhccccccccccCCCCC-EEEEEEecCCCCcccccCC-Cc Confidence 22111 1 1112333344432 234433334444444443222 22343 5666777778889988875 57 Q ss_pred cceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCcccc Q lcl|NC_011142. 104 LPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKT 183 (343) Q Consensus 104 ip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~ 183 (343) +|..+...+.....+..++..+.++.++.+. ...++...-...+.+.+++..|+.++.-- .| .+. T Consensus 69 i~~~~~~~~~~~~~~~~~~~~~~itd~~~~~---s~~d~~~~~~~~~~~~~a~~~d~~i~~~~-----~~------a~~- 133 (272) T protein:vir:98 69 IPMTQLGFKKTTMTIKKAGKGVEITDEAILS---GYGDPVGQAAKQIVEAIDHKVDADVLDAL-----SK------STQ- 133 (272) T ss_pred ccccccccceEEEEeeeeeeeeeecHHHHhh---ccccHHHHHHHHHHHHHHHHHHHHHHHHh-----cc------ccc- Confidence 8999999999999999999888887666544 35567777888888889888887766321 11 000 Q ss_pred ccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCc-cHHHHHHhcCcceeeccccc Q lcl|NC_011142. 184 SATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDR-TVIEHFQINNAYTLLTRNPI 262 (343) Q Consensus 184 ~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~-tvle~l~~n~~~~~~~~~p~ 262 (343) ..+.. .+ +++|.+++..+... + ..+..++++|..+..|.+-...+..+. ....-+..++..-.+.|.|+ T Consensus 134 -~~~~~--~t----~d~i~da~~~l~~~--~-~~~~~~vv~p~~~~~L~k~~~~~~~~~~~~~~~~~~~g~ig~i~G~~V 203 (272) T protein:vir:98 134 -TVEAT--AT----VDGVSKALDIFNDE--D-DAETVIVMNPADASTLRLDAAKEWLGATEVGANRVVSGVYGEVLGVQI 203 (272) T ss_pred -ccccc--cC----HHHHHHHHHHHhcc--C-CCccEEEEcHHHHHHHHHhccccccccccccccccccccchhhcCeeE Confidence 01111 12 55677777776533 2 346789999999998854211111110 00111122232223344443 Q ss_pred cccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-eecCceeEeeeeeeeeeEEEECcceeeeec Q lcl|NC_011142. 263 DIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-QLLGLGITVPAEYKISGTEYRYPLCAQYVD 341 (343) Q Consensus 263 ~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d 341 (343) ..+... .. + - +|.-++..+.+..-.+.+...- +.......+....+ .|+.+.+|.+++.+. T Consensus 204 i~s~~~----------p~---~--t--~~~~~~~a~~~~~~~~~~ve~~r~~~~~~~~i~~~~~-~~~~v~~~~~vv~~t 265 (272) T protein:vir:98 204 VRSRKC----------PK---G--T--AYMVRKGALRIMLKRNTMVETDRDITKAINQIVANKH-YGVYLYKAEKAVKIT 265 (272) T ss_pred EEcCCC----------Cc---c--e--EEEEcCCeEEEEecCCceeeeccccccceeEEEEEEE-EEEEEEcCCceEEEE Confidence 332211 00 1 1 1212233333333233222110 11222344444444 468899999888876 Q ss_pred cC Q lcl|NC_011142. 342 ML 343 (343) Q Consensus 342 GI 343 (343) -= T Consensus 266 ~~ 267 (272) T protein:vir:98 266 LK 267 (272) T ss_pred ec Confidence 44 No 93 >protein:vir:3033 Length: 272 # NCBI annotation: major capsid protein # Family: family:all:522 # MgeID: mge:61 # MgeName: PhiNIH1.1 # Cross-refs: genbank:acc:NP_438146;genbank:gi:16271809;genbank:GeneID:929235 Probab=97.33 E-value=9.2e-05 Score=42.71 Aligned_cols=263 Identities=10% Similarity=-0.052 Sum_probs=136.1 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC--CCcceeEEEEeeecccccceeecCCcCc Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLAN--IPEYATHWNYRSYDGAAMGKFISANASD 103 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~--~~~~~~~~~~~~~~~~G~a~~~~~~~~d 103 (343) |+... + ..+-.+..+.|.. .+.+.....+....+.-+... +..|. ++.++.++..|.+.+++.+ ++ T Consensus 1 MA~~~----T---~~~~~~iPev~s~---~v~~~~~~~~~~~~~~~~~~~~~g~~G~-tv~iP~~~~~~~a~~v~eg-~~ 68 (272) T protein:vir:30 1 MAVGT----T---KMAQMLDPEVLAD---MIDAEVGKAIRFAPLAEVDTTLEGQPGT-TLTVPKWDYIGDAEDVAEG-EA 68 (272) T ss_pred CCCcc----c---cchheechHHHHH---HHHHHHHHHhhhhccccccccccCCCCC-EEEEEEecCCCCcccccCC-Cc Confidence 22111 1 1112333344432 234433334444444443222 22343 5666777778889988875 57 Q ss_pred cceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCcccc Q lcl|NC_011142. 104 LPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKT 183 (343) Q Consensus 104 ip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~ 183 (343) +|..+...+.....+..++..+.++.++.+. ...++...-...+.+.+++..|+.++.-- .| .+. T Consensus 69 i~~~~~~~~~~~~~~~~~~~~~~itd~~~~~---s~~d~~~~~~~~~~~~~a~~~d~~i~~~~-----~~------a~~- 133 (272) T protein:vir:30 69 IPMTQLGFKKTTMTIKKAGKGVEITDEAILS---GYGDPVGQAAKQIVEAIDHKVDADVLDAL-----SK------STQ- 133 (272) T ss_pred ccccccccceEEEEeeeeeeeeeecHHHHhh---ccccHHHHHHHHHHHHHHHHHHHHHHHHh-----cc------ccc- Confidence 8999999999999999999888887666544 35567777888888889888887766321 11 000 Q ss_pred ccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCc-cHHHHHHhcCcceeeccccc Q lcl|NC_011142. 184 SATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDR-TVIEHFQINNAYTLLTRNPI 262 (343) Q Consensus 184 ~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~-tvle~l~~n~~~~~~~~~p~ 262 (343) ..+.. .+ +++|.+++..+... + ..+..++++|..+..|.+-...+..+. ....-+..++..-.+.|.|+ T Consensus 134 -~~~~~--~t----~d~i~da~~~l~~~--~-~~~~~~vv~p~~~~~L~k~~~~~~~~~~~~~~~~~~~g~ig~i~G~~V 203 (272) T protein:vir:30 134 -TVEAT--AT----VDGVSKALDIFNDE--D-DAETVIVMNPADASTLRLDAAKEWLGATEVGANRVVSGVYGEVLGVQI 203 (272) T ss_pred -ccccc--cC----HHHHHHHHHHHhcc--C-CCccEEEEcHHHHHHHHHhccccccccccccccccccccchhhcCeeE Confidence 01111 12 55677777776533 2 346789999999998854211111110 00111122232223344443 Q ss_pred cccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-eecCceeEeeeeeeeeeEEEECcceeeeec Q lcl|NC_011142. 263 DIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-QLLGLGITVPAEYKISGTEYRYPLCAQYVD 341 (343) Q Consensus 263 ~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d 341 (343) ..+... .. + - +|.-++..+.+..-.+.+...- +.......+....+ .|+.+.+|.+++.+. T Consensus 204 i~s~~~----------p~---~--t--~~~~~~~a~~~~~~~~~~ve~~r~~~~~~~~i~~~~~-~~~~v~~~~~vv~~t 265 (272) T protein:vir:30 204 VRSRKC----------PK---G--T--AYMVRKGALRIMLKRNTMVETDRDITKAINQIVANKH-YGVYLYKAEKAVKIT 265 (272) T ss_pred EEcCCC----------Cc---c--e--EEEEcCCeEEEEecCCceeeeccccccceeEEEEEEE-EEEEEEcCCceEEEE Confidence 332211 00 1 1 1212233333333233222110 11222344444444 468899999888876 Q ss_pred cC Q lcl|NC_011142. 342 ML 343 (343) Q Consensus 342 GI 343 (343) -= T Consensus 266 ~~ 267 (272) T protein:vir:30 266 LK 267 (272) T ss_pred ec Confidence 44 No 94 >protein:vir:102655 Length: 322 # NCBI annotation: Hypothetical protein # Family: family:all:6384 # MgeID: mge:1624 # MgeName: VP2 # Cross-refs: genbank:acc:YP_052979;genbank:gi:50282923;genbank:GeneID:2948122 Probab=97.33 E-value=8.8e-05 Score=42.80 Aligned_cols=301 Identities=9% Similarity=-0.006 Sum_probs=134.8 Q ss_pred hhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeec--ccccc---e Q lcl|NC_011142. 21 KFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYD--GAAMG---K 95 (343) Q Consensus 21 ~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~--~~G~a---~ 95 (343) |+|.+--+ |+|.|.++- .-+|- +|+..==..+++.....|... +-..+....+..-..+.+.+ ..|+. . T Consensus 1 ~~~~~~~~-~~~~Ms~~i--~~~fv-~qy~~~v~~~~qq~~s~L~~t--V~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 74 (322) T protein:vir:10 1 MKLNAIMS-MLPLIAGDI--DQAFV-QTYETTLRILSQQKSAKLKQY--CQHKNESSESHNWETLASMDPDAVKRKRSRQ 74 (322) T ss_pred Ccccceee-eeeeeechh--hhHHH-HHHHHHHHHHHHHhhhhhhcc--cccccccccccceeecccccccccccccccc Confidence 33322222 467776532 23343 555422223555444333322 22222222222211222211 12222 2 Q ss_pred eecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeee Q lcl|NC_011142. 96 FISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLL 175 (343) Q Consensus 96 ~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLl 175 (343) ..++..-|.|..+.....+...+..+..++.+. ++..+ ++..++...-..++..+++++.|++++.|--+....| T Consensus 75 ~~~d~~~dtp~~~~~~~~r~~~~~d~~~~~~VD--d~D~~-k~~~D~~~~~~~~~a~AL~R~~D~~I~~a~~g~a~~~-- 149 (322) T protein:vir:10 75 QSADGTYPTPVNNKPFAKRRTNVDTYDTGHVVE--QEDIS-QMLLDPNSALITSQAYAMARKTDDLIIAGAWKPASIK-- 149 (322) T ss_pred cccCcccCCCccccccceEEEeecccccceecc--hHHHH-HhhcCchHHHHHHHHHHhhhHHHHHHHhhhhcccccc-- Confidence 334444477887777777777777777766554 44442 3356677778889999999999998886533222222 Q ss_pred ecCCccccc-cCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcc Q lcl|NC_011142. 176 NNPNVTKTS-ATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAY 254 (343) Q Consensus 176 N~p~v~~~~-~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~ 254 (343) .++.+... ++..-...+..--++.|.++...|.++.---..+-.++++|+.|..|..-.--.+.+..=-+.|..++.- T Consensus 150 -~~gt~v~~~ss~~i~~g~~g~t~~kl~~a~~~l~~~dvp~d~~R~~vv~p~~~~~LL~d~~~ts~D~~~~~~l~~~G~i 228 (322) T protein:vir:10 150 -GTGQPVEFLATQEIGDGTKPISFDYVTEITERFLENEIEPEVSKVIVIGPTQARKLLQITEATSADYTSAMDLQSKGII 228 (322) T ss_pred -ccccccccCCCcccccCccchhHHHHHHHHHHHHhcCCCCCCCeEEEeCHHHHHHHhcchhhhhhhcccchhhhhcCee Confidence 11111100 0000000111122445666666666543211234569999999999975211011111112233333321 Q ss_pred eeeccccccccccceeeechhh----------hccccCCccceEEEEEcccceEEEeeccchhcccceec--CceeEeee Q lcl|NC_011142. 255 TLLTRNPIDIKIRFQLMATELA----------AAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLL--GLGITVPA 322 (343) Q Consensus 255 ~~~~~~p~~i~~~~~l~~~~~~----------~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~--~~~~~~~~ 322 (343) ..+-|- .++....+. ..+...+.+...++|.++ -+.++...+++.---+.. ...+.+.. T Consensus 229 g~~lGf-------~~i~s~~lp~~~~t~~~~~~~~~~~~~~~~~~a~~k~--Av~~a~~~dv~~~i~~~~~~~~a~~I~~ 299 (322) T protein:vir:10 229 TNWMGY-------TWIVSTRLDKFDPTQWGMAAEDGPQGDEIWCIAMTDM--ALGYHSCKDIWTKVAEDPSASFAWRIYS 299 (322) T ss_pred eeeeeE-------EEEEeccCCccccccccccccCCCCccceeEEEEecC--ceeEEEeeeeeEEeeccCCcchhhhhhh Confidence 111111 111111111 111111223344566543 455554444333222212 22355655 Q ss_pred eeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 323 EYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 323 ~~~~gGv~i~~P~ai~~~dGI 343 (343) ....| +.+-+|..|+.++=- T Consensus 300 ~~~~G-a~ri~~~gVv~i~~~ 319 (322) T protein:vir:10 300 AFTAD-CVRVEDEHIFKLRLK 319 (322) T ss_pred hhhhC-ceEeccCcEEEEEEe Confidence 56554 444478777766555 No 95 >protein:vir:81160 Length: 371 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:1892 # MgeName: Geobacillus virus E2 # Cross-refs: genbank:acc:YP_001285811;genbank:gi:148747732;genbank:GeneID:5247203 Probab=97.33 E-value=6.8e-05 Score=43.43 Aligned_cols=293 Identities=8% Similarity=0.045 Sum_probs=139.7 Q ss_pred CCccee-----ccchhhhhchhhhch-hcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccC Q lcl|NC_011142. 1 MSEKRV-----VIDAQTIAGNRWLNK-FLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLA 74 (343) Q Consensus 1 ~~~~~~-----~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~ 74 (343) ..++.. .-....-+-+++++. ... ++.....+.|.++.. +.+.+.+++........++++.+.. T Consensus 60 ~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~--------a~~~~t~~~gg~~vP--~~~~~~ii~~~~~~s~i~~~~~~~~ 129 (371) T protein:vir:81 60 EDKEPLKPTVQVKENEVEAFVNHIRTRFRN--------AMSEGSNQDGGYTVP--QDIQTRINELRESKDALQNLITVEP 129 (371) T ss_pred ccccccccchhhHHHHHHHHHHHHHHHHHH--------hhccCCCccCceeec--HhHHHHHHHHHHhhhhhhhhceeee Confidence 000000 000000000111110 000 111111122333333 3456678888777777777765422 Q ss_pred CCCcceeEEEEeeecccccceeecCCcCccce-eeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHH Q lcl|NC_011142. 75 NIPEYATHWNYRSYDGAAMGKFISANASDLPR-VAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRG 153 (343) Q Consensus 75 ~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~-v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~ 153 (343) .+.+...+.+......+.+.+++.+ ..+|. .+..++......+.++.-+.+|.+=++.+ ..+|..--....+.+ T Consensus 130 -~~~~~~~~~~~~~~~~~~a~~v~Eg-~~~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~ds---~~~l~~~i~~~l~~a 204 (371) T protein:vir:81 130 -VTTLSGSRVFKKRSQQTGFVEVAEG-AAIGEKATPQFTLLQYQVKKYAGFFRVTNELLNDS---TEAIVNTLVRWIGDE 204 (371) T ss_pred -ccCCceeEEEEeecCCcceeeeccc-cccccccccceeeEEeeeeEEEEeehhhHHHHhhh---hHHHHHHHHHHHHHH Confidence 2222333344444445667777765 34663 45678899999999998888877655543 346777778888899 Q ss_pred HHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhc Q lcl|NC_011142. 154 SEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASS 233 (343) Q Consensus 154 ~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~ 233 (343) +++.+|+.+++|+....-.|. .+.+++...+...+.. .....-.++|+|..+..|.. T Consensus 205 ~~~~~~~~i~~g~g~~~~~~~-----------------~~~~~i~~~~~~~l~~------~~~~~a~~vmn~~~~~~L~~ 261 (371) T protein:vir:81 205 SRVTRNGLIINVLNTKAKTAI-----------------ADLDGLKQIINVQLDP------VFRSTSSVIVNQDAFNWLDT 261 (371) T ss_pred HHHHHHHHHHhhccccccccc-----------------ccHHHHHHHHHhhcch------hhhcCCEEEEcHHHHHHHHH Confidence 999999999999764321111 1233333333322211 12233579999999999875 Q ss_pred cccCCCCCccHHHHHHhcCc----ceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcc Q lcl|NC_011142. 234 LLMTGYTDRTVIEHFQINNA----YTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRML 309 (343) Q Consensus 234 ~~~~~~~~~tvle~l~~n~~----~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~ 309 (343) .. +..|. ||-..++ ...+-|.|+...... ..... ...+.+....-++|-+=.+.+.+.....++.. T Consensus 262 lk--d~~g~----~l~~~~~~~~~~~~l~G~pV~~~~~~--~~~~~--~~~~~~~~~~~i~~Gd~~~~~~~~~~~~~~i~ 331 (371) T protein:vir:81 262 LK--DQNGQ----YLLQPSISSPTGRQLLGLPVVIVSNK--VLANR--VDGGTGAQFAPIIVGDLKEAVVMFDRQRTEIM 331 (371) T ss_pred hh--ccCCC----eeeecccCCCCCceecceeEEEeccc--ccCcc--ccccccCCcceEEEEehhceEEEEeecceEEE Confidence 32 32332 1111111 122334554332211 00000 01111111222333321222333222222222 Q ss_pred cceec-----CceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 310 APQLL-----GLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 310 ~~~~~-----~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ..... .-...+.++.++ +..+++|.+++.++ + T Consensus 332 ~~~~~~~~f~~~~v~~~~~~r~-d~~~~~~~a~~~~~-~ 368 (371) T protein:vir:81 332 SSNVAMDAFETDATLWRAIERM-DVKMRDDEAFVFGE-V 368 (371) T ss_pred EeccccchhhcCceEEEEEEee-ccEEecccceEEEE-E Confidence 11111 112455667776 56888899998887 6 No 96 >protein:vir:1025 Length: 408 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:20 # MgeName: bIL286 # Cross-refs: genbank:acc:NP_076679;genbank:gi:13095788;genbank:GeneID:920362 Probab=97.31 E-value=9.2e-05 Score=42.71 Aligned_cols=301 Identities=9% Similarity=0.027 Sum_probs=138.1 Q ss_pred CCcceeccchhhhhchhhhchhc-------ccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhcccc Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFL-------DSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVL 73 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~-------~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~ 73 (343) -.++...-+.+.-+..+..+... ..+......++..-..+.|.++.. +.+.+.|++..+.....+.+..+. T Consensus 76 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~t~~~gg~~vP--~~~~~~Ii~~~~~~~~l~~~~~~~ 153 (408) T protein:vir:10 76 EEEKGPLNKSENELKDKFVKDFVNMVRNPMAFMNTVSSKTETSGSDSAAGLTIP--QDIRTMINTLVRQYDSLQQYVRVE 153 (408) T ss_pred cccccccccchhhhHHHHHHHHHHHhhcchhhhhhhhhhhhhcccccCCceecc--HhHHHHHHHHHHhhchhhhhccee Confidence 00000000000000000000000 000000111111111233445555 455667888777776666664432 Q ss_pred CCCCcceeEEEEe-eecccccceeecCCcCccceee-eccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHH Q lcl|NC_011142. 74 ANIPEYATHWNYR-SYDGAAMGKFISANASDLPRVA-QSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAF 151 (343) Q Consensus 74 ~~~~~~~~~~~~~-~~~~~G~a~~~~~~~~dip~v~-~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~ 151 (343) . .+...-.+.+. ..+..+.+.+++..+ .+|..+ ..++......+.++.-+.+|.+=|+. ...+|..--....+ T Consensus 154 ~-~~~~~~~~~~~~~~~~~~~a~~v~E~~-~~~~~~~~~~~~i~~~~~k~~~~~~iS~ell~d---s~~~l~~~i~~~l~ 228 (408) T protein:vir:10 154 S-VSTSNGSRVYEKWTDVTPLTVMDAEDG-KIPDLDNPQLTIIKYLIKRYAGIITATNTSLKD---TAENILAWLSSWIA 228 (408) T ss_pred e-ccCCcceEEEeeccccccceeeecCcc-ccccccCcceeeEEeeeeeEEeeehhHHHHHhh---chHHHHHHHHHHHH Confidence 1 11111122222 223446667776653 355433 46788888899999888877654443 35567777888889 Q ss_pred HHHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHH Q lcl|NC_011142. 152 RGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRA 231 (343) Q Consensus 152 ~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L 231 (343) +++...+|+-+++|+.... +..+ ..+.+++++.++.. +.. .....-.++++|..|..| T Consensus 229 ~~~~~~~~~~il~g~g~~~----------~~~~------~~~~~~l~~~~~~~---~~~---~~~~~a~~v~n~~~~~~l 286 (408) T protein:vir:10 229 KKVVVTRNQAIIEVMKAAP----------KKPT------IAKFDDVITMINTA---VDP---AIIATSSLLTNQSGLNKL 286 (408) T ss_pred HHHHHHHHHHHhhcccccc----------cccc------cccHHHHHHHHHHh---hhh---hhccCCEEEEcHHHHHHH Confidence 9999999999999976321 1111 12334443333322 211 222334689999999999 Q ss_pred hccccCCCCCccHHHH-HHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhccc Q lcl|NC_011142. 232 SSLLMTGYTDRTVIEH-FQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLA 310 (343) Q Consensus 232 ~~~~~~~~~~~tvle~-l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~ 310 (343) .+-. +..|.-++.- +....+ ..+-|.|+.+.... ..+....+ +..++|-+=.+.+.+..-..++... T Consensus 287 ~~lk--d~~G~~i~~~~~~~~~~-~~l~G~PV~~~~~~--------~~~~~~~~-~~~i~~gd~~~~~~~~~~~~~~v~~ 354 (408) T protein:vir:10 287 ALVK--TAEGKYLLEPDPTKPNS-YLIKGKQVIVVADR--------WLPNTGST-VYPLYYGDMSQAITLFDRENMSLLP 354 (408) T ss_pred HHhh--ccCCceEeccCcCCCCC-ceecceeeEEeccc--------ccCccCCC-ceEEEEEehhccEEEEEecceEEEE Confidence 7532 4344333211 111111 12334443221110 00111111 2223332222333332222333221 Q ss_pred c-eecC----ceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 311 P-QLLG----LGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 311 ~-~~~~----~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) . +... -...+.++.|+ ++.+.+|.+++.++.- T Consensus 355 ~~~~~~~f~~~~~~~r~~~r~-d~~v~~~~a~~~~~~~ 391 (408) T protein:vir:10 355 TNIGAGAFETDTTKIRVIDRF-DVKATDSEALVAGSFS 391 (408) T ss_pred cccccchhhcCceEEEEEEee-ccEEeccccEEEEEee Confidence 1 1111 12455667777 5678889999988866 No 97 >protein:vir:1268 Length: 397 # NCBI annotation: hypothetical protein # Family: family:all:21 # MgeID: mge:329 # MgeName: phi-105 # Cross-refs: genbank:acc:NP_690760;genbank:gi:22855000;genbank:GeneID:955203 Probab=97.28 E-value=5.6e-05 Score=43.91 Aligned_cols=298 Identities=10% Similarity=-0.008 Sum_probs=137.8 Q ss_pred CCcceeccchhhhhchhhhchhcccccccC-------------cchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccch Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIG-------------VPSVVNDADGGAAYYISQLASLETTVYEVPYADITYL 67 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-------------~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~ 67 (343) ..+.+...+-......+..+.......+-. ..++.....+.|.++.. +.+.+.|++..+.....+ T Consensus 77 ~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~gg~lvP--~~~~~~ii~~~~~~~~l~ 154 (397) T protein:vir:12 77 PEGQRSQGQGNEERQQQYSKAFLKGLRGKRLTDEERDLLDSPEFRAMSGINDEDGGILIP--EDIGRQIHEFKRQFEPLE 154 (397) T ss_pred hcccccccchhhHHHHHHHHHHHHHHhccCCcHHHHHHHhhhhhhhccccccccCcccCc--hhHHHHHHHhhhhhhhHH Confidence 000000000000000001110000000000 00111111223444444 455667888777776666 Q ss_pred hhccccCCCCcceeEEEEeeecccccceeecCCcCcccee-eeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHH Q lcl|NC_011142. 68 EDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRV-AQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQ 146 (343) Q Consensus 68 ~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v-~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k 146 (343) .++.+.. .+.+.-.+.+......+.+.+++.++. +|-. ...++........++.-..+|.+=++. ...+|..-- T Consensus 155 ~~~~~~~-~~~~~~~~~~~~~~~~~~a~~v~Eg~~-~~~~~~~~~~~v~~~~~k~~~~~~is~e~l~d---s~~~l~~~i 229 (397) T protein:vir:12 155 QYVTVEP-VTTRSGTRLLEKNADMVPFSPVEELGN-LPEIDQPRFTKVSYSIIDYGGIMTLSNSMLND---SDQAIMTYV 229 (397) T ss_pred hhcceee-ccCCceeEEEEEecCCcceeeeccccc-ccccccccceeEEeeheeeEeeehhhHHHHhh---chHHHHHHH Confidence 6654421 111222333444444556777776643 4533 346788888888888888877654433 345677778 Q ss_pred HHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHH-HHHHhcCCeecccEEEecH Q lcl|NC_011142. 147 AELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVF-AVVKASKRFHTPNTVLMFP 225 (343) Q Consensus 147 ~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~-~l~~~s~g~~~p~tL~l~p 225 (343) ....++++++.+|..+++|+....-.|++ + ++||.+++. .+. ........++++| T Consensus 230 ~~~l~~~~~~~~d~~il~G~g~~~~~g~~-----------------~----~~~i~~~~~~~l~---~~~~~~a~~~~n~ 285 (397) T protein:vir:12 230 AKWFAKKSVVTRNNLILAAIASLKKVDID-----------------G----LDGIKKALNVTLD---PMVAPGSIVLTNQ 285 (397) T ss_pred HHHHHHHHHHHHHHHHHhccccccccccc-----------------c----HHHHHHHHhhccc---hhhhCCCEEEEcH Confidence 88889999999999999998643222221 1 234444442 221 1222345799999 Q ss_pred HHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccc Q lcl|NC_011142. 226 DLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIP 305 (343) Q Consensus 226 ~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~ 305 (343) ..|..|.+-. +..|.-++.--..+..-..+-|.|+.... .+......+ +.-+++-+=.+.+.+..-+. T Consensus 286 ~~~~~L~~lk--d~~G~~l~~~~~~~g~~~~l~G~pv~~~~---------~~~~~~~~~-~~~~~~gd~~~~~~~~~~~~ 353 (397) T protein:vir:12 286 DGYDWLDTLK--DGTGRYLLQPDPTNPTKKLLDGRPVVPFT---------NRVLKTQKG-KAPLIIGNLKEAIVLFDREQ 353 (397) T ss_pred HHHHHHHHhh--ccCCceeecccccCCCCccccceeeEEec---------ccccccCCC-ccEEEEEehhceEEEEeecc Confidence 9999996432 33332111000001111122344432111 010011112 22233333233333332223 Q ss_pred hhcccce------ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 306 FRMLAPQ------LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 306 ~~~~~~~------~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ++..-.. .++ ...+.++.++ +..+++|.|++.++-= T Consensus 354 ~~i~~~~~~~~~f~~~-~~~~r~~~r~-d~~~~~~~a~~~~~~t 395 (397) T protein:vir:12 354 QSIASTDTGAGAFETN-STKVRGIERE-DVRKWDEDAVVFGQIT 395 (397) T ss_pred eEEEEeccccchhhcC-ceEEEEEEee-ccEEecccceEEEEEe Confidence 2221111 112 3456677877 4677999999888766 No 98 >protein:vir:4997 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:109 # MgeName: Sfi21 # Cross-refs: genbank:acc:NP_049971;genbank:gi:9632943;genbank:GeneID:1262106 Probab=97.20 E-value=9.3e-05 Score=42.68 Aligned_cols=300 Identities=9% Similarity=-0.002 Sum_probs=139.4 Q ss_pred CCcceeccchhhhhchhhhchhcccccccC---cchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCC Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIG---VPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIP 77 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~ 77 (343) ..++...-+-..-....+.+.......+-. ..++..-....|.++.+ +.+...|++........+++..+. +.+ T Consensus 73 ~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~t~~~gg~~iP--~~~~~~ii~~~~~~~~l~~~~~~~-~~~ 149 (397) T protein:vir:49 73 EEEKKPLTKNEEEVKANFVKDFKNLVRGRYQNLLDSKTDGSGSDAGLTIP--QDIRTAINTLVRQFDSLQEYVNVE-NVT 149 (397) T ss_pred ccccccccchhhHHHHHHHHHHHHHhhcchhhHHHhhhccCCccCcceec--HHHHHHHHHHHHhhhhHhhhccee-ecc Confidence 111111111111111111110000000000 00001001122334443 334456777666666666665542 122 Q ss_pred cceeEEEEeee-cccccceeecCCcCccceee-eccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHH Q lcl|NC_011142. 78 EYATHWNYRSY-DGAAMGKFISANASDLPRVA-QSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSE 155 (343) Q Consensus 78 ~~~~~~~~~~~-~~~G~a~~~~~~~~dip~v~-~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~ 155 (343) .+...+.+... +..+.+.+++.. ..+|..+ ..++......+.++.-+.+|.+=|+. ...++..--....+++++ T Consensus 150 ~~~~~~~~~~~~~~~~~a~~v~E~-~~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~d---s~~~l~~~i~~~l~~~~~ 225 (397) T protein:vir:49 150 TLTGSRVYEKWADITGLAKLDDEG-GQIGQNDDPKLSLIRYAIKRYAGISTVTNSLLAD---SAENILAWLSGWIAKKVV 225 (397) T ss_pred CCcceEEEEeeccCCcceeeeccc-cccccccccceeeeEeeeeeeEeehhhHHHHHhh---hhHHHHHHHHHHHHHHHH Confidence 23333444433 334566776654 3355444 35777788888888877776543433 345677778888999999 Q ss_pred HhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccc Q lcl|NC_011142. 156 EHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLL 235 (343) Q Consensus 156 ~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~ 235 (343) +.+|+-+++|+.... +..+. .+ +++|.+++.++... ...+..++|+|..|..|.+-. T Consensus 226 ~~~d~ail~G~g~~~----------~~~~~------~~----~d~i~~~~~~l~~~---~~~~a~~v~n~~~~~~l~~lk 282 (397) T protein:vir:49 226 VTRNKAILEAIGTLP----------NKPTL------AK----WDDIIDLQAKVDPA---IKQTSLFLTNTSGFTALKKVK 282 (397) T ss_pred HHHHHHHHhcccccc----------ccccc------cC----HHHHHHHHHhhhhh---hcCCCEEEEcHHHHHHHHHhh Confidence 999999999975321 11111 12 35666777776532 234678999999999997533 Q ss_pred cCCCCCccHHHH-HHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccce-- Q lcl|NC_011142. 236 MTGYTDRTVIEH-FQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQ-- 312 (343) Q Consensus 236 ~~~~~~~tvle~-l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~-- 312 (343) +..|.-++.- +. ++....+-|.|+...... . ...+.++ +..++|-+=.+.+.+..-..++..--. T Consensus 283 --d~~g~~l~~~~~~-~g~~~~l~G~pV~~~~~~-------~-~~~~~~~-~~~~~~gd~~~~~~~~~~~~~~i~~~~~~ 350 (397) T protein:vir:49 283 --NAMGDYLMERDVK-SPTGYSIDGFVVKEISDR-------F-LPNGTGG-AMPLYFGDLKQAVTLFDRQHLSLLSTNIG 350 (397) T ss_pred --ccCCceeeccccc-CCCCceecceeeEEeccc-------c-cccccCC-ceeEEEeeccceEEEEeecccEEEEeccc Confidence 3333222110 11 111112334443221100 0 0111122 222333332233332221222222111 Q ss_pred ---ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 313 ---LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 313 ---~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ...-...+.++.+++| .+.+|.|++.++.= T Consensus 351 ~~~~~~~~~~~~~~~r~d~-~~~~~~a~~~~~~~ 383 (397) T protein:vir:49 351 GGAFETDTTKVRVIDRFDV-VSTDTEAFVPASFK 383 (397) T ss_pred cchhhcCeeeEEEEEeecc-EEecccceEEEEec Confidence 0111345567788754 57889999988643 No 99 >protein:vir:3845 Length: 395 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:322 # MgeName: phi adh # Cross-refs: genbank:acc:NP_050151;swissprot:trembl:q9t1f6;genbank:gi:9633043;uniprot:Q9T1F6;genbank:GeneID:1262163 Probab=97.18 E-value=8.3e-05 Score=42.97 Aligned_cols=294 Identities=7% Similarity=-0.011 Sum_probs=138.1 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) +..+.-..+++.+.. .+.+........ +. . ..+.|.++.+ +.+.+.|++........+.+..+. +..... T Consensus 81 ~~~~~~~~~~~~~~~-~~~~~~~~~~~~-~~---~--~~~~gg~~vP--~~~~~~ii~~~~~~~~l~~~~~~~-~~~~~~ 150 (395) T protein:vir:38 81 LPVKDGKPDAQAMKN-QFVKDFKNLVTS-GT---T--GTGNAGLTIP--EDIQLQIRTLTRSFTSLESLANVE-NVTTSH 150 (395) T ss_pred cchhhhhHHHHHHHH-HHHHHHHHHHhh-cc---C--ccCCCceecc--hhHhhHHHHHHHhhcchhhhccee-eccCCc Confidence 333333333333322 222211111111 10 0 1122333333 344567788777776666665432 111122 Q ss_pred eEEEEeee-cccccceeecCCcCcccee-eeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhh Q lcl|NC_011142. 81 THWNYRSY-DGAAMGKFISANASDLPRV-AQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHS 158 (343) Q Consensus 81 ~~~~~~~~-~~~G~a~~~~~~~~dip~v-~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~ 158 (343) ..+.+... +..+.+.+++.. ..+|.. ...++......+.++.-+.+|..=++ ....+|..--....++++.+.+ T Consensus 151 ~~~~~~~~~~~~~~a~~v~E~-~~~~~~~~~~f~~v~~~~~k~~~~~~iS~ell~---ds~~~l~~~i~~~la~~~~~~~ 226 (395) T protein:vir:38 151 GSRVYEKLADITPLKDLDDES-ALIGDNDDPELTVVKYLIHRYAGITTVTNTLLK---DTVDNIIQWLVNWAAKKDVVTR 226 (395) T ss_pred ceEEEEeeccCCccccccccc-cccccccccceeeEEeeeeeeEeehhhHHHHHh---hhHHHHHHHHHHHHHHHHHHHH Confidence 22333322 233445555544 345543 35677888888888888777654332 2355677778888999999999 Q ss_pred hheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCC Q lcl|NC_011142. 159 QRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTG 238 (343) Q Consensus 159 n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~ 238 (343) |+-+++|+.... +... . .+.++ |.+++...... .......++|+|+.|..|.... + T Consensus 227 ~~~il~g~g~~~----------~~~~----~--~~~~~----i~~~~~~~l~~--~~~~~a~~v~n~~~~~~L~~lk--d 282 (395) T protein:vir:38 227 NAKILEVMGKAP----------KKPT----I--SQFDN----IKDLENNTLDP--AIESTSSFITNQSGYNILSKVK--D 282 (395) T ss_pred HHHHhhcccccc----------cccc----c--ccHHH----HHHHHHHhhhh--hhcCCCEEEEcHHHHHHHHHhh--c Confidence 999999976421 1110 1 12333 33333321111 1223356899999999997533 3 Q ss_pred CCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccce-ec--- Q lcl|NC_011142. 239 YTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQ-LL--- 314 (343) Q Consensus 239 ~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~-~~--- 314 (343) ..|.-++.--..++.-..+-|.|+..... +...+.++... ++|-+-.+.+.+..-..++..-.. .. T Consensus 283 ~~G~~l~~~~~~~~~~~~l~G~pV~~~~~---------~~~~~~~~~~~-i~~gd~~~~~~i~~~~~~~i~~~~~~~~~~ 352 (395) T protein:vir:38 283 ADGRYLMQPDVTSPDKYLIDGKPVIRIAD---------KWLPDVSGSHP-LYFGDLKQGITLFDRQQMQIDTTNVGAGSF 352 (395) T ss_pred cCCceeeccCcCCCCcceeccceeEEecc---------cccCcCCCcce-EEEEeccccEEEEEecceEEEEeccccchh Confidence 33332211000111111233444322211 00011112222 333332333333322333222111 11 Q ss_pred -CceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 315 -GLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 315 -~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .-.+.+.++.++ ++.+.+|.+++.++.- T Consensus 353 ~~~~~~~r~~~r~-d~~~~~~~a~~~~~~~ 381 (395) T protein:vir:38 353 EHDTTKLRFIDRF-DVQLIDDGAFAAASFK 381 (395) T ss_pred hcCceEEEEEEee-ccEEecccceEEEEee Confidence 112455667776 5678889999999977 No 100 >protein:vir:3870 Length: 400 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:82 # MgeName: A2 # Cross-refs: genbank:acc:NP_680487;swissprot:trembl:q8ltc0;genbank:gi:22296527;interpro:IPR006444;uniprot:Q8LTC0;genbank:GeneID:951713 Probab=97.13 E-value=3.4e-05 Score=45.05 Aligned_cols=293 Identities=9% Similarity=0.000 Sum_probs=132.5 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) +++....--.+.-.+.............--....... +.|.++.+ +.+.+.|++.....-..+.++.+. +-.. T Consensus 103 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~gg~~vP--~~~~~~ii~~~~~~~~l~~~~~~~---~~~~ 175 (400) T protein:vir:38 103 GRNTDGVNFEKTDVGTFAVLRAVPTDASDAVNAGVKA--ADAASTIP--ETISNTPQRELQTVVDLKPFTNVF---QAST 175 (400) T ss_pred HHHHHHHHHHHHHHHHHhhhhhhhHHHHHHHhhcccc--cCCccccc--HHHHHHHHHHHHhhhhhhhcceeE---eccC Confidence 0000000000000000000000000000000000111 22344444 455677777666665555555542 2222 Q ss_pred eEEEEeeec-ccccceeecCCcCccc-eeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhh Q lcl|NC_011142. 81 THWNYRSYD-GAAMGKFISANASDLP-RVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHS 158 (343) Q Consensus 81 ~~~~~~~~~-~~G~a~~~~~~~~dip-~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~ 158 (343) .+..|.+.. ..+.+.+++..+. .| ..+..++........++.-+.+|.+=|+ ....++..--....+.++...+ T Consensus 176 ~~~~~~~~~~~~~~~~~~~E~~~-~~~~~~~~f~~i~~~~~k~~~~~~is~ell~---ds~~~~~~~i~~~l~~~~~~~~ 251 (400) T protein:vir:38 176 QKGTYPTVANATTKMVTVAELEK-NPAMAKPEFKPVNWSVETYRQALPVSQESID---DSAIDLVGLIAQNGQQIKVNTT 251 (400) T ss_pred cceEEEEEecCCCcccccccccc-ccccccccceeeEeehhheeeehhhHHHHHh---hhHHHHHHHHHHHHHHHHHHHH Confidence 244555544 4566777766543 44 3455677888888888877777664232 3345677777777888888999 Q ss_pred hheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCC Q lcl|NC_011142. 159 QRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTG 238 (343) Q Consensus 159 n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~ 238 (343) |..+++|....... . ..+.++ +.+++...... . ....++|+|..|..|..- .+ T Consensus 252 ~~~i~~~~~~~~~~---------------~--~~~~~~----~~~~~~~~~~~-~---~~a~~v~~~~~~~~l~~l--kd 304 (400) T protein:vir:38 252 NGAVATLLKGFTAK---------------T--ISSVDD----LKHINNVDLDP-A---YSRVIIASQSFYNFLDTV--KD 304 (400) T ss_pred HHhhhhcccccccc---------------c--cccHHH----HHHHHHhhhhh-h---hCcEEEEcHHHHHHHHHh--hc Confidence 99998886532100 1 112333 33333322111 1 135799999999998743 23 Q ss_pred CCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccceecCcee Q lcl|NC_011142. 239 YTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLGLGI 318 (343) Q Consensus 239 ~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~~~~ 318 (343) ..|.-++.-=..++.-..+-|.|+.... ... .+..| |-.++|-+=.+.+.+..-..++........... T Consensus 305 ~~G~~i~~~~~~~~~~~~l~G~pv~~~~-------~~~---~~~~g-~~~~~~gd~s~~~~~~~~~~~~~~~~~~~~~~~ 373 (400) T protein:vir:38 305 GNGRYLLQDSILTPSGKSVLGMPIAVVS-------DDT---LGAAG-EAHAFLGDIKRAILFANRADFMVRWVDDQIYGQ 373 (400) T ss_pred cCCCeeeecCcCCCCccccccceeEEec-------ccc---cCCCC-ceEEEEEeccccEEEEeecceEEEEecccccce Confidence 3333222100001111123344432211 111 11122 333444332222222222223322222222234 Q ss_pred EeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 319 TVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 319 ~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .+.++.|++ +.+..|.+|+.+..= T Consensus 374 ~~~~~~r~d-~~~~~~~a~~~l~~~ 397 (400) T protein:vir:38 374 FLQAGMRFG-VSVADEKAGYFLTYT 397 (400) T ss_pred eEEEEEEec-cEEecccceEEEEee Confidence 456778874 556679999987777 No 101 >protein:vir:100172 Length: 394 # NCBI annotation: putative major head protein # Family: family:all:21 # MgeID: mge:1524 # MgeName: phi AT3 # Cross-refs: genbank:acc:YP_025031;genbank:gi:48697264;genbank:GeneID:2948270 Probab=97.07 E-value=0.00018 Score=41.15 Aligned_cols=294 Identities=11% Similarity=0.047 Sum_probs=135.1 Q ss_pred CCcceeccchh---hhhchhhhchhccc-----ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccc Q lcl|NC_011142. 1 MSEKRVVIDAQ---TIAGNRWLNKFLDS-----NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPV 72 (343) Q Consensus 1 ~~~~~~~~~~~---~~~~~~~~~~~~~~-----~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v 72 (343) ..+....-+.. .-...+.++..+-. ..+.+. .+ .+.|.++.. +.+.+.|++........+.++.+ T Consensus 75 ~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~---~t--~~~gg~~vP--~~~~~~ii~~~~~~~~l~~~~~~ 147 (394) T protein:vir:10 75 DNAQPNGTDLKKKPIDAKKKAINDFIHSHGKVIDNAAGH---VT--STEAGVLIP--EEIIYDPTAEVNSVVDLSTLVTK 147 (394) T ss_pred hhhcccccchhhhHHHHHHHHHHHHHhccchhhhhhhcc---cc--cccCceecc--HHHHHHHHHHHHhhhhhhhhcee Confidence 00000000000 00000011110000 001110 11 123344444 45667788877777666666554 Q ss_pred cCCCCcceeEEEEeeecc-cccceeecCCcCccce-eeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHH Q lcl|NC_011142. 73 LANIPEYATHWNYRSYDG-AAMGKFISANASDLPR-VAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELA 150 (343) Q Consensus 73 ~~~~~~~~~~~~~~~~~~-~G~a~~~~~~~~dip~-v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA 150 (343) . +....+..|.+... .+.+.+++..+ ..|- -+..++.....++.++.-..+|.+=|+.+ ..+|..--.... T Consensus 148 ~---~~~~~~~~~~~~~~~~~~~~~~~E~~-~~~~~~~~~~~~v~l~~~k~~~~~~iS~ell~ds---~~~l~~~i~~~l 220 (394) T protein:vir:10 148 T---PVTTPKGTYPILKRATDRFSSVAELA-ENPALAEPEFEQVDWSVSTYRGAIPLSEEAIADS---AVDLTSLVGQSI 220 (394) T ss_pred e---eccCCceEEEEEecCCCccccccccc-cccccccccceeEEeeeeeeEeeehhHHHHHhhh---hHHHHHHHHHHH Confidence 2 22222344444443 46666776654 3453 34578888888888888888877655543 346777777888 Q ss_pred HHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHH Q lcl|NC_011142. 151 FRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKR 230 (343) Q Consensus 151 ~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~ 230 (343) +++++..+|+-+++|.... .+.... +..+. ++|.+++...... .+ .-.++|+|+.|.. T Consensus 221 a~~~~~~~~~~il~g~g~~----------~~~~~~----~~~~~----d~l~~~~~~~~~~-~~---~a~~vmn~~~~~~ 278 (394) T protein:vir:10 221 NEKSVNTYNAMIAPVLQSF----------TAKATT----TDTLV----DSLKHILNVDLDP-AY---SRALVVTQSLFNT 278 (394) T ss_pred HHHHHHHHHHHHhhccccc----------cccccc----ccccH----HHHHHHHHhhhhh-hc---cCEEEecHHHHHH Confidence 8899999999998887531 111111 11223 3444444322211 12 2469999999999 Q ss_pred HhccccCCCCCccHHHHHHhc----CcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccch Q lcl|NC_011142. 231 ASSLLMTGYTDRTVIEHFQIN----NAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPF 306 (343) Q Consensus 231 L~~~~~~~~~~~tvle~l~~n----~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~ 306 (343) |..-. +..|.-++.--..+ +....+-|.|+... .. ...+.+..+-.++|-+=.+.+.+..-..+ T Consensus 279 l~~lk--d~~G~~i~~~~~~~~~~~~~~~~L~G~PV~~~-------~~---~~~~~~~~~~~i~~gd~s~~~~~~~~~~~ 346 (394) T protein:vir:10 279 LDTLK--DKNGRYLLHDASDSITDGTAKGTVLGVPVYVV-------GD---ALLGSAAGDQKAFVGDLKRGVLFADRQQV 346 (394) T ss_pred HHHhh--ccCCCeeeeccccccccCCcccccccceeEEe-------cc---cccCCCCCceEEEEeeccccEEEEeecce Confidence 97532 33332211100000 00012233343221 10 00111112333344322232333222333 Q ss_pred hcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 307 RMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 307 ~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +............+..+.|++ +.+++|.+|+++..= T Consensus 347 ~v~~~~~~~~~~~~~~~~r~d-~~~~~~~ai~~~~~~ 382 (394) T protein:vir:10 347 TLAWEDSKIYGRYLGAAFRFG-VKQADSNAGYFVTNT 382 (394) T ss_pred EEEEecccccceeEEEEEEec-cEEeccccEEEEEee Confidence 332222222233445667875 677779999887644 No 102 >protein:vir:4830 Length: 397 # NCBI annotation: MPL-7201 # Family: family:all:21 # MgeID: mge:105 # MgeName: 7201 # Cross-refs: genbank:acc:NP_038327;genbank:gi:9634653;genbank:GeneID:1262632 Probab=97.06 E-value=0.00018 Score=41.14 Aligned_cols=299 Identities=7% Similarity=-0.053 Sum_probs=135.5 Q ss_pred CCcceeccchhhhhchhhhchhccccccc-----CcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATI-----GVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLAN 75 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-----~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~ 75 (343) ...++...+...-..++..+.......+- ..-+..+. ..+.++.. +.+.+.|++........+.+..+.. T Consensus 73 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~t~--~~gg~~iP--~~~~~~ii~~~~~~~~l~~~~~~~~- 147 (397) T protein:vir:48 73 EEEKKPLTKSEEEVKAGFVKDFKNLVRGRYQNLLDSKTDASG--SDAGLTIP--QDIQTAIHTLVRQYDSLQEYVNVEN- 147 (397) T ss_pred hhccccccchhhHHHHHHHHHHHHHHhhhhhHHHHHhhccCC--cccccccc--HHHHHHHHHHHHHHHHHHhhhceee- Confidence 00011111111111111111000000000 00000111 12333333 3445677887776666666655422 Q ss_pred CCcceeEEEEe-eecccccceeecCCcCcccee-eeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHH Q lcl|NC_011142. 76 IPEYATHWNYR-SYDGAAMGKFISANASDLPRV-AQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRG 153 (343) Q Consensus 76 ~~~~~~~~~~~-~~~~~G~a~~~~~~~~dip~v-~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~ 153 (343) .+.+...+.+. ..+..+.+.+++.. ..+|.. ...++........++.-+.+|.+=|+. ...++..--....+++ T Consensus 148 ~~~~~~~~~~~~~~~~~~~a~~v~E~-~~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~d---s~~~l~~~v~~~l~~~ 223 (397) T protein:vir:48 148 VTTLTGSRVYEKWADITGLAKLDDEA-GSIGTNDDPKLYPIRYAIKRYAGISTVTNSLLAD---SAENILAWLSGWIAKK 223 (397) T ss_pred ccCCcceEEEEeecCCCcceeeeccc-cccccccccceeeEEeeheeeeeehhhHHHHHhh---chHHHHHHHHHHHHHH Confidence 22122222222 22344556666654 345544 356788888888888888887654443 3456777788889999 Q ss_pred HHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhc Q lcl|NC_011142. 154 SEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASS 233 (343) Q Consensus 154 ~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~ 233 (343) +++.+|+.+++|+...+..| + . .+ +++|.+++.++... + ..+..++++|..|..|.. T Consensus 224 ~~~~~d~~il~G~g~~~~~~----------~----~--~~----~d~i~~~~~~l~~~--~-~~~a~~v~n~~~~~~L~~ 280 (397) T protein:vir:48 224 VVVTRNKAILEAIATLPTKP----------T----L--TK----WDDIIDLQAKVDPA--I-KQTSFFLTNTSGFTALKK 280 (397) T ss_pred HHHHHHHHHhhccccccccc----------c----c--cc----HHHHHHHHHHhhhh--h-cCCCEEEECHHHHHHHHH Confidence 99999999999975432111 0 0 12 34555666665432 2 345789999999999975 Q ss_pred cccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcc--cc Q lcl|NC_011142. 234 LLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRML--AP 311 (343) Q Consensus 234 ~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~--~~ 311 (343) .. +..|.-++.--..++.-..+-|.|+.+..... ...+..+...++ |-+=.+.+.+..-..++.. .. T Consensus 281 lk--d~~G~~i~~~~~~~~~~~~l~G~PV~~~~~~~--------~~~~~~~~~~~~-~gd~~~~~~~~~~~~~~i~~~~~ 349 (397) T protein:vir:48 281 VK--NAFGDYLMERDVKSPTGYSIDGFAVKEVADRW--------LANASSGAMPLY-FGDLKQAVTLFDRQQMSLLSTNI 349 (397) T ss_pred hh--cCCCceeeccCcCCCCCceeccceeEEecccc--------cCCcCCCceEEE-EEeccceEEEEeecceEEEEecc Confidence 32 33332221100011111233444432211100 011122233333 3221222222221222211 10 Q ss_pred ---eecCceeEeeeeeeeeeEEEECcceeeeec--cC Q lcl|NC_011142. 312 ---QLLGLGITVPAEYKISGTEYRYPLCAQYVD--ML 343 (343) Q Consensus 312 ---~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d--GI 343 (343) ....-...+.++.++ ++.+++|.+++.++ .. T Consensus 350 ~~~~~~~~~~~~r~~~r~-d~~~~~~~a~~~~~~~~~ 385 (397) T protein:vir:48 350 GGGAFETDTTKIRVIDRF-DVVATDTESFVPASFKAI 385 (397) T ss_pred chhhhhcCceeEEEEeee-ccEEecccceEEEEeccc Confidence 011112455567776 46778899986655 33 No 103 >protein:vir:93742 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1475 # MgeName: 55 # Cross-refs: genbank:acc:YP_240459;genbank:gi:66396126;genbank:GeneID:5133511 Probab=97.03 E-value=0.0002 Score=40.86 Aligned_cols=264 Identities=7% Similarity=-0.056 Sum_probs=131.4 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC--CCcceeEEEEeeecccccceeecCCcCc Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLAN--IPEYATHWNYRSYDGAAMGKFISANASD 103 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~--~~~~~~~~~~~~~~~~G~a~~~~~~~~d 103 (343) |++ ..+.- .-.+..+.|. +.+.+.....+....+..+... +.+|. ++.++.+...|.++.+.++ ++ T Consensus 1 ma~----~~T~~---~~~iiPev~~---~~v~~~~~~~~~~~~~~~~~~~l~g~~G~-tv~ip~~~~~g~~~~~~eg-~~ 68 (274) T protein:vir:93 1 MPQ----GITKT---SNQIIPEVLA---PMMQAQLEKKLRFASFAEVDSTLQGQPGD-TLTFPAFVYSGDAQVVAEG-EK 68 (274) T ss_pred CCc----cceeh---hheechHHHH---HHHHHHHHhhhhhcccccccccccCCCCC-EEEEEeeccCCCcccccCC-Cc Confidence 221 11111 1122223232 2233333344444455544332 22343 5677777777888888664 57 Q ss_pred cceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCcccc Q lcl|NC_011142. 104 LPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKT 183 (343) Q Consensus 104 ip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~ 183 (343) ++..+...+.....+...+.+|.+ .|+..++. +.++-..-...+.+.+++..|+.++..-.+. + ... T Consensus 69 i~~~~it~~~~~~~i~~~~~~~~i--~D~~~~~~-~~d~~~~~~~~~~~~~a~~~d~~~~~~~~~a--------~-~~~- 135 (274) T protein:vir:93 69 IPTDILETKKREAKIRKIAKGTSI--TDEALLSG-YGDPQGEQVRQHGLAHANKVDNDVLEALMGA--------K-LTV- 135 (274) T ss_pred ccccccccceeEEEeeeecccccc--cHHHHHhh-ccchHHHHHHHHHHHHHHHHHHHHHHHHhcc--------c-ccc- Confidence 888888888888888887766555 55554443 4455566777788888888887766332110 0 001 Q ss_pred ccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccH-HHHHHhcCcceeeccccc Q lcl|NC_011142. 184 SATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTV-IEHFQINNAYTLLTRNPI 262 (343) Q Consensus 184 ~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tv-le~l~~n~~~~~~~~~p~ 262 (343) ++ +..+ +++|.+++.++-.. . ..+..|+++|..+..|.+.........+- .+-+..++..-.+.|.++ T Consensus 136 --~~--~~~~----~d~i~dA~~~l~d~--~-~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G~~V 204 (274) T protein:vir:93 136 --NA--DITK----LNGLQSAIDKFNDE--D-LEPMVLFINPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEALGAII 204 (274) T ss_pred --cc--cccC----HHHHHHHHHHhhhc--c-CCccEEEeCHHHHHHHHhhhhhcccccccccccceeecccceecCeeE Confidence 10 0112 45666777766543 2 36789999999999997431111000000 011122333333333332 Q ss_pred cccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-eecCceeEeeeeeeeeeEEEECcceeeeec Q lcl|NC_011142. 263 DIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-QLLGLGITVPAEYKISGTEYRYPLCAQYVD 341 (343) Q Consensus 263 ~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d 341 (343) ..+.. .. .+ .+|.-.+..+.+....+.+...- .++...-.+.... ..|+-+.+|.+++.+. T Consensus 205 i~s~~----------~p---~~----t~~l~~~gai~~~~~~~~~vE~~Rd~~~~~d~i~~~~-~y~~~~~~~~~~v~~t 266 (274) T protein:vir:93 205 VRTNK----------LE---AG----TAILAKKGAVKLILKRDFFLEVARDASTKTTALYSDK-HYVAYLYDESKAVKIT 266 (274) T ss_pred EEcCC----------CC---cc----eEEEEeCCeEEEEecCCcccccccchhhcccEEEEEE-EEEEEEEcCCceEEEe Confidence 22211 00 01 12222344444444333332111 1112233343333 3589999999988877 Q ss_pred cC Q lcl|NC_011142. 342 ML 343 (343) Q Consensus 342 GI 343 (343) -= T Consensus 267 ~~ 268 (274) T protein:vir:93 267 KG 268 (274) T ss_pred eC Confidence 44 No 104 >protein:vir:97255 Length: 310 # NCBI annotation: hypothetical protein ORF017 # Family: family:all:1120 # MgeID: mge:1657 # MgeName: M6 # Cross-refs: genbank:acc:YP_001294525;genbank:gi:149408246;genbank:GeneID:5237120 Probab=96.97 E-value=0.00023 Score=40.53 Aligned_cols=285 Identities=12% Similarity=0.010 Sum_probs=133.0 Q ss_pred cchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeee-c--ccccceeecCCc-Cccce Q lcl|NC_011142. 31 VPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSY-D--GAAMGKFISANA-SDLPR 106 (343) Q Consensus 31 ~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~-~--~~G~a~~~~~~~-~dip~ 106 (343) -|+++. |++ ..+.. ..+...|+|.....-...+.+|-.. +..+ .+.|.-. . .++...+...+. .+.|. T Consensus 1 mpaltL-aea--~k~~~--d~l~~~ViE~~~~~s~lL~~LpF~~-veg~--~~~ynR~~~~~~~~~~~v~~~~~~~g~~~ 72 (310) T protein:vir:97 1 MASVTL-AES--AKLAQ--DELVAGVIENIITVNRMFDVLPFDS-IEGN--SLAYNRENVLGDVIMAGVGTTFSGAGAGK 72 (310) T ss_pred Ccccch-HHH--hhcCc--chHHHHHHHHHhccchHHHhCCccc-ccCC--cceeeEeeccCCcccccccccccCCCccc Confidence 132222 111 11111 3445567776554444445555321 1111 1223211 1 122222111111 12233 Q ss_pred eeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCcc--HHHHHHHHHHHHHhhhheeeeeehh-hcceeeeecC-Cccc Q lcl|NC_011142. 107 VAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPID--SMQAELAFRGSEEHSQRVAYFGDTN-RNMSGLLNNP-NVTK 182 (343) Q Consensus 107 v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~--~~k~~aA~~~~~~~~n~~~f~G~~~-~g~~GLlN~p-~v~~ 182 (343) ....++++...+..++..+.+.-+-.+. ..+-+-+ ........+++.++.+...+|||.. .+++||+..= +-.. T Consensus 73 ~~~t~~~~~~~L~i~~g~~~Vd~~i~dl--~~~~~~dq~~~Ql~~~iea~~~~~e~~lINGD~a~n~F~GL~~~~~~~q~ 150 (310) T protein:vir:97 73 AAATFTKVNSNLTTIMGDAEVNGLIQAT--RSGDGNDQTAVQIASKAKSAGRKYQDQLINGNGAGNEFAGLIQLCASGQK 150 (310) T ss_pred cccccceeeeeeeeeeehhhhhhHHHhh--hcCChHHHHHHHHHHHHHHHHHHHHHHhhccccCCCcccchhhcCCccce Confidence 3344566666777777666554321111 1232333 3455667788899999999999874 4678997652 1122 Q ss_pred cccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHH---HHHHhccccC-CCCCccHHHHHHhcCcceeec Q lcl|NC_011142. 183 TSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDL---WKRASSLLMT-GYTDRTVIEHFQINNAYTLLT 258 (343) Q Consensus 183 ~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~---~~~L~~~~~~-~~~~~tvle~l~~n~~~~~~~ 258 (343) ...++.-..-| ++|+-++++.+|.. .-.|..|+++|+. +..+.|.-.. ..+..++. .-..+..... T Consensus 151 i~~~~~gg~~t----~d~LDeLl~~v~~~---~g~p~~~l~~~~~~r~i~A~~R~~~~~g~~~~~~~---~~G~~v~~~~ 220 (310) T protein:vir:97 151 ATTGATGSAIS----FAILDELMDLVVDK---DGQVDYLTMHARTLRSYKALLRALGGASINEVVEL---PSGAEVPAYS 220 (310) T ss_pred eecCCCCCCCC----HHHHHHHHHHHhcC---CCCCCEEEecHHHHHHHHHHHHHhcCCCCCCcccc---CCCCEEeeeC Confidence 22111111224 46778888888742 2358899999974 5555542100 11111111 1123444555 Q ss_pred cccccccccceeeechhhhccccCCccceEEEEEcccc-----eEEEeecc----chhccc-ceecC-ceeEeeeeeeee Q lcl|NC_011142. 259 RNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSER-----NLALAKPI----PFRMLA-PQLLG-LGITVPAEYKIS 327 (343) Q Consensus 259 ~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~-----~~~~~v~~----~~~~~~-~~~~~-~~~~~~~~~~~g 327 (343) |.|+..-- ..+.-...+ ..+|+...++..-..+ ++.++.+. ..++.. .+.+. ..|.+.. .- T Consensus 221 GiPi~~~d----~ip~~~~~~-~~~gtTsIya~r~Ge~~~~~Gv~Gl~~~~~~glsVr~~G~~~~~~v~~~~V~~---Y~ 292 (310) T protein:vir:97 221 GTPIFRND----YIPTNQTKG-GTTGCTTIFAGTLDDGSRTHGIAGLTATQAAGIQVVDVGESEDSDEHIWRVKW---YC 292 (310) T ss_pred CeEEEEeC----ccCCCcccc-ccCCceeEEEEeeCccccccceeccccCCccceeEEeCCcccCCcceeEEEEE---ee Confidence 65543211 011111112 2234444555544432 33333221 233333 23333 4566544 35 Q ss_pred eEEEECcceeeeeccC Q lcl|NC_011142. 328 GTEYRYPLCAQYVDML 343 (343) Q Consensus 328 Gv~i~~P~ai~~~dGI 343 (343) |+-++-|.|++.+.|| T Consensus 293 ~~av~~~~A~a~L~~V 308 (310) T protein:vir:97 293 GLALFSEKGLACADGI 308 (310) T ss_pred eEEEecccceeeeccc Confidence 7899999999999999 No 105 >protein:vir:1383 Length: 421 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:314 # MgeName: phi3626 # Cross-refs: genbank:acc:NP_612835;genbank:gi:20065969;genbank:GeneID:935826 Probab=96.83 E-value=0.00029 Score=39.98 Aligned_cols=294 Identities=7% Similarity=0.031 Sum_probs=125.6 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcc----hhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCC Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVP----SVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANI 76 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~----~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~ 76 (343) ........+......+...+.......+.... ++.+. ..|.++.+ +.+.+.|++........+.++.+.. . T Consensus 79 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ra~~t~--~~gg~liP--~~~~~~Ii~~~~~~~~l~~l~~~~~-~ 153 (421) T protein:vir:13 79 GGRVIINGDSKEEKRSLQLSAMSKTIRGIQLSEEERDIMSS--TNNGAVIP--QEFVNEFEKLKEGYPSLKEHCHVIP-V 153 (421) T ss_pred ccccccccchhHHHHHHHHHHHHHhhhccchhHHHhhcccc--CCcceecc--hhhHHHHHHHHHhhhhhhhhceeee-c Confidence 11111111111111111111111000000000 01111 12333443 4455667776666655565554421 2 Q ss_pred CcceeEEEEeeeccccc--ceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHH Q lcl|NC_011142. 77 PEYATHWNYRSYDGAAM--GKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGS 154 (343) Q Consensus 77 ~~~~~~~~~~~~~~~G~--a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~ 154 (343) + ..+..|.+...... +.+.+.. ..+|..+..++.....++.++.-+.+|.+=|+.+ ..+|..--....++++ T Consensus 154 ~--~~~~~~~~~~~~~~~~~~~~~E~-~~~~~s~~~f~~i~~~~~k~~~~v~iS~ell~ds---~~~l~~~i~~~la~~~ 227 (421) T protein:vir:13 154 N--RNAGKMPVRAGASVDKLANLAKD-TELVKAMLKTQPMAYDIDDYGLLAPIDNSLLEDS---EINFLEFVNEEFAEFA 227 (421) T ss_pred c--CCceEEEEeecCCccceeecccc-ccccccccceeEEEeeeeeeEeehhhhHHHHhhh---HHHHHHHHHHHHHHHH Confidence 2 22233433333222 3333333 4577777778888888888888777765434332 3345555556666667 Q ss_pred HHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhcc Q lcl|NC_011142. 155 EEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSL 234 (343) Q Consensus 155 ~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~ 234 (343) ...+|.-+.. ...|+++.+++ .+ ++||.++++.+... + ..+..++|+|..|..|... T Consensus 228 ~~~~~~~i~~-----~~~g~~~~~~~-----------~~----~d~i~~~~~~l~~~--~-~~~a~~v~n~~~~~~l~~l 284 (421) T protein:vir:13 228 VNTENAEIVK-----QAKAVLAEETI-----------ND----YAGLVKTINSLVPN--A-RKRAIIVTNSDGRAYLDGL 284 (421) T ss_pred HHHhhhhHhh-----hhhhccccccc-----------cc----hHHHHHHHHHhhhh--h-cCCCEEEEcHHHHHHHHHh Confidence 7777654332 23444433221 12 45666777766432 2 2356899999999999753 Q ss_pred ccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccceec Q lcl|NC_011142. 235 LMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLL 314 (343) Q Consensus 235 ~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~ 314 (343) -+..|.=++.-+....+ ..+-|.|+... .... .+.++ +-.++|-+-.+.+.+..-..++....... T Consensus 285 --kd~~G~~i~~~~~~~~~-~tl~G~pV~~~-------~~~~---~~~~~-~~~~~~gd~~~~~~~~~~~~~~v~~~~~~ 350 (421) T protein:vir:13 285 --MDKQGRPLLKELSDGGD-LVFKGRPVIEL-------EESI---FDVGD-ETKFIVSDFKTLIKFMDRKQYLIDQSKEA 350 (421) T ss_pred --hcCCCceeecCcCCCCC-ceecceeeEEe-------cccc---ccCCC-ceEEEEEeccccEEEEEecceEEEeeccc Confidence 24444322221111111 12334443211 1111 11112 22333333333344433344433322111 Q ss_pred C---ceeEeeeeeeeeeEEEECcce-----------eeeeccC Q lcl|NC_011142. 315 G---LGITVPAEYKISGTEYRYPLC-----------AQYVDML 343 (343) Q Consensus 315 ~---~~~~~~~~~~~gGv~i~~P~a-----------i~~~dGI 343 (343) . -.+.+.+..|++|. +..|.+ ++..++. T Consensus 351 ~f~~~~~~~r~~~r~d~~-~~~~~a~~~~~~~~~~a~v~~~~~ 392 (421) T protein:vir:13 351 GYTKNETIARIIERFDVN-SPLDKSSDAEKIRKFGVIVKLQEV 392 (421) T ss_pred ccccCeeEEEEEeeecce-eecchhhheeeecccceeeccccc Confidence 1 12345566676443 334444 4444444 No 106 >protein:vir:78640 Length: 352 # NCBI annotation: phage capsid # Family: family:all:658 # MgeID: mge:1855 # MgeName: tp310-2 # Cross-refs: genbank:acc:YP_001429943;genbank:gi:156603997;genbank:GeneID:5525386 Probab=96.73 E-value=0.00016 Score=41.46 Aligned_cols=291 Identities=9% Similarity=0.006 Sum_probs=141.5 Q ss_pred CCcceeccchhhh-----------hchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhh Q lcl|NC_011142. 1 MSEKRVVIDAQTI-----------AGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLED 69 (343) Q Consensus 1 ~~~~~~~~~~~~~-----------~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~ 69 (343) ..+.-..-..+.+ .++...+.....+.- ..++.....++|.++.+ +.+.++|++........|.+ T Consensus 41 ~~~~~~~~~~~~~~~~~~~~r~~~~~~~~~~~~~~~~~~--~~al~~~~~~~gG~lIP--~~~~~~Ii~~l~~~s~l~~~ 116 (352) T protein:vir:78 41 EAYQSLNDNEKLVKAKAEFYRHAILPNEFEKPSMEAQRL--LHALPTGNDSGGDKLLP--KTLSKEIVSEPFAKNQLREK 116 (352) T ss_pred ccccccchhhhHHHHHHHHHHHHhhhhHHHHHHhhHHHH--HHHhccCCCCCCceecc--HhHHHHHHHHHHhhcchhhh Confidence 1100000000111 111111100000000 01112223345666666 45667788876666666776 Q ss_pred ccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHH Q lcl|NC_011142. 70 VPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAEL 149 (343) Q Consensus 70 i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~a 149 (343) ..+.+..+ ... ..+....+.+.+++.. ..+|..+..++......+.++.-+.+|.+=|+.+ ..+|..--... T Consensus 117 ~~v~~~~~---~~~-p~~~~~~~~a~~v~E~-~~~~~~~~~f~~v~~~~~k~~~~i~is~ell~Ds---~~~l~~~i~~~ 188 (352) T protein:vir:78 117 ARLTNIKG---LEI-PRVSYTLDDDDFITDV-ETAKELKLKGDTVKFTTNKFKVFAAISDTVIHGS---DVDLVNWVENA 188 (352) T ss_pred eeeEecCC---ceE-EEEecCCCcccccccc-cccccccccceeeeecceeEEeechhhHHHHhhh---hHHHHHHHHHH Confidence 66543222 111 1222233567777654 4577778888888888999988888877644433 45576666677 Q ss_pred HHHHHHHhhhheee-eeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHH Q lcl|NC_011142. 150 AFRGSEEHSQRVAY-FGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLW 228 (343) Q Consensus 150 A~~~~~~~~n~~~f-~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~ 228 (343) .++++...++..+| .|+......|+++++++...+..+ .+++|.+++..|... +.. .-+.+|.+..+ T Consensus 189 la~~~~~~e~~~~~~~g~g~~~~~g~l~~~~~~~~t~~~---------~~d~i~~~~~~l~~~--~~~-~a~~~mn~~t~ 256 (352) T protein:vir:78 189 LQSGLAAKERKDALAVSPKSGLEHMSFYNGSVKEVEGAN---------MYDAIINALADLHED--YRD-NATIYMRYADY 256 (352) T ss_pred HHHHHHHHHHHhhhhcCCCCcccccceeccccccccccc---------hHHHHHHHHhccChh--hhc-CCEEEEehHHH Confidence 77777777777555 555444457888888766544321 256666666665432 221 24588888888 Q ss_pred HHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhc Q lcl|NC_011142. 229 KRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRM 308 (343) Q Consensus 229 ~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~ 308 (343) ..|.+.. .+ .+..++ ...+.+ +-|.|+.+..- + ++..+ -+.+.=++.. ..+.. T Consensus 257 ~~l~~~~-~~-~~~~~~----~~~~~~-llG~PV~~~~~------------~----~~~~~-Gdf~~~~~~~---~~~~~ 309 (352) T protein:vir:78 257 VKIISVL-SN-GTTNFF----DTPAEK-VFGKPVVFTDA------------A----VKPIV-GDFNYFGINY---DGTTY 309 (352) T ss_pred HHHHHHH-hc-cCCccc----ccCCcc-ccccceEEecC------------C----CceeE-eehhhhhhhh---hhhee Confidence 7776543 22 232222 222332 33556543211 0 01111 0100000000 00110 Q ss_pred cc-ceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 309 LA-PQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 309 ~~-~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -. -+...-...+.+.+|++|. +.+|.|++.+.-= T Consensus 310 ~~~~~~~~g~~~f~~~~r~Dg~-~~~~eA~~~l~~~ 344 (352) T protein:vir:78 310 DTDKDVKKGEYLFVLTAWYDQQ-RTLDSAFRIAKAK 344 (352) T ss_pred eeeccccCCeeEEEEEeeeCce-eechhheEEEEee Confidence 00 0111223555667787654 6779998776543 No 107 >protein:vir:96123 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1602 # MgeName: 37 # Cross-refs: genbank:acc:YP_240078;genbank:gi:66395742;genbank:GeneID:5133103 Probab=96.71 E-value=0.00039 Score=39.26 Aligned_cols=264 Identities=7% Similarity=-0.057 Sum_probs=131.8 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC--CCcceeEEEEeeecccccceeecCCcCc Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLAN--IPEYATHWNYRSYDGAAMGKFISANASD 103 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~--~~~~~~~~~~~~~~~~G~a~~~~~~~~d 103 (343) |++.. +. ..-.+-.+-|. +.+.+.....+....+..+... +.+|. ++.++.+...|.+..+.++ ++ T Consensus 1 ma~~~----T~---~~d~i~Pev~s---~~v~~~~~~~~~~~~~~~~~~~l~g~~G~-tv~ip~~~~~g~~~~~~~g-~~ 68 (274) T protein:vir:96 1 MAQGT----TK---VSNLIVPEVLA---PMMQAELDKKLRFAQFADIDSTLVGQPGD-TLTFPAFTYSGDAQVIAEG-EK 68 (274) T ss_pred CCccc----cc---hhhhhhhHHHH---HHHHHHHHhhhhhcccccccccccCCCCC-EEEEEeeccCCCccccCCC-Cc Confidence 22211 10 11122222333 2234444444555555554432 23353 5777777777888887665 57 Q ss_pred cceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCcccc Q lcl|NC_011142. 104 LPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKT 183 (343) Q Consensus 104 ip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~ 183 (343) ++..+...+.....+...+.+|.+ .|++..+ .+.++-......+...+++..|+.++.--.+ ... .. T Consensus 69 i~~~~it~~~~~~~i~~~~~~~~i--~D~~~~~-~~~d~~~~~~~~~~~~~a~~~d~~i~~~l~~---------a~~-~~ 135 (274) T protein:vir:96 69 IPVDQIGTSKREAKVRKIGKGTEL--TDEAVLS-GFGDPQGEAVRQHGLAIANKVDNDVLEALKG---------ATL-TV 135 (274) T ss_pred CchhhcccceeEEEEEeeeceeee--cHHHHHh-hcchHHHHHHHHHHHHHHHHHHHHHHHHHhc---------CCC-Cc Confidence 888888888888888887766655 5555544 4555556677778888888888877632211 000 01 Q ss_pred ccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCcc-HHHHHHhcCcceeeccccc Q lcl|NC_011142. 184 SATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRT-VIEHFQINNAYTLLTRNPI 262 (343) Q Consensus 184 ~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~t-vle~l~~n~~~~~~~~~p~ 262 (343) .++. -+ ++.|.++...+-.. ...+..|+++|..+..|.+-...+...-+ ..+-+.+++..-.+.|.++ T Consensus 136 ~~~~----~~----~d~i~dA~~~l~d~---~~~~~~ivv~p~~~~~L~k~~~~~f~~~~~~g~~~~~~g~ig~~~G~~V 204 (274) T protein:vir:96 136 EADI----TK----LDGLQTAIDKFNDE---DLEPMVLFVNPLDAGGLRTSASDNFTRPTQLGDNIIVKGAFGEALGAVI 204 (274) T ss_pred Cccc----cc----HHHHHHHHHHhccc---CCCceEEEeCHHHHHHHHhcccccccccccccccceeecccceecCeeE Confidence 1111 11 55666777776543 23678899999999999642111100000 0011122333333333332 Q ss_pred cccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-eecCceeEeeeeeeeeeEEEECcceeeeec Q lcl|NC_011142. 263 DIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-QLLGLGITVPAEYKISGTEYRYPLCAQYVD 341 (343) Q Consensus 263 ~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d 341 (343) ..+.. + + .+ . +|.-.+.-+.+....+++.-.- .++.....+.... ..|+-+.+|..++.+. T Consensus 205 i~s~~--~--p---------~~--t--~~l~~~gA~~~~~~~~~~vE~~Rd~~~~~d~i~~~~-~yg~~~~~~~~vv~~t 266 (274) T protein:vir:96 205 VRSNK--L--N---------KG--E--ALLAKKGAVKLITKRDFFLEKDRDASRKSTALYSDK-HYVAYLYDESKVVKIT 266 (274) T ss_pred EEcCC--C--C---------cc--e--EEEEeCcceeeeecCCcccccccchhhcccEEEEee-EEEEEEEcCccEEEEE Confidence 22211 0 0 01 1 1222233344433333332110 1111223333333 4589999998888765 Q ss_pred cC Q lcl|NC_011142. 342 ML 343 (343) Q Consensus 342 GI 343 (343) -= T Consensus 267 ~~ 268 (274) T protein:vir:96 267 KG 268 (274) T ss_pred cC Confidence 44 No 108 >protein:vir:94494 Length: 274 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1508 # MgeName: 88 # Cross-refs: genbank:acc:YP_240676;genbank:gi:66396348;genbank:GeneID:5133758 Probab=96.66 E-value=0.00043 Score=39.06 Aligned_cols=264 Identities=8% Similarity=-0.039 Sum_probs=129.8 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC--CCcceeEEEEeeecccccceeecCCcCc Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLAN--IPEYATHWNYRSYDGAAMGKFISANASD 103 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~--~~~~~~~~~~~~~~~~G~a~~~~~~~~d 103 (343) |++ ..+.-+| .+..+-|. +.|.+.....+....+..+... +.+| .++.++.+...|.+..+.++ ++ T Consensus 1 ma~----~~T~~~d---~iiPev~~---~~v~~~~~~~l~~~~~~~~d~~l~g~~G-~tv~iP~~~~~g~a~~~~~g-~~ 68 (274) T protein:vir:94 1 MPQ----GLTKTSD---QIIPEVLA---PMMQAQLEKKLRFASFAEVDSTLQGQPG-DTLTFPAFVYSGDAQVVAEG-EK 68 (274) T ss_pred CCc----cceehhh---eechHHHH---HHHHHhhhhhhhhcccceecccccCCCC-CEEEEeeecCCCccccccCC-Cc Confidence 222 0111111 12222232 2234444444555555544332 2234 46777777778888877665 57 Q ss_pred cceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCcccc Q lcl|NC_011142. 104 LPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKT 183 (343) Q Consensus 104 ip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~ 183 (343) ++..+...+.....+...+.+|. +.|++.++..+- +-..-...+..++++..|+.++.--.. -..... T Consensus 69 i~~~~lt~~~~~~~i~~~~~~~~--i~D~~~~~~~~d-p~~~~~~~~a~a~a~~vd~~~~~~l~~---------a~~~~~ 136 (274) T protein:vir:94 69 IPTDILETKKREAKIRKIAKGTS--ITDEALLSGYGD-PQGEQVRQHGLAHANKVDNDVLEALMG---------AKLTVN 136 (274) T ss_pred ccccccccceeEEEeeeecceec--ccHHHHHhccch-HHHHHHHHHHHHHHHHHHHHHHHHHhc---------cCcccc Confidence 88888888888888888766555 556665554444 445566777788888888765522111 011110 Q ss_pred ccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccH-HHHHHhcCcceeeccccc Q lcl|NC_011142. 184 SATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTV-IEHFQINNAYTLLTRNPI 262 (343) Q Consensus 184 ~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tv-le~l~~n~~~~~~~~~p~ 262 (343) + +.-+ ++.|.++..++-.. ...+..|+++|..+..|.+....+....+- .+-+..++..-.+.|.++ T Consensus 137 -~----~~~~----~d~i~dA~~~l~d~---~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G~~V 204 (274) T protein:vir:94 137 -A----DITK----LNGLQSAIDKFNDE---DLEPMVLFVNPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEALGAII 204 (274) T ss_pred -c----cccC----HHHHHHHHHHhhcc---CCCceEEEeCHHHHHHHHhhhhhhccccCcccccceeccccceecCeeE Confidence 0 1112 45667777776543 235788999999999997531111000000 011122333333334332 Q ss_pred cccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-eecCceeEeeeeeeeeeEEEECcceeeeec Q lcl|NC_011142. 263 DIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-QLLGLGITVPAEYKISGTEYRYPLCAQYVD 341 (343) Q Consensus 263 ~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d 341 (343) ..+... + .+ .+|.-.+.-+.+....+++...- .++...-.+.... ..|+-+.+|..++.+. T Consensus 205 i~s~~~----p---------~~----t~~l~~~gA~~~~~~~~~~vE~~Rd~~~~~d~i~~~~-~y~~~~~~~~~vv~~t 266 (274) T protein:vir:94 205 VRTNKL----E---------AG----TAILAKKGAVKLILKRDFFLEVARDASTKTTALYSDK-HYVAYLYDESKAVKIT 266 (274) T ss_pred EEcCCC----C---------cc----eEEEEeCcceEeeecCCceeccccchhhcccEEEEEE-EEEEEEEcCCceEEEe Confidence 222110 0 01 12222233334333333332111 1111223333333 4588999998888876 Q ss_pred cC Q lcl|NC_011142. 342 ML 343 (343) Q Consensus 342 GI 343 (343) -= T Consensus 267 ~~ 268 (274) T protein:vir:94 267 KG 268 (274) T ss_pred cC Confidence 44 No 109 >protein:vir:97433 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1676 # MgeName: 92 # Cross-refs: genbank:acc:YP_240749;genbank:gi:66396420;genbank:GeneID:5133789 Probab=96.66 E-value=0.00043 Score=39.06 Aligned_cols=264 Identities=8% Similarity=-0.039 Sum_probs=129.8 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC--CCcceeEEEEeeecccccceeecCCcCc Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLAN--IPEYATHWNYRSYDGAAMGKFISANASD 103 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~--~~~~~~~~~~~~~~~~G~a~~~~~~~~d 103 (343) |++ ..+.-+| .+..+-|. +.|.+.....+....+..+... +.+| .++.++.+...|.+..+.++ ++ T Consensus 1 ma~----~~T~~~d---~iiPev~~---~~v~~~~~~~l~~~~~~~~d~~l~g~~G-~tv~iP~~~~~g~a~~~~~g-~~ 68 (274) T protein:vir:97 1 MPQ----GLTKTSD---QIIPEVLA---PMMQAQLEKKLRFASFAEVDSTLQGQPG-DTLTFPAFVYSGDAQVVAEG-EK 68 (274) T ss_pred CCc----cceehhh---eechHHHH---HHHHHhhhhhhhhcccceecccccCCCC-CEEEEeeecCCCccccccCC-Cc Confidence 222 0111111 12222232 2234444444555555544332 2234 46777777778888877665 57 Q ss_pred cceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCcccc Q lcl|NC_011142. 104 LPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKT 183 (343) Q Consensus 104 ip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~ 183 (343) ++..+...+.....+...+.+|. +.|++.++..+- +-..-...+..++++..|+.++.--.. -..... T Consensus 69 i~~~~lt~~~~~~~i~~~~~~~~--i~D~~~~~~~~d-p~~~~~~~~a~a~a~~vd~~~~~~l~~---------a~~~~~ 136 (274) T protein:vir:97 69 IPTDILETKKREAKIRKIAKGTS--ITDEALLSGYGD-PQGEQVRQHGLAHANKVDNDVLEALMG---------AKLTVN 136 (274) T ss_pred ccccccccceeEEEeeeecceec--ccHHHHHhccch-HHHHHHHHHHHHHHHHHHHHHHHHHhc---------cCcccc Confidence 88888888888888888766555 556665554444 445566777788888888765522111 011110 Q ss_pred ccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccH-HHHHHhcCcceeeccccc Q lcl|NC_011142. 184 SATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTV-IEHFQINNAYTLLTRNPI 262 (343) Q Consensus 184 ~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tv-le~l~~n~~~~~~~~~p~ 262 (343) + +.-+ ++.|.++..++-.. ...+..|+++|..+..|.+....+....+- .+-+..++..-.+.|.++ T Consensus 137 -~----~~~~----~d~i~dA~~~l~d~---~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G~~V 204 (274) T protein:vir:97 137 -A----DITK----LNGLQSAIDKFNDE---DLEPMVLFVNPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEALGAII 204 (274) T ss_pred -c----cccC----HHHHHHHHHHhhcc---CCCceEEEeCHHHHHHHHhhhhhhccccCcccccceeccccceecCeeE Confidence 0 1112 45667777776543 235788999999999997531111000000 011122333333334332 Q ss_pred cccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-eecCceeEeeeeeeeeeEEEECcceeeeec Q lcl|NC_011142. 263 DIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-QLLGLGITVPAEYKISGTEYRYPLCAQYVD 341 (343) Q Consensus 263 ~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d 341 (343) ..+... + .+ .+|.-.+.-+.+....+++...- .++...-.+.... ..|+-+.+|..++.+. T Consensus 205 i~s~~~----p---------~~----t~~l~~~gA~~~~~~~~~~vE~~Rd~~~~~d~i~~~~-~y~~~~~~~~~vv~~t 266 (274) T protein:vir:97 205 VRTNKL----E---------AG----TAILAKKGAVKLILKRDFFLEVARDASTKTTALYSDK-HYVAYLYDESKAVKIT 266 (274) T ss_pred EEcCCC----C---------cc----eEEEEeCcceEeeecCCceeccccchhhcccEEEEEE-EEEEEEEcCCceEEEe Confidence 222110 0 01 12222233334333333332111 1111223333333 4588999998888876 Q ss_pred cC Q lcl|NC_011142. 342 ML 343 (343) Q Consensus 342 GI 343 (343) -= T Consensus 267 ~~ 268 (274) T protein:vir:97 267 KG 268 (274) T ss_pred cC Confidence 44 No 110 >protein:vir:739 Length: 231 # NCBI annotation: major structural protein 4 # Family: family:all:522 # MgeID: mge:14 # MgeName: Tuc2009 # Cross-refs: genbank:acc:NP_108716;genbank:gi:13487838;genbank:GeneID:920884 Probab=96.48 E-value=0.00043 Score=39.05 Aligned_cols=225 Identities=8% Similarity=-0.061 Sum_probs=115.5 Q ss_pred cCCCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHH Q lcl|NC_011142. 73 LANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFR 152 (343) Q Consensus 73 ~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~ 152 (343) ....+.|+ +++++.+ .|.+..++.+ +.+|......+.....|.+.+++|+++..+ .....|-|+ ..-....+. T Consensus 1 ~~~~~~Gd-tit~P~~--iGda~~v~eG-~~i~~~~l~~t~~~atIk~~gk~~~itD~a--~l~~~gDp~-~ea~~Q~~~ 73 (231) T protein:vir:73 1 ENGINLAN-LCEYPND--IGDAADVAEG-GEISLDKIGTTTKSVTIKKAAKGTEITDEA--ALSGYGDPI-GESNKQLGL 73 (231) T ss_pred CccccCCc-eEEeccc--ccchhhhcCC-CcCChhhccccceeeeEeeeccceeeeHHH--HhhccCchH-HHHHHHHHH Confidence 44455554 4566644 7888888876 458888888999999999999888875544 444566555 445555666 Q ss_pred HHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHh Q lcl|NC_011142. 153 GSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRAS 232 (343) Q Consensus 153 ~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~ 232 (343) .++.+.|+=++ +.+. .+.|+.+++. =++.|++++..+-.. ...++.++++|+.+..|- T Consensus 74 ~iA~kvD~di~---~~~~---------------~a~l~~~~~~-t~d~i~~A~~~fgde---~~~~~vivv~p~~~~~Lr 131 (231) T protein:vir:73 74 SLANKVDDDLL---KAAK---------------TTSQTVSTKA-NVDGVQAALDIFNDE---DAQAYVLIVNPKDAAKIR 131 (231) T ss_pred HHHHhhhHHHH---Hhhc---------------cccccccccc-cHHHHHHHHHHhccc---cccceEEEEcchHHHhhh Confidence 66666665333 0110 0112222221 266777787777542 346789999999999984 Q ss_pred ccccCCCCC-cc-HHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhccc Q lcl|NC_011142. 233 SLLMTGYTD-RT-VIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLA 310 (343) Q Consensus 233 ~~~~~~~~~-~t-vle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~ 310 (343) + -. +... .+ ..+=+..++.+-.+.|.++..+.. . +.++....-|.-.+.=+.+..-.+++.- T Consensus 132 k-~~-~~~~~~~~~g~~i~~~G~iG~i~G~~Vi~S~~---------~----~~~~~~~~~~i~~~gAl~~~~k~~~~vE- 195 (231) T protein:vir:73 132 K-DA-NAKNIGSEVGANALINGTYADVLGAQIVRSKK---------L----AEGSALMFKIVSNSPALKLVLKRGVQVE- 195 (231) T ss_pred h-cc-chhhhhhhhccceeeecccceEcceEEEEcCC---------C----CCCceeeeeEEeeccceeeeecccceee- Confidence 3 11 1111 01 111122233333344433222111 0 1111122112112222222222222211 Q ss_pred ceec--CceeEeeeeeeeeeEEEECcceeeee--ccC Q lcl|NC_011142. 311 PQLL--GLGITVPAEYKISGTEYRYPLCAQYV--DML 343 (343) Q Consensus 311 ~~~~--~~~~~~~~~~~~gGv~i~~P~ai~~~--dGI 343 (343) .++. .....+ +.-...+|-+++|..++.+ .|+ T Consensus 196 tdRd~~~k~~~i-~~~~~y~v~l~~~~~vv~~t~~g~ 231 (231) T protein:vir:73 196 TDRDIVTKTTVI-TADEHYAAYLYDLTKVVNITFTGV 231 (231) T ss_pred ccccccccccEE-EEeEEEEEEEEcCccEEEEEeecC Confidence 1111 112233 2233468999999988875 677 No 111 >protein:vir:105334 Length: 276 # NCBI annotation: putative phage major capsid protein # Family: family:all:522 # MgeID: mge:1679 # MgeName: PH15 # Cross-refs: genbank:acc:YP_950669;genbank:gi:119967839;genbank:GeneID:4643213 Probab=96.36 E-value=0.0007 Score=37.86 Aligned_cols=265 Identities=7% Similarity=-0.030 Sum_probs=125.9 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCC-cceeEEEEeeecccccceeecCCcCcc Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIP-EYATHWNYRSYDGAAMGKFISANASDL 104 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~-~~~~~~~~~~~~~~G~a~~~~~~~~di 104 (343) |++. .+..+ -.+..+-|. +-|.+.....+....+..+..... -+-.++.++.+...|.+..+.++ +++ T Consensus 1 Ma~~----~T~l~---d~i~Pev~~---~~v~~~~~~~~~~~~~~~~~~~l~g~~G~ti~iP~~~~igda~~~~eg-~~i 69 (276) T protein:vir:10 1 MAQG----TTTKS---TQIVPEVLA---PMMQAELDKKLRFAQFADIDSTLVGQPGDTLTFPAFVYSGDATVVPEG-QKI 69 (276) T ss_pred CCcc----eeehh---hhhchHHHH---HHHHHHHHhhhhhcccceecccccCCCCCEEEeeeecCCCccccccCC-Ccc Confidence 2210 01111 112222222 223333333344444544433322 12345677777778888887776 468 Q ss_pred ceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccc Q lcl|NC_011142. 105 PRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTS 184 (343) Q Consensus 105 p~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~ 184 (343) |......+.....+...+.+|.++ |+...+. +.++-..-...+...+++..|+-++. .+ +. T Consensus 70 ~~~~lt~~~~~a~i~~~~k~~~~t--D~a~~~~-~~dp~~~~~~~~~~~~a~~~d~~~~~---~l------~~------- 130 (276) T protein:vir:10 70 PVDKIETNRREAKIHKIGKGTDIT--DEALLSG-YGDPQGEAVRQHGLAIANKVDNDVLE---AL------RG------- 130 (276) T ss_pred CccccccceeeEEeehcccccccc--HHHHHhh-ccchHHHHHHHHHHHHHHHHHHHHHH---HH------hc------- Confidence 888888889889998877776665 5554333 44555666677777788887765541 11 00 Q ss_pred cCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHH-HHHHhcCcceeecccccc Q lcl|NC_011142. 185 ATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVI-EHFQINNAYTLLTRNPID 263 (343) Q Consensus 185 ~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvl-e~l~~n~~~~~~~~~p~~ 263 (343) +...++..+. -++.|.+++..+-.. ...+..|+++|+.+..|.+-...+....+-. +=+..++.+-.+.|.++. T Consensus 131 ~~~~~~~~~~--t~d~i~~A~~~lgd~---~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G~~Vi 205 (276) T protein:vir:10 131 TKLTVSADIG--TLAGLEAAIDTFDDE---DLEPMVLFINPKDAGKLRSSASDNFTRATELGDNIIVKGAFGEALGAVIV 205 (276) T ss_pred cccccccccc--CHHHHHHHHHHhccc---cCcccEEEEcHHHHHHHHHhccccccccccccccceeccccceecceeEE Confidence 0001111111 145566677666432 2367899999999999964211111110100 001112222222233222 Q ss_pred ccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-eecCceeEeeeeeeeeeEEEECcceeeeecc Q lcl|NC_011142. 264 IKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-QLLGLGITVPAEYKISGTEYRYPLCAQYVDM 342 (343) Q Consensus 264 i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dG 342 (343) .... + + .+. +|-..+.-+.+....+++.-.- ......-.+-... ..|+.+.+|..++.+.= T Consensus 206 ~s~~---------~----p--~~t--~~l~~~gAi~~~~~~~~~vE~dRd~~~~~d~i~~~~-~y~~~~~~~~~vv~~t~ 267 (276) T protein:vir:10 206 RSKK---------L----D--EGE--AILAKRGAVKLITKRDFFLETDRDPSTKTTALYSDK-HYVAYLYDESKAVKVTK 267 (276) T ss_pred EcCC---------C----C--cce--EEEEeccceeeeecCCceeecccchhhcccEEEEee-EEEEEEEcCcceEEEec Confidence 1110 0 0 011 1222233333333333332111 0111222332333 45899999998888763 Q ss_pred C Q lcl|NC_011142. 343 L 343 (343) Q Consensus 343 I 343 (343) - T Consensus 268 ~ 268 (276) T protein:vir:10 268 G 268 (276) T ss_pred C Confidence 3 No 112 >protein:vir:4092 Length: 390 # NCBI annotation: major capsid protein a # Family: family:all:635 # MgeID: mge:86 # MgeName: 2389 # Cross-refs: genbank:acc:NP_510986;swissprot:trembl:q8w604;genbank:gi:17488508;uniprot:Q8W604;genbank:GeneID:1260361 Probab=96.26 E-value=0.00082 Score=37.51 Aligned_cols=301 Identities=9% Similarity=-0.012 Sum_probs=138.6 Q ss_pred CCccee-----ccchhhh--hchhhh----chhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhh Q lcl|NC_011142. 1 MSEKRV-----VIDAQTI--AGNRWL----NKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLED 69 (343) Q Consensus 1 ~~~~~~-----~~~~~~~--~~~~~~----~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~ 69 (343) ..|.+. ..|.... .+.+.+ +.++..... .-+.+++..+.. +.+...|++.....-..+.+ T Consensus 47 ~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~r~~~~~~~~-------~~~~~~gg~lvP--~~~~~~I~~~~~~~s~i~~~ 117 (390) T protein:vir:40 47 IAQARKEVNREMNDNNVLASRGANALTSDESKYYNEVIA-------GNGFAGVTALLP--PTVFERVFEDLTVEHPLLSK 117 (390) T ss_pred HHHHHHHHHHHHHHHHHHHhcCchhccHHHHHHHHHHHh-------ccCcccCccccc--HHHHHHHHHHHHhhhhhhhh Confidence 000000 0000000 000000 111111011 011223333433 34445566655555444554 Q ss_pred ccccCCCCcceeEEEEeeecccccceeecCCcCccc-eeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHH Q lcl|NC_011142. 70 VPVLANIPEYATHWNYRSYDGAAMGKFISANASDLP-RVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAE 148 (343) Q Consensus 70 i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip-~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~ 148 (343) +.+.. . +.....+.+....+.+.+++..+ .+| ..+..++......+.++.-+.+|.+=++.+ ..++..--.. T Consensus 118 ~~~~~-~--~~~~~~i~~~~~~~~a~~~~E~~-~~~~~~~~~f~~i~l~~~k~~~~i~iS~ell~ds---~~~l~~~i~~ 190 (390) T protein:vir:40 118 INFVN-T--TATTEWIISVGDVATAWWGPLCA-EIKEVLDNGFDKIQTGMYKLSAYIPVCNAMLDLG---PSWLDQYVRT 190 (390) T ss_pred ceeee-c--CCceeEEEEEcCCcceeeecccc-ccCccccccceeeEeeeeeEEEeehhhHHHHhcc---hHHHHHHHHH Confidence 44321 1 22333344556667777766543 344 346678888888898888888875555433 4568888888 Q ss_pred HHHHHHHHhhhheeeeeehhhcceeeeecCCccccccC--cCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHH Q lcl|NC_011142. 149 LAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSAT--VNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPD 226 (343) Q Consensus 149 aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~--~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~ 226 (343) ..+++++..+|+.+++|+....-.|+||..+....... ....+-|.+.+.+.+..+...+.........--.++|+|+ T Consensus 191 ~la~~i~~~~~~a~l~G~G~~~P~Gil~~~~~~~~~~~~~~~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~a~~i~n~~ 270 (390) T protein:vir:40 191 ILGEAMALGLEAGIVNGSGKDQPIGMMRDLNNVTAGEHPVKTATPLTDLTPATLATKVMLPLTDNGKKSVSDAILVINPA 270 (390) T ss_pred HHHHHHHHHHHhhhhcccCCCccceeeeccccccccccccccccccchhhHHHHHHHHHHHhhcchhhhhcCceEEEcch Confidence 99999999999999999876667899998763322111 1111223333444444444433322222223345788887 Q ss_pred HH-HHHhc-cccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeecc Q lcl|NC_011142. 227 LW-KRASS-LLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPI 304 (343) Q Consensus 227 ~~-~~L~~-~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~ 304 (343) .+ .+|.. +...+..|.- +...- ..|.|+..... .+.+ . ++|-+ ...+-+..-. T Consensus 271 t~~~~l~~~~~~~d~~G~~----v~~~~----~~g~pvv~~~~-------------~p~~--~-i~~Gd-~s~~~i~~~~ 325 (390) T protein:vir:40 271 DYWSKIYAATSYMTPQGVW----VTGIL----PVPLEIVQSVA-------------VPVG--K-AVAGR-AKDYFMGIGS 325 (390) T ss_pred hHHHHHHHHhhccCCCCcc----ccccC----CCceeEEEcCC-------------CCCC--c-EEEEe-eceEEEEeec Confidence 64 33321 1122223321 11111 12334322110 0111 1 22221 1112222223 Q ss_pred chhcccc-eec--CceeEeeeeeeeeeEEEECcceeeee--ccC Q lcl|NC_011142. 305 PFRMLAP-QLL--GLGITVPAEYKISGTEYRYPLCAQYV--DML 343 (343) Q Consensus 305 ~~~~~~~-~~~--~~~~~~~~~~~~gGv~i~~P~ai~~~--dGI 343 (343) .++...- +.. .-...+....|+ ++.++.|.|++.+ .++ T Consensus 326 ~~~v~~~~~~~f~~~~~~~r~~~r~-dg~v~~~~A~~~l~~~~~ 368 (390) T protein:vir:40 326 EQVIRTSTEYRLLDDETLYYAKQYA-NGRPKDNSSFLVFDITGL 368 (390) T ss_pred ceEEEecchhhhhcCcEEEEEEEEe-CCEEecccceEEEEeecc Confidence 3332211 111 123556667887 4667779998855 455 No 113 >protein:vir:96833 Length: 275 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1642 # MgeName: EW # Cross-refs: genbank:acc:YP_240157;genbank:gi:66395822;genbank:GeneID:5133174 Probab=96.15 E-value=0.00094 Score=37.17 Aligned_cols=265 Identities=7% Similarity=-0.037 Sum_probs=124.5 Q ss_pred ccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC--CCcceeEEEEeeecccccceeecCCc Q lcl|NC_011142. 24 DSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLAN--IPEYATHWNYRSYDGAAMGKFISANA 101 (343) Q Consensus 24 ~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~--~~~~~~~~~~~~~~~~G~a~~~~~~~ 101 (343) .+|++ .+..+| .+-.+-|. +-|.+.....+....+..+... +.+| .++.++.+...|.+..+.++ T Consensus 1 ~~~~~-----~T~l~d---~i~PEv~~---~~v~~~~~~~~~~~~~~~~~~~l~g~~G-~tv~iP~~~~ig~a~~~~~g- 67 (275) T protein:vir:96 1 MALEN-----MTKLAN---MVNPEVLA---PMMQAELDKKLKFAQFADIDNTLVGQPG-NTITFPAFVYSGDAKVVPEG- 67 (275) T ss_pred CCCcc-----cchhhh---hhchHHHH---HHHHHHHHHhhhhcccceecccccCCCC-CEEEeeeeccCCccccccCC- Confidence 12211 111111 22222232 1233333334444555444332 2234 45677777778888887665 Q ss_pred CccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCcc Q lcl|NC_011142. 102 SDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVT 181 (343) Q Consensus 102 ~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~ 181 (343) ++++..+...+.....+...+.+|.+ .|++..+..+ ++-..-...+...+++..|+-++. .++ .-..+ T Consensus 68 ~~i~~~~lt~~~~~~~i~~~~~~~~i--~D~~~~~~~~-d~~~~~~~~~a~~~a~~~d~~ll~---~l~------~a~~~ 135 (275) T protein:vir:96 68 EEIPIDLIETKKRQATIRKIGKGTVL--TDEALLSGYG-DPKGEAVRQHGLAIANKVDNDVLE---ALQ------GATLK 135 (275) T ss_pred CCcchhhcccceeeEEeehhcccccc--cHHHHHhhcc-chHHHHHHHHHHHHHHHHHHHHHH---HHh------ccccc Confidence 57888888888888888887766655 5555444444 444556666777788787776551 111 10001 Q ss_pred ccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccH-HHHHHhcCcceeeccc Q lcl|NC_011142. 182 KTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTV-IEHFQINNAYTLLTRN 260 (343) Q Consensus 182 ~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tv-le~l~~n~~~~~~~~~ 260 (343) . ..+ .-+ ++.|.+++..+-.. ...+..|+++|+.+..|.+-........+. .+-+..|+..-.+.|. T Consensus 136 ~-~~~----~~~----~d~i~dA~~~lgd~---~~~~~~ivv~p~~~~~L~k~~~~~f~~~~~~g~~~~~~G~ig~~~G~ 203 (275) T protein:vir:96 136 V-EAD----ITK----LAGLQTAIDKFNDE---DLEPMVLFVNPLDAGKLRASATDNFTRATLLGDNVIVKGAFGEALGA 203 (275) T ss_pred c-ccc----ccC----HHHHHHHHHHhccc---cCCccEEEeCHHHHHHHHhcccccccccccccccceeccccceecCe Confidence 1 111 112 45566677666432 236789999999999985421101000000 0111123322233333 Q ss_pred cccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-eecCceeEeeeeeeeeeEEEECcceeee Q lcl|NC_011142. 261 PIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-QLLGLGITVPAEYKISGTEYRYPLCAQY 339 (343) Q Consensus 261 p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-~~~~~~~~~~~~~~~gGv~i~~P~ai~~ 339 (343) ++..+.. . .. + ..+++ .+.-+.+....+++.-.- .+....-.+... ...|+.+.+|..++. T Consensus 204 ~Vi~s~~-------~---p~---~--t~~i~--~~gA~~~~~~~~~~vE~~Rd~~~~~d~i~~~-~~y~~~~~~~~~vv~ 265 (275) T protein:vir:96 204 IIVRSNK-------I---KE---G--EAILA--KRGAVKLITKRDFFLETERHASHKSTALFSD-KHYVAYLYDESKVVK 265 (275) T ss_pred eEEEeCC-------C---Cc---c--eEEEE--eccceeeeecCCcccccccchhhcCcEEEEe-EEEEEEEEcCccEEE Confidence 3221110 0 00 1 11222 222333332222221110 001122233223 345899999998887 Q ss_pred ec------cC Q lcl|NC_011142. 340 VD------ML 343 (343) Q Consensus 340 ~d------GI 343 (343) +. |+ T Consensus 266 ~t~~~~~~~~ 275 (275) T protein:vir:96 266 ITKSASGLGV 275 (275) T ss_pred EEecccccCC Confidence 63 33 No 114 >protein:vir:80128 Length: 466 # NCBI annotation: Phage capsid protein # Family: family:all:635 # MgeID: mge:1877 # MgeName: bacteriophage bv1 # Cross-refs: genbank:acc:YP_001425603;genbank:gi:155042936;genbank:GeneID:5469556 Probab=96.10 E-value=0.001 Score=37.03 Aligned_cols=313 Identities=10% Similarity=0.012 Sum_probs=134.0 Q ss_pred CCcceeccc---------hhhhhchhhhchhccccccc---C-cchh-ecchhhhhhhhHHHHHHHHHHHHhhhhhcccc Q lcl|NC_011142. 1 MSEKRVVID---------AQTIAGNRWLNKFLDSNATI---G-VPSV-VNDADGGAAYYISQLASLETTVYEVPYADITY 66 (343) Q Consensus 1 ~~~~~~~~~---------~~~~~~~~~~~~~~~~~~~~---~-~~~~-~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~ 66 (343) ..++...+. .+.+...++....+.....- . ...+ ...+.+++..+.++ .+-.+|++........ T Consensus 102 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~vP~--~~~~~i~~~l~~~~~l 179 (466) T protein:vir:80 102 SGARTQQFVGGETRMKGFFRNMPYEQRAALIARSEVKEFLAQVRTLAQQKRAVSGAELTIPD--VMLELLRDNMHRYSKL 179 (466) T ss_pred HhhhhhHHhhHHHHHHHHHHhhhhhhHHHHHHHHHHHHHHHHHHHHhhhhhhhccccccccH--HHHHHHHHhhhhhhhh Confidence 111111110 00000000000000000000 0 0000 01122333334442 3334555544433333 Q ss_pred hhhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHH Q lcl|NC_011142. 67 LEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQ 146 (343) Q Consensus 67 ~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k 146 (343) +..+.+..- + | ...+.+......+.+.+.. .++|..+..++.....++.++.-+.+|.+=|+.+ ..++..-- T Consensus 180 ~~~~~v~~~-~-g--~~~~~~~~~~~~a~wv~E~-~~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~ds---~~~l~~~i 251 (466) T protein:vir:80 180 ISKVRLRPL-K-G--TARQNIAGAIPEGVWTEAV-ANLNELSLSFSQIEVDGYKVGGFIPIPNSTLEDS---DLNLADEI 251 (466) T ss_pred hhheeeeec-C-c--eeEeeeecCCcceeecccc-cccccccccccceeecceeeeeehhhhHHHHhcc---hHHHHHHH Confidence 333332211 1 1 1233334444556666544 4678888888899999999988777776655543 55688888 Q ss_pred HHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccccC-----cCccc-------------cCHHHHHHHHHHHHHHH Q lcl|NC_011142. 147 AELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSAT-----VNYAT-------------CTGQELFDLLNNPVFAV 208 (343) Q Consensus 147 ~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~-----~~w~~-------------~t~~~i~~di~~~~~~l 208 (343) ....+.++...+|+-+++|+....-.|+||+.+....... ..+.. .++...+.++...+..+ T Consensus 252 ~~~la~~~~~~~~~ail~G~G~~~P~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 331 (466) T protein:vir:80 252 LDAIGQAIGFALDKAILYGTGTKMPVGIVTRLAQTTQPPNWGTKAPAWTNLSTTNLLKIDPTGKSAEEFFSELVLKLSKA 331 (466) T ss_pred HHHHHHHHHHHHhhheeeccCCCCcceeeecccccccccccccccccccccchhhhhhhhhhccchhhHHHHHHHHHHhh Confidence 8889999999999999999987667899998754322111 11111 11222233332222221 Q ss_pred HHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceE Q lcl|NC_011142. 209 VKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRY 288 (343) Q Consensus 209 ~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~ 288 (343) . .+........++++..+..|.......+.+- .+-+--.|.+ .+-|.|+...... +. + ..-.|-...+ T Consensus 332 ~--~~~~~~~~~w~~~~~~~~~l~~~~~~~~~~g-~~~~~~~~~~--~i~G~pvv~s~~~----~~--~-~~~~g~~~~y 399 (466) T protein:vir:80 332 R--ANYSNGMKFWAMSSNTHAVLMSKAITFNSAG-ALVASLNNTM--PIVGGDIVILDFI----PD--N-DIIGGYGSLY 399 (466) T ss_pred h--ccccCCceeEEecchhHHHhhcccccccCCc-cccccCCCcc--cccccceeecCcc----Cc--c-ceeeeccccE Confidence 1 1122222335667777777654332211110 0000000100 0223333221100 00 0 0000011112 Q ss_pred EEEEcccceEEEeeccchhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 289 VVYDKSERNLALAKPIPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 289 v~y~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +++.+ .-+++....... -.++ ...+....|+ +..+++|.|++++++= T Consensus 400 ~i~~r--~~~~i~~~~~~~----f~~d-~~~~r~~~r~-dg~~~~~~afv~~~~~ 446 (466) T protein:vir:80 400 LLAER--ADIKLAQSEHVR----FIED-QTVFKGTARY-DGKPVFGEGFVAVNIA 446 (466) T ss_pred EEEee--cceEEEechhhh----hhcC-cEEEEEEEEE-ccEEeccCceEEEEec Confidence 22221 112221111000 0122 2446667887 5566899999999744 No 115 >protein:vir:94933 Length: 330 # NCBI annotation: putative phage structural protein # Family: family:all:1120 # MgeID: mge:1538 # MgeName: Xp15 # Cross-refs: genbank:acc:YP_239278;genbank:gi:66392060;genbank:GeneID:5076578 Probab=95.83 E-value=0.0013 Score=36.34 Aligned_cols=307 Identities=14% Similarity=0.025 Sum_probs=133.8 Q ss_pred CCcceeccch----hhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCC Q lcl|NC_011142. 1 MSEKRVVIDA----QTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANI 76 (343) Q Consensus 1 ~~~~~~~~~~----~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~ 76 (343) |- |-.-||. ++|+- --...+.|+ .+.|++ ..+.. +.+...|+|.....-...+.+|... . T Consensus 1 ~~-~~~~~~~~~~~~~~~~---------~~p~l~m~a-lTLaea--~~l~~--d~~~~~VIE~l~~~s~iL~~lpf~~-v 64 (330) T protein:vir:94 1 MV-RICTPPLRGRWRTLTH---------QFPELKMPT-VTLAES--AKLSQ--DHLVSGLIETIVEVNPLYEMMPFTE-I 64 (330) T ss_pred Cc-eecCCccccceeehhc---------cccccchhh-hhhhHH--hhcCc--hhhHHHHHHhhhccchHHhhccccc-c Confidence 21 2222222 22221 001111222 233442 22222 3456678887766655566666421 1 Q ss_pred CcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCC--ccHHHHHHHHHHH Q lcl|NC_011142. 77 PEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMP--IDSMQAELAFRGS 154 (343) Q Consensus 77 ~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~--l~~~k~~aA~~~~ 154 (343) ..+ .+.|......+.+.+..-+...-|.-...+.+.+..+..++..+.++.+- +...|-+ +.........+++ T Consensus 65 e~~--~~~~~r~~~lp~a~~r~~n~~~~~~~~~Tf~q~t~~l~~l~~~~~Vd~~i---adl~g~~~d~~~~q~~~~ieal 139 (330) T protein:vir:94 65 EGN--ALAYNRENVLGDVQFLAVGGTITAKNPATFTKVTSELTTLIGDAEVNGLI---QATRSDFMDQTSVQVASKAKSI 139 (330) T ss_pred cCC--cceeeeeecCCcceeeeccccccccCcceeeeeeechhhhhhhHHHHHHH---HHhcCCHHHHHHHHHHHHHHHH Confidence 111 12333222234444332221111111122333344455555444433222 2233443 3445556677789 Q ss_pred HHhhhheeeeeehh-hcceeeeecCC-ccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHh Q lcl|NC_011142. 155 EEHSQRVAYFGDTN-RNMSGLLNNPN-VTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRAS 232 (343) Q Consensus 155 ~~~~n~~~f~G~~~-~g~~GLlN~p~-v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~ 232 (343) .+++.+..+|||.. .++.||++.=. -+....+..=..-| ++|+-+++..++.. .-.|..|+++......|. T Consensus 140 ~~~~e~~linGDs~~~~F~GL~~~~~~~q~i~tg~~gg~~T----~d~LDeLl~~v~~~---~g~~~~~l~n~a~~r~I~ 212 (330) T protein:vir:94 140 GRQYQASMITGDGTGNSFQGMMGLVAASQTISAGANGGTLT----FELLDQLLDLVKDK---DGQVDYLMSSFAMRRKYF 212 (330) T ss_pred HHHHHHHhhccCCCCccccchhhcCCcccEEecCCCCCCCC----HHHHHHHHHHhcCC---CCCCcEEEechhHHHHHH Confidence 99999999999865 46789975321 11221111112234 46677777777642 125888998777655553 Q ss_pred cc-ccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcc-----cceEEEeecc-- Q lcl|NC_011142. 233 SL-LMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKS-----ERNLALAKPI-- 304 (343) Q Consensus 233 ~~-~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~-----~~~~~~~v~~-- 304 (343) .- |-....++.--....-..+.....|.|+...- ..+.-...+. .+|+...++..-. .-+..++-+. T Consensus 213 a~~R~~~~~~v~~~~~~~~G~~v~~~~GvPi~~~d----~ip~~~~~~~-~~~ttsIyav~~G~~~~~qgV~Gl~~~g~~ 287 (330) T protein:vir:94 213 SLLRALGGAAIGEVMTLPSGRQIPTYRGVPWFVND----FIPSNMTQGT-ATNATAIFAGTFDDGSNKYGIAGLTARGSA 287 (330) T ss_pred HHHHhccCCCCCCcccccCCCEEeeeCCeEEEecc----cccCCCCccc-CCCceeEEEEeecccccccceEeecCCCCC Confidence 31 11111121000000112333344455532210 0111111112 2344444444321 2334444222 Q ss_pred --chhccc-ceecC-ceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 305 --PFRMLA-PQLLG-LGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 305 --~~~~~~-~~~~~-~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ..+... .+.+. ..|.+.. .-|+-++-|.|++.+.|| T Consensus 288 glsVr~~G~~~~k~v~~~~v~~---y~~~av~~~~a~~~L~~V 327 (330) T protein:vir:94 288 GLRVQNVGAKENADETITRVKM---YCGFANFSQLGLAAIKGL 327 (330) T ss_pred cceeeeCCCccccceeeEEEEE---eeeeEEechhheeeeccc Confidence 123322 23333 3455533 357889999999999999 No 116 >protein:vir:95376 Length: 425 # NCBI annotation: phage major capsid protein # Family: family:all:635 # MgeID: mge:1567 # MgeName: GBSV1 # Cross-refs: genbank:acc:YP_764476;genbank:gi:115334630;genbank:GeneID:5179263 Probab=95.73 E-value=0.0015 Score=35.99 Aligned_cols=305 Identities=11% Similarity=-0.003 Sum_probs=137.8 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcch-----hecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPS-----VVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLAN 75 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-----~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~ 75 (343) -+++..+++...-.....++.-.... ...+.. ..+.+.+++.++.+ +.+.+.|++........+.++.+.. T Consensus 101 ~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~gg~~vP--~~~~~~Ii~~l~~~~~i~~~~~~~~- 176 (425) T protein:vir:95 101 QGSKGDVVEMNRLQVREMLKTGEYYK-RSEVVEFYEKFRNLRAVAGGELTIP--EVVVNRIMDIMGDYTTLYPLVDKIR- 176 (425) T ss_pred hhhhhhHHHHHHHHHHHHHhhhhhhh-hhHHHHHHHHHHhhcccccCceecc--HHHHHHHHHHHHhhhhHHHhhceee- Confidence 01111222211111111110000000 000000 01111234545555 3455667776665555555554322 Q ss_pred CCcceeEEEEeeecccccceeecCCcCccceeee-ccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHH Q lcl|NC_011142. 76 IPEYATHWNYRSYDGAAMGKFISANASDLPRVAQ-SAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGS 154 (343) Q Consensus 76 ~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~-~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~ 154 (343) .+ | ...+.+....+.+.+++..+ .+|..+. .++......+.++.-+.+|.+=|+.+ ..++..--....+.++ T Consensus 177 ~~-g--~~~ip~~~~~~~a~~v~E~~-~~~~~~~~~f~~i~l~~~k~~~~~~iS~ell~ds---~~~l~~~i~~~l~~~i 249 (425) T protein:vir:95 177 VK-G--TTRILVDTDTSPATWIEQSG-ALPTGDVGTIASIDFDGFKVGKVTFVDNYLLQDS---IINLDDYVTKKIARAI 249 (425) T ss_pred cC-c--eeEEEEecCCcccccccccc-ccccccccccceeeeeheeeeeeehhhHHHHhcc---HHHHHHHHHHHHHHHH Confidence 22 2 23445566667777777664 4666664 47888888888888778776544443 3457788888899999 Q ss_pred HHhhhheeeeeehhh--cceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCee-cccEEEecHHHH-HH Q lcl|NC_011142. 155 EEHSQRVAYFGDTNR--NMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFH-TPNTVLMFPDLW-KR 230 (343) Q Consensus 155 ~~~~n~~~f~G~~~~--g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~-~p~tL~l~p~~~-~~ 230 (343) ++.+|+-+++|+... .-.|++++-..... .+..+.+ ..++++.+++..+... +.. ....++|++..| .. T Consensus 250 ~~~~d~~il~G~G~~~~~p~Gil~~~~~~~~--~~~~~~~---~~~~~~~~~~~~~~~~--~~~~~~~~~v~~~~~~~~~ 322 (425) T protein:vir:95 250 AKALDLAIVKGTGAANKQPLGIIPSLPPENQ--VTVEADN---NLLKNLVKQIGLIDTG--DDSVGEIVAVMKRSTYYNR 322 (425) T ss_pred HHHHHHHhhccCCCCccccceeecccccccc--ccccccc---chHHHHHHHHHhhhhh--ccccCceEEEEeChHHHHH Confidence 999999999998642 34788876332211 1122211 1356667777665432 222 223467777754 33 Q ss_pred Hhc-cccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcc Q lcl|NC_011142. 231 ASS-LLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRML 309 (343) Q Consensus 231 L~~-~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~ 309 (343) |.. +..-+..|.-++. ..+...-.+-|.|+..... . . .+. ++|-+=.+ +-+..-..++.. T Consensus 323 l~~l~~~kd~~g~~i~~--~~~~~~~~l~G~pvv~~~~-------~---~-----~~~-i~~Gd~~~-~~~~~~~~~~i~ 383 (425) T protein:vir:95 323 LVEFSIQVDSNGNVVGK--LPNLRTPDLLGLRVVFNNF-------L---D-----DDT-VLFGEFEQ-YTLVERENITID 383 (425) T ss_pred HHHHHhhcCCCCceeec--cCCCCCccccceeeEEcCc-------C---C-----Ccc-EEEEeccc-EEEEeecceEEE Confidence 421 1112333321111 0111111122333221110 0 0 011 22211111 111111111111 Q ss_pred c-ceecC--ceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 310 A-PQLLG--LGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 310 ~-~~~~~--~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) . .+... -...+..+.++ +..+++|.|+++++== T Consensus 384 ~~~~~~f~~~~~~~~~~~r~-d~~~~~~~a~~~~~i~ 419 (425) T protein:vir:95 384 SSTHVKFTEDQTAFRGKGRF-DGKPVKPEAFVLVTIT 419 (425) T ss_pred eecccccccCceEEEEEEee-CcEeecccceEEEEec Confidence 0 11111 12344455665 5688899999988522 No 117 >protein:vir:9704 Length: 394 # NCBI annotation: hypothetical protein # Family: family:all:21 # MgeID: mge:174 # MgeName: 315.2 # Cross-refs: genbank:acc:NP_795466;genbank:gi:28876225;genbank:GeneID:1257769 Probab=95.60 E-value=0.0014 Score=36.21 Aligned_cols=289 Identities=8% Similarity=0.009 Sum_probs=124.1 Q ss_pred CCcce----------eccch-hhhhchhhhchhccccccc-CcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchh Q lcl|NC_011142. 1 MSEKR----------VVIDA-QTIAGNRWLNKFLDSNATI-GVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLE 68 (343) Q Consensus 1 ~~~~~----------~~~~~-~~~~~~~~~~~~~~~~~~~-~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~ 68 (343) +.+++ ...+. ............+...... ......+ .+.|.++.. +.+.+.|++........+. T Consensus 85 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~t--~~~gg~liP--~~~~~~ii~~~~~~~~l~~ 160 (394) T protein:vir:97 85 KTYRESVNDFIRSKGKIVNDSLRFEGKDEVLMPINETTPVEPQKDGIK--KENAKPVSS--EEILYTPAREVKTVVDLKP 160 (394) T ss_pred HHHHHHHHHHHHHHHHHhhhhhhhhhHHHHHHHHHhhhhhhhhccccc--cccccccCh--HHHHHHHHHHhhhhhhhhh Confidence 00000 00000 0000000000000000000 0000011 122444444 3455677776666655555 Q ss_pred hccccCCCCcceeEEEEeeec-ccccceeecCCcCccce-eeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHH Q lcl|NC_011142. 69 DVPVLANIPEYATHWNYRSYD-GAAMGKFISANASDLPR-VAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQ 146 (343) Q Consensus 69 ~i~v~~~~~~~~~~~~~~~~~-~~G~a~~~~~~~~dip~-v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k 146 (343) +..+.. .+-+ +..+.+.. ..+.+.+++..+ ..|. -+..++........++.-+.+|.+=|+. ...++..-- T Consensus 161 ~~~~~~-~~~~--~~~~~~~~~~~~~~~~v~E~~-~~~~~~~~~~~~v~l~~~k~~~~i~is~ell~d---s~~~~~~~i 233 (394) T protein:vir:97 161 FTTVYQ-AKKA--SGKYPVLQRATTKMVTVAELE-KNPALAKPDFKDVAWNIDTYRGAIPLSQESIDD---ADVDLVGIV 233 (394) T ss_pred hceeee-ccCc--ceEEEEEecCCCccceecccc-cccccccccceeEEeehhheeeehhhHHHHHhh---hhHHHHHHH Confidence 544421 1212 23344443 334556666554 3453 3456778888888888777776543332 244566767 Q ss_pred HHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHH Q lcl|NC_011142. 147 AELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPD 226 (343) Q Consensus 147 ~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~ 226 (343) ....++.+...+|..+++|..... +.. ..+.+ ||.+++...... . ..-.++|+|. T Consensus 234 ~~~la~~~~~~~~~~i~~g~~~~~---------------~~~--~~~~~----~~~~~~~~~~~~-~---~~a~~v~n~~ 288 (394) T protein:vir:97 234 SESISQIKVNTTNDAIAKVLKSFT---------------TKT--VKNLD----EIKALLNGGFDP-A---YNVSLIVSQS 288 (394) T ss_pred HHHHHHHHHHHHHHHHhhcccccc---------------ccc--cccHH----HHHHHHHhhhhh-h---hCCEEEEcHH Confidence 777888888899988887753210 011 12333 444444332211 1 1246899999 Q ss_pred HHHHHhccccCCCCCccHHH-HHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccc Q lcl|NC_011142. 227 LWKRASSLLMTGYTDRTVIE-HFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIP 305 (343) Q Consensus 227 ~~~~L~~~~~~~~~~~tvle-~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~ 305 (343) .|..|.... +..|.-++. -+. ++.-..+-|.|+... .. ... +.+.+++-+-+ +.+.+..-.. T Consensus 289 ~~~~l~~lk--d~~G~~i~~~~~~-~~~~~~l~G~pv~~~-------~~---~~~---~~~~~~~gd~~-~~~~~~~~~~ 351 (394) T protein:vir:97 289 FYQTLDTLK--DGNGRYLLQDDIT-AVSGKVLLGKPVFVL-------SD---EVL---GANKAFIGDFK-RGVLFADRKD 351 (394) T ss_pred HHHHHHHhh--ccCCCeeeecCcC-CCCCceeccceeEEe-------cc---ccc---CCccEEEeecc-ccEEEEEecc Confidence 999986532 333322211 000 111112334443221 10 011 11111111111 1111111112 Q ss_pred hhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 306 FRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 306 ~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ++............+.++.|+ |+.+.+|.+|+.++.= T Consensus 352 ~~~~~~~~~~~~~~~~~~~r~-d~~v~~~~a~~~~~~~ 388 (394) T protein:vir:97 352 LGLRWADNEIYGQYLQAVLRF-GVSKVDDKAGYYVTFT 388 (394) T ss_pred eEEEEecccccceeEEEEEEE-ccEEecccceEEEEec Confidence 222111111122334667887 4577789999988877 No 118 >protein:vir:99888 Length: 309 # NCBI annotation: capsid protein # Family: family:all:908 # MgeID: mge:1480 # MgeName: B3 # Cross-refs: genbank:acc:YP_164075;genbank:gi:56692607;genbank:GeneID:3192616 Probab=95.47 E-value=0.002 Score=35.42 Aligned_cols=270 Identities=11% Similarity=0.035 Sum_probs=115.3 Q ss_pred cCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeec---CCcCccc Q lcl|NC_011142. 29 IGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFIS---ANASDLP 105 (343) Q Consensus 29 ~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~---~~~~dip 105 (343) |+...+..| ..|+.+= +.-.-+++.+.++||.. +-..+++.|..++..-...... ....+.- T Consensus 1 ~~~~~~~~d---------p~LT~~A---~gy~n~~~Ia~~l~P~v---pV~~~~~~~~~f~~~e~F~~~~t~r~~~~~~~ 65 (309) T protein:vir:99 1 MSNAPFPID---------PELTAIA---IAYRNGRMISDEVLPRV---PVGKQEFKFWKYDLAQGFTVPETLVGRKSKPN 65 (309) T ss_pred CCCCCcCcC---------HhHHHHH---hhccChhhhhhhcCCcc---ccCccccceeeechhhcccccchhhccCCCcc Confidence 221111222 1233332 22233457788888874 3334444444443321111110 1111223 Q ss_pred eeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhh----hheeeeeehhhcceeeeecCC-- Q lcl|NC_011142. 106 RVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHS----QRVAYFGDTNRNMSGLLNNPN-- 179 (343) Q Consensus 106 ~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~----n~~~f~G~~~~g~~GLlN~p~-- 179 (343) .++.........+...+....+..+|+..|. .+.++.......+...+...+ -++++.-. |.|. T Consensus 66 ~v~~~~~~~~~~~~~~~L~~~i~~~~~~~a~-~~~d~~~~Av~~l~~~i~l~rE~~~A~lv~~~a---------~y~~~~ 135 (309) T protein:vir:99 66 EVEFSATDETGSTEDHGLDAPVPQADIDNAP-TNYNPLGHATEQTTNLILLDREARTSKLVFSPN---------SYAAGN 135 (309) T ss_pred eEeecccCceeeecccceeecCCchhhhhcc-CCCCHHHHHHHHHHHHHHHHHHHHHHHHhcChh---------hcCCCc Confidence 4455555666666666666677777877653 356666665555555554443 33333321 2221 Q ss_pred ccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeecc Q lcl|NC_011142. 180 VTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTR 259 (343) Q Consensus 180 v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~ 259 (343) .-+.+.+..|++.++ +++.||.++..++ | ..|++++|..+.|..|.+ ...+++.|+-++....+.. T Consensus 136 k~~Lsgt~~wsd~~S-DPi~~i~~~~~~~-----g-~~PN~~vlg~~~~~~l~~-------hp~i~~~ik~~~~~~g~it 201 (309) T protein:vir:99 136 KTTLSGADQWSDPTS-NPLPVITDALDSV-----I-LRPNIGVLGRRTATILRR-------HPKIVKAYNGSLGDEGMVP 201 (309) T ss_pred eEEecCccccCCCCC-CcHHHHHHHHHhh-----C-CCcceEEechHHHHHHhh-------CHHHHHHhcCCCccccccC Confidence 112223346988765 4688999887765 3 489999999999988753 1133333332221111110 Q ss_pred ccccccccce----eeec-hhhhccccCC-------ccceEEEEEccc-ceEEEeeccchhcccc----eecCceeEeee Q lcl|NC_011142. 260 NPIDIKIRFQ----LMAT-ELAAAGVSNG-------NKDRYVVYDKSE-RNLALAKPIPFRMLAP----QLLGLGITVPA 322 (343) Q Consensus 260 ~p~~i~~~~~----l~~~-~~~~~g~~~~-------g~dr~v~y~~~~-~~~~~~v~~~~~~~~~----~~~~~~~~~~~ 322 (343) .. ......+ ++.. .+..+..+.. |.+..++|.... +.+. .| ++... ....-.+..++ T Consensus 202 ~~-~la~l~~ve~V~vg~a~~n~a~~g~~~~~~~iwg~~~~L~y~~~~~~~~~----~p-s~G~t~~~~~r~~g~~~d~~ 275 (309) T protein:vir:99 202 MA-FLQELLELDAIYIGEARLNIARPGQNPNLIRAWGPHASFIYRDRLADTRN----GT-TFGLTAQWGDRVSGSIADPN 275 (309) T ss_pred HH-HHHHHhCcceEEeecceeeccccccccccccccCCcEEEEEcCCCCCCcc----cc-cccceeecccccCCceeeee Confidence 00 0000000 0000 0000000000 345555664322 1211 11 11111 11222345566 Q ss_pred eeeeeeEEEE-----CcceeeeeccC Q lcl|NC_011142. 323 EYKISGTEYR-----YPLCAQYVDML 343 (343) Q Consensus 323 ~~~~gGv~i~-----~P~ai~~~dGI 343 (343) +..-||-.|| .|.-++.--|. T Consensus 276 ~~~~g~~~vr~~~~~k~~i~~~d~G~ 301 (309) T protein:vir:99 276 IGLRGGQRVRVGESVKELVTAPDLGF 301 (309) T ss_pred eccCCceEEEEeccccchhcchhcch Confidence 5555553333 12222222222 No 119 >protein:vir:93881 Length: 387 # NCBI annotation: ORF011 # Family: family:all:658 # MgeID: mge:1485 # MgeName: 3A # Cross-refs: genbank:acc:YP_239938;genbank:gi:66395599;genbank:GeneID:5130947 Probab=95.43 E-value=0.0021 Score=35.29 Aligned_cols=291 Identities=10% Similarity=0.027 Sum_probs=136.3 Q ss_pred CCcceeccch-----hh-hhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccC Q lcl|NC_011142. 1 MSEKRVVIDA-----QT-IAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLA 74 (343) Q Consensus 1 ~~~~~~~~~~-----~~-~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~ 74 (343) +.++.....+ +. +.+.+.......+..- ..++.+...++|.++.. +.+.+.|++.....-..+.+..+.+ T Consensus 81 ~~~~~~~~~~~~~~~r~~~~~~~~~~~~~~~~~~--~~al~~~t~s~gG~~IP--~~~~~~Ii~~~~~~~~l~~~~~v~~ 156 (387) T protein:vir:93 81 LNDHEKMVKAKAEFYRHAILPNEFEKPSMEAQRL--LHALPTGNDSGGDKLLP--KTLSKEIVSEPFAKNQLREKARLTN 156 (387) T ss_pred cchhhHHHHHHHHHHHHHhhhhhhhhhhhhhHHH--HHhhccCcCCCCceeec--hhHHHHHHHHHHhhchhhhheeeee Confidence 2211111111 11 1111111000000000 01112222344555555 3455667776666555566665533 Q ss_pred CCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHH Q lcl|NC_011142. 75 NIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGS 154 (343) Q Consensus 75 ~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~ 154 (343) -.+ .++. .+....+.+.+++... ..|..+..++......+.++.-+.+|.+=|+ ....++..--....+.++ T Consensus 157 ~~~---~~~p-~~~~~~~~a~~v~E~~-~~~~~~~~f~~v~~~~~k~~~~~~iS~ell~---Ds~~~l~~~i~~~la~~~ 228 (387) T protein:vir:93 157 IKG---LEIP-RVSYTLDDDDFITDVE-TAKELKLKGDTVKFTTNKFKVFAAISDTVIH---GSDVDLVNWVENALQSGL 228 (387) T ss_pred cCC---ceEE-EEeecCCccccccCcc-cccccccccceeeeeheeeeeechhhHHHHh---hhHHHHHHHHHHHHHHHH Confidence 221 1111 1222345566776543 4666777788888888888887888755343 234557777777777888 Q ss_pred HHhhhheeee-eehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhc Q lcl|NC_011142. 155 EEHSQRVAYF-GDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASS 233 (343) Q Consensus 155 ~~~~n~~~f~-G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~ 233 (343) ...++..+|. |+....-.|+++++++...+.. ..+++|.+++..+... +.. .-..+|++..|..+.+ T Consensus 229 ~~~e~~~~~~~g~g~g~p~g~l~~~~~~~v~~~---------~~~d~i~~~~~~l~~~--~~~-~a~~~mn~~t~~~~~~ 296 (387) T protein:vir:93 229 AAKERKDALAVSPKSGLDHMSFYNGSVKEVEGA---------DMYDAIINALADLHED--YRD-NATIYMRYADYVKIIS 296 (387) T ss_pred HHHHHHhHhhcCCCccccceeeecccccccccc---------chHHHHHHHHhccChh--hhc-CCEEEEechHHHHHHH Confidence 8888776664 4443345788888776554332 2256677777666432 221 2357888887766544 Q ss_pred cccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhccc-ce Q lcl|NC_011142. 234 LLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLA-PQ 312 (343) Q Consensus 234 ~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~-~~ 312 (343) .. .++++ .++ ...+. .+-|.|+.+..- + .+ +++-+.+.=++. . ..+...+ .+ T Consensus 297 ~~-~d~~~-~~~----~~~~~-~llG~PV~~~~~------------~----~~-~~~GDf~~~~~~-~--~~~~~~~~~~ 349 (387) T protein:vir:93 297 VL-SNGTT-NFF----DTPAE-KVFGKPVVFTDA------------A----VK-PIVGDFNYFGIN-Y--DGTTYDTDKD 349 (387) T ss_pred HH-hcCCC-ccc----ccCCc-cccccceEEecC------------C----Cc-eeeeehhhhhee-h--hhheeeeccc Confidence 32 23222 111 12222 234556543211 0 01 111011100000 0 0111110 11 Q ss_pred ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 313 LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 313 ~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ...-.+.+-+..|++|. +.+|.|++++.-= T Consensus 350 ~~~~~~~~~~~~r~d~~-v~~~eA~~~l~~k 379 (387) T protein:vir:93 350 VKKGEYLFVLTAWYDQQ-RTLDSAFRIAKAK 379 (387) T ss_pred ccCCceeEEEEeeeCce-eechhheEEEEee Confidence 12223444556787554 6679999977543 No 120 >protein:vir:9361 Length: 402 # NCBI annotation: SLT orf 37-like protein # Family: family:all:658 # MgeID: mge:166 # MgeName: phi 12 # Cross-refs: genbank:acc:NP_803339;genbank:gi:29028650;genbank:GeneID:1258088 Probab=95.25 E-value=0.0022 Score=35.18 Aligned_cols=290 Identities=10% Similarity=0.016 Sum_probs=134.2 Q ss_pred CCcceeccchh------hhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccC Q lcl|NC_011142. 1 MSEKRVVIDAQ------TIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLA 74 (343) Q Consensus 1 ~~~~~~~~~~~------~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~ 74 (343) .+++.....+. ...+++..+....+..- ..++.+-.+++|.++.. +.+...|++.....-..+.+..+.+ T Consensus 96 ~~~~~~~~~~~~~~~r~~~~~~~~~~~~~~~~~~--~~a~~~~t~~~GG~lIP--~~~~~~Ii~~~~~~~~l~~~~~v~~ 171 (402) T protein:vir:93 96 LSDNEKMVKAKAEFYRHAILPNEFEKPSMEAQRL--LHALPTGNDSGGDKLLP--KTLSKEIVSEPFAKNQLREKARLTN 171 (402) T ss_pred CchhHHHHHHHHHHHHHHHhhhhHHHHHHhHHHH--HhhhccCCCcCCccccc--hhHHHHHHHhHHhhhhhhhhceeee Confidence 11111111110 01111111110000000 00111111233445555 4456677776666655566655533 Q ss_pred CCCcceeEEEEeee-cccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHH Q lcl|NC_011142. 75 NIPEYATHWNYRSY-DGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRG 153 (343) Q Consensus 75 ~~~~~~~~~~~~~~-~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~ 153 (343) -.+ .+ +... ...+.+.+++..+ ..|..+..++......+.++.-+.+|.+=|+- ...++..--....+++ T Consensus 172 ~~~---~~--~p~~~~~~~~a~~v~Eg~-~~~~~~~~f~~i~~~~~k~~~~i~iS~ell~D---s~~~l~~~i~~~la~~ 242 (402) T protein:vir:93 172 IKG---LE--IPRVSYTLDDDDFITDVE-TAKELKAKGDTVKFTTNKFKVFAAISDTVIHG---SDVDLVNWVENALQSG 242 (402) T ss_pred cCC---ce--eeeeeccCCccccccccc-cccccccccceeeecceeeeeechhhHHHHhh---hHHHHHHHHHHHHHHH Confidence 221 11 2222 2334566766543 46767777888888888888877887553433 2455666667777777 Q ss_pred HHHhhhheeee-eehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHh Q lcl|NC_011142. 154 SEEHSQRVAYF-GDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRAS 232 (343) Q Consensus 154 ~~~~~n~~~f~-G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~ 232 (343) +...+++.+|. |+....-.|+++++++...+.. ..+++|.+++..|... +. ..-..+|.+..+..+. T Consensus 243 ~~~~e~~~~~~~g~g~g~p~g~~~~~~~~~~~~~---------~~~d~l~~~~~~l~~~--y~-~na~~imn~~t~~~~~ 310 (402) T protein:vir:93 243 LAAKERKDALAVSPKSGLEHMSFYNGSVKEVEGA---------DMYDAIINALADLHED--YR-DNATIYMRYADYVKII 310 (402) T ss_pred HHHHHHHhHhhcCCCccccceeeecccccccccc---------chHHHHHHHHhccChh--hh-cCCEEEEechHHHHHH Confidence 77777766554 4443345788887776554322 2356777777766432 22 1235788888776665 Q ss_pred ccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcc-cc Q lcl|NC_011142. 233 SLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRML-AP 311 (343) Q Consensus 233 ~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~-~~ 311 (343) +.. .++ +..++ ...|. .+-|.|+.+..- . ++..++ +.+ ..+. .-..+..- .- T Consensus 311 ~~~-~d~-~~~~~----~~~~~-~llG~PV~~t~~------------~----~~i~~G-Df~-~~~~--~~~~~~~~~~~ 363 (402) T protein:vir:93 311 SVL-SNG-TTNFF----DTPAE-KVFGKPVVFTDA------------A----VKPIVG-DFN-YFGI--NYDGTTYDTDK 363 (402) T ss_pred HHH-hcC-CCccc----ccCCc-cccccceEEecC------------C----Cceeee-chh-hhhh--hhhhhhhhhhh Confidence 433 222 22221 12222 233555533210 0 011110 000 0000 00001000 01 Q ss_pred eecCceeEeeeeeeeeeEEEECcceeeeecc--C Q lcl|NC_011142. 312 QLLGLGITVPAEYKISGTEYRYPLCAQYVDM--L 343 (343) Q Consensus 312 ~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dG--I 343 (343) +...-...+-+..|++| .+.+|.|++++.- - T Consensus 364 ~~~~~~~~~~~~~r~Dg-~v~~~~A~~~l~ik~~ 396 (402) T protein:vir:93 364 DVKKGEYLFVLTAWYDQ-QRTLDSAFRIAKAKEN 396 (402) T ss_pred cccCCceEEEEEEEeCc-EEechhheEEEEeecC Confidence 11122355666778855 4557999986543 2 No 121 >protein:vir:1239 Length: 274 # NCBI annotation: similar to phage B1 major head protein # Family: family:all:522 # MgeID: mge:25 # MgeName: phi ETA # Cross-refs: genbank:acc:NP_510938;genbank:gi:17426272;genbank:GeneID:927376 Probab=95.24 E-value=0.0025 Score=34.89 Aligned_cols=265 Identities=8% Similarity=-0.024 Sum_probs=123.2 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCc-ceeEEEEeeecccccceeecCCcCcc Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPE-YATHWNYRSYDGAAMGKFISANASDL 104 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~-~~~~~~~~~~~~~G~a~~~~~~~~di 104 (343) |++ ..+..+| .+..+-|. +.|.+.....+....+..+...+.. +-.++.++.+...|.+..+.++ +++ T Consensus 1 ma~----~~T~l~d---~iiPev~~---~~v~~~~~~~l~~~~~~~~d~~l~g~~G~tv~iP~~~~ig~a~~~~~g-~~i 69 (274) T protein:vir:12 1 MAQ----GLTKTSN---QIIPEVLA---PMMQAQLEKKLRFASFAEVDSTLQGQPGDTLTFPAFVYSGDAQVVAEG-EKI 69 (274) T ss_pred CCc----ceeehhh---hhchHHHH---HHHHHHHHhhhhhcccceecccccCCCCCEEEEeeecCCCccccccCC-Ccc Confidence 221 1111111 12222232 2233333333444555544433221 2345677777777888887665 578 Q ss_pred ceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccc Q lcl|NC_011142. 105 PRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTS 184 (343) Q Consensus 105 p~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~ 184 (343) +..+...+.....+.+.+.+|.+ .|++..+..+-++ ......+...+++..|+-++.--.. .+. +. . T Consensus 70 ~~~~lt~~~~~~~i~~~~~~~~i--~D~~~~~~~~d~~-~~~~~q~~~~~a~~vd~~~l~~~~~--------a~~-~~-~ 136 (274) T protein:vir:12 70 PTDILETKKREAKIRKIAKGTSI--TDEALLSGYGDPQ-GEQVRQHGLAHANKVDNDVLEALMG--------AKL-TV-N 136 (274) T ss_pred chhhcccceeeEEeeeecceeee--cHHHHHhcccchH-HHHHHHHHHHHHHHHHHHHHHHHhc--------ccc-cc-c Confidence 88888888888888887666554 5566555544444 5566667777777777655421110 000 00 0 Q ss_pred cCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccH-HHHHHhcCcceeecccccc Q lcl|NC_011142. 185 ATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTV-IEHFQINNAYTLLTRNPID 263 (343) Q Consensus 185 ~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tv-le~l~~n~~~~~~~~~p~~ 263 (343) . ..-+ ++.|.++..++-.. ...+..|+++|..+..|.+-...+...-+- .+=+..++..-.+.|.++. T Consensus 137 ~----~a~~----~d~i~dA~~~lgd~---~~~~~~ivv~p~~~~~L~k~~~~~fv~~s~~g~~~~~~G~ig~~~G~~Vi 205 (274) T protein:vir:12 137 A----DITK----LNGLQSAIDKFNDE---DLEPMVLFINPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEALGAIIV 205 (274) T ss_pred c----cccC----HHHHHHHHHHhccc---cccccEEEeCHHHHHHHHhhhhhhccccccccccceecccceeecCeeEE Confidence 0 0112 45566676666432 236788999999999987531101000000 0011123333333333332 Q ss_pred ccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-eecCceeEeeeeeeeeeEEEECcceeeeecc Q lcl|NC_011142. 264 IKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-QLLGLGITVPAEYKISGTEYRYPLCAQYVDM 342 (343) Q Consensus 264 i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dG 342 (343) .+... + . ++ .|.-.+.-+.+....+++.-.- .++...-.+... ...|+-+.+|..++.+.. T Consensus 206 ~s~~~----p------~---~t----~~l~~~gA~~~~~~~~~~vE~~Rd~~~~~d~i~~~-~~y~~~~~~~~~vv~~t~ 267 (274) T protein:vir:12 206 RSNKL----E------A---GT----AILAKKGAVKLILKRDFFLEVARDASTKTTALYSD-KHYVAYLYDESKAVKITK 267 (274) T ss_pred EeCCC----C------c---ce----EEEEeccceeeeecCCceeccccchhhcccEEEee-eEEEEEEEcCCceEEEEc Confidence 22110 0 0 00 1111122222222222221100 011112222222 345888889988888876 Q ss_pred C Q lcl|NC_011142. 343 L 343 (343) Q Consensus 343 I 343 (343) = T Consensus 268 ~ 268 (274) T protein:vir:12 268 G 268 (274) T ss_pred C Confidence 5 No 122 >protein:vir:962 Length: 397 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:19 # MgeName: bIL285 # Cross-refs: genbank:acc:NP_076616;genbank:gi:13095724;genbank:GeneID:920264 Probab=95.07 E-value=0.0016 Score=35.91 Aligned_cols=292 Identities=7% Similarity=-0.016 Sum_probs=121.6 Q ss_pred CCcc----eeccchhhhhchhhhc-------hhccccc-ccC--cchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccc Q lcl|NC_011142. 1 MSEK----RVVIDAQTIAGNRWLN-------KFLDSNA-TIG--VPSVVNDADGGAAYYISQLASLETTVYEVPYADITY 66 (343) Q Consensus 1 ~~~~----~~~~~~~~~~~~~~~~-------~~~~~~~-~~~--~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~ 66 (343) ..+. ..+-..+.+....... .++.... ... ..+...+ ..+.+... +.+...+++ ....... T Consensus 87 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~vp--~~~~~~i~~-~~~~~~l 161 (397) T protein:vir:96 87 AADPTDQKPKDGEKRKMKKFKVTEEELAEKRSAINAFVKSKGAEKRDGFTS--VEGGALIP--QELLQPQLE-PKDIVDL 161 (397) T ss_pred hhhhhhhhhHHHHHHHHHHHhhhhHHHHHHHHHHHHHHHhhhhhhhhcccc--cccccchh--HHHHHHHHH-hhhhhhH Confidence 0000 0000000000000000 0000000 000 0000111 12222222 344455555 2333233 Q ss_pred hhhccccCCCCcceeEEEEeeecc-cccceeecCCcCccc-eeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccH Q lcl|NC_011142. 67 LEDVPVLANIPEYATHWNYRSYDG-AAMGKFISANASDLP-RVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDS 144 (343) Q Consensus 67 ~~~i~v~~~~~~~~~~~~~~~~~~-~G~a~~~~~~~~dip-~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~ 144 (343) +..+.+ .+....+..+.+... .+.+.+++..+. .| ..+..++.....+..++.-..++.+=|+.+ ..++.. T Consensus 162 ~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~~E~~~-~~~~~~~~~~~i~~~~~~~~~~~~~s~ell~ds---~~~l~~ 234 (397) T protein:vir:96 162 SKYVRS---VPVNSASGKFPVISKSGSKMATVQQLEK-NPQLANPKMVEIDYSVATRRGYIPISQEMIDDA---SYDVTG 234 (397) T ss_pred HHhhhh---ccccccceeEEEEeccCCcccccccccc-ccccccccccceeecHhHhhcchhhHHHHHhhh---HHHHHH Confidence 333332 222223344444432 344455555443 34 345567777777777776666665544433 345666 Q ss_pred HHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEec Q lcl|NC_011142. 145 MQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMF 224 (343) Q Consensus 145 ~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~ 224 (343) --....+.++...+|.-+++|.....-.| ..+. +||.+++...... .+ .-.++|+ T Consensus 235 ~i~~~l~~~~~~~~~~~i~~g~g~~~~~~-----------------~~~~----d~~~~~~~~~~~~-~~---~a~~v~n 289 (397) T protein:vir:96 235 LIADEIQDQSLNTKNADIAAVLKTATAKS-----------------VVGV----DGLKDLINKEIKK-VY---DVKLFIS 289 (397) T ss_pred HHHHHHHHHHHHHHHHHHhhccccccccc-----------------ccch----HHHHHHHHHhhhh-hc---CcEEEEc Confidence 66777888888899998888865321111 1123 3444444432221 11 2469999 Q ss_pred HHHHHHHhccccCCCCCccHHH-HHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeec Q lcl|NC_011142. 225 PDLWKRASSLLMTGYTDRTVIE-HFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKP 303 (343) Q Consensus 225 p~~~~~L~~~~~~~~~~~tvle-~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~ 303 (343) |+.|..|..- .+..|.-++. -+. +.....+-|.|+.... . ...+. ..+ +..++|-+=.+.+.+..- T Consensus 290 ~~~~~~l~~l--kd~~G~~~~~~~~~-~~~~~~l~G~pv~~~~-------~-~~~~~-~~~-~~~~~~gd~~~~~~~~~~ 356 (397) T protein:vir:96 290 ASMYSELDKL--KDKNGRYLLQDSIT-AASGKQLLGKEVVVLD-------D-DVIGK-SVG-NVVGFIGDAKAFASFFDR 356 (397) T ss_pred HHHHHHHHHh--hccCCCeEeccCcc-CCCcccccccceEEec-------c-cccCC-CCC-ceEEEEeehhcceEeEee Confidence 9999999653 2434432211 011 1111123344432211 0 00111 112 223333322222223222 Q ss_pred cchhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 304 IPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 304 ~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +.++............+..+.|++ +.+++|.+++.+.-= T Consensus 357 ~~~~~~~~~~~~~~~~~~~~~r~d-~~~~~~~a~~~~~~~ 395 (397) T protein:vir:96 357 KQVSVSWVDNNIYGQLLAGIIRYD-VKATDKKAGFYVTFT 395 (397) T ss_pred cceEEEEecccccceeEEEEEEEc-cEEecccceEEEEee Confidence 334433322222334455667774 577799999887632 No 123 >protein:vir:96262 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1612 # MgeName: ROSA # Cross-refs: genbank:acc:YP_240311;genbank:gi:66395978;genbank:GeneID:5133339 Probab=94.97 E-value=0.0031 Score=34.36 Aligned_cols=264 Identities=8% Similarity=-0.065 Sum_probs=125.4 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC--CCcceeEEEEeeecccccceeecCCcCc Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLAN--IPEYATHWNYRSYDGAAMGKFISANASD 103 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~--~~~~~~~~~~~~~~~~G~a~~~~~~~~d 103 (343) |++ ..+.=+ -.+-.+-|. +.|.+.....+....+..+... +.+| .++.++.+...|.+..+.++ ++ T Consensus 1 m~~----~~T~l~---d~i~Pev~~---~~v~~~~~~~l~~~~~~~~~~~l~g~~G-~tv~iP~~~~ig~a~~~~~g-~~ 68 (274) T protein:vir:96 1 MAQ----GMTKLT---NQIVPEVLA---PMMQAELEKKLRFASFAEIDNTLVGQPG-DTLTFPAFIYSGDAKVVAEG-EK 68 (274) T ss_pred CCc----ceeehh---heechHHHH---HHHHHHHHhhhhccccceecccccCCCC-CEEEeeeecCCCccccccCC-Cc Confidence 211 001101 111122222 1233333444455555443332 2234 56777777878988887664 57 Q ss_pred cceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCcccc Q lcl|NC_011142. 104 LPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKT 183 (343) Q Consensus 104 ip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~ 183 (343) ++......+.....+.+.+.+|.+ .|++..+..+ ++-..-...+...+++..|+.++- .+. + T Consensus 69 i~~~~lt~~~~~~~i~~~~~a~~i--~D~~~~~~~~-d~~~~~~~~~~~~~a~~vd~~i~~---~l~--~---------- 130 (274) T protein:vir:96 69 IPTDILETKKREAKIRKIAKGTSI--SDEALLSGYG-DPQGEQVRQHGLAHANKVDDDVLE---ALK--S---------- 130 (274) T ss_pred cchhhcccceeEEEeeeeecceee--hHHHHhhccc-hHHHHHHHHHHHHHHHHHHHHHHH---HHh--c---------- Confidence 888888888888888877666555 5666555444 444556677777788777775541 111 0 Q ss_pred ccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccH-HHHHHhcCcceeeccccc Q lcl|NC_011142. 184 SATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTV-IEHFQINNAYTLLTRNPI 262 (343) Q Consensus 184 ~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tv-le~l~~n~~~~~~~~~p~ 262 (343) +...+++++. -++.|.++..++-.. ...+..|+++|..+..|.+-..-+...-|- .+=+..|+..-.+.|.++ T Consensus 131 -a~~~~~~~~~--~~d~i~~A~~~lgd~---~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G~~V 204 (274) T protein:vir:96 131 -AKLTVEADIT--KLTGLQTAIDKFNDE---DLEPMVLFISPLDAGKLRGDATTNFTRATELGDDVIVKGAFGEALGAVI 204 (274) T ss_pred -cccccccccc--CHHHHHHHHHHhccc---cccccEEEeCHHHHHHHHhhccccccccccccccceeccccceecCeEE Confidence 0001111111 155666677666432 236789999999999997531101000000 011222333333333332 Q ss_pred cccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-eecCceeEeeeeeeeeeEEEECcceeeeec Q lcl|NC_011142. 263 DIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-QLLGLGITVPAEYKISGTEYRYPLCAQYVD 341 (343) Q Consensus 263 ~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d 341 (343) ..+.. ... + . .|.-.+.-+.+....+++.-.- .++...-.+.. -...|+.+.+|..++.+. T Consensus 205 i~s~~----------~~~---~--t--~~l~~~gA~~~~~~~~~~vE~~Rd~~~~~d~i~~-~~~y~~~~~~~~~~v~~t 266 (274) T protein:vir:96 205 VRSNK----------LEA---G--T--AILAKKGAVKLITKRDFFLETDRDPSTKTTALYS-DKHYVAYLYDESKAVKIT 266 (274) T ss_pred EEeCC----------CCC---c--e--EEEEeccceeeeecCCcccccccccccccCEEEE-eEEEEEEEEcCCcEEEEE Confidence 22111 000 0 0 1111122222222222221110 11112223322 345689999998888877 Q ss_pred cC Q lcl|NC_011142. 342 ML 343 (343) Q Consensus 342 GI 343 (343) -= T Consensus 267 k~ 268 (274) T protein:vir:96 267 KG 268 (274) T ss_pred cC Confidence 43 No 124 >protein:vir:95898 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1588 # MgeName: 71 # Cross-refs: genbank:acc:YP_240385;genbank:gi:66396054;genbank:GeneID:5133409 Probab=94.97 E-value=0.0031 Score=34.36 Aligned_cols=264 Identities=8% Similarity=-0.065 Sum_probs=125.4 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC--CCcceeEEEEeeecccccceeecCCcCc Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLAN--IPEYATHWNYRSYDGAAMGKFISANASD 103 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~--~~~~~~~~~~~~~~~~G~a~~~~~~~~d 103 (343) |++ ..+.=+ -.+-.+-|. +.|.+.....+....+..+... +.+| .++.++.+...|.+..+.++ ++ T Consensus 1 m~~----~~T~l~---d~i~Pev~~---~~v~~~~~~~l~~~~~~~~~~~l~g~~G-~tv~iP~~~~ig~a~~~~~g-~~ 68 (274) T protein:vir:95 1 MAQ----GMTKLT---NQIVPEVLA---PMMQAELEKKLRFASFAEIDNTLVGQPG-DTLTFPAFIYSGDAKVVAEG-EK 68 (274) T ss_pred CCc----ceeehh---heechHHHH---HHHHHHHHhhhhccccceecccccCCCC-CEEEeeeecCCCccccccCC-Cc Confidence 211 001101 111122222 1233333444455555443332 2234 56777777878988887664 57 Q ss_pred cceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCcccc Q lcl|NC_011142. 104 LPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKT 183 (343) Q Consensus 104 ip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~ 183 (343) ++......+.....+.+.+.+|.+ .|++..+..+ ++-..-...+...+++..|+.++- .+. + T Consensus 69 i~~~~lt~~~~~~~i~~~~~a~~i--~D~~~~~~~~-d~~~~~~~~~~~~~a~~vd~~i~~---~l~--~---------- 130 (274) T protein:vir:95 69 IPTDILETKKREAKIRKIAKGTSI--SDEALLSGYG-DPQGEQVRQHGLAHANKVDDDVLE---ALK--S---------- 130 (274) T ss_pred cchhhcccceeEEEeeeeecceee--hHHHHhhccc-hHHHHHHHHHHHHHHHHHHHHHHH---HHh--c---------- Confidence 888888888888888877666555 5666555444 444556677777788777775541 111 0 Q ss_pred ccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccH-HHHHHhcCcceeeccccc Q lcl|NC_011142. 184 SATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTV-IEHFQINNAYTLLTRNPI 262 (343) Q Consensus 184 ~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tv-le~l~~n~~~~~~~~~p~ 262 (343) +...+++++. -++.|.++..++-.. ...+..|+++|..+..|.+-..-+...-|- .+=+..|+..-.+.|.++ T Consensus 131 -a~~~~~~~~~--~~d~i~~A~~~lgd~---~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G~~V 204 (274) T protein:vir:95 131 -AKLTVEADIT--KLTGLQTAIDKFNDE---DLEPMVLFISPLDAGKLRGDATTNFTRATELGDDVIVKGAFGEALGAVI 204 (274) T ss_pred -cccccccccc--CHHHHHHHHHHhccc---cccccEEEeCHHHHHHHHhhccccccccccccccceeccccceecCeEE Confidence 0001111111 155666677666432 236789999999999997531101000000 011222333333333332 Q ss_pred cccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccc-eecCceeEeeeeeeeeeEEEECcceeeeec Q lcl|NC_011142. 263 DIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAP-QLLGLGITVPAEYKISGTEYRYPLCAQYVD 341 (343) Q Consensus 263 ~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~-~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d 341 (343) ..+.. ... + . .|.-.+.-+.+....+++.-.- .++...-.+.. -...|+.+.+|..++.+. T Consensus 205 i~s~~----------~~~---~--t--~~l~~~gA~~~~~~~~~~vE~~Rd~~~~~d~i~~-~~~y~~~~~~~~~~v~~t 266 (274) T protein:vir:95 205 VRSNK----------LEA---G--T--AILAKKGAVKLITKRDFFLETDRDPSTKTTALYS-DKHYVAYLYDESKAVKIT 266 (274) T ss_pred EEeCC----------CCC---c--e--EEEEeccceeeeecCCcccccccccccccCEEEE-eEEEEEEEEcCCcEEEEE Confidence 22111 000 0 0 1111122222222222221110 11112223322 345689999998888877 Q ss_pred cC Q lcl|NC_011142. 342 ML 343 (343) Q Consensus 342 GI 343 (343) -= T Consensus 267 k~ 268 (274) T protein:vir:95 267 KG 268 (274) T ss_pred cC Confidence 43 No 125 >protein:vir:100884 Length: 389 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1473 # MgeName: Lc-Nu # Cross-refs: genbank:acc:YP_358764;genbank:gi:78000028;genbank:GeneID:3726155 Probab=94.91 E-value=0.0032 Score=34.26 Aligned_cols=288 Identities=10% Similarity=0.026 Sum_probs=132.3 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) +.++.. +.+.=+-.+.++.-...+..++ ..++ +.|.++.. +.+.+.|++........+.++.+. +.+. T Consensus 83 ~~~~~~--~~~~~~~~~~lr~~~~~~~~~~---~~t~--~~gg~~vP--~~~~~~i~~~~~~~~~l~~~~~~~---~~~~ 150 (389) T protein:vir:10 83 LSKKPI--DAKKKAINDFIHSHGKVIDATS---KVTS--TEAGVLIP--EEIIYDPTAEVNSVVDLSTLVTKT---PVTT 150 (389) T ss_pred cchhHH--HHHHHHHHHHhhcchhhhhhhc---cccc--CCcceeeh--HHHHHHHHHHHHhhhhHHhhccee---eccC Confidence 111111 1111011111111111111111 0111 22344444 344566777666666666655442 2222 Q ss_pred eEEEEeeecc-cccceeecCCcCccc-eeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhh Q lcl|NC_011142. 81 THWNYRSYDG-AAMGKFISANASDLP-RVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHS 158 (343) Q Consensus 81 ~~~~~~~~~~-~G~a~~~~~~~~dip-~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~ 158 (343) .+..|..... .+.+.+++.++. .| .-+..++.....++.++.-+.+|.+=|+.+ ..++..--....++++...+ T Consensus 151 ~~~~~~~~~~~~~~~~~~~E~~~-~~~~~~~~~~~i~~~~~k~~~~~~iS~ell~ds---~~~l~~~i~~~la~~~~~~~ 226 (389) T protein:vir:10 151 PKGTYPILKRATDRFSSVAELAE-NPKLAEPEFNKVDWSVATYRGAIPLSEEAIADS---AVDLTALVGQSIKEKSVNTY 226 (389) T ss_pred CeeEEEEEecCCCcccccccccc-ccccccccceeeeeeheeeEeeehhhHHHHhhh---hHHHHHHHHHHHHHHHHHHH Confidence 2344444432 344455555543 44 345667888888899998888877645433 44677777778888999999 Q ss_pred hheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCC Q lcl|NC_011142. 159 QRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTG 238 (343) Q Consensus 159 n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~ 238 (343) |..++.|..... +.... ...+ ++++.++++.... ..+ ...++++|+.|..|..-. + T Consensus 227 ~~~i~~g~~~~~----------~~~~~----~~~~----~d~l~~~~~~~~~-~~~---~a~~~~n~~~~~~L~~lk--d 282 (389) T protein:vir:10 227 NAMIAPVLQSFT----------AKKTT----TDTL----VDSLKHILNVDLD-PAY---SRALVVTQSLFNTLDTLK--D 282 (389) T ss_pred HHHHhhhhcccc----------ccccc----cccc----HHHHHHHHHhhhh-hhh---CcEEEecHHHHHHHHHhh--c Confidence 999888865321 11100 1112 3344444432111 111 247999999999997533 3 Q ss_pred CCCccHHHHHHhcCc-c-------eeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhccc Q lcl|NC_011142. 239 YTDRTVIEHFQINNA-Y-------TLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLA 310 (343) Q Consensus 239 ~~~~tvle~l~~n~~-~-------~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~ 310 (343) ..|. ||-...+ . ..+-|.|+.... . . ..+..+.+-.++|-+=.+.+.+...+.++..- T Consensus 283 ~~G~----~i~~~~~~~~~~~~~~~~l~G~pV~~~~-------~--~-~~~~~~~~~~~~~gd~~~~~~~~~~~~~~i~~ 348 (389) T protein:vir:10 283 KNGR----YLLHDASDSITDGTAKGTILGVPVYVVG-------D--T-LLGSLAGDQKAFVGDLKRGVLFTDRQQVTLAW 348 (389) T ss_pred cCCC----eeeecCcccccccccccccccceeEEec-------c--c-ccCCCCCceEEEEeeccccEEEEeecceEEEe Confidence 3332 1111100 0 112333432110 0 0 01111223233333222323333333344332 Q ss_pred ceecCceeEeeeeeeeeeEEEECcceeeeec--cC Q lcl|NC_011142. 311 PQLLGLGITVPAEYKISGTEYRYPLCAQYVD--ML 343 (343) Q Consensus 311 ~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d--GI 343 (343) .........+....|++|. +.+|.|++.++ .. T Consensus 349 ~~~~~~~~~~~~~~r~d~~-~~~~~a~~~~~~~~~ 382 (389) T protein:vir:10 349 EDSKIYGKYLGAAFRFGVQ-KADSKAGYFVTNTDV 382 (389) T ss_pred eccccccceEEEEEEeccE-EecccceEEEEeecc Confidence 2222333345566787644 68899987665 44 No 126 >protein:vir:9643 Length: 377 # NCBI annotation: major coat protein # Family: family:all:635 # MgeID: mge:173 # MgeName: 315.1 # Cross-refs: genbank:acc:NP_795405;genbank:gi:28876178;genbank:GeneID:1257724 Probab=94.81 E-value=0.0034 Score=34.09 Aligned_cols=302 Identities=9% Similarity=-0.019 Sum_probs=136.3 Q ss_pred CC-cceeccchhhhhchhhhchhcccccccCcchh------------ecchhhhhhhhHHHHHHHHHHHHhhhhhcccch Q lcl|NC_011142. 1 MS-EKRVVIDAQTIAGNRWLNKFLDSNATIGVPSV------------VNDADGGAAYYISQLASLETTVYEVPYADITYL 67 (343) Q Consensus 1 ~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~------------~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~ 67 (343) |. +--..+..+.-+..++ +..+. -+.+.+ ..-.++++.++.+ +.+..+|++.....-..+ T Consensus 38 ~~~~~~~~~~~~~~~e~~~---~~~~~--~~~~~lt~ee~~~~~~~~~~~~~~~gg~lvP--~~~~~~I~~~l~~~s~i~ 110 (377) T protein:vir:96 38 AFTTMGDEILAKNEEEMER---MFDLR--DKNRELTAEEIKFFNDIDKNVGGKDKFKLLP--EETMVQVFDDLVAEHPLL 110 (377) T ss_pred HHHHHHHHHHHHHHHHHHH---HHHhc--cCCcccCHHHHHHHHHHHhcCCCCCCceecC--HHHHHHHHHHHHhhhhhh Confidence 00 0000000000000000 00000 000010 0111334444444 234455665444333333 Q ss_pred hhccccCCCCcceeEEEEeeecccccceeecCCcCccc-eeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHH Q lcl|NC_011142. 68 EDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLP-RVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQ 146 (343) Q Consensus 68 ~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip-~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k 146 (343) .+..+.. .+ +. ..+...+..+.+.+++..+ .++ ..+..++....+.+.++.-..++.+=|+. +..+++.-- T Consensus 111 ~~~~v~~-~~-~~--~~i~~~~~~~~a~wv~e~~-~~~~~~~~~f~~i~l~~~kl~~~~~is~~ll~d---s~~~le~~i 182 (377) T protein:vir:96 111 KVINFKN-TS-LR--LKALTAETSGTAVWGDIFG-EIKGQLKQAFKEQDFSQFKLTAFVVIPKDALKF---GPKWLKQFI 182 (377) T ss_pred hhceeEe-cC-Cc--eEEEEecCCcceeEeeccc-ccccccCccceeEeeeeeeEEeechhhHHHhhc---chhhHHHHH Confidence 3333321 11 11 2344456667777766543 343 44667888888888888777776544433 466788888 Q ss_pred HHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccccCcC---------------ccccCHHHHHHHHHHHHHHHHHh Q lcl|NC_011142. 147 AELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVN---------------YATCTGQELFDLLNNPVFAVVKA 211 (343) Q Consensus 147 ~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~---------------w~~~t~~~i~~di~~~~~~l~~~ 211 (343) ....+++++..+++-+++|+...+-.|+||++........+. ....+++.+.+.+..+...+... T Consensus 183 ~~~l~~~~~~~~~~a~i~G~G~~~P~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~ 262 (377) T protein:vir:96 183 TEQLKEAIAVALELAIVKGNGLLQPVGLLKDLSQPTVDQSTGRDITTYKTDKEAIADLSDLDPDTAVELLVPVMKHLSVN 262 (377) T ss_pred HHHHHHHHHHHHhhceEeccCCCcceeeeeccccccccccccccccceeeccccccccccCChhHHHHHHHHHHHhhccc Confidence 999999999999999999998878899999887544322111 11234556655555555544332 Q ss_pred cCCe----ecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccce Q lcl|NC_011142. 212 SKRF----HTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDR 287 (343) Q Consensus 212 s~g~----~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr 287 (343) ..+. .+.-.++|+|..+..+...+.... .++.+..+-+.|..+-... ..+.++ T Consensus 263 ~~~~~~~~~~~a~~~mn~~t~~~~~~~~~~~~----------~~G~~~~~l~~p~~v~~s~-----------~~p~~~-- 319 (377) T protein:vir:96 263 DKKHPLKIAGQVKLLLNPEDRWTLEAKFTSRN----------QFGEYVTVLPHGITILESL-----------AVETGK-- 319 (377) T ss_pred cccccccccCceEEEEchhhHHhccccccccC----------CCCCceeccCCCceEEecC-----------CCCccc-- Confidence 2111 123458899888766532221111 1111112222222110000 000111 Q ss_pred EEEEEcccceEEEeeccchhcccc-eecCc--eeEeeeeeeeeeEEEECcceeeeec--cC Q lcl|NC_011142. 288 YVVYDKSERNLALAKPIPFRMLAP-QLLGL--GITVPAEYKISGTEYRYPLCAQYVD--ML 343 (343) Q Consensus 288 ~v~y~~~~~~~~~~v~~~~~~~~~-~~~~~--~~~~~~~~~~gGv~i~~P~ai~~~d--GI 343 (343) ++..+.+. ..+..-..++...- +.... ...+...+|.+| .++.|.|++.++ |= T Consensus 320 i~fgdf~~--Y~i~~r~~~~i~~~~~~~~~~d~~~f~~~~r~dG-~~~d~~a~~vl~l~~~ 377 (377) T protein:vir:96 320 AIAFVANR--YDAFMATASTIEEYDQTFAMEDLQLYLTKNYFYG-KAKDNHTAALLTLAGG 377 (377) T ss_pred EEEEEcCc--EEEEEecccEEEeehhhhhhcCCeEEEEEEEEcC-EEecCCcEEEEEEecC Confidence 12112111 22222222222111 11111 233455666644 556777766554 11 No 127 >protein:vir:9509 Length: 381 # NCBI annotation: hypothetical protein # Family: family:all:635 # MgeID: mge:170 # MgeName: phiN315 # Cross-refs: genbank:acc:NP_835556;genbank:gi:30043951;genbank:GeneID:1260537 Probab=94.76 E-value=0.0036 Score=34.00 Aligned_cols=302 Identities=7% Similarity=-0.051 Sum_probs=141.1 Q ss_pred CCcceecc----------chhhhhchhhhchhcccccccCcc-----------hhecchhhhhhhhHHHHHHHHHHHHhh Q lcl|NC_011142. 1 MSEKRVVI----------DAQTIAGNRWLNKFLDSNATIGVP-----------SVVNDADGGAAYYISQLASLETTVYEV 59 (343) Q Consensus 1 ~~~~~~~~----------~~~~~~~~~~~~~~~~~~~~~~~~-----------~~~~dA~~~~~f~~~~l~~id~~v~e~ 59 (343) -.|..... +.+.-+.+ .++.+... ..+.+ ++.....++|.++.. +.+..+|++. T Consensus 25 ~~~~~~~~~~~~~~~~~~~~~~~~~~-e~~~~~~~--~~~~~~lt~~e~~~~~~~~~~~~~~gg~lvP--~~~~~~I~~~ 99 (381) T protein:vir:95 25 PQERQNELYGDMINQLFEETKLQAKA-EAERVSSL--PKSAQSLSANQRSFFMDINKNVNYKEEKLLP--EETIDRIFED 99 (381) T ss_pred hhHHHHHHHHHHHHhhhhhHHHHHHH-HHHHHHHh--ccCcccccHHHHHHHHHHhcccCCCCceecC--HHHHHHHHHH Confidence 00000000 00000000 00000000 00111 111122234555555 4556677776 Q ss_pred hhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccc-eeeeccceeEEEEEEEEEEEeecHHHHHHHHHh Q lcl|NC_011142. 60 PYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLP-RVAQSAKLHQVELGYAGVECHYSLDELRTTAAV 138 (343) Q Consensus 60 ~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip-~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~ 138 (343) ....-..+++..+.. .+ +. ..+...+..+.+.|++..+ .++ ..+..+++.....+.++.-..++.+=|+. . T Consensus 100 l~~~s~i~~~~~v~~-~~-~~--~~i~~~~~~~~a~w~~e~~-~~~~~~~~~f~~i~l~~~kl~~~~~is~elL~D---s 171 (381) T protein:vir:95 100 LTTNHPLLADLGIKN-AG-LR--LKFLKSETSGVAVWGKIYG-EIKGQLDAAFSEETAIQNKLTAFVVLPKDLNDF---G 171 (381) T ss_pred HHhhccceeheeeEe-cC-cc--eEEEEecCCcceeeecccc-cccccccccceeeeecceeEEeechhhHHHhhc---C Confidence 665555555555432 22 11 2344556677777766543 343 33556778888888888777766544433 3 Q ss_pred CCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccccC---------cCccccCHHHHHHHHHHHHHHHH Q lcl|NC_011142. 139 NMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSAT---------VNYATCTGQELFDLLNNPVFAVV 209 (343) Q Consensus 139 g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~---------~~w~~~t~~~i~~di~~~~~~l~ 209 (343) ..+++.--....+++++..+++-+++|+...+-.|||++.+......+ ..+...++...++.+..++..+. T Consensus 172 ~~~ie~~i~~~la~~~a~~~~~a~i~G~G~~qP~Gil~~~~~~~~~~~g~~~~~~~~~t~t~~~~~~~~~~l~~~~~~~~ 251 (381) T protein:vir:95 172 PAWIERFVRVQIEEAFAVALETAFLKGTGKDQPIGLNRQVQKGVSVTEGAYPEKEEQGTLTFANPRATVNELTQVFKYHS 251 (381) T ss_pred HHHHHHHHHHHHHHHHHHHhhheeEeccCCCCceeeeeccCcccccccccccccccccccccccchhhHHHHHHHHHhhc Confidence 456888888889999999999999999988788999998753221111 11222233344455555555553 Q ss_pred HhcCCee----cccEEEecHHHHHHHhccccC-CCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCc Q lcl|NC_011142. 210 KASKRFH----TPNTVLMFPDLWKRASSLLMT-GYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGN 284 (343) Q Consensus 210 ~~s~g~~----~p~tL~l~p~~~~~L~~~~~~-~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g 284 (343) ...++.. .-..++|.|..+..+...... +..| .+...-+.|..+-.. ..+ ..+ T Consensus 252 ~~~~~~~~~~~~~a~~~mn~~t~~~l~~~~~~~~~~G-----------~~v~~l~~g~~vv~s--~~~---------p~~ 309 (381) T protein:vir:95 252 TNEKGKSVAVKGNVTMVVNPSDAFEVQAQYTHLNANG-----------VYVTALPFNLNVIES--TVQ---------EAG 309 (381) T ss_pred cccccccccccCceEEEEccccHHhhccccccCCCCC-----------ceeecCCCCceEEec--CCC---------CcC Confidence 3222211 123578899887776432211 1111 111111111111000 000 111 Q ss_pred cceEEEEEcccceEEEeeccchhcccc-eecCc--eeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 285 KDRYVVYDKSERNLALAKPIPFRMLAP-QLLGL--GITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 285 ~dr~v~y~~~~~~~~~~v~~~~~~~~~-~~~~~--~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) + ++..+.+. ..+..-..++.-.- +.... ...+....|.+ ..++.|.|+++++ | T Consensus 310 -~-iifgDfs~--Y~i~~r~~~~i~~~~~~~~~~d~~~f~a~~r~d-g~~~~~~A~~v~~-l 365 (381) T protein:vir:95 310 -K-VLTYVKGL--YDGYLAGGINVQKFKETLALDDMDLYTAKQFAY-GKAKDNKVAAVWK-L 365 (381) T ss_pred -c-EEEEeccc--EEEEEecccEEEeechhHhhcCCeEEEEEEEEc-CEEecCceEEEEE-E Confidence 1 22222222 22222222222111 11111 23455677764 5678899988877 6 No 128 >protein:vir:101291 Length: 381 # NCBI annotation: hypothetical protein # Family: family:all:635 # MgeID: mge:1591 # MgeName: phiNM3 # Cross-refs: genbank:acc:YP_908831;genbank:gi:118725095;genbank:GeneID:4555862 Probab=94.76 E-value=0.0036 Score=34.00 Aligned_cols=302 Identities=7% Similarity=-0.051 Sum_probs=141.1 Q ss_pred CCcceecc----------chhhhhchhhhchhcccccccCcc-----------hhecchhhhhhhhHHHHHHHHHHHHhh Q lcl|NC_011142. 1 MSEKRVVI----------DAQTIAGNRWLNKFLDSNATIGVP-----------SVVNDADGGAAYYISQLASLETTVYEV 59 (343) Q Consensus 1 ~~~~~~~~----------~~~~~~~~~~~~~~~~~~~~~~~~-----------~~~~dA~~~~~f~~~~l~~id~~v~e~ 59 (343) -.|..... +.+.-+.+ .++.+... ..+.+ ++.....++|.++.. +.+..+|++. T Consensus 25 ~~~~~~~~~~~~~~~~~~~~~~~~~~-e~~~~~~~--~~~~~~lt~~e~~~~~~~~~~~~~~gg~lvP--~~~~~~I~~~ 99 (381) T protein:vir:10 25 PQERQNELYGDMINQLFEETKLQAKA-EAERVSSL--PKSAQSLSANQRSFFMDINKNVNYKEEKLLP--EETIDRIFED 99 (381) T ss_pred hhHHHHHHHHHHHHhhhhhHHHHHHH-HHHHHHHh--ccCcccccHHHHHHHHHHhcccCCCCceecC--HHHHHHHHHH Confidence 00000000 00000000 00000000 00111 111122234555555 4556677776 Q ss_pred hhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccc-eeeeccceeEEEEEEEEEEEeecHHHHHHHHHh Q lcl|NC_011142. 60 PYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLP-RVAQSAKLHQVELGYAGVECHYSLDELRTTAAV 138 (343) Q Consensus 60 ~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip-~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~ 138 (343) ....-..+++..+.. .+ +. ..+...+..+.+.|++..+ .++ ..+..+++.....+.++.-..++.+=|+. . T Consensus 100 l~~~s~i~~~~~v~~-~~-~~--~~i~~~~~~~~a~w~~e~~-~~~~~~~~~f~~i~l~~~kl~~~~~is~elL~D---s 171 (381) T protein:vir:10 100 LTTNHPLLADLGIKN-AG-LR--LKFLKSETSGVAVWGKIYG-EIKGQLDAAFSEETAIQNKLTAFVVLPKDLNDF---G 171 (381) T ss_pred HHhhccceeheeeEe-cC-cc--eEEEEecCCcceeeecccc-cccccccccceeeeecceeEEeechhhHHHhhc---C Confidence 665555555555432 22 11 2344556677777766543 343 33556778888888888777766544433 3 Q ss_pred CCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccccC---------cCccccCHHHHHHHHHHHHHHHH Q lcl|NC_011142. 139 NMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSAT---------VNYATCTGQELFDLLNNPVFAVV 209 (343) Q Consensus 139 g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~---------~~w~~~t~~~i~~di~~~~~~l~ 209 (343) ..+++.--....+++++..+++-+++|+...+-.|||++.+......+ ..+...++...++.+..++..+. T Consensus 172 ~~~ie~~i~~~la~~~a~~~~~a~i~G~G~~qP~Gil~~~~~~~~~~~g~~~~~~~~~t~t~~~~~~~~~~l~~~~~~~~ 251 (381) T protein:vir:10 172 PAWIERFVRVQIEEAFAVALETAFLKGTGKDQPIGLNRQVQKGVSVTEGAYPEKEEQGTLTFANPRATVNELTQVFKYHS 251 (381) T ss_pred HHHHHHHHHHHHHHHHHHHhhheeEeccCCCCceeeeeccCcccccccccccccccccccccccchhhHHHHHHHHHhhc Confidence 456888888889999999999999999988788999998753221111 11222233344455555555553 Q ss_pred HhcCCee----cccEEEecHHHHHHHhccccC-CCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCc Q lcl|NC_011142. 210 KASKRFH----TPNTVLMFPDLWKRASSLLMT-GYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGN 284 (343) Q Consensus 210 ~~s~g~~----~p~tL~l~p~~~~~L~~~~~~-~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g 284 (343) ...++.. .-..++|.|..+..+...... +..| .+...-+.|..+-.. ..+ ..+ T Consensus 252 ~~~~~~~~~~~~~a~~~mn~~t~~~l~~~~~~~~~~G-----------~~v~~l~~g~~vv~s--~~~---------p~~ 309 (381) T protein:vir:10 252 TNEKGKSVAVKGNVTMVVNPSDAFEVQAQYTHLNANG-----------VYVTALPFNLNVIES--TVQ---------EAG 309 (381) T ss_pred cccccccccccCceEEEEccccHHhhccccccCCCCC-----------ceeecCCCCceEEec--CCC---------CcC Confidence 3222211 123578899887776432211 1111 111111111111000 000 111 Q ss_pred cceEEEEEcccceEEEeeccchhcccc-eecCc--eeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 285 KDRYVVYDKSERNLALAKPIPFRMLAP-QLLGL--GITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 285 ~dr~v~y~~~~~~~~~~v~~~~~~~~~-~~~~~--~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) + ++..+.+. ..+..-..++.-.- +.... ...+....|.+ ..++.|.|+++++ | T Consensus 310 -~-iifgDfs~--Y~i~~r~~~~i~~~~~~~~~~d~~~f~a~~r~d-g~~~~~~A~~v~~-l 365 (381) T protein:vir:10 310 -K-VLTYVKGL--YDGYLAGGINVQKFKETLALDDMDLYTAKQFAY-GKAKDNKVAAVWK-L 365 (381) T ss_pred -c-EEEEeccc--EEEEEecccEEEeechhHhhcCCeEEEEEEEEc-CEEecCceEEEEE-E Confidence 1 22222222 22222222222111 11111 23455677764 5678899988877 6 No 129 >protein:vir:94424 Length: 387 # NCBI annotation: ORF010 # Family: family:all:658 # MgeID: mge:1506 # MgeName: 47 # Cross-refs: genbank:acc:YP_240005;genbank:gi:66395666;genbank:GeneID:5133084 Probab=94.20 E-value=0.0037 Score=33.90 Aligned_cols=291 Identities=10% Similarity=0.032 Sum_probs=134.5 Q ss_pred CCcceeccchh-----h-hhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccC Q lcl|NC_011142. 1 MSEKRVVIDAQ-----T-IAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLA 74 (343) Q Consensus 1 ~~~~~~~~~~~-----~-~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~ 74 (343) ..++.....+. . ..++...+....+..- ..++..-..++|.++.. +.+.++|++.....-..+.+..+.+ T Consensus 81 ~~~~~~~~~~~~~~~r~~~~~~~~~~~~~~~~~~--~~a~~~~~~~~gG~lIP--~~~~~~Ii~~~~~~~~l~~~~~~~~ 156 (387) T protein:vir:94 81 LSDNEKMVKAKAEFYRHAILPNEFEKPSMEAQRL--LHALPTGNDSGGDKLLP--KTLSKEIVSEPFAKNQLREKARLTN 156 (387) T ss_pred CchhHHHHHHHHHHHHHHHhhhhHHHHHHHHHHH--HhhhccCCCCCCceeec--hhHHHHHHHHHHhhchhhhhceeee Confidence 11111100000 0 0000000000000000 00111111233445555 3456778877666655566655532 Q ss_pred CCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHH Q lcl|NC_011142. 75 NIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGS 154 (343) Q Consensus 75 ~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~ 154 (343) -.+ .++. .+....+.+.+++.. ...|..+..++......+.++.-+.+|.+=|+. ...++..--....++++ T Consensus 157 ~~~---~~~p-~~~~~~~~a~~v~Eg-~~~~~~~~~f~~v~l~~~k~~~~i~iS~ell~d---s~~~l~~~i~~~la~~~ 228 (387) T protein:vir:94 157 IKG---LEIP-RVSYTLDDDDFITDV-ETAKELKAKGDTVKFTTNKFKVFAAISDTVIHG---SDVDLVNWVENALQSGL 228 (387) T ss_pred cCC---ceee-eeeccCCcccccccc-ccccccccccceeeechheeeeechhhHHHHhh---hHHHHHHHHHHHHHHHH Confidence 221 1111 122233556666654 346777778888888888888888887553433 34556666667777777 Q ss_pred HHhhhheeee-eehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhc Q lcl|NC_011142. 155 EEHSQRVAYF-GDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASS 233 (343) Q Consensus 155 ~~~~n~~~f~-G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~ 233 (343) ...+++.+|. |+....-.|.++.++++..+.. ..+++|.+++..+... +. ..-..+|.+..|..+.+ T Consensus 229 ~~~e~~~~~~~g~g~g~~~g~~~~~~~~~~~~~---------~~~d~i~~~~~~l~~~--y~-~na~~imn~~t~~~~~~ 296 (387) T protein:vir:94 229 AAKERKDALAVSPKSGLEHMSFYNGSVKEVEGA---------DMYDAIINALADLHED--YR-DNATIYMRYADYVKIIS 296 (387) T ss_pred HHHHHHhHhhcCCCccccceeeecccccccccc---------chHHHHHHHHhccChh--hh-cCCEEEEechHHHHHHH Confidence 7777776664 4433345778887776554332 2356777777766432 22 12357788877766654 Q ss_pred cccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhccc-ce Q lcl|NC_011142. 234 LLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLA-PQ 312 (343) Q Consensus 234 ~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~-~~ 312 (343) .. .++ +..++ ...+. .+-|.|+.+... . .+ +++-+.+.=++.. ..+.... -+ T Consensus 297 ~~-~~~-~~~~~----~~~~~-~llG~PV~~~~~------------~----~~-~~~GDf~~~~~~~---~~~~~~~~~~ 349 (387) T protein:vir:94 297 VL-SNG-TTNFF----DTPAE-KVFGKPVVFTDA------------A----VK-PIVGDFNYFGINY---DGTTYDTDKD 349 (387) T ss_pred HH-hcC-CCccc----ccCCc-cccccceEEecC------------C----Cc-eeeechhhhhhhh---hhhhheeccc Confidence 33 222 22221 12222 233555433210 0 01 1110111000000 0111111 11 Q ss_pred ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 313 LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 313 ~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ...-.+.+.+..|++ ..+++|.|++++.-= T Consensus 350 ~~~~~~~~~~~~r~D-g~v~~~~A~~~l~~k 379 (387) T protein:vir:94 350 VKKGEYLFVLTAWYD-QQRTLDSAFRIAKAK 379 (387) T ss_pred ccCCceEEEEEEEeC-cEeechhheEEEEee Confidence 112234566677875 555689999986542 No 130 >protein:vir:96978 Length: 387 # NCBI annotation: ORF009 # Family: family:all:658 # MgeID: mge:1643 # MgeName: 42e # Cross-refs: genbank:acc:YP_239859;genbank:gi:66395517;genbank:GeneID:5133011 Probab=94.20 E-value=0.0037 Score=33.90 Aligned_cols=291 Identities=10% Similarity=0.032 Sum_probs=134.5 Q ss_pred CCcceeccchh-----h-hhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccC Q lcl|NC_011142. 1 MSEKRVVIDAQ-----T-IAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLA 74 (343) Q Consensus 1 ~~~~~~~~~~~-----~-~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~ 74 (343) ..++.....+. . ..++...+....+..- ..++..-..++|.++.. +.+.++|++.....-..+.+..+.+ T Consensus 81 ~~~~~~~~~~~~~~~r~~~~~~~~~~~~~~~~~~--~~a~~~~~~~~gG~lIP--~~~~~~Ii~~~~~~~~l~~~~~~~~ 156 (387) T protein:vir:96 81 LSDNEKMVKAKAEFYRHAILPNEFEKPSMEAQRL--LHALPTGNDSGGDKLLP--KTLSKEIVSEPFAKNQLREKARLTN 156 (387) T ss_pred CchhHHHHHHHHHHHHHHHhhhhHHHHHHHHHHH--HhhhccCCCCCCceeec--hhHHHHHHHHHHhhchhhhhceeee Confidence 11111100000 0 0000000000000000 00111111233445555 3456778877666655566655532 Q ss_pred CCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHH Q lcl|NC_011142. 75 NIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGS 154 (343) Q Consensus 75 ~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~ 154 (343) -.+ .++. .+....+.+.+++.. ...|..+..++......+.++.-+.+|.+=|+. ...++..--....++++ T Consensus 157 ~~~---~~~p-~~~~~~~~a~~v~Eg-~~~~~~~~~f~~v~l~~~k~~~~i~iS~ell~d---s~~~l~~~i~~~la~~~ 228 (387) T protein:vir:96 157 IKG---LEIP-RVSYTLDDDDFITDV-ETAKELKAKGDTVKFTTNKFKVFAAISDTVIHG---SDVDLVNWVENALQSGL 228 (387) T ss_pred cCC---ceee-eeeccCCcccccccc-ccccccccccceeeechheeeeechhhHHHHhh---hHHHHHHHHHHHHHHHH Confidence 221 1111 122233556666654 346777778888888888888888887553433 34556666667777777 Q ss_pred HHhhhheeee-eehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhc Q lcl|NC_011142. 155 EEHSQRVAYF-GDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASS 233 (343) Q Consensus 155 ~~~~n~~~f~-G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~ 233 (343) ...+++.+|. |+....-.|.++.++++..+.. ..+++|.+++..+... +. ..-..+|.+..|..+.+ T Consensus 229 ~~~e~~~~~~~g~g~g~~~g~~~~~~~~~~~~~---------~~~d~i~~~~~~l~~~--y~-~na~~imn~~t~~~~~~ 296 (387) T protein:vir:96 229 AAKERKDALAVSPKSGLEHMSFYNGSVKEVEGA---------DMYDAIINALADLHED--YR-DNATIYMRYADYVKIIS 296 (387) T ss_pred HHHHHHhHhhcCCCccccceeeecccccccccc---------chHHHHHHHHhccChh--hh-cCCEEEEechHHHHHHH Confidence 7777776664 4433345778887776554332 2356777777766432 22 12357788877766654 Q ss_pred cccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhccc-ce Q lcl|NC_011142. 234 LLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLA-PQ 312 (343) Q Consensus 234 ~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~-~~ 312 (343) .. .++ +..++ ...+. .+-|.|+.+... . .+ +++-+.+.=++.. ..+.... -+ T Consensus 297 ~~-~~~-~~~~~----~~~~~-~llG~PV~~~~~------------~----~~-~~~GDf~~~~~~~---~~~~~~~~~~ 349 (387) T protein:vir:96 297 VL-SNG-TTNFF----DTPAE-KVFGKPVVFTDA------------A----VK-PIVGDFNYFGINY---DGTTYDTDKD 349 (387) T ss_pred HH-hcC-CCccc----ccCCc-cccccceEEecC------------C----Cc-eeeechhhhhhhh---hhhhheeccc Confidence 33 222 22221 12222 233555433210 0 01 1110111000000 0111111 11 Q ss_pred ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 313 LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 313 ~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ...-.+.+.+..|++ ..+++|.|++++.-= T Consensus 350 ~~~~~~~~~~~~r~D-g~v~~~~A~~~l~~k 379 (387) T protein:vir:96 350 VKKGEYLFVLTAWYD-QQRTLDSAFRIAKAK 379 (387) T ss_pred ccCCceEEEEEEEeC-cEeechhheEEEEee Confidence 112234566677875 555689999986542 No 131 >protein:vir:2685 Length: 387 # NCBI annotation: hypothetical protein # Family: family:all:658 # MgeID: mge:57 # MgeName: phiSLT # Cross-refs: genbank:acc:NP_075504;genbank:gi:12719433;genbank:GeneID:920169 Probab=94.20 E-value=0.0037 Score=33.90 Aligned_cols=291 Identities=10% Similarity=0.032 Sum_probs=134.5 Q ss_pred CCcceeccchh-----h-hhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccC Q lcl|NC_011142. 1 MSEKRVVIDAQ-----T-IAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLA 74 (343) Q Consensus 1 ~~~~~~~~~~~-----~-~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~ 74 (343) ..++.....+. . ..++...+....+..- ..++..-..++|.++.. +.+.++|++.....-..+.+..+.+ T Consensus 81 ~~~~~~~~~~~~~~~r~~~~~~~~~~~~~~~~~~--~~a~~~~~~~~gG~lIP--~~~~~~Ii~~~~~~~~l~~~~~~~~ 156 (387) T protein:vir:26 81 LSDNEKMVKAKAEFYRHAILPNEFEKPSMEAQRL--LHALPTGNDSGGDKLLP--KTLSKEIVSEPFAKNQLREKARLTN 156 (387) T ss_pred CchhHHHHHHHHHHHHHHHhhhhHHHHHHHHHHH--HhhhccCCCCCCceeec--hhHHHHHHHHHHhhchhhhhceeee Confidence 11111100000 0 0000000000000000 00111111233445555 3456778877666655566655532 Q ss_pred CCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHH Q lcl|NC_011142. 75 NIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGS 154 (343) Q Consensus 75 ~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~ 154 (343) -.+ .++. .+....+.+.+++.. ...|..+..++......+.++.-+.+|.+=|+. ...++..--....++++ T Consensus 157 ~~~---~~~p-~~~~~~~~a~~v~Eg-~~~~~~~~~f~~v~l~~~k~~~~i~iS~ell~d---s~~~l~~~i~~~la~~~ 228 (387) T protein:vir:26 157 IKG---LEIP-RVSYTLDDDDFITDV-ETAKELKAKGDTVKFTTNKFKVFAAISDTVIHG---SDVDLVNWVENALQSGL 228 (387) T ss_pred cCC---ceee-eeeccCCcccccccc-ccccccccccceeeechheeeeechhhHHHHhh---hHHHHHHHHHHHHHHHH Confidence 221 1111 122233556666654 346777778888888888888888887553433 34556666667777777 Q ss_pred HHhhhheeee-eehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhc Q lcl|NC_011142. 155 EEHSQRVAYF-GDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASS 233 (343) Q Consensus 155 ~~~~n~~~f~-G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~ 233 (343) ...+++.+|. |+....-.|.++.++++..+.. ..+++|.+++..+... +. ..-..+|.+..|..+.+ T Consensus 229 ~~~e~~~~~~~g~g~g~~~g~~~~~~~~~~~~~---------~~~d~i~~~~~~l~~~--y~-~na~~imn~~t~~~~~~ 296 (387) T protein:vir:26 229 AAKERKDALAVSPKSGLEHMSFYNGSVKEVEGA---------DMYDAIINALADLHED--YR-DNATIYMRYADYVKIIS 296 (387) T ss_pred HHHHHHhHhhcCCCccccceeeecccccccccc---------chHHHHHHHHhccChh--hh-cCCEEEEechHHHHHHH Confidence 7777776664 4433345778887776554332 2356777777766432 22 12357788877766654 Q ss_pred cccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhccc-ce Q lcl|NC_011142. 234 LLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLA-PQ 312 (343) Q Consensus 234 ~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~-~~ 312 (343) .. .++ +..++ ...+. .+-|.|+.+... . .+ +++-+.+.=++.. ..+.... -+ T Consensus 297 ~~-~~~-~~~~~----~~~~~-~llG~PV~~~~~------------~----~~-~~~GDf~~~~~~~---~~~~~~~~~~ 349 (387) T protein:vir:26 297 VL-SNG-TTNFF----DTPAE-KVFGKPVVFTDA------------A----VK-PIVGDFNYFGINY---DGTTYDTDKD 349 (387) T ss_pred HH-hcC-CCccc----ccCCc-cccccceEEecC------------C----Cc-eeeechhhhhhhh---hhhhheeccc Confidence 33 222 22221 12222 233555433210 0 01 1110111000000 0111111 11 Q ss_pred ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 313 LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 313 ~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ...-.+.+.+..|++ ..+++|.|++++.-= T Consensus 350 ~~~~~~~~~~~~r~D-g~v~~~~A~~~l~~k 379 (387) T protein:vir:26 350 VKKGEYLFVLTAWYD-QQRTLDSAFRIAKAK 379 (387) T ss_pred ccCCceEEEEEEEeC-cEeechhheEEEEee Confidence 112234566677875 555689999986542 No 132 >protein:vir:100632 Length: 381 # NCBI annotation: 77ORF006 # Family: family:all:635 # MgeID: mge:1476 # MgeName: 77 # Cross-refs: genbank:acc:NP_958606;genbank:gi:41189521;genbank:GeneID:2743778 Probab=93.87 E-value=0.0061 Score=32.72 Aligned_cols=302 Identities=8% Similarity=-0.089 Sum_probs=133.2 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcc-----------hhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhh Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVP-----------SVVNDADGGAAYYISQLASLETTVYEVPYADITYLED 69 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-----------~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~ 69 (343) +.+... =|.+..+..+.-+..+ . ..+.. ++..+..++|.++.. +.+..+|++.....-..|.+ T Consensus 36 ~~~~~~-~~~~~~~~~e~~~~~~-~--~~~~~~l~~~e~~~~~~~~~~t~~~Gg~lvP--~~~~~~I~~~l~~~spir~~ 109 (381) T protein:vir:10 36 MINQLF-EETKLQAKAEAERVSS-L--PKSAQTLSANQRNFFMDINKSVGYKEEKLLP--EETIDRIFEDLTTNHPLLAD 109 (381) T ss_pred HHHhhh-hhHHHHHHHHHHHHHH-h--cccccccCHHHHHHHHHHhhcCCCCCceecC--HHHHHHHHHHHHhhcceeee Confidence 000000 0111111111111000 0 00111 122222344555555 45566777755544344444 Q ss_pred ccccCCCCcceeEEEEeeecccccceeecCCcCccc-eeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHH Q lcl|NC_011142. 70 VPVLANIPEYATHWNYRSYDGAAMGKFISANASDLP-RVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAE 148 (343) Q Consensus 70 i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip-~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~ 148 (343) ..+.. .+ +. ..+...+..+.+.|....+ .++ ..+..+++...+.+.++.-..++.+=|+.+ ..+|+.--.. T Consensus 110 a~v~~-~~-~~--~~i~~~~~~~~a~W~~e~~-~~~~~~~~~f~~i~l~~~kl~a~i~is~elL~Ds---~~~le~~i~~ 181 (381) T protein:vir:10 110 LGIKN-AG-LR--LKFLKSETSGVAVWGKIYG-EIKGQLDAAFSEETAIQNKLTAFVVLPKDLNDFG---PAWIERFVRV 181 (381) T ss_pred eeeEe-cC-cc--eEEEeecCCcceEEeeccc-ccccccCccceeEeecceeEEeeccccHHHHhcc---HHHHHHHHHH Confidence 44322 22 11 2344556667777755432 333 345567888888888887777765444433 5568888888 Q ss_pred HHHHHHHHhhhheeeeeehhhcceeeeecCCccc--ccc-CcCc------cccCHHHHHHHHHHHHHHHHHhcCCee--- Q lcl|NC_011142. 149 LAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTK--TSA-TVNY------ATCTGQELFDLLNNPVFAVVKASKRFH--- 216 (343) Q Consensus 149 aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~--~~~-~~~w------~~~t~~~i~~di~~~~~~l~~~s~g~~--- 216 (343) ..++++++.+++-+++|+...+-.|||++.+-.. ... .+++ ...++...++.+..++..+.....+.. T Consensus 182 ~la~~~a~~~~~afi~GdG~~qP~Gil~~~~~~~~~~~g~~~~~~~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~ 261 (381) T protein:vir:10 182 QIEEAFAVALETAFLKGTGKDQPIGLNRQVQKGVSVTDGAYPEKEEQGTLTFANPRATVNELTQVFKYHSTNEKGKSVAV 261 (381) T ss_pred HHHHHHHHHhhceeEecccCCCceeeeecCCccccccccccccccccccccccchhhHHHHHHHHHHhhhhhhccccccc Confidence 8999999999999999998878899998754221 111 1111 111222233333333333322111111 Q ss_pred -cccEEEecHHHHHHHhcccc-CCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcc Q lcl|NC_011142. 217 -TPNTVLMFPDLWKRASSLLM-TGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKS 294 (343) Q Consensus 217 -~p~tL~l~p~~~~~L~~~~~-~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~ 294 (343) .-.+++|+|..+..+..... .+..|. |+ .....+.++... . ..+.++ ++..+.+ T Consensus 262 ~~~~~~vmn~~t~~~l~~~~~~~~~~G~----~v-----~~lp~g~~vv~~----~---------~~p~~~--i~fGDfs 317 (381) T protein:vir:10 262 KGNVTMVVNPSDAFEVQAQYTHLNANGV----YV-----TALPFNLNVIES----T---------VQEAGK--VLTYVKG 317 (381) T ss_pred cCceEEEEchhhHHhhccccccCCCCCc----ee-----ecCCCCceeEEc----C---------CCCcCc--EEEEEcc Confidence 12357889988777643221 111111 00 000001111000 0 001111 2222222 Q ss_pred cceEEEeeccchhcccc-eecCc--eeEeeeeeeeeeEEEECcceeeeec----cC Q lcl|NC_011142. 295 ERNLALAKPIPFRMLAP-QLLGL--GITVPAEYKISGTEYRYPLCAQYVD----ML 343 (343) Q Consensus 295 ~~~~~~~v~~~~~~~~~-~~~~~--~~~~~~~~~~gGv~i~~P~ai~~~d----GI 343 (343) . ..+..-+.++.-.- +.... ...+....|.+ -.++.|.|+++++ |- T Consensus 318 ~--Y~i~~r~~~~i~~~~~~~~~~d~~~f~a~~r~d-G~~~~~~A~~v~~l~~~~~ 370 (381) T protein:vir:10 318 L--YDGYLAGGINVQKFKETLALDDMDLYTAKQFAY-GKAKDNKVAAVWKLDLKGH 370 (381) T ss_pred c--EEEEEecccEEEeechhhhhcCceEEEEEEEEc-CEEecCCcEEEEEEeecCC Confidence 2 22222222221110 11111 23344566654 4567788877643 22 No 133 >protein:vir:95107 Length: 270 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1549 # MgeName: X2 # Cross-refs: genbank:acc:YP_240822;genbank:gi:66394683;genbank:GeneID:5133901 Probab=93.47 E-value=0.0075 Score=32.25 Aligned_cols=260 Identities=7% Similarity=-0.026 Sum_probs=123.2 Q ss_pred hhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCc-ceeEEEEeeecccccceeecC Q lcl|NC_011142. 21 KFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPE-YATHWNYRSYDGAAMGKFISA 99 (343) Q Consensus 21 ~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~-~~~~~~~~~~~~~G~a~~~~~ 99 (343) |+.-....+=+|..+. .++ .|.....+....+..+...+.. +-.+++++.++..|.+..+.+ T Consensus 1 Ma~T~~~d~I~Pev~~-------------~~V----~e~~~~~~~~~~~~~~d~~L~g~~G~ti~~P~~~~igdae~~~e 63 (270) T protein:vir:95 1 MTQTKKANLINPEVLA-------------NVV----SAQMQNAIRFTPYAVTDDTLVGQPGDTITRPKYAYIGAAEDLQE 63 (270) T ss_pred CCceehhhhcchHHHH-------------HHH----HHHHHhHHhhccccccccccCCCCCCEEEeeeecCCCccccccC Confidence 1111111111222211 122 2221112222333333322221 234567777888899998887 Q ss_pred CcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCC Q lcl|NC_011142. 100 NASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPN 179 (343) Q Consensus 100 ~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~ 179 (343) + ++++......+.....+...+++|.+ -|++.....+-|+ ..-....+..++++.|+.++ +.+ .| T Consensus 64 g-~~i~~~~lt~~~~~a~i~~~gk~~~i--tD~a~~~~~~dp~-~~~~~q~a~~~a~~~d~~li---~~l--~~------ 128 (270) T protein:vir:95 64 G-VAMDTTQMSMTTTKVTVKETGKAVEV--TQTAIITNVNGTL-QEASRQLAMSLADKVEIDYI---AEL--NK------ 128 (270) T ss_pred C-CccchhhcccchheeeeehhhCccee--cHHHHhhhccchH-HHHHHHHHHHHHHHHHHHHH---HHh--cc------ Confidence 6 46888888888888999888777665 5555544445554 44555677777777666543 111 11 Q ss_pred ccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeecc Q lcl|NC_011142. 180 VTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTR 259 (343) Q Consensus 180 v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~ 259 (343) .... ++. +.+ +++|++++..+-. ....++.++++|..+..|.+-..-. .....+-+..|+.+-.+.| T Consensus 129 a~~~-~~~---~~t----~~~~~dA~~~lgd---~~~~~~~i~vhs~~~~~Lrk~~~~~--~~~~~~~~~~~G~ig~~~G 195 (270) T protein:vir:95 129 SKQT-ATV---SAD----ATGILDAIEVFNS---ENDEDYVLYVNPKDYNKLVKSLFKV--GGNVQDRAISKGDLVEIVG 195 (270) T ss_pred cccc-ccc---ccC----HHHHHHHHHHhcc---ccCCCcEEEEcHHHHHHHHhhhccc--ccccccchhcccccceecc Confidence 0000 000 112 4566777766532 2345789999999999985322111 1111111223333333333 Q ss_pred ccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccceec--CceeEeeeeeeeeeEEEECccee Q lcl|NC_011142. 260 NPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLL--GLGITVPAEYKISGTEYRYPLCA 337 (343) Q Consensus 260 ~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~--~~~~~~~~~~~~gGv~i~~P~ai 337 (343) .++.+. + ..... .-.|...+.-+.+....+++.- .++. .....+-. -+..|+.+++|..+ T Consensus 196 ~~Viv~--------s-~~~~~-------~~~~l~~~gAi~~~~~~~~~vE-tdRd~~~~~d~i~~-~~~y~v~~~~~skv 257 (270) T protein:vir:95 196 VSDIVK--------S-KRVSE-------NTAFLQRYGAMEIVNKKKPEAY-TDFDILKRTHLLST-NYHYSVNLKDETGV 257 (270) T ss_pred eeEEEe--------C-CCCCc-------eeEEEEeccceeeeecCCceee-eccchhhcccEEEe-eeEEEEEEEccceE Confidence 332111 0 00010 1223333444454444443321 1111 11222222 23468899999988 Q ss_pred eeeccC Q lcl|NC_011142. 338 QYVDML 343 (343) Q Consensus 338 ~~~dGI 343 (343) +.++== T Consensus 258 v~~t~~ 263 (270) T protein:vir:95 258 VKVTFK 263 (270) T ss_pred EEEEec Confidence 765311 No 134 >protein:vir:78739 Length: 332 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:1856 # MgeName: Syn5 # Cross-refs: genbank:acc:YP_001285448;genbank:gi:148724482;genbank:GeneID:5220210 Probab=91.55 E-value=0.015 Score=30.53 Aligned_cols=291 Identities=7% Similarity=-0.012 Sum_probs=125.0 Q ss_pred cccccccCcchhe------cchhhh-hhhhHHHHH-HHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccc Q lcl|NC_011142. 23 LDSNATIGVPSVV------NDADGG-AAYYISQLA-SLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMG 94 (343) Q Consensus 23 ~~~~~~~~~~~~~------~dA~~~-~~f~~~~l~-~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a 94 (343) +....+|+.|.-. -++|.. ..|+ ++|. .++.++-+. -..+.++.+.+ .- +-.++.+.. .|.. T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~d~~~al~l-e~~~geV~~~f~~~----s~~~~~~~~r~-i~-~G~tv~i~~---ig~~ 70 (332) T protein:vir:78 1 MTTLSNFSLPNQANGGARNADYDVRYATAL-KLFSGEVFTAFNNA----SIFKGLVRSYD-LR-GGKSKQFMF---TGKL 70 (332) T ss_pred CcccccccCCccccCCccccccccchhhhh-hhhhhhHHHHHHHH----hhhhhcccccc-cc-ccceEEEEe---ccce Confidence 1112222222211 122222 3444 6654 665555332 12344444432 22 233344332 3444 Q ss_pred ee--ecCCcCcc-ceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeee----e-e Q lcl|NC_011142. 95 KF--ISANASDL-PRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYF----G-D 166 (343) Q Consensus 95 ~~--~~~~~~di-p~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~----G-~ 166 (343) +. +..+ .++ |..+.+-.+....|-. ..-|..-+.++..++ ...++-..-.+.+..++++..|+.++- + . T Consensus 71 ~~~~~~~g-~~l~~~~~~~~~~~~l~ID~-~ky~~~~VddiD~~q-~~~dl~~~~~~~~g~aLA~~~D~~i~~~l~~aa~ 147 (332) T protein:vir:78 71 SAGYHTPG-TPIVGDAGIKANEKTLVMDD-LLVSSQFVYSLDEIF-SQYSTRAEVSKQIGEALATHYDERIARVLAKASA 147 (332) T ss_pred eEeeecCC-CCCCCCCCCCCceEEEEEeh-hhhhHHHHHhHHHHh-cCcchHHHHHHHHHHHHHHHHHHHHHHHHHhhhc Confidence 33 2222 222 2223333443333222 123445567788754 446688888888999999999987652 1 1 Q ss_pred hhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCee-cccEEEecHHHHHHHhc---ccc-C-CC- Q lcl|NC_011142. 167 TNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFH-TPNTVLMFPDLWKRASS---LLM-T-GY- 239 (343) Q Consensus 167 ~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~-~p~tL~l~p~~~~~L~~---~~~-~-~~- 239 (343) ......| .|+-.....+.. .+.+++.+++-|.++..+|.++ .+- .-..++++|..|..|.+ ++. + +. T Consensus 148 ~~~~~~~---~~g~~~~~~~~~-~~~~~~~~~~~i~~a~~~Lde~--~VP~~gR~~vv~P~~y~~Ll~~~d~~~~n~~~~ 221 (332) T protein:vir:78 148 EASPVTG---EPGGFHVNIGAG-NTNDAQAIVDGFFEAAAVLDER--SAPQEGRVAVLSPRQYYSLISSVDTNILNREIG 221 (332) T ss_pred ccCcccc---cccccccccCCc-cccCHHHHHHHHHHHHHHHhhc--CCCccCCEEEeCHHHHHHHHhhcCceeeeeecc Confidence 1111222 122111111111 2346889999999999988764 331 11468899999988864 111 1 11 Q ss_pred -CCccHHHHHHhcCcceeeccccccccccceeeechhhhc---cccCCc----------cceEEEEEcccceEEEeeccc Q lcl|NC_011142. 240 -TDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAA---GVSNGN----------KDRYVVYDKSERNLALAKPIP 305 (343) Q Consensus 240 -~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~---g~~~~g----------~dr~v~y~~~~~~~~~~v~~~ 305 (343) .+.++.. ... ...+.|.++....+. +...+. ..+..| +-..++| .++-+.+...++ T Consensus 222 ~~~~~~~~---g~~-i~~i~G~~V~~Sn~l----p~~~g~~~~~~~~~~~~n~~~~~~~~~~~~~~--h~~a~~~v~~~~ 291 (332) T protein:vir:78 222 NSQGDMNS---GKG-LYSIAGIRILKSNNL----AGLYGQDLSSAAVTGENNDYQVDASALAGLIF--HREAAGCIQSVA 291 (332) T ss_pred ccccceec---cee-eeEEeeeEEEecCcc----ccCcccccccccccccccccccccccceEEee--cccceeeeeeec Confidence 1111110 000 111222221111110 000000 000000 1111222 233343433333 Q ss_pred hhcccc----eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 306 FRMLAP----QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 306 ~~~~~~----~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ++..-. .++.....+..... .|+.+.||.+++.+.== T Consensus 292 ~~~~~t~~~~~~~~~~d~i~~~~~-~G~~v~rPe~~v~l~~a 332 (332) T protein:vir:78 292 PTIQTTSGDFNVQYQGDLIVGKLA-MGCGSLRTSVAGSFQAA 332 (332) T ss_pred cchhhhhcccchhhhHhhhhhhhh-hcCceecccceEEEeeC Confidence 322111 12223344445554 47899999998766555 No 135 >protein:vir:102823 Length: 470 # NCBI annotation: major structural protein # Family: family:all:2450 # MgeID: mge:1610 # MgeName: YS40 # Cross-refs: genbank:acc:YP_874086;genbank:gi:118197693;genbank:GeneID:4496015 Probab=90.81 E-value=0.0015 Score=36.01 Aligned_cols=293 Identities=14% Similarity=0.040 Sum_probs=118.7 Q ss_pred CC-cceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhh--cccchhhccccCCCC Q lcl|NC_011142. 1 MS-EKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYA--DITYLEDVPVLANIP 77 (343) Q Consensus 1 ~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~--~l~~~~~i~v~~~~~ 77 (343) |. |.---.|- |-++.|+.|+.+ +++ +.. +.+|+++...-.. +++.-.-++- .+.. T Consensus 1 ~~~~~~~~~~~---a~~~al~~a~~~---------------g~A-lR~--EsLd~~l~~lt~~~~~ftf~~~i~k-~~a~ 58 (470) T protein:vir:10 1 MPYEHLKHLDE---ATLKALNAAGQV---------------AES-LER--EDLEPEVTQLNVLDTPLTDLLSKNA-VKAK 58 (470) T ss_pred CChhHhhhhhH---HHHHHHHHhhhc---------------chh-hhh--hhhccceeEeeecCccchhhhhcCC-chhh Confidence 21 11001111 112222222211 122 111 3444444332211 2222222221 1222 Q ss_pred cceeEEEEee-ecccccce--eecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHH Q lcl|NC_011142. 78 EYATHWNYRS-YDGAAMGK--FISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGS 154 (343) Q Consensus 78 ~~~~~~~~~~-~~~~G~a~--~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~ 154 (343) ..... |.+ ++..|+.. ..+ ....-+..+.++.++...+..++.+.+++...+...+-.=.++....-+.|...+ T Consensus 59 STV~e--y~~~~~rhG~~g~s~~~-E~~l~~~~d~~~~Rr~v~~K~l~~~~~VT~~a~~~~~n~v~d~~~~~~~dai~~i 135 (470) T protein:vir:10 59 AYEHE--YNVVTARHDKIGYAAFR-EGGLPRTVEVNVVRRRIRPMLVGHRITVTELATRTTQNGVMQIDELVKREKMIAV 135 (470) T ss_pred hHhhh--hhhhccccccccceeec-ccccCccCCCceEEEEEEEEEEeecchhhhhhhhhhhccccchHHHHHHHHHHHH Confidence 22222 221 22223332 232 3333455677888888888899999888776665533334488888888999999 Q ss_pred HHhhhheeeeeehhhc-----------ceeeee--cCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEE Q lcl|NC_011142. 155 EEHSQRVAYFGDTNRN-----------MSGLLN--NPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTV 221 (343) Q Consensus 155 ~~~~n~~~f~G~~~~g-----------~~GLlN--~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL 221 (343) ++......||||+.+. +.||.| +++-+.-.. +....-. + .+.|+++-..+. .+++.-.|+-+ T Consensus 136 a~tiE~a~FyGDs~l~s~~~g~~~gleFDGl~~lId~~~~~NVi--DarG~~L-s-~~~L~~aa~~I~-~~~~fGt~TD~ 210 (470) T protein:vir:10 136 ANEFEYLAFYGDNLLGDDVPGSPNNLQQDGIINIIKRGAPQNVL--DAGGRPL-S-IDLLWEAESRVV-STQAFANPTAV 210 (470) T ss_pred HHHHHhhhhhhccccccccCcccCceeccchhhhccCCCCcccc--ccCCCCc-c-HHHHHHHHhhhc-ccccccChhhh Confidence 9999999999988552 344422 211010000 1101111 0 244555554443 34567789999 Q ss_pred EecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEc-------- Q lcl|NC_011142. 222 LMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDK-------- 293 (343) Q Consensus 222 ~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~-------- 293 (343) .||+...+.|..-..+.. ..+..+|......|.++...... .-..++ ++..+... T Consensus 211 ~lp~~vka~f~~~~~~~q------Rv~~~~N~~~~~~G~~v~~f~sa---~G~I~L--------~~s~~m~~~~k~~p~~ 273 (470) T protein:vir:10 211 FISYVDKLNLQASFYQIS------RVMTTADRRAGLLGADAQSYIGV---RGEHSL--------YPSQFLGDFHKFNPAR 273 (470) T ss_pred ccchhHHHHHHHhhcCce------EEEEecCCCceeeeeeccceeee---eeeeee--------cccccccchhhcCccc Confidence 999999998875433211 11222333333333332110000 000000 00011100 Q ss_pred -ccceEEEeec---------cchhcccceecCc--------eeEeeeeeeeeeEEEECcceee-------eeccC Q lcl|NC_011142. 294 -SERNLALAKP---------IPFRMLAPQLLGL--------GITVPAEYKISGTEYRYPLCAQ-------YVDML 343 (343) Q Consensus 294 -~~~~~~~~v~---------~~~~~~~~~~~~~--------~~~~~~~~~~gGv~i~~P~ai~-------~~dGI 343 (343) +.++-.++-| .+...++.+-+.. +|..+...+.|-- +|.++- .-+|| T Consensus 274 l~~~v~~~aAP~~~~tv~~t~~~~a~~~~sk~g~~~~~~v~sy~y~v~~~~gds---~s~~v~vt~t~~~v~kgv 345 (470) T protein:vir:10 274 FGAEVGDFAAPSNSWTVSTTDNFVTLPYNSGLGDPANTTVYSYAFKAANFYGES---AAKYIDVYIDSTEAGKGV 345 (470) T ss_pred CCcccCCcccCceeEEeecCCCceeecccCCCCcccCcceeEEEEEEEEecCCC---CcceEEEEEeeehhccee Confidence 0011111111 1111222222211 1222222222111 122221 22333 No 136 >protein:vir:97031 Length: 402 # NCBI annotation: 31 # Family: family:all:2806 # MgeID: mge:1644 # MgeName: K1-5 # Cross-refs: genbank:acc:YP_654132;genbank:gi:108862016;genbank:GeneID:5075980 Probab=90.23 E-value=0.022 Score=29.67 Aligned_cols=300 Identities=8% Similarity=-0.008 Sum_probs=127.0 Q ss_pred hhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeeccccccee--ec Q lcl|NC_011142. 21 KFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKF--IS 98 (343) Q Consensus 21 ~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~--~~ 98 (343) |... .....|...-.++....|+....-.+++.+-+.. ..+.++.+++ +. +-.++.|. ..|..+. +. T Consensus 1 Ms~~--n~~t~~~~~~s~~~~al~le~f~geV~taF~~~s----i~~~~~~vrt-i~-~GkS~qf~---~iG~~~a~y~~ 69 (402) T protein:vir:97 1 MSTP--NTLTNVAVSASGEVDSLLIEKFNGKVNEQYLKGE----NILSYFDVQT-VT-GTNTVSNK---YLGETELQVLA 69 (402) T ss_pred CCCc--ccccccccccccchhhhhhhhhhhhHHHHHHHHH----hhcCcceeee-ec-ccceEEEE---EEeeeEEeeec Confidence 2111 1112233332233445666555667777664421 2234444432 22 23334433 2344443 11 Q ss_pred CCcCccceeeeccceeEEEE--EEEEEEEeecHHHHHHHHHhCCC-ccHHHHHHHHHHHHHhhhheeee-----eehhh- Q lcl|NC_011142. 99 ANASDLPRVAQSAKLHQVEL--GYAGVECHYSLDELRTTAAVNMP-IDSMQAELAFRGSEEHSQRVAYF-----GDTNR- 169 (343) Q Consensus 99 ~~~~dip~v~~~~~~~~~~v--~~~~~~~~~~~~El~~a~~~g~~-l~~~k~~aA~~~~~~~~n~~~f~-----G~~~~- 169 (343) .+ ..+-.....-++....| ..+..-| +.+|..+ +...+ +...-...+..++++..|+.++- |-... T Consensus 70 ~G-~~ldg~~~~~~k~~ItID~lL~a~~~---V~diDea-q~~yD~vRse~s~e~G~ALA~~~Dq~ii~~i~~aa~a~t~ 144 (402) T protein:vir:97 70 PG-QSPNATPTQADKNQLVIDTTVIARNT---VAHIHDV-QGDIDSLKPKLAMNQAKQLKRLEDQMAIQQMLLGGIANTK 144 (402) T ss_pred cc-cccCCCCcccccEEEEeCceeechhh---hhhHHHH-HhcccchhHHHHHHHHHHHHHHHHHHHHHHHHHhhccccc Confidence 11 11101111122222221 1122222 3444442 33565 56666677888888888885531 11111 Q ss_pred ---cceeeeecC-CccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhcc--ccCCCCCcc Q lcl|NC_011142. 170 ---NMSGLLNNP-NVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSL--LMTGYTDRT 243 (343) Q Consensus 170 ---g~~GLlN~p-~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~--~~~~~~~~t 243 (343) +..+...+- +.+..... .=...+++.+++-|.++..+|.++.=-... ..++|+|..|..|..- .++..++.+ T Consensus 145 ~~~~~~~~~~~g~s~~~~~t~-~~a~~~~~~l~~ai~~a~~~LdEkdVP~~d-Rv~vv~P~~y~~Ll~~~rl~n~d~~~~ 222 (402) T protein:vir:97 145 AERNKPRVKGHGFSINVNVTE-SEALANPQYVMAAVEYALEQQLEQEVDISD-VAIMMPWKFFNALRDADRIVDKTYTIS 222 (402) T ss_pred cccccCccccccccccccccc-chhhcCHHHHHHHHHHHHHHHHhcCCCccc-cEEEeChHHHHHHhhcccccchhhccc Confidence 111111111 11111111 112357889999999999888764211122 4799999999988742 111100000 Q ss_pred -HHHHHHhcCcceeeccccccccccceeeechh---hhcccc---------CCccceEEEEEcccceEEEeeccchhcc- Q lcl|NC_011142. 244 -VIEHFQINNAYTLLTRNPIDIKIRFQLMATEL---AAAGVS---------NGNKDRYVVYDKSERNLALAKPIPFRML- 309 (343) Q Consensus 244 -vle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~---~~~g~~---------~~g~dr~v~y~~~~~~~~~~v~~~~~~~- 309 (343) -..+. +.-...+.|.++..+.+.--.+... ....++ .-.+-++++|.+ +-+.-.-.++++.- T Consensus 223 ~~g~~~--~G~v~~v~Gv~Vv~SnnlP~~a~~it~~~ls~a~~G~~y~~t~d~t~~~~~~f~~--~Av~tvk~~~vT~~~ 298 (402) T protein:vir:97 223 QSGATI--NGFVLSSYNCPVIPSNRFPTFAQDQAHHLLSNEDNGYRYDPIAEMNGAVAVLFTS--DALLVGRTIEVTGDI 298 (402) T ss_pred cCCccc--cceeEEEeceEEEecCccccccccccccccccCCCCccCCcCcccceeEEEEEec--ceEEEEEeeccccch Confidence 00011 1111122333322221111000000 000111 112445666654 33332333444332 Q ss_pred cceecCceeEeeeeeeeeeEEEECcceeeeec-----------cC Q lcl|NC_011142. 310 APQLLGLGITVPAEYKISGTEYRYPLCAQYVD-----------ML 343 (343) Q Consensus 310 ~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d-----------GI 343 (343) --+.+...|.+.+.+.. |+..+||.++..+. |+ T Consensus 299 ~~d~r~~~~~id~~~a~-G~g~~RPeaa~vv~~~~~~t~~~~~~~ 342 (402) T protein:vir:97 299 FYEKKEKTYYIDTFMAE-GAIPDRWEAVSVVTTKRDATTGDAGGP 342 (402) T ss_pred hhchhHHHHHHHHHHHh-CCcccCccceEEEEEecccccccCCcc Confidence 22344455666666665 79999999998882 22 No 137 >protein:vir:98480 Length: 348 # NCBI annotation: ORFp38 # Family: family:all:1083 # MgeID: mge:1589 # MgeName: VWB # Cross-refs: genbank:acc:NP_958280;genbank:gi:41057254;uniprot:Q38595;genbank:GeneID:2732864 Probab=89.95 E-value=0.023 Score=29.52 Aligned_cols=272 Identities=13% Similarity=0.033 Sum_probs=108.9 Q ss_pred cCcchhecchhhhhhhhHHHHHHHHHHHH-hhhhhcccchhhccccCCCCcceeEEEEeeeccc-cc---ceeecCCcCc Q lcl|NC_011142. 29 IGVPSVVNDADGGAAYYISQLASLETTVY-EVPYADITYLEDVPVLANIPEYATHWNYRSYDGA-AM---GKFISANASD 103 (343) Q Consensus 29 ~~~~~~~~dA~~~~~f~~~~l~~id~~v~-e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~-G~---a~~~~~~~~d 103 (343) |+. +. + --.|...+|+.+=.++. +.+...+-..++||... . ..+.|...... +. +.+.+..+. T Consensus 1 M~~---~~--~-~d~~~~~~l~~~i~~~~~~~~~~~~l~~~~fp~~~-~----~~~~~~~~~~~~~~~~~a~~~~~~~~- 68 (348) T protein:vir:98 1 MSW---TL--D-TEFIEPTQLTGLIREALRDLQVNRFRLARWLPNVD-V----DDITFEFLRGGGGLAETASYRSWDTE- 68 (348) T ss_pred Ccc---hh--h-hhccCHHHHHHHHHHHhhccCcchhhHHhcCCCcc-c----cceEEEEEeccCCceeeeeeecCCCc- Confidence 121 00 1 11344555554433332 22334466788998632 1 12333332221 11 223332222 Q ss_pred cceee-eccceeEEEEEEEEEEEeecHHHHHHHHHhCCC----ccHHHHHHHHHHH----HHhhhheeeeee---hhhcc Q lcl|NC_011142. 104 LPRVA-QSAKLHQVELGYAGVECHYSLDELRTTAAVNMP----IDSMQAELAFRGS----EEHSQRVAYFGD---TNRNM 171 (343) Q Consensus 104 ip~v~-~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~----l~~~k~~aA~~~~----~~~~n~~~f~G~---~~~g~ 171 (343) -|... ..++.....+..++..+.++..|+...+....+ .-.+.....++.+ +...-++++.|- .+.+. T Consensus 69 ~~~~~r~g~~~~~~~~~~i~~~~~i~~~d~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~qal~~Gki~~~g~~~ 148 (348) T protein:vir:98 69 SKIGRREGLAKVMGELPPISEKIPLNEYDRLRLRKLSRDEALPFIARDAQRLARNIGARFEVARGSALVNATVPVTELQQ 148 (348) T ss_pred cceeecccceeeeeeccccccccccCHHHHHHhcCChHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhCCeEEEecCce Confidence 22222 234556667777888888888887764321110 0011111122222 222234455551 11121 Q ss_pred eee-eecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHh Q lcl|NC_011142. 172 SGL-LNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQI 250 (343) Q Consensus 172 ~GL-lN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~ 250 (343) .+ +..|.-...+.++.|+.....+++.||.+.+..+...+ | ..|..++|+++.|..|.+ +..+.+.+.- T Consensus 149 -~vDyg~~~~~~~t~~~~Ws~~~~adp~~di~~~~~~~~~~~-G-~~p~~~vm~~~~~~~l~~-------~~~i~~~~~~ 218 (348) T protein:vir:98 149 -TVDFGRIGSHSVVAAVLWSVHATATPISDLESWVATYEDTN-G-QSPGVILMPKAAVSHMRQ-------CEEVIRQVFP 218 (348) T ss_pred -EEccccCcccccccccccCCCCCCCHHHHHHHHHHHHHHcc-C-CcceEEEeCHHHHHHHhc-------CHHHHHHHhc Confidence 11 23343334556678964332357899999998877543 3 358999999999999853 3355555543 Q ss_pred cCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccceecCceeEeeeeeeeeeEE Q lcl|NC_011142. 251 NNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLGLGITVPAEYKISGTE 330 (343) Q Consensus 251 n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~~~~~~gGv~ 330 (343) .+.... .+ .+.. ......+. ..|+ -...+|+ . .+.. +=...+..+.+.-...| .. .... T Consensus 219 ~~~~~~---~~-~~~~--~~~~~~~~----~~g~-~~i~~~d--~-~~~~----~g~~~~~~p~~~i~l~p--~~-~~~~ 277 (348) T protein:vir:98 219 LAPSGT---AP-MVSV--EQLNTVLS----SMGL-PPIEVYD--A-KVAV----DGVSTRITPANAIALLP--EP-GATD 277 (348) T ss_pred cCcccc---cc-ccCH--HHHHHHHH----hhCC-eEEEEee--e-EEEc----CCceeceecCCeEEEEe--cC-Cccc Confidence 222110 00 0000 00000111 1111 1233332 1 1111 00011111222111111 00 1111 Q ss_pred EECcce-eeeeccC Q lcl|NC_011142. 331 YRYPLC-AQYVDML 343 (343) Q Consensus 331 i~~P~a-i~~~dGI 343 (343) ++++.. -...-|- T Consensus 278 ~~~~~~~G~t~~G~ 291 (348) T protein:vir:98 278 AAQPTELGATLLGT 291 (348) T ss_pred ccccccccceeccc Confidence 111111 1122232 No 138 >protein:vir:2736 Length: 348 # NCBI annotation: putative structural protein # Family: family:all:1083 # MgeID: mge:58 # MgeName: O1205 # Cross-refs: genbank:acc:NP_695109;genbank:gi:23455878;genbank:GeneID:955608 Probab=88.97 E-value=0.029 Score=29.01 Aligned_cols=277 Identities=10% Similarity=0.009 Sum_probs=101.2 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeeccc---cc-ceeecCCc Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGA---AM-GKFISANA 101 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~---G~-a~~~~~~~ 101 (343) |+. -.-.|+..+++..-.++ ..+...+-..++||... .. + +.+...+.. .. +.+++..+ T Consensus 1 M~~-----------i~d~f~~~~l~~~v~~~-~~~~~~~l~~~~Fp~~~-~~-~---~~~~~~~~~~~~~~~a~~v~~~~ 63 (348) T protein:vir:27 1 MGL-----------IYDKVTASNIAGYFNAL-QENVSSTLGESIFPARK-QL-G---TKLSYIKGASGQSVALKAAAFDT 63 (348) T ss_pred Ccc-----------hhhhcCHHHHHHHHHhc-cchhhhhhHhhcCCCcc-cc-c---eeEEEEeeccCceeEeeeecCCC Confidence 111 11234444444321111 11223344456777422 11 1 122222221 11 22233222 Q ss_pred CccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHH-----------H----HHHHHHHHHhhhheeeeee Q lcl|NC_011142. 102 SDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQ-----------A----ELAFRGSEEHSQRVAYFGD 166 (343) Q Consensus 102 ~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k-----------~----~aA~~~~~~~~n~~~f~G~ 166 (343) .....-...++..+..++.+...+.++..|++......-...... . ...++..+...-+++++|- T Consensus 64 ~~~~~~r~~~~~~~~~~p~i~~~~~i~~~d~~~~~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~~al~~Gk 143 (348) T protein:vir:27 64 NVTIRDRVSAEMHDEQMPFFKEAMLVKENDRQQLNLVKDSGNAVLVNTIVAGIFNDNLTLVNGARARLEAMRMQVLATGK 143 (348) T ss_pred CcceecccceeeeeeecCccccccccCHHHHHHHHHhhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCe Confidence 221112233455566777788888888777655433322111111 1 1122222333334455551 Q ss_pred ---hhhcce-ee-eecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCC Q lcl|NC_011142. 167 ---TNRNMS-GL-LNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTD 241 (343) Q Consensus 167 ---~~~g~~-GL-lN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~ 241 (343) .+.|.. .+ ++.|.-...+.++.|++.++ +++.||.+....+.. .|. .|..++|+++.|..|.+ + T Consensus 144 i~i~~~~~~~~vdfg~~~~~~~t~~~~W~~~~a-dp~~di~~~~~~~~~--~G~-~~~~ii~~~~~~~~l~~-------~ 212 (348) T protein:vir:27 144 IAFTSDGVNKDIDYGVKPDHKKQVSKSWAEPGA-TPLADLEDAIETARE--LGL-NPERAVMNAKTFGLIRK-------A 212 (348) T ss_pred eEEecCCeeEEEeecCCcccceeeeeccCCCCC-CHHHHHHHHHHHHHh--cCC-cccEEEECHHHHHHHhc-------C Confidence 122221 10 22232223444557988766 578999999877753 354 88999999999999863 2 Q ss_pred ccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEc----ccceEEEeeccc-hhcccceecC- Q lcl|NC_011142. 242 RTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDK----SERNLALAKPIP-FRMLAPQLLG- 315 (343) Q Consensus 242 ~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~----~~~~~~~~v~~~-~~~~~~~~~~- 315 (343) ..+.+.+..++....... .. .....+. ..+|- ..++|+. ..-...-.+|.. +.++|....+ T Consensus 213 ~~v~~~~~~~~~~~~~i~----~~----~~~~~~~----~~~g~-~i~~yd~~y~d~~G~~~~~~p~~~vvl~~~~~~G~ 279 (348) T protein:vir:27 213 ASTVKVIKPLAGDGSAVT----KA----ELENYIA----DNFGV-SIVLENGTYRNDKGEVSKFYPDGHLTLIPNGPLGN 279 (348) T ss_pred HHHHHHhcccCccccccC----HH----HHHHHHH----hhcCc-eEEEEeeEEEcCCCcCcccccCCeEEEEcCCccee Confidence 334444432222111100 00 0000000 01111 1222221 000111112221 1111211111 Q ss_pred ceeEeee------eeeee---------eEEE---ECcceeeeec-----cC Q lcl|NC_011142. 316 LGITVPA------EYKIS---------GTEY---RYPLCAQYVD-----ML 343 (343) Q Consensus 316 ~~~~~~~------~~~~g---------Gv~i---~~P~ai~~~d-----GI 343 (343) ..|-... ....+ |..+ .......+.. .+ T Consensus 280 ~~yG~~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dP~~~~~~~~s~~l 330 (348) T protein:vir:27 280 TVFGTTPEESDLFADNTVNAEVEIVDNGIAVTTTKTTDPVNVQTKVSMVAL 330 (348) T ss_pred EEeccCcchhhhhhccccccceeeeCCeeEEEeeecCCCceEEEEEeeeee Confidence 0110000 00000 0110 0111000100 00 No 139 >protein:vir:1084 Length: 437 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:21 # MgeName: bIL309 # Cross-refs: genbank:acc:NP_076738;genbank:gi:13095848;genbank:GeneID:920418 Probab=88.28 E-value=0.033 Score=28.68 Aligned_cols=297 Identities=8% Similarity=-0.059 Sum_probs=119.6 Q ss_pred CCccee---ccchhhhhchhhhchhcccccccC--cchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC Q lcl|NC_011142. 1 MSEKRV---VIDAQTIAGNRWLNKFLDSNATIG--VPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLAN 75 (343) Q Consensus 1 ~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~ 75 (343) +.+... ......-................. ......+ ..+.++.. +.+...+.+ ....-..++++.+. T Consensus 120 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~e~~~~~~~~~--~~~g~lvp--~~~~~~i~~-~~~~~~l~~~~~~~-- 192 (437) T protein:vir:10 120 RDAGGLQDMKLKVGGEIADKKVTAFADYLKTGEVRDVTGIAL--KDGKVIIP--ETILTPEKE-VHQFPRLGSLVRTE-- 192 (437) T ss_pred HhHHHHhHHHHHHHHHHHHhhhhhhHHHHHhhhhhhhhhccc--ccccccch--HHHHHHHHH-hhhhhhhhhcceeE-- Confidence 000000 000000000000000000000000 0000111 12223333 122333333 22222333443332 Q ss_pred CCcceeEEEEeeecc-cccceeecCCcCccce-eeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHH Q lcl|NC_011142. 76 IPEYATHWNYRSYDG-AAMGKFISANASDLPR-VAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRG 153 (343) Q Consensus 76 ~~~~~~~~~~~~~~~-~G~a~~~~~~~~dip~-v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~ 153 (343) +-......+.+... .+.+.+++..+ .+|. .+..++......+.++.-+.+|.+=|+.+ ..+|..--....+.+ T Consensus 193 -~~~~~~~~~~~~~~~~~~~~~~~e~~-~~~e~~~~~~~~v~~~~~k~~~~~~is~ell~ds---~~~~~~~i~~~l~~~ 267 (437) T protein:vir:10 193 -SVTTTTGKLPIFNNSTDLLTAHTEYG-QTTKNATPVITPILWDLKTYTGGYVFSQELISDS---SYDWQAELQSRLIEL 267 (437) T ss_pred -eeccCceeeEEeeccccccccccccc-cccccccccceeeeeehhheeeehhhhHHHHhhh---HHHHHHHHHHHHHHH Confidence 11222333444432 34445554433 3453 33457777788888887777776444432 345666677778889 Q ss_pred HHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhc Q lcl|NC_011142. 154 SEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASS 233 (343) Q Consensus 154 ~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~ 233 (343) +...+|.-+++|+.. +.+..+ .+.+.+++.+-++.. +.. .+. ..-.++|+|..|..|.. T Consensus 268 ~~~~~~~~i~~g~g~----------~~~~~~-----~~~~~~~~~~~~~~~---l~~--~~~-~~~~~~~~~~~~~~l~~ 326 (437) T protein:vir:10 268 RDNTDDSLIITALTD----------GIKKTT-----STYLLGDLKKVLNVT---LKP--QDS-AAASIVMSQSAYNLFDM 326 (437) T ss_pred HHHHHHHHHhhhhcc----------cccccc-----cccchhhHHHHHHhh---hhh--hhh-cCCEEEEcHHHHHHHHH Confidence 999999999999753 111111 112233333323222 211 121 22368999999999865 Q ss_pred cccCCCCCccHHH-HHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccce Q lcl|NC_011142. 234 LLMTGYTDRTVIE-HFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQ 312 (343) Q Consensus 234 ~~~~~~~~~tvle-~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~ 312 (343) -. +..|.-++. -+....+ ..+-|.|+.+.... ....+..| +..++|-+=.+.+.+..-+.++..-.. T Consensus 327 lk--d~~g~~~~~~~~~~~~~-~~l~G~pv~~~~~~--------~~~~~~~~-~~~~~~gd~~~~~~~~~r~~~~~~~~~ 394 (437) T protein:vir:10 327 AT--DAMGRPLLQPNVTAATG-YTLLGKTVVIVDDK--------LFPSASAG-DVNIVVAPLKKAVINFKLTEITGQFQD 394 (437) T ss_pred hh--ccCCCeeeccCccCCCC-cccccceeEEeccc--------ccCCcCCC-ceEEEEeeccccEEEEeeeceEEEEec Confidence 32 333332221 0111111 12334443321110 00111112 223334332232322222223321111 Q ss_pred -ecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 313 -LLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 313 -~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .......+....|+ ++.+..|.|++++.|= T Consensus 395 ~~~~~~~~~~~~~r~-d~~~~~~~a~~~l~~~ 425 (437) T protein:vir:10 395 TYDIWYKQLGIFLRQ-NVVQASKDLIVNLTGK 425 (437) T ss_pred ccccccceeeEEEEE-ccEEecccceEEEEee Confidence 11222334445676 5667789999999876 No 140 >protein:vir:78350 Length: 383 # NCBI annotation: Cps # Family: family:all:635 # MgeID: mge:1850 # MgeName: B025 # Cross-refs: genbank:acc:YP_001468644;genbank:gi:157325222;genbank:GeneID:5601696 Probab=87.81 E-value=0.036 Score=28.48 Aligned_cols=301 Identities=11% Similarity=-0.007 Sum_probs=128.6 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcc-----------hhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhh Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVP-----------SVVNDADGGAAYYISQLASLETTVYEVPYADITYLED 69 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-----------~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~ 69 (343) +.|++... ++.-. ++....+.. ..-|.. ++.....++|.++.. +.+..+|++.....-..+.+ T Consensus 43 ~~~~~~~~-~~~~~-~~~~~~~~~--~~~g~~~lt~~e~~~~~~~~~~~~~~gg~lvP--~~~~~~I~~~l~~~s~l~~~ 116 (383) T protein:vir:78 43 MAADIMEQ-AKKEA-RQEADAYIS--ASRTDKNITNEEIKFFNDINKEVGYKEETLLP--QTVVDEIFEDLTTEHPFLAS 116 (383) T ss_pred HHHHHHHH-HHHHH-HHHHHHHHH--hcCChhhhhHHHHHHHHHHhccCCCCCccccC--HHHHHHHHHHHHhhccceee Confidence 22211110 00000 000000000 000110 111122344555555 34555666655444344444 Q ss_pred ccccCCCCcceeEEEEeeecccccceeecCCcCccc-eeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHH Q lcl|NC_011142. 70 VPVLANIPEYATHWNYRSYDGAAMGKFISANASDLP-RVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAE 148 (343) Q Consensus 70 i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip-~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~ 148 (343) ..+.. .+ + . ..+...+..+.+.+++..+ .++ ..+..++......+.++.-..++.+=|+. ...+|+.--.. T Consensus 117 ~~v~~-~~-~-~-~~i~~~~~~~~a~w~~e~~-~~~~~~~~~f~~i~l~~~kl~~~i~is~ell~D---s~~~ie~~i~~ 188 (383) T protein:vir:78 117 IGMRT-TG-L-R-TKFLKSETSGVAVWGKIFG-EIKGQLDATFSDEESIQNKLTAFVVVPKDLEKF---GPAWVKRFVVT 188 (383) T ss_pred eeeEe-cC-C-c-eEEEEEcCCcceEEeeccc-ccccccCcceeeEeecceeeEeeccchHHHhhc---cHHHHHHHHHH Confidence 44321 22 1 1 2345566667777765543 233 34556778888888888766665444433 35578888888 Q ss_pred HHHHHHHHhhhheeeeeehhhcceeeeecCCcccccc---CcCccc---cCHHHHHHHHHHHHHHHHHhcCCe------- Q lcl|NC_011142. 149 LAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSA---TVNYAT---CTGQELFDLLNNPVFAVVKASKRF------- 215 (343) Q Consensus 149 aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~---~~~w~~---~t~~~i~~di~~~~~~l~~~s~g~------- 215 (343) ..+++++..+|+.+++|+...+-.||+++.+...... ...|.. .+.+++. .+...+..+....... T Consensus 189 ~l~~~~a~~~~~a~i~G~G~~qP~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~l~~~~~~~~~~~~~~~~~ 267 (383) T protein:vir:78 189 QIEEAFAVALESAYIVGDGNDKPIGLNRKVGKGSTVVDGVYAEKAATGTLTFANPK-TTVNELTDVYKYHSVKENGHPLN 267 (383) T ss_pred HHHHHHHHHHhhheEeccCCCCceeeeeccCCcccccccccccccccchhhhhhhH-HHHHHHHHHHhccchhcccchhh Confidence 9999999999999999998777899998765322111 112221 1222221 1122222222111110 Q ss_pred -ecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcc Q lcl|NC_011142. 216 -HTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKS 294 (343) Q Consensus 216 -~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~ 294 (343) .+..+.++.|..+..+...+...+ .++.+...-+.|..+-.... .+.+ .++..+.+ T Consensus 268 ~~~~~~~~~n~~~~~~~~~~~~~~~----------~~G~~~t~l~~~~~iv~s~~-----------~p~~--~iifgdfs 324 (383) T protein:vir:78 268 VAGKVTLLVNPTDAWDVKKQYTSLN----------ANGVYVTALPFNLNIIESLF-----------VPEK--KAISYVAE 324 (383) T ss_pred hcCceEEEEcCcchhhhccchhccC----------CCCceeeecCCCceEEecCC-----------CCcc--cEEEeecc Confidence 111245667655433321110000 11111111122221100000 0011 11111111 Q ss_pred cceEEEeeccchhcccc-eecCc--eeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 295 ERNLALAKPIPFRMLAP-QLLGL--GITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 295 ~~~~~~~v~~~~~~~~~-~~~~~--~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) . +-+..-..++.-.- +.... ...+....|.+| .++.|.|+++++ | T Consensus 325 ~--Y~i~~r~~~~i~~~~~~~f~~d~~~f~~~~r~dG-~~~~~~A~~vl~-~ 372 (383) T protein:vir:78 325 R--YDALIGGPLDIGTYDQTLAIEDLNLYAAKQFAYG-KAKDDKAAAVWT-L 372 (383) T ss_pred c--eEEEecccceEEecchhhhhcCceEEEEEEEEcC-EEecCCeEEEEE-E Confidence 1 22222222222111 11111 233455677654 788899988876 5 No 141 >protein:vir:95963 Length: 395 # NCBI annotation: ORF009 # Family: family:all:635 # MgeID: mge:1594 # MgeName: 2638A # Cross-refs: genbank:acc:YP_239802;genbank:gi:66395459;genbank:GeneID:5132880 Probab=87.28 E-value=0.04 Score=28.26 Aligned_cols=302 Identities=10% Similarity=-0.009 Sum_probs=131.0 Q ss_pred CCcceeccchhhhhchhhhchh-ccc--ccccC------cchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhcc Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKF-LDS--NATIG------VPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVP 71 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~-~~~--~~~~~------~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~ 71 (343) +.++... +++..+..+..... +.. ...+. ..++..+..+++.++.. +.+..+|++.....-..+.+.. T Consensus 45 ~~~~~~~-~~~~e~~~~~~~~~~~~~r~~~~l~~ee~~~~~~~~~~t~~~gG~liP--~~~~~~Ii~~l~~~s~i~~~~~ 121 (395) T protein:vir:95 45 LSNDLQE-EITAEINNRVVDNGILAKRSQDPLTSEERKFFNDINYDVGYTDEKILP--ETVVERVFDDLQKDHPLLSKIN 121 (395) T ss_pred HHHHHHH-HHHHHHHHHHHHHHHHhhcCccccchHHHHHHHHHhhccCCCCceecc--HHHHHHHHHHHHhhhhhhhhce Confidence 1100000 00000000000000 000 00000 00111122334445554 4556667776665555555555 Q ss_pred ccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHH Q lcl|NC_011142. 72 VLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAF 151 (343) Q Consensus 72 v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~ 151 (343) +.. .+ + ...+...+..+.+.+....+.--+..+..++......+.++.-..+|.+=|+ ....+++.--....+ T Consensus 122 v~~-~~-~--~~~i~~~~~~~~a~w~~e~~~~~~~~~~~f~~i~l~~~kl~~~~~iS~ell~---ds~~~ie~~i~~~la 194 (395) T protein:vir:95 122 FQN-AG-I--KTRVIKADPAGQAVWGKVFGEIKGQLDAAFREENFTQYKLTCFVVLPDDLST---FGPAWIERFVRTQIQ 194 (395) T ss_pred eEe-cC-C--ceEEEEecCCcceEEeecccccCccccccceeeeeceeeEEEeecccHHHHh---cchhHHHHHHHHHHH Confidence 432 22 1 1345556677777775443321234566677888888888877777654443 346678888899999 Q ss_pred HHHHHhhhheeeeeehhhc--ceeeeecCCccccccCcCcccc----CHHHH---HHHHHHHHHHHHHhcCC----eecc Q lcl|NC_011142. 152 RGSEEHSQRVAYFGDTNRN--MSGLLNNPNVTKTSATVNYATC----TGQEL---FDLLNNPVFAVVKASKR----FHTP 218 (343) Q Consensus 152 ~~~~~~~n~~~f~G~~~~g--~~GLlN~p~v~~~~~~~~w~~~----t~~~i---~~di~~~~~~l~~~s~g----~~~p 218 (343) .++++.+|+-+++|+...+ =.||||+.+...... .|... |.+.+ +..+..++..+....++ ...- T Consensus 195 ~~ia~~~~~a~i~G~G~~~~qP~Gil~~~~~~~~~~--~~~~~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~ 272 (395) T protein:vir:95 195 EAISVALESAIINGGGAAKTQPVGLMKDVNTNSGAV--TDKASSGTLTFADADTTILELNDVLKNLSVDEKGKELKIDGK 272 (395) T ss_pred HHHHHHHhhheeeccCCCCcCceeeeeccccccccc--ccccccchhhhhhhHhhHHHHHHHHHhhccccccchhhhcCc Confidence 9999999999999986532 479999876433221 22211 22222 22222222222111111 1122 Q ss_pred cEEEecHHHHHHHhccccC-CCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccce Q lcl|NC_011142. 219 NTVLMFPDLWKRASSLLMT-GYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERN 297 (343) Q Consensus 219 ~tL~l~p~~~~~L~~~~~~-~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~ 297 (343) -+++|+|..+..+...+.- +..| -...+-+.|+.+- .. ...+.++ ++|-+=.+ T Consensus 273 ~~~~mn~~t~~~~~g~~~~~~~~G-----------~~~~~lg~g~~v~-----~~------~~~p~~~---i~fgdfs~- 326 (395) T protein:vir:95 273 VALVVNPRDSWDVQARYTYLTANG-----------GFVTVLPYNVTII-----TS------EFVPEGK---LVAFVTDR- 326 (395) T ss_pred eEEEEcchhhhhcCCcceeccCCC-----------cceeccCCcceEE-----Ec------CCCCCCc---EEEEeccc- Confidence 3577888776554322211 1111 0011111121000 00 0001111 22221111 Q ss_pred EEEeeccch--hcccceecC--ceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 298 LALAKPIPF--RMLAPQLLG--LGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 298 ~~~~v~~~~--~~~~~~~~~--~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +.+..-..+ .++. +... -...+....|++ ..++.|.|+.+++ | T Consensus 327 y~i~~r~~~~i~~~~-~~~~~~d~~~f~~~~r~d-g~~~~~~A~~~l~-i 373 (395) T protein:vir:95 327 YNAVRGGGLTVKKFD-QTLALEDAVLFTAKTFAY-GQPDDNKASAVYD-L 373 (395) T ss_pred EEEEEecceEEEecc-chhhhCCcEEEEEEEEEC-CEEeccccEEEEE-e Confidence 112111122 1111 1111 124455677774 5677788876532 2 No 142 >protein:vir:107882 Length: 307 # NCBI annotation: gp34 # Family: family:all:908 # MgeID: mge:1565 # MgeName: BcepMu # Cross-refs: genbank:acc:YP_024707;genbank:gi:48696944;genbank:GeneID:2845970 Probab=86.84 E-value=0.043 Score=28.09 Aligned_cols=271 Identities=10% Similarity=0.045 Sum_probs=104.7 Q ss_pred ccccCcchhecchhhhhhhhH-HHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceee----cCC Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYI-SQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFI----SAN 100 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~-~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~----~~~ 100 (343) |-+++ .+|.. ..|+.+= +.-.-+.+.+.++||... -..++..|..++. ...... +-. T Consensus 1 m~~~~-----------~~~~~dp~LT~~A---~gy~n~~~ia~~l~P~vp---v~~~~~k~~~f~~-eaF~~~~t~r~~~ 62 (307) T protein:vir:10 1 MGRLS-----------KLRIVDPVLTNLA---IGYTNAEFIGQSLMPVVE---VEKEGGKIPKFGK-ESFRLYKTERALR 62 (307) T ss_pred CCCCC-----------CCcccChhHHHHH---HhhcchhhhhhhcCCccc---ccccccceeeECc-ccccchhhhcccC Confidence 11111 12221 1234332 222224578888888743 3334444444431 111100 000 Q ss_pred cCccceeeecc-ceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhh----hheeeeeehhhcceeee Q lcl|NC_011142. 101 ASDLPRVAQSA-KLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHS----QRVAYFGDTNRNMSGLL 175 (343) Q Consensus 101 ~~dip~v~~~~-~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~----n~~~f~G~~~~g~~GLl 175 (343) + +.-.++... +.....+......+-...+ .......++.....+.+...+...+ -+++|.... |+ T Consensus 63 ~-~~~~v~~~~~~~~~~~~~~~~L~~~id~r---~~~~~~~~~~~~av~~l~d~I~l~~E~~~A~l~~~~~~----y~-- 132 (307) T protein:vir:10 63 A-RSNRMNPEDLGSIDIVLDEHDLEYPIDYR---EDQESAFPLEQAAVQTATEAIQLRREKMVADLAQNPNS----YA-- 132 (307) T ss_pred C-CcceeecccccccccccccccccccCChh---hcCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHhcCccc----cC-- Confidence 0 000111110 1111112222222222222 2334455555555555544443333 445544321 11 Q ss_pred ecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcce Q lcl|NC_011142. 176 NNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYT 255 (343) Q Consensus 176 N~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~ 255 (343) ..+.-+.+.+..|+.+++ +++.||.+.+.++...+ ...|++++|+.+.|..|.+ +..+++.|+-+.. T Consensus 133 -~~~k~tLsGt~~Wsd~~s-DPi~di~~~~~ai~~~~--g~~Pn~~vlg~~a~~al~~-------hp~i~e~lk~~~~-- 199 (307) T protein:vir:10 133 -GGNKKQLSATEKFTAAGS-DPVGVIEDGKEAIRTKI--GRRPNTMVIGASAYKTLKA-------HPQLIEKIKYSMK-- 199 (307) T ss_pred -CCceEEeccccccCCCCC-CcHHHHHHHHHHHHhhh--CCccceEEeCHHHHHHHhc-------CHHHHHHhCCccc-- Confidence 112222334457988765 56899999999888653 3579999999999998863 1234455442211 Q ss_pred eeccccccccccceeee-----------chhhhccccCCccceEEEEEccc-ceEEEeeccchhccc-ceecCceeEeee Q lcl|NC_011142. 256 LLTRNPIDIKIRFQLMA-----------TELAAAGVSNGNKDRYVVYDKSE-RNLALAKPIPFRMLA-PQLLGLGITVPA 322 (343) Q Consensus 256 ~~~~~p~~i~~~~~l~~-----------~~~~~~g~~~~g~dr~v~y~~~~-~~~~~~v~~~~~~~~-~~~~~~~~~~~~ 322 (343) .. ++.+.+.++.. ......-.-.-|.+..++|.... ..-.-.+-+| ++.. .+.++..+..++ T Consensus 200 g~----it~~~la~ll~v~~i~vg~a~~~~~~~~~~~iw~~~~vl~yv~~~~~~~~~~~~ep-sfGyT~~~~g~~~~d~~ 274 (307) T protein:vir:10 200 GI----VTVDLLKEIFEVENIAVGEAIYADDKDRFTDIWGANIVLAYVPLQRGGQQRTPYEP-SYGYTLRKKGNPVVDTR 274 (307) T ss_pred cc----cCHHHHHHHhCceeEEEeeeeeeccCCccceeCCCceEEEecccccCCCCCccccc-ccceeEEEcCCeEeece Confidence 10 00000000000 00000000001334455553211 0000000011 1111 123444444444 Q ss_pred eeeeeeEEEE------CcceeeeeccC Q lcl|NC_011142. 323 EYKISGTEYR------YPLCAQYVDML 343 (343) Q Consensus 323 ~~~~gGv~i~------~P~ai~~~dGI 343 (343) .+ -+|+.+. .|.-++.--|. T Consensus 275 ~~-~~~~~~~r~~~~~~~~i~~~~~G~ 300 (307) T protein:vir:10 275 IE-DGKLELVRSTDIFRPYLLGADAGY 300 (307) T ss_pred ec-CCceeEEeccccccceeecccccc Confidence 44 3454433 23333333333 No 143 >protein:vir:98635 Length: 377 # NCBI annotation: major coat protein # Family: family:all:635 # MgeID: mge:1601 # MgeName: phi3396 # Cross-refs: genbank:acc:YP_001039923;genbank:gi:126011098;genbank:GeneID:4818471 Probab=86.19 E-value=0.047 Score=27.85 Aligned_cols=302 Identities=10% Similarity=-0.030 Sum_probs=124.5 Q ss_pred CCcceeccchhhhh---chhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCC Q lcl|NC_011142. 1 MSEKRVVIDAQTIA---GNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIP 77 (343) Q Consensus 1 ~~~~~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~ 77 (343) ..|.+..+.++.-. .++. +.++..-...+ +.+++.++.. +.+..+|++.....-..+.+.-+.. .+ T Consensus 51 ~~e~~~~~~~~~~~~~lt~ee-~~~~~~~~~~~-------~~~~gg~~vP--~~~~~~I~~~l~~~s~i~~~~~v~~-~~ 119 (377) T protein:vir:98 51 EEEMERMFDLRDKNRELTAEE-IKFFNDIDKNV-------GGKDKFKLLP--EETMVQVFDDLVAEHPLLKVINFKN-TS 119 (377) T ss_pred HHHHHHHHHhccCCcccCHHH-HHHHHHHHhcc-------CCCCCccccC--HHHHHHHHHHHHHhhhhhhheeeEe-cC Confidence 00100000000000 0000 00110000111 1223344443 3344456554443333333333221 11 Q ss_pred cceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHh Q lcl|NC_011142. 78 EYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEH 157 (343) Q Consensus 78 ~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~ 157 (343) +. ..+...+..+.+.+++..+.--+..+..++....+.+.++.-..++.+=|+. +..++..--....++++++. T Consensus 120 -~~--~~~~~~~~~~~a~w~~e~~~~~~~~~~~f~~i~l~~~kl~a~~~is~elL~d---s~~~ie~~i~~~la~~~a~~ 193 (377) T protein:vir:98 120 -LR--LKALTAETSGTAVWGDIFGEIKGQLKQAFKEQDFSQFKLTAFVVIPKDALKF---GPKWIKQFITEQLKEAIAVA 193 (377) T ss_pred -cc--eEEEEecCCcceeEeecccccCcccCccceeEeecceeEEeeecccHHhhhc---cHhHHHHHHHHHHHHHHHHH Confidence 11 2345566777788876543312344556778888888888777776544443 35568888888899999999 Q ss_pred hhheeeeeehhhcceeeeecCCccccccCcCccccCHHH---HHHHHH------------HHHHHHHH----hcCCeecc Q lcl|NC_011142. 158 SQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQE---LFDLLN------------NPVFAVVK----ASKRFHTP 218 (343) Q Consensus 158 ~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~---i~~di~------------~~~~~l~~----~s~g~~~p 218 (343) +++-+++|+...+-.||||++........+.+.+.+... .+.|+. -+++.... .-+...+. T Consensus 194 ~~~a~i~G~G~~qP~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~a~~~m~~~t~~~~~klkd~~G~ 273 (377) T protein:vir:98 194 LELAIVKGDGLLQPVGLLKDLSQPTVDQSTGRDITTYKTDKEAIADLSDLTPDNAPKKLVPVMKHLSVNDKKRPLKIAGQ 273 (377) T ss_pred HhhceEeccCCCcceeeeecccccccccccccccccccchhhhHhhhhhhchhHHHHHHHHHHHHHHHHHHhhhhccCCc Confidence 999999999887889999987543322221221111111 111111 01111110 00112344 Q ss_pred cEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceE Q lcl|NC_011142. 219 NTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNL 298 (343) Q Consensus 219 ~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~ 298 (343) +.++++|..+..+.--+...+ .++.+..+-+.|..+-.. ...+. ..+-.|--.+++++++. -+ T Consensus 274 ~i~~~n~~~~~~~~p~~~~~~----------~~G~~~t~lg~p~~vv~s--~~~p~---~~i~fgdf~~Y~i~~r~--~~ 336 (377) T protein:vir:98 274 VKLILNPEDRWALEAQFTSRN----------QFGEYVTVLPHGITILES--LAVET---GKAIAFVANRYDAFMAT--AS 336 (377) T ss_pred eEEEecccchhhccccccccC----------CCCccccccCCCceEEec--CCCCc---ccEEEEEecceeEEeec--ce Confidence 556667765544431111000 011111111222111000 00000 00000001112332221 11 Q ss_pred EEeeccchhcccceecCceeEeeeeeeeeeEEEECcceeeeec---c Q lcl|NC_011142. 299 ALAKPIPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVD---M 342 (343) Q Consensus 299 ~~~v~~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d---G 342 (343) ++..-....+ .++ ...+....|.+| .++.|.|++.++ | T Consensus 337 ~i~~~~~~~~----~~d-~~~f~~~~r~dg-~~~~~~a~~vl~i~~~ 377 (377) T protein:vir:98 337 TIEEYDQTFA----MED-LQLYLTKNYFYG-KAKDNHTAALLTLAGG 377 (377) T ss_pred EEEeechhhh----hcC-ceEEEEEEEEcC-EEeccCcEEEEEEecC Confidence 1111000000 011 233455667654 778888877665 3 No 144 >protein:vir:94576 Length: 347 # NCBI annotation: Major capsid protein # Family: family:all:975 # MgeID: mge:1516 # MgeName: Berlin # Cross-refs: genbank:acc:YP_919012;genbank:gi:119637776;genbank:GeneID:5179336 Probab=85.86 E-value=0.05 Score=27.73 Aligned_cols=308 Identities=9% Similarity=0.013 Sum_probs=127.2 Q ss_pred hhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHH-HHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeee Q lcl|NC_011142. 10 AQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQL-ASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSY 88 (343) Q Consensus 10 ~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l-~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~ 88 (343) ..-+++-++++ .+++.|.. .++....|+ ++| -++++++-+. -..+.++.+++ +- +-.++.+. T Consensus 1 ma~~~~~~~~~----t~~g~~~~----~~d~~al~i-e~~~geV~~~f~~~----s~~~~~~~~rt-i~-~G~sv~~~-- 63 (347) T protein:vir:94 1 MANMNGGQQMG----KDQGKGMS----AGDKLALFL-KVFGGEVLTAFTRT----SVTMNKHLVRS-IQ-SGKSAQFP-- 63 (347) T ss_pred CCccccccccc----cccccCCc----ccchHHHHH-HHHhHHHHHHHHHH----Hhhhhhhhhee-cc-ccceEEee-- Confidence 11111112222 12222211 122233455 544 4666655332 23444444432 22 22334333 Q ss_pred cccccceeec-CCcCcc--ceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeee- Q lcl|NC_011142. 89 DGAAMGKFIS-ANASDL--PRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYF- 164 (343) Q Consensus 89 ~~~G~a~~~~-~~~~di--p~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~- 164 (343) ..|..+... ..+.++ |..+....+....|-.. .-++.-+.++..+ ++..++...-...+..++++..|+.++- T Consensus 64 -~iG~~~~~~~~~G~~l~~~~~~~~~~e~~ltID~~-~y~~~~VddiD~~-q~~~D~rs~~~~~~g~ALA~~~D~~i~~~ 140 (347) T protein:vir:94 64 -VLGRTKAAYLQPGENLDDKRKDMKHTEKTINIDGL-LTADVLIYDIEDA-MNHYDVRSEYTAQLGESLAMAADGAVLAE 140 (347) T ss_pred -eccceeEeeeecCcCCCCCcCCccccceEEEEcch-hhhhhhhhhHHHH-hcCcchHHHHHHHHHHHHHHHHHHHHHHH Confidence 445554321 112222 22333344444333322 1223345677764 4566777888888999999999987652 Q ss_pred ---e-e----hhhcceeeeecCCccccccCcCc--cccCHHHHHHHHHHHHHHHHHhcCCee-cccEEEecHHHHHHHhc Q lcl|NC_011142. 165 ---G-D----TNRNMSGLLNNPNVTKTSATVNY--ATCTGQELFDLLNNPVFAVVKASKRFH-TPNTVLMFPDLWKRASS 233 (343) Q Consensus 165 ---G-~----~~~g~~GLlN~p~v~~~~~~~~w--~~~t~~~i~~di~~~~~~l~~~s~g~~-~p~tL~l~p~~~~~L~~ 233 (343) + . ......|..-.-.+......+.+ ..++++.+++-|.++...|.++ .+- .+..++++|+.|..|.+ T Consensus 141 l~~~a~~~~~~~~~~~g~~~~~~v~i~~~~~~~~~~~~~~~~~~d~i~~a~~~Lde~--dVP~~~R~~vv~P~~y~~LLk 218 (347) T protein:vir:94 141 MAKLCNLPTANNENIAGLGKAHVLEVGDQATLQGDQVKLGQAIIAQLTLARAKLTGN--YVPSSDRVFYTTPDNYSAILA 218 (347) T ss_pred HHHhhccccccccccccCCcceeEeeeccccccccccccHHHHHHHHHHHHHHhhhc--CCCCCCCEEEeChHHHHHHHH Confidence 1 1 11111221100011111111111 2356889999999999888764 332 35789999999988875 Q ss_pred cccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccC--Cc------------------cceEEEEEc Q lcl|NC_011142. 234 LLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSN--GN------------------KDRYVVYDK 293 (343) Q Consensus 234 ~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~--~g------------------~dr~v~y~~ 293 (343) .......+.....-+. ++-...+.|.++....+.-.....-...+.+. .+ +-..+++. T Consensus 219 ~~~~~~~~~~~~~~~~-~G~V~~v~G~~V~~Sn~~p~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~d~~~~~~l~~~- 296 (347) T protein:vir:94 219 ALMPNAANYQALIDPS-TGSIRNVMGFEVIEVPHLTAGGAGDNRAEEGVAPTNQKHAFPDTASGDTRVALDNVVGLFNH- 296 (347) T ss_pred hhcccccccccccccc-cceeEEeeceEEEEcCccccccCcccccccccccccccccccccccccccccccceEEEEec- Confidence 3222222211111111 12222222322221111110000000111100 00 01112221 Q ss_pred ccceEEEeeccchhccc-ceecCceeEeeeeeeeeeEEEECcceeeeec--cC Q lcl|NC_011142. 294 SERNLALAKPIPFRMLA-PQLLGLGITVPAEYKISGTEYRYPLCAQYVD--ML 343 (343) Q Consensus 294 ~~~~~~~~v~~~~~~~~-~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d--GI 343 (343) ++-+.....++++.-. -+.+-..+.+.+.... |+-++||.+.+-.. -= T Consensus 297 -~~A~~tv~~~~~~~e~~~~~~~~~~~i~~~~a~-G~g~~rPe~a~~i~~~~a 347 (347) T protein:vir:94 297 -RSAVGTVKLKDMALERARRANFQADQIIAKYAM-GHGGLRPEACGALVFKKA 347 (347) T ss_pred -hhhhhhhhhcccceeeeechhhhhhhhhhhhhh-cCcccccceeEEEEecCC Confidence 2211111112221111 1222234555566654 78889998875321 11 No 145 >protein:vir:105645 Length: 400 # NCBI annotation: putative major capsid protein # Family: family:all:2806 # MgeID: mge:1674 # MgeName: K1E # Cross-refs: genbank:acc:YP_425009;genbank:gi:83571757;uniprot:Q2WC43;genbank:GeneID:3837286 Probab=85.62 E-value=0.051 Score=27.65 Aligned_cols=302 Identities=8% Similarity=-0.038 Sum_probs=129.9 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) |+ -..+...|...--++....|+....-.+++.+-+.. ..+.++.+.+ +. +. T Consensus 1 Ms----------------------~~n~~t~p~~~gsg~~~aL~Le~f~GeV~taF~~~s----i~~~~~~vRt-I~-~g 52 (400) T protein:vir:10 1 MS----------------------TPNNLTNVAVSASGEVDSLLIEKFNGKVNEQYLKGE----NIMSYFDVQT-VT-GT 52 (400) T ss_pred CC----------------------CCccccccccccccchhhhHHhHhcchHHHHHHHHh----hhcccceeee-ec-cc Confidence 11 011111122111122334555555557776663322 2344555543 22 22 Q ss_pred eEEEEeeecccccceeec-CCcCccceeeeccceeEEE--EEEEEEEEeecHHHHHHHHHhCCC-ccHHHHHHHHHHHHH Q lcl|NC_011142. 81 THWNYRSYDGAAMGKFIS-ANASDLPRVAQSAKLHQVE--LGYAGVECHYSLDELRTTAAVNMP-IDSMQAELAFRGSEE 156 (343) Q Consensus 81 ~~~~~~~~~~~G~a~~~~-~~~~dip~v~~~~~~~~~~--v~~~~~~~~~~~~El~~a~~~g~~-l~~~k~~aA~~~~~~ 156 (343) .++.|. ..|+.+.-. ..+..+-.....-++.... -..+..-+-|.+.|... ..+ +...-....-.++++ T Consensus 53 kS~qf~---~lG~s~a~y~~pG~~ldg~~~~~dk~~ItIDtLL~a~~~V~dlDd~q~----~yD~vRse~s~e~G~ALA~ 125 (400) T protein:vir:10 53 NTVSNK---YLGETELQVLAPGQSPAATSTQADKNQLVIDATVIARNTVAHLHDVQG----DIDSLKPKLATNQAKQLKK 125 (400) T ss_pred ceEEEE---EeeeeEEeeecCCCCcCCCCcccCcEEEEeCceeeecchhhhHHHHhh----ccccccHHHHHHHHHHHHH Confidence 233333 234443211 1111111111222232222 22333334444455443 555 555555666777777 Q ss_pred hhhheee-----eeeh----hhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHH Q lcl|NC_011142. 157 HSQRVAY-----FGDT----NRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDL 227 (343) Q Consensus 157 ~~n~~~f-----~G~~----~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~ 227 (343) ..|+.++ -|.+ ..+..|..-++.....+....=...+++.+...|.++..++.++.=- .....+++||.. T Consensus 126 ~~Dq~iiq~i~~a~~a~t~~~~~~~~g~~~g~s~~v~~~~~~~~~~~~~l~~A~~~A~~~LdEkdVP-~~d~vvl~pp~~ 204 (400) T protein:vir:10 126 MEDEMLIQQMLLGGIANTQAKRTNPRVKGHGFSVNVEVNEGEALVNPQYVMAAVEFALEQQLEQEVD-ISDVAILMPWRY 204 (400) T ss_pred HHHHHHHHHHHHhcccccccccccCCccccccceeecccccccccCHHHHHHHHHHHHHHHHhcCCC-ccceEEEcCHHH Confidence 7777543 2211 11222322222211121222223347889999999999988764221 223678899999 Q ss_pred HHHHhcc--ccCCCCCccH-HHHHHhcCcceeeccccccccccceeeech-----hhhccc-------cCCccceEEEEE Q lcl|NC_011142. 228 WKRASSL--LMTGYTDRTV-IEHFQINNAYTLLTRNPIDIKIRFQLMATE-----LAAAGV-------SNGNKDRYVVYD 292 (343) Q Consensus 228 ~~~L~~~--~~~~~~~~tv-le~l~~n~~~~~~~~~p~~i~~~~~l~~~~-----~~~~g~-------~~~g~dr~v~y~ 292 (343) |..|... .++-.++.+- ..+.+.. ...+.|.++..+.+.--.... +-.++. +.-.+-++++|. T Consensus 205 Ys~Ll~~dkLvnrdf~~s~~g~~~~g~--v~~v~Gv~Iv~Sn~lP~~a~~~~~~~lS~a~~G~~y~~t~d~s~~~av~F~ 282 (400) T protein:vir:10 205 FNVLRDADRIVDKSYTISQSGATIQGF--VLSSYNCPVIPSNRFPKYSQGQKHHLLSNEDNGYRYDPIAEMNGAIAVLFT 282 (400) T ss_pred HHHHHhCCcccchhccccCCCccccce--EEEEeceEEEeeCcCCcccCcccccccccCCCCccCCccccccceeEEEEe Confidence 9777542 2221111110 1112111 112333333222221100000 000000 112355677776 Q ss_pred cccceEEEeeccchhcc-cceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 293 KSERNLALAKPIPFRML-APQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 293 ~~~~~~~~~v~~~~~~~-~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ++.= --.-.+|++.- --+.+...|.+.+...+ |+..+||.|++.+.=- T Consensus 283 ~sAv--~tvk~~~lt~~~~~d~r~~~~~id~~~a~-G~g~~RPeaa~vv~~~ 331 (400) T protein:vir:10 283 ADAL--LVGRSIDVIGDIFYEKKEKTYYIDTFMSE-GAIPDRWEAVSVVTTK 331 (400) T ss_pred hhhe--EEEEeeccccccccchhhHHHHHHHHHHh-CCcccchhheEEEEec Confidence 5521 11222333332 12445566777777765 7999999999877544 No 146 >protein:vir:4902 Length: 348 # NCBI annotation: gp348 # Family: family:all:1083 # MgeID: mge:107 # MgeName: Sfi11 # Cross-refs: genbank:acc:NP_056680;genbank:gi:9635015;genbank:GeneID:1262657 Probab=85.47 E-value=0.053 Score=27.60 Aligned_cols=280 Identities=11% Similarity=0.014 Sum_probs=105.0 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeeccccc-ceeecCCcCcc Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAM-GKFISANASDL 104 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~-a~~~~~~~~di 104 (343) |+.+ --.|+..+++..-..+- .+...+-...+||... .. ..+...+........ +.+++..+... T Consensus 1 M~~l-----------~d~f~~~~l~~~v~~~~-~~~~~~l~~~~Fp~~~-~~-~~~~~~~~~~~~~~~~a~~v~~~~~~~ 66 (348) T protein:vir:49 1 MGLI-----------YDKVTASNIAGYFNALQ-ENVDSTLGESIFPARK-QL-GTKLSYITGASGQSVALKAAAFDTNVT 66 (348) T ss_pred Ccch-----------hhhcCHHHHHHHHHhcc-ccchhhhHhhcCCCcc-cc-CceeEEEEeecCceeeeeeecCCCCcc Confidence 1111 11344444432211111 1223344567777432 11 122222222222221 22333332222 Q ss_pred ceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHH---------------HHHHHHHHhhhheeeeee--- Q lcl|NC_011142. 105 PRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAE---------------LAFRGSEEHSQRVAYFGD--- 166 (343) Q Consensus 105 p~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~---------------aA~~~~~~~~n~~~f~G~--- 166 (343) ..-...++..+..++.+...+.++..|+...+...-+-....+. ..++..+...-++++.|- T Consensus 67 ~~~r~~~~~~~~~~p~i~~~~~i~~~d~~~l~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~qal~~Gki~i 146 (348) T protein:vir:49 67 VRDRVSAEMHDEQMPFFKEAMLVKENDRQQLNLVKDSGNAALVNTIVAGIFNDNLTLVNGARARLEAMRMQVLATGKIAF 146 (348) T ss_pred eecccceeeeeeecCccccccccCHHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhCCeEEE Confidence 22233455566777888888888887765444332222111111 122223333334444451 Q ss_pred hhhcc-eee-eecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccH Q lcl|NC_011142. 167 TNRNM-SGL-LNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTV 244 (343) Q Consensus 167 ~~~g~-~GL-lN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tv 244 (343) .+.|. +.+ +..|.-...+.++.|++.++ +++.||.+.+..+.. + |. .|.+++|+++.|..|.+ +..+ T Consensus 147 ~~~g~~~~vdyg~~~~~~~t~~~~W~~~~a-dp~~di~~~~~~~~~-~-G~-~~~~ii~~~~~~~~l~~-------~~~v 215 (348) T protein:vir:49 147 TSDGVNKDIDYGVKPDHKKQVSKSWAEPGA-TPLADLEDAIETARE-L-GL-NPERAVMNAKTFGLIRK-------AAST 215 (348) T ss_pred ecCCceEEEeecCCcccceeeeeccCCCCC-CHHHHHHHHHHHHHh-c-CC-cccEEEeCHHHHHHHhc-------CHHH Confidence 11121 111 22232223344557988665 588999999877754 3 54 78999999999998853 2334 Q ss_pred HHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEc----ccceEEEeeccch-hcccceecC-cee Q lcl|NC_011142. 245 IEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDK----SERNLALAKPIPF-RMLAPQLLG-LGI 318 (343) Q Consensus 245 le~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~----~~~~~~~~v~~~~-~~~~~~~~~-~~~ 318 (343) .+.+...+........ . .....+. ..+|- ..++|+. ..-...-.+|... .++|....+ ..| T Consensus 216 ~~~~~~~~~~~~~i~~----~----~~~~~~~----~~~g~-~i~~y~~~y~d~dG~~~~~~p~~~v~l~~~~~~G~~~y 282 (348) T protein:vir:49 216 VKVIKPLAGDGSSVTK----A----ELDNYIA----DNFGV-TVVLENGTYRNEKGEVSKFFPDGHLTLIPNGPLGNTVF 282 (348) T ss_pred HHHhhccCcccccccH----H----HHHHHHH----hhcCc-eEEEEeeEEEecCCcEeeeecCCeEEEecCCCcceeEE Confidence 4444322221111000 0 0000000 00111 1222211 1111112233221 112221111 111 Q ss_pred Eeee---------------eeeeeeEEEE-----Ccceeeee---ccC Q lcl|NC_011142. 319 TVPA---------------EYKISGTEYR-----YPLCAQYV---DML 343 (343) Q Consensus 319 ~~~~---------------~~~~gGv~i~-----~P~ai~~~---dGI 343 (343) -... ...=.|+.++ .|...-.+ ..+ T Consensus 283 g~~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dP~~~~~~~~s~~l 330 (348) T protein:vir:49 283 GTTPEESDLFADNTVNADVEIVDNGIAVTTTKTTDPVNVQTKVSMVAL 330 (348) T ss_pred ecChhhhhhccccccccceeecCCeEEEeeeecCCCceEEEEEeeecc Confidence 0000 0000111111 11111111 000 No 147 >protein:vir:99675 Length: 324 # NCBI annotation: Major capsid protein # Family: family:all:975 # MgeID: mge:1523 # MgeName: VP4 # Cross-refs: genbank:acc:YP_249589;genbank:gi:68299740;genbank:GeneID:3799990 Probab=85.16 E-value=0.055 Score=27.50 Aligned_cols=260 Identities=9% Similarity=-0.014 Sum_probs=106.9 Q ss_pred hccccCCCCcceeEEEEeeecccccceeec-CCcCccce--eeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHH Q lcl|NC_011142. 69 DVPVLANIPEYATHWNYRSYDGAAMGKFIS-ANASDLPR--VAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSM 145 (343) Q Consensus 69 ~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~-~~~~dip~--v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~ 145 (343) ++ ..+--| .++.+ +..|..+... ..++++.. -+..-.+....|=. ..-+..-+.|+..++ +..++-.. T Consensus 1 ~v---r~i~~g-~s~~~---~~iG~~~~~~~~~G~~l~~~~~~~~~~e~~itID~-~l~~~~~VdDiD~~q-a~~Dlr~e 71 (324) T protein:vir:99 1 MT---RTITSG-KSAQF---PVMGRTKARYLKQGQSLDDGREDIKHTEKVITIDG-LLTTDVLIYDIEDAM-NHYDVRSE 71 (324) T ss_pred Ce---eeeecC-ceEEE---eeeeeeEeccccCCCCcCCCcCCcCcccEEEEecc-hhhhhhhhhhHHHHh-cCccchhH Confidence 11 111111 22222 2335554321 11222211 11222332222211 111234456777754 55778888 Q ss_pred HHHHHHHHHHHhhhheeeee-------ehhhcceeeeecCCcc--ccccCcCccccCHHHHHHHHHHHHHHHHHhcCCee Q lcl|NC_011142. 146 QAELAFRGSEEHSQRVAYFG-------DTNRNMSGLLNNPNVT--KTSATVNYATCTGQELFDLLNNPVFAVVKASKRFH 216 (343) Q Consensus 146 k~~aA~~~~~~~~n~~~f~G-------~~~~g~~GLlN~p~v~--~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~ 216 (343) -.+.+..++++..|+.++-= .+.....+.....+.. ..+.+..=...+++.+++-|.++..+|.++.=-.. T Consensus 72 ~s~~~G~aLA~~~Dq~i~~~~a~~~~~~a~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~dai~~a~~~Lde~~VP~~ 151 (324) T protein:vir:99 72 YSTQMGEALAMAADVANYAEMAKLVNSRKETTNENIEGLGAASLVKITGKKEDPAKYGTQVIQALTYARAAFAKKYIPAG 151 (324) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhhcccccccCCcccCCccceecccccccccccCHHHHHHHHHHHHHHHhhcCCCCC Confidence 88899999999999877511 1111111111111111 11111112345688999999999998876532222 Q ss_pred cccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccc---------------- Q lcl|NC_011142. 217 TPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGV---------------- 280 (343) Q Consensus 217 ~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~---------------- 280 (343) ...++|+|+.|..|..-+...+.+..-.. ...++-...+.|.++....+..-.. ..+...+ T Consensus 152 -gR~~vv~P~~y~~Ll~~~~~~~~~~~~~~-~~~~G~V~~i~Gf~V~~Sn~lp~~~-~t~~~~a~~~~~~~~~~~~~~~~ 228 (324) T protein:vir:99 152 -DRTFYTDPDTYSAILAALMPNAANYAALI-DPETGNIRNVMGFEVVETPHMTAQM-VTNPTDAFDGTGHIFPATGDSTT 228 (324) T ss_pred -CCEEEeChHHHHHHhhccccccccccccc-ceecceEEEEeceEEEecCCccccc-ccccccccccccccccccccccc Confidence 25699999999988643222211110000 0111211222222221111110000 0000000 Q ss_pred -----cCCccceEEEEEcccc-eEEEeeccchhcccceecCceeEeeeeeeeeeEEEECcceeeeec-------cC Q lcl|NC_011142. 281 -----SNGNKDRYVVYDKSER-NLALAKPIPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVD-------ML 343 (343) Q Consensus 281 -----~~~g~dr~v~y~~~~~-~~~~~v~~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d-------GI 343 (343) +..++-+.++|.++-= +++. ++..++....+ +...+.+...... |+.+.||.+++.+. |+ T Consensus 229 ~~ky~~d~~~~~gl~~~~~a~~tv~~-~~~~~e~~~~~-~~~~d~i~~~~a~-G~~~lRPe~a~~v~l~~~~~~~~ 301 (324) T protein:vir:99 229 TGKMTVGADNVVGLFVHRSAVATLKL-KDMALERARRP-EYQADQIIAKYAM-GHGGLRPEAVGAIIFEDGETPAV 301 (324) T ss_pred ccccccccCceeEEEEehhheEEEee-ecceecceech-hhHHHhhhhhhhh-cCcccccceEEEEEEccCccccc Confidence 0011122223222210 1111 11111222212 2234555555554 78899999887665 33 No 148 >protein:vir:79078 Length: 307 # NCBI annotation: gp8 # Family: family:all:908 # MgeID: mge:1862 # MgeName: phiE255 # Cross-refs: genbank:acc:YP_001111208;genbank:gi:134288798;genbank:GeneID:4960752 Probab=84.91 E-value=0.057 Score=27.41 Aligned_cols=274 Identities=12% Similarity=0.032 Sum_probs=100.0 Q ss_pred ccccCcchhecchhhhhhhhH-HHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceee----cCC Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYI-SQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFI----SAN 100 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~-~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~----~~~ 100 (343) |-++. .+|.. ..|+.+= +.-.-+++.+.++||.. +-..+++.|..++. ...... +-. T Consensus 1 m~~~~-----------~~~~~dp~LT~~A---~gy~n~~~Iad~lfP~v---pV~~~~~k~~~f~~-e~f~~~~t~ra~~ 62 (307) T protein:vir:79 1 MGRLS-----------KLRIVDPVLTNLA---IGYTNAEFIGQTLMPVV---EVEKEGGKIPKFGK-ESFRLYQTERALR 62 (307) T ss_pred CCCCC-----------CCcccCHHHHHHH---hhccchhhhhhhcCCcc---cccccccceeeecc-ccccccccccccC Confidence 11111 12221 1234332 22224568888898864 33444444444431 111100 000 Q ss_pred cCccceee-eccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHH----HHHHHhhhheeeeeehhhcceeee Q lcl|NC_011142. 101 ASDLPRVA-QSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAF----RGSEEHSQRVAYFGDTNRNMSGLL 175 (343) Q Consensus 101 ~~dip~v~-~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~----~~~~~~~n~~~f~G~~~~g~~GLl 175 (343) + +-..++ ..++.....+.......-...+ .....+.++.....+... +..+...-+++|.+.. | T Consensus 63 ~-~~~~v~~~~~~~~~~~~~~~~l~~~id~r---~~~~~~~~~~~~Av~~l~d~I~l~~E~~~A~l~~~~~~----y--- 131 (307) T protein:vir:79 63 A-KSNRMNPEDIDSVDVNLDEHDLEYPIDYR---EDQESAFPLEQAAVQTATDAIQLRREKMIADLSQNPSS----Y--- 131 (307) T ss_pred C-Ccceeeeeccccccccccccchhhcccch---hcCCCCCCHHHHHHHHHHHHHHhHHHHHHHHHhccccc----c--- Confidence 0 001111 0111111122221111111111 122334444443333332 3333344455554332 1 Q ss_pred ecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcce Q lcl|NC_011142. 176 NNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYT 255 (343) Q Consensus 176 N~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~ 255 (343) ...+.-+.+.+..|+.+++ +++.||.+.+.++...+ ...|++++|+++.|..|.+ +..+++.|+-++. T Consensus 132 ~~~~k~tLsgt~~Wsd~~s-DPi~di~~~~~ai~~~~--g~~Pn~~vlg~~a~~~l~~-------h~~i~~~lk~~~~-- 199 (307) T protein:vir:79 132 AAGNKKQLSATEKFTAANS-DPVGVIEDGKEAIRTKI--GRRPNTMVIGASAYKTLKA-------HPQLIEKIKYSMK-- 199 (307) T ss_pred CCCceEEEccCcccCCCCC-CcHHHHHHHHHHHHHhh--CCccceEEeCHHHHHHHhc-------CHHHHHHhcCccc-- Confidence 1112223334456988765 56899999999888753 3579999999999998863 2234444432221 Q ss_pred eeccccccccccce------eee--chhhhccccCCccceEEEEEcc-cceEEEeeccchhcccc-eecCceeEeeeeee Q lcl|NC_011142. 256 LLTRNPIDIKIRFQ------LMA--TELAAAGVSNGNKDRYVVYDKS-ERNLALAKPIPFRMLAP-QLLGLGITVPAEYK 325 (343) Q Consensus 256 ~~~~~p~~i~~~~~------l~~--~~~~~~g~~~~g~dr~v~y~~~-~~~~~~~v~~~~~~~~~-~~~~~~~~~~~~~~ 325 (343) .+.. |.......+ ..+ ....+...-.-|.+..++|... +.+-.-.+-+| ++... +.++.-...++.+ T Consensus 200 g~it-~~~la~l~~v~~V~vg~a~y~~~~~~~~~iw~~~~~l~y~~~~~~~~~~~~~~p-s~Gyt~~~~g~~~~d~~~~- 276 (307) T protein:vir:79 200 GIVT-VDLLKEIFEVENIAVGEAIYADDKDRFTDIWGANIVLAYVPLQRGGQQRTPYEP-SYGYTLRKKGNPVVDTRIE- 276 (307) T ss_pred cccC-HHHHHHHhCceeEEEeeeeeecccccchhcCCCceEEEecccccCCCCCccccc-ccceeEEecCceEEecccC- Confidence 1100 000000000 000 0000000000133555555421 11111011111 11111 1122222222332 Q ss_pred eeeEEEE------CcceeeeeccC Q lcl|NC_011142. 326 ISGTEYR------YPLCAQYVDML 343 (343) Q Consensus 326 ~gGv~i~------~P~ai~~~dGI 343 (343) -+|++++ .|.-++.--|. T Consensus 277 ~~~~~~vrv~~~~~~~i~~~~~G~ 300 (307) T protein:vir:79 277 DGKLELVRATDIFRPYLLGADAGY 300 (307) T ss_pred CCceeEEeecccccceeeccccch Confidence 2333332 33333333333 No 149 >protein:vir:94622 Length: 341 # NCBI annotation: PfWMP4_37 # Family: family:all:2203 # MgeID: mge:1525 # MgeName: Pf-WMP4 # Cross-refs: genbank:acc:YP_762667;genbank:gi:115304375;genbank:GeneID:5142322 Probab=84.58 E-value=0.059 Score=27.31 Aligned_cols=296 Identities=10% Similarity=-0.038 Sum_probs=115.3 Q ss_pred hhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccC-CCCcceeEEEEeeecccccceeecC Q lcl|NC_011142. 21 KFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLA-NIPEYATHWNYRSYDGAAMGKFISA 99 (343) Q Consensus 21 ~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~-~~~~~~~~~~~~~~~~~G~a~~~~~ 99 (343) |++ -.+++.|++.+ ...--|-.+.|.. .+.+.....+.++.++.-.. +... -.++.++..- ...++.+.. T Consensus 1 ~~~--~~~~~~~~~~t--~~v~~fipei~s~---~i~~~l~~~~v~~~~~~d~~~~~~~-Gdtv~ip~~g-~~~~~d~~~ 71 (341) T protein:vir:94 1 MAL--GNTITGPSINT--QRGQQFIPEQWLS---EVQMFRKAKMLDTSVVKTWGAQVKK-GDTFHVPRIS-ELGVEDKAT 71 (341) T ss_pred Ccc--hhhhccccccc--hhHHHHHHHHHHH---HHHHHHHhhcchhhccccccccccC-CceEEEeccC-cceeeeecC Confidence 111 11233444432 2233344444542 35555555666666653221 1111 2456555432 333444422 Q ss_pred CcCccceeeeccceeEEEEEE-EEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecC Q lcl|NC_011142. 100 NASDLPRVAQSAKLHQVELGY-AGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNP 178 (343) Q Consensus 100 ~~~dip~v~~~~~~~~~~v~~-~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p 178 (343) +.+++..+.+-.+....+-. ...++.++ +++..+ ...++-.+-...+..++++..|+.++--.+....... + T Consensus 72 -~~~i~~~~~~~~~~~itiD~~~~~~~~i~--d~d~~~-~~~d~~~~~~~~~~~aLA~~~D~~i~~~~a~~~~~~~---~ 144 (341) T protein:vir:94 72 -DVPVGVQPVNDTDFVITVDTDRTTAVALD--DLLEIQ-ASYDLRAPYLEAMGYALAKDMTGSILGLRAAVQNTAS---Q 144 (341) T ss_pred -CCccccccccCceEEEEEeeeeecceeec--hHHHHh-hccchHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc---C Confidence 23455555555555555522 34555554 555433 3557777777888888988888876532221111100 1 Q ss_pred CccccccCcCcc-ccCHH-HHHHHHHHHHHHHHHhcCCee-cccEEEecHHHHHHHhccccCCCCCccHHHHH----Hhc Q lcl|NC_011142. 179 NVTKTSATVNYA-TCTGQ-ELFDLLNNPVFAVVKASKRFH-TPNTVLMFPDLWKRASSLLMTGYTDRTVIEHF----QIN 251 (343) Q Consensus 179 ~v~~~~~~~~w~-~~t~~-~i~~di~~~~~~l~~~s~g~~-~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l----~~n 251 (343) + +.. +.+.. +.+++ -.++.|.++...|.+. ++- ....|+++|..|..|.+- + ..+-.++. .++ T Consensus 145 ~-~~~--~~~~~~t~~~~~~~~~~i~~a~~~Lde~--~VP~~gR~lvv~P~~~~~Ll~~---~--~~~~~~~~g~~~l~~ 214 (341) T protein:vir:94 145 N-VFS--SSNGAITGNGQAFSFAVFLAARRLLLEA--DVPEEKIVLLISPGQESALFTI---P--QFISKDFINNAPIAQ 214 (341) T ss_pred c-ccc--CccccccCchhhhhHHHHHHHHHHHhhc--CCCccCCEEEeCHHHHHHHhhc---h--hhhhhhccccchhhe Confidence 1 000 01111 11122 2356666676666543 321 235699999999999641 1 11111111 112 Q ss_pred Ccceeeccccccccccceeeechhhhcccc----------------------CCccceEEEEEccc-ceEEEeeccchhc Q lcl|NC_011142. 252 NAYTLLTRNPIDIKIRFQLMATELAAAGVS----------------------NGNKDRYVVYDKSE-RNLALAKPIPFRM 308 (343) Q Consensus 252 ~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~----------------------~~g~dr~v~y~~~~-~~~~~~v~~~~~~ 308 (343) +....+-|.++......-.........+.+ ..+.-+.+++.++. -.+++.-|+-++. T Consensus 215 G~ig~i~G~~V~~Sn~lp~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~gl~~~~~av~~~k~~~~~~~~~ 294 (341) T protein:vir:94 215 GQIGSLMGVRVIRTSLIGNNSATGWRNGAPTIAPAEATPGFTGSRYLPKQDSFTSLPATFTGNSRPVHTAVMCHMDWAAA 294 (341) T ss_pred eeeeeEeceEEEEeccccccccccccccccceecccccccccccccccccccccccEEEEEEecccccceeeecchhhhc Confidence 222222222221111100000000000000 00011111111111 1111111111111 Q ss_pred ccceec---------CceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 309 LAPQLL---------GLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 309 ~~~~~~---------~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ..++.. .....+.... .-|+-+.||.+++.+-=- T Consensus 295 ~~~~~~~~~~~~~~~~~~~~i~~~~-~~G~~~lrp~~~v~~~~~ 337 (341) T protein:vir:94 295 VVSKAPRVTQSFENREQVWLMVGRQ-AYGARLYRPLHAVNIHTT 337 (341) T ss_pred cccccccccccchhhhhhhhhhhhh-hhcccccCcceeEEEecC Confidence 111111 1112222223 236777777776543322 No 150 >protein:vir:6378 Length: 346 # NCBI annotation: capsid protein E # Family: family:all:1021 # MgeID: mge:133 # MgeName: BcepNazgul # Cross-refs: genbank:acc:NP_918991;genbank:gi:34610166;genbank:GeneID:2559600 Probab=81.63 E-value=0.084 Score=26.48 Aligned_cols=287 Identities=10% Similarity=0.025 Sum_probs=104.4 Q ss_pred cchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeeccccc-ceeecCCcCccceeeecccee Q lcl|NC_011142. 36 NDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAM-GKFISANASDLPRVAQSAKLH 114 (343) Q Consensus 36 ~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~-a~~~~~~~~dip~v~~~~~~~ 114 (343) || .|+..+|+..-.+ .+...+-..++||-.. ......+.+...++.-. +..++......+.-...++-. T Consensus 1 ~d-----~f~~~~l~~~i~~---~p~~~~l~~~~fp~~~--~~~t~~i~i~~~~g~~~la~~v~~~~~~~~~~~~g~~~~ 70 (346) T protein:vir:63 1 ME-----IFDTLTLAGVIQS---GPALSMYWQGFYPNEI--TFDTDEILFDLVFKDKKLAPFVAPNVQGRVIAARGYTTK 70 (346) T ss_pred CC-----ccCHHHHHHHHHh---cCCccchhhhcCcccc--ccccceEEEEEecCceeeeeeecCCCCcceecccceeee Confidence 33 5666666543222 2344555667777422 23445555655544222 233343333333323223333 Q ss_pred EEEEEEEEEEEeecHHHHHHHHHh------CCCccH-------HHHHHHHHHHHHhhh----heeeeee---hhhcceee Q lcl|NC_011142. 115 QVELGYAGVECHYSLDELRTTAAV------NMPIDS-------MQAELAFRGSEEHSQ----RVAYFGD---TNRNMSGL 174 (343) Q Consensus 115 ~~~v~~~~~~~~~~~~El~~a~~~------g~~l~~-------~k~~aA~~~~~~~~n----~~~f~G~---~~~g~~GL 174 (343) ...++.+.....++..|+...+.. +.+... ++....++.++..++ +.+..|. .+-++.-. T Consensus 71 ~~~~p~i~~~~~i~~~d~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~l~~~i~~~~E~m~~~al~~gki~~~g~~~~~~ 150 (346) T protein:vir:63 71 TFRPAYVKPKDVINPNRTLKRRAGEQPIIGGMSLQERFQAVVADSQLEQRQRIENRIEWMCAMATIYGYVDVVGEAFPMQ 150 (346) T ss_pred EeecCccCccceeCHHHHHHHhhhhhhccCCcCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEEeeCCceeEE Confidence 445566777778888887653321 111111 122222333332222 2222331 11111111 Q ss_pred -ee--cC--CccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHH Q lcl|NC_011142. 175 -LN--NP--NVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQ 249 (343) Q Consensus 175 -lN--~p--~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~ 249 (343) .+ -| +....+.+..|+..++ +++.||.+.+..+...++ ..|.+++|+++.|..|.+ +..+.+.+. T Consensus 151 ~vdfg~~~~~~~~lt~~~~W~~~~a-dp~~di~~~~~~~~~~~g--~~~~~~i~~~~~~~~l~~-------~~~v~~~~~ 220 (346) T protein:vir:63 151 RVDFGRDPALTVQLTGGAAWDQATS-DPLGNIQTMRTTAWKKSN--STITRLTMGLDAWSLFSQ-------KPAVVELLN 220 (346) T ss_pred EEeeCCCccceeeecccccCCCCCC-CHHHHHHHHHHHHHHccC--CceEEEEECHHHHHHHhc-------CHHHHHHHh Confidence 11 11 1112345567987665 478999999988876533 468899999999998853 223333332 Q ss_pred hcC-ccee-ec------ccccc----c-------cccceeeechh-hhccc--cCCccceEEEEEccc-ceEEEeeccch Q lcl|NC_011142. 250 INN-AYTL-LT------RNPID----I-------KIRFQLMATEL-AAAGV--SNGNKDRYVVYDKSE-RNLALAKPIPF 306 (343) Q Consensus 250 ~n~-~~~~-~~------~~p~~----i-------~~~~~l~~~~~-~~~g~--~~~g~dr~v~y~~~~-~~~~~~v~~~~ 306 (343) -++ .... +. +.... + ....+.....+ +..|. ..--.|.++.+.... -.+...-+.++ T Consensus 221 ~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~gi~i~~y~~~y~d~~G~~~~~ip~~~v~~~p~~~~g~~~yg~~~d~ 300 (346) T protein:vir:63 221 LFYKGSTSDFNRSRLDDGSPVQYQGTIGGYNGMGTLELYTYHDTYTGDDNTEQEILGSYDVVGTGPGLQGTQCFGAIMDF 300 (346) T ss_pred hhccccccccchhhcccchhhhhhhhHhhhhccCCeEEEEeccEEEcCCCceeccccCCeEEEEecCCcceEEEeecccc Confidence 110 0000 00 00000 0 00000000000 00000 000012222221110 00101000000 Q ss_pred hc-------cc---ceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 307 RM-------LA---PQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 307 ~~-------~~---~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) .. .+ .+.+.....+...++- =.++.+|.++..++== T Consensus 301 ~~~~~~~~~~~~~~~~~dp~~~~~~~~s~p-lPv~~~p~~~~~~~V~ 346 (346) T protein:vir:63 301 KNGLVPTRMFPKMWEEEDPSVAMLMTQSAP-LMVPAQPNASFRMTVK 346 (346) T ss_pred ccCcccceeeeEEEEecCCCEEEEEEeeec-cceecCCCcEEEEEeC Confidence 00 00 0000001111111000 0011222221111000 No 151 >protein:vir:8885 Length: 347 # NCBI annotation: major capsid protein A # Family: family:all:975 # MgeID: mge:161 # MgeName: gh-1 # Cross-refs: genbank:acc:NP_813774;genbank:gi:29366729;genbank:GeneID:1258837 Probab=80.95 E-value=0.09 Score=26.31 Aligned_cols=310 Identities=10% Similarity=0.003 Sum_probs=126.2 Q ss_pred hhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHH-HHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeee Q lcl|NC_011142. 10 AQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLA-SLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSY 88 (343) Q Consensus 10 ~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~-~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~ 88 (343) -.-+.+.++++ +.++-|.. .+|....|. ++|. +++...-+. -..+.++.+++ +-.| .++.+. T Consensus 1 ~a~~~~~~~~~----~~~g~~~~----~~d~~al~i-e~~~geV~~~f~~~----s~~~~~~~~r~-i~~G-~sv~~~-- 63 (347) T protein:vir:88 1 MANATGGQQIG----ANQGKGQS----AADKLALFL-KVFGGEVLTAFVRR----SVTMDKHMVRT-IQNG-KSASFP-- 63 (347) T ss_pred CCCcccchhhh----ccCCCCcc----ccchHHHHH-HHHHHHHHHHHHHH----hhhhhcccccc-ccCc-ceEEEe-- Confidence 11122333332 22222211 123223444 6654 666544332 24455555543 2223 334333 Q ss_pred cccccceeec-CCcCcc--ceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeee Q lcl|NC_011142. 89 DGAAMGKFIS-ANASDL--PRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFG 165 (343) Q Consensus 89 ~~~G~a~~~~-~~~~di--p~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G 165 (343) ..|..+... ..+.++ |..+..-.+....|-.+- -+..-+.++..+ +...++-..-...+..++++..|+.++-- T Consensus 64 -~iG~~~~~~~~~g~~l~~~~~~~~~~~~~i~ID~~~-y~~~~Vdd~D~~-q~~~D~r~~~~~~~g~aLA~~~D~~i~~~ 140 (347) T protein:vir:88 64 -VMGRTKGYYLAPGENLDDKRKDIKHSEKVIQIDGLL-TSDVLIYDIEDA-MNHYDVRAEYSAQLGEALAIAADGAVLAE 140 (347) T ss_pred -eecceeeeeeccccCCCCCCCCCccceEEEEEechh-hhhhhhhhHHHH-hhcCCchHHHHHHHHHHHHHHHHHHHHHH Confidence 344444311 112222 223333444444443321 123345666663 44566777778889999999999887522 Q ss_pred e------h---hhcceeeeecCCccccccCc-CccccCHHHHHHHHHHHHHHHHHhcCCee-cccEEEecHHHHHHHhcc Q lcl|NC_011142. 166 D------T---NRNMSGLLNNPNVTKTSATV-NYATCTGQELFDLLNNPVFAVVKASKRFH-TPNTVLMFPDLWKRASSL 234 (343) Q Consensus 166 ~------~---~~g~~GLlN~p~v~~~~~~~-~w~~~t~~~i~~di~~~~~~l~~~s~g~~-~p~tL~l~p~~~~~L~~~ 234 (343) - + +..+.|+-....++..++.. .=+.++++.+++.|.++...|.++ .+- ....++|+|+.|..|... T Consensus 141 l~~~a~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~a~~~Lde~--~VP~~gR~~vv~P~~y~~Ll~~ 218 (347) T protein:vir:88 141 MAKLCNLPAASNENIAGLGQAVVLNIGAAADLVDVEARGKAILKGLTLARARLTKN--YVPAGDRRFYCAPEDYSAILSA 218 (347) T ss_pred HHHhhccccccccccCCccccccccccccccccchhhhHHHHHHHHHHHHHHHhhc--CCCCCCCEEEeCHHHHHHHhcc Confidence 1 1 11233321111111111111 112345677888899998888654 332 246799999999888653 Q ss_pred ccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEE--------EEEcccce-EEEee--- Q lcl|NC_011142. 235 LMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYV--------VYDKSERN-LALAK--- 302 (343) Q Consensus 235 ~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v--------~y~~~~~~-~~~~v--- 302 (343) ......+.....-+. ++....+.|-.+....+.........+.+.+..++.... -|.-+..+ +.+.. T Consensus 219 ~~~~~~~~~~~~~~~-~G~vg~i~G~~V~~s~nlp~~~~~~~~~~~~~~~t~~~~~~~~~~~~~~~~d~~~~~~l~~~~~ 297 (347) T protein:vir:88 219 LMPNAANYAALIDPE-TGNIRNVMGFEVIEVPHLTVGGAGDNNPADGVAPTNQKHIFPATATGDDRVAQNNVVGLFNHRS 297 (347) T ss_pred hhhhhhhhccccchh-cceeeeeccceEEEeecccccccccccccccccccccccccccccccccccccCcEEEEEechh Confidence 322221111111111 111111222211111110000000000000000000000 01111111 11111 Q ss_pred ------ccchhccc-ceecCceeEeeeeeeeeeEEEECcceeeee-ccC Q lcl|NC_011142. 303 ------PIPFRMLA-PQLLGLGITVPAEYKISGTEYRYPLCAQYV-DML 343 (343) Q Consensus 303 ------~~~~~~~~-~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~-dGI 343 (343) .++++.-. -.++...+.+.+.... |+.+.||.+++.+ ..- T Consensus 298 a~g~v~~~d~~~e~~r~~~~~~d~i~~~~~~-G~~~~rPe~a~~~~~~~ 345 (347) T protein:vir:88 298 AVGTVKLKDMALERARRPEFQADQIIGKYAM-GHGGLRPEAAGALVFTP 345 (347) T ss_pred hhhheecccceeeeeechhhHHHHhhhhhhh-cCceeccceEEEEEeCC Confidence 11111110 1122345566666655 7999999866433 333 No 152 >protein:vir:79928 Length: 393 # NCBI annotation: major head protein # Family: family:all:30335 # MgeID: mge:1874 # MgeName: 0305phi8-36 # Cross-refs: genbank:acc:YP_001429616;genbank:gi:156564106;genbank:GeneID:5525693 Probab=80.45 E-value=0.095 Score=26.20 Aligned_cols=311 Identities=12% Similarity=0.082 Sum_probs=145.1 Q ss_pred CCc--ceeccchhhhhch-------hhhch--hcccccc-cCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchh Q lcl|NC_011142. 1 MSE--KRVVIDAQTIAGN-------RWLNK--FLDSNAT-IGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLE 68 (343) Q Consensus 1 ~~~--~~~~~~~~~~~~~-------~~~~~--~~~~~~~-~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~ 68 (343) |.| ---+.|+..++=| +++-. +.+.+.+ +.+-.+++.++ +.++.. .-|...+.|++.|-.-... T Consensus 28 me~~et~~e~~~~~~~~~~~e~el~E~f~Kmm~G~~p~~eV~~~e~mtt~~--a~IliP--~vis~v~~Eaaepl~~~~k 103 (393) T protein:vir:79 28 MERGETLAEADANKLALNEEETQILESFAKMMEGETPTNEVNLREFMATPS--AQILIP--RVIVGTMREAAEPLYIGTK 103 (393) T ss_pred hhhhhhhhhhhhhhhhcchhHHHHHHHHHHHhcCCCchhheehhhhhcCCC--cceech--hhhhhhhhhcccchhHHHH Confidence 432 1223333333333 23321 1111110 22223333333 333333 4566666665554444444 Q ss_pred hccccCCCCcceeEEEEeeecccccceeecCCcCccceee---eccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHH Q lcl|NC_011142. 69 DVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVA---QSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSM 145 (343) Q Consensus 69 ~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~---~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~ 145 (343) ++...+ ...| ++..|..+. +=++..+++++. +|..+ ...+.....+-..+....|+.+=+. ..|.++-.- T Consensus 104 l~qk~~-L~~G-rsm~F~~~g-~~Ra~~IgEGgE-~~~~sld~~T~dsv~~~~gK~G~~Ia~SqEmIs---DSg~Dvin~ 176 (393) T protein:vir:79 104 MLQKIR-LKSG-QSMIFPSIG-IMRAYDVAEGQE-IPEDSIDWQTHESPEIRVGKSGIRLRFTDEMIS---DSQWDLMSM 176 (393) T ss_pred HHHHHh-hhcC-cceeccchh-eeeecccccccc-ccccchhhhcCCceeEEechhhhhhhhHHHHhh---cchHHHHHH Confidence 443211 1112 111111110 111112222211 22222 2233444555666666777665554 458888888 Q ss_pred HHHHHHHHHHHhhhheeeeeehhhcc---eeeeecCCccccccCcCccc-cCHHHHHHHHHHHHHHHHHhcCCeecccEE Q lcl|NC_011142. 146 QAELAFRGSEEHSQRVAYFGDTNRNM---SGLLNNPNVTKTSATVNYAT-CTGQELFDLLNNPVFAVVKASKRFHTPNTV 221 (343) Q Consensus 146 k~~aA~~~~~~~~n~~~f~G~~~~g~---~GLlN~p~v~~~~~~~~w~~-~t~~~i~~di~~~~~~l~~~s~g~~~p~tL 221 (343) ...+|-++++++.+..+|+|.+..|- .|+...|-...+.- +..+ -.+-=-++|+.+++-++.. ..+.|.+| T Consensus 177 ~l~aA~RaMaRkKee~a~n~fk~~ghtvfDa~st~t~ahptGr--~~~~~qNGTlSleDllDm~~av~~---~hyt~svi 251 (393) T protein:vir:79 177 MIKQAGRAMGRHKEQKAYHQFRSHGHTVFDNYSTNKLAHTTGL--DKNGVQNDTFSAEDFLDLIIAVMA---NEYTPSDL 251 (393) T ss_pred HHHHHHHHHHhhhHHHHHhhhhcccceeeeccccCccceeecC--CccccccccccHHHHHHHHHHHhc---ccCCcceE Confidence 89999999999999999999998875 45444443222111 1111 0111235677777655542 45689999 Q ss_pred EecHHHHHHHhccc------cCC--CCCc--------cHHHHHHhcCcceeeccccccccccceeeechhhhccccCCcc Q lcl|NC_011142. 222 LMFPDLWKRASSLL------MTG--YTDR--------TVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNK 285 (343) Q Consensus 222 ~l~p~~~~~L~~~~------~~~--~~~~--------tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~ 285 (343) .|.|-.|+.+.+.. .+. +++. -.-+-|+..- |-+.-+....+++.... . T Consensus 252 ~MHPLAWnv~AKna~me~~~~na~gN~~~~~~~ts~algp~~i~~~~--------~~nlnv~~sPfvp~d~k-------~ 316 (393) T protein:vir:79 252 MMHPLAWTVFAKNELMGSLQANPYGNYPAKGAPSSMALGPDSIQGRL--------PFNFNVNLSPFIPLDKK-------S 316 (393) T ss_pred EEcCchhhhhhhhhhhcceeeccccccCccccchhhhhchhhhcccc--------ccceeEEEecccccccc-------c Confidence 99999998875421 110 0100 0111122111 11111112222222111 3 Q ss_pred ceEEEEEcccceEEEeeccc-hhcccceecCc-eeEeeeeeeeeeE-EEECcceeeeeccC Q lcl|NC_011142. 286 DRYVVYDKSERNLALAKPIP-FRMLAPQLLGL-GITVPAEYKISGT-EYRYPLCAQYVDML 343 (343) Q Consensus 286 dr~v~y~~~~~~~~~~v~~~-~~~~~~~~~~~-~~~~~~~~~~gGv-~i~~P~ai~~~dGI 343 (343) .|+=.|.=|+.++..-+.-+ ++.-.-+-+.- -..++..+|. |+ ++---.+|+....| T Consensus 317 ~rFd~~~Vd~NnvgvlLV~D~i~tdq~ddk~rdiq~iKl~ERY-G~gvLn~gkaiavakNI 376 (393) T protein:vir:79 317 RRFDVYAVDRNNVGVLLVRDDLKTDQWDEKARGLQNIKMIERY-GIGILNEGKAIAVAKNI 376 (393) T ss_pred ceeeEEEeecCCceEEEEecCcceeccccccccceeeeeeeee-ceeeeeCCceEEEEecc Confidence 46666666666666544222 22222221111 2466777776 45 77777888888888 No 153 >protein:vir:96490 Length: 348 # NCBI annotation: head protein # Family: family:all:1083 # MgeID: mge:1620 # MgeName: 2972 # Cross-refs: genbank:acc:YP_238492;genbank:gi:66391768;genbank:GeneID:5176912 Probab=80.11 E-value=0.098 Score=26.12 Aligned_cols=279 Identities=11% Similarity=0.040 Sum_probs=100.7 Q ss_pred ccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeee-ccccc-ceeecCCcCc Q lcl|NC_011142. 26 NATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSY-DGAAM-GKFISANASD 103 (343) Q Consensus 26 ~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~-~~~G~-a~~~~~~~~d 103 (343) |+.+ --.|+..+++..-+.+ ..+...+-..++||... .. . ..+.+... ..... +.+++..+.. T Consensus 1 M~~i-----------~d~f~~~~l~~~i~~~-~~~~~~~l~~~~Fp~~~-~~-~-~~~~~~~~~~~~~~~a~~v~~~~~~ 65 (348) T protein:vir:96 1 MGLI-----------YDKVTASNIAGYFNTL-QENVDSTLGESIFPARK-QL-G-TKLSYIKGASGQSVALKAAAFDTNV 65 (348) T ss_pred Ccch-----------hhccCHHHHHHHHHhc-ccchhhhhhhhcCCCcc-cc-c-eeEEEEeecCCceeEeeeecCCCCc Confidence 1111 1134444443221111 11233445567777432 11 1 11222111 11111 2233333222 Q ss_pred cceeeeccceeEEEEEEEEEEEeecHHHHHHHHH---hCCCccHHH--------HHHHHHHH----HHhhhheeeeee-- Q lcl|NC_011142. 104 LPRVAQSAKLHQVELGYAGVECHYSLDELRTTAA---VNMPIDSMQ--------AELAFRGS----EEHSQRVAYFGD-- 166 (343) Q Consensus 104 ip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~---~g~~l~~~k--------~~aA~~~~----~~~~n~~~f~G~-- 166 (343) ...-...++.....++.+.....++..|+..... .+.+-..+. ....++.+ +...-+++++|- T Consensus 66 ~~~~r~~~~~~~~~~p~i~~~~~i~~~d~~~l~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~qal~~Gki~ 145 (348) T protein:vir:96 66 TIRDRVSAEIHDEQMPFFKEALLVKENDRQQLNLVKDTGNEALINTIVAGIFNDDVTLINGARARLEAMRMQVLATGKIA 145 (348) T ss_pred ceecccceeeeeeecCccccccccCHHHHHHHHhhhccCCchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCeeE Confidence 2222333556666777777778888777654322 222211111 11111222 222234444551 Q ss_pred -hhhcce-ee-eecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCcc Q lcl|NC_011142. 167 -TNRNMS-GL-LNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRT 243 (343) Q Consensus 167 -~~~g~~-GL-lN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~t 243 (343) .+.|.. .+ +..|.-...+.+..|+++++ +++.||.+.+..+.. .|. .|.+++|+++.|..|.+ +.. T Consensus 146 ~~~~~~~~~vdfg~~~~~~~t~~~~W~~~~a-dp~~di~~~~~~~~~--~G~-~~~~~i~~~~~~~~l~~-------~~~ 214 (348) T protein:vir:96 146 FTSDGVNKDIDYGVKADHKKQVSKSWAEPGA-TPLADLEDAIETARE--LGL-NPERAIMNAKTFGLIRK-------AAS 214 (348) T ss_pred eecCCeeEEEeccCCcccceeeccccCCCCC-CHHHHHHHHHHHHHh--cCC-cccEEEeCHHHHHHHhc-------CHH Confidence 112211 10 11232223344567998765 588999999877653 354 68999999999999852 234 Q ss_pred HHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEc----ccceEEEeeccc-hhcccceecC-ce Q lcl|NC_011142. 244 VIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDK----SERNLALAKPIP-FRMLAPQLLG-LG 317 (343) Q Consensus 244 vle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~----~~~~~~~~v~~~-~~~~~~~~~~-~~ 317 (343) +.+.+.-.+.......... ....+. ..+|-+ .++|+. ......-.+|.. +.++|....+ .. T Consensus 215 v~~~~~~~~~~~~~~~~~~--------~~~~~~----~~~g~~-i~~y~~~y~d~~G~~~~~~p~~~v~l~~~~~~G~~~ 281 (348) T protein:vir:96 215 TVKAIKPLAGDGSSVTKAE--------LQNYVA----DNYGVE-IVLENGTYRNEKGEVSKFFPDGHLTLIPNGPLGNTV 281 (348) T ss_pred HHHHHhccCCccccccHHH--------HHHHHh----hhcCce-EEEEccEEEecCCcEeccccCCeEEEEcCCCceeEE Confidence 4454442222111100000 000000 011111 222211 001111112221 1111111111 00 Q ss_pred eEeee------eeee---------eeE---EEECcceeeeec-----cC Q lcl|NC_011142. 318 ITVPA------EYKI---------SGT---EYRYPLCAQYVD-----ML 343 (343) Q Consensus 318 ~~~~~------~~~~---------gGv---~i~~P~ai~~~d-----GI 343 (343) |-... .... .|. .+.......+.. .+ T Consensus 282 yg~~~e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dP~~~~~~~~s~pl 330 (348) T protein:vir:96 282 FGTTPEESDLFADNTVNADVEIVDSGIAVTTTKTTDPVNVQTKVSMVAL 330 (348) T ss_pred eccChhhhhhhhcccccccceecCCeeEEEeeecCCCceEEEEEeeeee Confidence 00000 0000 001 011111000000 00 No 154 >protein:vir:6324 Length: 335 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:132 # MgeName: phiKMV # Cross-refs: genbank:acc:NP_877471;genbank:gi:33300843;uniprot:Q7Y2D3;genbank:GeneID:1482613 Probab=79.86 E-value=0.1 Score=26.06 Aligned_cols=296 Identities=10% Similarity=0.000 Sum_probs=124.1 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) || .+.+...|..---+.....|+..-.-.+++++-+. -..+.+..+++ +- +- T Consensus 1 ms----------------------~~~~~tr~~~~~s~~d~al~le~f~geV~~af~~~----s~~~~~~~~rt-i~-~g 52 (335) T protein:vir:63 1 MS----------------------FLNDLTRPNYAGKNADVDIHLEEHLGIVDKHFAYT----SKFAPLMNIRD-LR-GS 52 (335) T ss_pred CC----------------------CcccchhhhcccccchhheehhhhhhhHHHHHHhh----hhhccccceee-ec-cc Confidence 11 11111112111111122355533334666655332 23345555543 21 23 Q ss_pred eEEEEeeecccccceee----cCCcCccceeeeccceeEEEEEE--EEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHH Q lcl|NC_011142. 81 THWNYRSYDGAAMGKFI----SANASDLPRVAQSAKLHQVELGY--AGVECHYSLDELRTTAAVNMPIDSMQAELAFRGS 154 (343) Q Consensus 81 ~~~~~~~~~~~G~a~~~----~~~~~dip~v~~~~~~~~~~v~~--~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~ 154 (343) .++.++ ..|..+.. |..-++.|.. .++....|=. +..- -+.++.. .+...++-..-....-.++ T Consensus 53 ~s~~~~---~iG~~~~~~~~pG~~l~~~~~~---~~k~~itVD~ll~a~~---~I~dlDe-~~~~yDvRse~s~e~G~aL 122 (335) T protein:vir:63 53 NVVRLD---RLGNVEAKGRRAGEELERSRVV---NDKWNLTVDTLLYLRH---QFDHQDE-WTQSFDMRKEVAELDGQEL 122 (335) T ss_pred eeEEEe---eeeeeeeecccCCcCcCCCCcc---ccceEEEecceeechh---hhhhHHH-HhcCchhHHHHHHHHHHHH Confidence 344443 33444432 1111222321 1232332221 2222 2445554 3345667677777788888 Q ss_pred HHhhhheee------eee-hhhcceeeeecCCccc-cccCcCccccCHHHHHHHHHHHHHHHHHhcCCe--ecccEEEec Q lcl|NC_011142. 155 EEHSQRVAY------FGD-TNRNMSGLLNNPNVTK-TSATVNYATCTGQELFDLLNNPVFAVVKASKRF--HTPNTVLMF 224 (343) Q Consensus 155 ~~~~n~~~f------~G~-~~~g~~GLlN~p~v~~-~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~--~~p~tL~l~ 224 (343) ++..|+.+| -+. +....+|-++ +|+.. ...+..=+...++.+++-+..+..+|.++.-.. ..+..++|+ T Consensus 123 A~~~D~~~~~~i~~aa~~~a~~~~~~~~~-~G~~~~~~~tg~~~~~~~~~l~~a~~~a~~~L~e~dVP~~~~~dr~~vv~ 201 (335) T protein:vir:63 123 ARKFDQACLIQVIKAAAMDAPVDLEDAFS-PGVLEKLDLTGLTAKQAADKIVRMHRRVVETFIDRDLGDAVYSEGLTPMS 201 (335) T ss_pred HHHHHHHHHHHHHhhccccCccccCCCcC-CCcceeeeeccCcccccHHHHHHHHHHHHHHHHhccCCCcccCceEEEeC Confidence 998888765 111 1122333333 23221 111111122358888888888888887642111 123679999 Q ss_pred HHHHHHHhcc--ccCCCCCcc--HHHHHHhcCcceeeccccccccccceeeechhhhcc-----c--c--CCccceEEEE Q lcl|NC_011142. 225 PDLWKRASSL--LMTGYTDRT--VIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAG-----V--S--NGNKDRYVVY 291 (343) Q Consensus 225 p~~~~~L~~~--~~~~~~~~t--vle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g-----~--~--~~g~dr~v~y 291 (343) |..|..|..- .++..++-+ .-.+. +.-...+.|.++..+.. .+.-...+ + + ...+.++.++ T Consensus 202 P~~y~~Ll~~~~l~n~~~~~s~~~~~~~--~g~v~~v~Gv~V~~sn~----lP~~~~t~~~lg~a~n~~~~d~~~~~~~~ 275 (335) T protein:vir:63 202 PRVFSLLLEHDKLMNVEYQATGATNDYV--KSRVAILNGVKVLETPR----FATKAIAAHPLGRHFNVSAEESERQIALF 275 (335) T ss_pred hHHHHHHhcccccccccccccccccccc--CceeEEeeceEEEeecc----CCCCCcccccccccCCccccccceeEEEE Confidence 9999998642 111000000 00000 00111122222211111 11101000 0 0 0112233333 Q ss_pred EcccceEEEeeccchhcc-cceecCceeEeeeeeeeeeEEEECcceee--eeccC Q lcl|NC_011142. 292 DKSERNLALAKPIPFRML-APQLLGLGITVPAEYKISGTEYRYPLCAQ--YVDML 343 (343) Q Consensus 292 ~~~~~~~~~~v~~~~~~~-~~~~~~~~~~~~~~~~~gGv~i~~P~ai~--~~dGI 343 (343) . .++-+.....++++.- .-+.+...+.+.+.... |+.++||.+++ ...|| T Consensus 276 ~-~~~Al~t~~~~~vt~e~~~~~~~~~~~i~~~~a~-G~g~lRPe~a~~i~~tg~ 328 (335) T protein:vir:63 276 L-PSKTLITAQVAPVQAKLWEDNEKFSWVLDTFQMY-NIGARRPDTAGAIELKGI 328 (335) T ss_pred E-ecceEEEEEEeecccceeeccchhhHHhHHHHHc-CCcccccceEEEEEEcCC Confidence 2 2333333333444321 11223355666666664 79999996655 56788 No 155 >protein:vir:94800 Length: 319 # NCBI annotation: ORF012 # Family: family:all:701 # MgeID: mge:1531 # MgeName: 29 # Cross-refs: genbank:acc:YP_240536;genbank:gi:66396203;genbank:GeneID:5133580 Probab=76.57 E-value=0.13 Score=25.37 Aligned_cols=287 Identities=10% Similarity=-0.013 Sum_probs=121.2 Q ss_pred CCcceeccchhhhhch-hhhchhcccccccCcchhecchhhhhhhhHHHHH-HHHHHHHhhhhhcccchhhccccCC-CC Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGN-RWLNKFLDSNATIGVPSVVNDADGGAAYYISQLA-SLETTVYEVPYADITYLEDVPVLAN-IP 77 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~-~id~~v~e~~~~~l~~~~~i~v~~~-~~ 77 (343) |.. +|-+. -.|+..|+--+.-++ ++.-.-+++.+. .+|+ .+....+...+-+.++ -. T Consensus 1 ~~~--------~~~~~~~~~~~~~~~~~~~~~-------~~nt~~l~~k~~~~LD~-----~~~~~~~s~~~~~N~~~e~ 60 (319) T protein:vir:94 1 MNK--------TIKNATGMLKLNLQHFANKSV-------EPGQTLLKNKHVGILER-----VTAVNAYSTPALISNDAIF 60 (319) T ss_pred CCc--------ccccccceeEeehhhhhccCC-------CcchHHHHHHHHHHHHH-----HHHHhhhhhhcccCcceEe Confidence 211 11000 001111111111111 122233333332 3333 2222222221111111 11 Q ss_pred cceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhC-CCccHHHHHHHHHHHHH Q lcl|NC_011142. 78 EYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVN-MPIDSMQAELAFRGSEE 156 (343) Q Consensus 78 ~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g-~~l~~~k~~aA~~~~~~ 156 (343) .+..++....++..|-. -|.- .++...-+++.++....+- -..+|.+.+.++...+..+ +.......+.++..+.- T Consensus 61 ~gg~tVkIp~i~~~gl~-DY~R-~~g~~~g~vt~~~~t~tid-qdR~~~F~VD~~D~~Etn~~l~a~~i~~~~~~~~v~P 137 (319) T protein:vir:94 61 MEGRSFTVMKGDTTELK-DYKR-NATNEFDHPKIEETTYFLD-QEKYWGRFVDALDRKDTEGNIDINYVVARQGAEVVAP 137 (319) T ss_pred ccCcEEEEeeecccccc-cccC-CCCcccCCcccceeEEEee-cccccccccchhhHhhhhchhhHHHHHHHHHHHHhhh Confidence 24556777766665533 2211 1223333444555554443 3677888888888766533 22222334445555555 Q ss_pred hhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhcc-c Q lcl|NC_011142. 157 HSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSL-L 235 (343) Q Consensus 157 ~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~-~ 235 (343) ..|...|--..... | ...+ .+.|++.+++.|.++..+|.++ ++-....|+++|..+..|..- + T Consensus 138 EiDay~~skla~~a--~-------~~~~-----~~~t~~n~y~~i~~a~~~Lde~--~VP~~Rvl~Vtp~~~~~L~~~~~ 201 (319) T protein:vir:94 138 YLDNLRFATLARNK--A-------KHLT-----VGTGSDAQYDAVLDVSVELDEI--KAPENRVLFVSPTFYKGIKKFVI 201 (319) T ss_pred hhhHHHHHHHHhhc--c-------cccc-----cccCHHHHHHHHHHHHHHHHhc--CCCCCcEEEeCHHHHHHHHhhhh Confidence 55655443322210 0 0011 1246788999999999999875 443457799999999999531 1 Q ss_pred cCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccceecC Q lcl|NC_011142. 236 MTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLG 315 (343) Q Consensus 236 ~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~ 315 (343) .....+ +.+-...++....+.|.++ ..++..... +.+.++++..-.-..... ..++.+.+.++. T Consensus 202 f~~~~~--~~~~~~~~g~Vg~idG~~V-------i~vps~~~k-----~in~i~~h~~A~~~~~k~--~~~~~~~p~~~~ 265 (319) T protein:vir:94 202 ALPQGD--TRQQVLGKGVQGELDGFVI-------VKVPTKLLQ-----GLQAIAVVGEVLASPIQA--DLAKTNSNIPGM 265 (319) T ss_pred hhcccc--ccccceeeeeceeecCeEE-------EEecccccc-----cceEEEEcCCeeeeeeee--eeeeccCCCccc Confidence 111111 1111111222222222221 111111111 223333332211111110 112322223444 Q ss_pred ceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 316 LGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 316 ~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ..|.+.+. .+.|+.|.+|...+.+... T Consensus 266 ~a~~v~gr-~y~d~~V~~~k~~~Iy~~~ 292 (319) T protein:vir:94 266 FGTLAEQL-LYTGAFVPEHLQKYIFTIG 292 (319) T ss_pred cceeeeee-eeeeeEEeccccceEEEee Confidence 45666654 4789999999865544433 No 156 >protein:vir:97331 Length: 319 # NCBI annotation: ORF011 # Family: family:all:701 # MgeID: mge:1666 # MgeName: 52A # Cross-refs: genbank:acc:YP_240611;genbank:gi:66396278;genbank:GeneID:5133687 Probab=76.57 E-value=0.13 Score=25.37 Aligned_cols=287 Identities=10% Similarity=-0.013 Sum_probs=121.2 Q ss_pred CCcceeccchhhhhch-hhhchhcccccccCcchhecchhhhhhhhHHHHH-HHHHHHHhhhhhcccchhhccccCC-CC Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGN-RWLNKFLDSNATIGVPSVVNDADGGAAYYISQLA-SLETTVYEVPYADITYLEDVPVLAN-IP 77 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~-~id~~v~e~~~~~l~~~~~i~v~~~-~~ 77 (343) |.. +|-+. -.|+..|+--+.-++ ++.-.-+++.+. .+|+ .+....+...+-+.++ -. T Consensus 1 ~~~--------~~~~~~~~~~~~~~~~~~~~~-------~~nt~~l~~k~~~~LD~-----~~~~~~~s~~~~~N~~~e~ 60 (319) T protein:vir:97 1 MNK--------TIKNATGMLKLNLQHFANKSV-------EPGQTLLKNKHVGILER-----VTAVNAYSTPALISNDAIF 60 (319) T ss_pred CCc--------ccccccceeEeehhhhhccCC-------CcchHHHHHHHHHHHHH-----HHHHhhhhhhcccCcceEe Confidence 211 11000 001111111111111 122233333332 3333 2222222221111111 11 Q ss_pred cceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhC-CCccHHHHHHHHHHHHH Q lcl|NC_011142. 78 EYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVN-MPIDSMQAELAFRGSEE 156 (343) Q Consensus 78 ~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g-~~l~~~k~~aA~~~~~~ 156 (343) .+..++....++..|-. -|.- .++...-+++.++....+- -..+|.+.+.++...+..+ +.......+.++..+.- T Consensus 61 ~gg~tVkIp~i~~~gl~-DY~R-~~g~~~g~vt~~~~t~tid-qdR~~~F~VD~~D~~Etn~~l~a~~i~~~~~~~~v~P 137 (319) T protein:vir:97 61 MEGRSFTVMKGDTTELK-DYKR-NATNEFDHPKIEETTYFLD-QEKYWGRFVDALDRKDTEGNIDINYVVARQGAEVVAP 137 (319) T ss_pred ccCcEEEEeeecccccc-cccC-CCCcccCCcccceeEEEee-cccccccccchhhHhhhhchhhHHHHHHHHHHHHhhh Confidence 24556777766665533 2211 1223333444555554443 3677888888888766533 22222334445555555 Q ss_pred hhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhcc-c Q lcl|NC_011142. 157 HSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSL-L 235 (343) Q Consensus 157 ~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~-~ 235 (343) ..|...|--..... | ...+ .+.|++.+++.|.++..+|.++ ++-....|+++|..+..|..- + T Consensus 138 EiDay~~skla~~a--~-------~~~~-----~~~t~~n~y~~i~~a~~~Lde~--~VP~~Rvl~Vtp~~~~~L~~~~~ 201 (319) T protein:vir:97 138 YLDNLRFATLARNK--A-------KHLT-----VGTGSDAQYDAVLDVSVELDEI--KAPENRVLFVSPTFYKGIKKFVI 201 (319) T ss_pred hhhHHHHHHHHhhc--c-------cccc-----cccCHHHHHHHHHHHHHHHHhc--CCCCCcEEEeCHHHHHHHHhhhh Confidence 55655443322210 0 0011 1246788999999999999875 443457799999999999531 1 Q ss_pred cCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccceecC Q lcl|NC_011142. 236 MTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLG 315 (343) Q Consensus 236 ~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~ 315 (343) .....+ +.+-...++....+.|.++ ..++..... +.+.++++..-.-..... ..++.+.+.++. T Consensus 202 f~~~~~--~~~~~~~~g~Vg~idG~~V-------i~vps~~~k-----~in~i~~h~~A~~~~~k~--~~~~~~~p~~~~ 265 (319) T protein:vir:97 202 ALPQGD--TRQQVLGKGVQGELDGFVI-------VKVPTKLLQ-----GLQAIAVVGEVLASPIQA--DLAKTNSNIPGM 265 (319) T ss_pred hhcccc--ccccceeeeeceeecCeEE-------EEecccccc-----cceEEEEcCCeeeeeeee--eeeeccCCCccc Confidence 111111 1111111222222222221 111111111 223333332211111110 112322223444 Q ss_pred ceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 316 LGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 316 ~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ..|.+.+. .+.|+.|.+|...+.+... T Consensus 266 ~a~~v~gr-~y~d~~V~~~k~~~Iy~~~ 292 (319) T protein:vir:97 266 FGTLAEQL-LYTGAFVPEHLQKYIFTIG 292 (319) T ss_pred cceeeeee-eeeeeEEeccccceEEEee Confidence 45666654 4789999999865544433 No 157 >protein:vir:80213 Length: 334 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:1879 # MgeName: LKA1 # Cross-refs: genbank:acc:YP_001522884;genbank:gi:158345177;genbank:GeneID:5687476 Probab=75.77 E-value=0.14 Score=25.22 Aligned_cols=303 Identities=8% Similarity=0.006 Sum_probs=125.3 Q ss_pred hhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHH-HHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeee Q lcl|NC_011142. 10 AQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQL-ASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSY 88 (343) Q Consensus 10 ~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l-~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~ 88 (343) -.++.+|.+- .|+.--.+.+...|+ ++| -.|++..-+. -..+.+..+++ +--| .++.+. T Consensus 1 m~~~~~~~~t-----------~~~~~~~~~~~~l~l-e~~~geV~~af~~~----s~~~~~~~~r~-i~~G-~s~~~~-- 60 (334) T protein:vir:80 1 MTYPAANTHT-----------RPGWGGANSDVSLHI-EEHLGLVDASFMYS----SKFASWMNVRS-LRGT-NQLRVD-- 60 (334) T ss_pred CCCCcCCCcc-----------ccccccccchheehh-hhhhhHHHHHHHHh----hhhhccceeee-cccc-ceEEEe-- Confidence 2233333222 232222122233454 444 4666555332 23344444432 2212 334333 Q ss_pred cccccceeec-CCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeeh Q lcl|NC_011142. 89 DGAAMGKFIS-ANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDT 167 (343) Q Consensus 89 ~~~G~a~~~~-~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~ 167 (343) ..|..+... ..+..+..-...-++....|-.. .-++.-+.++..+ ++..++...-...+..++++..|+.++--.. T Consensus 61 -~iG~~~~~~~~~g~~l~~~~~~~~~~~l~ID~~-l~~~~~VddiD~~-q~~~D~rse~~~~~G~aLA~~~D~~~~~~l~ 137 (334) T protein:vir:80 61 -RVGASTIAGRKAGEELVVQKNVSDKLNLTVDTV-LYARHFFDKFDEW-TSNLDVRKETAREDGIALARQYDQACIIQLQ 137 (334) T ss_pred -eecceeeeeecCCCCCCCCCcccCceEEEEeee-eehhhhHhhHHHH-hcCcchHHHHHHHHHHHHHHHHHHHHHHHHH Confidence 334444311 11122222122223333333321 1223445677764 4456677888888899999999987663311 Q ss_pred h-------hcceeeeecCCccc--cccCcCccccCHHHHHHHHHHHHHHHHHhcCCee--cccEEEecHHHHHHHhcc-c Q lcl|NC_011142. 168 N-------RNMSGLLNNPNVTK--TSATVNYATCTGQELFDLLNNPVFAVVKASKRFH--TPNTVLMFPDLWKRASSL-L 235 (343) Q Consensus 168 ~-------~g~~GLlN~p~v~~--~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~--~p~tL~l~p~~~~~L~~~-~ 235 (343) . .+.+.-+.+.+... .+..+.=...+++.+++-+..+...|.++.-.-. ....++++|..|..|..- + T Consensus 138 kaa~~~~~~~~~~~~~~G~~~~~~~~g~~~~~~~~~~~l~~a~~~a~~~L~e~dvp~~~~~~R~~vv~P~~y~~Ll~~~r 217 (334) T protein:vir:80 138 KCGDFLAPAHLKPAFHDGILLPSTISGLAADAAADADVLVAAHRQGVEAMVFRDLGDQLMSEGVTLLDPVIFSFLLEHDR 217 (334) T ss_pred HhhhhcccccccccccCCcceeecccccccchhhhHHHHHHHHHHHHHHHHhcCCCCCcCCceEEEeChHHHHHHhcccc Confidence 0 11111111111111 1111111234688888888888888877532211 236899999999998652 1 Q ss_pred cCCC-CC--ccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCc---------cceEEEEEcccceEEEeec Q lcl|NC_011142. 236 MTGY-TD--RTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGN---------KDRYVVYDKSERNLALAKP 303 (343) Q Consensus 236 ~~~~-~~--~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g---------~dr~v~y~~~~~~~~~~v~ 303 (343) +-+. ++ -+...+- +.-...+.|.++..+. ..+.........|+ +.++.++ ..++-+..... T Consensus 218 ~~n~d~~~s~~~~~~~--~g~i~~v~G~~V~~Sn----~~P~~~~t~~~~g~~~~~~agd~t~~~~~~-~~~~Al~t~~~ 290 (334) T protein:vir:80 218 LMNVEFGAKEGGNSFV--GGRIAMLNGVRVVETP----RFPQSAITANALGADFNVTDAEVRRKMITF-IPSMALISAQV 290 (334) T ss_pred cccceecccccccccc--ceeEEEEeceEEEeec----CCCCccccccccccccccccccccceEEEE-EeCceEEEEEE Confidence 1100 00 0000000 1111112222221111 11211111111111 1122222 12222222222 Q ss_pred cchhccc-ceecCceeEeeeeeeeeeEEEECccee--eeeccC Q lcl|NC_011142. 304 IPFRMLA-PQLLGLGITVPAEYKISGTEYRYPLCA--QYVDML 343 (343) Q Consensus 304 ~~~~~~~-~~~~~~~~~~~~~~~~gGv~i~~P~ai--~~~dGI 343 (343) ++++.-. -+.+...+.+.+... .|+-+.||.++ +-++++ T Consensus 291 ~~~~~e~~~~~~~~~d~i~~~~a-~G~g~lRPeaa~vv~~~~~ 332 (334) T protein:vir:80 291 HPVSAQFWEEKKDFGHYLDTFQS-YNIGQRRPDAVAVHDITVT 332 (334) T ss_pred eecceeeeechhhHHHHHHHHHH-cCCceeccceEEEEEEeee Confidence 3322111 011233444545544 48999999655 456777 No 158 >protein:vir:96666 Length: 462 # NCBI annotation: ORF016 # Family: family:all:2450 # MgeID: mge:1623 # MgeName: Twort # Cross-refs: genbank:acc:YP_238545;genbank:gi:66391271;genbank:GeneID:5130448 Probab=71.77 E-value=0.19 Score=24.52 Aligned_cols=294 Identities=13% Similarity=0.112 Sum_probs=122.8 Q ss_pred CCcceeccchhhhhch---hhhchhcccccccCcchhecchh-hhhhhhHHHHHHHHHHHHhhhh--hcccchhhccccC Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGN---RWLNKFLDSNATIGVPSVVNDAD-GGAAYYISQLASLETTVYEVPY--ADITYLEDVPVLA 74 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~dA~-~~~~f~~~~l~~id~~v~e~~~--~~l~~~~~i~v~~ 74 (343) ||-..-+-..+-.+.. +.+...+.+-=+++ | |.+ .++++..+ .+|+++..--+ .+++.-+-++- . T Consensus 1 ~~~~~~~~~~~~~~~~~~~e~~~KS~~tg~g~~-p----~~q~~~gAlR~e---sL~~~i~~Lt~~~~~~~~~~~i~k-~ 71 (462) T protein:vir:96 1 MHKDTNLTAEQNKYADKFQEEVMKSYQTGYGIT-P----DTQVDAGALRRE---ILDDQITMLTWTQDDLIFYREISR-R 71 (462) T ss_pred CccccccchhhhhhhchhhHHHHHHHhcCCCcC-C----ccccccchhhhh---hhhhhhheeeecccchhhhhhcCC-c Confidence 5533333333333221 11122222111111 1 222 35666654 44444433222 23343333332 2 Q ss_pred CCCcceeEEE-EeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHH Q lcl|NC_011142. 75 NIPEYATHWN-YRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRG 153 (343) Q Consensus 75 ~~~~~~~~~~-~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~ 153 (343) +.......+. |......|.+.+++.. ...+..+.++.+++..+..++..-..++..-. +-.=.+......+.|... T Consensus 72 ~a~sTv~~y~~~~~~G~~g~~~f~~E~-g~~~~~d~~~~R~~~~~k~l~~t~~vsi~~tl--~n~~~d~~~~~~~dai~~ 148 (462) T protein:vir:96 72 PAQSTVQKYDVYLRHGNVGHSRFVREV-GVAPVSDPNIRQKTVEMKYVSDTKNLSIASTL--VNNIQDPMQILTEDAIAV 148 (462) T ss_pred hhhhhhhhheeeeccCccccccccccc-cccccCCCceEEEEEEEEEEeeeeeechhhhh--ccchhhHHHHHHHHHHHH Confidence 2222222222 2223344555555554 34677888899999999999988777765432 111223446777788899 Q ss_pred HHHhhhheeeeeehhhcceee---eecCCccccccCcCccccCHHHH-HHHHHHHHHHHHHhcCCeecccEEEecHHHHH Q lcl|NC_011142. 154 SEEHSQRVAYFGDTNRNMSGL---LNNPNVTKTSATVNYATCTGQEL-FDLLNNPVFAVVKASKRFHTPNTVLMFPDLWK 229 (343) Q Consensus 154 ~~~~~n~~~f~G~~~~g~~GL---lN~p~v~~~~~~~~w~~~t~~~i-~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~ 229 (343) +++......||||+.+.=.+- |+.-|+...-.+.+--.+-++.. -+.|+.+-.. .+++.-.|+-+.||....+ T Consensus 149 ~a~tiE~a~Fygds~l~~~~~~~gleFDGl~~lI~~~NViDarG~~Ls~~~ln~aa~~---i~~~fGt~TD~~~p~~v~a 225 (462) T protein:vir:96 149 VAKTIEWASFYGDASLTADPTGQGLEFDGLAKLIDKDNVIDAKGESLTETLLNRSAVL---IGKSFGTATDAYMPIGVHA 225 (462) T ss_pred HHHHHHHHHhhhhcccCCCccccccchhhhhhhcCCCceeecCCCCccHHHHhhhhhh---cccccCChhheecchHHHH Confidence 999999999999987643111 33334322222111111111111 2344444322 2456678999999999999 Q ss_pred HHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeec----cc Q lcl|NC_011142. 230 RASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKP----IP 305 (343) Q Consensus 230 ~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~----~~ 305 (343) .|..-.++.. ..+...|......|.++.. |....-.++|+=. .| T Consensus 226 ~f~~~~l~~q------rv~~~~n~g~~~~G~~v~~--------------------------f~s~~G~I~L~~s~~m~~~ 273 (462) T protein:vir:96 226 DFVNSVLGRQ------MQLMQDNSGNVNAGYNVQG--------------------------FYSSRGFIKLHGSTVMENE 273 (462) T ss_pred HHHHhhcCce------EEEEcCCCCceeeeeeccc--------------------------eeeeeeeeeeCCceecCcc Confidence 9874332211 1122233332233322110 1111112222200 01 Q ss_pred hhcccceecCceeEeeeeeeeee-EEEECcceeeeeccC Q lcl|NC_011142. 306 FRMLAPQLLGLGITVPAEYKISG-TEYRYPLCAQYVDML 343 (343) Q Consensus 306 ~~~~~~~~~~~~~~~~~~~~~gG-v~i~~P~ai~~~dGI 343 (343) ... -.+. ...-..|..-.... +.--.+....-=.+. T Consensus 274 ~i~-~~~~-~~~p~ap~~~~vsaTv~t~~~g~f~~~~d~ 310 (462) T protein:vir:96 274 LIL-DESL-QPLPNAPQPATVKATVETGKKGLFTDEHDR 310 (462) T ss_pred ccc-cccc-ccCCCCCCCCceeEEEEeCCCCCCCCccCc Confidence 110 0000 00001111111110 111111100000001 No 159 >protein:vir:99311 Length: 463 # NCBI annotation: putative capsid protein # Family: family:all:2450 # MgeID: mge:1655 # MgeName: K # Cross-refs: genbank:acc:YP_024474;genbank:gi:48696433;genbank:GeneID:2948039 Probab=69.64 E-value=0.22 Score=24.19 Aligned_cols=284 Identities=13% Similarity=0.121 Sum_probs=121.3 Q ss_pred CCcceeccchhhhhchh-hhchhcccccccCcchhecchh-hhhhhhHHHHHHHHHHHHhhhh--hcccchhhccccCCC Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNR-WLNKFLDSNATIGVPSVVNDAD-GGAAYYISQLASLETTVYEVPY--ADITYLEDVPVLANI 76 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~dA~-~~~~f~~~~l~~id~~v~e~~~--~~l~~~~~i~v~~~~ 76 (343) |.||--..-++-+-.-+ .+...+.+--+++ | |.+ .++++..+.| |+.+...-+ .+|+.-.-++- .+. T Consensus 3 ~~~~~~~~~~~~~~~~~e~~~KS~~tg~g~~-p----~~q~~~~AlR~EsL---~~~i~~Lt~~~~~f~~~~~i~k-~~a 73 (463) T protein:vir:99 3 IEKNLSDVQQKYADQFQEDVVKSFQTGYGIT-P----DTQIDAGALRREIL---DDQITMLTWTNEDLIFYRDISR-RPA 73 (463) T ss_pred cccccchHHHHHHhhhhHHHHHHhhcCCccC-C----ccccCcchhhhhhh---hhhhheeeecccchhhhhhcCC-chh Confidence 55554333322221111 1112222111111 1 223 2455555544 444433222 33444444432 222 Q ss_pred CcceeEEE-EeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHH Q lcl|NC_011142. 77 PEYATHWN-YRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSE 155 (343) Q Consensus 77 ~~~~~~~~-~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~ 155 (343) ......+. |......|.+.+++.. ...+..+.++.+++..+..++.....+...-. +-...+......+.|...++ T Consensus 74 ~STV~~y~~~~~~G~~g~~~f~~E~-g~~~~~d~~~~Rr~~~~K~l~~~~~VS~~~~l--~n~~~d~~~~~~~dai~~ia 150 (463) T protein:vir:99 74 QSTVVKYDQYLRHGNVGHSRFVKEI-GVAPVSDPNIRQKTVSMKYVSDTKNMSIASGL--VNNIADPSQILTEDAIAVVA 150 (463) T ss_pred hhhhhhheeeeccCccccccccccc-cccccCCCceEEEEEEeeeeehhhhhhhHHHh--hcccccHHHHHHHHHHHHHH Confidence 22222222 2223344555555554 34567788888888888888877766554422 33345667778888999999 Q ss_pred Hhhhheeeeeehhhcc----eeeeecCCccccccCc---CccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHH Q lcl|NC_011142. 156 EHSQRVAYFGDTNRNM----SGLLNNPNVTKTSATV---NYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLW 228 (343) Q Consensus 156 ~~~n~~~f~G~~~~g~----~GLlN~p~v~~~~~~~---~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~ 228 (343) +......||||+.+.= .| |+.-|+...-.+. |-...-..+ +.|+++-..+ +.+.-.|+-+.||.... T Consensus 151 ~tiE~a~FyGds~l~~~~~~~g-leFDGl~~lId~enviDarG~~Ls~--~~ln~Aa~~i---~~~fGt~TD~~lp~~vk 224 (463) T protein:vir:99 151 KTIEWASFYGDASLTSEVEGEG-LEFDGLAKLIDKNNVINAKGNQLTE--KHLNEAAVRI---GKGFGTATDAYMPIGVH 224 (463) T ss_pred HHHHHHHhhhhhccCCCcCccc-cchhhhhhhcCCCCeeecCCCcccH--HHHhhhhhhh---hcccCChhheecchHHH Confidence 9999999999986542 12 2233322211111 111111112 2355444333 44667899999999999 Q ss_pred HHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhc Q lcl|NC_011142. 229 KRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRM 308 (343) Q Consensus 229 ~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~ 308 (343) +.|..-.++.. ..+...|+.....|.++.. |....-.++|+=..-+. T Consensus 225 a~f~~~~l~~q------rv~~~~N~~~~~~G~~v~~--------------------------f~s~~G~I~L~~s~~m~- 271 (463) T protein:vir:99 225 ADFVNSILGRQ------MQLMQDNSGNVNTGYSVNG--------------------------FYSSRGFIKLHGSTVME- 271 (463) T ss_pred HHHHHHhcCce------EEEEcCCCCceeeeeeccc--------------------------eeeeeeeeeeCCceecC- Confidence 99974333211 1122233333333322110 00111112221000000 Q ss_pred ccceecCceeEee-eeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 309 LAPQLLGLGITVP-AEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 309 ~~~~~~~~~~~~~-~~~~~gGv~i~~P~ai~~~dGI 343 (343) .+..+.-.-. .-.+. --|...+.+.+- T Consensus 272 ---~~~il~~~~~~~p~ap-----~~~~~tatv~~~ 299 (463) T protein:vir:99 272 ---NELILDESLQPLPNAP-----QPAKVTATVETK 299 (463) T ss_pred ---CcccccchhhcCCCCc-----cCceeEEEEeec Confidence 0000000000 00000 001111122111 No 160 >protein:vir:95603 Length: 463 # NCBI annotation: ORF016 # Family: family:all:2450 # MgeID: mge:1577 # MgeName: G1 # Cross-refs: genbank:acc:YP_240903;genbank:gi:66394965;genbank:GeneID:5132544 Probab=69.64 E-value=0.22 Score=24.19 Aligned_cols=284 Identities=13% Similarity=0.121 Sum_probs=121.3 Q ss_pred CCcceeccchhhhhchh-hhchhcccccccCcchhecchh-hhhhhhHHHHHHHHHHHHhhhh--hcccchhhccccCCC Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNR-WLNKFLDSNATIGVPSVVNDAD-GGAAYYISQLASLETTVYEVPY--ADITYLEDVPVLANI 76 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~dA~-~~~~f~~~~l~~id~~v~e~~~--~~l~~~~~i~v~~~~ 76 (343) |.||--..-++-+-.-+ .+...+.+--+++ | |.+ .++++..+.| |+.+...-+ .+|+.-.-++- .+. T Consensus 3 ~~~~~~~~~~~~~~~~~e~~~KS~~tg~g~~-p----~~q~~~~AlR~EsL---~~~i~~Lt~~~~~f~~~~~i~k-~~a 73 (463) T protein:vir:95 3 IEKNLSDVQQKYADQFQEDVVKSFQTGYGIT-P----DTQIDAGALRREIL---DDQITMLTWTNEDLIFYRDISR-RPA 73 (463) T ss_pred cccccchHHHHHHhhhhHHHHHHhhcCCccC-C----ccccCcchhhhhhh---hhhhheeeecccchhhhhhcCC-chh Confidence 55554333322221111 1112222111111 1 223 2455555544 444433222 33444444432 222 Q ss_pred CcceeEEE-EeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHH Q lcl|NC_011142. 77 PEYATHWN-YRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSE 155 (343) Q Consensus 77 ~~~~~~~~-~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~ 155 (343) ......+. |......|.+.+++.. ...+..+.++.+++..+..++.....+...-. +-...+......+.|...++ T Consensus 74 ~STV~~y~~~~~~G~~g~~~f~~E~-g~~~~~d~~~~Rr~~~~K~l~~~~~VS~~~~l--~n~~~d~~~~~~~dai~~ia 150 (463) T protein:vir:95 74 QSTVVKYDQYLRHGNVGHSRFVKEI-GVAPVSDPNIRQKTVSMKYVSDTKNMSIASGL--VNNIADPSQILTEDAIAVVA 150 (463) T ss_pred hhhhhhheeeeccCccccccccccc-cccccCCCceEEEEEEeeeeehhhhhhhHHHh--hcccccHHHHHHHHHHHHHH Confidence 22222222 2223344555555554 34567788888888888888877766554422 33345667778888999999 Q ss_pred Hhhhheeeeeehhhcc----eeeeecCCccccccCc---CccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHH Q lcl|NC_011142. 156 EHSQRVAYFGDTNRNM----SGLLNNPNVTKTSATV---NYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLW 228 (343) Q Consensus 156 ~~~n~~~f~G~~~~g~----~GLlN~p~v~~~~~~~---~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~ 228 (343) +......||||+.+.= .| |+.-|+...-.+. |-...-..+ +.|+++-..+ +.+.-.|+-+.||.... T Consensus 151 ~tiE~a~FyGds~l~~~~~~~g-leFDGl~~lId~enviDarG~~Ls~--~~ln~Aa~~i---~~~fGt~TD~~lp~~vk 224 (463) T protein:vir:95 151 KTIEWASFYGDASLTSEVEGEG-LEFDGLAKLIDKNNVINAKGNQLTE--KHLNEAAVRI---GKGFGTATDAYMPIGVH 224 (463) T ss_pred HHHHHHHhhhhhccCCCcCccc-cchhhhhhhcCCCCeeecCCCcccH--HHHhhhhhhh---hcccCChhheecchHHH Confidence 9999999999986542 12 2233322211111 111111112 2355444333 44667899999999999 Q ss_pred HHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhc Q lcl|NC_011142. 229 KRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRM 308 (343) Q Consensus 229 ~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~ 308 (343) +.|..-.++.. ..+...|+.....|.++.. |....-.++|+=..-+. T Consensus 225 a~f~~~~l~~q------rv~~~~N~~~~~~G~~v~~--------------------------f~s~~G~I~L~~s~~m~- 271 (463) T protein:vir:95 225 ADFVNSILGRQ------MQLMQDNSGNVNTGYSVNG--------------------------FYSSRGFIKLHGSTVME- 271 (463) T ss_pred HHHHHHhcCce------EEEEcCCCCceeeeeeccc--------------------------eeeeeeeeeeCCceecC- Confidence 99974333211 1122233333333322110 00111112221000000 Q ss_pred ccceecCceeEee-eeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 309 LAPQLLGLGITVP-AEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 309 ~~~~~~~~~~~~~-~~~~~gGv~i~~P~ai~~~dGI 343 (343) .+..+.-.-. .-.+. --|...+.+.+- T Consensus 272 ---~~~il~~~~~~~p~ap-----~~~~~tatv~~~ 299 (463) T protein:vir:95 272 ---NELILDESLQPLPNAP-----QPAKVTATVETK 299 (463) T ss_pred ---CcccccchhhcCCCCc-----cCceeEEEEeec Confidence 0000000000 00000 001111122111 No 161 >protein:vir:80180 Length: 381 # NCBI annotation: capsid protein # Family: family:all:2203 # MgeID: mge:1878 # MgeName: Pf-WMP3 # Cross-refs: genbank:acc:YP_001285797;genbank:gi:148747831;genbank:GeneID:5220456 Probab=61.66 E-value=0.35 Score=23.09 Aligned_cols=282 Identities=9% Similarity=-0.022 Sum_probs=110.7 Q ss_pred hhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCC--CcceeEEEEee Q lcl|NC_011142. 10 AQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANI--PEYATHWNYRS 87 (343) Q Consensus 10 ~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~--~~~~~~~~~~~ 87 (343) -.+|||| +. .+| . -|+..+.-.|..+.|. ..+.+.....+....+..- .+. -.| .++.++. T Consensus 1 ~~~~~~~---~~------~~~-~--~~~~t~~~~fiPev~s---~~v~~~l~~~lv~~~l~~~-~~~~~~~G-dTV~ip~ 63 (381) T protein:vir:80 1 MATIQGT---GG------YKG-S--AVDLSNVQVFIPEVWS---SEVRMFRDQKFAALEATKK-IPFEGKKG-DLIHIPN 63 (381) T ss_pred Cceeccc---cc------ccC-c--ccchhhHHhhhhHHHH---HHHHHHHHHhhhhhhcccc-ccceeecC-ceEEeec Confidence 4556554 10 011 1 2333455666666554 3344544555565554432 221 122 3455543 Q ss_pred ecccccceeecCCcCccceeeeccceeEEEEEE-EEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeee Q lcl|NC_011142. 88 YDGAAMGKFISANASDLPRVAQSAKLHQVELGY-AGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGD 166 (343) Q Consensus 88 ~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~-~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~ 166 (343) .- ...++.+.. ...++..+..-......|-. ...++.++ +++..+ ...++..+-...+..++++..|+.++--. T Consensus 64 ~g-~~~a~d~~~-g~~i~~~~~~~~~~~itID~~~~~~~~Id--d~D~~~-~~~D~~~~~~~~~~~aLA~~~D~~i~~~~ 138 (381) T protein:vir:80 64 IS-RAAVYDKQP-QTPVNLQARTDSEFTFTVTKYKESSFMIE--DIVNTQ-ASYTLRQYYTKEAGYALARDMDNFALAHR 138 (381) T ss_pred cC-cceeeeecC-CCcccccccCCceEEEEEeeeeecceeec--hHHHHh-hccChHHHHHHHHHHHHHHHHHHHHHHHH Confidence 32 222332322 23344444444444444422 34445554 444432 23466677777778888888888765321 Q ss_pred hhhc-ceeeeecCC---ccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCe-ecccEEEecHHHHHHHhccccCCCCC Q lcl|NC_011142. 167 TNRN-MSGLLNNPN---VTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRF-HTPNTVLMFPDLWKRASSLLMTGYTD 241 (343) Q Consensus 167 ~~~g-~~GLlN~p~---v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~-~~p~tL~l~p~~~~~L~~~~~~~~~~ 241 (343) .... ..+-...+. +...+....-...+.+..+++|.++...|.+. .+ .....|+++|..|..|.+- .. T Consensus 139 ~~~~~~~~~~~~t~~~~i~~~~~~~~~t~~~~~~t~~~i~~a~~~Lde~--~VP~egR~lvv~P~~~~~Ll~~--~~--- 211 (381) T protein:vir:80 139 AVINAFPSQRIYSYDTTLGDGTVNAHLTGTPAPLTYAALLLAKQKLDEA--DVPQEGRIVMVSPAQYIDLLSI--NQ--- 211 (381) T ss_pred hhcccccccccccccccccccccccccccchhhHHHHHHHHHHHHHhhc--CCCcCCcEEEeCHHHHHHHhhc--hh--- Confidence 1100 000000011 11111111112234566788888888888764 22 1235799999999988641 11 Q ss_pred ccHHHHHH----hcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhcccceecCce Q lcl|NC_011142. 242 RTVIEHFQ----INNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLGLG 317 (343) Q Consensus 242 ~tvle~l~----~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~~~ 317 (343) .+-.++.- +++....+-|.+... ...+.. +.. .+.... ..-|....|. ..+.. T Consensus 212 ~~~ad~~~~~~l~~G~Ig~i~G~~Vv~-------Sn~lp~-~~~-t~~~~~-------------agap~~~~~~-~~~~~ 268 (381) T protein:vir:80 212 FISVDFSQVKPVTSGVVGTILGMEVIV-------TTQIGI-NSL-TGYVNG-------------QGAPTQPTPG-VLGSP 268 (381) T ss_pred hhhhhhccchhhhceeeeEEcceEEEe-------eccccc-ccc-cceeee-------------cccccccccc-ccccc Confidence 11111111 111111222221111 111100 000 000000 0011111110 00000 Q ss_pred eEee---eeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 318 ITVP---AEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 318 ~~~~---~~~~~gGv~i~~P~ai~~~dGI 343 (343) |.-. .....+.+..|-.....-+.|+ T Consensus 269 ~~g~~s~~a~av~~~k~yd~~~~~~~~~~ 297 (381) T protein:vir:80 269 YLPDQAGTANVVNTGSASDLAVSLSYFGL 297 (381) T ss_pred cccccccceeeeeeeeeeceeeeeeeccc Confidence 0000 0123333444444433344444 No 162 >protein:vir:7019 Length: 401 # NCBI annotation: major capsid protein # Family: family:all:2806 # MgeID: mge:141 # MgeName: SP6 # Cross-refs: genbank:acc:NP_853592;genbank:gi:31711674;genbank:GeneID:1481800 Probab=59.19 E-value=0.4 Score=22.78 Aligned_cols=301 Identities=8% Similarity=-0.037 Sum_probs=122.2 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) |+ -..+...|...--++....|+....-.+++.+-+.. ..+.++.+++ +. +. T Consensus 1 Ms----------------------~~n~~t~~~~~~sg~~~al~Le~f~GeV~taF~~~s----i~~~~~~vRt-i~-~g 52 (401) T protein:vir:70 1 MS----------------------TPNNLTNVAVSASGEVDSLLIEKFNGKVNEQYLKGE----NIMSYFDVQT-VT-GT 52 (401) T ss_pred CC----------------------CCccccccccccccchhHhHHhHhcchHHHHHHHHh----hhcccceeee-ec-cc Confidence 11 001111121111112334555555557776663322 2334455543 22 22 Q ss_pred eEEEEeeecccccceee--cCCcCccceeeeccceeEEEE--EEEEEEEeecHHHHHHHHHhCCC-ccHHHHHHHHHHHH Q lcl|NC_011142. 81 THWNYRSYDGAAMGKFI--SANASDLPRVAQSAKLHQVEL--GYAGVECHYSLDELRTTAAVNMP-IDSMQAELAFRGSE 155 (343) Q Consensus 81 ~~~~~~~~~~~G~a~~~--~~~~~dip~v~~~~~~~~~~v--~~~~~~~~~~~~El~~a~~~g~~-l~~~k~~aA~~~~~ 155 (343) .++.|. ..|..+.- ..+ ..+-......++....| ..+.. .-+.+|..+ +...+ +...-....-.+++ T Consensus 53 kS~qf~---~~G~s~~~~~~pG-~~ld~~~~~~dK~~ItID~lL~a~---~~V~dlDe~-q~~yD~vRse~s~e~G~ALA 124 (401) T protein:vir:70 53 NTVSNK---YLGETELQVLAPG-QSPAATSTQADKNQLVIDATVIAR---NTVAHLHDV-QGDIDSLKPKLATNQAKQLK 124 (401) T ss_pred ceEEEE---EeeeeEeeeecCC-CCcCCCCcccccEEEEeCceeehh---hhhhhHHHH-HhcccccchHHHHHHHHHHH Confidence 233333 23444321 111 11111111122222221 11222 223444443 23444 45555556666777 Q ss_pred Hhhhheee-----eeehh----hcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHH Q lcl|NC_011142. 156 EHSQRVAY-----FGDTN----RNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPD 226 (343) Q Consensus 156 ~~~n~~~f-----~G~~~----~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~ 226 (343) +..|+.++ -|-.. .....-..++..-......+=...+++++.+.|.++..+|.++.=- .....+++||. T Consensus 125 ~~~Dq~iiq~i~~aa~ana~~~~~~p~~~~~G~~i~v~~~~~~~~~~~~~l~~ai~dA~~~LdEkdVP-~~r~vvl~pp~ 203 (401) T protein:vir:70 125 RMEDEMLIQQMMLGGIANTQAKRTNPRVKGHGFSINVEVAEGEALVNPQYVMAAVEFALEQQLEQEVD-ISDVAILMPWR 203 (401) T ss_pred HHHHHHHHHHHHHhccccccccccCCCcCCCceEEeccccccccccCHHHHHHHHHHHHHHHHhcCCC-ccceEEEcCHH Confidence 77776442 12110 0000000000000111111223457899999999999998764221 23467888999 Q ss_pred HHHHHhcc--ccCCCCCccH-HHHHHhcCcceeeccccccccccceeeech---hhhccc---------cCCccceEEEE Q lcl|NC_011142. 227 LWKRASSL--LMTGYTDRTV-IEHFQINNAYTLLTRNPIDIKIRFQLMATE---LAAAGV---------SNGNKDRYVVY 291 (343) Q Consensus 227 ~~~~L~~~--~~~~~~~~tv-le~l~~n~~~~~~~~~p~~i~~~~~l~~~~---~~~~g~---------~~~g~dr~v~y 291 (343) .|..|... .++-.++.+- ..+.+.. + ..+.|.++..+.+.--.... .....+ +.-.+-++++| T Consensus 204 ~Ys~Ll~~d~L~nrd~~~s~~g~~~~G~-v-~~vaGv~Vv~SnnlP~~a~~it~~~ls~a~~G~~y~~~~d~s~~~~v~f 281 (401) T protein:vir:70 204 YFNVLRDADRIVDKTYTISQSGATIQGF-T-LSSYNCPVIPSNRFPKYSQGQTHHLLSNEDNGYRYDPLPAMNGAIAVLF 281 (401) T ss_pred HHHHHHhcCcccchhhccccCCccccce-E-EEEeceEEEeeccccccccccccccccccCCCccCCCCccccceeEEEE Confidence 99777653 2211111000 1111111 1 12233333222221100000 000001 11235567777 Q ss_pred EcccceEEEeeccchhcc-cceecCceeEeeeeeeeeeEEEECcceeeee----ccC Q lcl|NC_011142. 292 DKSERNLALAKPIPFRML-APQLLGLGITVPAEYKISGTEYRYPLCAQYV----DML 343 (343) Q Consensus 292 ~~~~~~~~~~v~~~~~~~-~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~----dGI 343 (343) .++.= --.-.+|++.- --+.+...|.+.+.... |+..+||.|++.+ +|. T Consensus 282 ~~~Av--~tvk~~~lt~~~~~d~r~~~~~id~~~a~-g~g~~RPeaa~vv~~k~~~~ 335 (401) T protein:vir:70 282 TADAL--LVGRSIDVTGDIFYEKKEKTYYIDTFMAE-GAIPDRWEAVSVVTTKRNTT 335 (401) T ss_pred ehhhe--EEEEeeccccchhhhhhhhHHHHHHHHHh-CCcccchhheEEEeecCccc Confidence 65521 11222344321 12344566777777765 7999999999875 222 No 163 >protein:vir:78935 Length: 335 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:1860 # MgeName: LKD16 # Cross-refs: genbank:acc:YP_001522824;genbank:gi:158345059;genbank:GeneID:5687425 Probab=58.21 E-value=0.42 Score=22.66 Aligned_cols=299 Identities=9% Similarity=-0.014 Sum_probs=123.8 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHH-HHHHHHHHhhhhhcccchhhccccCCCCcc Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQL-ASLETTVYEVPYADITYLEDVPVLANIPEY 79 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l-~~id~~v~e~~~~~l~~~~~i~v~~~~~~~ 79 (343) || .+.....|..---+.+...|+ ++| -++++++-+. -..+.+..+++ +- + T Consensus 1 ms----------------------~~~~~t~~~~~~s~~d~al~l-e~f~geV~~af~~~----s~~~~~~~~rt-i~-~ 51 (335) T protein:vir:78 1 MS----------------------FLNDLTRPNYAGKNADVDIHL-EEHLGIVDKHFAYT----SKFAPLMNIRD-LR-G 51 (335) T ss_pred CC----------------------ccccccccccccccchhhhhh-hhhhhHHHHHHHHh----hhhccccceee-ec-c Confidence 11 111111121111111223554 444 4666655332 23445555543 22 2 Q ss_pred eeEEEEeeecccccceee----cCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHH Q lcl|NC_011142. 80 ATHWNYRSYDGAAMGKFI----SANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSE 155 (343) Q Consensus 80 ~~~~~~~~~~~~G~a~~~----~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~ 155 (343) -.++.++ ..|..+.. |..-+..|.. .++....|=..= -.+.-+.++.. .++..++...-......+++ T Consensus 52 g~s~~~~---~iG~~~~~~~~pG~~l~~~~~~---~~k~~itID~ll-~a~~~VddlDe-~~~~yDvR~e~s~~~G~aLA 123 (335) T protein:vir:78 52 SNVVRLD---RLGNVEAKGRRAGEELERSRVV---NDKWNLTVDTLL-YLRHQFDHQDE-WTQSFDMRKEVAELDGQELA 123 (335) T ss_pred ceeEEEe---eeeeeeecccccCcccCCCCcc---cCCeEEEeccee-echhhHhhHHH-hhcCchhHHHHHHHHHHHHH Confidence 3344443 33555432 2222222321 123223222111 11222455555 34466777777888888999 Q ss_pred Hhhhheee------eee-hhhcceeeeecCCccccc-cCcCccccCHHHHHHHHHHHHHHHHHhcCCeecc--cEEEecH Q lcl|NC_011142. 156 EHSQRVAY------FGD-TNRNMSGLLNNPNVTKTS-ATVNYATCTGQELFDLLNNPVFAVVKASKRFHTP--NTVLMFP 225 (343) Q Consensus 156 ~~~n~~~f------~G~-~~~g~~GLlN~p~v~~~~-~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p--~tL~l~p 225 (343) +..|+.++ .+. +.....+-++ ||..... ....=++..++.+.+-+.++...+.+..---..+ ..++|+| T Consensus 124 ~~~Dq~~~~~l~~aa~~~a~~~~~~~~~-~G~~~~~~~tg~~~~~~~~~l~~a~~~a~~~l~ekdvP~~~~~~rv~vv~P 202 (335) T protein:vir:78 124 RKFDQACLIQVIKAAAMDAPVDLEDAFS-PGVLEKLDLTGLTAKEAAEKIVRMHRRVVETFIERDLGDAVYSEGLTPMSP 202 (335) T ss_pred HHHHHHHHHHHHhhcccccccccCCCcC-CCcceeeeeccccccccHHHHHHHHHHHHHHHHhccCCCCCCCccEEEeCh Confidence 99898766 111 1111122222 2322111 1111123458888888888888887542111111 4689999 Q ss_pred HHHHHHhcc--ccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhc-----cccC----CccceEEEEEcc Q lcl|NC_011142. 226 DLWKRASSL--LMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAA-----GVSN----GNKDRYVVYDKS 294 (343) Q Consensus 226 ~~~~~L~~~--~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~-----g~~~----~g~dr~v~y~~~ 294 (343) ..|..|..- .++..++-+--.-...+.-...+.|.++..+.. .+.-... .+++ .-+.++.++ .. T Consensus 203 ~~y~~Ll~~~~l~n~~~~~s~~~~~~~~g~v~~v~Gv~V~~Sn~----lP~~~~t~~~lg~a~n~~~~d~~~~~~~~-~~ 277 (335) T protein:vir:78 203 RVFSLLLEHDKLMSVEYQATGATNDYVKSRVAILNGVKVLETPR----FATKAISAHPLGRHFNVSAEEAERQIALF-LP 277 (335) T ss_pred HHHHHHhcccccccccccccccccccccceeEEeeceEEEeecc----CCCCCCccccccccCCcccccccceEEEE-Ee Confidence 999998642 111000000000000001111222222211111 1111100 0000 012233333 23 Q ss_pred cceEEEeeccchhcc-cceecCceeEeeeeeeeeeEEEECcceee--eeccC Q lcl|NC_011142. 295 ERNLALAKPIPFRML-APQLLGLGITVPAEYKISGTEYRYPLCAQ--YVDML 343 (343) Q Consensus 295 ~~~~~~~v~~~~~~~-~~~~~~~~~~~~~~~~~gGv~i~~P~ai~--~~dGI 343 (343) ++-+--...+++..- .-+.+...+.+.+.... |+.++||.+++ ...|| T Consensus 278 ~~Al~t~~~~~~~~e~~~~~~~~~~~i~~~~a~-G~g~lRPe~a~~i~~tg~ 328 (335) T protein:vir:78 278 SKTLITAQVAPVQAKLWEDHDQFSWVLDTFQMY-NIGARRPDTAGAIELKGI 328 (335) T ss_pred cceEEEEEEEecccceeeccchhhHhhhHHHHc-CCcccCcceEEEEEecCC Confidence 333333233333321 11223355666666664 79999997655 56788 No 164 >protein:vir:7990 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:151 # MgeName: Che8 # Cross-refs: genbank:acc:NP_817344;genbank:gi:29565772;genbank:GeneID:1258978 Probab=51.77 E-value=0.57 Score=21.91 Aligned_cols=264 Identities=10% Similarity=0.054 Sum_probs=112.7 Q ss_pred cCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccC--CCCcceeEEEEeeecccccceeecCCcCccce Q lcl|NC_011142. 29 IGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLA--NIPEYATHWNYRSYDGAAMGKFISANASDLPR 106 (343) Q Consensus 29 ~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~--~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~ 106 (343) |.+ -.|..+.|. ..+.+.....+....++.... .+..|+ ++.++.....+.+..... +..++. T Consensus 1 MA~----------~~~~pei~~---~~v~~~~~~~lv~~~l~~~~~~~~~~~Gd-Tv~ip~~~~~~~~d~~~~-~~~~~~ 65 (273) T protein:vir:79 1 MAF----------NNFIPELWS---DMLLEEWTAQTVFANLVNREYEGIASKGN-VVHIAGVVAPTVKDYKAA-GRQTSA 65 (273) T ss_pred Ccc----------hhhhHHHHH---HHHHHHHHhhccchhhhhccccccccCCc-EEEEeecCcccccccccC-CCccCc Confidence 111 123445454 335555555566666553321 122343 566665544343332211 222333 Q ss_pred eeeccceeEEEEEE-EEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCcccccc Q lcl|NC_011142. 107 VAQSAKLHQVELGY-AGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSA 185 (343) Q Consensus 107 v~~~~~~~~~~v~~-~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~ 185 (343) -+.+-++....+-. ...++.++ +++..+ ...++. +-...+..++++..|+.++ + ++..-+. ..+. T Consensus 66 ~~~~~~~~~~tid~~~~~~~~i~--d~d~~~-~~~~~~-~~~~~~~~ala~~vD~~i~-~--------~~~~a~~-~~~~ 131 (273) T protein:vir:79 66 DAISDTGVDLLIDQEKSIDFLVD--DIDRVQ-VAGSLE-AYTRAGATALATDTDKFIA-D--------MLVDNGT-ALTG 131 (273) T ss_pred cccccceEEEEEeeecccceeec--cHHHHh-hcccHH-HHHHHHHHHHHHHHHHHHH-H--------HHhhccc-cccc Confidence 34444555555544 35556655 444433 344675 3555677788888776543 1 1100000 0000 Q ss_pred CcCccccCHHHHHHHHHHHHHHHHHhcCCe-ecccEEEecHHHHHHHhcc--ccCCCCCccHHHHHHhcCcceeeccccc Q lcl|NC_011142. 186 TVNYATCTGQELFDLLNNPVFAVVKASKRF-HTPNTVLMFPDLWKRASSL--LMTGYTDRTVIEHFQINNAYTLLTRNPI 262 (343) Q Consensus 186 ~~~w~~~t~~~i~~di~~~~~~l~~~s~g~-~~p~tL~l~p~~~~~L~~~--~~~~~~~~tvle~l~~n~~~~~~~~~p~ 262 (343) + ..-+++.+++.|.++..++.++ ++ .....|+++|..+..|... ..... +..--+-..+++..-.+-|.+ T Consensus 132 ~---~~~~~~~~~~~i~~a~~~ld~~--~vP~~~R~lvv~p~~~~~Ll~~~~~~~~~-~~~~~~~~l~~G~ig~~~G~~- 204 (273) T protein:vir:79 132 S---APSDADDAFDLIASALKELTKA--NVPNVGRVVVVNAEMAFWLRSSGSKLTSA-DTSGDAAGLRAGTIGNLLGAR- 204 (273) T ss_pred c---cccchhhHHHHHHHHHHHhhhc--cCCccCcEEEECHHHHHHHhhchhhhhhh-hhcccccceeeeEeeEEeceE- Confidence 1 1224667788888888877653 32 1224799999999987531 11110 000000000111111111211 Q ss_pred cccccceeeechhhhccccCCccceEEEEEcccceEEEeec-cchhcccceecCceeEeeeeeeeeeEEEECcceeeeec Q lcl|NC_011142. 263 DIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKP-IPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVD 341 (343) Q Consensus 263 ~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~-~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d 341 (343) ......+. .+. + ...+++.++- +.+..- ..+..+.. ++.....+..... .|+.+.||.+++.+. T Consensus 205 ------i~~s~~lp---~~~-~-~~~~a~~~~A--~~~a~~~~~~e~~r~-~~~~~~~v~~~~~-yg~~v~~p~~vv~~~ 269 (273) T protein:vir:79 205 ------IVESNNLR---DTD-D-EQFVAFHPSA--AAYVSQIDTVEALRD-QDSFSDRIRALHV-YGGKVVRPTGVVVFN 269 (273) T ss_pred ------EEeccccc---ccC-c-eEEEEEeccc--eeeeeehhhhhcccC-cccceeeeeeeee-eeeEEecCceEEEEe Confidence 11111111 000 1 1123333221 111110 01111111 2233445545544 578888999888765 Q ss_pred cC Q lcl|NC_011142. 342 ML 343 (343) Q Consensus 342 GI 343 (343) == T Consensus 270 ~~ 271 (273) T protein:vir:79 270 KT 271 (273) T ss_pred cc Confidence 33 No 165 >protein:vir:1541 Length: 347 # NCBI annotation: major capsid protein 10A # Family: family:all:975 # MgeID: mge:31 # MgeName: phiYeO3-12 # Cross-refs: genbank:acc:NP_052109;swissprot:trembl:q9t107;genbank:gi:9634035;uniprot:Q9T107;genbank:GeneID:1262383 Probab=49.82 E-value=0.63 Score=21.69 Aligned_cols=305 Identities=11% Similarity=0.041 Sum_probs=123.9 Q ss_pred hhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeec Q lcl|NC_011142. 10 AQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYD 89 (343) Q Consensus 10 ~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~ 89 (343) ...+++...++ .+++.|.. .++....|...-...++..+-+ .-..+.++.+.+ ...|. ++.+. T Consensus 1 ma~~~~~~~~~----t~~~~~~~----~~~~~a~~ie~f~g~V~~~f~~----~s~~~~~~~~~~-~~~G~-sv~i~--- 63 (347) T protein:vir:15 1 MANIQGGQQIG----TNQGKGQS----AADKLALFLKVFGGEVLTAFAR----TSVTMPRHMLRS-IASGK-SAQFP--- 63 (347) T ss_pred CCccccCCccc----cccccCCC----cchHHHHHHHHHHHHHHHHHHH----hhhhhhcccccc-ccccc-eeEee--- Confidence 33344444443 22232211 1122334443333455543332 123445554432 22232 33333 Q ss_pred cccccee--ecCCcCccce--eeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeee- Q lcl|NC_011142. 90 GAAMGKF--ISANASDLPR--VAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYF- 164 (343) Q Consensus 90 ~~G~a~~--~~~~~~dip~--v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~- 164 (343) ..|..+. +.. +.+++. .+..-.+....|-..- -+..-+.++..++ +..++-..-...+..++++..|+.++- T Consensus 64 ~ig~~t~~~~~~-g~~l~~~~~~~~~~e~~ltID~~~-~~~~~VddlD~~q-~~~D~~~~~~~~~g~aLA~~~D~~i~~~ 140 (347) T protein:vir:15 64 VIGRTKAAYLKP-GENLDDKRKDIKHTEKVIHIDGLL-TADVLIYDIEDAM-NHYDVRAEYTAQLGESLAMAADGAVLAE 140 (347) T ss_pred eccceeeeeecc-CCCCCCCCCCCccceEEEEechhh-hhhHHhhhHHHHh-cCCcchHHHHHHHHHHHHHHHHHHHHHH Confidence 2333332 121 122221 1122233333332221 1233456777644 566787888889999999999988761 Q ss_pred ----eehh---hcceeeeecCCccc-cccC-cCc--cccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhc Q lcl|NC_011142. 165 ----GDTN---RNMSGLLNNPNVTK-TSAT-VNY--ATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASS 233 (343) Q Consensus 165 ----G~~~---~g~~GLlN~p~v~~-~~~~-~~w--~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~ 233 (343) .+.. ....+..-.+++.. .... .+. +..+++.|++-+.++..+|.++.=-. ....++++|+.|..|.. T Consensus 141 l~~~~~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~i~d~~~~a~~~Lde~~VP~-~gR~~vv~P~~y~~LL~ 219 (347) T protein:vir:15 141 LAGLVNLPDASNENIEGLGKPTVLTLVKPTTGDLTDPVELGKAIIAQLTIARASLTKNYVPA-ADRTFYTTPDNYSAILA 219 (347) T ss_pred HHHHhhccccccccccccCccccccccccccccchhhhhHHHHHHHHHHHHHHHHhhcCCCc-cCCEEEeCHHHHHHHhc Confidence 1111 11111111111111 1111 111 12346778888888888887653211 23679999999999975 Q ss_pred cccCCCCC-ccHHHHHHhcCcceeeccccccccccceeeechhhhc----cccCCcc---------ceEEEEEc------ Q lcl|NC_011142. 234 LLMTGYTD-RTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAA----GVSNGNK---------DRYVVYDK------ 293 (343) Q Consensus 234 ~~~~~~~~-~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~----g~~~~g~---------dr~v~y~~------ 293 (343) .......+ .+-.. + .++....+-|.++....+. +..... ++..|.+ .....|.+ T Consensus 220 ~~~~~~~d~~~~~~-~-~~G~Vg~i~G~~V~~Sn~l----p~~~~t~~~~~~~~g~~~~~~~~~~~~~~~~f~~~~~l~~ 293 (347) T protein:vir:15 220 ALMPNAANYQALID-H-ERGTIRNVMGFEVVEVPHL----TAGGAGDTREDAPADQKHAFPATSSTTVKVALDNVVGLFQ 293 (347) T ss_pred cccccccccccccc-c-cceEEEEEeceEEEecccc----cccccccccccccccccccccccccceeeeccccceeeee Confidence 32111111 00001 1 2332223333322221111 000000 0000000 01111111 Q ss_pred ccceEEEeeccch--hcccceecCceeEeeeeeeeeeEEEECcceeeee--ccC Q lcl|NC_011142. 294 SERNLALAKPIPF--RMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYV--DML 343 (343) Q Consensus 294 ~~~~~~~~v~~~~--~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~--dGI 343 (343) .++-+.....+++ +....+ +.....+...... |+-+.||.+++-+ -|| T Consensus 294 h~~A~g~v~~~~~~~e~~~~~-~~~~d~i~~~~~~-G~~vlrP~~av~~~~~~~ 345 (347) T protein:vir:15 294 HRSAVGTVKLKDLALERARRA-NYQADQIIAKYAM-GHGGLRPEAAGAIVLPKV 345 (347) T ss_pred ccceeeeeEeeceeeeecccc-hhhhhhhehhhhc-CCceeccccEEEEecCCC Confidence 1111111111221 222222 2223344444444 8999999987654 455 No 166 >protein:vir:10450 Length: 344 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:184 # MgeName: phiA1122 # Cross-refs: genbank:acc:NP_848297;genbank:gi:30387487;genbank:GeneID:1733971 Probab=47.30 E-value=0.71 Score=21.41 Aligned_cols=310 Identities=12% Similarity=0.046 Sum_probs=120.6 Q ss_pred hhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHH-HHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeee Q lcl|NC_011142. 10 AQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLA-SLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSY 88 (343) Q Consensus 10 ~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~-~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~ 88 (343) ...+.+-+..+.. ..|...-..+....|+ ++|. ++++++-+. -..+.++.+++ +- +-.++.+. T Consensus 1 ma~~~~~~~~n~~-------~~~~~~~~~~~~al~i-e~~~geV~~~f~~~----s~~~~~~~~r~-i~-~g~s~~~~-- 64 (344) T protein:vir:10 1 MANMTGGQQLGTN-------QGKDVMAAGDKLALFL-KVFGGEVLTAFART----SVTTSRHMVRS-IS-SGKSAQFP-- 64 (344) T ss_pred CccccccccCCcc-------cCCccCCccchhHHHH-HHHHHHHHHHHHHH----hhhcccceeee-ec-ccceEEEE-- Confidence 1111111222210 0111111112234444 5553 676665443 23344554432 22 22334333 Q ss_pred cccccceeec-CCcCcccee--eeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeee- Q lcl|NC_011142. 89 DGAAMGKFIS-ANASDLPRV--AQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYF- 164 (343) Q Consensus 89 ~~~G~a~~~~-~~~~dip~v--~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~- 164 (343) ..|..+.-. ..+.+++.. +..-.+....|=. ..-+..-+.|+..+ ++..++...-...+..++++..|+.++- T Consensus 65 -~iG~~~~~~~~~G~~l~~t~~~~~~~e~~l~ID~-~~y~~~~VdDiD~~-q~~~D~r~~~~~~~G~aLA~~~D~~i~~~ 141 (344) T protein:vir:10 65 -VLGRTQAAYLAPGENLDDIRKDIKHTEKVITIDG-LLTADVLIYDIEDA-MNHYDVRSEYTSQLGESLAMAADGAVLAE 141 (344) T ss_pred -eeceeEEEeeecCCCCCCCCCCcccceEEEEEcc-hhhhhhhhhhHHHH-hcCcchHHHHHHHHHHHHHHHHHHHHHHH Confidence 334444211 112233221 1222332222221 11234446777764 4566787888888899999999987752 Q ss_pred ---eeh-----hhcceeeeecCCccccccCcC--ccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhcc Q lcl|NC_011142. 165 ---GDT-----NRNMSGLLNNPNVTKTSATVN--YATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSL 234 (343) Q Consensus 165 ---G~~-----~~g~~GLlN~p~v~~~~~~~~--w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~ 234 (343) +.. .....|+-..-.+.....+.. =...+++.+++.|.++...|.++.=-.. ...++++|+.|..|..- T Consensus 142 la~~a~~~~~~~~~~~g~~~~~~~~~~~~~~~~t~~~~~~~~~~~~i~~a~~~Lde~~VP~~-gR~~vv~P~~y~~Ll~~ 220 (344) T protein:vir:10 142 IAGLCNVESQYNENITGLGTATVIETTQDKTTLTDQVALGKEIIAALTKARAALTKNYVPSS-DRVFYCDPDSYSAILAA 220 (344) T ss_pred HHhhhccccccccccccccccceeecccccccccchhhhHHHHHHHHHHHHHHHhhcCCCcc-CCEEEeChHHHHHHhhc Confidence 111 112222211111111111111 1234567888889999888876522111 25688999999998652 Q ss_pred cc--CCCCCccHHHHHHhcCcceeeccccccccccceee-ech--hhhcccc---CCc--cceEEEEEcc------cceE Q lcl|NC_011142. 235 LM--TGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLM-ATE--LAAAGVS---NGN--KDRYVVYDKS------ERNL 298 (343) Q Consensus 235 ~~--~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~-~~~--~~~~g~~---~~g--~dr~v~y~~~------~~~~ 298 (343) .. ...++ .-.. ..++....+.|.++....+..-- ... ....|.. .++ -...+.+.+. |+-+ T Consensus 221 ~~~~~~~~~-~~~~--~~~G~V~~v~G~~V~~Sn~lp~~~~~~~~~~~tg~~~~~~~~~~~~~~~~~s~~~~l~~h~~A~ 297 (344) T protein:vir:10 221 LMPNAANYA-ALID--PEKGSIRNVMGFEVVEVPHLTAGGAGTSREGTTGQKHAFPATKSGNDKVAKDNVIGLFMHRSAV 297 (344) T ss_pred ccccccccc-cccc--eeeeEEEEEeceEEEeccccccccCCcccccccCccccccCCcccceeeecceeEEEeechhhh Confidence 21 11111 0000 11222222222222111110000 000 0000000 000 0011111111 1111 Q ss_pred EEeeccchhcccc-eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 299 ALAKPIPFRMLAP-QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 299 ~~~v~~~~~~~~~-~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -....++++.-.. ..+...+.+.+.... |+.+.||.+++.+.== T Consensus 298 ~~v~~~~~~~e~~r~~~~~~d~i~g~~~~-G~~vlRPe~a~~v~~~ 342 (344) T protein:vir:10 298 GTVKLRDLALERARRANFQADQIIAKYAM-GHGGLRPEAAGAVVFK 342 (344) T ss_pred hhhhhccceeecccchhHHHHHHHHHhhc-ccceecccceEEEEee Confidence 1111112111110 122234555555554 7899999877543211 No 167 >protein:vir:103323 Length: 364 # NCBI annotation: major capsid-like protein # Family: family:all:2806 # MgeID: mge:1609 # MgeName: Era103 # Cross-refs: genbank:acc:YP_001039668;genbank:gi:125999997;genbank:GeneID:4818399 Probab=44.93 E-value=0.79 Score=21.15 Aligned_cols=299 Identities=8% Similarity=-0.014 Sum_probs=127.5 Q ss_pred hhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecC- Q lcl|NC_011142. 21 KFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISA- 99 (343) Q Consensus 21 ~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~- 99 (343) |.... ....|...-.++....|+....-.+++.+-+.. ..+.++.+++ +. +-.++.+. ..|..+...- T Consensus 1 ms~~n--~~t~~~~~~~~~~~al~le~f~geV~taf~~~s----~~~~~~~~rt-i~-~gkS~q~~---~iG~~~~~~~~ 69 (364) T protein:vir:10 1 MSNPN--VLTQPAVSASGEVDSLLIEKFNNRVHEQYLKGE----NLLQWFDVQE-VV-GTNSVSNK---YIGETELQVLS 69 (364) T ss_pred CCCcc--cccccccccccchhhhhhhhhhhhHHHHHHHHH----hhcCcceeee-ec-ccceEEee---eeeeeEEeeec Confidence 22211 112233332234445666655667777664421 2234444432 22 22333333 2344443110 Q ss_pred CcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCC-ccHHHHHHHHHHHHHhhhheeee----eehhhcceee Q lcl|NC_011142. 100 NASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMP-IDSMQAELAFRGSEEHSQRVAYF----GDTNRNMSGL 174 (343) Q Consensus 100 ~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~-l~~~k~~aA~~~~~~~~n~~~f~----G~~~~g~~GL 174 (343) .+..+-.....-++....|=..= -++.-+.++..+ +...+ +...-...+..++++..|+.++- +-. .+..+- T Consensus 70 ~G~~ld~~~~~~~k~~itID~ll-~a~~~V~diDe~-q~~~D~vR~e~s~e~G~ALA~~~Dq~i~~~v~~aa~-a~~~~~ 146 (364) T protein:vir:10 70 PGKSPDASPTEFDKNRLVVDTTV-IARNTVAHFHDV-QNDIDGLKSKLSVNQAKKLKKMEDSMVIQQLVLGGI-SNTEAI 146 (364) T ss_pred cCcccCCCCcccCcEEEEeccee-eechhhhhHHHH-hcCccchhHHHHHHHHHHHHHHHHHHHHHHHHhhhh-hccccc Confidence 01111111111222222221110 012224555543 33555 55666667788888888887641 100 111221 Q ss_pred eecCCcc-cc-----ccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhcc--ccCCCCCcc-HH Q lcl|NC_011142. 175 LNNPNVT-KT-----SATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSL--LMTGYTDRT-VI 245 (343) Q Consensus 175 lN~p~v~-~~-----~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~--~~~~~~~~t-vl 245 (343) .+.|.+. .. .....=...+++.+++-|.++...|.++.=-..+ ..++|+|..|..|..- .++-.++.+ -- T Consensus 147 ~~~~~~~~~g~~i~~~~~a~~~~~~~~~l~~ai~~a~~~LdEkdVP~~~-R~~vv~P~~y~~Ll~~~~lvn~d~~~~~~~ 225 (364) T protein:vir:10 147 RKNPRVAGHGFSIHIVGLASSFLTSPQYMMAAIEMAMEQQTEQEVDTSE-LCGLMPWTAFNCLRDADRIVDKSYTIAASD 225 (364) T ss_pred ccCCcccCCcceeeecccCcchhhhHHHHHHHHHHHHHHHhhcCCCccc-cEEEeChHHHHHHhcCCccccccccccCCC Confidence 1122111 00 0111112345788888888888888764211122 5799999999888653 221111100 00 Q ss_pred HHHHhcCcceeeccccccccccceeeechhhhcc-------------c--cC-----C--ccceEEEEEcccceEEEeec Q lcl|NC_011142. 246 EHFQINNAYTLLTRNPIDIKIRFQLMATELAAAG-------------V--SN-----G--NKDRYVVYDKSERNLALAKP 303 (343) Q Consensus 246 e~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g-------------~--~~-----~--g~dr~v~y~~~~~~~~~~v~ 303 (343) .+. +.-...+.|.++..+.+. +...... + |. + .+-++++|.+ +-+.-.-. T Consensus 226 ~~~--~G~v~~v~Gv~Vv~Sn~l----P~~~~~~~~t~~~t~h~ls~~~~g~~y~v~~d~~~~~~~~f~~--~Al~tv~~ 297 (364) T protein:vir:10 226 NTV--DGFVLKSWNTPIVPSNRF----PKLSDNTEGTGNTKHHKLSNAGNGNRYDVTAGQTSAQAVLFTQ--DALLVGRT 297 (364) T ss_pred ccc--cceeEEEeceEEEecccc----ccccccccccccccccccccccCCcccccccccceeEEEEEec--ceEEEEEE Confidence 111 111112233333222211 1111100 0 00 1 2455666644 33333333 Q ss_pred cchhcccc-eecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 304 IPFRMLAP-QLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 304 ~~~~~~~~-~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ++++.-.- +.+...|.+.+.... |+-++||.+++-+.== T Consensus 298 ~~~t~e~~~~~~~~~~~ida~~a~-G~g~lRPeaa~~i~~~ 337 (364) T protein:vir:10 298 ISITGDIFYEKKEKTWYIDTFLAE-GAIPDRWEAVAVVTAA 337 (364) T ss_pred ecceeeeeeccceeeeeeeeehcc-cCcccCccceEEEEec Confidence 44433211 223456666666654 7999999998876433 No 168 >protein:vir:97397 Length: 517 # NCBI annotation: major capsid protein # Family: family:all:11745 # MgeID: mge:1675 # MgeName: Q54 # Cross-refs: genbank:acc:YP_762590;genbank:gi:115304291;genbank:GeneID:5130600 Probab=43.98 E-value=0.82 Score=21.05 Aligned_cols=303 Identities=11% Similarity=-0.057 Sum_probs=96.7 Q ss_pred CCcceec----------------------------------cchhhhhchhhhch--hcccccccCcchhec--chhhhh Q lcl|NC_011142. 1 MSEKRVV----------------------------------IDAQTIAGNRWLNK--FLDSNATIGVPSVVN--DADGGA 42 (343) Q Consensus 1 ~~~~~~~----------------------------------~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~--dA~~~~ 42 (343) +.+.... .+............ +......-+.-...+ +....+ T Consensus 168 l~~~~~~~~~~~~e~~~~l~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 247 (517) T protein:vir:97 168 LKERENGGDNAALKTVSELAANLMKQRESEKILGVEALKVTPEATEFLKTREAEVAYMSASLTKDPKAAWTAELKERGIS 247 (517) T ss_pred HHHHHHHHHHHHHhhhhhhhhhHHHHHHhhhhcccccccccchhhHHHHHHHHHHHHHHhcccccccceeeeeccccccc Confidence 1111100 00000000000000 000000000000000 011112 Q ss_pred hhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEE Q lcl|NC_011142. 43 AYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAG 122 (343) Q Consensus 43 ~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~ 122 (343) .++.. ..+-..+.......-..++++++.. .+...+. .....+.+.+... +...|..+..++....++..++ T Consensus 248 ~~~~p--~~~~~~i~~~~~~~~~i~~~~~~~~---i~~~~~~--~~~~~~~a~~~~e-G~~kp~s~~tf~~~~~~~~~ia 319 (517) T protein:vir:97 248 GMPAP--AGILKRIQDAVNDEGSLLPFIRHEN---LPTLVVG--GDNALTQGTGHTT-GTDKTESNITLQTRVLTPQYVY 319 (517) T ss_pred ccccc--hHHHHHHHHhhhhhccceeeeeecc---ccceeee--cccccceeeeeec-CCcccccccceeeEEeeHhhhh Confidence 22211 1111122221111112223333211 1111111 1111122223322 2335666666777777777777 Q ss_pred EEEeecHHHHHHHHHhCCC-ccHHHHHHHHHHHHHhhhheeeeeehh-hcceeeeecCCccccccCcCccccCHHHHHHH Q lcl|NC_011142. 123 VECHYSLDELRTTAAVNMP-IDSMQAELAFRGSEEHSQRVAYFGDTN-RNMSGLLNNPNVTKTSATVNYATCTGQELFDL 200 (343) Q Consensus 123 ~~~~~~~~El~~a~~~g~~-l~~~k~~aA~~~~~~~~n~~~f~G~~~-~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~d 200 (343) .-...|.+-|+.+...-.+ |..--....+..+++.|++-+++|+.. .+..|+++..+... ..+.-.+.+..+++.. T Consensus 320 ~~~~~S~qll~Ds~~dd~~~l~s~i~~~l~~~l~~~ee~a~l~GdGtg~~~~gi~~~a~~~~--~~~~~~~~~~~d~i~~ 397 (517) T protein:vir:97 320 KYIKLPKIVMNSNATDIAGAILTYVMNRLPDMVIMAVNRAIIMGGVTGVSETQIYPVVGDAW--ATNVTGTTNIQELLEK 397 (517) T ss_pred hhhhhhHHHHHHhhhccHHHHHHHHHHHHHHHHHHHHHHHHhcccCCCcccccccccccccc--cccccccchHHHHHHH Confidence 7666666655544321111 555566678889999999999999863 23345543322100 0000011122233333 Q ss_pred HHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhh--c Q lcl|NC_011142. 201 LNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAA--A 278 (343) Q Consensus 201 i~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~--~ 278 (343) +..++.. ...-.++|+|..|..|.+.. +..|.=++ ...+. .+.+..+-..... .+.... . T Consensus 398 l~~a~~~--------a~~a~~vmn~~t~~~I~klK--D~~G~Yl~----~~~~~---~~~~~~l~G~~~~-~~~~~~~~~ 459 (517) T protein:vir:97 398 LSVATPK--------AADSTLVIHRNDLAAIRFLK--DKNGNYVF----PVGVS---NQTIATHFGFNRL-VQSVAVDEK 459 (517) T ss_pred HHHHhhh--------ccCCEEEECHHHHHHHHHhh--cCCCCeec----cCcCC---cccccccCCcccc-ccccccCce Confidence 3222211 11346899999999997543 43332111 11000 0000000000000 000000 0 Q ss_pred cccCCccceEEEEEcccceEEEeeccchhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 279 GVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 279 g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) -.+. ..+-.++-...-..+ ..|-+- ...-.| -...+++| -|+.|.++++..=- T Consensus 460 ~~~~-~~~y~i~~~~g~~~~-----~~fd~~---~n~~~f--~~~~~~~g-~i~~~~r~a~~~~~ 512 (517) T protein:vir:97 460 TAVS-LSGYVTNGSRGMEFE-----QGTILV---ENNKEY--LFEMPISG-SLEYKGTTAYGTYT 512 (517) T ss_pred eEee-ccccEEEeecceeee-----eeeecc---cCceeE--eeeeeecc-ccccccceEEEEEc Confidence 0000 000000000000000 111100 001111 11233333 34444444432111 No 169 >protein:vir:106590 Length: 349 # NCBI annotation: putative major head protein # Family: family:all:1083 # MgeID: mge:1598 # MgeName: Lj965 # Cross-refs: genbank:acc:NP_958585;genbank:gi:41179245;genbank:GeneID:2717126 Probab=37.97 E-value=1.1 Score=20.38 Aligned_cols=280 Identities=10% Similarity=0.061 Sum_probs=103.2 Q ss_pred cCcchhecchhh-----hhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeeeccccc-ceeecCCcC Q lcl|NC_011142. 29 IGVPSVVNDADG-----GAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSYDGAAM-GKFISANAS 102 (343) Q Consensus 29 ~~~~~~~~dA~~-----~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~-a~~~~~~~~ 102 (343) |-...+.+|-+. .-.|+..++..+-+ +.+.+.+-+.++||... .. +.+............ +..++-.+. T Consensus 1 ~~~~~~~~~~~~~~~~~~d~~~~~~l~~~~~---~~~~~~~l~~~~Fp~~~-~~-~~~~~~~~~~~~~~~~a~~v~~~~~ 75 (349) T protein:vir:10 1 MKNQKLQLDLQRFATPILDMFSQNTVLDYTR---NRQYPEMLGDTLFPAVK-VP-TLEVDILKAGSRVPTIASVSAFDAE 75 (349) T ss_pred CCcchhhHHHHHHHHHhhcccCHHHHHHHHH---hcCcchhhHhhcCCccc-cc-cceeEEEeeccCcceeeeeecCCCC Confidence 222344443211 11334444433322 22334566677888532 11 111111111111111 122222221 Q ss_pred ccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHH--------HHHHHHHHHH----hhhheeeeee---h Q lcl|NC_011142. 103 DLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQ--------AELAFRGSEE----HSQRVAYFGD---T 167 (343) Q Consensus 103 dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k--------~~aA~~~~~~----~~n~~~f~G~---~ 167 (343) .|..+-........++.+...+.++..|+......+.+-.... ....++.+.. ..-+++++|. . T Consensus 76 -~~~~~r~~~~~~~~~p~ik~~~~i~e~dl~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~q~l~~Gki~~~ 154 (349) T protein:vir:10 76 -AEIGTREASKMTAELAYVKRKMQITEEMLIKLQSPRNTAEENYLKQYVFDDIDAMVQAVKARGEKMTMEMFATGKITDK 154 (349) T ss_pred -cceecccceeEEeeccccccccccCHHHHHHHhhccCcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhCCeeEEc Confidence 2333333344445566777788889899887666554422111 1112222222 2234444551 1 Q ss_pred hhcce---eeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhccccCCCCCccH Q lcl|NC_011142. 168 NRNMS---GLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSLLMTGYTDRTV 244 (343) Q Consensus 168 ~~g~~---GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~~~~~~~~~tv 244 (343) +.|+. |. ...+....+.++.|+++++ ++++||.+.+..+ | ..|..++|+++.+..|.+ +..+ T Consensus 155 ~~g~~vD~g~-~~~~~~~lt~~~~Ws~~~a-dpi~Di~~~~~~~-----g-~~p~~~vm~~~~~~~l~~-------~~~i 219 (349) T protein:vir:10 155 KNGIAIDYGV-PKKHQETLSGTKTWDKSDA-SIIDNLQDWSDSL-----D-VTPTRALTSKKVLRILMR-------STEI 219 (349) T ss_pred CCcEEEeccc-CccceeEecCcccCCCCCC-CHHHHHHHHHHHh-----C-CCccEEEeCHHHHHHHhc-------CHHH Confidence 11211 10 1112222345567988665 5788998776543 3 358999999999999853 2344 Q ss_pred HHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEc----ccc----eEEEeeccchh-cccceecC Q lcl|NC_011142. 245 IEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDK----SER----NLALAKPIPFR-MLAPQLLG 315 (343) Q Consensus 245 le~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~----~~~----~~~~~v~~~~~-~~~~~~~~ 315 (343) .+.+..++..... + . ..+ ...+. +.++- ...+|+. +.. +-.-.+|...- ++|....+ T Consensus 220 ~~~~~~~~~~~~~---~--~---~~~-~~~l~----~~~~~-~i~~yd~~y~d~~~~~~~t~~~~~p~~~v~l~~~~~~G 285 (349) T protein:vir:10 220 KEAIFGKDTGRVV---G--Q---ADL-DQWMT----AQGLP-IIRAYDGKYRDEDSRGNLTTNSYFPEDRIVLFNDEVPG 285 (349) T ss_pred HHHhccccccccc---C--H---HHH-HHHHH----hcCCc-eEEEEeeEEEeecCCCceeecccccCCeEEEecCCCce Confidence 4444322111000 0 0 000 00000 01111 1233321 000 11112232221 11211111 Q ss_pred -ceeEeee-----------eeeee-eEEEE-----Ccce---eeeeccC Q lcl|NC_011142. 316 -LGITVPA-----------EYKIS-GTEYR-----YPLC---AQYVDML 343 (343) Q Consensus 316 -~~~~~~~-----------~~~~g-Gv~i~-----~P~a---i~~~dGI 343 (343) ..|-... ....+ |..++ .|.. .+....+ T Consensus 286 ~~~yG~~~e~~~~~~g~~~~~~~~~~~~~~~~~~~dP~~~~~~~~s~~l 334 (349) T protein:vir:10 286 QKIYGPTPEENRLISSNAQVSNVGNIMAKIYETSEDPIGTWILASATML 334 (349) T ss_pred eEEeeccchhhhhcccccceeeccceEEEeeeecCCCceEEEEEeeeee Confidence 1110000 00000 11111 1110 0011111 No 170 >protein:vir:102605 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:1661 # MgeName: Llij # Cross-refs: genbank:acc:YP_655002;genbank:gi:109392192;genbank:GeneID:4157227 Probab=37.93 E-value=1.1 Score=20.37 Aligned_cols=264 Identities=11% Similarity=0.050 Sum_probs=112.2 Q ss_pred cCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccC--CCCcceeEEEEeeecccccceeecCCcCccce Q lcl|NC_011142. 29 IGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLA--NIPEYATHWNYRSYDGAAMGKFISANASDLPR 106 (343) Q Consensus 29 ~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~--~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~ 106 (343) |.+ -.|..+.|.. .+.+.....+....++.... .+..| .++.++.....+.+. +-..+..++. T Consensus 1 MA~----------~~~~pe~~~~---~v~~~~~~~lv~~~l~~~~~~~~~~~G-dtv~ip~~~~~~~~d-~~~~~~~~~~ 65 (273) T protein:vir:10 1 MAF----------NNFIPELWSD---MLLEEWTAQTVFANLVNREYEGTASKG-NVVHIAGVVAPTVKD-YKAAGRQTSA 65 (273) T ss_pred Ccc----------hhhhHHHHHH---HHHHHHHhhhccchhhccccccccccC-ceEEEeecccccccc-cccCCCccCc Confidence 111 1234455542 34444444555555554321 23334 366666554433322 2111122222 Q ss_pred eeeccceeEEEEEE-EEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCcccccc Q lcl|NC_011142. 107 VAQSAKLHQVELGY-AGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSA 185 (343) Q Consensus 107 v~~~~~~~~~~v~~-~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~ 185 (343) -+.+-+.....+-. ...++.++ +++..+.. .++.. -...+..+++...|+.++. .+ .+ -+ +..+. T Consensus 66 ~~~~~~~~~~tid~~~~~~~~i~--d~d~~~~~-~~~~~-~~~~~~~alA~~vD~~i~~---~~--~~----a~-~~~~~ 131 (273) T protein:vir:10 66 DAISDTGVDLLIDQEKSIDFLVD--DIDRVQVA-GSLEA-YTRAGATALATDTDKFIAD---ML--VD----NG-TALTG 131 (273) T ss_pred cccccceEEEEEeeeeecceEee--cHHHhhhh-ccHHH-HHHHHHHHHHHHHHHHHHH---HH--hc----cc-ccccc Confidence 23333344444422 35555554 55554443 35643 5556777888887766551 00 00 00 00011 Q ss_pred CcCccccCHHHHHHHHHHHHHHHHHhcCCe-ecccEEEecHHHHHHHhcc--ccCCCCCccHHHHHHhcCcceeeccccc Q lcl|NC_011142. 186 TVNYATCTGQELFDLLNNPVFAVVKASKRF-HTPNTVLMFPDLWKRASSL--LMTGYTDRTVIEHFQINNAYTLLTRNPI 262 (343) Q Consensus 186 ~~~w~~~t~~~i~~di~~~~~~l~~~s~g~-~~p~tL~l~p~~~~~L~~~--~~~~~~~~tvle~l~~n~~~~~~~~~p~ 262 (343) + +.-++..+++.|.++..+|.+. .+ .....|+++|..|..|.+- .... .+..--+=..+++....+-|.+ T Consensus 132 ~---~~~~~~~~~~~i~~a~~~ld~~--~vP~~~R~lvv~p~~~~~L~~~~~~~~~-~~~~~~~~~l~~G~ig~i~G~~- 204 (273) T protein:vir:10 132 S---APTDADDAFDLIAKALKELTKA--NVPNVGRVVVVNAEMAFWLRSSGSKLTS-ADTSGDAAGLRAGTIGNLLGAR- 204 (273) T ss_pred c---cccchhHHHHHHHHHHHHhhhc--CCCcCCCEEEECHHHHHHHhcchhhhhh-hhccccccceeeeeeeEEeceE- Confidence 1 1235677899999998888654 22 1235799999999988531 1111 0000000001111111222211 Q ss_pred cccccceeeechhhhccccCCccceEEEEEcccceEEEeecc-chhcccceecCceeEeeeeeeeeeEEEECcceeeeec Q lcl|NC_011142. 263 DIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPI-PFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVD 341 (343) Q Consensus 263 ~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~-~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d 341 (343) ......+. .+. ....+++.++- +.+..-. .+..+..+ +.....+..... .|+.+.||.+++.+. T Consensus 205 ------v~~s~~lp---~~~--~~~~~~~~~~A--~~~a~q~~~~e~~r~~-~~~~~~v~~~~~-yg~~v~~~~~~~~l~ 269 (273) T protein:vir:10 205 ------IVESNNLR---DTD--DEQFVAFHPSA--AAYVSQIDTVEALRDQ-DSFSDRIRALHV-YGGKVVRPTGVVVFN 269 (273) T ss_pred ------EEEecccc---cCC--ccEEEEEeccc--eeeeeeeehhhcccCC-Ccceeeeeeeee-eeeeEeccceEEEEe Confidence 11111110 011 11233443221 1111100 11111111 223444545444 578888999888754 Q ss_pred cC Q lcl|NC_011142. 342 ML 343 (343) Q Consensus 342 GI 343 (343) == T Consensus 270 ~~ 271 (273) T protein:vir:10 270 KT 271 (273) T ss_pred cc Confidence 33 No 171 >protein:vir:105822 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:1636 # MgeName: PMC # Cross-refs: genbank:acc:YP_655767;genbank:gi:109522090;genbank:GeneID:4157630 Probab=37.93 E-value=1.1 Score=20.37 Aligned_cols=264 Identities=11% Similarity=0.050 Sum_probs=112.2 Q ss_pred cCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccC--CCCcceeEEEEeeecccccceeecCCcCccce Q lcl|NC_011142. 29 IGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLA--NIPEYATHWNYRSYDGAAMGKFISANASDLPR 106 (343) Q Consensus 29 ~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~--~~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~ 106 (343) |.+ -.|..+.|.. .+.+.....+....++.... .+..| .++.++.....+.+. +-..+..++. T Consensus 1 MA~----------~~~~pe~~~~---~v~~~~~~~lv~~~l~~~~~~~~~~~G-dtv~ip~~~~~~~~d-~~~~~~~~~~ 65 (273) T protein:vir:10 1 MAF----------NNFIPELWSD---MLLEEWTAQTVFANLVNREYEGTASKG-NVVHIAGVVAPTVKD-YKAAGRQTSA 65 (273) T ss_pred Ccc----------hhhhHHHHHH---HHHHHHHhhhccchhhccccccccccC-ceEEEeecccccccc-cccCCCccCc Confidence 111 1234455542 34444444555555554321 23334 366666554433322 2111122222 Q ss_pred eeeccceeEEEEEE-EEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCcccccc Q lcl|NC_011142. 107 VAQSAKLHQVELGY-AGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSA 185 (343) Q Consensus 107 v~~~~~~~~~~v~~-~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~ 185 (343) -+.+-+.....+-. ...++.++ +++..+.. .++.. -...+..+++...|+.++. .+ .+ -+ +..+. T Consensus 66 ~~~~~~~~~~tid~~~~~~~~i~--d~d~~~~~-~~~~~-~~~~~~~alA~~vD~~i~~---~~--~~----a~-~~~~~ 131 (273) T protein:vir:10 66 DAISDTGVDLLIDQEKSIDFLVD--DIDRVQVA-GSLEA-YTRAGATALATDTDKFIAD---ML--VD----NG-TALTG 131 (273) T ss_pred cccccceEEEEEeeeeecceEee--cHHHhhhh-ccHHH-HHHHHHHHHHHHHHHHHHH---HH--hc----cc-ccccc Confidence 23333344444422 35555554 55554443 35643 5556777888887766551 00 00 00 00011 Q ss_pred CcCccccCHHHHHHHHHHHHHHHHHhcCCe-ecccEEEecHHHHHHHhcc--ccCCCCCccHHHHHHhcCcceeeccccc Q lcl|NC_011142. 186 TVNYATCTGQELFDLLNNPVFAVVKASKRF-HTPNTVLMFPDLWKRASSL--LMTGYTDRTVIEHFQINNAYTLLTRNPI 262 (343) Q Consensus 186 ~~~w~~~t~~~i~~di~~~~~~l~~~s~g~-~~p~tL~l~p~~~~~L~~~--~~~~~~~~tvle~l~~n~~~~~~~~~p~ 262 (343) + +.-++..+++.|.++..+|.+. .+ .....|+++|..|..|.+- .... .+..--+=..+++....+-|.+ T Consensus 132 ~---~~~~~~~~~~~i~~a~~~ld~~--~vP~~~R~lvv~p~~~~~L~~~~~~~~~-~~~~~~~~~l~~G~ig~i~G~~- 204 (273) T protein:vir:10 132 S---APTDADDAFDLIAKALKELTKA--NVPNVGRVVVVNAEMAFWLRSSGSKLTS-ADTSGDAAGLRAGTIGNLLGAR- 204 (273) T ss_pred c---cccchhHHHHHHHHHHHHhhhc--CCCcCCCEEEECHHHHHHHhcchhhhhh-hhccccccceeeeeeeEEeceE- Confidence 1 1235677899999998888654 22 1235799999999988531 1111 0000000001111111222211 Q ss_pred cccccceeeechhhhccccCCccceEEEEEcccceEEEeecc-chhcccceecCceeEeeeeeeeeeEEEECcceeeeec Q lcl|NC_011142. 263 DIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPI-PFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVD 341 (343) Q Consensus 263 ~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~-~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~d 341 (343) ......+. .+. ....+++.++- +.+..-. .+..+..+ +.....+..... .|+.+.||.+++.+. T Consensus 205 ------v~~s~~lp---~~~--~~~~~~~~~~A--~~~a~q~~~~e~~r~~-~~~~~~v~~~~~-yg~~v~~~~~~~~l~ 269 (273) T protein:vir:10 205 ------IVESNNLR---DTD--DEQFVAFHPSA--AAYVSQIDTVEALRDQ-DSFSDRIRALHV-YGGKVVRPTGVVVFN 269 (273) T ss_pred ------EEEecccc---cCC--ccEEEEEeccc--eeeeeeeehhhcccCC-Ccceeeeeeeee-eeeeEeccceEEEEe Confidence 11111110 011 11233443221 1111100 11111111 223444545444 578888999888754 Q ss_pred cC Q lcl|NC_011142. 342 ML 343 (343) Q Consensus 342 GI 343 (343) == T Consensus 270 ~~ 271 (273) T protein:vir:10 270 KT 271 (273) T ss_pred cc Confidence 33 No 172 >protein:vir:8324 Length: 410 # NCBI annotation: gp41 # Family: family:all:30827 # MgeID: mge:154 # MgeName: Corndog # Cross-refs: genbank:acc:NP_817892;genbank:gi:29566325;genbank:GeneID:1259520 Probab=37.63 E-value=1.1 Score=20.34 Aligned_cols=292 Identities=12% Similarity=0.039 Sum_probs=118.1 Q ss_pred CCcceeccc---------hhhh----hchhh-------hchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhh Q lcl|NC_011142. 1 MSEKRVVID---------AQTI----AGNRW-------LNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVP 60 (343) Q Consensus 1 ~~~~~~~~~---------~~~~----~~~~~-------~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~ 60 (343) |.|+-+.+- ++-+ .|++- ++.+.+..-+. |- ..+...+|-.=--++++.. T Consensus 89 ~r~~p~~~~veyRSaGE~lkal~~~~~Gd~~A~~~~e~~r~a~~~~~Tg---------d~-~~~i~~~~v~d~i~li~q~ 158 (410) T protein:vir:83 89 MRGSPVGTEVEYRSAGEYMLDMWNSAQGNASAADRLEVYARAADHQKTG---------DL-QGVIPDPIVGPVIDFIDSA 158 (410) T ss_pred CcCCCCCCCcccccHHHHHHHHhccCCchHHHHHHHHHHHHhhccCccc---------cc-ccccchhHhhhHHHHHhhc Confidence 333321110 1111 22222 22222222110 11 1112222321111344433 Q ss_pred hhcccchhhccccCCCCcceeEEEEeeecccccceee------cCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHH Q lcl|NC_011142. 61 YADITYLEDVPVLANIPEYATHWNYRSYDGAAMGKFI------SANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRT 134 (343) Q Consensus 61 ~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~------~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~ 134 (343) ++- +.+|+ -+|.--.++.|.+......+..+ ++.++.+|+-.+.++..+..|..++-.-.+|.+.+++ T Consensus 159 r~i---~slf~---tLP~~g~T~eY~v~t~~~tV~~q~~~~kqa~EGd~L~~gKl~~~t~tA~ikTyGGyt~LSRQ~IER 232 (410) T protein:vir:83 159 RPL---VSTLG---TLPLNNATFYRPIVSQRPAVGLQGVAGGASDEKTELDSQKMVIDRLTVNAKTLGGYVNVSRQAIDF 232 (410) T ss_pred cch---hhhhh---hCCCCCCeeEEeeecccccccccccccccccccccccccceeeeeccceeehhcCcccccceeeec Confidence 322 22222 12222337777777655544332 4566779999999999999999999888899999998 Q ss_pred HHHhCCCccHHHHHHHHHHHHHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCC Q lcl|NC_011142. 135 TAAVNMPIDSMQAELAFRGSEEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKR 214 (343) Q Consensus 135 a~~~g~~l~~~k~~aA~~~~~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g 214 (343) +.-.-++...+-...|.....+.--+-.|++... +. . +-...|+++...-|.++...+....++ T Consensus 233 s~v~~L~~~lraL~~AYA~atea~vra~L~~t~t-~~-------------~--a~~~~Tad~~~~~i~da~~~v~da~~~ 296 (410) T protein:vir:83 233 SSPSALDLVVNGLGQQYAIETEALVGAALASTST-GA-------------V--GYGNATADNVASAIWQAAGAVYTAVKG 296 (410) T ss_pred CChhhHHHHHHHHHHHHHHHHHHHHHHHHHHhhh-hh-------------h--hhhhccHHHHHHHHHHHHHHHhhhhcc Confidence 7655544444333222222222222222222211 10 0 111225556555566655555544333 Q ss_pred eecccEEEecHHHHHHHhccccCCCC--CccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEE Q lcl|NC_011142. 215 FHTPNTVLMFPDLWKRASSLLMTGYT--DRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYD 292 (343) Q Consensus 215 ~~~p~tL~l~p~~~~~L~~~~~~~~~--~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~ 292 (343) .+-..|+++|..+..+-.+-..-++ ..|.- +--+-....+.|..+.+..+. . ....+++ . +|- T Consensus 297 -~~~~~i~vS~DVl~~~~~~f~~~~~~~~dt~G--fg~~~lg~gi~G~~~~ipVvm---~---~~a~AgT-----A-~f~ 361 (410) T protein:vir:83 297 -MGRLVIAIAPDVLGDFGPLFAPVNPTNAHSTG--FEAGRFGQGVMGSISGIPVVM---S---AALGSGD-----A-YLF 361 (410) T ss_pred -ceeeeEEechhhhhhccceeeccCCCCccccc--ccccccccchhhhhcccceEE---e---cCCCcCe-----e-eEe Confidence 3457799999998776543221111 11211 111111122333322222211 1 1122221 1 221 Q ss_pred cccceEEE-eec-cchhcccceecCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 293 KSERNLAL-AKP-IPFRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 293 ~~~~~~~~-~v~-~~~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) ++.-++. +=+ -|++......- +....|. ...++-...|..+.=+.|- T Consensus 362 -~~~Ai~~~eS~~gp~qL~d~~i~--nLt~~yS-gY~a~a~~~~~gliPv~g~ 410 (410) T protein:vir:83 362 -STAAIECFEQRVGTLQVVEPSVF--GLQVAYA-GYFSTLVVNEDAIVPLVGS 410 (410) T ss_pred -ccceeeeeecCCceeEeeCCchh--hhhhhhe-eeeeeccccccceeeeccC Confidence 3332222 111 23333222222 2222222 1222333333333333333 No 173 >protein:vir:3364 Length: 347 # NCBI annotation: major capsid protein 10A # Family: family:all:975 # MgeID: mge:67 # MgeName: T3 # Cross-refs: genbank:acc:NP_523335;genbank:gi:17570826;genbank:GeneID:927448 Probab=34.65 E-value=1.3 Score=20.00 Aligned_cols=309 Identities=10% Similarity=0.019 Sum_probs=124.9 Q ss_pred hhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHH-HHHHHHHhhhhhcccchhhccccCCCCcceeEEEEeee Q lcl|NC_011142. 10 AQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLA-SLETTVYEVPYADITYLEDVPVLANIPEYATHWNYRSY 88 (343) Q Consensus 10 ~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~-~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~ 88 (343) ..-+++-++++ .+++.|.. . ++....|. ++|. .++..+-+. -..+.++.+.+ .-.| .++.+. T Consensus 1 ~~~~~~~~~~~----t~~g~~~~--~--~~~~al~i-e~~~g~V~~~f~~~----s~~~~~v~~r~-~~~G-~sv~i~-- 63 (347) T protein:vir:33 1 MANIQGGQQIG----TNQGKGQS--A--ADKLALFL-KVFGGEVLTAFART----SVTMPRHMLRS-IASG-KSAQFP-- 63 (347) T ss_pred CCCCccCcccc----cccccCCc--c--cchHHHHH-HHHHHHHHHHHHHH----Hhhhhhhcccc-cccc-ceeEee-- Confidence 11122222222 22333311 1 12223455 6654 555544332 23444554432 2223 333333 Q ss_pred ccccccee--ecCCcCccce--eeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHHhhhheee- Q lcl|NC_011142. 89 DGAAMGKF--ISANASDLPR--VAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEEHSQRVAY- 163 (343) Q Consensus 89 ~~~G~a~~--~~~~~~dip~--v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~~~f- 163 (343) ..|..+. +.. +.+++. .+..-.+....|-.+- -+..-+.++..++ +..++-..-...+..++++..|+.++ T Consensus 64 -~iG~~t~~~~~~-g~~l~~~~~~~~~~e~~ltiD~~~-y~~~~VddiD~~q-~~~D~~~~~~~~~g~aLA~~~D~~i~~ 139 (347) T protein:vir:33 64 -VIGRTKAAYLKP-GENLDDKRKDIKHTEKVIHIDGLL-TADVLIYDIEDAM-NHYDVRAEYTAQLGESLAMAADGAVLA 139 (347) T ss_pred -eccceeeeeecC-CCCCCCCCCCCccceEEEEechhh-hhhHHHhhHHHHh-cCCchhHHHHHHHHHHHHHHHHHHHHH Confidence 3344432 221 122221 1122233333322211 1123356777654 46677777888899999999999886 Q ss_pred ----eeehh---hcceeeeecCCccc---cccCcCcc-ccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHh Q lcl|NC_011142. 164 ----FGDTN---RNMSGLLNNPNVTK---TSATVNYA-TCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRAS 232 (343) Q Consensus 164 ----~G~~~---~g~~GLlN~p~v~~---~~~~~~w~-~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~ 232 (343) .+... ....+.+..+.... .+.++.|. .++++.|++.|.++...|.++.=-.. ...++++|+.|..|. T Consensus 140 ~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~tg~~~d~~~~a~~i~~~i~~a~~~Lde~~VP~~-gR~~vv~P~~y~~Ll 218 (347) T protein:vir:33 140 ELAGLVNLPDGSNENIEGLGKPTVLTLVKPTTGSLTDPVELGKAIIAQLTIARASLTKNYVPAA-DRTFYTTPDNYSAIL 218 (347) T ss_pred HHHHhhhhhcccccccccccccccccccccccccccchhhhHHHHHHHHHHHHHHHhhcCCCcc-CcEEEeCHHHHHHHh Confidence 22211 11122222222111 11222332 24678899999999999876532222 356999999999987 Q ss_pred ccccCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccc--cC-------CccceEEEEEc------ccce Q lcl|NC_011142. 233 SLLMTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGV--SN-------GNKDRYVVYDK------SERN 297 (343) Q Consensus 233 ~~~~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~--~~-------~g~dr~v~y~~------~~~~ 297 (343) .-..-.+.+..-.+-+ .++....+.|.++....+.......-...++ |. .+..+..++.+ .++- T Consensus 219 ~~~~~~~~d~~~~~~~-~~G~V~~i~G~~V~~Sn~lp~~~~~~~~~~~~ag~~~~~~~~~~~~~~~a~~~~~gl~~h~~A 297 (347) T protein:vir:33 219 AALMPNAANYQALLDP-ERGTIRNVMGFEVVEVPHLTAGGAGDTREDAPADQKHAFPATSSTTVKVALDNVVGLFQHRSA 297 (347) T ss_pred cccccccccccccccc-ccceeEEEeceeEEEecccccCccccccccccccccccccCCcccceeccccceeeeeecchh Confidence 5211111111001111 1222222333222111111000000000000 00 00111111110 1111 Q ss_pred EEEeeccc--hhcccceecCceeEeeeeeeeeeEEEECcceeeee--ccC Q lcl|NC_011142. 298 LALAKPIP--FRMLAPQLLGLGITVPAEYKISGTEYRYPLCAQYV--DML 343 (343) Q Consensus 298 ~~~~v~~~--~~~~~~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~--dGI 343 (343) +.....++ ++....+ +.....+...... |+-+.||.+++-+ -|| T Consensus 298 ~g~v~~~~~~~e~~r~~-~~~~d~i~~~~~~-G~~vlrP~~av~i~~~~~ 345 (347) T protein:vir:33 298 VGTVKLKDLALERARRA-NYQADQIIAKYAM-GHGGLRPEAAGAIVLPKV 345 (347) T ss_pred heeeeeeceeeeeccch-hhhhHhhhhhhhc-CCceecccceEEEecCCC Confidence 11111111 1222212 2223444455554 8999999987654 455 No 174 >protein:vir:94711 Length: 347 # NCBI annotation: capsid # Family: family:all:975 # MgeID: mge:1528 # MgeName: K1F # Cross-refs: genbank:acc:YP_338120;genbank:gi:77118198;genbank:GeneID:3707734 Probab=29.47 E-value=1.7 Score=19.38 Aligned_cols=304 Identities=9% Similarity=0.001 Sum_probs=117.0 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHHHHHHHHHhhhhhcccchhhccccCCCCcce Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLASLETTVYEVPYADITYLEDVPVLANIPEYA 80 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~ 80 (343) |.-- +-+.++ .+.+.| ++ ++|....|.....-.++...-+. -..+.++.+.+ +- +- T Consensus 1 m~~~----------~~~~~~----t~~g~~-~~---~~d~~al~ik~f~~eV~~~f~~~----s~~~~~~~~r~-i~-~G 56 (347) T protein:vir:94 1 MANV----------PGQKIG----TDQGKG-KS---SSDALALFLKVFAGEVLTAFTRR----SVTADKHIVRT-IQ-NG 56 (347) T ss_pred CCCC----------Cccccc----cccccC-Cc---cccHHHHHHHHHhHHHHHHHHHH----Hhhhccccccc-cc-cc Confidence 1100 001111 111111 11 12222344433233444332211 12233444332 22 22 Q ss_pred eEEEEeeeccccccee--ecCCcCcccee--eeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHHH Q lcl|NC_011142. 81 THWNYRSYDGAAMGKF--ISANASDLPRV--AQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSEE 156 (343) Q Consensus 81 ~~~~~~~~~~~G~a~~--~~~~~~dip~v--~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~ 156 (343) .++.+. ..|..+. +.. +++++.. +..-.+....|-.+- -+..-+.++.. .+...++..+-...+..++++ T Consensus 57 ~sv~i~---~iG~~tv~~~t~-G~~l~~~~~~~~~~e~~itID~~~-~~~~~VddiD~-~q~~~D~~~~~~~~~g~aLa~ 130 (347) T protein:vir:94 57 KSAQFP---VMGRTSGVYLAP-GERLSDKRKGIKHTEKVITIDGLL-TADVMIFDIED-AMNHYDVAGEYSNQLGEALAI 130 (347) T ss_pred ceEEEe---cccceeeeeecC-CCCcCCCCCCCCcceEEEEecchh-hhhHHhhhHHH-HhcCcchHHHHHHHHHHHHHH Confidence 333333 3344442 111 1222111 122223333322221 12333456666 445667878888899999999 Q ss_pred hhhheeee---------eehhhcceeeeecCCcc-ccccCc-CccccCHHHHHHHHHHHHHHHHHhcCCee-cccEEEec Q lcl|NC_011142. 157 HSQRVAYF---------GDTNRNMSGLLNNPNVT-KTSATV-NYATCTGQELFDLLNNPVFAVVKASKRFH-TPNTVLMF 224 (343) Q Consensus 157 ~~n~~~f~---------G~~~~g~~GLlN~p~v~-~~~~~~-~w~~~t~~~i~~di~~~~~~l~~~s~g~~-~p~tL~l~ 224 (343) ..|+.++. +.+.....|+- .+++. ..+.+. .=..++++.+++.|.++...|.+. .+- ....++|+ T Consensus 131 ~~D~~i~~~~~~~aa~~~~~~~~~~g~~-~~s~~~~~~~~~~~~~~~~~~~~~~~i~~a~~~Lde~--~VP~~~R~~vv~ 207 (347) T protein:vir:94 131 AADGAVLAEMAILCNLPAASNENIAGLG-TASVLEVGKKADLDTPAKLGEAIIGQLTIARAKLTSN--YVPAGDRYFYTT 207 (347) T ss_pred HHHHHHHHHHHHHhccccccccccCCCc-ccceeeccccccccchhhhHHHHHHHHHHHHHHHhhc--CCCCCCcEEEeC Confidence 99987742 11122223321 12211 111111 112345778888888888888754 221 23578999 Q ss_pred HHHHHHHhccccCCCCCccHHHHHH----hcCcceeeccccccccccceeeechhhhcc---ccCCccceEEE------E Q lcl|NC_011142. 225 PDLWKRASSLLMTGYTDRTVIEHFQ----INNAYTLLTRNPIDIKIRFQLMATELAAAG---VSNGNKDRYVV------Y 291 (343) Q Consensus 225 p~~~~~L~~~~~~~~~~~tvle~l~----~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g---~~~~g~dr~v~------y 291 (343) |..|..|...+.- +..++.. +++....+-|.++....+.-.....-...+ ....|.+..+. | T Consensus 208 P~~~~~Ll~~~~~-----~~~~~~~~~~~~~G~Vg~i~G~~V~~Sn~lp~~~~t~~~~~~~~~~~aG~~~~~~~~~~~~~ 282 (347) T protein:vir:94 208 PDNYSAILAALMP-----NAANYAALIDPETGNIRNVMGFVVVEVPHLVQGGAGETRGDDGITIASGQKHAFPATASSDV 282 (347) T ss_pred HHHHHHHhccchh-----hhhhccccccccccceEEEeceEEEecCcccccccccccccCcceecCcccccccccchhhh Confidence 9999988643221 1111111 222222222322211111100000000000 00111111110 1 Q ss_pred Ec----------ccceEEEeeccchhccc-ceecCceeEeeeeeeeeeEEEECcceeeeecc-C Q lcl|NC_011142. 292 DK----------SERNLALAKPIPFRMLA-PQLLGLGITVPAEYKISGTEYRYPLCAQYVDM-L 343 (343) Q Consensus 292 ~~----------~~~~~~~~v~~~~~~~~-~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~dG-I 343 (343) .- .++-+.....++++.-. -..+...+.+.+... .|+.+.||.+++.+.= . T Consensus 283 ~~~~~~~~~l~~h~~A~~~v~~~~~~~e~~r~~~~~~d~i~~~~~-~G~~~~rP~~a~~~~~~~ 345 (347) T protein:vir:94 283 KVTMDNVVGLFSHRSAVGTVKLRDLALERDRDVDAQGDLIVGKYA-MGHGGLRPEAAGALVFSP 345 (347) T ss_pred cccccceeEEEeehhhhhhhhcccccccchhchhhHHHHhhhhhh-hcCcccccceeEEEEecC Confidence 10 11111111111221111 111223455666665 4799999998765421 1 No 175 >protein:vir:100057 Length: 375 # NCBI annotation: T7-like capsid protein # Family: family:all:975 # MgeID: mge:1604 # MgeName: P-SSP7 # Cross-refs: genbank:acc:YP_214206;genbank:gi:61806429;genbank:GeneID:3294737 Probab=25.91 E-value=2 Score=18.93 Aligned_cols=308 Identities=11% Similarity=0.063 Sum_probs=120.5 Q ss_pred CCcceeccchhhhhchhhhchhcccccccCcchhecchhhhhhhhHHHHH-HHHHHHHhhhhhcccchhhccccCCCCcc Q lcl|NC_011142. 1 MSEKRVVIDAQTIAGNRWLNKFLDSNATIGVPSVVNDADGGAAYYISQLA-SLETTVYEVPYADITYLEDVPVLANIPEY 79 (343) Q Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~dA~~~~~f~~~~l~-~id~~v~e~~~~~l~~~~~i~v~~~~~~~ 79 (343) |+.--+ .++ .++...+.|..- .|++.-+.+.+++. .+++.+-+. -..+.++.+.+ .- + T Consensus 1 ~~~~~~----------~~~----~~~n~~t~~~~~-~~~~~~al~le~f~geV~~~f~~~----si~~~~~~~rt-i~-~ 59 (375) T protein:vir:10 1 MANANQ----------VAL----GRSNLSTGTGYG-GATDKYALYLKLFSGEMFKGFQHE----TIARDLVTKRT-LK-N 59 (375) T ss_pred Cccccc----------ccc----CccccCCccccc-cccchHHHHHHHHhHHHHHHHHHH----Hhhhccccccc-cc-c Confidence 221110 000 011111112111 12233344445543 666555332 23344554432 22 2 Q ss_pred eeEEEEeeeccccccee--ecCCc--CccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhCCCccHHHHHHHHHHHH Q lcl|NC_011142. 80 ATHWNYRSYDGAAMGKF--ISANA--SDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVNMPIDSMQAELAFRGSE 155 (343) Q Consensus 80 ~~~~~~~~~~~~G~a~~--~~~~~--~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~ 155 (343) -.++.+. ..|..+. +..+. ++-|..+.+..+....|-.. .-|..-+.|+..+ +...++-..-...+..+++ T Consensus 60 Gksv~f~---~iG~~t~~~~t~G~~i~~~~~~d~~~te~~l~ID~~-~y~~~~VdDiD~a-qa~~Dlr~e~s~~~G~aLA 134 (375) T protein:vir:10 60 GKSLQFI---YTGRMTSSFHTPGTPILGNADKAPPVAEKTIVMDDL-LISSAFVYDLDET-LAHYELRGEISKKIGYALA 134 (375) T ss_pred CceEEEE---eeeeeEEeeecCCcCcCCccccCCCCCceEEEecch-hhhhhhHhhHHHH-hcCchhHHHHHHHHHHHHH Confidence 2333333 3344442 22221 22233333333333333221 1233445677764 4556777778888899999 Q ss_pred Hhhhheeee----e-ehhhccee--eeecCCccccc---cCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecH Q lcl|NC_011142. 156 EHSQRVAYF----G-DTNRNMSG--LLNNPNVTKTS---ATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFP 225 (343) Q Consensus 156 ~~~n~~~f~----G-~~~~g~~G--LlN~p~v~~~~---~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p 225 (343) +..|+.++- | .....+.| .+ .|+-.... ....=...|++.+++.|.++..+|.++.=-.. ...++++| T Consensus 135 ~~~D~~i~~~l~kaa~~~~p~~~~~~~-~~Gg~~i~~~sg~~~~~~~ta~~~~~ai~~a~~~Lde~~VP~~-~R~~vv~P 212 (375) T protein:vir:10 135 EKYDRLIFRSITRGARSASPVSATNFV-EPGGTQIRVGSGTNESDAFTASALVNAFYDAAAAMDEKGVSSQ-GRCAVLNP 212 (375) T ss_pred HHHHHHHHHHHHHhhhhcccccccccc-ccCcceeeeccccccccccCHHHHHHHHHHHHHHHhhcCCCCC-CCEEEeCh Confidence 999987762 1 11111000 00 01111111 11112335799999999999999876532212 24588999 Q ss_pred HHHHHHhccccCC-CCCccHH-HHHHhcCcceeeccccccccccceeeechhhhc------------------------- Q lcl|NC_011142. 226 DLWKRASSLLMTG-YTDRTVI-EHFQINNAYTLLTRNPIDIKIRFQLMATELAAA------------------------- 278 (343) Q Consensus 226 ~~~~~L~~~~~~~-~~~~tvl-e~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~------------------------- 278 (343) +.|..|..-..++ ..+.... +=+..++--..+.|.++....+ .+...+. T Consensus 213 ~~y~~Ll~~~d~~~~~n~d~~~~~~~~~g~v~~i~Gv~V~~Sn~----lP~~~~~~~~~g~~~~~~a~~~~~~~~~~~~~ 288 (375) T protein:vir:10 213 RQYYALIQDIGSNGLVNRDVQGSALQSGNGVIEIAGIHIYKSMN----IPFLGKYGVKYGGTTGETSPGNLGSHIGPTPE 288 (375) T ss_pred HHHHHHHhcCCccceeeecccccceeccceEEEEeceEEEEecc----ccccccccccccccccccchhhhhccccccCC Confidence 9998885310000 0000000 0000011111122222111111 0100000 Q ss_pred ----ccc----------CCccceEEEEEcccceEEEeeccchhccc----ceecCceeEeeeeeeeeeEEEECcceeeee Q lcl|NC_011142. 279 ----GVS----------NGNKDRYVVYDKSERNLALAKPIPFRMLA----PQLLGLGITVPAEYKISGTEYRYPLCAQYV 340 (343) Q Consensus 279 ----g~~----------~~g~dr~v~y~~~~~~~~~~v~~~~~~~~----~~~~~~~~~~~~~~~~gGv~i~~P~ai~~~ 340 (343) .+| ..++-..+++. ++-+.-...++++.-- -+++-..+.+-..... |+.+.||.+++-+ T Consensus 289 ~~~~~~g~~~~y~~d~~~~~~~~~~~~~--~~A~g~v~~~~~~~~~~~~~~~~~~q~~~i~~~~a~-G~~~lrp~~av~l 365 (375) T protein:vir:10 289 NANATGGVNNDYGTNAELGAKSCGLIFQ--KEAAGVVEAIGPQVQVTNGDVSVIYQGDVILGRMAM-GADYLNPAAAVEL 365 (375) T ss_pred cceeeccccccccccccccCceEEEEEc--hhheeeeeeeccccccccchhhheeeeeeeeeeeee-ccCccCceeEEEE Confidence 000 00122233332 2211111112222111 1222223444344443 6788888886654 Q ss_pred ccC Q lcl|NC_011142. 341 DML 343 (343) Q Consensus 341 dGI 343 (343) .== T Consensus 366 ~~~ 368 (375) T protein:vir:10 366 YIG 368 (375) T ss_pred ecC Confidence 211 No 176 >protein:vir:107120 Length: 329 # NCBI annotation: conserved phage protein # Family: family:all:701 # MgeID: mge:1571 # MgeName: CNPH82 # Cross-refs: genbank:acc:YP_950606;genbank:gi:119953686;genbank:GeneID:4643129 Probab=23.65 E-value=2.3 Score=18.62 Aligned_cols=289 Identities=10% Similarity=-0.003 Sum_probs=122.7 Q ss_pred cchhhhhchhhhc-----------hhcccccccCcchhecchhh-hhhhhHHHHHHHHHHHHhhhhhcccchhhccccCC Q lcl|NC_011142. 8 IDAQTIAGNRWLN-----------KFLDSNATIGVPSVVNDADG-GAAYYISQLASLETTVYEVPYADITYLEDVPVLAN 75 (343) Q Consensus 8 ~~~~~~~~~~~~~-----------~~~~~~~~~~~~~~~~dA~~-~~~f~~~~l~~id~~v~e~~~~~l~~~~~i~v~~~ 75 (343) .|---|-|-+.++ ..|+--++-++ .- ..+........+|+.+....+. ..-++.-.-. T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-------~~nt~~l~~k~~~~LD~~~~~~~~s---~~~~~N~~~e 70 (329) T protein:vir:10 1 MDGIFITGVKTMNKEIKNATGKLKLNLQHFANKSV-------EPGDTLLKNKHVGILEKVTAANSYS---APAVISNDAI 70 (329) T ss_pred CCceEEechhhhhhhhhcccceeEEehhhhcCCcc-------CCchhHHHHHHHHHHHHHHHhhcee---eeeeccccee Confidence 1111122222111 11111111110 01 1222222334555433332221 1111110001 Q ss_pred CCcceeEEEEeeecccccceeecCCcCccceeeeccceeEEEEEEEEEEEeecHHHHHHHHHhC-CCccHHHHHHHHHHH Q lcl|NC_011142. 76 IPEYATHWNYRSYDGAAMGKFISANASDLPRVAQSAKLHQVELGYAGVECHYSLDELRTTAAVN-MPIDSMQAELAFRGS 154 (343) Q Consensus 76 ~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~~~~~~~~~v~~~~~~~~~~~~El~~a~~~g-~~l~~~k~~aA~~~~ 154 (343) . -+..++....+...|-.. |.- .++...-+++..+....+-+ ..+|.+.+.++...+..+ +.....-.+.++..+ T Consensus 71 ~-~~g~tVkIp~i~~~gl~D-Y~R-~~g~~~g~vt~~~~t~tidq-dR~~~F~VD~~D~dEtn~~l~a~~i~~~~~~~~v 146 (329) T protein:vir:10 71 F-MQGRSFTVIKGDVTELKD-YKR-NATNEFDHPQIQETTYFLDQ-EKYWGRFVDALDRRDTEGNIDINYVVAKQASEVV 146 (329) T ss_pred e-ccCcEEEEeeeccccccc-ccC-CCCccccccccceeEEEeec-ccceeeecchhhHhhhhhhhhHHHHHHHHHHHHh Confidence 1 245567776666555332 210 12233334445565555444 778888888888766533 222233334455666 Q ss_pred HHhhhheeeeeehhhcceeeeecCCccccccCcCccccCHHHHHHHHHHHHHHHHHhcCCeecccEEEecHHHHHHHhcc Q lcl|NC_011142. 155 EEHSQRVAYFGDTNRNMSGLLNNPNVTKTSATVNYATCTGQELFDLLNNPVFAVVKASKRFHTPNTVLMFPDLWKRASSL 234 (343) Q Consensus 155 ~~~~n~~~f~G~~~~g~~GLlN~p~v~~~~~~~~w~~~t~~~i~~di~~~~~~l~~~s~g~~~p~tL~l~p~~~~~L~~~ 234 (343) .-..|...|--..... | .. +. .+.|++.+++.|.++..+|.++ +.-....|+++|..+..|.+- T Consensus 147 ~pEiDay~~skla~~a--~-------~~--~~---~~~t~~nay~~i~~a~~~Lde~--~vp~~Rvl~VtP~~~~~Lk~~ 210 (329) T protein:vir:10 147 APYLDNLRFATLARNK--A-------KH--LT---VGSGADAQYDAVLDVSVELDEI--GAGASRILFVTPKFYKGIKKF 210 (329) T ss_pred hhHHHHHHHHHHHhhc--c-------cc--cc---cccCHHHHHHHHHHHHHHHHhc--CCCCCcEEEeCHHHHHHHHhh Confidence 5555655442221110 0 00 11 1246788999999999999875 444556899999999998641 Q ss_pred c-cCCCCCccHHHHHHhcCcceeeccccccccccceeeechhhhccccCCccceEEEEEcccceEEEeeccchhccccee Q lcl|NC_011142. 235 L-MTGYTDRTVIEHFQINNAYTLLTRNPIDIKIRFQLMATELAAAGVSNGNKDRYVVYDKSERNLALAKPIPFRMLAPQL 313 (343) Q Consensus 235 ~-~~~~~~~tvle~l~~n~~~~~~~~~p~~i~~~~~l~~~~~~~~g~~~~g~dr~v~y~~~~~~~~~~v~~~~~~~~~~~ 313 (343) . ....... .+-...++....+.|.++ ..++..... +.+.++++..-.-.+... ..++.+.+.+ T Consensus 211 ~~f~~~~~~--~~~~~~~g~Vg~idG~~I-------i~vps~~~k-----~in~ii~~~~A~~~~~K~--~~~~~~~p~~ 274 (329) T protein:vir:10 211 VIELPQGDN--RQQVLGKGVQGELDGFTI-------VKVPSKMLQ-----GVEAMAVIGEVMASPIQA--NEAKLNSNVP 274 (329) T ss_pred hhhhccccc--cccceeeeeeeeecCeEE-------EEecCCccc-----ceeEEEEcCCceeeeeee--eeeeeeCCCC Confidence 1 0000000 000111111112222221 111111111 123333332211111110 1223222233 Q ss_pred cCceeEeeeeeeeeeEEEECcceeeeeccC Q lcl|NC_011142. 314 LGLGITVPAEYKISGTEYRYPLCAQYVDML 343 (343) Q Consensus 314 ~~~~~~~~~~~~~gGv~i~~P~ai~~~dGI 343 (343) +...|.+.+. .+.|+.|.+|.+.+.+.-+ T Consensus 275 ~~~a~~v~gr-~yyd~~V~~~k~~~I~~~~ 303 (329) T protein:vir:10 275 GMFGTLAEQM-LYTGAFVPEHLQKYIFTIG 303 (329) T ss_pred ccchheeeee-eeeeeEEEccccCEEEEec Confidence 4445777654 4789999999865433333 Done!