Query lcl|NC_020858.1_cdsid_YP_007675402.1 [gene=SUAG_00010] [protein=hypothetical protein] [protein_id=YP_007675402.1] [location=complement(5641..6633)] Match_columns 330 No_of_seqs 77 out of 87 Neff 6.5 Searched_HMMs 1612 Date Thu Nov 7 16:15:17 2013 Command /home/guerois/workspace/virfam/python/lib/hhsearch//hhsearch2 -i .//seq/seq_10 -d /home/guerois/workspace/virfam/python/profile_database/capsid_neck_tail.hhm -glob -cpu 7 -o .//seq/HHR/seq_10_vs_rec_db.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 protein:vir:8843 Length: 317 # 100.0 2E-117 1E-120 659.9 30.0 312 1-326 1-317 (317) 2 protein:vir:96442 Length: 418 100.0 1.4E-67 8.4E-71 387.2 20.8 299 1-330 1-410 (418) 3 protein:vir:103370 Length: 418 100.0 3.6E-60 2.3E-63 346.4 19.5 303 1-330 1-410 (418) 4 protein:vir:97255 Length: 310 99.2 3.7E-12 2.3E-15 83.2 17.1 289 1-325 1-310 (310) 5 protein:vir:94933 Length: 330 99.0 3.4E-11 2.1E-14 78.0 14.3 288 1-325 25-330 (330) 6 protein:vir:80835 Length: 464 98.9 1.5E-10 9.3E-14 74.4 14.4 290 1-330 20-335 (464) 7 protein:vir:102823 Length: 470 98.8 6.1E-10 3.8E-13 71.1 14.9 292 1-330 19-340 (470) 8 protein:vir:96666 Length: 462 98.7 4.4E-09 2.7E-12 66.4 16.8 290 1-330 26-338 (462) 9 protein:vir:95603 Length: 463 98.7 2.4E-09 1.5E-12 67.8 15.2 288 1-330 26-335 (463) 10 protein:vir:99311 Length: 463 98.7 2.4E-09 1.5E-12 67.8 15.2 288 1-330 26-335 (463) 11 protein:vir:100851 Length: 514 98.6 5.1E-09 3.1E-12 66.1 14.9 293 1-330 43-368 (514) 12 protein:vir:94771 Length: 298 98.5 3.8E-08 2.4E-11 61.2 16.9 283 1-324 1-298 (298) 13 protein:vir:1886 Length: 385 # 98.3 1.9E-07 1.2E-10 57.4 16.8 277 1-330 103-385 (385) 14 protein:vir:191 Length: 385 # 98.3 1.9E-07 1.2E-10 57.4 16.8 277 1-330 103-385 (385) 15 protein:vir:63741 Length: 468 98.3 9.4E-08 5.8E-11 59.1 13.7 295 1-330 23-356 (468) 16 protein:vir:80491 Length: 467 98.3 9.8E-08 6.1E-11 59.0 13.6 295 1-330 23-355 (467) 17 protein:vir:348 Length: 321 # 98.2 4.7E-08 2.9E-11 60.8 11.5 282 1-323 1-321 (321) 18 protein:vir:100135 Length: 418 98.2 5.1E-07 3.1E-10 55.1 16.8 281 1-330 132-417 (418) 19 protein:vir:9574 Length: 300 # 98.2 4.6E-07 2.8E-10 55.3 16.5 283 1-325 1-300 (300) 20 protein:vir:1638 Length: 298 # 98.2 6.7E-07 4.2E-10 54.4 17.3 282 1-324 1-298 (298) 21 protein:vir:81070 Length: 390 98.1 1.4E-06 9E-10 52.6 16.5 276 1-324 107-390 (390) 22 protein:vir:9759 Length: 303 # 98.0 2.8E-06 1.7E-09 51.1 17.3 285 1-325 1-303 (303) 23 protein:vir:41 Length: 299 # N 98.0 5.6E-06 3.5E-09 49.4 18.8 279 1-326 1-299 (299) 24 protein:vir:95763 Length: 297 98.0 2.3E-06 1.4E-09 51.5 16.4 275 1-326 1-297 (297) 25 protein:vir:9309 Length: 324 # 98.0 4.9E-06 3.1E-09 49.7 18.1 281 1-330 21-320 (324) 26 protein:vir:4339 Length: 395 # 98.0 2.8E-06 1.7E-09 51.1 16.5 281 1-329 107-395 (395) 27 protein:vir:9410 Length: 415 # 98.0 1.8E-06 1.1E-09 52.0 15.4 282 1-330 117-405 (415) 28 protein:vir:104085 Length: 320 98.0 5.1E-06 3.2E-09 49.6 17.7 291 1-329 1-320 (320) 29 protein:vir:7771 Length: 330 # 97.9 7E-06 4.3E-09 48.8 18.4 296 1-330 1-325 (330) 30 protein:vir:8187 Length: 311 # 97.9 5E-06 3.1E-09 49.7 17.0 283 1-329 1-311 (311) 31 protein:vir:105905 Length: 304 97.9 6.2E-06 3.8E-09 49.1 17.4 280 1-324 1-304 (304) 32 protein:vir:94142 Length: 304 97.9 6.2E-06 3.8E-09 49.1 17.4 280 1-324 1-304 (304) 33 protein:vir:103955 Length: 324 97.9 7.5E-06 4.7E-09 48.7 17.8 280 1-330 18-320 (324) 34 protein:vir:99749 Length: 324 97.9 9.6E-06 6E-09 48.1 17.8 281 1-330 21-320 (324) 35 protein:vir:97148 Length: 324 97.9 1.3E-05 7.9E-09 47.4 18.3 278 1-330 24-320 (324) 36 protein:vir:96223 Length: 324 97.8 6.4E-06 4E-09 49.1 16.5 278 1-330 21-320 (324) 37 protein:vir:10364 Length: 390 97.8 9.3E-06 5.8E-09 48.2 16.9 276 1-324 107-390 (390) 38 protein:vir:78830 Length: 324 97.8 2E-05 1.3E-08 46.3 18.0 278 1-330 18-320 (324) 39 protein:vir:96392 Length: 324 97.8 2E-05 1.3E-08 46.3 18.0 278 1-330 18-320 (324) 40 protein:vir:81100 Length: 415 97.8 1.9E-05 1.2E-08 46.4 17.8 284 1-328 117-415 (415) 41 protein:vir:98339 Length: 415 97.8 1.9E-05 1.2E-08 46.4 17.8 284 1-328 117-415 (415) 42 protein:vir:79987 Length: 415 97.8 1.9E-05 1.2E-08 46.4 17.8 284 1-328 117-415 (415) 43 protein:vir:3158 Length: 321 # 97.7 6.7E-06 4.1E-09 49.0 15.1 288 1-330 1-320 (321) 44 protein:vir:4600 Length: 415 # 97.7 2.1E-05 1.3E-08 46.2 17.6 281 1-328 120-415 (415) 45 protein:vir:4700 Length: 415 # 97.7 2.1E-05 1.3E-08 46.2 17.6 281 1-328 120-415 (415) 46 protein:vir:4511 Length: 409 # 97.7 1.3E-05 7.8E-09 47.4 16.3 286 1-328 117-409 (409) 47 protein:vir:97053 Length: 390 97.7 1.2E-05 7.3E-09 47.6 15.9 277 1-324 107-390 (390) 48 protein:vir:485 Length: 407 # 97.5 3.8E-05 2.4E-08 44.8 16.2 289 1-330 106-405 (407) 49 protein:vir:4997 Length: 397 # 97.4 4E-05 2.5E-08 44.7 15.8 269 1-330 109-395 (397) 50 protein:vir:4159 Length: 315 # 97.4 2.6E-05 1.6E-08 45.7 14.7 289 1-322 12-315 (315) 51 protein:vir:4830 Length: 397 # 97.4 5.3E-05 3.3E-08 44.0 16.3 271 1-330 109-389 (397) 52 protein:vir:8102 Length: 543 # 97.3 9.8E-05 6.1E-08 42.6 17.3 284 1-330 247-543 (543) 53 protein:vir:4953 Length: 397 # 97.3 0.0001 6.2E-08 42.5 17.0 267 1-330 109-389 (397) 54 protein:vir:2430 Length: 318 # 97.1 0.00016 9.8E-08 41.4 17.2 286 1-330 14-318 (318) 55 protein:vir:78523 Length: 338 97.0 0.0002 1.2E-07 40.9 18.8 295 1-328 10-338 (338) 56 protein:vir:2504 Length: 305 # 97.0 0.00022 1.4E-07 40.7 17.3 285 1-330 1-305 (305) 57 protein:vir:101650 Length: 497 97.0 9.9E-05 6.2E-08 42.5 13.6 304 1-329 151-497 (497) 58 protein:vir:7855 Length: 497 # 97.0 9.9E-05 6.2E-08 42.5 13.6 304 1-329 151-497 (497) 59 protein:vir:99424 Length: 360 96.9 0.00025 1.6E-07 40.3 15.5 306 1-328 15-360 (360) 60 protein:vir:1025 Length: 408 # 96.9 0.00026 1.6E-07 40.3 18.5 273 1-330 116-403 (408) 61 protein:vir:4226 Length: 326 # 96.9 0.00026 1.6E-07 40.3 18.4 288 1-328 20-326 (326) 62 protein:vir:94673 Length: 419 96.9 0.00027 1.7E-07 40.1 19.1 284 1-330 121-419 (419) 63 protein:vir:4856 Length: 293 # 96.9 0.0003 1.8E-07 39.9 18.6 270 1-330 5-285 (293) 64 protein:vir:4456 Length: 401 # 96.8 0.00034 2.1E-07 39.6 16.3 284 1-329 107-401 (401) 65 protein:vir:3033 Length: 272 # 96.7 0.00037 2.3E-07 39.4 16.9 261 1-330 1-270 (272) 66 protein:vir:9820 Length: 272 # 96.7 0.00037 2.3E-07 39.4 16.9 261 1-330 1-270 (272) 67 protein:vir:1433 Length: 435 # 96.7 0.00038 2.4E-07 39.3 16.1 283 1-327 130-435 (435) 68 protein:vir:3870 Length: 400 # 96.7 0.00028 1.8E-07 40.0 14.1 263 1-330 129-400 (400) 69 protein:vir:2344 Length: 397 # 96.7 0.00039 2.4E-07 39.2 16.3 282 1-330 10-311 (397) 70 protein:vir:81227 Length: 413 96.7 0.00042 2.6E-07 39.1 15.0 287 1-329 113-413 (413) 71 protein:vir:101607 Length: 379 96.7 0.00042 2.6E-07 39.1 17.2 268 1-326 104-379 (379) 72 protein:vir:100247 Length: 425 96.7 0.00042 2.6E-07 39.1 16.9 295 1-330 127-425 (425) 73 protein:vir:3991 Length: 404 # 96.5 0.00058 3.6E-07 38.3 17.6 272 1-330 116-397 (404) 74 protein:vir:7409 Length: 408 # 96.4 0.00062 3.8E-07 38.2 18.6 267 1-330 116-397 (408) 75 protein:vir:100172 Length: 394 96.4 0.00068 4.2E-07 37.9 16.6 277 1-330 107-389 (394) 76 protein:vir:80684 Length: 315 96.3 0.00076 4.7E-07 37.7 15.7 289 1-330 1-310 (315) 77 protein:vir:100884 Length: 389 96.3 0.00081 5E-07 37.5 16.1 272 1-330 109-386 (389) 78 protein:vir:3613 Length: 272 # 96.1 0.0011 6.5E-07 36.9 17.3 260 1-326 1-272 (272) 79 protein:vir:8420 Length: 477 # 96.0 0.00066 4.1E-07 38.0 12.5 300 1-330 151-475 (477) 80 protein:vir:4197 Length: 314 # 96.0 0.0012 7.4E-07 36.6 17.4 291 1-326 7-314 (314) 81 protein:vir:105334 Length: 276 95.9 0.0012 7.7E-07 36.5 19.2 264 1-330 1-275 (276) 82 protein:vir:96123 Length: 274 95.8 0.0014 8.7E-07 36.2 18.3 263 1-329 1-274 (274) 83 protein:vir:80376 Length: 435 95.8 0.0015 9.3E-07 36.1 16.0 284 1-327 130-435 (435) 84 protein:vir:96833 Length: 275 95.7 0.0015 9.4E-07 36.0 16.9 258 1-330 1-272 (275) 85 protein:vir:99920 Length: 311 95.7 0.0016 9.7E-07 36.0 15.8 283 1-325 1-311 (311) 86 protein:vir:9704 Length: 394 # 95.7 0.0016 9.8E-07 36.0 15.5 261 1-330 125-392 (394) 87 protein:vir:104256 Length: 458 95.7 0.0017 1E-06 35.8 17.6 293 1-329 159-458 (458) 88 protein:vir:6242 Length: 390 # 95.5 0.0019 1.2E-06 35.5 16.6 273 1-330 106-390 (390) 89 protein:vir:1328 Length: 392 # 95.5 0.002 1.3E-06 35.3 17.4 280 1-330 106-392 (392) 90 protein:vir:95376 Length: 425 95.4 0.0016 9.7E-07 36.0 12.2 273 1-330 138-422 (425) 91 protein:vir:108211 Length: 318 95.3 0.0022 1.4E-06 35.2 12.7 283 1-328 1-318 (318) 92 protein:vir:102119 Length: 404 95.2 0.0025 1.5E-06 34.9 16.5 279 1-329 110-404 (404) 93 protein:vir:5739 Length: 366 # 95.0 0.003 1.8E-06 34.4 15.7 278 1-325 64-366 (366) 94 protein:vir:1084 Length: 437 # 94.8 0.0034 2.1E-06 34.1 14.8 268 1-330 152-435 (437) 95 protein:vir:78223 Length: 333 94.8 0.0035 2.2E-06 34.0 18.0 291 1-329 10-333 (333) 96 protein:vir:93742 Length: 274 94.7 0.0036 2.2E-06 34.0 18.7 261 1-330 1-272 (274) 97 protein:vir:739 Length: 231 # 94.3 0.0047 2.9E-06 33.4 17.0 229 35-326 1-231 (231) 98 protein:vir:94494 Length: 274 94.0 0.0058 3.6E-06 32.8 17.9 261 1-330 1-272 (274) 99 protein:vir:97433 Length: 274 94.0 0.0058 3.6E-06 32.8 17.9 261 1-330 1-272 (274) 100 protein:vir:962 Length: 397 # 93.7 0.0067 4.2E-06 32.5 12.7 268 1-325 129-397 (397) 101 protein:vir:3845 Length: 395 # 92.9 0.0094 5.8E-06 31.7 17.7 269 1-330 105-388 (395) 102 protein:vir:4092 Length: 390 # 92.6 0.011 6.6E-06 31.4 14.6 275 1-330 84-372 (390) 103 protein:vir:79928 Length: 393 92.5 0.011 6.9E-06 31.3 12.9 297 1-330 74-382 (393) 104 protein:vir:102082 Length: 392 92.0 0.013 8.1E-06 30.9 16.0 272 1-330 106-389 (392) 105 protein:vir:107593 Length: 392 92.0 0.013 8.1E-06 30.9 16.0 272 1-330 106-389 (392) 106 protein:vir:105004 Length: 392 92.0 0.013 8.1E-06 30.9 16.0 272 1-330 106-389 (392) 107 protein:vir:102873 Length: 392 92.0 0.013 8.1E-06 30.9 16.0 272 1-330 106-389 (392) 108 protein:vir:1383 Length: 421 # 91.5 0.016 9.6E-06 30.5 16.0 264 1-330 114-396 (421) 109 protein:vir:81160 Length: 371 91.4 0.016 1E-05 30.4 15.2 261 1-329 91-371 (371) 110 protein:vir:105038 Length: 428 90.6 0.02 1.2E-05 29.9 15.9 276 1-325 125-428 (428) 111 protein:vir:80068 Length: 301 90.4 0.021 1.3E-05 29.8 12.7 288 1-323 1-301 (301) 112 protein:vir:96262 Length: 274 86.8 0.043 2.6E-05 28.1 18.6 259 1-329 1-274 (274) 113 protein:vir:95898 Length: 274 86.8 0.043 2.6E-05 28.1 18.6 259 1-329 1-274 (274) 114 protein:vir:103886 Length: 302 86.3 0.047 2.9E-05 27.9 13.0 285 1-330 1-301 (302) 115 protein:vir:1268 Length: 397 # 85.5 0.052 3.2E-05 27.6 17.7 263 1-329 123-397 (397) 116 protein:vir:104342 Length: 314 85.5 0.053 3.3E-05 27.6 10.7 276 1-326 19-314 (314) 117 protein:vir:80930 Length: 278 84.7 0.059 3.6E-05 27.3 16.5 263 1-326 1-278 (278) 118 protein:vir:95318 Length: 328 84.3 0.062 3.8E-05 27.2 12.1 250 1-273 1-328 (328) 119 protein:vir:5255 Length: 304 # 80.4 0.096 5.9E-05 26.2 11.5 282 1-322 1-304 (304) 120 protein:vir:104479 Length: 310 80.0 0.0059 3.6E-06 32.8 1.9 94 1-100 208-310 (310) 121 protein:vir:103285 Length: 296 77.4 0.12 7.7E-05 25.5 14.7 281 1-326 1-296 (296) 122 protein:vir:1239 Length: 274 # 77.3 0.13 7.8E-05 25.5 18.1 263 1-330 1-272 (274) 123 protein:vir:6212 Length: 434 # 76.9 0.13 8.1E-05 25.4 16.5 279 1-329 141-434 (434) 124 protein:vir:79642 Length: 329 73.9 0.17 0.0001 24.9 13.0 289 1-326 26-329 (329) 125 protein:vir:9643 Length: 377 # 69.1 0.23 0.00014 24.1 10.7 286 1-326 76-377 (377) 126 protein:vir:6324 Length: 335 # 66.7 0.26 0.00016 23.8 11.0 288 1-330 1-332 (335) 127 protein:vir:107687 Length: 319 59.8 0.39 0.00024 22.9 13.3 280 1-323 16-319 (319) 128 protein:vir:95107 Length: 270 56.9 0.45 0.00028 22.5 15.3 257 1-330 1-268 (270) 129 protein:vir:2685 Length: 387 # 50.9 0.6 0.00037 21.8 13.2 265 1-330 116-385 (387) 130 protein:vir:96978 Length: 387 50.9 0.6 0.00037 21.8 13.2 265 1-330 116-385 (387) 131 protein:vir:94424 Length: 387 50.9 0.6 0.00037 21.8 13.2 265 1-330 116-385 (387) 132 protein:vir:96762 Length: 632 49.5 0.64 0.00039 21.7 14.6 266 1-324 355-632 (632) 133 protein:vir:9361 Length: 402 # 48.3 0.67 0.00042 21.5 13.6 266 1-330 131-400 (402) 134 protein:vir:9927 Length: 295 # 46.1 0.75 0.00046 21.3 7.6 267 1-330 1-284 (295) 135 protein:vir:93616 Length: 645 44.4 0.81 0.0005 21.1 17.3 277 1-330 332-644 (645) 136 protein:vir:5974 Length: 324 # 44.1 0.82 0.00051 21.1 13.7 266 1-330 1-295 (324) 137 protein:vir:106647 Length: 303 43.1 0.86 0.00053 20.9 9.6 251 1-330 1-292 (303) 138 protein:vir:95963 Length: 395 35.0 1.3 0.00078 20.0 12.5 274 1-330 86-380 (395) 139 protein:vir:93881 Length: 387 34.7 1.3 0.00079 20.0 13.1 266 1-330 116-385 (387) 140 protein:vir:9875 Length: 296 # 34.5 1.3 0.0008 20.0 12.3 251 1-330 15-291 (296) 141 protein:vir:78640 Length: 352 34.1 1.3 0.00081 19.9 14.3 262 1-330 83-350 (352) 142 protein:vir:78935 Length: 335 33.4 1.4 0.00084 19.8 10.5 287 1-330 1-332 (335) 143 protein:vir:98635 Length: 377 33.2 1.4 0.00085 19.8 9.3 289 1-326 76-377 (377) 144 protein:vir:101291 Length: 381 24.4 2.2 0.0014 18.7 10.7 288 1-330 74-374 (381) 145 protein:vir:9509 Length: 381 # 24.4 2.2 0.0014 18.7 10.7 288 1-330 74-374 (381) 146 protein:vir:94070 Length: 339 21.9 2.5 0.0016 18.4 12.4 279 1-323 35-339 (339) No 1 >protein:vir:8843 Length: 317 # NCBI annotation: major head protein # Family: family:all:3919 # MgeID: mge:158 # MgeName: PaP3 # Cross-refs: genbank:acc:NP_775251;genbank:gi:27476049;genbank:GeneID:2700597 Probab=100.00 E-value=2.4e-117 Score=659.93 Aligned_cols=312 Identities=30% Similarity=0.423 Sum_probs=289.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) ||+|+++|+||+++|+||||+|+|++|+|+||||+|+|+++++++++|+||||+|+++++|+++||+|+++++.++|+++ T Consensus 1 ma~~~~~~~t~~~~g~~~dl~~~I~~isp~dTPf~S~i~~~~a~~~~~~W~~d~l~~~~~~~~~EG~da~~~~~~~r~~~ 80 (317) T protein:vir:88 1 MATPTNAVSTVEINGKREDLIDIIYNIAPYDTPFMSAIGKGVATAITHEWQTDELRQPGKNTRVEGEDATIKAGSFTTML 80 (317) T ss_pred CCccccceEeeeeeeeeechhhhheecCCccCcceeeecCceecccEEEEEeeecCCccccccccCcccccccccCCEEe Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred cceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCc----CCcccccchhHHHHHhcccccc Q lcl|NC_020858. 81 GNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNAS----VGGATRESGSLPTWVKTNVSRG 156 (330) Q Consensus 81 ~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~----~~~~~r~~~Gi~~~i~tn~~~g 156 (330) +||||||+|+++||||+||++.+|+++|++||++||++|||||||++||+|++. .++++|+|+||++||++|+..+ T Consensus 81 ~N~tQIf~k~v~VSgTa~av~~~G~~~ela~q~~kk~~EikrdmE~~li~g~~a~~~~~~t~~r~~~Gl~~~i~t~~~~~ 160 (317) T protein:vir:88 81 NNYCQISDETLQVTGTADRVKKAGRKNELAYQLAKKSKELKLDMEYALVGAPQAKVQRNTTTPGQMANIFAYYKTNGSLG 160 (317) T ss_pred ccEEEEEEeEEEEeehhhhhhhcCccchhHHHHHHHHHHHHHHHHHHHhcCeeeccCCCCccchhhhhHHHHhccCceec Confidence 999999999999999999999999999999999999999999999999999864 2456899999999999999887 Q ss_pred cccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcce Q lcl|NC_020858. 157 ATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNN 236 (330) Q Consensus 157 ~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~ 236 (330) ++|.. +..+...++++|++++|||++|++++|+||++||+++.+||+|.+|++|++|+.+... +....+.++ T Consensus 161 ~~g~~----~~~~~~~~~t~~t~~~lte~~l~~~l~~i~~~Gg~~~~i~v~a~~k~~i~~~~~~~~~----~i~~~~~~~ 232 (317) T protein:vir:88 161 ANGVA----PVGDGSNTGTAGDLRLLTEDMLLNASESIWRNGGQANSIQTSSSIKKAISKNMKGRAT----EITLDASDN 232 (317) T ss_pred cCccc----cccCCCccccccccccccHHHHHHHHHHHHhcCCCCCEEEeChHHHHHHHHHhcCCce----eEEEcccCe Confidence 76653 3344456679999999999999999999999999999999999999999999886543 234456788 Q ss_pred eEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccccccceeeEEEEEEEEEEecchh Q lcl|NC_020858. 237 SIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTGDAEKFMLIGEGALKPKNEKG 316 (330) Q Consensus 237 ~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~E~tLe~~N~~a 316 (330) +|+..|++|+||||+|+||+|||||+ +.+++|||+||+++||||+ +.|+|||+||++|+||++|+|||||||+| T Consensus 233 ~~g~~v~~~~tdfG~v~ii~~r~lp~-----~~~~~~D~~~~~l~~Lr~~-~~e~laKtGd~~k~~i~~E~tLe~~N~~a 306 (317) T protein:vir:88 233 RIAQTVDVYESDFGKYTIRANRWFHE-----NTLFVFDPKMHSLCYLRPF-FQHELAKTGDSEKRQLLVEYTFRVNNEKS 306 (317) T ss_pred EEEEEEEEEEeCCeEEEEEeCCCCCC-----CeEEEEcccccceeecccc-eeeccCCCcccceeEEEEEEEEEEcCccc Confidence 99999999999999999999999995 5789999999999999986 67999999999999999999999999999 Q ss_pred eeEEeccc-cc Q lcl|NC_020858. 317 LGVAADLY-GL 326 (330) Q Consensus 317 ~g~i~gLt-~~ 326 (330) ||+|+||+ +| T Consensus 307 ~a~i~~l~~~~ 317 (317) T protein:vir:88 307 GALIRDVVAQL 317 (317) T ss_pred eeEEEEecccC Confidence 99999874 33 No 2 >protein:vir:96442 Length: 418 # NCBI annotation: hypothetical protein # Family: family:all:11266 # MgeID: mge:1616 # MgeName: 119X # Cross-refs: genbank:acc:YP_001218814;genbank:gi:147917331;genbank:GeneID:5142645 Probab=100.00 E-value=1.4e-67 Score=387.17 Aligned_cols=299 Identities=18% Similarity=0.197 Sum_probs=250.7 Q ss_pred CCccccceeecc--ccccccccceeeEecCCc-ccceeeeecc---ceeccceeeeeeeeccCc---------cc----- Q lcl|NC_020858. 1 MAVVTNTFQSTG--AKGNREELADVVSRITPE-DTPIYSMIEK---VSFDTTHPEWTTDELAAP---------GA----- 60 (330) Q Consensus 1 Ma~~t~~~~t~~--~~g~~edl~d~I~~i~p~-dTP~~s~ig~---~~~~~~~~~W~td~L~~~---------~~----- 60 (330) |++.++-|.+.. ...|--+++.-|-+-=|- ..|++++||. .++.+..|.|..|+|... ++ T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~PN~~~p~l~~i~~g~~~~~~~~t~~w~~d~l~~~~~~~ta~~~a~~T~i~ 80 (418) T protein:vir:96 1 MSVYAGIFNTTLNPQELNMKSFAGTILRRVPNGSAPLLAMTSVVGSTTAKASTHGYFSKTMVFASAVVTAEALADATVLT 80 (418) T ss_pred CceeeeecccCCChhhhchhhhhhhhhhhcCCcccchhhhhcccCccccceeEEEEEeeEeeeeeEEEEEEEecCceEEE Confidence 999998887543 234555777776665554 5789999964 467889999999986410 00 Q ss_pred --------------------------------------------------------cccccccccccccccCceEecceE Q lcl|NC_020858. 61 --------------------------------------------------------NITLEGDEYTFDATVSPERLGNYT 84 (330) Q Consensus 61 --------------------------------------------------------na~~EG~d~~~~~~~~~~~~~N~t 84 (330) .++.||+|+++++...+++++||| T Consensus 81 V~~~~~f~~~~l~~~~~~~EvirVtsVng~~lTV~RG~~~t~aa~iaag~~~~~ig~~~eEGsd~~ta~~~k~~~vsN~t 160 (418) T protein:vir:96 81 VENSDGLTKGMIFYNEATGENMRLELVNGLNLTVKRQTGRIAAAIIAANTKLIVIGTAFEEGSQRPTARSIQPVYVPNFT 160 (418) T ss_pred ecCCcccccccEEEEecCCeEEEEEEEeCCEEEEEEccCCeeeeeeecCceEEEeecCcccccccCCcceecceeccchh Confidence 233499999999999899999999 Q ss_pred EEEeeeeeehhHHHH-HhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCC---c-----ccccchhHHHHHhccccc Q lcl|NC_020858. 85 QIMRKSGIISGTQNI-TDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVG---G-----ATRESGSLPTWVKTNVSR 155 (330) Q Consensus 85 QIf~~~v~VS~T~~a-v~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~---~-----~~r~~~Gi~~~i~tn~~~ 155 (330) |||++.|+||||||| +.++|+.+++++| +++++++|+++|++++.|++.++ + ++|+|+||++|+.+|+.. T Consensus 161 QIf~e~vsVSgTAqA~v~qaGvsn~~~~e-~d~l~~~kv~iE~ali~g~~~~~~~ng~p~~~t~R~m~gI~~f~~~Nvi~ 239 (418) T protein:vir:96 161 QIFRNAWALTDTARASYAEAGYSNITESR-RDCMDFHATEQETAIFFGQAFMGTYNGQPLHTTQGIVDAIRQYAPDNVNA 239 (418) T ss_pred heehhhhhhhhhhhhhhhhcCcchhHHHH-HHHHHHHHHHHHHhhhccccccCCCCCcccccccchhHHHHhhccccccc Confidence 999999999999999 5779999998888 79999999999999999998653 2 369999999999888763 Q ss_pred ccccccccccccccccccccccccccccHHHHHHHHHHHHh----cCCce----eEEEeChHHHHHHHHhhccceeeeee Q lcl|NC_020858. 156 GATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQ----SGANF----KHVFVSPYVKSVFVTFMSDTNVASFR 227 (330) Q Consensus 156 g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~----~Gg~~----~~i~v~~~~k~~is~f~~~~~~~~~r 227 (330) +++++++|+++|.++++++|. .|++. +.++|++.+|++|++|+++ . T Consensus 240 --------------------ag~~~~~t~d~L~~~~~~a~~~g~n~G~~~~~~~y~~~V~a~~k~~I~k~~~~-I----- 293 (418) T protein:vir:96 240 --------------------MPNPTAVTYDDVVDATIDAFKWSVNVGDNTQRVMFCDTVGMRTMQDIGRFFGE-V----- 293 (418) T ss_pred --------------------cCCCCcCCHHHHHHHHHHHHhhcCCCCCcccceEEEEEeChHHHHHHhhhhce-e----- Confidence 445678999999999999998 33432 5789999999999999864 2 Q ss_pred eeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhccc--CCccccccccccc--------- Q lcl|NC_020858. 228 YAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWL--RKIAEDKKVAKTG--------- 296 (330) Q Consensus 228 ~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~L--r~~~~~e~laKtG--------- 296 (330) .....+++++..|++|+||||.++||+|||||++.+.++.|+++||++|+++|| |++ ++|.|+|+| T Consensus 294 --~~~~~en~~G~vv~~~~Td~G~v~ii~n~~~pad~I~~g~mlVvD~~~vkL~yL~~R~~-~~E~l~k~G~~~~~~~~~ 370 (418) T protein:vir:96 294 --TVTQRETSYGMVFTEWKFFKGRLIIKEHPLFSAIGISPGFAVVVDVPAVKLAYMDGRNA-KVENYGQGGGENKSGATD 370 (418) T ss_pred --EeccccceeceEEEEEEeeccEEEEEecCCCCccccCcceEEEEecCceEEEEecCCCc-cchhcccCCCcccccccc Confidence 245678999999999999999999999999999999999999999999999999 654 679999999 Q ss_pred -------cceeeEEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 297 -------DAEKFMLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 297 -------d~~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) |++|+||++|++||++||++|++|+||--- +.+ T Consensus 371 ~~~~~~~D~~~G~l~~Eltle~~N~~a~a~itgl~~~-~~~ 410 (418) T protein:vir:96 371 YSYGHGVDAQGGSLTSEWALELLNPQGCAVITGLQKA-KER 410 (418) T ss_pred cccccccccccCEEEEEEEEEeecccccEEeeccccc-ccc Confidence 999999999999999999999999988432 222 No 3 >protein:vir:103370 Length: 418 # NCBI annotation: hypothetical protein # Family: family:all:11266 # MgeID: mge:1621 # MgeName: PaP2 # Cross-refs: genbank:acc:YP_024741;genbank:gi:48697083;genbank:GeneID:2846038 Probab=100.00 E-value=3.6e-60 Score=346.45 Aligned_cols=303 Identities=17% Similarity=0.203 Sum_probs=238.8 Q ss_pred CCccccceeecc--ccccccccceeeEecCCc-ccceeeeecc---ceeccceeeeeeeeccCc---------cc----- Q lcl|NC_020858. 1 MAVVTNTFQSTG--AKGNREELADVVSRITPE-DTPIYSMIEK---VSFDTTHPEWTTDELAAP---------GA----- 60 (330) Q Consensus 1 Ma~~t~~~~t~~--~~g~~edl~d~I~~i~p~-dTP~~s~ig~---~~~~~~~~~W~td~L~~~---------~~----- 60 (330) |++.++-|.+.. ...|--+++.-|-+-=|- .+|++++|+. .++++..|.|..|+|-.. ++ T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~PN~~~pll~li~~g~~~ta~ast~~w~~d~~~~~~~~~ta~a~a~~T~l~ 80 (418) T protein:vir:10 1 MSVYAGIFNTTLNPQELNMKSFAGTILRRVPNGSAPLLAMTSVVGSTTAKASTHGYFSKTMVFASAVVTAEAAADATVLT 80 (418) T ss_pred CceeccccccCCChhhhchhhhhhhhhhhcCCcchhhhhhhhcccccccceeEEEEEEEEEeeeeEEEEEEEecCceEEE Confidence 999998887543 234555777776665554 5789999954 367889999999875210 00 Q ss_pred --------------------------------------------------------cccccccccccccccCceEecceE Q lcl|NC_020858. 61 --------------------------------------------------------NITLEGDEYTFDATVSPERLGNYT 84 (330) Q Consensus 61 --------------------------------------------------------na~~EG~d~~~~~~~~~~~~~N~t 84 (330) .++.||+|+++++...+++++||| T Consensus 81 ve~~~~f~~~~l~~~~~~~Evirv~sVng~~lTV~Rg~~~t~aaaia~n~~~~~Ig~~~eEGsd~~ta~~~k~~~vsNvt 160 (418) T protein:vir:10 81 VENSDGLTKGMIFYNEATGENMRLELVNGLNLTVKRQTGRISAAIIAANTKLIVIGTAFEEGSQRPTARSIQPVYVPNFT 160 (418) T ss_pred EcCcceeccccEEEEccCCeEEEEEEEeCCEEEEEEecCCeeEEEEecCceEEEeccccccccccCCcceecceeccchh Confidence 234599999999999999999999 Q ss_pred EEEeeeeeehhHHHHH-hhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCC---cc-cccchhHHHHHhccccccccc Q lcl|NC_020858. 85 QIMRKSGIISGTQNIT-DEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVG---GA-TRESGSLPTWVKTNVSRGATG 159 (330) Q Consensus 85 QIf~~~v~VS~T~~av-~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~---~~-~r~~~Gi~~~i~tn~~~g~~g 159 (330) |||++.|+||||+||+ .++|..|++++|..+++++ ++|||+++|+|++.++ +. .|+|+||+.+++...... T Consensus 161 QIF~~avsvSgTaqAs~~q~Gvsn~~ese~drk~~~-av~iEkalI~G~~~~~~~~~g~~R~m~GIl~~vr~~~~gn--- 236 (418) T protein:vir:10 161 QIFRNAWALTDTARASYAEAGYSNITESRRDCMDFH-ATEQETAIFFGQAFMGTYNGQPLHTTQGIVDAVRQYAPDN--- 236 (418) T ss_pred hhhhhhhhhhhhhhhccccccCchHHHHHHHHHHHH-HHHHHHHHhcccccCCCcCCcchhhHHHHHHHHhhhcccc--- Confidence 9999999999999995 6689999999996666665 5799999999986543 33 499999998886432210 Q ss_pred ccccccccccccccccccccccccHHHHHHHHHHHHh----cCCce----eEEEeChHHHHHHHHhhccceeeeeeeeec Q lcl|NC_020858. 160 ANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQ----SGANF----KHVFVSPYVKSVFVTFMSDTNVASFRYAAS 231 (330) Q Consensus 160 ~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~----~Gg~~----~~i~v~~~~k~~is~f~~~~~~~~~r~~~~ 231 (330) -..++++.++|.++|.++++.+|+ .|++. +.++|++.+|++|++|+++ . .. T Consensus 237 -------------Vv~a~~~t~~s~d~l~~a~~~af~~g~~~G~~~q~~~f~~~V~~~~k~~I~k~~~~-I-------~~ 295 (418) T protein:vir:10 237 -------------VNAMPNPTAVTYDDVVDATIDAFKWSVNVGDNTQRVMFCDTVGMRTMQDIGRFFGE-V-------TV 295 (418) T ss_pred -------------eeccCCCCccCHHHHHHHHHHHhhccCCCcccccceeEEEEeChHHHHHhhhhhhh-e-------ee Confidence 012344568999999999999977 34443 6799999999999999874 2 34 Q ss_pred CCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhccc--CCccccccccccc------------- Q lcl|NC_020858. 232 NGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWL--RKIAEDKKVAKTG------------- 296 (330) Q Consensus 232 ~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~L--r~~~~~e~laKtG------------- 296 (330) .+.++.++..|++|.+.+|.|.+..||.+|+-.+.+++|+++||++++++|| | .+.+|.|+|+| T Consensus 296 ~~~e~~~G~vv~~~~~~~G~I~L~~~p~~~~~~lp~g~mlVvD~~~vkL~~L~~R-~~~~E~l~k~G~~~~~~~~~~~~~ 374 (418) T protein:vir:10 296 TQRETSYGMVFTEWKFFKGRLILKEHPLFSAIGISPGFAVVVDVPAVKLAYMDGR-NAKVENYGQGGGENKSGATDYSYG 374 (418) T ss_pred cccceeeeEEEEEEEcceEEEEeecccccccccCCCceEEEEccccceEEEeccc-cccchhcccCCCcccccccccccc Confidence 5678899999999965556555555554555555567899999999999999 6 45789999999 Q ss_pred ---cceeeEEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 297 ---DAEKFMLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 297 ---d~~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) |++|++|++|++||++||++|++|+|| |.-+.+ T Consensus 375 ~~~D~~kG~iv~E~tLe~~N~~a~avitgl-~~~~~~ 410 (418) T protein:vir:10 375 HGVDAQGGSLTSEWALELLNPQGCAVITGL-QKAKER 410 (418) T ss_pred cccccccceEEEEeeeeeecccceEEeecc-ceeccc Confidence 999999999999999999999999999 988877 No 4 >protein:vir:97255 Length: 310 # NCBI annotation: hypothetical protein ORF017 # Family: family:all:1120 # MgeID: mge:1657 # MgeName: M6 # Cross-refs: genbank:acc:YP_001294525;genbank:gi:149408246;genbank:GeneID:5237120 Probab=99.18 E-value=3.7e-12 Score=83.23 Aligned_cols=289 Identities=14% Similarity=0.074 Sum_probs=165.0 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeecccee-ccceeeeeeeeccCcc-----ccccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSF-DTTHPEWTTDELAAPG-----ANITLEGDEYTFDAT 74 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~-~~~~~~W~td~L~~~~-----~na~~EG~d~~~~~~ 74 (330) |+..|=. -..++-.-.+...|+..-..+-+++..+-=..+ .+.++=-++.++.+.. ....-||..-...+ T Consensus 1 mpaltLa---ea~k~~~d~l~~~ViE~~~~~s~lL~~LpF~~veg~~~~ynR~~~~~~~~~~~v~~~~~~~g~~~~~~t- 76 (310) T protein:vir:97 1 MASVTLA---ESAKLAQDELVAGVIENIITVNRMFDVLPFDSIEGNSLAYNRENVLGDVIMAGVGTTFSGAGAGKAAAT- 76 (310) T ss_pred CcccchH---HHhhcCcchHHHHHHHHHhccchHHHhCCcccccCCcceeeEeeccCCcccccccccccCCCccccccc- Confidence 8755411 111222222333333333444444443210001 1223333344433322 11112333222222 Q ss_pred cCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccc Q lcl|NC_020858. 75 VSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVS 154 (330) Q Consensus 75 ~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~ 154 (330) .....--+-|+...+.|-.--+........++.++|+..+.+.+++-.|+.||+|.... +...||...++.-.. T Consensus 77 --~~~~~~~L~i~~g~~~Vd~~i~dl~~~~~~dq~~~Ql~~~iea~~~~~e~~lINGD~a~----n~F~GL~~~~~~~q~ 150 (310) T protein:vir:97 77 --FTKVNSNLTTIMGDAEVNGLIQATRSGDGNDQTAVQIASKAKSAGRKYQDQLINGNGAG----NEFAGLIQLCASGQK 150 (310) T ss_pred --cceeeeeeeeeeehhhhhhHHHhhhcCChHHHHHHHHHHHHHHHHHHHHHHhhccccCC----CcccchhhcCCccce Confidence 23344456777777777744444322223589999999999999999999999997542 246677666531111 Q ss_pred cccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCc Q lcl|NC_020858. 155 RGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGK 234 (330) Q Consensus 155 ~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~ 234 (330) . . +..++-++|-++|++++..+|..+|.+..+++||..+++|.+|....... ..+... T Consensus 151 i----------------~--~~~~gg~~t~d~LDeLl~~v~~~~g~p~~~l~~~~~~r~i~A~~R~~~~~----g~~~~~ 208 (310) T protein:vir:97 151 A----------------T--TGATGSAISFAILDELMDLVVDKDGQVDYLTMHARTLRSYKALLRALGGA----SINEVV 208 (310) T ss_pred e----------------e--cCCCCCCCCHHHHHHHHHHHhcCCCCCCEEEecHHHHHHHHHHHHHhcCC----CCCCcc Confidence 0 0 11122357889999999999999999999999999999999997643211 111111 Q ss_pred ceeEEEEEEEEEcCCeEEEEEEcCcCCCccc-----cccEEEEEcchh----hhhcccC----Ccccccccc--ccccce Q lcl|NC_020858. 235 NNSIVANADVYEGPFGKVMIHPNRVMAGSGA-----LARNAFFVDPEF----LQFGWLR----KIAEDKKVA--KTGDAE 299 (330) Q Consensus 235 ~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~-----~a~~~~~ld~~~----~~~~~Lr----~~~~~e~la--KtGd~~ 299 (330) ..-||.-|. -|+-|.|+++-++|.+.. ...+++.+.... .-+..|. |...-+.+. ...+.. T Consensus 209 ~~~~G~~v~----~~~GiPi~~~d~ip~~~~~~~~~gtTsIya~r~Ge~~~~~Gv~Gl~~~~~~glsVr~~G~~~~~~v~ 284 (310) T protein:vir:97 209 ELPSGAEVP----AYSGTPIFRNDYIPTNQTKGGTTGCTTIFAGTLDDGSRTHGIAGLTATQAAGIQVVDVGESEDSDEH 284 (310) T ss_pred ccCCCCEEe----eeCCeEEEEeCccCCCccccccCCceeEEEEeeCccccccceeccccCCccceeEEeCCcccCCcce Confidence 222333332 345589999999987532 134556655543 2232332 111222232 244678 Q ss_pred eeEEEEEEEEEEecchheeEEecccc Q lcl|NC_020858. 300 KFMLIGEGALKPKNEKGLGVAADLYG 325 (330) Q Consensus 300 k~~i~~E~tLe~~N~~a~g~i~gLt~ 325 (330) ++.|.+-+++-+.+|+|.|++++++- T Consensus 285 ~~~V~~Y~~~av~~~~A~a~L~~V~~ 310 (310) T protein:vir:97 285 IWRVKWYCGLALFSEKGLACADGITN 310 (310) T ss_pred eEEEEEeeeEEEecccceeeeccccC Confidence 89999999999999999999999998 No 5 >protein:vir:94933 Length: 330 # NCBI annotation: putative phage structural protein # Family: family:all:1120 # MgeID: mge:1538 # MgeName: Xp15 # Cross-refs: genbank:acc:YP_239278;genbank:gi:66392060;genbank:GeneID:5076578 Probab=98.98 E-value=3.4e-11 Score=77.98 Aligned_cols=288 Identities=11% Similarity=0.087 Sum_probs=160.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceecc-ceeeeeeeeccCccccccccccccccccccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDT-THPEWTTDELAAPGANITLEGDEYTFDATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~-~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~ 79 (330) |+..| ... ..++-...+...|+..-....|++..+-=..+.+ .++-=++.+|..+.-... +...+.....+.+. T Consensus 25 m~alT--Lae-a~~l~~d~~~~~VIE~l~~~s~iL~~lpf~~ve~~~~~~~r~~~lp~a~~r~~--n~~~~~~~~~Tf~q 99 (330) T protein:vir:94 25 MPTVT--LAE-SAKLSQDHLVSGLIETIVEVNPLYEMMPFTEIEGNALAYNRENVLGDVQFLAV--GGTITAKNPATFTK 99 (330) T ss_pred hhhhh--hhH-HhhcCchhhHHHHHHhhhccchHHhhcccccccCCcceeeeeecCCcceeeec--cccccccCcceeee Confidence 66444 222 1222222333333333344456665442111111 222224444433321111 11111111112233 Q ss_pred ecceEEEEeeeeeehhHHHHHhhccc-cchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 80 LGNYTQIMRKSGIISGTQNITDEAGR-ATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 80 ~~N~tQIf~~~v~VS~T~~av~~~G~-~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) ...-|-|+.-.+.|..-.. +.+|. .+..++|+..+.+.|+..+|..||+|. +.+++..||...+.-... T Consensus 100 ~t~~l~~l~~~~~Vd~~ia--dl~g~~~d~~~~q~~~~ieal~~~~e~~linGD----s~~~~F~GL~~~~~~~q~---- 169 (330) T protein:vir:94 100 VTSELTTLIGDAEVNGLIQ--ATRSDFMDQTSVQVASKAKSIGRQYQASMITGD----GTGNSFQGMMGLVAASQT---- 169 (330) T ss_pred eeechhhhhhhHHHHHHHH--HhcCCHHHHHHHHHHHHHHHHHHHHHHHhhccC----CCCccccchhhcCCcccE---- Confidence 3334556666666663322 23453 478899999999999999999999995 234677787665521111 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeE Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSI 238 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~ 238 (330) ..+.+++-++|-++|++++..+|..+|.+..|+++.++.++|.+|...... +........-| T Consensus 170 --------------i~tg~~gg~~T~d~LDeLl~~v~~~~g~~~~~l~n~a~~r~I~a~~R~~~~----~~v~~~~~~~~ 231 (330) T protein:vir:94 170 --------------ISAGANGGTLTFELLDQLLDLVKDKDGQVDYLMSSFAMRRKYFSLLRALGG----AAIGEVMTLPS 231 (330) T ss_pred --------------EecCCCCCCCCHHHHHHHHHHhcCCCCCCcEEEechhHHHHHHHHHHhccC----CCCCCcccccC Confidence 011123346889999999999999899999999999999999999653321 11111111122 Q ss_pred EEEEEEEEcCCeEEEEEEcCcCCCcccc-----ccEEEEEcch--hh--hhcccC----Ccccccccc--ccccceeeEE Q lcl|NC_020858. 239 VANADVYEGPFGKVMIHPNRVMAGSGAL-----ARNAFFVDPE--FL--QFGWLR----KIAEDKKVA--KTGDAEKFML 303 (330) Q Consensus 239 ~~~v~~~~tdfG~v~iv~nR~mp~~~~~-----a~~~~~ld~~--~~--~~~~Lr----~~~~~e~la--KtGd~~k~~i 303 (330) |..| --|+-|.|+++-++|.+... .-+++++... .. -+..|. |...-+.+. .+.+..+|.| T Consensus 232 G~~v----~~~~GvPi~~~d~ip~~~~~~~~~~ttsIyav~~G~~~~~qgV~Gl~~~g~~glsVr~~G~~~~k~v~~~~v 307 (330) T protein:vir:94 232 GRQI----PTYRGVPWFVNDFIPSNMTQGTATNATAIFAGTFDDGSNKYGIAGLTARGSAGLRVQNVGAKENADETITRV 307 (330) T ss_pred CCEE----eeeCCeEEEecccccCCCCcccCCCceeEEEEeecccccccceEeecCCCCCcceeeeCCCccccceeeEEE Confidence 3222 12344778888888875321 2345555533 22 222332 111222233 3556788999 Q ss_pred EEEEEEEEecchheeEEeccc-c Q lcl|NC_020858. 304 IGEGALKPKNEKGLGVAADLY-G 325 (330) Q Consensus 304 ~~E~tLe~~N~~a~g~i~gLt-~ 325 (330) .+-+++-+.+|+|.|++.++- | T Consensus 308 ~~y~~~av~~~~a~~~L~~V~~g 330 (330) T protein:vir:94 308 KMYCGFANFSQLGLAAIKGLIPG 330 (330) T ss_pred EEeeeeEEechhheeeeccccCC Confidence 999999999999999999863 5 No 6 >protein:vir:80835 Length: 464 # NCBI annotation: putative major capsid protein # Family: family:all:2450 # MgeID: mge:1885 # MgeName: phiEF24C # Cross-refs: genbank:acc:YP_001504125;genbank:gi:158079312;genbank:GeneID:5666484 Probab=98.88 E-value=1.5e-10 Score=74.43 Aligned_cols=290 Identities=14% Similarity=0.113 Sum_probs=183.7 Q ss_pred CCccccce-----eeccccccccccceeeEecCCccccee--eeeccceeccceeeeeeeeccCc-cccc-ccccccccc Q lcl|NC_020858. 1 MAVVTNTF-----QSTGAKGNREELADVVSRITPEDTPIY--SMIEKVSFDTTHPEWTTDELAAP-GANI-TLEGDEYTF 71 (330) Q Consensus 1 Ma~~t~~~-----~t~~~~g~~edl~d~I~~i~p~dTP~~--s~ig~~~~~~~~~~W~td~L~~~-~~na-~~EG~d~~~ 71 (330) =|..++-- ++...-..||+|.+.|.+++-.+.+|. .-|.+.+++|+.|+|-...=..- ..-+ .-|+..+.. T Consensus 20 Ks~ttgy~~~p~~q~~~~AlRrEsL~~~i~~Lt~~~~~f~f~~di~k~~a~STV~~y~~~~~~G~~g~~~f~~E~g~~~~ 99 (464) T protein:vir:80 20 KGFTTGYGITPESQTDAAALRREFLDDQITMLTWADGDLSFYRDITKRPATSTVAKYDVYLAHGRVGHTRFTREIGVAPI 99 (464) T ss_pred HHHHhCCccCcccccCcchhhhhhhhhhhheeeecccchhhhhhcCCchhhhhhhhhheeeccCcccccccccccccccc Confidence 11112222 222335689999999999999988873 34578999999999988874333 2333 368887766 Q ss_pred ccccCceEecceEEEEeeeeeehhHHHH-----HhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCccc-----cc Q lcl|NC_020858. 72 DATVSPERLGNYTQIMRKSGIISGTQNI-----TDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGAT-----RE 141 (330) Q Consensus 72 ~~~~~~~~~~N~tQIf~~~v~VS~T~~a-----v~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~-----r~ 141 (330) ....-+.+.-|+- -+|+|.+. +.+.+ .+-++.|...++.-+...+|+++..|...-...| =+ T Consensus 100 ~d~~~~Rr~~~~K-------fl~~~r~vsia~~lvn~~-~d~~~~~~~dai~~va~tiE~a~FyGds~l~~~~~~~~gle 171 (464) T protein:vir:80 100 SDPNLRQKTVNMK-------YVSDTKNMSIATGLVNNI-EDPMRILTDDAISVVAKTIEWASFYGDSDLSENPDAGSGLE 171 (464) T ss_pred CCCceEEEEEEee-------eeecceeeeeehhhhcch-hhHHHHHHHHHHHHHHHHHHHHHhhhccccCCCCCCccccc Confidence 5544444433322 22233221 11223 4899999999999999999999999886544332 36 Q ss_pred chhHHHHHhcccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHH-HHhhcc Q lcl|NC_020858. 142 SGSLPTWVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVF-VTFMSD 220 (330) Q Consensus 142 ~~Gi~~~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~i-s~f~~~ 220 (330) .+||..+|+-.+.. |.-+..||+++|+.+...+-.+=|++..++++...|-.| +.+... T Consensus 172 FDGl~~lI~~~NVi--------------------DarG~~Ls~~~ln~Aa~~i~~~fGt~TD~~lp~~v~a~f~n~~l~~ 231 (464) T protein:vir:80 172 FDGLAKLIDKHNVL--------------------DAKGASLTEALLNQASVLVGKGYGTPTDAYMPIGVQADFVNQQLDR 231 (464) T ss_pred hhhhHhhcCCCcee--------------------ecCCCCcCHHHHhhhhhhhhcccCChhhcccchhHHHHHHhhhcCc Confidence 89999998532222 122346999999999888866558999999999999665 776554 Q ss_pred ceeeeeeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCcccccc--cc--ccc Q lcl|NC_020858. 221 TNVASFRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKK--VA--KTG 296 (330) Q Consensus 221 ~~~~~~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~--la--KtG 296 (330) + +|....+++....|..|..|.|-.|.+++..+-+|.... ++||+..+... -|.+-... ++ ..| T Consensus 232 q----~~~~~~n~~~~~~G~~v~~f~sa~G~i~L~~s~~m~~~~-------~ld~~~~~~~~-apaapsvt~tv~~~~~g 299 (464) T protein:vir:80 232 Q----VQVISDNGQNATMGFNVKGFNSARGFIRLHGSTVMELEQ-------ILDENRMQLPN-APQKATVKATLEAGTKG 299 (464) T ss_pred e----eEEEcCCCCcceeeeecccccccccceeccCccccCccc-------ccccccccCCC-CcCCceeEEEecCCccc Confidence 3 334446666678999999999999999999888886543 47777655443 11111111 11 111 Q ss_pred cc--eeeEEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 297 DA--EKFMLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 297 d~--~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) .. +..-...+|-+.+.|..+-..+.....-+.+. T Consensus 300 ~f~~~~~~~~~~Ykv~~vn~~GeS~ps~~~~~ti~~ 335 (464) T protein:vir:80 300 KFRDEDLTIDTEYKVVVVSDDAESAPSDVASVVIDD 335 (464) T ss_pred CCccccccceeEEEEEEECCCCccccceeeeeeecC Confidence 11 22122346777777776644443322222111 No 7 >protein:vir:102823 Length: 470 # NCBI annotation: major structural protein # Family: family:all:2450 # MgeID: mge:1610 # MgeName: YS40 # Cross-refs: genbank:acc:YP_874086;genbank:gi:118197693;genbank:GeneID:4496015 Probab=98.79 E-value=6.1e-10 Score=71.11 Aligned_cols=292 Identities=17% Similarity=0.137 Sum_probs=171.9 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceee--eeccceeccceeeeeeeecc--CccccccccccccccccccC Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYS--MIEKVSFDTTHPEWTTDELA--APGANITLEGDEYTFDATVS 76 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s--~ig~~~~~~~~~~W~td~L~--~~~~na~~EG~d~~~~~~~~ 76 (330) -|+.++ .+ ..+|||.+.|.+++-.+.+|+= -|.+.+++|+.|+|-....+ ..+-.+.-||.........- T Consensus 19 ~a~~~g-----~A-lR~EsLd~~l~~lt~~~~~ftf~~~i~k~~a~STV~ey~~~~~rhG~~g~s~~~E~~l~~~~d~~~ 92 (470) T protein:vir:10 19 AAGQVA-----ES-LEREDLEPEVTQLNVLDTPLTDLLSKNAVKAKAYEHEYNVVTARHDKIGYAAFREGGLPRTVEVNV 92 (470) T ss_pred Hhhhcc-----hh-hhhhhhccceeEeeecCccchhhhhcCCchhhhHhhhhhhhccccccccceeecccccCccCCCce Confidence 222221 11 3899999999999999999854 56889999999999553332 22222335777765553322 Q ss_pred ceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCC----c--ccccchhHHHHHh Q lcl|NC_020858. 77 PERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVG----G--ATRESGSLPTWVK 150 (330) Q Consensus 77 ~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~----~--~~r~~~Gi~~~i~ 150 (330) +.|.- ..-.+.....||.-+-.....|+.|.++.|...++.-+..-+|+++..|...-. + .+=+.+||...|+ T Consensus 93 ~Rr~v-~~K~l~~~~~VT~~a~~~~~n~v~d~~~~~~~dai~~ia~tiE~a~FyGDs~l~s~~~g~~~gleFDGl~~lId 171 (470) T protein:vir:10 93 VRRRI-RPMLVGHRITVTELATRTTQNGVMQIDELVKREKMIAVANEFEYLAFYGDNLLGDDVPGSPNNLQQDGIINIIK 171 (470) T ss_pred EEEEE-EEEEEeecchhhhhhhhhhhccccchHHHHHHHHHHHHHHHHHhhhhhhccccccccCcccCceeccchhhhcc Confidence 22222 222333344444332223345888999999999999999999999999865322 2 2347899999885 Q ss_pred c----ccccccccccccccccccccccccccccccccHHHHHHHHHHHHh--cCCceeEEEeChHHHHHHHHhhccceee Q lcl|NC_020858. 151 T----NVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQ--SGANFKHVFVSPYVKSVFVTFMSDTNVA 224 (330) Q Consensus 151 t----n~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~--~Gg~~~~i~v~~~~k~~is~f~~~~~~~ 224 (330) - |+. +.-++.|+++.|++....+-. +=|+++.++++...|-.|..-.-.+... T Consensus 172 ~~~~~NVi---------------------DarG~~Ls~~~L~~aa~~I~~~~~fGt~TD~~lp~~vka~f~~~~~~~qRv 230 (470) T protein:vir:10 172 RGAPQNVL---------------------DAGGRPLSIDLLWEAESRVVSTQAFANPTAVFISYVDKLNLQASFYQISRV 230 (470) T ss_pred CCCCcccc---------------------ccCCCCccHHHHHHHHhhhcccccccChhhhccchhHHHHHHHhhcCceEE Confidence 2 333 223457999999999998854 4478999999999988887643333222 Q ss_pred eeeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCcc--------ccccccc-- Q lcl|NC_020858. 225 SFRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIA--------EDKKVAK-- 294 (330) Q Consensus 225 ~~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~--------~~e~laK-- 294 (330) .+ ..+...-..|..|..|.|-.|.|++.++-+||..... +-+.||-+--+++ -|.. -...+++ T Consensus 231 ~~---~~N~~~~~~G~~v~~f~sa~G~I~L~~s~~m~~~~k~--~p~~l~~~v~~~a--AP~~~~tv~~t~~~~a~~~~s 303 (470) T protein:vir:10 231 MT---TADRRAGLLGADAQSYIGVRGEHSLYPSQFLGDFHKF--NPARFGAEVGDFA--APSNSWTVSTTDNFVTLPYNS 303 (470) T ss_pred EE---ecCCCceeeeeeccceeeeeeeeeecccccccchhhc--CcccCCcccCCcc--cCceeEEeecCCCceeecccC Confidence 21 1233334588899999999999999999999742110 0012333211111 0100 0001111 Q ss_pred -ccc-ceeeEEEEEEEEEEecchhee--EEeccccccccC Q lcl|NC_020858. 295 -TGD-AEKFMLIGEGALKPKNEKGLG--VAADLYGLTAST 330 (330) Q Consensus 295 -tGd-~~k~~i~~E~tLe~~N~~a~g--~i~gLt~~~~~~ 330 (330) .|+ ..+. +++|..++.|-.+-+ .|.+-+ .+.+. T Consensus 304 k~g~~~~~~--v~sy~y~v~~~~gds~s~~v~vt-~t~~~ 340 (470) T protein:vir:10 304 GLGDPANTT--VYSYAFKAANFYGESAAKYIDVY-IDSTE 340 (470) T ss_pred CCCcccCcc--eeEEEEEEEEecCCCCcceEEEE-Eeeeh Confidence 111 1122 446666666655322 222221 11111 No 8 >protein:vir:96666 Length: 462 # NCBI annotation: ORF016 # Family: family:all:2450 # MgeID: mge:1623 # MgeName: Twort # Cross-refs: genbank:acc:YP_238545;genbank:gi:66391271;genbank:GeneID:5130448 Probab=98.70 E-value=4.4e-09 Score=66.41 Aligned_cols=290 Identities=13% Similarity=0.068 Sum_probs=175.4 Q ss_pred CCc---cccceeeccccccccccceeeEecCCccccee--eeeccceeccceeeeeeeeccCc-ccccc-cccccccccc Q lcl|NC_020858. 1 MAV---VTNTFQSTGAKGNREELADVVSRITPEDTPIY--SMIEKVSFDTTHPEWTTDELAAP-GANIT-LEGDEYTFDA 73 (330) Q Consensus 1 Ma~---~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~--s~ig~~~~~~~~~~W~td~L~~~-~~na~-~EG~d~~~~~ 73 (330) |.+ ++..-++...-.-||+|.+.|.+++-.+.+|. .-|.+.++.|+.|+|-...=..- ..-+. -|+..+.... T Consensus 26 ~~tg~g~~p~~q~~~gAlR~esL~~~i~~Lt~~~~~~~~~~~i~k~~a~sTv~~y~~~~~~G~~g~~~f~~E~g~~~~~d 105 (462) T protein:vir:96 26 YQTGYGITPDTQVDAGALRREILDDQITMLTWTQDDLIFYREISRRPAQSTVQKYDVYLRHGNVGHSRFVREVGVAPVSD 105 (462) T ss_pred HhcCCCcCCccccccchhhhhhhhhhhheeeecccchhhhhhcCCchhhhhhhhheeeeccCccccccccccccccccCC Confidence 221 11111222234678999999999999888874 45678999999999988874333 33333 6888876665 Q ss_pred ccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCc----ccccchhHHHHH Q lcl|NC_020858. 74 TVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGG----ATRESGSLPTWV 149 (330) Q Consensus 74 ~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~----~~r~~~Gi~~~i 149 (330) ..-+.+.-++-=+. .+-.||- ++.-..++.+-++.|.+.++.-+...+|+++..|.++-.. .-=+.+||..+| T Consensus 106 ~~~~R~~~~~k~l~-~t~~vsi--~~tl~n~~~d~~~~~~~dai~~~a~tiE~a~Fygds~l~~~~~~~gleFDGl~~lI 182 (462) T protein:vir:96 106 PNIRQKTVEMKYVS-DTKNLSI--ASTLVNNIQDPMQILTEDAIAVVAKTIEWASFYGDASLTADPTGQGLEFDGLAKLI 182 (462) T ss_pred CceEEEEEEEEEEe-eeeeech--hhhhccchhhHHHHHHHHHHHHHHHHHHHHHhhhhcccCCCccccccchhhhhhhc Confidence 54444433332221 1223331 1222346788999999999999999999999998765332 235689998888 Q ss_pred hcccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeee Q lcl|NC_020858. 150 KTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYA 229 (330) Q Consensus 150 ~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~ 229 (330) +-.+.. |.-++.|++++|+.....+-.+=|++..++++...|-.|..+.-.+.... . T Consensus 183 ~~~NVi--------------------DarG~~Ls~~~ln~aa~~i~~~fGt~TD~~~p~~v~a~f~~~~l~~qrv~---~ 239 (462) T protein:vir:96 183 DKDNVI--------------------DAKGESLTETLLNRSAVLIGKSFGTATDAYMPIGVHADFVNSVLGRQMQL---M 239 (462) T ss_pred CCCcee--------------------ecCCCCccHHHHhhhhhhcccccCChhheecchHHHHHHHHhhcCceEEE---E Confidence 532221 12235799999999998885454789999999999999986444333222 2 Q ss_pred ecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCcccccccccc------c---ccee Q lcl|NC_020858. 230 ASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKT------G---DAEK 300 (330) Q Consensus 230 ~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKt------G---d~~k 300 (330) ..+...-..|..|..|.|-.|.|++..++||..+. ++|-+.-.++- -|. ...|+-| | |. + T Consensus 240 ~~n~g~~~~G~~v~~f~s~~G~I~L~~s~~m~~~~-------i~~~~~~~~p~-ap~--~~~vsaTv~t~~~g~f~~~-~ 308 (462) T protein:vir:96 240 QDNSGNVNAGYNVQGFYSSRGFIKLHGSTVMENEL-------ILDESLQPLPN-APQ--PATVKATVETGKKGLFTDE-H 308 (462) T ss_pred cCCCCceeeeeeccceeeeeeeeeeCCceecCccc-------ccccccccCCC-CCC--CCceeEEEEeCCCCCCCCc-c Confidence 22333346888999999999999999999998654 45655543332 111 1122222 2 22 1 Q ss_pred eEEEEE--EEEEEecchhee-EEeccccccccC Q lcl|NC_020858. 301 FMLIGE--GALKPKNEKGLG-VAADLYGLTAST 330 (330) Q Consensus 301 ~~i~~E--~tLe~~N~~a~g-~i~gLt~~~~~~ 330 (330) . .+| |-+...|..+-. -..-++. +.+. T Consensus 309 d--~~~y~Y~V~avs~dgeS~PS~~Vta-Tva~ 338 (462) T protein:vir:96 309 D--RAELTYKVVVNSDDAQSAPSEAVTA-TVNN 338 (462) T ss_pred C--ceeEEEEEEEECCCCccccceeeEe-eeec Confidence 1 244 344444543211 0000000 0000 No 9 >protein:vir:95603 Length: 463 # NCBI annotation: ORF016 # Family: family:all:2450 # MgeID: mge:1577 # MgeName: G1 # Cross-refs: genbank:acc:YP_240903;genbank:gi:66394965;genbank:GeneID:5132544 Probab=98.69 E-value=2.4e-09 Score=67.85 Aligned_cols=288 Identities=13% Similarity=0.055 Sum_probs=177.9 Q ss_pred CCc---cccceeeccccccccccceeeEecCCcccceee--eeccceeccceeeeeeeeccCc-ccccc-cccccccccc Q lcl|NC_020858. 1 MAV---VTNTFQSTGAKGNREELADVVSRITPEDTPIYS--MIEKVSFDTTHPEWTTDELAAP-GANIT-LEGDEYTFDA 73 (330) Q Consensus 1 Ma~---~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s--~ig~~~~~~~~~~W~td~L~~~-~~na~-~EG~d~~~~~ 73 (330) |.+ ++..-++...-..||+|...|.+++-.+.+|.= -|.+.+++|+.|+|-...=..- ..-+. -|+..+.... T Consensus 26 ~~tg~g~~p~~q~~~~AlR~EsL~~~i~~Lt~~~~~f~~~~~i~k~~a~STV~~y~~~~~~G~~g~~~f~~E~g~~~~~d 105 (463) T protein:vir:95 26 FQTGYGITPDTQIDAGALRREILDDQITMLTWTNEDLIFYRDISRRPAQSTVVKYDQYLRHGNVGHSRFVKEIGVAPVSD 105 (463) T ss_pred hhcCCccCCccccCcchhhhhhhhhhhheeeecccchhhhhhcCCchhhhhhhhheeeeccCccccccccccccccccCC Confidence 322 111113333456899999999999999888843 4578999999999988874333 33333 6887765554 Q ss_pred ccCceEecceEEEEeeeeeehhHHHHHhhc----cccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcc----cccchhH Q lcl|NC_020858. 74 TVSPERLGNYTQIMRKSGIISGTQNITDEA----GRATKVKEQKLKKGVELRKDVEFSIVATNASVGGA----TRESGSL 145 (330) Q Consensus 74 ~~~~~~~~N~tQIf~~~v~VS~T~~av~~~----G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~----~r~~~Gi 145 (330) .. |.|+-...==+|.|.+.+... ++.+-++.|...++.-+...+|+++..|...-... -=+.+|| T Consensus 106 ~~-------~~Rr~~~~K~l~~~~~VS~~~~l~n~~~d~~~~~~~dai~~ia~tiE~a~FyGds~l~~~~~~~gleFDGl 178 (463) T protein:vir:95 106 PN-------IRQKTVSMKYVSDTKNMSIASGLVNNIADPSQILTEDAIAVVAKTIEWASFYGDASLTSEVEGEGLEFDGL 178 (463) T ss_pred Cc-------eEEEEEEeeeeehhhhhhhHHHhhcccccHHHHHHHHHHHHHHHHHHHHHhhhhhccCCCcCccccchhhh Confidence 43 333322222255555544433 34578999999999999999999999987654332 2468899 Q ss_pred HHHHhcccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeee Q lcl|NC_020858. 146 PTWVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVAS 225 (330) Q Consensus 146 ~~~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~ 225 (330) ...|+-.... |.-++.|++++|+.+...+-.+=|+++.++++...|-.|..+.-.+.... T Consensus 179 ~~lId~envi--------------------DarG~~Ls~~~ln~Aa~~i~~~fGt~TD~~lp~~vka~f~~~~l~~qrv~ 238 (463) T protein:vir:95 179 AKLIDKNNVI--------------------NAKGNQLTEKHLNEAAVRIGKGFGTATDAYMPIGVHADFVNSILGRQMQL 238 (463) T ss_pred hhhcCCCCee--------------------ecCCCcccHHHHhhhhhhhhcccCChhheecchHHHHHHHHHhcCceEEE Confidence 8888533322 22345799999999888886655899999999999999986544433222 Q ss_pred eeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcc--cCCc--cccccccccc---cc Q lcl|NC_020858. 226 FRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGW--LRKI--AEDKKVAKTG---DA 298 (330) Q Consensus 226 ~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~--Lr~~--~~~e~laKtG---d~ 298 (330) ...+......|..|..|.|-.|.|++..++||..+. ++|-+....+- -.|- ..-++..|.| +. T Consensus 239 ---~~~N~~~~~~G~~v~~f~s~~G~I~L~~s~~m~~~~-------il~~~~~~~p~ap~~~~~tatv~~~~~~~~~~~~ 308 (463) T protein:vir:95 239 ---MQDNSGNVNTGYSVNGFYSSRGFIKLHGSTVMENEL-------ILDESLQPLPNAPQPAKVTATVETKQKGAFENEE 308 (463) T ss_pred ---EcCCCCceeeeeeccceeeeeeeeeeCCceecCCcc-------cccchhhcCCCCccCceeEEEEeeccCCCCCCcc Confidence 222333346888999999999999999999998654 45555442221 1110 0111112222 11 Q ss_pred eeeEEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 299 EKFMLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 299 ~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) ++.. -.|-+.+.|-.+-.+ .|-+-.+| T Consensus 309 ~~a~--~~Y~vv~~s~~geS~---pS~ivtaT 335 (463) T protein:vir:95 309 DRAG--LSYKVVVNSDDAQSA---PSEEVTAT 335 (463) T ss_pred cccc--eEEEEEEECCCCCcc---cchheeee Confidence 2221 235555656554331 11111122 No 10 >protein:vir:99311 Length: 463 # NCBI annotation: putative capsid protein # Family: family:all:2450 # MgeID: mge:1655 # MgeName: K # Cross-refs: genbank:acc:YP_024474;genbank:gi:48696433;genbank:GeneID:2948039 Probab=98.69 E-value=2.4e-09 Score=67.85 Aligned_cols=288 Identities=13% Similarity=0.055 Sum_probs=177.9 Q ss_pred CCc---cccceeeccccccccccceeeEecCCcccceee--eeccceeccceeeeeeeeccCc-ccccc-cccccccccc Q lcl|NC_020858. 1 MAV---VTNTFQSTGAKGNREELADVVSRITPEDTPIYS--MIEKVSFDTTHPEWTTDELAAP-GANIT-LEGDEYTFDA 73 (330) Q Consensus 1 Ma~---~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s--~ig~~~~~~~~~~W~td~L~~~-~~na~-~EG~d~~~~~ 73 (330) |.+ ++..-++...-..||+|...|.+++-.+.+|.= -|.+.+++|+.|+|-...=..- ..-+. -|+..+.... T Consensus 26 ~~tg~g~~p~~q~~~~AlR~EsL~~~i~~Lt~~~~~f~~~~~i~k~~a~STV~~y~~~~~~G~~g~~~f~~E~g~~~~~d 105 (463) T protein:vir:99 26 FQTGYGITPDTQIDAGALRREILDDQITMLTWTNEDLIFYRDISRRPAQSTVVKYDQYLRHGNVGHSRFVKEIGVAPVSD 105 (463) T ss_pred hhcCCccCCccccCcchhhhhhhhhhhheeeecccchhhhhhcCCchhhhhhhhheeeeccCccccccccccccccccCC Confidence 322 111113333456899999999999999888843 4578999999999988874333 33333 6887765554 Q ss_pred ccCceEecceEEEEeeeeeehhHHHHHhhc----cccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcc----cccchhH Q lcl|NC_020858. 74 TVSPERLGNYTQIMRKSGIISGTQNITDEA----GRATKVKEQKLKKGVELRKDVEFSIVATNASVGGA----TRESGSL 145 (330) Q Consensus 74 ~~~~~~~~N~tQIf~~~v~VS~T~~av~~~----G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~----~r~~~Gi 145 (330) .. |.|+-...==+|.|.+.+... ++.+-++.|...++.-+...+|+++..|...-... -=+.+|| T Consensus 106 ~~-------~~Rr~~~~K~l~~~~~VS~~~~l~n~~~d~~~~~~~dai~~ia~tiE~a~FyGds~l~~~~~~~gleFDGl 178 (463) T protein:vir:99 106 PN-------IRQKTVSMKYVSDTKNMSIASGLVNNIADPSQILTEDAIAVVAKTIEWASFYGDASLTSEVEGEGLEFDGL 178 (463) T ss_pred Cc-------eEEEEEEeeeeehhhhhhhHHHhhcccccHHHHHHHHHHHHHHHHHHHHHhhhhhccCCCcCccccchhhh Confidence 43 333322222255555544433 34578999999999999999999999987654332 2468899 Q ss_pred HHHHhcccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeee Q lcl|NC_020858. 146 PTWVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVAS 225 (330) Q Consensus 146 ~~~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~ 225 (330) ...|+-.... |.-++.|++++|+.+...+-.+=|+++.++++...|-.|..+.-.+.... T Consensus 179 ~~lId~envi--------------------DarG~~Ls~~~ln~Aa~~i~~~fGt~TD~~lp~~vka~f~~~~l~~qrv~ 238 (463) T protein:vir:99 179 AKLIDKNNVI--------------------NAKGNQLTEKHLNEAAVRIGKGFGTATDAYMPIGVHADFVNSILGRQMQL 238 (463) T ss_pred hhhcCCCCee--------------------ecCCCcccHHHHhhhhhhhhcccCChhheecchHHHHHHHHHhcCceEEE Confidence 8888533322 22345799999999888886655899999999999999986544433222 Q ss_pred eeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcc--cCCc--cccccccccc---cc Q lcl|NC_020858. 226 FRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGW--LRKI--AEDKKVAKTG---DA 298 (330) Q Consensus 226 ~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~--Lr~~--~~~e~laKtG---d~ 298 (330) ...+......|..|..|.|-.|.|++..++||..+. ++|-+....+- -.|- ..-++..|.| +. T Consensus 239 ---~~~N~~~~~~G~~v~~f~s~~G~I~L~~s~~m~~~~-------il~~~~~~~p~ap~~~~~tatv~~~~~~~~~~~~ 308 (463) T protein:vir:99 239 ---MQDNSGNVNTGYSVNGFYSSRGFIKLHGSTVMENEL-------ILDESLQPLPNAPQPAKVTATVETKQKGAFENEE 308 (463) T ss_pred ---EcCCCCceeeeeeccceeeeeeeeeeCCceecCCcc-------cccchhhcCCCCccCceeEEEEeeccCCCCCCcc Confidence 222333346888999999999999999999998654 45555442221 1110 0111112222 11 Q ss_pred eeeEEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 299 EKFMLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 299 ~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) ++.. -.|-+.+.|-.+-.+ .|-+-.+| T Consensus 309 ~~a~--~~Y~vv~~s~~geS~---pS~ivtaT 335 (463) T protein:vir:99 309 DRAG--LSYKVVVNSDDAQSA---PSEEVTAT 335 (463) T ss_pred cccc--eEEEEEEECCCCCcc---cchheeee Confidence 2221 235555656554331 11111122 No 11 >protein:vir:100851 Length: 514 # NCBI annotation: hypothetical protein # Family: family:all:2450 # MgeID: mge:1633 # MgeName: LP65 # Cross-refs: genbank:acc:YP_164744;genbank:gi:56693157;genbank:GeneID:3197484 Probab=98.61 E-value=5.1e-09 Score=66.06 Aligned_cols=293 Identities=15% Similarity=0.068 Sum_probs=171.5 Q ss_pred CCcccc-----ceeeccccccccccceeeEecCCcccceee--eeccceeccceeeeeeeeccCc-ccc-cccccccccc Q lcl|NC_020858. 1 MAVVTN-----TFQSTGAKGNREELADVVSRITPEDTPIYS--MIEKVSFDTTHPEWTTDELAAP-GAN-ITLEGDEYTF 71 (330) Q Consensus 1 Ma~~t~-----~~~t~~~~g~~edl~d~I~~i~p~dTP~~s--~ig~~~~~~~~~~W~td~L~~~-~~n-a~~EG~d~~~ 71 (330) =|..++ +-++..+-..+|||.+.|.+++-.+.+|+= -|.+.+++|+.|+|....=..- ..- ..-|+..... T Consensus 43 ~a~t~gy~~~~~~~t~gaAlR~EsLd~~l~~Lt~~~~~ftf~~~i~k~~a~STV~ey~~~~~~G~~G~~~f~~E~gi~~~ 122 (514) T protein:vir:10 43 SAFTAGHSITPDTQTDGAANRIESLNRDLKVTTWGERDFTLYNDIAKQPVDNTVLKYTQYYSHGRTGHSLFQPEIGIGDV 122 (514) T ss_pred hhhccccccCCccccCccchhhhhhccceeEeeecCcchhhhhhcCCchhhHHHhhhhhhcccCcccccccccccccCcC Confidence 112221 112333456899999999999999999854 5688999999999988873322 223 3368876555 Q ss_pred ccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCC----cccccchhHHH Q lcl|NC_020858. 72 DATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVG----GATRESGSLPT 147 (330) Q Consensus 72 ~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~----~~~r~~~Gi~~ 147 (330) ....-+.+.-++-=+. ....||--+.- ..|+-+-++.|...++.-+...+|+++..|...-. +.+-+.+||.. T Consensus 123 ~d~~~~rk~~~~k~l~-~~~~vS~~~~l--~n~i~d~~~~~~~dai~~ia~tiE~a~FyGDs~L~s~~~~~gleFDGl~~ 199 (514) T protein:vir:10 123 NNPNERQRTINIKYIV-DTHVTSIALQR--ANTIVDSLKVQEYAAISTVIKTDEWAMFYGDADLTSGQKGEGLQFDGLFK 199 (514) T ss_pred CCcceEEEEEeeeeee-eeeeeeehhhh--ccchhhHHHHHHHHHHHHHHHHHHHHHhhhcccCCCccccCcchhhhHHH Confidence 4444343333322111 12233322222 22666899999999999999999999999876533 45688999999 Q ss_pred HHhcccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHH-hhccceeeee Q lcl|NC_020858. 148 WVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVT-FMSDTNVASF 226 (330) Q Consensus 148 ~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~-f~~~~~~~~~ 226 (330) .|+-.+.. |.-+..|++++|+.+...+-.+=|+++.++++...|-.|.. |+.. . + T Consensus 200 lI~~~NvI--------------------DarG~~Ls~~~ln~aA~~i~~gfGt~TD~ylp~~vka~f~~~~~~~-q---R 255 (514) T protein:vir:10 200 LIAPENHI--------------------DLRGGRLSPAALNMAARKIGEGFGTPTDAYMPIGIKADFVNQHLNG-Q---R 255 (514) T ss_pred hhcCCCeE--------------------ecCCCCccHHHHhhhhhhhhcccCChhheeCchHHHHHHhhcccCc-c---e Confidence 99532221 22334799999999998777776899999999988765544 3332 1 1 Q ss_pred eeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCcccccccccccc--------- Q lcl|NC_020858. 227 RYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTGD--------- 297 (330) Q Consensus 227 r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtGd--------- 297 (330) .....+...-..+..|+.|.|-.|.|++..+-+|.....+ |.. .....--|.+....++-|.+ T Consensus 256 V~~~~n~~~~~~G~~v~~f~s~~G~I~L~gs~im~~~n~L-------~~~-~~~~~~Ap~~~~va~svT~~~~g~~~~ad 327 (514) T protein:vir:10 256 VMLPGQTGGMTTGLDIDKFLSAHGSIRIQGSTIMDSDNKL-------DFD-RPVSPTAPTAPQLSATVTPDGGGLWHEAD 327 (514) T ss_pred EEeecCccceeeeeeccceeEeccceeecCCeeecccccC-------ccC-CccCCcCCCCCcceEEEecCcccccCccc Confidence 1122233345678888999999999999888888765543 211 11111111111111111111 Q ss_pred ---c-eeeEEE------EEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 298 ---A-EKFMLI------GEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 298 ---~-~k~~i~------~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) + -+..+- -.|.+...|..+-. .-..=++++. T Consensus 328 ~t~~~g~~~~~~~~g~~~sYaVv~~n~~GeS--~ps~~vtaT~ 368 (514) T protein:vir:10 328 KTDSKGEVILNKEVGVEQSYVAVMVSRHGDS--RPSLVQTATP 368 (514) T ss_pred ccccccccccccccceeEEEEEEEECCCCcc--cccceeeeee Confidence 1 111122 23445555554321 0000122222 No 12 >protein:vir:94771 Length: 298 # NCBI annotation: major head protein # Family: family:all:966 # MgeID: mge:1529 # MgeName: phi LC3 # Cross-refs: genbank:acc:NP_996706;genbank:gi:45597421;genbank:GeneID:2769044 Probab=98.50 E-value=3.8e-08 Score=61.25 Aligned_cols=283 Identities=9% Similarity=-0.035 Sum_probs=140.1 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) ||+-++..... .+.+.|...-....|+.++....+..+-.+.+...+-. +...-..||.+.+.......... T Consensus 1 ma~~gG~lip~-------~~~~~ii~~~~~~s~i~~~~~~~~~~~~~~~~p~~~~~-~~a~~v~Eg~~~~~~~~~f~~v~ 72 (298) T protein:vir:94 1 MVLNKGTLFDP-------ELVTDLISKVAGKSSIARLSAQKPIPFNGEKVFTFTMD-SEIDVVAESGKKTHGGVTLAPQT 72 (298) T ss_pred CeeccccccCh-------hHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEecC-cceEEeeCCccccccccceeEEE Confidence 99877543333 34445555555666777665444333322333222211 11122358877665433322221 Q ss_pred cceEEEEeeeeeehhHHHHHhhcc--ccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 81 GNYTQIMRKSGIISGTQNITDEAG--RATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 81 ~N~tQIf~~~v~VS~T~~av~~~G--~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) -+.. -+...+.||.- .+.... ..+..++-..+-...+.+.+|.++++|.....+.+-.-.|...+.. . T Consensus 73 l~~~-k~~~~~~iS~e--ll~~~~~~~~~l~~~i~~~la~ai~~~~d~~~l~G~~~~~g~~~~~~~~~~~~~-----~-- 142 (298) T protein:vir:94 73 MVPI-KVEYGARISDE--FMYASDEEKINILQAFNDGFAKKVARGIDLMAFHGVNPRLGTASAVIGTNHFDS-----K-- 142 (298) T ss_pred Eeee-EEEEeeehhHH--HhccCCccHHHHHHHHHHHHHHHHHHHHHHHhhcccccCCCccccccccccccc-----c-- Confidence 1222 22334555543 222111 1223344445566778899999999986433332211111111110 0 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeE Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSI 238 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~ 238 (330) .+.....++......++|.+++.++..++.+...+++||..+.++.++ ++.+. |+...+... T Consensus 143 -----------~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~l-kd~~G---~~l~~~~~~--- 204 (298) T protein:vir:94 143 -----------VTQKVEAPRGIADPNGAIENAVELLTGVDADVTGIAINPSFRSALAKQ-KDLQG---NALFPELKW--- 204 (298) T ss_pred -----------cccccccccccccHHHHHHHHHHhhhhcCCCccEEEEcHHHHHHHHHh-hccCC---CeeecCccc--- Confidence 000111112222345678889999988888888899999999999886 33331 222111100 Q ss_pred EEEEEEEEcCCeEEEEEEcCcCCCccc-cccEEEEEcchhhhhcccCCccccccccc------------cccceeeEEEE Q lcl|NC_020858. 239 VANADVYEGPFGKVMIHPNRVMAGSGA-LARNAFFVDPEFLQFGWLRKIAEDKKVAK------------TGDAEKFMLIG 305 (330) Q Consensus 239 ~~~v~~~~tdfG~v~iv~nR~mp~~~~-~a~~~~~ld~~~~~~~~Lr~~~~~e~laK------------tGd~~k~~i~~ 305 (330) +.... +=+| +.++.+..||.... ....+|+.|.+..-....|...+-+ +.. .-+...+.... T Consensus 205 ~~~~~---tl~G-~PV~~~~~v~~~~~~~~~~~~~Gdfs~~~~~~~~~~~~~~-~~~~~~~d~~~~~~f~~~~v~~r~~~ 279 (298) T protein:vir:94 205 GATPD---TING-LPVDVNKTVSDMSLTQRDRAIIGDFANGFKWGYAKEVPLE-VIQYGDPDNSGLDLKGYNQVYIRAEL 279 (298) T ss_pred CCCCc---eecc-eeeEEecccccccCCCccEEEEeeccceEEEEEecCceEE-EeecCCCcCcchhhhhcCcEEEEEEE Confidence 00000 0123 47777788885422 1235677788743211112111111 111 11223344566 Q ss_pred EEEEEEecchheeEEeccc Q lcl|NC_020858. 306 EGALKPKNEKGLGVAADLY 324 (330) Q Consensus 306 E~tLe~~N~~a~g~i~gLt 324 (330) .+++.+++|+|..+|.+-| T Consensus 280 r~~~~~~~~~a~~~l~~~t 298 (298) T protein:vir:94 280 FLGWGILDATKFARVTEAN 298 (298) T ss_pred EeccEeecccceEEEEecC Confidence 7889999999999999988 No 13 >protein:vir:1886 Length: 385 # NCBI annotation: major capsid subunit precursor # Family: family:all:585 # MgeID: mge:41 # MgeName: HK022 # Cross-refs: genbank:acc:NP_037666;genbank:gi:9634124;genbank:GeneID:1262513 Probab=98.33 E-value=1.9e-07 Score=57.39 Aligned_cols=277 Identities=9% Similarity=0.021 Sum_probs=145.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) .+..+++ +.......+.+.+.|+..--...|+++++......+..+++....-.++...-..||...+..... .. T Consensus 103 ~~~~~~~--~~~g~~i~~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~~~~---~~ 177 (385) T protein:vir:18 103 KSLGSDA--DSAGSLIQPMQIPGIIMPGLRRLTIRDLLAQGRTSSNALEYVREEVFTNNADVVAEKALKPESDIT---FS 177 (385) T ss_pred hhhcccc--ccCCceecchhhhHHHHHhhhccchhhhcceecccCcceEEEEEecCCcceeeeccCccccccccc---ee Confidence 1111111 111112334566667776678899999877666655555555544333333334577665544322 11 Q ss_pred cceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccccc Q lcl|NC_020858. 81 GNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATGA 160 (330) Q Consensus 81 ~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g~ 160 (330) .....+++-...+.=|.+.++.. .+...|=...-...+.+-+|.++|+|.... ....||....... T Consensus 178 ~~~~~~~k~~~~~~is~ell~d~--~~l~~~i~~~la~a~~~~~d~~~l~G~g~~----~~~~Gi~~~~~~~-------- 243 (385) T protein:vir:18 178 KQTANVKTIAHWVQASRQVMDDA--PMLQSYINNRLMYGLALKEEGQLLNGDGTG----DNLEGLNKVATAY-------- 243 (385) T ss_pred EEEEeeeeEEEeehhhHHHHhhH--HHHHHHHHHHHHHHHHHHHHHHHHhccCCC----Ccccccccccccc-------- Confidence 22233333333333344455543 233444444556668888999999885321 1234544321100 Q ss_pred cccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeEEE Q lcl|NC_020858. 161 NGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSIVA 240 (330) Q Consensus 161 ~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~~~ 240 (330) ..+.......+.+.|.+++.++-.++.....+++||....+|..+ ++... |+...+.. .. T Consensus 244 ------------~~~~~~~~~~~~d~i~~~~~~l~~~~~~~~~~~~~~~~~~~l~~l-kd~~G---~~l~~~~~----~~ 303 (385) T protein:vir:18 244 ------------DTSLNATGDTRADIIAHAIYQVTESEFSASGIVLNPRDWHNIALL-KDNEG---RYIFGGPQ----AF 303 (385) T ss_pred ------------cccccccccchHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHh-hcCCC---ceeccCcc----cC Confidence 001111123566888888888888888888899999998888875 34332 22221100 00 Q ss_pred EEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchh-hhhcccCCcccccccccc-----ccceeeEEEEEEEEEEecc Q lcl|NC_020858. 241 NADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRKIAEDKKVAKT-----GDAEKFMLIGEGALKPKNE 314 (330) Q Consensus 241 ~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~~~~~e~laKt-----Gd~~k~~i~~E~tLe~~N~ 314 (330) .. .+=+| +.|+.+++||++. +++.|+.. +.+ +.+.-..-+..-.. -+...+.....++..+++| T Consensus 304 ~~---~~l~G-~pV~~~~~~p~~~-----~~~gd~~~~~~~-~~~~~~~v~~~~~~~~~~~~~~~~~~~~~r~~~~v~~~ 373 (385) T protein:vir:18 304 TS---NIMWG-LPVVPTKAQAAGT-----FTVGGFDMASQV-WDRMDATVEVSREDRDNFVKNMLTILCEERLALAHYRP 373 (385) T ss_pred CC---ceecc-eeeEEcCcCCCCc-----EEEeecccEEEE-EEecceEEEEeccccchhhcCcEEEEEEEeeccEEecc Confidence 00 11246 6788999999764 56788764 322 22321111100011 1233444556689999999 Q ss_pred hheeEEeccccccccC Q lcl|NC_020858. 315 KGLGVAADLYGLTAST 330 (330) Q Consensus 315 ~a~g~i~gLt~~~~~~ 330 (330) .|..+++-=+ +| T Consensus 374 ~a~~~~~~~a----a~ 385 (385) T protein:vir:18 374 TAIIKGTFSS----GS 385 (385) T ss_pred cceEEEEecc----CC Confidence 9998876432 23 No 14 >protein:vir:191 Length: 385 # NCBI annotation: major head subunit precursor # Family: family:all:585 # MgeID: mge:6 # MgeName: HK97 # Cross-refs: genbank:acc:NP_037701;genbank:gi:9634158;genbank:GeneID:1262530 Probab=98.33 E-value=1.9e-07 Score=57.39 Aligned_cols=277 Identities=9% Similarity=0.021 Sum_probs=145.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) .+..+++ +.......+.+.+.|+..--...|+++++......+..+++....-.++...-..||...+..... .. T Consensus 103 ~~~~~~~--~~~g~~i~~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~~~~---~~ 177 (385) T protein:vir:19 103 KSLGSDA--DSAGSLIQPMQIPGIIMPGLRRLTIRDLLAQGRTSSNALEYVREEVFTNNADVVAEKALKPESDIT---FS 177 (385) T ss_pred hhhcccc--ccCCceecchhhhHHHHHhhhccchhhhcceecccCcceEEEEEecCCcceeeeccCccccccccc---ee Confidence 1111111 111112334566667776678899999877666655555555544333333334577665544322 11 Q ss_pred cceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccccc Q lcl|NC_020858. 81 GNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATGA 160 (330) Q Consensus 81 ~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g~ 160 (330) .....+++-...+.=|.+.++.. .+...|=...-...+.+-+|.++|+|.... ....||....... T Consensus 178 ~~~~~~~k~~~~~~is~ell~d~--~~l~~~i~~~la~a~~~~~d~~~l~G~g~~----~~~~Gi~~~~~~~-------- 243 (385) T protein:vir:19 178 KQTANVKTIAHWVQASRQVMDDA--PMLQSYINNRLMYGLALKEEGQLLNGDGTG----DNLEGLNKVATAY-------- 243 (385) T ss_pred EEEEeeeeEEEeehhhHHHHhhH--HHHHHHHHHHHHHHHHHHHHHHHHhccCCC----Ccccccccccccc-------- Confidence 22233333333333344455543 233444444556668888999999885321 1234544321100 Q ss_pred cccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeEEE Q lcl|NC_020858. 161 NGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSIVA 240 (330) Q Consensus 161 ~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~~~ 240 (330) ..+.......+.+.|.+++.++-.++.....+++||....+|..+ ++... |+...+.. .. T Consensus 244 ------------~~~~~~~~~~~~d~i~~~~~~l~~~~~~~~~~~~~~~~~~~l~~l-kd~~G---~~l~~~~~----~~ 303 (385) T protein:vir:19 244 ------------DTSLNATGDTRADIIAHAIYQVTESEFSASGIVLNPRDWHNIALL-KDNEG---RYIFGGPQ----AF 303 (385) T ss_pred ------------cccccccccchHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHh-hcCCC---ceeccCcc----cC Confidence 001111123566888888888888888888899999998888875 34332 22221100 00 Q ss_pred EEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchh-hhhcccCCcccccccccc-----ccceeeEEEEEEEEEEecc Q lcl|NC_020858. 241 NADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRKIAEDKKVAKT-----GDAEKFMLIGEGALKPKNE 314 (330) Q Consensus 241 ~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~~~~~e~laKt-----Gd~~k~~i~~E~tLe~~N~ 314 (330) .. .+=+| +.|+.+++||++. +++.|+.. +.+ +.+.-..-+..-.. -+...+.....++..+++| T Consensus 304 ~~---~~l~G-~pV~~~~~~p~~~-----~~~gd~~~~~~~-~~~~~~~v~~~~~~~~~~~~~~~~~~~~~r~~~~v~~~ 373 (385) T protein:vir:19 304 TS---NIMWG-LPVVPTKAQAAGT-----FTVGGFDMASQV-WDRMDATVEVSREDRDNFVKNMLTILCEERLALAHYRP 373 (385) T ss_pred CC---ceecc-eeeEEcCcCCCCc-----EEEeecccEEEE-EEecceEEEEeccccchhhcCcEEEEEEEeeccEEecc Confidence 00 11246 6788999999764 56788764 322 22321111100011 1233444556689999999 Q ss_pred hheeEEeccccccccC Q lcl|NC_020858. 315 KGLGVAADLYGLTAST 330 (330) Q Consensus 315 ~a~g~i~gLt~~~~~~ 330 (330) .|..+++-=+ +| T Consensus 374 ~a~~~~~~~a----a~ 385 (385) T protein:vir:19 374 TAIIKGTFSS----GS 385 (385) T ss_pred cceEEEEecc----CC Confidence 9998876432 23 No 15 >protein:vir:63741 Length: 468 # NCBI annotation: Cps # Family: family:all:2450 # MgeID: mge:1517 # MgeName: P100 # Cross-refs: genbank:gi:82547622;genbank:GeneID:3783474 Probab=98.26 E-value=9.4e-08 Score=59.12 Aligned_cols=295 Identities=14% Similarity=0.120 Sum_probs=174.6 Q ss_pred CC-ccccce-----eeccccccccccceeeEecCCcccceee--eeccceeccceeeeeeeeccCc-ccccc-ccccccc Q lcl|NC_020858. 1 MA-VVTNTF-----QSTGAKGNREELADVVSRITPEDTPIYS--MIEKVSFDTTHPEWTTDELAAP-GANIT-LEGDEYT 70 (330) Q Consensus 1 Ma-~~t~~~-----~t~~~~g~~edl~d~I~~i~p~dTP~~s--~ig~~~~~~~~~~W~td~L~~~-~~na~-~EG~d~~ 70 (330) |. ..++-- ++...-..||+|+..|.+++-.+.+|.= -|.+.++.|+.|+|-...=..- ..-+. -|+.... T Consensus 23 ~Ks~~agy~~~p~~q~~~~AlR~EsL~~~i~~L~~~~~~f~~~~di~k~~a~stv~~y~~~~~~G~~g~~~f~~E~g~~~ 102 (468) T protein:vir:63 23 LKSFTTGYGITPDTQTDAGALRREFLDDQISMLTWTENDLTFYKDIAKKPATSTVAKYDVYMQHGKVGHTRFTREIGVAP 102 (468) T ss_pred HHHHHcCcccCCccccCcchhhhhhhhhhhheeeecccchhhhhhcccchhhhhhhhheeeeccCccccccccccccccc Confidence 11 122111 2223346799999999999999988843 4578999999999988874333 33333 6888776 Q ss_pred cccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccc-----cchhH Q lcl|NC_020858. 71 FDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATR-----ESGSL 145 (330) Q Consensus 71 ~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r-----~~~Gi 145 (330) .....-+.+.-++- -+..+-.||--+.- .-++.+-++.|...++.-+...+|+++..|.....+++. +-+|| T Consensus 103 ~~~~~~~r~~~~~k-~l~~~~~vs~~~~l--~n~i~d~~~~~~~~ai~~~a~tiE~a~FyGds~l~~s~~~~~glqfDGi 179 (468) T protein:vir:63 103 VSDPNIRQKTVNMK-FASDTKNISIAAGL--VNNIQDPMQILTDDAIVNIAKTIEWASFFGDSDLSDSPEPQAGLEFDGL 179 (468) T ss_pred cCCCceEEEEEEee-eeeeeeeehhhhhh--hcchhhHHHHHHHHHHHHHHHHHHHHhhhcccccccCCCccccccccce Confidence 65544444433322 22233344433333 334678899999999999999999999999877654433 57777 Q ss_pred HHHHhcccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHH-HHhhccceee Q lcl|NC_020858. 146 PTWVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVF-VTFMSDTNVA 224 (330) Q Consensus 146 ~~~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~i-s~f~~~~~~~ 224 (330) ...|.-.. +.+ .-+..||+++|+++...+-+.=|.+..++++...|-.| +.+...+ T Consensus 180 ~~li~~en----------------viD----a~G~~ls~~~lneaa~~i~~gfG~~td~~~~~~v~a~~~~~~L~~q--- 236 (468) T protein:vir:63 180 AKLINQDN----------------VHD----ARGASLTESLLNQAAVMISKGYGTPTDAYMPVGVQADFVNQQLSKQ--- 236 (468) T ss_pred eEEecCCc----------------eec----cCCCccCHHHHHHHhhhccccccChhhhhcchhHHhhhhhhhcCce--- Confidence 77663211 111 22345999999998877766547888999999999666 6654432 Q ss_pred eeeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccc----ccccccc-ce Q lcl|NC_020858. 225 SFRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDK----KVAKTGD-AE 299 (330) Q Consensus 225 ~~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e----~laKtGd-~~ 299 (330) .+....++.....+..|.-|.|..|.|++..+-+|.... +++|+-...+. -|-+... ...++|- +. T Consensus 237 -~~v~~~n~~~~~~G~~v~g~~sa~G~I~l~gs~il~~~~-------~l~~~~~~~~~-Apsp~~vsaT~~~~~~g~~~~ 307 (468) T protein:vir:63 237 -TQLVRDNGNNVSVGFNIQGFHSARGFIKLHGSTVMENEQ-------ILDERILALPT-APQPAKVTATQEAGKKGQFRA 307 (468) T ss_pred -EEEEcCCCCceeeeecccceecceeeeeecCceeecccc-------CCCcccccccc-cccCCccceeeecccCCcccC Confidence 233334555667888888999999999998887776544 35555444332 1211111 0011111 00 Q ss_pred eeEEEEEEEEEEecchheeEEe------------------ccccccccC Q lcl|NC_020858. 300 KFMLIGEGALKPKNEKGLGVAA------------------DLYGLTAST 330 (330) Q Consensus 300 k~~i~~E~tLe~~N~~a~g~i~------------------gLt~~~~~~ 330 (330) ..--.-+|-+.+.|..+-..+. .|.++..++ T Consensus 308 ~~~a~y~Y~v~~vs~~GES~pS~~vtvTVaa~~dg~~ltIt~~~~~~~~ 356 (468) T protein:vir:63 308 EDLAAHEYKVVVSSDDAESIASEVATATVTAKDDGVKLEIELAPMYSSR 356 (468) T ss_pred CCcceEEEEEEEECCCCccccccceEEEecCcccceeEEEEecCCCCCc Confidence 0000123444444443322211 111222222 No 16 >protein:vir:80491 Length: 467 # NCBI annotation: Cps # Family: family:all:2450 # MgeID: mge:1883 # MgeName: A511 # Cross-refs: genbank:acc:YP_001468466;genbank:gi:157325041;genbank:GeneID:5601449 Probab=98.25 E-value=9.8e-08 Score=59.02 Aligned_cols=295 Identities=14% Similarity=0.117 Sum_probs=174.6 Q ss_pred CCccccce-----eeccccccccccceeeEecCCcccceee--eeccceeccceeeeeeeeccCc-ccccc-cccccccc Q lcl|NC_020858. 1 MAVVTNTF-----QSTGAKGNREELADVVSRITPEDTPIYS--MIEKVSFDTTHPEWTTDELAAP-GANIT-LEGDEYTF 71 (330) Q Consensus 1 Ma~~t~~~-----~t~~~~g~~edl~d~I~~i~p~dTP~~s--~ig~~~~~~~~~~W~td~L~~~-~~na~-~EG~d~~~ 71 (330) =|..++-- ++...-..||+|+..|.+++-.+.+|.= -|.+.++.|+.|+|-...=..- ..-+. -|+..... T Consensus 23 Ks~~agy~~~p~tq~~~~AlR~EsL~~~i~~Lt~~~~~f~~~~di~k~~a~stv~~y~~~~~~G~~g~~~f~~E~g~~~~ 102 (467) T protein:vir:80 23 KSFTTGYGITPDTQTDAGALRREFLDDQISMLTWTENDLTFYKDIAKKPATSTVAKYDVYMQHGKVGHTRFTREIGVAPV 102 (467) T ss_pred HHHHcccccCCccccCcchhhhhhhhhhhheeeccccchhhhhhcccchhhhhhhhheeeeccCcccccccccccccccc Confidence 11122222 2223346799999999999999988843 4578999999999988874333 33333 68887766 Q ss_pred ccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccc-----cchhHH Q lcl|NC_020858. 72 DATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATR-----ESGSLP 146 (330) Q Consensus 72 ~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r-----~~~Gi~ 146 (330) ....-+.+.-++- -+..+-.||--+.- .-++.+-++.|...++.-+...+|+++..|.....+++. +-+||. T Consensus 103 ~~~~~~r~~~~~k-~l~~~~~vs~~~~l--~n~i~d~~~~~~~~ai~~~a~tiE~a~FyGds~l~~s~~~~~glqfDGi~ 179 (467) T protein:vir:80 103 SDPNIRQKTVNMK-FASDTKNISIAAGL--VNNIQDPMQILTDDAIVNIAKTIEWASFFGDSDLSDSPEPQAGLEFDGLA 179 (467) T ss_pred CCCceEEEEEEee-eeeeeeeehhhhhh--hcchhhHHHHHHHHHHHHHHHHHHHHhhhcccccccCCCcccccccccee Confidence 5544444433322 22233344433333 334678899999999999999999999999877654433 577777 Q ss_pred HHHhcccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHH-HHhhccceeee Q lcl|NC_020858. 147 TWVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVF-VTFMSDTNVAS 225 (330) Q Consensus 147 ~~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~i-s~f~~~~~~~~ 225 (330) ..|.-.. +.+ .-+..||+++|+++...+-+.=|.+..++++...|-.| +.+...+ T Consensus 180 ~li~~en----------------viD----a~G~~ls~~~lneaa~~i~~gfG~~td~~~p~~v~a~~~~~~L~~q---- 235 (467) T protein:vir:80 180 KLINQDN----------------VHD----ARGASLTESLLNQAAVMISKGYGTPTDAYMPVGVQADFVNQQLSKQ---- 235 (467) T ss_pred EEecCCc----------------eec----cCCCccCHHHHHHHhhhccccccChhhhhcchhHHhhhhhhhcCce---- Confidence 7663211 111 22345999999998877766547888999999999666 6654432 Q ss_pred eeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccc----ccccccc-cee Q lcl|NC_020858. 226 FRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDK----KVAKTGD-AEK 300 (330) Q Consensus 226 ~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e----~laKtGd-~~k 300 (330) .+....++.....+..|.-|.|..|.|++..+-+|.... +++|+-...+. -|-+... ...++|- +.. T Consensus 236 ~~v~~~n~~~~~~G~~v~g~~sa~G~I~l~gs~il~~~~-------~l~~~~~~~~~-Apsp~~vsaT~~~~~~g~~~~~ 307 (467) T protein:vir:80 236 TQLVRDNGNNVSVGFNIQGFHSARGFIKLHGSTVMENEQ-------ILDERILALPT-APQPAKVTATQEAGKKGQFRAE 307 (467) T ss_pred EEEEcCCCCceeeeecccceecceeeeeecCceeecccc-------CCCcccccccc-cccCCccceeeecccCCcccCC Confidence 233334555667888888999999999998887776544 35555444332 1211110 0011111 001 Q ss_pred eEEEEEEEEEEecchheeEEe------------------ccccccccC Q lcl|NC_020858. 301 FMLIGEGALKPKNEKGLGVAA------------------DLYGLTAST 330 (330) Q Consensus 301 ~~i~~E~tLe~~N~~a~g~i~------------------gLt~~~~~~ 330 (330) .--.-+|-+.+.|..+-..+. .|.++..++ T Consensus 308 ~~a~y~Y~v~~vs~~GES~pS~~vtvTVaa~~dg~~ltIt~~~~~~~~ 355 (467) T protein:vir:80 308 DLAAHEYKVVVSSDDAESIASEVATATVTAKDDGVKLEIELAPMYSSR 355 (467) T ss_pred CcceEEEEEEEECCCCccccccceEEEecCcccceeEEEEecCCCCCc Confidence 000123444444443322211 111222222 No 17 >protein:vir:348 Length: 321 # NCBI annotation: major virion structural protein # Family: family:all:3198 # MgeID: mge:9 # MgeName: Mx8 # Cross-refs: genbank:acc:NP_203462;genbank:gi:15320618;genbank:GeneID:921734 Probab=98.23 E-value=4.7e-08 Score=60.76 Aligned_cols=282 Identities=13% Similarity=0.134 Sum_probs=153.8 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeecc-----------------ceeccceeeeee--eeccCcccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEK-----------------VSFDTTHPEWTT--DELAAPGAN 61 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~-----------------~~~~~~~~~W~t--d~L~~~~~n 61 (330) |+.|. ++---++ ..+..+..+..+=-.++|||-.+.+ +.+....+.|-+ |.|.- ++. T Consensus 1 mp~~~--lsel~t~-tl~~rs~~~~D~v~~~n~LL~~L~~kG~~~~~~gg~~I~~~l~y~~~s~~~wy~Gyd~l~~-~p~ 76 (321) T protein:vir:34 1 MPFPN--ISDIITT-TIESRSGVIADNVTKNNAILARLAKRGKPRLVSGGYTILEELSFSGNSNGGWYSGYDVLPT-APQ 76 (321) T ss_pred CCCch--HHHHHHH-HHHhhcchhhhhhhcccHHHHHHHhcCcccccCCCeeEEEEEeeccCcceeEEEeeeeecc-chh Confidence 87754 2211111 1222233333333456777765521 122334455543 33432 223 Q ss_pred ccccccccccccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCC---cCCc- Q lcl|NC_020858. 62 ITLEGDEYTFDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNA---SVGG- 137 (330) Q Consensus 62 a~~EG~d~~~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~---~~~~- 137 (330) -.+|.+.++-... .. .+.|||+-+= .+.|. +|+-.-+++.-+..|..|.+.-. ..++ T Consensus 77 d~~~~Aef~wk~a-----a~--------~~~isg~e~l-~n~g~-----~~~idll~~~~~~ae~t~~n~l~~~l~sdGT 137 (321) T protein:vir:34 77 DVISSAEYALKQY-----AV--------PVVISGLEML-QNSGK-----EAQLDLLEARMNVAEATMANDISAALYGDGT 137 (321) T ss_pred hhccccccchhhe-----eE--------eeEEehhHHh-hccch-----HHHHHHHHHHHHHHHHHHHhhhhHhhhcccc Confidence 3345555433322 22 3456766543 33332 24444455555555555555432 1223 Q ss_pred --ccccchhHHHHHhcccccccccccccccccccccccccccccccccHHHHHHHHHHHHh---cC-CceeEEEeChHHH Q lcl|NC_020858. 138 --ATRESGSLPTWVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQ---SG-ANFKHVFVSPYVK 211 (330) Q Consensus 138 --~~r~~~Gi~~~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~---~G-g~~~~i~v~~~~k 211 (330) ..|++.||..+|...+..|+.|.+.-.+-.- =-+..+++.. ..|-..+...|.++|. .| ..|+.|+++...- T Consensus 138 a~g~~~i~GL~~lv~~~p~tGtvGGIdra~~~~-WRn~~~d~~~-~~t~~tl~~~m~~~w~~~~Rg~~~PDlii~~~~~y 215 (321) T protein:vir:34 138 AFGGRAINGLDGAVPVDPTVGTYGGINRALWPF-WRSQVEDMAA-VATINTIQPAMTKLWSRCVRGADMPDLIMSGNDAW 215 (321) T ss_pred ccccchhhhhhhhcccCCCCceeccccccchhh-hhhhhhhhhh-cccHHHHHHHHHHHHHhhccCCCCccEEEechHHH Confidence 3689999999997666666555432221100 0012233222 2577778899999996 34 4899999998775 Q ss_pred HHHHHhhccceeeeeeeeecCCcceeEEEEEEEEEcCCeEEEEEEcC----cCCCccccccEEEEEcchhhhhcccCCcc Q lcl|NC_020858. 212 SVFVTFMSDTNVASFRYAASNGKNNSIVANADVYEGPFGKVMIHPNR----VMAGSGALARNAFFVDPEFLQFGWLRKIA 287 (330) Q Consensus 212 ~~is~f~~~~~~~~~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR----~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~ 287 (330) .+ +..+..+.+|+...++ .+.|..==.|.+ +.||.+. +||+ +.+|+||.+|+++++...- T Consensus 216 ~~----y~~s~q~~qR~~~~~~--a~~Gf~~Lky~~----~div~D~~~g~~~pa-----n~~yfiNT~yl~~r~h~~~- 279 (321) T protein:vir:34 216 TT----YSNSLQVLQRFTSAEE--ANLGFRSLKFLS----TDVVLDGGIGGFAGA-----NTMYFLNTKYLHFRPHKDR- 279 (321) T ss_pred HH----HHHhhheeeeeccccc--ccccceeeeeee----EEEEEeCCCCCCccc-----cceeeeecceEEEEEcCCC- Confidence 55 3445566666644433 233321112443 8888887 5774 4689999999999875431 Q ss_pred cccccccc------ccceeeEEEEEEEEEEecchheeEEecc Q lcl|NC_020858. 288 EDKKVAKT------GDAEKFMLIGEGALKPKNEKGLGVAADL 323 (330) Q Consensus 288 ~~e~laKt------Gd~~k~~i~~E~tLe~~N~~a~g~i~gL 323 (330) ...|+... =|+.-.+|..-.-|-|.|+.+|+++.+- T Consensus 280 ~~~pi~p~r~~~~NqdA~~q~I~~~GnL~~sn~~~~~vL~~~ 321 (321) T protein:vir:34 280 NMVPLSPSRRAAFNQDAEAQILAWAGNLTCSGAQFQGRLIAE 321 (321) T ss_pred ceeecCcccccccchhHHhhhhhhhheeeeecccceeEEeeC Confidence 11233322 2344466777888999999999999988 No 18 >protein:vir:100135 Length: 418 # NCBI annotation: gp5 # Family: family:all:585 # MgeID: mge:1639 # MgeName: phi1026b # Cross-refs: genbank:acc:NP_945035;genbank:gi:38707895;genbank:GeneID:2744182 Probab=98.21 E-value=5.1e-07 Score=55.10 Aligned_cols=281 Identities=10% Similarity=-0.025 Sum_probs=145.9 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) +....++-++.......+.+...|+..-....|++.++......+....|....-..+...-..||...+.......... T Consensus 132 ~~~~~~~~~~~~g~lvp~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~~~~f~~v~ 211 (418) T protein:vir:10 132 VPATVGSGVSGSNSLVVADRQAGIIAPPQRKMTIRDLLMPGQTSSSSIEYTVETGFTNNAAAVAEGAQKPTSDLKFNLKN 211 (418) T ss_pred hhhhccCCCCCCccccchhHHHHHHHHHhhhhhHHhhcceeeccCCceeEEEEecCCCceeeeccCccccccccceeeEE Confidence 11111111111112234467777777778888998887776666555666665544443344468877655433222111 Q ss_pred cceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccccc Q lcl|NC_020858. 81 GNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATGA 160 (330) Q Consensus 81 ~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g~ 160 (330) -.... +...+.|| .+.+... .+..+|=...-...+.+-+|.++|+|..... ...||........ T Consensus 212 ~~~~k-~~~~~~is--~ell~ds--~~l~~~i~~~l~~a~~~~~d~a~l~G~g~~~----~p~Gi~~~~~~~~------- 275 (418) T protein:vir:10 212 QPVRT-IAHLFKAS--RQILDDA--PALQSYIDGRARYGLQLTEEGQILKGDGTGA----NILGILPQASAFM------- 275 (418) T ss_pred Eeeee-EEEeehhh--HHHHHhH--HHHHHHHHHHHHHHHHHHHHHHHhccCCCCc----ccccccccccccc------- Confidence 11111 11223344 4444443 2444444445566789999999999854321 1235443321000 Q ss_pred cccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeEEE Q lcl|NC_020858. 161 NGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSIVA 240 (330) Q Consensus 161 ~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~~~ 240 (330) .+.+.....+.++|.+++.++..++.....++|||.....|..+- +.+. |+...+- .. . T Consensus 276 -------------~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~L~~lk-d~~G---~~i~~~~-~~---~ 334 (418) T protein:vir:10 276 -------------PSITLANATPIDKIRLALLQAVLAEFPATGIVLNPIDWASIELTK-DSQG---RYIVGNP-VN---G 334 (418) T ss_pred -------------ccccccccccHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHhh-cCCC---ceecccc-cc---C Confidence 011111234556788888888877777777899999988887763 3321 2222110 00 0 Q ss_pred EEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCcccccccccc-----ccceeeEEEEEEEEEEecch Q lcl|NC_020858. 241 NADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKT-----GDAEKFMLIGEGALKPKNEK 315 (330) Q Consensus 241 ~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKt-----Gd~~k~~i~~E~tLe~~N~~ 315 (330) . --+=+| +.|+.+.+||.+. +++.|++..-+-+.+.-..-+...-. -+...+..+..+...+++|+ T Consensus 335 ~---~~~l~G-~pV~~~~~~p~~~-----~~~gd~s~~~~~~~~~~~~i~~~~~~~~~f~~~~~~~r~~~~~d~~~~~~~ 405 (418) T protein:vir:10 335 T---TPRLWN-LPVVETQAMTANE-----FLVGAFSMAAQIFDRMEIEVLLSTENVDDFEKNMVSIRAEERLALAVYRPE 405 (418) T ss_pred C---Cceecc-eeeEEcCCCCCCc-----EEEeeccceEEEEEecceEEEEecccchhhhcCceEEEEEEeeccEEeccc Confidence 0 001245 5788899999754 56788775222222221111100111 23345556777999999999 Q ss_pred heeEEeccccccccC Q lcl|NC_020858. 316 GLGVAADLYGLTAST 330 (330) Q Consensus 316 a~g~i~gLt~~~~~~ 330 (330) |...++.- ++.. T Consensus 406 a~~~~~~~---~~~~ 417 (418) T protein:vir:10 406 SFVTGALV---EQAG 417 (418) T ss_pred ceEEEEec---cCCC Confidence 98765532 2222 No 19 >protein:vir:9574 Length: 300 # NCBI annotation: gp40 # Family: family:all:966 # MgeID: mge:171 # MgeName: SM1 # Cross-refs: genbank:acc:NP_862879;genbank:gi:32469471;genbank:GeneID:1461316 Probab=98.21 E-value=4.6e-07 Score=55.34 Aligned_cols=283 Identities=11% Similarity=0.031 Sum_probs=139.2 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) ||..+..- ..-..+.+.+.|...-....|+..+.......+....+...+-. +..--..||.+.+..... + T Consensus 1 ma~~t~~~----G~lip~~~~~~ii~~l~~~s~i~~l~~~~~~~~~~~~~p~~~~~-~~a~wv~Eg~~~~~s~~~----f 71 (300) T protein:vir:95 1 MSEAQLSK----GNLFNPELVTKVINKVKGHSSIAKLSPQKPIPFNGQREFVFDFD-SDIDIVAENGKKTHGGVS----L 71 (300) T ss_pred CcccccCC----cceechhhHHHHHHHHHhhhhhhhhcceeeccCCceEEEEEecC-cceEEeeCCccccccccc----c Confidence 98877441 11233456666666556667776654433333233333333222 222233577776544322 2 Q ss_pred cceE---EEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 81 GNYT---QIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 81 ~N~t---QIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) .+++ .-+..-+.||.-...-......+...+=..+-.+.+.+-+|.++++|...-.+++...-|....-. .. T Consensus 72 ~~v~l~~~k~~~~~~iS~ell~~~~d~~~~l~~~i~~~l~~aia~~~d~~~l~G~~~~~g~~~~~~~~~~~~~--~~--- 146 (300) T protein:vir:95 72 DPVTIVPLKVEYGARVSDEFLHASEEAKVDMLTDFVEGFSKKLARGLDIMSIHGINPRTKQASTIIGDNCFDK--KV--- 146 (300) T ss_pred eeeEeeeEEEEEeehhhHHHhccCCCCHHHHHHHHHHHHHHHHHHHHHHhhhhcccCCCCCCccccccccccc--cc--- Confidence 2222 123334445433221001111122233333456778999999999986433333211111111100 00 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCccee Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNS 237 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~ 237 (330) .......+ ..+.+.|.+++.++-.++.++..+++||....++..+- +.+. |+...+.. T Consensus 147 ------------~~~~~~~~---~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~L~~lk-d~~G---~~i~~~~~--- 204 (300) T protein:vir:95 147 ------------TQTVPFKD---TNPDESMEDAVGMIDGSERDITGAILDPIFTTALSKMK-NAEG---GKLYPELA--- 204 (300) T ss_pred ------------ceeecccc---cchHHHHHHHHHHhhhcCCCccEEEECHHHHHHHHHhh-ccCC---CeeccCcc--- Confidence 00001111 24457788888888888888888999999999998863 3321 11111100 Q ss_pred EEEEEEEEEcCCeEEEEEEcCcCCCcccccc-EEEEEcchh-hhhcccCCcccccccccccc------------ceeeEE Q lcl|NC_020858. 238 IVANADVYEGPFGKVMIHPNRVMAGSGALAR-NAFFVDPEF-LQFGWLRKIAEDKKVAKTGD------------AEKFML 303 (330) Q Consensus 238 ~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~-~~~~ld~~~-~~~~~Lr~~~~~e~laKtGd------------~~k~~i 303 (330) .+.. .-+=+| +.++.+..||.+..... .+|+-|.+. +.+.. |...+.+ ....+| ..-+.. T Consensus 205 ~~~~---~~~l~G-~Pv~~s~~v~~~~~~~~~~~~~GDf~~~~~~~~-~~~~~~~-v~~~~~~d~~~~~~f~~~~v~~r~ 278 (300) T protein:vir:95 205 WGGV---PDAING-LAVDKNRTVSYSQTDPKNTAIVGDFETMFKWGY-AKEVPME-IIKYGDPDNSGRDLKGYNQIYIRC 278 (300) T ss_pred ccCC---Cceecc-eeeEEecCCCCCCCCCccEEEEeeccceEEEEE-ecccEEE-EeeccCCCCcchhhhhcCcEEEEE Confidence 0000 001123 47788888887643222 245567652 22222 2111111 111111 133344 Q ss_pred EEEEEEEEecchheeEEecccc Q lcl|NC_020858. 304 IGEGALKPKNEKGLGVAADLYG 325 (330) Q Consensus 304 ~~E~tLe~~N~~a~g~i~gLt~ 325 (330) +.++++.+++|.|..+|++-=| T Consensus 279 ~~r~d~~v~~~~a~~~l~~~~g 300 (300) T protein:vir:95 279 EAYIGWGIMDAASFARIVKTGG 300 (300) T ss_pred EEeecceeecccceEEEecCCC Confidence 5668899999999999999999 No 20 >protein:vir:1638 Length: 298 # NCBI annotation: Structural protein # Family: family:all:966 # MgeID: mge:33 # MgeName: r1t # Cross-refs: genbank:acc:NP_695059;genbank:gi:23455750;genbank:GeneID:955469 Probab=98.21 E-value=6.7e-07 Score=54.43 Aligned_cols=282 Identities=10% Similarity=0.012 Sum_probs=136.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccc--eeeeeeeeccCccccccccccccccccccCce Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTT--HPEWTTDELAAPGANITLEGDEYTFDATVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~--~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~ 78 (330) ||..++.+..+ ++.+.|+..-....|+.++.......+- .+-+++.. +...-..||++.+........ T Consensus 1 ma~~gG~lvp~-------~~~~~ii~~~~~~s~i~~l~~~~~~~~~~~~ip~~~~~---~~a~~v~E~~~~~~~~~~f~~ 70 (298) T protein:vir:16 1 MVLNKGTLFDP-------TLVTDLISKVAGKSSIARLSAQKPIPFNGEKVFTFTMD---SEIDVVAESGKKTHGGVTLAP 70 (298) T ss_pred CcccCcceech-------hHHHHHHHHHHhhhhhhhhcceeeccCCceEEEEEecC---cceEEecCCccccccccceeE Confidence 99887665443 3444444444556777776554433322 22233221 112233578776655333222 Q ss_pred EecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 79 RLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 79 ~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) ..-+... +..-+.||.-...-..--..+-.++=..+-.+.+.+-+|.++++|.....+.+...-|+.... +.. T Consensus 71 v~l~~~k-~a~~~~iS~ell~~s~d~~~~l~~~i~~~la~ai~~~~d~~~l~G~~~~~g~~~~~~~~~~~~--~~~---- 143 (298) T protein:vir:16 71 QTMVPIK-VEYGARISDEFMYASDEEKINILQEFNDGFAKKVARGIDLMAFHGVNPRLGTASAVIGTNHFD--SKV---- 143 (298) T ss_pred EEEeeee-EEEeehhhHHHhhcCcccHHHHHHHHHHHHHHHHHHHHHHHhhccccCCCCcccccccccccc--ccc---- Confidence 2112222 223344554332111100012222333344556778999999998643333332222211110 000 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeE Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSI 238 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~ 238 (330) ......++...-..++|.+++.++..++.+...+++||..+.+|.++ +|.+. |+...+.... T Consensus 144 ------------~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~l-kd~~G---~~i~~~~~~~-- 205 (298) T protein:vir:16 144 ------------TQKVEAPRGIADPNGAIENAVELLTGVDADVTGIAINPSFRSALAKQ-KDLQD---NALFPELKWG-- 205 (298) T ss_pred ------------ccccccccccccHHHHHHHHHHHhhhcCCCccEEEEcHHHHHHHHHh-hccCC---CeeecCcccC-- Confidence 00000011111124578888888888888777799999999999875 44432 2222111000 Q ss_pred EEEEEEEEcCCeEEEEEEcCcCCCcccc-ccEEEEEcchhh-hhcccCCccccccccccc------------cceeeEEE Q lcl|NC_020858. 239 VANADVYEGPFGKVMIHPNRVMAGSGAL-ARNAFFVDPEFL-QFGWLRKIAEDKKVAKTG------------DAEKFMLI 304 (330) Q Consensus 239 ~~~v~~~~tdfG~v~iv~nR~mp~~~~~-a~~~~~ld~~~~-~~~~Lr~~~~~e~laKtG------------d~~k~~i~ 304 (330) .... +=+| +.++.+..||..... ...+|+-|.+.. .+.. |...+-+ +...+ |...+..+ T Consensus 206 -~~~~---~l~G-~PV~~~~~v~~~~~~~~~~~~~GDfs~~~~~~~-~~~~~~~-~~~~~~~~~~~~~~f~~~~v~~ra~ 278 (298) T protein:vir:16 206 -ATPD---TING-LPVDVNKTVSDMSLTQRDRAIIGDFANGFKWGY-AKEVPLE-VIQYGDPDNSGLDLKGYNQVYIRAE 278 (298) T ss_pred -CCCc---eecc-eeeEEecccccccCCCccEEEEeeccceEEEEE-ecCceEE-EeeccCCcCcchhhhhcCcEEEEEE Confidence 0000 0123 477778888864221 234666677532 2222 2111111 11122 22334455 Q ss_pred EEEEEEEecchheeEEeccc Q lcl|NC_020858. 305 GEGALKPKNEKGLGVAADLY 324 (330) Q Consensus 305 ~E~tLe~~N~~a~g~i~gLt 324 (330) ..+...+++|+|..+|.+-+ T Consensus 279 ~r~d~~v~~~~a~~~l~~at 298 (298) T protein:vir:16 279 LFLGWGILDATKFARVTEAN 298 (298) T ss_pred EEEccEeecccceEEEeecC Confidence 56889999999999999988 No 21 >protein:vir:81070 Length: 390 # NCBI annotation: p09 # Family: family:all:585 # MgeID: mge:1889 # MgeName: Xop411 # Cross-refs: genbank:acc:YP_001285679;genbank:gi:148727187;genbank:GeneID:5247115 Probab=98.06 E-value=1.4e-06 Score=52.59 Aligned_cols=276 Identities=11% Similarity=-0.005 Sum_probs=144.0 Q ss_pred CCc----cccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccC Q lcl|NC_020858. 1 MAV----VTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVS 76 (330) Q Consensus 1 Ma~----~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~ 76 (330) .+. .+.+ ++....-..+.+...|+..-....|+.++.......+..+.|....-.++...-..||+..+...... T Consensus 107 ~~~~~~~~~~~-~~~~g~~~~~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~Eg~~~~~~~~~~ 185 (390) T protein:vir:81 107 KAALNTASTDA-AGSAGALTTPNRLPGFITPPDARLTVRDLIGSGRTDSALIEYVQETGFVNNAAIVAEGALKPESSLKF 185 (390) T ss_pred HHHHHhhcccc-ccCCcceechhhhHHHHHHHhhhhhhhhhcceeeccCCceEEEEEecCCcceeeecCCccccccccee Confidence 111 1100 01111112233444455554666788877666665555566666554333333346887765543332 Q ss_pred ceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccc Q lcl|NC_020858. 77 PERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRG 156 (330) Q Consensus 77 ~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g 156 (330) ....-+.. -+...+.||. +.+.... +..++-...-...+.+-+|.+||+|.... . ...||+........ T Consensus 186 ~~i~~~~~-k~~~~~~is~--ell~d~~--~~~~~i~~~l~~~~~~~~d~a~l~G~g~~-~---~~~Gi~~~~~~~~~-- 254 (390) T protein:vir:81 186 AKKTDTTH-VIAHTMKATR--QILSDAP--QLASYMNNRLIRGLKVKEDAEILRGTGAN-D---GLLGLIPQATTYAA-- 254 (390) T ss_pred eEEEEeee-EEEEeehhhH--HHHHhHH--HHHHHHHHHHHHHHHHHHHHHHHhcCCCC-C---cccceeeccccccc-- Confidence 22222222 1222334443 4555442 45555555667778999999999985421 1 13454432110000 Q ss_pred cccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcce Q lcl|NC_020858. 157 ATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNN 236 (330) Q Consensus 157 ~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~ 236 (330) +.......+.++|.+++.++-.++.....+++||....+|..+- |.. .|+...+.... T Consensus 255 ------------------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~lk-d~~---G~~l~~~~~~~ 312 (390) T protein:vir:81 255 ------------------PTTIAGATRVDQLRLAMLQASLAEYNPSGIVINPIDWAAIELAK-DAN---NQYLIGNARGT 312 (390) T ss_pred ------------------ccccccchhHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHhh-cCC---CceeecCcccc Confidence 11111234557788888888888877778999999988888763 332 22222211100 Q ss_pred eEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccc----cccccccccceeeEEEEEEEEEEe Q lcl|NC_020858. 237 SIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAE----DKKVAKTGDAEKFMLIGEGALKPK 312 (330) Q Consensus 237 ~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~----~e~laKtGd~~k~~i~~E~tLe~~ 312 (330) .. -+=+| +.|+.+.+||.+. +++.|++..-.-+.|.-.. .+..--+-+...+.++..+...++ T Consensus 313 ----~~---~~l~G-~pv~~~~~~p~~~-----~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~v~~r~~~r~d~~v~ 379 (390) T protein:vir:81 313 ----LT---PTLWG-LPVVATQAMAPGE-----FLVGAFDLAAQIFDQWDARVEIGYVGEDFQRNMITVLAEERLALVVY 379 (390) T ss_pred ----cC---ceecc-eeeEEcCCCCCCc-----EEEEehhceEEEEEecceEEEEecccchhhcCcEEEEEEEeeccEEe Confidence 00 01134 3677888999654 5788887533223332111 111111234556677888999999 Q ss_pred cchheeEEeccc Q lcl|NC_020858. 313 NEKGLGVAADLY 324 (330) Q Consensus 313 N~~a~g~i~gLt 324 (330) +|.|+.+|+ |. T Consensus 380 ~~~a~v~~t-~a 390 (390) T protein:vir:81 380 RPEALISGS-FA 390 (390) T ss_pred cccceEEEE-eC Confidence 999997665 33 No 22 >protein:vir:9759 Length: 303 # NCBI annotation: putative structural protein # Family: family:all:966 # MgeID: mge:175 # MgeName: 315.3 # Cross-refs: genbank:acc:NP_795521;genbank:gi:28876283;genbank:GeneID:1257824 Probab=98.02 E-value=2.8e-06 Score=51.06 Aligned_cols=285 Identities=9% Similarity=-0.016 Sum_probs=136.2 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) ||+.+.+- .-.-+.+++.|...-....|+..+.......+-.......+.. +..--..||+..+.......... T Consensus 1 m~t~t~gg-----~liP~~~~~~ii~~l~~~s~i~~l~~~~~~~~~~~~ip~~~~~-~~a~wv~E~~~~~~s~~~f~~v~ 74 (303) T protein:vir:97 1 MGTETSKA-----SLFDKHLVSDLINKVKGHSSLAKLSSQKPIPFNGSKEFTFTLD-SDIDVVAENGKKTHGGLSLEPVT 74 (303) T ss_pred CcccCCCC-----eEcchhHHHHHHHHHHhhchhhhhcceeecCCCceEEEEEecC-cceEEeecCccccccccceeeEE Confidence 88755332 1123456666666666677777765444333222222111111 11122347776654433221111 Q ss_pred cceEEEEeeeeeehhHHHHHhhc-c-ccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 81 GNYTQIMRKSGIISGTQNITDEA-G-RATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 81 ~N~tQIf~~~v~VS~T~~av~~~-G-~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) -+. .-+..-+.||. +-+..- . ..+..++=..+-.+.+.+-+|.++|+|.....+++-..-|+... . T Consensus 75 l~~-~kl~~~~~iS~--ell~~~~d~~~~l~~~i~~~la~a~~~~ld~a~l~G~~~~~g~~~~~~~~~~~-----~---- 142 (303) T protein:vir:97 75 IVP-IKVEYGARLSD--EFLYATEEEKIDILKAFNEGFAKKLARGIDLMAMHGINPRTKKASDVIGTNHF-----D---- 142 (303) T ss_pred eee-EEEEEeehhhH--HHhhcCccchHHHHHHHHHHHHHHHHHHHHhhhhcccccCCcccccccccccc-----c---- Confidence 111 11222333332 222111 1 11223333344455678899999999964333332111111000 0 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeE Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSI 238 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~ 238 (330) +........+....+.++|.+++.++..++..+..+++||....++..+ ++.+. ++.....- .. T Consensus 143 ----------~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~L~~l-kd~~g---~~~~~~~~--~~ 206 (303) T protein:vir:97 143 ----------SKVTQVVKFTESEDADANIEAAVNLIQGAEGVVTGLAMDTEFSTALAKV-TNGEM---GPKMYPEL--AW 206 (303) T ss_pred ----------cccccccccccccchHHHHHHHHHHHhhcCCCccEEEEcHHHHHHHHHh-hccCC---CeEEecCc--cC Confidence 0000001111123456788888888888888888899999999888775 44432 11111100 00 Q ss_pred EEEEEEEEcCCeEEEEEEcCcCCCcccccc---EEEEEcchh-hhhcccCCcccccccccccc------------ceeeE Q lcl|NC_020858. 239 VANADVYEGPFGKVMIHPNRVMAGSGALAR---NAFFVDPEF-LQFGWLRKIAEDKKVAKTGD------------AEKFM 302 (330) Q Consensus 239 ~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~---~~~~ld~~~-~~~~~Lr~~~~~e~laKtGd------------~~k~~ 302 (330) +...+ +=+| +.++.+++||.....+. .+++.|.+. ..+.. |.-.+-+ +-.-+| ...+. T Consensus 207 ~~~~~---~l~G-~Pv~~s~~v~~~~~~~~~~~~~~~Gdf~~~~~~~~-~~~~~~~-~~~~~~~d~~~~~~~~~n~~~~r 280 (303) T protein:vir:97 207 GANPD---SING-LKSSVNTTVGAGADEAESKDLVIIGDFESMFKWGY-AKQIPME-IIKYGDPDNSGKDLKGYNQIYLR 280 (303) T ss_pred CCCCc---eecc-eeeEEecccCCccccCCCccEEEEeeccccEEEEE-ecCcEEE-EeeccCCCCcchhhhhcCcEEEE Confidence 00111 1234 78888999987543332 366777643 22222 2111111 111222 12333 Q ss_pred EEEEEEEEEecchheeEEecccc Q lcl|NC_020858. 303 LIGEGALKPKNEKGLGVAADLYG 325 (330) Q Consensus 303 i~~E~tLe~~N~~a~g~i~gLt~ 325 (330) .+..+...+++|+|..+|++.-= T Consensus 281 ~~~r~~~~v~~p~af~~l~~~~~ 303 (303) T protein:vir:97 281 AEAYIGWGILDAKSFARVTKGEV 303 (303) T ss_pred EEEEeccEeecccceEEeeCCCC Confidence 45668899999999999987654 No 23 >protein:vir:41 Length: 299 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:2 # MgeName: A118 # Cross-refs: genbank:acc:NP_463467;swissprot:trembl:q9t1b7;genbank:gi:16798789;uniprot:Q9T1B7;genbank:GeneID:922353 Probab=98.01 E-value=5.6e-06 Score=49.37 Aligned_cols=279 Identities=11% Similarity=0.002 Sum_probs=148.0 Q ss_pred CCccccceeeccc--cccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCce Q lcl|NC_020858. 1 MAVVTNTFQSTGA--KGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~--~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~ 78 (330) |-.=+.+.++... ...-+.+++.|..--....|+.++....+..+....+...+ .+...-.-||++.+....... T Consensus 1 ~g~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~~~~~--~~~a~~v~E~~~~~~~~~~f~- 77 (299) T protein:vir:41 1 MGFNPDTTTMQSAKTGSIPINISEQIITGVKNGSAAMKLAKAVPMTKPEEEFTFMS--GVGAFWVDEAERIQTSKPTFT- 77 (299) T ss_pred CCcCCCcccccCCCceecchhHHHHHHHHHHhcchhhhhceeeecCCCcEEEEEEc--CCceeeeecCcccccccccee- Confidence 2222222222111 12234576777666677788877765555544444444433 333333457777654332211 Q ss_pred EecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 79 RLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 79 ~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) .-...+.+-...+.=|.+.++... .+-..+=...-...+.+.+|.++|+|.... .+ -|++.-.. . T Consensus 78 --~v~l~~~k~~~~~~is~ell~ds~-~~~~~~i~~~l~~a~~~~~d~a~l~G~g~~--~~---~gil~~~~---~---- 142 (299) T protein:vir:41 78 --KAKMRSKKMGVIIPTTKENLNYSV-TNFFSLMQAEIVEAFYKKFDQAVFTGVESP--YN---WNILKSAT---D---- 142 (299) T ss_pred --EEEEeeEEEEEeehhhHHHHhcCH-HHHHHHHHHHHHHHHHHHHHHHHhhcccCc--cc---cccccccc---c---- Confidence 112233333444445556665433 233333444555668899999999986421 11 13222110 0 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhcccee-eeeeeeecCCccee Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNV-ASFRYAASNGKNNS 237 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~-~~~r~~~~~~~~~~ 237 (330) +.+.......+-++|.+++.++-.++.....++|||....++.++- +.+. ...+....++.. T Consensus 143 --------------~~~~~~~~~~~~~~l~~~~~~l~~~~~~~~~~v~n~~~~~~L~~lk-d~~G~~l~~~~~~~~~~-- 205 (299) T protein:vir:41 143 --------------ASNLVEETANKYDDLNEAIGLIEAEDLEPNGIATIRKQRVKYRSTK-DGNGMPIFNTATSNGVD-- 205 (299) T ss_pred --------------cceeeccccccHHHHHHHHHhhhcccCCcCEEEEcHHHHHHHHHhh-ccCCceeecCCcCCCCc-- Confidence 0001112235678899999999888887778999999999999864 3321 111111111111 Q ss_pred EEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccc----ccc-------------cccccee Q lcl|NC_020858. 238 IVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDK----KVA-------------KTGDAEK 300 (330) Q Consensus 238 ~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e----~la-------------KtGd~~k 300 (330) +=+| +.|+.+.+||.++- ...+++.|++++-+ ++|.-...+ ..- ..-+... T Consensus 206 ---------~l~G-~PV~~~~~~~~~~~-~~~~~~gdfs~~~i-~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 273 (299) T protein:vir:41 206 ---------DVLG-LPIAYTPKYTFGDK-DISELVGDWNQAYY-GILRGVEYEILTEATLTTVADETGKPLNLAERDMAA 273 (299) T ss_pred ---------eecc-eeeEEecccCCCCC-ceEEEEEecccEEE-EEecCcEEEEeecccccccccccccchhhhhcCcEE Confidence 1134 67888889997652 23477888887644 223211111 100 1223344 Q ss_pred eEEEEEEEEEEecchheeEEeccccc Q lcl|NC_020858. 301 FMLIGEGALKPKNEKGLGVAADLYGL 326 (330) Q Consensus 301 ~~i~~E~tLe~~N~~a~g~i~gLt~~ 326 (330) +..+..+++.+++|.|..+|..-++= T Consensus 274 ~r~~~~~d~~v~~~~A~~~l~~~aa~ 299 (299) T protein:vir:41 274 IKATFEVGFMVVKDEAFSAVQPKAGN 299 (299) T ss_pred EEEEEEeccEEecccceEEEEeccCC Confidence 45567789999999999988654443 No 24 >protein:vir:95763 Length: 297 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1578 # MgeName: SMP # Cross-refs: genbank:acc:YP_950590;genbank:gi:119953785;genbank:GeneID:5076833 Probab=97.99 E-value=2.3e-06 Score=51.49 Aligned_cols=275 Identities=8% Similarity=-0.025 Sum_probs=141.6 Q ss_pred CCcc-----ccceeeccccccccccceeeEecCCcccceeeeeccceeccce-eeeeeeeccCccccccccccccccccc Q lcl|NC_020858. 1 MAVV-----TNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTH-PEWTTDELAAPGANITLEGDEYTFDAT 74 (330) Q Consensus 1 Ma~~-----t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~-~~W~td~L~~~~~na~~EG~d~~~~~~ 74 (330) |.+. ..+-++.....--+.+.+.|...-....|++++.......+.. ..+.... ..+...-..||++.+.... T Consensus 1 m~~~~~~~~~~~~t~~~~~lvP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~~~~~~-~~~~a~~v~Eg~~~~~~~~ 79 (297) T protein:vir:95 1 MTVQTFNPENVLVSQKKDGTLHKEFTDIIMKEVAQNSLVMQLGQYQEMEGEQEKTVYVQT-DGISAYWVNETEKIKTDKP 79 (297) T ss_pred CCccccccccccccCCCcceechhHHHHHHHHHHhhchhhhhcceeecCCCccEEEEEEc-CCceeEEeecCcccccccc Confidence 4433 1111121222234577777777778888988876554443222 2222222 2222233468887765433 Q ss_pred cCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccc Q lcl|NC_020858. 75 VSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVS 154 (330) Q Consensus 75 ~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~ 154 (330) ..... -..+.+=...+.=|.+.+.... .+...+=...-...+.+.+|.++|+|...... .|+..-+. T Consensus 80 ~f~~v---~l~~~k~~~~~~is~ell~ds~-~~l~~~i~~~la~ai~~~~d~a~l~G~g~~~~-----~gi~~~~~---- 146 (297) T protein:vir:95 80 EVVPV---TLKAHKLGIILVTSREALNYTW-KKFFEDMKPQIVEAFYKKIDEAGLLGHDTPFA-----NSVAKAAK---- 146 (297) T ss_pred ceeEE---EEeeEEEEEeehhhHHHHhcCH-HHHHHHHHHHHHHHHHHHHHHHHhcccCCccc-----cccccccc---- Confidence 22111 1223333333444445544332 23333333444555999999999988542211 12221110 Q ss_pred cccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCc Q lcl|NC_020858. 155 RGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGK 234 (330) Q Consensus 155 ~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~ 234 (330) ..+......++-++|.+++.++..++.....++++|....++.++. +... ++... +. T Consensus 147 ------------------~~~~~~~~~~t~~~i~~~~~~l~~~~~~~~~~v~~~~~~~~L~~l~-d~~G---~~i~~-~~ 203 (297) T protein:vir:95 147 ------------------DANKVIGGPINYDNILKLQDALYDADVEPNAFVSKIQNRSALREAR-DGNK---VSIYD-KA 203 (297) T ss_pred ------------------ccceecccccCHHHHHHHHHHhhhccCCcCEEEEcHHHHHHHHHhh-ccCC---ceeec-CC Confidence 0011122346788899999999998888888999999999988763 3221 22111 11 Q ss_pred ceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc---ccccc-cc------------ccccc Q lcl|NC_020858. 235 NNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI---AEDKK-VA------------KTGDA 298 (330) Q Consensus 235 ~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~---~~~e~-la------------KtGd~ 298 (330) . .+-+|. .++.....+ .....+++.|++.+-+..-.++ ...+. +. -+-|. T Consensus 204 ~----------~~l~G~-Pv~~~~~~~---~~~~~~~~gd~s~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~ 269 (297) T protein:vir:95 204 A----------NTIDGI-TTVDLKSAR---FEKGDLLAGDFDNLIYGVPYNITYKISEEGQISTITNADGTPINLFEQEM 269 (297) T ss_pred C----------Ccccce-eeEeecCCC---CCCceEEEEecccEEEEEecCeEEEEeeccccccccccCccchhhhhcCc Confidence 1 112232 233322222 2345677888887644321111 00010 00 12244 Q ss_pred eeeEEEEEEEEEEecchheeEEeccccc Q lcl|NC_020858. 299 EKFMLIGEGALKPKNEKGLGVAADLYGL 326 (330) Q Consensus 299 ~k~~i~~E~tLe~~N~~a~g~i~gLt~~ 326 (330) ..+.++..+...+.||+|..+|..-+.. T Consensus 270 ~~~r~~~~~d~~v~~~~a~~~l~~at~~ 297 (297) T protein:vir:95 270 IAIRATMDIAVMITKTDAFAKLTPAERV 297 (297) T ss_pred EEEEEEEEeccEeecccceEEEeecCCC Confidence 5566677899999999999998766665 No 25 >protein:vir:9309 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:165 # MgeName: phi 11 # Cross-refs: genbank:acc:NP_803287;genbank:gi:29028597;genbank:GeneID:1258044 Probab=97.98 E-value=4.9e-06 Score=49.67 Aligned_cols=281 Identities=10% Similarity=0.010 Sum_probs=144.6 Q ss_pred CC-ccccceeecc-cc-ccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCc Q lcl|NC_020858. 1 MA-VVTNTFQSTG-AK-GNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSP 77 (330) Q Consensus 1 Ma-~~t~~~~t~~-~~-g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~ 77 (330) +. .-.++.++.. .. ..-+.+.+.|...=....|+.++.......+..+.+...+-. +...-..||++.+....... T Consensus 21 ~~~~~a~~~~~~~~~~~liP~~~~~~ii~~~~~~s~l~~l~~~~~~~~~~~~ip~~~~~-~~a~~v~Eg~~~~~~~~~f~ 99 (324) T protein:vir:93 21 PQVFNPDNVMMHEKKDGTLLNDFTTPILQEVMENSKIMQLGKYEPMEGTEKKFTFWADK-PGAYWVGEGQKIETSKATWV 99 (324) T ss_pred hhhcccccccccCCCcceechhHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEecC-cceeeecCCcccccccccee Confidence 00 0011111111 11 122356677766668888888877666554444444332222 22223358877765433211 Q ss_pred eEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 78 ERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 78 ~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) ...-+. .-+...+.||. +.+.... .+-..+=...-.+.+.+.+|.++|+|.... . ...|+...+... T Consensus 100 ~i~~~~-~k~~~~~~iS~--ell~ds~-~~l~~~i~~~l~~aia~~~d~a~l~G~g~~-~---~~~~~~~~~~~~----- 166 (324) T protein:vir:93 100 NATMRA-FKLGVILPVTK--EFLNYTY-SQFFEEMKPMIAEAFYKKFDEAGILNQGNN-P---FGKSIAQSIEKT----- 166 (324) T ss_pred EEEEEe-EEEEEeehhhH--HHHhcch-HHHHHHHHHHHHHHHHHHHHHHHhcCCCCC-C---cCcccccccccc----- Confidence 111111 11223334443 3333221 122233333444668899999999985421 1 112322222100 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCccee Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNS 237 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~ 237 (330) ...+...++.++|.+++.++-.++.....++++|.....|..+ ++... |+...++...+ T Consensus 167 -----------------~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~n~~~~~~L~~l-~d~~G---~~~~~~~~~~~ 225 (324) T protein:vir:93 167 -----------------NKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKI-VDPET---KERIYDRNSDS 225 (324) T ss_pred -----------------ceeccccccHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHh-hCCCC---CeeecCCCCCc Confidence 0111224778889999999988888888899999999999876 44332 22222222211 Q ss_pred EEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc---ccccc-c------------cccccceee Q lcl|NC_020858. 238 IVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI---AEDKK-V------------AKTGDAEKF 301 (330) Q Consensus 238 ~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~---~~~e~-l------------aKtGd~~k~ 301 (330) + +| +.|+.+..++. ....+++.|++++-+..-.++ ...+. + ...-|...+ T Consensus 226 l----------~G-~PVv~~~~~~~---~~~~i~~gdfs~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~n~~~~ 291 (324) T protein:vir:93 226 L----------DG-LPVVNLKSSNL---KRGELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVAL 291 (324) T ss_pred c----------cc-eeeEeecCCCC---CcceEEEEecceEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEE Confidence 1 22 23333322222 234567788876644331221 00010 0 012245667 Q ss_pred EEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 302 MLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 302 ~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) .+...+++.+.+|.|..+|++-..=|..| T Consensus 292 r~~~r~d~~v~~~~a~~~l~~a~~~~~~~ 320 (324) T protein:vir:93 292 RATMHVALHIADDKAFAKLVPADKRTDSV 320 (324) T ss_pred EEEEEeccEEecccceEEEecccccCCCC Confidence 77788999999999999999998888888 No 26 >protein:vir:4339 Length: 395 # NCBI annotation: major head protein # Family: family:all:585 # MgeID: mge:93 # MgeName: D3 # Cross-refs: genbank:acc:NP_061502;genbank:gi:9635591;genbank:GeneID:1262860 Probab=97.97 E-value=2.8e-06 Score=51.06 Aligned_cols=281 Identities=9% Similarity=-0.019 Sum_probs=144.8 Q ss_pred CCccccceeeccc---cccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCc Q lcl|NC_020858. 1 MAVVTNTFQSTGA---KGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSP 77 (330) Q Consensus 1 Ma~~t~~~~t~~~---~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~ 77 (330) +.......++... .-..+.+...|...-....|+.+++......+..++|......++...-..||+..+....... T Consensus 107 ~~~~~~~~~~~~~~~g~~vp~~~~~~ii~~~~~~~~l~~l~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~~~~~~ 186 (395) T protein:vir:43 107 VSMPRSAITSIDGSGGALVAPDRRPGVVAAPQRRLTIRDLVAPGTTESNSVEYVRETGFVNNAAPVSEGTQKPYSDLTFE 186 (395) T ss_pred hhhhhhhhcccCCCCccccchhhHHHHHHHHHhhhhHHhhccceecCCCceEEEEEecCCCceeeecCCcccccccccee Confidence 1111111111111 1133456667777767888999988877776667778776555443334468877665443322 Q ss_pred eEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 78 ERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 78 ~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) . +.--.+-+...+.||. +.++.. .+..+|=...-...+.+-+|.++|+|.... .. ..||......... T Consensus 187 ~-i~~~~~k~~~~~~is~--ell~d~--~~l~~~v~~~la~a~~~~~d~~~l~G~g~~-~~---~~Gi~~~~~~~~~--- 254 (395) T protein:vir:43 187 L-ENAPVRTIAHLFKASR--QILDDA--SALQSYIDARARYGLMLVEECQLLYGNGTG-AN---LHGIIPQAQAYAP--- 254 (395) T ss_pred E-EEEeeeeEEEeehhhH--HHHHhH--HHHHHHHHHHHHHHHHHHHHHHHHhccCCC-Cc---ccccccccccccc--- Confidence 2 2112222333344553 344433 233344444555678889999999985421 11 3454433211111 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCccee Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNS 237 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~ 237 (330) ..+.++.....-+.+.+++.++-.++.....+++||....+|..+ .+.. .|+...+.. T Consensus 255 ---------------~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~l-kd~~---G~~i~~~~~--- 312 (395) T protein:vir:43 255 ---------------PSGVVVTAEQRIDRIRLAILQAQLAEFPASGIVLNPIDWALIELN-KDAE---NRYIIGSPQ--- 312 (395) T ss_pred ---------------ccccccccchhHHHHHHHHHhhccccCCCcEEEEcHHHHHHHHHh-hccC---Cceeccccc--- Confidence 011111223445666677777766777777899999998888876 3332 122221100 Q ss_pred EEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccccc-----cceeeEEEEEEEEEEe Q lcl|NC_020858. 238 IVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTG-----DAEKFMLIGEGALKPK 312 (330) Q Consensus 238 ~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtG-----d~~k~~i~~E~tLe~~ 312 (330) .... -+=+| +.|+.+.+||.+. +++.|+...-+.+.|.-..-+-..-.+ +...+.++..+...++ T Consensus 313 -~~~~---~~l~G-~pVv~~~~~~~~~-----~~~gd~~~~~~~~~~~~~~i~~~~~~~~~f~~~~~~~r~~~r~d~~v~ 382 (395) T protein:vir:43 313 -NGTT---PTLWR-LPVVETQAITQDE-----FLTGAFSLGAQIFDRMDIEVLVSTENDKDFENNMVTIRAEERLAFAVY 382 (395) T ss_pred -cCCC---ceecc-eeeEEcCCCCCCc-----EEEEeccceEEEEEecceEEEEeccccchhhcCcEEEEEEEeeccEEe Confidence 0000 11245 5788899999754 567887653222223211111111111 2334455566899999 Q ss_pred cchheeEEecccccccc Q lcl|NC_020858. 313 NEKGLGVAADLYGLTAS 329 (330) Q Consensus 313 N~~a~g~i~gLt~~~~~ 329 (330) +|.|...+. + +++ T Consensus 383 ~~~a~~~~~-~---taa 395 (395) T protein:vir:43 383 RPEAFVTGS-L---TAS 395 (395) T ss_pred cccceEEEE-e---ccC Confidence 999987763 2 222 No 27 >protein:vir:9410 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:167 # MgeName: phi 13 # Cross-refs: genbank:acc:NP_803388;genbank:gi:29028700;genbank:GeneID:1258136 Probab=97.96 E-value=1.8e-06 Score=52.03 Aligned_cols=282 Identities=11% Similarity=0.005 Sum_probs=136.4 Q ss_pred CCccccceeecc-ccccccccceeeEecCCcccceeeeeccceeccceeeee-eeeccCcccccccccccccccc-ccCc Q lcl|NC_020858. 1 MAVVTNTFQSTG-AKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWT-TDELAAPGANITLEGDEYTFDA-TVSP 77 (330) Q Consensus 1 Ma~~t~~~~t~~-~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~-td~L~~~~~na~~EG~d~~~~~-~~~~ 77 (330) .....+..++.. ..-.-+++.+.|...-...+|+..++......+..+.+. ...-..+...-..||++.+... .... T Consensus 117 ~~~~~~~~~~~~g~~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~Eg~~~~~~~~~~~~ 196 (415) T protein:vir:94 117 NDIQGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVRQSEVAALEKVEELEENPELAVKPFF 196 (415) T ss_pred hhhhhhccccccccccCcHHHHHHHHHHHHhhhhhhhhcceeeccCCceeEEEEeecCCccceeccccccccccccccce Confidence 111111111111 112223677777777677888888877665544444332 2222233334446787765322 1111 Q ss_pred eEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 78 ERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 78 ~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) ...-+.-.+ ...+.|| .+.+.... .+-.+|=...-...+.+-+|.++|+|... +++-. ++..+.... T Consensus 197 ~i~~~~~k~-~~~~~is--~ell~ds~-~~~~~~i~~~l~~~~~~~~~~~il~g~g~--g~~~~--~~~~~~~~~----- 263 (415) T protein:vir:94 197 QLAYDINTH-RGYFRIS--REAIEDAK-VNVLQELKLWMARTIAATRNKAIIDVITK--GSTGS--TSSGFEKEG----- 263 (415) T ss_pred eeEeeheee-eeechhh--HHHHhhch-HHHHHHHHHHHHHHHHHHHHHHHhhcccc--Ccccc--ccccccccc----- Confidence 111111111 1223343 33444332 12333334445556788899999988542 11111 000000000 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeec----CC Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAAS----NG 233 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~----~~ 233 (330) .+.......+-+.|.+++.++-.++.....+++||....+|..+ ++.+. |+... ++ T Consensus 264 ----------------~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~l-kd~~G---~~l~~~~~~~~ 323 (415) T protein:vir:94 264 ----------------KKLEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKM-KDKLG---NYLIQPDVKEK 323 (415) T ss_pred ----------------cccccccccchHHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHh-hccCC---CeeeccCcCCC Confidence 00011123566778888888877776667789999999999876 44331 11111 11 Q ss_pred cceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccccccceeeEEEEEEEEEEec Q lcl|NC_020858. 234 KNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTGDAEKFMLIGEGALKPKN 313 (330) Q Consensus 234 ~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~E~tLe~~N 313 (330) ...+| +| +.|+....||.+......+++.|+...-+-+.|.-...+...-..+...+..+..+.+.+.+ T Consensus 324 ~~~~l----------~G-~pV~~~~~~~~~~~~~~~i~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~r~~~r~d~~~~~ 392 (415) T protein:vir:94 324 TQQRL----------LG-AKIEILPDEVLGQKGNNTLIIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDCRILD 392 (415) T ss_pred CCcee----------cc-eeeEEecccccCCCCccEEEEEehhccEEEEeecceEEEEeccccCceEEEEEEEeccEEec Confidence 11011 12 24555566776554334467777663222222321111111112344556667779999999 Q ss_pred chheeEEeccccccccC Q lcl|NC_020858. 314 EKGLGVAADLYGLTAST 330 (330) Q Consensus 314 ~~a~g~i~gLt~~~~~~ 330 (330) |.|..+|+ ++++. T Consensus 393 ~~a~~~~~----~~~~~ 405 (415) T protein:vir:94 393 YKSAIVIE----YDDSE 405 (415) T ss_pred cccEEEEE----EeccC Confidence 99998886 33333 No 28 >protein:vir:104085 Length: 320 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:1656 # MgeName: Che12 # Cross-refs: genbank:acc:YP_655596;genbank:gi:109392467;genbank:GeneID:4156953 Probab=97.95 E-value=5.1e-06 Score=49.58 Aligned_cols=291 Identities=12% Similarity=0.061 Sum_probs=141.0 Q ss_pred CCcccc--------cee--eccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccc Q lcl|NC_020858. 1 MAVVTN--------TFQ--STGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYT 70 (330) Q Consensus 1 Ma~~t~--------~~~--t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~ 70 (330) ||.-+. ..+ +...-..-+.+.+.|+..-....|+++++......+..+.|...+-. +...-..||++.+ T Consensus 1 ~~~~~~~~~~~~~~~~t~~~~~~~~ip~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~p~~~~~-~~a~~v~E~~~~~ 79 (320) T protein:vir:10 1 MAAGTAFQVDHAQIAQTGDTMFKGYLEPEQAKDYFAEAEKTSIVQQFAQKVPMGTTGQKIPHWIGD-VSAQWIGEGDMKP 79 (320) T ss_pred CCCCccCCHHHHHhhccccccccccccHHHHHHHHHHHHhccchhhhcceeeccCCceEEEEEeCC-cceEEecCCcccc Confidence 322221 001 11111123356666666666778888877665554444554443322 2222335888876 Q ss_pred cccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHh Q lcl|NC_020858. 71 FDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVK 150 (330) Q Consensus 71 ~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~ 150 (330) ........ .-..+.+=...+.=|.+.++... .+..++=..+-.+.+.+.+|.++|+|..+ +.+...-|+.... T Consensus 80 ~~~~~f~~---v~~~~~k~~~~~~is~ell~ds~-~~l~~~i~~~l~~a~a~~~d~a~l~G~g~--~~~~~~~~~~~~~- 152 (320) T protein:vir:10 80 ITKGNMTS---QNIAPHKIATIFVASAETVRANP-ANYLGTMRTKVATAFAMAFDSAALNGTDS--PFPTYLAQTTKSV- 152 (320) T ss_pred ccccceeE---EEEeeEEEEEeehhhHHHHhcCh-HHHHHHHHHHHHHHHHHHHHHHhhcccCC--CCCcccccccccc- Confidence 54332211 12233333444444555555433 23344444566678899999999998753 2221111111110 Q ss_pred cccccccccccccccccccccccccccccccc--cHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceee-eee Q lcl|NC_020858. 151 TNVSRGATGANGGYNTGTGLTVAPTDGTQRAF--SKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVA-SFR 227 (330) Q Consensus 151 tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~l--Te~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~-~~r 227 (330) +.. ..+..+...+ .++.+.+++..+..++.....+++||....++..+ ++.+.. ..+ T Consensus 153 -~~~------------------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~l-kd~~G~~l~~ 212 (320) T protein:vir:10 153 -SLA------------------DPGGATASDLTAYDAVAVNGLSLLVNAKKKWTHTLLDDIVEPILNGA-KDKNGRPLFI 212 (320) T ss_pred -cce------------------ecccccccccccHHHHHHHHHhhhhcccCCCcEEEEcHHHHHHHHHh-hccCCceeec Confidence 000 0001111112 23446666667776666777899999999999875 443311 111 Q ss_pred eeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc---cccccccccc-------- Q lcl|NC_020858. 228 YAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI---AEDKKVAKTG-------- 296 (330) Q Consensus 228 ~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~---~~~e~laKtG-------- 296 (330) .....+....+. ....--+.++++.+||.+... +++.|++.+-+..-.+. ...+...+.| T Consensus 213 ~~~~~~~~~~~~------~~~i~g~pv~~~~~~~~~~~~---~~~gd~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~ 283 (320) T protein:vir:10 213 ESTYTDENSPFR------AGRIVSRPTILSDHVADGTTV---GYMGDFRNVIWGQVGGLSFDVTDQATLNLGTPTEPNFV 283 (320) T ss_pred cccccCcccccc------CceeeeeeeEecCCCCCCceE---EEEeecceEEEEEecCeEEEEeecceeeeccccccccc Confidence 000111110000 001123567888889876532 45667765543321111 1112222222 Q ss_pred -----cceeeEEEEEEEEEEecchheeEEecccccccc Q lcl|NC_020858. 297 -----DAEKFMLIGEGALKPKNEKGLGVAADLYGLTAS 329 (330) Q Consensus 297 -----d~~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~ 329 (330) +...+..+..+.+.+.+|+|..+|++..= .++ T Consensus 284 ~~f~~~~~~~r~~~~~d~~v~~~~a~~~l~~~~a-p~~ 320 (320) T protein:vir:10 284 SLWQHNLVAVRVEAEYAFHNNDKDAFVKLTNVVT-PDA 320 (320) T ss_pred hhhhcCcEEEEEEEeeccEEecccceEEEEeccC-CCC Confidence 33445556778999999999888865542 222 No 29 >protein:vir:7771 Length: 330 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:149 # MgeName: Bxz2 # Cross-refs: genbank:acc:NP_817605;genbank:gi:29566035;genbank:GeneID:1259229 Probab=97.95 E-value=7e-06 Score=48.84 Aligned_cols=296 Identities=11% Similarity=0.007 Sum_probs=149.5 Q ss_pred CCccccc----eeeccccc-cccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCcccccccccccccccccc Q lcl|NC_020858. 1 MAVVTNT----FQSTGAKG-NREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATV 75 (330) Q Consensus 1 Ma~~t~~----~~t~~~~g-~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~ 75 (330) ||..... .++...-+ ..+.+.+.|...--...|+..+.......+....+...+-. +...-..||...+..... T Consensus 1 m~~~~~~a~~~~~t~~~g~~i~~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~p~~~~~-~~a~~v~Eg~~~~~~~~~ 79 (330) T protein:vir:77 1 MAGSTVPSTQVALTGDFSAFLTPEQSQDYFAEIEKTSIVQRIARKVPMGPTGISIPHWTGA-VSASWTGEAERKPITKGS 79 (330) T ss_pred CcccccchhhccccCCCcceechhHHHHHHHHHHhccchhhhcceeeccCCceEEEEEcCC-cceeEecCCCccccccce Confidence 7776322 22222212 23456666666666778888876655554444554443322 222223588777655433 Q ss_pred CceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccc- Q lcl|NC_020858. 76 SPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVS- 154 (330) Q Consensus 76 ~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~- 154 (330) ..... ..+.+=...|.=|.+.+.... .+-.++=...-.+.+.+.+|+++|+|.... ....|++........ T Consensus 80 f~~i~---~~~~k~~~~~~is~ell~ds~-~~~~~~i~~~l~~ai~~~~~~~~l~G~g~~----~~~~g~~~~~~~~~~~ 151 (330) T protein:vir:77 80 FGKQE---LEPVKITTIFAESAEVVRLNP-LNYLNTMRTKIAEAIALKFDAAAIHGIDKP----SAFKGYLAETTKVVSL 151 (330) T ss_pred eeEEE---EeEEEEEEeehhhHHHHhcch-HHHHHHHHHHHHHHHHHHHHHHhhcccCCC----Ccccccccccccccee Confidence 22221 222222233333344444332 233344444556678899999999986532 222444433311110 Q ss_pred cccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCc Q lcl|NC_020858. 155 RGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGK 234 (330) Q Consensus 155 ~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~ 234 (330) ..+ ...+.........++|.+++.+++.++.....+++|+.....|..+ ++.+. |+...... T Consensus 152 ~~~--------------~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~vmn~~~~~~l~~l-kd~~G---~~l~~~~~ 213 (330) T protein:vir:77 152 ADT--------------NLTTASGPQGNAYLAVNNALSLLVNSGKKWTGTLLDNVTEPILNTA-VDGNG---RPLFVEST 213 (330) T ss_pred ecc--------------cccccccccchhHHHHHHHHHhhhhcCCCccEEEEcHHHHHHHHHH-hccCC---ceeecCcc Confidence 000 0011111223445678888888998888888899999999998875 44331 22211110 Q ss_pred ceeEEEEEEEE--EcCCeEEEEEEcCcCCCccccc-cEEEEEcchhhhhcccCCc---cccccccccc------------ Q lcl|NC_020858. 235 NNSIVANADVY--EGPFGKVMIHPNRVMAGSGALA-RNAFFVDPEFLQFGWLRKI---AEDKKVAKTG------------ 296 (330) Q Consensus 235 ~~~~~~~v~~~--~tdfG~v~iv~nR~mp~~~~~a-~~~~~ld~~~~~~~~Lr~~---~~~e~laKtG------------ 296 (330) . ....... .+=+| +.|+.+.+||++.... ..+++.|++..-+.--.+. ...+...+.| T Consensus 214 ~---~~~~~~~~~~~l~G-~PV~~~~~~p~~~~~~~~~~~~gd~s~~~i~~~~~~~i~~~~e~~~~~~~~~~~~~~~~~~ 289 (330) T protein:vir:77 214 Y---TEQVGAIREGRILG-RPTYVADNVVNGTVGNRVVGVMGDFSQVIWGQIGGLSFDVTDQATLDFGEEQGGVWVPKLI 289 (330) T ss_pred c---cccccccCCceecc-eeeEEeccccCCCCCCccEEEEEecceEEEEEecCcEEEEeecceeeeccccccccccccc Confidence 0 0000000 01134 6788888898754211 2366778776643321111 0111111111 Q ss_pred -----cceeeEEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 297 -----DAEKFMLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 297 -----d~~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) |...+.+...+...+++|+|..+|..-+ +.+ T Consensus 290 ~~f~~~~~~~r~~~r~d~~v~~~~a~~~i~~~~---~~~ 325 (330) T protein:vir:77 290 SLWQHNMVAVRCEAEFAFMVNDKDAFVKLTDQV---AGT 325 (330) T ss_pred chhhcCcEEEEEEEEeccEEecccceEEEEecc---CCc Confidence 3455566777899999999998865444 444 No 30 >protein:vir:8187 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:153 # MgeName: Che9d # Cross-refs: genbank:acc:NP_817980;genbank:gi:29566414;genbank:GeneID:2700968 Probab=97.92 E-value=5e-06 Score=49.66 Aligned_cols=283 Identities=11% Similarity=0.010 Sum_probs=136.6 Q ss_pred CCcccc-ceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceE Q lcl|NC_020858. 1 MAVVTN-TFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPER 79 (330) Q Consensus 1 Ma~~t~-~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~ 79 (330) ||+++. .++-+ +.+.+.|...-..+.|+..+.......+-..++...+-.+ ...-..||.+.+......... T Consensus 1 mat~~~gg~lvP------~~~~~~ii~~~~~~s~i~~~~~~i~~~~~~~~~p~~~~~~-~a~wv~Eg~~~~~~~~~f~~v 73 (311) T protein:vir:81 1 MVALATGTFQLP------KHLVPGVWQKAQGQSVLARLSMAEPQEFGEQQYMTLTAPP-RGEVVGEGAQKSESTATFAPV 73 (311) T ss_pred CceecCCceEcc------hhHHHHHHHHHHhcchhhhhcceeecCCCceEEEEEeCCc-eeEEeecCcccccccceeeEE Confidence 877754 33333 3456666665566777777654433332333443332222 112235887766543322111 Q ss_pred ecceEEEEeeeeeehhHHHHHhhccc--cchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 80 LGNYTQIMRKSGIISGTQNITDEAGR--ATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 80 ~~N~tQIf~~~v~VS~T~~av~~~G~--~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) .=+.. -+..-+.|| .+-+..... .+...+=..+....+.+-+|.++++|.. +++.-...|+...+....... T Consensus 74 ~l~~~-kl~~~~~iS--~ell~~~~d~~~~l~~~i~~~la~ai~~~~d~a~l~G~~--~~~~~~~~gi~~~~~~~~~~~- 147 (311) T protein:vir:81 74 TAIPR-KVQVTQRFS--QEVKWADESRQLGVLQTMADLSGVALGRALDLIGIHGIN--PLTGAALSGSPAKILDTTNIV- 147 (311) T ss_pred EEeeE-EEEEeehhh--HHHhhcCcccHHHHHHHHHHHHHHHHHHHHHHhhhcccc--CCCCcccccccccccccceee- Confidence 11111 112222333 332221110 1122333334556688999999999863 222222334444331110000 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceee-eeeeeecCCcce Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVA-SFRYAASNGKNN 236 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~-~~r~~~~~~~~~ 236 (330) . ...+........+..++.++..++.+++.+++||....+|.++ ++.+.. ..+.....+... T Consensus 148 --------------~--~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~l-kd~~G~~l~~~~~~~~~~~ 210 (311) T protein:vir:81 148 --------------E--LTTGTSATPDLAVEAAVGLVLGDNLSPDGVALDNTFSFMLATQ-RDSQGRKLYPELGFGTDVA 210 (311) T ss_pred --------------e--ecccccchHHHHHHHHHHHhhhcCCCceEEEEcHHHHHHHHhh-hccCCCeeecCccccCCCc Confidence 0 0111112334567777777777788888899999999999876 443311 111001111111 Q ss_pred eEEEEEEEEEcCCeEEEEEEcCcCCCccc-------------cccEEEEEcchhhhhcccCCccccccccccc------- Q lcl|NC_020858. 237 SIVANADVYEGPFGKVMIHPNRVMAGSGA-------------LARNAFFVDPEFLQFGWLRKIAEDKKVAKTG------- 296 (330) Q Consensus 237 ~~~~~v~~~~tdfG~v~iv~nR~mp~~~~-------------~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtG------- 296 (330) ++ +| +.++.+..||.... ....+++.|.+.+-+...++. .-+ +...+ T Consensus 211 tl----------~G-~Pv~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~gDfs~~~i~~~~~~-~~~-~~~~~~~~~~~~ 277 (311) T protein:vir:81 211 SF----------AG-LNAAVSDTVRGGPEAVTASTGVYRTTNPNVKAIAGDFSAFRWGVQVSI-PLE-LIEFGDPDGLGD 277 (311) T ss_pred ee----------cc-eeEEecccccccccccccccchhcccCCccEEEEEecccEEEEEeccc-eEE-EeccCCCCcchh Confidence 11 22 45666666764322 123467888876555432221 111 12222 Q ss_pred ----cceeeEEEEEEEEEEecchheeEEecccccccc Q lcl|NC_020858. 297 ----DAEKFMLIGEGALKPKNEKGLGVAADLYGLTAS 329 (330) Q Consensus 297 ----d~~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~ 329 (330) |...+..+..++..+++|+|..+|++-+- + T Consensus 278 ~~~~~~v~~r~~~r~d~~v~~~~a~~~l~~a~~---~ 311 (311) T protein:vir:81 278 LKRQNQIAIRAEVVYGIGIMSTDAFAVVRDADE---S 311 (311) T ss_pred hhhcCcEEEEEEEEeccEeecccceEEEEeecc---C Confidence 22345556678999999999988765433 3 No 31 >protein:vir:105905 Length: 304 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:1514 # MgeName: phiETA3 # Cross-refs: genbank:acc:YP_001004375;genbank:gi:122891830;genbank:GeneID:4712376 Probab=97.91 E-value=6.2e-06 Score=49.15 Aligned_cols=280 Identities=11% Similarity=-0.018 Sum_probs=142.7 Q ss_pred CCcccccee---eccccc--cccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCcccccccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQ---STGAKG--NREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATV 75 (330) Q Consensus 1 Ma~~t~~~~---t~~~~g--~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~ 75 (330) ||+.+.+-. +...-| .-+.+.+.|..--....|++++.......+..+.+...+- .+...-..||+..+..... T Consensus 1 ma~~~~~~~~~~~t~~gg~lip~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~ip~~~~-~~~a~~v~E~~~~~~~~~~ 79 (304) T protein:vir:10 1 MATPTYTPGNVILSDFKNGVIPAEQGTLIMKDIMANSAIMKLAKNEPMTAQKKKFTYLAK-GVGAYWVSETERIQTSKPE 79 (304) T ss_pred CcccccccccccccCCCceecchhHHHHHHHHHHhccchhhhcceeeccCCceEEEEEeC-CcceEEeecCcccccccce Confidence 888763322 121112 2335666666555777888887655554443333322221 1111223477766544322 Q ss_pred CceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccc Q lcl|NC_020858. 76 SPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSR 155 (330) Q Consensus 76 ~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~ 155 (330) .....-+... +..-+.|| .+.+.... .+..+|=...-.+.+.+.+|.++|+|........-...|+..-. T Consensus 80 ~~~i~~~~~k-~~~~~~iS--~ell~ds~-~~l~~~i~~~l~~~ia~~~d~~~l~G~g~~~~~~~~~~~~~~~~------ 149 (304) T protein:vir:10 80 YAQAEMEAKK-IGVIIPLS--KEFLKWTA-KDFFNEVKPLIAEAFYKAFDQAVIFGTKSPYNTSTSGKPLVEGA------ 149 (304) T ss_pred eeEEEEEEEE-EEEeehhh--HHHHhcch-HHHHHHHHHHHHHHHHHHHHhhheeccCCCcccccccccccccc------ Confidence 2222212222 22233444 34444332 22223333344466889999999998754322211111111100 Q ss_pred ccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcc Q lcl|NC_020858. 156 GATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKN 235 (330) Q Consensus 156 g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~ 235 (330) ....+..+....+-++|.+++.++-.++.....++|+|.....+.++ ++.. .|+..... T Consensus 150 ---------------~~~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~~~~~~~~L~~l-kd~~---G~~l~~~~-- 208 (304) T protein:vir:10 150 ---------------EEKGNVVTDTNNLYVDLSALMATIEDEELDPNGVLTTRSFRSKMRNA-LDAN---DRPLFDAN-- 208 (304) T ss_pred ---------------cccccccccccchHHHHHHHHHHhhhccCCcCEEEEcHHHHHHHHHh-hccC---CcEeecCC-- Confidence 01112222234567778888888888877777899999999988875 3332 12221111 Q ss_pred eeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccc--c-----------------ccc Q lcl|NC_020858. 236 NSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKV--A-----------------KTG 296 (330) Q Consensus 236 ~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~l--a-----------------KtG 296 (330) . -+=+| +.|+.+.+||.+.- ...+++.|++++-+.. |.-...+.+ + -.- T Consensus 209 ------~---~~l~G-~PV~~~~~~~~~~~-~~~~~~gd~~~~~~~~-~~~~~i~~~~e~~~~~~~~~~~~g~~~~~f~~ 276 (304) T protein:vir:10 209 ------G---NEIMG-LPLSYTGADVYDKK-KSLALMGDWDYARYGI-LQGIEYAISEDATLTTLQASDASGQPVSLFER 276 (304) T ss_pred ------C---ccccc-eeeEEecccccCCC-CcEEEEEehhhEEEEE-ecceEEEEeecceeeeecccccCccchhhhhc Confidence 0 01234 46777888986543 3357788888764432 221111100 0 011 Q ss_pred cceeeEEEEEEEEEEecchheeEEeccc Q lcl|NC_020858. 297 DAEKFMLIGEGALKPKNEKGLGVAADLY 324 (330) Q Consensus 297 d~~k~~i~~E~tLe~~N~~a~g~i~gLt 324 (330) |...+.++..+++.+++|.|..+|+.== T Consensus 277 ~~~~~r~~~r~~~~v~~~~a~~~l~~a~ 304 (304) T protein:vir:10 277 DMFALRATMHIAYMNVKPEAFATLKPTE 304 (304) T ss_pred CcEEEEEEEEeccEeecccceEEEEecC Confidence 2344556677999999999999887644 No 32 >protein:vir:94142 Length: 304 # NCBI annotation: ORF013 # Family: family:all:507 # MgeID: mge:1494 # MgeName: 96 # Cross-refs: genbank:acc:YP_240234;genbank:gi:66395898;genbank:GeneID:5133311 Probab=97.91 E-value=6.2e-06 Score=49.15 Aligned_cols=280 Identities=11% Similarity=-0.018 Sum_probs=142.7 Q ss_pred CCcccccee---eccccc--cccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCcccccccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQ---STGAKG--NREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATV 75 (330) Q Consensus 1 Ma~~t~~~~---t~~~~g--~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~ 75 (330) ||+.+.+-. +...-| .-+.+.+.|..--....|++++.......+..+.+...+- .+...-..||+..+..... T Consensus 1 ma~~~~~~~~~~~t~~gg~lip~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~ip~~~~-~~~a~~v~E~~~~~~~~~~ 79 (304) T protein:vir:94 1 MATPTYTPGNVILSDFKNGVIPAEQGTLIMKDIMANSAIMKLAKNEPMTAQKKKFTYLAK-GVGAYWVSETERIQTSKPE 79 (304) T ss_pred CcccccccccccccCCCceecchhHHHHHHHHHHhccchhhhcceeeccCCceEEEEEeC-CcceEEeecCcccccccce Confidence 888763322 121112 2335666666555777888887655554443333322221 1111223477766544322 Q ss_pred CceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccc Q lcl|NC_020858. 76 SPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSR 155 (330) Q Consensus 76 ~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~ 155 (330) .....-+... +..-+.|| .+.+.... .+..+|=...-.+.+.+.+|.++|+|........-...|+..-. T Consensus 80 ~~~i~~~~~k-~~~~~~iS--~ell~ds~-~~l~~~i~~~l~~~ia~~~d~~~l~G~g~~~~~~~~~~~~~~~~------ 149 (304) T protein:vir:94 80 YAQAEMEAKK-IGVIIPLS--KEFLKWTA-KDFFNEVKPLIAEAFYKAFDQAVIFGTKSPYNTSTSGKPLVEGA------ 149 (304) T ss_pred eeEEEEEEEE-EEEeehhh--HHHHhcch-HHHHHHHHHHHHHHHHHHHHhhheeccCCCcccccccccccccc------ Confidence 2222212222 22233444 34444332 22223333344466889999999998754322211111111100 Q ss_pred ccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcc Q lcl|NC_020858. 156 GATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKN 235 (330) Q Consensus 156 g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~ 235 (330) ....+..+....+-++|.+++.++-.++.....++|+|.....+.++ ++.. .|+..... T Consensus 150 ---------------~~~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~~~~~~~~L~~l-kd~~---G~~l~~~~-- 208 (304) T protein:vir:94 150 ---------------EEKGNVVTDTNNLYVDLSALMATIEDEELDPNGVLTTRSFRSKMRNA-LDAN---DRPLFDAN-- 208 (304) T ss_pred ---------------cccccccccccchHHHHHHHHHHhhhccCCcCEEEEcHHHHHHHHHh-hccC---CcEeecCC-- Confidence 01112222234567778888888888877777899999999988875 3332 12221111 Q ss_pred eeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccc--c-----------------ccc Q lcl|NC_020858. 236 NSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKV--A-----------------KTG 296 (330) Q Consensus 236 ~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~l--a-----------------KtG 296 (330) . -+=+| +.|+.+.+||.+.- ...+++.|++++-+.. |.-...+.+ + -.- T Consensus 209 ------~---~~l~G-~PV~~~~~~~~~~~-~~~~~~gd~~~~~~~~-~~~~~i~~~~e~~~~~~~~~~~~g~~~~~f~~ 276 (304) T protein:vir:94 209 ------G---NEIMG-LPLSYTGADVYDKK-KSLALMGDWDYARYGI-LQGIEYAISEDATLTTLQASDASGQPVSLFER 276 (304) T ss_pred ------C---ccccc-eeeEEecccccCCC-CcEEEEEehhhEEEEE-ecceEEEEeecceeeeecccccCccchhhhhc Confidence 0 01234 46777888986543 3357788888764432 221111100 0 011 Q ss_pred cceeeEEEEEEEEEEecchheeEEeccc Q lcl|NC_020858. 297 DAEKFMLIGEGALKPKNEKGLGVAADLY 324 (330) Q Consensus 297 d~~k~~i~~E~tLe~~N~~a~g~i~gLt 324 (330) |...+.++..+++.+++|.|..+|+.== T Consensus 277 ~~~~~r~~~r~~~~v~~~~a~~~l~~a~ 304 (304) T protein:vir:94 277 DMFALRATMHIAYMNVKPEAFATLKPTE 304 (304) T ss_pred CcEEEEEEEEeccEeecccceEEEEecC Confidence 2344556677999999999999887644 No 33 >protein:vir:103955 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1662 # MgeName: phiNM # Cross-refs: genbank:acc:YP_873992;genbank:gi:118430767;genbank:GeneID:4525449 Probab=97.90 E-value=7.5e-06 Score=48.67 Aligned_cols=280 Identities=11% Similarity=0.024 Sum_probs=147.2 Q ss_pred CC-----ccccceeeccccc-cccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccc Q lcl|NC_020858. 1 MA-----VVTNTFQSTGAKG-NREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDAT 74 (330) Q Consensus 1 Ma-----~~t~~~~t~~~~g-~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~ 74 (330) |. -+.++..+...-+ .-+.+.+.|...=....|+.++.......+..+++..-+- .+...-..||+..+.... T Consensus 18 ~~~~~~~~a~~~~~~~~~~~liP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~p~~~~-~~~a~~v~Eg~~~~~~~~ 96 (324) T protein:vir:10 18 NVKPQVFNPDNVMMHEKKDGTLLNDFTTPILQEVMENSKIMQLGKYEPMEGTEKKFTFWAD-KPGAYWVGEGQKIETSKA 96 (324) T ss_pred hhccceecccceeccCCCcceechhHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEeC-CcceeEeccCcccccccc Confidence 10 0011111111111 2235666666555777888887766655555455544332 222222358888765543 Q ss_pred cCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccc Q lcl|NC_020858. 75 VSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVS 154 (330) Q Consensus 75 ~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~ 154 (330) ......-+... +...+.|| .+.+.... .+..+|=...-.+.+.+-+|.++|+|.... . ...|+...+.. T Consensus 97 ~~~~v~~~~~k-~~~~~~iS--~ell~ds~-~~l~~~i~~~l~~ai~~~~d~a~l~G~g~~-~---~~~~i~~~~~~--- 165 (324) T protein:vir:10 97 TWVNATMRAFK-LGVILPVT--KEFLNYTY-SQFFEEMKPMIAEAFYKKFDEAGILNQGNN-P---FGKSIAQSIEK--- 165 (324) T ss_pred ceeEEEEeeEE-EEEeehhh--HHHHhcch-HHHHHHHHHHHHHHHHHHHHHHhhhcCCCC-c---cCccccccccc--- Confidence 33222222222 22333444 34444332 233344444555678899999999885421 1 22233332211 Q ss_pred cccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCc Q lcl|NC_020858. 155 RGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGK 234 (330) Q Consensus 155 ~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~ 234 (330) .+..+...++.++|.+++.++-.++.....+++||.....+..+. +... |+...++. T Consensus 166 -------------------~~~~~~~~~t~~~i~~~~~~l~~~~~~~~~~v~n~~~~~~L~~l~-d~~g---~~~~~~~~ 222 (324) T protein:vir:10 166 -------------------TNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKIV-DPET---KERIYDRN 222 (324) T ss_pred -------------------cceeccccCCHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHhh-ccCC---ceeecCCC Confidence 011122357889999999999888777778899999999988763 3221 22222221 Q ss_pred ceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccc---ccc--------------cccc Q lcl|NC_020858. 235 NNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDK---KVA--------------KTGD 297 (330) Q Consensus 235 ~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e---~la--------------KtGd 297 (330) .. +-+| +.|+.+..++.+ ...+++.|++.+-+..-.++ ..+ +.. -.-| T Consensus 223 ~~----------~l~G-~PV~~~~~~~~~---~~~~~~gd~~~~~~~~~~~~-~i~~~~~~~~~~~~~~~~~~~~~~~~~ 287 (324) T protein:vir:10 223 SD----------TLDG-LPVVNLKSSNLK---RGELITGDFDKLIYGIPQLI-EYKIDETAQLSTVKNEDGTPVNLFEQD 287 (324) T ss_pred Cc----------cccc-eeEEeecCCCCC---cceEEEEecccEEEEEecCc-EEEEeecccccccccccccchhhhhcC Confidence 11 1233 233444444433 34567778877655432221 111 000 1124 Q ss_pred ceeeEEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 298 AEKFMLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 298 ~~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) ...+.....++..+.+|.|..+|++.+.=+..| T Consensus 288 ~~~~r~~~r~d~~v~~~~A~~~l~~a~~~~~~~ 320 (324) T protein:vir:10 288 MVALRATMHVALHIADDKAFAKLVPADKKTDSV 320 (324) T ss_pred cEEEEEEEEEccEEecccceEEEEeccCCCCCC Confidence 456666677999999999999988887777767 No 34 >protein:vir:99749 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1497 # MgeName: phiETA2 # Cross-refs: genbank:acc:YP_001004307;genbank:gi:122891761;genbank:GeneID:4712304 Probab=97.87 E-value=9.6e-06 Score=48.08 Aligned_cols=281 Identities=11% Similarity=0.013 Sum_probs=148.5 Q ss_pred CCc--cccceeecccc-ccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCc Q lcl|NC_020858. 1 MAV--VTNTFQSTGAK-GNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSP 77 (330) Q Consensus 1 Ma~--~t~~~~t~~~~-g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~ 77 (330) ++. +.++..+...- ..-+.+.+.|...-....|+.++.......+..+.+...+- .+...-..||+..+....... T Consensus 21 ~~~~~a~~~~~~~~~~~lip~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~p~~~~-~~~a~~v~Eg~~~~~~~~~~~ 99 (324) T protein:vir:99 21 PQVFNPDNVMMHEKKDGTLLNDFTTPILQEVMENSKIMRLGKYEPMEGTEKKFTFWAD-KPGAYWVGEGQKIETSKATWV 99 (324) T ss_pred hhhccccceeccCCCcceechhHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEec-CcceeEeccCcccccccccee Confidence 100 00011111111 12235667766666888888888766655444444444332 222222358888766544332 Q ss_pred eEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 78 ERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 78 ~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) ...-+... +...+.||. +.+.... .+...|=...-.+.+.+-+|.++|+|.... + ...|+...+. T Consensus 100 ~v~~~~~k-~~~~~~iS~--ell~ds~-~~l~~~i~~~l~~ai~~~~d~~~l~G~g~~-~---~~~~~~~~~~------- 164 (324) T protein:vir:99 100 NATMRAFK-LGVILPVTK--EFLNYTY-SQFFEEMKPMIAEAFYKKFDEAGILNQGNN-P---FGKSIAQSIE------- 164 (324) T ss_pred EEEEeeEE-EEEeehhhH--HHHhcch-HHHHHHHHHHHHHHHHHHHHHHhhhcCCCC-c---cCcccccccc------- Confidence 22222222 223344443 3333332 233444444556668899999999885421 1 1223222211 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCccee Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNS 237 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~ 237 (330) ..+..+...++.++|.+++.++-.++.....+++||.....+..+ ++... |+...++.. T Consensus 165 ---------------~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~n~~~~~~L~~l-~d~~g---~~~~~~~~~-- 223 (324) T protein:vir:99 165 ---------------KTNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKI-VDPET---KERIYDRNS-- 223 (324) T ss_pred ---------------ccceeccccCCHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHh-hcCCC---ceeecCCCC-- Confidence 001112235788999999999988877777889999999988876 33321 222222211 Q ss_pred EEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCcc---cccccc-------------ccccceee Q lcl|NC_020858. 238 IVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIA---EDKKVA-------------KTGDAEKF 301 (330) Q Consensus 238 ~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~---~~e~la-------------KtGd~~k~ 301 (330) .+-+| +.|+.+..++.+ ...+++.|++.+-+....++. ..+... -.-|...+ T Consensus 224 --------~~l~G-~PVv~~~~~~~~---~~~~i~gd~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~~~~~~ 291 (324) T protein:vir:99 224 --------DTLDG-LPVVNLKSSNLK---RGELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVAL 291 (324) T ss_pred --------ccccc-eeEEeecCCCCC---cceEEEEecccEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEE Confidence 11234 344444455543 235677888876554322210 011000 11244566 Q ss_pred EEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 302 MLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 302 ~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) .+...++..+.+|.|..+|++.+.=+..| T Consensus 292 r~~~r~d~~v~~~~a~~~lt~a~~~~~~~ 320 (324) T protein:vir:99 292 RATMHVALHIADDKAFAKLVPADKKTDSV 320 (324) T ss_pred EEEEEEccEEecccceEEEEeccCCCCCC Confidence 66677999999999999988877777766 No 35 >protein:vir:97148 Length: 324 # NCBI annotation: ORF010 # Family: family:all:507 # MgeID: mge:1654 # MgeName: 85 # Cross-refs: genbank:acc:YP_239726;genbank:gi:66394880;genbank:GeneID:5130881 Probab=97.85 E-value=1.3e-05 Score=47.43 Aligned_cols=278 Identities=11% Similarity=0.014 Sum_probs=146.5 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) ...-..+.++......-+.+.+.|...=....|+..+.......+..+.+...+- .+...-..||+..+..... + T Consensus 24 ~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~ip~~~~-~~~a~~v~Eg~~~~~~~~~----f 98 (324) T protein:vir:97 24 FNPDNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQLGKYEPMEGTEKKFTFWAD-KPGAYWVGEGQKIETSKAT----W 98 (324) T ss_pred hccccccccCCCcceechhHHHHHHHHHHhhcchhhhcceeeccCCceEEEEEec-CcceeEeccCccccccccc----e Confidence 1000001111111122345667666666888888887766555544444444332 2222233588776544332 3 Q ss_pred cceEEEEe---eeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 81 GNYTQIMR---KSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 81 ~N~tQIf~---~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) .+++--.+ ..+.|| .+.+.... .+...+=...-...+.+-+|.++|+|.... . ...|+...+.. T Consensus 99 ~~v~~~~~k~~~~~~is--~ell~ds~-~~l~~~i~~~l~~aia~~~d~a~l~G~g~~-~---~~~gi~~~~~~------ 165 (324) T protein:vir:97 99 VNATMRAFKLGVILPVT--KEFLNYTY-SQFFEEMKPMIAEAFYKKFDEAGILNQGNN-P---FGKSIAQSIEK------ 165 (324) T ss_pred eEEEEeeEEEEEeehhh--HHHHhcch-HHHHHHHHHHHHHHHHHHHHHHhhccCCCC-c---cCccccccccc------ Confidence 33332222 333344 33443332 233344445556678899999999986422 1 22333332210 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCccee Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNS 237 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~ 237 (330) .+..+...++.++|.++..++-.++.....++++|.....+..+. +... |+....+...+ T Consensus 166 ----------------~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~n~~~~~~L~~lk-d~~g---~~~~~~~~~~t 225 (324) T protein:vir:97 166 ----------------TNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKIV-DPET---KERIYDRNSDT 225 (324) T ss_pred ----------------cceeccccCCHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHhh-cCCC---ceeecCCCCcc Confidence 111122357888899999888888877778999999998888763 3321 22222221111 Q ss_pred EEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc---ccccccc-------------ccccceee Q lcl|NC_020858. 238 IVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI---AEDKKVA-------------KTGDAEKF 301 (330) Q Consensus 238 ~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~---~~~e~la-------------KtGd~~k~ 301 (330) =+|. .|+.+...+.+ ...+++.|++.+-+..-.++ ..++... ..=|...+ T Consensus 226 ----------l~G~-PV~~~~~~~~~---~~~~~~gd~~~~~i~~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~ 291 (324) T protein:vir:97 226 ----------LDGL-PVVNLKSSNLK---RGELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVAL 291 (324) T ss_pred ----------ccce-eeEeecCCCCC---cceEEEEecccEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEE Confidence 1332 33333333322 33567788877644432221 0011100 11144555 Q ss_pred EEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 302 MLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 302 ~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) .+...+...+++|+|..+|.+.+.=+..| T Consensus 292 r~~~r~d~~v~~~~a~~~l~~~~~~~~~~ 320 (324) T protein:vir:97 292 RATMHVALHIADDKAFAKLVPADKKTDSV 320 (324) T ss_pred EEEEEeccEEecccceEEEEeccCCCCCC Confidence 66677899999999999999988877777 No 36 >protein:vir:96223 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1607 # MgeName: 69 # Cross-refs: genbank:acc:YP_239571;genbank:gi:66395304;genbank:GeneID:5132771 Probab=97.85 E-value=6.4e-06 Score=49.06 Aligned_cols=278 Identities=12% Similarity=0.025 Sum_probs=147.7 Q ss_pred CCc--cccceeecccc-ccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCc Q lcl|NC_020858. 1 MAV--VTNTFQSTGAK-GNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSP 77 (330) Q Consensus 1 Ma~--~t~~~~t~~~~-g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~ 77 (330) ++. +.+...+.... ..-+.+.+.|...-....|++++.......+..+.|...+-.+ ...-..||+..+..... T Consensus 21 ~~~~~a~~~~~~~~~~~lip~~~~~~ii~~~~~~s~l~~l~~~~~~~~~~~~~p~~~~~~-~a~~v~Eg~~~~~~~~~-- 97 (324) T protein:vir:96 21 PQVFNPDNVMMHEKKDGTLLNDFTTPILQEVMENSKIMQLGKYEPMEGTEKKFTFWADKP-GAYWVGEGQKIETSKAT-- 97 (324) T ss_pred hhhcccccccccCCCcceechhHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEecCc-ceeeecCCccccccccc-- Confidence 110 01000011111 1224576777666688888888877666665556666544322 22234588777654332 Q ss_pred eEecceE---EEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccc Q lcl|NC_020858. 78 ERLGNYT---QIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVS 154 (330) Q Consensus 78 ~~~~N~t---QIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~ 154 (330) +.+++ +-+...+.||. +.+.... .+...+=...-.+.+.+.+|.++|+|..... . ..|+...... T Consensus 98 --f~~v~~~~~k~~~~~~is~--ell~ds~-~~l~~~i~~~l~~aia~~~d~~~l~G~g~~~-~---~~~~~~~~~~--- 165 (324) T protein:vir:96 98 --WVNATMRAFKLGVILPVTK--EFLNYTY-SQFFEEMKPMIAEAFYKKFDEAGILNQGNNP-F---GKSIAQSIKK--- 165 (324) T ss_pred --eeEEEEEeEEEEEeehhhH--HHHhcch-HHHHHHHHHHHHHHHHHHHHHHhhhcCCCCC-c---Cccccccccc--- Confidence 33333 22333444543 2222221 2233333444556689999999998854211 1 1222221100 Q ss_pred cccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCc Q lcl|NC_020858. 155 RGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGK 234 (330) Q Consensus 155 ~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~ 234 (330) .+......++.+.|.+++.++-.++.+...++++|.....+..+ ++... |+...++. T Consensus 166 -------------------~~~~~~~~~~~~~i~~~~~~i~~~~~~~~~~i~n~~~~~~L~~l-kd~~G---~~~~~~~~ 222 (324) T protein:vir:96 166 -------------------TNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKI-VDPET---KERIYDRN 222 (324) T ss_pred -------------------cceecccccchHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHh-hCCCC---CeeecCCC Confidence 01111224678889999999988888888899999999988875 44332 22222222 Q ss_pred ceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc---ccccc-cc------------ccccc Q lcl|NC_020858. 235 NNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI---AEDKK-VA------------KTGDA 298 (330) Q Consensus 235 ~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~---~~~e~-la------------KtGd~ 298 (330) ..+ =+| +.|+.+...+. ....+++.|++.+-+..-.++ ...+. +. -.-|. T Consensus 223 ~~~----------l~G-~PV~~~~~~~~---~~~~~~~gd~s~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~n~ 288 (324) T protein:vir:96 223 SDS----------LDG-LPVVNLKSSNL---KRGELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDM 288 (324) T ss_pred CCc----------ccc-eeeEeecCCCC---CcceEEEEecceEEEEEecCcEEEEeecccccccccccccchhhhhcCc Confidence 211 123 23333333322 234577788877654432221 00110 00 11234 Q ss_pred eeeEEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 299 EKFMLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 299 ~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) ..+.+...+.+.+++|.|..+|++-.+=+..| T Consensus 289 v~~r~~~r~d~~v~~~~a~~~l~~a~~~~~~~ 320 (324) T protein:vir:96 289 VALRATMHVALHIADDKAFAKLVPADKRTDSV 320 (324) T ss_pred EEEEEEEEeccEEecccceEEEecccccCCCC Confidence 55556677899999999999999988888888 No 37 >protein:vir:10364 Length: 390 # NCBI annotation: head protein; major capsid subunit precursor # Family: family:all:585 # MgeID: mge:183 # MgeName: Xp10 # Cross-refs: genbank:acc:NP_858956;genbank:gi:32128421;genbank:GeneID:2648357 Probab=97.81 E-value=9.3e-06 Score=48.16 Aligned_cols=276 Identities=12% Similarity=0.008 Sum_probs=137.8 Q ss_pred CCccc--cceeeccc-cccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCc Q lcl|NC_020858. 1 MAVVT--NTFQSTGA-KGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSP 77 (330) Q Consensus 1 Ma~~t--~~~~t~~~-~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~ 77 (330) ++... .+..+... ....+.+.+.|+..--...|+..++...+..+....|....-.++...-..||...+....... T Consensus 107 ~~~~~~~~~~~~~~~g~~~~~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~Eg~~~~~~~~~~~ 186 (390) T protein:vir:10 107 KAALNTASTDAAGSAGALTTPNRLPGFITQPDARLTVRDLIGSGRTDSALIEYVQETGFVNNAAIVAEGALKPESSLKFA 186 (390) T ss_pred HHHHHhhhcccccccccccchhHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEecCCcceeeecCCcccccccccee Confidence 11000 00000000 0112233333444334556777776655555555566655443333233457777655433222 Q ss_pred eEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 78 ERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 78 ~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) ...-+.. -+...+.||. +.+... .+..+|-..+-...+.+-+|.++|+|... +. ...||+........ T Consensus 187 ~i~~~~~-k~~~~~~is~--ell~d~--~~l~~~i~~~l~~~~~~~~~~~il~G~G~-~~---~p~Gi~~~~~~~~~--- 254 (390) T protein:vir:10 187 KKTDTTH-VIAHTMKATR--QILSDA--PQLASYMNNRLIRGLKVKEDAEILRGTGA-ND---GLLGLIPQATTYAA--- 254 (390) T ss_pred EEEEeeE-EEEEeehhhH--HHHHhH--HHHHHHHHHHHHHHHHHHHHHHHhhcCCC-Cc---cccccccccccccc--- Confidence 2111221 2223344554 334433 24444444455567888999999998542 11 23355433211110 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCccee Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNS 237 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~ 237 (330) +.+.......+.+.+++.++-.++.....+++||.....|..+- +.. .|+...+... T Consensus 255 -----------------~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~v~n~~~~~~L~~lk-d~~---g~~l~~~~~~-- 311 (390) T protein:vir:10 255 -----------------PTTIAGATRVDQLRLAMLQASLAEYPASGIVINPIDWAAIELAK-DAN---NQYLIGNARG-- 311 (390) T ss_pred -----------------cccccccchHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHhh-cCC---CceeecCCcC-- Confidence 01111134456788888888877777778899999988888763 332 1222211110 Q ss_pred EEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchh-hhhcccCCccc----cccccccccceeeEEEEEEEEEEe Q lcl|NC_020858. 238 IVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRKIAE----DKKVAKTGDAEKFMLIGEGALKPK 312 (330) Q Consensus 238 ~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~~~~----~e~laKtGd~~k~~i~~E~tLe~~ 312 (330) ... -+=+| +.|+.+++||++. +++.|++. +.+ +.|.-.. .+.-.-+-+...+.....+...++ T Consensus 312 --~~~---~~l~G-~pv~~~~~~p~~~-----~~~gdf~~~~~~-~~~~~~~i~~~~~~~~~~~~~~~~r~~~r~d~~v~ 379 (390) T protein:vir:10 312 --TLT---PTLWG-LPVVATQAMAPGE-----FLVGAFDLAAQI-FDQWDARVEIGYVNDDFQRNMVTVLAEERLALVVY 379 (390) T ss_pred --cCC---ceecc-eeeEEcCCCCCCc-----EEEEeccceEEE-EEecceEEEEeecccccccCcEEEEEEEeeccEEe Confidence 000 01133 4678889999654 56788864 333 2232111 111112335566777788999999 Q ss_pred cchheeEEeccc Q lcl|NC_020858. 313 NEKGLGVAADLY 324 (330) Q Consensus 313 N~~a~g~i~gLt 324 (330) +|.|+.+|+ |. T Consensus 380 ~~~a~~~~~-~a 390 (390) T protein:vir:10 380 RPEALISGS-FA 390 (390) T ss_pred ccccEEEEE-eC Confidence 999986554 44 No 38 >protein:vir:78830 Length: 324 # NCBI annotation: major head protein # Family: family:all:507 # MgeID: mge:1858 # MgeName: 80alpha # Cross-refs: genbank:acc:YP_001285361;genbank:gi:148717889;genbank:GeneID:5246961 Probab=97.76 E-value=2e-05 Score=46.30 Aligned_cols=278 Identities=12% Similarity=0.018 Sum_probs=143.9 Q ss_pred CCc-----c-ccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccc Q lcl|NC_020858. 1 MAV-----V-TNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDAT 74 (330) Q Consensus 1 Ma~-----~-t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~ 74 (330) +.. + ..+.++......-+.+.+.|...-....|+++++......+..+....-+.. +...-..||+..+.... T Consensus 18 ~~~~~~~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~l~~~~~~~~~~~~~p~~~~~-~~a~~v~Eg~~~~~~~~ 96 (324) T protein:vir:78 18 NVKPQVFNPDNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQLGKYEPMEGTEKKFTFWADK-PGAYWVGEGQKIETSKA 96 (324) T ss_pred hhhhhhhccccccccCcCccccchhHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEecC-cceeEecCCcccccccc Confidence 100 0 0010111111122356666666667788888877666555444443332222 22233358877765433 Q ss_pred cCceEecceEEE---EeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhc Q lcl|NC_020858. 75 VSPERLGNYTQI---MRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKT 151 (330) Q Consensus 75 ~~~~~~~N~tQI---f~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~t 151 (330) . +.+++-- +...+.|| .+.+.... .+..++=...-.+.+.+.+|.++|+|.... . ...|+...... T Consensus 97 ~----~~~v~~~~~k~~~~~~is--~ell~ds~-~~l~~~i~~~la~ai~~~~d~a~l~G~g~~-~---~~~gi~~~~~~ 165 (324) T protein:vir:78 97 T----WVNATMRAFKLGVILPVT--KEFLNYTY-SQFFEEMKPMIAEAFYKKFDEAGILNQGNN-P---FGKSIAQSIEK 165 (324) T ss_pred c----eeEEEEeeEEEEEeehhh--HHHHhcch-HHHHHHHHHHHHHHHHHHHHHHHhccCCCC-C---cCccccccccc Confidence 2 2222211 22333333 33433332 233333334455778899999999986422 1 12233222110 Q ss_pred ccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeec Q lcl|NC_020858. 152 NVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAAS 231 (330) Q Consensus 152 n~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~ 231 (330) .+..+...++.++|.++..++-.++.+...++++|....++..+. +.. .|+... T Consensus 166 ----------------------~~~~~~~~~t~~~i~~~~~~l~~~~~~~~~~vmn~~~~~~L~~l~-d~~---G~~~~~ 219 (324) T protein:vir:78 166 ----------------------TNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKIV-DPE---TKERIY 219 (324) T ss_pred ----------------------cceeccccccHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHhh-ccC---CCeeec Confidence 011122346788899999988888888888999999998888763 332 122222 Q ss_pred CCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc---cccccc-------------ccc Q lcl|NC_020858. 232 NGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI---AEDKKV-------------AKT 295 (330) Q Consensus 232 ~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~---~~~e~l-------------aKt 295 (330) ++...+ =+| +.|+.+..++. ....+++.|++.+-+..-.++ ...+.. .-. T Consensus 220 ~~~~~~----------l~G-~PV~~~~~~~~---~~~~~~~gd~~~~~~g~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~ 285 (324) T protein:vir:78 220 DRNSDS----------LDG-LPVVNLKSSNL---KRGELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFE 285 (324) T ss_pred CCCCCc----------ccc-eeeEeeCCCCC---CcceEEEEecceEEEEEecCcEEEEeecccccccccccccchhhhh Confidence 222211 122 23333333332 234577888877644331221 001100 012 Q ss_pred ccceeeEEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 296 GDAEKFMLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 296 Gd~~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) -|...+.....++..+++|.|..+|++...=+..| T Consensus 286 ~d~~~~r~~~r~d~~v~~~~A~~~l~~a~~~~~~~ 320 (324) T protein:vir:78 286 QDMVALRATMHVALHIADDKAFAKLVPADKRTDSV 320 (324) T ss_pred cCcEEEEEEEEEccEEecccceEEEecccccCCCC Confidence 24455566677899999999999999988877777 No 39 >protein:vir:96392 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1613 # MgeName: 53 # Cross-refs: genbank:acc:YP_239648;genbank:gi:66395381;genbank:GeneID:5132868 Probab=97.76 E-value=2e-05 Score=46.30 Aligned_cols=278 Identities=12% Similarity=0.018 Sum_probs=143.9 Q ss_pred CCc-----c-ccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccc Q lcl|NC_020858. 1 MAV-----V-TNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDAT 74 (330) Q Consensus 1 Ma~-----~-t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~ 74 (330) +.. + ..+.++......-+.+.+.|...-....|+++++......+..+....-+.. +...-..||+..+.... T Consensus 18 ~~~~~~~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~l~~~~~~~~~~~~~p~~~~~-~~a~~v~Eg~~~~~~~~ 96 (324) T protein:vir:96 18 NVKPQVFNPDNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQLGKYEPMEGTEKKFTFWADK-PGAYWVGEGQKIETSKA 96 (324) T ss_pred hhhhhhhccccccccCcCccccchhHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEecC-cceeEecCCcccccccc Confidence 100 0 0010111111122356666666667788888877666555444443332222 22233358877765433 Q ss_pred cCceEecceEEE---EeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhc Q lcl|NC_020858. 75 VSPERLGNYTQI---MRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKT 151 (330) Q Consensus 75 ~~~~~~~N~tQI---f~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~t 151 (330) . +.+++-- +...+.|| .+.+.... .+..++=...-.+.+.+.+|.++|+|.... . ...|+...... T Consensus 97 ~----~~~v~~~~~k~~~~~~is--~ell~ds~-~~l~~~i~~~la~ai~~~~d~a~l~G~g~~-~---~~~gi~~~~~~ 165 (324) T protein:vir:96 97 T----WVNATMRAFKLGVILPVT--KEFLNYTY-SQFFEEMKPMIAEAFYKKFDEAGILNQGNN-P---FGKSIAQSIEK 165 (324) T ss_pred c----eeEEEEeeEEEEEeehhh--HHHHhcch-HHHHHHHHHHHHHHHHHHHHHHHhccCCCC-C---cCccccccccc Confidence 2 2222211 22333333 33433332 233333334455778899999999986422 1 12233222110 Q ss_pred ccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeec Q lcl|NC_020858. 152 NVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAAS 231 (330) Q Consensus 152 n~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~ 231 (330) .+..+...++.++|.++..++-.++.+...++++|....++..+. +.. .|+... T Consensus 166 ----------------------~~~~~~~~~t~~~i~~~~~~l~~~~~~~~~~vmn~~~~~~L~~l~-d~~---G~~~~~ 219 (324) T protein:vir:96 166 ----------------------TNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKIV-DPE---TKERIY 219 (324) T ss_pred ----------------------cceeccccccHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHhh-ccC---CCeeec Confidence 011122346788899999988888888888999999998888763 332 122222 Q ss_pred CCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc---cccccc-------------ccc Q lcl|NC_020858. 232 NGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI---AEDKKV-------------AKT 295 (330) Q Consensus 232 ~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~---~~~e~l-------------aKt 295 (330) ++...+ =+| +.|+.+..++. ....+++.|++.+-+..-.++ ...+.. .-. T Consensus 220 ~~~~~~----------l~G-~PV~~~~~~~~---~~~~~~~gd~~~~~~g~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~ 285 (324) T protein:vir:96 220 DRNSDS----------LDG-LPVVNLKSSNL---KRGELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFE 285 (324) T ss_pred CCCCCc----------ccc-eeeEeeCCCCC---CcceEEEEecceEEEEEecCcEEEEeecccccccccccccchhhhh Confidence 222211 122 23333333332 234577888877644331221 001100 012 Q ss_pred ccceeeEEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 296 GDAEKFMLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 296 Gd~~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) -|...+.....++..+++|.|..+|++...=+..| T Consensus 286 ~d~~~~r~~~r~d~~v~~~~A~~~l~~a~~~~~~~ 320 (324) T protein:vir:96 286 QDMVALRATMHVALHIADDKAFAKLVPADKRTDSV 320 (324) T ss_pred cCcEEEEEEEEEccEEecccceEEEecccccCCCC Confidence 24455566677899999999999999988877777 No 40 >protein:vir:81100 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:1891 # MgeName: tp310-1 # Cross-refs: genbank:acc:YP_001429874;genbank:gi:156603927;genbank:GeneID:5525320 Probab=97.75 E-value=1.9e-05 Score=46.44 Aligned_cols=284 Identities=11% Similarity=-0.016 Sum_probs=135.4 Q ss_pred CCccccceeeccc-cccccccceeeEecCCcccceeeeeccceeccceee-eeeeeccCccccccccccccccccc-cCc Q lcl|NC_020858. 1 MAVVTNTFQSTGA-KGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPE-WTTDELAAPGANITLEGDEYTFDAT-VSP 77 (330) Q Consensus 1 Ma~~t~~~~t~~~-~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~-W~td~L~~~~~na~~EG~d~~~~~~-~~~ 77 (330) +.......++.+. .-.-+++.+.|...--..+|+..++.....++..+. |....-..+...-..||++.+.... ... T Consensus 117 ~~~~~~~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~E~~~~~~~~~~~~~ 196 (415) T protein:vir:81 117 NDIQGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVRQSEVAALEKVEELEENPELAVKPFF 196 (415) T ss_pred hhhhhccccccccccccchHHHHHHHHHHHhhhhhhhheeeeeccCCceeEEEEeecCCccceeeccccccCccccccee Confidence 1111111112111 112236777777776777888888776554433332 2222222233334467877654321 111 Q ss_pred eEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 78 ERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 78 ~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) ...-+...+ ...+.||. +.+.... .+-.+|=..+-...+.+-+|+++|+|........ ++..... . T Consensus 197 ~v~~~~~k~-~~~~~iS~--ell~ds~-~~l~~~i~~~l~~~~~~~~~~~il~g~g~g~~~~----~~~~~~~---~--- 262 (415) T protein:vir:81 197 QLAYDINTH-RGYFRISR--EAIEDAK-VNVLQELKLWMARTIAATRNKAIIDVITKGSTGS----TSSGFEK---E--- 262 (415) T ss_pred eEEeeeeee-EeeehhhH--HHHhhch-HHHHHHHHHHHHHHHHHHHHHHHhhccccCcccc----ccccccc---c--- Confidence 111122222 22344443 3333332 2333344445555678889999998764211110 0000000 0 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecC----C Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASN----G 233 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~----~ 233 (330) ..+.......+-+.|.+++.++..++.....+++|+....+|..+ ++.+. |+.... + T Consensus 263 ---------------~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~l~~l-kd~~G---~~l~~~~~~~~ 323 (415) T protein:vir:81 263 ---------------GKKLEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKM-KDKLG---NYLIQPDVKEK 323 (415) T ss_pred ---------------ccccccccccchhHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHh-hccCC---ceeeccCcCCC Confidence 001111224566778888888877776666789999999998876 44331 222111 1 Q ss_pred cceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccccccceeeEEEEEEEEEEec Q lcl|NC_020858. 234 KNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTGDAEKFMLIGEGALKPKN 313 (330) Q Consensus 234 ~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~E~tLe~~N 313 (330) ...+| +| +.|+....||........+++.|+...-+-+.|.-..-+...-..+......+..+...+.+ T Consensus 324 ~~~~l----------~G-~pV~~~~~~~~~~~~~~~~~~Gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~r~d~~v~~ 392 (415) T protein:vir:81 324 TQQRL----------LG-AKIEILPDEVLGQKGNNTLIIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDCRILD 392 (415) T ss_pred CCcee----------cc-eeeEEecccccCCCCccEEEEEehhccEEEEeecceEEEEeccccCceEEEEEEEeccEEec Confidence 11011 22 24555556775544334467777653211122221111111112233445566778899999 Q ss_pred chheeEEec--------cccccc Q lcl|NC_020858. 314 EKGLGVAAD--------LYGLTA 328 (330) Q Consensus 314 ~~a~g~i~g--------Lt~~~~ 328 (330) |.|..++.- =-||-+ T Consensus 393 ~~a~~~~~~~~~~~~~~~~~~~~ 415 (415) T protein:vir:81 393 YKSAIVIEYDDSERGEGDLGLEA 415 (415) T ss_pred cccEEEEEEeccCCCCCccccCC Confidence 999987752 223333 No 41 >protein:vir:98339 Length: 415 # NCBI annotation: putative capsid protein # Family: family:all:21 # MgeID: mge:1581 # MgeName: phiPVL(108) # Cross-refs: genbank:acc:YP_918931;genbank:gi:119443693;genbank:GeneID:4594501 Probab=97.75 E-value=1.9e-05 Score=46.44 Aligned_cols=284 Identities=11% Similarity=-0.016 Sum_probs=135.4 Q ss_pred CCccccceeeccc-cccccccceeeEecCCcccceeeeeccceeccceee-eeeeeccCccccccccccccccccc-cCc Q lcl|NC_020858. 1 MAVVTNTFQSTGA-KGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPE-WTTDELAAPGANITLEGDEYTFDAT-VSP 77 (330) Q Consensus 1 Ma~~t~~~~t~~~-~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~-W~td~L~~~~~na~~EG~d~~~~~~-~~~ 77 (330) +.......++.+. .-.-+++.+.|...--..+|+..++.....++..+. |....-..+...-..||++.+.... ... T Consensus 117 ~~~~~~~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~E~~~~~~~~~~~~~ 196 (415) T protein:vir:98 117 NDIQGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVRQSEVAALEKVEELEENPELAVKPFF 196 (415) T ss_pred hhhhhccccccccccccchHHHHHHHHHHHhhhhhhhheeeeeccCCceeEEEEeecCCccceeeccccccCccccccee Confidence 1111111112111 112236777777776777888888776554433332 2222222233334467877654321 111 Q ss_pred eEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 78 ERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 78 ~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) ...-+...+ ...+.||. +.+.... .+-.+|=..+-...+.+-+|+++|+|........ ++..... . T Consensus 197 ~v~~~~~k~-~~~~~iS~--ell~ds~-~~l~~~i~~~l~~~~~~~~~~~il~g~g~g~~~~----~~~~~~~---~--- 262 (415) T protein:vir:98 197 QLAYDINTH-RGYFRISR--EAIEDAK-VNVLQELKLWMARTIAATRNKAIIDVITKGSTGS----TSSGFEK---E--- 262 (415) T ss_pred eEEeeeeee-EeeehhhH--HHHhhch-HHHHHHHHHHHHHHHHHHHHHHHhhccccCcccc----ccccccc---c--- Confidence 111122222 22344443 3333332 2333344445555678889999998764211110 0000000 0 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecC----C Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASN----G 233 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~----~ 233 (330) ..+.......+-+.|.+++.++..++.....+++|+....+|..+ ++.+. |+.... + T Consensus 263 ---------------~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~l~~l-kd~~G---~~l~~~~~~~~ 323 (415) T protein:vir:98 263 ---------------GKKLEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKM-KDKLG---NYLIQPDVKEK 323 (415) T ss_pred ---------------ccccccccccchhHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHh-hccCC---ceeeccCcCCC Confidence 001111224566778888888877776666789999999998876 44331 222111 1 Q ss_pred cceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccccccceeeEEEEEEEEEEec Q lcl|NC_020858. 234 KNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTGDAEKFMLIGEGALKPKN 313 (330) Q Consensus 234 ~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~E~tLe~~N 313 (330) ...+| +| +.|+....||........+++.|+...-+-+.|.-..-+...-..+......+..+...+.+ T Consensus 324 ~~~~l----------~G-~pV~~~~~~~~~~~~~~~~~~Gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~r~d~~v~~ 392 (415) T protein:vir:98 324 TQQRL----------LG-AKIEILPDEVLGQKGNNTLIIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDCRILD 392 (415) T ss_pred CCcee----------cc-eeeEEecccccCCCCccEEEEEehhccEEEEeecceEEEEeccccCceEEEEEEEeccEEec Confidence 11011 22 24555556775544334467777653211122221111111112233445566778899999 Q ss_pred chheeEEec--------cccccc Q lcl|NC_020858. 314 EKGLGVAAD--------LYGLTA 328 (330) Q Consensus 314 ~~a~g~i~g--------Lt~~~~ 328 (330) |.|..++.- =-||-+ T Consensus 393 ~~a~~~~~~~~~~~~~~~~~~~~ 415 (415) T protein:vir:98 393 YKSAIVIEYDDSERGEGDLGLEA 415 (415) T ss_pred cccEEEEEEeccCCCCCccccCC Confidence 999987752 223333 No 42 >protein:vir:79987 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:1875 # MgeName: tp310-3 # Cross-refs: genbank:acc:YP_001430002;genbank:gi:156604057;genbank:GeneID:5525447 Probab=97.75 E-value=1.9e-05 Score=46.44 Aligned_cols=284 Identities=11% Similarity=-0.016 Sum_probs=135.4 Q ss_pred CCccccceeeccc-cccccccceeeEecCCcccceeeeeccceeccceee-eeeeeccCccccccccccccccccc-cCc Q lcl|NC_020858. 1 MAVVTNTFQSTGA-KGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPE-WTTDELAAPGANITLEGDEYTFDAT-VSP 77 (330) Q Consensus 1 Ma~~t~~~~t~~~-~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~-W~td~L~~~~~na~~EG~d~~~~~~-~~~ 77 (330) +.......++.+. .-.-+++.+.|...--..+|+..++.....++..+. |....-..+...-..||++.+.... ... T Consensus 117 ~~~~~~~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~E~~~~~~~~~~~~~ 196 (415) T protein:vir:79 117 NDIQGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVRQSEVAALEKVEELEENPELAVKPFF 196 (415) T ss_pred hhhhhccccccccccccchHHHHHHHHHHHhhhhhhhheeeeeccCCceeEEEEeecCCccceeeccccccCccccccee Confidence 1111111112111 112236777777776777888888776554433332 2222222233334467877654321 111 Q ss_pred eEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 78 ERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 78 ~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) ...-+...+ ...+.||. +.+.... .+-.+|=..+-...+.+-+|+++|+|........ ++..... . T Consensus 197 ~v~~~~~k~-~~~~~iS~--ell~ds~-~~l~~~i~~~l~~~~~~~~~~~il~g~g~g~~~~----~~~~~~~---~--- 262 (415) T protein:vir:79 197 QLAYDINTH-RGYFRISR--EAIEDAK-VNVLQELKLWMARTIAATRNKAIIDVITKGSTGS----TSSGFEK---E--- 262 (415) T ss_pred eEEeeeeee-EeeehhhH--HHHhhch-HHHHHHHHHHHHHHHHHHHHHHHhhccccCcccc----ccccccc---c--- Confidence 111122222 22344443 3333332 2333344445555678889999998764211110 0000000 0 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecC----C Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASN----G 233 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~----~ 233 (330) ..+.......+-+.|.+++.++..++.....+++|+....+|..+ ++.+. |+.... + T Consensus 263 ---------------~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~l~~l-kd~~G---~~l~~~~~~~~ 323 (415) T protein:vir:79 263 ---------------GKKLEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKM-KDKLG---NYLIQPDVKEK 323 (415) T ss_pred ---------------ccccccccccchhHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHh-hccCC---ceeeccCcCCC Confidence 001111224566778888888877776666789999999998876 44331 222111 1 Q ss_pred cceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccccccceeeEEEEEEEEEEec Q lcl|NC_020858. 234 KNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTGDAEKFMLIGEGALKPKN 313 (330) Q Consensus 234 ~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~E~tLe~~N 313 (330) ...+| +| +.|+....||........+++.|+...-+-+.|.-..-+...-..+......+..+...+.+ T Consensus 324 ~~~~l----------~G-~pV~~~~~~~~~~~~~~~~~~Gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~r~d~~v~~ 392 (415) T protein:vir:79 324 TQQRL----------LG-AKIEILPDEVLGQKGNNTLIIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDCRILD 392 (415) T ss_pred CCcee----------cc-eeeEEecccccCCCCccEEEEEehhccEEEEeecceEEEEeccccCceEEEEEEEeccEEec Confidence 11011 22 24555556775544334467777653211122221111111112233445566778899999 Q ss_pred chheeEEec--------cccccc Q lcl|NC_020858. 314 EKGLGVAAD--------LYGLTA 328 (330) Q Consensus 314 ~~a~g~i~g--------Lt~~~~ 328 (330) |.|..++.- =-||-+ T Consensus 393 ~~a~~~~~~~~~~~~~~~~~~~~ 415 (415) T protein:vir:79 393 YKSAIVIEYDDSERGEGDLGLEA 415 (415) T ss_pred cccEEEEEEeccCCCCCccccCC Confidence 999987752 223333 No 43 >protein:vir:3158 Length: 321 # NCBI annotation: capsid protein gpE # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:316 # MgeName: PhiCh1 # Cross-refs: genbank:acc:NP_665929;genbank:gi:22091115;genbank:GeneID:951342 Probab=97.74 E-value=6.7e-06 Score=48.96 Aligned_cols=288 Identities=14% Similarity=0.106 Sum_probs=131.4 Q ss_pred CC---------cc--ccceeecccc---ccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccc Q lcl|NC_020858. 1 MA---------VV--TNTFQSTGAK---GNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEG 66 (330) Q Consensus 1 Ma---------~~--t~~~~t~~~~---g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG 66 (330) || .+ .+.+++.+.. -..+++.+.|..---+.+||++++......+...+=..-...........|| T Consensus 1 ~~~k~~~~~l~~~~~~~~~~~~~~~~g~~v~~~~~~~l~~~i~e~s~~l~~i~v~~v~~~~~~i~~~~~~~~~~~~~~e~ 80 (321) T protein:vir:31 1 MASRTINNDLSRITEKNALTVDDLDAGGTLPDPLWDEFWTDMIEETPLLDAIRTETVGAKKTRIPTLNIGERHRRPQDEG 80 (321) T ss_pred CchHHHHHHHHHHHHhccccccccCCcceeCHHHHHHHHHHHHHhhhhhhhceeeeccCcceeeeeeccCCccccccccc Confidence 11 11 1111111111 1234566666665567789999887655544332211111111111111233 Q ss_pred cccccccccCceEecceEEEEee---eeeehhHHHHHhhccc-cchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcc-ccc Q lcl|NC_020858. 67 DEYTFDATVSPERLGNYTQIMRK---SGIISGTQNITDEAGR-ATKVKEQKLKKGVELRKDVEFSIVATNASVGGA-TRE 141 (330) Q Consensus 67 ~d~~~~~~~~~~~~~N~tQIf~~---~v~VS~T~~av~~~G~-~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~-~r~ 141 (330) ..... +..+ .++.+.-=.++ -+.||. +-+...-. .+--++=..+-.+.+.+|++.++++|......+ ... T Consensus 81 ~~~~~-~~~~--~~~~~~~~~~k~~~~~~it~--e~L~d~a~~~d~e~~i~~~ia~~~a~~~~~~~~nGd~~~~~~~~~~ 155 (321) T protein:vir:31 81 EWNEN-ESDV--STGTIDISTEKATVAWDLPR--EVVQENPEGEALADRILNLMTDAWSADVEDLAANGDEDAEDSFENQ 155 (321) T ss_pred ccccc-cccc--eeeeeeeeeEEEEeehhccH--HHHHhhhcchhHHHHHHHHHHHHHHHHHHhheeeccccCCCccccc Confidence 22111 1111 11221111111 223332 23322211 233333444556779999999999997432111 123 Q ss_pred chhHHHHHhcccccccccccccccccccccccccccccccccHHHHHHHHHHH---HhcCCceeEEEeChHHHHHHHHhh Q lcl|NC_020858. 142 SGSLPTWVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQG---YQSGANFKHVFVSPYVKSVFVTFM 218 (330) Q Consensus 142 ~~Gi~~~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~---~~~Gg~~~~i~v~~~~k~~is~f~ 218 (330) ..|++..++.+... ...+...++.+.|.+++..+ |.+.++. ..+|++.....+-... T Consensus 156 n~G~l~~a~~~~~~-------------------~~~~~~~~~~d~l~~l~~~l~~~yr~~~~~-v~im~~~~~~~~~~~l 215 (321) T protein:vir:31 156 NDGFITVAEGDVET-------------------IDAADDILDNDLVIRTIAGLDSKYRARMNP-ALIVSEDQLLSYHYTL 215 (321) T ss_pred chhhhhhhcccccc-------------------ccccccccCHHHHHHHHHhccHhHhcCCCe-EEEechHHHHHHHHHH Confidence 45654443322110 11122346777788877775 3233333 4678888766655544 Q ss_pred ccceeeeeeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccc---ccc Q lcl|NC_020858. 219 SDTNVASFRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKV---AKT 295 (330) Q Consensus 219 ~~~~~~~~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~l---aKt 295 (330) .+...-..+....++. .. -++-+.++..++||.+. +++-|+..+.+.+-+.. ..+.. .+. T Consensus 216 ~~~~~~~~~~~l~~~~----------~~-tl~G~pvv~~~~mP~~~-----il~t~~~nl~~~~~~~~-~~~~~~~~~~~ 278 (321) T protein:vir:31 216 TDRDTPLGDNVIMGEA----------DV-NPFSFPIIGSGLWPDDK-----AMFTDPQNLIYALYRDL-EIDVLTESDKV 278 (321) T ss_pred hcCCCccccchhhccc----------cc-cccceeEEEcCCCCCCc-----EEEeccccEEEEEeecc-EEEEeecCccc Confidence 4332111000011111 11 25667888889999764 56788887755432321 11111 111 Q ss_pred ---ccceeeEEEEEEEEEEecchheeEEecccc----ccccC Q lcl|NC_020858. 296 ---GDAEKFMLIGEGALKPKNEKGLGVAADLYG----LTAST 330 (330) Q Consensus 296 ---Gd~~k~~i~~E~tLe~~N~~a~g~i~gLt~----~~~~~ 330 (330) .+.....+.....-.+.++.+.+.|+||=- |-.+| T Consensus 279 ~~~~~~~~~~~~~~~~~~ve~~~a~a~~~~i~~~~~~~~~~~ 320 (321) T protein:vir:31 279 SERDLHARYFMRGDDDFAIENTEAVVLAEGLGDPLEHLEEET 320 (321) T ss_pred cccceeeEeeeeeecceeEeccccEEEEecCCcchhcccCCC Confidence 122233345557777889999999998732 22222 No 44 >protein:vir:4600 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:101 # MgeName: PVL # Cross-refs: genbank:acc:NP_058445;genbank:gi:9635171;genbank:GeneID:1262708 Probab=97.72 E-value=2.1e-05 Score=46.23 Aligned_cols=281 Identities=11% Similarity=0.023 Sum_probs=138.5 Q ss_pred CCccccceeeccc-cccccccceeeEecCCcccceeeeeccceeccceeeeeeeec-cCccccccccccccccccccCce Q lcl|NC_020858. 1 MAVVTNTFQSTGA-KGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDEL-AAPGANITLEGDEYTFDATVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~-~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L-~~~~~na~~EG~d~~~~~~~~~~ 78 (330) ++..+ ++... .-.-+.+.+.|...-...+|+..++......+..+.+..-.. ..+...-..||+..+.... . T Consensus 120 ~~~~~---~t~~g~~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~Eg~~~~~~~~---~ 193 (415) T protein:vir:46 120 QGGSL---KTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVRQSEVAALEKVEELEENPELAV---K 193 (415) T ss_pred hhccc---cccCCcccccHHHHHHHHHHHHhhhhhhhhcceeeccCCceeEEEEEecCCcceeecccccccccccc---c Confidence 22211 11111 222346777887777888899888776555544433322211 1222233457777653211 1 Q ss_pred Eecce-EEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 79 RLGNY-TQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 79 ~~~N~-tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) .+.++ ...++-...+.=|.+.++... .+...|=..+-...+.+-+|.++|+|... +.+.. ++..+.. . T Consensus 194 ~~~~v~~~~~k~~~~~~iS~ell~ds~-~~l~~~i~~~l~~~i~~~~d~~il~g~g~--g~~~~--~~~~~~~---~--- 262 (415) T protein:vir:46 194 PFFQLAYDINTHRGYFRISREAIEDAK-VNVLQELKLWMARTIAATRNKAIIDVITK--GSTGS--TSSGFEK---E--- 262 (415) T ss_pred ceeeEEeeeeeeEeeehhhHHHHhhch-HHHHHHHHHHHHHHHHHHHHHHHhhcccc--CCccc--ccccccc---c--- Confidence 12221 112222222223334444333 23344444555666788899999988542 11111 0000000 0 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeec----CC Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAAS----NG 233 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~----~~ 233 (330) ..+.......+.++|.+++.++-.++.....+++||....+|..+ ++.+. |+... ++ T Consensus 263 ---------------~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~L~~l-kd~~G---~~i~~~~~~~~ 323 (415) T protein:vir:46 263 ---------------GKKLEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKM-KDKLG---NYLIQPDVKEK 323 (415) T ss_pred ---------------cceeccccccchHHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHh-hccCC---CeeeccCcCCC Confidence 001112234667788888888888776677889999999998876 34331 12211 11 Q ss_pred cceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccccccceeeEEEEEEEEEEec Q lcl|NC_020858. 234 KNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTGDAEKFMLIGEGALKPKN 313 (330) Q Consensus 234 ~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~E~tLe~~N 313 (330) ...+| +| +.|+....||..+.....+++.|+...-+-+.|.-..-+...-..+......+..+...+.+ T Consensus 324 ~~~~l----------~G-~pV~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~r~d~~v~~ 392 (415) T protein:vir:46 324 TQQRL----------LG-AKIEILPDEVLGQKGNNTLIIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDCRILD 392 (415) T ss_pred CCccc----------cc-eeeEEeccccccCCCccEEEEEehhccEEEEeecceEEEeeccccCceEEEEEEEeccEEec Confidence 11011 22 25566666776554344567778774221122221111111112234455666778899999 Q ss_pred chheeEEe--------ccccccc Q lcl|NC_020858. 314 EKGLGVAA--------DLYGLTA 328 (330) Q Consensus 314 ~~a~g~i~--------gLt~~~~ 328 (330) |.|..++. |=-||-+ T Consensus 393 ~~a~~~~~~~~~~~~~~~~~~~~ 415 (415) T protein:vir:46 393 YKSAIVIEYDDSERGEGDLGLEA 415 (415) T ss_pred cccEEEEEeeccCCCCCCccCCC Confidence 99987765 2223333 No 45 >protein:vir:4700 Length: 415 # NCBI annotation: phi PVL ORF 7 homologue # Family: family:all:21 # MgeID: mge:102 # MgeName: phiPV83 # Cross-refs: genbank:acc:NP_061632;genbank:gi:9635719;genbank:GeneID:1262976 Probab=97.72 E-value=2.1e-05 Score=46.23 Aligned_cols=281 Identities=11% Similarity=0.023 Sum_probs=138.5 Q ss_pred CCccccceeeccc-cccccccceeeEecCCcccceeeeeccceeccceeeeeeeec-cCccccccccccccccccccCce Q lcl|NC_020858. 1 MAVVTNTFQSTGA-KGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDEL-AAPGANITLEGDEYTFDATVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~-~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L-~~~~~na~~EG~d~~~~~~~~~~ 78 (330) ++..+ ++... .-.-+.+.+.|...-...+|+..++......+..+.+..-.. ..+...-..||+..+.... . T Consensus 120 ~~~~~---~t~~g~~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~Eg~~~~~~~~---~ 193 (415) T protein:vir:47 120 QGGSL---KTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVRQSEVAALEKVEELEENPELAV---K 193 (415) T ss_pred hhccc---cccCCcccccHHHHHHHHHHHHhhhhhhhhcceeeccCCceeEEEEEecCCcceeecccccccccccc---c Confidence 22211 11111 222346777887777888899888776555544433322211 1222233457777653211 1 Q ss_pred Eecce-EEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 79 RLGNY-TQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 79 ~~~N~-tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) .+.++ ...++-...+.=|.+.++... .+...|=..+-...+.+-+|.++|+|... +.+.. ++..+.. . T Consensus 194 ~~~~v~~~~~k~~~~~~iS~ell~ds~-~~l~~~i~~~l~~~i~~~~d~~il~g~g~--g~~~~--~~~~~~~---~--- 262 (415) T protein:vir:47 194 PFFQLAYDINTHRGYFRISREAIEDAK-VNVLQELKLWMARTIAATRNKAIIDVITK--GSTGS--TSSGFEK---E--- 262 (415) T ss_pred ceeeEEeeeeeeEeeehhhHHHHhhch-HHHHHHHHHHHHHHHHHHHHHHHhhcccc--CCccc--ccccccc---c--- Confidence 12221 112222222223334444333 23344444555666788899999988542 11111 0000000 0 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeec----CC Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAAS----NG 233 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~----~~ 233 (330) ..+.......+.++|.+++.++-.++.....+++||....+|..+ ++.+. |+... ++ T Consensus 263 ---------------~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~L~~l-kd~~G---~~i~~~~~~~~ 323 (415) T protein:vir:47 263 ---------------GKKLEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKM-KDKLG---NYLIQPDVKEK 323 (415) T ss_pred ---------------cceeccccccchHHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHh-hccCC---CeeeccCcCCC Confidence 001112234667788888888888776677889999999998876 34331 12211 11 Q ss_pred cceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccccccceeeEEEEEEEEEEec Q lcl|NC_020858. 234 KNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTGDAEKFMLIGEGALKPKN 313 (330) Q Consensus 234 ~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~E~tLe~~N 313 (330) ...+| +| +.|+....||..+.....+++.|+...-+-+.|.-..-+...-..+......+..+...+.+ T Consensus 324 ~~~~l----------~G-~pV~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~r~d~~v~~ 392 (415) T protein:vir:47 324 TQQRL----------LG-AKIEILPDEVLGQKGNNTLIIGNLKDAIVLFDRSQYQASWTDYMHFGECLMIAVRQDCRILD 392 (415) T ss_pred CCccc----------cc-eeeEEeccccccCCCccEEEEEehhccEEEEeecceEEEeeccccCceEEEEEEEeccEEec Confidence 11011 22 25566666776554344567778774221122221111111112234455666778899999 Q ss_pred chheeEEe--------ccccccc Q lcl|NC_020858. 314 EKGLGVAA--------DLYGLTA 328 (330) Q Consensus 314 ~~a~g~i~--------gLt~~~~ 328 (330) |.|..++. |=-||-+ T Consensus 393 ~~a~~~~~~~~~~~~~~~~~~~~ 415 (415) T protein:vir:47 393 YKSAIVIEYDDSERGEGDLGLEA 415 (415) T ss_pred cccEEEEEeeccCCCCCCccCCC Confidence 99987765 2223333 No 46 >protein:vir:4511 Length: 409 # NCBI annotation: capsid # Family: family:all:21 # MgeID: mge:97 # MgeName: V # Cross-refs: genbank:acc:NP_599037;genbank:gi:19548995;genbank:GeneID:935211 Probab=97.71 E-value=1.3e-05 Score=47.44 Aligned_cols=286 Identities=11% Similarity=0.010 Sum_probs=134.0 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceecc-ceeeeeeeeccCccccccccccccccccccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDT-THPEWTTDELAAPGANITLEGDEYTFDATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~-~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~ 79 (330) |.+.+.. ......-+.+.+.|...=....|+.++....+..+ ....|....-..+..--..||...+......... T Consensus 117 ~~~~~~~---~gg~liP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~E~~~~~~~~~~f~~~ 193 (409) T protein:vir:45 117 QGVAQDE---KGGYTVPETFLAKVVEKMKSYGGIASVAQILTTSDGRTMEWATADGTSEVGVLLGENEEAGEEDTDFGMG 193 (409) T ss_pred ccCccCc---CCceeccHhHHHHHHHHHHhhhhhhhhceeeecCCCceEEEEeeccCcccccccccccccccccccccee Confidence 3222211 00111123455555555567777777655444432 3467776664443333446777654433221111 Q ss_pred ecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccccc Q lcl|NC_020858. 80 LGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATG 159 (330) Q Consensus 80 ~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g 159 (330) .-.--.+....+.| |.+.+.... .+...|=...-...+.+-+|.+||+|... +++-+.-||+.... + T Consensus 194 ~l~~~k~~~~~i~i--s~ell~ds~-~~l~~~i~~~la~a~~~~~~~a~l~G~G~--~~~~~p~Gil~~~~--------~ 260 (409) T protein:vir:45 194 SLGALKMTSKIIRV--SNELLQDSA-IDMEAYLARRIAERIGRGEARYLIQGTGA--GTPKQPKGLAASVT--------G 260 (409) T ss_pred eeeeeeeeeeehhh--hHHHHhccH-HHHHHHHHHHHHHHHHHHHHHHhhccCCC--CCccccceeeeccc--------c Confidence 10111111122234 444444432 22333333344556778899999998753 22223345543221 0 Q ss_pred ccccccccccccccccccccccccHHHHHHHHHHHHhcCC-ce-eEEEeChHHHHHHHHhhccceeeeeeeeecCCccee Q lcl|NC_020858. 160 ANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGA-NF-KHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNS 237 (330) Q Consensus 160 ~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg-~~-~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~ 237 (330) . . +.++...++-++|.+++..+-.+.. +. -.+++|+....++..+- |... |+.....-.. T Consensus 261 ~----------~---~~~~~~~~~~d~i~~l~~~l~~~~~~~a~~~~~~n~~~~~~l~~lk-d~~G---~~i~~~~~~~- 322 (409) T protein:vir:45 261 T----------T---QTAAANAVKWQEILALKHSIDPAYRRGPKFRLAFNDNTLKLISEME-DGQG---RPLWLPDIVG- 322 (409) T ss_pred c----------c---ccccccccchHHHHHHHHhhhhhhccCCeEEEEECHHHHHHHHHhh-cCCC---ceeeccCcCC- Confidence 0 0 1111224677778887776654321 22 24567998888888763 3321 2221111000 Q ss_pred EEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccc----cccccccccceeeEEEEEEEEEEec Q lcl|NC_020858. 238 IVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAE----DKKVAKTGDAEKFMLIGEGALKPKN 313 (330) Q Consensus 238 ~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~----~e~laKtGd~~k~~i~~E~tLe~~N 313 (330) +. -.+=+| +.|+.+.+||........+++.|++..-+.. +.-.. .++.. .-+...+..+..+...+.+ T Consensus 323 -~~----~~~l~G-~PV~~~~~~p~~~~~~~~i~~Gd~~~~~i~~-~~~~~~~~~~d~~~-~~~~~~~~~~~r~d~~~~~ 394 (409) T protein:vir:45 323 -VA----PASVLN-VPYVIDQEIDDIGAGKKFMFCGDFDRFIIRR-VRYMILKRLVERYA-EYDQTGFLAFHRFDCILED 394 (409) T ss_pred -CC----Cceecc-eeeEEecCcCCccCCccEEEEeehhhhheee-ccceEEEEeecccc-cCCcEEEEEEEEeccEeec Confidence 00 011235 4788889999654434446667887643322 21111 11111 1123334445568889999 Q ss_pred chheeEEeccccccc Q lcl|NC_020858. 314 EKGLGVAADLYGLTA 328 (330) Q Consensus 314 ~~a~g~i~gLt~~~~ 328 (330) |.|..+++.=..-.+ T Consensus 395 ~~A~~~l~~k~s~~~ 409 (409) T protein:vir:45 395 TSAIKALVGKGSVGG 409 (409) T ss_pred hhheEEEEeccCCCC Confidence 998876654322222 No 47 >protein:vir:97053 Length: 390 # NCBI annotation: putative head protein # Family: family:all:585 # MgeID: mge:1653 # MgeName: OP1 # Cross-refs: genbank:acc:YP_453565;genbank:gi:84662600;genbank:GeneID:5142468 Probab=97.70 E-value=1.2e-05 Score=47.60 Aligned_cols=277 Identities=11% Similarity=0.005 Sum_probs=137.6 Q ss_pred CCcccccee-eccc--cccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCc Q lcl|NC_020858. 1 MAVVTNTFQ-STGA--KGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSP 77 (330) Q Consensus 1 Ma~~t~~~~-t~~~--~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~ 77 (330) ++......+ +... ...-+.+.+.|...--...|+..++......+..+.|....-.++...-..||...+....... T Consensus 107 ~~~~~~~~~~~~~~~g~lip~~~~~~ii~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~Eg~~~~~~~~~~~ 186 (390) T protein:vir:97 107 KAALNTASTDAAGSAGALTTPNRLPGFITPPDARLTVRDLIGSGRTDSALIEYVQETGFVNNAAIVAEGALKPESSLKFA 186 (390) T ss_pred HHHHHhhhcccccccccccchhhhHHHHHHHhhhhhhHhhcceeeccCCceEEEEEecCCcceeeecCCcccccccccee Confidence 111110100 0111 1122344455555445566777776665555555566655433333233458887665543322 Q ss_pred eEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 78 ERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 78 ~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) ...-+... +...+.||. +.++.. .+..++=...-...+.+-+|.++|+|.... . ...||......... T Consensus 187 ~i~~~~~k-~~~~~~is~--ell~ds--~~l~~~i~~~la~a~~~~~d~a~l~G~g~~-~---~p~Gi~~~~~~~~~--- 254 (390) T protein:vir:97 187 KKTDTTHV-IAHTMKATR--QILSDA--PQLASYMNNRLIRGLKVKEDAEILRGTGAN-D---GLLGLIPQATTYAA--- 254 (390) T ss_pred EEEEeeee-EEEeehhhH--HHHHhH--HHHHHHHHHHHHHHHHHHHHHHHhhcCCCC-c---cccceeeccccccc--- Confidence 22222222 222344544 444443 243444444566778889999999985321 1 23354432210000 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCccee Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNS 237 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~ 237 (330) .+ +.....+.+.+.+++.++-.++.....+++||.....|.++ ++... ++...+.... T Consensus 255 ----------------~~-~~~~~~~~d~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~l-kd~~G---~~l~~~~~~~- 312 (390) T protein:vir:97 255 ----------------PT-TIAGATRVDQLRLAMLQASLAEYPASGIVINPIDWAAIELA-KDANN---QYLIGNARGT- 312 (390) T ss_pred ----------------cc-cccccchHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHh-hcCCC---ceeecCccCC- Confidence 00 11113445667777777766777777789999999898876 44432 2222111000 Q ss_pred EEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccc----cccccccccceeeEEEEEEEEEEec Q lcl|NC_020858. 238 IVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAE----DKKVAKTGDAEKFMLIGEGALKPKN 313 (330) Q Consensus 238 ~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~----~e~laKtGd~~k~~i~~E~tLe~~N 313 (330) ... .+--+.|+.+.+||++. +++.|++..-+-+.|.-.. .+.---+-+...+.+...+.+.+++ T Consensus 313 ~~~-------~l~G~pV~~~~~~~~~~-----~~~gd~~~~~~~~~~~~~~i~~~~~~~~f~~~~~~~r~~~r~d~~v~~ 380 (390) T protein:vir:97 313 LTP-------TLWGLPVVATQAMAPGE-----FLVGAFDLAAQIFDQWDARVEIGYVNDDFQRNMVTVLAEERLALVVYR 380 (390) T ss_pred CCc-------eecceeeEEcCCCCCCc-----EEEEeccceEEEEEecceEEEEeecccccccCcEEEEEEEeeccEEec Confidence 000 11124677888899654 5688887422222221111 1110012244556666779999999 Q ss_pred chheeEEeccc Q lcl|NC_020858. 314 EKGLGVAADLY 324 (330) Q Consensus 314 ~~a~g~i~gLt 324 (330) |.|+.+|+ |+ T Consensus 381 ~~a~v~~~-~a 390 (390) T protein:vir:97 381 PEALITGS-FA 390 (390) T ss_pred cccEEEEE-eC Confidence 99986554 55 No 48 >protein:vir:485 Length: 407 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:11 # MgeName: P27 # Cross-refs: genbank:acc:NP_543092;swissprot:trembl:q8w627;genbank:gi:18249904;uniprot:Q8W627;genbank:GeneID:929693 Probab=97.50 E-value=3.8e-05 Score=44.82 Aligned_cols=289 Identities=12% Similarity=0.042 Sum_probs=136.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) |...+.+ ....-.-+++.+.|...-....|+.+........+..+..... ...+...-..||...+.... T Consensus 106 ~~~~t~~---~gG~~iP~~~~~~I~~~~~~~~~l~~~~~~~~~~~~~~~~~~~-~~~~~a~~v~E~~~~~~~~~------ 175 (407) T protein:vir:48 106 LQVGNDE---DGGYAIPEELDRTILTLLKDEVVMRQEATVITLGGSDYKKLVN-LGGTTSGWVGETDARPETAT------ 175 (407) T ss_pred hhcccCC---CCcccccHhHHHHHHHHHHhhhhhhhhceeeecCCCceEEEEe-cCCcceeeeccccccccccc------ Confidence 3222211 0111122466666776667777887765554443332222221 12222233457766543221 Q ss_pred cceEEEEee----eeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccc Q lcl|NC_020858. 81 GNYTQIMRK----SGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRG 156 (330) Q Consensus 81 ~N~tQIf~~----~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g 156 (330) ..+.||--. ..-|.=|.+.+.... .+..+|=..+-...+.+-+|.+||+|... + +.-||+.......... T Consensus 176 ~~f~~i~~~~~k~~~~~~iS~ell~ds~-~~l~~~i~~~l~~~i~~~~~~a~l~G~G~--~---~p~Gil~~~~~~~~~~ 249 (407) T protein:vir:48 176 SKLGLIEPFMGEIYGNPQATQKMLDDAF-FNVEDWINSELALEFAEQEEIAFTSGDGS--K---KPKGFLAYESTDEDDK 249 (407) T ss_pred ccceeEEeeeeeeEeehhhHHHHHhcch-HHHHHHHHHHHHHHHHHHHHhhhhccCCC--C---ccceeeeccccccccc Confidence 122232222 222223344444332 12223333344445678899999998643 2 2346554432111110 Q ss_pred cccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeec----C Q lcl|NC_020858. 157 ATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAAS----N 232 (330) Q Consensus 157 ~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~----~ 232 (330) +.. .+.....+.++...++-++|.+++.++..+......+++|+.....|..+ +|.+. |+... + T Consensus 250 ~~~--------~~~~~~~~~~~~~~~~~d~i~~l~~~l~~~~~~~a~~v~n~~~~~~L~~l-kD~~G---r~l~~~~~~~ 317 (407) T protein:vir:48 250 TRA--------FGKLQHIASGAASGVTADAIIKLIYTLRKAHRSGAKFMMNNSSLFAIRLL-KDNDG---NYLWRPGIEL 317 (407) T ss_pred ccc--------cccccccccccccccChHHHHHHHHhhchhhhcCCEEEEcHHHHHHHHHh-hccCC---ceeeccCcCC Confidence 000 00011112233345777888888777765543333578999998888875 44331 22211 1 Q ss_pred CcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchh-hhhcccCC-c-cccccccccccceeeEEEEEEEE Q lcl|NC_020858. 233 GKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRK-I-AEDKKVAKTGDAEKFMLIGEGAL 309 (330) Q Consensus 233 ~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~-~-~~~e~laKtGd~~k~~i~~E~tL 309 (330) +...+ =+| +.|+.+.+||........+++.|++. +.+ +.|. + ...++.+ +-+...+..+..+.. T Consensus 318 g~~~~----------l~G-~PV~~~~~~p~~~~~~~~i~~Gd~~~~~~i-~~~~~~~i~~d~~~-~~~~~~~~~~~r~d~ 384 (407) T protein:vir:48 318 GQPSS----------LAG-YGIVENEQMPDIAADAKAIAFGNFKRGYTI-VDRIGTRILRDPYT-NKPFVGFYTTKRTGG 384 (407) T ss_pred CCCce----------ecc-eeeEEecCcCCccCCccEEEEEeccccEEE-EEeeceEEEeeccc-cCCcEEEEEEEEecc Confidence 11111 123 46788888987544344466678763 222 1121 1 1222333 235556666667999 Q ss_pred EEecchheeEEeccccccccC Q lcl|NC_020858. 310 KPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 310 e~~N~~a~g~i~gLt~~~~~~ 330 (330) .+.+|+|..++..=.-=++++ T Consensus 385 ~v~~~~a~~~l~~~aa~~~~~ 405 (407) T protein:vir:48 385 MLVDSQAIKLMKIGAATRQKA 405 (407) T ss_pred EEecccceEEEEeeccCCCCC Confidence 999999987654322222222 No 49 >protein:vir:4997 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:109 # MgeName: Sfi21 # Cross-refs: genbank:acc:NP_049971;genbank:gi:9632943;genbank:GeneID:1262106 Probab=97.45 E-value=4e-05 Score=44.67 Aligned_cols=269 Identities=7% Similarity=-0.044 Sum_probs=134.8 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceec--cceeeeeeeeccCccccccccccccccccccCce Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFD--TTHPEWTTDELAAPGANITLEGDEYTFDATVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~--~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~ 78 (330) |+..+.+- ...-.-+.+.+.|...-....|++.+......+ +-...|....-..+...-..||+..+......-. T Consensus 109 ~~~~t~~~---gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~~~~~~~ 185 (397) T protein:vir:49 109 KTDGSGSD---AGLTIPQDIRTAINTLVRQFDSLQEYVNVENVTTLTGSRVYEKWADITGLAKLDDEGGQIGQNDDPKLS 185 (397) T ss_pred hhccCCcc---CcceecHHHHHHHHHHHHhhhhHhhhcceeeccCCcceEEEEeeccCCcceeeecccccccccccccee Confidence 44332111 111122356666666667888887765543332 2223333222122222233577776543221111 Q ss_pred EecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 79 RLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 79 ~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) .+.=-..-+...+.||.- .+.... .+-..|=...-...+.+-+|.++|+|.... ++ T Consensus 186 ~v~~~~~k~~~~~~iS~e--ll~ds~-~~l~~~i~~~l~~~~~~~~d~ail~G~g~~--~~------------------- 241 (397) T protein:vir:49 186 LIRYAIKRYAGISTVTNS--LLADSA-ENILAWLSGWIAKKVVVTRNKAILEAIGTL--PN------------------- 241 (397) T ss_pred eeEeeeeeeEeehhhHHH--HHhhhh-HHHHHHHHHHHHHHHHHHHHHHHHhccccc--cc------------------- Confidence 111112222344455543 333222 123333334445566778899999885321 00 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeee----cCCc Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAA----SNGK 234 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~----~~~~ 234 (330) ....++-+.|.+++-++-.++.....+++||....+|..+- +.+. |+.. .++. T Consensus 242 -------------------~~~~~~~d~i~~~~~~l~~~~~~~a~~v~n~~~~~~l~~lk-d~~g---~~l~~~~~~~g~ 298 (397) T protein:vir:49 242 -------------------KPTLAKWDDIIDLQAKVDPAIKQTSLFLTNTSGFTALKKVK-NAMG---DYLMERDVKSPT 298 (397) T ss_pred -------------------cccccCHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHHhh-ccCC---ceeecccccCCC Confidence 00124556777777777767666678899999999999873 3331 2211 1111 Q ss_pred ceeEEEEEEEEEcCCeE-EEEEEcCcCCCccccccEEEEEcchh-hhhcccCCccccccccccc-----cceeeEEEEEE Q lcl|NC_020858. 235 NNSIVANADVYEGPFGK-VMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRKIAEDKKVAKTG-----DAEKFMLIGEG 307 (330) Q Consensus 235 ~~~~~~~v~~~~tdfG~-v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~~~~~e~laKtG-----d~~k~~i~~E~ 307 (330) . .+=+|. |.++.+.+||........+++.|+.. +.+.. |.-..-+...-.+ +...+.++.-+ T Consensus 299 ~----------~~l~G~pV~~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~-~~~~~i~~~~~~~~~~~~~~~~~~~~~r~ 367 (397) T protein:vir:49 299 G----------YSIDGFVVKEISDRFLPNGTGGAMPLYFGDLKQAVTLFD-RQHLSLLSTNIGGGAFETDTTKVRVIDRF 367 (397) T ss_pred C----------ceecceeeEEecccccccccCCceeEEEeeccceEEEEe-ecccEEEEeccccchhhcCeeeEEEEEee Confidence 1 123453 56666788887766666688888763 43322 2211111111112 22334456668 Q ss_pred EEEEecchheeEEecccccc-----ccC Q lcl|NC_020858. 308 ALKPKNEKGLGVAADLYGLT-----AST 330 (330) Q Consensus 308 tLe~~N~~a~g~i~gLt~~~-----~~~ 330 (330) ...+++|.|..+++--..-+ +++ T Consensus 368 d~~~~~~~a~~~~~~~~~~~~~~~~~~~ 395 (397) T protein:vir:49 368 DVVSTDTEAFVPASFKAIADQKAKLSTA 395 (397) T ss_pred ccEEecccceEEEEecccccccCccccc Confidence 89999999998875433322 222 No 50 >protein:vir:4159 Length: 315 # NCBI annotation: structural protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:87 # MgeName: psiM2 # Cross-refs: genbank:acc:NP_046968;genbank:gi:9630538;genbank:GeneID:1261712 Probab=97.44 E-value=2.6e-05 Score=45.68 Aligned_cols=289 Identities=9% Similarity=0.023 Sum_probs=125.5 Q ss_pred CCccccceeecccc-c-cccccceeeEecCCcccceeeeecccee-ccceeeeeeeeccCccc-cccccccccccccccC Q lcl|NC_020858. 1 MAVVTNTFQSTGAK-G-NREELADVVSRITPEDTPIYSMIEKVSF-DTTHPEWTTDELAAPGA-NITLEGDEYTFDATVS 76 (330) Q Consensus 1 Ma~~t~~~~t~~~~-g-~~edl~d~I~~i~p~dTP~~s~ig~~~~-~~~~~~W~td~L~~~~~-na~~EG~d~~~~~~~~ 76 (330) +-.++.++++.+.- | -.|+..+.+...--+.+||++.+..... .+..++...-.+..+-. ....+|...+..+..+ T Consensus 12 ~~~~~k~~t~~d~~Gg~l~P~~~~~~i~~~~e~s~~l~~~~vi~~~~~~~~~i~~~g~~~~~~~g~~~~~~~~~~~~~~~ 91 (315) T protein:vir:41 12 PFEIVPKIDVPDLGRGVLSVDRFGEFVKAVRDSAVIIPEARIDNALKSYEKDISRLSLVLDVGPGRDETGQKLAPPESTA 91 (315) T ss_pred hhhhhhhcCCcCCCCceechHHHHHHHHHHHhhhhhhhhceeeeccccccccccccccCcccccccccccCcCCCCCCcc Confidence 22222233322211 1 1233344444445567888887654221 22222211111111000 1111122111111111 Q ss_pred --ceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcC-CcccccchhHHHHHhccc Q lcl|NC_020858. 77 --PERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASV-GGATRESGSLPTWVKTNV 153 (330) Q Consensus 77 --~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~-~~~~r~~~Gi~~~i~tn~ 153 (330) ....-+ +.-+.--+.||.-.--=+..| .+--++=..+-.+++.+|+|.+|++|.-.. ..-..+..|++......+ T Consensus 92 ~f~~~~l~-~~~l~~~~~it~elL~D~~~~-~~~e~~l~~~~a~~~a~~~~~~~~nGdg~s~~p~~~~~~G~l~~a~~~~ 169 (315) T protein:vir:41 92 EVKTNTLY-MREMVTKVVIHEDAIEDNIEG-KAFEQKIVTLLGEGISYVLEKYYLHGDTSSSDPLLRMSDGWLKLASEKL 169 (315) T ss_pred ccceeeec-eeeeeeeccccHHHHHhhhcc-ccHHHHHHHHHHHHHHHHHHHHhhccCCcCcCccccccccceecccccc Confidence 111111 122222244553332212223 244444444667789999999999996432 121245567655432221 Q ss_pred ccccccccccccccccccccccccccccccHHHHHHHHHHH---HhcCCceeEEEeChHHHHHHHHhhccceeeeeeeee Q lcl|NC_020858. 154 SRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQG---YQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAA 230 (330) Q Consensus 154 ~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~---~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~ 230 (330) . ....++....++++.|.++++.+ |-+.+.--..+++.....++.++-.......-+-.. T Consensus 170 ~-----------------~~~~~~~a~~~~~d~l~~l~~sl~~~yr~~~~~~~~imn~~t~~~~rklk~~~g~~lw~~~~ 232 (315) T protein:vir:41 170 T-----------------ESDVDPEAEDWPMNLFDTMIESLPTPYRNNLPNMKFYVTWDIYRAYRDALKGRETGLGDQAL 232 (315) T ss_pred c-----------------ccccccccccccHHHHHHHHHhcChHHhhcCCceEEEEcHHHHHHHHHHhccCCCccccchh Confidence 1 01122333456788888887766 544433234678887778887775433221111111 Q ss_pred cCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc-cccccccccccceeeEEEEE--- Q lcl|NC_020858. 231 SNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI-AEDKKVAKTGDAEKFMLIGE--- 306 (330) Q Consensus 231 ~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~-~~~e~laKtGd~~k~~i~~E--- 306 (330) ..+... +=+| +.|+....||....-...+++-|+..+-+..-+.+ .+.+..++ ..+.+++.. T Consensus 233 ~~g~~~----------tl~G-~PV~~~~~m~~~~~~~~~ilf~d~~nl~~~~~~~i~i~~~~~a~---~~~~~~~~~~r~ 298 (315) T protein:vir:41 233 TGANSI----------LYDG-RPVQYVPALEALNDGKSRALFVVPTQLVYGFWRNIKVVPDYDAE---MRLTKYVASLRT 298 (315) T ss_pred hcCCCc----------eecc-cceEecccccccCCCCccEEEecccceEEEeccccEEEeeecCC---CCceEEEEEEEe Confidence 111111 1123 34455566765544445678889887644332221 11223333 233333332 Q ss_pred -EEEEEecchheeEEec Q lcl|NC_020858. 307 -GALKPKNEKGLGVAAD 322 (330) Q Consensus 307 -~tLe~~N~~a~g~i~g 322 (330) +.+.+.|..+.+++.= T Consensus 299 d~~~~~~~~~a~~~~~v 315 (315) T protein:vir:41 299 DNHYEDEEGAVSATITV 315 (315) T ss_pred ceeEEeccceeEeeeeC Confidence 4456667655555544 No 51 >protein:vir:4830 Length: 397 # NCBI annotation: MPL-7201 # Family: family:all:21 # MgeID: mge:105 # MgeName: 7201 # Cross-refs: genbank:acc:NP_038327;genbank:gi:9634653;genbank:GeneID:1262632 Probab=97.43 E-value=5.3e-05 Score=44.02 Aligned_cols=271 Identities=7% Similarity=-0.021 Sum_probs=133.8 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCcc--ccccccccccccccccCce Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPG--ANITLEGDEYTFDATVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~--~na~~EG~d~~~~~~~~~~ 78 (330) |+..+.+ ....-.-+++.+.|...-....|++++......++....+......... ..-..||+..+......-. T Consensus 109 ~~~~t~~---~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~~~~~~~ 185 (397) T protein:vir:48 109 KTDASGS---DAGLTIPQDIQTAIHTLVRQYDSLQEYVNVENVTTLTGSRVYEKWADITGLAKLDDEAGSIGTNDDPKLY 185 (397) T ss_pred hhccCCc---cccccccHHHHHHHHHHHHHHHHHHhhhceeeccCCcceEEEEeecCCCcceeeecccccccccccccee Confidence 4333211 1111223467777777778888888876654433322222211111111 1223466665432211111 Q ss_pred EecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 79 RLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 79 ~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) .+.=-.+-+...+.||.. .+.... .+...|-...-...+.+-+|.++++|....... T Consensus 186 ~v~~~~~k~~~~~~iS~e--ll~ds~-~~l~~~v~~~l~~~~~~~~d~~il~G~g~~~~~-------------------- 242 (397) T protein:vir:48 186 PIRYAIKRYAGISTVTNS--LLADSA-ENILAWLSGWIAKKVVVTRNKAILEAIATLPTK-------------------- 242 (397) T ss_pred eEEeeheeeeeehhhHHH--HHhhch-HHHHHHHHHHHHHHHHHHHHHHHhhcccccccc-------------------- Confidence 111111222333445543 333222 123333334444556677899999875321100 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceee-eeeeeecCCccee Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVA-SFRYAASNGKNNS 237 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~-~~r~~~~~~~~~~ 237 (330) ....+.+.|.+++.++-.+......+++||....+|..+ ++.+.. ..+....++.. T Consensus 243 --------------------~~~~~~d~i~~~~~~l~~~~~~~a~~v~n~~~~~~L~~l-kd~~G~~i~~~~~~~~~~-- 299 (397) T protein:vir:48 243 --------------------PTLTKWDDIIDLQAKVDPAIKQTSFFLTNTSGFTALKKV-KNAFGDYLMERDVKSPTG-- 299 (397) T ss_pred --------------------cccccHHHHHHHHHHhhhhhcCCCEEEECHHHHHHHHHh-hcCCCceeeccCcCCCCC-- Confidence 002345567777767666655556789999999999886 333211 11111111111 Q ss_pred EEEEEEEEEcCCeE-EEEEEcCcCCCccccccEEEEEcchh-hhhcccCCccccc--ccc---ccccceeeEEEEEEEEE Q lcl|NC_020858. 238 IVANADVYEGPFGK-VMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRKIAEDK--KVA---KTGDAEKFMLIGEGALK 310 (330) Q Consensus 238 ~~~~v~~~~tdfG~-v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~~~~~e--~la---KtGd~~k~~i~~E~tLe 310 (330) .+=+|. |.++.+.+||..+.....+++.|+.. +.+.. +.-..-+ +.. -.-+...+..+..+... T Consensus 300 --------~~l~G~PV~~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~-~~~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~ 370 (397) T protein:vir:48 300 --------YSIDGFAVKEVADRWLANASSGAMPLYFGDLKQAVTLFD-RQQMSLLSTNIGGGAFETDTTKIRVIDRFDVV 370 (397) T ss_pred --------ceeccceeEEecccccCCcCCCceEEEEEeccceEEEEe-ecceEEEEeccchhhhhcCceeEEEEeeeccE Confidence 112564 67777889988776677788889873 33332 2211111 110 11233455566678999 Q ss_pred EecchheeEEeccccccccC Q lcl|NC_020858. 311 PKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 311 ~~N~~a~g~i~gLt~~~~~~ 330 (330) +++|.+..++. +...+.+. T Consensus 371 ~~~~~a~~~~~-~~~~~~~~ 389 (397) T protein:vir:48 371 ATDTESFVPAS-FKAIADQK 389 (397) T ss_pred EecccceEEEE-ecccccCC Confidence 99999886544 44444433 No 52 >protein:vir:8102 Length: 543 # NCBI annotation: gp6 # Family: family:all:21 # MgeID: mge:152 # MgeName: Che9c # Cross-refs: genbank:acc:NP_817683;genbank:gi:29566114;genbank:GeneID:1259308 Probab=97.31 E-value=9.8e-05 Score=42.56 Aligned_cols=284 Identities=10% Similarity=-0.074 Sum_probs=131.5 Q ss_pred CCccccceeeccccccccccceeeE-ecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVS-RITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~-~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~ 79 (330) .+...+.-.+...+..-+++.+.|+ ..-....|+..+.......+. +.+....- .+...-..||+..+......... T Consensus 247 ~~~~~~~t~~~gg~lip~~~~~~ii~~~~~~~~~l~~~~~~~~~~g~-~~~~~~~~-~~~a~~v~Eg~~~~~~~~~~~~i 324 (543) T protein:vir:81 247 EVRAMGLTKADGGYLVPFQLDPTVIITSNGSLNDIRRFARQVVATGD-VWHGVSSA-AVQWSWDAEFEEVSDDSPEFGQP 324 (543) T ss_pred hhhhcccccccCcccCchhhhhHHHHHHHhhhchhhhhcccccCCcc-eEEEEecC-CcceeecccCcccccccccccee Confidence 1111111111111112235555443 332344566554433222222 22222221 22222345888776543332221 Q ss_pred ecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccccc Q lcl|NC_020858. 80 LGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATG 159 (330) Q Consensus 80 ~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g 159 (330) .-+... +...+.|| .+.+... .+-.++=...-...+.+-++.+||+|.. +.-+.-||+....... T Consensus 325 ~~~~~k-~~~~~~is--~ell~d~--~~~~~~i~~~l~~~~~~~~d~ail~G~G----t~~~p~Gi~~~~~~~~------ 389 (543) T protein:vir:81 325 EIPVKK-AQGFVPIS--IEALQDE--ANVTETVALLFAEGKDELEAVTLTTGTG----QGNQPTGIVTALAGTA------ 389 (543) T ss_pred eeeeee-eEeeehhh--HHHHhcc--HHHHHHHHHHHHHHHHHHHHHHHhccCC----CCcccccchhhccccc------ Confidence 111111 12233444 4555443 2555555555567788899999999853 2234455554321100 Q ss_pred ccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeEE Q lcl|NC_020858. 160 ANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSIV 239 (330) Q Consensus 160 ~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~~ 239 (330) ...+.++...++.+++.++...+=.+......+++||.....|..+- +... ++...+... T Consensus 390 ------------~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~v~n~~~~~~l~~lk-d~~G---~~l~~~~~~---- 449 (543) T protein:vir:81 390 ------------AEIAPVTAETFALADVYAVYEQLAARHRRQGAWLANNLIYNKIRQFD-TQGG---AGLWTTIGN---- 449 (543) T ss_pred ------------ccccccccccccHHHHHHHHHhhhccccCCcEEEEcHHHHHHHHHhh-cCCC---ceeccCcCC---- Confidence 01122233357778888777776444333346889999999988863 3321 222111000 Q ss_pred EEEEEEEcCCeEEEEEEcCcCCCcccc---cc--EEEEEcchhhhhcccCCccccc---ccccccc----ceeeEEEEEE Q lcl|NC_020858. 240 ANADVYEGPFGKVMIHPNRVMAGSGAL---AR--NAFFVDPEFLQFGWLRKIAEDK---KVAKTGD----AEKFMLIGEG 307 (330) Q Consensus 240 ~~v~~~~tdfG~v~iv~nR~mp~~~~~---a~--~~~~ld~~~~~~~~Lr~~~~~e---~laKtGd----~~k~~i~~E~ 307 (330) .. -.+=+| +.|+...+||..... ++ .+++.|++.+-+.. +.-..-+ ....+-+ ...+..+..+ T Consensus 450 g~---~~~l~G-~pv~~~~~~~~~~~~~~~~~~~~i~~gd~~~~~i~~-~~~~~i~~~~~~~~~~~~~~~~~~~~~~~r~ 524 (543) T protein:vir:81 450 GE---PSQLLG-RPVGEAEAMDANWNTSASADNFVLLYGNFQNYVIAD-RIGMTVEFIPHLFGTNRRPNGSRGWFAYYRM 524 (543) T ss_pred CC---Cccccc-eeeEEeccccccccccccCCcceEEEeeccceeEEe-ecccEEEEeccccccchhhcCceEEEEEEee Confidence 00 011245 477777888865321 11 26677887655542 2111111 1111111 2345556668 Q ss_pred EEEEecchheeEEeccccccccC Q lcl|NC_020858. 308 ALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 308 tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) ++.++||.|+.++.-=+. . T Consensus 525 d~~v~~~~A~~~l~~~~~----a 543 (543) T protein:vir:81 525 GADVVNPNAFRLLNVETA----S 543 (543) T ss_pred ccEeecccceEEEEeccc----C Confidence 999999999866553221 1 No 53 >protein:vir:4953 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:108 # MgeName: Sfi19 # Cross-refs: genbank:acc:NP_049929;genbank:gi:9632900;genbank:GeneID:1262076 Probab=97.30 E-value=0.0001 Score=42.50 Aligned_cols=267 Identities=8% Similarity=-0.031 Sum_probs=131.9 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeee--eccCccccccccccccccc-cccCc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTD--ELAAPGANITLEGDEYTFD-ATVSP 77 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td--~L~~~~~na~~EG~d~~~~-~~~~~ 77 (330) |+..+.+ ......-+.+.+.|...-....|+.+++.....++....+.-. .-..+...-..||+..+.. ..... T Consensus 109 ~~~~t~~---~gg~~vP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~~~~~~~ 185 (397) T protein:vir:49 109 KTDASGS---DAGLTIPQDIQTAIHTLVSQYDSLQEYVNVENVTTLTGSRVYEKWTDITGLANIDDEAGKIADVDDPKLS 185 (397) T ss_pred hhccccc---cCcccccHhHHHHHHHHHHhhhhHHhhhceeecccCccceEEEeeccCCcceeeecCcccccccccccee Confidence 4433311 1111122467777777778888988887665443322222111 1111111223577665432 11111 Q ss_pred eEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 78 ERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 78 ~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) ...-+ .+-+...+.|| .+.+.... .+-.+|-...-...+.+-+|.++|+|....... T Consensus 186 ~i~~~-~~k~~~~~~iS--~ell~ds~-~~l~~~i~~~l~~~~~~~~d~ai~~G~g~~~~~------------------- 242 (397) T protein:vir:49 186 LIKYT-IKRYAGISTVT--NSLLADSA-ENILAWLSGWIAKKVVVTRNKAILEAIAALPTK------------------- 242 (397) T ss_pred eEEee-eeeEEeeehhH--HHHHhhhH-HHHHHHHHHHHHHHHHHHHHHHHHhhccccccc------------------- Confidence 11111 12222333444 33443332 223333344445566788999999885421100 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeec----CC Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAAS----NG 233 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~----~~ 233 (330) ....+.+.|.+++.++-.+......+++||....+|..+ ++.+. |+... ++ T Consensus 243 ---------------------~~~~~~d~i~~~~~~l~~~~~~~a~~vmn~~~~~~l~~l-kd~~G---~~l~~~~~~~~ 297 (397) T protein:vir:49 243 ---------------------PTLTKWDDIIDLEAKVDPAIKQTSFFLTNTSGFTALKKV-KNALG---DYLMERDVKSP 297 (397) T ss_pred ---------------------cccccHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHHh-hcCCC---ceeeccCcCCC Confidence 001345667777777766665556789999999999886 44331 22211 11 Q ss_pred cceeEEEEEEEEEcCCeE-EEEEEcCcCCCccccccEEEEEcchh-hhhcccCCccccccccccc-----cceeeEEEEE Q lcl|NC_020858. 234 KNNSIVANADVYEGPFGK-VMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRKIAEDKKVAKTG-----DAEKFMLIGE 306 (330) Q Consensus 234 ~~~~~~~~v~~~~tdfG~-v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~~~~~e~laKtG-----d~~k~~i~~E 306 (330) .. -+=+|. |.++.+.+||..+.....+++.|+.. +.+..-.+. .-+-..-++ +...+.++.. T Consensus 298 ~~----------~~l~G~PV~~~~~~~~~~~~~~~~~i~~gd~~~~~~~~~~~~~-~i~~~~~~~~~~~~~~~~~r~~~r 366 (397) T protein:vir:49 298 TG----------YSIDGFAVKEVADRWLANGTGGAMPLYFGDLKQAVTLFDRQHM-SLLSTNIGGGAFETDTTKVRVIDR 366 (397) T ss_pred CC----------ceecceeeEEecccccccccCCceeEEEeeccceEEEEeecce-EEEEeccccchhhcCceeEEEEee Confidence 11 112453 55666888988776556678888773 333221121 111111122 2234455667 Q ss_pred EEEEEecchheeEEeccccccccC Q lcl|NC_020858. 307 GALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 307 ~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) +...+.+|.|..++. ++....++ T Consensus 367 ~d~~~~~~~a~~~~~-~~~~~~~~ 389 (397) T protein:vir:49 367 FDVVATDTEAFVPAS-FKAIADQK 389 (397) T ss_pred eCcEEecccceEEEE-eecccCCC Confidence 889999998886654 33333322 No 54 >protein:vir:2430 Length: 318 # NCBI annotation: major head subunit # Family: family:all:507 # MgeID: mge:52 # MgeName: D29 # Cross-refs: genbank:acc:NP_046832;genbank:gi:9630400;genbank:GeneID:1261582 Probab=97.13 E-value=0.00016 Score=41.42 Aligned_cols=286 Identities=12% Similarity=0.048 Sum_probs=130.0 Q ss_pred CCccccceeeccccc-cccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKG-NREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g-~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~ 79 (330) |+..+.+ +.-+ .-+.+.+.|...-....|+..+.......+..+.+....- .+..--..||++.+..... T Consensus 14 ~~~~~~~----~~~~~ip~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~ip~~~~-~~~a~~v~Eg~~~~~~~~~---- 84 (318) T protein:vir:24 14 IAQTGDT----MFKGYLEPEQAKDYFAEAEKTSIVQQFAQKVPMGTTGQKIPHWVG-DVSAQWIGEGDMKPITKGN---- 84 (318) T ss_pred hhcccCc----ccceeechhHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEeC-CcceEEecCCccccccccc---- Confidence 3332211 1111 1224555555554666788777655554444444332221 1221222477776554322 Q ss_pred ecceEEEEee-eeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 80 LGNYTQIMRK-SGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 80 ~~N~tQIf~~-~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) +..++--.+| ...+.=|.+.+.... .+-.++=...-.+.+.+.+|.++|+|.... .+ .|+..... T Consensus 85 f~~i~~~~~k~~~~~~iS~e~l~ds~-~~~~~~i~~~l~~~~~~~~d~a~l~G~g~~--~~---~~~~~~~~-------- 150 (318) T protein:vir:24 85 MTSQTIAPHKIATIFVASAETVRANP-ANYLGTMRTKVATAFAMAFDGAAMHGTDSP--FP---TYIGQTTK-------- 150 (318) T ss_pred eeEEEEeeEEEEEeehhhHHHhhcCh-HHHHHHHHHHHHHHHHHHHHHhhhcccCCC--CC---cccccccc-------- Confidence 2222211122 122222334444322 233344444555668999999999986421 11 12211110 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceee-eeeeeecCCccee Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVA-SFRYAASNGKNNS 237 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~-~~r~~~~~~~~~~ 237 (330) +. ......+......+.+.+++..+-.++.....+++||..+..+..+ ++.+.. ..+....++.... T Consensus 151 ~~-----------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~l-kd~~G~~l~~~~~~~~~~~~ 218 (318) T protein:vir:24 151 AI-----------SIADTTGATTVYDQVAVNGLSLLVNDGKKWTHTLLDDITEPILNGA-KDQNGRPLFIESTYGEAASP 218 (318) T ss_pred cc-----------cccccccccchHHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHh-hccCCceeecCccccCcccc Confidence 00 0000111112333445555555555555556789999999999875 443321 1110011111100 Q ss_pred EEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc---cccccccc-------------cccceee Q lcl|NC_020858. 238 IVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI---AEDKKVAK-------------TGDAEKF 301 (330) Q Consensus 238 ~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~---~~~e~laK-------------tGd~~k~ 301 (330) +.. .+.--+.++..+.+|.+.. .+++.|++.+-+..-.+. ...+.--+ +=|...+ T Consensus 219 ~~~------~~i~g~pv~~~~~~~~~~~---~~~~gdfs~~~~~~~~~l~i~~~~~~~~~~~~~~~~~~~~~f~~~~~~~ 289 (318) T protein:vir:24 219 FRS------GRIVARPTILSDHVVEGTT---VGFMGDFSQLIWGQIGGLSFDVTDQATLNLGTVESPNFVSLWQHNLVAV 289 (318) T ss_pred ccC------ceEEEEeeEEeCCCCCCcc---EEEEeecceEEEEEecCeEEEEeeccceeccccccccchhhhhcCcEEE Confidence 000 0111246777788886643 346667776544321111 00111001 1134555 Q ss_pred EEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 302 MLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 302 ~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) .+...+...+++|+|..+|++.+-=+..- T Consensus 290 r~~~r~d~~v~~~~a~~~i~~~~a~~~~~ 318 (318) T protein:vir:24 290 RVEAEYAFHCNDAEAFVALTNVVSGGGEG 318 (318) T ss_pred EEEEEEccEEecccceEEEEeeccCCCCC Confidence 66778999999999988865544322222 No 55 >protein:vir:78523 Length: 338 # NCBI annotation: Putative head structural protein # Family: family:all:507 # MgeID: mge:1853 # MgeName: U2 # Cross-refs: genbank:acc:YP_001491585;genbank:gi:157786408;genbank:GeneID:5625675 Probab=97.03 E-value=0.0002 Score=40.86 Aligned_cols=295 Identities=8% Similarity=-0.002 Sum_probs=139.8 Q ss_pred CCccc---cceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCcc-----ccc--cccccccc Q lcl|NC_020858. 1 MAVVT---NTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPG-----ANI--TLEGDEYT 70 (330) Q Consensus 1 Ma~~t---~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~-----~na--~~EG~d~~ 70 (330) |+.-+ +.+++.....--+.+.+.|...-....|+..+.......+..+.....+..+.+ ..+ ..||...+ T Consensus 10 ~~~~~~~~~~~~~~~~~liP~~~~~~ii~~~~~~s~l~~l~~~~~~~~~~~~ip~~~~~~~a~~v~~~~~~~~~Eg~~~~ 89 (338) T protein:vir:78 10 NTAGSNHQGRLAHVPSDLLPKEIVGPIFDKAQESSLVLRLGENIPISYGETIIPTTVKRPEVGQVGVGTSNEQREGGTKP 89 (338) T ss_pred hhcccccccceecccccccchHHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEecCccceeeccccccccccccccc Confidence 22222 122222222233456677777778888888876665555544444443332211 122 24666655 Q ss_pred cccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHh Q lcl|NC_020858. 71 FDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVK 150 (330) Q Consensus 71 ~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~ 150 (330) ..........-+... +..-+.|| .+.+.... .+-.++=..+-.+.+.+.+|.++|+|..... +-...|+..... T Consensus 90 ~~~~~f~~v~l~~~k-~~~~~~is--~ell~ds~-~~~~~~i~~~la~a~~~~~d~~~l~G~g~~~--~~~~~gi~~~~~ 163 (338) T protein:vir:78 90 LSGTAWDTRSVAPIK-LATIVTVS--EEFARMNP-SGLYTKLQADLAYAIGRGIDLAVFHGKSPLT--GSALQGIDTNNV 163 (338) T ss_pred ccccceeEEEEEEEE-EEEeehhh--HHHHhcCH-HHHHHHHHHHHHHHHHHHHHHHhhcccCCCc--cccccccccccc Confidence 433222111111111 12223333 33333322 2333444445566789999999999875321 112233332221 Q ss_pred cccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcC-CceeEEEeChHHHHHHHHhh--ccceee-ee Q lcl|NC_020858. 151 TNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSG-ANFKHVFVSPYVKSVFVTFM--SDTNVA-SF 226 (330) Q Consensus 151 tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~G-g~~~~i~v~~~~k~~is~f~--~~~~~~-~~ 226 (330) ..... ......+......+.|.+++.++-.+. .....++++|.....+.++. +|.... .. T Consensus 164 ~~~~~----------------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~m~~~~~~~L~~~~~l~d~~g~~l~ 227 (338) T protein:vir:78 164 IVNTT----------------NVDYLQTGTTPLLDRFLDGYDLVSANTDVDFNGWAADPRYRARLLRSQAYRDANGNVDP 227 (338) T ss_pred ccccc----------------ccccccccchhhHHHHHHHHHHhhhhccccceEEEEchHHHHHHHHHhhhccCCCceee Confidence 10000 001111222344566777766665433 24556889998888776653 222210 00 Q ss_pred eeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCcccc----ccEEEEEcchhhhhcccCCcc---cccccc------ Q lcl|NC_020858. 227 RYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGAL----ARNAFFVDPEFLQFGWLRKIA---EDKKVA------ 293 (330) Q Consensus 227 r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~----a~~~~~ld~~~~~~~~Lr~~~---~~e~la------ 293 (330) +.....+... +=+| +.++.+.+||..... ...+++.|+++..+..-.++. ..+..- T Consensus 228 ~~~~~~~~~~----------~l~G-~PV~~~~~ip~~~~~~~~~~~~~~~gdfs~~~~~~~~~~~i~~~~~~~~~~~~~~ 296 (338) T protein:vir:78 228 TRINLAASAG----------DLLG-LPVQFGKAVGGDLGAATDSKVRVVGGDFSQLKYGFADEIRVKMSDTATLTDNTSP 296 (338) T ss_pred cccccCCCCc----------eeee-eeEEEccccCccccccCCcccEEEEEecceEEEEeecccEEEEeecccccccccc Confidence 0001111110 1123 477888899864321 134777888876554322210 001110 Q ss_pred -------ccccceeeEEEEEEEEEEecchheeEEeccccccc Q lcl|NC_020858. 294 -------KTGDAEKFMLIGEGALKPKNEKGLGVAADLYGLTA 328 (330) Q Consensus 294 -------KtGd~~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~ 328 (330) ..=|...+..+..+...+.+|.|..+|++-+-=++ T Consensus 297 ~~~~~~~~~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~~~~~ 338 (338) T protein:vir:78 297 TPQTVSMWQTNQIAILIEVTFGWLLGDKQAFVKFVDDEDPDA 338 (338) T ss_pred cccchhhhhcCcEEEEEEEEeccEeecccceEEEecccCCCC Confidence 11133445556668899999999999887555444 No 56 >protein:vir:2504 Length: 305 # NCBI annotation: major capsid subunit gp9 # Family: family:all:507 # MgeID: mge:53 # MgeName: TM4 # Cross-refs: genbank:acc:NP_569745;genbank:gi:18496895;genbank:GeneID:932268 Probab=97.00 E-value=0.00022 Score=40.65 Aligned_cols=285 Identities=15% Similarity=0.052 Sum_probs=137.0 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccC-ceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVS-PER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~-~~~ 79 (330) ||..+.+-- ...--+++.+.|...--..+|+.++.......+....+...+-.+ ...-..||+..+...... ... T Consensus 1 ma~~t~~~g---g~liP~~~~~~Ii~~~~~~s~l~~l~~~~~~~~~~~~~p~~~~~~-~a~wv~E~~~~~~~~~~~s~~~ 76 (305) T protein:vir:25 1 MADISRAEV---ASLIQEAYSDTLLAAAKQGSTVLSAFQNVNMGTKTTHLPVLATLP-EADWVGESATDPKGVKPTSKVT 76 (305) T ss_pred CCCccCCcc---ceecCHHHHHHHHHHHHhhchhhhhcceeeccCCcEEEEEEeCCc-ceEEeecccccccccccccccc Confidence 888774311 122344677777776677889988876666655556665544332 222235776655433211 112 Q ss_pred ecceE-EEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 80 LGNYT-QIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 80 ~~N~t-QIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) +..+. -.++=...+.=|.+.++... .+-.++=...-.+.+.+-+|.++|+|.-...+ ...-++..... T Consensus 77 f~~i~~~~~k~~~~~~is~ell~ds~-~~~~~~i~~~l~~~~a~~~d~a~~~G~g~~~~--~~~~~~~~~~~-------- 145 (305) T protein:vir:25 77 WANRTLVAEEIAVIIPVHENVIDDAT-VAVLTEVAELGGQAIGKKLDQAVIFGTDKPAS--WVSPALIPAAV-------- 145 (305) T ss_pred eeeEEeeeEEEEEeehhhHHHHhcch-HHHHHHHHHHHHHHHHHHHhhhheeccCCCCC--ccccccccccc-------- Confidence 22211 11222233333444444332 23334444455678899999999998632111 00001000000 Q ss_pred cccccccccccccccccccccccccHHHHHHHHH----HHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCc Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQ----QGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGK 234 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~----~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~ 234 (330) .. . .. ...+....+..++.+.+. .+..++...+.+++++.....+.++ +|.+ .|+...++ T Consensus 146 ~~-------~---~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~l-kd~~---G~~i~~~~- 209 (305) T protein:vir:25 146 TA-------G---QA-VEVVGGVANESDIVGATNRAAKAVASAGWAPDTLLSSLALRYEVANI-RDAN---GNPVFRDD- 209 (305) T ss_pred cc-------c---cc-ccccccchhhhHHHHHHHHHHHhhhhcccccceeEecHHHHHHHHHh-hccC---CceeecCC- Confidence 00 0 00 001111223333333333 3334454555688999998888876 4433 12222111 Q ss_pred ceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc---cccccccccc---------cceeeE Q lcl|NC_020858. 235 NNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI---AEDKKVAKTG---------DAEKFM 302 (330) Q Consensus 235 ~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~---~~~e~laKtG---------d~~k~~ 302 (330) +=+| +.++.+.++|.+.. ...+++.|++...+..-.+. ...+-.-+.+ |..... T Consensus 210 ------------~l~G-~Pv~~~~~~~~~~~-~~~~~~gd~s~~~i~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~R 275 (305) T protein:vir:25 210 ------------SFAG-FRTFFNRNGAWDAD-AAIEVIADSSRVKIGVRQDITVKFLDQATLGTGENQINLAERDMVALR 275 (305) T ss_pred ------------cccc-cceEEcCccCCCCC-ccEEEEEecceEEEEEecCeEEEEeeeeeeecCCceeeeeecCcEEEE Confidence 1122 34556667775432 23567788877654331121 0011001111 223333 Q ss_pred EEEEEEEEEecchheeEEeccc-c-ccccC Q lcl|NC_020858. 303 LIGEGALKPKNEKGLGVAADLY-G-LTAST 330 (330) Q Consensus 303 i~~E~tLe~~N~~a~g~i~gLt-~-~~~~~ 330 (330) .+..+++.+.||++..+++++- + .+.++ T Consensus 276 ~~~r~~~~v~~p~a~v~~~~~~~~~~~pa~ 305 (305) T protein:vir:25 276 LKARFAYVLGVSATAQGANKTPVAVVAPAA 305 (305) T ss_pred EEEeecceeeCcccEEEEccccccccCCCC Confidence 4556899999999998888763 1 22333 No 57 >protein:vir:101650 Length: 497 # NCBI annotation: gp13 # Family: family:all:585 # MgeID: mge:1515 # MgeName: 244 # Cross-refs: genbank:acc:YP_654768;genbank:gi:109302766;genbank:GeneID:4156084 Probab=96.99 E-value=9.9e-05 Score=42.53 Aligned_cols=304 Identities=13% Similarity=0.035 Sum_probs=136.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) |+..+ ++......-+++...|+..-....|+..++......+....|....-..+...-..||...+..... + T Consensus 151 ~~~~~---~~~gg~~vp~~~~~~ii~~~~~~~~i~~l~~~~~~~~~~~~~~~~~~~~~~a~wv~E~~~~~~s~~~----f 223 (497) T protein:vir:10 151 NPFGS---TGTFAPGILPTFLPGIVEQLFYELSLADLISSRPVTSPNLSYLTESAAHNNAAAVAEAGTYPFSSEE----F 223 (497) T ss_pred hhccc---CcccccccchhhhHHHHHHHHhhhhHHhhccccccCCCceEEEEEcCCCCcceeeccCccccccccc----c Confidence 22211 1112223345666666665566778888776666555556665544433333344588776543322 2 Q ss_pred cceEEEEeeee-eehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccc------ Q lcl|NC_020858. 81 GNYTQIMRKSG-IISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNV------ 153 (330) Q Consensus 81 ~N~tQIf~~~v-~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~------ 153 (330) ..++-=.+|.. -+.=|.+.++.. .+.-+|=..+-...|.+-+|.+||+|.-.. +.-||+..-.... T Consensus 224 ~~i~~~~~k~a~~~~iS~ell~d~--~~l~~~i~~~l~~~i~~~~d~~~l~G~G~~-----~p~Gil~~~~~~~~~~~~~ 296 (497) T protein:vir:10 224 ARVYEQVGKVANALTITDEGLRDA--PELFNFVQGRLLEGIQRKEEVQLLAGGGYP-----GVNGLLQRSTGFTASSASS 296 (497) T ss_pred eeeEeeeeeeEeecHhHHHHHHhH--HHHHHHHHHHHHHHHHHHHHHHhhcCCCcc-----ccccccccccccccccccc Confidence 22222222211 122233333333 232233334455677888899999985321 1122221110000 Q ss_pred ------------ccccccccccc-----------------cccccccccccccccccccHHHHHHHHHHHHhcCC-ceeE Q lcl|NC_020858. 154 ------------SRGATGANGGY-----------------NTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGA-NFKH 203 (330) Q Consensus 154 ------------~~g~~g~~~~~-----------------~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg-~~~~ 203 (330) .....+..... ....+........ ........+...+..++..+. .++. T Consensus 297 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~ 375 (497) T protein:vir:10 297 LFGATSATVSNVKFPADGTNGAFVGQDTVASLKYGRVVTGAAGSGSGVAGSYP-TAAEIAENVFDAFVDIQLTLFQTPNA 375 (497) T ss_pred chhhhhhhhhhhhhhcccccchhhhhhHHHHHHHHHhhhhhhhhccchhcccc-chhhhhhHHHHHHhhhhhhcccCCCe Confidence 00000000000 0000000000000 001222334445555555544 4556 Q ss_pred EEeChHHHHHHHHhhccceeeeeeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhc-c Q lcl|NC_020858. 204 VFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFG-W 282 (330) Q Consensus 204 i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~-~ 282 (330) +++||....++..+ +|.+. |+...+......+..+..-.+=+| +.|+.+..||.+. .++.|.+...+. + T Consensus 376 ~vmn~~~~~~l~~l-kd~~G---~~i~~~~~~~~~~~~~~~~~~l~G-~pV~~t~~~~~~~-----~~~Gd~~~~~~~i~ 445 (497) T protein:vir:10 376 VVMNPRDWELLRLT-KDANG---QYMGGNFFGNAYGNPVNGGKNIWG-VPVVTTPLIPLGT-----ILVGHFAPSVIQTA 445 (497) T ss_pred EEEchHHHHHHHHh-hcCCC---ceeccCcccccccccccCCceeec-eeeEecCCCCCCc-----eEEeecccceEEEE Confidence 88999988888876 44331 222111100000000000012234 5677788888654 456666532221 1 Q ss_pred cCCccccccccccc-----cceeeEEEEEEEEEEecchheeEEecccccccc Q lcl|NC_020858. 283 LRKIAEDKKVAKTG-----DAEKFMLIGEGALKPKNEKGLGVAADLYGLTAS 329 (330) Q Consensus 283 Lr~~~~~e~laKtG-----d~~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~ 329 (330) .|.-..-+-....+ +...+.++.++++.|++|.|+.++.--.+-++| T Consensus 446 ~r~~~~v~~~~~~~~~f~~n~v~~r~~~r~~~~v~~p~A~~~l~~~~~~~~~ 497 (497) T protein:vir:10 446 RREGVTMQMTNSNGTDFVDGKVTVRAEERLGLLVYRPSAFQLIQLKKGATGS 497 (497) T ss_pred EecccEEEeecccchhhhcCcEEEEEEEeecceeeccccEEEEEecCCccCC Confidence 12211111111112 334455557799999999999999998888888 No 58 >protein:vir:7855 Length: 497 # NCBI annotation: gp12 # Family: family:all:585 # MgeID: mge:150 # MgeName: CJW1 # Cross-refs: genbank:acc:NP_817462;genbank:gi:29565891;genbank:GeneID:1259081 Probab=96.99 E-value=9.9e-05 Score=42.53 Aligned_cols=304 Identities=13% Similarity=0.035 Sum_probs=136.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) |+..+ ++......-+++...|+..-....|+..++......+....|....-..+...-..||...+..... + T Consensus 151 ~~~~~---~~~gg~~vp~~~~~~ii~~~~~~~~i~~l~~~~~~~~~~~~~~~~~~~~~~a~wv~E~~~~~~s~~~----f 223 (497) T protein:vir:78 151 NPFGS---TGTFAPGILPTFLPGIVEQLFYELSLADLISSRPVTSPNLSYLTESAAHNNAAAVAEAGTYPFSSEE----F 223 (497) T ss_pred hhccc---CcccccccchhhhHHHHHHHHhhhhHHhhccccccCCCceEEEEEcCCCCcceeeccCccccccccc----c Confidence 22211 1112223345666666665566778888776666555556665544433333344588776543322 2 Q ss_pred cceEEEEeeee-eehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccc------ Q lcl|NC_020858. 81 GNYTQIMRKSG-IISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNV------ 153 (330) Q Consensus 81 ~N~tQIf~~~v-~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~------ 153 (330) ..++-=.+|.. -+.=|.+.++.. .+.-+|=..+-...|.+-+|.+||+|.-.. +.-||+..-.... T Consensus 224 ~~i~~~~~k~a~~~~iS~ell~d~--~~l~~~i~~~l~~~i~~~~d~~~l~G~G~~-----~p~Gil~~~~~~~~~~~~~ 296 (497) T protein:vir:78 224 ARVYEQVGKVANALTITDEGLRDA--PELFNFVQGRLLEGIQRKEEVQLLAGGGYP-----GVNGLLQRSTGFTASSASS 296 (497) T ss_pred eeeEeeeeeeEeecHhHHHHHHhH--HHHHHHHHHHHHHHHHHHHHHHhhcCCCcc-----ccccccccccccccccccc Confidence 22222222211 122233333333 232233334455677888899999985321 1122221110000 Q ss_pred ------------ccccccccccc-----------------cccccccccccccccccccHHHHHHHHHHHHhcCC-ceeE Q lcl|NC_020858. 154 ------------SRGATGANGGY-----------------NTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGA-NFKH 203 (330) Q Consensus 154 ------------~~g~~g~~~~~-----------------~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg-~~~~ 203 (330) .....+..... ....+........ ........+...+..++..+. .++. T Consensus 297 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~ 375 (497) T protein:vir:78 297 LFGATSATVSNVKFPADGTNGAFVGQDTVASLKYGRVVTGAAGSGSGVAGSYP-TAAEIAENVFDAFVDIQLTLFQTPNA 375 (497) T ss_pred chhhhhhhhhhhhhhcccccchhhhhhHHHHHHHHHhhhhhhhhccchhcccc-chhhhhhHHHHHHhhhhhhcccCCCe Confidence 00000000000 0000000000000 001222334445555555544 4556 Q ss_pred EEeChHHHHHHHHhhccceeeeeeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhc-c Q lcl|NC_020858. 204 VFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFG-W 282 (330) Q Consensus 204 i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~-~ 282 (330) +++||....++..+ +|.+. |+...+......+..+..-.+=+| +.|+.+..||.+. .++.|.+...+. + T Consensus 376 ~vmn~~~~~~l~~l-kd~~G---~~i~~~~~~~~~~~~~~~~~~l~G-~pV~~t~~~~~~~-----~~~Gd~~~~~~~i~ 445 (497) T protein:vir:78 376 VVMNPRDWELLRLT-KDANG---QYMGGNFFGNAYGNPVNGGKNIWG-VPVVTTPLIPLGT-----ILVGHFAPSVIQTA 445 (497) T ss_pred EEEchHHHHHHHHh-hcCCC---ceeccCcccccccccccCCceeec-eeeEecCCCCCCc-----eEEeecccceEEEE Confidence 88999988888876 44331 222111100000000000012234 5677788888654 456666532221 1 Q ss_pred cCCccccccccccc-----cceeeEEEEEEEEEEecchheeEEecccccccc Q lcl|NC_020858. 283 LRKIAEDKKVAKTG-----DAEKFMLIGEGALKPKNEKGLGVAADLYGLTAS 329 (330) Q Consensus 283 Lr~~~~~e~laKtG-----d~~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~ 329 (330) .|.-..-+-....+ +...+.++.++++.|++|.|+.++.--.+-++| T Consensus 446 ~r~~~~v~~~~~~~~~f~~n~v~~r~~~r~~~~v~~p~A~~~l~~~~~~~~~ 497 (497) T protein:vir:78 446 RREGVTMQMTNSNGTDFVDGKVTVRAEERLGLLVYRPSAFQLIQLKKGATGS 497 (497) T ss_pred EecccEEEeecccchhhhcCcEEEEEEEeecceeeccccEEEEEecCCccCC Confidence 12211111111112 334455557799999999999999998888888 No 59 >protein:vir:99424 Length: 360 # NCBI annotation: hypothetical protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:1595 # MgeName: BJ1 # Cross-refs: genbank:acc:YP_919080;genbank:gi:119757038;genbank:GeneID:4606077 Probab=96.93 E-value=0.00025 Score=40.33 Aligned_cols=306 Identities=12% Similarity=0.071 Sum_probs=146.7 Q ss_pred CCcc-ccceeec--cccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCcc--cccccc-ccccccccc Q lcl|NC_020858. 1 MAVV-TNTFQST--GAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPG--ANITLE-GDEYTFDAT 74 (330) Q Consensus 1 Ma~~-t~~~~t~--~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~--~na~~E-G~d~~~~~~ 74 (330) +..+ =.+++.. +.---+++..+.+.+.--+.+|||+.+...+..+...+= +.|.-.. -.++-| |.+...+.. T Consensus 15 ~~~i~k~~it~~~l~~g~L~p~~a~~Fl~~v~~~t~iL~~~r~~~~~s~~~ei--~kig~G~r~~r~~~e~~~~~~~~~~ 92 (360) T protein:vir:99 15 MNSLSQKDIGLAELDGFQLPVDVTEEFLERMQKGVQILGMADTMTLARLEMEV--PQFGVPRLSGHTRDEEGSRTENSEA 92 (360) T ss_pred HHHHHhhhccccccCceeecHHHHHHHHHHHhhccchhhhcceeecccccccc--cccccceeeccccccCCCCCcCCcC Confidence 1111 0000100 111134466677777777899999998776655544331 2222111 122223 222222222 Q ss_pred cCceEecceEEEEeeeeeehhHHHHHhh--ccccchHHHHH-HHHHHHHHHHHHHHHhcCCCcC-----Ccc----cccc Q lcl|NC_020858. 75 VSPERLGNYTQIMRKSGIISGTQNITDE--AGRATKVKEQK-LKKGVELRKDVEFSIVATNASV-----GGA----TRES 142 (330) Q Consensus 75 ~~~~~~~N~tQIf~~~v~VS~T~~av~~--~G~~~e~a~q~-~k~~~EikrD~E~a~i~g~~~~-----~~~----~r~~ 142 (330) ..+...-|.+.=++-.+.+ +.+.+.. .-.+.++..-+ ..=.+.+.+|||...++|.+.. +++ -..+ T Consensus 93 ~~~~v~~~~~~~~~~~~~i--~~~~~~~n~~~~~~~f~~~i~~~~ae~~~~Dle~l~~~g~~ds~d~~~~~~~d~fl~~~ 170 (360) T protein:vir:99 93 ESGSVKFNATDKSYYILVE--PKRDALKNTHYGPDQFGDYIVDQFIERYGNDLGLMGIRAGASSGNLQSIGGAAELDNTF 170 (360) T ss_pred ccccCccccccceeeEeec--hHHHHHhhhhcccchhHHHHHHHHHHHHHHHHHHHHhhccchhcccccCcccchhhhhh Confidence 1111112222212222222 2332221 22222333332 3344678999999999987653 122 2568 Q ss_pred hhHHHHHhcccccc--cc-----ccccccccccccc----ccccccccccccHHHHHHHHHHHHh---cCC--ceeEEEe Q lcl|NC_020858. 143 GSLPTWVKTNVSRG--AT-----GANGGYNTGTGLT----VAPTDGTQRAFSKAIMDDVMQQGYQ---SGA--NFKHVFV 206 (330) Q Consensus 143 ~Gi~~~i~tn~~~g--~~-----g~~~~~~~~~~~~----~~~t~gt~~~lTe~~l~~~~~~~~~---~Gg--~~~~i~v 206 (330) .|.+.....++..- ++ +..+....++... +.-+.|...++.+++|+++++.+=. ++- ++ .+++ T Consensus 171 dGwlKka~~~~~~id~a~d~t~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~lf~~~~~~Lp~kyr~~~~~~~-~~~~ 249 (360) T protein:vir:99 171 KGWIARAEGDAQSVDDAGDSTRIGLEDTATADADSMPSIANTDGSGNPQPVDTSLFNETIQTLDSRYRESDAYSP-VLMT 249 (360) T ss_pred HHHHHHhhcccchhhccccccccccccccccccccchhhhccccccccccchHHHHHHHHHhcchhhhcCcccce-EEEc Confidence 88777765443210 00 0000011111111 1123344567889999999988764 221 22 4566 Q ss_pred ChHHHHHHHHhhccceeeeeeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc Q lcl|NC_020858. 207 SPYVKSVFVTFMSDTNVASFRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI 286 (330) Q Consensus 207 ~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~ 286 (330) ++.....-..+..+... .- |..--++. -..+++| +.|+.-++||.+ .+++.+|..+-+..=|-+ T Consensus 250 s~~~~~~yr~~L~~R~t------~L-Gd~~l~g~---~~~~~~G-ipi~~v~~~pd~-----~~mlT~p~NLi~g~~~~i 313 (360) T protein:vir:99 250 SPNQVQSYTMSLTERED------PL-GSAVIFGD---SDITPFS-YDLVGVNGFPDE-----YMMFTDPNNLAFGLYEEM 313 (360) T ss_pred cCchHHHHHHHHhccCc------cc-chhheecc---cccccce-eeeEEcCCCCCC-----ceEEeccCceeEEeeeee Confidence 76665555554433211 00 11111111 1345778 778888899965 467889998855422322 Q ss_pred ccccc------ccccccceeeEEEEEEEEEEecchheeEEeccccccc Q lcl|NC_020858. 287 AEDKK------VAKTGDAEKFMLIGEGALKPKNEKGLGVAADLYGLTA 328 (330) Q Consensus 287 ~~~e~------laKtGd~~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~ 328 (330) +.+. .++.--+-++.+.+-+-..++++.|.++++||.-=.+ T Consensus 314 -ri~~~~e~~~~~~~~~~~~~~~~~~~D~~iee~~Av~~vt~~~~~~~ 360 (360) T protein:vir:99 314 -ELDQSTDTDKVHEQRLHSRNWLEGQFDFQIKEQQAGVLVTDLETPTA 360 (360) T ss_pred -EEeecccchhhhhhceeeeEEEEEEeeEEEEecccEEEEecCCCCCC Confidence 1111 1111112334444556677789999999999887777 No 60 >protein:vir:1025 Length: 408 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:20 # MgeName: bIL286 # Cross-refs: genbank:acc:NP_076679;genbank:gi:13095788;genbank:GeneID:920362 Probab=96.92 E-value=0.00026 Score=40.27 Aligned_cols=273 Identities=8% Similarity=0.020 Sum_probs=125.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccce--eeeeeeeccCccccccccccccccccccCce Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTH--PEWTTDELAAPGANITLEGDEYTFDATVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~--~~W~td~L~~~~~na~~EG~d~~~~~~~~~~ 78 (330) |...+. +......-+.+.+.|...-....|++++.......+.. +.+....-..+...-..||+..+......-. T Consensus 116 ~~~~t~---~~gg~~vP~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~~~~~~~ 192 (408) T protein:vir:10 116 ETSGSD---SAAGLTIPQDIRTMINTLVRQYDSLQQYVRVESVSTSNGSRVYEKWTDVTPLTVMDAEDGKIPDLDNPQLT 192 (408) T ss_pred hhcccc---cCCceeccHhHHHHHHHHHHhhchhhhhcceeeccCCcceEEEeeccccccceeeecCccccccccCccee Confidence 221111 11111123467788888878888998887765544332 2221111111222233577665432111111 Q ss_pred EecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 79 RLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 79 ~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) .+.--..-+..-+.||.. .++... .+..+|=...-...+.+-+|.+||+|....... T Consensus 193 ~i~~~~~k~~~~~~iS~e--ll~ds~-~~l~~~i~~~l~~~~~~~~~~~il~g~g~~~~~-------------------- 249 (408) T protein:vir:10 193 IIKYLIKRYAGIITATNT--SLKDTA-ENILAWLSSWIAKKVVVTRNQAIIEVMKAAPKK-------------------- 249 (408) T ss_pred eEEeeeeeEEeeehhHHH--HHhhch-HHHHHHHHHHHHHHHHHHHHHHHhhcccccccc-------------------- Confidence 111112223334445443 333322 122222233334555667888888875421100 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcC-CceeEEEeChHHHHHHHHhhccceeeeeeeeecCCccee Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSG-ANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNS 237 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~G-g~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~ 237 (330) ....+-+++.+++......+ .....+++||....+|..+ ++.+. |+.....-.. T Consensus 250 --------------------~~~~~~~~l~~~~~~~~~~~~~~~a~~v~n~~~~~~l~~l-kd~~G---~~i~~~~~~~- 304 (408) T protein:vir:10 250 --------------------PTIAKFDDVITMINTAVDPAIIATSSLLTNQSGLNKLALV-KTAEG---KYLLEPDPTK- 304 (408) T ss_pred --------------------cccccHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHh-hccCC---ceEeccCcCC- Confidence 00123445555554322222 2223578999999998886 33321 2222111000 Q ss_pred EEEEEEEEEcCCeE-EEEEEcCcCCCccccccEEEEEcchh-hhhcccCCcccccccccc-c----cceeeEEEEEEEEE Q lcl|NC_020858. 238 IVANADVYEGPFGK-VMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRKIAEDKKVAKT-G----DAEKFMLIGEGALK 310 (330) Q Consensus 238 ~~~~v~~~~tdfG~-v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~~~~~e~laKt-G----d~~k~~i~~E~tLe 310 (330) . .-.+=+|. |.++.|.+||........+++.|+.. +.+..-.+. .-+...-. . +...+..+..+.+. T Consensus 305 --~---~~~~l~G~PV~~~~~~~~~~~~~~~~~i~~gd~~~~~~~~~~~~~-~v~~~~~~~~~f~~~~~~~r~~~r~d~~ 378 (408) T protein:vir:10 305 --P---NSYLIKGKQVIVVADRWLPNTGSTVYPLYYGDMSQAITLFDRENM-SLLPTNIGAGAFETDTTKIRVIDRFDVK 378 (408) T ss_pred --C---CCceecceeeEEecccccCccCCCceEEEEEehhccEEEEEecce-EEEEcccccchhhcCceEEEEEEeeccE Confidence 0 00112564 56666778887554333467778773 333221121 11111111 1 33445555668999 Q ss_pred EecchheeEEe-----ccccccccC Q lcl|NC_020858. 311 PKNEKGLGVAA-----DLYGLTAST 330 (330) Q Consensus 311 ~~N~~a~g~i~-----gLt~~~~~~ 330 (330) +.+|.|..+++ ...|.+.++ T Consensus 379 v~~~~a~~~~~~~~~~~~~~~~~~~ 403 (408) T protein:vir:10 379 ATDSEALVAGSFSAIADQVGNFKTT 403 (408) T ss_pred EeccccEEEEEeeccccCCCCCCCC Confidence 99999988776 344444444 No 61 >protein:vir:4226 Length: 326 # NCBI annotation: observed 35.2Kd protein # Family: family:all:507 # MgeID: mge:89 # MgeName: L5 # Cross-refs: genbank:acc:NP_039681;swissprot:sw:q05223;genbank:gi:9625447;uniprot:Q05223;genbank:GeneID:2942929 Probab=96.92 E-value=0.00026 Score=40.25 Aligned_cols=288 Identities=11% Similarity=0.013 Sum_probs=131.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) |...+ +....-.-+.+.+.|+..-....|++.+.......+....+....-.+... -..||...+.......... T Consensus 20 ~~~~~----~~~g~~ip~~~~~~ii~~~~~~s~i~~~~~~~~~~~~~~~~p~~~~~~~a~-~v~Eg~~~~~~~~~f~~i~ 94 (326) T protein:vir:42 20 AQTGD----SMFEGYLEPEQAQDYFAEAEKISIVQQFAQKIPMGTTGQKIPHWTGDVSAS-WIGEGDMKPITKGNMTSQT 94 (326) T ss_pred eeccc----cCCcceechhhHHHHHHHHHhcchhhhhcceeeccCCceEEEEEeCCcceE-EecCCccccccccceeEEE Confidence 21111 111111334566666666677778888765555444445554433222211 2258887766543322221 Q ss_pred cceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccccc Q lcl|NC_020858. 81 GNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATGA 160 (330) Q Consensus 81 ~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g~ 160 (330) -+ ..-+...+.|| .+.++... .+-.+|=..+-.+.+.+.+|.++|+|..+ +.| .|+..-....... T Consensus 95 ~~-~~k~~~~v~iS--~ell~~s~-~~~~~~i~~~l~~a~~~~~d~a~l~G~gs--~~p---~gi~~~~~~~~~~----- 160 (326) T protein:vir:42 95 IA-PHKIATIFVAS--AETVRANP-ANYLGTMRTKVATAFAMAFDNAAINGTDS--PFP---TFLAQTTKEVSLV----- 160 (326) T ss_pred Ee-eEEEEEeehhh--HHHHhcCH-HHHHHHHHHHHHHHHHHHHHHHhhcccCC--Ccc---cccccccccccee----- Confidence 11 12233334444 44444332 23334444444566899999999998652 222 1221111000000 Q ss_pred cccccccccccccccccccccccHHH--HHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceee-eeeeeecCCccee Q lcl|NC_020858. 161 NGGYNTGTGLTVAPTDGTQRAFSKAI--MDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVA-SFRYAASNGKNNS 237 (330) Q Consensus 161 ~~~~~~~~~~~~~~t~gt~~~lTe~~--l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~-~~r~~~~~~~~~~ 237 (330) ....+..+ ..++..+ +.+.+..+-..+.....++++|..+.++.++ ++.... ..+.....+.... T Consensus 161 ----------~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~a~~v~n~~~~~~L~~l-kd~~G~~l~~~~~~~~~~~~ 228 (326) T protein:vir:42 161 ----------DPDGTGSN-ADLTVYDAVAVNALSLLVNAGKKWTHTLLDDITEPILNGA-KDKSGRPLFIESTYTEENSP 228 (326) T ss_pred ----------eccccccc-ccchhHHHHHHHHHhhhhhhccCccEEEEeHHHHHHHHHh-hccCCceeeccccccCcccc Confidence 00001111 1122222 2223333333344445678999999998885 443211 1111111111100 Q ss_pred EEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc---cccccccccc-------------cceee Q lcl|NC_020858. 238 IVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI---AEDKKVAKTG-------------DAEKF 301 (330) Q Consensus 238 ~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~---~~~e~laKtG-------------d~~k~ 301 (330) + ....+--+.++++.+||++... +++-|.+.+-+..-.++ ...+...++| |...+ T Consensus 229 ~------~~~~l~G~pv~~~~~~~~~~~~---~~~Gd~s~~~~~~~~~~~v~~~~e~~~~~~~~~~~~~~~~~~~d~~~~ 299 (326) T protein:vir:42 229 F------RLGRIVARPTILSDHVASGTVV---GYQGDFRQLVWGQVGGLSFDVTDQATLNLGTPQAPNFVSLWQHNLVAV 299 (326) T ss_pred c------cCceeeeeeEEEcCCCCCCceE---EEEeecceEEEEEecceEEEEeecceeeecccccccchhhhhcCcEEE Confidence 0 0001223577788899876542 34446554432211110 0111111222 33445 Q ss_pred EEEEEEEEEEecchheeEEeccccccc Q lcl|NC_020858. 302 MLIGEGALKPKNEKGLGVAADLYGLTA 328 (330) Q Consensus 302 ~i~~E~tLe~~N~~a~g~i~gLt~~~~ 328 (330) .....+..++.+|+|..+|++..--.+ T Consensus 300 r~~~~~d~~v~~~~a~~~l~~~~~~~~ 326 (326) T protein:vir:42 300 RVEAEYAFHCNDKDAFVKLTNVDATEA 326 (326) T ss_pred EEEEEeccEEecccceEEEeeccccCC Confidence 567778999999999988777765555 No 62 >protein:vir:94673 Length: 419 # NCBI annotation: major capsid protein # Family: family:all:585 # MgeID: mge:1527 # MgeName: mu1/6 # Cross-refs: genbank:acc:YP_579208;genbank:gi:93007444;genbank:GeneID:5076792 Probab=96.90 E-value=0.00027 Score=40.13 Aligned_cols=284 Identities=11% Similarity=-0.015 Sum_probs=130.6 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccC-c----cccc--ccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAA-P----GANI--TLEGDEYTFDA 73 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~-~----~~na--~~EG~d~~~~~ 73 (330) ++...+..++.......+.+.+.|...-.....+..++......+..+.|....... + ...+ ..||+..+... T Consensus 121 ~~~~~~~~~~~~~~~~p~~~~~~i~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~Eg~~~~~~~ 200 (419) T protein:vir:94 121 RDAPAGTITNPNVPHLPQLVPGIVPTTPDLPLLVADLLDQQNADYNVLEYIRDTSGTAGAGSTWNKAAVVPEGTAKPQST 200 (419) T ss_pred cccccccccCCcccccchhhhHHHHHHHhhhhhhhhcceeeeccCCceeeeeeccccccccccCcccceecCCccccccc Confidence 444443333332222223344444333222222333333333333333333322211 1 1112 24777765443 Q ss_pred ccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHH-HHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcc Q lcl|NC_020858. 74 TVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKL-KKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTN 152 (330) Q Consensus 74 ~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~-k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn 152 (330) .......-+. .-+...+.|| .+.++..+ + +...+. .-...+.+-+|.++|+|.... +.-|++.--. T Consensus 201 ~~~~~i~~~~-~k~~~~~~is--~ell~d~~--~-l~~~i~~~la~a~~~~~d~aii~G~G~~-----~p~Gi~~~~~-- 267 (419) T protein:vir:94 201 LSFDTITTTL-KTVAHWLPIT--RQAADDNS--Q-LMGYIQGRLTYGLRFLRDRQLLNGNGST-----EMQGILTTPG-- 267 (419) T ss_pred cceeeEEeee-eeEEEeehhh--HHHHHhHH--H-HHHHHHHHHHHHHHHHHHHHHHhccCcc-----cccceecccc-- Confidence 3222211111 2233344555 34444432 3 333343 457788899999999986531 1223322110 Q ss_pred cccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceee--eeeeee Q lcl|NC_020858. 153 VSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVA--SFRYAA 230 (330) Q Consensus 153 ~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~--~~r~~~ 230 (330) ... ......+.+.......++|.+++.++-.++.....++|||.....+..+....... .+... T Consensus 268 ~~~-------------~~~~~~~~~~t~~~~~~~l~~~~~~~~~~~~~~~~~v~n~~~~~~l~~~k~~~~~~~~~~~~~- 333 (419) T protein:vir:94 268 IGT-------------YQQPKPTAPATDEPPLVDIRRAKTVAEIAGFPPDGVVVHPQDWESIELDQAPGSGVFRVIANV- 333 (419) T ss_pred ccc-------------ccccccccccccchhHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHHhhcCCCceeecCCc- Confidence 000 00000111111223456666666676667767778999999999988764322211 00000 Q ss_pred cCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCcccccccc-----ccccceeeEEEE Q lcl|NC_020858. 231 SNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVA-----KTGDAEKFMLIG 305 (330) Q Consensus 231 ~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~la-----KtGd~~k~~i~~ 305 (330) .++... +=+| +.|+.+.+||++. +++.|++..-+-+.|.-..-+... -+-+...+.++. T Consensus 334 ~~~~~~----------~l~G-~pV~~~~~~~~~~-----~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~~r~~~ 397 (419) T protein:vir:94 334 QGEATP----------RIWG-LNVVSTVAIAQGT-----ALVGGFRQGATLWSRQGITVLMTDSHADFFTANTLVILAEF 397 (419) T ss_pred ccCCCc----------cccc-eeeEEcCCCCCcc-----EEEeeccceEEEEEecceEEEEeccccchhhcCcEEEEEEE Confidence 011111 1134 4778888999654 567888764333333211111111 122445667777 Q ss_pred EEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 306 EGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 306 E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) -+.+.+++|.|+.+++- =.++| T Consensus 398 r~d~~v~~~~a~~~~~~---~aa~~ 419 (419) T protein:vir:94 398 RANLAVYQPKAFVRVTF---AAATT 419 (419) T ss_pred eeccEEeccccEEEEEe---ccCCC Confidence 79999999999887653 22233 No 63 >protein:vir:4856 Length: 293 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:106 # MgeName: DT1 # Cross-refs: genbank:acc:NP_049396;genbank:gi:9632424;genbank:GeneID:1258532 Probab=96.86 E-value=0.0003 Score=39.93 Aligned_cols=270 Identities=7% Similarity=-0.026 Sum_probs=133.1 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeecc--CccccccccccccccccccCce Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELA--APGANITLEGDEYTFDATVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~--~~~~na~~EG~d~~~~~~~~~~ 78 (330) |+..+.+ ....-.-+++.+.|...=....|+.++......++..-.|.--... .+...-..||...+....-.-. T Consensus 5 ~~~~t~~---~gg~liP~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~~g~~~~~~~~~~~~~a~~v~Eg~~~~~~~~~~~~ 81 (293) T protein:vir:48 5 KTDHSGS---DAGLTIPQDIRTAINTLVRQYDSLQEYVNVENVTTLTGSRVYEKWTDITGLANIDDEAGKIADIDDPKLS 81 (293) T ss_pred ecccccC---cCceEechhHHHHHHHHHHhhhhhhhhceeeeccCCcceEEEEeecCCCcceeeecCCccccccccccee Confidence 5544422 1112234467777766667788887776554433322222222221 1112233577765432111111 Q ss_pred EecceEEEEeeeeeehhHHHHHhhcc--ccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccc Q lcl|NC_020858. 79 RLGNYTQIMRKSGIISGTQNITDEAG--RATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRG 156 (330) Q Consensus 79 ~~~N~tQIf~~~v~VS~T~~av~~~G--~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g 156 (330) .+.=-..=+...+.|| .+.++... ..+.+..++. +.+.+-+++.+++|...... T Consensus 82 ~i~l~~~k~~~~~~iS--~ell~ds~~~l~~~i~~~la---~~~~~~~~~~i~~g~~~~~~------------------- 137 (293) T protein:vir:48 82 LIKYTIKRYAGISTVT--NSLLADSAENILAWLSGWIA---KKVVVTRNKAILGVVDKLPT------------------- 137 (293) T ss_pred EEEEeeeEEEEeehhh--HHHHhhhhHHHHHHHHHHHH---HHHHHHHHhHHhhccccccc------------------- Confidence 1111112222234444 33344332 2334444444 34456667777765332110 Q ss_pred cccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcce Q lcl|NC_020858. 157 ATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNN 236 (330) Q Consensus 157 ~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~ 236 (330) ....++-++|.+++.++-.+......+++|+.....|..+ +|... |+.....-.. T Consensus 138 ---------------------~~~~~~~d~i~~~~~~l~~~~~~~a~~vmn~~~~~~L~~l-kd~~g---~~l~~~~~~~ 192 (293) T protein:vir:48 138 ---------------------KPTLTKWDDIIDLEAKVDPAIKQTSFFLTNTSGFTALKKV-KNALG---DYLMERDVKS 192 (293) T ss_pred ---------------------cccccCHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHHh-hccCC---ceEeecCcCC Confidence 0113566778888877766555445688999998888876 34321 2222111000 Q ss_pred eEEEEEEEEEcCCeE-EEEEEcCcCCCccccccEEEEEcch-hhhhcccCCccccccc-----cccccceeeEEEEEEEE Q lcl|NC_020858. 237 SIVANADVYEGPFGK-VMIHPNRVMAGSGALARNAFFVDPE-FLQFGWLRKIAEDKKV-----AKTGDAEKFMLIGEGAL 309 (330) Q Consensus 237 ~~~~~v~~~~tdfG~-v~iv~nR~mp~~~~~a~~~~~ld~~-~~~~~~Lr~~~~~e~l-----aKtGd~~k~~i~~E~tL 309 (330) - .-.+=+|. |.++.+.++|........+++.|.. ++.+..-.+. .-+.. .-.-+...+.++..+.. T Consensus 193 ~------~~~~l~G~Pv~~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~~~-~i~~~~~~~~~~~~~~~~~r~~~r~d~ 265 (293) T protein:vir:48 193 P------TGYSIAGFAVKEISDRWLPNASSGVMPLYFGDLKQAVTLFDRQQM-SLLSTNIGGGAFETDTTKVRVIDRFDV 265 (293) T ss_pred C------CCceecceeeEEecccccCCccCCceEEEEEeccceEEEEEecce-EEEEecccchhhhcCeEEEEEEEeeCc Confidence 0 00112564 6667788888776555567788876 3333321111 11100 11234455667777999 Q ss_pred EEecchheeEEeccccccccC Q lcl|NC_020858. 310 KPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 310 e~~N~~a~g~i~gLt~~~~~~ 330 (330) .+++|+|..++. ++....+- T Consensus 266 ~~~~~~a~~~l~-~~~~~~~~ 285 (293) T protein:vir:48 266 VATDTEAFVPAS-FKAIADQK 285 (293) T ss_pred EEecccceEEEE-eeccccCC Confidence 999999998876 44433332 No 64 >protein:vir:4456 Length: 401 # NCBI annotation: Major capsid protein precursor # Family: family:all:21 # MgeID: mge:96 # MgeName: ST64B # Cross-refs: genbank:acc:NP_700379;genbank:gi:23505451;genbank:GeneID:955658 Probab=96.79 E-value=0.00034 Score=39.62 Aligned_cols=284 Identities=14% Similarity=0.053 Sum_probs=138.6 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) |+..+.. ....-.-+.+.+.|...-....|+..+.......+..+.|....-.. ...-..||...+.... T Consensus 107 ~~~~~~~---~GG~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~-~a~wv~E~~~~~~~~~------ 176 (401) T protein:vir:44 107 LQVGTDE---DGGYAVPEELDRSILSLLKDEVVMRQEATVITVGGSDYKKLVNLGGT-ASGWVGETDTRSQTAT------ 176 (401) T ss_pred hhcCCCC---CCceeccHhHHHHHHHHHHhhhhhhhhceeeecCCCceEEEEecCCc-cceeeccccccCcccc------ Confidence 4433221 11111224566677776677788877766556555555555543222 1122357665432211 Q ss_pred cceEEEEeee----eeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccc Q lcl|NC_020858. 81 GNYTQIMRKS----GIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRG 156 (330) Q Consensus 81 ~N~tQIf~~~----v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g 156 (330) ..+.||.-.. ..|.=|.+.++... .+...|=..+-...+.+-+|.+||+|... + +.-||+.......... T Consensus 177 ~~~~~v~~~~~k~~~~~~iS~ell~ds~-~~l~~~i~~~la~ai~~~~~~~~l~G~G~--~---~p~Gil~~~~~~~~~~ 250 (401) T protein:vir:44 177 SRLGLIEPFMGEIYGNPQATQKMLDDAF-FNVEAWINSELATEFAEQEEIAFTTGDGT--K---KPKGFLAYESTEESDK 250 (401) T ss_pred ccceeeeeehhheeeehhhhHHHHhcch-HHHHHHHHHHHHHHHHHHHHhhhhccCCC--C---ccceeecccccccccc Confidence 1233432222 22223344444432 12222333333455678899999998642 2 3446554432211111 Q ss_pred cccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeec----C Q lcl|NC_020858. 157 ATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAAS----N 232 (330) Q Consensus 157 ~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~----~ 232 (330) +... +.....+.+....++.+.|.+++..+-.+......+++|+....+|..+ +|.+. |+... + T Consensus 251 ~~~~--------~~~~~~~t~~~~~~~~d~i~~~~~~l~~~~~~~a~~v~n~~~~~~L~~l-kd~~G---~~l~~~~~~~ 318 (401) T protein:vir:44 251 ARAF--------GKLQHIVSGEATAVTADAIIKLIYTLRKAHRTGAKFMMNNNSLFAIRLL-KDTEG---NYLWRPGLEL 318 (401) T ss_pred cccc--------ccccccccccccccCHHHHHHHHHhcchhhhcCCEEEEcHHHHHHHHHh-hccCC---ceeecCCcCC Confidence 1100 0001112233445777777777766543322223578999998888876 34331 22211 1 Q ss_pred CcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchh-hhhcccCC-c-cccccccccccceeeEEEEEEEE Q lcl|NC_020858. 233 GKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRK-I-AEDKKVAKTGDAEKFMLIGEGAL 309 (330) Q Consensus 233 ~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~-~-~~~e~laKtGd~~k~~i~~E~tL 309 (330) +...+| +| +.|+.+.+||........+++.|+.. +.+. .|. + ...++... -+...+..+..++. T Consensus 319 g~~~~l----------~G-~PVv~~~~~p~~~~~~~~i~~Gd~~~~~~i~-~~~~~~~~~~~~~~-~~~v~~~a~~r~d~ 385 (401) T protein:vir:44 319 GQPSSL----------AG-YGIAENEQMPDIAADAKAIAFGNFKRGYTIV-DRIGTRILRDPYTN-KPFVGFYTTKRTGG 385 (401) T ss_pred CCCcee----------cc-eeeEEecCcCCccCCccEEEEeehhccEEEE-EecceEEeeecccc-CCcEEEEEEEEecc Confidence 111111 12 46777888987554444467788754 2221 121 0 12233332 35556666677999 Q ss_pred EEecchheeEEecccccccc Q lcl|NC_020858. 310 KPKNEKGLGVAADLYGLTAS 329 (330) Q Consensus 310 e~~N~~a~g~i~gLt~~~~~ 329 (330) .+.++.|..+|.. .++ T Consensus 386 ~~~~~~a~~~l~~----~aa 401 (401) T protein:vir:44 386 MLVDSQAIKLLKI----AAA 401 (401) T ss_pred EEecccceEEEEe----ecC Confidence 9999999877553 222 No 65 >protein:vir:3033 Length: 272 # NCBI annotation: major capsid protein # Family: family:all:522 # MgeID: mge:61 # MgeName: PhiNIH1.1 # Cross-refs: genbank:acc:NP_438146;genbank:gi:16271809;genbank:GeneID:929235 Probab=96.75 E-value=0.00037 Score=39.42 Aligned_cols=261 Identities=12% Similarity=0.056 Sum_probs=133.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccce---e----eeeccceecc-ceeeeeeeeccCccccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPI---Y----SMIEKVSFDT-THPEWTTDELAAPGANITLEGDEYTFD 72 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~---~----s~ig~~~~~~-~~~~W~td~L~~~~~na~~EG~d~~~~ 72 (330) ||..+.. -...-.-|-+++.|..--+..--| . ..-|+ .... ..+.|. .+..+ .-..||.+.+.. T Consensus 1 MA~~~T~---~~~~~iPev~s~~v~~~~~~~~~~~~~~~~~~~~~g~-~G~tv~iP~~~--~~~~a--~~v~eg~~i~~~ 72 (272) T protein:vir:30 1 MAVGTTK---MAQMLDPEVLADMIDAEVGKAIRFAPLAEVDTTLEGQ-PGTTLTVPKWD--YIGDA--EDVAEGEAIPMT 72 (272) T ss_pred CCCcccc---chheechHHHHHHHHHHHHHHhhhhccccccccccCC-CCCEEEEEEec--CCCCc--ccccCCCccccc Confidence 9966522 112223333444332111111111 0 11111 1111 234552 22222 234588887766 Q ss_pred cccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcc Q lcl|NC_020858. 73 ATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTN 152 (330) Q Consensus 73 ~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn 152 (330) ........-...+ +.+.+.||+-... .. ..+-+++-..+....+.|.+|..++..-.+.. T Consensus 73 ~~~~~~~~~~~~~-~~~~~~itd~~~~--~s-~~d~~~~~~~~~~~~~a~~~d~~i~~~~~~a~---------------- 132 (272) T protein:vir:30 73 QLGFKKTTMTIKK-AGKGVEITDEAIL--SG-YGDPVGQAAKQIVEAIDHKVDADVLDALSKST---------------- 132 (272) T ss_pred ccccceEEEEeee-eeeeeeecHHHHh--hc-cccHHHHHHHHHHHHHHHHHHHHHHHHhcccc---------------- Confidence 6555454445555 3466777755433 32 24666666667777888999998875311100 Q ss_pred cccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecC Q lcl|NC_020858. 153 VSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASN 232 (330) Q Consensus 153 ~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~ 232 (330) .......+.+.|.++++++=+++...+.++|||.....+.+.....+....+ . T Consensus 133 -----------------------~~~~~~~t~d~i~da~~~l~~~~~~~~~~vv~p~~~~~L~k~~~~~~~~~~~----~ 185 (272) T protein:vir:30 133 -----------------------QTVEATATVDGVSKALDIFNDEDDAETVIVMNPADASTLRLDAAKEWLGATE----V 185 (272) T ss_pred -----------------------cccccccCHHHHHHHHHHHhccCCCccEEEEcHHHHHHHHHhcccccccccc----c Confidence 0001134678899999998888888899999999877775432111100000 0 Q ss_pred CcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCC-ccccccccccccceeeEEEEEEEEEE Q lcl|NC_020858. 233 GKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRK-IAEDKKVAKTGDAEKFMLIGEGALKP 311 (330) Q Consensus 233 ~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~-~~~~e~laKtGd~~k~~i~~E~tLe~ 311 (330) +.. ...+-....+--++|+.+.+||.++ .+++++..+.+..-++ ..+.+..++. ..+......-|++.+ T Consensus 186 ~~~----~~~~g~ig~i~G~~Vi~s~~~p~~t-----~~~~~~~a~~~~~~~~~~ve~~r~~~~-~~~~i~~~~~~~~~v 255 (272) T protein:vir:30 186 GAN----RVVSGVYGEVLGVQIVRSRKCPKGT-----AYMVRKGALRIMLKRNTMVETDRDITK-AINQIVANKHYGVYL 255 (272) T ss_pred ccc----ccccccchhhcCeeEEEcCCCCcce-----EEEEcCCeEEEEecCCceeeecccccc-ceeEEEEEEEEEEEE Confidence 000 0000011112226899999999654 5789998777754333 1122222322 234444455688999 Q ss_pred ecchheeEEeccccccccC Q lcl|NC_020858. 312 KNEKGLGVAADLYGLTAST 330 (330) Q Consensus 312 ~N~~a~g~i~gLt~~~~~~ 330 (330) .||.+.-+++- -++- T Consensus 256 ~~~~~vv~~t~----~~a~ 270 (272) T protein:vir:30 256 YKAEKAVKITL----KDAA 270 (272) T ss_pred EcCCceEEEEe----cccc Confidence 99987766531 1111 No 66 >protein:vir:9820 Length: 272 # NCBI annotation: putative major capsid/head protein # Family: family:all:522 # MgeID: mge:176 # MgeName: 315.4 # Cross-refs: genbank:acc:NP_795582;genbank:gi:28876339;genbank:GeneID:1257858 Probab=96.75 E-value=0.00037 Score=39.42 Aligned_cols=261 Identities=12% Similarity=0.056 Sum_probs=133.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccce---e----eeeccceecc-ceeeeeeeeccCccccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPI---Y----SMIEKVSFDT-THPEWTTDELAAPGANITLEGDEYTFD 72 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~---~----s~ig~~~~~~-~~~~W~td~L~~~~~na~~EG~d~~~~ 72 (330) ||..+.. -...-.-|-+++.|..--+..--| . ..-|+ .... ..+.|. .+..+ .-..||.+.+.. T Consensus 1 MA~~~T~---~~~~~iPev~s~~v~~~~~~~~~~~~~~~~~~~~~g~-~G~tv~iP~~~--~~~~a--~~v~eg~~i~~~ 72 (272) T protein:vir:98 1 MAVGTTK---MAQMLDPEVLADMIDAEVGKAIRFAPLAEVDTTLEGQ-PGTTLTVPKWD--YIGDA--EDVAEGEAIPMT 72 (272) T ss_pred CCCcccc---chheechHHHHHHHHHHHHHHhhhhccccccccccCC-CCCEEEEEEec--CCCCc--ccccCCCccccc Confidence 9966522 112223333444332111111111 0 11111 1111 234552 22222 234588887766 Q ss_pred cccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcc Q lcl|NC_020858. 73 ATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTN 152 (330) Q Consensus 73 ~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn 152 (330) ........-...+ +.+.+.||+-... .. ..+-+++-..+....+.|.+|..++..-.+.. T Consensus 73 ~~~~~~~~~~~~~-~~~~~~itd~~~~--~s-~~d~~~~~~~~~~~~~a~~~d~~i~~~~~~a~---------------- 132 (272) T protein:vir:98 73 QLGFKKTTMTIKK-AGKGVEITDEAIL--SG-YGDPVGQAAKQIVEAIDHKVDADVLDALSKST---------------- 132 (272) T ss_pred ccccceEEEEeee-eeeeeeecHHHHh--hc-cccHHHHHHHHHHHHHHHHHHHHHHHHhcccc---------------- Confidence 6555454445555 3466777755433 32 24666666667777888999998875311100 Q ss_pred cccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecC Q lcl|NC_020858. 153 VSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASN 232 (330) Q Consensus 153 ~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~ 232 (330) .......+.+.|.++++++=+++...+.++|||.....+.+.....+....+ . T Consensus 133 -----------------------~~~~~~~t~d~i~da~~~l~~~~~~~~~~vv~p~~~~~L~k~~~~~~~~~~~----~ 185 (272) T protein:vir:98 133 -----------------------QTVEATATVDGVSKALDIFNDEDDAETVIVMNPADASTLRLDAAKEWLGATE----V 185 (272) T ss_pred -----------------------cccccccCHHHHHHHHHHHhccCCCccEEEEcHHHHHHHHHhcccccccccc----c Confidence 0001134678899999998888888899999999877775432111100000 0 Q ss_pred CcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCC-ccccccccccccceeeEEEEEEEEEE Q lcl|NC_020858. 233 GKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRK-IAEDKKVAKTGDAEKFMLIGEGALKP 311 (330) Q Consensus 233 ~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~-~~~~e~laKtGd~~k~~i~~E~tLe~ 311 (330) +.. ...+-....+--++|+.+.+||.++ .+++++..+.+..-++ ..+.+..++. ..+......-|++.+ T Consensus 186 ~~~----~~~~g~ig~i~G~~Vi~s~~~p~~t-----~~~~~~~a~~~~~~~~~~ve~~r~~~~-~~~~i~~~~~~~~~v 255 (272) T protein:vir:98 186 GAN----RVVSGVYGEVLGVQIVRSRKCPKGT-----AYMVRKGALRIMLKRNTMVETDRDITK-AINQIVANKHYGVYL 255 (272) T ss_pred ccc----ccccccchhhcCeeEEEcCCCCcce-----EEEEcCCeEEEEecCCceeeecccccc-ceeEEEEEEEEEEEE Confidence 000 0000011112226899999999654 5789998777754333 1122222322 234444455688999 Q ss_pred ecchheeEEeccccccccC Q lcl|NC_020858. 312 KNEKGLGVAADLYGLTAST 330 (330) Q Consensus 312 ~N~~a~g~i~gLt~~~~~~ 330 (330) .||.+.-+++- -++- T Consensus 256 ~~~~~vv~~t~----~~a~ 270 (272) T protein:vir:98 256 YKAEKAVKITL----KDAA 270 (272) T ss_pred EcCCceEEEEe----cccc Confidence 99987766531 1111 No 67 >protein:vir:1433 Length: 435 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:30 # MgeName: phiE125 # Cross-refs: genbank:acc:NP_536362;genbank:gi:17975167;genbank:GeneID:929171 Probab=96.73 E-value=0.00038 Score=39.33 Aligned_cols=283 Identities=10% Similarity=0.077 Sum_probs=130.5 Q ss_pred CCccccceeeccc-cccccccceeeEecCCcccceeeeec-cceeccceeeeeeeeccCccccccccccccccccccCce Q lcl|NC_020858. 1 MAVVTNTFQSTGA-KGNREELADVVSRITPEDTPIYSMIE-KVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~-~g~~edl~d~I~~i~p~dTP~~s~ig-~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~ 78 (330) ++..+ .++... ...-+++.+.|...-...+|+..+.- ..+..+-...|...+- .+...-..||...+..... T Consensus 130 ~~~~~--~t~~~gg~~vP~~~~~~ii~~l~~~~~i~~~~~~~~~~~~~~~~~p~~~~-~~~a~~v~E~~~~~~~~~~--- 203 (435) T protein:vir:14 130 MSLNT--LSPGAGGVLVPENLSSEVIELLRPKSVVRKLGARTLPLSNGNITIPRLKG-GAIVGYIGADTDIPTTQQQ--- 203 (435) T ss_pred hhccc--CCcCCCccccchhHHHHHHHHHhhhchhhhhcceeeecCCCceEEEEEeC-CcceeeeccCccccccccc--- Confidence 11111 111110 01122444544444345566655311 1222222334433321 1111223577665543221 Q ss_pred EecceE-EEEeeeeeehhHHHHHhhccccchHHHHHH-HHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccc Q lcl|NC_020858. 79 RLGNYT-QIMRKSGIISGTQNITDEAGRATKVKEQKL-KKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRG 156 (330) Q Consensus 79 ~~~N~t-QIf~~~v~VS~T~~av~~~G~~~e~a~q~~-k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g 156 (330) +.+++ -.++=...|.=|.+.++..+....+...+. .-...+.+-+|.+||+|... .-+..||..+..... T Consensus 204 -f~~i~~~~~k~~~~~~iS~ell~ds~~~~~l~~~i~~~l~~ai~~~~d~a~l~G~G~----~~~p~Gi~~~~~~~~--- 275 (435) T protein:vir:14 204 -FDDLKLTAKKMAALVPIANDLIKYAGVNPNVDQIVVGDLTAAIGAREDKAFIRDDGT----ANTPKGLRFWALPSN--- 275 (435) T ss_pred -eeEEEeeeEEEEEeehhhHHHHHhhccCHHHHHHHHHHHHHHHHHHHHHHhhccCCC----Cccccceeecccccc--- Confidence 22211 112222233334445555543322333333 44556888899999988532 223345543321100 Q ss_pred cccccccccccccccccccccccccccHHHHHHHHHHHHhcCCc--eeEEEeChHHHHHHHHhhccceeeeeeeeecCCc Q lcl|NC_020858. 157 ATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGAN--FKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGK 234 (330) Q Consensus 157 ~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~--~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~ 234 (330) +.....++.......++.+++..+..+.+. ...+++||....++..+ ++.+. |+...... T Consensus 276 --------------~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~v~n~~~~~~L~~l-kd~~G---~~l~~~~~ 337 (435) T protein:vir:14 276 --------------VITASDASTLQKIETDLGKVILALENADANLTQPGWIMAPRTFRFLEGL-RDGNG---NKVYPELA 337 (435) T ss_pred --------------eeccccccchhhHHHHHHHHHHHhhhccccccCCEEEEcHHHHHHHHHh-hccCC---ceeccCCC Confidence 011112222223345677777777765443 33578899998888775 33331 22221111 Q ss_pred ceeEEEEEEEEEcCCeEEEEEEcCcCCCccccc---cEEEEEcchhhhhcccCCccccc----cccc----------ccc Q lcl|NC_020858. 235 NNSIVANADVYEGPFGKVMIHPNRVMAGSGALA---RNAFFVDPEFLQFGWLRKIAEDK----KVAK----------TGD 297 (330) Q Consensus 235 ~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a---~~~~~ld~~~~~~~~Lr~~~~~e----~laK----------tGd 297 (330) .. +=+| +.|+.+.+||.+...+ ..+++.|++.+-+. .|.-...+ ..-+ .-| T Consensus 338 ~g----------~l~G-~Pv~~~~~~p~~~~~~~~~~~i~~gd~s~~~i~-~~~~~~~~~~~~~~~~~~~~~~~~~f~~~ 405 (435) T protein:vir:14 338 NG----------MLKG-YPVGKTTQVPINLGETGKESEIYFTDFGDVFIG-EEETLEIDYSKEATYKDADGHMVSAFQRD 405 (435) T ss_pred CC----------eeec-ceeEeeccccccccCCCccceEEEeecccEEEE-EecccEEEEeccccccccccchhhhhhcC Confidence 11 1123 4677778888642221 24777887764432 22211111 0011 113 Q ss_pred ceeeEEEEEEEEEEecchheeEEecccccc Q lcl|NC_020858. 298 AEKFMLIGEGALKPKNEKGLGVAADLYGLT 327 (330) Q Consensus 298 ~~k~~i~~E~tLe~~N~~a~g~i~gLt~~~ 327 (330) ...+..+.-+.+.+.+|+|..++.|+..=+ T Consensus 406 ~~~~r~~~r~d~~~~~~~a~~~l~~~~~~~ 435 (435) T protein:vir:14 406 QTLIRVIAKNDFGPRHVESIAVLAGVAWGA 435 (435) T ss_pred hhheeeeeeeCceeecccceEEEecCCCCC Confidence 345556777889999999999999988777 No 68 >protein:vir:3870 Length: 400 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:82 # MgeName: A2 # Cross-refs: genbank:acc:NP_680487;swissprot:trembl:q8ltc0;genbank:gi:22296527;interpro:IPR006444;uniprot:Q8LTC0;genbank:GeneID:951713 Probab=96.71 E-value=0.00028 Score=40.03 Aligned_cols=263 Identities=11% Similarity=-0.041 Sum_probs=119.9 Q ss_pred CCccccceeecc--ccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCcccccccccccccccc-ccCc Q lcl|NC_020858. 1 MAVVTNTFQSTG--AKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDA-TVSP 77 (330) Q Consensus 1 Ma~~t~~~~t~~--~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~-~~~~ 77 (330) .....+..++.. ..-.-+++.+.|...-....|+..++...+..+....|..-....+...-..||...+... .... T Consensus 129 ~~~~~~~~~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~E~~~~~~~~~~~f~ 208 (400) T protein:vir:38 129 ASDAVNAGVKAADAASTIPETISNTPQRELQTVVDLKPFTNVFQASTQKGTYPTVANATTKMVTVAELEKNPAMAKPEFK 208 (400) T ss_pred HHHHHhhcccccCCcccccHHHHHHHHHHHHhhhhhhhcceeEeccCcceEEEEEecCCCccccccccccccccccccce Confidence 000111111111 1112235677777666777888888776666555555554443333333345776654321 1111 Q ss_pred eEecceEEEEeeeeeehhHHHHHhhc--cccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccc Q lcl|NC_020858. 78 ERLGNYTQIMRKSGIISGTQNITDEA--GRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSR 155 (330) Q Consensus 78 ~~~~N~tQIf~~~v~VS~T~~av~~~--G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~ 155 (330) ...-+.. -+...+.||. +.+... ...+.+..++.+ .+.+-++.+++.|...... T Consensus 209 ~i~~~~~-k~~~~~~is~--ell~ds~~~~~~~i~~~l~~---~~~~~~~~~i~~~~~~~~~------------------ 264 (400) T protein:vir:38 209 PVNWSVE-TYRQALPVSQ--ESIDDSAIDLVGLIAQNGQQ---IKVNTTNGAVATLLKGFTA------------------ 264 (400) T ss_pred eeEeehh-heeeehhhHH--HHHhhhHHHHHHHHHHHHHH---HHHHHHHHhhhhccccccc------------------ Confidence 1111111 1223344443 333322 222333333333 3344556666655431100 Q ss_pred ccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeec---- Q lcl|NC_020858. 156 GATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAAS---- 231 (330) Q Consensus 156 g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~---- 231 (330) ....+.+.+.+++...-....+ ..+++||....+|..+ ++.+. |+... T Consensus 265 -----------------------~~~~~~~~~~~~~~~~~~~~~~-a~~v~~~~~~~~l~~l-kd~~G---~~i~~~~~~ 316 (400) T protein:vir:38 265 -----------------------KTISSVDDLKHINNVDLDPAYS-RVIIASQSFYNFLDTV-KDGNG---RYLLQDSIL 316 (400) T ss_pred -----------------------cccccHHHHHHHHHhhhhhhhC-cEEEEcHHHHHHHHHh-hccCC---CeeeecCcC Confidence 0023445566666655444333 3567899998888875 44321 12111 Q ss_pred CCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccccccceeeEEEEEEEEEE Q lcl|NC_020858. 232 NGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTGDAEKFMLIGEGALKP 311 (330) Q Consensus 232 ~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~E~tLe~ 311 (330) ++...+| +| +.|+.+..||........+++.|++..-+-+.|........--.........+..++..+ T Consensus 317 ~~~~~~l----------~G-~pv~~~~~~~~~~~g~~~~~~gd~s~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~d~~~ 385 (400) T protein:vir:38 317 TPSGKSV----------LG-MPIAVVSDDTLGAAGEAHAFLGDIKRAILFANRADFMVRWVDDQIYGQFLQAGMRFGVSV 385 (400) T ss_pred CCCcccc----------cc-ceeEEecccccCCCCceEEEEEeccccEEEEeecceEEEEecccccceeEEEEEEeccEE Confidence 1111011 12 245555567765543444667787632111212110101111122334567788899999 Q ss_pred ecchheeEEeccccccccC Q lcl|NC_020858. 312 KNEKGLGVAADLYGLTAST 330 (330) Q Consensus 312 ~N~~a~g~i~gLt~~~~~~ 330 (330) .+|.++..|+--.. . T Consensus 386 ~~~~a~~~l~~~~~----a 400 (400) T protein:vir:38 386 ADEKAGYFLTYTPK----A 400 (400) T ss_pred ecccceEEEEeecC----C Confidence 99999887664222 2 No 69 >protein:vir:2344 Length: 397 # NCBI annotation: gp14 # Family: family:all:507 # MgeID: mge:51 # MgeName: Bxb1 # Cross-refs: genbank:acc:NP_075281;genbank:gi:12657868;genbank:GeneID:920118 Probab=96.71 E-value=0.00039 Score=39.25 Aligned_cols=282 Identities=11% Similarity=0.004 Sum_probs=135.5 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) |+..+.+- ..--..+.+.+.|+..-....|++++.......+.........- .+...-.-||+..+..... + T Consensus 10 ~~~~~t~~---~~g~l~~~~~~~ii~~l~~~s~i~~l~~~~~~~~~~~~ip~~~~-~~~a~wv~Eg~~~~~s~~~----f 81 (397) T protein:vir:23 10 IAQTKDTM---FTGYLDPVQAKDYFAEAEKTSIVQRVAQKIPMGATGIVIPHWTG-DVSAQWIGEGDMKPITKGN----M 81 (397) T ss_pred HhhccCCC---CccccchhHHHHHHHHHHhccchhhhcceeeccCCceEEEEEcC-CcceEEecCCccccccccc----e Confidence 43332110 01011223344444444567888887655554443333332222 2222223577766554322 2 Q ss_pred cceE-EEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccccc Q lcl|NC_020858. 81 GNYT-QIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATG 159 (330) Q Consensus 81 ~N~t-QIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g 159 (330) .+++ .+.+-...|.=|.+.++... .+-.++=...-.+.+.+-+|.++|+|... +....|+..-. +. T Consensus 82 ~~v~l~~~k~~~~v~iS~ell~ds~-~~l~~~i~~~l~~aia~~~d~a~l~G~gt----~~~~~~~~~~~--~~------ 148 (397) T protein:vir:23 82 TKRDVHPAKIATIFVASAETVRANP-ANYLGTMRTKVATAIAMAFDNAALHGTNA----PSAFQGYLDQS--NK------ 148 (397) T ss_pred eEEEEeeEEEEEeehhhHHHHhcch-HHHHHHHHHHHHHHHHHHHHHHHhhcccC----Ccccccccccc--cc------ Confidence 2222 22233333333444444322 23344444555666889999999988643 21122211110 00 Q ss_pred ccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeEE Q lcl|NC_020858. 160 ANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSIV 239 (330) Q Consensus 160 ~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~~ 239 (330) ...+....+.+.+.+++.++..++.....+++|+....+|.++ ++.+. |+........ + T Consensus 149 ---------------~~~~~~~~~~~~~~~~~~~l~~~~~~~a~~vmn~~~~~~L~~l-kd~~G---~~i~~~~~~~--~ 207 (397) T protein:vir:23 149 ---------------TQSISPNAYQGLGVSGLTKLVTDGKKWTHTLLDDTVEPVLNGS-VDANG---RPLFVESTYE--S 207 (397) T ss_pred ---------------eeeecccchhHHHHHHHHhhhhcccCCCEEEEcHHHHHHHHHh-hccCC---ceeecccccc--c Confidence 0011123455667778888888888878899999999999886 33321 1211111000 0 Q ss_pred EEEEEEEcCCeE---EEEEEcCcCCCccccccEEEEEcchhhhhcccCCc---cccccccccc-------------ccee Q lcl|NC_020858. 240 ANADVYEGPFGK---VMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI---AEDKKVAKTG-------------DAEK 300 (330) Q Consensus 240 ~~v~~~~tdfG~---v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~---~~~e~laKtG-------------d~~k 300 (330) .. +.-.-|+ +.++.+.+||++... +++-|...+-+..-+++ ..++..-+.| |... T Consensus 208 ~~---~~~~~~tl~G~Pv~~s~~~~~g~~~---~~~gDfs~~~i~~~~~i~i~~~~e~~~~~~~~~~~~~~~lf~~d~v~ 281 (397) T protein:vir:23 208 LT---TPFREGRILGRPTILSDHVAEGDVV---GYAGDFSQIIWGQVGGLSFDVTDQATLNLGSQESPNFVSLWQHNLVA 281 (397) T ss_pred cc---ccccCceeeeeeEEEeCCCCCCceE---EEEeecceEEEEEEeceEEEEeeeeeeeeccccccceeeeeecccee Confidence 00 0001122 467788889876543 34556554332221111 0111111111 2345 Q ss_pred eEEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 301 FMLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 301 ~~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) +.++..+...+++|.|..++..-..-...+ T Consensus 282 ~ra~~r~d~~v~~~~a~~~~~~~~~~~~~~ 311 (397) T protein:vir:23 282 VRVEAEYGLLINDVNAFVKLTFDPVLTTYA 311 (397) T ss_pred EEEEeeeccceecccceEEEeeccccceee Confidence 556677889999999998887644433332 No 70 >protein:vir:81227 Length: 413 # NCBI annotation: gp6, major capsid protein # Family: family:all:585 # MgeID: mge:1893 # MgeName: BFK20 # Cross-refs: genbank:acc:YP_001456736;genbank:gi:157168379;hssp:P49861;interpro:IPR006444;uniprot:Q9MBJ9;genbank:GeneID:5580350 Probab=96.68 E-value=0.00042 Score=39.12 Aligned_cols=287 Identities=7% Similarity=-0.037 Sum_probs=131.5 Q ss_pred CCccccce-e-eccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCcc-c--cccccccccccccc- Q lcl|NC_020858. 1 MAVVTNTF-Q-STGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPG-A--NITLEGDEYTFDAT- 74 (330) Q Consensus 1 Ma~~t~~~-~-t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~-~--na~~EG~d~~~~~~- 74 (330) +.....+. + +......-+.+++.|+..-....|+.+++.....++....|......... . .-..||...+.... T Consensus 113 ~~~~~~~~~~~~~~~~~vp~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~Eg~~~~~~~~~ 192 (413) T protein:vir:81 113 ASDPASTATLTDEFQGGYGTTWNRNIIYRRREKLVVADLMDNLTMTNTTIKYLMEKANRVVEGGFKTVAEGGKKPYMRFA 192 (413) T ss_pred hhhhhhhcccccccccccchhhHHHHHHHHhhhhhHHhhcceeeccCCceeEEEeccccccccccceecCcccccccCcc Confidence 11111111 1 11111123456666666667777888877766666655566544432221 1 12247766543321 Q ss_pred cCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHH-HHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccc Q lcl|NC_020858. 75 VSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKL-KKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNV 153 (330) Q Consensus 75 ~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~-k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~ 153 (330) ......-+. .-+...+.||.. .++... . +...+. +-...+.+-+|++||+|.... . ...||..... . T Consensus 193 ~f~~i~~~~-~k~~~~~~iS~e--ll~ds~--~-l~~~i~~~la~~~~~~~d~~~l~G~G~~-~---~~~Gi~~~~~--~ 260 (413) T protein:vir:81 193 DFDIVTESL-SKIAGLTKITDE--MIEDYD--F-LVSYINARLLEELAIEEERQLLLGDGTG-N---NLTGLLKRDG--I 260 (413) T ss_pred cceeeEeee-eeEEEeehhhHH--HHHHHH--H-HHHHHHHHHHHHHHHHHHHHHhccCCCC-C---cccccccccc--c Confidence 111111111 112223455543 333332 2 333333 445678889999999985321 1 1334432210 0 Q ss_pred ccccccccccccccccccccccccccccccHHHHHHHHHHHHhc-CCceeEEEeChHHHHHHHHhhccceeeeeeeeecC Q lcl|NC_020858. 154 SRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQS-GANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASN 232 (330) Q Consensus 154 ~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~-Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~ 232 (330) . .. +.++. .-..+.+..++-++..+ +..++.+++||....+|..+ ++... |+.... T Consensus 261 ~-----------------~~-~~~~~-~~~~~~i~~~~~~~~~~~~~~~~~~vmn~~~~~~l~~l-kd~~G---~~l~~~ 317 (413) T protein:vir:81 261 Q-----------------TL-AVSNK-DELADSIYKAMTNISLATPFQADALVINPLDYQELRLA-KDANG---QYYGGG 317 (413) T ss_pred c-----------------cc-ccccc-chhHHHHHHHHHHhhhhccCCCcEEEEcHHHHHHHHHh-hccCC---ceeccc Confidence 0 00 00010 11234455555555444 34555688999988888876 33321 222111 Q ss_pred CcceeEEE-EEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccc-----cccceeeEEEEE Q lcl|NC_020858. 233 GKNNSIVA-NADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAK-----TGDAEKFMLIGE 306 (330) Q Consensus 233 ~~~~~~~~-~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laK-----tGd~~k~~i~~E 306 (330) ......+. ....-.+=+| +.|+.+.+||++ .+++.|++..-+-+.|.-..-+-..- +-+...+..+.. T Consensus 318 ~~~~~~~~~~~~~~~~l~G-~pv~~s~~~~~~-----~~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~~r~~~r 391 (413) T protein:vir:81 318 VFQGQYGSGGIMLDPAPWG-LRTVQSQVVPVG-----KPVVGAFRSAASVLRKGGVRIDSTNTNVDDFENNLITVRAEER 391 (413) T ss_pred cccccccccccccCceecc-eeeEEcCCCCcc-----cEEEEecccEEEEEEecceEEEEeccccchhhcCcEEEEEEEe Confidence 10000000 0000012234 367888889865 45788887532223232111111111 123445666667 Q ss_pred EEEEEecchheeEEecccccccc Q lcl|NC_020858. 307 GALKPKNEKGLGVAADLYGLTAS 329 (330) Q Consensus 307 ~tLe~~N~~a~g~i~gLt~~~~~ 329 (330) +.+.+.+|.|..+++ ++.-.++ T Consensus 392 ~d~~~~~~~a~~~l~-~~~~~~p 413 (413) T protein:vir:81 392 VGLMVTFPEAIVQLD-VAEVVTP 413 (413) T ss_pred eccEEecccceEEEE-ecCCCCC Confidence 899999999998875 2332222 No 71 >protein:vir:101607 Length: 379 # NCBI annotation: major capsid protein precursor # Family: family:all:585 # MgeID: mge:1646 # MgeName: 11b # Cross-refs: genbank:acc:YP_112497;genbank:gi:53793597;uniprot:Q5ZGF6;genbank:GeneID:3101715 Probab=96.68 E-value=0.00042 Score=39.11 Aligned_cols=268 Identities=9% Similarity=-0.048 Sum_probs=128.6 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccC-ccccccccccccccccccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAA-PGANITLEGDEYTFDATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~-~~~na~~EG~d~~~~~~~~~~~ 79 (330) .+.-+.+..+....+..+++...|+..-....|+.+++...+..+....+......+ +...-..||.+.+......... T Consensus 104 ~~~~~~~~~~~~~~~ip~~~~~~ii~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~Eg~~~~~~~~~f~~i 183 (379) T protein:vir:10 104 KAVGDMTLPVNLTGAQPKDYNFDVVLNPSQMLNVSDIVGAVSISGGTYTFVRENGAGEGAIGAQVEGATKGQKDYDISMI 183 (379) T ss_pred hhhcccccCCCCccccchhhhhHHHHhHHhhhhHHhhceeeeccCCceEEEEeecCCCcccccccCCccccccccceeee Confidence 111111222222334555677777777677778877766555555555555443222 2222245777765443221111 Q ss_pred ecceEEEEeeeeeehhHHHHHhhccccchHHHHHHH-HHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 80 LGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLK-KGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 80 ~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k-~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) .-+. .-+...+.||. +.+.... + +...+.. -...+.+-++.+|++|....+.. T Consensus 184 ~~~~-~k~~~~~~iS~--ell~D~~--~-l~~~i~~~la~~~~~~~~~~~~~g~~~~~~~-------------------- 237 (379) T protein:vir:10 184 DVNT-DFIAGFTRYSK--KMANNLP--F-LTSFIPNALRRDYAKAENAAFNAVLAANATA-------------------- 237 (379) T ss_pred Eeee-eeEEeeehhhH--HHHhhHH--H-HHHHHHHHHHHHHHHHHHHHHhccccccccc-------------------- Confidence 1111 12222334443 3333332 1 2222222 22334555666776653311100 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeE Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSI 238 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~ 238 (330) +. ...+ ...+-+.+.+++.++-.++.....+++||....+|..+ ++.+. ++....+-...- T Consensus 238 ~~-----------~~~~----~~~~~d~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~l-kd~~G---~~l~~~~~~~~~ 298 (379) T protein:vir:10 238 ST-----------EIIT----NKNKVEMLINEIAKQENLDFPVTAIVLRPTDYYDILVT-QKSVG---AGYGLPGVVTQD 298 (379) T ss_pred cc-----------cccc----CcccHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHh-hccCC---ceeccCCccCCC Confidence 00 0000 11233556666666667777777899999988888876 33331 221111000000 Q ss_pred EEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccc------cccceeeEEEEEEEEEEe Q lcl|NC_020858. 239 VANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAK------TGDAEKFMLIGEGALKPK 312 (330) Q Consensus 239 ~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laK------tGd~~k~~i~~E~tLe~~ 312 (330) +. ..+=+| +.|+.+.+||++ .+++.|.+...+..-++. . -.+.. .-+...+..+..+++.++ T Consensus 299 ~~----~~~l~G-~pvv~s~~~~ag-----~~~~gdf~~~~~~~~~~~-~-i~~~~~~~~~f~~~~~~~r~~~R~~~~v~ 366 (379) T protein:vir:10 299 NG----VLRING-IPLFRATWLAAN-----KYYVGDWTRVTKVTTEGL-S-LEFSEVEGTNFVKNNITARIEAQVALAVE 366 (379) T ss_pred CC----cceecc-eeeEecCCCCCC-----ceEEeecccEEEEEEece-E-EEEeecccccccCCcEEEEEEEEeccEEe Confidence 00 001112 577888999865 457888887655432221 0 11111 122334444556899999 Q ss_pred cchheeEEeccccc Q lcl|NC_020858. 313 NEKGLGVAADLYGL 326 (330) Q Consensus 313 N~~a~g~i~gLt~~ 326 (330) +|.|+.. ..|+++ T Consensus 367 ~p~a~v~-~~~~~~ 379 (379) T protein:vir:10 367 QPAALIF-GDFTAV 379 (379) T ss_pred cCccEEE-EEecCC Confidence 9999754 457777 No 72 >protein:vir:100247 Length: 425 # NCBI annotation: gp76 # Family: family:all:21 # MgeID: mge:1619 # MgeName: Bcep176 # Cross-refs: genbank:acc:YP_355412;genbank:gi:77864702;genbank:GeneID:3725969 Probab=96.68 E-value=0.00042 Score=39.11 Aligned_cols=295 Identities=11% Similarity=-0.010 Sum_probs=137.8 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccc-c-Cce Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDAT-V-SPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~-~-~~~ 78 (330) .+..+..-.+......-+++.+.|...-...+|+.++....+..+......... ..+...-..||+..+.... . ..+ T Consensus 127 ~~al~~~t~~~gG~lvP~~~~~~ii~~~~~~s~l~~l~~~~~~~~~~~~~~~~~-~~~~a~wv~E~~~~~~~~~~~f~~v 205 (425) T protein:vir:10 127 QAALNKGEDSEGGYLTPIEWDRTITNKLVLISPMRQLCRVQPVSKAGFSKLFNM-GGTTSGWVGEASQRPQTNAATFQPL 205 (425) T ss_pred HHHhhcCcCCCCceeccHhHHHHHHHHHHhhhhhhhhceeeeccCCceEEEEEc-CCcceeeecccccccccccccccee Confidence 111111101111111234666666666677778888765544443322222211 2222223357766543221 1 111 Q ss_pred EecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 79 RLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 79 ~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) .+ ..++-..-|.=|.+.++... .+..++=...-...+.+-+|.+||+|.-. + +.-||+..+..... T Consensus 206 ~~----~~~k~~~~i~iS~ell~ds~-~~l~~~i~~~la~ai~~~~d~~~l~G~G~--~---~p~Gil~~~~~~~~---- 271 (425) T protein:vir:10 206 SF----ASGEIYANPAATQQILDDAE-IDLESWLATEVQTEFAKQEGKAFLAGDGT--N---KPNGLLTYIAGGAN---- 271 (425) T ss_pred ee----eheeeEeehHhHHHHHhcch-hHHHHHHHHHHHHHHHHHHHhhhhcccCC--C---Ccceeeeccccccc---- Confidence 11 11222223334455555443 23334444455566678899999998642 1 34466554421111 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeE Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSI 238 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~ 238 (330) +. ....+.....+.++...++-++|.++..++-.+-.....+++||....+|..+ +|.+. |+...+.-.. T Consensus 272 ~~----~~~~~~~~~~~~~~~~~~~~d~l~~l~~~l~~~~~~~a~~vmn~~~~~~L~~l-kD~~G---~~l~~~~~~~-- 341 (425) T protein:vir:10 272 AA----KHPFGAIEVVNSGAAADITSDGIIDLVYDLPSAFTGNARFAMNRNTQRQVRKL-KDGQG---NYLWQPSYVA-- 341 (425) T ss_pred cc----cccccccccccccccccccHHHHHHHHhhhhhhhccCCEEEEchHHHHHHHHh-hcCCC---ceeeccCccC-- Confidence 00 00011111122233445777778777776654433333578999999888876 34331 2221111000 Q ss_pred EEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCC-c-cccccccccccceeeEEEEEEEEEEecchh Q lcl|NC_020858. 239 VANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRK-I-AEDKKVAKTGDAEKFMLIGEGALKPKNEKG 316 (330) Q Consensus 239 ~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~-~-~~~e~laKtGd~~k~~i~~E~tLe~~N~~a 316 (330) +. ..+=+| +.|+.+.+||........+++.|.+..-+-+-|. + ...++.. .-+...+..+.-+...+.+|+| T Consensus 342 g~----~~~l~G-~PV~~~~~~p~~~~~~~~i~~Gd~~~~~~i~~~~~~~v~~d~~~-~~~~~~~~~~~r~d~~v~~~~A 415 (425) T protein:vir:10 342 GQ----PATLAG-YPVTEVPDMPDVAANSTPILFGDFQQTYLIIDRIGVRVLRDPYT-AKPYVLFYTTKRVGGGLLNPEP 415 (425) T ss_pred CC----Cceecc-eeeEEecCcCCccCCccEEEEEehhccEEEEEecceEEEecccc-cCCcEEEEEEEEeccEeecccc Confidence 00 001133 4678888899765555557777876421111121 1 0112222 2344555566668999999999 Q ss_pred eeEEeccccccccC Q lcl|NC_020858. 317 LGVAADLYGLTAST 330 (330) Q Consensus 317 ~g~i~gLt~~~~~~ 330 (330) ..+|.- .+|= T Consensus 416 ~~~l~~----~as~ 425 (425) T protein:vir:10 416 MRAMKV----AASE 425 (425) T ss_pred eEEEEe----eccC Confidence 866532 1122 No 73 >protein:vir:3991 Length: 404 # NCBI annotation: major structural protein # Family: family:all:21 # MgeID: mge:319 # MgeName: BK5-T # Cross-refs: genbank:acc:NP_116499;genbank:gi:14251132;genbank:GeneID:921252 Probab=96.48 E-value=0.00058 Score=38.31 Aligned_cols=272 Identities=8% Similarity=0.000 Sum_probs=125.7 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccc--cccccccccccccCce Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANI--TLEGDEYTFDATVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na--~~EG~d~~~~~~~~~~ 78 (330) |...+. +......-+.+.+.|...-....|+++++......+....|..-...+....+ ..||+..+......-. T Consensus 116 ~~~~t~---~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~Eg~~~~~~~~~~f~ 192 (404) T protein:vir:39 116 ETSGSD---SAAGLTIPQDIRTMINTLVRQYDSLQQYVRVESVSTSNGSRVYEKWTDVTPLTVMDAEDGKIPDLDNPRLT 192 (404) T ss_pred hhcccc---cCCceeccHHHHHHHHHHHHhhhhHHhhcceeeccCCcceEEEEeecCCccceeeecCcccccccccccee Confidence 221111 11112234577788887778889999887765544333333222121111122 3467665431111101 Q ss_pred EecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 79 RLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 79 ~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) .+.=-..-+...+.||.. .+.... .+..+|=...-...+.+-+|.++|+|....... T Consensus 193 ~i~~~~~k~~~~~~iS~e--ll~ds~-~~l~~~i~~~l~~~~~~~~d~~il~g~g~~~~~-------------------- 249 (404) T protein:vir:39 193 IIKYLIKRYAGIITATNT--LLKDTA-ENILAWLSSWIAKKVVVTRNQAIIAAMGTVPKK-------------------- 249 (404) T ss_pred eEEeeeeeEEeeehhHHH--HHhhch-HHHHHHHHHHHHHHHHHHHHHHHHhcccccccc-------------------- Confidence 111111112233445543 333222 233344444455566677899999875321000 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHH-hcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCccee Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGY-QSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNS 237 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~-~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~ 237 (330) ....+.+.+.+++.... .+......+++||....+|..+ ++... |+.....-.. T Consensus 250 --------------------~~~~~~~~i~~~~~~~~~~~~~~~a~~v~n~~~~~~L~~l-kd~~G---~~l~~~~~~~- 304 (404) T protein:vir:39 250 --------------------PTIAKFDDVITMINTSVDPAIIATSSLLTNQSGLNKLALV-KTAEG---KYLLEPDPTK- 304 (404) T ss_pred --------------------cccccHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHh-hccCC---ceeeccCcCC- Confidence 00133445555554322 2222223588999999888876 33321 2221111000 Q ss_pred EEEEEEEEEcCCeE-EEEEEcCcCCCccccccEEEEEcchh-hhhcccCCccccccccccc-----cceeeEEEEEEEEE Q lcl|NC_020858. 238 IVANADVYEGPFGK-VMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRKIAEDKKVAKTG-----DAEKFMLIGEGALK 310 (330) Q Consensus 238 ~~~~v~~~~tdfG~-v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~~~~~e~laKtG-----d~~k~~i~~E~tLe 310 (330) .. -.+=+|. |.+.-+.++|........+++.|+.. +.+.. |.-..-+-...++ +...+.++..+... T Consensus 305 --~~---~~~l~G~pV~~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~-~~~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~ 378 (404) T protein:vir:39 305 --PN---SYLIKGKKVIVVADRWLPNSGSTVYPLYYGDMSQAITLFD-RENMSLLPTNIGAGAFETDTTKIRVIDRFDVK 378 (404) T ss_pred --CC---cceecceeEEEecccccCccCCCccEEEEEeccccEEEEe-ecceEEEEeccchhhhhhceeeEEEEeeeccE Confidence 00 0011453 45555778887665555688889873 33322 2111111111111 23345556668999 Q ss_pred EecchheeEEeccccccccC Q lcl|NC_020858. 311 PKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 311 ~~N~~a~g~i~gLt~~~~~~ 330 (330) +.+|.|..++..-.- +.+. T Consensus 379 ~~~~~a~~~~~~~~~-a~~~ 397 (404) T protein:vir:39 379 TTDSEALVAGSFTAI-ADQV 397 (404) T ss_pred EecccceEEEEeecc-ccCC Confidence 999999876663332 3222 No 74 >protein:vir:7409 Length: 408 # NCBI annotation: major structural protein # Family: family:all:21 # MgeID: mge:146 # MgeName: P335 # Cross-refs: genbank:acc:NP_839926;genbank:gi:30089896;genbank:GeneID:1260683 Probab=96.45 E-value=0.00062 Score=38.18 Aligned_cols=267 Identities=10% Similarity=0.056 Sum_probs=124.8 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccc--eeeeeeeeccCccccccccccccccccccCce Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTT--HPEWTTDELAAPGANITLEGDEYTFDATVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~--~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~ 78 (330) |...+.. ....-.-+.+.+.|...-...+|+..+++....++. .+.|....-..+...-..||++.+.... . T Consensus 116 ~~~~~~~---~gg~~vP~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~E~~~~~~~~~---~ 189 (408) T protein:vir:74 116 ETSGSDS---AAGLTIPQDIRTMINTLVRQYDSLQQYVRVESVSTSSGSRVYEKWTDVTPLKAMDEEDGKIPDLDN---P 189 (408) T ss_pred hcccccC---CCceeechhHhhHHHHHHhhhcchhhhcceeeccCCcceEEEEeecCCcccccccccccccccccc---c Confidence 2111110 011112346777777777888889888776554332 3444443322222223346766543211 1 Q ss_pred EecceE-EEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 79 RLGNYT-QIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 79 ~~~N~t-QIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) .+.+++ ..++-...+.=|.+.+.... .+...|=...-...+.+-+|.++|+|...... .+ T Consensus 190 ~~~~i~~~~~k~~~~~~iS~ell~ds~-~~l~~~i~~~l~~~~~~~~d~~il~G~G~~~~-----------------~~- 250 (408) T protein:vir:74 190 RLTIIKYLIKRYAGIITATNTLLKDTA-ENILAWLSSWIAKKVVVTRNQAIIAAMGTVPK-----------------KP- 250 (408) T ss_pred ceeeEEeeeeeEEeeehhHHHHHhhch-HHHHHHHHHHHHHHHHHHHHHHHhhccccccc-----------------cc- Confidence 111111 11111222222333443322 22333333344555667788999987532110 00 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHH-HHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecC---- Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQ-QGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASN---- 232 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~-~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~---- 232 (330) ...+.+.+.+++. .+-.+......++|||....+|..+ ++... |+.... T Consensus 251 ----------------------~~~~~~~i~~~~~~~l~~~~~~~a~~v~n~~~~~~l~~l-kd~~G---~~l~~~~~~~ 304 (408) T protein:vir:74 251 ----------------------TIANFDDVITMINTSVDPAIIATSSLLTNQSGLNKLALV-KTAEG---KYLLEPDPTK 304 (408) T ss_pred ----------------------ccccHHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHHh-hcCCC---ceEeccCcCC Confidence 0134455655553 3332222223578899999999886 33321 222211 Q ss_pred CcceeEEEEEEEEEcCCeE-EEEEEcCcCCCccccccEEEEEcchh-hhhcccCCccccccccccc-ccee----eEEEE Q lcl|NC_020858. 233 GKNNSIVANADVYEGPFGK-VMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRKIAEDKKVAKTG-DAEK----FMLIG 305 (330) Q Consensus 233 ~~~~~~~~~v~~~~tdfG~-v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~~~~~e~laKtG-d~~k----~~i~~ 305 (330) +...+ =+|. |.+..|.+||........+++.|+.. +.+. .|.-..-+-....+ +.++ +.++. T Consensus 305 ~~~~~----------l~G~pV~~~~~~~~~~~~~~~~~i~~gd~~~~~~~~-~~~~~~i~~~~~~~~~f~~~~~~~r~~~ 373 (408) T protein:vir:74 305 PNSYL----------IKGKQVIVVADRWLPNSGSTVYPLYYGDMSQAITLF-DRENMSLLPTNIGAGAFETDTTKIRVID 373 (408) T ss_pred CCCce----------ecceeeEEecCcccccccCCcceEEEEehhccEEEE-EecceEEEEeccccchhhcceeeEEEEE Confidence 11111 1453 56666888987654444567778773 3332 22111111111122 2233 44555 Q ss_pred EEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 306 EGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 306 E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) .+...+.+|.|..+++- ++.+..- T Consensus 374 r~d~~~~~~~a~~~~~~-~~~~~~~ 397 (408) T protein:vir:74 374 RFDVKATDSEALVAGSF-TAIADQV 397 (408) T ss_pred eeCcEEecccceEEEEe-ecccCCC Confidence 58899999999877753 3322211 No 75 >protein:vir:100172 Length: 394 # NCBI annotation: putative major head protein # Family: family:all:21 # MgeID: mge:1524 # MgeName: phi AT3 # Cross-refs: genbank:acc:YP_025031;genbank:gi:48697264;genbank:GeneID:2948270 Probab=96.38 E-value=0.00068 Score=37.93 Aligned_cols=277 Identities=12% Similarity=0.021 Sum_probs=124.2 Q ss_pred CCccccceeecc-ccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCcccccccccccccccc-ccCce Q lcl|NC_020858. 1 MAVVTNTFQSTG-AKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDA-TVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~-~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~-~~~~~ 78 (330) .....++.++.+ ..-.-+++...|...-...+|+.+++......+....|..-....+...-..||.+.+... ..... T Consensus 107 ~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~E~~~~~~~~~~~~~~ 186 (394) T protein:vir:10 107 IDNAAGHVTSTEAGVLIPEEIIYDPTAEVNSVVDLSTLVTKTPVTTPKGTYPILKRATDRFSSVAELAENPALAEPEFEQ 186 (394) T ss_pred hhhhhcccccccCceeccHHHHHHHHHHHHhhhhhhhhceeeeccCCceEEEEEecCCCcccccccccccccccccccee Confidence 000111111111 1112235666677666778888888776666555555554444333333445776655322 11111 Q ss_pred EecceEEEEeeeeeehhHHHHHhhcc--ccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccc Q lcl|NC_020858. 79 RLGNYTQIMRKSGIISGTQNITDEAG--RATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRG 156 (330) Q Consensus 79 ~~~N~tQIf~~~v~VS~T~~av~~~G--~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g 156 (330) ..-+.- -+..-+.|| .+.+.... ..+.+..++ .+.+.+-+|.++++|... +.+. + T Consensus 187 v~l~~~-k~~~~~~iS--~ell~ds~~~l~~~i~~~l---a~~~~~~~~~~il~g~g~--~~~~---------------~ 243 (394) T protein:vir:10 187 VDWSVS-TYRGAIPLS--EEAIADSAVDLTSLVGQSI---NEKSVNTYNAMIAPVLQS--FTAK---------------A 243 (394) T ss_pred EEeeee-eeEeeehhH--HHHHhhhhHHHHHHHHHHH---HHHHHHHHHHHHhhcccc--cccc---------------c Confidence 111111 122233344 34444322 222333333 344566678888876542 1100 0 Q ss_pred cccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcce Q lcl|NC_020858. 157 ATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNN 236 (330) Q Consensus 157 ~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~ 236 (330) .....+-+.|.+++...-....+ ..+++||....+|..+ ++.+. |+........ T Consensus 244 ---------------------~~~~~~~d~l~~~~~~~~~~~~~-a~~vmn~~~~~~l~~l-kd~~G---~~i~~~~~~~ 297 (394) T protein:vir:10 244 ---------------------TTTDTLVDSLKHILNVDLDPAYS-RALVVTQSLFNTLDTL-KDKNG---RYLLHDASDS 297 (394) T ss_pred ---------------------ccccccHHHHHHHHHhhhhhhcc-CEEEecHHHHHHHHHh-hccCC---Ceeeeccccc Confidence 00112345566655433333223 3588999999998886 33321 2211111000 Q ss_pred eEEEEEEEEEcCCeE-EEEEEcCcCCCccccccEEEEEcchh-hhhcccCCccccccccccccceeeEEEEEEEEEEecc Q lcl|NC_020858. 237 SIVANADVYEGPFGK-VMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRKIAEDKKVAKTGDAEKFMLIGEGALKPKNE 314 (330) Q Consensus 237 ~~~~~v~~~~tdfG~-v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~~~~~e~laKtGd~~k~~i~~E~tLe~~N~ 314 (330) .... ....+=+|. |.++.+.++|..+. ...+++.|++. +.+ +.|........--......+..+..+...+.+| T Consensus 298 ~~~~--~~~~~L~G~PV~~~~~~~~~~~~~-~~~i~~gd~s~~~~~-~~~~~~~v~~~~~~~~~~~~~~~~r~d~~~~~~ 373 (394) T protein:vir:10 298 ITDG--TAKGTVLGVPVYVVGDALLGSAAG-DQKAFVGDLKRGVLF-ADRQQVTLAWEDSKIYGRYLGAAFRFGVKQADS 373 (394) T ss_pred cccC--CcccccccceeEEecccccCCCCC-ceEEEEeeccccEEE-EeecceEEEEecccccceeEEEEEEeccEEecc Confidence 0000 001122453 55555666665422 22356667663 222 222110100000011223456677789999999 Q ss_pred hheeEEeccccccccC Q lcl|NC_020858. 315 KGLGVAADLYGLTAST 330 (330) Q Consensus 315 ~a~g~i~gLt~~~~~~ 330 (330) +|...|+.-.--+.+| T Consensus 374 ~ai~~~~~~~~~~~~~ 389 (394) T protein:vir:10 374 NAGYFVTNTDAASGST 389 (394) T ss_pred ccEEEEEeecccCCCC Confidence 9998887544433344 No 76 >protein:vir:80684 Length: 315 # NCBI annotation: gp6 # Family: family:all:966 # MgeID: mge:1884 # MgeName: PA6 # Cross-refs: genbank:acc:YP_001285582;genbank:gi:148727088;genbank:GeneID:5247055 Probab=96.30 E-value=0.00076 Score=37.67 Aligned_cols=289 Identities=10% Similarity=0.006 Sum_probs=129.9 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) ||..+..-.. .-.-+.+.+.|...-....|+..+.......+..+......- .+..--..||+..+.......... T Consensus 1 Ma~~~~~~gg---~~vP~~~~~~ii~~l~~~s~i~~l~~~i~~~~~~~~ip~~~~-~~~a~wv~Eg~~~~~s~~~f~~v~ 76 (315) T protein:vir:80 1 MADDFLSAGK---LELPGSMIGAVRDRAIDSGVLAKLSPEQPTIFGPVKGAVFSG-VPRAKIVGEGEVKPSASVDVSAFT 76 (315) T ss_pred CCCCcCCcCc---eEcchHHHHHHHHHHHhhchhhhhcceeecCCCceEEEEEeC-CcceEEeeCCccccccccceeeeE Confidence 8866533221 223345666666666667776664322222221121111111 111122357776655432211111 Q ss_pred cceEEEEeeeeeehhHHHHHhhcc--ccchHHHHHH-HHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 81 GNYTQIMRKSGIISGTQNITDEAG--RATKVKEQKL-KKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 81 ~N~tQIf~~~v~VS~T~~av~~~G--~~~e~a~q~~-k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) =+- .-+.--+.|| .+.++.-. ...++...+. .-...+.+-+|.++++|.....+. ...|+...+.+... T Consensus 77 l~~-~kl~~~~~iS--~ell~~s~~~~~~~l~~~i~~~la~ai~~~~d~a~~~G~~~~~~~--~~~~~~~~~~~~~~--- 148 (315) T protein:vir:80 77 AQP-IKVVTQQRVS--DEFMWADADYRLGVLQDLISPALGASIGRAVDLIAFHGIDPATGK--AASAVHTSLNKTKN--- 148 (315) T ss_pred eee-eeEEeeehhh--HHHhhcCchhHHHHHHHHHHHHHHHHHHHHHhhheeeccCCCCCc--cccccccccccccc--- Confidence 111 1122223333 33322211 1123433333 345567888999999986322221 23333322210000 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCc-eeEEEeChHHHHHHHHhhccceeee-eeeeecCCcc Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGAN-FKHVFVSPYVKSVFVTFMSDTNVAS-FRYAASNGKN 235 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~-~~~i~v~~~~k~~is~f~~~~~~~~-~r~~~~~~~~ 235 (330) .... ...+.++|.+++.++-.++.. .+-.++||..+..+.++-....... .++.. . . T Consensus 149 ----------------~~~~--~~~~~~d~~~~~~~~~~~~~~~~~~~imn~~~~~~L~~l~~~~g~~~~g~~~~-~--~ 207 (315) T protein:vir:80 149 ----------------IVDA--TDSATADLVKAVGLIAGAGLQVPNGVALDPAFSFALSTEVYPKGSPLAGQPMY-P--A 207 (315) T ss_pred ----------------eeec--cccchHHHHHHHHHHhhccCccceEEEEcHHHHHHHHHHhhccCCcccccccc-c--c Confidence 0000 112345666777665444332 3457899999999888732111000 00000 0 0 Q ss_pred eeEEEEEEEEEcCCeEEEEEEcCcCCCccccc----cEEEEEcchhhhhcccCCccccccccccc------------cce Q lcl|NC_020858. 236 NSIVANADVYEGPFGKVMIHPNRVMAGSGALA----RNAFFVDPEFLQFGWLRKIAEDKKVAKTG------------DAE 299 (330) Q Consensus 236 ~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a----~~~~~ld~~~~~~~~Lr~~~~~e~laKtG------------d~~ 299 (330) ...+ . ..+=+| +.|+.+++||.....+ ..+|+.|.+++.+...+++ ..+ +...+ |.. T Consensus 208 ~~~g-~---~~tl~G-~PV~~~~~~~~~~~~~~~~~~~~~~GDfs~~~~g~~~~~-~i~-i~~~~~~~~~~~~~~~~~~v 280 (315) T protein:vir:80 208 AGFA-G---LDNWRG-LNVGASSTVSGAPEMSPASGVKAIVGDFSRVHWGFQRNF-PIE-LIEYGDPDQTGRDLKGHNEV 280 (315) T ss_pred cccC-C---Cceecc-eeeEecCcCCcccccccccccEEEEeecccEEEEEecCe-eEE-EeccccccCcccchhhcCcE Confidence 0000 0 001123 5788889998764332 2467778887665543322 111 11111 223 Q ss_pred eeEEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 300 KFMLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 300 k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) .+.....++..+++|.|..+|++-.= ++.| T Consensus 281 ~~r~~~r~~~~v~~~~a~~~l~~~~a-~~~~ 310 (315) T protein:vir:80 281 MVRAEAVLYVAIESLDSFAVVKEKAA-PKPN 310 (315) T ss_pred EEEEEEEecceeecccceEEEeeccC-CCCC Confidence 44455668899999999988765443 2333 No 77 >protein:vir:100884 Length: 389 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1473 # MgeName: Lc-Nu # Cross-refs: genbank:acc:YP_358764;genbank:gi:78000028;genbank:GeneID:3726155 Probab=96.26 E-value=0.00081 Score=37.53 Aligned_cols=272 Identities=11% Similarity=0.017 Sum_probs=124.4 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) |+..+.+ ...+-.-+.+...|...-....|+..++.....++....|....-.........||...+.... ..+ T Consensus 109 ~~~~t~~---~gg~~vP~~~~~~i~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~E~~~~~~~~~---~~~ 182 (389) T protein:vir:10 109 TSKVTST---EAGVLIPEEIIYDPTAEVNSVVDLSTLVTKTPVTTPKGTYPILKRATDRFSSVAELAENPKLAE---PEF 182 (389) T ss_pred hcccccC---CcceeehHHHHHHHHHHHHhhhhHHhhcceeeccCCeeEEEEEecCCCcccccccccccccccc---ccc Confidence 4433211 1111122356666666667788888877666655554555433322222223457666543221 122 Q ss_pred cceE---EEEeeeeeehhHHHHHhhcc--ccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccc Q lcl|NC_020858. 81 GNYT---QIMRKSGIISGTQNITDEAG--RATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSR 155 (330) Q Consensus 81 ~N~t---QIf~~~v~VS~T~~av~~~G--~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~ 155 (330) ..++ .-+..-+.|| .+.+.... ....+..++ ...+.+-++..+++|...... T Consensus 183 ~~i~~~~~k~~~~~~iS--~ell~ds~~~l~~~i~~~l---a~~~~~~~~~~i~~g~~~~~~------------------ 239 (389) T protein:vir:10 183 NKVDWSVATYRGAIPLS--EEAIADSAVDLTALVGQSI---KEKSVNTYNAMIAPVLQSFTA------------------ 239 (389) T ss_pred eeeeeeheeeEeeehhh--HHHHhhhhHHHHHHHHHHH---HHHHHHHHHHHHhhhhccccc------------------ Confidence 2222 2223334444 44444332 222333333 344556677777766432100 Q ss_pred ccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcc Q lcl|NC_020858. 156 GATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKN 235 (330) Q Consensus 156 g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~ 235 (330) .++....+.+.|.+++...-..+.+ ..+++|+....+|..+- |.+. |+....+-. T Consensus 240 --------------------~~~~~~~~~d~l~~~~~~~~~~~~~-a~~~~n~~~~~~L~~lk-d~~G---~~i~~~~~~ 294 (389) T protein:vir:10 240 --------------------KKTTTDTLVDSLKHILNVDLDPAYS-RALVVTQSLFNTLDTLK-DKNG---RYLLHDASD 294 (389) T ss_pred --------------------ccccccccHHHHHHHHHhhhhhhhC-cEEEecHHHHHHHHHhh-ccCC---CeeeecCcc Confidence 0111234556677776644443322 35889999999998863 3321 222211100 Q ss_pred eeEEEEEEEEEcCCeE-EEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccccccceeeEEEEEEEEEEecc Q lcl|NC_020858. 236 NSIVANADVYEGPFGK-VMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTGDAEKFMLIGEGALKPKNE 314 (330) Q Consensus 236 ~~~~~~v~~~~tdfG~-v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~E~tLe~~N~ 314 (330) .. .....-.+=+|. |.++.+.++|..+. ...+++-|++..-+-+.|.-..-+..--........++..+.+.+.+| T Consensus 295 ~~--~~~~~~~~l~G~pV~~~~~~~~~~~~~-~~~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~r~d~~~~~~ 371 (389) T protein:vir:10 295 SI--TDGTAKGTILGVPVYVVGDTLLGSLAG-DQKAFVGDLKRGVLFTDRQQVTLAWEDSKIYGKYLGAAFRFGVQKADS 371 (389) T ss_pred cc--cccccccccccceeEEecccccCCCCC-ceEEEEeeccccEEEEeecceEEEeeccccccceEEEEEEeccEEecc Confidence 00 000001112454 45555555554321 123555666532111222100100000011223456666789999999 Q ss_pred hheeEEeccccccccC Q lcl|NC_020858. 315 KGLGVAADLYGLTAST 330 (330) Q Consensus 315 ~a~g~i~gLt~~~~~~ 330 (330) .|+-.++ ++...+++ T Consensus 372 ~a~~~~~-~~~~~~~~ 386 (389) T protein:vir:10 372 KAGYFVT-NTDVPGSA 386 (389) T ss_pred cceEEEE-eeccCCCC Confidence 9987664 66666666 No 78 >protein:vir:3613 Length: 272 # NCBI annotation: MHP # Family: family:all:522 # MgeID: mge:74 # MgeName: TP901-1 # Cross-refs: genbank:acc:NP_112699;genbank:gi:13786567;genbank:GeneID:921035 Probab=96.06 E-value=0.0011 Score=36.91 Aligned_cols=260 Identities=15% Similarity=0.094 Sum_probs=130.2 Q ss_pred CCccccceeeccccccccccce----------eeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELAD----------VVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYT 70 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d----------~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~ 70 (330) ||... |..+. .-.-|=+++ .+..+.+.+.++ -|+.--+=+.+.|. .+.++ .-..||.+.+ T Consensus 1 ma~~~-T~~~d--~iiPev~~~~v~~~~~~~~~~~~~~~~~~~l---~g~~G~ti~iP~~~--~~gda--~~~~eg~~i~ 70 (272) T protein:vir:36 1 MSKQK-TTLAD--LVNPEVLAPIVSYELNKALRFAPLAQVDTTL---QGQPGNTLKFPAFT--YIGDA--ADVAEGGEIS 70 (272) T ss_pred CCCcc-eehhh--hhchHHHHHHHHHHHHhhhhhcccccccccc---ccCCCCEEEEeeec--cCccc--cccCCCCccC Confidence 88654 22222 111121221 223444444433 23211122345674 34332 2356787776 Q ss_pred cccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHh Q lcl|NC_020858. 71 FDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVK 150 (330) Q Consensus 71 ~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~ 150 (330) ....+.....--+.|. .+.|+|++-+... .+ +|.+.+-..+....+.|.+++.++..-.+.. T Consensus 71 ~~~lt~~~~~~~i~~~-~k~~~vtD~~~~~--~~-~d~~~~~~~~~a~~~a~~~d~~i~~~l~~~~-------------- 132 (272) T protein:vir:36 71 LDKIGTTTKSVTIKKA-AKGTEITDEAALS--GY-GDPIGESNKQLGLSLANKVDDDLLSAAKTTS-------------- 132 (272) T ss_pred hhhcCCcceeEeeehh-hccccccHHHHhh--cc-chHHHHHHHHHHHHHHHHHHHHHHHHhcccc-------------- Confidence 6655554444444553 5678888765553 22 4555555556666677888877753211000 Q ss_pred cccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeee Q lcl|NC_020858. 151 TNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAA 230 (330) Q Consensus 151 tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~ 230 (330) ......++.+.|.++++++=+++...+.++|||.....+.+...-.. . .. T Consensus 133 -------------------------~~~~~~~~~d~i~~A~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~~----~-~~ 182 (272) T protein:vir:36 133 -------------------------QTVSTKANVDGVQAALDIFNDEDAQAYVLIVNPKDAAKIRKDANAKN----I-GS 182 (272) T ss_pred -------------------------ccccccccHHHHHHHHHHhhhcCCCceEEEEcHHHHHHHhccccccc----c-cc Confidence 00112467788999999999999899999999998777655422111 0 01 Q ss_pred cCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCC-ccccccccccc-cceeeEEEEEEE Q lcl|NC_020858. 231 SNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRK-IAEDKKVAKTG-DAEKFMLIGEGA 308 (330) Q Consensus 231 ~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~-~~~~e~laKtG-d~~k~~i~~E~t 308 (330) ..+..-..... | ..|--++||.+..||.++..... +++-+.-+.+..-++ .-|.+..++.+ |.-+. .--|+ T Consensus 183 ~~~~~~~~~G~---i-g~~~G~~Vv~s~~~p~~~~~~~~-~~~~~gA~~~~~~~~~~vE~~R~~~~~~d~i~~--~~~y~ 255 (272) T protein:vir:36 183 EVGANALINGT---Y-ADVLGAQIVRSKKLAEGSALMFK-IVSNSPALKLVLKRGVQVETDRDIVTKTTVITA--DEHYA 255 (272) T ss_pred cccccceeeec---c-ceecCeeEEEeCCCCCCceeEEE-EEecccceeeeecCCcccccccchhhcCcEEEE--EEEEE Confidence 11111001111 1 11222699999999987654322 233333333221122 11222222222 22222 33389 Q ss_pred EEEecchheeEEeccccc Q lcl|NC_020858. 309 LKPKNEKGLGVAADLYGL 326 (330) Q Consensus 309 Le~~N~~a~g~i~gLt~~ 326 (330) +.+.||.+..+++ ..|. T Consensus 256 ~~v~~~~~vv~~t-~~g~ 272 (272) T protein:vir:36 256 AYLYDLTKVVNIT-FTGV 272 (272) T ss_pred EEEEcCccEEEEe-ecCC Confidence 9999999876653 3444 No 79 >protein:vir:8420 Length: 477 # NCBI annotation: gp15 # Family: family:all:21 # MgeID: mge:155 # MgeName: Omega # Cross-refs: genbank:acc:NP_818316;genbank:gi:29566752;genbank:GeneID:1260033 Probab=96.03 E-value=0.00066 Score=38.02 Aligned_cols=300 Identities=9% Similarity=0.020 Sum_probs=128.7 Q ss_pred CCccccceeecccc-c--cccc-cceeeEecCCcccceeeeeccceecc--ceeeeeeeeccCccccccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAK-G--NREE-LADVVSRITPEDTPIYSMIEKVSFDT--THPEWTTDELAAPGANITLEGDEYTFDAT 74 (330) Q Consensus 1 Ma~~t~~~~t~~~~-g--~~ed-l~d~I~~i~p~dTP~~s~ig~~~~~~--~~~~W~td~L~~~~~na~~EG~d~~~~~~ 74 (330) +.... ..++.... | --++ +.+.|...--..+|+.++++...... -.+.|....-.++..--..||+..+.... T Consensus 151 ~~~~~-~~~~~~~~gg~lv~~~~~~~~ii~~l~~~~~i~~~~~~~~~~~~~~~~~ip~~~~~~~~a~~~~Eg~~~~~~~~ 229 (477) T protein:vir:84 151 GEEYR-DLDRNGGTGGYAVPPLWMMNRFIELARAGRTYANLCPTEPLPGGTSSINIPKILTGTSTAIQAADNAALTAPSA 229 (477) T ss_pred hhhhc-cccccCCCcceeeccchhHHHHHHHhhhcchHHHhhceeeecCCcceeEEEEEecCcceeeeeccCcccccccc Confidence 00000 11111111 1 1123 44555554455677777666544322 23555543322222222346654433211 Q ss_pred c-CceEecceEEEEee-eeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcc Q lcl|NC_020858. 75 V-SPERLGNYTQIMRK-SGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTN 152 (330) Q Consensus 75 ~-~~~~~~N~tQIf~~-~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn 152 (330) . ....+..++--.+| ..-|.=|.+.+..... +..+|=..+-...+.+-+|.+||+|.-. .-+.-||+.+-..+ T Consensus 230 ~~s~~~f~~i~~~~~k~~~~~~iS~ell~ds~~-~l~~~i~~~l~~~~~~~~d~~~l~G~Gt----~~~p~Gi~~~~~~~ 304 (477) T protein:vir:84 230 HEVDLTDGFVQANVKTIAGQQGIAIQLLDQAAV-SVDEFVFRDLAADYANKLNVQVISGTGS----NNQVVGVRATAGIT 304 (477) T ss_pred cccccceeeEEEeeeeEEeeeHHHHHHHhccch-hHHHHHHHHHHHHHHHHHHHHHhccCCC----CCccceeeeccccc Confidence 1 11112222211111 2223335555555442 3334444455566888899999988532 11234555432111 Q ss_pred ccccccccccccccccccccccccccccc---ccHHHHHHHHHHHHhcCC-ceeEEEeChHHHHHHHHhhccceeeeeee Q lcl|NC_020858. 153 VSRGATGANGGYNTGTGLTVAPTDGTQRA---FSKAIMDDVMQQGYQSGA-NFKHVFVSPYVKSVFVTFMSDTNVASFRY 228 (330) Q Consensus 153 ~~~g~~g~~~~~~~~~~~~~~~t~gt~~~---lTe~~l~~~~~~~~~~Gg-~~~~i~v~~~~k~~is~f~~~~~~~~~r~ 228 (330) ... ....++.-+ ...+.|.+++..+..+.. ....++++|....++..+. +.+. |+ T Consensus 305 ~~~-----------------~~~~~~t~~~~~~~~~~i~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~lk-d~~G---~~ 363 (477) T protein:vir:84 305 QVT-----------------ATSAGSALEKHQIIYQKIADAIQRVHTSRFLEPEVIVMHPRRWASFHAIF-AGDD---RP 363 (477) T ss_pred ccc-----------------ccccccchhhHHHHHHHHHHHHhhccccccCCccEEEEcHHHHHHHHHhh-ccCC---Ce Confidence 110 101111111 122333333334443322 3446788998888888864 3321 22 Q ss_pred eecCC--cceeEEEEEE-EEEcCCeE---EEEEEcCcCCCccccc---cEEEEEcchhhhhcccCCc---cccccccccc Q lcl|NC_020858. 229 AASNG--KNNSIVANAD-VYEGPFGK---VMIHPNRVMAGSGALA---RNAFFVDPEFLQFGWLRKI---AEDKKVAKTG 296 (330) Q Consensus 229 ~~~~~--~~~~~~~~v~-~~~tdfG~---v~iv~nR~mp~~~~~a---~~~~~ld~~~~~~~~Lr~~---~~~e~laKtG 296 (330) ..... ....++.... .-..+.|+ +.++.+..||++...+ ..+++.|.+.+-+.. ++. ...+..+ .- T Consensus 364 l~~~~~~~~~~~~~~~~~~~~~~~~~l~G~pVv~s~~~p~~~~~~~d~~~i~~gd~~~~~i~~-~~~~~~~~~~~~~-~~ 441 (477) T protein:vir:84 364 LIVPSGPGFNNLGVLTEVASQRVVGQMHGLPVVTDPTLPTTLGTGTDQDVIHVLRASDLALFE-SSVRMRALQETRA-EN 441 (477) T ss_pred eeecCcccccccccccccccccccchhcccceEecCcccccccccCCcceEEEEEeceEEEEe-eceeEEecccccc-cc Confidence 11100 0000111000 01112233 4788888999753222 246777777654432 121 1111122 11 Q ss_pred cceeeEEEEEEE-EEEecchheeEEeccccccccC Q lcl|NC_020858. 297 DAEKFMLIGEGA-LKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 297 d~~k~~i~~E~t-Le~~N~~a~g~i~gLt~~~~~~ 330 (330) ....+++++.+. +-+|.|+++.+|+| +++++.| T Consensus 442 ~~~~~~v~~~~~~~~~r~~~afv~~t~-~~~~~~~ 475 (477) T protein:vir:84 442 LSVLLQVYGYLAFTAARFPQSVVEIGG-TALTAPT 475 (477) T ss_pred ceeeeeehhhhhhhhhccccceEEeec-ccccccc Confidence 223344455443 45578999987776 6888888 No 80 >protein:vir:4197 Length: 314 # NCBI annotation: putative structural protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:88 # MgeName: psiM100 # Cross-refs: genbank:acc:NP_071822;genbank:gi:11863105;genbank:GeneID:1257607 Probab=95.97 E-value=0.0012 Score=36.62 Aligned_cols=291 Identities=11% Similarity=0.049 Sum_probs=124.0 Q ss_pred CCccccceeecccc--ccccccceeeEecCCcccceeeeecccee-cc-ceeeeeeeeccCcccccccccccc---cccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAK--GNREELADVVSRITPEDTPIYSMIEKVSF-DT-THPEWTTDELAAPGANITLEGDEY---TFDA 73 (330) Q Consensus 1 Ma~~t~~~~t~~~~--g~~edl~d~I~~i~p~dTP~~s~ig~~~~-~~-~~~~W~td~L~~~~~na~~EG~d~---~~~~ 73 (330) -.-++.++++.+.- --.|+-.+.+...=-+.+||++.+-...+ .+ ...-|....-... .+...|+.+. +... T Consensus 7 ~~~~~k~it~~d~~gG~L~P~~~~~~i~~l~e~s~i~~~a~vi~t~~s~~~~i~~i~~g~~~-~~~~~~~~~~~~~~~~~ 85 (314) T protein:vir:41 7 PFQITPKIDVPDLGKGILAVQRFGEFVREVRENSAIIKDARVLNALKSYEVDISRISLGVEL-EPGRNTSGTKVAPTADE 85 (314) T ss_pred HHHhhcccccccCCCceeChHHHHHHHHHHHhccchhhheeeecccCccceeecccccCccc-ccccccccCCccCCccc Confidence 01111111111111 01222223333434577888887653221 11 1122222211100 1112222221 2122 Q ss_pred ccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcC-Cccc--ccchhHHHHHh Q lcl|NC_020858. 74 TVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASV-GGAT--RESGSLPTWVK 150 (330) Q Consensus 74 ~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~-~~~~--r~~~Gi~~~i~ 150 (330) .......=+.-+.-. -+.||.-.-.=+..| .+.-.|-+.+-.+.+.+|+|.++++|.... .+.| .+..|++.... T Consensus 86 ~tf~~~~l~~~kl~~-~v~is~e~L~D~a~~-~~le~~i~~~~Ae~~g~~~~~~~~nGdg~~~s~~~~~~~p~G~l~~a~ 163 (314) T protein:vir:41 86 VTVSTNTLEMKELVT-KVVLEDEALEDNIEQ-SAFEQTITSLLASGVTYDLECFFLHADSSLTTGRELYRINDGWMKLAG 163 (314) T ss_pred ccccceeeeeEEEEE-eecccHHHHHhhhch-hhHHHHHHHHHHHHHHHHHHHHhhccccCCcCcccchhcchhhhhhcc Confidence 222222223333333 466775444322223 344444445667789999999999997532 2222 36677655421 Q ss_pred cccccccccccccccccccccccccccccccccHHHHHHHHHHH---Hh-cCCceeEEEeChHHHHHHHHhhccceeeee Q lcl|NC_020858. 151 TNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQG---YQ-SGANFKHVFVSPYVKSVFVTFMSDTNVASF 226 (330) Q Consensus 151 tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~---~~-~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~ 226 (330) ..+. . ..+...+++.+.|.+++..+ |- ++++. ..++++....++.++..+...... T Consensus 164 ~~~~------------------~-~~~~~~~~~~~~~~~l~~sl~~~yr~~~~~~-~~~m~~~t~~~~r~~l~~~~~~l~ 223 (314) T protein:vir:41 164 NQYT------------------D-AEPEDENWPLNLFDGMMDELDTRYLQLKPRM-KFYVSNEIYNGYRKQLLVRETGLG 223 (314) T ss_pred ccee------------------e-cCccccccHHHHHHHHHHhcCchhhcCCCce-EEEecHHHHHHHHHHHhccCCccc Confidence 1100 0 11122356788888888887 43 34444 455688888888877554432211 Q ss_pred eeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc-cccccccccccceeeEEE- Q lcl|NC_020858. 227 RYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI-AEDKKVAKTGDAEKFMLI- 304 (330) Q Consensus 227 r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~-~~~e~laKtGd~~k~~i~- 304 (330) +.....+...+ +.-+.|+...+||.-..-+..+++-|+.++-+..-+.+ .+.+..++.+ ..++... T Consensus 224 ~~~~~~~~~~~-----------l~G~PV~~~~~~~~~~~~~~~i~fgd~~nlv~~~~~~ir~~~~~~a~~~-~~~~~~~~ 291 (314) T protein:vir:41 224 DSALIGATGLQ-----------YDGIPIQYVPALDALGDDKARALLTVPTNLVYGFWRNIRIEPKRDAAMR-RTEYIASL 291 (314) T ss_pred chhhhCCCCce-----------ecceeeEecccccccCCCCceEEEechhheEEEeeceeEEeecccCcCC-eEEEEEEE Confidence 11111111111 22345566677766544456788999998744321111 1223333221 1111111 Q ss_pred -EEEEEEEecchheeEEeccccc Q lcl|NC_020858. 305 -GEGALKPKNEKGLGVAADLYGL 326 (330) Q Consensus 305 -~E~tLe~~N~~a~g~i~gLt~~ 326 (330) .-+.++.-+..+.++|..-++= T Consensus 292 r~d~~~~~~~aa~~~~~~~~~~~ 314 (314) T protein:vir:41 292 RADCNYEDENAAVAAVIDMSSGG 314 (314) T ss_pred EeceEEEEcCcEEEEEeeccCCC Confidence 1222333222222222222111 No 81 >protein:vir:105334 Length: 276 # NCBI annotation: putative phage major capsid protein # Family: family:all:522 # MgeID: mge:1679 # MgeName: PH15 # Cross-refs: genbank:acc:YP_950669;genbank:gi:119967839;genbank:GeneID:4643213 Probab=95.93 E-value=0.0012 Score=36.52 Aligned_cols=264 Identities=15% Similarity=0.094 Sum_probs=140.4 Q ss_pred CCccccceeeccccccccccc----------eeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELA----------DVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYT 70 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~----------d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~ 70 (330) ||..+ |..+. +-.-|=++ ..+..+...++-+ -|+.--+=+.+.| ..+.++. -..||.+.+ T Consensus 1 Ma~~~-T~l~d--~i~Pev~~~~v~~~~~~~~~~~~~~~~~~~l---~g~~G~ti~iP~~--~~igda~--~~~eg~~i~ 70 (276) T protein:vir:10 1 MAQGT-TTKST--QIVPEVLAPMMQAELDKKLRFAQFADIDSTL---VGQPGDTLTFPAF--VYSGDAT--VVPEGQKIP 70 (276) T ss_pred CCcce-eehhh--hhchHHHHHHHHHHHHhhhhhcccceecccc---cCCCCCEEEeeee--cCCCccc--cccCCCccC Confidence 88543 21111 11111111 1223333333322 2321112234567 3444333 256888877 Q ss_pred cccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHh Q lcl|NC_020858. 71 FDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVK 150 (330) Q Consensus 71 ~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~ 150 (330) ...-+.....--+.| ..+.|++++-+.. ..+ +|.+..-......-+.|.+...++.--+.... T Consensus 71 ~~~lt~~~~~a~i~~-~~k~~~~tD~a~~--~~~-~dp~~~~~~~~~~~~a~~~d~~~~~~l~~~~~------------- 133 (276) T protein:vir:10 71 VDKIETNRREAKIHK-IGKGTDITDEALL--SGY-GDPQGEAVRQHGLAIANKVDNDVLEALRGTKL------------- 133 (276) T ss_pred ccccccceeeEEeeh-ccccccccHHHHH--hhc-cchHHHHHHHHHHHHHHHHHHHHHHHHhcccc------------- Confidence 666555554444444 3567777766554 322 56677777777777888888777632111000 Q ss_pred cccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeee Q lcl|NC_020858. 151 TNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAA 230 (330) Q Consensus 151 tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~ 230 (330) ..+..++|.+.|.++++++=+++...+.++|||.+.-.+-+.....+... T Consensus 134 -------------------------~~~~~~~t~d~i~~A~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~f~~~----- 183 (276) T protein:vir:10 134 -------------------------TVSADIGTLAGLEAAIDTFDDEDLEPMVLFINPKDAGKLRSSASDNFTRA----- 183 (276) T ss_pred -------------------------cccccccCHHHHHHHHHHhccccCcccEEEEcHHHHHHHHHhcccccccc----- Confidence 00112477889999999998888888999999998777755432222110 Q ss_pred cCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc-cccccccccccceeeEEEEEEEE Q lcl|NC_020858. 231 SNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI-AEDKKVAKTGDAEKFMLIGEGAL 309 (330) Q Consensus 231 ~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~-~~~e~laKtGd~~k~~i~~E~tL 309 (330) .+.....+... .|-+..| ++|+.++.+|.+ ..+++.+..+.+..-++. -|.+..++.+ ++......=|+. T Consensus 184 s~~g~~~~~~G--~ig~~~G-~~Vi~s~~~p~~-----t~~l~~~gAi~~~~~~~~~vE~dRd~~~~-~d~i~~~~~y~~ 254 (276) T protein:vir:10 184 TELGDNIIVKG--AFGEALG-AVIVRSKKLDEG-----EAILAKRGAVKLITKRDFFLETDRDPSTK-TTALYSDKHYVA 254 (276) T ss_pred ccccccceecc--ccceecc-eeEEEcCCCCcc-----eEEEEeccceeeeecCCceeecccchhhc-ccEEEEeeEEEE Confidence 00001100000 1222223 688999999865 457889888887554442 1222233332 222223334889 Q ss_pred EEecchheeEEeccccccccC Q lcl|NC_020858. 310 KPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 310 e~~N~~a~g~i~gLt~~~~~~ 330 (330) .+.||.+..+|+-=+|-+.+- T Consensus 255 ~~~~~~~vv~~t~~~~~~~~~ 275 (276) T protein:vir:10 255 YLYDESKAVKVTKGAGTTDSG 275 (276) T ss_pred EEEcCcceEEEecCCcCCcCC Confidence 999999999998666544443 No 82 >protein:vir:96123 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1602 # MgeName: 37 # Cross-refs: genbank:acc:YP_240078;genbank:gi:66395742;genbank:GeneID:5133103 Probab=95.83 E-value=0.0014 Score=36.23 Aligned_cols=263 Identities=14% Similarity=0.061 Sum_probs=139.0 Q ss_pred CCccccceeeccccccccccc----------eeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELA----------DVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYT 70 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~----------d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~ 70 (330) ||..+ |-.+. .-..|=++ ..+..+.+.++.+ -|+.--+=+.+.|. .+.++. -..||.+.+ T Consensus 1 ma~~~-T~~~d--~i~Pev~s~~v~~~~~~~~~~~~~~~~~~~l---~g~~G~tv~ip~~~--~~g~~~--~~~~g~~i~ 70 (274) T protein:vir:96 1 MAQGT-TKVSN--LIVPEVLAPMMQAELDKKLRFAQFADIDSTL---VGQPGDTLTFPAFT--YSGDAQ--VIAEGEKIP 70 (274) T ss_pred CCccc-cchhh--hhhhHHHHHHHHHHHHhhhhhcccccccccc---cCCCCCEEEEEeec--cCCCcc--ccCCCCcCc Confidence 98766 22111 11111111 1223333344332 23211122345674 222222 235777776 Q ss_pred cccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHh Q lcl|NC_020858. 71 FDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVK 150 (330) Q Consensus 71 ~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~ 150 (330) ..........--+.|+ .+.|++++-+.+. . .+|.+..-..+....+.+.++..++.--+... T Consensus 71 ~~~it~~~~~~~i~~~-~~~~~i~D~~~~~--~-~~d~~~~~~~~~~~~~a~~~d~~i~~~l~~a~-------------- 132 (274) T protein:vir:96 71 VDQIGTSKREAKVRKI-GKGTELTDEAVLS--G-FGDPQGEAVRQHGLAIANKVDNDVLEALKGAT-------------- 132 (274) T ss_pred hhhcccceeEEEEEee-eceeeecHHHHHh--h-cchHHHHHHHHHHHHHHHHHHHHHHHHHhcCC-------------- Confidence 6665555544455664 5778888766543 2 24677776777777888999888774211100 Q ss_pred cccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeee Q lcl|NC_020858. 151 TNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAA 230 (330) Q Consensus 151 tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~ 230 (330) ..+...+++.+.|.++++++=+++...+.++|||.+...+.+.....+... . T Consensus 133 ------------------------~~~~~~~~~~d~i~dA~~~l~d~~~~~~~ivv~p~~~~~L~k~~~~~f~~~----~ 184 (274) T protein:vir:96 133 ------------------------LTVEADITKLDGLQTAIDKFNDEDLEPMVLFVNPLDAGGLRTSASDNFTRP----T 184 (274) T ss_pred ------------------------CCcCcccccHHHHHHHHHHhcccCCCceEEEeCHHHHHHHHhccccccccc----c Confidence 001112467888999999998888889999999998777755322222110 0 Q ss_pred cCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCC-ccccccccccccceeeEEEEEEEE Q lcl|NC_020858. 231 SNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRK-IAEDKKVAKTGDAEKFMLIGEGAL 309 (330) Q Consensus 231 ~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~-~~~~e~laKtGd~~k~~i~~E~tL 309 (330) ..+. . ....-....|--++|+.+..||.+ ..|++.+..+.+..-++ ..+.+..++.+ ++......=|+. T Consensus 185 ~~g~-~---~~~~g~ig~~~G~~Vi~s~~~p~~-----t~~l~~~gA~~~~~~~~~~vE~~Rd~~~~-~d~i~~~~~yg~ 254 (274) T protein:vir:96 185 QLGD-N---IIVKGAFGEALGAVIVRSNKLNKG-----EALLAKKGAVKLITKRDFFLEKDRDASRK-STALYSDKHYVA 254 (274) T ss_pred cccc-c---ceeecccceecCeeEEEcCCCCcc-----eEEEEeCcceeeeecCCcccccccchhhc-ccEEEEeeEEEE Confidence 0011 1 011111112223689999999965 45788888777754333 11222223322 223333344899 Q ss_pred EEecchheeEEecccccccc Q lcl|NC_020858. 310 KPKNEKGLGVAADLYGLTAS 329 (330) Q Consensus 310 e~~N~~a~g~i~gLt~~~~~ 329 (330) .+.||.+..+++-=++=--- T Consensus 255 ~~~~~~~vv~~t~~~~~~~~ 274 (274) T protein:vir:96 255 YLYDESKVVKITKGAGDEVM 274 (274) T ss_pred EEEcCccEEEEEcCcccccC Confidence 99999888776532221111 No 83 >protein:vir:80376 Length: 435 # NCBI annotation: gp6, major capsid head protein # Family: family:all:21 # MgeID: mge:1881 # MgeName: phi644-2 # Cross-refs: genbank:acc:YP_001111085;genbank:gi:134288639;genbank:GeneID:4960624 Probab=95.76 E-value=0.0015 Score=36.07 Aligned_cols=284 Identities=11% Similarity=0.042 Sum_probs=130.2 Q ss_pred CCccccceeecc-ccccccccceeeEecCCcccceeeee-ccceeccceeeeeeeeccCccccccccccccccccccCce Q lcl|NC_020858. 1 MAVVTNTFQSTG-AKGNREELADVVSRITPEDTPIYSMI-EKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~-~~g~~edl~d~I~~i~p~dTP~~s~i-g~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~ 78 (330) ++..+.+ +.. ....-+.+.+.|...-...+|+..+- -..+..+..+.|..-+-. +...-..||...+........ T Consensus 130 ~~~~~~~--~~~gg~lvP~~~~~~ii~~l~~~~~i~~~~~~~v~~~~~~~~~p~~~~~-~~a~~v~E~~~~~~~~~~f~~ 206 (435) T protein:vir:80 130 MSLNTLS--PGAGGVLVPENLSSEVIELLRPKSVVRKLGARTLPLSNGNITIPRLKGG-AIVGYIGADTDIPTTQQQFDD 206 (435) T ss_pred hhhcccC--CCCCccccchhHHHHHHHHHhhhchhhhccceeeecCCCceEEEEEeCC-cceeeeccCccccccccceee Confidence 2221111 110 01112244444444334556665531 122233333444333221 111223577665543221111 Q ss_pred EecceEEEEeeeeeehhHHHHHhhcccc-chHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 79 RLGNYTQIMRKSGIISGTQNITDEAGRA-TKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 79 ~~~N~tQIf~~~v~VS~T~~av~~~G~~-~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) ..-+ ..-+...+.|| .+.+...+.. +..++=...-...+.+-+|.+||+|.... -+..||..+..... T Consensus 207 i~~~-~~k~~~~~~is--~ell~ds~~~~~l~~~i~~~l~~a~~~~~d~a~l~G~G~~----~~p~Gi~~~~~~~~---- 275 (435) T protein:vir:80 207 LKLT-AKKMAALVPIA--NDLIKYAGVNPNVDQIVVGDLTAAIGAREDKAFIRDDGTA----NTPKGLRFWALPGN---- 275 (435) T ss_pred EEEe-eEEEEEeehhh--HHHHHhhcccHHHHHHHHHHHHHHHHHHHHHHhhccCCCC----Ccccceeecccccc---- Confidence 1111 12222333343 4445544432 33333334455668899999999985321 12335443321100 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCc--eeEEEeChHHHHHHHHhhccceeeeeeeeecCCcc Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGAN--FKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKN 235 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~--~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~ 235 (330) +.....++....+..++.+++..+..+... ...+++||....++..+- +.+. ++....... T Consensus 276 -------------~~~~~~~~~~~~~~~d~~~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~lk-d~~G---~~l~~~~~~ 338 (435) T protein:vir:80 276 -------------VITASDGSTLQKIETDLGKAILALENADANLTQPGWIMAPRTFRFLEGLR-DGNG---NKVYPELAN 338 (435) T ss_pred -------------eeecccccchhhHHHHHHHHHHHhhccccccccCEEEEcHHHHHHHHhhh-ccCC---ceeccCCCC Confidence 111122222233345566666666654332 335688999998888863 3332 222111111 Q ss_pred eeEEEEEEEEEcCCeEEEEEEcCcCCCccccc---cEEEEEcchhhhhcccCCcccccc----ccccc----------cc Q lcl|NC_020858. 236 NSIVANADVYEGPFGKVMIHPNRVMAGSGALA---RNAFFVDPEFLQFGWLRKIAEDKK----VAKTG----------DA 298 (330) Q Consensus 236 ~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a---~~~~~ld~~~~~~~~Lr~~~~~e~----laKtG----------d~ 298 (330) .++ +| +.|+.+.+||.....+ ..+++.|++++-+. .|+-..-+. .-+++ +. T Consensus 339 ~~l----------~G-~pv~~~~~~p~~~~~~~~~~~i~~gd~s~~~i~-~~~~~~i~~~~~~~~~~~~~~~~~~f~~n~ 406 (435) T protein:vir:80 339 GML----------KG-YPVGKTTQVPINLGEAGKESEIYFTDFGDVFIG-EEETLEIDYSKEATYKDADGHMVSAFQRDQ 406 (435) T ss_pred CeE----------ee-eeeEEeccccccccCCCCcceEEEEEcccEEEE-eecceEEEEeccccccccccchhhhhhcCc Confidence 111 23 4677777887643222 24677787764433 222111110 01111 23 Q ss_pred eeeEEEEEEEEEEecchheeEEecccccc Q lcl|NC_020858. 299 EKFMLIGEGALKPKNEKGLGVAADLYGLT 327 (330) Q Consensus 299 ~k~~i~~E~tLe~~N~~a~g~i~gLt~~~ 327 (330) ..+.++..+.+.+.+|+|..+|.|+..=+ T Consensus 407 ~~~r~~~r~d~~~~~~~a~~~l~~~~~~~ 435 (435) T protein:vir:80 407 TLIRVIAKNDFGPRHVESIAVLSGVAWGA 435 (435) T ss_pred ceeeeeeeeCcEeecccceEEEeccCCCC Confidence 44456667889999999999999988766 No 84 >protein:vir:96833 Length: 275 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1642 # MgeName: EW # Cross-refs: genbank:acc:YP_240157;genbank:gi:66395822;genbank:GeneID:5133174 Probab=95.75 E-value=0.0015 Score=36.04 Aligned_cols=258 Identities=15% Similarity=0.150 Sum_probs=136.7 Q ss_pred CCccccceeeccccccccccceee----------EecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVV----------SRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYT 70 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I----------~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~ 70 (330) ||-++.|..+.-. +-|=+++.+ ..+...++-+ -|+.--+=+.+.|. .+.++. -..||.+.+ T Consensus 1 ~~~~~~T~l~d~i--~PEv~~~~v~~~~~~~~~~~~~~~~~~~l---~g~~G~tv~iP~~~--~ig~a~--~~~~g~~i~ 71 (275) T protein:vir:96 1 MALENMTKLANMV--NPEVLAPMMQAELDKKLKFAQFADIDNTL---VGQPGNTITFPAFV--YSGDAK--VVPEGEEIP 71 (275) T ss_pred CCCcccchhhhhh--chHHHHHHHHHHHHHhhhhcccceecccc---cCCCCCEEEeeeec--cCCccc--cccCCCCcc Confidence 7776644333321 222222222 2222222221 23211122345674 344333 245777776 Q ss_pred cccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHh Q lcl|NC_020858. 71 FDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVK 150 (330) Q Consensus 71 ~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~ 150 (330) ....+.....--+.|. .+.|++++-+.. ..+ +|.+..-......-+.+.++..++.--+... T Consensus 72 ~~~lt~~~~~~~i~~~-~~~~~i~D~~~~--~~~-~d~~~~~~~~~a~~~a~~~d~~ll~~l~~a~-------------- 133 (275) T protein:vir:96 72 IDLIETKKRQATIRKI-GKGTVLTDEALL--SGY-GDPKGEAVRQHGLAIANKVDNDVLEALQGAT-------------- 133 (275) T ss_pred hhhcccceeeEEeehh-cccccccHHHHH--hhc-cchHHHHHHHHHHHHHHHHHHHHHHHHhccc-------------- Confidence 6655554444334443 556777765433 332 4667766677777788888887763211100 Q ss_pred cccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeee Q lcl|NC_020858. 151 TNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAA 230 (330) Q Consensus 151 tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~ 230 (330) ......+++.+.|.++++++-+++...+.++|||.+...+-+.....+.. . T Consensus 134 ------------------------~~~~~~~~~~d~i~dA~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~f~~-----~ 184 (275) T protein:vir:96 134 ------------------------LKVEADITKLAGLQTAIDKFNDEDLEPMVLFVNPLDAGKLRASATDNFTR-----A 184 (275) T ss_pred ------------------------ccccccccCHHHHHHHHHHhccccCCccEEEeCHHHHHHHHhcccccccc-----c Confidence 00011257889999999999888888999999999877765533222210 1 Q ss_pred cCCcceeEEEEEEEEEcCCeE---EEEEEcCcCCCccccccEEEEEcchhhhhcccCCc-cccccccccccceeeEEEEE Q lcl|NC_020858. 231 SNGKNNSIVANADVYEGPFGK---VMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI-AEDKKVAKTGDAEKFMLIGE 306 (330) Q Consensus 231 ~~~~~~~~~~~v~~~~tdfG~---v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~-~~~e~laKtGd~~k~~i~~E 306 (330) ....+..+ ...-+|+ ++|+.++.+|.+ ..+++.+..+.+..-++. -|.+..++.+ ++......= T Consensus 185 ~~~g~~~~------~~G~ig~~~G~~Vi~s~~~p~~-----t~~i~~~gA~~~~~~~~~~vE~~Rd~~~~-~d~i~~~~~ 252 (275) T protein:vir:96 185 TLLGDNVI------VKGAFGEALGAIIVRSNKIKEG-----EAILAKRGAVKLITKRDFFLETERHASHK-STALFSDKH 252 (275) T ss_pred ccccccce------eccccceecCeeEEEeCCCCcc-----eEEEEeccceeeeecCCcccccccchhhc-CcEEEEeEE Confidence 11111111 1122333 588888888865 457888887777543331 1222222222 222223334 Q ss_pred EEEEEecchheeEEeccccccccC Q lcl|NC_020858. 307 GALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 307 ~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) |++.+.||.+..+++= ++++ T Consensus 253 y~~~~~~~~~vv~~t~----~~~~ 272 (275) T protein:vir:96 253 YVAYLYDESKVVKITK----SASG 272 (275) T ss_pred EEEEEEcCccEEEEEe----cccc Confidence 7999999988877642 4444 No 85 >protein:vir:99920 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:1611 # MgeName: Halo # Cross-refs: genbank:acc:YP_655524;genbank:gi:109392294;genbank:GeneID:4157089 Probab=95.72 E-value=0.0016 Score=35.97 Aligned_cols=283 Identities=12% Similarity=0.040 Sum_probs=121.7 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccc--eeeeeeeeccCccccccccccccccccccCce Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTT--HPEWTTDELAAPGANITLEGDEYTFDATVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~--~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~ 78 (330) ||..+.. .....-+.+.+.|...-....|+..+..+....+. .+-+.+.. +..--..||.+.+........ T Consensus 1 Mat~tt~----~g~~vP~~~~~~ii~~~~~~s~l~~~~~~i~~~~~~~~~p~~~~~---~~a~wv~Eg~~~~~~~~~f~~ 73 (311) T protein:vir:99 1 MATFGTG----NLKNLPRNIADGMVKDVVQGSTVAVLSARKPQRFGNEDIITFNGR---PKAEFVGEGQQKSSTTGEFDF 73 (311) T ss_pred CceecCC----CceeccHHHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEeCC---ceeEEeecCcccccccceeeE Confidence 9865521 22222346777777776777787776544433322 23233221 111123588776654332222 Q ss_pred EecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 79 RLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 79 ~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) ..-+...+ .--+.||.-.......-..+..++-..+-.+.+.+.+|.++|+|...-.+. ...|+..++.. + T Consensus 74 v~l~~~k~-~~~~~iS~ell~~~~d~~~~l~~~i~~~la~ai~~~~d~~~l~G~g~~~g~--~~~g~~~~~~~-----~- 144 (311) T protein:vir:99 74 VTSTPKKA-QVTMRFNEEVQWADEDYQLGVLQTLSEAGAEALARALDLGLYHRINPLTGT--VIPGWSNYLGA-----A- 144 (311) T ss_pred EEEeeEEE-EEeehhhHHHhhcccccHHHHHHHHHHHHHHHHHHHHHHHhhcccCcccCc--ccccccccccc-----c- Confidence 21122222 223444433221111111233444455666778999999999986421111 11122222100 0 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcCCc--eeEEEeChHHHHHHHHhhcccee-eeeeeeecCCcc Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGAN--FKHVFVSPYVKSVFVTFMSDTNV-ASFRYAASNGKN 235 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~--~~~i~v~~~~k~~is~f~~~~~~-~~~r~~~~~~~~ 235 (330) +..+.. +..+ ......++..++..+-.++.. .+-+++||..+.+|..+ +|.+. ...+....++.. T Consensus 145 ---------~~~~~~-~~~~-~~~~~~~i~~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~l-kd~~G~~l~~~~~~~~~~ 212 (311) T protein:vir:99 145 ---------SKRVEL-TADT-IANPDLAIEAAVGLLVANGHPTPVNGLALHPSIAWGLSTA-RYTDGRKKFPELGLGIGV 212 (311) T ss_pred ---------cceeec-cccc-cchhHHHHHHHHHHHhhhccCCCccEEEEcHHHHHHHHhh-hccCCCeeecCcccCCCC Confidence 000000 0011 112345667777776666544 33488999999999885 44331 111111111111 Q ss_pred eeEEEEEEEEEcCCeEEEEEEcCcCCCcccc-----------ccEEEEEcch-hhhhcccCCccccccccccc------- Q lcl|NC_020858. 236 NSIVANADVYEGPFGKVMIHPNRVMAGSGAL-----------ARNAFFVDPE-FLQFGWLRKIAEDKKVAKTG------- 296 (330) Q Consensus 236 ~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~-----------a~~~~~ld~~-~~~~~~Lr~~~~~e~laKtG------- 296 (330) .++ +| +.++.+.++|..... -..+++-|.+ .+++...+.. .- .+.+.+ T Consensus 213 ~~l----------~G-~Pv~~s~~i~~~~~~~~~~~~~~~~~~~~~~~Gdf~~~~~~~~~~~~-~~-~~~~~~~~~~~~~ 279 (311) T protein:vir:99 213 SSF----------EG-IDASVSDTVNGGDEADPDDEDLDAARAVRGIVGDFANGIHWGVQRDI-PV-ELIKYGDPDGQGD 279 (311) T ss_pred cee----------cc-eeeEeecccccccccccccchhhccCcceEEEeeccccEEEEEecCc-eE-EEeecCCCCcchh Confidence 111 11 133333333322110 1123444543 2223221211 11 111222 Q ss_pred ----cceeeEEEEEEEEEEecchheeEEecccc Q lcl|NC_020858. 297 ----DAEKFMLIGEGALKPKNEKGLGVAADLYG 325 (330) Q Consensus 297 ----d~~k~~i~~E~tLe~~N~~a~g~i~gLt~ 325 (330) |...+.....++..+++|+ +.++.+-.= T Consensus 280 ~~~~d~~~~r~~~r~d~~v~~~~-~v~~~~~~A 311 (311) T protein:vir:99 280 LKRHNQIALRLEIVYGWYVFTDR-FVVIENAVA 311 (311) T ss_pred hhhcCcEEEEEEEeecceecChh-HeeeecccC Confidence 2233444667888999974 445444333 No 86 >protein:vir:9704 Length: 394 # NCBI annotation: hypothetical protein # Family: family:all:21 # MgeID: mge:174 # MgeName: 315.2 # Cross-refs: genbank:acc:NP_795466;genbank:gi:28876225;genbank:GeneID:1257769 Probab=95.72 E-value=0.0016 Score=35.95 Aligned_cols=261 Identities=13% Similarity=0.022 Sum_probs=116.5 Q ss_pred CCccccceeecc-ccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCcccccccccccccccc-ccCce Q lcl|NC_020858. 1 MAVVTNTFQSTG-AKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDA-TVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~-~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~-~~~~~ 78 (330) ..... ..++.+ ....-+++.+.|...-...+|+..+.......+....|..-...++...-..||+..+... ..... T Consensus 125 ~~~~~-~~t~~~gg~liP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~E~~~~~~~~~~~~~~ 203 (394) T protein:vir:97 125 EPQKD-GIKKENAKPVSSEEILYTPAREVKTVVDLKPFTTVYQAKKASGKYPVLQRATTKMVTVAELEKNPALAKPDFKD 203 (394) T ss_pred hhhcc-ccccccccccChHHHHHHHHHHhhhhhhhhhhceeeeccCcceEEEEEecCCCccceeccccccccccccccee Confidence 00000 111111 1112235666666666677888877666555555566665443333333345887665321 11111 Q ss_pred EecceEEEEeeeeeehhHHHHHhhcc--ccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccc Q lcl|NC_020858. 79 RLGNYTQIMRKSGIISGTQNITDEAG--RATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRG 156 (330) Q Consensus 79 ~~~N~tQIf~~~v~VS~T~~av~~~G--~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g 156 (330) ..-+.-. +..-+.||. +.+.... ..+.+..++ .+.+.+-++.++|+|.....+ T Consensus 204 v~l~~~k-~~~~i~is~--ell~ds~~~~~~~i~~~l---a~~~~~~~~~~i~~g~~~~~~------------------- 258 (394) T protein:vir:97 204 VAWNIDT-YRGAIPLSQ--ESIDDADVDLVGIVSESI---SQIKVNTTNDAIAKVLKSFTT------------------- 258 (394) T ss_pred EEeehhh-eeeehhhHH--HHHhhhhHHHHHHHHHHH---HHHHHHHHHHHHhhccccccc------------------- Confidence 1112211 223344443 3333332 223333333 344456677788775321100 Q ss_pred cccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhcccee-eeeeeeecCCcc Q lcl|NC_020858. 157 ATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNV-ASFRYAASNGKN 235 (330) Q Consensus 157 ~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~-~~~r~~~~~~~~ 235 (330) ....+-+.+.+++........+. .+++||....+|..+. |.+. ...+....++.. T Consensus 259 ----------------------~~~~~~~~~~~~~~~~~~~~~~a-~~v~n~~~~~~l~~lk-d~~G~~i~~~~~~~~~~ 314 (394) T protein:vir:97 259 ----------------------KTVKNLDEIKALLNGGFDPAYNV-SLIVSQSFYQTLDTLK-DGNGRYLLQDDITAVSG 314 (394) T ss_pred ----------------------cccccHHHHHHHHHhhhhhhhCC-EEEEcHHHHHHHHHhh-ccCCCeeeecCcCCCCC Confidence 00123455666666555433332 4779999988888763 3321 111110111111 Q ss_pred eeEEEEEEEEEcCCeE-EEEEEcCcCCCccccccEEEEEcchh-hhhcccCCccccccccccccceeeEEEEEEEEEEec Q lcl|NC_020858. 236 NSIVANADVYEGPFGK-VMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRKIAEDKKVAKTGDAEKFMLIGEGALKPKN 313 (330) Q Consensus 236 ~~~~~~v~~~~tdfG~-v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~~~~~e~laKtGd~~k~~i~~E~tLe~~N 313 (330) .+ =+|. |.++.+..++.. .+++.|.+. +.+.. |.-..-+..--..+......+..+...+.+ T Consensus 315 ~~----------l~G~pv~~~~~~~~~~~-----~~~~gd~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~r~d~~v~~ 378 (394) T protein:vir:97 315 KV----------LLGKPVFVLSDEVLGAN-----KAFIGDFKRGVLFAD-RKDLGLRWADNEIYGQYLQAVLRFGVSKVD 378 (394) T ss_pred ce----------eccceeEEecccccCCc-----cEEEeeccccEEEEE-ecceEEEEecccccceeEEEEEEEccEEec Confidence 11 1342 344445555543 345666542 22221 210010000011122345667789999999 Q ss_pred chheeEEeccccccccC Q lcl|NC_020858. 314 EKGLGVAADLYGLTAST 330 (330) Q Consensus 314 ~~a~g~i~gLt~~~~~~ 330 (330) |.|+..|.- + .+++ T Consensus 379 ~~a~~~~~~-~--~~~~ 392 (394) T protein:vir:97 379 DKAGYYVTF-T--PEPL 392 (394) T ss_pred ccceEEEEe-c--cccc Confidence 999876654 2 2333 No 87 >protein:vir:104256 Length: 458 # NCBI annotation: major head protein precursor # Family: family:all:27070 # MgeID: mge:1504 # MgeName: T5 # Cross-refs: genbank:acc:YP_006977;genbank:gi:46401878;genbank:GeneID:2777673 Probab=95.67 E-value=0.0017 Score=35.84 Aligned_cols=293 Identities=9% Similarity=0.058 Sum_probs=125.2 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCcccccccccccccccccc--Cce Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATV--SPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~--~~~ 78 (330) .+..+.+-.+......-+.+.+.|...--...|+..+.......+..+......-.. ...-..||...+..... ... T Consensus 159 ~a~~~~~~~~~g~~~ip~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~-~a~~v~e~~~~~~~~~~~~~~~ 237 (458) T protein:vir:10 159 KAVNQSSSVEVSSESYETIFSQRIIRDLQKELVVGALFEELPMSSKILTMLVEPDAG-KATWVAASTYGTDTTTGEEVKG 237 (458) T ss_pred hhhhhcccCccccceehhhHhHHHHHHHHhhhhHHhhcceeecCCcceEEEEecCCc-ceeecccccccccccccccccc Confidence 111111111111111122455555555456677766655544444433322221111 11112355443322111 011 Q ss_pred EecceE-EEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 79 RLGNYT-QIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 79 ~~~N~t-QIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) .+..++ ..++=...|.=|.+.+.... .+...|=...-...|.+-+|.+||+|... + +.-||+....... T Consensus 238 ~~~~i~~~~~k~~~~v~is~ell~ds~-~~~~~~i~~~l~~~i~~~~d~~~l~G~G~--~---~p~Gi~~~~~~~~---- 307 (458) T protein:vir:10 238 ALKEIHFSTYKLAAKSFITDETEEDAI-FSLLPLLRKRLIEAHAVSIEEAFMTGDGS--G---KPKGLLTLASEDS---- 307 (458) T ss_pred cceeeEeeeeeEEeeehhhHHHHhcch-HHHHHHHHHHHHHHHHHHHHHHhhcCCCC--C---ccceeeecccccc---- Confidence 122221 11111222222333433332 22333434444566777899999998642 2 3345544421100 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceee-eeeeeecCCcce Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVA-SFRYAASNGKNN 236 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~-~~r~~~~~~~~~ 236 (330) ...+...+.+....++.+.|.+++.++-.++.....+++||....+|..+ ++.+.. ..+....... T Consensus 308 ----------~~~~~~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~~~~~~~~l~~l-kd~~G~~i~~~~~~~~~-- 374 (458) T protein:vir:10 308 ----------AKVVTEAKADGSVLVTAKTISKLRRKLGRHGLKLSKLVLIVSMDAYYDLL-EDEEWQDVAQVGNDSVK-- 374 (458) T ss_pred ----------cceeecccccccccccHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHhh-cccCCceeecccccccc-- Confidence 00111122333445777888888888776665556788999988888775 333211 1110000000 Q ss_pred eEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcc-hhhhhcccCC-c-cccccccccccceeeEEEEEEEEEEec Q lcl|NC_020858. 237 SIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDP-EFLQFGWLRK-I-AEDKKVAKTGDAEKFMLIGEGALKPKN 313 (330) Q Consensus 237 ~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~-~~~~~~~Lr~-~-~~~e~laKtGd~~k~~i~~E~tLe~~N 313 (330) .... -.+=+| +.|+.+.+||...... ..++.|. +++.+.. |. . ...++.+.+ +..++....-+++-+.. T Consensus 375 -~~~~---~~~l~G-~pv~~~~~~p~~~~~~-~~~~~~f~~~~~~~~-~~~~~v~~d~~~~~-~~~~~~~~~r~~~~v~~ 446 (458) T protein:vir:10 375 -LQGQ---VGRIYG-LPVVVSEYFPAKANSA-EFAVIVYKDNFVMPR-QRAVTVERERQAGK-QRDAYYVTQRVNLQRYF 446 (458) T ss_pred -ccCc---Cceecc-eeeEEccccccccCCc-ceEEEEecccEEEEE-eeceEEEeecccCC-CceEEEEEEEecceEec Confidence 0000 011234 5778888999865433 3445555 3333322 21 1 011222221 22333333336688888 Q ss_pred chheeEEecccccccc Q lcl|NC_020858. 314 EKGLGVAADLYGLTAS 329 (330) Q Consensus 314 ~~a~g~i~gLt~~~~~ 329 (330) |.|+-++ =+++| T Consensus 447 ~~a~v~~----~~aa~ 458 (458) T protein:vir:10 447 ANGVVSG----TYAAS 458 (458) T ss_pred ccceEEE----eeccC Confidence 8665332 23444 No 88 >protein:vir:6242 Length: 390 # NCBI annotation: gp36 # Family: family:all:21 # MgeID: mge:131 # MgeName: phi-BT1 # Cross-refs: genbank:acc:NP_813696;swissprot:trembl:q859c1;genbank:gi:29366756;interpro:IPR006444;uniprot:Q859C1;genbank:GeneID:1258897 Probab=95.51 E-value=0.0019 Score=35.46 Aligned_cols=273 Identities=10% Similarity=0.004 Sum_probs=119.8 Q ss_pred CCc--cccceeeccccccccccceeeEecCCcccceeeeeccc-eecc-ceeeeeeeeccCccccccccccccccccccC Q lcl|NC_020858. 1 MAV--VTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKV-SFDT-THPEWTTDELAAPGANITLEGDEYTFDATVS 76 (330) Q Consensus 1 Ma~--~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~-~~~~-~~~~W~td~L~~~~~na~~EG~d~~~~~~~~ 76 (330) ++. ...+-.....+...+...+.|..+-. +.+++..+... ...+ ....|...+-.+.+ .-..||+..+..... T Consensus 106 ~~~~~~~~t~~~~g~~~~~~~~~~~i~~~~~-~~~~l~~~~~~~~~~~~~~~~~p~~~~~~~a-~wv~E~~~~~~~~~~- 182 (390) T protein:vir:62 106 FAPEKRDGTKAGNPNVLSRTLYGQLIAQAVE-RSAIMRGGATTFTTSDANPLDFTVITGRSSA-SIVGETAEIPESYPA- 182 (390) T ss_pred hhhhhhcccccCCCccccccchHHHHHHHHh-hhhhhhhcceeeecCCCceeEEEEEcCCcce-eeecccccccccccc- Confidence 111 11111111112222234444554433 34434333322 2212 22455544332221 223577776553322 Q ss_pred ceEecceEEEEee-eeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccc Q lcl|NC_020858. 77 PERLGNYTQIMRK-SGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSR 155 (330) Q Consensus 77 ~~~~~N~tQIf~~-~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~ 155 (330) +.+++=-.++ ...+-=|.+.+..... +..++=...-...+.+-+|.+||+|.- +| -||+.... . T Consensus 183 ---f~~i~~~~~k~~~~~~iS~ell~ds~~-~l~~~i~~~l~~~i~~~~d~~~l~G~G----~p---~Gi~~~~~----~ 247 (390) T protein:vir:62 183 ---TAQRSMGGFKYGFASVVSYEFATDQVL-DLVGFLVSDAGPAIGDAMGRHFITGTG----QP---RGILTDAS----P 247 (390) T ss_pred ---eeeeEeeeeeEEeehHHHHHHHhhhhH-HHHHHHHHHHHHHHHHHHHhhhhccCC----cc---cccccccc----c Confidence 2222222222 1222234555544432 222222223345567779999998852 33 24443220 0 Q ss_pred ccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeec---- Q lcl|NC_020858. 156 GATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAAS---- 231 (330) Q Consensus 156 g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~---- 231 (330) . ....+.++...++.+.|.++..++..+-..-...++|+.....|..+ +|.+. |+... T Consensus 248 ~--------------~~~~~~~~~~~~~~~~l~~~~~~l~~~~~~~a~~vmn~~~~~~L~~l-kd~~g---~~l~~~~~~ 309 (390) T protein:vir:62 248 A--------------TATFLATDTDSKVSDALIDLFHEVPSAYRANAKYVVNDLRAAQMRKL-KDANG---QYLWQSGLT 309 (390) T ss_pred c--------------ccceecccccccchHHHHHHHHhhhhhhhcCCEEEEchHHHHHHHHh-hccCC---CeeecCCcC Confidence 0 00111223335677777777766643322222468899999888876 33321 22211 Q ss_pred CCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccc---cccccccceeeEEEEEEE Q lcl|NC_020858. 232 NGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDK---KVAKTGDAEKFMLIGEGA 308 (330) Q Consensus 232 ~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e---~laKtGd~~k~~i~~E~t 308 (330) .+...+ =+| +.|+.+.+||.+. +++.|+++.-+.. +.-...+ ...-+=|...+....-+. T Consensus 310 ~g~~~~----------l~G-~Pv~~~~~~p~~~-----i~~gd~s~~~i~~-~~~~~v~~~~~~~~~~~~~~~~~~~r~d 372 (390) T protein:vir:62 310 VGAPSL----------FNG-KVVETDDGMPADK-----ILFADLSKYRVRF-AGSLRVDRSVDAKFSTDQIVYRFLQRAD 372 (390) T ss_pred CCccce----------ecc-cceEEecCCCCcc-----EEEeeccceeEEe-ecceEEEeeccccccCCcEEEEEEEEeC Confidence 111111 123 3566677888654 5678887644332 2111111 111112333344444588 Q ss_pred EEEecchheeEEeccccccccC Q lcl|NC_020858. 309 LKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 309 Le~~N~~a~g~i~gLt~~~~~~ 330 (330) ..+.+|+|..+|. ++++. T Consensus 373 ~~~~~~~A~~~l~----~~~~a 390 (390) T protein:vir:62 373 GLLVDARGAKVLT----VTPGA 390 (390) T ss_pred cEeechhheEEEE----eecCC Confidence 9999999977766 33333 No 89 >protein:vir:1328 Length: 392 # NCBI annotation: gp36 # Family: family:all:21 # MgeID: mge:28 # MgeName: phi-C31 # Cross-refs: genbank:acc:NP_047927;swissprot:trembl:q9zwv6;genbank:gi:9631145;uniprot:Q9ZWV6;genbank:GeneID:2715889 Probab=95.46 E-value=0.002 Score=35.34 Aligned_cols=280 Identities=12% Similarity=0.081 Sum_probs=121.3 Q ss_pred CCccccceeecc--ccccccccceeeEecCCcccceeeeecc-cee-ccceeeeeeeeccCccccccccccccccccccC Q lcl|NC_020858. 1 MAVVTNTFQSTG--AKGNREELADVVSRITPEDTPIYSMIEK-VSF-DTTHPEWTTDELAAPGANITLEGDEYTFDATVS 76 (330) Q Consensus 1 Ma~~t~~~~t~~--~~g~~edl~d~I~~i~p~dTP~~s~ig~-~~~-~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~ 76 (330) .+.-....++.. .....+-..+.|..+ ....+++..+.. ... ....+.|....-.+.+ .-..||+..+...... T Consensus 106 ~~~~~~~~t~~~~g~~~~~~~~~~~i~~~-~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~a-~~v~E~~~~~~~~~~f 183 (392) T protein:vir:13 106 FAPEKRDGTKAGNPNVLSRTLYGQLIAQA-VERSAIMRGGASTFTTSDANPMDFTVITGRATA-GIVGETAEIPESYPAT 183 (392) T ss_pred hhhhhhcccccCCCccccccchHHHHHHH-HhhhhhhhhcceeeecCCCceeEEEEEcCCcce-eeecccccccccccce Confidence 000000001111 111122233334344 333444433332 222 2234566555432222 2235777765432221 Q ss_pred ceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccc Q lcl|NC_020858. 77 PERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRG 156 (330) Q Consensus 77 ~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g 156 (330) ... ..-.++=...|-=|.+.+..... +-.++=...-...+.+-+|.+||+|... + +.-||+.... T Consensus 184 ~~v---~~~~~k~~~~~~iS~ell~ds~~-~l~~~i~~~l~~~i~~~~d~~~l~G~Gt--~---~p~Gil~~~~------ 248 (392) T protein:vir:13 184 TQR---SMGGFKYGFASVVSYEFATDQVL-DLVGFLVSDAGPAIGDAMGRHFLTGTGT--G---QPRGILTDAT------ 248 (392) T ss_pred eeE---EeeeeeEEeeehhHHHHHhcchH-HHHHHHHHHHHHHHHHHHHHHHhcccCC--c---cccccccccc------ Confidence 111 11222222333334444444331 2223333344456677799999998642 1 2235544321 Q ss_pred cccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhcccee-eeeeeeecCCcc Q lcl|NC_020858. 157 ATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNV-ASFRYAASNGKN 235 (330) Q Consensus 157 ~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~-~~~r~~~~~~~~ 235 (330) +. ....+.++...++.+.|.++...+-..-.....+++|+.....|..+ +|... ...+.-...+.. T Consensus 249 --~~----------~~~~~~~~~~~~~~d~l~~~~~~l~~~~~~~a~~v~n~~~~~~l~~l-kd~~G~~l~~~~~~~g~~ 315 (392) T protein:vir:13 249 --GA----------NAAFGEADADSKVSDALIDLFHEVPSAYRKNAKFVVNDLRAAQMRKL-KDANGQYLWQSALTVGAP 315 (392) T ss_pred --cc----------cccccccccccccHHHHHHHHHhhhhhhhcCCEEEEcHHHHHHHHHh-hccCCceeecCCcCCCCC Confidence 00 00111222335667777777665543211222467899998888886 34321 111111111111 Q ss_pred eeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCcc-c-cccccccccceeeEEEEEEEEEEec Q lcl|NC_020858. 236 NSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIA-E-DKKVAKTGDAEKFMLIGEGALKPKN 313 (330) Q Consensus 236 ~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~-~-~e~laKtGd~~k~~i~~E~tLe~~N 313 (330) .+| +| +.|+.+.+||++. +++.|++.+.+..-.++. . .....-.-|...+..+.-+..++.+ T Consensus 316 ~~l----------~G-~Pv~~~~~~~~~~-----i~~Gdf~~~~i~~~~~~~i~~~~~~~~~~~~~~~r~~~r~d~~~~~ 379 (392) T protein:vir:13 316 DTF----------NG-KVVETDDGMPADK-----VLFADLSKYRVRFAGSLRVDRSVDAKFSTDQIVYRFLQRADGLLVD 379 (392) T ss_pred cee----------cc-eeeEEcCCCCCCc-----EEEeeccceeEEeecceEEEeeccccccCCcEEEEEEEEeccEEec Confidence 111 23 5677788898654 457788765443212210 0 0111112233445555567888999 Q ss_pred chheeEEeccccccccC Q lcl|NC_020858. 314 EKGLGVAADLYGLTAST 330 (330) Q Consensus 314 ~~a~g~i~gLt~~~~~~ 330 (330) |.|.-++. ++++. T Consensus 380 ~~A~~~~~----~~~aa 392 (392) T protein:vir:13 380 ARGAKVLT----VTPAA 392 (392) T ss_pred ccceEEEE----eeccC Confidence 99866443 23333 No 90 >protein:vir:95376 Length: 425 # NCBI annotation: phage major capsid protein # Family: family:all:635 # MgeID: mge:1567 # MgeName: GBSV1 # Cross-refs: genbank:acc:YP_764476;genbank:gi:115334630;genbank:GeneID:5179263 Probab=95.39 E-value=0.0016 Score=35.97 Aligned_cols=273 Identities=12% Similarity=0.066 Sum_probs=122.4 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccc-cCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDAT-VSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~-~~~~~ 79 (330) ++..+ ++......-+.+.+.|+..=....|++.++...... -.+.|... ...+...-..||++.+.... ..... T Consensus 138 ~~~~~---~~~gg~~vP~~~~~~Ii~~l~~~~~i~~~~~~~~~~-g~~~ip~~-~~~~~a~~v~E~~~~~~~~~~~f~~i 212 (425) T protein:vir:95 138 RNLRA---VAGGELTIPEVVVNRIMDIMGDYTTLYPLVDKIRVK-GTTRILVD-TDTSPATWIEQSGALPTGDVGTIASI 212 (425) T ss_pred Hhhcc---cccCceeccHHHHHHHHHHHHhhhhHHHhhceeecC-ceeEEEEe-cCCcccccccccccccccccccccee Confidence 11111 011111112356666665556677887776544332 23344332 22333333457776543321 11111 Q ss_pred ecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHH-HHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 80 LGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKK-GVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 80 ~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~-~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) .-+.. -+...+.|| .+.++.... .+...+... ...+.+-+|.++|+|.......| -||+..+..... T Consensus 213 ~l~~~-k~~~~~~iS--~ell~ds~~--~l~~~i~~~l~~~i~~~~d~~il~G~G~~~~~p---~Gil~~~~~~~~---- 280 (425) T protein:vir:95 213 DFDGF-KVGKVTFVD--NYLLQDSII--NLDDYVTKKIARAIAKALDLAIVKGTGAANKQP---LGIIPSLPPENQ---- 280 (425) T ss_pred eeehe-eeeeeehhh--HHHHhccHH--HHHHHHHHHHHHHHHHHHHHHhhccCCCCcccc---ceeecccccccc---- Confidence 11111 122333444 444444432 233334433 34478899999999975332222 244433211100 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhc---CCceeEEEeChHHHHHHHHh--hccceeeeeeeeec-- Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQS---GANFKHVFVSPYVKSVFVTF--MSDTNVASFRYAAS-- 231 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~---Gg~~~~i~v~~~~k~~is~f--~~~~~~~~~r~~~~-- 231 (330) +.+.....+-+.+.++...+-.+ .++...+|.++..+..+..+ .+|.+. |+... T Consensus 281 ----------------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~~l~~l~~~kd~~g---~~i~~~~ 341 (425) T protein:vir:95 281 ----------------VTVEADNNLLKNLVKQIGLIDTGDDSVGEIVAVMKRSTYYNRLVEFSIQVDSNG---NVVGKLP 341 (425) T ss_pred ----------------cccccccchHHHHHHHHHhhhhhccccCceEEEEeChHHHHHHHHHHhhcCCCC---ceeeccC Confidence 00111234555566655544432 23343455555555545433 233321 22111 Q ss_pred CCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccc---cccccccceeeEEEEEEE Q lcl|NC_020858. 232 NGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDK---KVAKTGDAEKFMLIGEGA 308 (330) Q Consensus 232 ~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e---~laKtGd~~k~~i~~E~t 308 (330) ++... +=+| ..++.+.+||.+. +++.|+++.-+.. |.-...+ +..-+-+...+..+..+. T Consensus 342 ~~~~~----------~l~G-~pvv~~~~~~~~~-----i~~Gd~~~~~~~~-~~~~~i~~~~~~~f~~~~~~~~~~~r~d 404 (425) T protein:vir:95 342 NLRTP----------DLLG-LRVVFNNFLDDDT-----VLFGEFEQYTLVE-RENITIDSSTHVKFTEDQTAFRGKGRFD 404 (425) T ss_pred CCCCc----------cccc-eeeEEcCcCCCcc-----EEEEecccEEEEe-ecceEEEeecccccccCceEEEEEEeeC Confidence 11110 1123 4678888999764 4567777643332 2211111 111123455666666788 Q ss_pred EEEecchheeEEeccccccccC Q lcl|NC_020858. 309 LKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 309 Le~~N~~a~g~i~gLt~~~~~~ 330 (330) ..+++|.|..++. ++.++ T Consensus 405 ~~~~~~~a~~~~~----i~~~~ 422 (425) T protein:vir:95 405 GKPVKPEAFVLVT----ITDPV 422 (425) T ss_pred cEeecccceEEEE----ecCcC Confidence 9999999987763 33333 No 91 >protein:vir:108211 Length: 318 # NCBI annotation: gp9 # Family: family:all:6420 # MgeID: mge:2004 # MgeName: Giles # Cross-refs: genbank:acc:YP_001552338;genbank:gi:160700658;genbank:GeneID:5758931 Probab=95.29 E-value=0.0022 Score=35.17 Aligned_cols=283 Identities=13% Similarity=0.062 Sum_probs=123.7 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceee----------eecccee-ccceeeeeeee--ccCcccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYS----------MIEKVSF-DTTHPEWTTDE--LAAPGANITLEGD 67 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s----------~ig~~~~-~~~~~~W~td~--L~~~~~na~~EG~ 67 (330) |..||+-.-+++- ....+++.|-.=+-..+=+.- ++-+..+ .+-.+...+++ +-+-+.....||. T Consensus 1 ~~~~~~i~s~~~~--~~itv~~ll~~P~~I~~~i~e~~~~~~iad~lf~~~~a~~~~~v~f~~~~p~~~~~d~e~VaEgg 78 (318) T protein:vir:10 1 MTAPTGIVSVSDG--PAITVRELVGNPLWIPTALKKMMVNQFISESLFRNGGANPNGVVAYNEGNPSFLEDDVADVAEFG 78 (318) T ss_pred CCCCCcceeeecC--CceehHHhhCCchhHHHHHHHHHhccchhhhhhhcccccccceeEEEecccccccCcHhhccCcc Confidence 9999865555543 467777775422211111111 1112222 23344554443 3333344457999 Q ss_pred ccccccccC-ceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcC-----CCcCCccccc Q lcl|NC_020858. 68 EYTFDATVS-PERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVAT-----NASVGGATRE 141 (330) Q Consensus 68 d~~~~~~~~-~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g-----~~~~~~~~r~ 141 (330) +++...... ..++-... =.-..+.||+=++. .+..+.+..++.|...-+.|..+...+.- .+....+. T Consensus 79 EiP~~~~~~G~~~ia~~~-K~G~~~~vS~Em~~---~n~~~~v~r~~~~l~Nti~r~~d~~a~dal~sa~t~~~~~s~-- 152 (318) T protein:vir:10 79 EIPVSAGARGLPRTAFAV-KKALGVRVSKEMID---ENRVGAVNDQMLQLRNTFIRANDRSAKALLQSPIVPTLAVPT-- 152 (318) T ss_pred cccccCCCCCchhhhhhe-hhccceeccHHHHh---hcChhHHHHHHHHHHHHHHHHHHHHHHHHHhccccccccCCc-- Confidence 976554433 33332221 22334555544333 44457888888888888888888876541 11111110 Q ss_pred chhHHHHHhcccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccc Q lcl|NC_020858. 142 SGSLPTWVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDT 221 (330) Q Consensus 142 ~~Gi~~~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~ 221 (330) ++.+..+.-.+...+ .....-+.-...+.++.+.--.-|=.+++|++||.....|.+ +. T Consensus 153 -----------------~w~~~~~~~~d~~~A-~e~v~~a~~~~~~a~~~~~~~~~GY~pdtIVlhP~~~~~l~~---n~ 211 (318) T protein:vir:10 153 -----------------AWDNGGKVRTDIAIA-IEQISTAAPTAYPAGVGSSDEYFGFIPDTIVMHYALLPILMD---NE 211 (318) T ss_pred -----------------CCCCcccccccchhh-hhhhhhhhhhhhhhhhhhhhhccCccceeeEECHHHHHHHhc---ch Confidence 000000000000000 000000000000000000001234478899999998777632 11 Q ss_pred eeeeeeeeecCCcceeEEEEEEEEEcC-----CeEEEEEEcCcCCCccccccEEEEEcchh---------hhhcccCCcc Q lcl|NC_020858. 222 NVASFRYAASNGKNNSIVANADVYEGP-----FGKVMIHPNRVMAGSGALARNAFFVDPEF---------LQFGWLRKIA 287 (330) Q Consensus 222 ~~~~~r~~~~~~~~~~~~~~v~~~~td-----fG~v~iv~nR~mp~~~~~a~~~~~ld~~~---------~~~~~Lr~~~ 287 (330) .. |. .+.++.+.....+ -|.+. +| ++++.+|++|.+. +++|+-.- +..-.+|+. T Consensus 212 ~~---~~-~y~~~a~~~~~~~-~~tg~~~g~~lG-l~vi~s~~~p~~~-----alvlq~g~vG~~~d~~pl~~t~~~~e- 279 (318) T protein:vir:10 212 NF---MK-VYERNANYVSTAP-DWTGNFPGSVMG-LNVIRSRTFPIDR-----VLIMERGTVGFYSDTRPLQFTALYPE- 279 (318) T ss_pred hh---hh-hhhccchhhhhcc-cccccccceeec-eEEeecCccCCCe-----eEEEecCCcceeeccccceeeecccC- Confidence 10 00 0000000000000 12233 45 8999999999754 46666432 222222210 Q ss_pred ccccccccccceeeEE--EEEEEEEEecchheeEEeccccccc Q lcl|NC_020858. 288 EDKKVAKTGDAEKFML--IGEGALKPKNEKGLGVAADLYGLTA 328 (330) Q Consensus 288 ~~e~laKtGd~~k~~i--~~E~tLe~~N~~a~g~i~gLt~~~~ 328 (330) -..+ ..|..+.|.+ ..=-+.=|..|||...|+||-- . T Consensus 280 gg~~--~g~~~~s~~~~~~~~~~~~V~~PkA~~~itgi~~--~ 318 (318) T protein:vir:10 280 GNGP--NGGPTESYRADASHKRALAVDQPKAALWLTGIVT--P 318 (318) T ss_pred CCCC--CCCcchhhheehheeeeeeeeCcceeEEEeeccC--C Confidence 0001 1222233321 1112466889999999888742 2 No 92 >protein:vir:102119 Length: 404 # NCBI annotation: phage major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1641 # MgeName: phiSM101 # Cross-refs: genbank:acc:YP_699941;genbank:gi:110804052;genbank:GeneID:4206662 Probab=95.24 E-value=0.0025 Score=34.89 Aligned_cols=279 Identities=12% Similarity=-0.003 Sum_probs=131.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeec-cCccccccccccccccccccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDEL-AAPGANITLEGDEYTFDATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L-~~~~~na~~EG~d~~~~~~~~~~~ 79 (330) |...+. +......-+++.+.|...--..+|++.+++..........+....+ ..+..-...||...+....++. T Consensus 110 ~~~~~~---~~gg~~vP~~~~~~ii~~~~~~~~l~~l~~~~~~~~~~g~~~~~~~~~~~~~~~v~e~~~~~~~~~~~~-- 184 (404) T protein:vir:10 110 ISENID---EDGGYAVPEDIQTKINTRLKDTTDLYNMVDYEPVFTRSGSRTYEKRSKQKPMKPLSENQQIPTNGDNGK-- 184 (404) T ss_pred hccccC---CCCceeechhHHHHHHHHHhhhhhHhhhhceeeccCCccceEEEEecCCcceeeccccccccccccccc-- Confidence 322221 1111112346767777776788899998877655433222221111 1222233456766544322221 Q ss_pred ecceEE---EEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccc Q lcl|NC_020858. 80 LGNYTQ---IMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRG 156 (330) Q Consensus 80 ~~N~tQ---If~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g 156 (330) +.+++- -+...+.|| .+.+.... .+..+|=...-...+.+-+|.++|+|..... ..+||......+ T Consensus 185 f~~i~~~~~k~~~~~~iS--~ell~ds~-~~l~~~i~~~la~~~~~~~~~~il~G~g~~~----~~~gi~~~~~~~---- 253 (404) T protein:vir:10 185 LERFNFKLKDLADFMSIP--NDLLKFAD-KSLEDWIINWFVDKVRITRNAEILYGAGGDE----HATGIMTANKFK---- 253 (404) T ss_pred eeeeEeeheeeEeeehhh--HHHHhhcH-HHHHHHHHHHHHHHHHHHHHHHHhhcCCCCC----cccceeeccccc---- Confidence 122111 112233344 34444332 2333344445556677889999998854211 122332211000 Q ss_pred cccccccccccccccccccccccccccHHHHHHHHHHHHhcCCcee-EEEeChHHHHHHHHhhccceeeeeeeeec---- Q lcl|NC_020858. 157 ATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFK-HVFVSPYVKSVFVTFMSDTNVASFRYAAS---- 231 (330) Q Consensus 157 ~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~-~i~v~~~~k~~is~f~~~~~~~~~r~~~~---- 231 (330) ..+ .....+.+++.+++...-.++...+ .+++||....+|..+ ++.+. |+... T Consensus 254 ----------------~~~--~~~~~~~~~~~~~~~~~l~~~~~~~~~~v~n~~~~~~L~~l-kd~~G---~~l~~~~~~ 311 (404) T protein:vir:10 254 ----------------KIT--LPKSPALKDFKKCKNVELLNVFKATSSWIVNQDGFNYLDSL-EDKTG---RPYLQPDPK 311 (404) T ss_pred ----------------eee--ccccccHHHHHHHHHhhhhccccCCCEEEEcHHHHHHHHHh-hccCC---ceeeccCcC Confidence 001 1113456677777665444443332 578999998888886 33321 22211 Q ss_pred CCcceeEEEEEEEEEcCCeE-EEEEEcCcCCCccccccEEEEEcchh-hhhcccCCccccccc-cccc----cceeeEEE Q lcl|NC_020858. 232 NGKNNSIVANADVYEGPFGK-VMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRKIAEDKKV-AKTG----DAEKFMLI 304 (330) Q Consensus 232 ~~~~~~~~~~v~~~~tdfG~-v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~~~~~e~l-aKtG----d~~k~~i~ 304 (330) ++...+ =+|. |.++.+ .||..+.....+++.|++. +.+.. |.-..-+.. -... +...+..+ T Consensus 312 ~~~~~~----------l~G~PV~~~~~-~~~~~~~~~~~~~~gd~s~~~~~~~-~~~~~i~~~~~~~~~~~~~~~~~~~~ 379 (404) T protein:vir:10 312 DPTQYR----------FLGLPVIELPN-DLLLSTESAIPVLLGDTKEAYKYVS-DGAYELATTNIGAGAFETNTTKARII 379 (404) T ss_pred CCCCcc----------ccceeeEEecc-cccCCCCCccEEEEEeccccEEEEE-ecceEEEEeccccchhhcCceEEEEE Confidence 111101 1232 333333 4555554455577788764 33322 321111110 0112 33345666 Q ss_pred EEEEEEEecchheeEEecccccccc Q lcl|NC_020858. 305 GEGALKPKNEKGLGVAADLYGLTAS 329 (330) Q Consensus 305 ~E~tLe~~N~~a~g~i~gLt~~~~~ 329 (330) ..+...+++|.|+.++.-=+--+.+ T Consensus 380 ~r~d~~v~~~~a~~~~~~~~aa~~~ 404 (404) T protein:vir:10 380 MRIDGNVKDSEALLIAEIPVESVQA 404 (404) T ss_pred EeeccEEecccceEEEEeecccCCC Confidence 7789999999999877754444444 No 93 >protein:vir:5739 Length: 366 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:122 # MgeName: PY54 # Cross-refs: genbank:acc:NP_892050;genbank:gi:33770513;interpro:IPR006444;uniprot:Q7Y410;genbank:GeneID:1732928 Probab=95.01 E-value=0.003 Score=34.45 Aligned_cols=278 Identities=10% Similarity=0.066 Sum_probs=125.8 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeee-eccceeccceeeeeeeeccCccccccccccccccccccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSM-IEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~-ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~ 79 (330) |+..+.+ +......-+.+.+.|...-...+|+..+ .......+-.+.|...+- .+...-..||++.+..... T Consensus 64 ~a~~~~~--~~Gg~lvP~~~~~~ii~~l~~~s~l~~lg~~~v~~~~g~~~~p~~t~-~~~a~wv~E~~~~~~s~~~---- 136 (366) T protein:vir:57 64 MAISTAA--GSGGALIPQNMQNEVIELLRDRTVVRILGARSIPLPNGNLSMPRLSG-GATAGYVGEGKDVVATGAT---- 136 (366) T ss_pred hhccccc--cCCccccchhHHHHHHHHHhhhcchhhhceeeeecCCCceEEEEEeC-CcceeeeccCccccccccc---- Confidence 4433322 1111112335666666665667777554 222222222334433321 1111223577776544322 Q ss_pred ecceEE-EEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 80 LGNYTQ-IMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 80 ~~N~tQ-If~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) +++++- .++=...+.=|.+.+.... .+-.++=..+-.+.+.+-+|.+||+|... +.. ..||+........ T Consensus 137 f~~i~~~~~k~~~~~~iS~ell~ds~-~~~~~~i~~~l~~a~~~~~d~a~l~G~G~-~~~---p~Gi~~~~~~~~~---- 207 (366) T protein:vir:57 137 FDDVKLSAKTMIALVPVSNQLIGRAG-FNVEQLLLGDILSAIATREDKAFLRDDGT-GDT---PKGMKAVATAANR---- 207 (366) T ss_pred eeEEEEeeEEEEEeehhhHHHHhhhh-HHHHHHHHHHHHHHHHHHHHHHhhccCCC-Ccc---ccceeeccccccc---- Confidence 222111 1111222223344444332 12223333445556778899999998642 112 2344433210000 Q ss_pred cccccccccccccccccccccccccHHH---HHHHHHHHHh-cC--CceeEEEeChHHHHHHHHhhccceeeeeeeeecC Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAI---MDDVMQQGYQ-SG--ANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASN 232 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~---l~~~~~~~~~-~G--g~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~ 232 (330) .....++ ..+.+. +.+.+...+. .. ......++++....++..+ ++... ++...+ T Consensus 208 -------------~~~~~~t--~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~vmn~~~~~~L~~l-kd~~G---~~l~~~ 268 (366) T protein:vir:57 208 -------------LVAWTGT--AINLTTIDEYLDSLILKHMDSNSNMIRCGWGLSNRTYMTLFGL-RDGNG---NKVYPE 268 (366) T ss_pred -------------eeecccc--ccchhhHHHHHHHHHHhhhccccccccCEEEecHHHHHHHHhh-hccCC---ceeccC Confidence 0001111 222222 3333333332 11 1223467999888888876 33321 122111 Q ss_pred CcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccc---cEEEEEcchhhhhcccCCccc----cccccc----------c Q lcl|NC_020858. 233 GKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALA---RNAFFVDPEFLQFGWLRKIAE----DKKVAK----------T 295 (330) Q Consensus 233 ~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a---~~~~~ld~~~~~~~~Lr~~~~----~e~laK----------t 295 (330) ....++ +| +.++.+.+||.+.... ..+++.|++.+-+.. +.-.. .+..-+ . T Consensus 269 ~~~g~l----------~G-~Pvv~s~~ip~~~~~~~~~~~i~~gdfs~~~i~~-~~~i~i~~~~ea~~~~~~g~~~~~f~ 336 (366) T protein:vir:57 269 MSQGIL----------KG-YPIQRTSAIPANLGDDGNESEIYFCDFNDVVIGE-DGMMKVDFSTEATYKDADGQLVSAFA 336 (366) T ss_pred CCCCee----------cc-eeeEEccccccccccCCCccEEEEEecceEEEEE-ecceEEEEeeccccccccccchhhhh Confidence 111111 22 5688888998753221 236778887654332 21111 111001 1 Q ss_pred ccceeeEEEEEEEEEEecchheeEEecccc Q lcl|NC_020858. 296 GDAEKFMLIGEGALKPKNEKGLGVAADLYG 325 (330) Q Consensus 296 Gd~~k~~i~~E~tLe~~N~~a~g~i~gLt~ 325 (330) -|......+..+.+.+++|++..+++|+.= T Consensus 337 ~~~~~iR~~~~~d~~v~~~~a~~~lt~~~~ 366 (366) T protein:vir:57 337 RNQSLIRVVTEHDIGFRHPEGLVLGTGVIW 366 (366) T ss_pred cCceeEEeeeeeCcEeeccccEEEEecccC Confidence 233456667779999999999999988877 No 94 >protein:vir:1084 Length: 437 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:21 # MgeName: bIL309 # Cross-refs: genbank:acc:NP_076738;genbank:gi:13095848;genbank:GeneID:920418 Probab=94.84 E-value=0.0034 Score=34.14 Aligned_cols=268 Identities=10% Similarity=-0.014 Sum_probs=113.8 Q ss_pred CCccccceeecc-ccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTG-AKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~-~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~ 79 (330) ........++.+ ....-+.+.+.|... ...+++..++.....++....|....-..+......||...+.... .. T Consensus 152 e~~~~~~~~~~~~g~lvp~~~~~~i~~~-~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~e~~~~~e~~~---~~ 227 (437) T protein:vir:10 152 EVRDVTGIALKDGKVIIPETILTPEKEV-HQFPRLGSLVRTESVTTTTGKLPIFNNSTDLLTAHTEYGQTTKNAT---PV 227 (437) T ss_pred hhhhhhhcccccccccchHHHHHHHHHh-hhhhhhhhcceeEeeccCceeeEEeecccccccccccccccccccc---cc Confidence 000000111111 111112444555443 4555666655555444444444433322222233345554432111 11 Q ss_pred ecceEEE---EeeeeeehhHHHHHhhcc--ccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccc Q lcl|NC_020858. 80 LGNYTQI---MRKSGIISGTQNITDEAG--RATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVS 154 (330) Q Consensus 80 ~~N~tQI---f~~~v~VS~T~~av~~~G--~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~ 154 (330) ++.++-- +..-+.|| .+.+.... ..+.+...+ ...+.+-+|.++|+|..+. .+ T Consensus 228 ~~~v~~~~~k~~~~~~is--~ell~ds~~~~~~~i~~~l---~~~~~~~~~~~i~~g~g~~--~~--------------- 285 (437) T protein:vir:10 228 ITPILWDLKTYTGGYVFS--QELISDSSYDWQAELQSRL---IELRDNTDDSLIITALTDG--IK--------------- 285 (437) T ss_pred ceeeeeehhheeeehhhh--HHHHhhhHHHHHHHHHHHH---HHHHHHHHHHHHhhhhccc--cc--------------- Confidence 2222211 11223333 33333322 222333333 3445666788888875310 00 Q ss_pred cccccccccccccccccccccccccccccHHHHHHHHH----HHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeee Q lcl|NC_020858. 155 RGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQ----QGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAA 230 (330) Q Consensus 155 ~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~----~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~ 230 (330) .++ ...+.+.|.+++. ..|.+++ .+++||....+|..+ +|.+. |+.. T Consensus 286 ~~~----------------------~~~~~~~~~~~~~~~l~~~~~~~~---~~~~~~~~~~~l~~l-kd~~g---~~~~ 336 (437) T protein:vir:10 286 KTT----------------------STYLLGDLKKVLNVTLKPQDSAAA---SIVMSQSAYNLFDMA-TDAMG---RPLL 336 (437) T ss_pred ccc----------------------cccchhhHHHHHHhhhhhhhhcCC---EEEEcHHHHHHHHHh-hccCC---Ceee Confidence 000 0112233444433 3343333 578999998888886 33321 2221 Q ss_pred cCCcceeEEEEEEEEEcCCeE-EEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccc--cccceeeEEEEEE Q lcl|NC_020858. 231 SNGKNNSIVANADVYEGPFGK-VMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAK--TGDAEKFMLIGEG 307 (330) Q Consensus 231 ~~~~~~~~~~~v~~~~tdfG~-v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laK--tGd~~k~~i~~E~ 307 (330) ...-.. + .-.+=+|. |.+..+.++|........+++.|.+..-.-+.|.-...+ +.. .-+.+...++..+ T Consensus 337 ~~~~~~--~----~~~~l~G~pv~~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~r~~~~~~-~~~~~~~~~~~~~~~~r~ 409 (437) T protein:vir:10 337 QPNVTA--A----TGYTLLGKTVVIVDDKLFPSASAGDVNIVVAPLKKAVINFKLTEITGQ-FQDTYDIWYKQLGIFLRQ 409 (437) T ss_pred ccCccC--C----CCcccccceeEEecccccCCcCCCceEEEEeeccccEEEEeeeceEEE-EecccccccceeeEEEEE Confidence 111000 0 00112563 555556677766543444777787632222223211111 111 1234555667778 Q ss_pred EEEEecchheeEEeccc---cccccC Q lcl|NC_020858. 308 ALKPKNEKGLGVAADLY---GLTAST 330 (330) Q Consensus 308 tLe~~N~~a~g~i~gLt---~~~~~~ 330 (330) ...+.+|.|+.+|++=. -.+++| T Consensus 410 d~~~~~~~a~~~l~~~~~~~~~~~~~ 435 (437) T protein:vir:10 410 NVVQASKDLIVNLTGKLKAVTVVQST 435 (437) T ss_pred ccEEecccceEEEEeeccccccCCCC Confidence 99999999999888531 111222 No 95 >protein:vir:78223 Length: 333 # NCBI annotation: Putative major head protein # Family: family:all:966 # MgeID: mge:1849 # MgeName: Bethlehem # Cross-refs: genbank:acc:YP_001491666;genbank:gi:157786490;genbank:GeneID:5625701 Probab=94.77 E-value=0.0035 Score=34.02 Aligned_cols=291 Identities=9% Similarity=0.006 Sum_probs=129.2 Q ss_pred CCccccce---eeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCcc-------ccccccccccc Q lcl|NC_020858. 1 MAVVTNTF---QSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPG-------ANITLEGDEYT 70 (330) Q Consensus 1 Ma~~t~~~---~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~-------~na~~EG~d~~ 70 (330) |+..++.- ++.....--+.+.+.|...-..+.|+.++.-.....+-...|...+-.+.+ .....|+...+ T Consensus 10 ~~~~~~~~g~~~~~~~~liP~~~~~~ii~~l~~~s~l~~~~~~~~~~~~~~~~p~~~~~~~a~~v~eg~~~~~~e~~~~~ 89 (333) T protein:vir:78 10 NSAGSNHQGRLAHVPSDLLPKEIVGPIFDKAQESSLVLRMGEQIPISYGETIIPTTVKRPEVGQVGVGTSNEQREGGLKP 89 (333) T ss_pred hcccccccCceecCCccccchhHHHHHHHHHHhhchhhhhcceeeccCCceEEEEEeCCceeEeecCccccccccccccc Confidence 22222111 111111112245555655556677787776555544444444443322211 11112222222 Q ss_pred cccccCceEecceEEEEee-eeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHH Q lcl|NC_020858. 71 FDATVSPERLGNYTQIMRK-SGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWV 149 (330) Q Consensus 71 ~~~~~~~~~~~N~tQIf~~-~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i 149 (330) .. ...+.+++=-.+| ...+.=|.+.+.... .+..+|=...-.+.+.+-+|.++|+|..... +-...|+.... T Consensus 90 ~~----~~~f~~i~l~~~kl~~~~~is~ell~~s~-~~~~~~i~~~la~ai~~~~d~~~l~G~g~~~--~~~~~g~~~~~ 162 (333) T protein:vir:78 90 LS----GTAWDTRSVSPIKLATIVTVSEEFARMNP-SGLYTKLQGDLAYAIGRGIDLAVFHGKSPLT--GSALQGIDTDN 162 (333) T ss_pred cc----ccceeEEEEeeEEEEEeehhhHHHHhcCH-HHHHHHHHHHHHHHHHHHHHHHHhcccCCCC--Ccccccccccc Confidence 11 1222222222222 222223334443322 2333344445556788899999999865322 11222332211 Q ss_pred hcccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcC-CceeEEEeChHHHHHHHHhh--ccceee-e Q lcl|NC_020858. 150 KTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSG-ANFKHVFVSPYVKSVFVTFM--SDTNVA-S 225 (330) Q Consensus 150 ~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~G-g~~~~i~v~~~~k~~is~f~--~~~~~~-~ 225 (330) . ....+ .....+....++.+.|.+++..+-.++ .+.+.++++|..+..+.++. +|.... . T Consensus 163 ~--~~~~~--------------~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~~~~~~d~~G~~i 226 (333) T protein:vir:78 163 V--IANTT--------------NVDYLQETGDPLLDRLLDGYDLVSANTDVEFNGWAVDPRFRAHLLRAQAYRDANGNVD 226 (333) T ss_pred c--ccccc--------------cccccccccchhHHHHHHHHHhhccccccCceEEEEcchHHHHHHHHhhhcCCCCcee Confidence 0 00000 001112223456667777766655443 34556888998877776543 222110 0 Q ss_pred eeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccc----cEEEEEcchhhhhcccCCccccc---c--ccc-- Q lcl|NC_020858. 226 FRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALA----RNAFFVDPEFLQFGWLRKIAEDK---K--VAK-- 294 (330) Q Consensus 226 ~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a----~~~~~ld~~~~~~~~Lr~~~~~e---~--laK-- 294 (330) .+.....+.. .+=+| +.++.+.+||.+...+ ..+++.|.+..-+.. |.-.+-+ + ... T Consensus 227 ~~~~~~~~~~----------~~l~G-~Pv~~~~~i~~~~~~~~~~~~~~~~gD~~~~~~g~-~~~~~i~~~~~~~~~~~~ 294 (333) T protein:vir:78 227 PSRINLAAQT----------GDVLG-LPAQFGRAVGGDLGAAVDSKTRIIGGDFSQLKFGF-ADEIRIKMSDTATLTDSG 294 (333) T ss_pred ecCccccCCC----------ceeec-eeeEEccccCCCccccCCCccEEEEEecccEEEEE-eeccEEEEeccccccccc Confidence 0000001110 11233 4788888998764322 247788888765542 2111110 0 001 Q ss_pred -------cccceeeEEEEEEEEEEecchheeEEecccccccc Q lcl|NC_020858. 295 -------TGDAEKFMLIGEGALKPKNEKGLGVAADLYGLTAS 329 (330) Q Consensus 295 -------tGd~~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~ 329 (330) .-|...+..+..+.+.+++|+|..+| ++-++. T Consensus 295 ~~~~~~~~~~~v~~r~~~r~d~~v~~~~a~~~l---~~~~a~ 333 (333) T protein:vir:78 295 SATVSMWQTNQIAILIEVTFGWLLGDKQAFVKF---VDDEQP 333 (333) T ss_pred cceeehhhcCcEEEEEEEEEccEEecccceEEE---eccCCC Confidence 11223345566788999999999886 334444 No 96 >protein:vir:93742 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1475 # MgeName: 55 # Cross-refs: genbank:acc:YP_240459;genbank:gi:66396126;genbank:GeneID:5133511 Probab=94.74 E-value=0.0036 Score=33.97 Aligned_cols=261 Identities=13% Similarity=0.050 Sum_probs=140.4 Q ss_pred CCccccceeecccccccccccee----------eEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADV----------VSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYT 70 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~----------I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~ 70 (330) ||..+ |..+.-. .-|-+++. +..+.+.+.. .-|+.--+=+.+.|. .+.++. -..||.+.+ T Consensus 1 ma~~~-T~~~~~i--iPev~~~~v~~~~~~~~~~~~~~~~~~~---l~g~~G~tv~ip~~~--~~g~~~--~~~eg~~i~ 70 (274) T protein:vir:93 1 MPQGI-TKTSNQI--IPEVLAPMMQAQLEKKLRFASFAEVDST---LQGQPGDTLTFPAFV--YSGDAQ--VVAEGEKIP 70 (274) T ss_pred CCccc-eehhhee--chHHHHHHHHHHHHhhhhhccccccccc---ccCCCCCEEEEEeec--cCCCcc--cccCCCccc Confidence 88755 3222211 11111111 2233333322 223211122345674 233322 235777776 Q ss_pred cccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHh Q lcl|NC_020858. 71 FDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVK 150 (330) Q Consensus 71 ~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~ 150 (330) ..........--+.| ..+.+.|++-+.+. . .++.++....+....+.+.++..++.--.... T Consensus 71 ~~~it~~~~~~~i~~-~~~~~~i~D~~~~~--~-~~d~~~~~~~~~~~~~a~~~d~~~~~~~~~a~-------------- 132 (274) T protein:vir:93 71 TDILETKKREAKIRK-IAKGTSITDEALLS--G-YGDPQGEQVRQHGLAHANKVDNDVLEALMGAK-------------- 132 (274) T ss_pred ccccccceeEEEeee-ecccccccHHHHHh--h-ccchHHHHHHHHHHHHHHHHHHHHHHHHhccc-------------- Confidence 665555555445556 34678888765543 2 25778888888888999999988874211100 Q ss_pred cccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeee Q lcl|NC_020858. 151 TNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAA 230 (330) Q Consensus 151 tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~ 230 (330) ......+++.+.|.++++++=+++...+.++|||.....+-+-....+.. .. T Consensus 133 ------------------------~~~~~~~~~~d~i~dA~~~l~d~~~~~~~ivv~p~~~~~L~k~~~~~f~~----~s 184 (274) T protein:vir:93 133 ------------------------LTVNADITKLNGLQSAIDKFNDEDLEPMVLFINPLDAGKLRGDASTNFTR----AT 184 (274) T ss_pred ------------------------ccccccccCHHHHHHHHHHhhhccCCccEEEeCHHHHHHHHhhhhhcccc----cc Confidence 00011246778899999999998888999999999876664321111110 00 Q ss_pred cCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCC-ccccccccccccceeeEEEEEEEE Q lcl|NC_020858. 231 SNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRK-IAEDKKVAKTGDAEKFMLIGEGAL 309 (330) Q Consensus 231 ~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~-~~~~e~laKtGd~~k~~i~~E~tL 309 (330) ..+.. . ...-....|--++|+.++.||.+ ..+++.+..+.+..-++ ..+.+.-++.+ ++......-|+. T Consensus 185 ~~g~~-~---~~~G~ig~~~G~~Vi~s~~~p~~-----t~~l~~~gai~~~~~~~~~vE~~Rd~~~~-~d~i~~~~~y~~ 254 (274) T protein:vir:93 185 ELGDD-I---IVKGAFGEALGAIIVRTNKLEAG-----TAILAKKGAVKLILKRDFFLEVARDASTK-TTALYSDKHYVA 254 (274) T ss_pred ccccc-c---eeecccceecCeeEEEcCCCCcc-----eEEEEeCCeEEEEecCCcccccccchhhc-ccEEEEEEEEEE Confidence 00111 0 00101111222688999999965 45788888777754343 12222233332 344444556899 Q ss_pred EEecchheeEEeccccccccC Q lcl|NC_020858. 310 KPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 310 e~~N~~a~g~i~gLt~~~~~~ 330 (330) ++.||.+..+++-= ++|+ T Consensus 255 ~~~~~~~~v~~t~~---~~s~ 272 (274) T protein:vir:93 255 YLYDESKAVKITKG---SGSL 272 (274) T ss_pred EEEcCCceEEEeeC---cccc Confidence 99999988776633 3344 No 97 >protein:vir:739 Length: 231 # NCBI annotation: major structural protein 4 # Family: family:all:522 # MgeID: mge:14 # MgeName: Tuc2009 # Cross-refs: genbank:acc:NP_108716;genbank:gi:13487838;genbank:GeneID:920884 Probab=94.34 E-value=0.0047 Score=33.35 Aligned_cols=229 Identities=12% Similarity=0.081 Sum_probs=121.7 Q ss_pred eeeeccceeccceeeeeeeeccCccccccccccccccccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHH Q lcl|NC_020858. 35 YSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKL 114 (330) Q Consensus 35 ~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~ 114 (330) -..+..+++ =+.+.| +.++ .-..||...+....+.....-=+.|+ .|.|+|++-+.-+. + +|.+..-.. T Consensus 1 ~~~~~~Gdt-it~P~~----iGda--~~v~eG~~i~~~~l~~t~~~atIk~~-gk~~~itD~a~l~~-~--gDp~~ea~~ 69 (231) T protein:vir:73 1 ENGINLANL-CEYPND----IGDA--ADVAEGGEISLDKIGTTTKSVTIKKA-AKGTEITDEAALSG-Y--GDPIGESNK 69 (231) T ss_pred CccccCCce-EEeccc----ccch--hhhcCCCcCChhhccccceeeeEeee-ccceeeeHHHHhhc-c--CchHHHHHH Confidence 001111110 012245 2222 33468888776666655554456776 88999998888653 3 454444444 Q ss_pred HHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccccccccccccccccccccccccccccHHHHHHHHHHH Q lcl|NC_020858. 115 KKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQG 194 (330) Q Consensus 115 k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~ 194 (330) +....|++-+.+-++.--.. + . + + -..++|.+.+.++++++ T Consensus 70 Q~~~~iA~kvD~di~~~~~~---a-~--------l-------------------------~--~~~~~t~d~i~~A~~~f 110 (231) T protein:vir:73 70 QLGLSLANKVDDDLLKAAKT---T-S--------Q-------------------------T--VSTKANVDGVQAALDIF 110 (231) T ss_pred HHHHHHHHhhhHHHHHhhcc---c-c--------c-------------------------c--ccccccHHHHHHHHHHh Confidence 44444545544444421000 0 0 0 0 01257899999999999 Q ss_pred HhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEc Q lcl|NC_020858. 195 YQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVD 274 (330) Q Consensus 195 ~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld 274 (330) -+...++..++|||.....+.+|...... ........+ ++=....|--++|+.++.+|.++....+ ++.- T Consensus 111 gde~~~~~vivv~p~~~~~Lrk~~~~~~~------~~~~g~~i~---~~G~iG~i~G~~Vi~S~~~~~~~~~~~~-~i~~ 180 (231) T protein:vir:73 111 NDEDAQAYVLIVNPKDAAKIRKDANAKNI------GSEVGANAL---INGTYADVLGAQIVRSKKLAEGSALMFK-IVSN 180 (231) T ss_pred ccccccceEEEEcchHHHhhhhccchhhh------hhhhcccee---eecccceEcceEEEEcCCCCCCceeeee-EEee Confidence 88888889999999876665554322110 101111111 1111222223799999999987765433 3333 Q ss_pred chhhhhcccCCccccccccccccceeeEEEEE--EEEEEecchheeEEeccccc Q lcl|NC_020858. 275 PEFLQFGWLRKIAEDKKVAKTGDAEKFMLIGE--GALKPKNEKGLGVAADLYGL 326 (330) Q Consensus 275 ~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~E--~tLe~~N~~a~g~i~gLt~~ 326 (330) +.-+.+..-|+. .-| .....+.+.-.|++- |+..+-||....+|+ +.|. T Consensus 181 ~gAl~~~~k~~~-~vE-tdRd~~~k~~~i~~~~~y~v~l~~~~~vv~~t-~~g~ 231 (231) T protein:vir:73 181 SPALKLVLKRGV-QVE-TDRDIVTKTTVITADEHYAAYLYDLTKVVNIT-FTGV 231 (231) T ss_pred ccceeeeecccc-eee-ccccccccccEEEEeEEEEEEEEcCccEEEEE-eecC Confidence 555555443431 212 233444444444443 789999999876653 4555 No 98 >protein:vir:94494 Length: 274 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1508 # MgeName: 88 # Cross-refs: genbank:acc:YP_240676;genbank:gi:66396348;genbank:GeneID:5133758 Probab=93.95 E-value=0.0058 Score=32.83 Aligned_cols=261 Identities=13% Similarity=0.050 Sum_probs=139.8 Q ss_pred CCccccceeeccccccccccceee----------EecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVV----------SRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYT 70 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I----------~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~ 70 (330) ||..+ |..+. .-+-|=+++.+ ..+...+..+ -|+.--+=+.+.|. .+.++. -..||.+.+ T Consensus 1 ma~~~-T~~~d--~iiPev~~~~v~~~~~~~l~~~~~~~~d~~l---~g~~G~tv~iP~~~--~~g~a~--~~~~g~~i~ 70 (274) T protein:vir:94 1 MPQGL-TKTSD--QIIPEVLAPMMQAQLEKKLRFASFAEVDSTL---QGQPGDTLTFPAFV--YSGDAQ--VVAEGEKIP 70 (274) T ss_pred CCccc-eehhh--eechHHHHHHHHHhhhhhhhhcccceecccc---cCCCCCEEEEeeec--CCCccc--cccCCCccc Confidence 88743 22222 11222222222 1222222221 12211122345674 333322 235777776 Q ss_pred cccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHh Q lcl|NC_020858. 71 FDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVK 150 (330) Q Consensus 71 ~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~ 150 (330) ..........--+.|+ .+.++|++-+.+. . .+|.++....+...-+.+.++..++.--.+.. T Consensus 71 ~~~lt~~~~~~~i~~~-~~~~~i~D~~~~~--~-~~dp~~~~~~~~a~a~a~~vd~~~~~~l~~a~-------------- 132 (274) T protein:vir:94 71 TDILETKKREAKIRKI-AKGTSITDEALLS--G-YGDPQGEQVRQHGLAHANKVDNDVLEALMGAK-------------- 132 (274) T ss_pred ccccccceeEEEeeee-cceecccHHHHHh--c-cchHHHHHHHHHHHHHHHHHHHHHHHHHhccC-------------- Confidence 6666655555556664 4678888866553 2 25677777778888899999988774211100 Q ss_pred cccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeee Q lcl|NC_020858. 151 TNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAA 230 (330) Q Consensus 151 tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~ 230 (330) ......+++.+.|.++++++=+++...+.++|||.+...+-+-....+... . T Consensus 133 ------------------------~~~~~~~~~~d~i~dA~~~l~d~~~~~~~ivv~p~~~~~L~k~~~~~f~~~----s 184 (274) T protein:vir:94 133 ------------------------LTVNADITKLNGLQSAIDKFNDEDLEPMVLFVNPLDAGKLRGDASTNFTRA----T 184 (274) T ss_pred ------------------------ccccccccCHHHHHHHHHHhhccCCCceEEEeCHHHHHHHHhhhhhhcccc----C Confidence 000112467889999999999988899999999998766654211111100 0 Q ss_pred cCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCC-ccccccccccccceeeEEEEEEEE Q lcl|NC_020858. 231 SNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRK-IAEDKKVAKTGDAEKFMLIGEGAL 309 (330) Q Consensus 231 ~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~-~~~~e~laKtGd~~k~~i~~E~tL 309 (330) ..+. .. ..+-....|--+.|+.++.||.+ ..+++.+..+.+..-++ ..|.+.-++.+ ++......-|+. T Consensus 185 ~~g~-~~---~~~G~ig~~~G~~Vi~s~~~p~~-----t~~l~~~gA~~~~~~~~~~vE~~Rd~~~~-~d~i~~~~~y~~ 254 (274) T protein:vir:94 185 ELGD-DI---IVKGAFGEALGAIIVRTNKLEAG-----TAILAKKGAVKLILKRDFFLEVARDASTK-TTALYSDKHYVA 254 (274) T ss_pred cccc-cc---eeccccceecCeeEEEcCCCCcc-----eEEEEeCcceEeeecCCceeccccchhhc-ccEEEEEEEEEE Confidence 0011 10 00111112223689999999964 45788888877654343 11222222222 233333445899 Q ss_pred EEecchheeEEeccccccccC Q lcl|NC_020858. 310 KPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 310 e~~N~~a~g~i~gLt~~~~~~ 330 (330) .+.||.+..+++- =++|+ T Consensus 255 ~~~~~~~vv~~t~---~~~~~ 272 (274) T protein:vir:94 255 YLYDESKAVKITK---GSGSL 272 (274) T ss_pred EEEcCCceEEEec---Ccccc Confidence 9999998887762 33444 No 99 >protein:vir:97433 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1676 # MgeName: 92 # Cross-refs: genbank:acc:YP_240749;genbank:gi:66396420;genbank:GeneID:5133789 Probab=93.95 E-value=0.0058 Score=32.83 Aligned_cols=261 Identities=13% Similarity=0.050 Sum_probs=139.8 Q ss_pred CCccccceeeccccccccccceee----------EecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVV----------SRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYT 70 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I----------~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~ 70 (330) ||..+ |..+. .-+-|=+++.+ ..+...+..+ -|+.--+=+.+.|. .+.++. -..||.+.+ T Consensus 1 ma~~~-T~~~d--~iiPev~~~~v~~~~~~~l~~~~~~~~d~~l---~g~~G~tv~iP~~~--~~g~a~--~~~~g~~i~ 70 (274) T protein:vir:97 1 MPQGL-TKTSD--QIIPEVLAPMMQAQLEKKLRFASFAEVDSTL---QGQPGDTLTFPAFV--YSGDAQ--VVAEGEKIP 70 (274) T ss_pred CCccc-eehhh--eechHHHHHHHHHhhhhhhhhcccceecccc---cCCCCCEEEEeeec--CCCccc--cccCCCccc Confidence 88743 22222 11222222222 1222222221 12211122345674 333322 235777776 Q ss_pred cccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHh Q lcl|NC_020858. 71 FDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVK 150 (330) Q Consensus 71 ~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~ 150 (330) ..........--+.|+ .+.++|++-+.+. . .+|.++....+...-+.+.++..++.--.+.. T Consensus 71 ~~~lt~~~~~~~i~~~-~~~~~i~D~~~~~--~-~~dp~~~~~~~~a~a~a~~vd~~~~~~l~~a~-------------- 132 (274) T protein:vir:97 71 TDILETKKREAKIRKI-AKGTSITDEALLS--G-YGDPQGEQVRQHGLAHANKVDNDVLEALMGAK-------------- 132 (274) T ss_pred ccccccceeEEEeeee-cceecccHHHHHh--c-cchHHHHHHHHHHHHHHHHHHHHHHHHHhccC-------------- Confidence 6666655555556664 4678888866553 2 25677777778888899999988774211100 Q ss_pred cccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeee Q lcl|NC_020858. 151 TNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAA 230 (330) Q Consensus 151 tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~ 230 (330) ......+++.+.|.++++++=+++...+.++|||.+...+-+-....+... . T Consensus 133 ------------------------~~~~~~~~~~d~i~dA~~~l~d~~~~~~~ivv~p~~~~~L~k~~~~~f~~~----s 184 (274) T protein:vir:97 133 ------------------------LTVNADITKLNGLQSAIDKFNDEDLEPMVLFVNPLDAGKLRGDASTNFTRA----T 184 (274) T ss_pred ------------------------ccccccccCHHHHHHHHHHhhccCCCceEEEeCHHHHHHHHhhhhhhcccc----C Confidence 000112467889999999999988899999999998766654211111100 0 Q ss_pred cCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCC-ccccccccccccceeeEEEEEEEE Q lcl|NC_020858. 231 SNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRK-IAEDKKVAKTGDAEKFMLIGEGAL 309 (330) Q Consensus 231 ~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~-~~~~e~laKtGd~~k~~i~~E~tL 309 (330) ..+. .. ..+-....|--+.|+.++.||.+ ..+++.+..+.+..-++ ..|.+.-++.+ ++......-|+. T Consensus 185 ~~g~-~~---~~~G~ig~~~G~~Vi~s~~~p~~-----t~~l~~~gA~~~~~~~~~~vE~~Rd~~~~-~d~i~~~~~y~~ 254 (274) T protein:vir:97 185 ELGD-DI---IVKGAFGEALGAIIVRTNKLEAG-----TAILAKKGAVKLILKRDFFLEVARDASTK-TTALYSDKHYVA 254 (274) T ss_pred cccc-cc---eeccccceecCeeEEEcCCCCcc-----eEEEEeCcceEeeecCCceeccccchhhc-ccEEEEEEEEEE Confidence 0011 10 00111112223689999999964 45788888877654343 11222222222 233333445899 Q ss_pred EEecchheeEEeccccccccC Q lcl|NC_020858. 310 KPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 310 e~~N~~a~g~i~gLt~~~~~~ 330 (330) .+.||.+..+++- =++|+ T Consensus 255 ~~~~~~~vv~~t~---~~~~~ 272 (274) T protein:vir:97 255 YLYDESKAVKITK---GSGSL 272 (274) T ss_pred EEEcCCceEEEec---Ccccc Confidence 9999998887762 33444 No 100 >protein:vir:962 Length: 397 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:19 # MgeName: bIL285 # Cross-refs: genbank:acc:NP_076616;genbank:gi:13095724;genbank:GeneID:920264 Probab=93.68 E-value=0.0067 Score=32.50 Aligned_cols=268 Identities=13% Similarity=0.027 Sum_probs=109.1 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) +......-........-+++.+.|... -...++...+.....++....+..............||...+......-..+ T Consensus 129 ~~~~~~~~~~~~~~~vp~~~~~~i~~~-~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~E~~~~~~~~~~~~~~i 207 (397) T protein:vir:96 129 AEKRDGFTSVEGGALIPQELLQPQLEP-KDIVDLSKYVRSVPVNSASGKFPVISKSGSKMATVQQLEKNPQLANPKMVEI 207 (397) T ss_pred hhhhhcccccccccchhHHHHHHHHHh-hhhhhHHHhhhhccccccceeEEEEeccCCccccccccccccccccccccce Confidence 111111111111111223455555543 2344555555555454444555444333333233456665543211111111 Q ss_pred cceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccccc Q lcl|NC_020858. 81 GNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATGA 160 (330) Q Consensus 81 ~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g~ 160 (330) .--..-+..-+.|| .+.++.... +..+|=..+-...+.+-++..+++|...... T Consensus 208 ~~~~~~~~~~~~~s--~ell~ds~~-~l~~~i~~~l~~~~~~~~~~~i~~g~g~~~~----------------------- 261 (397) T protein:vir:96 208 DYSVATRRGYIPIS--QEMIDDASY-DVTGLIADEIQDQSLNTKNADIAAVLKTATA----------------------- 261 (397) T ss_pred eecHhHhhcchhhH--HHHHhhhHH-HHHHHHHHHHHHHHHHHHHHHHhhccccccc----------------------- Confidence 11011111222333 333333321 1122222222344555566677765321000 Q ss_pred cccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeEEE Q lcl|NC_020858. 161 NGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSIVA 240 (330) Q Consensus 161 ~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~~~ 240 (330) ....+-+.|.+++...-....+. .+++||....+|..+ +|.+ .|+.....-.. T Consensus 262 ------------------~~~~~~d~~~~~~~~~~~~~~~a-~~v~n~~~~~~l~~l-kd~~---G~~~~~~~~~~---- 314 (397) T protein:vir:96 262 ------------------KSVVGVDGLKDLINKEIKKVYDV-KLFISASMYSELDKL-KDKN---GRYLLQDSITA---- 314 (397) T ss_pred ------------------ccccchHHHHHHHHHhhhhhcCc-EEEEcHHHHHHHHHh-hccC---CCeEeccCccC---- Confidence 00234556666666555443332 578999999998886 3332 12221110000 Q ss_pred EEEEEEcCCeE-EEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccccccceeeEEEEEEEEEEecchheeE Q lcl|NC_020858. 241 NADVYEGPFGK-VMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTGDAEKFMLIGEGALKPKNEKGLGV 319 (330) Q Consensus 241 ~v~~~~tdfG~-v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~E~tLe~~N~~a~g~ 319 (330) . .-.+=+|. |.+ .+-.+|........+++.|++..-+-+.|.-......--..+......+..+...+++|.|+.+ T Consensus 315 ~--~~~~l~G~pv~~-~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~d~~~~~~~a~~~ 391 (397) T protein:vir:96 315 A--SGKQLLGKEVVV-LDDDVIGKSVGNVVGFIGDAKAFASFFDRKQVSVSWVDNNIYGQLLAGIIRYDVKATDKKAGFY 391 (397) T ss_pred C--CcccccccceEE-ecccccCCCCCceEEEEeehhcceEeEeecceEEEEecccccceeEEEEEEEccEEecccceEE Confidence 0 00011232 222 2334454443333466677773211122321111111111122234455678899999999988 Q ss_pred Eecccc Q lcl|NC_020858. 320 AADLYG 325 (330) Q Consensus 320 i~gLt~ 325 (330) |.-=.+ T Consensus 392 ~~~~~a 397 (397) T protein:vir:96 392 VTFTIG 397 (397) T ss_pred EEeecC Confidence 865444 No 101 >protein:vir:3845 Length: 395 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:322 # MgeName: phi adh # Cross-refs: genbank:acc:NP_050151;swissprot:trembl:q9t1f6;genbank:gi:9633043;uniprot:Q9T1F6;genbank:GeneID:1262163 Probab=92.95 E-value=0.0094 Score=31.71 Aligned_cols=269 Identities=11% Similarity=0.032 Sum_probs=119.5 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceecc---ceeeeeeeeccCccccccccccccccccccCc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDT---THPEWTTDELAAPGANITLEGDEYTFDATVSP 77 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~---~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~ 77 (330) |+..+ +-++...+---+++.+.|...-....|+..+......++ ...-|+..+..+.+ .-..||...+... . T Consensus 105 ~~~~~-~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~a-~~v~E~~~~~~~~---~ 179 (395) T protein:vir:38 105 VTSGT-TGTGNAGLTIPEDIQLQIRTLTRSFTSLESLANVENVTTSHGSRVYEKLADITPLK-DLDDESALIGDND---D 179 (395) T ss_pred Hhhcc-CccCCCceecchhHhhHHHHHHHhhcchhhhcceeeccCCcceEEEEeeccCCccc-ccccccccccccc---c Confidence 33222 111111111234677778777788888888765544322 23344444433221 2234665543221 1 Q ss_pred eEecce-EEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccc Q lcl|NC_020858. 78 ERLGNY-TQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRG 156 (330) Q Consensus 78 ~~~~N~-tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g 156 (330) ..+.++ ...++-..-|-=|.+.++..+ .+...+=...-.+.+.+-+|.++++|....... .| T Consensus 180 ~~f~~v~~~~~k~~~~~~iS~ell~ds~-~~l~~~i~~~la~~~~~~~~~~il~g~g~~~~~----~~------------ 242 (395) T protein:vir:38 180 PELTVVKYLIHRYAGITTVTNTLLKDTV-DNIIQWLVNWAAKKDVVTRNAKILEVMGKAPKK----PT------------ 242 (395) T ss_pred cceeeEEeeeeeeEeehhhHHHHHhhhH-HHHHHHHHHHHHHHHHHHHHHHHhhcccccccc----cc------------ Confidence 112221 122222222333334444332 122333333445556677889999875421100 00 Q ss_pred cccccccccccccccccccccccccccHHHHHHHHHH-HHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeee----c Q lcl|NC_020858. 157 ATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQ-GYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAA----S 231 (330) Q Consensus 157 ~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~-~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~----~ 231 (330) ..+-+.+.+++.. +-.+......+++||....+|..+ ++.+. |+.. . T Consensus 243 ------------------------~~~~~~i~~~~~~~l~~~~~~~a~~v~n~~~~~~L~~l-kd~~G---~~l~~~~~~ 294 (395) T protein:vir:38 243 ------------------------ISQFDNIKDLENNTLDPAIESTSSFITNQSGYNILSKV-KDADG---RYLMQPDVT 294 (395) T ss_pred ------------------------cccHHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHHh-hccCC---ceeeccCcC Confidence 1122334444432 222211223578999998888876 33321 1211 1 Q ss_pred CCcceeEEEEEEEEEcCCeE-EEEEEcCcCCCccccccEEEEEcchh-hhhcccCCc-cccccccc---cccceeeEEEE Q lcl|NC_020858. 232 NGKNNSIVANADVYEGPFGK-VMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRKI-AEDKKVAK---TGDAEKFMLIG 305 (330) Q Consensus 232 ~~~~~~~~~~v~~~~tdfG~-v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~~-~~~e~laK---tGd~~k~~i~~ 305 (330) ++...+| +|. |.+..+-.+|... ....+++.|++. +.+....+. .+..+... .-+...+..+. T Consensus 295 ~~~~~~l----------~G~pV~~~~~~~~~~~~-~~~~i~~gd~~~~~~i~~~~~~~i~~~~~~~~~~~~~~~~~r~~~ 363 (395) T protein:vir:38 295 SPDKYLI----------DGKPVIRIADKWLPDVS-GSHPLYFGDLKQGITLFDRQQMQIDTTNVGAGSFEHDTTKLRFID 363 (395) T ss_pred CCCccee----------ccceeEEecccccCcCC-CcceEEEEeccccEEEEEecceEEEEeccccchhhcCceEEEEEE Confidence 1111111 232 3333343455432 233467778763 444321221 01011110 12334566677 Q ss_pred EEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 306 EGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 306 E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) .+.+.+.+|.|..+|..-+-=+.+. T Consensus 364 r~d~~~~~~~a~~~~~~~~~~~~~~ 388 (395) T protein:vir:38 364 RFDVQLIDDGAFAAASFKTVANQAQ 388 (395) T ss_pred eeccEEecccceEEEEeecccCCCC Confidence 7999999999999887533222221 No 102 >protein:vir:4092 Length: 390 # NCBI annotation: major capsid protein a # Family: family:all:635 # MgeID: mge:86 # MgeName: 2389 # Cross-refs: genbank:acc:NP_510986;swissprot:trembl:q8w604;genbank:gi:17488508;uniprot:Q8W604;genbank:GeneID:1260361 Probab=92.63 E-value=0.011 Score=31.41 Aligned_cols=275 Identities=12% Similarity=-0.003 Sum_probs=119.8 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccc-cccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFD-ATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~-~~~~~~~ 79 (330) ++..+ ++.....--+++.+.|...-....|+++++......+. ..|.-..-..+...-..||+..+.. ....... T Consensus 84 ~~~~~---~~~gg~lvP~~~~~~I~~~~~~~s~i~~~~~~~~~~~~-~~~i~~~~~~~~a~~~~E~~~~~~~~~~~f~~i 159 (390) T protein:vir:40 84 IAGNG---FAGVTALLPPTVFERVFEDLTVEHPLLSKINFVNTTAT-TEWIISVGDVATAWWGPLCAEIKEVLDNGFDKI 159 (390) T ss_pred HhccC---cccCcccccHHHHHHHHHHHHhhhhhhhhceeeecCCc-eeEEEEEcCCcceeeeccccccCccccccceee Confidence 22111 11111112246677776666777788887665544332 2222221122222223465554322 1111111 Q ss_pred ecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccccc Q lcl|NC_020858. 80 LGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATG 159 (330) Q Consensus 80 ~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g 159 (330) .-.. .-+.--+.|| .+.+.... .+..+|=...-...+.+-+|.+||+|... +.| -||+.-+. + T Consensus 160 ~l~~-~k~~~~i~iS--~ell~ds~-~~l~~~i~~~la~~i~~~~~~a~l~G~G~--~~P---~Gil~~~~--------~ 222 (390) T protein:vir:40 160 QTGM-YKLSAYIPVC--NAMLDLGP-SWLDQYVRTILGEAMALGLEAGIVNGSGK--DQP---IGMMRDLN--------N 222 (390) T ss_pred Eeee-eeEEEeehhh--HHHHhcch-HHHHHHHHHHHHHHHHHHHHhhhhcccCC--Ccc---ceeeeccc--------c Confidence 1111 1112223344 33333332 23444555566667889999999998642 222 24433221 0 Q ss_pred ccccccccccccccccccccccccHHHHHHHH---HHHHhcCCc----eeEEEeChHHH-HHHHHh--hccceeeeeeee Q lcl|NC_020858. 160 ANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVM---QQGYQSGAN----FKHVFVSPYVK-SVFVTF--MSDTNVASFRYA 229 (330) Q Consensus 160 ~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~---~~~~~~Gg~----~~~i~v~~~~k-~~is~f--~~~~~~~~~r~~ 229 (330) ... +.....++ ..++..+..+++ ...|...+. -...+||+... ..+..+ ..+.. T Consensus 223 ~~~------~~~~~~~~---~~~t~~~~~~~~~~l~~~~~~~~~~~~~~a~~i~n~~t~~~~l~~~~~~~d~~------- 286 (390) T protein:vir:40 223 VTA------GEHPVKTA---TPLTDLTPATLATKVMLPLTDNGKKSVSDAILVINPADYWSKIYAATSYMTPQ------- 286 (390) T ss_pred ccc------cccccccc---cccchhhHHHHHHHHHHHhhcchhhhhcCceEEEcchhHHHHHHHHhhccCCC------- Confidence 000 00001111 124443333333 333322221 12456776432 222221 12211 Q ss_pred ecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccc---cccceeeEEEEE Q lcl|NC_020858. 230 ASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAK---TGDAEKFMLIGE 306 (330) Q Consensus 230 ~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laK---tGd~~k~~i~~E 306 (330) | ++... ...+| +.|+.+.+||++. +++.|++.+-+ +.|.-...+...- .-|...+..+.- T Consensus 287 ---G---~~v~~----~~~~g-~pvv~~~~~p~~~-----i~~Gd~s~~~i-~~~~~~~v~~~~~~~f~~~~~~~r~~~r 349 (390) T protein:vir:40 287 ---G---VWVTG----ILPVP-LEIVQSVAVPVGK-----AVAGRAKDYFM-GIGSEQVIRTSTEYRLLDDETLYYAKQY 349 (390) T ss_pred ---C---ccccc----cCCCc-eeEEEcCCCCCCc-----EEEEeeceEEE-EeecceEEEecchhhhhcCcEEEEEEEE Confidence 1 11110 12345 5788899999754 56889987644 3453222221111 124455666777 Q ss_pred EEEEEecchheeEEeccccccccC Q lcl|NC_020858. 307 GALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 307 ~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) +...++++.|.-++ .++++.-+. T Consensus 350 ~dg~v~~~~A~~~l-~~~~~~~~~ 372 (390) T protein:vir:40 350 ANGRPKDNSSFLVF-DITGLEGSP 372 (390) T ss_pred eCCEEecccceEEE-EeeccCCCC Confidence 88999999987633 344443221 No 103 >protein:vir:79928 Length: 393 # NCBI annotation: major head protein # Family: family:all:30335 # MgeID: mge:1874 # MgeName: 0305phi8-36 # Cross-refs: genbank:acc:YP_001429616;genbank:gi:156564106;genbank:GeneID:5525693 Probab=92.51 E-value=0.011 Score=31.30 Aligned_cols=297 Identities=12% Similarity=0.079 Sum_probs=149.9 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccce---eccceeeeeeeeccCccccccccccccccccccCc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVS---FDTTHPEWTTDELAAPGANITLEGDEYTFDATVSP 77 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~---~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~ 77 (330) |+.++...+-+-. +++.|-.....-|-...|+-+.. ..+..|.-.- .+|+ -...||.++++....-- T Consensus 74 mtt~~a~IliP~v------is~v~~Eaaepl~~~~kl~qk~~L~~Grsm~F~~~g-~~Ra---~~IgEGgE~~~~sld~~ 143 (393) T protein:vir:79 74 MATPSAQILIPRV------IVGTMREAAEPLYIGTKMLQKIRLKSGQSMIFPSIG-IMRA---YDVAEGQEIPEDSIDWQ 143 (393) T ss_pred hcCCCcceechhh------hhhhhhhcccchhHHHHHHHHHhhhcCcceeccchh-eeee---ccccccccccccchhhh Confidence 7777766555432 23333322211111112221110 1111110000 1222 12345655555444311 Q ss_pred eEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 78 ERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 78 ~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) ..-+=-+.-=+-.+.|-.|+++++..| -|.+.+-+..++-.|+|-.|-.++++-+..+.++.. + +..+...+-+ T Consensus 144 T~dsv~~~~gK~G~~Ia~SqEmIsDSg-~Dvin~~l~aA~RaMaRkKee~a~n~fk~~ghtvfD--a---~st~t~ahpt 217 (393) T protein:vir:79 144 THESPEIRVGKSGIRLRFTDEMISDSQ-WDLMSMMIKQAGRAMGRHKEQKAYHQFRSHGHTVFD--N---YSTNKLAHTT 217 (393) T ss_pred cCCceeEEechhhhhhhhHHHHhhcch-HHHHHHHHHHHHHHHHhhhHHHHHhhhhcccceeee--c---cccCccceee Confidence 111000111122456667888888877 488999999999999999999999987654443221 1 1111122211 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCccee Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNS 237 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~ 237 (330) |-. .+.--+| .|+.++|.|++-++-.++=.+.+||++|-.+.+|-+-..=.....+-+-|++... T Consensus 218 Gr~----------~~~~qNG---TlSleDllDm~~av~~~hyt~svi~MHPLAWnv~AKna~me~~~~na~gN~~~~~-- 282 (393) T protein:vir:79 218 GLD----------KNGVQND---TFSAEDFLDLIIAVMANEYTPSDLMMHPLAWTVFAKNELMGSLQANPYGNYPAKG-- 282 (393) T ss_pred cCC----------ccccccc---cccHHHHHHHHHHHhcccCCcceEEEcCchhhhhhhhhhhcceeeccccccCccc-- Confidence 111 1112234 4899999999999999999999999999887766552211111122222222211 Q ss_pred EEE----EEEEEEcCCe-EEEEEEcCcCCCcccccc--EEEEEcchhhhhcccCC-ccccccccccccceeeEEEEEEEE Q lcl|NC_020858. 238 IVA----NADVYEGPFG-KVMIHPNRVMAGSGALAR--NAFFVDPEFLQFGWLRK-IAEDKKVAKTGDAEKFMLIGEGAL 309 (330) Q Consensus 238 ~~~----~v~~~~tdfG-~v~iv~nR~mp~~~~~a~--~~~~ld~~~~~~~~Lr~-~~~~e~laKtGd~~k~~i~~E~tL 309 (330) +.. .-++|....- .++|++.+|.|=+.. +. ..+.+|-+...+.-.+. +.-..-=.|+-|-++.-+.-.||+ T Consensus 283 ~~ts~algp~~i~~~~~~nlnv~~sPfvp~d~k-~~rFd~~~Vd~NnvgvlLV~D~i~tdq~ddk~rdiq~iKl~ERYG~ 361 (393) T protein:vir:79 283 APSSMALGPDSIQGRLPFNFNVNLSPFIPLDKK-SRRFDVYAVDRNNVGVLLVRDDLKTDQWDEKARGLQNIKMIERYGI 361 (393) T ss_pred cchhhhhchhhhccccccceeEEEecccccccc-cceeeEEEeecCCceEEEEecCcceeccccccccceeeeeeeeece Confidence 000 0011111100 267777888775543 22 24556666555543332 111111245677888888888999 Q ss_pred EEecch-heeEEeccccccccC Q lcl|NC_020858. 310 KPKNEK-GLGVAADLYGLTAST 330 (330) Q Consensus 310 e~~N~~-a~g~i~gLt~~~~~~ 330 (330) -+.|+- |.++..+++ .+++- T Consensus 362 gvLn~gkaiavakNI~-~~k~y 382 (393) T protein:vir:79 362 GILNEGKAIAVAKNIS-MDKSY 382 (393) T ss_pred eeeeCCceEEEEecce-eeccc Confidence 887764 777777765 23333 No 104 >protein:vir:102082 Length: 392 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1503 # MgeName: Fah # Cross-refs: genbank:acc:YP_512315;genbank:gi:89152484;genbank:GeneID:3953075 Probab=92.03 E-value=0.013 Score=30.90 Aligned_cols=272 Identities=10% Similarity=-0.005 Sum_probs=116.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeec-cCccccccccccccccccccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDEL-AAPGANITLEGDEYTFDATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L-~~~~~na~~EG~d~~~~~~~~~~~ 79 (330) |...+.. ......-+.+.+.|...-....|+.++.+.....+....+..-.. ..+...-..||+..+......-.. T Consensus 106 ~~~~t~~---~gg~~vP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~~~~~~~~ 182 (392) T protein:vir:10 106 MSGLTGE---DGGLVIPQDIQTQINELARSFDALEQYVTVEPVRTRSGSRVLEKNSDMIPFAEITEMGEIPETDNPKFSN 182 (392) T ss_pred ccccccC---CCceecchhHHHHHHHHHHhhhhhhhhceeeeccCCceeEEEEeecCCccceeeccccccccccccccee Confidence 3333211 111112235666666666677888777665554433222221111 112223346777665322111111 Q ss_pred ecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccccc Q lcl|NC_020858. 80 LGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATG 159 (330) Q Consensus 80 ~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g 159 (330) +.=-..-+...+.||...-.-........+..++ ...+.+-++.++++|...... T Consensus 183 v~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l---~~~i~~~~d~~~~~g~g~~~~---------------------- 237 (392) T protein:vir:10 183 VQYAVKDRAGILPLSRSLLQDSDQNILKYVTKWL---GKKSKVTRNVLILGVIEKLTK---------------------- 237 (392) T ss_pred EEeeeeeEEEeehhhHHHHhhhHHHHHHHHHHHH---HHHHHHHHHHHHhhccccccc---------------------- Confidence 1111222344555554332211122223333333 444566778888876432100 Q ss_pred ccccccccccccccccccccccccHHHHHHHHHHHHhcC-CceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeE Q lcl|NC_020858. 160 ANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSG-ANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSI 238 (330) Q Consensus 160 ~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~G-g~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~ 238 (330) ....+-+.|.+++...-..+ .....+++||....+|.++ +|.+. |+.....-.... T Consensus 238 -------------------~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~l-kd~~G---~~l~~~~~~~~~ 294 (392) T protein:vir:10 238 -------------------QAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDKL-KDKDG---KYILQSDPTQKN 294 (392) T ss_pred -------------------cCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHh-hccCC---CeEeecCccCCc Confidence 00134455666554322222 2223578999999999886 43321 222111000000 Q ss_pred EEEEEEEEcCCeE--EEEEEcCcCCCcccccc--EEEEEcchh-hhhcccCCccccccccccc-cc----eeeEEEEEEE Q lcl|NC_020858. 239 VANADVYEGPFGK--VMIHPNRVMAGSGALAR--NAFFVDPEF-LQFGWLRKIAEDKKVAKTG-DA----EKFMLIGEGA 308 (330) Q Consensus 239 ~~~v~~~~tdfG~--v~iv~nR~mp~~~~~a~--~~~~ld~~~-~~~~~Lr~~~~~e~laKtG-d~----~k~~i~~E~t 308 (330) -.+=+|. |.+..+..++.....++ .+++.|++. +.+.. |....-+...-++ .+ ..+.++..++ T Consensus 295 ------~~tllG~~~v~~~~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~-~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d 367 (392) T protein:vir:10 295 ------KKLFAGTNPVVVVSNRFLKSKGTTAKKAPLIIGDLKEAIVLFK-REDMELASTDVGGKAFTRNTLDLRAIQRDD 367 (392) T ss_pred ------cccccCcccEEEecccccCCCcccCCceEEEEEehhceEEEEe-ecceEEEEeccccchhhcCceEEEEEEeec Confidence 0011342 22222444433222222 255667663 32222 2111111111111 22 3344556688 Q ss_pred EEEecchheeEEeccccccccC Q lcl|NC_020858. 309 LKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 309 Le~~N~~a~g~i~gLt~~~~~~ 330 (330) ..+++|.|..++..=.--...| T Consensus 368 ~~v~~~~a~~~l~~~~~a~~~~ 389 (392) T protein:vir:10 368 VQMWDNEAAVYGEIDLSAPVEQ 389 (392) T ss_pred cEEecccceEEEEecccccccC Confidence 9999999998886555333333 No 105 >protein:vir:107593 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1491 # MgeName: Gamma # Cross-refs: genbank:acc:YP_338188;genbank:gi:77020144;genbank:GeneID:3703724 Probab=92.03 E-value=0.013 Score=30.90 Aligned_cols=272 Identities=10% Similarity=-0.005 Sum_probs=116.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeec-cCccccccccccccccccccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDEL-AAPGANITLEGDEYTFDATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L-~~~~~na~~EG~d~~~~~~~~~~~ 79 (330) |...+.. ......-+.+.+.|...-....|+.++.+.....+....+..-.. ..+...-..||+..+......-.. T Consensus 106 ~~~~t~~---~gg~~vP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~~~~~~~~ 182 (392) T protein:vir:10 106 MSGLTGE---DGGLVIPQDIQTQINELARSFDALEQYVTVEPVRTRSGSRVLEKNSDMIPFAEITEMGEIPETDNPKFSN 182 (392) T ss_pred ccccccC---CCceecchhHHHHHHHHHHhhhhhhhhceeeeccCCceeEEEEeecCCccceeeccccccccccccccee Confidence 3333211 111112235666666666677888777665554433222221111 112223346777665322111111 Q ss_pred ecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccccc Q lcl|NC_020858. 80 LGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATG 159 (330) Q Consensus 80 ~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g 159 (330) +.=-..-+...+.||...-.-........+..++ ...+.+-++.++++|...... T Consensus 183 v~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l---~~~i~~~~d~~~~~g~g~~~~---------------------- 237 (392) T protein:vir:10 183 VQYAVKDRAGILPLSRSLLQDSDQNILKYVTKWL---GKKSKVTRNVLILGVIEKLTK---------------------- 237 (392) T ss_pred EEeeeeeEEEeehhhHHHHhhhHHHHHHHHHHHH---HHHHHHHHHHHHhhccccccc---------------------- Confidence 1111222344555554332211122223333333 444566778888876432100 Q ss_pred ccccccccccccccccccccccccHHHHHHHHHHHHhcC-CceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeE Q lcl|NC_020858. 160 ANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSG-ANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSI 238 (330) Q Consensus 160 ~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~G-g~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~ 238 (330) ....+-+.|.+++...-..+ .....+++||....+|.++ +|.+. |+.....-.... T Consensus 238 -------------------~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~l-kd~~G---~~l~~~~~~~~~ 294 (392) T protein:vir:10 238 -------------------QAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDKL-KDKDG---KYILQSDPTQKN 294 (392) T ss_pred -------------------cCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHh-hccCC---CeEeecCccCCc Confidence 00134455666554322222 2223578999999999886 43321 222111000000 Q ss_pred EEEEEEEEcCCeE--EEEEEcCcCCCcccccc--EEEEEcchh-hhhcccCCccccccccccc-cc----eeeEEEEEEE Q lcl|NC_020858. 239 VANADVYEGPFGK--VMIHPNRVMAGSGALAR--NAFFVDPEF-LQFGWLRKIAEDKKVAKTG-DA----EKFMLIGEGA 308 (330) Q Consensus 239 ~~~v~~~~tdfG~--v~iv~nR~mp~~~~~a~--~~~~ld~~~-~~~~~Lr~~~~~e~laKtG-d~----~k~~i~~E~t 308 (330) -.+=+|. |.+..+..++.....++ .+++.|++. +.+.. |....-+...-++ .+ ..+.++..++ T Consensus 295 ------~~tllG~~~v~~~~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~-~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d 367 (392) T protein:vir:10 295 ------KKLFAGTNPVVVVSNRFLKSKGTTAKKAPLIIGDLKEAIVLFK-REDMELASTDVGGKAFTRNTLDLRAIQRDD 367 (392) T ss_pred ------cccccCcccEEEecccccCCCcccCCceEEEEEehhceEEEEe-ecceEEEEeccccchhhcCceEEEEEEeec Confidence 0011342 22222444433222222 255667663 32222 2111111111111 22 3344556688 Q ss_pred EEEecchheeEEeccccccccC Q lcl|NC_020858. 309 LKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 309 Le~~N~~a~g~i~gLt~~~~~~ 330 (330) ..+++|.|..++..=.--...| T Consensus 368 ~~v~~~~a~~~l~~~~~a~~~~ 389 (392) T protein:vir:10 368 VQMWDNEAAVYGEIDLSAPVEQ 389 (392) T ss_pred cEEecccceEEEEecccccccC Confidence 9999999998886555333333 No 106 >protein:vir:105004 Length: 392 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:1490 # MgeName: W Beta # Cross-refs: genbank:acc:YP_459969;genbank:gi:85701384;genbank:GeneID:3882145 Probab=92.03 E-value=0.013 Score=30.90 Aligned_cols=272 Identities=10% Similarity=-0.005 Sum_probs=116.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeec-cCccccccccccccccccccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDEL-AAPGANITLEGDEYTFDATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L-~~~~~na~~EG~d~~~~~~~~~~~ 79 (330) |...+.. ......-+.+.+.|...-....|+.++.+.....+....+..-.. ..+...-..||+..+......-.. T Consensus 106 ~~~~t~~---~gg~~vP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~~~~~~~~ 182 (392) T protein:vir:10 106 MSGLTGE---DGGLVIPQDIQTQINELARSFDALEQYVTVEPVRTRSGSRVLEKNSDMIPFAEITEMGEIPETDNPKFSN 182 (392) T ss_pred ccccccC---CCceecchhHHHHHHHHHHhhhhhhhhceeeeccCCceeEEEEeecCCccceeeccccccccccccccee Confidence 3333211 111112235666666666677888777665554433222221111 112223346777665322111111 Q ss_pred ecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccccc Q lcl|NC_020858. 80 LGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATG 159 (330) Q Consensus 80 ~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g 159 (330) +.=-..-+...+.||...-.-........+..++ ...+.+-++.++++|...... T Consensus 183 v~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l---~~~i~~~~d~~~~~g~g~~~~---------------------- 237 (392) T protein:vir:10 183 VQYAVKDRAGILPLSRSLLQDSDQNILKYVTKWL---GKKSKVTRNVLILGVIEKLTK---------------------- 237 (392) T ss_pred EEeeeeeEEEeehhhHHHHhhhHHHHHHHHHHHH---HHHHHHHHHHHHhhccccccc---------------------- Confidence 1111222344555554332211122223333333 444566778888876432100 Q ss_pred ccccccccccccccccccccccccHHHHHHHHHHHHhcC-CceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeE Q lcl|NC_020858. 160 ANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSG-ANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSI 238 (330) Q Consensus 160 ~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~G-g~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~ 238 (330) ....+-+.|.+++...-..+ .....+++||....+|.++ +|.+. |+.....-.... T Consensus 238 -------------------~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~l-kd~~G---~~l~~~~~~~~~ 294 (392) T protein:vir:10 238 -------------------QAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDKL-KDKDG---KYILQSDPTQKN 294 (392) T ss_pred -------------------cCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHh-hccCC---CeEeecCccCCc Confidence 00134455666554322222 2223578999999999886 43321 222111000000 Q ss_pred EEEEEEEEcCCeE--EEEEEcCcCCCcccccc--EEEEEcchh-hhhcccCCccccccccccc-cc----eeeEEEEEEE Q lcl|NC_020858. 239 VANADVYEGPFGK--VMIHPNRVMAGSGALAR--NAFFVDPEF-LQFGWLRKIAEDKKVAKTG-DA----EKFMLIGEGA 308 (330) Q Consensus 239 ~~~v~~~~tdfG~--v~iv~nR~mp~~~~~a~--~~~~ld~~~-~~~~~Lr~~~~~e~laKtG-d~----~k~~i~~E~t 308 (330) -.+=+|. |.+..+..++.....++ .+++.|++. +.+.. |....-+...-++ .+ ..+.++..++ T Consensus 295 ------~~tllG~~~v~~~~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~-~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d 367 (392) T protein:vir:10 295 ------KKLFAGTNPVVVVSNRFLKSKGTTAKKAPLIIGDLKEAIVLFK-REDMELASTDVGGKAFTRNTLDLRAIQRDD 367 (392) T ss_pred ------cccccCcccEEEecccccCCCcccCCceEEEEEehhceEEEEe-ecceEEEEeccccchhhcCceEEEEEEeec Confidence 0011342 22222444433222222 255667663 32222 2111111111111 22 3344556688 Q ss_pred EEEecchheeEEeccccccccC Q lcl|NC_020858. 309 LKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 309 Le~~N~~a~g~i~gLt~~~~~~ 330 (330) ..+++|.|..++..=.--...| T Consensus 368 ~~v~~~~a~~~l~~~~~a~~~~ 389 (392) T protein:vir:10 368 VQMWDNEAAVYGEIDLSAPVEQ 389 (392) T ss_pred cEEecccceEEEEecccccccC Confidence 9999999998886555333333 No 107 >protein:vir:102873 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1492 # MgeName: Cherry # Cross-refs: genbank:acc:YP_338137;genbank:gi:77020198;genbank:GeneID:3703782 Probab=92.03 E-value=0.013 Score=30.90 Aligned_cols=272 Identities=10% Similarity=-0.005 Sum_probs=116.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeec-cCccccccccccccccccccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDEL-AAPGANITLEGDEYTFDATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L-~~~~~na~~EG~d~~~~~~~~~~~ 79 (330) |...+.. ......-+.+.+.|...-....|+.++.+.....+....+..-.. ..+...-..||+..+......-.. T Consensus 106 ~~~~t~~---~gg~~vP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~~~~~~~~ 182 (392) T protein:vir:10 106 MSGLTGE---DGGLVIPQDIQTQINELARSFDALEQYVTVEPVRTRSGSRVLEKNSDMIPFAEITEMGEIPETDNPKFSN 182 (392) T ss_pred ccccccC---CCceecchhHHHHHHHHHHhhhhhhhhceeeeccCCceeEEEEeecCCccceeeccccccccccccccee Confidence 3333211 111112235666666666677888777665554433222221111 112223346777665322111111 Q ss_pred ecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccccc Q lcl|NC_020858. 80 LGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATG 159 (330) Q Consensus 80 ~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g 159 (330) +.=-..-+...+.||...-.-........+..++ ...+.+-++.++++|...... T Consensus 183 v~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l---~~~i~~~~d~~~~~g~g~~~~---------------------- 237 (392) T protein:vir:10 183 VQYAVKDRAGILPLSRSLLQDSDQNILKYVTKWL---GKKSKVTRNVLILGVIEKLTK---------------------- 237 (392) T ss_pred EEeeeeeEEEeehhhHHHHhhhHHHHHHHHHHHH---HHHHHHHHHHHHhhccccccc---------------------- Confidence 1111222344555554332211122223333333 444566778888876432100 Q ss_pred ccccccccccccccccccccccccHHHHHHHHHHHHhcC-CceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeE Q lcl|NC_020858. 160 ANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSG-ANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSI 238 (330) Q Consensus 160 ~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~G-g~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~ 238 (330) ....+-+.|.+++...-..+ .....+++||....+|.++ +|.+. |+.....-.... T Consensus 238 -------------------~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~l-kd~~G---~~l~~~~~~~~~ 294 (392) T protein:vir:10 238 -------------------QAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDKL-KDKDG---KYILQSDPTQKN 294 (392) T ss_pred -------------------cCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHh-hccCC---CeEeecCccCCc Confidence 00134455666554322222 2223578999999999886 43321 222111000000 Q ss_pred EEEEEEEEcCCeE--EEEEEcCcCCCcccccc--EEEEEcchh-hhhcccCCccccccccccc-cc----eeeEEEEEEE Q lcl|NC_020858. 239 VANADVYEGPFGK--VMIHPNRVMAGSGALAR--NAFFVDPEF-LQFGWLRKIAEDKKVAKTG-DA----EKFMLIGEGA 308 (330) Q Consensus 239 ~~~v~~~~tdfG~--v~iv~nR~mp~~~~~a~--~~~~ld~~~-~~~~~Lr~~~~~e~laKtG-d~----~k~~i~~E~t 308 (330) -.+=+|. |.+..+..++.....++ .+++.|++. +.+.. |....-+...-++ .+ ..+.++..++ T Consensus 295 ------~~tllG~~~v~~~~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~-~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d 367 (392) T protein:vir:10 295 ------KKLFAGTNPVVVVSNRFLKSKGTTAKKAPLIIGDLKEAIVLFK-REDMELASTDVGGKAFTRNTLDLRAIQRDD 367 (392) T ss_pred ------cccccCcccEEEecccccCCCcccCCceEEEEEehhceEEEEe-ecceEEEEeccccchhhcCceEEEEEEeec Confidence 0011342 22222444433222222 255667663 32222 2111111111111 22 3344556688 Q ss_pred EEEecchheeEEeccccccccC Q lcl|NC_020858. 309 LKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 309 Le~~N~~a~g~i~gLt~~~~~~ 330 (330) ..+++|.|..++..=.--...| T Consensus 368 ~~v~~~~a~~~l~~~~~a~~~~ 389 (392) T protein:vir:10 368 VQMWDNEAAVYGEIDLSAPVEQ 389 (392) T ss_pred cEEecccceEEEEecccccccC Confidence 9999999998886555333333 No 108 >protein:vir:1383 Length: 421 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:314 # MgeName: phi3626 # Cross-refs: genbank:acc:NP_612835;genbank:gi:20065969;genbank:GeneID:935826 Probab=91.50 E-value=0.016 Score=30.50 Aligned_cols=264 Identities=10% Similarity=-0.009 Sum_probs=116.0 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccc-cccccccccccccccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGA-NITLEGDEYTFDATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~-na~~EG~d~~~~~~~~~~~ 79 (330) .+..+.+ ......-+.+...|...-....|+..++......+....+.......... -...||.+.+......... T Consensus 114 ra~~t~~---~gg~liP~~~~~~Ii~~~~~~~~l~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~E~~~~~~s~~~f~~i 190 (421) T protein:vir:13 114 RDIMSST---NNGAVIPQEFVNEFEKLKEGYPSLKEHCHVIPVNRNAGKMPVRAGASVDKLANLAKDTELVKAMLKTQPM 190 (421) T ss_pred hhccccC---CcceecchhhHHHHHHHHHhhhhhhhhceeeeccCCceEEEEeecCCccceeeccccccccccccceeEE Confidence 1222211 01111223555556555566677777766555554444444433322211 1134666544332221111 Q ss_pred ecceEEEEeeeeeehhHHHHHhhcc--ccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccc Q lcl|NC_020858. 80 LGNYTQIMRKSGIISGTQNITDEAG--RATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGA 157 (330) Q Consensus 80 ~~N~tQIf~~~v~VS~T~~av~~~G--~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~ 157 (330) .-.. .-+..-+.|| .+.+.... ..+.+..++.++ +.+=++..+++ + ..|+. T Consensus 191 ~~~~-~k~~~~v~iS--~ell~ds~~~l~~~i~~~la~~---~~~~~~~~i~~-~---------~~g~~----------- 243 (421) T protein:vir:13 191 AYDI-DDYGLLAPID--NSLLEDSEINFLEFVNEEFAEF---AVNTENAEIVK-Q---------AKAVL----------- 243 (421) T ss_pred Eeee-eeeEeehhhh--HHHHhhhHHHHHHHHHHHHHHH---HHHHhhhhHhh-h---------hhhcc----------- Confidence 1111 1122233344 33333322 122233333222 12222222221 0 11110 Q ss_pred ccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecC---Cc Q lcl|NC_020858. 158 TGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASN---GK 234 (330) Q Consensus 158 ~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~---~~ 234 (330) .. ....+.+.|.+++.++-.++.....+++||....+|..+ +|.+. |+...+ +. T Consensus 244 --------------~~-----~~~~~~d~i~~~~~~l~~~~~~~a~~v~n~~~~~~l~~l-kd~~G---~~i~~~~~~~~ 300 (421) T protein:vir:13 244 --------------AE-----ETINDYAGLVKTINSLVPNARKRAIIVTNSDGRAYLDGL-MDKQG---RPLLKELSDGG 300 (421) T ss_pred --------------cc-----ccccchHHHHHHHHHhhhhhcCCCEEEEcHHHHHHHHHh-hcCCC---ceeecCcCCCC Confidence 00 001345667778888877776666788999999998876 44321 222211 11 Q ss_pred ceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchh-hhhcccCCccccccccccccceee----EEEEEEEE Q lcl|NC_020858. 235 NNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRKIAEDKKVAKTGDAEKF----MLIGEGAL 309 (330) Q Consensus 235 ~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~~~~~e~laKtGd~~k~----~i~~E~tL 309 (330) ..+| +| +.|+....||........+++.|+.. +.+.. |.-..-+ .....++++. ..+..+.. T Consensus 301 ~~tl----------~G-~pV~~~~~~~~~~~~~~~~~~gd~~~~~~~~~-~~~~~v~-~~~~~~f~~~~~~~r~~~r~d~ 367 (421) T protein:vir:13 301 DLVF----------KG-RPVIELEESIFDVGDETKFIVSDFKTLIKFMD-RKQYLID-QSKEAGYTKNETIARIIERFDV 367 (421) T ss_pred Ccee----------cc-eeeEEeccccccCCCceEEEEEeccccEEEEE-ecceEEE-eecccccccCeeEEEEEeeecc Confidence 1111 22 24455556775544444577788764 33322 2211111 1222334443 33445666 Q ss_pred EEecchhe--------eEEeccccccccC Q lcl|NC_020858. 310 KPKNEKGL--------GVAADLYGLTAST 330 (330) Q Consensus 310 e~~N~~a~--------g~i~gLt~~~~~~ 330 (330) .+.+++|. +.+..+++..++| T Consensus 368 ~~~~~~a~~~~~~~~~~a~v~~~~~~~~~ 396 (421) T protein:vir:13 368 NSPLDKSSDAEKIRKFGVIVKLQEVLKSS 396 (421) T ss_pred eeecchhhheeeecccceeeccccccCCC Confidence 77777764 4566677777777 No 109 >protein:vir:81160 Length: 371 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:1892 # MgeName: Geobacillus virus E2 # Cross-refs: genbank:acc:YP_001285811;genbank:gi:148747732;genbank:GeneID:5247203 Probab=91.39 E-value=0.016 Score=30.42 Aligned_cols=261 Identities=10% Similarity=0.005 Sum_probs=116.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceee-eeeeeccCcccccccccccccccc-ccCce Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPE-WTTDELAAPGANITLEGDEYTFDA-TVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~-W~td~L~~~~~na~~EG~d~~~~~-~~~~~ 78 (330) |..-+. +....-.-+.+.+.|...-....|++.++......+.... |....-..+...-..||++.+... ..... T Consensus 91 ~~~~t~---~~gg~~vP~~~~~~ii~~~~~~s~i~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~Eg~~~~~~~~~~f~~ 167 (371) T protein:vir:81 91 MSEGSN---QDGGYTVPQDIQTRINELRESKDALQNLITVEPVTTLSGSRVFKKRSQQTGFVEVAEGAAIGEKATPQFTL 167 (371) T ss_pred hccCCC---ccCceeecHhHHHHHHHHHHhhhhhhhhceeeeccCCceeEEEEeecCCcceeeeccccccccccccceee Confidence 332221 1111112235666677666777888887766555433322 233222223333346777755322 22111 Q ss_pred EecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 79 RLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 79 ~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) ..-+.-. +...+.|| .+.+.... .+..+|=...-...+.+-+|.++++|....... | T Consensus 168 i~~~~~k-~~~~~~iS--~ell~ds~-~~l~~~i~~~l~~a~~~~~~~~i~~g~g~~~~~-----~-------------- 224 (371) T protein:vir:81 168 LQYQVKK-YAGFFRVT--NELLNDST-EAIVNTLVRWIGDESRVTRNGLIINVLNTKAKT-----A-------------- 224 (371) T ss_pred EEeeeeE-EEEeehhh--HHHHhhhh-HHHHHHHHHHHHHHHHHHHHHHHHhhccccccc-----c-------------- Confidence 1111111 11223333 33333322 122333333344556778899999886531110 0 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHH-HHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeec----CC Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQ-GYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAAS----NG 233 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~-~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~----~~ 233 (330) ..+-+.+.+++.. +-.+-.....+++||....+|..+ ++.+. |+... ++ T Consensus 225 ----------------------~~~~~~i~~~~~~~l~~~~~~~a~~vmn~~~~~~L~~l-kd~~g---~~l~~~~~~~~ 278 (371) T protein:vir:81 225 ----------------------IADLDGLKQIINVQLDPVFRSTSSVIVNQDAFNWLDTL-KDQNG---QYLLQPSISSP 278 (371) T ss_pred ----------------------cccHHHHHHHHHhhcchhhhcCCEEEEcHHHHHHHHHh-hccCC---CeeeecccCCC Confidence 1222334443332 111111223578999998888876 33321 12111 11 Q ss_pred cceeEEEEEEEEEcCCeEEEEEEcCcCCCccc-------cccEEEEEcchh-hhhcccCCccccccccccc-----ccee Q lcl|NC_020858. 234 KNNSIVANADVYEGPFGKVMIHPNRVMAGSGA-------LARNAFFVDPEF-LQFGWLRKIAEDKKVAKTG-----DAEK 300 (330) Q Consensus 234 ~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~-------~a~~~~~ld~~~-~~~~~Lr~~~~~e~laKtG-----d~~k 300 (330) ...++ +| +.|+...+||.+.. ....+++.|+.. +.+..-.++ .-+-..-.+ +... T Consensus 279 ~~~~l----------~G-~pV~~~~~~~~~~~~~~~~~~~~~~i~~Gd~~~~~~~~~~~~~-~i~~~~~~~~~f~~~~v~ 346 (371) T protein:vir:81 279 TGRQL----------LG-LPVVIVSNKVLANRVDGGTGAQFAPIIVGDLKEAVVMFDRQRT-EIMSSNVAMDAFETDATL 346 (371) T ss_pred CCcee----------cc-eeEEEecccccCccccccccCCcceEEEEehhceEEEEeecce-EEEEeccccchhhcCceE Confidence 11111 12 35555566663321 123467777653 332211111 111111111 3345 Q ss_pred eEEEEEEEEEEecchheeEEecccccccc Q lcl|NC_020858. 301 FMLIGEGALKPKNEKGLGVAADLYGLTAS 329 (330) Q Consensus 301 ~~i~~E~tLe~~N~~a~g~i~gLt~~~~~ 329 (330) +..+..+...+++|.|..++. ++.+ T Consensus 347 ~~~~~r~d~~~~~~~a~~~~~----~~~A 371 (371) T protein:vir:81 347 WRAIERMDVKMRDDEAFVFGE----VQLA 371 (371) T ss_pred EEEEEeeccEEecccceEEEE----EecC Confidence 556666899999999988776 3344 No 110 >protein:vir:105038 Length: 428 # NCBI annotation: major capsid head protein precursor # Family: family:all:21 # MgeID: mge:1465 # MgeName: phiKO2 # Cross-refs: genbank:acc:YP_006586;genbank:gi:46402092;genbank:GeneID:2777903 Probab=90.65 E-value=0.02 Score=29.93 Aligned_cols=276 Identities=11% Similarity=0.034 Sum_probs=127.1 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeee-eccceeccceeeeeeeeccCccccccccccccccccccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSM-IEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~-ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~ 79 (330) ++..+.+ +...+..-+++.+.|+..=...+|+..+ ....+..+-.+.|...+-. +...-..||...+..... T Consensus 125 ~~~~~~~--~~gg~liP~~~~~~ii~~l~~~~~l~~~~~~~~~~~~g~~~~p~~~~~-~~a~~v~Eg~~~~~~~~~---- 197 (428) T protein:vir:10 125 MAISTAA--GSGGVLIPQNIHSEVIELLRDRTIVRKLGARSIPLPNGNMSLPRLAGG-ATASYTGENQDAKVSEAR---- 197 (428) T ss_pred hhhcccc--cCCccccchhHHHHHHHHHhhhchhhhhcceeeecCCcceEEEEEeCC-cceeeeccCccccccccc---- Confidence 2222211 1111112235555565555677787665 1222333334555543322 211223577776554322 Q ss_pred ecceE---EEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHh-ccccc Q lcl|NC_020858. 80 LGNYT---QIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVK-TNVSR 155 (330) Q Consensus 80 ~~N~t---QIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~-tn~~~ 155 (330) +.+++ .-+...+.|| .+.+.... .+..+|=...-...+.+-+|.++|+|... +. +.-||+.-.. ++... T Consensus 198 f~~i~~~~~k~~~~v~is--~ell~ds~-~~l~~~i~~~l~~ai~~~~d~~~l~G~G~-~~---~p~Gi~~~~~~~~~~~ 270 (428) T protein:vir:10 198 FDDVKLTAKTMIAMVPIS--NALIGRAG-FNVEQLVLQDILTAISVREDKAFMRDDGT-GD---TPIGMKARATQWNRLL 270 (428) T ss_pred eeeEEeeeEEEEEeehhh--HHHHhhhh-HHHHHHHHHHHHHHHHHHHHHHHhccCCC-Cc---cccccccccccccccc Confidence 22222 2222233444 34444332 23334444555666889999999998642 11 1224433210 00000 Q ss_pred ccccccccccccccccccccccccccccHHHHHHHHHHH---HhcCC---ceeEEEeChHHHHHHHHhhccceeeeeeee Q lcl|NC_020858. 156 GATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQG---YQSGA---NFKHVFVSPYVKSVFVTFMSDTNVASFRYA 229 (330) Q Consensus 156 g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~---~~~Gg---~~~~i~v~~~~k~~is~f~~~~~~~~~r~~ 229 (330) . .++....+-+.++.++..+ +..+. .....++++....++..+ ++.+. |+. T Consensus 271 ----------------~---~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~l-kd~~G---~~i 327 (428) T protein:vir:10 271 ----------------P---WAADAAVNLDTIDTYLDSIILMSMDGNSNMISSGWGMSNRTYMKLFGL-RDGNG---NKV 327 (428) T ss_pred ----------------c---ccccccccHHHHHHHHHHHHHhhhccccccccCEEEEcHHHHHHHHHh-hccCC---cee Confidence 0 0011123333343333333 32221 122457899888888876 34331 122 Q ss_pred ecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccc---cEEEEEcchhhhhcccCCcccccc-----cc-c------ Q lcl|NC_020858. 230 ASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALA---RNAFFVDPEFLQFGWLRKIAEDKK-----VA-K------ 294 (330) Q Consensus 230 ~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a---~~~~~ld~~~~~~~~Lr~~~~~e~-----la-K------ 294 (330) ..+....+ +--+.|+.+.+||.+...+ ..+++.|++.+-+.. ++-..-+. +. . T Consensus 328 ~~~~~~g~-----------l~G~pv~~~~~~p~~~~~~~~~~~i~~gd~s~~~i~~-~~~i~i~~~~~~~~~~~~~~~~~ 395 (428) T protein:vir:10 328 YPEMAQGM-----------LKGYPIQRTSAIPANLGEGGKESEIYFADFNDVVIGE-DGNMKVDFSKEASYIDTDGKLVS 395 (428) T ss_pred ccCCCCCe-----------eeceeeEEeccccccccCCCccceEEEEecceEEEEE-ecceEEEeecccccccccccccc Confidence 11111101 1124677777888753222 236778887755532 22111100 00 0 Q ss_pred --cccceeeEEEEEEEEEEecchheeEEecccc Q lcl|NC_020858. 295 --TGDAEKFMLIGEGALKPKNEKGLGVAADLYG 325 (330) Q Consensus 295 --tGd~~k~~i~~E~tLe~~N~~a~g~i~gLt~ 325 (330) .-|...+..+..+.+.+.+|.|..+++|+.= T Consensus 396 ~f~~~~~~~R~~~r~d~~v~~p~a~~~~t~~~~ 428 (428) T protein:vir:10 396 AFSRNQSLIRVVTEHDIGFRHPEGLVLGTGVLF 428 (428) T ss_pred hhhcchhheeeeeeeCceeeccceEEEEeccCC Confidence 1122344556668899999999999999887 No 111 >protein:vir:80068 Length: 301 # NCBI annotation: gp8 # Family: family:all:463 # MgeID: mge:1876 # MgeName: B054 # Cross-refs: genbank:acc:YP_001468712;genbank:gi:157325292;genbank:GeneID:5601759 Probab=90.42 E-value=0.021 Score=29.79 Aligned_cols=288 Identities=9% Similarity=-0.016 Sum_probs=119.4 Q ss_pred CCccc-cceeeccccccccccceeeEecCCcccceeeeec---cceeccceeeeeeeeccCccccccccc-ccccccccc Q lcl|NC_020858. 1 MAVVT-NTFQSTGAKGNREELADVVSRITPEDTPIYSMIE---KVSFDTTHPEWTTDELAAPGANITLEG-DEYTFDATV 75 (330) Q Consensus 1 Ma~~t-~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig---~~~~~~~~~~W~td~L~~~~~na~~EG-~d~~~~~~~ 75 (330) |-.-. +.|. ....+-++..|+.+--..+=.-.+++ ........+.+...+.....+ ....+ .|.+-. .. T Consensus 1 ~~~~~~g~f~----~~~l~~id~~v~e~~~~~l~~r~l~~v~~~~~~~~~~~~~~~~~~~G~~~-~~~~~~~dip~~-~~ 74 (301) T protein:vir:80 1 MQGKITATIE----ARDLQAIDNVIYEPKQEELTARSVFPQKFDVNEGAESYSFDVMTRSGAAK-IIANGADDLPLV-DV 74 (301) T ss_pred CCccccchhh----HHHHHHHHHHHHHhhhhhhhhhhhcccccCCCCceEEEEEeeeccceeEE-EecCcccccccc-cc Confidence 22211 1111 11222233333332211111111221 112222222222222211111 11111 111111 11 Q ss_pred CceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccc Q lcl|NC_020858. 76 SPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSR 155 (330) Q Consensus 76 ~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~ 155 (330) ...+..-..-.|.+.+.++---.+......-+.-......+...+.+.++..++.|.+... .-||+.. .+... T Consensus 75 ~~~~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aa~~~~~~~~n~~~f~G~~~~g-----~~GLlN~--p~~~~ 147 (301) T protein:vir:80 75 DMVRKSVPIYSIGIGLSYTIQDLRAARMQGTTVDAAKATTVRRAIAEKENSIAFRGEKKYA-----IKGAFEA--TGIQI 147 (301) T ss_pred cceeEEEEEEEEEeeeeecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhceEEeeeccccc-----ceeeecC--CCccc Confidence 1122222333344555555433333333334555555667778888888888888865321 1222211 00000 Q ss_pred ccccccccccccccccccccccccccccHHHHHHHHHHHHhc-CC--ceeEEEeChHHHHHHHHh-hccceeeeeeeeec Q lcl|NC_020858. 156 GATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQS-GA--NFKHVFVSPYVKSVFVTF-MSDTNVASFRYAAS 231 (330) Q Consensus 156 g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~-Gg--~~~~i~v~~~~k~~is~f-~~~~~~~~~r~~~~ 231 (330) ...+.++. .+..+- ...|+..+ -++|..++.++|.. |+ .+.+|+++|.+-..++.= .++.. T Consensus 148 ~~~~~~~~----~~~~~w-~~~t~~ei-~~di~~~~~~l~~~s~g~~~p~~L~L~p~~~~~L~~~~~~~~~--------- 212 (301) T protein:vir:80 148 DVSPTTGV----GNVSKW-EKKTAEQI-IDEIGEAHTKITVLPGYGTASLKLCLPPKQFELINKKRYSNED--------- 212 (301) T ss_pred ccccCccc----cccccc-ccCCHHHH-HHHHHHHHHHHHHhcCceecccEEEecHHHHHhhhhccccCCC--------- Confidence 00000000 000000 00111111 25678888888874 33 467899999987777642 11111 Q ss_pred CCcceeEEEEEEEEEcCCeEEEEEEcCcCCCcc-ccccEEEEE--cchhhhhcccCCccccccccccccceeeEEEEE-E Q lcl|NC_020858. 232 NGKNNSIVANADVYEGPFGKVMIHPNRVMAGSG-ALARNAFFV--DPEFLQFGWLRKIAEDKKVAKTGDAEKFMLIGE-G 307 (330) Q Consensus 232 ~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~-~~a~~~~~l--d~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~E-~ 307 (330) + . ...+.+...|.-+.|+.-+.+.... ...+.++++ ++++++++.=.|+.. -+....+-..+.-.+.. . T Consensus 213 -~-~----tvl~~l~~~~~~~~I~~~p~L~~~g~~g~~~~v~~~~~~d~~~~~v~~~~~~-~~~e~~~~~~~~~~~~r~~ 285 (301) T protein:vir:80 213 -S-R----SVLKVLQDNAWFSAIVRVPDLAGMGTAGSDSFAVIHDSNETAELIIPMDITR-HPEEYSFPRTKVPFEERTA 285 (301) T ss_pred -C-e----eHHHHHHHHcCcceEEEcceeccCCCCcccEEEEEecCCcEEEEEecCceee-ecceecCceeEeeeeeeeE Confidence 0 0 1112222333345555555554321 122334444 577777764233322 23444444333323444 4 Q ss_pred EEEEecchheeEEecc Q lcl|NC_020858. 308 ALKPKNEKGLGVAADL 323 (330) Q Consensus 308 tLe~~N~~a~g~i~gL 323 (330) +++++-|.+...+.|| T Consensus 286 Gv~i~~P~ai~~~~GI 301 (301) T protein:vir:80 286 GVVVRFPAAIVRVDGI 301 (301) T ss_pred EEEEEccceEEEEecC Confidence 7999999999999999 No 112 >protein:vir:96262 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1612 # MgeName: ROSA # Cross-refs: genbank:acc:YP_240311;genbank:gi:66395978;genbank:GeneID:5133339 Probab=86.85 E-value=0.043 Score=28.09 Aligned_cols=259 Identities=14% Similarity=0.086 Sum_probs=136.4 Q ss_pred CCccccceeeccccccccccceee----------EecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVV----------SRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYT 70 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I----------~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~ 70 (330) ||..+ |..+... +.|-+++.+ ..+...+.-+ -|+.--+=+.+.|. .+.++. -..||.+.+ T Consensus 1 m~~~~-T~l~d~i--~Pev~~~~v~~~~~~~l~~~~~~~~~~~l---~g~~G~tv~iP~~~--~ig~a~--~~~~g~~i~ 70 (274) T protein:vir:96 1 MAQGM-TKLTNQI--VPEVLAPMMQAELEKKLRFASFAEIDNTL---VGQPGDTLTFPAFI--YSGDAK--VVAEGEKIP 70 (274) T ss_pred CCcce-eehhhee--chHHHHHHHHHHHHhhhhccccceecccc---cCCCCCEEEeeeec--CCCccc--cccCCCccc Confidence 88854 3332222 112122111 1222222211 12211122345674 343333 245777777 Q ss_pred cccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHh Q lcl|NC_020858. 71 FDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVK 150 (330) Q Consensus 71 ~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~ 150 (330) ..........-.+.|. .+.|++++-+.... .+|.++.-......-+.+.++..++.--+....+ T Consensus 71 ~~~lt~~~~~~~i~~~-~~a~~i~D~~~~~~---~~d~~~~~~~~~~~~~a~~vd~~i~~~l~~a~~~------------ 134 (274) T protein:vir:96 71 TDILETKKREAKIRKI-AKGTSISDEALLSG---YGDPQGEQVRQHGLAHANKVDDDVLEALKSAKLT------------ 134 (274) T ss_pred hhhcccceeEEEeeee-ecceeehHHHHhhc---cchHHHHHHHHHHHHHHHHHHHHHHHHHhccccc------------ Confidence 7766666666666774 67888887654432 2456666666666777788887776321110000 Q ss_pred cccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeee Q lcl|NC_020858. 151 TNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAA 230 (330) Q Consensus 151 tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~ 230 (330) ....+++.+.|.++++++=++....+.++|||.+...+-+.....+.. . T Consensus 135 --------------------------~~~~~~~~d~i~~A~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~f~~-----~ 183 (274) T protein:vir:96 135 --------------------------VEADITKLTGLQTAIDKFNDEDLEPMVLFISPLDAGKLRGDATTNFTR-----A 183 (274) T ss_pred --------------------------ccccccCHHHHHHHHHHhccccccccEEEeCHHHHHHHHhhccccccc-----c Confidence 001247788899999999888888889999999887766532222210 0 Q ss_pred cCCcceeEEEEEEEEEcCCeE---EEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccccccceeeEE--EE Q lcl|NC_020858. 231 SNGKNNSIVANADVYEGPFGK---VMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTGDAEKFML--IG 305 (330) Q Consensus 231 ~~~~~~~~~~~v~~~~tdfG~---v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtGd~~k~~i--~~ 305 (330) .+.....+ +..-+|+ +.|+.+.-+|.. ..|++.+..+.+..-++. .-| .....+...-.| .- T Consensus 184 s~~g~~~~------~~G~ig~~~G~~Vi~s~~~~~~-----t~~l~~~gA~~~~~~~~~-~vE-~~Rd~~~~~d~i~~~~ 250 (274) T protein:vir:96 184 TELGDDVI------VKGAFGEALGAVIVRSNKLEAG-----TAILAKKGAVKLITKRDF-FLE-TDRDPSTKTTALYSDK 250 (274) T ss_pred ccccccce------eccccceecCeEEEEeCCCCCc-----eEEEEeccceeeeecCCc-ccc-cccccccccCEEEEeE Confidence 01111100 1112333 578888788855 346777777766433331 212 222222222222 33 Q ss_pred EEEEEEecchheeEEecccccccc Q lcl|NC_020858. 306 EGALKPKNEKGLGVAADLYGLTAS 329 (330) Q Consensus 306 E~tLe~~N~~a~g~i~gLt~~~~~ 329 (330) =|+..+.||.+..+++==||=-+- T Consensus 251 ~y~~~~~~~~~~v~~tk~~~~~~~ 274 (274) T protein:vir:96 251 HYVAYLYDESKAVKITKGSGSLEM 274 (274) T ss_pred EEEEEEEcCCcEEEEEcCCccccC Confidence 388999999888887744443222 No 113 >protein:vir:95898 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1588 # MgeName: 71 # Cross-refs: genbank:acc:YP_240385;genbank:gi:66396054;genbank:GeneID:5133409 Probab=86.85 E-value=0.043 Score=28.09 Aligned_cols=259 Identities=14% Similarity=0.086 Sum_probs=136.4 Q ss_pred CCccccceeeccccccccccceee----------EecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVV----------SRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYT 70 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I----------~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~ 70 (330) ||..+ |..+... +.|-+++.+ ..+...+.-+ -|+.--+=+.+.|. .+.++. -..||.+.+ T Consensus 1 m~~~~-T~l~d~i--~Pev~~~~v~~~~~~~l~~~~~~~~~~~l---~g~~G~tv~iP~~~--~ig~a~--~~~~g~~i~ 70 (274) T protein:vir:95 1 MAQGM-TKLTNQI--VPEVLAPMMQAELEKKLRFASFAEIDNTL---VGQPGDTLTFPAFI--YSGDAK--VVAEGEKIP 70 (274) T ss_pred CCcce-eehhhee--chHHHHHHHHHHHHhhhhccccceecccc---cCCCCCEEEeeeec--CCCccc--cccCCCccc Confidence 88854 3332222 112122111 1222222211 12211122345674 343333 245777777 Q ss_pred cccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHh Q lcl|NC_020858. 71 FDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVK 150 (330) Q Consensus 71 ~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~ 150 (330) ..........-.+.|. .+.|++++-+.... .+|.++.-......-+.+.++..++.--+....+ T Consensus 71 ~~~lt~~~~~~~i~~~-~~a~~i~D~~~~~~---~~d~~~~~~~~~~~~~a~~vd~~i~~~l~~a~~~------------ 134 (274) T protein:vir:95 71 TDILETKKREAKIRKI-AKGTSISDEALLSG---YGDPQGEQVRQHGLAHANKVDDDVLEALKSAKLT------------ 134 (274) T ss_pred hhhcccceeEEEeeee-ecceeehHHHHhhc---cchHHHHHHHHHHHHHHHHHHHHHHHHHhccccc------------ Confidence 7766666666666774 67888887654432 2456666666666777788887776321110000 Q ss_pred cccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeee Q lcl|NC_020858. 151 TNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAA 230 (330) Q Consensus 151 tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~ 230 (330) ....+++.+.|.++++++=++....+.++|||.+...+-+.....+.. . T Consensus 135 --------------------------~~~~~~~~d~i~~A~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~f~~-----~ 183 (274) T protein:vir:95 135 --------------------------VEADITKLTGLQTAIDKFNDEDLEPMVLFISPLDAGKLRGDATTNFTR-----A 183 (274) T ss_pred --------------------------ccccccCHHHHHHHHHHhccccccccEEEeCHHHHHHHHhhccccccc-----c Confidence 001247788899999999888888889999999887766532222210 0 Q ss_pred cCCcceeEEEEEEEEEcCCeE---EEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccccccceeeEE--EE Q lcl|NC_020858. 231 SNGKNNSIVANADVYEGPFGK---VMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTGDAEKFML--IG 305 (330) Q Consensus 231 ~~~~~~~~~~~v~~~~tdfG~---v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtGd~~k~~i--~~ 305 (330) .+.....+ +..-+|+ +.|+.+.-+|.. ..|++.+..+.+..-++. .-| .....+...-.| .- T Consensus 184 s~~g~~~~------~~G~ig~~~G~~Vi~s~~~~~~-----t~~l~~~gA~~~~~~~~~-~vE-~~Rd~~~~~d~i~~~~ 250 (274) T protein:vir:95 184 TELGDDVI------VKGAFGEALGAVIVRSNKLEAG-----TAILAKKGAVKLITKRDF-FLE-TDRDPSTKTTALYSDK 250 (274) T ss_pred ccccccce------eccccceecCeEEEEeCCCCCc-----eEEEEeccceeeeecCCc-ccc-cccccccccCEEEEeE Confidence 01111100 1112333 578888788855 346777777766433331 212 222222222222 33 Q ss_pred EEEEEEecchheeEEecccccccc Q lcl|NC_020858. 306 EGALKPKNEKGLGVAADLYGLTAS 329 (330) Q Consensus 306 E~tLe~~N~~a~g~i~gLt~~~~~ 329 (330) =|+..+.||.+..+++==||=-+- T Consensus 251 ~y~~~~~~~~~~v~~tk~~~~~~~ 274 (274) T protein:vir:95 251 HYVAYLYDESKAVKITKGSGSLEM 274 (274) T ss_pred EEEEEEEcCCcEEEEEcCCccccC Confidence 388999999888887744443222 No 114 >protein:vir:103886 Length: 302 # NCBI annotation: putative major head subunit protein # Family: family:all:776 # MgeID: mge:1522 # MgeName: D3112 # Cross-refs: genbank:acc:NP_938242;genbank:gi:38229147;genbank:GeneID:2648201 Probab=86.30 E-value=0.047 Score=27.89 Aligned_cols=285 Identities=11% Similarity=0.060 Sum_probs=140.0 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCcccccccc-cccccccccc-Cce Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLE-GDEYTFDATV-SPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~E-G~d~~~~~~~-~~~ 78 (330) |.+.+..+.. -..+-+-.|.+-.....|.=..|. +....+.....+-|.-+- |. --| ..+....... ... T Consensus 1 m~it~~~l~~-l~~~~~~~~~~~y~~a~~~~~~~a-~~~~sdf~~~~~~~lg~~---p~---l~e~~Ge~~~~~l~~~~~ 72 (302) T protein:vir:10 1 MLINKQSLNA-AFVAIKTIFNNAFAAAPTTWQKIA-MEVPSNTSSNDYKWLSTF---PK---MRRWIGAKVVKNLKAYKY 72 (302) T ss_pred CcccHHHHHH-HHHHHHHHHHHHHHhhhhhhhcee-eecCCCcceeeceecCCC---CC---ccccccceeeccccccce Confidence 5554422111 111111122222222222222221 112223334445565421 11 111 2222222111 112 Q ss_pred EecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcC---CCcCCcccccchhHHHHHhccccc Q lcl|NC_020858. 79 RLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVAT---NASVGGATRESGSLPTWVKTNVSR 155 (330) Q Consensus 79 ~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g---~~~~~~~~r~~~Gi~~~i~tn~~~ 155 (330) .+.|-+ |.+.|+|+..+-.=...|.-..+..++.++..++--++=+.+|.+ .+-.++.+.--. .|... T Consensus 73 ~i~~~~--~g~~v~i~R~~i~nDdlg~~~~~~~~~G~aaa~~~~~lv~~~L~~g~~~~~~DG~~fF~~-------dH~~g 143 (302) T protein:vir:10 73 VVENED--FEATVEVDRNDIEDDQIGIYSPQAKMAGYSAAQLPDELVYEAVNGAFTKPCFDGQYFIDT-------DHPVG 143 (302) T ss_pred eEEeec--ccceecccHHhhcccccchhHHHHHHHHHHHHhhHHHHHHHHHhccCCCcccCCcceecc-------ccccc Confidence 233333 777888886555544578777888888888888888888888764 222233321100 01110 Q ss_pred ccccccccccccccccccccccccccccHHHHHH---HHHHHHhcCC-----ceeEEEeChHHHHHHHHhhccceeeeee Q lcl|NC_020858. 156 GATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDD---VMQQGYQSGA-----NFKHVFVSPYVKSVFVTFMSDTNVASFR 227 (330) Q Consensus 156 g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~---~~~~~~~~Gg-----~~~~i~v~~~~k~~is~f~~~~~~~~~r 227 (330) .....+. .......+..+++++.|.. +|++.-+.+| .|+.|+|+|.+......+..... T Consensus 144 ~~~~~N~--------g~~~~~~~~~~l~~~~~~aa~~am~~~k~~~G~~L~i~P~~LiVp~~le~~A~~ll~~~~----- 210 (302) T protein:vir:10 144 DASVSNK--------GTAPLSNASQAAAKAGYGAARTAMKKFKDEEGRSLNVSPNVLLVGPALEDVAKMLLTNPK----- 210 (302) T ss_pred ccccccc--------cchhhhhcccccchHHHHHHHHHHHHHhhhcccccccCCCEEEecchhHHHHHHHhhccc----- Confidence 0000000 0011112233566555544 4556656665 46789999999988888765432 Q ss_pred eeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc--cccc-cccccccceeeEEE Q lcl|NC_020858. 228 YAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI--AEDK-KVAKTGDAEKFMLI 304 (330) Q Consensus 228 ~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~--~~~e-~laKtGd~~k~~i~ 304 (330) ..++..+ . -+|++++|.++++.+.+ .=.++-||.-++..||.+. |+.+ ...-+-|..++.++ T Consensus 211 --~~~g~~N-------p---~~g~~~~vv~p~L~s~~---aWyL~a~~~~i~~~~l~g~~~P~~~~~~~~~~dgv~~k~~ 275 (302) T protein:vir:10 211 --LADNTPN-------P---YVGTAELVVDGRIESDT---AWFLLDTTKPVKPFIFQPRKQPEFVSQVNLDSDDVFNLRK 275 (302) T ss_pred --cCCCCcc-------e---eccceEEEEeeccCCCC---ceEEEecCCccceEEEcCccccEEEeccCCCCCceEEEEE Confidence 1222222 1 24889999999986432 1234558888888888542 1111 11224467777788 Q ss_pred EEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 305 GEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 305 ~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) -.++..+|-..+.|-=....|=..+. T Consensus 276 ~d~Gvd~R~~~G~~~wq~a~~s~g~~ 301 (302) T protein:vir:10 276 LKFGAEARAAAGYGFWQLAYGSTGTG 301 (302) T ss_pred EEEeeeeeeecchhhhhhhhccCccC Confidence 88898888766654333333322222 No 115 >protein:vir:1268 Length: 397 # NCBI annotation: hypothetical protein # Family: family:all:21 # MgeID: mge:329 # MgeName: phi-105 # Cross-refs: genbank:acc:NP_690760;genbank:gi:22855000;genbank:GeneID:955203 Probab=85.54 E-value=0.052 Score=27.62 Aligned_cols=263 Identities=9% Similarity=-0.058 Sum_probs=117.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceee-eeeeeccCcccccccccccccccc-ccCce Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPE-WTTDELAAPGANITLEGDEYTFDA-TVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~-W~td~L~~~~~na~~EG~d~~~~~-~~~~~ 78 (330) |+..+. +......-+.+.+.|...-....|++.++......+.... |....-..+...-..||+..+... ..... T Consensus 123 ~~~~~~---~~gg~lvP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~Eg~~~~~~~~~~~~~ 199 (397) T protein:vir:12 123 MSGIND---EDGGILIPEDIGRQIHEFKRQFEPLEQYVTVEPVTTRSGTRLLEKNADMVPFSPVEELGNLPEIDQPRFTK 199 (397) T ss_pred cccccc---ccCcccCchhHHHHHHHhhhhhhhHHhhcceeeccCCceeEEEEEecCCcceeeeccccccccccccccee Confidence 322221 1111112345666676666778888887665544332222 122222222233446777654322 21111 Q ss_pred EecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 79 RLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 79 ~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) ..-+...+ ...+.|| .+.++..+. +...|=...-...+.+-+|.++++|.... .| .| T Consensus 200 v~~~~~k~-~~~~~is--~e~l~ds~~-~l~~~i~~~l~~~~~~~~d~~il~G~g~~--~~---~g-------------- 256 (397) T protein:vir:12 200 VSYSIIDY-GGIMTLS--NSMLNDSDQ-AIMTYVAKWFAKKSVVTRNNLILAAIASL--KK---VD-------------- 256 (397) T ss_pred EEeeheee-Eeeehhh--HHHHhhchH-HHHHHHHHHHHHHHHHHHHHHHHhccccc--cc---cc-------------- Confidence 11111111 1123333 444443331 22223333445556777899999885421 11 00 Q ss_pred cccccccccccccccccccccccccHHHHHHHHH-HHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeec----CC Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQ-QGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAAS----NG 233 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~-~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~----~~ 233 (330) .++.+.+.+++. .+-.+-.....+++||....++..+ ++.+. |+... ++ T Consensus 257 ----------------------~~~~~~i~~~~~~~l~~~~~~~a~~~~n~~~~~~L~~l-kd~~G---~~l~~~~~~~g 310 (397) T protein:vir:12 257 ----------------------IDGLDGIKKALNVTLDPMVAPGSIVLTNQDGYDWLDTL-KDGTG---RYLLQPDPTNP 310 (397) T ss_pred ----------------------cccHHHHHHHHhhccchhhhCCCEEEEcHHHHHHHHHh-hccCC---ceeecccccCC Confidence 123344555443 2211111223578999998888876 33321 22211 11 Q ss_pred cceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchh-hhhcccCCc-ccccccc---ccccceeeEEEEEEE Q lcl|NC_020858. 234 KNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEF-LQFGWLRKI-AEDKKVA---KTGDAEKFMLIGEGA 308 (330) Q Consensus 234 ~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~-~~~~~Lr~~-~~~e~la---KtGd~~k~~i~~E~t 308 (330) ...+ =+|.=-++.+..||........+++.|++. +.+..-.+. .+..+-. -.-+...+..+..+. T Consensus 311 ~~~~----------l~G~pv~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~f~~~~~~~r~~~r~d 380 (397) T protein:vir:12 311 TKKL----------LDGRPVVPFTNRVLKTQKGKAPLIIGNLKEAIVLFDREQQSIASTDTGAGAFETNSTKVRGIERED 380 (397) T ss_pred CCcc----------ccceeeEEecccccccCCCccEEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeec Confidence 1111 145411223445665544445577888763 444321111 0101101 112344555666689 Q ss_pred EEEecchheeEEecccccccc Q lcl|NC_020858. 309 LKPKNEKGLGVAADLYGLTAS 329 (330) Q Consensus 309 Le~~N~~a~g~i~gLt~~~~~ 329 (330) ..+.+|.|..++.- |+- T Consensus 381 ~~~~~~~a~~~~~~----t~~ 397 (397) T protein:vir:12 381 VRKWDEDAVVFGQI----TVE 397 (397) T ss_pred cEEecccceEEEEE----eeC Confidence 99999999866542 111 No 116 >protein:vir:104342 Length: 314 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:1593 # MgeName: RTP # Cross-refs: genbank:acc:YP_398971;genbank:gi:81343955;genbank:GeneID:3778874 Probab=85.48 E-value=0.053 Score=27.60 Aligned_cols=276 Identities=8% Similarity=0.022 Sum_probs=113.4 Q ss_pred CCcc----ccceeeccccccccccceeeEec------CCcccceeeeeccceeccceeeeeeeeccCccccccccccccc Q lcl|NC_020858. 1 MAVV----TNTFQSTGAKGNREELADVVSRI------TPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYT 70 (330) Q Consensus 1 Ma~~----t~~~~t~~~~g~~edl~d~I~~i------~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~ 70 (330) |-.. ++.|+. .+.+-++..|+.+ .+..-|+.+-++ .....+.+...+-...++.---.+.|.+ T Consensus 19 ~~~~~~d~~~~fl~----~ql~~id~~v~e~~~~~~~~~~~i~v~~~~~---~~~et~~~~~~e~~G~a~~~~d~~~dip 91 (314) T protein:vir:10 19 MGVEKADAAGIWAV----SQLTAALNRAYEKEYAENSVVNIFPVTNEIP---GHAKYFEYPEFDGVGIAQIIADYSDDLP 91 (314) T ss_pred hcccchhhhHHHHH----HHHHHHHHHHhhhhccccccceeeccccCCC---CceeEEEeeeeccccceeeeCCcccccc Confidence 1111 122222 2333344444322 111222222111 1111222222221111110001112221 Q ss_pred cccccCceEecceEEEEeeeeeehhHHHHHhhc---cccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHH Q lcl|NC_020858. 71 FDATVSPERLGNYTQIMRKSGIISGTQNITDEA---GRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPT 147 (330) Q Consensus 71 ~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~---G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~ 147 (330) -....- .+. ..+|++-...+..+.+-+..+ |. +.-......+...+.+.+.+..+.|.+... .-||+. T Consensus 92 ~vd~~~-~~~--~~~i~~~~~~~~~~~~El~~a~~~g~-~l~~~k~~aA~~~~~~~~n~i~f~G~~~~g-----~~GLlN 162 (314) T protein:vir:10 92 LVDAFM-TEK--QGKVFRFGNAFLISTDEIKAGAATGQ-SLSARKQALAFEAHDNLLDKLVWSGSAPHG-----IVSVFD 162 (314) T ss_pred eeeccc-cee--EEEEEEEEeeEEecHHHHHHHHHhCC-ChHHHHHHHHHHHHHHhhceEEEeeccccc-----ceeEee Confidence 111111 111 233444444444444444433 33 333344445566666677777777755321 222221 Q ss_pred HHhcccccccccccccccccccccccccccccccccHHHHHHHHHHHHhc-CC--ceeEEEeChHHHHHHHHhhccceee Q lcl|NC_020858. 148 WVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQS-GA--NFKHVFVSPYVKSVFVTFMSDTNVA 224 (330) Q Consensus 148 ~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~-Gg--~~~~i~v~~~~k~~is~f~~~~~~~ 224 (330) - .++ .+.++ ...| .|+..+ -+++..++.++|.. ++ .+++|.++|.+-..++.-....... T Consensus 163 ~--p~v-~~~~~-~~~W------------aT~~ei-~~Di~~~~~~l~~~s~g~~~p~~l~Lpp~~~~~L~~~~~~~~~t 225 (314) T protein:vir:10 163 Q--PNI-NNVVA-TPNW------------SVPQNA-IDDVTAMIDAVESSTQGLHHVTDILLPASARRVMQGLVPQTNLS 225 (314) T ss_pred c--CCC-ccccC-CCCc------------ccHHHH-HHHHHHHHHHHHHhcCccccceeEEecHHHHHhhcccccCCCcc Confidence 1 111 11111 1111 121122 36678888889974 33 4678999998876665422111111 Q ss_pred eeeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccE---EEEEcchhhhhcccCCccccccccccccceee Q lcl|NC_020858. 225 SFRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARN---AFFVDPEFLQFGWLRKIAEDKKVAKTGDAEKF 301 (330) Q Consensus 225 ~~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~---~~~ld~~~~~~~~Lr~~~~~e~laKtGd~~k~ 301 (330) . .+.+.-.|=.+.|+.-+++.+....... ++.-|+++++++.=.|+.. -+.-+.+-+.+. T Consensus 226 v----------------l~~l~~n~~~l~I~~~~el~~ag~~g~~~~v~y~~~~~~~~~~vp~~~~~-l~~e~~~~~~~~ 288 (314) T protein:vir:10 226 Y----------------GELFTRNNPGLTIRFLQFLDNYDGAGGKAALAFEKSPLNMSIEIPEVTNV-LPAQPKDLHFRY 288 (314) T ss_pred H----------------HHHHHHhCCCcEEEEcccccccCCCcceEEEEEecCCcEEEEecCcccee-ecceecCceEEE Confidence 1 1111122223445555555432221111 2234677777764333322 233334433333 Q ss_pred EEEEE-EEEEEecchheeEEeccccc Q lcl|NC_020858. 302 MLIGE-GALKPKNEKGLGVAADLYGL 326 (330) Q Consensus 302 ~i~~E-~tLe~~N~~a~g~i~gLt~~ 326 (330) -.+.. .+++++-|.+...+.|||=- T Consensus 289 ~~~~r~~Gv~i~~P~ai~~~dGI~~~ 314 (314) T protein:vir:10 289 PVTSKATGLIVYRPLTMAVIKGITFA 314 (314) T ss_pred cceeeeEEEEEECcceeEeeeeeecC Confidence 23344 47999999999988877755 No 117 >protein:vir:80930 Length: 278 # NCBI annotation: Cps # Family: family:all:522 # MgeID: mge:1886 # MgeName: A500 # Cross-refs: genbank:acc:YP_001468392;genbank:gi:157324966;genbank:GeneID:5601363 Probab=84.66 E-value=0.059 Score=27.33 Aligned_cols=263 Identities=16% Similarity=0.122 Sum_probs=126.9 Q ss_pred CCccccceeecccccccccccee----------eEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADV----------VSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYT 70 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~----------I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~ 70 (330) ||..+ |..+... .-|-+++. +..+.+.+..+ -|+.--+=+.+.|. .+.++. -..||.+.+ T Consensus 1 Ma~~~-T~~~~~i--iPev~s~~v~~~~~~~~v~~~~~~~~~~l---~g~~G~tv~ip~~~--~~g~a~--~~~~g~~i~ 70 (278) T protein:vir:80 1 MADLT-TKLANLI--DPEVMGPMISAKLPKAIKFGKIAPIDNSL---EGQPGSEITVPKYK--YIGDAQ--DVAEGAAID 70 (278) T ss_pred CCCcc-eehhhee--cHHHHHHHHHHHHHHhhhhcccceecccc---cCCCCCEEEEeeec--cCCcce--eecCCCcCc Confidence 88644 2122211 11212221 11222222222 12211111245563 333322 245666665 Q ss_pred cccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHh Q lcl|NC_020858. 71 FDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVK 150 (330) Q Consensus 71 ~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~ 150 (330) ..........--+.|. .+.|+|++-... ..+ .|.++.-..+...-+.|.+++.++..-++.... T Consensus 71 ~~~lt~~~~~~~i~~~-~~a~~v~D~~~~--~~~-~d~~~~~~~~~a~~~a~~~d~~l~~~l~~a~~~------------ 134 (278) T protein:vir:80 71 YSALETESVKHGIKKA-GKGVKLTDESVL--SGY-GDPVEEAQKQIRMAIASKVDNDILEEALTTTLE------------ 134 (278) T ss_pred ccccccceeeEeeehh-hccccccHHHHh--hcc-ccHHHHHHHHHHHHHHHHHHHHHHHHHhccccc------------ Confidence 5544444433344453 456777775433 333 567777777778888999998887542211000 Q ss_pred cccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCC-ceeEEEeChHHHHHHHHhhccceeeeeeee Q lcl|NC_020858. 151 TNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGA-NFKHVFVSPYVKSVFVTFMSDTNVASFRYA 229 (330) Q Consensus 151 tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg-~~~~i~v~~~~k~~is~f~~~~~~~~~r~~ 229 (330) +. + +.+..+ ..-..+.|.++..++=.++. ..+.++|||.+...+.+.....+... T Consensus 135 --~~----~-------------~~t~~~-~~~~~~~~~da~~~l~~~~~~~~~~ivv~p~~~~~L~k~~~~~~~~~---- 190 (278) T protein:vir:80 135 --VK----G-------------AINIGL-IDKIENTFTDAPDAIEDESITTTGVLFLNYKDTAKLREEAAGSWTKA---- 190 (278) T ss_pred --cc----c-------------ccccch-hhhHHHHHHHHHHhhcccCCCcccEEEECHHHHHHHHhhhhhhcccc---- Confidence 00 0 000000 00123556677766655543 34468999998777765432222100 Q ss_pred ecCCcceeEEEEEEEEEcCCeE---EEEEEcCcCCCccccccEEEEEcchhhhhcccCCc-cccccccccccceeeEEEE Q lcl|NC_020858. 230 ASNGKNNSIVANADVYEGPFGK---VMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI-AEDKKVAKTGDAEKFMLIG 305 (330) Q Consensus 230 ~~~~~~~~~~~~v~~~~tdfG~---v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~-~~~e~laKtGd~~k~~i~~ 305 (330) ...+. . .. ....+|+ +.|+.+..||.+ ..+++.+..+.+..-++. -|.+..++.+ ++...... T Consensus 191 ~~~g~-~---~~---~~G~ig~~~G~~Vi~s~~~p~~-----t~~l~~~gAi~~~~~~~~~vE~~Rd~~~~-~d~i~~~~ 257 (278) T protein:vir:80 191 SQLGD-D---LL---VKGAFGELLGWEIVRTKKLADG-----NALAVKAGALKTFLKRNLLAESGRDMDHK-LTKFNADQ 257 (278) T ss_pred ccccc-c---ce---eeccceeecceeEEEcCCCCcc-----eEEEEeccceeeeecCCcccccccchhhc-cceeeeee Confidence 00011 0 00 1123333 689999999965 457888886655433331 1222222222 12222234 Q ss_pred EEEEEEecchheeEEeccccc Q lcl|NC_020858. 306 EGALKPKNEKGLGVAADLYGL 326 (330) Q Consensus 306 E~tLe~~N~~a~g~i~gLt~~ 326 (330) =|++.+.||.+..+|+==.|= T Consensus 258 ~yg~~v~~~~~~v~it~~a~~ 278 (278) T protein:vir:80 258 HYAVALVDETKAVKVVPVAGN 278 (278) T ss_pred EEEEEEEcCcceEEEeeccCC Confidence 489999999998888766655 No 118 >protein:vir:95318 Length: 328 # NCBI annotation: hypothetical protein # Family: family:all:1903 # MgeID: mge:1564 # MgeName: phiV10 # Cross-refs: genbank:acc:YP_512264;genbank:gi:89152431;genbank:GeneID:3952987 Probab=84.27 E-value=0.062 Score=27.21 Aligned_cols=250 Identities=12% Similarity=0.076 Sum_probs=117.4 Q ss_pred CCccccceeecccccccc---ccceeeEecCCcccceeeeeccceec-ccee-eeeeeeccCcccccccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNRE---ELADVVSRITPEDTPIYSMIEKVSFD-TTHP-EWTTDELAAPGANITLEGDEYTFDATV 75 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~e---dl~d~I~~i~p~dTP~~s~ig~~~~~-~~~~-~W~td~L~~~~~na~~EG~d~~~~~~~ 75 (330) |+...++-.|-.-.-++- .+...|...=.+..|+|..+-=..+. .+.| .=+..+|..+.=...=||- .+... T Consensus 1 m~~~~~~~~TL~e~Akr~~~d~~~~~VIE~l~~~n~IL~~lpf~e~n~gt~~~~~v~~~LP~~~fR~lN~g~---~~s~~ 77 (328) T protein:vir:95 1 MAVKGLTALTLADWGKRVDPNGKVDKIIELLGQTNPILQDMPFVEGNLPTGHRTTIRSGLPSATWRLLNYGV---QPSKS 77 (328) T ss_pred CCccccccccHHHHHhhhCcchhHHHHHHHHhccchhHhhcceeecccCCcceeeEeeccCCceeeecCCcc---Ccccc Confidence 888754433322111111 12223333334455555544221221 1212 2244445444322111221 23345 Q ss_pred CceEecceEEEEeeeeeehhHHHHHhhccccc-hHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHh---- Q lcl|NC_020858. 76 SPERLGNYTQIMRKSGIISGTQNITDEAGRAT-KVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVK---- 150 (330) Q Consensus 76 ~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~-e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~---- 150 (330) .+..++-.|=||.-.++|..-... ..|..+ ..++|+..+++-+.++++.+||+|.- +..|....||...+. T Consensus 78 tt~q~t~~l~ilgg~~eVDr~la~--~~Gn~~~~ra~q~~~~~ka~~~~~~~~~iyGds--a~~p~~F~GL~~R~~~~s~ 153 (328) T protein:vir:95 78 TTVQVTDSVGMLETYAEVDKSLAD--LNGNTAEFRLSEDRAFIEAMNQQMAQTLFYGDS--SVNPQQFMGLSSRYSSLSA 153 (328) T ss_pred eeEEEEEEEEEEecceeechHHHh--hcCCHHHHHHHHHHHHHHHHHHHHHHHHhcCCc--cCChhhhcchhhhcCcccc Confidence 678888999999999999985544 455444 45889999999999999999999963 334555666654331 Q ss_pred ---ccc-cccccccc------cccccc----------------------------cccc--------------------- Q lcl|NC_020858. 151 ---TNV-SRGATGAN------GGYNTG----------------------------TGLT--------------------- 171 (330) Q Consensus 151 ---tn~-~~g~~g~~------~~~~~~----------------------------~~~~--------------------- 171 (330) .|+ ..|..|.. -.|.+. .+.. T Consensus 154 ~~a~qiidaGgtg~~~TSi~~v~~g~~~~~giyPkG~~~Gl~~~d~g~~~~~~~~g~~y~~y~~~~~w~~Gl~i~d~r~v 233 (328) T protein:vir:95 154 GNAQNIIDAGGTGTDNTSIWLVVWGENTVHGIFPKGKKAGIQMEDKGQVTLEDANGGKYEGYRTHYKWDNGLALRDWRYV 233 (328) T ss_pred ccccceeecccCCCCceEEEEEEEcCCeEEEecccccccCceeeecCceeeecCCCCeeeEEEEEEEeeeeeEEcCcccE Confidence 011 11111110 000000 0000 Q ss_pred ----ccc-cccccccccHHHHHHH-HHHHHh--cCCce-eEEEeChHHHHHHHHhhccceeeeeeeeecCCcceeEEEEE Q lcl|NC_020858. 172 ----VAP-TDGTQRAFSKAIMDDV-MQQGYQ--SGANF-KHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNSIVANA 242 (330) Q Consensus 172 ----~~~-t~gt~~~lTe~~l~~~-~~~~~~--~Gg~~-~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~~~~~v 242 (330) +.. .+-+..+ ..++|.++ ++.++- +++.. -.++||-..+..+..-..+.....-......++ T Consensus 234 vrI~NId~~~l~~~~-~~~~l~~lm~~a~~~ip~~~~~~~~~y~n~~v~~~L~~q~~~~~n~~~~~~~~~g~-------- 304 (328) T protein:vir:95 234 VRIANIDVSNLSEPS-SAANIAKLMVKALHRIPNRGMGRPVFYMNRTVGQALDLQSLEKTSLAISVKETEGE-------- 304 (328) T ss_pred EEEecCccccccccc-ChhhHHHHHHHHHHHhccCCCCcceeehhHHHHHHHHHHHhcCcceeeeeeccCCc-------- Confidence 000 0000000 12334433 333331 44444 469999999999887655544333333333333 Q ss_pred EEEEcCCeEEEEEEcCcCCCccccccEEEEE Q lcl|NC_020858. 243 DVYEGPFGKVMIHPNRVMAGSGALARNAFFV 273 (330) Q Consensus 243 ~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~l 273 (330) ..|.|+-|.|.. .++++..+.-++ T Consensus 305 --~~t~~~gipir~-----~dai~~tE~~vv 328 (328) T protein:vir:95 305 --WWTSFRGVPIRE-----TDALLETEARVV 328 (328) T ss_pred --ceeEECCeEEEE-----EeeeecCccccC Confidence 234444455432 233332222233 No 119 >protein:vir:5255 Length: 304 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:117 # MgeName: Aaphi23 # Cross-refs: genbank:acc:NP_852760;genbank:gi:31544035;uniprot:Q7Y5U0;genbank:GeneID:2753552 Probab=80.37 E-value=0.096 Score=26.17 Aligned_cols=282 Identities=9% Similarity=0.000 Sum_probs=119.3 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccc---eeccceeeeeeeeccCccc--ccccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKV---SFDTTHPEWTTDELAAPGA--NITLEGDEYTFDATV 75 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~---~~~~~~~~W~td~L~~~~~--na~~EG~d~~~~~~~ 75 (330) |++-+ +.+.+.+-+...|+..-=.+.-...+|+-. ......+.+...+-...++ ...-.+.|.+-.... T Consensus 1 ~~~la------fl~~qL~~id~~vye~~~~~~~~~~lipv~t~~~~~~~~~~~~~~d~~G~a~~~~i~~~a~dip~vd~~ 74 (304) T protein:vir:52 1 MSLLA------YVKNGLTAVSKDIAETKYPEIVFPQFVYVDQQTAVGITEKLHYGADEHGSLDDGLITVGTSTLDQVEVG 74 (304) T ss_pred CchHH------HHHHHHHHHhhhhhccccccchhhhhccccCCCCcccceEEEeeeeccCcccccccCCcCCccceeecc Confidence 66655 445577777777775322222233344321 1122233444333222222 111223333322222 Q ss_pred CceEecceEEEEeeeeeehhHHHHHh---hccccchHHHHH-HHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhc Q lcl|NC_020858. 76 SPERLGNYTQIMRKSGIISGTQNITD---EAGRATKVKEQK-LKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKT 151 (330) Q Consensus 76 ~~~~~~N~tQIf~~~v~VS~T~~av~---~~G~~~e~a~q~-~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~t 151 (330) -.... .-|+.-..++..+.+-+. ..|. .+.-++ ..+...+..++....+.|.+...+ +-||+.- . T Consensus 75 ~~~~~---~~i~~~~~~~~y~~~El~~a~~~g~--~l~~~ka~aa~~a~~~~~n~v~~~Gd~~~~g----~~GllN~--p 143 (304) T protein:vir:52 75 FTPTR---SYIVPWAKSVTWTKPELEQGKLLGL--ALNTAKIMALNKNAQQTLQKVAFLGHAKDSR----LTGLLNN--K 143 (304) T ss_pred cceeE---EEEEEEeeeeeecHHHHHHHHHhCC--CcHHHHHHHHHHHHHhhhceEEEEeeccccc----eEEEEeC--C Confidence 22222 234444555555555433 3353 344444 344457777777777777542211 2233211 1 Q ss_pred ccccccccccccccccccccccccccccccccHHHHHHHHHHHHhc-CC--ceeEEEeChHHHHHHHHh-hccceeeeee Q lcl|NC_020858. 152 NVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQS-GA--NFKHVFVSPYVKSVFVTF-MSDTNVASFR 227 (330) Q Consensus 152 n~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~-Gg--~~~~i~v~~~~k~~is~f-~~~~~~~~~r 227 (330) |+...+.. +.+ ....-...|+..+ -++|++++.++|.. |+ .+++|+++|.+.+.++.- .++.+.+... T Consensus 144 ~v~~~~~~--~~~-----a~~~w~~~T~~eI-~~di~~~~~~i~~~s~~~~~p~tl~Lpp~~~~~l~~~~~~~~~~Tvl~ 215 (304) T protein:vir:52 144 SVEVYAIK--GAA-----QNTKVQAMDFDKA-VAFFKEIFLKGMEKTKRIEAPNTFAIDSLDLAHLALVQRANTDTTALE 215 (304) T ss_pred Ccceeeec--CCc-----cCCccccCCHHHH-HHHHHHHHHHHHhccCceecCceEEeCHHHHHHHhhccCCCCCchHHH Confidence 11110000 000 0000011122221 25677889999974 43 377899999998887642 1111111111 Q ss_pred eeecCCcceeEEEEEEEEEcCCeE-EEEE--EcCcCCCccccccEEEE--EcchhhhhcccCCccccccccccccc-eee Q lcl|NC_020858. 228 YAASNGKNNSIVANADVYEGPFGK-VMIH--PNRVMAGSGALARNAFF--VDPEFLQFGWLRKIAEDKKVAKTGDA-EKF 301 (330) Q Consensus 228 ~~~~~~~~~~~~~~v~~~~tdfG~-v~iv--~nR~mp~~~~~a~~~~~--ld~~~~~~~~Lr~~~~~e~laKtGd~-~k~ 301 (330) +...+. ...+ |. ++|+ +..+..++....+.|++ -|++++++..=.|+ .-|+-.=.+ ..+ T Consensus 216 ~l~~n~----------~~~~--g~~l~I~~v~~~~~~~g~~g~~r~vvY~~d~~~~~~~vP~p~---~~l~~q~~~~~~~ 280 (304) T protein:vir:52 216 FLTKHL----------SAAA--GRQVAIKALPSNYGTRVTDGKTRAMVYVNSKEHVIFDVPMSP---TVLDAQPKGLLAF 280 (304) T ss_pred HHHHhc----------cccc--CCcceEEEecccccccCCCCceEEEEEecChhheEEecCccc---cccchhhcCCceE Confidence 110000 0011 11 3333 22233334333444444 46778877531232 122222222 222 Q ss_pred EE--EE-EEEEEEecchheeEEec Q lcl|NC_020858. 302 ML--IG-EGALKPKNEKGLGVAAD 322 (330) Q Consensus 302 ~i--~~-E~tLe~~N~~a~g~i~g 322 (330) .+ +. =.+++|+-|.+...+=- T Consensus 281 ~vp~~~r~gGv~v~~P~a~~y~D~ 304 (304) T protein:vir:52 281 ESGLRMAFGGVTFMEPDSALYVDY 304 (304) T ss_pred EecceeeeeeEEEEccceeeeecC Confidence 22 22 25799999988765544 No 120 >protein:vir:104479 Length: 310 # NCBI annotation: gp15 # Family: family:all:1105 # MgeID: mge:1548 # MgeName: P-SSM4 # Cross-refs: genbank:acc:YP_214651;genbank:gi:61806292;genbank:GeneID:3294534 Probab=79.96 E-value=0.0059 Score=32.82 Aligned_cols=94 Identities=16% Similarity=0.120 Sum_probs=51.1 Q ss_pred CCcc--cccee----eccccccccccceeeEecCCcccceeeee-ccceecccee-eeeeeeccCccccccc-ccccccc Q lcl|NC_020858. 1 MAVV--TNTFQ----STGAKGNREELADVVSRITPEDTPIYSMI-EKVSFDTTHP-EWTTDELAAPGANITL-EGDEYTF 71 (330) Q Consensus 1 Ma~~--t~~~~----t~~~~g~~edl~d~I~~i~p~dTP~~s~i-g~~~~~~~~~-~W~td~L~~~~~na~~-EG~d~~~ 71 (330) .++. ..++. +....+....+.+|+..|.-.-+--|+.. |+..+....| .-..-.+-.+.+++.+ ||+|+ T Consensus 208 a~Isa~~ts~~V~d~s~~~~~~~i~Id~E~i~i~~isgn~LTV~RG~~~T~aa~H~~g~~V~~in~~d~~lle~gddf-- 285 (310) T protein:vir:10 208 ATISKTATGFAVANASGINQYDNIYIGAELMRVTNKVGNNLSVIRGYEKSTPTVHSVGSNVFIVNAADNALLESDDDF-- 285 (310) T ss_pred ccccccceeeeecccccccccceEEECcEEEEEEeeccceEEEEecccCCchhhhhcCCcEEEEccCCCccCCccccc-- Confidence 1111 00111 11123455567777776655444444433 5544333333 2222234444566665 57774 Q ss_pred ccccCceEecceEEEEeeeeeehhHHHHH Q lcl|NC_020858. 72 DATVSPERLGNYTQIMRKSGIISGTQNIT 100 (330) Q Consensus 72 ~~~~~~~~~~N~tQIf~~~v~VS~T~~av 100 (330) ...+..+||+||.... .+||+++|+ T Consensus 286 ---g~~e~~s~~~d~~~~~-~~~~~~~a~ 310 (310) T protein:vir:10 286 ---GFGEIYSEYTDMKKYN-PVSGQDEAI 310 (310) T ss_pred ---cccccccccccceeec-cccceeecC Confidence 4567789999977666 999999999 No 121 >protein:vir:103285 Length: 296 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:1605 # MgeName: JK06 # Cross-refs: genbank:acc:YP_277465;genbank:gi:71834107;genbank:GeneID:3562396 Probab=77.43 E-value=0.12 Score=25.54 Aligned_cols=281 Identities=9% Similarity=0.020 Sum_probs=124.5 Q ss_pred CCccccceeeccccccccccceeeEec------CCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRI------TPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDAT 74 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i------~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~ 74 (330) |-.--.--.-...+.+.+-++..|+.. .+..-|+.+- .......+.+...+-....+.---++.|.+.... T Consensus 1 ~~~~~a~~~~~f~~~ql~~id~~v~e~~~~~l~~~~~i~v~~~---~~~~~~~~~~~~~~~~G~a~~~~~~~~dip~v~~ 77 (296) T protein:vir:10 1 MGVDKADAAGIWTVKQLTASLNKAYETEYDQNSVVNLFPVSNE---IPGYAKYFEYPVFDGVGIAQIVADYTDDLPLVDA 77 (296) T ss_pred CcccchhhhHHHHHHHHHHHHHHHHhhhhcccccceecccccC---CCCceeEEEeeeeeccCceeEeCCCccccceeec Confidence 332211111112233444455555432 1111222221 1111122333222211111111112222221111 Q ss_pred cCceEecceEEEEeeeeeehhHHHHHhhccc-cchHHHHH-HHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcc Q lcl|NC_020858. 75 VSPERLGNYTQIMRKSGIISGTQNITDEAGR-ATKVKEQK-LKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTN 152 (330) Q Consensus 75 ~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~-~~e~a~q~-~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn 152 (330) . ..+. ..+|++-...++.+-+-+..+.. +-.++-++ ..+...+.+.+...++.|.+... .-||+.- .+ T Consensus 78 ~-~~~~--~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~ka~aA~~~~~~~~n~~~f~G~~~~g-----~~GLlN~--p~ 147 (296) T protein:vir:10 78 L-ATER--QGKVFRFGNAFLISIDEIKVGQATGQSLSTRKQSLAFEAHDKLLDKLVWSGSTAHG-----IPSVFDY--PN 147 (296) T ss_pred c-ceeE--EEEEEEEEeeeeecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhceEEEeeccccc-----ceeEeec--CC Confidence 1 1122 23455545555544554444322 22444444 45557777777777788854321 1122111 11 Q ss_pred cccccccccccccccccccccccccccccccHHHHHHHHHHHHhc-CC--ceeEEEeChHHHHHHHHhhccceeeeeeee Q lcl|NC_020858. 153 VSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQS-GA--NFKHVFVSPYVKSVFVTFMSDTNVASFRYA 229 (330) Q Consensus 153 ~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~-Gg--~~~~i~v~~~~k~~is~f~~~~~~~~~r~~ 229 (330) +.. ..+.+.|. ..+ .. -++|..++.++|.. +| .+++++++|.+...++.-+.+... T Consensus 148 v~~--~~~~~~W~----------~~t--~i-~~Di~~~~~~l~~~s~g~~~p~~l~L~p~~~~~L~~~~~~~~~------ 206 (296) T protein:vir:10 148 INN--VVSGGSWS----------QPT--TA-VSDITSLLDIIETSTNGQHRATHLLLPTTARRIMQNLVPGTSV------ 206 (296) T ss_pred Ccc--ccccCCcc----------CHH--HH-HHHHHHHHHHHHHhhCceecceeEEeCHHHHHHHhhccCCCCc------ Confidence 100 00011110 011 11 35677777777864 33 467899999998887654332211 Q ss_pred ecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccc-cccEEEEE--cchhhhhcccCCccccccccccccceeeEEEEE Q lcl|NC_020858. 230 ASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGA-LARNAFFV--DPEFLQFGWLRKIAEDKKVAKTGDAEKFMLIGE 306 (330) Q Consensus 230 ~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~-~a~~~~~l--d~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~E 306 (330) ...+.+...|..++|+.-+++..... ..+.|+++ +|++++++.=.|+. ..++-+.+.+-+.-++.- T Consensus 207 ----------t~l~~ik~~~~~l~i~~~~~l~~a~~~g~~~~v~~~~~~~~~~~~v~~~~~-~~~~e~~~l~~~~~~~~~ 275 (296) T protein:vir:10 207 ----------SYGEFFRQNNSGVTVEFVQYLNDYNGTGTSAAIAYEKDPNNMAIEIPEATN-ALPAQPKDLHFKIPVTSK 275 (296) T ss_pred ----------cHHHHHHHhcCCceEEEeeeeccCCCCcceEEEEEEcCCceEEEEcCccee-eecccccCceEEEeeEee Confidence 11222333444555555555543221 12345554 48888887423332 234555555555444554 Q ss_pred E-EEEEecchheeEEeccccc Q lcl|NC_020858. 307 G-ALKPKNEKGLGVAADLYGL 326 (330) Q Consensus 307 ~-tLe~~N~~a~g~i~gLt~~ 326 (330) . +++++-|.+...+.|||=- T Consensus 276 ~~Gv~i~~P~ai~~~dGI~~~ 296 (296) T protein:vir:10 276 ATGLIVYRPLTMAVMKGITFA 296 (296) T ss_pred EEEEEEECCceeEEEeeeecC Confidence 5 5999999999988877655 No 122 >protein:vir:1239 Length: 274 # NCBI annotation: similar to phage B1 major head protein # Family: family:all:522 # MgeID: mge:25 # MgeName: phi ETA # Cross-refs: genbank:acc:NP_510938;genbank:gi:17426272;genbank:GeneID:927376 Probab=77.29 E-value=0.13 Score=25.51 Aligned_cols=263 Identities=12% Similarity=0.035 Sum_probs=136.0 Q ss_pred CCccccceeecccccc------ccccce--eeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGN------REELAD--VVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFD 72 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~------~edl~d--~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~ 72 (330) ||... |..+.-.+-+ .+.+.. .++.+...++-| -|+.--+=+.+.|. .+.++. -..||.+.+.. T Consensus 1 ma~~~-T~l~d~iiPev~~~~v~~~~~~~l~~~~~~~~d~~l---~g~~G~tv~iP~~~--~ig~a~--~~~~g~~i~~~ 72 (274) T protein:vir:12 1 MAQGL-TKTSNQIIPEVLAPMMQAQLEKKLRFASFAEVDSTL---QGQPGDTLTFPAFV--YSGDAQ--VVAEGEKIPTD 72 (274) T ss_pred CCcce-eehhhhhchHHHHHHHHHHHHhhhhhcccceecccc---cCCCCCEEEEeeec--CCCccc--cccCCCccchh Confidence 88765 3222211110 001111 122333333322 23211122345674 333322 24577777666 Q ss_pred cccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcc Q lcl|NC_020858. 73 ATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTN 152 (330) Q Consensus 73 ~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn 152 (330) ..+.....--+.|. .+.|+|++-+.... + +|.++........-+.+.++..++.--..... T Consensus 73 ~lt~~~~~~~i~~~-~~~~~i~D~~~~~~--~-~d~~~~~~~q~~~~~a~~vd~~~l~~~~~a~~--------------- 133 (274) T protein:vir:12 73 ILETKKREAKIRKI-AKGTSITDEALLSG--Y-GDPQGEQVRQHGLAHANKVDNDVLEALMGAKL--------------- 133 (274) T ss_pred hcccceeeEEeeee-cceeeecHHHHHhc--c-cchHHHHHHHHHHHHHHHHHHHHHHHHhcccc--------------- Confidence 66655555556664 67899988655432 2 56666666667777888888877642111000 Q ss_pred cccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecC Q lcl|NC_020858. 153 VSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASN 232 (330) Q Consensus 153 ~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~ 232 (330) .....+++.+.|.++++++=++....+.++|||.+...+-+.....+.. ..+ T Consensus 134 -----------------------~~~~~a~~~d~i~dA~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~fv~-----~s~ 185 (274) T protein:vir:12 134 -----------------------TVNADITKLNGLQSAIDKFNDEDLEPMVLFINPLDAGKLRGDASTNFTR-----ATE 185 (274) T ss_pred -----------------------cccccccCHHHHHHHHHHhccccccccEEEeCHHHHHHHHhhhhhhccc-----ccc Confidence 0011257889999999999888888889999999876665532222211 011 Q ss_pred CcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc-cccccccccccceeeEEEEEEEEEE Q lcl|NC_020858. 233 GKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI-AEDKKVAKTGDAEKFMLIGEGALKP 311 (330) Q Consensus 233 ~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~-~~~e~laKtGd~~k~~i~~E~tLe~ 311 (330) +... ....-....|.-+.|+.++-||..+ .|++-+.-+.+..-++. -|.+..++.+. +......=|+..+ T Consensus 186 ~g~~---~~~~G~ig~~~G~~Vi~s~~~p~~t-----~~l~~~gA~~~~~~~~~~vE~~Rd~~~~~-d~i~~~~~y~~~~ 256 (274) T protein:vir:12 186 LGDD---IIVKGAFGEALGAIIVRSNKLEAGT-----AILAKKGAVKLILKRDFFLEVARDASTKT-TALYSDKHYVAYL 256 (274) T ss_pred cccc---ceecccceeecCeeEEEeCCCCcce-----EEEEeccceeeeecCCceeccccchhhcc-cEEEeeeEEEEEE Confidence 1111 0000011112236888888899654 46777776666433331 11112222221 1222223388999 Q ss_pred ecchheeEEeccccccccC Q lcl|NC_020858. 312 KNEKGLGVAADLYGLTAST 330 (330) Q Consensus 312 ~N~~a~g~i~gLt~~~~~~ 330 (330) .||.+..+|+- =++|+ T Consensus 257 ~~~~~vv~~t~---~~~~~ 272 (274) T protein:vir:12 257 YDESKAVKITK---GSGSL 272 (274) T ss_pred EcCCceEEEEc---CCccc Confidence 99998888763 34455 No 123 >protein:vir:6212 Length: 434 # NCBI annotation: prohead protease # Family: family:all:21 # MgeID: mge:128 # MgeName: phBC6A52 # Cross-refs: genbank:acc:NP_852592;genbank:gi:31415852;genbank:GeneID:1489210 Probab=76.88 E-value=0.13 Score=25.43 Aligned_cols=279 Identities=8% Similarity=-0.058 Sum_probs=118.2 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCcccccc--ccccccccccccCce Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANIT--LEGDEYTFDATVSPE 78 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~--~EG~d~~~~~~~~~~ 78 (330) +|..+.+ +......-+++.+.|...-....|+..+....... ..+.|..-.-.+.+.... .||.+.+.....-.. T Consensus 141 ~a~~~~t--~~GG~lvP~~~~~~Ii~~l~~~~~i~~~~~~~~~~-~~~~~p~~~~~~~a~~~~~~~e~~~~~~~~~~f~~ 217 (434) T protein:vir:62 141 RALGLVT--GNGSVTIPDFLSKEIITYAQEENFLRRLGTGVKTK-ENIKYPVLVKKAEAQGHKNERTNNEMPETDIEFDE 217 (434) T ss_pred hhhcccc--cccceecchhhHHHHHHhhhhhhhhhhhcceeccC-CceEEEEEecCCcccceecccccccccccccceee Confidence 2221111 11111122456666666666677776544332222 223333322222222222 234444332211111 Q ss_pred EecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 79 RLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 79 ~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) .. ..+++-...+.=|.+.+.... .+..+|=...-...+.+-+|.+||+|...... .+|+.+- .. T Consensus 218 v~---~~~~k~~~~~~iS~ell~ds~-~~l~~~i~~~la~~~~~~~d~~~l~G~G~~~~----~~g~~~~----~~---- 281 (434) T protein:vir:62 218 IE---LSPTEFDALATVTKKLLARTG-LPIEQIVMDELKKAYVRKETQYMVNGDEANNI----NDGALAK----KA---- 281 (434) T ss_pred EE---eeheeeEeehhhHHHHHhcch-HHHHHHHHHHHHHHHHHHHHHHHhccCCCCcc----ccceeec----cc---- Confidence 11 112222223333444544443 23333444445566778899999998643211 2232210 00 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeec------C Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAAS------N 232 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~------~ 232 (330) .+..+....+.+.|.++..++..+...-..+++|+....+|..+ +|.+. |+... + T Consensus 282 ---------------~~~~~~~~~~~d~l~~l~~~l~~~~~~~a~~v~n~~~~~~L~~l-kd~~G---~~l~~~~~~~~~ 342 (434) T protein:vir:62 282 ---------------VEFKTDEKNLYDALVKMKNTPVKEVRKKARWVLNTAALTKIETM-KTDDG---FPLLRPFNQAEG 342 (434) T ss_pred ---------------ccccccccchhhHHHHHHhhcchhhhcCCEEEEcHHHHHHHHHh-hccCC---CEeeccCCCccC Confidence 01111223556777777777766543333567899888888876 44432 22211 1 Q ss_pred CcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccE-EEEEcchhhhhcccCCcccc----ccccccccceeeEEEEEE Q lcl|NC_020858. 233 GKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARN-AFFVDPEFLQFGWLRKIAED----KKVAKTGDAEKFMLIGEG 307 (330) Q Consensus 233 ~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~-~~~ld~~~~~~~~Lr~~~~~----e~laKtGd~~k~~i~~E~ 307 (330) +...+| +| +.|+.+.+||........ +++.|.+...+....+.... +..+ +-+...+.+.--+ T Consensus 343 g~~~tl----------~G-~pV~~~~~~~~~~~~~~~~i~~Gdfs~~~i~~~~g~~~i~~~~~~~~-~~~~v~~~~~~r~ 410 (434) T protein:vir:62 343 GIGYTL----------LG-FPVEEEDAIDIPDSPDTPVFYFGDFSKFYIQDVIGSLEVQKLVELFS-RTNRVGFRIWNLL 410 (434) T ss_pred CCCcee----------cc-eeeEEecCccCccCCCceEEEEeeccceEEEEeeceeEEEeehhhhc-ccCceEEEEEeee Confidence 111111 23 355666677754322212 44558876544321221111 1111 1122223333333 Q ss_pred EEE-EecchheeEEec-ccccccc Q lcl|NC_020858. 308 ALK-PKNEKGLGVAAD-LYGLTAS 329 (330) Q Consensus 308 tLe-~~N~~a~g~i~g-Lt~~~~~ 329 (330) ..+ ++.|.+..++.. +..=|++ T Consensus 411 Dgk~i~~~~~~~~~~~~~~~~~~~ 434 (434) T protein:vir:62 411 DAQLIHSPFEVPVYKYVLKAPTGA 434 (434) T ss_pred cceeecCcccceEEEEEeccCCCC Confidence 333 445888776632 2222333 No 124 >protein:vir:79642 Length: 329 # NCBI annotation: HsbB # Family: family:all:463 # MgeID: mge:1872 # MgeName: TLS # Cross-refs: genbank:acc:YP_001285525;genbank:gi:148734508;genbank:GeneID:5220000 Probab=73.86 E-value=0.17 Score=24.87 Aligned_cols=289 Identities=9% Similarity=0.038 Sum_probs=112.1 Q ss_pred CCcccccee--eccccccccccceeeEecCCcccceeeeec---cceeccceeeeeeeeccCcccccccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQ--STGAKGNREELADVVSRITPEDTPIYSMIE---KVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATV 75 (330) Q Consensus 1 Ma~~t~~~~--t~~~~g~~edl~d~I~~i~p~dTP~~s~ig---~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~ 75 (330) |+..+..-. --....+.+-++..|+.+--.+.=....|+ ........+.+..-+-....+.---.+.|.+..... T Consensus 26 ~~~~~~~~~~~~~f~~~ql~~id~~v~e~~~~~l~~~~~i~i~~~~~~~~~~~t~~~~~~~G~a~~~~d~~~dip~vd~~ 105 (329) T protein:vir:79 26 LRGAKNDASDMGIWTSQELHKIKAQAYEKEYPAGSALRVFPVTSELSDTDKTFEYQTFDKVGHAKIIADYTDDLSTVDAL 105 (329) T ss_pred cccceeccchhhHHHHHHHHHHHHHHHhhhhcccchhhhcccccCCCCceeEEEeeeeecceeeeeecCcccccceeecc Confidence 222221100 001112222333334332111111111222 122222233333333221111000112222211111 Q ss_pred CceEecceEEEEeeeeeehhHHHHHhh---ccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcc Q lcl|NC_020858. 76 SPERLGNYTQIMRKSGIISGTQNITDE---AGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTN 152 (330) Q Consensus 76 ~~~~~~N~tQIf~~~v~VS~T~~av~~---~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn 152 (330) .... ...|++-...++.+-+-+.. .|. +.-......+...+.+.+....++|.+... .-||+.- .+ T Consensus 106 ~~~~---~~~i~~~~~~~~~~~~El~~a~~~g~-~l~~~k~~aA~~~~~~~~n~i~f~G~~~~g-----~~GLlN~--p~ 174 (329) T protein:vir:79 106 MTSE---FGKVFRLGNAFLISIDEIKAGQRTGK-SLSTRKANAAQNAHDQLVNHLVFKGSKPHK-----IISVFEH--PN 174 (329) T ss_pred ccee---EEEEEEEEEEEEecHHHHHHHHHhCC-ChHHHHHHHHHHHHHHhhccEEEeeccccc-----ceeeecC--CC Confidence 1111 12333333444444443333 343 333444445566677777777778854311 1222211 11 Q ss_pred cccccccccccccccccccccccccccccccHHHHHHHHHHHHhc-CC--ceeEEEeChHHHHHHHHhhccceeeeeeee Q lcl|NC_020858. 153 VSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQS-GA--NFKHVFVSPYVKSVFVTFMSDTNVASFRYA 229 (330) Q Consensus 153 ~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~-Gg--~~~~i~v~~~~k~~is~f~~~~~~~~~r~~ 229 (330) +.....+.++ ...- ...|+..+ -++|.+++.++|.. |+ .+.+|+++|.+...++.-..+...... T Consensus 175 v~~~~~~~~~-------~~~w-~~kt~~ei-~~di~~~~~~l~~~s~g~~~p~~L~Lpp~~~~~L~~~~~~~~~tvl--- 242 (329) T protein:vir:79 175 LTTINSAGWN-------NAAG-TGKKPETA-QDELEQAIEKIETLTNGQHRANMILIPPSMRKVLMVRMPETTMSYL--- 242 (329) T ss_pred ccccccCCCC-------Cccc-cccCHHHH-HHHHHHHHHHHHHhcCceecccEEEecHHHHHHhhcccCCCCccHH--- Confidence 1100000000 0000 00111111 26788888999974 33 367899999887776542211111111 Q ss_pred ecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCcc-ccccEEEE--EcchhhhhcccCCccccccccccccceeeEEEEE Q lcl|NC_020858. 230 ASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSG-ALARNAFF--VDPEFLQFGWLRKIAEDKKVAKTGDAEKFMLIGE 306 (330) Q Consensus 230 ~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~-~~a~~~~~--ld~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~E 306 (330) +.+.-.|-.++|+.-+++-... ...+.|++ -|++++++..=.|+.. .+..+.+-+.+.-.+.- T Consensus 243 -------------~~lk~~~~~l~I~~~~el~~ag~~g~~~~v~y~~~~~~~~~~vp~~~~~-l~~q~~~~~~~v~~~~r 308 (329) T protein:vir:79 243 -------------DYFKQQNGGITIESISELEDIDGAGTKAALVYEKDPMNMSIEIPEAFNM-LTAQPKDLHFKVPCTSK 308 (329) T ss_pred -------------HHHHHhCCCcEEEEcccccccCCCCceEEEEEecCCceEEEecCcceee-eeceecCceEEEceeee Confidence 1122223334555555543211 11233333 5777777764334322 23444443333222333 Q ss_pred -EEEEEecchheeEEeccccc Q lcl|NC_020858. 307 -GALKPKNEKGLGVAADLYGL 326 (330) Q Consensus 307 -~tLe~~N~~a~g~i~gLt~~ 326 (330) .+++++-|.+...+.||-== T Consensus 309 ~~Gv~i~~P~ai~~~dGI~~~ 329 (329) T protein:vir:79 309 CTGLTIYRPLTLVLIKGLVVG 329 (329) T ss_pred EEEEEEECcceeeeeeeeeeC Confidence 56999999987665555311 No 125 >protein:vir:9643 Length: 377 # NCBI annotation: major coat protein # Family: family:all:635 # MgeID: mge:173 # MgeName: 315.1 # Cross-refs: genbank:acc:NP_795405;genbank:gi:28876178;genbank:GeneID:1257724 Probab=69.09 E-value=0.23 Score=24.11 Aligned_cols=286 Identities=11% Similarity=-0.040 Sum_probs=121.0 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccc-cccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFD-ATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~-~~~~~~~ 79 (330) .++.+.+-.+......-+++.+.|+.-=...-|+.++.....+.+ ...|.-.+-.+.+ .=..|++..+.. ...-... T Consensus 76 ~~~~~~~~~~~gg~lvP~~~~~~I~~~l~~~s~i~~~~~v~~~~~-~~~i~~~~~~~~a-~wv~e~~~~~~~~~~~f~~i 153 (377) T protein:vir:96 76 NDIDKNVGGKDKFKLLPEETMVQVFDDLVAEHPLLKVINFKNTSL-RLKALTAETSGTA-VWGDIFGEIKGQLKQAFKEQ 153 (377) T ss_pred HHHHhcCCCCCCceecCHHHHHHHHHHHHhhhhhhhhceeEecCC-ceEEEEecCCcce-eEeecccccccccCccceeE Confidence 111111111111111223455555554456677777665444333 2334433221111 111344433211 1111111 Q ss_pred ecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccccc Q lcl|NC_020858. 80 LGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATG 159 (330) Q Consensus 80 ~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g 159 (330) .=+...+. .-+.|| .+-+...+ .+--+|=..+-...+.+-+|.+||+|.-. + +.-||+..+.........+ T Consensus 154 ~l~~~kl~-~~~~is--~~ll~ds~-~~le~~i~~~l~~~~~~~~~~a~i~G~G~--~---~P~Gil~~~~~~~~~~~~~ 224 (377) T protein:vir:96 154 DFSQFKLT-AFVVIP--KDALKFGP-KWLKQFITEQLKEAIAVALELAIVKGNGL--L---QPVGLLKDLSQPTVDQSTG 224 (377) T ss_pred eeeeeeEE-eechhh--HHHhhcch-hhHHHHHHHHHHHHHHHHHhhceEeccCC--C---cceeeeecccccccccccc Confidence 11222221 122333 33333332 34445555566677888999999998752 2 3336665542111111110 Q ss_pred ccccccccccccccccccccccccHHHHHHHHHHH---HhcCCc-------ee-EEEeChHHHHHHHHhhccceeeeeee Q lcl|NC_020858. 160 ANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQG---YQSGAN-------FK-HVFVSPYVKSVFVTFMSDTNVASFRY 228 (330) Q Consensus 160 ~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~---~~~Gg~-------~~-~i~v~~~~k~~is~f~~~~~~~~~r~ 228 (330) ....+........++...++.+.+.+++..+ |...+. .+ ..++|+.....+-..+ .+ T Consensus 225 ----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~a~~~mn~~t~~~~~~~~--------~~ 292 (377) T protein:vir:96 225 ----RDITTYKTDKEAIADLSDLDPDTAVELLVPVMKHLSVNDKKHPLKIAGQVKLLLNPEDRWTLEAKF--------TS 292 (377) T ss_pred ----ccccceeeccccccccccCChhHHHHHHHHHHHhhccccccccccccCceEEEEchhhHHhccccc--------cc Confidence 0000000011112222334445555544444 332211 11 3668876543221100 01 Q ss_pred eecCCcceeEEEEEEEEEcCCeE-EEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccc---cccccccceeeEEE Q lcl|NC_020858. 229 AASNGKNNSIVANADVYEGPFGK-VMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDK---KVAKTGDAEKFMLI 304 (330) Q Consensus 229 ~~~~~~~~~~~~~v~~~~tdfG~-v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e---~laKtGd~~k~~i~ 304 (330) .+.+|. +.+-+|. +.++.+.+||.+ ++++.|+++.-+. .|.-...+ ...=.-|..-+... T Consensus 293 ~~~~G~----------~~~~l~~p~~v~~s~~~p~~-----~i~fgdf~~Y~i~-~r~~~~i~~~~~~~~~~d~~~f~~~ 356 (377) T protein:vir:96 293 RNQFGE----------YVTVLPHGITILESLAVETG-----KAIAFVANRYDAF-MATASTIEEYDQTFAMEDLQLYLTK 356 (377) T ss_pred cCCCCC----------ceeccCCCceEEecCCCCcc-----cEEEEEcCcEEEE-EecccEEEeehhhhhhcCCeEEEEE Confidence 122221 2334442 567788889965 3567888774443 24211111 11112355566667 Q ss_pred EEEEEEEecchheeEEeccccc Q lcl|NC_020858. 305 GEGALKPKNEKGLGVAADLYGL 326 (330) Q Consensus 305 ~E~tLe~~N~~a~g~i~gLt~~ 326 (330) .-+.-++.++.|.-+ -.|++= T Consensus 357 ~r~dG~~~d~~a~~v-l~l~~~ 377 (377) T protein:vir:96 357 NYFYGKAKDNHTAAL-LTLAGG 377 (377) T ss_pred EEEcCEEecCCcEEE-EEEecC Confidence 778888889998654 446555 No 126 >protein:vir:6324 Length: 335 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:132 # MgeName: phiKMV # Cross-refs: genbank:acc:NP_877471;genbank:gi:33300843;uniprot:Q7Y2D3;genbank:GeneID:1482613 Probab=66.68 E-value=0.26 Score=23.76 Aligned_cols=288 Identities=13% Similarity=0.087 Sum_probs=114.3 Q ss_pred CCccccceeeccccccccccceeeEec--CCcccc------eeeeec-----cceeccceeeeeeeec-cCccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRI--TPEDTP------IYSMIE-----KVSFDTTHPEWTTDEL-AAPGANITLEG 66 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i--~p~dTP------~~s~ig-----~~~~~~~~~~W~td~L-~~~~~na~~EG 66 (330) |..+. ++|-+.-.|.-.|+ +....+ +-.+|- |+++.. +++..+...-|....- ..| ...+.| T Consensus 1 ms~~~-~~tr~~~~~s~~d~-al~le~f~geV~~af~~~s~~~~~~~~rti~~g~s~~~~~iG~~~~~~~~p--G~~l~~ 76 (335) T protein:vir:63 1 MSFLN-DLTRPNYAGKNADV-DIHLEEHLGIVDKHFAYTSKFAPLMNIRDLRGSNVVRLDRLGNVEAKGRRA--GEELER 76 (335) T ss_pred CCCcc-cchhhhcccccchh-heehhhhhhhHHHHHHhhhhhccccceeeeccceeEEEeeeeeeeeecccC--CcCcCC Confidence 88884 56655554444454 211100 011111 111111 1111111112222111 111 111222 Q ss_pred cccccccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHh----cCCCcCCcccccc Q lcl|NC_020858. 67 DEYTFDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIV----ATNASVGGATRES 142 (330) Q Consensus 67 ~d~~~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i----~g~~~~~~~~r~~ 142 (330) .. .......-++-+-..-.+-|-.--++.++|-...|+..|+.++ |++..-.+++ .+-...... -.. T Consensus 77 ~~-----~~~~k~~itVD~ll~a~~~I~dlDe~~~~yDvRse~s~e~G~a---LA~~~D~~~~~~i~~aa~~~a~~-~~~ 147 (335) T protein:vir:63 77 SR-----VVNDKWNLTVDTLLYLRHQFDHQDEWTQSFDMRKEVAELDGQE---LARKFDQACLIQVIKAAAMDAPV-DLE 147 (335) T ss_pred CC-----ccccceEEEecceeechhhhhhHHHHhcCchhHHHHHHHHHHH---HHHHHHHHHHHHHHhhccccCcc-ccC Confidence 21 1111222233333333344555566666776666766666554 4444444432 221111100 000 Q ss_pred hhHHHHHhcccccccccccccccccccccccccccccccccHHHHHHHHHHHHhc-----CCceeEEEeChHHHHHHHHh Q lcl|NC_020858. 143 GSLPTWVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQS-----GANFKHVFVSPYVKSVFVTF 217 (330) Q Consensus 143 ~Gi~~~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~-----Gg~~~~i~v~~~~k~~is~f 217 (330) +|...-+..+.. -+++ .+.++ +..| .+.+.++.+++-++ |...+.++|+|.+...+-+ T Consensus 148 ~~~~~G~~~~~~-~tg~------------~~~~~--~~~l-~~a~~~a~~~L~e~dVP~~~~~dr~~vv~P~~y~~Ll~- 210 (335) T protein:vir:63 148 DAFSPGVLEKLD-LTGL------------TAKQA--ADKI-VRMHRRVVETFIDRDLGDAVYSEGLTPMSPRVFSLLLE- 210 (335) T ss_pred CCcCCCcceeee-eccC------------ccccc--HHHH-HHHHHHHHHHHHhccCCCcccCceEEEeChHHHHHHhc- Confidence 110000000000 0000 00000 1122 24566777777754 3456899999988766655 Q ss_pred hccceeeeeeeeecCCcc---eeEEEEEEEEEcCCeEEEEEEcCcCCCccccccE----------------EEEEcchhh Q lcl|NC_020858. 218 MSDTNVASFRYAASNGKN---NSIVANADVYEGPFGKVMIHPNRVMAGSGALARN----------------AFFVDPEFL 278 (330) Q Consensus 218 ~~~~~~~~~r~~~~~~~~---~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~----------------~~~ld~~~~ 278 (330) ........+.+.++.. +..+.. .=-+.|+..++||..++.+.. ++++-++-+ T Consensus 211 --~~~l~n~~~~~s~~~~~~~~g~v~~-------v~Gv~V~~sn~lP~~~~t~~~lg~a~n~~~~d~~~~~~~~~~~~Al 281 (335) T protein:vir:63 211 --HDKLMNVEYQATGATNDYVKSRVAI-------LNGVKVLETPRFATKAIAAHPLGRHFNVSAEESERQIALFLPSKTL 281 (335) T ss_pred --cccccccccccccccccccCceeEE-------eeceEEEeeccCCCCCcccccccccCCccccccceeEEEEEecceE Confidence 2211112222222211 001111 012567777888877655321 233333322 Q ss_pred hhcccCCccccccccccccceeeEEEEE--EEEEEecchheeEEeccccccccC Q lcl|NC_020858. 279 QFGWLRKIAEDKKVAKTGDAEKFMLIGE--GALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 279 ~~~~Lr~~~~~e~laKtGd~~k~~i~~E--~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) -..-+.++- .+..-....+-++|.+= ++..+++|.+.+.|+ ++||-+=. T Consensus 282 ~t~~~~~vt--~e~~~~~~~~~~~i~~~~a~G~g~lRPe~a~~i~-~tg~~~~~ 332 (335) T protein:vir:63 282 ITAQVAPVQ--AKLWEDNEKFSWVLDTFQMYNIGARRPDTAGAIE-LKGIGAFD 332 (335) T ss_pred EEEEEeecc--cceeeccchhhHHhHHHHHcCCcccccceEEEEE-EcCCCcee Confidence 222222110 00111222234555554 778889999888877 78876544 No 127 >protein:vir:107687 Length: 319 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:1518 # MgeName: T1 # Cross-refs: genbank:acc:YP_003898;genbank:gi:45686314;genbank:GeneID:2773027 Probab=59.76 E-value=0.39 Score=22.85 Aligned_cols=280 Identities=8% Similarity=0.021 Sum_probs=112.6 Q ss_pred CCccccceee-------ccccccccccceeeEecCCcccceeeeec--------cceeccceeeeeeeeccCcccccccc Q lcl|NC_020858. 1 MAVVTNTFQS-------TGAKGNREELADVVSRITPEDTPIYSMIE--------KVSFDTTHPEWTTDELAAPGANITLE 65 (330) Q Consensus 1 Ma~~t~~~~t-------~~~~g~~edl~d~I~~i~p~dTP~~s~ig--------~~~~~~~~~~W~td~L~~~~~na~~E 65 (330) |.....++.- ..+..+.+-++..|+. ++.-.+.+ ........+.+...+-....+.---+ T Consensus 16 ~~~~~~~~~~da~~~~g~~~~~ql~~id~~v~e-----~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~~~~d~ 90 (319) T protein:vir:10 16 MYLIQAGVKQDAAATMGIWTAQELHRIKSQSYE-----EDYPVGSALRVFPVTTELSPTDKTFEYMTFDKVGTAQIIADY 90 (319) T ss_pred HHHhhccchhhhhhhhhhHHHHHHHHHHHHHHh-----hhhcceechhhcccccCCCCceEEEEeeeeccccceeeecCc Confidence 1111111100 1111122222222222 22222221 11112222222222211111100011 Q ss_pred ccccccccccCceEecceEEEEeeeeeehhHHHHHhhc--cccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccch Q lcl|NC_020858. 66 GDEYTFDATVSPERLGNYTQIMRKSGIISGTQNITDEA--GRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESG 143 (330) Q Consensus 66 G~d~~~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~--G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~ 143 (330) ..|.+...... .+. ..+|++-...++.+.+-+..+ ..-+.-......+...+.+.+....+.|.+... .- T Consensus 91 ~~dip~v~~~~-~~~--~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~i~f~G~~~~g-----~~ 162 (319) T protein:vir:10 91 TDDLPLVDALG-TSE--FGKVFRLGNAYLISIDEIKAGQATGRPLSTRKASACQLAHDQLVNRLVFKGSAPHK-----IV 162 (319) T ss_pred cccccceeccc-eee--EEEEEEEEeeeeecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhceEEEeeccccc-----ce Confidence 12222111111 111 134555444444444444443 223333444445666777777777777854321 12 Q ss_pred hHHHHHhcccccccccccccccccccccccccccccccccHHHHHHHHHHHHhc-CC--ceeEEEeChHHHHHHHHhhcc Q lcl|NC_020858. 144 SLPTWVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQS-GA--NFKHVFVSPYVKSVFVTFMSD 220 (330) Q Consensus 144 Gi~~~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~-Gg--~~~~i~v~~~~k~~is~f~~~ 220 (330) ||+.- .+..... ...+++. .+ .++.. --+++..++.++|.. ++ .++.|.++|.+...++.-..+ T Consensus 163 GLlN~--p~~~~~~--~~~~~~~-------~t-~t~~~-i~~di~~~~~~l~~~s~g~~~p~~L~L~p~~~~~L~~~~~~ 229 (319) T protein:vir:10 163 SVFNH--PNITKIT--SGKWIDV-------ST-MKPET-AEAELTQAIETIETITRGQHRATNILIPPSMRKVLAIRMPE 229 (319) T ss_pred eEEeC--CCceeee--cCCCCCc-------cc-cCHHH-HHHHHHHHHHHHHHhcCceeeceEEEecHHHHHhhhcccCC Confidence 22111 1111000 0000000 00 01001 125567778888853 34 467899999998777653322 Q ss_pred ceeeeeeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccc-cccEEEEE--cchhhhhcccCCcccccccccccc Q lcl|NC_020858. 221 TNVASFRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGA-LARNAFFV--DPEFLQFGWLRKIAEDKKVAKTGD 297 (330) Q Consensus 221 ~~~~~~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~-~a~~~~~l--d~~~~~~~~Lr~~~~~e~laKtGd 297 (330) .... ..+.+...|..++|+.-+++..... ..+.|+++ +|++++++.=.|+. ..++-+.+- T Consensus 230 ~~~t----------------~l~~lk~~~~~l~I~~~pel~~ag~~g~~~~v~y~~~~~~~~~~v~~~~~-~~~~e~~~l 292 (319) T protein:vir:10 230 TTMS----------------YLDYFKSQNSGIEIDSIAELEDIDGAGTKGVLVYEKNPMNMSIEIPEAFN-MLPAQPKDL 292 (319) T ss_pred CCee----------------HHHHHHHhcCCceEEEeeeecccCCCcceEEEEEecCCceEEEecCccee-eeeeeecCc Confidence 2111 1122333344455555555543211 12333333 68888886423332 224433333 Q ss_pred ceeeEEEEE-EEEEEecchheeEEecc Q lcl|NC_020858. 298 AEKFMLIGE-GALKPKNEKGLGVAADL 323 (330) Q Consensus 298 ~~k~~i~~E-~tLe~~N~~a~g~i~gL 323 (330) +.+.-.+.- .+++++-|.+...+.|| T Consensus 293 ~~~~~~~~r~~Gv~i~~P~ai~~~dGI 319 (319) T protein:vir:10 293 HFKVPCTSKCTGLTIYRPMTIVLITGV 319 (319) T ss_pred eEEEeeeeeeEEEEEEccceeEeeecC Confidence 322222222 46999999999999999 No 128 >protein:vir:95107 Length: 270 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1549 # MgeName: X2 # Cross-refs: genbank:acc:YP_240822;genbank:gi:66394683;genbank:GeneID:5133901 Probab=56.90 E-value=0.45 Score=22.50 Aligned_cols=257 Identities=12% Similarity=0.045 Sum_probs=124.6 Q ss_pred CCccccc-eeecc----ccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCcccccccccccccccccc Q lcl|NC_020858. 1 MAVVTNT-FQSTG----AKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATV 75 (330) Q Consensus 1 Ma~~t~~-~~t~~----~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~ 75 (330) ||.+.-. ..-+. .+-++-.=...+..+.+.|+ ++-|+.--+=+.+.|. .+.++ .-..||.+.+....+ T Consensus 1 Ma~T~~~d~I~Pev~~~~V~e~~~~~~~~~~~~~~d~---~L~g~~G~ti~~P~~~--~igda--e~~~eg~~i~~~~lt 73 (270) T protein:vir:95 1 MTQTKKANLINPEVLANVVSAQMQNAIRFTPYAVTDD---TLVGQPGDTITRPKYA--YIGAA--EDLQEGVAMDTTQMS 73 (270) T ss_pred CCceehhhhcchHHHHHHHHHHHHhHHhhcccccccc---ccCCCCCCEEEeeeec--CCCcc--ccccCCCccchhhcc Confidence 7763310 00000 01111101112223333333 2233322222456673 33332 234578877655544 Q ss_pred CceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccc Q lcl|NC_020858. 76 SPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSR 155 (330) Q Consensus 76 ~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~ 155 (330) .....-=+.|. .|.|++++-+..+ .+ +|.+..-..+...-+.|.+++.+|.--++...+ T Consensus 74 ~~~~~a~i~~~-gk~~~itD~a~~~--~~-~dp~~~~~~q~a~~~a~~~d~~li~~l~~a~~~----------------- 132 (270) T protein:vir:95 74 MTTTKVTVKET-GKAVEVTQTAIIT--NV-NGTLQEASRQLAMSLADKVEIDYIAELNKSKQT----------------- 132 (270) T ss_pred cchheeeeehh-hCcceecHHHHhh--hc-cchHHHHHHHHHHHHHHHHHHHHHHHhcccccc----------------- Confidence 43333223333 4677777765543 22 355555555566667788888776322110000 Q ss_pred ccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcc Q lcl|NC_020858. 156 GATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKN 235 (330) Q Consensus 156 g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~ 235 (330) ....+|.+.|.++++++=+++.....|+|||.....+.+ +.....-+ .+. T Consensus 133 ----------------------~~~~~t~~~~~dA~~~lgd~~~~~~~i~vhs~~~~~Lrk---~~~~~~~~----~~~- 182 (270) T protein:vir:95 133 ----------------------ATVSADATGILDAIEVFNSENDEDYVLYVNPKDYNKLVK---SLFKVGGN----VQD- 182 (270) T ss_pred ----------------------cccccCHHHHHHHHHHhccccCCCcEEEEcHHHHHHHHh---hhcccccc----ccc- Confidence 001367788999999998888889999999997766543 22111000 010 Q ss_pred eeEEEEEEEEEcCCeE---EEE-EEcCcCCCccccccEEEEEcchhhhhcccCCccccccccccccceeeEEEE--EEEE Q lcl|NC_020858. 236 NSIVANADVYEGPFGK---VMI-HPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTGDAEKFMLIG--EGAL 309 (330) Q Consensus 236 ~~~~~~v~~~~tdfG~---v~i-v~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~--E~tL 309 (330) . + ...+.||+ ++| |-++..+. ...+++.+.-+.+..-++. .-| ....-+...-.|++ -|++ T Consensus 183 ~-~-----~~~G~ig~~~G~~Viv~s~~~~~-----~~~~l~~~gAi~~~~~~~~-~vE-tdRd~~~~~d~i~~~~~y~v 249 (270) T protein:vir:95 183 R-A-----ISKGDLVEIVGVSDIVKSKRVSE-----NTAFLQRYGAMEIVNKKKP-EAY-TDFDILKRTHLLSTNYHYSV 249 (270) T ss_pred c-h-----hcccccceecceeEEEeCCCCCc-----eeEEEEeccceeeeecCCc-eee-eccchhhcccEEEeeeEEEE Confidence 0 0 12234454 454 44554443 3567888887777665541 112 22222222223333 3889 Q ss_pred EEecchheeEEeccccccccC Q lcl|NC_020858. 310 KPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 310 e~~N~~a~g~i~gLt~~~~~~ 330 (330) .+.||....+|+= . =+-|| T Consensus 250 ~~~~~skvv~~t~-~-~a~~~ 268 (270) T protein:vir:95 250 NLKDETGVVKVTF-K-PSGSL 268 (270) T ss_pred EEEccceEEEEEe-c-CCCCc Confidence 9999887776641 1 11122 No 129 >protein:vir:2685 Length: 387 # NCBI annotation: hypothetical protein # Family: family:all:658 # MgeID: mge:57 # MgeName: phiSLT # Cross-refs: genbank:acc:NP_075504;genbank:gi:12719433;genbank:GeneID:920169 Probab=50.89 E-value=0.6 Score=21.81 Aligned_cols=265 Identities=12% Similarity=0.036 Sum_probs=101.8 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) -+..+++ .+......-+++.+.|+..-....|+..++...+..+...-+....... ..-..||...+.......... T Consensus 116 ~a~~~~~-~~~gG~lIP~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~~~p~~~~~~~~--a~~v~Eg~~~~~~~~~f~~v~ 192 (387) T protein:vir:26 116 HALPTGN-DSGGDKLLPKTLSKEIVSEPFAKNQLREKARLTNIKGLEIPRVSYTLDD--DDFITDVETAKELKAKGDTVK 192 (387) T ss_pred hhhccCC-CCCCceeechhHHHHHHHHHHhhchhhhhceeeecCCceeeeeeccCCc--cccccccccccccccccceee Confidence 1111111 0111111223566666665566677776665544444333333332221 122356766555433222221 Q ss_pred cceEEEEeeeeeehhHHHHHhhc--cccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 81 GNYTQIMRKSGIISGTQNITDEA--GRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 81 ~N~tQIf~~~v~VS~T~~av~~~--G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) -+...+ .--+.|| .+.+... .....+..++++++.. ...+.+|++|... +.| .|++ ++.. T Consensus 193 l~~~k~-~~~i~iS--~ell~ds~~~l~~~i~~~la~~~~~--~e~~~~~~~g~g~--g~~---~g~~----~~~~---- 254 (387) T protein:vir:26 193 FTTNKF-KVFAAIS--DTVIHGSDVDLVNWVENALQSGLAA--KERKDALAVSPKS--GLE---HMSF----YNGS---- 254 (387) T ss_pred echhee-eeechhh--HHHHhhhHHHHHHHHHHHHHHHHHH--HHHHhHhhcCCCc--ccc---ceee----eccc---- Confidence 122111 1123344 3333322 2223344444443221 1223445554421 111 0100 0000 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcC-CceeEEEeCh-HHHHHHHHhhccceeeeeeeeecCCcce Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSG-ANFKHVFVSP-YVKSVFVTFMSDTNVASFRYAASNGKNN 236 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~G-g~~~~i~v~~-~~k~~is~f~~~~~~~~~r~~~~~~~~~ 236 (330) .....+ ..+-+.|.+++..+..+- .+.. +++++ .....+..+ .+... +.. .+... T Consensus 255 -------------~~~~~~---~~~~d~i~~~~~~l~~~y~~na~-~imn~~t~~~~~~~~-~~~~~----~~~-~~~~~ 311 (387) T protein:vir:26 255 -------------VKEVEG---ADMYDAIINALADLHEDYRDNAT-IYMRYADYVKIISVL-SNGTT----NFF-DTPAE 311 (387) T ss_pred -------------cccccc---cchHHHHHHHHhccChhhhcCCE-EEEechHHHHHHHHH-hcCCC----ccc-ccCCc Confidence 000011 112345555555443321 1233 44554 444444443 33221 111 11111 Q ss_pred eEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCcccc-ccccccccceeeEEEEEEEEEEecch Q lcl|NC_020858. 237 SIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAED-KKVAKTGDAEKFMLIGEGALKPKNEK 315 (330) Q Consensus 237 ~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~-e~laKtGd~~k~~i~~E~tLe~~N~~ 315 (330) + =||. .|+..-.++ .+++.|+++.-..+ ++.... ..-+++ +...+++..-+..++.+|. T Consensus 312 ~----------llG~-PV~~~~~~~-------~~~~GDf~~~~~~~-~~~~~~~~~~~~~-~~~~~~~~~r~Dg~v~~~~ 371 (387) T protein:vir:26 312 K----------VFGK-PVVFTDAAV-------KPIVGDFNYFGINY-DGTTYDTDKDVKK-GEYLFVLTAWYDQQRTLDS 371 (387) T ss_pred c----------cccc-ceEEecCCC-------ceeeechhhhhhhh-hhhhheecccccC-CceEEEEEEEeCcEeechh Confidence 1 2342 222222233 35677776543322 221111 111222 4566767777999999999 Q ss_pred heeEEeccccccccC Q lcl|NC_020858. 316 GLGVAADLYGLTAST 330 (330) Q Consensus 316 a~g~i~gLt~~~~~~ 330 (330) |.-+ ..+..=+.+| T Consensus 372 A~~~-l~~ka~~~~~ 385 (387) T protein:vir:26 372 AFRI-AKAKENTGPL 385 (387) T ss_pred heEE-EEeecCCCCC Confidence 9865 3344444444 No 130 >protein:vir:96978 Length: 387 # NCBI annotation: ORF009 # Family: family:all:658 # MgeID: mge:1643 # MgeName: 42e # Cross-refs: genbank:acc:YP_239859;genbank:gi:66395517;genbank:GeneID:5133011 Probab=50.89 E-value=0.6 Score=21.81 Aligned_cols=265 Identities=12% Similarity=0.036 Sum_probs=101.8 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) -+..+++ .+......-+++.+.|+..-....|+..++...+..+...-+....... ..-..||...+.......... T Consensus 116 ~a~~~~~-~~~gG~lIP~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~~~p~~~~~~~~--a~~v~Eg~~~~~~~~~f~~v~ 192 (387) T protein:vir:96 116 HALPTGN-DSGGDKLLPKTLSKEIVSEPFAKNQLREKARLTNIKGLEIPRVSYTLDD--DDFITDVETAKELKAKGDTVK 192 (387) T ss_pred hhhccCC-CCCCceeechhHHHHHHHHHHhhchhhhhceeeecCCceeeeeeccCCc--cccccccccccccccccceee Confidence 1111111 0111111223566666665566677776665544444333333332221 122356766555433222221 Q ss_pred cceEEEEeeeeeehhHHHHHhhc--cccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 81 GNYTQIMRKSGIISGTQNITDEA--GRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 81 ~N~tQIf~~~v~VS~T~~av~~~--G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) -+...+ .--+.|| .+.+... .....+..++++++.. ...+.+|++|... +.| .|++ ++.. T Consensus 193 l~~~k~-~~~i~iS--~ell~ds~~~l~~~i~~~la~~~~~--~e~~~~~~~g~g~--g~~---~g~~----~~~~---- 254 (387) T protein:vir:96 193 FTTNKF-KVFAAIS--DTVIHGSDVDLVNWVENALQSGLAA--KERKDALAVSPKS--GLE---HMSF----YNGS---- 254 (387) T ss_pred echhee-eeechhh--HHHHhhhHHHHHHHHHHHHHHHHHH--HHHHhHhhcCCCc--ccc---ceee----eccc---- Confidence 122111 1123344 3333322 2223344444443221 1223445554421 111 0100 0000 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcC-CceeEEEeCh-HHHHHHHHhhccceeeeeeeeecCCcce Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSG-ANFKHVFVSP-YVKSVFVTFMSDTNVASFRYAASNGKNN 236 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~G-g~~~~i~v~~-~~k~~is~f~~~~~~~~~r~~~~~~~~~ 236 (330) .....+ ..+-+.|.+++..+..+- .+.. +++++ .....+..+ .+... +.. .+... T Consensus 255 -------------~~~~~~---~~~~d~i~~~~~~l~~~y~~na~-~imn~~t~~~~~~~~-~~~~~----~~~-~~~~~ 311 (387) T protein:vir:96 255 -------------VKEVEG---ADMYDAIINALADLHEDYRDNAT-IYMRYADYVKIISVL-SNGTT----NFF-DTPAE 311 (387) T ss_pred -------------cccccc---cchHHHHHHHHhccChhhhcCCE-EEEechHHHHHHHHH-hcCCC----ccc-ccCCc Confidence 000011 112345555555443321 1233 44554 444444443 33221 111 11111 Q ss_pred eEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCcccc-ccccccccceeeEEEEEEEEEEecch Q lcl|NC_020858. 237 SIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAED-KKVAKTGDAEKFMLIGEGALKPKNEK 315 (330) Q Consensus 237 ~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~-e~laKtGd~~k~~i~~E~tLe~~N~~ 315 (330) + =||. .|+..-.++ .+++.|+++.-..+ ++.... ..-+++ +...+++..-+..++.+|. T Consensus 312 ~----------llG~-PV~~~~~~~-------~~~~GDf~~~~~~~-~~~~~~~~~~~~~-~~~~~~~~~r~Dg~v~~~~ 371 (387) T protein:vir:96 312 K----------VFGK-PVVFTDAAV-------KPIVGDFNYFGINY-DGTTYDTDKDVKK-GEYLFVLTAWYDQQRTLDS 371 (387) T ss_pred c----------cccc-ceEEecCCC-------ceeeechhhhhhhh-hhhhheecccccC-CceEEEEEEEeCcEeechh Confidence 1 2342 222222233 35677776543322 221111 111222 4566767777999999999 Q ss_pred heeEEeccccccccC Q lcl|NC_020858. 316 GLGVAADLYGLTAST 330 (330) Q Consensus 316 a~g~i~gLt~~~~~~ 330 (330) |.-+ ..+..=+.+| T Consensus 372 A~~~-l~~ka~~~~~ 385 (387) T protein:vir:96 372 AFRI-AKAKENTGPL 385 (387) T ss_pred heEE-EEeecCCCCC Confidence 9865 3344444444 No 131 >protein:vir:94424 Length: 387 # NCBI annotation: ORF010 # Family: family:all:658 # MgeID: mge:1506 # MgeName: 47 # Cross-refs: genbank:acc:YP_240005;genbank:gi:66395666;genbank:GeneID:5133084 Probab=50.89 E-value=0.6 Score=21.81 Aligned_cols=265 Identities=12% Similarity=0.036 Sum_probs=101.8 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) -+..+++ .+......-+++.+.|+..-....|+..++...+..+...-+....... ..-..||...+.......... T Consensus 116 ~a~~~~~-~~~gG~lIP~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~~~p~~~~~~~~--a~~v~Eg~~~~~~~~~f~~v~ 192 (387) T protein:vir:94 116 HALPTGN-DSGGDKLLPKTLSKEIVSEPFAKNQLREKARLTNIKGLEIPRVSYTLDD--DDFITDVETAKELKAKGDTVK 192 (387) T ss_pred hhhccCC-CCCCceeechhHHHHHHHHHHhhchhhhhceeeecCCceeeeeeccCCc--cccccccccccccccccceee Confidence 1111111 0111111223566666665566677776665544444333333332221 122356766555433222221 Q ss_pred cceEEEEeeeeeehhHHHHHhhc--cccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 81 GNYTQIMRKSGIISGTQNITDEA--GRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 81 ~N~tQIf~~~v~VS~T~~av~~~--G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) -+...+ .--+.|| .+.+... .....+..++++++.. ...+.+|++|... +.| .|++ ++.. T Consensus 193 l~~~k~-~~~i~iS--~ell~ds~~~l~~~i~~~la~~~~~--~e~~~~~~~g~g~--g~~---~g~~----~~~~---- 254 (387) T protein:vir:94 193 FTTNKF-KVFAAIS--DTVIHGSDVDLVNWVENALQSGLAA--KERKDALAVSPKS--GLE---HMSF----YNGS---- 254 (387) T ss_pred echhee-eeechhh--HHHHhhhHHHHHHHHHHHHHHHHHH--HHHHhHhhcCCCc--ccc---ceee----eccc---- Confidence 122111 1123344 3333322 2223344444443221 1223445554421 111 0100 0000 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcC-CceeEEEeCh-HHHHHHHHhhccceeeeeeeeecCCcce Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSG-ANFKHVFVSP-YVKSVFVTFMSDTNVASFRYAASNGKNN 236 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~G-g~~~~i~v~~-~~k~~is~f~~~~~~~~~r~~~~~~~~~ 236 (330) .....+ ..+-+.|.+++..+..+- .+.. +++++ .....+..+ .+... +.. .+... T Consensus 255 -------------~~~~~~---~~~~d~i~~~~~~l~~~y~~na~-~imn~~t~~~~~~~~-~~~~~----~~~-~~~~~ 311 (387) T protein:vir:94 255 -------------VKEVEG---ADMYDAIINALADLHEDYRDNAT-IYMRYADYVKIISVL-SNGTT----NFF-DTPAE 311 (387) T ss_pred -------------cccccc---cchHHHHHHHHhccChhhhcCCE-EEEechHHHHHHHHH-hcCCC----ccc-ccCCc Confidence 000011 112345555555443321 1233 44554 444444443 33221 111 11111 Q ss_pred eEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCcccc-ccccccccceeeEEEEEEEEEEecch Q lcl|NC_020858. 237 SIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAED-KKVAKTGDAEKFMLIGEGALKPKNEK 315 (330) Q Consensus 237 ~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~-e~laKtGd~~k~~i~~E~tLe~~N~~ 315 (330) + =||. .|+..-.++ .+++.|+++.-..+ ++.... ..-+++ +...+++..-+..++.+|. T Consensus 312 ~----------llG~-PV~~~~~~~-------~~~~GDf~~~~~~~-~~~~~~~~~~~~~-~~~~~~~~~r~Dg~v~~~~ 371 (387) T protein:vir:94 312 K----------VFGK-PVVFTDAAV-------KPIVGDFNYFGINY-DGTTYDTDKDVKK-GEYLFVLTAWYDQQRTLDS 371 (387) T ss_pred c----------cccc-ceEEecCCC-------ceeeechhhhhhhh-hhhhheecccccC-CceEEEEEEEeCcEeechh Confidence 1 2342 222222233 35677776543322 221111 111222 4566767777999999999 Q ss_pred heeEEeccccccccC Q lcl|NC_020858. 316 GLGVAADLYGLTAST 330 (330) Q Consensus 316 a~g~i~gLt~~~~~~ 330 (330) |.-+ ..+..=+.+| T Consensus 372 A~~~-l~~ka~~~~~ 385 (387) T protein:vir:94 372 AFRI-AKAKENTGPL 385 (387) T ss_pred heEE-EEeecCCCCC Confidence 9865 3344444444 No 132 >protein:vir:96762 Length: 632 # NCBI annotation: putative phage-related protein # Family: family:all:21 # MgeID: mge:1628 # MgeName: VP882 # Cross-refs: genbank:acc:YP_001039818;genbank:gi:126010917;genbank:GeneID:5076272 Probab=49.54 E-value=0.64 Score=21.66 Aligned_cols=266 Identities=9% Similarity=-0.022 Sum_probs=111.5 Q ss_pred CCccccceeeccccccccc-cceeeEecCCcccceeeeecc--ceeccceeeeeeeeccCccccccccccccccccccCc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREE-LADVVSRITPEDTPIYSMIEK--VSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSP 77 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~ed-l~d~I~~i~p~dTP~~s~ig~--~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~ 77 (330) .|..+++-.+ .....-++ +.+.|+..=-..+++..+ |- .+..+-.+.|...+-.+ ...-..||...+..... T Consensus 355 ra~~~~t~~~-gg~lvp~~~~~~~iie~lr~~s~i~~l-~~~~~~~~~g~~~ip~~~~~~-~a~wv~E~~~~~~s~~~-- 429 (632) T protein:vir:96 355 RQLEKKTAGK-GGELVATELLSEEFIDILRNKAIIGQM-GARMLPGLVGDVDIPKKTSGA-NFYWIGEDEDVQDSDFD-- 429 (632) T ss_pred hhhhcccccc-cccccccccchHHHHHHHhhcchhhhh-cceEeecCCcceEEEEEeCCc-eeEeecCCccccccccc-- Confidence 2222222111 00000111 233322221113333222 21 12222234444333211 11112466665443221 Q ss_pred eEecceE---EEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccc Q lcl|NC_020858. 78 ERLGNYT---QIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVS 154 (330) Q Consensus 78 ~~~~N~t---QIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~ 154 (330) ++.++ .-+.--+.| |.+.+.... .+..++=...-...+.+-+|.++|+|.... +. .-||+.....+.. T Consensus 430 --f~~i~l~~~k~~~~v~i--S~ell~ds~-~~~~~~i~~~l~~a~~~~~d~a~l~G~G~~-~~---p~Gi~~~~~~~~~ 500 (632) T protein:vir:96 430 --FTTLSFSPKTIAGAVPV--TRKLRKQSS-IHVENLIREDLIEGIGVALDLAMLTGTGLA-ND---PVGLLNMTGVPAL 500 (632) T ss_pred --eeeEEeeeeEEEEehhh--HHHHHhccc-hHHHHHHHHHHHHHHHHHHHHHhhcccCCC-Cc---cceeeecccccce Confidence 21111 112222233 344444332 122222234556678888999999986422 22 2355433211100 Q ss_pred cccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCce--eEEEeChHHHHHHHHh-hccceeeeeeeeec Q lcl|NC_020858. 155 RGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANF--KHVFVSPYVKSVFVTF-MSDTNVASFRYAAS 231 (330) Q Consensus 155 ~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~--~~i~v~~~~k~~is~f-~~~~~~~~~r~~~~ 231 (330) ..+...++.+.|.++..++-...+.. ...+++|..+..+... ..+.. .++... T Consensus 501 ---------------------~~~~~~~~~~~i~~~~~~i~~~~~~~~~~~~~~~~~~~~~l~~~~l~d~~---G~~i~~ 556 (632) T protein:vir:96 501 ---------------------TYPAGGVDWASVVDMETKISTFNADAGRLAYLTSVTQRGAAKKAQVFDNT---GERIWQ 556 (632) T ss_pred ---------------------ecccccCCHHHHHHHHHHHhhcccccCccEEEEchhHHHHHHHHhccCCC---Cceeec Confidence 01112357777888887777655432 2457888776554431 22221 122211 Q ss_pred CCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc---cccccccccccceeeEEEEEEE Q lcl|NC_020858. 232 NGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI---AEDKKVAKTGDAEKFMLIGEGA 308 (330) Q Consensus 232 ~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~---~~~e~laKtGd~~k~~i~~E~t 308 (330) ++ + =+| +.++.+..||++. +++.|.+.+-+.-.-+. ...+..+ .-+...+.....+. T Consensus 557 ~~---~----------l~G-~pv~~s~~ip~~~-----~~~gd~s~~~i~~~~~~~i~~~~~~~~-~~~~v~~~~~~~~d 616 (632) T protein:vir:96 557 NN---E----------VNG-YRAEASNQIPADT-----WIFGDWSQIVIAMWGVLDLKVDPYTKA-ASDGLVLRVFQDVD 616 (632) T ss_pred CC---e----------ecc-cceEeccccccCc-----EEEeecceEEEEEecceEEEEcccccc-ccCceEEEEEeecC Confidence 11 0 112 3566677788654 55677665544321110 0011111 12334555667788 Q ss_pred EEEecchheeEEeccc Q lcl|NC_020858. 309 LKPKNEKGLGVAADLY 324 (330) Q Consensus 309 Le~~N~~a~g~i~gLt 324 (330) +.++.|.++.++.-=- T Consensus 617 ~~v~~~~af~~~k~~A 632 (632) T protein:vir:96 617 AGVRRKEAFCIAKKGA 632 (632) T ss_pred ceeechhhhhheeecC Confidence 9999999876554222 No 133 >protein:vir:9361 Length: 402 # NCBI annotation: SLT orf 37-like protein # Family: family:all:658 # MgeID: mge:166 # MgeName: phi 12 # Cross-refs: genbank:acc:NP_803339;genbank:gi:29028650;genbank:GeneID:1258088 Probab=48.26 E-value=0.67 Score=21.52 Aligned_cols=266 Identities=12% Similarity=0.024 Sum_probs=103.0 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) .+..+++ .+......-+++...|...-....|+..++...+..+...-+....... ..-..||...+.......... T Consensus 131 ~a~~~~t-~~~GG~lIP~~~~~~Ii~~~~~~~~l~~~~~v~~~~~~~~p~~~~~~~~--a~~v~Eg~~~~~~~~~f~~i~ 207 (402) T protein:vir:93 131 HALPTGN-DSGGDKLLPKTLSKEIVSEPFAKNQLREKARLTNIKGLEIPRVSYTLDD--DDFITDVETAKELKAKGDTVK 207 (402) T ss_pred hhhccCC-CcCCccccchhHHHHHHHhHHhhhhhhhhceeeecCCceeeeeeccCCc--cccccccccccccccccceee Confidence 1111111 0001111223566666655556677766665544443333232222211 223357776655433322222 Q ss_pred cceEEEEeeeeeehhHHHHHhhc--cccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 81 GNYTQIMRKSGIISGTQNITDEA--GRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 81 ~N~tQIf~~~v~VS~T~~av~~~--G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) -+...+ .-.+.|| .+.+... .....+..++++++.. ...+.+|+.|... +.| .|++ ++.. T Consensus 208 ~~~~k~-~~~i~iS--~ell~Ds~~~l~~~i~~~la~~~~~--~e~~~~~~~g~g~--g~p---~g~~----~~~~---- 269 (402) T protein:vir:93 208 FTTNKF-KVFAAIS--DTVIHGSDVDLVNWVENALQSGLAA--KERKDALAVSPKS--GLE---HMSF----YNGS---- 269 (402) T ss_pred ecceee-eeechhh--HHHHhhhHHHHHHHHHHHHHHHHHH--HHHHhHhhcCCCc--ccc---ceee----eccc---- Confidence 222222 1223344 3333332 2233444444444322 1223445554321 111 1111 0000 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcC-CceeEEEeChHHHHHHHHhhccceeeeeeeeecCCccee Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSG-ANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNS 237 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~G-g~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~ 237 (330) .....+ .-+.+.|.+++.++-.+- .+...+|-+......+..+ .+... +.. .+...+ T Consensus 270 -------------~~~~~~---~~~~d~l~~~~~~l~~~y~~na~~imn~~t~~~~~~~~-~d~~~----~~~-~~~~~~ 327 (402) T protein:vir:93 270 -------------VKEVEG---ADMYDAIINALADLHEDYRDNATIYMRYADYVKIISVL-SNGTT----NFF-DTPAEK 327 (402) T ss_pred -------------cccccc---cchHHHHHHHHhccChhhhcCCEEEEechHHHHHHHHH-hcCCC----ccc-ccCCcc Confidence 000011 112344555554443221 1223344344444444443 33321 111 111111 Q ss_pred EEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCC-ccccccccccccceeeEEEEEEEEEEecchh Q lcl|NC_020858. 238 IVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRK-IAEDKKVAKTGDAEKFMLIGEGALKPKNEKG 316 (330) Q Consensus 238 ~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~-~~~~e~laKtGd~~k~~i~~E~tLe~~N~~a 316 (330) + ||. .|+..-.++ .+++.|.++.-+.+-+. +..... +++ +...++...-+..++.||.| T Consensus 328 l----------lG~-PV~~t~~~~-------~i~~GDf~~~~~~~~~~~~~~~~~-~~~-~~~~~~~~~r~Dg~v~~~~A 387 (402) T protein:vir:93 328 V----------FGK-PVVFTDAAV-------KPIVGDFNYFGINYDGTTYDTDKD-VKK-GEYLFVLTAWYDQQRTLDSA 387 (402) T ss_pred c----------ccc-ceEEecCCC-------ceeeechhhhhhhhhhhhhhhhhc-ccC-CceEEEEEEEeCcEEechhh Confidence 1 332 222222233 34567776532222111 111111 333 45566667778999999999 Q ss_pred eeEEeccccccccC Q lcl|NC_020858. 317 LGVAADLYGLTAST 330 (330) Q Consensus 317 ~g~i~gLt~~~~~~ 330 (330) .- |.-+.+-+.+| T Consensus 388 ~~-~l~ik~~~~~~ 400 (402) T protein:vir:93 388 FR-IAKAKENTGPL 400 (402) T ss_pred eE-EEEeecCCCCC Confidence 76 44455556666 No 134 >protein:vir:9927 Length: 295 # NCBI annotation: hypothetical protein # Family: family:all:1178 # MgeID: mge:178 # MgeName: 315.6 # Cross-refs: genbank:acc:NP_795689;genbank:gi:28876459;genbank:GeneID:1258000 Probab=46.12 E-value=0.75 Score=21.28 Aligned_cols=267 Identities=12% Similarity=0.041 Sum_probs=107.7 Q ss_pred CCccccceeecccc-ccccccceeeEecCCcccceeeeeccc--ee--ccceeeeeeeeccCcccccccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAK-GNREELADVVSRITPEDTPIYSMIEKV--SF--DTTHPEWTTDELAAPGANITLEGDEYTFDATV 75 (330) Q Consensus 1 Ma~~t~~~~t~~~~-g~~edl~d~I~~i~p~dTP~~s~ig~~--~~--~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~ 75 (330) ||- +|.-++.+.+ -+.+|+.+....==-+-.++|...... .. +=+.+.|--- ..+ ....||..-+..... T Consensus 1 mAe-~nlt~~~dL~~~~sidfv~~f~~~i~~L~~~Lgi~r~~p~a~G~tIt~pK~~~t-gda---~dVaEGe~Iplskvt 75 (295) T protein:vir:99 1 MAE-KNLNTMADLGDIKSIDFVNKFSKNINDLLKLLGVTRRETLTNDLKIQTYKWEVT-LDQ---TDPGEGETIPLSKVT 75 (295) T ss_pred CCC-cccccHhhccCceeehhhHHhhhhHHHHHHHhccccccccccCCeEEeeeeeee-ccc---ccccCCcccchhhhe Confidence 887 3333333322 222232222210000111112111111 11 1123445321 222 236788887655433 Q ss_pred CceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccc Q lcl|NC_020858. 76 SPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSR 155 (330) Q Consensus 76 ~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~ 155 (330) ......=-.+|. |.-... |.+|++-.|.++-.-.--.+-+.+|.+++-.-|+.--+..-.+ -.-.++ T Consensus 76 ~~~~~t~t~kik-K~rK~t-TdEAIqlsGygdpvgead~qL~~~ia~kId~D~~~~lktat~t-~tg~~l---------- 142 (295) T protein:vir:99 76 RTKDKDYTVKWF-KKRRAT-TAEAIARHGAARAITEADKRIMRELQNGIKDAFFTFLKTKPTK-VKGVGL---------- 142 (295) T ss_pred eeeeeeeEEEee-eecccc-cHHHHHhcCCCchhHHHHHHHHHHHHHhhhHHHHHHhccCcee-eehhhH---------- Confidence 221111112343 333333 8999987787764444444444555555555554321110000 000000 Q ss_pred ccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcc Q lcl|NC_020858. 156 GATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKN 235 (330) Q Consensus 156 g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~ 235 (330) ...-+.+.+.+...|+.-+...++|++|..... +-++...... .. T Consensus 143 -------------------------q~a~a~~~~al~~f~Ee~~~~~V~FVnP~D~a~---yl~~A~~~~~-------~a 187 (295) T protein:vir:99 143 -------------------------QKALSASWAKLATFNEFEGSPLVSFVSPLDVAN---YLGDTKVGAD-------AS 187 (295) T ss_pred -------------------------HHHHHHhhhhhhhcccccCCceEEEEehHHHHH---HHhccccccc-------hh Confidence 012233444566666655667799999966433 3333322111 11 Q ss_pred eeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccccccceeeEEEEEEEEEEecch Q lcl|NC_020858. 236 NSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTGDAEKFMLIGEGALKPKNEK 315 (330) Q Consensus 236 ~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~E~tLe~~N~~ 315 (330) +.||.. +++.=.|-=.||..+=+|.+. ++.--++.+.++|+.+. .-+|++...-.... .|=+++-|.-.. T Consensus 188 ~~fG~~--~L~nfLG~q~II~S~kv~~G~-----~~aT~~~Ni~~ay~~~~--~g~l~~~f~~~~D~-tglIg~~h~~~~ 257 (295) T protein:vir:99 188 NVFGMT--LLKNFLGMQNVIVMPSVPEGK-----IYSTAVENLVFASLNVK--GGDLGGLFADFTDE-TGLIAAARNRQL 257 (295) T ss_pred hhhhhh--hhhhhhccceEEEcccCCCce-----EEEeeccceEEEEecCC--chhhhhhhhhccCc-ccceEEEecccc Confidence 114332 233333433588888888764 55688888888886542 12344322110000 011222221111 Q ss_pred heeEEecc------------ccccccC Q lcl|NC_020858. 316 GLGVAADL------------YGLTAST 330 (330) Q Consensus 316 a~g~i~gL------------t~~~~~~ 330 (330) .|..++-| -|+-++| T Consensus 258 ~~~t~et~~~~~~~lfpE~~dgiv~~t 284 (295) T protein:vir:99 258 SNLTYESVFFGANVLFAEIPEGVVEAT 284 (295) T ss_pred ceeeehhhhHhHHHhcccccceEEEEE Confidence 22222111 1222222 No 135 >protein:vir:93616 Length: 645 # NCBI annotation: putative major head protein/prohead protease # Family: family:all:21 # MgeID: mge:157 # MgeName: phi 4795 # Cross-refs: genbank:acc:YP_001449293;genbank:gi:157166041;goa:Q6H9U8;interpro:IPR006433;uniprot:Q6H9U8;genbank:GeneID:5580438 Probab=44.37 E-value=0.81 Score=21.09 Aligned_cols=277 Identities=11% Similarity=-0.013 Sum_probs=121.7 Q ss_pred CCccccceeeccccc---cccccceeeEecCCcccceeeeeccc--ee----ccceeeeeeeeccCcccccccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKG---NREELADVVSRITPEDTPIYSMIEKV--SF----DTTHPEWTTDELAAPGANITLEGDEYTF 71 (330) Q Consensus 1 Ma~~t~~~~t~~~~g---~~edl~d~I~~i~p~dTP~~s~ig~~--~~----~~~~~~W~td~L~~~~~na~~EG~d~~~ 71 (330) +|+.+++-++...-| .-+.+.+.|+..-...+++..+-... .. .....-|++. .+...=..||...+. T Consensus 332 ~a~~~~~~~~~~~~Gg~~vp~~~~~~ii~~l~~~svv~~l~~~~~~~~~~~~~~~~ip~~t~---~~~a~wv~Eg~~~~~ 408 (645) T protein:vir:93 332 SAVGAGTTTDPQWAGSLSEYQEYAQDFIDYLRPQTIIGRFGQGGIPALRQVPFNIRVHAQVS---GGAAGWVGEGKTKPL 408 (645) T ss_pred hhhhccccccccccCCccCchhhHHHHHHhhhhhhhHHhhccccccccccccCceeeeeeec---CcceEEeccCccccc Confidence 333333322222222 12234444444333444443321100 00 0112223332 111111347776554 Q ss_pred ccccCceEecceEEEEee-eeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHh Q lcl|NC_020858. 72 DATVSPERLGNYTQIMRK-SGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVK 150 (330) Q Consensus 72 ~~~~~~~~~~N~tQIf~~-~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~ 150 (330) .... ++.++-=-+| ..-|.=|.+-+.... .+--+|=..+-...+.+-++.+||+|....... -...|+.. T Consensus 409 s~~~----f~~v~l~~~kla~~~~iS~ell~ds~-~~~~~~i~~~l~~aia~~~d~a~l~g~g~~~~~-~~p~gi~~--- 479 (645) T protein:vir:93 409 TKFD----FESITFSHAKVSAIAVLTEELIRFSS-PAADALVRNALAEAVVARLDTDFVDPKKAAVAD-VSPASITH--- 479 (645) T ss_pred cccc----eeEEEEeeEEEEEeehhHHHHHhhch-HHHHHHHHHHHHHHHHHHHHHHhhcCCCcccCC-ccccceec--- Confidence 3322 2222211122 222333444444443 222233334555667788889999886532111 11112211 Q ss_pred cccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCcee--EEEeChHHHHHHHHhhccceeeeeee Q lcl|NC_020858. 151 TNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFK--HVFVSPYVKSVFVTFMSDTNVASFRY 228 (330) Q Consensus 151 tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~--~i~v~~~~k~~is~f~~~~~~~~~r~ 228 (330) +..+ .++. ..+.+++..++.++..++..+. ..++||..+.++..+ ++.+... .+ T Consensus 480 -----~~~~-------------~~~~----~~~~~d~~~~~~~~~~a~~~~~~a~~vmn~~~~~~L~~l-kd~~G~~-~~ 535 (645) T protein:vir:93 480 -----DVKG-------------TASS----GNPDADAEAAFGQFVAANLQPTGAVWLMSSTNALALSMR-KNALGQK-EY 535 (645) T ss_pred -----cccc-------------cccc----cchHHHHHHHHHHHHhcCCCccccEEEEcHHHHHHHHhc-cccCCce-ee Confidence 0000 0111 1234567777777777776543 357899999998875 4433210 11 Q ss_pred eecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCc-------cccc----c--cccc Q lcl|NC_020858. 229 AASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKI-------AEDK----K--VAKT 295 (330) Q Consensus 229 ~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~-------~~~e----~--laKt 295 (330) ........+ +--+.|+...+||++ +++.|++.+-+...-+. ...+ + ..++ T Consensus 536 ~~~~~~~~t-----------L~G~PV~~s~~vp~~------~~~gd~s~~~ig~~~~v~i~~s~~a~~~~~~~~~~~~~~ 598 (645) T protein:vir:93 536 PDMTLLGGS-----------FQGLPVIVSQYVGDQ------LVLVNAPDIYLADDGGVAVDMSREASLEMQSEPTGDSTT 598 (645) T ss_pred cCCCCCCce-----------eeceeeEEeccCCcc------eeEeccccEEEEEecceEEEeecceeEEEeecccccccc Confidence 111111111 112566777788753 33445554333211110 0000 0 0001 Q ss_pred -----------ccceeeEEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 296 -----------GDAEKFMLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 296 -----------Gd~~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) -|..-+..+..+.+.++.|.|..+|+|..==|++- T Consensus 599 ~~~~~~v~lf~~d~vaira~~r~d~~~~~p~a~~~lt~~~~g~~~~ 644 (645) T protein:vir:93 599 PSPVELVSMFQTGSVAIRAERWINWRRRRTAAVAVITGVNYGSASG 644 (645) T ss_pred cccccchhHhhcCceEEEEEEEEcceeeCccceEEEecccCCcccC Confidence 12233455677899999999999999987656666 No 136 >protein:vir:5974 Length: 324 # NCBI annotation: hypothetical protein # Family: family:all:1522 # MgeID: mge:125 # MgeName: SPP1 # Cross-refs: genbank:acc:NP_690674;genbank:geneid:6329212;genbank:gi:22855068;goa:Q38582;uniprot:Q38582;genbank:GeneID:955303 Probab=44.07 E-value=0.82 Score=21.06 Aligned_cols=266 Identities=10% Similarity=0.046 Sum_probs=108.7 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceee------------eec-c-ceeccceeeeeeeeccCccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYS------------MIE-K-VSFDTTHPEWTTDELAAPGANITLEG 66 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s------------~ig-~-~~~~~~~~~W~td~L~~~~~na~~EG 66 (330) ||.+ ..+.-. +-|=|.+.+..-.|.-..|+- .+. . ....-+.+-|. .|...+ ...-|+ T Consensus 1 MA~T---~lsd~i--~peVf~~yv~~~~~~~~~l~qSg~i~~~a~i~~~l~~~~~G~~i~~P~~~--~l~Gd~-~~v~~~ 72 (324) T protein:vir:59 1 MAYT---KISDVI--VPELFNPYVINTTTQLSAFFQSGIAATDDELNALAKKAGGGSTLNMPYWN--DLDGDS-QVLNDT 72 (324) T ss_pred CCce---eeecee--chhHHHHHHHhhhHHHHHHhhcccccccHHHHHHhhccCCCCEEEecccc--cCCCcc-cccCCC Confidence 8832 222211 223334444333333322210 110 0 11112344553 232211 112255 Q ss_pred cccccccccCceEecceEEEE-----eeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCccccc Q lcl|NC_020858. 67 DEYTFDATVSPERLGNYTQIM-----RKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRE 141 (330) Q Consensus 67 ~d~~~~~~~~~~~~~N~tQIf-----~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~ 141 (330) .+.+. .+++-.+|+. -|.|+++.-+..+ .| +|-+..=-.+-..-+.|++++.+|.--+..-+.. . T Consensus 73 ~~i~~------~~l~t~~~~a~i~~~~k~~~~tD~a~~~--sg-~dp~~~i~~q~a~~~~~~~~~~lia~l~g~~~~~-~ 142 (324) T protein:vir:59 73 DDLVP------QKINAGQDKAVLILRGNAWSSHDLAATL--SG-SDPMQAIGSRVAAYWAREMQKIVFAELAGVFSND-D 142 (324) T ss_pred cccch------hhcccceeeEEEEeecCceeehhhhhhh--cc-chHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc-c Confidence 54432 2233333322 2345555544443 33 3433332223344456777777764221110000 0 Q ss_pred chhHHHHHhcccccccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccc Q lcl|NC_020858. 142 SGSLPTWVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDT 221 (330) Q Consensus 142 ~~Gi~~~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~ 221 (330) + .++... . ...+...|+.+.|.++++++=++......+++||.+.-.+-+-.--. T Consensus 143 ~-------~~~~~d----------------v--sa~~~~~~s~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~ 197 (324) T protein:vir:59 143 M-------KDNKLD----------------I--SGTADGIYSAETFVDASYKLGDHESLLTAIGMHSATMASAVKQDLIE 197 (324) T ss_pred c-------ccceee----------------e--eccccceecHHHHHHHHHHhCCcccCcEEEEEchHHHHHHHHhhhhh Confidence 0 001000 0 01122368999999999999899889999999998866655421100 Q ss_pred eeeeeeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCcccc----ccEEEEEcchhhhhcccCCc----ccccccc Q lcl|NC_020858. 222 NVASFRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGAL----ARNAFFVDPEFLQFGWLRKI----AEDKKVA 293 (330) Q Consensus 222 ~~~~~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~----a~~~~~ld~~~~~~~~Lr~~----~~~e~la 293 (330) + . ...++. -.|-+..| ..||.+.-||..... .-..|++-+.-+.+...++. ..+.++. T Consensus 198 ~---~--~~s~~~--------~~i~~~~G-~~VivdD~~p~~~~~~~~~~y~s~l~~~GAi~~~~~~~~v~vE~dRd~~~ 263 (324) T protein:vir:59 198 F---V--KDSQSG--------IRFPTYMN-KRVIVDDSMPVETLEDGTKVFTSYLFGAGALGYAEGQPEVPTETARNALG 263 (324) T ss_pred h---c--cccccC--------ceeeeecc-cEEEEeCCCCccccCCCCceEEEEEEecCeEEEeecCCCcceecccCccc Confidence 0 0 111111 11344455 677778877753211 11356666655555443331 1111211 Q ss_pred ccc--cceeeEEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 294 KTG--DAEKFMLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 294 KtG--d~~k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) +.. -.++..+++=+|....... ..|.|.+- T Consensus 264 g~~~l~~r~~~~~~p~G~s~~~~~-------~~~~sPt~ 295 (324) T protein:vir:59 264 SQDILINRKHFVLHPRGVKFTENA-------MAGTTPTD 295 (324) T ss_pred cceEEEEeeEEEeEeeeEEecccc-------cCCCCCCh Confidence 110 0122333333333332221 12333222 No 137 >protein:vir:106647 Length: 303 # NCBI annotation: ORF011 # Family: family:all:1178 # MgeID: mge:1557 # MgeName: 187 # Cross-refs: genbank:acc:YP_239493;genbank:gi:66395226;genbank:GeneID:4555801 Probab=43.06 E-value=0.86 Score=20.94 Aligned_cols=251 Identities=10% Similarity=0.101 Sum_probs=115.9 Q ss_pred CCccccceeeccc-cccccccceee-------------EecCCcccceeeeeccceeccceeeeeeeeccCccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGA-KGNREELADVV-------------SRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEG 66 (330) Q Consensus 1 Ma~~t~~~~t~~~-~g~~edl~d~I-------------~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG 66 (330) |+...|..++.+. .-.-+|+.+.+ .+..|.. .| .+=..+.|...+...++ --..|| T Consensus 1 M~~e~nl~~~~dL~~a~siDF~~~f~~~i~~L~~~LGv~r~~pla------~G---t~iktyK~~~~~y~gda-~dVaEG 70 (303) T protein:vir:10 1 MSAENNLINVEALGKAKSIDFANKLGVGLNKLFEALAIQNKIPMN------VG---SALKQYRFKVEDSEKPN-GDVAEG 70 (303) T ss_pred CCCCcCCcchhhcccceeehhhhhhhhhHHHHHHHhhhhcccccc------CC---ceeeeeeeeceeecccc-ccccCC Confidence 9999988776665 34444554421 1222221 01 11124566533332222 223488 Q ss_pred cccccccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCc----CCcccccc Q lcl|NC_020858. 67 DEYTFDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNAS----VGGATRES 142 (330) Q Consensus 67 ~d~~~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~----~~~~~r~~ 142 (330) ..-+-.. ..|..-+=++==|.|.-... |.+|+.-.|.++-.-.--.+-+.+|.+++-+.|+.--+. ...++ T Consensus 71 e~Iplsk-vt~~~~~t~~~~~kK~rK~t-TdEAIqlsGyg~aVgetd~qL~~~Iq~kIdnd~~~~lktaT~t~~~t~--- 145 (303) T protein:vir:10 71 DVIPLTK-VTREQVDITELQFAKYRKST-SAEAIQAHGYDLAINQTDNEMIKYVQKKFRAKFFETLKSAIENGKRTN--- 145 (303) T ss_pred cccchhh-heeeecceEEEEeecccccc-cHHHHHhhcCCchhHHHHHHHHHHHHhhhhHHHHHHHhhccccccccc--- Confidence 8876332 22222222222245555555 999998888877555545555556666666555432111 11110 Q ss_pred hhHHHHHhcccccccccccccccccccccccccccccccccHHHHHHHHHH-------HHhcCCceeEEEeChHHHHHHH Q lcl|NC_020858. 143 GSLPTWVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQ-------GYQSGANFKHVFVSPYVKSVFV 215 (330) Q Consensus 143 ~Gi~~~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~-------~~~~Gg~~~~i~v~~~~k~~is 215 (330) ....+-+-|+.++-. .|+....+ ++||||..... T Consensus 146 ------------------------------------~t~~s~~glq~Al~~~~~kl~~~~ed~~~~-V~FvNP~Daa~-- 186 (303) T protein:vir:10 146 ------------------------------------KTKLSAENLQGALSKGRANLSVLLDDEITP-IAFVNPNDTAE-- 186 (303) T ss_pred ------------------------------------ceeecHHHHHHHHHhhhhhccccccccccE-EEEEchHHHHH-- Confidence 001222334444333 34444443 89999965333 Q ss_pred HhhccceeeeeeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccc- Q lcl|NC_020858. 216 TFMSDTNVASFRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAK- 294 (330) Q Consensus 216 ~f~~~~~~~~~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laK- 294 (330) +-++.... ...+.||. ++++.=.|- .||..+-+|.+. ++.--++.+.++|..+. - +|++ T Consensus 187 -yl~~A~i~--------~~~t~fG~--n~L~nfLG~-~II~S~kv~~G~-----~~~T~~~Ni~~ay~~~~--g-~l~~~ 246 (303) T protein:vir:10 187 -YLANGFIN--------STGAQFGV--NLLTPYVGV-KIVEFADVPQGE-----VWMTVAENLNVAYANPR--G-ELSRA 246 (303) T ss_pred -HhhcCCcc--------hhhhhhhh--hhhhhhhcc-eEEEeccCCCce-----EEEeeccceEEEEecCc--h-hhhhh Confidence 22332211 11123433 234444454 688888888765 55688888888886642 1 3443 Q ss_pred ---cccceeeEEEEEEEEEEecchheeEEecc------------ccccccC Q lcl|NC_020858. 295 ---TGDAEKFMLIGEGALKPKNEKGLGVAADL------------YGLTAST 330 (330) Q Consensus 295 ---tGd~~k~~i~~E~tLe~~N~~a~g~i~gL------------t~~~~~~ 330 (330) +.| +-+. +|+-|.-...|..++-| =|+-++| T Consensus 247 f~~t~D-~tgl----IGv~h~~~~~~~t~eT~~~~~~~lfpE~~dgiv~~t 292 (303) T protein:vir:10 247 FAFATD-ATGF----VGVLHDIQPQRLTSDTIYASAISMFPENIDAVIKVT 292 (303) T ss_pred hhhccc-cccc----eEEEeccccceeeehhHhHhHHHhcccccceEEEEE Confidence 334 2121 22222211122222111 1222222 No 138 >protein:vir:95963 Length: 395 # NCBI annotation: ORF009 # Family: family:all:635 # MgeID: mge:1594 # MgeName: 2638A # Cross-refs: genbank:acc:YP_239802;genbank:gi:66395459;genbank:GeneID:5132880 Probab=34.98 E-value=1.3 Score=20.04 Aligned_cols=274 Identities=14% Similarity=0.041 Sum_probs=108.6 Q ss_pred CC---ccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccC- Q lcl|NC_020858. 1 MA---VVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVS- 76 (330) Q Consensus 1 Ma---~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~- 76 (330) |. ...+.++ .-+.+.+.|...=....|+.+++....+.+. ..|...+-.+.+.- ..|++... ....+ T Consensus 86 ~~~~t~~~gG~l------iP~~~~~~Ii~~l~~~s~i~~~~~v~~~~~~-~~i~~~~~~~~a~w-~~e~~~~~-~~~~~~ 156 (395) T protein:vir:95 86 INYDVGYTDEKI------LPETVVERVFDDLQKDHPLLSKINFQNAGIK-TRVIKADPAGQAVW-GKVFGEIK-GQLDAA 156 (395) T ss_pred HhhccCCCCcee------ccHHHHHHHHHHHHhhhhhhhhceeEecCCc-eEEEEecCCcceEE-eecccccC-cccccc Confidence 11 1111111 2234555555444455577666544433332 22322221111110 11222211 11111 Q ss_pred -ceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccc Q lcl|NC_020858. 77 -PERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSR 155 (330) Q Consensus 77 -~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~ 155 (330) ...--+...+ ..-+.||.. -+.... .+--+|=..+-...+.+-+|.+||+|.-.....| -||+..+...... T Consensus 157 f~~i~l~~~kl-~~~~~iS~e--ll~ds~-~~ie~~i~~~la~~ia~~~~~a~i~G~G~~~~qP---~Gil~~~~~~~~~ 229 (395) T protein:vir:95 157 FREENFTQYKL-TCFVVLPDD--LSTFGP-AWIERFVRTQIQEAISVALESAIINGGGAAKTQP---VGLMKDVNTNSGA 229 (395) T ss_pred ceeeeeceeeE-EEeecccHH--HHhcch-hHHHHHHHHHHHHHHHHHHhhheeeccCCCCcCc---eeeeecccccccc Confidence 1111122222 233344433 333332 2333444445556678888999999874322222 2555433211100 Q ss_pred ccccccccccccccccccccccccc----cccHHHHHHHHHHHH--hcC------CceeEEEeChHHHHHHHHhhcccee Q lcl|NC_020858. 156 GATGANGGYNTGTGLTVAPTDGTQR----AFSKAIMDDVMQQGY--QSG------ANFKHVFVSPYVKSVFVTFMSDTNV 223 (330) Q Consensus 156 g~~g~~~~~~~~~~~~~~~t~gt~~----~lTe~~l~~~~~~~~--~~G------g~~~~i~v~~~~k~~is~f~~~~~~ 223 (330) ...+ ...++.. .++...+.++...+- .++ ++. ..++|+....- +.+.. T Consensus 230 ~~~~--------------~~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~-~~~mn~~t~~~---~~g~~-- 289 (395) T protein:vir:95 230 VTDK--------------ASSGTLTFADADTTILELNDVLKNLSVDEKGKELKIDGKV-ALVVNPRDSWD---VQARY-- 289 (395) T ss_pred cccc--------------cccchhhhhhhHhhHHHHHHHHHhhccccccchhhhcCce-EEEEcchhhhh---cCCcc-- Confidence 0000 0001000 111222322222221 011 111 23455533211 11110 Q ss_pred eeeeeeecCCcceeEEEEEEEEEcCCe-EEEEEEcCcCCCccccccEEEEEcchhhhhcccCCcccccc---ccccccce Q lcl|NC_020858. 224 ASFRYAASNGKNNSIVANADVYEGPFG-KVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKK---VAKTGDAE 299 (330) Q Consensus 224 ~~~r~~~~~~~~~~~~~~v~~~~tdfG-~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~---laKtGd~~ 299 (330) . +...+|. +.+-+| -+.|+.+.+||++. +++.|+++.-+ +.|.-..-+. .-=+-|.. T Consensus 290 -~--~~~~~G~----------~~~~lg~g~~v~~~~~~p~~~-----i~fgdfs~y~i-~~r~~~~i~~~~~~~~~~d~~ 350 (395) T protein:vir:95 290 -T--YLTANGG----------FVTVLPYNVTIITSEFVPEGK-----LVAFVTDRYNA-VRGGGLTVKKFDQTLALEDAV 350 (395) T ss_pred -e--eccCCCc----------ceeccCCcceEEEcCCCCCCc-----EEEEecccEEE-EEecceEEEeccchhhhCCcE Confidence 0 1112221 223333 25677899999654 56788886433 3342111111 11122455 Q ss_pred eeEEEEEEEEEEecchheeEEeccccccccC Q lcl|NC_020858. 300 KFMLIGEGALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 300 k~~i~~E~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) .+....-+.-++.++.|. ++-.|+-+++.- T Consensus 351 ~f~~~~r~dg~~~~~~A~-~~l~i~~~~~~~ 380 (395) T protein:vir:95 351 LFTAKTFAYGQPDDNKAS-AVYDLKVASAPR 380 (395) T ss_pred EEEEEEEECCEEeccccE-EEEEeeccCCCC Confidence 566667788899999998 466677655432 No 139 >protein:vir:93881 Length: 387 # NCBI annotation: ORF011 # Family: family:all:658 # MgeID: mge:1485 # MgeName: 3A # Cross-refs: genbank:acc:YP_239938;genbank:gi:66395599;genbank:GeneID:5130947 Probab=34.68 E-value=1.3 Score=20.00 Aligned_cols=266 Identities=11% Similarity=0.017 Sum_probs=96.7 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCceEe Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSPERL 80 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~~~~ 80 (330) .+..+++ .+......-+++.+.|...--...|+..++...+..+...-+...... ...-..||...+.......... T Consensus 116 ~al~~~t-~s~gG~~IP~~~~~~Ii~~~~~~~~l~~~~~v~~~~~~~~p~~~~~~~--~a~~v~E~~~~~~~~~~f~~v~ 192 (387) T protein:vir:93 116 HALPTGN-DSGGDKLLPKTLSKEIVSEPFAKNQLREKARLTNIKGLEIPRVSYTLD--DDDFITDVETAKELKLKGDTVK 192 (387) T ss_pred HhhccCc-CCCCceeechhHHHHHHHHHHhhchhhhheeeeecCCceEEEEeecCC--ccccccCcccccccccccceee Confidence 2211111 001111122356666665555556766555444443333223222221 1222357766555433322221 Q ss_pred cceEEEEeeeeeehhHHHHHhhcc--ccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccccccc Q lcl|NC_020858. 81 GNYTQIMRKSGIISGTQNITDEAG--RATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGAT 158 (330) Q Consensus 81 ~N~tQIf~~~v~VS~T~~av~~~G--~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~ 158 (330) -+...+ .--+.||. +.+.... ....+..++++++.. ...+.+|.+|... +.| -|++ ++.. T Consensus 193 ~~~~k~-~~~~~iS~--ell~Ds~~~l~~~i~~~la~~~~~--~e~~~~~~~g~g~--g~p---~g~l----~~~~---- 254 (387) T protein:vir:93 193 FTTNKF-KVFAAISD--TVIHGSDVDLVNWVENALQSGLAA--KERKDALAVSPKS--GLD---HMSF----YNGS---- 254 (387) T ss_pred eeheee-eeechhhH--HHHhhhHHHHHHHHHHHHHHHHHH--HHHHhHhhcCCCc--ccc---ceee----eccc---- Confidence 122222 22344443 3333222 223344444333321 1223345444321 111 1111 0000 Q ss_pred cccccccccccccccccccccccccHHHHHHHHHHHHhcC-CceeEEEeChHHHHHHHHhhccceeeeeeeeecCCccee Q lcl|NC_020858. 159 GANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSG-ANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKNNS 237 (330) Q Consensus 159 g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~G-g~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~~~ 237 (330) . ....+ ..+.+.|.+++.++-.+- .+. .+++++.....+-+...+... ++.. +...+ T Consensus 255 ~-------------~~v~~---~~~~d~i~~~~~~l~~~~~~~a-~~~mn~~t~~~~~~~~~d~~~---~~~~--~~~~~ 312 (387) T protein:vir:93 255 V-------------KEVEG---ADMYDAIINALADLHEDYRDNA-TIYMRYADYVKIISVLSNGTT---NFFD--TPAEK 312 (387) T ss_pred c-------------ccccc---cchHHHHHHHHhccChhhhcCC-EEEEechHHHHHHHHHhcCCC---cccc--cCCcc Confidence 0 00011 112344444443332211 122 345665443333333343321 1111 11112 Q ss_pred EEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccc-cccccccccceeeEEEEEEEEEEecchh Q lcl|NC_020858. 238 IVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAE-DKKVAKTGDAEKFMLIGEGALKPKNEKG 316 (330) Q Consensus 238 ~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~-~e~laKtGd~~k~~i~~E~tLe~~N~~a 316 (330) |- ..-++.++ .++ .+++-|+++.-+.+ +++.. ...-+++| ...++...-+..++.+|.| T Consensus 313 ll-G~PV~~~~----------~~~-------~~~~GDf~~~~~~~-~~~~~~~~~~~~~~-~~~~~~~~r~d~~v~~~eA 372 (387) T protein:vir:93 313 VF-GKPVVFTD----------AAV-------KPIVGDFNYFGINY-DGTTYDTDKDVKKG-EYLFVLTAWYDQQRTLDSA 372 (387) T ss_pred cc-ccceEEec----------CCC-------ceeeeehhhhheeh-hhheeeecccccCC-ceeEEEEeeeCceeechhh Confidence 21 11223322 232 24567776643322 22111 11112222 2334444568889999998 Q ss_pred eeEEeccccccccC Q lcl|NC_020858. 317 LGVAADLYGLTAST 330 (330) Q Consensus 317 ~g~i~gLt~~~~~~ 330 (330) .- +.-+..=+.+| T Consensus 373 ~~-~l~~k~~~~~~ 385 (387) T protein:vir:93 373 FR-IAKAKENTGSL 385 (387) T ss_pred eE-EEEeecCCCCC Confidence 86 34454444454 No 140 >protein:vir:9875 Length: 296 # NCBI annotation: hypothetical protein # Family: family:all:1178 # MgeID: mge:177 # MgeName: 315.5 # Cross-refs: genbank:acc:NP_795637;genbank:gi:28876404;genbank:GeneID:1257935 Probab=34.48 E-value=1.3 Score=19.98 Aligned_cols=251 Identities=13% Similarity=0.086 Sum_probs=107.0 Q ss_pred CC----ccccceeeccccccccccceee--EecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccc Q lcl|NC_020858. 1 MA----VVTNTFQSTGAKGNREELADVV--SRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDAT 74 (330) Q Consensus 1 Ma----~~t~~~~t~~~~g~~edl~d~I--~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~ 74 (330) |+ ..+-=|+.-+..+.. .|.+.+ .+..|... ++.=.++..|..- + ++ .-..||..-+-... T Consensus 15 ~~dl~~~~siDf~~~f~~~i~-~L~~~LGv~r~~pla~--------GstIkt~k~~~y~-g-da--~dVaEGe~Iplskv 81 (296) T protein:vir:98 15 STDLKYPITIDVTNKFQENIS-KLLEMLGVTRKISVSE--------GMTLKTYAGYDVT-L-AE--GNVPEGEVIPLSKV 81 (296) T ss_pred hhhhhhhhhhhhHHHHhhhHH-HHHHHhhhcccccccC--------CCEEeeccceeee-e-cc--ccccCCcccchhhh Confidence 21 112223332222211 122222 11112111 1111122334321 1 11 22348888765543 Q ss_pred cCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhcccc Q lcl|NC_020858. 75 VSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVS 154 (330) Q Consensus 75 ~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~ 154 (330) .......=-.+| .|.-... |.+|++-.|.++-.-.--.+-+.+|.+++-+.|+.--+..-.+. T Consensus 82 t~~~~~t~t~~i-kK~rK~t-TdEAIqlsGyg~aVgetd~qL~~~iq~kId~d~~t~LktaT~t~--------------- 144 (296) T protein:vir:98 82 ERKIHSEKKIEL-KKYRKAT-TGEDIQMYGSNEAVTNTDNALVRQLQKKIRTDFVTALKTGTGTQ--------------- 144 (296) T ss_pred eeeecceEEEEe-ecccccc-CHHHHHhhcCCchhHHHHHHHHHHHHHhhhHHHHHHHhccccee--------------- Confidence 322111112234 3333333 89999888877655444444555566666555543211100000 Q ss_pred cccccccccccccccccccccccccccccHHHHHHHHH-------HHHh-cCCceeEEEeChHHHHHHHHhhccceeeee Q lcl|NC_020858. 155 RGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQ-------QGYQ-SGANFKHVFVSPYVKSVFVTFMSDTNVASF 226 (330) Q Consensus 155 ~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~-------~~~~-~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~ 226 (330) .-+-+-|+.++. ..|+ .+....++|+||... .++-++... T Consensus 145 --------------------------~~t~~~lQ~Ala~~~~~l~~~feded~~~~V~FVnP~D~---a~ylg~a~i--- 192 (296) T protein:vir:98 145 --------------------------DALGAGLQGALASAWGKLQVLFEDYGSERAIVFANSLDV---AEYIAKAGI--- 192 (296) T ss_pred --------------------------eechhhHHHHHHHHhhhhhhhccccCCCceEEEEehHHH---HHHhcCCcc--- Confidence 001112333333 3343 355677899999663 334454432 Q ss_pred eeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccccccccccccceeeEEEEE Q lcl|NC_020858. 227 RYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKKVAKTGDAEKFMLIGE 306 (330) Q Consensus 227 r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~laKtGd~~k~~i~~E 306 (330) +-++.|+. .+++.-.| ..||..+=+|.+. ++..-++.+.++|..+. . -+|++...-.-- =.|= T Consensus 193 ------t~qt~fG~--tyl~nfLG-~~II~S~kV~~G~-----~~~T~~~Ni~~ay~~~~-~-~~l~~~f~~~~d-~tgl 255 (296) T protein:vir:98 193 ------TTQTAFGL--TYLVDFTG-TVIISTNDVTKGE-----IWATVPENIIFAYINPN-N-SELAKEFNLYGD-PTGY 255 (296) T ss_pred ------chhheech--hhhhhccc-cEEEEcCcCCCce-----EEEeeecceEEEeeccc-c-cchhhhhccccc-cccc Confidence 12333444 23444557 6899999888664 56788999999997642 1 245553311100 0011 Q ss_pred EEEEEecchheeEEecc------------ccccccC Q lcl|NC_020858. 307 GALKPKNEKGLGVAADL------------YGLTAST 330 (330) Q Consensus 307 ~tLe~~N~~a~g~i~gL------------t~~~~~~ 330 (330) +|+-|.-...|..++-| -|+-++| T Consensus 256 IGv~h~~~~~~~t~eT~~~~~~~lfpE~~dgiv~~t 291 (296) T protein:vir:98 256 IGMNHFQENTTLTIQTLLVSGMLMYPERIDGIVKVT 291 (296) T ss_pred eEEEeccccceeeehhHhHhHHHhcccccceEEEEE Confidence 22333222223222211 1222222 No 141 >protein:vir:78640 Length: 352 # NCBI annotation: phage capsid # Family: family:all:658 # MgeID: mge:1855 # MgeName: tp310-2 # Cross-refs: genbank:acc:YP_001429943;genbank:gi:156603997;genbank:GeneID:5525386 Probab=34.13 E-value=1.3 Score=19.94 Aligned_cols=262 Identities=9% Similarity=0.004 Sum_probs=103.1 Q ss_pred CC---ccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccccccCc Q lcl|NC_020858. 1 MA---VVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFDATVSP 77 (330) Q Consensus 1 Ma---~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~~~~~~ 77 (330) |. ...+.++- -+++...|...-....|+..+....++.+...-+.+.... ...-..||+..+....... T Consensus 83 l~~~~~~~gG~lI------P~~~~~~Ii~~l~~~s~l~~~~~v~~~~~~~~p~~~~~~~--~a~~v~E~~~~~~~~~~f~ 154 (352) T protein:vir:78 83 LPTGNDSGGDKLL------PKTLSKEIVSEPFAKNQLREKARLTNIKGLEIPRVSYTLD--DDDFITDVETAKELKLKGD 154 (352) T ss_pred hccCCCCCCceec------cHhHHHHHHHHHHhhcchhhheeeEecCCceEEEEecCCC--cccccccccccccccccce Confidence 21 11112222 2356666655556666776655544444433333333322 2223357776655432222 Q ss_pred eEecceEEEEeeeeeehhHHHHHhhcc--ccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccc Q lcl|NC_020858. 78 ERLGNYTQIMRKSGIISGTQNITDEAG--RATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSR 155 (330) Q Consensus 78 ~~~~N~tQIf~~~v~VS~T~~av~~~G--~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~ 155 (330) ...-+...+ .--+.|| .+.+.... ....+..++++++. +.+.+..|.+|... +.+ .+++ ++... T Consensus 155 ~v~~~~~k~-~~~i~is--~ell~Ds~~~l~~~i~~~la~~~~--~~e~~~~~~~g~g~--~~~--~g~l-----~~~~~ 220 (352) T protein:vir:78 155 TVKFTTNKF-KVFAAIS--DTVIHGSDVDLVNWVENALQSGLA--AKERKDALAVSPKS--GLE--HMSF-----YNGSV 220 (352) T ss_pred eeeecceeE-Eeechhh--HHHHhhhhHHHHHHHHHHHHHHHH--HHHHHhhhhcCCCC--ccc--ccce-----ecccc Confidence 221122111 1123333 34443322 23455555555543 22445555555432 111 1111 00000 Q ss_pred ccccccccccccccccccccccccccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeecCCcc Q lcl|NC_020858. 156 GATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAASNGKN 235 (330) Q Consensus 156 g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~~~~~ 235 (330) .. ..++ -+.+.|.+++.++-.+-.+-..+++++....++-++..+... +... +.. T Consensus 221 ----------------~~-~t~~---~~~d~i~~~~~~l~~~~~~~a~~~mn~~t~~~l~~~~~~~~~----~~~~-~~~ 275 (352) T protein:vir:78 221 ----------------KE-VEGA---NMYDAIINALADLHEDYRDNATIYMRYADYVKIISVLSNGTT----NFFD-TPA 275 (352) T ss_pred ----------------cc-cccc---chHHHHHHHHhccChhhhcCCEEEEehHHHHHHHHHHhccCC----cccc-cCC Confidence 00 0011 123455555554433222222456676555445454443321 1111 111 Q ss_pred eeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCccc-cccccccccceeeEEEEEEEEEEecc Q lcl|NC_020858. 236 NSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAE-DKKVAKTGDAEKFMLIGEGALKPKNE 314 (330) Q Consensus 236 ~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~-~e~laKtGd~~k~~i~~E~tLe~~N~ 314 (330) .+ =||. .|+....++ .+++-|+++.-+.+ .+... .-.-+++ +...+....-+...+.+| T Consensus 276 ~~----------llG~-PV~~~~~~~-------~~~~Gdf~~~~~~~-~~~~~~~~~~~~~-g~~~f~~~~r~Dg~~~~~ 335 (352) T protein:vir:78 276 EK----------VFGK-PVVFTDAAV-------KPIVGDFNYFGINY-DGTTYDTDKDVKK-GEYLFVLTAWYDQQRTLD 335 (352) T ss_pred cc----------cccc-ceEEecCCC-------ceeEeehhhhhhhh-hhheeeeeccccC-CeeEEEEEeeeCceeech Confidence 11 2332 222222333 34567777653322 22111 0011222 234555556688999999 Q ss_pred hheeEEeccccccccC Q lcl|NC_020858. 315 KGLGVAADLYGLTAST 330 (330) Q Consensus 315 ~a~g~i~gLt~~~~~~ 330 (330) .|.-++. +.=-+.++ T Consensus 336 eA~~~l~-~~a~~~~~ 350 (352) T protein:vir:78 336 SAFRIAK-AKESTGSL 350 (352) T ss_pred hheEEEE-eecccCCC Confidence 9975443 22112222 No 142 >protein:vir:78935 Length: 335 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:1860 # MgeName: LKD16 # Cross-refs: genbank:acc:YP_001522824;genbank:gi:158345059;genbank:GeneID:5687425 Probab=33.35 E-value=1.4 Score=19.85 Aligned_cols=287 Identities=13% Similarity=0.094 Sum_probs=109.3 Q ss_pred CCccccceeeccccccccccceeeEec--CCccc------ceeeeec-----cceeccceeeeeeeec-cCccccccccc Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRI--TPEDT------PIYSMIE-----KVSFDTTHPEWTTDEL-AAPGANITLEG 66 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i--~p~dT------P~~s~ig-----~~~~~~~~~~W~td~L-~~~~~na~~EG 66 (330) |..|. .+|-+.-.|--.|+ +....+ +-.+| =|+++.. +++.-+...-|....- ..| -..+.| T Consensus 1 ms~~~-~~t~~~~~~s~~d~-al~le~f~geV~~af~~~s~~~~~~~~rti~~g~s~~~~~iG~~~~~~~~p--G~~l~~ 76 (335) T protein:vir:78 1 MSFLN-DLTRPNYAGKNADV-DIHLEEHLGIVDKHFAYTSKFAPLMNIRDLRGSNVVRLDRLGNVEAKGRRA--GEELER 76 (335) T ss_pred CCccc-cccccccccccchh-hhhhhhhhhHHHHHHHHhhhhccccceeeeccceeEEEeeeeeeeeccccc--CcccCC Confidence 98884 66666554444443 211111 01111 1112111 1111111112222110 000 111222 Q ss_pred cccccccccCceEecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHh----cCC--CcCCcc-- Q lcl|NC_020858. 67 DEYTFDATVSPERLGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIV----ATN--ASVGGA-- 138 (330) Q Consensus 67 ~d~~~~~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i----~g~--~~~~~~-- 138 (330) .. .......-++-+-..-.+-|-.--++.++|-...|+..|+.+++ ++..-..++ .+- ....+. T Consensus 77 ~~-----~~~~k~~itID~ll~a~~~VddlDe~~~~yDvR~e~s~~~G~aL---A~~~Dq~~~~~l~~aa~~~a~~~~~~ 148 (335) T protein:vir:78 77 SR-----VVNDKWNLTVDTLLYLRHQFDHQDEWTQSFDMRKEVAELDGQEL---ARKFDQACLIQVIKAAAMDAPVDLED 148 (335) T ss_pred CC-----cccCCeEEEecceeechhhHhhHHHhhcCchhHHHHHHHHHHHH---HHHHHHHHHHHHHhhcccccccccCC Confidence 21 11122222333333333334455566666666666666665544 444444332 111 111110 Q ss_pred cccchhHHHHHhcccccccccccccccccccccccccccccccccHHHHHHHHHHHHhc-----CCceeEEEeChHHHHH Q lcl|NC_020858. 139 TRESGSLPTWVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQS-----GANFKHVFVSPYVKSV 213 (330) Q Consensus 139 ~r~~~Gi~~~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~-----Gg~~~~i~v~~~~k~~ 213 (330) ...-|+.... ...+ ..+.++ +..| .+.+.++.+.+-+. +-..+.++|+|.+... T Consensus 149 ~~~~G~~~~~----~~tg--------------~~~~~~--~~~l-~~a~~~a~~~l~ekdvP~~~~~~rv~vv~P~~y~~ 207 (335) T protein:vir:78 149 AFSPGVLEKL----DLTG--------------LTAKEA--AEKI-VRMHRRVVETFIERDLGDAVYSEGLTPMSPRVFSL 207 (335) T ss_pred CcCCCcceee----eecc--------------cccccc--HHHH-HHHHHHHHHHHHhccCCCCCCCccEEEeChHHHHH Confidence 0111110000 0000 000001 1111 22334444444432 2345789999988766 Q ss_pred HHHhhccceeeeeeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccE----------------EEEEcchh Q lcl|NC_020858. 214 FVTFMSDTNVASFRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARN----------------AFFVDPEF 277 (330) Q Consensus 214 is~f~~~~~~~~~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~----------------~~~ld~~~ 277 (330) +-. ........+...++..-.....|- ..=-+.|+..++||..++.+.. ++++-+.- T Consensus 208 Ll~---~~~l~n~~~~~s~~~~~~~~g~v~----~v~Gv~V~~Sn~lP~~~~t~~~lg~a~n~~~~d~~~~~~~~~~~~A 280 (335) T protein:vir:78 208 LLE---HDKLMSVEYQATGATNDYVKSRVA----ILNGVKVLETPRFATKAISAHPLGRHFNVSAEEAERQIALFLPSKT 280 (335) T ss_pred Hhc---ccccccccccccccccccccceeE----EeeceEEEeeccCCCCCCccccccccCCcccccccceEEEEEecce Confidence 655 211111222222221100000000 0012567777888876544321 22333332 Q ss_pred hhhcccCCccccccccccccceeeEEEEE--EEEEEecchheeEEeccccccccC Q lcl|NC_020858. 278 LQFGWLRKIAEDKKVAKTGDAEKFMLIGE--GALKPKNEKGLGVAADLYGLTAST 330 (330) Q Consensus 278 ~~~~~Lr~~~~~e~laKtGd~~k~~i~~E--~tLe~~N~~a~g~i~gLt~~~~~~ 330 (330) +-..-+.++ ..+..-....+-++|.+= ++..+++|.+.+.|+ ++||-+=. T Consensus 281 l~t~~~~~~--~~e~~~~~~~~~~~i~~~~a~G~g~lRPe~a~~i~-~tg~~~~~ 332 (335) T protein:vir:78 281 LITAQVAPV--QAKLWEDHDQFSWVLDTFQMYNIGARRPDTAGAIE-LKGIEAFD 332 (335) T ss_pred EEEEEEEec--ccceeeccchhhHhhhHHHHcCCcccCcceEEEEE-ecCCCccc Confidence 211111110 001111223344555554 678889999888877 88886544 No 143 >protein:vir:98635 Length: 377 # NCBI annotation: major coat protein # Family: family:all:635 # MgeID: mge:1601 # MgeName: phi3396 # Cross-refs: genbank:acc:YP_001039923;genbank:gi:126011098;genbank:GeneID:4818471 Probab=33.19 E-value=1.4 Score=19.83 Aligned_cols=289 Identities=12% Similarity=-0.002 Sum_probs=109.1 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccc-cccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFD-ATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~-~~~~~~~ 79 (330) .+..+....+..-..--+++.+.|+.-=...-|+++++......+. ..|..++-.+. ..=..|++..+.. ...-... T Consensus 76 ~~~~~~~~~~~gg~~vP~~~~~~I~~~l~~~s~i~~~~~v~~~~~~-~~~~~~~~~~~-a~w~~e~~~~~~~~~~~f~~i 153 (377) T protein:vir:98 76 NDIDKNVGGKDKFKLLPEETMVQVFDDLVAEHPLLKVINFKNTSLR-LKALTAETSGT-AVWGDIFGEIKGQLKQAFKEQ 153 (377) T ss_pred HHHHhccCCCCCccccCHHHHHHHHHHHHHhhhhhhheeeEecCcc-eEEEEecCCcc-eeEeecccccCcccCccceeE Confidence 1111111111111111234555555444555677766544333222 34443321111 1111344333211 1111111 Q ss_pred ecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccccc Q lcl|NC_020858. 80 LGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATG 159 (330) Q Consensus 80 ~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g 159 (330) .=+...+.. -+.|| .+-+.... .|--+|=..+-...+.+-+|.+||+|.-. + +.-||+..+...+.....+ T Consensus 154 ~l~~~kl~a-~~~is--~elL~ds~-~~ie~~i~~~la~~~a~~~~~a~i~G~G~--~---qP~Gil~~~~~~~~~~~~~ 224 (377) T protein:vir:98 154 DFSQFKLTA-FVVIP--KDALKFGP-KWIKQFITEQLKEAIAVALELAIVKGDGL--L---QPVGLLKDLSQPTVDQSTG 224 (377) T ss_pred eecceeEEe-eeccc--HHhhhccH-hHHHHHHHHHHHHHHHHHHhhceEeccCC--C---cceeeeecccccccccccc Confidence 112222221 13343 33333333 23444555566677788899999998742 2 3445554432111110000 Q ss_pred c-cccccccccccccccccc-------cccccHHHHHHHHHHHHhcCCceeEEEeChHHHHHHHHhhccceeeeeeeeec Q lcl|NC_020858. 160 A-NGGYNTGTGLTVAPTDGT-------QRAFSKAIMDDVMQQGYQSGANFKHVFVSPYVKSVFVTFMSDTNVASFRYAAS 231 (330) Q Consensus 160 ~-~~~~~~~~~~~~~~t~gt-------~~~lTe~~l~~~~~~~~~~Gg~~~~i~v~~~~k~~is~f~~~~~~~~~r~~~~ 231 (330) . .................. ...+--..-...++++-+.-|++ +..++|.....+ ... ....+. T Consensus 225 ~~~~~~~~~~~~~~~l~~~~~~~~~~~a~~~m~~~t~~~~~klkd~~G~~-i~~~n~~~~~~~---~p~-----~~~~~~ 295 (377) T protein:vir:98 225 RDITTYKTDKEAIADLSDLTPDNAPKKLVPVMKHLSVNDKKRPLKIAGQV-KLILNPEDRWAL---EAQ-----FTSRNQ 295 (377) T ss_pred cccccccchhhhHhhhhhhchhHHHHHHHHHHHHHHHHHHhhhhccCCce-EEEecccchhhc---ccc-----ccccCC Confidence 0 000000000000000000 00000000011112222222222 223444221110 000 000111 Q ss_pred CCcceeEEEEEEEEEcCCe-EEEEEEcCcCCCccccccEEEEEcchhhhhcccCCcccccc---ccccccceeeEEEEEE Q lcl|NC_020858. 232 NGKNNSIVANADVYEGPFG-KVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKK---VAKTGDAEKFMLIGEG 307 (330) Q Consensus 232 ~~~~~~~~~~v~~~~tdfG-~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~---laKtGd~~k~~i~~E~ 307 (330) +|. +.+-+| .+.++-+.+||.+ ++++.|.++.-+ +.|.-...+. ..=.-|..-+....-+ T Consensus 296 ~G~----------~~t~lg~p~~vv~s~~~p~~-----~i~fgdf~~Y~i-~~r~~~~i~~~~~~~~~~d~~~f~~~~r~ 359 (377) T protein:vir:98 296 FGE----------YVTVLPHGITILESLAVETG-----KAIAFVANRYDA-FMATASTIEEYDQTFAMEDLQLYLTKNYF 359 (377) T ss_pred CCc----------cccccCCCceEEecCCCCcc-----cEEEEEecceeE-EeecceEEEeechhhhhcCceEEEEEEEE Confidence 111 334556 3567788889865 356888887444 3343111111 1112255556666678 Q ss_pred EEEEecchheeEEeccccc Q lcl|NC_020858. 308 ALKPKNEKGLGVAADLYGL 326 (330) Q Consensus 308 tLe~~N~~a~g~i~gLt~~ 326 (330) .=++.++.|..++. |+|= T Consensus 360 dg~~~~~~a~~vl~-i~~~ 377 (377) T protein:vir:98 360 YGKAKDNHTAALLT-LAGG 377 (377) T ss_pred cCEEeccCcEEEEE-EecC Confidence 88899999976655 4433 No 144 >protein:vir:101291 Length: 381 # NCBI annotation: hypothetical protein # Family: family:all:635 # MgeID: mge:1591 # MgeName: phiNM3 # Cross-refs: genbank:acc:YP_908831;genbank:gi:118725095;genbank:GeneID:4555862 Probab=24.38 E-value=2.2 Score=18.72 Aligned_cols=288 Identities=13% Similarity=0.024 Sum_probs=117.2 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccc-cccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFD-ATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~-~~~~~~~ 79 (330) .+..+++ .+......-+.+.+.|+..=....|+.+++......+. ..+...+-.+.+ .=..|++..+.. ...-... T Consensus 74 ~~~~~~~-~~~gg~lvP~~~~~~I~~~l~~~s~i~~~~~v~~~~~~-~~i~~~~~~~~a-~w~~e~~~~~~~~~~~f~~i 150 (381) T protein:vir:10 74 MDINKNV-NYKEEKLLPEETIDRIFEDLTTNHPLLADLGIKNAGLR-LKFLKSETSGVA-VWGKIYGEIKGQLDAAFSEE 150 (381) T ss_pred HHHhccc-CCCCceecCHHHHHHHHHHHHhhccceeheeeEecCcc-eEEEEecCCcce-eeecccccccccccccceee Confidence 1211111 00111112345666666655677788776644333221 222222111111 011233322211 1111111 Q ss_pred ecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccccc Q lcl|NC_020858. 80 LGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATG 159 (330) Q Consensus 80 ~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g 159 (330) .=+...+ ..-+.||... +.... .+--+|=..+-...+.+-+|.+||+|.-. +.| -||+..+.-.+... ++ T Consensus 151 ~l~~~kl-~~~~~is~el--L~Ds~-~~ie~~i~~~la~~~a~~~~~a~i~G~G~--~qP---~Gil~~~~~~~~~~-~g 220 (381) T protein:vir:10 151 TAIQNKL-TAFVVLPKDL--NDFGP-AWIERFVRVQIEEAFAVALETAFLKGTGK--DQP---IGLNRQVQKGVSVT-EG 220 (381) T ss_pred eecceeE-EeechhhHHH--hhcCH-HHHHHHHHHHHHHHHHHHhhheeEeccCC--CCc---eeeeeccCcccccc-cc Confidence 1122222 2334444333 22221 23334444455567788899999998652 233 35554332111100 00 Q ss_pred ccccccccccccccccccccccccHHHHHHHHHHHHhcCCc----e--e-EEEeChHHHHHHHHhhccceeeeeeeeecC Q lcl|NC_020858. 160 ANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGAN----F--K-HVFVSPYVKSVFVTFMSDTNVASFRYAASN 232 (330) Q Consensus 160 ~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~----~--~-~i~v~~~~k~~is~f~~~~~~~~~r~~~~~ 232 (330) ..... ...+. .+... .....+.|.++++++-..+.. + + ..++|+.....+-.+... ...+ T Consensus 221 ~~~~~-~~~~t---~t~~~-~~~~~~~l~~~~~~~~~~~~~~~~~~~~~a~~~mn~~t~~~l~~~~~~--------~~~~ 287 (381) T protein:vir:10 221 AYPEK-EEQGT---LTFAN-PRATVNELTQVFKYHSTNEKGKSVAVKGNVTMVVNPSDAFEVQAQYTH--------LNAN 287 (381) T ss_pred ccccc-ccccc---ccccc-chhhHHHHHHHHHhhccccccccccccCceEEEEccccHHhhcccccc--------CCCC Confidence 00000 00000 00000 011224455555555332211 1 1 346787654443322110 0111 Q ss_pred CcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCcccccc---ccccccceeeEEEEEEEE Q lcl|NC_020858. 233 GKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKK---VAKTGDAEKFMLIGEGAL 309 (330) Q Consensus 233 ~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~---laKtGd~~k~~i~~E~tL 309 (330) |. + | ...+|| +.|+.+.+||.+. +++.|.++..+ ..|.-...+. ..=.=|...+....-+.- T Consensus 288 G~---~---v--~~l~~g-~~vv~s~~~p~~~-----iifgDfs~Y~i-~~r~~~~i~~~~~~~~~~d~~~f~a~~r~dg 352 (381) T protein:vir:10 288 GV---Y---V--TALPFN-LNVIESTVQEAGK-----VLTYVKGLYDG-YLAGGINVQKFKETLALDDMDLYTAKQFAYG 352 (381) T ss_pred Cc---e---e--ecCCCC-ceEEecCCCCcCc-----EEEEecccEEE-EEecccEEEeechhHhhcCCeEEEEEEEEcC Confidence 21 1 1 123455 4577788999654 56788877444 3343111111 111234455666777888 Q ss_pred EEecchheeEEecccc--ccccC Q lcl|NC_020858. 310 KPKNEKGLGVAADLYG--LTAST 330 (330) Q Consensus 310 e~~N~~a~g~i~gLt~--~~~~~ 330 (330) ++.+++|..++. |+- .+.++ T Consensus 353 ~~~~~~A~~v~~-l~~~~~~~~~ 374 (381) T protein:vir:10 353 KAKDNKVAAVWK-LDLKGHKPAL 374 (381) T ss_pred EEecCceEEEEE-EEecCCCcCc Confidence 889999987754 444 44444 No 145 >protein:vir:9509 Length: 381 # NCBI annotation: hypothetical protein # Family: family:all:635 # MgeID: mge:170 # MgeName: phiN315 # Cross-refs: genbank:acc:NP_835556;genbank:gi:30043951;genbank:GeneID:1260537 Probab=24.38 E-value=2.2 Score=18.72 Aligned_cols=288 Identities=13% Similarity=0.024 Sum_probs=117.2 Q ss_pred CCccccceeeccccccccccceeeEecCCcccceeeeeccceeccceeeeeeeeccCccccccccccccccc-cccCceE Q lcl|NC_020858. 1 MAVVTNTFQSTGAKGNREELADVVSRITPEDTPIYSMIEKVSFDTTHPEWTTDELAAPGANITLEGDEYTFD-ATVSPER 79 (330) Q Consensus 1 Ma~~t~~~~t~~~~g~~edl~d~I~~i~p~dTP~~s~ig~~~~~~~~~~W~td~L~~~~~na~~EG~d~~~~-~~~~~~~ 79 (330) .+..+++ .+......-+.+.+.|+..=....|+.+++......+. ..+...+-.+.+ .=..|++..+.. ...-... T Consensus 74 ~~~~~~~-~~~gg~lvP~~~~~~I~~~l~~~s~i~~~~~v~~~~~~-~~i~~~~~~~~a-~w~~e~~~~~~~~~~~f~~i 150 (381) T protein:vir:95 74 MDINKNV-NYKEEKLLPEETIDRIFEDLTTNHPLLADLGIKNAGLR-LKFLKSETSGVA-VWGKIYGEIKGQLDAAFSEE 150 (381) T ss_pred HHHhccc-CCCCceecCHHHHHHHHHHHHhhccceeheeeEecCcc-eEEEEecCCcce-eeecccccccccccccceee Confidence 1211111 00111112345666666655677788776644333221 222222111111 011233322211 1111111 Q ss_pred ecceEEEEeeeeeehhHHHHHhhccccchHHHHHHHHHHHHHHHHHHHHhcCCCcCCcccccchhHHHHHhccccccccc Q lcl|NC_020858. 80 LGNYTQIMRKSGIISGTQNITDEAGRATKVKEQKLKKGVELRKDVEFSIVATNASVGGATRESGSLPTWVKTNVSRGATG 159 (330) Q Consensus 80 ~~N~tQIf~~~v~VS~T~~av~~~G~~~e~a~q~~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~~~i~tn~~~g~~g 159 (330) .=+...+ ..-+.||... +.... .+--+|=..+-...+.+-+|.+||+|.-. +.| -||+..+.-.+... ++ T Consensus 151 ~l~~~kl-~~~~~is~el--L~Ds~-~~ie~~i~~~la~~~a~~~~~a~i~G~G~--~qP---~Gil~~~~~~~~~~-~g 220 (381) T protein:vir:95 151 TAIQNKL-TAFVVLPKDL--NDFGP-AWIERFVRVQIEEAFAVALETAFLKGTGK--DQP---IGLNRQVQKGVSVT-EG 220 (381) T ss_pred eecceeE-EeechhhHHH--hhcCH-HHHHHHHHHHHHHHHHHHhhheeEeccCC--CCc---eeeeeccCcccccc-cc Confidence 1122222 2334444333 22221 23334444455567788899999998652 233 35554332111100 00 Q ss_pred ccccccccccccccccccccccccHHHHHHHHHHHHhcCCc----e--e-EEEeChHHHHHHHHhhccceeeeeeeeecC Q lcl|NC_020858. 160 ANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQSGAN----F--K-HVFVSPYVKSVFVTFMSDTNVASFRYAASN 232 (330) Q Consensus 160 ~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~Gg~----~--~-~i~v~~~~k~~is~f~~~~~~~~~r~~~~~ 232 (330) ..... ...+. .+... .....+.|.++++++-..+.. + + ..++|+.....+-.+... ...+ T Consensus 221 ~~~~~-~~~~t---~t~~~-~~~~~~~l~~~~~~~~~~~~~~~~~~~~~a~~~mn~~t~~~l~~~~~~--------~~~~ 287 (381) T protein:vir:95 221 AYPEK-EEQGT---LTFAN-PRATVNELTQVFKYHSTNEKGKSVAVKGNVTMVVNPSDAFEVQAQYTH--------LNAN 287 (381) T ss_pred ccccc-ccccc---ccccc-chhhHHHHHHHHHhhccccccccccccCceEEEEccccHHhhcccccc--------CCCC Confidence 00000 00000 00000 011224455555555332211 1 1 346787654443322110 0111 Q ss_pred CcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEEcchhhhhcccCCcccccc---ccccccceeeEEEEEEEE Q lcl|NC_020858. 233 GKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFVDPEFLQFGWLRKIAEDKK---VAKTGDAEKFMLIGEGAL 309 (330) Q Consensus 233 ~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~ld~~~~~~~~Lr~~~~~e~---laKtGd~~k~~i~~E~tL 309 (330) |. + | ...+|| +.|+.+.+||.+. +++.|.++..+ ..|.-...+. ..=.=|...+....-+.- T Consensus 288 G~---~---v--~~l~~g-~~vv~s~~~p~~~-----iifgDfs~Y~i-~~r~~~~i~~~~~~~~~~d~~~f~a~~r~dg 352 (381) T protein:vir:95 288 GV---Y---V--TALPFN-LNVIESTVQEAGK-----VLTYVKGLYDG-YLAGGINVQKFKETLALDDMDLYTAKQFAYG 352 (381) T ss_pred Cc---e---e--ecCCCC-ceEEecCCCCcCc-----EEEEecccEEE-EEecccEEEeechhHhhcCCeEEEEEEEEcC Confidence 21 1 1 123455 4577788999654 56788877444 3343111111 111234455666777888 Q ss_pred EEecchheeEEecccc--ccccC Q lcl|NC_020858. 310 KPKNEKGLGVAADLYG--LTAST 330 (330) Q Consensus 310 e~~N~~a~g~i~gLt~--~~~~~ 330 (330) ++.+++|..++. |+- .+.++ T Consensus 353 ~~~~~~A~~v~~-l~~~~~~~~~ 374 (381) T protein:vir:95 353 KAKDNKVAAVWK-LDLKGHKPAL 374 (381) T ss_pred EEecCceEEEEE-EEecCCCcCc Confidence 889999987754 444 44444 No 146 >protein:vir:94070 Length: 339 # NCBI annotation: putative structural protein # Family: family:all:1653 # MgeID: mge:1493 # MgeName: OP2 # Cross-refs: genbank:acc:YP_453625;genbank:gi:84662661;genbank:GeneID:5142580 Probab=21.90 E-value=2.5 Score=18.37 Aligned_cols=279 Identities=8% Similarity=-0.009 Sum_probs=105.5 Q ss_pred CCccc---cc-eeecccccc---ccc-cceeeEecCCcccceeeeeccceec---cceeeeeeeeccCcccccccccccc Q lcl|NC_020858. 1 MAVVT---NT-FQSTGAKGN---REE-LADVVSRITPEDTPIYSMIEKVSFD---TTHPEWTTDELAAPGANITLEGDEY 69 (330) Q Consensus 1 Ma~~t---~~-~~t~~~~g~---~ed-l~d~I~~i~p~dTP~~s~ig~~~~~---~~~~~W~td~L~~~~~na~~EG~d~ 69 (330) ||.-. .. .++....|. ... +...|+.+--.+.-..++++-.+.. ...+.+...+... .++.=| |. T Consensus 35 ~a~d~~~~~~~~~~~~~~~i~a~~~~~i~~~vy~~~~~~~~~~~l~pv~t~g~w~~~t~~y~~~e~~G---~a~~yg-d~ 110 (339) T protein:vir:94 35 YAMDAVNLTPTLQTTANAGIPAWMTTFVDRRVIDIQLAPMAAAKIFPEVKKGDWTTTYGVFIIAEPVG---QVATYS-DW 110 (339) T ss_pred hhccccccccccccccccchhhhhhhhhchhheeecccccchhhhcccccCCCCcccEEEEeeeeccc---ceEEcc-cc Confidence 22111 11 111111121 111 2233333322222233344433221 1123333333222 333222 22 Q ss_pred ccc-cccCceEecceEEEEeeeeeehhHHHHHhhccc-cchHHHHH-HHHHHHHHHHHHHHHhcCCCcCCcccccchhHH Q lcl|NC_020858. 70 TFD-ATVSPERLGNYTQIMRKSGIISGTQNITDEAGR-ATKVKEQK-LKKGVELRKDVEFSIVATNASVGGATRESGSLP 146 (330) Q Consensus 70 ~~~-~~~~~~~~~N~tQIf~~~v~VS~T~~av~~~G~-~~e~a~q~-~k~~~EikrD~E~a~i~g~~~~~~~~r~~~Gi~ 146 (330) .+. ..+..+... .-||++..+.++..-+-+..++. +-.++-++ ..+...+.+.+.+..+.|.+. ...-||+ T Consensus 111 ad~Pl~~~~v~~~-~~~v~~~~~g~~y~~~E~~~A~~~g~~l~~~Ka~aA~~al~~~~N~i~~~Gd~~-----~~~~GLl 184 (339) T protein:vir:94 111 SANGMSKANVNFE-SRQNYRYQTWTEYGDLEMATYGEAGIDYVARQEISASLVMAKFANSSYLLGVAG-----IANYGLM 184 (339) T ss_pred cCCCcccccceee-EEeEEEEEEEEeecHHHHHHHHhhCCChHHHHHHHHHHHHHHhhceEEeeeecc-----cceEEEE Confidence 222 111111110 22455555555544444444321 22343333 334444455555555666542 2222321 Q ss_pred HHHhcccccccccccccccccccccccccccccccccHHHHHHHHHHHHhc-CCc-----eeEEEeChHHHHHHHHhhcc Q lcl|NC_020858. 147 TWVKTNVSRGATGANGGYNTGTGLTVAPTDGTQRAFSKAIMDDVMQQGYQS-GAN-----FKHVFVSPYVKSVFVTFMSD 220 (330) Q Consensus 147 ~~i~tn~~~g~~g~~~~~~~~~~~~~~~t~gt~~~lTe~~l~~~~~~~~~~-Gg~-----~~~i~v~~~~k~~is~f~~~ 220 (330) .- -|..... +....|. ..|+..+ -++|..++.++|.. ||. +.+|+++|.+...++.= + T Consensus 185 N~--P~l~~~v-~~s~~Wa----------~kT~~eI-~~Di~~~~~~l~~~s~g~~~~~~~~~L~LP~~~~~~L~~~--n 248 (339) T protein:vir:94 185 ND--PSLPAPV-AATVNWA----------TAAPEDI-ANDVVAMVGRLISQSGGLITGQERMVMALAPSALNNVNRT--N 248 (339) T ss_pred eC--CCccccc-cCCCCcc----------cCCHHHH-HHHHHHHHHHHHHhcCCeeeeccCcEEEecHHHHHhcccC--C Confidence 11 0111101 1111111 0111111 25667777888864 442 23689999887776541 1 Q ss_pred ceeeeeeeeecCCcceeEEEEEEEEEcCCeEEEEEEcCcCCCccccccEEEEE-----cchhhhhcccCCcccccccccc Q lcl|NC_020858. 221 TNVASFRYAASNGKNNSIVANADVYEGPFGKVMIHPNRVMAGSGALARNAFFV-----DPEFLQFGWLRKIAEDKKVAKT 295 (330) Q Consensus 221 ~~~~~~r~~~~~~~~~~~~~~v~~~~tdfG~v~iv~nR~mp~~~~~a~~~~~l-----d~~~~~~~~Lr~~~~~e~laKt 295 (330) .... ...+.+..+|-.++|+.-+.+-... .+.+.++ +++-++++.-.|+... +.-+. T Consensus 249 ~~~~---------------Tvl~~lk~n~pnl~i~~~~el~~a~--g~~~~~~~~~~~~~~~~~~~~p~~~~~l-pvq~~ 310 (339) T protein:vir:94 249 NFGL---------------SAGAKIAQTYPNIQFVAVPEFDTAS--GRLVQLWVPEVNGQPTGEVAFAEKLRSH-SIERY 310 (339) T ss_pred cCCc---------------cHHHHHHHhcCCcEEEEccccccCC--CceEEEEEEeccCCcceEEEcchhhhcc-ccEEc Confidence 1100 1111223333334455444443222 1222122 2233333321222221 11122 Q ss_pred ccc-eeeEEEEEEEEEEecchheeEEecc Q lcl|NC_020858. 296 GDA-EKFMLIGEGALKPKNEKGLGVAADL 323 (330) Q Consensus 296 Gd~-~k~~i~~E~tLe~~N~~a~g~i~gL 323 (330) +.+ +--.+..=.|++|+-|.+.+.+.|| T Consensus 311 ~~~~~v~~~~rt~Gv~i~~P~ai~~~~GI 339 (339) T protein:vir:94 311 STTTRQKHSGATFGAVIYQPWAVTQELGV 339 (339) T ss_pred CceEEecceeeeeeEEEEccceeeeeecC Confidence 222 2222233378999999999999988 Done!