Query lcl|NC_018271.1_cdsid_YP_006560342.1 [gene=B617_gp02] [protein=hypothetical protein] [protein_id=YP_006560342.1] [location=1232..2149] Match_columns 305 No_of_seqs 27 out of 30 Neff 5.2 Searched_HMMs 1612 Date Thu Nov 7 13:11:12 2013 Command /home/guerois/workspace/virfam/python/lib/hhsearch//hhsearch2 -i .//seq/seq_2 -d /home/guerois/workspace/virfam/python/profile_database/capsid_neck_tail.hhm -glob -cpu 7 -o .//seq/HHR/seq_2_vs_rec_db.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 protein:vir:99424 Length: 360 99.9 3.6E-24 2.2E-27 149.1 14.7 283 1-305 19-360 (360) 2 protein:vir:4197 Length: 314 # 99.7 5.7E-19 3.5E-22 120.6 20.1 275 1-305 14-311 (314) 3 protein:vir:4159 Length: 315 # 99.7 5.3E-19 3.3E-22 120.8 17.9 276 1-305 18-315 (315) 4 protein:vir:3158 Length: 321 # 99.7 1.1E-18 7.1E-22 119.0 18.2 276 1-304 15-321 (321) 5 protein:vir:8102 Length: 543 # 99.0 5.3E-11 3.3E-14 76.9 14.7 274 1-305 249-541 (543) 6 protein:vir:7771 Length: 330 # 98.9 1.9E-10 1.2E-13 73.9 16.3 278 1-305 1-322 (330) 7 protein:vir:4092 Length: 390 # 98.9 1.6E-10 1E-13 74.3 14.8 278 1-305 84-367 (390) 8 protein:vir:94142 Length: 304 98.9 3.3E-10 2E-13 72.6 15.5 277 1-305 1-304 (304) 9 protein:vir:105905 Length: 304 98.9 3.3E-10 2E-13 72.6 15.5 277 1-305 1-304 (304) 10 protein:vir:96762 Length: 632 98.8 3.1E-10 1.9E-13 72.7 14.1 272 1-305 357-632 (632) 11 protein:vir:1328 Length: 392 # 98.7 2.4E-09 1.5E-12 67.8 16.1 269 1-305 111-392 (392) 12 protein:vir:4339 Length: 395 # 98.7 1.2E-09 7.6E-13 69.4 14.2 275 1-305 113-394 (395) 13 protein:vir:191 Length: 385 # 98.6 2E-09 1.2E-12 68.3 13.6 266 1-305 105-383 (385) 14 protein:vir:1886 Length: 385 # 98.6 2E-09 1.2E-12 68.3 13.6 266 1-305 105-383 (385) 15 protein:vir:94673 Length: 419 98.6 3.8E-09 2.4E-12 66.7 14.8 279 1-305 121-416 (419) 16 protein:vir:41 Length: 299 # N 98.6 9.6E-09 6E-12 64.5 16.6 263 1-305 6-297 (299) 17 protein:vir:1433 Length: 435 # 98.6 8.6E-09 5.3E-12 64.8 14.8 280 1-305 131-432 (435) 18 protein:vir:4511 Length: 409 # 98.6 2.1E-08 1.3E-11 62.7 17.0 269 1-305 117-405 (409) 19 protein:vir:100135 Length: 418 98.5 7.6E-09 4.7E-12 65.1 14.1 267 1-305 132-414 (418) 20 protein:vir:95376 Length: 425 98.5 1.6E-08 1E-11 63.3 15.1 268 1-305 138-420 (425) 21 protein:vir:105038 Length: 428 98.5 1.5E-08 9.3E-12 63.5 14.9 272 1-305 125-427 (428) 22 protein:vir:9309 Length: 324 # 98.5 3E-08 1.9E-11 61.8 16.1 270 1-305 27-314 (324) 23 protein:vir:103955 Length: 324 98.5 3.2E-08 2E-11 61.7 16.1 270 1-305 27-314 (324) 24 protein:vir:99749 Length: 324 98.5 4.5E-08 2.8E-11 60.9 16.2 270 1-305 27-314 (324) 25 protein:vir:80376 Length: 435 98.4 1.7E-08 1.1E-11 63.2 13.7 269 1-305 130-432 (435) 26 protein:vir:6242 Length: 390 # 98.4 5.2E-08 3.3E-11 60.5 16.4 267 1-305 111-390 (390) 27 protein:vir:81070 Length: 390 98.4 3.4E-08 2.1E-11 61.6 14.8 270 1-304 113-390 (390) 28 protein:vir:5739 Length: 366 # 98.4 9.2E-08 5.7E-11 59.2 17.1 276 1-305 64-365 (366) 29 protein:vir:96392 Length: 324 98.4 7.8E-08 4.9E-11 59.5 15.8 270 1-305 27-314 (324) 30 protein:vir:78830 Length: 324 98.4 7.8E-08 4.9E-11 59.5 15.8 270 1-305 27-314 (324) 31 protein:vir:8187 Length: 311 # 98.4 9.5E-08 5.9E-11 59.1 16.2 274 1-305 1-309 (311) 32 protein:vir:4456 Length: 401 # 98.4 9.2E-08 5.7E-11 59.1 16.1 267 1-305 107-400 (401) 33 protein:vir:97053 Length: 390 98.4 4.1E-08 2.5E-11 61.1 14.0 266 1-304 113-390 (390) 34 protein:vir:96223 Length: 324 98.4 9.2E-08 5.7E-11 59.2 15.8 269 1-305 30-314 (324) 35 protein:vir:97148 Length: 324 98.4 1.1E-07 6.8E-11 58.8 16.1 270 1-305 27-314 (324) 36 protein:vir:102119 Length: 404 98.3 7.9E-08 4.9E-11 59.5 14.9 273 1-305 110-399 (404) 37 protein:vir:78223 Length: 333 98.3 2.3E-07 1.4E-10 57.0 17.0 278 1-305 10-331 (333) 38 protein:vir:104256 Length: 458 98.3 1.5E-07 9.4E-11 58.0 15.8 267 1-305 161-457 (458) 39 protein:vir:10364 Length: 390 98.3 9.3E-08 5.8E-11 59.1 14.0 270 1-304 114-390 (390) 40 protein:vir:95763 Length: 297 98.3 3.6E-07 2.3E-10 55.9 17.2 266 1-305 9-295 (297) 41 protein:vir:485 Length: 407 # 98.2 2.7E-07 1.7E-10 56.6 16.0 267 1-305 106-399 (407) 42 protein:vir:78523 Length: 338 98.2 3.7E-07 2.3E-10 55.8 16.6 279 1-305 10-334 (338) 43 protein:vir:104085 Length: 320 98.2 2.9E-07 1.8E-10 56.4 15.8 270 1-305 14-320 (320) 44 protein:vir:100247 Length: 425 98.2 3E-07 1.9E-10 56.3 15.2 272 1-305 130-423 (425) 45 protein:vir:2344 Length: 397 # 98.2 4.8E-07 3E-10 55.2 16.1 270 1-305 10-305 (397) 46 protein:vir:99920 Length: 311 98.1 9.1E-07 5.6E-10 53.7 15.6 278 1-305 1-310 (311) 47 protein:vir:9509 Length: 381 # 98.0 1.9E-07 1.2E-10 57.4 11.2 273 1-305 76-367 (381) 48 protein:vir:101291 Length: 381 98.0 1.9E-07 1.2E-10 57.4 11.2 273 1-305 76-367 (381) 49 protein:vir:80684 Length: 315 98.0 1.4E-06 8.6E-10 52.7 15.9 271 1-305 1-305 (315) 50 protein:vir:2430 Length: 318 # 98.0 1.4E-06 8.7E-10 52.7 15.4 273 1-305 14-312 (318) 51 protein:vir:97255 Length: 310 97.9 2.6E-07 1.6E-10 56.7 9.5 270 1-305 1-301 (310) 52 protein:vir:78350 Length: 383 97.8 1.2E-06 7.7E-10 53.0 11.8 272 1-305 83-374 (383) 53 protein:vir:4226 Length: 326 # 97.8 8.3E-06 5.2E-09 48.4 16.3 274 1-305 20-322 (326) 54 protein:vir:6212 Length: 434 # 97.7 1.2E-05 7.1E-09 47.7 16.0 261 1-305 141-420 (434) 55 protein:vir:9759 Length: 303 # 97.6 6.2E-06 3.8E-09 49.1 13.7 278 1-305 1-302 (303) 56 protein:vir:9574 Length: 300 # 97.6 9.3E-06 5.8E-09 48.2 14.7 271 1-305 1-299 (300) 57 protein:vir:81227 Length: 413 97.6 7.7E-06 4.8E-09 48.6 13.7 272 1-305 118-409 (413) 58 protein:vir:2504 Length: 305 # 97.6 2.4E-05 1.5E-08 45.9 16.1 269 1-305 1-305 (305) 59 protein:vir:4953 Length: 397 # 97.6 3.6E-05 2.2E-08 44.9 16.7 257 1-305 109-384 (397) 60 protein:vir:100632 Length: 381 97.6 4.8E-06 3E-09 49.8 11.9 270 1-305 76-367 (381) 61 protein:vir:94933 Length: 330 97.5 4.1E-05 2.5E-08 44.6 16.7 269 1-305 25-320 (330) 62 protein:vir:95963 Length: 395 97.5 5.9E-06 3.6E-09 49.3 12.0 269 1-305 86-375 (395) 63 protein:vir:4830 Length: 397 # 97.5 3.7E-05 2.3E-08 44.9 15.9 259 1-305 109-384 (397) 64 protein:vir:94771 Length: 298 97.5 3.1E-05 1.9E-08 45.3 15.3 274 1-305 1-298 (298) 65 protein:vir:4856 Length: 293 # 97.5 6.3E-05 3.9E-08 43.6 17.7 258 1-305 5-280 (293) 66 protein:vir:9410 Length: 415 # 97.4 3.6E-05 2.2E-08 45.0 15.5 269 1-305 120-405 (415) 67 protein:vir:102335 Length: 312 97.4 1.2E-05 7.3E-09 47.6 12.2 269 1-305 1-309 (312) 68 protein:vir:3033 Length: 272 # 97.4 3.8E-05 2.4E-08 44.8 14.6 258 1-305 1-268 (272) 69 protein:vir:9820 Length: 272 # 97.4 3.8E-05 2.4E-08 44.8 14.6 258 1-305 1-268 (272) 70 protein:vir:79987 Length: 415 97.3 7.4E-05 4.6E-08 43.2 16.0 267 1-305 120-405 (415) 71 protein:vir:98339 Length: 415 97.3 7.4E-05 4.6E-08 43.2 16.0 267 1-305 120-405 (415) 72 protein:vir:81100 Length: 415 97.3 7.4E-05 4.6E-08 43.2 16.0 267 1-305 120-405 (415) 73 protein:vir:1638 Length: 298 # 97.3 5.8E-05 3.6E-08 43.8 15.2 276 1-305 1-298 (298) 74 protein:vir:93616 Length: 645 97.3 6.5E-05 4E-08 43.5 15.3 264 1-305 334-638 (645) 75 protein:vir:4600 Length: 415 # 97.3 0.00011 6.7E-08 42.3 16.5 269 1-305 120-405 (415) 76 protein:vir:4700 Length: 415 # 97.3 0.00011 6.7E-08 42.3 16.5 269 1-305 120-405 (415) 77 protein:vir:8420 Length: 477 # 97.3 1.3E-05 8.3E-09 47.3 11.1 272 1-305 157-470 (477) 78 protein:vir:4997 Length: 397 # 97.2 0.00012 7.7E-08 42.0 17.4 253 1-305 109-384 (397) 79 protein:vir:1268 Length: 397 # 97.1 0.00014 8.4E-08 41.8 15.6 260 1-305 123-396 (397) 80 protein:vir:80128 Length: 466 97.1 2.4E-05 1.5E-08 45.9 11.3 273 1-305 147-447 (466) 81 protein:vir:1383 Length: 421 # 97.1 0.00015 9E-08 41.6 15.0 255 1-305 114-385 (421) 82 protein:vir:1025 Length: 408 # 97.0 0.00019 1.2E-07 40.9 17.2 255 1-305 116-394 (408) 83 protein:vir:105464 Length: 346 97.0 3.7E-05 2.3E-08 44.9 11.4 270 1-305 1-301 (346) 84 protein:vir:9643 Length: 377 # 97.0 5.9E-05 3.7E-08 43.8 12.2 268 1-305 79-376 (377) 85 protein:vir:98635 Length: 377 96.9 2.9E-05 1.8E-08 45.4 10.3 269 1-305 79-376 (377) 86 protein:vir:102605 Length: 273 96.9 0.00024 1.5E-07 40.4 15.3 262 1-305 1-272 (273) 87 protein:vir:105822 Length: 273 96.9 0.00024 1.5E-07 40.4 15.3 262 1-305 1-272 (273) 88 protein:vir:78090 Length: 302 96.9 0.00012 7.4E-08 42.1 13.6 265 1-305 1-299 (302) 89 protein:vir:102873 Length: 392 96.8 0.00031 1.9E-07 39.8 16.8 257 1-305 106-385 (392) 90 protein:vir:105004 Length: 392 96.8 0.00031 1.9E-07 39.8 16.8 257 1-305 106-385 (392) 91 protein:vir:107593 Length: 392 96.8 0.00031 1.9E-07 39.8 16.8 257 1-305 106-385 (392) 92 protein:vir:102082 Length: 392 96.8 0.00031 1.9E-07 39.8 16.8 257 1-305 106-385 (392) 93 protein:vir:101607 Length: 379 96.8 0.00036 2.2E-07 39.4 16.6 258 1-305 109-378 (379) 94 protein:vir:93881 Length: 387 96.7 0.00041 2.6E-07 39.1 17.0 254 1-305 118-383 (387) 95 protein:vir:9361 Length: 402 # 96.7 0.00042 2.6E-07 39.1 17.3 253 1-304 133-402 (402) 96 protein:vir:101650 Length: 497 96.6 0.00028 1.7E-07 40.1 13.3 275 1-305 151-492 (497) 97 protein:vir:7855 Length: 497 # 96.6 0.00028 1.7E-07 40.1 13.3 275 1-305 151-492 (497) 98 protein:vir:81160 Length: 371 96.5 0.00058 3.6E-07 38.3 17.2 256 1-305 91-370 (371) 99 protein:vir:1084 Length: 437 # 96.5 0.00058 3.6E-07 38.3 15.7 256 1-305 156-429 (437) 100 protein:vir:2685 Length: 387 # 96.4 0.00063 3.9E-07 38.1 17.7 254 1-305 118-380 (387) 101 protein:vir:94424 Length: 387 96.4 0.00063 3.9E-07 38.1 17.7 254 1-305 118-380 (387) 102 protein:vir:96978 Length: 387 96.4 0.00063 3.9E-07 38.1 17.7 254 1-305 118-380 (387) 103 protein:vir:7409 Length: 408 # 96.4 0.00067 4.1E-07 38.0 17.5 257 1-305 116-392 (408) 104 protein:vir:97397 Length: 517 96.2 0.00058 3.6E-07 38.3 13.1 263 1-305 235-515 (517) 105 protein:vir:7990 Length: 273 # 95.8 0.0014 8.9E-07 36.2 15.5 262 1-305 1-272 (273) 106 protein:vir:3845 Length: 395 # 95.7 0.0016 9.6E-07 36.0 17.2 256 1-305 105-384 (395) 107 protein:vir:3991 Length: 404 # 95.6 0.0017 1.1E-06 35.7 16.5 259 1-305 116-394 (404) 108 protein:vir:79008 Length: 299 95.4 0.0021 1.3E-06 35.2 13.6 267 1-305 1-298 (299) 109 protein:vir:78640 Length: 352 95.0 0.003 1.9E-06 34.4 17.5 254 1-305 83-345 (352) 110 protein:vir:93742 Length: 274 94.7 0.0036 2.3E-06 34.0 16.9 260 1-305 1-269 (274) 111 protein:vir:3783 Length: 336 # 94.7 0.0036 2.3E-06 34.0 12.7 270 1-298 13-336 (336) 112 protein:vir:79157 Length: 339 94.7 0.0037 2.3E-06 33.9 12.8 271 1-296 16-339 (339) 113 protein:vir:80930 Length: 278 94.2 0.005 3.1E-06 33.2 16.6 264 1-305 1-276 (278) 114 protein:vir:98856 Length: 343 92.9 0.0038 2.4E-06 33.8 9.1 276 1-302 16-343 (343) 115 protein:vir:78935 Length: 335 92.6 0.011 6.6E-06 31.4 12.2 263 1-305 1-299 (335) 116 protein:vir:270 Length: 341 # 92.5 0.011 6.8E-06 31.3 11.2 278 1-305 1-333 (341) 117 protein:vir:962 Length: 397 # 92.4 0.012 7.2E-06 31.2 14.0 247 1-305 132-397 (397) 118 protein:vir:100172 Length: 394 92.3 0.012 7.5E-06 31.1 17.5 254 1-305 111-385 (394) 119 protein:vir:3870 Length: 400 # 92.2 0.013 7.8E-06 31.0 17.0 251 1-305 133-400 (400) 120 protein:vir:6324 Length: 335 # 92.0 0.013 8.4E-06 30.8 11.1 262 1-305 15-299 (335) 121 protein:vir:100057 Length: 375 91.8 0.0063 3.9E-06 32.6 8.9 271 1-305 1-319 (375) 122 protein:vir:3746 Length: 336 # 91.7 0.015 9.1E-06 30.6 12.7 270 1-298 13-336 (336) 123 protein:vir:79712 Length: 285 91.3 0.016 1E-05 30.4 14.7 261 1-305 1-283 (285) 124 protein:vir:94576 Length: 347 91.0 0.01 6.2E-06 31.5 9.1 256 1-305 1-299 (347) 125 protein:vir:1153 Length: 338 # 90.1 0.022 1.4E-05 29.6 11.6 271 1-296 16-338 (338) 126 protein:vir:9704 Length: 394 # 89.9 0.024 1.5E-05 29.5 17.4 248 1-305 127-391 (394) 127 protein:vir:96123 Length: 274 89.7 0.025 1.5E-05 29.4 15.1 259 1-305 1-269 (274) 128 protein:vir:100331 Length: 342 89.5 0.026 1.6E-05 29.3 11.7 269 1-296 16-342 (342) 129 protein:vir:103323 Length: 364 89.2 0.019 1.1E-05 30.1 9.1 274 1-305 15-339 (364) 130 protein:vir:1829 Length: 355 # 88.8 0.03 1.9E-05 28.9 12.0 276 1-305 16-350 (355) 131 protein:vir:6061 Length: 357 # 87.3 0.04 2.5E-05 28.3 12.3 275 1-305 16-350 (357) 132 protein:vir:8885 Length: 347 # 85.8 0.016 1E-05 30.4 6.7 281 1-305 1-347 (347) 133 protein:vir:10450 Length: 344 85.8 0.022 1.3E-05 29.7 7.4 274 1-304 1-344 (344) 134 protein:vir:78777 Length: 358 85.4 0.053 3.3E-05 27.6 11.8 276 1-305 1-345 (358) 135 protein:vir:97031 Length: 402 84.9 0.039 2.4E-05 28.3 8.3 253 1-305 1-285 (402) 136 protein:vir:5694 Length: 357 # 84.4 0.061 3.8E-05 27.2 12.3 280 1-305 16-355 (357) 137 protein:vir:100884 Length: 389 83.9 0.065 4E-05 27.1 16.2 254 1-305 109-383 (389) 138 protein:vir:2016 Length: 357 # 83.8 0.066 4.1E-05 27.1 12.3 275 1-305 16-350 (357) 139 protein:vir:97433 Length: 274 83.2 0.071 4.4E-05 26.9 17.9 259 1-305 1-269 (274) 140 protein:vir:94494 Length: 274 83.2 0.071 4.4E-05 26.9 17.9 259 1-305 1-269 (274) 141 protein:vir:99523 Length: 311 81.4 0.086 5.3E-05 26.4 10.4 266 1-304 1-311 (311) 142 protein:vir:104011 Length: 337 81.1 0.089 5.5E-05 26.4 12.0 269 1-297 16-337 (337) 143 protein:vir:98566 Length: 355 80.5 0.094 5.8E-05 26.2 11.7 278 1-305 16-350 (355) 144 protein:vir:79171 Length: 337 79.3 0.11 6.6E-05 25.9 12.0 269 1-297 16-337 (337) 145 protein:vir:78186 Length: 337 75.2 0.15 9.2E-05 25.1 11.7 269 1-297 16-337 (337) 146 protein:vir:78920 Length: 290 74.0 0.16 0.0001 24.9 13.9 262 1-305 1-289 (290) 147 protein:vir:2201 Length: 345 # 71.3 0.2 0.00012 24.4 11.0 258 1-305 1-297 (345) 148 protein:vir:107120 Length: 329 66.8 0.26 0.00016 23.8 14.1 260 1-305 30-304 (329) 149 protein:vir:3364 Length: 347 # 61.9 0.35 0.00021 23.1 10.4 262 1-305 1-297 (347) 150 protein:vir:97331 Length: 319 61.1 0.36 0.00022 23.0 13.4 259 1-305 19-295 (319) 151 protein:vir:94800 Length: 319 61.1 0.36 0.00022 23.0 13.4 259 1-305 19-295 (319) 152 protein:vir:4074 Length: 480 # 56.8 0.45 0.00028 22.5 7.5 272 1-305 171-475 (480) 153 protein:vir:1239 Length: 274 # 54.3 0.51 0.00031 22.2 17.0 259 1-305 1-269 (274) 154 protein:vir:78739 Length: 332 53.8 0.52 0.00032 22.1 10.8 267 1-304 1-332 (332) 155 protein:vir:105334 Length: 276 53.7 0.52 0.00032 22.1 15.5 256 1-305 1-269 (276) 156 protein:vir:103759 Length: 330 51.4 0.58 0.00036 21.9 12.2 226 1-246 1-330 (330) 157 protein:vir:95898 Length: 274 47.0 0.72 0.00044 21.4 17.0 259 1-305 1-269 (274) 158 protein:vir:96262 Length: 274 47.0 0.72 0.00044 21.4 17.0 259 1-305 1-269 (274) 159 protein:vir:95318 Length: 328 39.5 1 0.00063 20.5 13.0 225 1-246 1-328 (328) 160 protein:vir:94622 Length: 341 38.9 1 0.00065 20.5 12.7 273 1-305 3-339 (341) 161 protein:vir:80068 Length: 301 37.6 1.1 0.00069 20.3 9.9 268 1-305 1-294 (301) 162 protein:vir:94711 Length: 347 37.6 1.1 0.00069 20.3 11.5 272 1-305 1-346 (347) 163 protein:vir:80180 Length: 381 36.9 1.1 0.00071 20.3 10.6 274 1-305 1-309 (381) 164 protein:vir:1541 Length: 347 # 32.4 1.4 0.00088 19.7 7.1 277 1-305 1-345 (347) 165 protein:vir:348 Length: 321 # 28.7 1.7 0.0011 19.3 9.1 265 1-289 1-321 (321) 166 protein:vir:3613 Length: 272 # 27.7 1.8 0.0011 19.2 12.8 260 1-305 1-271 (272) 167 protein:vir:7324 Length: 335 # 27.6 1.8 0.0011 19.2 12.8 224 1-247 1-335 (335) 168 protein:vir:106647 Length: 303 22.0 2.5 0.0016 18.4 6.3 238 1-305 15-284 (303) 169 protein:vir:96833 Length: 275 20.9 2.7 0.0017 18.2 15.0 256 1-305 3-272 (275) No 1 >protein:vir:99424 Length: 360 # NCBI annotation: hypothetical protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:1595 # MgeName: BJ1 # Cross-refs: genbank:acc:YP_919080;genbank:gi:119757038;genbank:GeneID:4606077 Probab=99.86 E-value=3.6e-24 Score=149.12 Aligned_cols=283 Identities=12% Similarity=0.132 Sum_probs=183.0 Q ss_pred CceEeee----ecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccc-cCCCC--CCCCccceEe- Q lcl|NC_018271. 1 MATTVDI----TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDF-VDYSC--GFTPSGEVDI- 72 (305) Q Consensus 1 ma~~~~~----~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~-q~~~~--~~~~~G~~~~- 72 (305) -..+|++ ...=..+++++++..+..+..+.++ ||+++--+++ .-+.++..|..+ .+.+. .++..+++.. T Consensus 19 ~k~~it~~~l~~g~L~p~~a~~Fl~~v~~~t~iL~~--~r~~~~~s~~-~ei~kig~G~r~~r~~~e~~~~~~~~~~~~~ 95 (360) T protein:vir:99 19 SQKDIGLAELDGFQLPVDVTEEFLERMQKGVQILGM--ADTMTLARLE-MEVPQFGVPRLSGHTRDEEGSRTENSEAESG 95 (360) T ss_pred HhhhccccccCceeecHHHHHHHHHHHhhccchhhh--cceeeccccc-ccccccccceeeccccccCCCCCcCCcCccc Confidence 0001111 3477899999999999999999977 9999766555 456777776655 33422 3333344433 Q ss_pred cceeeeeeeeeEeeccCHHHH-HHHHHHHhcCCCCcccccCCHHH-HHHHHHHHHHHHHhhhhhhccccCCcc------- Q lcl|NC_018271. 73 NEKQLTLKKIKSDKEVCKEDF-RQLWTAAEMGFSAFNDNGLPSTE-QGFMLTDMGNRLARKIDKDIWQGDGTT------- 143 (305) Q Consensus 73 ~~K~L~~~~~k~~~~~~P~d~-~~~w~~~~~~~g~~~~~~LP~~~-q~~~l~~l~~~ia~ei~~~~~~GD~s~------- 143 (305) +-+.++.+.+-....+.-.+. ++.|+ |... ++.++.+|+++++.+++...++||++. T Consensus 96 ~v~~~~~~~~~~~~~i~~~~~~~n~~~--------------~~~~f~~~i~~~~ae~~~~Dle~l~~~g~~ds~d~~~~~ 161 (360) T protein:vir:99 96 SVKFNATDKSYYILVEPKRDALKNTHY--------------GPDQFGDYIVDQFIERYGNDLGLMGIRAGASSGNLQSIG 161 (360) T ss_pred cCccccccceeeEeechHHHHHhhhhc--------------ccchhHHHHHHHHHHHHHHHHHHHHhhccchhcccccCc Confidence 222333443334443322221 23332 2222 356778889999999999999888652 Q ss_pred ------chhHHHHHHHhhccceE-------EeccCc----CcCC-------------hhhHHHHHHHHHHhccHHHHhCC Q lcl|NC_018271. 144 ------GNLQGILPLLEADATVI-------DVVGAS----GGIT-------------AANVEAELGKFIDAHTDEILQAP 193 (305) Q Consensus 144 ------~~fdG~lk~i~~d~~~~-------~~~~~~----~~iT-------------~anv~~~l~~~~~~iP~~~r~~~ 193 (305) +..|||+|++.++.+.+ ++.... .+++ .+.-.+.|+++.+.+|.+||+.+ T Consensus 162 ~~d~fl~~~dGwlKka~~~~~~id~a~d~t~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~lf~~~~~~Lp~kyr~~~ 241 (360) T protein:vir:99 162 GAAELDNTFKGWIARAEGDAQSVDDAGDSTRIGLEDTATADADSMPSIANTDGSGNPQPVDTSLFNETIQTLDSRYRESD 241 (360) T ss_pred ccchhhhhhHHHHHHhhcccchhhccccccccccccccccccccchhhhccccccccccchHHHHHHHHHhcchhhhcCc Confidence 22699999997665432 211110 0011 12246779999999999999876 Q ss_pred --CcEEEecHHHHHHHHHHHhhhhccCCcccCC--CcceecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhcccc Q lcl|NC_018271. 194 --NHVFGVSTNVIRAIKRAYGTQARSNGTFLNP--NEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIK 269 (305) Q Consensus 194 --~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~--~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~ 269 (305) +++++||.+.++.|.++|++|...-++..-. +.-.|+|++|++++.||++.+++|+..||++|+. ..+||+ T Consensus 242 ~~~~~~~~s~~~~~~yr~~L~~R~t~LGd~~l~g~~~~~~~Gipi~~v~~~pd~~~mlT~p~NLi~g~~-----~~iri~ 316 (360) T protein:vir:99 242 AYSPVLMTSPNQVQSYTMSLTEREDPLGSAVIFGDSDITPFSYDLVGVNGFPDEYMMFTDPNNLAFGLY-----EEMELD 316 (360) T ss_pred ccceEEEccCchHHHHHHHHhccCcccchhheecccccccceeeeEEcCCCCCCceEEeccCceeEEee-----eeeEEe Confidence 5699999999999999999999766664322 3346999999999999999999999999998873 356665 Q ss_pred ceeeec--cceeEEEEEEeecceeeccCC-eEEEe-----cCCC Q lcl|NC_018271. 270 DMGDVD--LSGQIRTKMVLSAGVEYAYGA-EIVLY-----TPAA 305 (305) Q Consensus 270 ~~~~~~--~~~~~f~k~~m~~d~~i~fg~-E~v~~-----~~~~ 305 (305) ....-+ ...+++.+.+|.+|+-..|++ |.|++ +|.| T Consensus 317 ~~~e~~~~~~~~~~~~~~~~~~~D~~iee~~Av~~vt~~~~~~~ 360 (360) T protein:vir:99 317 QSTDTDKVHEQRLHSRNWLEGQFDFQIKEQQAGVLVTDLETPTA 360 (360) T ss_pred ecccchhhhhhceeeeEEEEEEeeEEEEecccEEEEecCCCCCC Confidence 332111 123455566666666666666 35655 3555 No 2 >protein:vir:4197 Length: 314 # NCBI annotation: putative structural protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:88 # MgeName: psiM100 # Cross-refs: genbank:acc:NP_071822;genbank:gi:11863105;genbank:GeneID:1257607 Probab=99.73 E-value=5.7e-19 Score=120.62 Aligned_cols=275 Identities=13% Similarity=0.160 Sum_probs=187.7 Q ss_pred CceEeeeecc-cchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccc-cCCC---CCC-CCccceEecc Q lcl|NC_018271. 1 MATTVDITTN-YVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDF-VDYS---CGF-TPSGEVDINE 74 (305) Q Consensus 1 ma~~~~~~~~-Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~-q~~~---~~~-~~~G~~~~~~ 74 (305) | ++.+..+. -..+...+++..+...+.+.+. ++|++=++..+..++++..+-.+ .... ... .+-++..|++ T Consensus 14 i-t~~d~~gG~L~P~~~~~~i~~l~e~s~i~~~--a~vi~t~~s~~~~i~~i~~g~~~~~~~~~~~~~~~~~~~~~tf~~ 90 (314) T protein:vir:41 14 I-DVPDLGKGILAVQRFGEFVREVRENSAIIKD--ARVLNALKSYEVDISRISLGVELEPGRNTSGTKVAPTADEVTVST 90 (314) T ss_pred c-ccccCCCceeChHHHHHHHHHHHhccchhhh--eeeecccCccceeecccccCcccccccccccCCccCCcccccccc Confidence 3 22233333 4568888888778888888887 88876444443446655443222 2221 111 1236778999 Q ss_pred eeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc-------chhH Q lcl|NC_018271. 75 KQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT-------GNLQ 147 (305) Q Consensus 75 K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~-------~~fd 147 (305) ..|.++.+..... .++++.+-|- + -+..+.++..+|+++++......+++||++. +-.+ T Consensus 91 ~~l~~~kl~~~v~-is~e~L~D~a-----------~--~~~le~~i~~~~Ae~~g~~~~~~~~nGdg~~~s~~~~~~~p~ 156 (314) T protein:vir:41 91 NTLEMKELVTKVV-LEDEALEDNI-----------E--QSAFEQTITSLLASGVTYDLECFFLHADSSLTTGRELYRIND 156 (314) T ss_pred eeeeeEEEEEeec-ccHHHHHhhh-----------c--hhhHHHHHHHHHHHHHHHHHHHHhhccccCCcCcccchhcch Confidence 9999999888765 4444443221 0 1345678889999999999999999999852 3469 Q ss_pred HHHHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCC-CcEEEecHHHHHHHHHHHhhhhccCCccc--CC Q lcl|NC_018271. 148 GILPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAP-NHVFGVSTNVIRAIKRAYGTQARSNGTFL--NP 224 (305) Q Consensus 148 G~lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~-~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t--~~ 224 (305) ||++.+. ..++.. +..+.++..+.|.+++.++|..||+++ +++++||.+++.+|...+..+.+...+.. .+ T Consensus 157 G~l~~a~--~~~~~~----~~~~~~~~~~~~~~l~~sl~~~yr~~~~~~~~~m~~~t~~~~r~~l~~~~~~l~~~~~~~~ 230 (314) T protein:vir:41 157 GWMKLAG--NQYTDA----EPEDENWPLNLFDGMMDELDTRYLQLKPRMKFYVSNEIYNGYRKQLLVRETGLGDSALIGA 230 (314) T ss_pred hhhhhcc--cceeec----CccccccHHHHHHHHHHhcCchhhcCCCceEEEecHHHHHHHHHHHhccCCcccchhhhCC Confidence 9998742 222221 123456778999999999999999874 69999999999999999977766555443 33 Q ss_pred CcceecceeeeeccCC-----CCCeEEEecchHHhhhhhhhhhhhhccccceeeeccce-eEEEEEEeecceeeccCC-e Q lcl|NC_018271. 225 NEFDFEGYTLTEIKGL-----PASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSG-QIRTKMVLSAGVEYAYGA-E 297 (305) Q Consensus 225 ~~~~~kGi~iv~l~~~-----Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~-~~f~k~~m~~d~~i~fg~-E 297 (305) ....+.|++++.++.| |++.|++|+.+||++++. ..|||++--. +.. ++.|...+..|+.|.+.+ . T Consensus 231 ~~~~l~G~PV~~~~~~~~~~~~~~~i~fgd~~nlv~~~~-----~~ir~~~~~~--a~~~~~~~~~~~r~d~~~~~~~aa 303 (314) T protein:vir:41 231 TGLQYDGIPIQYVPALDALGDDKARALLTVPTNLVYGFW-----RNIRIEPKRD--AAMRRTEYIASLRADCNYEDENAA 303 (314) T ss_pred CCceecceeeEecccccccCCCCceEEEechhheEEEee-----ceeEEeeccc--CcCCeEEEEEEEEeceEEEEcCcE Confidence 4456999999999776 678999999999987553 3466654333 444 588888899999997653 3 Q ss_pred EEEecCCC Q lcl|NC_018271. 298 IVLYTPAA 305 (305) Q Consensus 298 ~v~~~~~~ 305 (305) ++.+--.+ T Consensus 304 ~~~~~~~~ 311 (314) T protein:vir:41 304 VAAVIDMS 311 (314) T ss_pred EEEEeecc Confidence 44443333 No 3 >protein:vir:4159 Length: 315 # NCBI annotation: structural protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:87 # MgeName: psiM2 # Cross-refs: genbank:acc:NP_046968;genbank:gi:9630538;genbank:GeneID:1261712 Probab=99.71 E-value=5.3e-19 Score=120.77 Aligned_cols=276 Identities=16% Similarity=0.156 Sum_probs=179.5 Q ss_pred CceEeeeeccc-chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccc-cCCCCCCC------CccceEe Q lcl|NC_018271. 1 MATTVDITTNY-VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDF-VDYSCGFT------PSGEVDI 72 (305) Q Consensus 1 ma~~~~~~~~Y-~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~-q~~~~~~~------~~G~~~~ 72 (305) .-++-+....| ..+.+++++.++.-..++-+. ++|++-++..+..+.++..+.-. .+. .|+ +-++.+| T Consensus 18 ~~t~~d~~Gg~l~P~~~~~~i~~~~e~s~~l~~--~~vi~~~~~~~~~i~~~g~~~~~~~g~--~~~~~~~~~~~~~~~f 93 (315) T protein:vir:41 18 KIDVPDLGRGVLSVDRFGEFVKAVRDSAVIIPE--ARIDNALKSYEKDISRLSLVLDVGPGR--DETGQKLAPPESTAEV 93 (315) T ss_pred hcCCcCCCCceechHHHHHHHHHHHhhhhhhhh--ceeeeccccccccccccccCccccccc--ccccCcCCCCCCcccc Confidence 11222333344 577888888777777777666 88765554332223333322211 122 232 2356789 Q ss_pred cceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc-----chhH Q lcl|NC_018271. 73 NEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT-----GNLQ 147 (305) Q Consensus 73 ~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~-----~~fd 147 (305) ++..|.++.+.++.. .++++..-|.. -+..|.++..+++++++...+..+++||+++ +..+ T Consensus 94 ~~~~l~~~~l~~~~~-it~elL~D~~~-------------~~~~e~~l~~~~a~~~a~~~~~~~~nGdg~s~~p~~~~~~ 159 (315) T protein:vir:41 94 KTNTLYMREMVTKVV-IHEDAIEDNIE-------------GKAFEQKIVTLLGEGISYVLEKYYLHGDTSSSDPLLRMSD 159 (315) T ss_pred ceeeeceeeeeeecc-ccHHHHHhhhc-------------cccHHHHHHHHHHHHHHHHHHHHhhccCCcCcCccccccc Confidence 999999998888754 33333332220 1236688999999999999999999999864 3459 Q ss_pred HHHHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhC-CCcEEEecHHHHHHHHHHHhhhhccCCc--ccCC Q lcl|NC_018271. 148 GILPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQA-PNHVFGVSTNVIRAIKRAYGTQARSNGT--FLNP 224 (305) Q Consensus 148 G~lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~-~~l~~f~S~~~~d~Y~d~~~~~~~k~~~--~t~~ 224 (305) ||++.+..........+... ....+.|.+++.++|..||++ ++++++||.+.+.+|+..+..+..-..+ ...+ T Consensus 160 G~l~~a~~~~~~~~~~~~a~----~~~~d~l~~l~~sl~~~yr~~~~~~~~imn~~t~~~~rklk~~~g~~lw~~~~~~g 235 (315) T protein:vir:41 160 GWLKLASEKLTESDVDPEAE----DWPMNLFDTMIESLPTPYRNNLPNMKFYVTWDIYRAYRDALKGRETGLGDQALTGA 235 (315) T ss_pred cceecccccccccccccccc----cccHHHHHHHHHhcChHHhhcCCceEEEEcHHHHHHHHHHhccCCCccccchhhcC Confidence 99998765544333222222 223678999999999999986 5799999999999998877555433222 2334 Q ss_pred CcceecceeeeeccCC-----CCCeEEEecchHHhhhhhhhhhhhhccccceeeeccc-eeEEEEEEeecceeeccCCeE Q lcl|NC_018271. 225 NEFDFEGYTLTEIKGL-----PASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLS-GQIRTKMVLSAGVEYAYGAEI 298 (305) Q Consensus 225 ~~~~~kGi~iv~l~~~-----Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~-~~~f~k~~m~~d~~i~fg~E~ 298 (305) ....+.|++++.++.| |+..|+.|+.+|+++|+. ..++|++... +. .+++|.+.+..|..++..+-. T Consensus 236 ~~~tl~G~PV~~~~~m~~~~~~~~~ilf~d~~nl~~~~~-----~~i~i~~~~~--a~~~~~~~~~~~r~d~~~~~~~~~ 308 (315) T protein:vir:41 236 NSILYDGRPVQYVPALEALNDGKSRALFVVPTQLVYGFW-----RNIKVVPDYD--AEMRLTKYVASLRTDNHYEDEEGA 308 (315) T ss_pred CCceecccceEecccccccCCCCccEEEecccceEEEec-----cccEEEeeec--CCCCceEEEEEEEeceeEEeccce Confidence 4557999999988777 578999999999987764 3467764432 33 448888888888877664433 Q ss_pred EEecCCC Q lcl|NC_018271. 299 VLYTPAA 305 (305) Q Consensus 299 v~~~~~~ 305 (305) |+-+=-- T Consensus 309 a~~~~~v 315 (315) T protein:vir:41 309 VSATITV 315 (315) T ss_pred eEeeeeC Confidence 3322222 No 4 >protein:vir:3158 Length: 321 # NCBI annotation: capsid protein gpE # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:316 # MgeName: PhiCh1 # Cross-refs: genbank:acc:NP_665929;genbank:gi:22091115;genbank:GeneID:951342 Probab=99.70 E-value=1.1e-18 Score=118.95 Aligned_cols=276 Identities=11% Similarity=0.115 Sum_probs=176.0 Q ss_pred CceEeeee---ccc--chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccc-cCC-CCCC-CCccceEe Q lcl|NC_018271. 1 MATTVDIT---TNY--VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDF-VDY-SCGF-TPSGEVDI 72 (305) Q Consensus 1 ma~~~~~~---~~Y--~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~-q~~-~~~~-~~~G~~~~ 72 (305) =++.++.. ..| ..+...+++..+....++.+. |+|+| |+....-++....+... +-- ...+ ...++..+ T Consensus 15 ~~~~~~~~~~~~g~~v~~~~~~~l~~~i~e~s~~l~~--i~v~~-v~~~~~~i~~~~~~~~~~~~~~e~~~~~~~~~~~~ 91 (321) T protein:vir:31 15 EKNALTVDDLDAGGTLPDPLWDEFWTDMIEETPLLDA--IRTET-VGAKKTRIPTLNIGERHRRPQDEGEWNENESDVST 91 (321) T ss_pred HhccccccccCCcceeCHHHHHHHHHHHHHhhhhhhh--ceeee-ccCcceeeeeeccCCccccccccccccccccccee Confidence 12233222 122 346677788888877777766 88887 43333223333322222 100 1112 12356789 Q ss_pred cceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccch-----hH Q lcl|NC_018271. 73 NEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGN-----LQ 147 (305) Q Consensus 73 ~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~-----fd 147 (305) ++..+.++.+.+...+ +.++..-|- .-+..+.++...++++++..+...+|+||+++.+ .+ T Consensus 92 ~~~~~~~~k~~~~~~i-t~e~L~d~a-------------~~~d~e~~i~~~ia~~~a~~~~~~~~nGd~~~~~~~~~~n~ 157 (321) T protein:vir:31 92 GTIDISTEKATVAWDL-PREVVQENP-------------EGEALADRILNLMTDAWSADVEDLAANGDEDAEDSFENQND 157 (321) T ss_pred eeeeeeeEEEEeehhc-cHHHHHhhh-------------cchhHHHHHHHHHHHHHHHHHHhheeeccccCCCcccccch Confidence 9999999999887765 344333332 1234678888999999999999999999986543 48 Q ss_pred HHHHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc--cCCC Q lcl|NC_018271. 148 GILPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF--LNPN 225 (305) Q Consensus 148 G~lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~--t~~~ 225 (305) ||++.+.+...++.. ..+.. -.+.|.+++..||..+|++++++++||.+.+.+|...+..+.+...+. +.+. T Consensus 158 G~l~~a~~~~~~~~~--~~~~~----~~d~l~~l~~~l~~~yr~~~~~v~im~~~~~~~~~~~l~~~~~~~~~~~l~~~~ 231 (321) T protein:vir:31 158 GFITVAEGDVETIDA--ADDIL----DNDLVIRTIAGLDSKYRARMNPALIVSEDQLLSYHYTLTDRDTPLGDNVIMGEA 231 (321) T ss_pred hhhhhhccccccccc--ccccc----CHHHHHHHHHhccHhHhcCCCeEEEechHHHHHHHHHHhcCCCccccchhhccc Confidence 999998777664431 22222 357788999999999999999999999999999998886655433222 2333 Q ss_pred cceecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeec--ccee--EEEEEEeecceee--------- Q lcl|NC_018271. 226 EFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVD--LSGQ--IRTKMVLSAGVEY--------- 292 (305) Q Consensus 226 ~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~--~~~~--~f~k~~m~~d~~i--------- 292 (305) ...+.|++++.++.||++.|++|+-+||.+|+. ..+++.+....+ ...+ |.+-+....|+.| T Consensus 232 ~~tl~G~pvv~~~~mP~~~il~t~~~nl~~~~~-----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ve~~~a~a~~ 306 (321) T protein:vir:31 232 DVNPFSFPIIGSGLWPDDKAMFTDPQNLIYALY-----RDLEIDVLTESDKVSERDLHARYFMRGDDDFAIENTEAVVLA 306 (321) T ss_pred cccccceeEEEcCCCCCCcEEEeccccEEEEEe-----eccEEEEeecCccccccceeeEeeeeeecceeEeccccEEEE Confidence 446999999999999999999999999987752 234444433211 1122 3333334455554 Q ss_pred ---ccCCeEEEecCC Q lcl|NC_018271. 293 ---AYGAEIVLYTPA 304 (305) Q Consensus 293 ---~fg~E~v~~~~~ 304 (305) +=+.|.|.=+|| T Consensus 307 ~~i~~~~~~~~~~~~ 321 (321) T protein:vir:31 307 EGLGDPLEHLEEETS 321 (321) T ss_pred ecCCcchhcccCCCC Confidence 222344444444 No 5 >protein:vir:8102 Length: 543 # NCBI annotation: gp6 # Family: family:all:21 # MgeID: mge:152 # MgeName: Che9c # Cross-refs: genbank:acc:NP_817683;genbank:gi:29566114;genbank:GeneID:1259308 Probab=98.97 E-value=5.3e-11 Score=76.91 Aligned_cols=274 Identities=11% Similarity=0.072 Sum_probs=165.0 Q ss_pred CceEeee-eccc-c-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEeccee Q lcl|NC_018271. 1 MATTVDI-TTNY-V-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~-~~~Y-~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~ 76 (305) .+...+- .+.| . .++..++|..++.....+.. ++++.++ +... .+++...+....--.. ..-+.++.+|.+.. T Consensus 249 ~~~~~t~~~gg~lip~~~~~~ii~~~~~~~~~l~~-~~~~~~~-~g~~-~~~~~~~~~~a~~v~Eg~~~~~~~~~~~~i~ 325 (543) T protein:vir:81 249 RAMGLTKADGGYLVPFQLDPTVIITSNGSLNDIRR-FARQVVA-TGDV-WHGVSSAAVQWSWDAEFEEVSDDSPEFGQPE 325 (543) T ss_pred hhcccccccCcccCchhhhhHHHHHHHhhhchhhh-hcccccC-Ccce-EEEEecCCcceeecccCccccccccccceee Confidence 2222211 1222 1 24455666666555433333 3566554 3443 3555444433321111 22334677899999 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) +.++++.+...++= ++ +. + .+..+.++...|...++..++..++.||++.+.+.|+++..... T Consensus 326 ~~~~k~~~~~~is~-el--------l~-------d-~~~~~~~i~~~l~~~~~~~~d~ail~G~Gt~~~p~Gi~~~~~~~ 388 (543) T protein:vir:81 326 IPVKKAQGFVPISI-EA--------LQ-------D-EANVTETVALLFAEGKDELEAVTLTTGTGQGNQPTGIVTALAGT 388 (543) T ss_pred eeeeeeEeeehhhH-HH--------Hh-------c-cHHHHHHHHHHHHHHHHHHHHHHHhccCCCCcccccchhhcccc Confidence 99999999887653 21 11 1 23567888899999999999999999999988999998875433 Q ss_pred cceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC--cccCCCcceecceee Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG--TFLNPNEFDFEGYTL 234 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~--~~t~~~~~~~kGi~i 234 (305) ...+ .+...+++| .+.+.++..+++..++. +-.|+||...|.+.+. ++.-.|.+. +..++....+.|+++ T Consensus 389 ~~~~-~~~~~~~~~----~~~~~~~~~~l~~~~~~--~~~~v~n~~~~~~l~~-lkd~~G~~l~~~~~~g~~~~l~G~pv 460 (543) T protein:vir:81 389 AAEI-APVTAETFA----LADVYAVYEQLAARHRR--QGAWLANNLIYNKIRQ-FDTQGGAGLWTTIGNGEPSQLLGRPV 460 (543) T ss_pred cccc-ccccccccc----HHHHHHHHHhhhccccC--CcEEEEcHHHHHHHHH-hhcCCCceeccCcCCCCCccccceee Confidence 3222 122233344 56666777778887764 3589999999877654 332222211 122333457999999 Q ss_pred eeccCCCCCe----------EEEecchHHhhhhhhhhhhhhccccce---eeeccceeEEEEEEeecceeeccCCeEEEe Q lcl|NC_018271. 235 TEIKGLPASR----------MVGYNRDNIVIGMSAQSDFNEIRIKDM---GDVDLSGQIRTKMVLSAGVEYAYGAEIVLY 301 (305) Q Consensus 235 v~l~~~Pd~~----------ii~T~~sNl~~gvnl~~D~n~I~I~~~---~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~ 301 (305) +-...||.+. |+..+-+++++|.+- ++. |+++.- ...+..+..-+.+.+-+|+.+..++=||+. T Consensus 461 ~~~~~~~~~~~~~~~~~~~~i~~gd~~~~~i~~~~--~~~-i~~~~~~~~~~~~~~~~~~~~~~~r~d~~v~~~~A~~~l 537 (543) T protein:vir:81 461 GEAEAMDANWNTSASADNFVLLYGNFQNYVIADRI--GMT-VEFIPHLFGTNRRPNGSRGWFAYYRMGADVVNPNAFRLL 537 (543) T ss_pred EEeccccccccccccCCcceEEEeeccceeEEeec--ccE-EEEeccccccchhhcCceEEEEEEeeccEeecccceEEE Confidence 9998888653 666777777665432 222 122110 111233455666677789999999999988 Q ss_pred cCCC Q lcl|NC_018271. 302 TPAA 305 (305) Q Consensus 302 ~~~~ 305 (305) +.++ T Consensus 538 ~~~~ 541 (543) T protein:vir:81 538 NVET 541 (543) T ss_pred Eecc Confidence 8877 No 6 >protein:vir:7771 Length: 330 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:149 # MgeName: Bxz2 # Cross-refs: genbank:acc:NP_817605;genbank:gi:29566035;genbank:GeneID:1259229 Probab=98.92 E-value=1.9e-10 Score=73.89 Aligned_cols=278 Identities=11% Similarity=-0.002 Sum_probs=164.1 Q ss_pred CceE---------eeeecc-cchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccc Q lcl|NC_018271. 1 MATT---------VDITTN-YVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGE 69 (305) Q Consensus 1 ma~~---------~~~~~~-Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~ 69 (305) ||.- -...+. -..+...+++........+.+. ++++| ++.....+|+...+-..+-..+ +--+-++ T Consensus 1 m~~~~~~a~~~~~t~~~g~~i~~~~~~~ii~~~~~~s~l~~~--~~~~~-~~~~~~~~p~~~~~~~a~~v~Eg~~~~~~~ 77 (330) T protein:vir:77 1 MAGSTVPSTQVALTGDFSAFLTPEQSQDYFAEIEKTSIVQRI--ARKVP-MGPTGISIPHWTGAVSASWTGEAERKPITK 77 (330) T ss_pred CcccccchhhccccCCCcceechhHHHHHHHHHHhccchhhh--cceee-ccCCceEEEEEcCCcceeEecCCCcccccc Confidence 4433 211122 2245556666666655555444 56666 4343333444432222221111 1112346 Q ss_pred eEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHH Q lcl|NC_018271. 70 VDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGI 149 (305) Q Consensus 70 ~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~ 149 (305) .+|.+..+.++++-+...++= ++ +. +-.+..+.++.+.|.++++..+++.++.|+++.+.+.|+ T Consensus 78 ~~f~~i~~~~~k~~~~~~is~-el--------l~-------ds~~~~~~~i~~~l~~ai~~~~~~~~l~G~g~~~~~~g~ 141 (330) T protein:vir:77 78 GSFGKQELEPVKITTIFAESA-EV--------VR-------LNPLNYLNTMRTKIAEAIALKFDAAAIHGIDKPSAFKGY 141 (330) T ss_pred ceeeEEEEeEEEEEEeehhhH-HH--------Hh-------cchHHHHHHHHHHHHHHHHHHHHHHhhcccCCCCccccc Confidence 789999999999998887542 21 11 123557788999999999999999999999999999999 Q ss_pred HHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC--------cc Q lcl|NC_018271. 150 LPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG--------TF 221 (305) Q Consensus 150 lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~--------~~ 221 (305) ++.+............+.+.+.++..+.|.++...+....+. .-.++||...|.+.+. ++.-.|.+. +. T Consensus 142 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~--~~~~vmn~~~~~~l~~-lkd~~G~~l~~~~~~~~~~ 218 (330) T protein:vir:77 142 LAETTKVVSLADTNLTTASGPQGNAYLAVNNALSLLVNSGKK--WTGTLLDNVTEPILNT-AVDGNGRPLFVESTYTEQV 218 (330) T ss_pred cccccccceeecccccccccccchhHHHHHHHHHhhhhcCCC--ccEEEEcHHHHHHHHH-HhccCCceeecCccccccc Confidence 888655444332222222333455556666665555554332 2489999999977654 332222211 11 Q ss_pred cCCCcceecceeeeeccCCCCC------eEEEecchHHhhhhhhhhhhhhcccccee-------------------eecc Q lcl|NC_018271. 222 LNPNEFDFEGYTLTEIKGLPAS------RMVGYNRDNIVIGMSAQSDFNEIRIKDMG-------------------DVDL 276 (305) Q Consensus 222 t~~~~~~~kGi~iv~l~~~Pd~------~ii~T~~sNl~~gvnl~~D~n~I~I~~~~-------------------~~~~ 276 (305) .......+.|++++....||++ .++..+-+++++|. ...+.++... .++- T Consensus 219 ~~~~~~~l~G~PV~~~~~~p~~~~~~~~~~~~gd~s~~~i~~-----~~~~~i~~~~e~~~~~~~~~~~~~~~~~~~~f~ 293 (330) T protein:vir:77 219 GAIREGRILGRPTYVADNVVNGTVGNRVVGVMGDFSQVIWGQ-----IGGLSFDVTDQATLDFGEEQGGVWVPKLISLWQ 293 (330) T ss_pred cccCCceecceeeEEeccccCCCCCCccEEEEEecceEEEEE-----ecCcEEEEeecceeeecccccccccccccchhh Confidence 1223347899999999999864 37777777775442 1222222111 1122 Q ss_pred ceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 277 SGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 277 ~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) +.+.-++..+-+|+.+..++=++.-++.+ T Consensus 294 ~~~~~~r~~~r~d~~v~~~~a~~~i~~~~ 322 (330) T protein:vir:77 294 HNMVAVRCEAEFAFMVNDKDAFVKLTDQV 322 (330) T ss_pred cCcEEEEEEEEeccEEecccceEEEEecc Confidence 33455666777788888888888887777 No 7 >protein:vir:4092 Length: 390 # NCBI annotation: major capsid protein a # Family: family:all:635 # MgeID: mge:86 # MgeName: 2389 # Cross-refs: genbank:acc:NP_510986;swissprot:trembl:q8w604;genbank:gi:17488508;uniprot:Q8W604;genbank:GeneID:1260361 Probab=98.89 E-value=1.6e-10 Score=74.26 Aligned_cols=278 Identities=12% Similarity=0.084 Sum_probs=168.2 Q ss_pred CceEeeeeccc--chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC--CCCCccceEeccee Q lcl|NC_018271. 1 MATTVDITTNY--VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC--GFTPSGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~~~~Y--~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~--~~~~~G~~~~~~K~ 76 (305) ++........| --+...+|+..+.....+.+. ++++| ++.....+++........--.. .....++.+|.+.. T Consensus 84 ~~~~~~~~gg~lvP~~~~~~I~~~~~~~s~i~~~--~~~~~-~~~~~~~i~~~~~~~~a~~~~E~~~~~~~~~~~f~~i~ 160 (390) T protein:vir:40 84 IAGNGFAGVTALLPPTVFERVFEDLTVEHPLLSK--INFVN-TTATTEWIISVGDVATAWWGPLCAEIKEVLDNGFDKIQ 160 (390) T ss_pred HhccCcccCcccccHHHHHHHHHHHHhhhhhhhh--ceeee-cCCceeEEEEEcCCcceeeeccccccCccccccceeeE Confidence 22222212222 345566677777777777554 88877 3333333454332222211111 11224577899999 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) |.++++-+...++ +++ . ++-++.++.++.+.|+++++..+++.++.|+++ +.-.||++.+... T Consensus 161 l~~~k~~~~i~iS-~el--------l-------~ds~~~l~~~i~~~la~~i~~~~~~a~l~G~G~-~~P~Gil~~~~~~ 223 (390) T protein:vir:40 161 TGMYKLSAYIPVC-NAM--------L-------DLGPSWLDQYVRTILGEAMALGLEAGIVNGSGK-DQPIGMMRDLNNV 223 (390) T ss_pred eeeeeEEEeehhh-HHH--------H-------hcchHHHHHHHHHHHHHHHHHHHHhhhhcccCC-Cccceeeeccccc Confidence 9999999987765 221 1 233566788999999999999999999999986 4466998866544 Q ss_pred cceEEeccCcCcCChhhHHHHHHHHHHhccHHHH-hCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecceeee Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGKFIDAHTDEIL-QAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEGYTLT 235 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r-~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kGi~iv 235 (305) ...........++|.++....+.++...+..... ..++..++|+...+..+-+.++..-.+++.+... ....|++++ T Consensus 224 ~~~~~~~~~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~a~~i~n~~t~~~~l~~~~~~~d~~G~~v~~--~~~~g~pvv 301 (390) T protein:vir:40 224 TAGEHPVKTATPLTDLTPATLATKVMLPLTDNGKKSVSDAILVINPADYWSKIYAATSYMTPQGVWVTG--ILPVPLEIV 301 (390) T ss_pred cccccccccccccchhhHHHHHHHHHHHhhcchhhhhcCceEEEcchhHHHHHHHHhhccCCCCccccc--cCCCceeEE Confidence 4433222234456776666666666555544322 1245789999887766666654443333333221 235699999 Q ss_pred eccCCCCCeEEEecchHHhhhhhhhhhhhhccccce-eeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 236 EIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDM-GDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 236 ~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~-~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) .-..||++.++..+-++.+++.+ .+ ++++.. ..-..+.+.-|....-+|....-.+=||+..-++ T Consensus 302 ~~~~~p~~~i~~Gd~s~~~i~~~--~~---~~v~~~~~~~f~~~~~~~r~~~r~dg~v~~~~A~~~l~~~~ 367 (390) T protein:vir:40 302 QSVAVPVGKAVAGRAKDYFMGIG--SE---QVIRTSTEYRLLDDETLYYAKQYANGRPKDNSSFLVFDITG 367 (390) T ss_pred EcCCCCCCcEEEEeeceEEEEee--cc---eEEEecchhhhhcCcEEEEEEEEeCCEEecccceEEEEeec Confidence 99999999998888888765431 23 333311 1111234455666777788888877788774333 No 8 >protein:vir:94142 Length: 304 # NCBI annotation: ORF013 # Family: family:all:507 # MgeID: mge:1494 # MgeName: 96 # Cross-refs: genbank:acc:YP_240234;genbank:gi:66395898;genbank:GeneID:5133311 Probab=98.86 E-value=3.3e-10 Score=72.59 Aligned_cols=277 Identities=10% Similarity=0.032 Sum_probs=164.2 Q ss_pred CceEe-ee---ec----c--cchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccc Q lcl|NC_018271. 1 MATTV-DI---TT----N--YVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGE 69 (305) Q Consensus 1 ma~~~-~~---~~----~--Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~ 69 (305) ||..- +. .+ . --.+..++++..+.....+.+. ++++| ++.....+|+...+....--.+ .--+-++ T Consensus 1 ma~~~~~~~~~~~t~~gg~lip~~~~~~ii~~~~~~~~l~~~--~~~~~-~~~~~~~ip~~~~~~~a~~v~E~~~~~~~~ 77 (304) T protein:vir:94 1 MATPTYTPGNVILSDFKNGVIPAEQGTLIMKDIMANSAIMKL--AKNEP-MTAQKKKFTYLAKGVGAYWVSETERIQTSK 77 (304) T ss_pred CcccccccccccccCCCceecchhHHHHHHHHHHhccchhhh--cceee-ccCCceEEEEEeCCcceEEeecCccccccc Confidence 66443 11 11 1 2234446677777767666654 66666 3333233454443322211111 1122356 Q ss_pred eEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHH Q lcl|NC_018271. 70 VDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGI 149 (305) Q Consensus 70 ~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~ 149 (305) .+|++..+.++++-+...++= ++ . ++-++..+.++.+.|.++++..++..+++||++.....++ T Consensus 78 ~~~~~i~~~~~k~~~~~~iS~-el--------l-------~ds~~~l~~~i~~~l~~~ia~~~d~~~l~G~g~~~~~~~~ 141 (304) T protein:vir:94 78 PEYAQAEMEAKKIGVIIPLSK-EF--------L-------KWTAKDFFNEVKPLIAEAFYKAFDQAVIFGTKSPYNTSTS 141 (304) T ss_pred ceeeEEEEEEEEEEEeehhhH-HH--------H-------hcchHHHHHHHHHHHHHHHHHHHHhhheeccCCCcccccc Confidence 789999999999998877632 21 1 1224567788999999999999999999999875443332 Q ss_pred HHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCccee Q lcl|NC_018271. 150 LPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDF 229 (305) Q Consensus 150 lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~ 229 (305) ...+.............++. ..+.+.+++.+++...++.. .++||...|.+.+. ++.- .+.-..+++...+ T Consensus 142 ~~~~~~~~~~~~~~~~~~~~----~~~~i~~~~~~l~~~~~~~~--~~v~~~~~~~~L~~-lkd~--~G~~l~~~~~~~l 212 (304) T protein:vir:94 142 GKPLVEGAEEKGNVVTDTNN----LYVDLSALMATIEDEELDPN--GVLTTRSFRSKMRN-ALDA--NDRPLFDANGNEI 212 (304) T ss_pred cccccccccccccccccccc----hHHHHHHHHHHhhhccCCcC--EEEEcHHHHHHHHH-hhcc--CCcEeecCCCccc Confidence 22222222222222222233 35556666777777766533 79999999987763 3222 2222234455679 Q ss_pred cceeeeeccCCC----CCeEEEecchHHhhhhhhh---hhhhhccccceee---------eccceeEEEEEEeecceeec Q lcl|NC_018271. 230 EGYTLTEIKGLP----ASRMVGYNRDNIVIGMSAQ---SDFNEIRIKDMGD---------VDLSGQIRTKMVLSAGVEYA 293 (305) Q Consensus 230 kGi~iv~l~~~P----d~~ii~T~~sNl~~gvnl~---~D~n~I~I~~~~~---------~~~~~~~f~k~~m~~d~~i~ 293 (305) .|++++....+| ++.++..+-+++++|..-. .-.+++.|+.... ++-..+..+...+-.|+.+. T Consensus 213 ~G~PV~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~~~i~~~~e~~~~~~~~~~~~g~~~~~f~~~~~~~r~~~r~~~~v~ 292 (304) T protein:vir:94 213 MGLPLSYTGADVYDKKKSLALMGDWDYARYGILQGIEYAISEDATLTTLQASDASGQPVSLFERDMFALRATMHIAYMNV 292 (304) T ss_pred cceeeEEecccccCCCCcEEEEEehhhEEEEEecceEEEEeecceeeeecccccCccchhhhhcCcEEEEEEEEeccEee Confidence 999998888887 4468888888887654210 0012222222221 22334567788888999999 Q ss_pred cCCeEEEecCCC Q lcl|NC_018271. 294 YGAEIVLYTPAA 305 (305) Q Consensus 294 fg~E~v~~~~~~ 305 (305) .++=||+-+++- T Consensus 293 ~~~a~~~l~~a~ 304 (304) T protein:vir:94 293 KPEAFATLKPTE 304 (304) T ss_pred cccceEEEEecC Confidence 999999999999 No 9 >protein:vir:105905 Length: 304 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:1514 # MgeName: phiETA3 # Cross-refs: genbank:acc:YP_001004375;genbank:gi:122891830;genbank:GeneID:4712376 Probab=98.86 E-value=3.3e-10 Score=72.59 Aligned_cols=277 Identities=10% Similarity=0.032 Sum_probs=164.2 Q ss_pred CceEe-ee---ec----c--cchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccc Q lcl|NC_018271. 1 MATTV-DI---TT----N--YVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGE 69 (305) Q Consensus 1 ma~~~-~~---~~----~--Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~ 69 (305) ||..- +. .+ . --.+..++++..+.....+.+. ++++| ++.....+|+...+....--.+ .--+-++ T Consensus 1 ma~~~~~~~~~~~t~~gg~lip~~~~~~ii~~~~~~~~l~~~--~~~~~-~~~~~~~ip~~~~~~~a~~v~E~~~~~~~~ 77 (304) T protein:vir:10 1 MATPTYTPGNVILSDFKNGVIPAEQGTLIMKDIMANSAIMKL--AKNEP-MTAQKKKFTYLAKGVGAYWVSETERIQTSK 77 (304) T ss_pred CcccccccccccccCCCceecchhHHHHHHHHHHhccchhhh--cceee-ccCCceEEEEEeCCcceEEeecCccccccc Confidence 66443 11 11 1 2234446677777767666654 66666 3333233454443322211111 1122356 Q ss_pred eEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHH Q lcl|NC_018271. 70 VDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGI 149 (305) Q Consensus 70 ~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~ 149 (305) .+|++..+.++++-+...++= ++ . ++-++..+.++.+.|.++++..++..+++||++.....++ T Consensus 78 ~~~~~i~~~~~k~~~~~~iS~-el--------l-------~ds~~~l~~~i~~~l~~~ia~~~d~~~l~G~g~~~~~~~~ 141 (304) T protein:vir:10 78 PEYAQAEMEAKKIGVIIPLSK-EF--------L-------KWTAKDFFNEVKPLIAEAFYKAFDQAVIFGTKSPYNTSTS 141 (304) T ss_pred ceeeEEEEEEEEEEEeehhhH-HH--------H-------hcchHHHHHHHHHHHHHHHHHHHHhhheeccCCCcccccc Confidence 789999999999998877632 21 1 1224567788999999999999999999999875443332 Q ss_pred HHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCccee Q lcl|NC_018271. 150 LPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDF 229 (305) Q Consensus 150 lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~ 229 (305) ...+.............++. ..+.+.+++.+++...++.. .++||...|.+.+. ++.- .+.-..+++...+ T Consensus 142 ~~~~~~~~~~~~~~~~~~~~----~~~~i~~~~~~l~~~~~~~~--~~v~~~~~~~~L~~-lkd~--~G~~l~~~~~~~l 212 (304) T protein:vir:10 142 GKPLVEGAEEKGNVVTDTNN----LYVDLSALMATIEDEELDPN--GVLTTRSFRSKMRN-ALDA--NDRPLFDANGNEI 212 (304) T ss_pred cccccccccccccccccccc----hHHHHHHHHHHhhhccCCcC--EEEEcHHHHHHHHH-hhcc--CCcEeecCCCccc Confidence 22222222222222222233 35556666777777766533 79999999987763 3222 2222234455679 Q ss_pred cceeeeeccCCC----CCeEEEecchHHhhhhhhh---hhhhhccccceee---------eccceeEEEEEEeecceeec Q lcl|NC_018271. 230 EGYTLTEIKGLP----ASRMVGYNRDNIVIGMSAQ---SDFNEIRIKDMGD---------VDLSGQIRTKMVLSAGVEYA 293 (305) Q Consensus 230 kGi~iv~l~~~P----d~~ii~T~~sNl~~gvnl~---~D~n~I~I~~~~~---------~~~~~~~f~k~~m~~d~~i~ 293 (305) .|++++....+| ++.++..+-+++++|..-. .-.+++.|+.... ++-..+..+...+-.|+.+. T Consensus 213 ~G~PV~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~~~i~~~~e~~~~~~~~~~~~g~~~~~f~~~~~~~r~~~r~~~~v~ 292 (304) T protein:vir:10 213 MGLPLSYTGADVYDKKKSLALMGDWDYARYGILQGIEYAISEDATLTTLQASDASGQPVSLFERDMFALRATMHIAYMNV 292 (304) T ss_pred cceeeEEecccccCCCCcEEEEEehhhEEEEEecceEEEEeecceeeeecccccCccchhhhhcCcEEEEEEEEeccEee Confidence 999998888887 4468888888887654210 0012222222221 22334567788888999999 Q ss_pred cCCeEEEecCCC Q lcl|NC_018271. 294 YGAEIVLYTPAA 305 (305) Q Consensus 294 fg~E~v~~~~~~ 305 (305) .++=||+-+++- T Consensus 293 ~~~a~~~l~~a~ 304 (304) T protein:vir:10 293 KPEAFATLKPTE 304 (304) T ss_pred cccceEEEEecC Confidence 999999999999 No 10 >protein:vir:96762 Length: 632 # NCBI annotation: putative phage-related protein # Family: family:all:21 # MgeID: mge:1628 # MgeName: VP882 # Cross-refs: genbank:acc:YP_001039818;genbank:gi:126010917;genbank:GeneID:5076272 Probab=98.82 E-value=3.1e-10 Score=72.74 Aligned_cols=272 Identities=13% Similarity=0.154 Sum_probs=164.2 Q ss_pred CceEeeeeccc--chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEecceee Q lcl|NC_018271. 1 MATTVDITTNY--VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQL 77 (305) Q Consensus 1 ma~~~~~~~~Y--~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~L 77 (305) +.+.-..+..| ..+++.+-|..++...-.+.+...+++|+.+.+.. +|+...+-...-..+ +--+.++.+|.+..| T Consensus 357 ~~~~t~~~gg~lvp~~~~~~~iie~lr~~s~i~~l~~~~~~~~~g~~~-ip~~~~~~~a~wv~E~~~~~~s~~~f~~i~l 435 (632) T protein:vir:96 357 LEKKTAGKGGELVATELLSEEFIDILRNKAIIGQMGARMLPGLVGDVD-IPKKTSGANFYWIGEDEDVQDSDFDFTTLSF 435 (632) T ss_pred hhcccccccccccccccchHHHHHHHhhcchhhhhcceEeecCCcceE-EEEEeCCceeEeecCCccccccccceeeEEe Confidence 22211111111 12333344444444444555545788888766643 444433222210111 112235779999999 Q ss_pred eeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhcc Q lcl|NC_018271. 78 TLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEADA 157 (305) Q Consensus 78 ~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d~ 157 (305) .++++-+...++-+=++ +. .+..+.++...|.+.++..++..++.|+++.+...|++.... T Consensus 436 ~~~k~~~~v~iS~ell~--------------ds--~~~~~~~i~~~l~~a~~~~~d~a~l~G~G~~~~p~Gi~~~~~--- 496 (632) T protein:vir:96 436 SPKTIAGAVPVTRKLRK--------------QS--SIHVENLIREDLIEGIGVALDLAMLTGTGLANDPVGLLNMTG--- 496 (632) T ss_pred eeeEEEEehhhHHHHHh--------------cc--chHHHHHHHHHHHHHHHHHHHHHhhcccCCCCccceeeeccc--- Confidence 99999988776432221 11 345677888999999999999999999998777888865421 Q ss_pred ceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHH-HHhhhhccCCcccCCCcceecceeeee Q lcl|NC_018271. 158 TVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKR-AYGTQARSNGTFLNPNEFDFEGYTLTE 236 (305) Q Consensus 158 ~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d-~~~~~~~k~~~~t~~~~~~~kGi~iv~ 236 (305) +..+....+.+|.++ +.++..+++......++..+.||...+....- .+..-.|.+ ..++ ..+.|.+++. T Consensus 497 -~~~~~~~~~~~~~~~----i~~~~~~i~~~~~~~~~~~~~~~~~~~~~l~~~~l~d~~G~~--i~~~--~~l~G~pv~~ 567 (632) T protein:vir:96 497 -VPALTYPAGGVDWAS----VVDMETKISTFNADAGRLAYLTSVTQRGAAKKAQVFDNTGER--IWQN--NEVNGYRAEA 567 (632) T ss_pred -ccceecccccCCHHH----HHHHHHHHhhcccccCccEEEEchhHHHHHHHHhccCCCCce--eecC--CeecccceEe Confidence 112222333455444 44455566555444456789999876654432 232222211 1222 3688999999 Q ss_pred ccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 237 IKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 237 l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) ...+|++.+++.+-+.+++|.. .+.. +.+++... ..++..-|..++-+|+.+..++-||+...+| T Consensus 568 s~~ip~~~~~~gd~s~~~i~~~--~~~~-i~~~~~~~-~~~~~v~~~~~~~~d~~v~~~~af~~~k~~A 632 (632) T protein:vir:96 568 SNQIPADTWIFGDWSQIVIAMW--GVLD-LKVDPYTK-AASDGLVLRVFQDVDAGVRRKEAFCIAKKGA 632 (632) T ss_pred ccccccCcEEEeecceEEEEEe--cceE-EEEccccc-cccCceEEEEEeecCceeechhhhhheeecC Confidence 9999999999998888866542 1222 22333322 1345678889999999999999999999999 No 11 >protein:vir:1328 Length: 392 # NCBI annotation: gp36 # Family: family:all:21 # MgeID: mge:28 # MgeName: phi-C31 # Cross-refs: genbank:acc:NP_047927;swissprot:trembl:q9zwv6;genbank:gi:9631145;uniprot:Q9ZWV6;genbank:GeneID:2715889 Probab=98.72 E-value=2.4e-09 Score=67.81 Aligned_cols=269 Identities=14% Similarity=0.142 Sum_probs=158.0 Q ss_pred CceEeeeeccc-chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCC------CccceEec Q lcl|NC_018271. 1 MATTVDITTNY-VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFT------PSGEVDIN 73 (305) Q Consensus 1 ma~~~~~~~~Y-~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~------~~G~~~~~ 73 (305) .+.+......+ ..++..+++..+.....++.. +.+++|.-......+|+...+. .+.|. +-++.+|. T Consensus 111 ~~~t~~~~g~~~~~~~~~~~i~~~~~~~~~l~~-~~~~~~~~~~~~~~~~~~~~~~-----~a~~v~E~~~~~~~~~~f~ 184 (392) T protein:vir:13 111 RDGTKAGNPNVLSRTLYGQLIAQAVERSAIMRG-GASTFTTSDANPMDFTVITGRA-----TAGIVGETAEIPESYPATT 184 (392) T ss_pred hcccccCCCccccccchHHHHHHHHhhhhhhhh-cceeeecCCCceeEEEEEcCCc-----ceeeeccccccccccccee Confidence 22222222222 233445556655554444433 3565554322222333332222 12342 23467899 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) +..|.++++-+...++ +++.+ +=.+..+.++.+.|.+.++..++..++.||++ +.-.||++.. T Consensus 185 ~v~~~~~k~~~~~~iS-~ell~---------------ds~~~l~~~i~~~l~~~i~~~~d~~~l~G~Gt-~~p~Gil~~~ 247 (392) T protein:vir:13 185 QRSMGGFKYGFASVVS-YEFAT---------------DQVLDLVGFLVSDAGPAIGDAMGRHFLTGTGT-GQPRGILTDA 247 (392) T ss_pred eEEeeeeeEEeeehhH-HHHHh---------------cchHHHHHHHHHHHHHHHHHHHHHHHhcccCC-cccccccccc Confidence 9999999998887753 22211 11345678888999999999999999999986 4467887764 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCcceec Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNEFDFE 230 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~~~~k 230 (305) .....+.. ....+.++ .+.+.+++.+++..+|.+ -.++||...+.+++. ++.-.|.+. +.+.+....+. T Consensus 248 ~~~~~~~~-~~~~~~~~----~d~l~~~~~~l~~~~~~~--a~~v~n~~~~~~l~~-lkd~~G~~l~~~~~~~g~~~~l~ 319 (392) T protein:vir:13 248 TGANAAFG-EADADSKV----SDALIDLFHEVPSAYRKN--AKFVVNDLRAAQMRK-LKDANGQYLWQSALTVGAPDTFN 319 (392) T ss_pred cccccccc-cccccccc----HHHHHHHHHhhhhhhhcC--CEEEEcHHHHHHHHH-hhccCCceeecCCcCCCCCceec Confidence 33222221 11223343 556667788888888864 479999999876653 433333221 22333345799 Q ss_pred ceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeee-ccceeEEEEEEeecceeeccCCeEEEecC--CC Q lcl|NC_018271. 231 GYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDV-DLSGQIRTKMVLSAGVEYAYGAEIVLYTP--AA 305 (305) Q Consensus 231 Gi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~-~~~~~~f~k~~m~~d~~i~fg~E~v~~~~--~~ 305 (305) |++++....+|++.|++.+-++++++. -..+.++..... .-+.++-|......|+.+.-++=|++-+- +| T Consensus 320 G~Pv~~~~~~~~~~i~~Gdf~~~~i~~-----~~~~~i~~~~~~~~~~~~~~~r~~~r~d~~~~~~~A~~~~~~~~aa 392 (392) T protein:vir:13 320 GKVVETDDGMPADKVLFADLSKYRVRF-----AGSLRVDRSVDAKFSTDQIVYRFLQRADGLLVDARGAKVLTVTPAA 392 (392) T ss_pred ceeeEEcCCCCCCcEEEeeccceeEEe-----ecceEEEeeccccccCCcEEEEEEEEeccEEecccceEEEEeeccC Confidence 999999999999999988877765443 233444432211 12234555666777888887777776554 44 No 12 >protein:vir:4339 Length: 395 # NCBI annotation: major head protein # Family: family:all:585 # MgeID: mge:93 # MgeName: D3 # Cross-refs: genbank:acc:NP_061502;genbank:gi:9635591;genbank:GeneID:1262860 Probab=98.71 E-value=1.2e-09 Score=69.45 Aligned_cols=275 Identities=11% Similarity=-0.008 Sum_probs=166.2 Q ss_pred CceEeeeec--ccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhcc-ccCCCC-CCCCccceEeccee Q lcl|NC_018271. 1 MATTVDITT--NYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDD-FVDYSC-GFTPSGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~~~--~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~-~q~~~~-~~~~~G~~~~~~K~ 76 (305) ..++.+... .-..+...+|+........+.+. +++.|-- ......++...... ..--.+ .-.+.++.+|.+.. T Consensus 113 ~~~~~~~~~g~~vp~~~~~~ii~~~~~~~~l~~l--~~~~~~~-~~~~~~~~~~~~~~~a~~v~E~~~~~~~~~~~~~i~ 189 (395) T protein:vir:43 113 AITSIDGSGGALVAPDRRPGVVAAPQRRLTIRDL--VAPGTTE-SNSVEYVRETGFVNNAAPVSEGTQKPYSDLTFELEN 189 (395) T ss_pred hhcccCCCCccccchhhHHHHHHHHHhhhhHHhh--ccceecC-CCceEEEEEecCCCceeeecCCccccccccceeEEE Confidence 212222111 23345566777777766666655 5554421 22222333222211 100011 11234677999999 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) +.++++.+...++ +++. ++.| ..+.++.+.|.++++..++..++.|+++.+.+.|++...... T Consensus 190 ~~~~k~~~~~~is-~ell---------------~d~~-~l~~~v~~~la~a~~~~~d~~~l~G~g~~~~~~Gi~~~~~~~ 252 (395) T protein:vir:43 190 APVRTIAHLFKAS-RQIL---------------DDAS-ALQSYIDARARYGLMLVEECQLLYGNGTGANLHGIIPQAQAY 252 (395) T ss_pred EeeeeEEEeehhh-HHHH---------------HhHH-HHHHHHHHHHHHHHHHHHHHHHHhccCCCCcccccccccccc Confidence 9999999988765 3331 1223 467888899999999999999999999988899988753221 Q ss_pred cceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC--cccCCCcceecceee Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG--TFLNPNEFDFEGYTL 234 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~--~~t~~~~~~~kGi~i 234 (305) . ...+...+....++.+.++..+++..+++. -.++||...|.+++. ++.-.|.+. +..++....+.|+++ T Consensus 253 ~-----~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~--~~~vmn~~~~~~l~~-lkd~~G~~i~~~~~~~~~~~l~G~pV 324 (395) T protein:vir:43 253 A-----PPSGVVVTAEQRIDRIRLAILQAQLAEFPA--SGIVLNPIDWALIEL-NKDAENRYIIGSPQNGTTPTLWRLPV 324 (395) T ss_pred c-----cccccccccchhHHHHHHHHHhhccccCCC--cEEEEcHHHHHHHHH-hhccCCceeccccccCCCceecceee Confidence 1 111222334456777788888888776643 489999999877653 322222222 223344557999999 Q ss_pred eeccCCCCCeEEEecchHHhhhhhhhhhhhhcccccee-eeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 235 TEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 235 v~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) +....+|++.++..+-++.+..++- .+.. ++++.-. ...-...+-+...+-+|+.+..++=||+-+.++ T Consensus 325 v~~~~~~~~~~~~gd~~~~~~~~~~-~~~~-i~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~~~~ta 394 (395) T protein:vir:43 325 VETQAITQDEFLTGAFSLGAQIFDR-MDIE-VLVSTENDKDFENNMVTIRAEERLAFAVYRPEAFVTGSLTA 394 (395) T ss_pred EEcCCCCCCcEEEEeccceEEEEEe-cceE-EEEeccccchhhcCcEEEEEEEeeccEEecccceEEEEecc Confidence 9999999999888877764322211 1111 2333211 112234556667777899999999998888777 No 13 >protein:vir:191 Length: 385 # NCBI annotation: major head subunit precursor # Family: family:all:585 # MgeID: mge:6 # MgeName: HK97 # Cross-refs: genbank:acc:NP_037701;genbank:gi:9634158;genbank:GeneID:1262530 Probab=98.65 E-value=2e-09 Score=68.33 Aligned_cols=266 Identities=12% Similarity=0.038 Sum_probs=160.7 Q ss_pred CceEeeeecc-cchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCC------CccceEec Q lcl|NC_018271. 1 MATTVDITTN-YVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFT------PSGEVDIN 73 (305) Q Consensus 1 ma~~~~~~~~-Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~------~~G~~~~~ 73 (305) |.++....+. -..++..+++..+.....+.+. ++++| ++....-.++...... .+.|. +.++.+|. T Consensus 105 ~~~~~~~~g~~i~~~~~~~ii~~~~~~~~l~~~--~~~~~-~~~~~~~~~~~~~~~~----~a~~v~E~~~~~~~~~~~~ 177 (385) T protein:vir:19 105 LGSDADSAGSLIQPMQIPGIIMPGLRRLTIRDL--LAQGR-TSSNALEYVREEVFTN----NADVVAEKALKPESDITFS 177 (385) T ss_pred hccccccCCceecchhhhHHHHHhhhccchhhh--cceec-ccCcceEEEEEecCCc----ceeeeccCcccccccccee Confidence 3333332222 2345566677777776666665 56555 3333222333222111 12332 33567899 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) +..+.++++-+...++ +++. . + .+.++.++...|.++++..+++.++.||++.+.+.|+++.. T Consensus 178 ~~~~~~~k~~~~~~is-~ell--------~-------d-~~~l~~~i~~~la~a~~~~~d~~~l~G~g~~~~~~Gi~~~~ 240 (385) T protein:vir:19 178 KQTANVKTIAHWVQAS-RQVM--------D-------D-APMLQSYINNRLMYGLALKEEGQLLNGDGTGDNLEGLNKVA 240 (385) T ss_pred EEEEeeeeEEEeehhh-HHHH--------h-------h-HHHHHHHHHHHHHHHHHHHHHHHHHhccCCCCccccccccc Confidence 9999999999887744 4431 1 1 23577889999999999999999999999998899998753 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC--cccCCCcceecc Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG--TFLNPNEFDFEG 231 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~--~~t~~~~~~~kG 231 (305) .....+ ...+.....+.+.++..+++..++.. =.++||...|.+.+. ++.-.|.+. +..++....+.| T Consensus 241 ~~~~~~-------~~~~~~~~~d~i~~~~~~l~~~~~~~--~~~~~~~~~~~~l~~-lkd~~G~~l~~~~~~~~~~~l~G 310 (385) T protein:vir:19 241 TAYDTS-------LNATGDTRADIIAHAIYQVTESEFSA--SGIVLNPRDWHNIAL-LKDNEGRYIFGGPQAFTSNIMWG 310 (385) T ss_pred cccccc-------ccccccchHHHHHHHHHhhccccCCC--CEEEEcHHHHHHHHH-hhcCCCceeccCcccCCCceecc Confidence 221111 11122234666777777777776643 389999999887654 332222221 223344567999 Q ss_pred eeeeeccCCCCCeEEEecchH-Hhhhhhhhhhhhhcccc--ce-eeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 232 YTLTEIKGLPASRMVGYNRDN-IVIGMSAQSDFNEIRIK--DM-GDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 232 i~iv~l~~~Pd~~ii~T~~sN-l~~gvnl~~D~n~I~I~--~~-~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) ++++....+|++.++..+-++ .+++. -..+.++ +- ....-...+-+...+-+|+.+..++=||.-+=++ T Consensus 311 ~pV~~~~~~p~~~~~~gd~~~~~~~~~-----~~~~~v~~~~~~~~~~~~~~~~~~~~~r~~~~v~~~~a~~~~~~~a 383 (385) T protein:vir:19 311 LPVVPTKAQAAGTFTVGGFDMASQVWD-----RMDATVEVSREDRDNFVKNMLTILCEERLALAHYRPTAIIKGTFSS 383 (385) T ss_pred eeeEEcCcCCCCcEEEeecccEEEEEE-----ecceEEEEeccccchhhcCcEEEEEEEeeccEEecccceEEEEecc Confidence 999999999999888877655 32221 1122221 10 1111234456666777788888888887766555 No 14 >protein:vir:1886 Length: 385 # NCBI annotation: major capsid subunit precursor # Family: family:all:585 # MgeID: mge:41 # MgeName: HK022 # Cross-refs: genbank:acc:NP_037666;genbank:gi:9634124;genbank:GeneID:1262513 Probab=98.65 E-value=2e-09 Score=68.33 Aligned_cols=266 Identities=12% Similarity=0.038 Sum_probs=160.7 Q ss_pred CceEeeeecc-cchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCC------CccceEec Q lcl|NC_018271. 1 MATTVDITTN-YVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFT------PSGEVDIN 73 (305) Q Consensus 1 ma~~~~~~~~-Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~------~~G~~~~~ 73 (305) |.++....+. -..++..+++..+.....+.+. ++++| ++....-.++...... .+.|. +.++.+|. T Consensus 105 ~~~~~~~~g~~i~~~~~~~ii~~~~~~~~l~~~--~~~~~-~~~~~~~~~~~~~~~~----~a~~v~E~~~~~~~~~~~~ 177 (385) T protein:vir:18 105 LGSDADSAGSLIQPMQIPGIIMPGLRRLTIRDL--LAQGR-TSSNALEYVREEVFTN----NADVVAEKALKPESDITFS 177 (385) T ss_pred hccccccCCceecchhhhHHHHHhhhccchhhh--cceec-ccCcceEEEEEecCCc----ceeeeccCcccccccccee Confidence 3333332222 2345566677777776666665 56555 3333222333222111 12332 33567899 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) +..+.++++-+...++ +++. . + .+.++.++...|.++++..+++.++.||++.+.+.|+++.. T Consensus 178 ~~~~~~~k~~~~~~is-~ell--------~-------d-~~~l~~~i~~~la~a~~~~~d~~~l~G~g~~~~~~Gi~~~~ 240 (385) T protein:vir:18 178 KQTANVKTIAHWVQAS-RQVM--------D-------D-APMLQSYINNRLMYGLALKEEGQLLNGDGTGDNLEGLNKVA 240 (385) T ss_pred EEEEeeeeEEEeehhh-HHHH--------h-------h-HHHHHHHHHHHHHHHHHHHHHHHHHhccCCCCccccccccc Confidence 9999999999887744 4431 1 1 23577889999999999999999999999998899998753 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC--cccCCCcceecc Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG--TFLNPNEFDFEG 231 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~--~~t~~~~~~~kG 231 (305) .....+ ...+.....+.+.++..+++..++.. =.++||...|.+.+. ++.-.|.+. +..++....+.| T Consensus 241 ~~~~~~-------~~~~~~~~~d~i~~~~~~l~~~~~~~--~~~~~~~~~~~~l~~-lkd~~G~~l~~~~~~~~~~~l~G 310 (385) T protein:vir:18 241 TAYDTS-------LNATGDTRADIIAHAIYQVTESEFSA--SGIVLNPRDWHNIAL-LKDNEGRYIFGGPQAFTSNIMWG 310 (385) T ss_pred cccccc-------ccccccchHHHHHHHHHhhccccCCC--CEEEEcHHHHHHHHH-hhcCCCceeccCcccCCCceecc Confidence 221111 11122234666777777777776643 389999999887654 332222221 223344567999 Q ss_pred eeeeeccCCCCCeEEEecchH-Hhhhhhhhhhhhhcccc--ce-eeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 232 YTLTEIKGLPASRMVGYNRDN-IVIGMSAQSDFNEIRIK--DM-GDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 232 i~iv~l~~~Pd~~ii~T~~sN-l~~gvnl~~D~n~I~I~--~~-~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) ++++....+|++.++..+-++ .+++. -..+.++ +- ....-...+-+...+-+|+.+..++=||.-+=++ T Consensus 311 ~pV~~~~~~p~~~~~~gd~~~~~~~~~-----~~~~~v~~~~~~~~~~~~~~~~~~~~~r~~~~v~~~~a~~~~~~~a 383 (385) T protein:vir:18 311 LPVVPTKAQAAGTFTVGGFDMASQVWD-----RMDATVEVSREDRDNFVKNMLTILCEERLALAHYRPTAIIKGTFSS 383 (385) T ss_pred eeeEEcCcCCCCcEEEeecccEEEEEE-----ecceEEEEeccccchhhcCcEEEEEEEeeccEEecccceEEEEecc Confidence 999999999999888877655 32221 1122221 10 1111234456666777788888888887766555 No 15 >protein:vir:94673 Length: 419 # NCBI annotation: major capsid protein # Family: family:all:585 # MgeID: mge:1527 # MgeName: mu1/6 # Cross-refs: genbank:acc:YP_579208;genbank:gi:93007444;genbank:GeneID:5076792 Probab=98.64 E-value=3.8e-09 Score=66.73 Aligned_cols=279 Identities=11% Similarity=0.031 Sum_probs=163.3 Q ss_pred CceEe----eeecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhcc---ccCCCCCCCCc------ Q lcl|NC_018271. 1 MATTV----DITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDD---FVDYSCGFTPS------ 67 (305) Q Consensus 1 ma~~~----~~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~---~q~~~~~~~~~------ 67 (305) ++... .....-.++...+++..+......+ ..++++.|--... ...++...... ...-.+.|.+- T Consensus 121 ~~~~~~~~~~~~~~~~p~~~~~~i~~~~~~~~~i-~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~a~~v~Eg~~~~~ 198 (419) T protein:vir:94 121 RDAPAGTITNPNVPHLPQLVPGIVPTTPDLPLLV-ADLLDQQNADYNV-LEYIRDTSGTAGAGSTWNKAAVVPEGTAKPQ 198 (419) T ss_pred cccccccccCCcccccchhhhHHHHHHHhhhhhh-hhcceeeeccCCc-eeeeeeccccccccccCcccceecCCccccc Confidence 11111 1111223455555555544333223 3457766633222 22222221111 11223567554 Q ss_pred cceEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhH Q lcl|NC_018271. 68 GEVDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQ 147 (305) Q Consensus 68 G~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fd 147 (305) ++.+|.+..+.++++-+...++ +++. ++.| ..+.++...|+++++..++..++.||++ +... T Consensus 199 ~~~~~~~i~~~~~k~~~~~~is-~ell---------------~d~~-~l~~~i~~~la~a~~~~~d~aii~G~G~-~~p~ 260 (419) T protein:vir:94 199 STLSFDTITTTLKTVAHWLPIT-RQAA---------------DDNS-QLMGYIQGRLTYGLRFLRDRQLLNGNGS-TEMQ 260 (419) T ss_pred cccceeeEEeeeeeEEEeehhh-HHHH---------------HhHH-HHHHHHHHHHHHHHHHHHHHHHHhccCc-cccc Confidence 3557999999999999988755 3321 1222 3677888889999999999999999997 4677 Q ss_pred HHHHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---CcccCC Q lcl|NC_018271. 148 GILPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---GTFLNP 224 (305) Q Consensus 148 G~lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~~~t~~ 224 (305) |+++... ..++.........|.....+.+.++...+....++ ++ .++||...|...........+.+ .+..++ T Consensus 261 Gi~~~~~--~~~~~~~~~~~~~t~~~~~~~l~~~~~~~~~~~~~-~~-~~v~n~~~~~~l~~~k~~~~~~~~~~~~~~~~ 336 (419) T protein:vir:94 261 GILTTPG--IGTYQQPKPTAPATDEPPLVDIRRAKTVAEIAGFP-PD-GVVVHPQDWESIELDQAPGSGVFRVIANVQGE 336 (419) T ss_pred ceecccc--cccccccccccccccchhHHHHHHHHHhhhhccCC-CC-EEEEcHHHHHHHHHHhhcCCCceeecCCcccC Confidence 8877522 22222222333456666788888888888777654 23 79999999888764432222111 122233 Q ss_pred CcceecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhcccccee-eeccceeEEEEEEeecceeeccCCeEEEecC Q lcl|NC_018271. 225 NEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIVLYTP 303 (305) Q Consensus 225 ~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~ 303 (305) ....+.|++++....+|++.++..+-++.+++++- .+.. +++++-. ...-...+.+...+-+|+.+..++=||+-+= T Consensus 337 ~~~~l~G~pV~~~~~~~~~~~~~gd~~~~~~~~~~-~~~~-v~~~~~~~~~~~~~~~~~r~~~r~d~~v~~~~a~~~~~~ 414 (419) T protein:vir:94 337 ATPRIWGLNVVSTVAIAQGTALVGGFRQGATLWSR-QGIT-VLMTDSHADFFTANTLVILAEFRANLAVYQPKAFVRVTF 414 (419) T ss_pred CCccccceeeEEcCCCCCccEEEeeccceEEEEEe-cceE-EEEeccccchhhcCcEEEEEEEeeccEEeccccEEEEEe Confidence 35579999999999999998888776654333321 1221 2222111 1122455667777888999988888888655 Q ss_pred CC Q lcl|NC_018271. 304 AA 305 (305) Q Consensus 304 ~~ 305 (305) ++ T Consensus 415 ~a 416 (419) T protein:vir:94 415 AA 416 (419) T ss_pred cc Confidence 55 No 16 >protein:vir:41 Length: 299 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:2 # MgeName: A118 # Cross-refs: genbank:acc:NP_463467;swissprot:trembl:q9t1b7;genbank:gi:16798789;uniprot:Q9T1B7;genbank:GeneID:922353 Probab=98.62 E-value=9.6e-09 Score=64.53 Aligned_cols=263 Identities=11% Similarity=0.073 Sum_probs=160.2 Q ss_pred CceEeeeeccc--chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccceEe Q lcl|NC_018271. 1 MATTVDITTNY--VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEVDI 72 (305) Q Consensus 1 ma~~~~~~~~Y--~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~~~ 72 (305) |+++......+ -.+...+++..+.....+-+. ++++| ++.....++... +. .+.|.+ -++.+| T Consensus 6 ~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~~--~~~~~-~~~~~~~~~~~~-~~-----~a~~v~E~~~~~~~~~~f 76 (299) T protein:vir:41 6 DTTTMQSAKTGSIPINISEQIITGVKNGSAAMKL--AKAVP-MTKPEEEFTFMS-GV-----GAFWVDEAERIQTSKPTF 76 (299) T ss_pred CcccccCCCceecchhHHHHHHHHHHhcchhhhh--ceeee-cCCCcEEEEEEc-CC-----ceeeeecCccccccccce Confidence 55554432222 244556666666666666555 56666 344434444322 11 123322 245789 Q ss_pred cceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHH Q lcl|NC_018271. 73 NEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPL 152 (305) Q Consensus 73 ~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~ 152 (305) ++..|.++++-+...++=+- + ++=++..+.++...|.++++..++..++.|+++... .|+++. T Consensus 77 ~~v~l~~~k~~~~~~is~el---------l-------~ds~~~~~~~i~~~l~~a~~~~~d~a~l~G~g~~~~-~gil~~ 139 (299) T protein:vir:41 77 TKAKMRSKKMGVIIPTTKEN---------L-------NYSVTNFFSLMQAEIVEAFYKKFDQAVFTGVESPYN-WNILKS 139 (299) T ss_pred eEEEEeeEEEEEeehhhHHH---------H-------hcCHHHHHHHHHHHHHHHHHHHHHHHHhhcccCccc-cccccc Confidence 99999999999998765321 1 122356778999999999999999999999987433 477765 Q ss_pred HhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC--CcccCCCcceec Q lcl|NC_018271. 153 LEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN--GTFLNPNEFDFE 230 (305) Q Consensus 153 i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~--~~~t~~~~~~~k 230 (305) .....++. ..++++. +.+.++..++....++. =.+.||...|.+++. ++...|.+ .+....+...+. T Consensus 140 ~~~~~~~~----~~~~~~~----~~l~~~~~~l~~~~~~~--~~~v~n~~~~~~L~~-lkd~~G~~l~~~~~~~~~~~l~ 208 (299) T protein:vir:41 140 ATDASNLV----EETANKY----DDLNEAIGLIEAEDLEP--NGIATIRKQRVKYRS-TKDGNGMPIFNTATSNGVDDVL 208 (299) T ss_pred ccccceee----ccccccH----HHHHHHHHhhhcccCCc--CEEEEcHHHHHHHHH-hhccCCceeecCCcCCCCceec Confidence 44333322 2234443 44445555666665543 279999999887764 33222221 111223345799 Q ss_pred ceeeeeccCCCCCe----EEEecchHHhhhhhhhhhhhhcccccee---------------eeccceeEEEEEEeeccee Q lcl|NC_018271. 231 GYTLTEIKGLPASR----MVGYNRDNIVIGMSAQSDFNEIRIKDMG---------------DVDLSGQIRTKMVLSAGVE 291 (305) Q Consensus 231 Gi~iv~l~~~Pd~~----ii~T~~sNl~~gvnl~~D~n~I~I~~~~---------------~~~~~~~~f~k~~m~~d~~ 291 (305) |++++....+|.+. ++..+-+++++|.. .+ +.++... +++-...+-++..+-+|+. T Consensus 209 G~PV~~~~~~~~~~~~~~~~~gdfs~~~i~~~--~~---~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~~~d~~ 283 (299) T protein:vir:41 209 GLPIAYTPKYTFGDKDISELVGDWNQAYYGIL--RG---VEYEILTEATLTTVADETGKPLNLAERDMAAIKATFEVGFM 283 (299) T ss_pred ceeeEEecccCCCCCceEEEEEecccEEEEEe--cC---cEEEEeecccccccccccccchhhhhcCcEEEEEEEEeccE Confidence 99999999998663 77777777765432 11 2222111 1122334556667777999 Q ss_pred eccCCeEEEecCCC Q lcl|NC_018271. 292 YAYGAEIVLYTPAA 305 (305) Q Consensus 292 i~fg~E~v~~~~~~ 305 (305) +..++=|+.-++++ T Consensus 284 v~~~~A~~~l~~~a 297 (299) T protein:vir:41 284 VVKDEAFSAVQPKA 297 (299) T ss_pred EecccceEEEEecc Confidence 99999999999999 No 17 >protein:vir:1433 Length: 435 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:30 # MgeName: phiE125 # Cross-refs: genbank:acc:NP_536362;genbank:gi:17975167;genbank:GeneID:929171 Probab=98.56 E-value=8.6e-09 Score=64.81 Aligned_cols=280 Identities=15% Similarity=0.153 Sum_probs=156.6 Q ss_pred CceEee-eecc-cchhH-HHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEeccee Q lcl|NC_018271. 1 MATTVD-ITTN-YVGEV-AGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~-~~~~-Y~Ge~-l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~ 76 (305) ..++.. ..+. -..+. ..+|+..+.....+ .+...+++|+-+.+. .+|+...+-...-..+ +--+-++.+|.+.. T Consensus 131 ~~~~~t~~~gg~~vP~~~~~~ii~~l~~~~~i-~~~~~~~~~~~~~~~-~~p~~~~~~~a~~v~E~~~~~~~~~~f~~i~ 208 (435) T protein:vir:14 131 SLNTLSPGAGGVLVPENLSSEVIELLRPKSVV-RKLGARTLPLSNGNI-TIPRLKGGAIVGYIGADTDIPTTQQQFDDLK 208 (435) T ss_pred hcccCCcCCCccccchhHHHHHHHHHhhhchh-hhhcceeeecCCCce-EEEEEeCCcceeeeccCccccccccceeEEE Confidence 111111 1111 12222 23444444444433 333477888776653 3444332221110010 11123567899999 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) +.++++-+...++ +++ +.+...++..+.++...|.++++..+++.++.||++.+...|++..... T Consensus 209 ~~~~k~~~~~~iS-~el-------------l~ds~~~~~l~~~i~~~l~~ai~~~~d~a~l~G~G~~~~p~Gi~~~~~~- 273 (435) T protein:vir:14 209 LTAKKMAALVPIA-NDL-------------IKYAGVNPNVDQIVVGDLTAAIGAREDKAFIRDDGTANTPKGLRFWALP- 273 (435) T ss_pred eeeEEEEEeehhh-HHH-------------HHhhccCHHHHHHHHHHHHHHHHHHHHHHhhccCCCCccccceeecccc- Confidence 9999999987765 221 1122345678888899999999999999999999998889999764221 Q ss_pred cceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecceeeee Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEGYTLTE 236 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kGi~iv~ 236 (305) ..+....+..|...+...+.++...+......-.+..++||...|.+. +.++.-.|.+. ........+.|++++. T Consensus 274 ---~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~v~n~~~~~~L-~~lkd~~G~~l-~~~~~~g~l~G~Pv~~ 348 (435) T protein:vir:14 274 ---SNVITASDASTLQKIETDLGKVILALENADANLTQPGWIMAPRTFRFL-EGLRDGNGNKV-YPELANGMLKGYPVGK 348 (435) T ss_pred ---cceeccccccchhhHHHHHHHHHHHhhhccccccCCEEEEcHHHHHHH-HHhhccCCcee-ccCCCCCeeecceeEe Confidence 112222233455555666666666655432111345799999999665 34433333222 1222234789999998 Q ss_pred ccCCCCC--------eEEEecchHHhhhhhhhhhhhhcccccee----------eeccceeEEEEEEeecceeeccCCeE Q lcl|NC_018271. 237 IKGLPAS--------RMVGYNRDNIVIGMSAQSDFNEIRIKDMG----------DVDLSGQIRTKMVLSAGVEYAYGAEI 298 (305) Q Consensus 237 l~~~Pd~--------~ii~T~~sNl~~gvnl~~D~n~I~I~~~~----------~~~~~~~~f~k~~m~~d~~i~fg~E~ 298 (305) ...+|+. .|+..+-+..+++... +.. +++.+-. ..+.....-+...+-+|+.+..++=| T Consensus 349 ~~~~p~~~~~~~~~~~i~~gd~s~~~i~~~~--~~~-~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~~~~~~a~ 425 (435) T protein:vir:14 349 TTQVPINLGETGKESEIYFTDFGDVFIGEEE--TLE-IDYSKEATYKDADGHMVSAFQRDQTLIRVIAKNDFGPRHVESI 425 (435) T ss_pred eccccccccCCCccceEEEeecccEEEEEec--ccE-EEEeccccccccccchhhhhhcChhheeeeeeeCceeecccce Confidence 8888873 4666666666544221 111 1111100 11222334555666679999888888 Q ss_pred EEecCCC Q lcl|NC_018271. 299 VLYTPAA 305 (305) Q Consensus 299 v~~~~~~ 305 (305) |+=|.++ T Consensus 426 ~~l~~~~ 432 (435) T protein:vir:14 426 AVLAGVA 432 (435) T ss_pred EEEecCC Confidence 8888877 No 18 >protein:vir:4511 Length: 409 # NCBI annotation: capsid # Family: family:all:21 # MgeID: mge:97 # MgeName: V # Cross-refs: genbank:acc:NP_599037;genbank:gi:19548995;genbank:GeneID:935211 Probab=98.56 E-value=2.1e-08 Score=62.66 Aligned_cols=269 Identities=10% Similarity=0.071 Sum_probs=155.3 Q ss_pred CceEeeeeccc--chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCC------CccceEe Q lcl|NC_018271. 1 MATTVDITTNY--VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFT------PSGEVDI 72 (305) Q Consensus 1 ma~~~~~~~~Y--~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~------~~G~~~~ 72 (305) |.++.+....| --+...+|+..+.....+.+ +++++|--......++....... .+.|. +..+..| T Consensus 117 ~~~~~~~~gg~liP~~~~~~ii~~~~~~~~l~~--~~~~~~~~~~~~~~~~~~~~~~~----~~~~v~E~~~~~~~~~~f 190 (409) T protein:vir:45 117 QGVAQDEKGGYTVPETFLAKVVEKMKSYGGIAS--VAQILTTSDGRTMEWATADGTSE----VGVLLGENEEAGEEDTDF 190 (409) T ss_pred ccCccCcCCceeccHhHHHHHHHHHHhhhhhhh--hceeeecCCCceEEEEeeccCcc----cccccccccccccccccc Confidence 33333222222 12333455555555555543 47777643333222222221111 12333 3356678 Q ss_pred cceeeeeeeeeE-eeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc--chhHHH Q lcl|NC_018271. 73 NEKQLTLKKIKS-DKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT--GNLQGI 149 (305) Q Consensus 73 ~~K~L~~~~~k~-~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~--~~fdG~ 149 (305) .+..|.+.+.-+ ...+ ++ +++. +-.+..+.++...|.++++..++..++.||++. +...|+ T Consensus 191 ~~~~l~~~k~~~~~i~i-s~--------ell~-------ds~~~l~~~i~~~la~a~~~~~~~a~l~G~G~~~~~~p~Gi 254 (409) T protein:vir:45 191 GMGSLGALKMTSKIIRV-SN--------ELLQ-------DSAIDMEAYLARRIAERIGRGEARYLIQGTGAGTPKQPKGL 254 (409) T ss_pred ceeeeeeeeeeeeehhh-hH--------HHHh-------ccHHHHHHHHHHHHHHHHHHHHHHHhhccCCCCCcccccee Confidence 888888766543 2222 11 2222 223567888999999999999999999999864 447888 Q ss_pred HHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---CcccCCCc Q lcl|NC_018271. 150 LPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---GTFLNPNE 226 (305) Q Consensus 150 lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~~~t~~~~ 226 (305) +........ ....+++| .+.+.++..+++..++.+....++|+...|.+.+ .++.-.|.+ .+.+++.. T Consensus 255 l~~~~~~~~----~~~~~~~~----~d~i~~l~~~l~~~~~~~a~~~~~~n~~~~~~l~-~lkd~~G~~i~~~~~~~~~~ 325 (409) T protein:vir:45 255 AASVTGTTQ----TAAANAVK----WQEILALKHSIDPAYRRGPKFRLAFNDNTLKLIS-EMEDGQGRPLWLPDIVGVAP 325 (409) T ss_pred eeccccccc----cccccccc----hHHHHHHHHhhhhhhccCCeEEEEECHHHHHHHH-HhhcCCCceeeccCcCCCCC Confidence 765332211 22233444 4556777888999998877677899999986554 443333332 23344555 Q ss_pred ceecceeeeeccCCCC-----CeEEEecchHHhhhhhhhhhhhhccccceeeec-cceeEEEEEEeecceeeccCCeEEE Q lcl|NC_018271. 227 FDFEGYTLTEIKGLPA-----SRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVD-LSGQIRTKMVLSAGVEYAYGAEIVL 300 (305) Q Consensus 227 ~~~kGi~iv~l~~~Pd-----~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~-~~~~~f~k~~m~~d~~i~fg~E~v~ 300 (305) ..+.|++++....||+ ..|+..+-++.+++ +...+.++.+.... -..++-+...+-+|+.+.-++=|++ T Consensus 326 ~~l~G~PV~~~~~~p~~~~~~~~i~~Gd~~~~~i~-----~~~~~~~~~~~d~~~~~~~~~~~~~~r~d~~~~~~~A~~~ 400 (409) T protein:vir:45 326 ASVLNVPYVIDQEIDDIGAGKKFMFCGDFDRFIIR-----RVRYMILKRLVERYAEYDQTGFLAFHRFDCILEDTSAIKA 400 (409) T ss_pred ceecceeeEEecCcCCccCCccEEEEeehhhhhee-----eccceEEEEeecccccCCcEEEEEEEEeccEeechhheEE Confidence 6799999998888885 45666676676533 22333232111111 1345566777778999988888888 Q ss_pred ecCCC Q lcl|NC_018271. 301 YTPAA 305 (305) Q Consensus 301 ~~~~~ 305 (305) .+-.+ T Consensus 401 l~~k~ 405 (409) T protein:vir:45 401 LVGKG 405 (409) T ss_pred EEecc Confidence 77655 No 19 >protein:vir:100135 Length: 418 # NCBI annotation: gp5 # Family: family:all:585 # MgeID: mge:1639 # MgeName: phi1026b # Cross-refs: genbank:acc:NP_945035;genbank:gi:38707895;genbank:GeneID:2744182 Probab=98.55 E-value=7.6e-09 Score=65.11 Aligned_cols=267 Identities=11% Similarity=0.032 Sum_probs=150.9 Q ss_pred CceEeeeec-----ccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCC------Cccc Q lcl|NC_018271. 1 MATTVDITT-----NYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFT------PSGE 69 (305) Q Consensus 1 ma~~~~~~~-----~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~------~~G~ 69 (305) +.......+ .--.++..+++..+..-..+.+. ++++| ++..+...++...... .+.|. +.++ T Consensus 132 ~~~~~~~~~~~~g~lvp~~~~~~ii~~~~~~~~l~~~--~~~~~-~~~~~~~~~~~~~~~~----~a~~v~E~~~~~~~~ 204 (418) T protein:vir:10 132 VPATVGSGVSGSNSLVVADRQAGIIAPPQRKMTIRDL--LMPGQ-TSSSSIEYTVETGFTN----NAAAVAEGAQKPTSD 204 (418) T ss_pred hhhhccCCCCCCccccchhHHHHHHHHHhhhhhHHhh--cceee-ccCCceeEEEEecCCC----ceeeeccCccccccc Confidence 111111111 12224445555555555555444 56555 3233222333222111 22342 3356 Q ss_pred eEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHH Q lcl|NC_018271. 70 VDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGI 149 (305) Q Consensus 70 ~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~ 149 (305) .+|++..+.++++.+...++ +++.+ +- +..+.++.+.|.++++..++..++.|+++.+...|+ T Consensus 205 ~~f~~v~~~~~k~~~~~~is-~ell~---------------ds-~~l~~~i~~~l~~a~~~~~d~a~l~G~g~~~~p~Gi 267 (418) T protein:vir:10 205 LKFNLKNQPVRTIAHLFKAS-RQILD---------------DA-PALQSYIDGRARYGLQLTEEGQILKGDGTGANILGI 267 (418) T ss_pred cceeeEEEeeeeEEEeehhh-HHHHH---------------hH-HHHHHHHHHHHHHHHHHHHHHHHhccCCCCcccccc Confidence 79999999999999987755 22311 11 246678889999999999999999999998888899 Q ss_pred HHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC--cccCCCcc Q lcl|NC_018271. 150 LPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG--TFLNPNEF 227 (305) Q Consensus 150 lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~--~~t~~~~~ 227 (305) ++.......+ ....+.. . ++.+.++...+....++ +-.++||...|...+. ++.-.|.+. +..++... T Consensus 268 ~~~~~~~~~~---~~~~~~~---~-~~~i~~~~~~~~~~~~~--~~~~v~n~~~~~~L~~-lkd~~G~~i~~~~~~~~~~ 337 (418) T protein:vir:10 268 LPQASAFMPS---ITLANAT---P-IDKIRLALLQAVLAEFP--ATGIVLNPIDWASIEL-TKDSQGRYIVGNPVNGTTP 337 (418) T ss_pred cccccccccc---ccccccc---c-HHHHHHHHHhhccccCC--CCEEEEcHHHHHHHHH-hhcCCCceeccccccCCCc Confidence 8863222111 1111112 2 34444555555555443 2379999999876643 332222222 22344456 Q ss_pred eecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhcc--cccee-eeccceeEEEEEEeecceeeccCCeEEEecCC Q lcl|NC_018271. 228 DFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIR--IKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPA 304 (305) Q Consensus 228 ~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~--I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~ 304 (305) .+.|++++....+|++.++..+-++.+..+ |-..+. +++-. ...-...+-+...+-+|+.+..++=||+.+.. T Consensus 338 ~l~G~pV~~~~~~p~~~~~~gd~s~~~~~~----~~~~~~i~~~~~~~~~f~~~~~~~r~~~~~d~~~~~~~a~~~~~~~ 413 (418) T protein:vir:10 338 RLWNLPVVETQAMTANEFLVGAFSMAAQIF----DRMEIEVLLSTENVDDFEKNMVSIRAEERLALAVYRPESFVTGALV 413 (418) T ss_pred eecceeeEEcCCCCCCcEEEeeccceEEEE----EecceEEEEecccchhhhcCceEEEEEEeeccEEecccceEEEEec Confidence 799999999999999998888877643211 111222 22111 11123445566667778888888888776554 Q ss_pred C Q lcl|NC_018271. 305 A 305 (305) Q Consensus 305 ~ 305 (305) + T Consensus 414 ~ 414 (418) T protein:vir:10 414 E 414 (418) T ss_pred c Confidence 4 No 20 >protein:vir:95376 Length: 425 # NCBI annotation: phage major capsid protein # Family: family:all:635 # MgeID: mge:1567 # MgeName: GBSV1 # Cross-refs: genbank:acc:YP_764476;genbank:gi:115334630;genbank:GeneID:5179263 Probab=98.51 E-value=1.6e-08 Score=63.30 Aligned_cols=268 Identities=15% Similarity=0.039 Sum_probs=162.1 Q ss_pred CceEeeeeccc--chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccce-E Q lcl|NC_018271. 1 MATTVDITTNY--VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEV-D 71 (305) Q Consensus 1 ma~~~~~~~~Y--~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~-~ 71 (305) .+.+-...+.| -.+...+|+..+.....+.+. +++.| ++.++. +|+.... -.+.|.+ .++. . T Consensus 138 ~~~~~~~~gg~~vP~~~~~~Ii~~l~~~~~i~~~--~~~~~-~~g~~~-ip~~~~~-----~~a~~v~E~~~~~~~~~~~ 208 (425) T protein:vir:95 138 RNLRAVAGGELTIPEVVVNRIMDIMGDYTTLYPL--VDKIR-VKGTTR-ILVDTDT-----SPATWIEQSGALPTGDVGT 208 (425) T ss_pred HhhcccccCceeccHHHHHHHHHHHHhhhhHHHh--hceee-cCceeE-EEEecCC-----ccccccccccccccccccc Confidence 22211112223 233455566666666665543 78776 666543 4543222 1234433 3343 6 Q ss_pred ecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc-chhHHHH Q lcl|NC_018271. 72 INEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT-GNLQGIL 150 (305) Q Consensus 72 ~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~-~~fdG~l 150 (305) |++..|.++++-+...++=. ++ ++=++..+.++...|.+.++..++..++.|+++. +...|++ T Consensus 209 f~~i~l~~~k~~~~~~iS~e---------ll-------~ds~~~l~~~i~~~l~~~i~~~~d~~il~G~G~~~~~p~Gil 272 (425) T protein:vir:95 209 IASIDFDGFKVGKVTFVDNY---------LL-------QDSIINLDDYVTKKIARAIAKALDLAIVKGTGAANKQPLGII 272 (425) T ss_pred cceeeeeheeeeeeehhhHH---------HH-------hccHHHHHHHHHHHHHHHHHHHHHHHhhccCCCCccccceee Confidence 88889999988887766532 11 2225567788889999999999999999999985 5588998 Q ss_pred HHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc----cCCCc Q lcl|NC_018271. 151 PLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF----LNPNE 226 (305) Q Consensus 151 k~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~----t~~~~ 226 (305) +.+.....+ ....++.| .+.+.++...+...++..++..++|+..+|......+...-.+++.+ ..+.. T Consensus 273 ~~~~~~~~~---~~~~~~~~----~~~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~~l~~l~~~kd~~g~~i~~~~~~~~ 345 (425) T protein:vir:95 273 PSLPPENQV---TVEADNNL----LKNLVKQIGLIDTGDDSVGEIVAVMKRSTYYNRLVEFSIQVDSNGNVVGKLPNLRT 345 (425) T ss_pred ccccccccc---ccccccch----HHHHHHHHHhhhhhccccCceEEEEeChHHHHHHHHHHhhcCCCCceeeccCCCCC Confidence 765544332 22223334 34445566666666666677889999887655444454333333332 22333 Q ss_pred ceecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeec-cceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 227 FDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVD-LSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 227 ~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~-~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) ..+.|.+++.-..+|++.|+..+-++.+++. ...+.|+...... -..+.-+...+-+|..+.-++=||+.+... T Consensus 346 ~~l~G~pvv~~~~~~~~~i~~Gd~~~~~~~~-----~~~~~i~~~~~~~f~~~~~~~~~~~r~d~~~~~~~a~~~~~i~~ 420 (425) T protein:vir:95 346 PDLLGLRVVFNNFLDDDTVLFGEFEQYTLVE-----RENITIDSSTHVKFTEDQTAFRGKGRFDGKPVKPEAFVLVTITD 420 (425) T ss_pred ccccceeeEEcCcCCCccEEEEecccEEEEe-----ecceEEEeecccccccCceEEEEEEeeCcEeecccceEEEEecC Confidence 4588999999999999998888877765553 2233333222211 123445555666788888888888876554 No 21 >protein:vir:105038 Length: 428 # NCBI annotation: major capsid head protein precursor # Family: family:all:21 # MgeID: mge:1465 # MgeName: phiKO2 # Cross-refs: genbank:acc:YP_006586;genbank:gi:46402092;genbank:GeneID:2777903 Probab=98.51 E-value=1.5e-08 Score=63.47 Aligned_cols=272 Identities=14% Similarity=0.152 Sum_probs=150.1 Q ss_pred CceEeee-ecccc-hhH-HHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccceE Q lcl|NC_018271. 1 MATTVDI-TTNYV-GEV-AGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEVD 71 (305) Q Consensus 1 ma~~~~~-~~~Y~-Ge~-l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~~ 71 (305) ++..... ++.|. ++. ..+|+..+.....+-+- ..+++|+-+.+. .+|+...+-. +.|.+ -++.. T Consensus 125 ~~~~~~~~~gg~liP~~~~~~ii~~l~~~~~l~~~-~~~~~~~~~g~~-~~p~~~~~~~-----a~~v~Eg~~~~~~~~~ 197 (428) T protein:vir:10 125 MAISTAAGSGGVLIPQNIHSEVIELLRDRTIVRKL-GARSIPLPNGNM-SLPRLAGGAT-----ASYTGENQDAKVSEAR 197 (428) T ss_pred hhhcccccCCccccchhHHHHHHHHHhhhchhhhh-cceeeecCCcce-EEEEEeCCcc-----eeeeccCccccccccc Confidence 2222221 12332 333 34555554444444333 377788765553 3454432211 23322 35678 Q ss_pred ecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHH Q lcl|NC_018271. 72 INEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 72 ~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk 151 (305) |++..|.++++-+...++-+=. . +=.+..+.++.+.|.++++..++..++.||++.+.+.|+++ T Consensus 198 f~~i~~~~~k~~~~v~is~ell---------~-------ds~~~l~~~i~~~l~~ai~~~~d~~~l~G~G~~~~p~Gi~~ 261 (428) T protein:vir:10 198 FDDVKLTAKTMIAMVPISNALI---------G-------RAGFNVEQLVLQDILTAISVREDKAFMRDDGTGDTPIGMKA 261 (428) T ss_pred eeeEEeeeEEEEEeehhhHHHH---------h-------hhhHHHHHHHHHHHHHHHHHHHHHHHhccCCCCcccccccc Confidence 9999999999999877654311 1 11245678888999999999999999999999888999998 Q ss_pred HHhhccceEEeccCcCcCChhhHHHHHHHHH--HhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCccee Q lcl|NC_018271. 152 LLEADATVIDVVGASGGITAANVEAELGKFI--DAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDF 229 (305) Q Consensus 152 ~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~--~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~ 229 (305) ........+. .......+...+...++.+. ...+...+ .+..++||...|.. -+.++.-.|.+.- .....+.+ T Consensus 262 ~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~v~n~~~~~~-L~~lkd~~G~~i~-~~~~~g~l 336 (428) T protein:vir:10 262 RATQWNRLLP-WAADAAVNLDTIDTYLDSIILMSMDGNSNM--ISSGWGMSNRTYMK-LFGLRDGNGNKVY-PEMAQGML 336 (428) T ss_pred cccccccccc-ccccccccHHHHHHHHHHHHHhhhcccccc--ccCEEEEcHHHHHH-HHHhhccCCceec-cCCCCCee Confidence 7544333221 11111222222222222221 12222222 34689999999864 3445433333221 22334479 Q ss_pred cceeeeeccCCCCC--------eEEEecchHHhhhhhhhhhhhhcccc--cee----------eeccceeEEEEEEeecc Q lcl|NC_018271. 230 EGYTLTEIKGLPAS--------RMVGYNRDNIVIGMSAQSDFNEIRIK--DMG----------DVDLSGQIRTKMVLSAG 289 (305) Q Consensus 230 kGi~iv~l~~~Pd~--------~ii~T~~sNl~~gvnl~~D~n~I~I~--~~~----------~~~~~~~~f~k~~m~~d 289 (305) .|++++....+|++ .|+..+-+++++|.. ..++++ +-. ..+......+...+-+| T Consensus 337 ~G~pv~~~~~~p~~~~~~~~~~~i~~gd~s~~~i~~~-----~~i~i~~~~~~~~~~~~~~~~~~f~~~~~~~R~~~r~d 411 (428) T protein:vir:10 337 KGYPIQRTSAIPANLGEGGKESEIYFADFNDVVIGED-----GNMKVDFSKEASYIDTDGKLVSAFSRNQSLIRVVTEHD 411 (428) T ss_pred eceeeEEeccccccccCCCccceEEEEecceEEEEEe-----cceEEEeecccccccccccccchhhcchhheeeeeeeC Confidence 99999888888865 355566666655432 222222 100 11122234445556668 Q ss_pred eeeccCCeEEEecCCC Q lcl|NC_018271. 290 VEYAYGAEIVLYTPAA 305 (305) Q Consensus 290 ~~i~fg~E~v~~~~~~ 305 (305) +.+..++=||+=|-.. T Consensus 412 ~~v~~p~a~~~~t~~~ 427 (428) T protein:vir:10 412 IGFRHPEGLVLGTGVL 427 (428) T ss_pred ceeeccceEEEEeccC Confidence 8888887777766555 No 22 >protein:vir:9309 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:165 # MgeName: phi 11 # Cross-refs: genbank:acc:NP_803287;genbank:gi:29028597;genbank:GeneID:1258044 Probab=98.49 E-value=3e-08 Score=61.80 Aligned_cols=270 Identities=11% Similarity=0.093 Sum_probs=159.1 Q ss_pred CceEee--eecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEecceee Q lcl|NC_018271. 1 MATTVD--ITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQL 77 (305) Q Consensus 1 ma~~~~--~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~L 77 (305) +.++.. ....--.+...+++..+.....+.+. .+++| ++.....+|+...+...+-..+ +--+-++.+|++..+ T Consensus 27 ~~~~~~~~~~~liP~~~~~~ii~~~~~~s~l~~l--~~~~~-~~~~~~~ip~~~~~~~a~~v~Eg~~~~~~~~~f~~i~~ 103 (324) T protein:vir:93 27 DNVMMHEKKDGTLLNDFTTPILQEVMENSKIMQL--GKYEP-MEGTEKKFTFWADKPGAYWVGEGQKIETSKATWVNATM 103 (324) T ss_pred ccccccCCCcceechhHHHHHHHHHHhhchhhhh--cceee-ccCCceEEEEEecCcceeeecCCccccccccceeEEEE Confidence 111111 11122344555666555555555544 56666 5444444555443333221111 111234678999999 Q ss_pred eeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhcc Q lcl|NC_018271. 78 TLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEADA 157 (305) Q Consensus 78 ~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d~ 157 (305) .++++-+...++=+-.+ +=.+..+.++.+.|.++++..++..++.|+++.+...|++..+.... T Consensus 104 ~~~k~~~~~~iS~ell~----------------ds~~~l~~~i~~~l~~aia~~~d~a~l~G~g~~~~~~~~~~~~~~~~ 167 (324) T protein:vir:93 104 RAFKLGVILPVTKEFLN----------------YTYSQFFEEMKPMIAEAFYKKFDEAGILNQGNNPFGKSIAQSIEKTN 167 (324) T ss_pred EeEEEEEeehhhHHHHh----------------cchHHHHHHHHHHHHHHHHHHHHHHHhcCCCCCCcCccccccccccc Confidence 99999888766432111 11345667888999999999999999999987766667665443221 Q ss_pred ceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecceeeeec Q lcl|NC_018271. 158 TVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEGYTLTEI 237 (305) Q Consensus 158 ~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kGi~iv~l 237 (305) . ...+++| .+.+.++..+++...+.. . .++||...|.+.+. ++.-.|.+ -...+....+.|++++-. T Consensus 168 ~-----~~~~~~~----~~~i~~~~~~l~~~~~~~-~-~~v~n~~~~~~L~~-l~d~~G~~-~~~~~~~~~l~G~PVv~~ 234 (324) T protein:vir:93 168 K-----VIKGDFT----QDNIIDLEALLEDDELEA-N-AFISKTQNRSLLRK-IVDPETKE-RIYDRNSDSLDGLPVVNL 234 (324) T ss_pred e-----ecccccc----HHHHHHHHHhhhhccCCC-C-EEEEcHHHHHHHHH-hhCCCCCe-eecCCCCCcccceeeEee Confidence 1 1223445 445556666777765543 3 79999999887653 33222222 123344557999998765 Q ss_pred cC--CCCCeEEEecchHHhhhhhhhhhhhhcccccee-------------eeccceeEEEEEEeecceeeccCCeEEEec Q lcl|NC_018271. 238 KG--LPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMG-------------DVDLSGQIRTKMVLSAGVEYAYGAEIVLYT 302 (305) Q Consensus 238 ~~--~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~-------------~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~ 302 (305) .+ .+++.+++.+.+++++|.. .++. |++.+=. +++-..+..|+..+-+|+.+..++=||.-+ T Consensus 235 ~~~~~~~~~i~~gdfs~~~~~~~--~~~~-i~~~~~~~~~~~~~~~~~~~~~f~~n~~~~r~~~r~d~~v~~~~a~~~l~ 311 (324) T protein:vir:93 235 KSSNLKRGELITGDFDKLIYGIP--QLIE-YKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAFAKLV 311 (324) T ss_pred cCCCCCcceEEEEecceEEEEEe--cCcE-EEEeecccccccccccccchhhhhcCcEEEEEEEEeccEEecccceEEEe Confidence 44 4677899999888876542 2222 2222110 122334566777778899999998899888 Q ss_pred CCC Q lcl|NC_018271. 303 PAA 305 (305) Q Consensus 303 ~~~ 305 (305) ++. T Consensus 312 ~a~ 314 (324) T protein:vir:93 312 PAD 314 (324) T ss_pred ccc Confidence 877 No 23 >protein:vir:103955 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1662 # MgeName: phiNM # Cross-refs: genbank:acc:YP_873992;genbank:gi:118430767;genbank:GeneID:4525449 Probab=98.49 E-value=3.2e-08 Score=61.66 Aligned_cols=270 Identities=11% Similarity=0.088 Sum_probs=159.1 Q ss_pred CceEee--eecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEecceee Q lcl|NC_018271. 1 MATTVD--ITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQL 77 (305) Q Consensus 1 ma~~~~--~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~L 77 (305) +.++.. ....--.++..+++..+.....+.+. .+++| ++..+..+|+...+...+-..+ +--+.++.+|.+..+ T Consensus 27 ~~~~~~~~~~~liP~~~~~~ii~~~~~~s~l~~~--~~~~~-~~~~~~~~p~~~~~~~a~~v~Eg~~~~~~~~~~~~v~~ 103 (324) T protein:vir:10 27 DNVMMHEKKDGTLLNDFTTPILQEVMENSKIMQL--GKYEP-MEGTEKKFTFWADKPGAYWVGEGQKIETSKATWVNATM 103 (324) T ss_pred cceeccCCCcceechhHHHHHHHHHHhhchhhhh--cceee-ccCCceEEEEEeCCcceeEeccCccccccccceeEEEE Confidence 111111 11122345566677677666666554 67777 4444444555443322221111 112345678999999 Q ss_pred eeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhcc Q lcl|NC_018271. 78 TLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEADA 157 (305) Q Consensus 78 ~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d~ 157 (305) .++++-+...++=+ +++ +-++..+.++.+.|.++++..++..++.|+++...-.|++..+.... T Consensus 104 ~~~k~~~~~~iS~e---------ll~-------ds~~~l~~~i~~~l~~ai~~~~d~a~l~G~g~~~~~~~i~~~~~~~~ 167 (324) T protein:vir:10 104 RAFKLGVILPVTKE---------FLN-------YTYSQFFEEMKPMIAEAFYKKFDEAGILNQGNNPFGKSIAQSIEKTN 167 (324) T ss_pred eeEEEEEeehhhHH---------HHh-------cchHHHHHHHHHHHHHHHHHHHHHHhhhcCCCCccCccccccccccc Confidence 99998888775432 111 22456778888999999999999999999987665566554332211 Q ss_pred ceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecceeeeec Q lcl|NC_018271. 158 TVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEGYTLTEI 237 (305) Q Consensus 158 ~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kGi~iv~l 237 (305) . ..++++| .+.+.++..+++...+.. + .++||...|.+.+. ++.-.|.+ ....+....+.|++++-. T Consensus 168 ~-----~~~~~~t----~~~i~~~~~~l~~~~~~~-~-~~v~n~~~~~~L~~-l~d~~g~~-~~~~~~~~~l~G~PV~~~ 234 (324) T protein:vir:10 168 K-----VIKGDFT----QDNIIDLEALLEDDELEA-N-AFISKTQNRSLLRK-IVDPETKE-RIYDRNSDTLDGLPVVNL 234 (324) T ss_pred e-----eccccCC----HHHHHHHHHhhhhccCCC-C-EEEEcHHHHHHHHH-hhccCCce-eecCCCCccccceeEEee Confidence 1 1234455 455556667777766543 3 69999999987653 32222211 222334456899987655 Q ss_pred cC--CCCCeEEEecchHHhhhhhhhhhhhhcccc-ce-----e-------eeccceeEEEEEEeecceeeccCCeEEEec Q lcl|NC_018271. 238 KG--LPASRMVGYNRDNIVIGMSAQSDFNEIRIK-DM-----G-------DVDLSGQIRTKMVLSAGVEYAYGAEIVLYT 302 (305) Q Consensus 238 ~~--~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~-~~-----~-------~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~ 302 (305) .. .+++.+++.+-+++++|+.- ++. |++. +. . +++-+....+...+-+|+.+..++=|+..+ T Consensus 235 ~~~~~~~~~~~~gd~~~~~~~~~~--~~~-i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~r~d~~v~~~~A~~~l~ 311 (324) T protein:vir:10 235 KSSNLKRGELITGDFDKLIYGIPQ--LIE-YKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAFAKLV 311 (324) T ss_pred cCCCCCcceEEEEecccEEEEEec--CcE-EEEeecccccccccccccchhhhhcCcEEEEEEEEEccEEecccceEEEE Confidence 44 46778999999998776522 222 2211 10 0 122334566677777799888887788887 Q ss_pred CCC Q lcl|NC_018271. 303 PAA 305 (305) Q Consensus 303 ~~~ 305 (305) .+. T Consensus 312 ~a~ 314 (324) T protein:vir:10 312 PAD 314 (324) T ss_pred ecc Confidence 766 No 24 >protein:vir:99749 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1497 # MgeName: phiETA2 # Cross-refs: genbank:acc:YP_001004307;genbank:gi:122891761;genbank:GeneID:4712304 Probab=98.46 E-value=4.5e-08 Score=60.87 Aligned_cols=270 Identities=11% Similarity=0.084 Sum_probs=158.6 Q ss_pred CceEee--eecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEecceee Q lcl|NC_018271. 1 MATTVD--ITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQL 77 (305) Q Consensus 1 ma~~~~--~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~L 77 (305) +.++.. ....--.++..+|+..+.....+.+. .+++| ++..+..+|+...+...+--.+ +--+.++.+|.+..| T Consensus 27 ~~~~~~~~~~~lip~~~~~~ii~~~~~~s~l~~~--~~~~~-~~~~~~~~p~~~~~~~a~~v~Eg~~~~~~~~~~~~v~~ 103 (324) T protein:vir:99 27 DNVMMHEKKDGTLLNDFTTPILQEVMENSKIMRL--GKYEP-MEGTEKKFTFWADKPGAYWVGEGQKIETSKATWVNATM 103 (324) T ss_pred cceeccCCCcceechhHHHHHHHHHHhhchhhhh--cceee-ccCCceEEEEEecCcceeEeccCccccccccceeEEEE Confidence 111111 11122344556666666555555443 67777 4444444555443222211111 112345678999999 Q ss_pred eeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhcc Q lcl|NC_018271. 78 TLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEADA 157 (305) Q Consensus 78 ~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d~ 157 (305) .++++-+...++=+ ++. +-++..+.++.+.|.++++..++..++.|+++...-.|+...+.... T Consensus 104 ~~~k~~~~~~iS~e---------ll~-------ds~~~l~~~i~~~l~~ai~~~~d~~~l~G~g~~~~~~~~~~~~~~~~ 167 (324) T protein:vir:99 104 RAFKLGVILPVTKE---------FLN-------YTYSQFFEEMKPMIAEAFYKKFDEAGILNQGNNPFGKSIAQSIEKTN 167 (324) T ss_pred eeEEEEEeehhhHH---------HHh-------cchHHHHHHHHHHHHHHHHHHHHHHhhhcCCCCccCccccccccccc Confidence 99999888776532 111 22356778888999999999999999999987665556554332221 Q ss_pred ceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecceeeeec Q lcl|NC_018271. 158 TVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEGYTLTEI 237 (305) Q Consensus 158 ~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kGi~iv~l 237 (305) . ..++++| .+.+.++..+++...+.. + .++||...|.+.+. ++.-.|.+ .........+.|++++-. T Consensus 168 ~-----~~~~~~~----~~~i~~~~~~l~~~~~~~-~-~~v~n~~~~~~L~~-l~d~~g~~-~~~~~~~~~l~G~PVv~~ 234 (324) T protein:vir:99 168 K-----VIKGDFT----QDNIIDLEALLEDDELEA-N-AFISKTQNRSLLRK-IVDPETKE-RIYDRNSDTLDGLPVVNL 234 (324) T ss_pred e-----eccccCC----HHHHHHHHHhhhhccCCC-C-EEEEcHHHHHHHHH-hhcCCCce-eecCCCCccccceeEEee Confidence 1 1233455 455666677777766543 3 69999999987653 32222211 222333446899988766 Q ss_pred cCC--CCCeEEEecchHHhhhhhhhhhhhhccccce-----------e--eeccceeEEEEEEeecceeeccCCeEEEec Q lcl|NC_018271. 238 KGL--PASRMVGYNRDNIVIGMSAQSDFNEIRIKDM-----------G--DVDLSGQIRTKMVLSAGVEYAYGAEIVLYT 302 (305) Q Consensus 238 ~~~--Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~-----------~--~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~ 302 (305) ..+ +++.+++.+-+++++|... ++. |++.+= + .++-+....+...+-+|+.+..++=|+..| T Consensus 235 ~~~~~~~~~~i~gd~~~~~~~~~~--~~~-i~~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~lt 311 (324) T protein:vir:99 235 KSSNLKRGELITGDFDKLIYGIPQ--LIE-YKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAFAKLV 311 (324) T ss_pred cCCCCCcceEEEEecccEEEEEec--CcE-EEEeecccccccccccccchhhhhcCcEEEEEEEEEccEEecccceEEEE Confidence 555 5678999999998765432 221 222110 0 112334566677777799999988899988 Q ss_pred CCC Q lcl|NC_018271. 303 PAA 305 (305) Q Consensus 303 ~~~ 305 (305) ++. T Consensus 312 ~a~ 314 (324) T protein:vir:99 312 PAD 314 (324) T ss_pred ecc Confidence 876 No 25 >protein:vir:80376 Length: 435 # NCBI annotation: gp6, major capsid head protein # Family: family:all:21 # MgeID: mge:1881 # MgeName: phi644-2 # Cross-refs: genbank:acc:YP_001111085;genbank:gi:134288639;genbank:GeneID:4960624 Probab=98.45 E-value=1.7e-08 Score=63.16 Aligned_cols=269 Identities=14% Similarity=0.170 Sum_probs=149.2 Q ss_pred Cce-Eee-eeccc-chhH-HHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCC------CCccce Q lcl|NC_018271. 1 MAT-TVD-ITTNY-VGEV-AGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGF------TPSGEV 70 (305) Q Consensus 1 ma~-~~~-~~~~Y-~Ge~-l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~------~~~G~~ 70 (305) ++. +.. ..+.| ..+. ..+++..+.....+ .....+++|+-+.+. .+|+...+-. +.| -+-++. T Consensus 130 ~~~~~~~~~~gg~lvP~~~~~~ii~~l~~~~~i-~~~~~~~v~~~~~~~-~~p~~~~~~~-----a~~v~E~~~~~~~~~ 202 (435) T protein:vir:80 130 MSLNTLSPGAGGVLVPENLSSEVIELLRPKSVV-RKLGARTLPLSNGNI-TIPRLKGGAI-----VGYIGADTDIPTTQQ 202 (435) T ss_pred hhhcccCCCCCccccchhHHHHHHHHHhhhchh-hhccceeeecCCCce-EEEEEeCCcc-----eeeeccCcccccccc Confidence 111 111 11122 1222 23344444444433 332367778776553 3443332211 233 233678 Q ss_pred EecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHH Q lcl|NC_018271. 71 DINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGIL 150 (305) Q Consensus 71 ~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~l 150 (305) +|.+..+.++++-+...++=+ + +. +...-+..+.++.+.|.++++..++..++.||++.+...||+ T Consensus 203 ~f~~i~~~~~k~~~~~~is~e-l--------l~-----ds~~~~~l~~~i~~~l~~a~~~~~d~a~l~G~G~~~~p~Gi~ 268 (435) T protein:vir:80 203 QFDDLKLTAKKMAALVPIAND-L--------IK-----YAGVNPNVDQIVVGDLTAAIGAREDKAFIRDDGTANTPKGLR 268 (435) T ss_pred ceeeEEEeeEEEEEeehhhHH-H--------HH-----hhcccHHHHHHHHHHHHHHHHHHHHHHhhccCCCCCccccee Confidence 999999999999998886621 1 11 112234567888999999999999999999999888888987 Q ss_pred HHHhhccceEEeccCcCcCCh----hhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCc Q lcl|NC_018271. 151 PLLEADATVIDVVGASGGITA----ANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNE 226 (305) Q Consensus 151 k~i~~d~~~~~~~~~~~~iT~----anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~ 226 (305) ....... +.......|. .++...+..+..+.+ .+ .+-.+.||...+.+. +.++.-.|.+.- ..... T Consensus 269 ~~~~~~~----~~~~~~~~~~~~~~~d~~~~~~~~~~~~~--~~--~~~~~vmn~~~~~~L-~~lkd~~G~~l~-~~~~~ 338 (435) T protein:vir:80 269 FWALPGN----VITASDGSTLQKIETDLGKAILALENADA--NL--TQPGWIMAPRTFRFL-EGLRDGNGNKVY-PELAN 338 (435) T ss_pred ecccccc----eeecccccchhhHHHHHHHHHHHhhcccc--cc--ccCEEEEcHHHHHHH-HhhhccCCceec-cCCCC Confidence 7542211 1111112222 233333333332211 12 345899999999544 344433333321 12223 Q ss_pred ceecceeeeeccCCCCC--------eEEEecchHHhhhhhhhhhhhhcccccee------------eeccceeEEEEEEe Q lcl|NC_018271. 227 FDFEGYTLTEIKGLPAS--------RMVGYNRDNIVIGMSAQSDFNEIRIKDMG------------DVDLSGQIRTKMVL 286 (305) Q Consensus 227 ~~~kGi~iv~l~~~Pd~--------~ii~T~~sNl~~gvnl~~D~n~I~I~~~~------------~~~~~~~~f~k~~m 286 (305) ..+.|++++....+|+. .|+..+-+++++|.+ ..++|+..- ..+......+.-.. T Consensus 339 ~~l~G~pv~~~~~~p~~~~~~~~~~~i~~gd~s~~~i~~~-----~~~~i~~~~~~~~~~~~~~~~~~f~~n~~~~r~~~ 413 (435) T protein:vir:80 339 GMLKGYPVGKTTQVPINLGEAGKESEIYFTDFGDVFIGEE-----ETLEIDYSKEATYKDADGHMVSAFQRDQTLIRVIA 413 (435) T ss_pred CeEeeeeeEEeccccccccCCCCcceEEEEEcccEEEEee-----cceEEEEeccccccccccchhhhhhcCcceeeeee Confidence 47999999888888863 577778788776542 223332110 11112233445556 Q ss_pred ecceeeccCCeEEEecCCC Q lcl|NC_018271. 287 SAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 287 ~~d~~i~fg~E~v~~~~~~ 305 (305) -+|+.+..++=||+=|..+ T Consensus 414 r~d~~~~~~~a~~~l~~~~ 432 (435) T protein:vir:80 414 KNDFGPRHVESIAVLSGVA 432 (435) T ss_pred eeCcEeecccceEEEeccC Confidence 6688888888888888777 No 26 >protein:vir:6242 Length: 390 # NCBI annotation: gp36 # Family: family:all:21 # MgeID: mge:131 # MgeName: phi-BT1 # Cross-refs: genbank:acc:NP_813696;swissprot:trembl:q859c1;genbank:gi:29366756;interpro:IPR006444;uniprot:Q859C1;genbank:GeneID:1258897 Probab=98.45 E-value=5.2e-08 Score=60.50 Aligned_cols=267 Identities=13% Similarity=0.103 Sum_probs=150.6 Q ss_pred CceEeeeecccc-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccceEec Q lcl|NC_018271. 1 MATTVDITTNYV-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEVDIN 73 (305) Q Consensus 1 ma~~~~~~~~Y~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~~~~ 73 (305) ++.+......|. .+...+++...+....++ +.+.+|+|--....+.+|+...+. .+.|.+ .++.+|. T Consensus 111 ~~~t~~~~g~~~~~~~~~~~i~~~~~~~~~l-~~~~~~~~~~~~~~~~~p~~~~~~-----~a~wv~E~~~~~~~~~~f~ 184 (390) T protein:vir:62 111 RDGTKAGNPNVLSRTLYGQLIAQAVERSAIM-RGGATTFTTSDANPLDFTVITGRS-----SASIVGETAEIPESYPATA 184 (390) T ss_pred hcccccCCCccccccchHHHHHHHHhhhhhh-hhcceeeecCCCceeEEEEEcCCc-----ceeeeccccccccccccee Confidence 111111111221 122334444433333333 223454442211112233332221 234432 3577899 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) +..|.++++-+...++=+-++. =.+..+.++.+.|.++++..++..++.|+++ =.||+... T Consensus 185 ~i~~~~~k~~~~~~iS~ell~d----------------s~~~l~~~i~~~l~~~i~~~~d~~~l~G~G~---p~Gi~~~~ 245 (390) T protein:vir:62 185 QRSMGGFKYGFASVVSYEFATD----------------QVLDLVGFLVSDAGPAIGDAMGRHFITGTGQ---PRGILTDA 245 (390) T ss_pred eeEeeeeeEEeehHHHHHHHhh----------------hhHHHHHHHHHHHHHHHHHHHHhhhhccCCc---cccccccc Confidence 9999999999887665222211 1235667888899999999999999999986 35777765 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCcceec Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNEFDFE 230 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~~~~k 230 (305) ......+. ...++.+|. +.+-+++.+++..++. +-.++||...+...+ .++...|.+. +...+....+. T Consensus 246 ~~~~~~~~-~~~~~~~~~----~~l~~~~~~l~~~~~~--~a~~vmn~~~~~~L~-~lkd~~g~~l~~~~~~~g~~~~l~ 317 (390) T protein:vir:62 246 SPATATFL-ATDTDSKVS----DALIDLFHEVPSAYRA--NAKYVVNDLRAAQMR-KLKDANGQYLWQSGLTVGAPSLFN 317 (390) T ss_pred ccccccee-cccccccch----HHHHHHHHhhhhhhhc--CCEEEEchHHHHHHH-HhhccCCCeeecCCcCCCccceec Confidence 44333221 222334553 4455666777777775 348999999986543 4544333221 12233345799 Q ss_pred ceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeee-ccceeEEEEEEeecceeeccCCeEEEe--cCCC Q lcl|NC_018271. 231 GYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDV-DLSGQIRTKMVLSAGVEYAYGAEIVLY--TPAA 305 (305) Q Consensus 231 Gi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~-~~~~~~f~k~~m~~d~~i~fg~E~v~~--~~~~ 305 (305) |.+++-..++|+..|+..+-+..+++. ...+.++..... .-+.++-|..++..|+.+.-++=+++- +|+| T Consensus 318 G~Pv~~~~~~p~~~i~~gd~s~~~i~~-----~~~~~v~~~~~~~~~~~~~~~~~~~r~d~~~~~~~A~~~l~~~~~a 390 (390) T protein:vir:62 318 GKVVETDDGMPADKILFADLSKYRVRF-----AGSLRVDRSVDAKFSTDQIVYRFLQRADGLLVDARGAKVLTVTPGA 390 (390) T ss_pred ccceEEecCCCCccEEEeeccceeEEe-----ecceEEEeeccccccCCcEEEEEEEEeCcEeechhheEEEEeecCC Confidence 999999999999988887766654332 223333322211 123446667778888888777777654 5666 No 27 >protein:vir:81070 Length: 390 # NCBI annotation: p09 # Family: family:all:585 # MgeID: mge:1889 # MgeName: Xop411 # Cross-refs: genbank:acc:YP_001285679;genbank:gi:148727187;genbank:GeneID:5247115 Probab=98.43 E-value=3.4e-08 Score=61.56 Aligned_cols=270 Identities=12% Similarity=0.028 Sum_probs=155.7 Q ss_pred CceEeeee--cccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhcc-ccCCCC-CCCCccceEeccee Q lcl|NC_018271. 1 MATTVDIT--TNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDD-FVDYSC-GFTPSGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~~--~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~-~q~~~~-~~~~~G~~~~~~K~ 76 (305) +.++.... ..-..++..+++.....-..+.+ +++++|-- ......++...... ..-..+ +-.+.++.+|++.. T Consensus 113 ~~~~~~~~~g~~~~~~~~~~ii~~~~~~~~l~~--~~~~~~~~-~~~~~~~~~~~~~~~a~~v~Eg~~~~~~~~~~~~i~ 189 (390) T protein:vir:81 113 ASTDAAGSAGALTTPNRLPGFITPPDARLTVRD--LIGSGRTD-SALIEYVQETGFVNNAAIVAEGALKPESSLKFAKKT 189 (390) T ss_pred hccccccCCcceechhhhHHHHHHHhhhhhhhh--hcceeecc-CCceEEEEEecCCcceeeecCCcccccccceeeEEE Confidence 11111111 11222344555555544444443 36666633 23222333222211 110111 11234567899999 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) ++++++-+...++ +++. . + .+.++.++...|.+.++..++..++.||++.+.+.|++...... T Consensus 190 ~~~~k~~~~~~is-~ell--------~-------d-~~~~~~~i~~~l~~~~~~~~d~a~l~G~g~~~~~~Gi~~~~~~~ 252 (390) T protein:vir:81 190 DTTHVIAHTMKAT-RQIL--------S-------D-APQLASYMNNRLIRGLKVKEDAEILRGTGANDGLLGLIPQATTY 252 (390) T ss_pred EeeeEEEEeehhh-HHHH--------H-------h-HHHHHHHHHHHHHHHHHHHHHHHHHhcCCCCCcccceeeccccc Confidence 9999999987753 3331 1 1 12467888899999999999999999999988899998653322 Q ss_pred cceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC--CcccCCCcceecceee Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN--GTFLNPNEFDFEGYTL 234 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~--~~~t~~~~~~~kGi~i 234 (305) .. ..+.+.+...+.+.++..++....+.. =.++||...|.+.+- ++.-.|.+ .+...+....+.|+++ T Consensus 253 ~~-------~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~v~~~~~~~~l~~-lkd~~G~~l~~~~~~~~~~~l~G~pv 322 (390) T protein:vir:81 253 AA-------PTTIAGATRVDQLRLAMLQASLAEYNP--SGIVINPIDWAAIEL-AKDANNQYLIGNARGTLTPTLWGLPV 322 (390) T ss_pred cc-------ccccccchhHHHHHHHHHhhccccCCC--CEEEEcHHHHHHHHH-hhcCCCceeecCcccccCceecceee Confidence 11 112233334566777777777766543 279999999876653 32222221 1222333457999999 Q ss_pred eeccCCCCCeEEEecchHHhhhhhhhhhhhhcccc--ceeeeccceeEEEEEEeecceeeccCCeEEEecCC Q lcl|NC_018271. 235 TEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIK--DMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPA 304 (305) Q Consensus 235 v~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~--~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~ 304 (305) +....+|++.++..+-++.+..+ |-..++++ +-....-...+-+...+-+|+.+..++=||+=|=| T Consensus 323 ~~~~~~p~~~~~~gd~~~~~~~~----~~~~~~v~~~~~~~~~~~~~v~~r~~~r~d~~v~~~~a~v~~t~a 390 (390) T protein:vir:81 323 VATQAMAPGEFLVGAFDLAAQIF----DQWDARVEIGYVGEDFQRNMITVLAEERLALVVYRPEALISGSFA 390 (390) T ss_pred EEcCCCCCCcEEEEehhceEEEE----EecceEEEEecccchhhcCcEEEEEEEeeccEEecccceEEEEeC Confidence 99999999988887766642211 11223333 11111123345566677778888888888888877 No 28 >protein:vir:5739 Length: 366 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:122 # MgeName: PY54 # Cross-refs: genbank:acc:NP_892050;genbank:gi:33770513;interpro:IPR006444;uniprot:Q7Y410;genbank:GeneID:1732928 Probab=98.42 E-value=9.2e-08 Score=59.17 Aligned_cols=276 Identities=15% Similarity=0.160 Sum_probs=157.6 Q ss_pred CceEeeee-ccc-chhH-HHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEeccee Q lcl|NC_018271. 1 MATTVDIT-TNY-VGEV-AGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~~-~~Y-~Ge~-l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~ 76 (305) ||...... ..| ..+. ..+++..+.....+- ....|++|+-+.+ +.+|+...+....--.+ +--+.++.+|.+.. T Consensus 64 ~a~~~~~~~Gg~lvP~~~~~~ii~~l~~~s~l~-~lg~~~v~~~~g~-~~~p~~t~~~~a~wv~E~~~~~~s~~~f~~i~ 141 (366) T protein:vir:57 64 MAISTAAGSGGALIPQNMQNEVIELLRDRTVVR-ILGARSIPLPNGN-LSMPRLSGGATAGYVGEGKDVVATGATFDDVK 141 (366) T ss_pred hhccccccCCccccchhHHHHHHHHHhhhcchh-hhceeeeecCCCc-eEEEEEeCCcceeeeccCccccccccceeEEE Confidence 33322221 233 2333 334554444433332 3237888876555 33454432222210000 11223467899999 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) +.++++-+...++ +++ + ++-.+..+.++.+.|.++++..++..++.||++.+...|++...... T Consensus 142 ~~~~k~~~~~~iS-~el--------l-------~ds~~~~~~~i~~~l~~a~~~~~d~a~l~G~G~~~~p~Gi~~~~~~~ 205 (366) T protein:vir:57 142 LSAKTMIALVPVS-NQL--------I-------GRAGFNVEQLLLGDILSAIATREDKAFLRDDGTGDTPKGMKAVATAA 205 (366) T ss_pred EeeEEEEEeehhh-HHH--------H-------hhhhHHHHHHHHHHHHHHHHHHHHHHhhccCCCCccccceeeccccc Confidence 9999999988764 222 1 12235567888899999999999999999999888899998765443 Q ss_pred cceEEeccCcCcCChhhHHHHHHHHHHhccH--HHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecceee Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGKFIDAHTD--EILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEGYTL 234 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~--~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kGi~i 234 (305) ..++.. ..+..+.+.+...++.+...... ..+ .+-.+.||...+.+.+ .++.-.|.+. +.......+.|+++ T Consensus 206 ~~~~~~--~~t~~~~~~~~~~~~~~~~~~~~~~~~~--~~a~~vmn~~~~~~L~-~lkd~~G~~l-~~~~~~g~l~G~Pv 279 (366) T protein:vir:57 206 NRLVAW--TGTAINLTTIDEYLDSLILKHMDSNSNM--IRCGWGLSNRTYMTLF-GLRDGNGNKV-YPEMSQGILKGYPI 279 (366) T ss_pred cceeec--cccccchhhHHHHHHHHHHhhhcccccc--ccCEEEecHHHHHHHH-hhhccCCcee-ccCCCCCeecceee Confidence 332221 12234444444444444333222 222 2458999999986544 4443333322 12333457999999 Q ss_pred eeccCCCCC--------eEEEecchHHhhhhhhhhhhhhcccc--------ce----eeeccceeEEEEEEeecceeecc Q lcl|NC_018271. 235 TEIKGLPAS--------RMVGYNRDNIVIGMSAQSDFNEIRIK--------DM----GDVDLSGQIRTKMVLSAGVEYAY 294 (305) Q Consensus 235 v~l~~~Pd~--------~ii~T~~sNl~~gvnl~~D~n~I~I~--------~~----~~~~~~~~~f~k~~m~~d~~i~f 294 (305) +....||++ .|+..+-+++++|... .+.|+ +- ..++-..+.-++..+-+|+.+.. T Consensus 280 v~s~~ip~~~~~~~~~~~i~~gdfs~~~i~~~~-----~i~i~~~~ea~~~~~~g~~~~~f~~~~~~iR~~~~~d~~v~~ 354 (366) T protein:vir:57 280 QRTSAIPANLGDDGNESEIYFCDFNDVVIGEDG-----MMKVDFSTEATYKDADGQLVSAFARNQSLIRVVTEHDIGFRH 354 (366) T ss_pred EEccccccccccCCCccEEEEEecceEEEEEec-----ceEEEEeeccccccccccchhhhhcCceeEEeeeeeCcEeec Confidence 999899873 4777777777655322 12221 10 12233445677778888999988 Q ss_pred CCeEEEecCCC Q lcl|NC_018271. 295 GAEIVLYTPAA 305 (305) Q Consensus 295 g~E~v~~~~~~ 305 (305) .+=||+-|=.- T Consensus 355 ~~a~~~lt~~~ 365 (366) T protein:vir:57 355 PEGLVLGTGVI 365 (366) T ss_pred cccEEEEeccc Confidence 87777666555 No 29 >protein:vir:96392 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1613 # MgeName: 53 # Cross-refs: genbank:acc:YP_239648;genbank:gi:66395381;genbank:GeneID:5132868 Probab=98.38 E-value=7.8e-08 Score=59.54 Aligned_cols=270 Identities=12% Similarity=0.097 Sum_probs=160.7 Q ss_pred CceEeeeec--ccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEecceee Q lcl|NC_018271. 1 MATTVDITT--NYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQL 77 (305) Q Consensus 1 ma~~~~~~~--~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~L 77 (305) +.++..-.. .--.+...+|+..+.....+.+. ++++| ++.....+|+...+-..+-..+ +--+.++.+|.+..+ T Consensus 27 ~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~l--~~~~~-~~~~~~~~p~~~~~~~a~~v~Eg~~~~~~~~~~~~v~~ 103 (324) T protein:vir:96 27 DNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQL--GKYEP-MEGTEKKFTFWADKPGAYWVGEGQKIETSKATWVNATM 103 (324) T ss_pred ccccccCcCccccchhHHHHHHHHHHhhchhhhh--cceee-ccCCceEEEEEecCcceeEecCCccccccccceeEEEE Confidence 222222122 22345566777777777776664 77777 5444334555443222211111 222345678999999 Q ss_pred eeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhcc Q lcl|NC_018271. 78 TLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEADA 157 (305) Q Consensus 78 ~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d~ 157 (305) .++++-+...++=+ +.. +-.+..+.++.+.|+++++..++..++.|+++.+.-.|+...+.... T Consensus 104 ~~~k~~~~~~is~e---------ll~-------ds~~~l~~~i~~~la~ai~~~~d~a~l~G~g~~~~~~gi~~~~~~~~ 167 (324) T protein:vir:96 104 RAFKLGVILPVTKE---------FLN-------YTYSQFFEEMKPMIAEAFYKKFDEAGILNQGNNPFGKSIAQSIEKTN 167 (324) T ss_pred eeEEEEEeehhhHH---------HHh-------cchHHHHHHHHHHHHHHHHHHHHHHHhccCCCCCcCccccccccccc Confidence 99999888875422 111 11345677888999999999999999999987666666655432211 Q ss_pred ceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecceeeeec Q lcl|NC_018271. 158 TVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEGYTLTEI 237 (305) Q Consensus 158 ~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kGi~iv~l 237 (305) . ...+++| .+.+.++..+++...+.. + .++||...|.+.+. ++.-.|.+ -...+....+.|++++.. T Consensus 168 ~-----~~~~~~t----~~~i~~~~~~l~~~~~~~-~-~~vmn~~~~~~L~~-l~d~~G~~-~~~~~~~~~l~G~PV~~~ 234 (324) T protein:vir:96 168 K-----VIKGDFT----QDNIIDLEALLEDDELEA-N-AFISKTQNRSLLRK-IVDPETKE-RIYDRNSDSLDGLPVVNL 234 (324) T ss_pred e-----ecccccc----HHHHHHHHHhhhhccCCC-C-EEEEcHHHHHHHHH-hhccCCCe-eecCCCCCcccceeeEee Confidence 1 1223445 444555666677766643 3 69999999886543 32222221 122334456999987765 Q ss_pred cC--CCCCeEEEecchHHhhhhhhhhhhhhccccc------ee-------eeccceeEEEEEEeecceeeccCCeEEEec Q lcl|NC_018271. 238 KG--LPASRMVGYNRDNIVIGMSAQSDFNEIRIKD------MG-------DVDLSGQIRTKMVLSAGVEYAYGAEIVLYT 302 (305) Q Consensus 238 ~~--~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~------~~-------~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~ 302 (305) .. .+++.+++.+-+++++|..- ++. +++.+ .. +++-..+..|...+-+|+.+..++=+|+-+ T Consensus 235 ~~~~~~~~~~~~gd~~~~~~g~~~--~~~-i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~r~~~r~d~~v~~~~A~~~l~ 311 (324) T protein:vir:96 235 KSSNLKRGELITGDFDKLIYGIPQ--LIE-YKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAFAKLV 311 (324) T ss_pred CCCCCCcceEEEEecceEEEEEec--CcE-EEEeecccccccccccccchhhhhcCcEEEEEEEEEccEEecccceEEEe Confidence 44 57778999998888766522 221 22211 00 112334566777788899999988899888 Q ss_pred CCC Q lcl|NC_018271. 303 PAA 305 (305) Q Consensus 303 ~~~ 305 (305) ++- T Consensus 312 ~a~ 314 (324) T protein:vir:96 312 PAD 314 (324) T ss_pred ccc Confidence 866 No 30 >protein:vir:78830 Length: 324 # NCBI annotation: major head protein # Family: family:all:507 # MgeID: mge:1858 # MgeName: 80alpha # Cross-refs: genbank:acc:YP_001285361;genbank:gi:148717889;genbank:GeneID:5246961 Probab=98.38 E-value=7.8e-08 Score=59.54 Aligned_cols=270 Identities=12% Similarity=0.097 Sum_probs=160.7 Q ss_pred CceEeeeec--ccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEecceee Q lcl|NC_018271. 1 MATTVDITT--NYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQL 77 (305) Q Consensus 1 ma~~~~~~~--~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~L 77 (305) +.++..-.. .--.+...+|+..+.....+.+. ++++| ++.....+|+...+-..+-..+ +--+.++.+|.+..+ T Consensus 27 ~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~l--~~~~~-~~~~~~~~p~~~~~~~a~~v~Eg~~~~~~~~~~~~v~~ 103 (324) T protein:vir:78 27 DNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQL--GKYEP-MEGTEKKFTFWADKPGAYWVGEGQKIETSKATWVNATM 103 (324) T ss_pred ccccccCcCccccchhHHHHHHHHHHhhchhhhh--cceee-ccCCceEEEEEecCcceeEecCCccccccccceeEEEE Confidence 222222122 22345566777777777776664 77777 5444334555443222211111 222345678999999 Q ss_pred eeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhcc Q lcl|NC_018271. 78 TLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEADA 157 (305) Q Consensus 78 ~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d~ 157 (305) .++++-+...++=+ +.. +-.+..+.++.+.|+++++..++..++.|+++.+.-.|+...+.... T Consensus 104 ~~~k~~~~~~is~e---------ll~-------ds~~~l~~~i~~~la~ai~~~~d~a~l~G~g~~~~~~gi~~~~~~~~ 167 (324) T protein:vir:78 104 RAFKLGVILPVTKE---------FLN-------YTYSQFFEEMKPMIAEAFYKKFDEAGILNQGNNPFGKSIAQSIEKTN 167 (324) T ss_pred eeEEEEEeehhhHH---------HHh-------cchHHHHHHHHHHHHHHHHHHHHHHHhccCCCCCcCccccccccccc Confidence 99999888875422 111 11345677888999999999999999999987666666655432211 Q ss_pred ceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecceeeeec Q lcl|NC_018271. 158 TVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEGYTLTEI 237 (305) Q Consensus 158 ~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kGi~iv~l 237 (305) . ...+++| .+.+.++..+++...+.. + .++||...|.+.+. ++.-.|.+ -...+....+.|++++.. T Consensus 168 ~-----~~~~~~t----~~~i~~~~~~l~~~~~~~-~-~~vmn~~~~~~L~~-l~d~~G~~-~~~~~~~~~l~G~PV~~~ 234 (324) T protein:vir:78 168 K-----VIKGDFT----QDNIIDLEALLEDDELEA-N-AFISKTQNRSLLRK-IVDPETKE-RIYDRNSDSLDGLPVVNL 234 (324) T ss_pred e-----ecccccc----HHHHHHHHHhhhhccCCC-C-EEEEcHHHHHHHHH-hhccCCCe-eecCCCCCcccceeeEee Confidence 1 1223445 444555666677766643 3 69999999886543 32222221 122334456999987765 Q ss_pred cC--CCCCeEEEecchHHhhhhhhhhhhhhccccc------ee-------eeccceeEEEEEEeecceeeccCCeEEEec Q lcl|NC_018271. 238 KG--LPASRMVGYNRDNIVIGMSAQSDFNEIRIKD------MG-------DVDLSGQIRTKMVLSAGVEYAYGAEIVLYT 302 (305) Q Consensus 238 ~~--~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~------~~-------~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~ 302 (305) .. .+++.+++.+-+++++|..- ++. +++.+ .. +++-..+..|...+-+|+.+..++=+|+-+ T Consensus 235 ~~~~~~~~~~~~gd~~~~~~g~~~--~~~-i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~r~~~r~d~~v~~~~A~~~l~ 311 (324) T protein:vir:78 235 KSSNLKRGELITGDFDKLIYGIPQ--LIE-YKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAFAKLV 311 (324) T ss_pred CCCCCCcceEEEEecceEEEEEec--CcE-EEEeecccccccccccccchhhhhcCcEEEEEEEEEccEEecccceEEEe Confidence 44 57778999998888766522 221 22211 00 112334566777788899999988899888 Q ss_pred CCC Q lcl|NC_018271. 303 PAA 305 (305) Q Consensus 303 ~~~ 305 (305) ++- T Consensus 312 ~a~ 314 (324) T protein:vir:78 312 PAD 314 (324) T ss_pred ccc Confidence 866 No 31 >protein:vir:8187 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:153 # MgeName: Che9d # Cross-refs: genbank:acc:NP_817980;genbank:gi:29566414;genbank:GeneID:2700968 Probab=98.38 E-value=9.5e-08 Score=59.08 Aligned_cols=274 Identities=10% Similarity=0.062 Sum_probs=157.4 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccceEecc Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEVDINE 74 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~~~~~ 74 (305) ||++.+-...=--+...+|+..+.....+.+. .+++| ++.....+|+...+.. +.|.+ -++.+|.+ T Consensus 1 mat~~~gg~lvP~~~~~~ii~~~~~~s~i~~~--~~~i~-~~~~~~~~p~~~~~~~-----a~wv~Eg~~~~~~~~~f~~ 72 (311) T protein:vir:81 1 MVALATGTFQLPKHLVPGVWQKAQGQSVLARL--SMAEP-QEFGEQQYMTLTAPPR-----GEVVGEGAQKSESTATFAP 72 (311) T ss_pred CceecCCceEcchhHHHHHHHHHHhcchhhhh--cceee-cCCCceEEEEEeCCce-----eEEeecCcccccccceeeE Confidence 99887632111144455666666665555554 34444 2222233444322221 23422 35778999 Q ss_pred eeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCC--ccchhHHHHHH Q lcl|NC_018271. 75 KQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDG--TTGNLQGILPL 152 (305) Q Consensus 75 K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~--s~~~fdG~lk~ 152 (305) ..|.++++-+...++-+ +.+ .+. +=....+.++.+.|.++++..+++.++.|++ +-+.+.|+.+. T Consensus 73 v~l~~~kl~~~~~iS~e-ll~------~~~------d~~~~l~~~i~~~la~ai~~~~d~a~l~G~~~~~~~~~~gi~~~ 139 (311) T protein:vir:81 73 VTAIPRKVQVTQRFSQE-VKW------ADE------SRQLGVLQTMADLSGVALGRALDLIGIHGINPLTGAALSGSPAK 139 (311) T ss_pred EEEeeEEEEEeehhhHH-Hhh------cCc------ccHHHHHHHHHHHHHHHHHHHHHHhhhccccCCCCccccccccc Confidence 99999999988766532 211 110 1134456778889999999999999999975 45668999998 Q ss_pred HhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCccee Q lcl|NC_018271. 153 LEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNEFDF 229 (305) Q Consensus 153 i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~~~~ 229 (305) +.+...++..... +.......+.++...+... +.+++ .+.||...+.+. +-++.-.|.+. ..+.+....+ T Consensus 140 ~~~~~~~~~~~~~----~~~~~~~~i~~~~~~~~~~-~~~~~-~~vmn~~~~~~l-~~lkd~~G~~l~~~~~~~~~~~tl 212 (311) T protein:vir:81 140 ILDTTNIVELTTG----TSATPDLAVEAAVGLVLGD-NLSPD-GVALDNTFSFML-ATQRDSQGRKLYPELGFGTDVASF 212 (311) T ss_pred ccccceeeeeccc----ccchHHHHHHHHHHHhhhc-CCCce-EEEEcHHHHHHH-HhhhccCCCeeecCccccCCCcee Confidence 8776665543222 2233444556666555433 22332 499999998665 33433333221 1123345679 Q ss_pred cceeeeeccCCCCCeEE------------------EecchHHhhhhhhhhhhhhccccce------eeeccceeEEEEEE Q lcl|NC_018271. 230 EGYTLTEIKGLPASRMV------------------GYNRDNIVIGMSAQSDFNEIRIKDM------GDVDLSGQIRTKMV 285 (305) Q Consensus 230 kGi~iv~l~~~Pd~~ii------------------~T~~sNl~~gvnl~~D~n~I~I~~~------~~~~~~~~~f~k~~ 285 (305) .|.+++--..+|++... ..+-+++++|... +. .+++.+. .+++-+..+-++.. T Consensus 213 ~G~Pv~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~gDfs~~~i~~~~--~~-~~~~~~~~~~~~~~~~~~~~~v~~r~~ 289 (311) T protein:vir:81 213 AGLNAAVSDTVRGGPEAVTASTGVYRTTNPNVKAIAGDFSAFRWGVQV--SI-PLELIEFGDPDGLGDLKRQNQIAIRAE 289 (311) T ss_pred cceeEEecccccccccccccccchhcccCCccEEEEEecccEEEEEec--cc-eEEEeccCCCCcchhhhhcCcEEEEEE Confidence 99888766667655433 3333343332211 11 0111111 12233455677778 Q ss_pred eecceeeccCCeEEEecCCC Q lcl|NC_018271. 286 LSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 286 m~~d~~i~fg~E~v~~~~~~ 305 (305) +-+|+.+..++=||.-+++. T Consensus 290 ~r~d~~v~~~~a~~~l~~a~ 309 (311) T protein:vir:81 290 VVYGIGIMSTDAFAVVRDAD 309 (311) T ss_pred EEeccEeecccceEEEEeec Confidence 88899999998899999988 No 32 >protein:vir:4456 Length: 401 # NCBI annotation: Major capsid protein precursor # Family: family:all:21 # MgeID: mge:96 # MgeName: ST64B # Cross-refs: genbank:acc:NP_700379;genbank:gi:23505451;genbank:GeneID:955658 Probab=98.37 E-value=9.2e-08 Score=59.15 Aligned_cols=267 Identities=14% Similarity=0.112 Sum_probs=156.5 Q ss_pred CceEeeeecccc--hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCCc-------cceE Q lcl|NC_018271. 1 MATTVDITTNYV--GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTPS-------GEVD 71 (305) Q Consensus 1 ma~~~~~~~~Y~--Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~~-------G~~~ 71 (305) |++..+....|. -++..+|+..+.....+.+ +++++| ++.....++....+. .+.|.+- .... T Consensus 107 ~~~~~~~~GG~~iP~~~~~~ii~~~~~~~~l~~--~~~~~~-~~~~~~~~~~~~~~~-----~a~wv~E~~~~~~~~~~~ 178 (401) T protein:vir:44 107 LQVGTDEDGGYAVPEELDRSILSLLKDEVVMRQ--EATVIT-VGGSDYKKLVNLGGT-----ASGWVGETDTRSQTATSR 178 (401) T ss_pred hhcCCCCCCceeccHhHHHHHHHHHHhhhhhhh--hceeee-cCCCceEEEEecCCc-----cceeeccccccCcccccc Confidence 555444333443 3445556665655555544 366655 323322233222211 1234321 2347 Q ss_pred ecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHH Q lcl|NC_018271. 72 INEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 72 ~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk 151 (305) |++..+.++++.+...++-+-. . +=++..+.++.+.|.+.++..++..++.||++ +.-.||+. T Consensus 179 ~~~v~~~~~k~~~~~~iS~ell---------~-------ds~~~l~~~i~~~la~ai~~~~~~~~l~G~G~-~~p~Gil~ 241 (401) T protein:vir:44 179 LGLIEPFMGEIYGNPQATQKML---------D-------DAFFNVEAWINSELATEFAEQEEIAFTTGDGT-KKPKGFLA 241 (401) T ss_pred ceeeeeehhheeeehhhhHHHH---------h-------cchHHHHHHHHHHHHHHHHHHHHhhhhccCCC-Cccceeec Confidence 8888888888888776653221 1 22566778889999999999999999999997 67889887 Q ss_pred HHhhccceE--------E-eccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---C Q lcl|NC_018271. 152 LLEADATVI--------D-VVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---G 219 (305) Q Consensus 152 ~i~~d~~~~--------~-~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~ 219 (305) ......... . .++..+.+| .+.+-++..+++..+|.+ -+++||...|.+.+- ++.-.|.+ . T Consensus 242 ~~~~~~~~~~~~~~~~~~~~t~~~~~~~----~d~i~~~~~~l~~~~~~~--a~~v~n~~~~~~L~~-lkd~~G~~l~~~ 314 (401) T protein:vir:44 242 YESTEESDKARAFGKLQHIVSGEATAVT----ADAIIKLIYTLRKAHRTG--AKFMMNNNSLFAIRL-LKDTEGNYLWRP 314 (401) T ss_pred cccccccccccccccccccccccccccC----HHHHHHHHHhcchhhhcC--CEEEEcHHHHHHHHH-hhccCCceeecC Confidence 654322210 0 111122233 455667777888888864 489999999977653 43333222 1 Q ss_pred cccCCCcceecceeeeeccCCCC-----CeEEEec-chHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeec Q lcl|NC_018271. 220 TFLNPNEFDFEGYTLTEIKGLPA-----SRMVGYN-RDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYA 293 (305) Q Consensus 220 ~~t~~~~~~~kGi~iv~l~~~Pd-----~~ii~T~-~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~ 293 (305) +.+++....+.|.+++....+|+ ..|+..+ ..+.+++ + --.+++. ..+..-+..+.+...+-+|..+. T Consensus 315 ~~~~g~~~~l~G~PVv~~~~~p~~~~~~~~i~~Gd~~~~~~i~----~-~~~~~~~-~~~~~~~~~v~~~a~~r~d~~~~ 388 (401) T protein:vir:44 315 GLELGQPSSLAGYGIAENEQMPDIAADAKAIAFGNFKRGYTIV----D-RIGTRIL-RDPYTNKPFVGFYTTKRTGGMLV 388 (401) T ss_pred CcCCCCCceecceeeEEecCcCCccCCccEEEEeehhccEEEE----E-ecceEEe-eeccccCCcEEEEEEEEeccEEe Confidence 22344556799999998888874 2233333 2233211 1 1112221 01212245577788888899999 Q ss_pred cCCeEEEecCCC Q lcl|NC_018271. 294 YGAEIVLYTPAA 305 (305) Q Consensus 294 fg~E~v~~~~~~ 305 (305) .++=||+.+.+| T Consensus 389 ~~~a~~~l~~~a 400 (401) T protein:vir:44 389 DSQAIKLLKIAA 400 (401) T ss_pred cccceEEEEeec Confidence 999999999988 No 33 >protein:vir:97053 Length: 390 # NCBI annotation: putative head protein # Family: family:all:585 # MgeID: mge:1653 # MgeName: OP1 # Cross-refs: genbank:acc:YP_453565;genbank:gi:84662600;genbank:GeneID:5142468 Probab=98.37 E-value=4.1e-08 Score=61.10 Aligned_cols=266 Identities=11% Similarity=0.008 Sum_probs=155.3 Q ss_pred CceEeeeec--ccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccceEe Q lcl|NC_018271. 1 MATTVDITT--NYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEVDI 72 (305) Q Consensus 1 ma~~~~~~~--~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~~~ 72 (305) +.++..... --..+...+++..+.....+.+. ++++|-- ......++...... .+.|.+ -++.+| T Consensus 113 ~~~~~~~~~g~lip~~~~~~ii~~~~~~~~i~~~--~~~~~~~-~~~~~~~~~~~~~~----~a~~v~Eg~~~~~~~~~~ 185 (390) T protein:vir:97 113 ASTDAAGSAGALTTPNRLPGFITPPDARLTVRDL--IGSGRTD-SALIEYVQETGFVN----NAAIVAEGALKPESSLKF 185 (390) T ss_pred hhcccccccccccchhhhHHHHHHHhhhhhhHhh--cceeecc-CCceEEEEEecCCc----ceeeecCCccccccccce Confidence 222222111 12334556666666555555543 6766633 22222333322211 123332 346789 Q ss_pred cceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHH Q lcl|NC_018271. 73 NEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPL 152 (305) Q Consensus 73 ~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~ 152 (305) .+..+.++++-+...++ +++. . + .+.++.++.+.|.++++..++..++.|+++.+.+.|++.. T Consensus 186 ~~i~~~~~k~~~~~~is-~ell--------~-------d-s~~l~~~i~~~la~a~~~~~d~a~l~G~g~~~~p~Gi~~~ 248 (390) T protein:vir:97 186 AKKTDTTHVIAHTMKAT-RQIL--------S-------D-APQLASYMNNRLIRGLKVKEDAEILRGTGANDGLLGLIPQ 248 (390) T ss_pred eEEEEeeeeEEEeehhh-HHHH--------H-------h-HHHHHHHHHHHHHHHHHHHHHHHHhhcCCCCccccceeec Confidence 99999999999887643 3331 1 1 1246788889999999999999999999998889999875 Q ss_pred HhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC--CcccCCCcceec Q lcl|NC_018271. 153 LEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN--GTFLNPNEFDFE 230 (305) Q Consensus 153 i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~--~~~t~~~~~~~k 230 (305) ...... . ...+.....+.+.++...+...++.. -.++||...|-+.+ .++.-.|.+ .+..++...++. T Consensus 249 ~~~~~~----~---~~~~~~~~~d~~~~~~~~~~~~~~~~--~~~v~n~~~~~~L~-~lkd~~G~~l~~~~~~~~~~~l~ 318 (390) T protein:vir:97 249 ATTYAA----P---TTIAGATRVDQLRLAMLQASLAEYPA--SGIVINPIDWAAIE-LAKDANNQYLIGNARGTLTPTLW 318 (390) T ss_pred cccccc----c---ccccccchHHHHHHHHHhhccccCCC--CEEEEcHHHHHHHH-HhhcCCCceeecCccCCCCceec Confidence 322111 1 11223334566667777777776643 37999999987665 343333222 122334455799 Q ss_pred ceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccce--eeeccceeEEEEEEeecceeeccCCeEEEecCC Q lcl|NC_018271. 231 GYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDM--GDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPA 304 (305) Q Consensus 231 Gi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~--~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~ 304 (305) |++++....+|++.++..+-++.+..+ |-..+.++.. ....-+...-+...+-+|+.+..++=||+=|=| T Consensus 319 G~pV~~~~~~~~~~~~~gd~~~~~~~~----~~~~~~i~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~v~~~~a 390 (390) T protein:vir:97 319 GLPVVATQAMAPGEFLVGAFDLAAQIF----DQWDARVEIGYVNDDFQRNMVTVLAEERLALVVYRPEALITGSFA 390 (390) T ss_pred ceeeEEcCCCCCCcEEEEeccceEEEE----EecceEEEEeecccccccCcEEEEEEEeeccEEeccccEEEEEeC Confidence 999999999999988888766532211 1122222211 111123444555666678888777777777766 No 34 >protein:vir:96223 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1607 # MgeName: 69 # Cross-refs: genbank:acc:YP_239571;genbank:gi:66395304;genbank:GeneID:5132771 Probab=98.36 E-value=9.2e-08 Score=59.17 Aligned_cols=269 Identities=11% Similarity=0.105 Sum_probs=156.7 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEecceeeee Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQLTL 79 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~L~~ 79 (305) |++. +....--.++..+|+..+.....+.+. ++++| ++..+..+|+....-..+-..+ +--+.++.+|.+..+.+ T Consensus 30 ~~~~-~~~~lip~~~~~~ii~~~~~~s~l~~l--~~~~~-~~~~~~~~p~~~~~~~a~~v~Eg~~~~~~~~~f~~v~~~~ 105 (324) T protein:vir:96 30 MMHE-KKDGTLLNDFTTPILQEVMENSKIMQL--GKYEP-MEGTEKKFTFWADKPGAYWVGEGQKIETSKATWVNATMRA 105 (324) T ss_pred cccC-CCcceechhHHHHHHHHHHhhchhhhh--cceee-ccCCceEEEEEecCcceeeecCCccccccccceeEEEEEe Confidence 2111 001122344556666666666665544 66666 5444444555443322221111 11223567999999999 Q ss_pred eeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhccce Q lcl|NC_018271. 80 KKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEADATV 159 (305) Q Consensus 80 ~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d~~~ 159 (305) +++-+...++= ++ +. +=.+..+.++.+.|.++++..++..++.|+++...-.|+...+.... T Consensus 106 ~k~~~~~~is~-el--------l~-------ds~~~l~~~i~~~l~~aia~~~d~~~l~G~g~~~~~~~~~~~~~~~~-- 167 (324) T protein:vir:96 106 FKLGVILPVTK-EF--------LN-------YTYSQFFEEMKPMIAEAFYKKFDEAGILNQGNNPFGKSIAQSIKKTN-- 167 (324) T ss_pred EEEEEeehhhH-HH--------Hh-------cchHHHHHHHHHHHHHHHHHHHHHHhhhcCCCCCcCccccccccccc-- Confidence 99998877542 21 11 11245667888999999999999999999987655455444322211 Q ss_pred EEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecceeeeeccC Q lcl|NC_018271. 160 IDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEGYTLTEIKG 239 (305) Q Consensus 160 ~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kGi~iv~l~~ 239 (305) ....+++|.+ .+.++..+++...+. ++ .++||...+.+.+. ++.-.|.+ ....+....+.|++++-... T Consensus 168 ---~~~~~~~~~~----~i~~~~~~i~~~~~~-~~-~~i~n~~~~~~L~~-lkd~~G~~-~~~~~~~~~l~G~PV~~~~~ 236 (324) T protein:vir:96 168 ---KVIKGDFTQD----NIIDLEALLEDDELE-AN-AFISKTQNRSLLRK-IVDPETKE-RIYDRNSDSLDGLPVVNLKS 236 (324) T ss_pred ---eecccccchH----HHHHHHHhhhhccCC-CC-EEEEcHHHHHHHHH-hhCCCCCe-eecCCCCCcccceeeEeecC Confidence 1122345544 445566667776554 33 69999999886553 32222221 12333445699998765444 Q ss_pred --CCCCeEEEecchHHhhhhhhhhhhhhccccc------ee-------eeccceeEEEEEEeecceeeccCCeEEEecCC Q lcl|NC_018271. 240 --LPASRMVGYNRDNIVIGMSAQSDFNEIRIKD------MG-------DVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPA 304 (305) Q Consensus 240 --~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~------~~-------~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~ 304 (305) .+++.+++.+.+++++|.. .++. +++.+ .. +++...+..++..+-+|+.+..++=+|+-+++ T Consensus 237 ~~~~~~~~~~gd~s~~~~~~~--~~~~-i~~~~~~~~~~~~~~~~~~~~~~~~n~v~~r~~~r~d~~v~~~~a~~~l~~a 313 (324) T protein:vir:96 237 SNLKRGELITGDFDKLIYGIP--QLIE-YKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAFAKLVPA 313 (324) T ss_pred CCCCcceEEEEecceEEEEEe--cCcE-EEEeecccccccccccccchhhhhcCcEEEEEEEEeccEEecccceEEEecc Confidence 4677899999999876542 2222 22211 10 12233456667777779999999889999888 Q ss_pred C Q lcl|NC_018271. 305 A 305 (305) Q Consensus 305 ~ 305 (305) . T Consensus 314 ~ 314 (324) T protein:vir:96 314 D 314 (324) T ss_pred c Confidence 7 No 35 >protein:vir:97148 Length: 324 # NCBI annotation: ORF010 # Family: family:all:507 # MgeID: mge:1654 # MgeName: 85 # Cross-refs: genbank:acc:YP_239726;genbank:gi:66394880;genbank:GeneID:5130881 Probab=98.36 E-value=1.1e-07 Score=58.75 Aligned_cols=270 Identities=12% Similarity=0.082 Sum_probs=157.9 Q ss_pred CceEeeeec--ccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEecceee Q lcl|NC_018271. 1 MATTVDITT--NYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQL 77 (305) Q Consensus 1 ma~~~~~~~--~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~L 77 (305) +..+..-+. .--.++..+|+..+.....+.+. .+++| ++..+..+|+.......+--.+ +--+.++.+|.+..+ T Consensus 27 ~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~~--~~~~~-~~~~~~~ip~~~~~~~a~~v~Eg~~~~~~~~~f~~v~~ 103 (324) T protein:vir:97 27 DNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQL--GKYEP-MEGTEKKFTFWADKPGAYWVGEGQKIETSKATWVNATM 103 (324) T ss_pred ccccccCCCcceechhHHHHHHHHHHhhcchhhh--cceee-ccCCceEEEEEecCcceeEeccCccccccccceeEEEE Confidence 111111112 22344556677667666665554 67776 4444444555443322211111 112235778999999 Q ss_pred eeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhcc Q lcl|NC_018271. 78 TLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEADA 157 (305) Q Consensus 78 ~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d~ 157 (305) .++++-+...++=+ ++ ++-++..+..+.+.|.++++..+++.++.|+++.+...|++..+.... T Consensus 104 ~~~k~~~~~~is~e---------ll-------~ds~~~l~~~i~~~l~~aia~~~d~a~l~G~g~~~~~~gi~~~~~~~~ 167 (324) T protein:vir:97 104 RAFKLGVILPVTKE---------FL-------NYTYSQFFEEMKPMIAEAFYKKFDEAGILNQGNNPFGKSIAQSIEKTN 167 (324) T ss_pred eeEEEEEeehhhHH---------HH-------hcchHHHHHHHHHHHHHHHHHHHHHHhhccCCCCccCccccccccccc Confidence 99999888875422 11 122456778888999999999999999999987665556554332211 Q ss_pred ceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecceeeeec Q lcl|NC_018271. 158 TVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEGYTLTEI 237 (305) Q Consensus 158 ~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kGi~iv~l 237 (305) ....+++|.+ .+.+++.+++...+.. + .+.||...|.+.+ .++.-.|.+ .........+.|.+++-. T Consensus 168 -----~~~~~~~~~~----~i~~~~~~l~~~~~~~-~-~~v~n~~~~~~L~-~lkd~~g~~-~~~~~~~~tl~G~PV~~~ 234 (324) T protein:vir:97 168 -----KVIKGDFTQD----NIIDLEALLEDDELEA-N-AFISKTQNRSLLR-KIVDPETKE-RIYDRNSDTLDGLPVVNL 234 (324) T ss_pred -----eeccccCCHH----HHHHHHHhhhhccCCC-C-EEEEcHHHHHHHH-HhhcCCCce-eecCCCCccccceeeEee Confidence 1123445544 4556677777776643 2 7999999987654 232222211 112233446899987765 Q ss_pred cCC--CCCeEEEecchHHhhhhhhhhhhhhccccce------e-------eeccceeEEEEEEeecceeeccCCeEEEec Q lcl|NC_018271. 238 KGL--PASRMVGYNRDNIVIGMSAQSDFNEIRIKDM------G-------DVDLSGQIRTKMVLSAGVEYAYGAEIVLYT 302 (305) Q Consensus 238 ~~~--Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~------~-------~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~ 302 (305) ..+ +++.++..+-+++++|.. .++. |++.+= . +++-..+.-|...+-+|+.+..++=+++.+ T Consensus 235 ~~~~~~~~~~~~gd~~~~~i~~~--~~~~-i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~r~~~r~d~~v~~~~a~~~l~ 311 (324) T protein:vir:97 235 KSSNLKRGELITGDFDKLIYGIP--QLIE-YKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAFAKLV 311 (324) T ss_pred cCCCCCcceEEEEecccEEEEEe--cCcE-EEEeecccccccccccccchhhhhcCcEEEEEEEEeccEEecccceEEEE Confidence 554 677889899899876642 2222 222110 0 111223455666677799888888888888 Q ss_pred CCC Q lcl|NC_018271. 303 PAA 305 (305) Q Consensus 303 ~~~ 305 (305) ++. T Consensus 312 ~~~ 314 (324) T protein:vir:97 312 PAD 314 (324) T ss_pred ecc Confidence 766 No 36 >protein:vir:102119 Length: 404 # NCBI annotation: phage major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1641 # MgeName: phiSM101 # Cross-refs: genbank:acc:YP_699941;genbank:gi:110804052;genbank:GeneID:4206662 Probab=98.34 E-value=7.9e-08 Score=59.52 Aligned_cols=273 Identities=7% Similarity=0.080 Sum_probs=149.2 Q ss_pred CceEeeeecccc--hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhcccc---CCCCCCCCccceEecce Q lcl|NC_018271. 1 MATTVDITTNYV--GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFV---DYSCGFTPSGEVDINEK 75 (305) Q Consensus 1 ma~~~~~~~~Y~--Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q---~~~~~~~~~G~~~~~~K 75 (305) |.......+.|. -++..+++..+.....+.+..-+..+++-..+.. .++........ +......+.++.+|.+. T Consensus 110 ~~~~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~l~~~~~~~~~~g~~~-~~~~~~~~~~~~v~e~~~~~~~~~~~~f~~i 188 (404) T protein:vir:10 110 ISENIDEDGGYAVPEDIQTKINTRLKDTTDLYNMVDYEPVFTRSGSRT-YEKRSKQKPMKPLSENQQIPTNGDNGKLERF 188 (404) T ss_pred hccccCCCCceeechhHHHHHHHHHhhhhhHhhhhceeeccCCccceE-EEEecCCcceeeccccccccccccccceeee Confidence 444333333332 2233455555555555555422332333333322 23222222221 11222333456789999 Q ss_pred eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhh Q lcl|NC_018271. 76 QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEA 155 (305) Q Consensus 76 ~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~ 155 (305) .+.++++.+...++-+- + ++-++..+.++...|.++++..++..++.|+++.+.+.|++..... T Consensus 189 ~~~~~k~~~~~~iS~el---------l-------~ds~~~l~~~i~~~la~~~~~~~~~~il~G~g~~~~~~gi~~~~~~ 252 (404) T protein:vir:10 189 NFKLKDLADFMSIPNDL---------L-------KFADKSLEDWIINWFVDKVRITRNAEILYGAGGDEHATGIMTANKF 252 (404) T ss_pred EeeheeeEeeehhhHHH---------H-------hhcHHHHHHHHHHHHHHHHHHHHHHHHhhcCCCCCcccceeecccc Confidence 99999999887665421 1 1234577788999999999999999999999988888898754211 Q ss_pred ccceEEeccCcCcCChhhHHHHHHH-HHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCcceecc Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGK-FIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNEFDFEG 231 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~-~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~~~~kG 231 (305) .++ ...+..+ .+.+.. +...++..++. +-.++||...|.+.+- ++.-.|.+. +..++....+.| T Consensus 253 --~~~---~~~~~~~----~~~~~~~~~~~l~~~~~~--~~~~v~n~~~~~~L~~-lkd~~G~~l~~~~~~~~~~~~l~G 320 (404) T protein:vir:10 253 --KKI---TLPKSPA----LKDFKKCKNVELLNVFKA--TSSWIVNQDGFNYLDS-LEDKTGRPYLQPDPKDPTQYRFLG 320 (404) T ss_pred --cee---ecccccc----HHHHHHHHHhhhhccccC--CCEEEEcHHHHHHHHH-hhccCCceeeccCcCCCCCccccc Confidence 111 1112222 333333 33356666654 4589999999877653 433233222 223344457999 Q ss_pred eeeeeccC-CCCC-----eEEEecchH-Hhhhhhhhhhhhhcccccee-eeccceeEEEEEEeecceeeccCCeEEEecC Q lcl|NC_018271. 232 YTLTEIKG-LPAS-----RMVGYNRDN-IVIGMSAQSDFNEIRIKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIVLYTP 303 (305) Q Consensus 232 i~iv~l~~-~Pd~-----~ii~T~~sN-l~~gvnl~~D~n~I~I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~ 303 (305) .+++.++. +|.+ .++..+-++ .+++.+ .... +++++-. +..-....-+...+-+|+.+..++=||+-+. T Consensus 321 ~PV~~~~~~~~~~~~~~~~~~~gd~s~~~~~~~~--~~~~-i~~~~~~~~~~~~~~~~~~~~~r~d~~v~~~~a~~~~~~ 397 (404) T protein:vir:10 321 LPVIELPNDLLLSTESAIPVLLGDTKEAYKYVSD--GAYE-LATTNIGAGAFETNTTKARIIMRIDGNVKDSEALLIAEI 397 (404) T ss_pred eeeEEecccccCCCCCccEEEEEeccccEEEEEe--cceE-EEEeccccchhhcCceEEEEEEeeccEEecccceEEEEe Confidence 99876643 4432 355555444 322221 1112 2222111 1112344555566667888888888888876 Q ss_pred CC Q lcl|NC_018271. 304 AA 305 (305) Q Consensus 304 ~~ 305 (305) ++ T Consensus 398 ~~ 399 (404) T protein:vir:10 398 PV 399 (404) T ss_pred ec Confidence 66 No 37 >protein:vir:78223 Length: 333 # NCBI annotation: Putative major head protein # Family: family:all:966 # MgeID: mge:1849 # MgeName: Bethlehem # Cross-refs: genbank:acc:YP_001491666;genbank:gi:157786490;genbank:GeneID:5625701 Probab=98.31 E-value=2.3e-07 Score=56.98 Aligned_cols=278 Identities=15% Similarity=0.139 Sum_probs=157.8 Q ss_pred CceEeeeeccc--------chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhcccc----C--C--C-CC Q lcl|NC_018271. 1 MATTVDITTNY--------VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFV----D--Y--S-CG 63 (305) Q Consensus 1 ma~~~~~~~~Y--------~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q----~--~--~-~~ 63 (305) |+...+-...| -.++..+|+..+.-...+.+. .+++| ++-....+|+....-..+ + . . .+ T Consensus 10 ~~~~~~~~g~~~~~~~~liP~~~~~~ii~~l~~~s~l~~~--~~~~~-~~~~~~~~p~~~~~~~a~~v~eg~~~~~~e~~ 86 (333) T protein:vir:78 10 NSAGSNHQGRLAHVPSDLLPKEIVGPIFDKAQESSLVLRM--GEQIP-ISYGETIIPTTVKRPEVGQVGVGTSNEQREGG 86 (333) T ss_pred hcccccccCceecCCccccchhHHHHHHHHHHhhchhhhh--cceee-ccCCceEEEEEeCCceeEeecCcccccccccc Confidence 22222222222 234456666666666655554 55555 333333444433322111 0 0 1 12 Q ss_pred CCCccceEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc Q lcl|NC_018271. 64 FTPSGEVDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT 143 (305) Q Consensus 64 ~~~~G~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~ 143 (305) +.+-++.+|.+..|.++++-+...++-+ +.+ +=++..+.++.+.|.++++..++..+++|+++. T Consensus 87 ~~~~~~~~f~~i~l~~~kl~~~~~is~e---------ll~-------~s~~~~~~~i~~~la~ai~~~~d~~~l~G~g~~ 150 (333) T protein:vir:78 87 LKPLSGTAWDTRSVSPIKLATIVTVSEE---------FAR-------MNPSGLYTKLQGDLAYAIGRGIDLAVFHGKSPL 150 (333) T ss_pred cccccccceeEEEEeeEEEEEeehhhHH---------HHh-------cCHHHHHHHHHHHHHHHHHHHHHHHHhcccCCC Confidence 3345678999999999999998876532 111 224567788889999999999999999999864 Q ss_pred --chhHHHHHHHhhccce-EEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHh--hhhccC Q lcl|NC_018271. 144 --GNLQGILPLLEADATV-IDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYG--TQARSN 218 (305) Q Consensus 144 --~~fdG~lk~i~~d~~~-~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~--~~~~k~ 218 (305) ..+.|+.+.......+ +......+.+ ..+.+.+++..++.....+.+ .+.|+...|........ .-.|.+ T Consensus 151 ~~~~~~g~~~~~~~~~~~~~~~~~~~~~~----~~~~i~~~~~~~~~~~~~~~~-~~vmn~~~~~~L~~~~~~~d~~G~~ 225 (333) T protein:vir:78 151 TGSALQGIDTDNVIANTTNVDYLQETGDP----LLDRLLDGYDLVSANTDVEFN-GWAVDPRFRAHLLRAQAYRDANGNV 225 (333) T ss_pred CCcccccccccccccccccccccccccch----hHHHHHHHHHhhccccccCce-EEEEcchHHHHHHHHhhhcCCCCce Confidence 4477766532211111 1111112222 355566666666665443333 69999988876643322 111221 Q ss_pred C---cccCCCcceecceeeeeccCCCCC---------eEEEecchHHhhhhhhhhhhhhccccce----------eeecc Q lcl|NC_018271. 219 G---TFLNPNEFDFEGYTLTEIKGLPAS---------RMVGYNRDNIVIGMSAQSDFNEIRIKDM----------GDVDL 276 (305) Q Consensus 219 ~---~~t~~~~~~~kGi~iv~l~~~Pd~---------~ii~T~~sNl~~gvnl~~D~n~I~I~~~----------~~~~~ 276 (305) . ....+....+.|++++--..+|++ .+++.+-+++++|.. .++. +++.+- ...+- T Consensus 226 i~~~~~~~~~~~~l~G~Pv~~~~~i~~~~~~~~~~~~~~~~gD~~~~~~g~~--~~~~-i~~~~~~~~~~~~~~~~~~~~ 302 (333) T protein:vir:78 226 DPSRINLAAQTGDVLGLPAQFGRAVGGDLGAAVDSKTRIIGGDFSQLKFGFA--DEIR-IKMSDTATLTDSGSATVSMWQ 302 (333) T ss_pred eecCccccCCCceeeceeeEEccccCCCccccCCCccEEEEEecccEEEEEe--eccE-EEEeccccccccccceeehhh Confidence 1 122233457999999888778754 588888888776542 1211 111111 11222 Q ss_pred ceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 277 SGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 277 ~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) ..+.-+...+-+|+.+.-++=+|+-+++. T Consensus 303 ~~~v~~r~~~r~d~~v~~~~a~~~l~~~~ 331 (333) T protein:vir:78 303 TNQIAILIEVTFGWLLGDKQAFVKFVDDE 331 (333) T ss_pred cCcEEEEEEEEEccEEecccceEEEeccC Confidence 34455667777899998889999999888 No 38 >protein:vir:104256 Length: 458 # NCBI annotation: major head protein precursor # Family: family:all:27070 # MgeID: mge:1504 # MgeName: T5 # Cross-refs: genbank:acc:YP_006977;genbank:gi:46401878;genbank:GeneID:2777673 Probab=98.31 E-value=1.5e-07 Score=57.98 Aligned_cols=267 Identities=13% Similarity=0.095 Sum_probs=147.7 Q ss_pred CceEeeee--cccc-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC----------- Q lcl|NC_018271. 1 MATTVDIT--TNYV-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP----------- 66 (305) Q Consensus 1 ma~~~~~~--~~Y~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~----------- 66 (305) ++...... .... -+...+|+..+.....+.+. ++++| ++.+....++...+-. +.|.+ T Consensus 161 ~~~~~~~~~g~~~ip~~~~~~ii~~~~~~~~l~~~--~~~~~-~~~~~~~~~~~~~~~~-----a~~v~e~~~~~~~~~~ 232 (458) T protein:vir:10 161 VNQSSSVEVSSESYETIFSQRIIRDLQKELVVGAL--FEELP-MSSKILTMLVEPDAGK-----ATWVAASTYGTDTTTG 232 (458) T ss_pred hhhcccCccccceehhhHhHHHHHHHHhhhhHHhh--cceee-cCCcceEEEEecCCcc-----eeeccccccccccccc Confidence 21221111 1111 23444555555544444333 44444 2223223333222211 23322 Q ss_pred -ccceEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccch Q lcl|NC_018271. 67 -SGEVDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGN 145 (305) Q Consensus 67 -~G~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~ 145 (305) .+..+|.+..|.++++-+...++=. + + ++=++.++.++...|...++..++..++.||++ +. T Consensus 233 ~~~~~~~~~i~~~~~k~~~~v~is~e-l--------l-------~ds~~~~~~~i~~~l~~~i~~~~d~~~l~G~G~-~~ 295 (458) T protein:vir:10 233 EEVKGALKEIHFSTYKLAAKSFITDE-T--------E-------EDAIFSLLPLLRKRLIEAHAVSIEEAFMTGDGS-GK 295 (458) T ss_pred ccccccceeeEeeeeeEEeeehhhHH-H--------H-------hcchHHHHHHHHHHHHHHHHHHHHHHhhcCCCC-Cc Confidence 2355788899999998888766532 1 1 222466788889999999999999999999987 55 Q ss_pred hHHHHHHHhhccceE--Eecc-CcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC--- Q lcl|NC_018271. 146 LQGILPLLEADATVI--DVVG-ASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG--- 219 (305) Q Consensus 146 fdG~lk~i~~d~~~~--~~~~-~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~--- 219 (305) ..||++........+ .... ..+.+|. +.+-+++..++..++. +-.|+||...|.... .++.-.|.+. T Consensus 296 p~Gi~~~~~~~~~~~~~~~~~~~~~~~~~----~~i~~~~~~l~~~~~~--~~~~v~~~~~~~~l~-~lkd~~G~~i~~~ 368 (458) T protein:vir:10 296 PKGLLTLASEDSAKVVTEAKADGSVLVTA----KTISKLRRKLGRHGLK--LSKLVLIVSMDAYYD-LLEDEEWQDVAQV 368 (458) T ss_pred cceeeecccccccceeecccccccccccH----HHHHHHHHhhhhhhcC--CCEEEEcHHHHHHHH-hhcccCCceeecc Confidence 678877654333221 1111 1123443 3344566777888775 358999999986543 3433222221 Q ss_pred ----cccCCCcceecceeeeeccCCCCC-----eEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecce Q lcl|NC_018271. 220 ----TFLNPNEFDFEGYTLTEIKGLPAS-----RMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGV 290 (305) Q Consensus 220 ----~~t~~~~~~~kGi~iv~l~~~Pd~-----~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~ 290 (305) ...++....+.|++++....||+. ++|+-..++.+++.+ .+.. +.++++.. +...-|-.....|+ T Consensus 369 ~~~~~~~~~~~~~l~G~pv~~~~~~p~~~~~~~~~~~~f~~~~~~~~~--~~~~-v~~d~~~~---~~~~~~~~~~r~~~ 442 (458) T protein:vir:10 369 GNDSVKLQGQVGRIYGLPVVVSEYFPAKANSAEFAVIVYKDNFVMPRQ--RAVT-VERERQAG---KQRDAYYVTQRVNL 442 (458) T ss_pred ccccccccCcCceecceeeEEccccccccCCcceEEEEecccEEEEEe--eceE-EEeecccC---CCceEEEEEEEecc Confidence 112222346899999999999874 344333344433322 1111 22343322 33344445556688 Q ss_pred eeccCCeEEEecCCC Q lcl|NC_018271. 291 EYAYGAEIVLYTPAA 305 (305) Q Consensus 291 ~i~fg~E~v~~~~~~ 305 (305) .+.+++=||.=|.|| T Consensus 443 ~v~~~~a~v~~~~aa 457 (458) T protein:vir:10 443 QRYFANGVVSGTYAA 457 (458) T ss_pred eEecccceEEEeecc Confidence 899999999999999 No 39 >protein:vir:10364 Length: 390 # NCBI annotation: head protein; major capsid subunit precursor # Family: family:all:585 # MgeID: mge:183 # MgeName: Xp10 # Cross-refs: genbank:acc:NP_858956;genbank:gi:32128421;genbank:GeneID:2648357 Probab=98.28 E-value=9.3e-08 Score=59.14 Aligned_cols=270 Identities=11% Similarity=0.011 Sum_probs=151.7 Q ss_pred CceEeeeeccc-chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhc-cccCCCC-CCCCccceEecceee Q lcl|NC_018271. 1 MATTVDITTNY-VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTD-DFVDYSC-GFTPSGEVDINEKQL 77 (305) Q Consensus 1 ma~~~~~~~~Y-~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~-~~q~~~~-~~~~~G~~~~~~K~L 77 (305) +..+......+ ..+...+++..+.....+.+. ++++|- +......++..... ...--.+ .-.+.++.+|.+..+ T Consensus 114 ~~~~~~~~g~~~~~~~~~~ii~~~~~~~~l~~~--~~~~~~-~~~~~~~~~~~~~~~~a~~v~Eg~~~~~~~~~~~~i~~ 190 (390) T protein:vir:10 114 STDAAGSAGALTTPNRLPGFITQPDARLTVRDL--IGSGRT-DSALIEYVQETGFVNNAAIVAEGALKPESSLKFAKKTD 190 (390) T ss_pred hcccccccccccchhHHHHHHHHHHhhchhhhh--cceeec-cCCceEEEEEecCCcceeeecCCccccccccceeEEEE Confidence 11111111222 234556777777666666553 777763 33322233322221 1110011 122345778999999 Q ss_pred eeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhcc Q lcl|NC_018271. 78 TLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEADA 157 (305) Q Consensus 78 ~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d~ 157 (305) .++++-+...++ +++. ++-| ..+.++...|+++++..++..++.|+++.+.+.|++....... T Consensus 191 ~~~k~~~~~~is-~ell---------------~d~~-~l~~~i~~~l~~~~~~~~~~~il~G~G~~~~p~Gi~~~~~~~~ 253 (390) T protein:vir:10 191 TTHVIAHTMKAT-RQIL---------------SDAP-QLASYMNNRLIRGLKVKEDAEILRGTGANDGLLGLIPQATTYA 253 (390) T ss_pred eeEEEEEeehhh-HHHH---------------HhHH-HHHHHHHHHHHHHHHHHHHHHHhhcCCCCcccccccccccccc Confidence 999998876644 2231 1112 4667888999999999999999999999888999986532111 Q ss_pred ceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC--cccCCCcceecceeee Q lcl|NC_018271. 158 TVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG--TFLNPNEFDFEGYTLT 235 (305) Q Consensus 158 ~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~--~~t~~~~~~~kGi~iv 235 (305) . +.+.+.....+.+.++...+...++.. =.++||...|.+.+ .++.-.|.+. +...+....+.|++++ T Consensus 254 ~-------~~~~~~~~~~~~~~~~~~~l~~~~~~~--~~~v~n~~~~~~L~-~lkd~~g~~l~~~~~~~~~~~l~G~pv~ 323 (390) T protein:vir:10 254 A-------PTTIAGATRVDQLRLAMLQASLAEYPA--SGIVINPIDWAAIE-LAKDANNQYLIGNARGTLTPTLWGLPVV 323 (390) T ss_pred c-------cccccccchHHHHHHHHHhhccccCCC--CEEEEcHHHHHHHH-HhhcCCCceeecCCcCcCCceecceeeE Confidence 1 111222334566666667776665543 37999999987654 3332222211 1112223468999999 Q ss_pred eccCCCCCeEEEecchHHhhhhhhhhhhhhcccc--ceeeeccceeEEEEEEeecceeeccCCeEEEecCC Q lcl|NC_018271. 236 EIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIK--DMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPA 304 (305) Q Consensus 236 ~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~--~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~ 304 (305) ....||++.+++.+-++.+.. -|-..+.++ +-...+-....-+...+-+|+.+.-++=||+=|=| T Consensus 324 ~~~~~p~~~~~~gdf~~~~~~----~~~~~~~i~~~~~~~~~~~~~~~~r~~~r~d~~v~~~~a~~~~~~a 390 (390) T protein:vir:10 324 ATQAMAPGEFLVGAFDLAAQI----FDQWDARVEIGYVNDDFQRNMVTVLAEERLALVVYRPEALISGSFA 390 (390) T ss_pred EcCCCCCCcEEEEeccceEEE----EEecceEEEEeecccccccCcEEEEEEEeeccEEeccccEEEEEeC Confidence 999999998888776543211 112223332 11111112334444566667777777667766666 No 40 >protein:vir:95763 Length: 297 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1578 # MgeName: SMP # Cross-refs: genbank:acc:YP_950590;genbank:gi:119953785;genbank:GeneID:5076833 Probab=98.27 E-value=3.6e-07 Score=55.89 Aligned_cols=266 Identities=13% Similarity=0.068 Sum_probs=158.8 Q ss_pred CceEeeeec--ccchhHHHHHHHHhhccccchhcCceEEecCCCCcc-cccchhhhhccccCCCC-CCCCccceEeccee Q lcl|NC_018271. 1 MATTVDITT--NYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENN-LFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~~~--~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~-~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~ 76 (305) |+++..-.. .--.++..+++..+.....+.+.+ +++| ++-+. ..+++...+-..+-..+ +--+.++.+|.+.. T Consensus 9 ~~~~~t~~~~~lvP~~~~~~ii~~~~~~s~l~~~~--~~~~-~~~~~~~~~~~~~~~~~a~~v~Eg~~~~~~~~~f~~v~ 85 (297) T protein:vir:95 9 ENVLVSQKKDGTLHKEFTDIIMKEVAQNSLVMQLG--QYQE-MEGEQEKTVYVQTDGISAYWVNETEKIKTDKPEVVPVT 85 (297) T ss_pred ccccccCCCcceechhHHHHHHHHHHhhchhhhhc--ceee-cCCCccEEEEEEcCCceeEEeecCccccccccceeEEE Confidence 444433222 234555677777777777777764 4443 22221 22343332222221111 11123467899999 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) |+++++-+...++=+-. ++-.+..+.++.+.|.++++..++..++.|+++.. =.|+++.+... T Consensus 86 l~~~k~~~~~~is~ell----------------~ds~~~l~~~i~~~la~ai~~~~d~a~l~G~g~~~-~~gi~~~~~~~ 148 (297) T protein:vir:95 86 LKAHKLGIILVTSREAL----------------NYTWKKFFEDMKPQIVEAFYKKIDEAGLLGHDTPF-ANSVAKAAKDA 148 (297) T ss_pred EeeEEEEEeehhhHHHH----------------hcCHHHHHHHHHHHHHHHHHHHHHHHHhcccCCcc-ccccccccccc Confidence 99999998877542211 12235577888899999999999999999998643 35665543322 Q ss_pred cceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecceeeee Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEGYTLTE 236 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kGi~iv~ 236 (305) .. ...+.+|.+++++ +..+++...+.. -.++||...+.+.+. ++.-.|. -..++....+.|++++. T Consensus 149 ~~-----~~~~~~t~~~i~~----~~~~l~~~~~~~--~~~v~~~~~~~~L~~-l~d~~G~--~i~~~~~~~l~G~Pv~~ 214 (297) T protein:vir:95 149 NK-----VIGGPINYDNILK----LQDALYDADVEP--NAFVSKIQNRSALRE-ARDGNKV--SIYDKAANTIDGITTVD 214 (297) T ss_pred ce-----ecccccCHHHHHH----HHHHhhhccCCc--CEEEEcHHHHHHHHH-hhccCCc--eeecCCCCcccceeeEe Confidence 11 1234566555554 444455554432 379999999887753 4332222 22344455788999876 Q ss_pred ccC--CCCCeEEEecchHHhhhhhhhhhhhhcccccee---------------eeccceeEEEEEEeecceeeccCCeEE Q lcl|NC_018271. 237 IKG--LPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMG---------------DVDLSGQIRTKMVLSAGVEYAYGAEIV 299 (305) Q Consensus 237 l~~--~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~---------------~~~~~~~~f~k~~m~~d~~i~fg~E~v 299 (305) ..+ .+++.+++.+-+++++|... .++++.+. +++-+....++..+-+|+.+..++=++ T Consensus 215 ~~~~~~~~~~~~~gd~s~~~~~~~~-----~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~~~d~~v~~~~a~~ 289 (297) T protein:vir:95 215 LKSARFEKGDLLAGDFDNLIYGVPY-----NITYKISEEGQISTITNADGTPINLFEQEMIAIRATMDIAVMITKTDAFA 289 (297) T ss_pred ecCCCCCCceEEEEecccEEEEEec-----CeEEEEeeccccccccccCccchhhhhcCcEEEEEEEEeccEeecccceE Confidence 554 47889999998888765421 12222111 112335577788888899999999999 Q ss_pred EecCCC Q lcl|NC_018271. 300 LYTPAA 305 (305) Q Consensus 300 ~~~~~~ 305 (305) .=+|+. T Consensus 290 ~l~~at 295 (297) T protein:vir:95 290 KLTPAE 295 (297) T ss_pred EEeecC Confidence 999998 No 41 >protein:vir:485 Length: 407 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:11 # MgeName: P27 # Cross-refs: genbank:acc:NP_543092;swissprot:trembl:q8w627;genbank:gi:18249904;uniprot:Q8W627;genbank:GeneID:929693 Probab=98.25 E-value=2.7e-07 Score=56.59 Aligned_cols=267 Identities=14% Similarity=0.096 Sum_probs=153.8 Q ss_pred CceEeeeecccc-h-hHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC-------ccceE Q lcl|NC_018271. 1 MATTVDITTNYV-G-EVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP-------SGEVD 71 (305) Q Consensus 1 ma~~~~~~~~Y~-G-e~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~-------~G~~~ 71 (305) |.+.......|. . +...+|+..+.....+.+ +++++|--+... .++....+.. +.|.+ ....+ T Consensus 106 ~~~~t~~~gG~~iP~~~~~~I~~~~~~~~~l~~--~~~~~~~~~~~~-~~~~~~~~~~-----a~~v~E~~~~~~~~~~~ 177 (407) T protein:vir:48 106 LQVGNDEDGGYAIPEELDRTILTLLKDEVVMRQ--EATVITLGGSDY-KKLVNLGGTT-----SGWVGETDARPETATSK 177 (407) T ss_pred hhcccCCCCcccccHhHHHHHHHHHHhhhhhhh--hceeeecCCCce-EEEEecCCcc-----eeeeccccccccccccc Confidence 433333233331 2 334556655555554443 367666443332 2332222211 23322 23357 Q ss_pred ecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHH Q lcl|NC_018271. 72 INEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 72 ~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk 151 (305) |.+..+.++++.+...++-+ ++. +=++..+..+.+.|.++++..++..++.||++ +...|++. T Consensus 178 f~~i~~~~~k~~~~~~iS~e---------ll~-------ds~~~l~~~i~~~l~~~i~~~~~~a~l~G~G~-~~p~Gil~ 240 (407) T protein:vir:48 178 LGLIEPFMGEIYGNPQATQK---------MLD-------DAFFNVEDWINSELALEFAEQEEIAFTSGDGS-KKPKGFLA 240 (407) T ss_pred ceeEEeeeeeeEeehhhHHH---------HHh-------cchHHHHHHHHHHHHHHHHHHHHhhhhccCCC-Cccceeee Confidence 88999999999998766543 122 22456778888999999999999999999997 66889887 Q ss_pred HHhhccceE-------E--eccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---C Q lcl|NC_018271. 152 LLEADATVI-------D--VVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---G 219 (305) Q Consensus 152 ~i~~d~~~~-------~--~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~ 219 (305) ......... . ....++.+| .+.+.++..+++..+|.+. +++||...|...+ .++.-.|.+ . T Consensus 241 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~----~d~i~~l~~~l~~~~~~~a--~~v~n~~~~~~L~-~lkD~~Gr~l~~~ 313 (407) T protein:vir:48 241 YESTDEDDKTRAFGKLQHIASGAASGVT----ADAIIKLIYTLRKAHRSGA--KFMMNNSSLFAIR-LLKDNDGNYLWRP 313 (407) T ss_pred cccccccccccccccccccccccccccC----hHHHHHHHHhhchhhhcCC--EEEEcHHHHHHHH-HhhccCCceeecc Confidence 643322110 0 011112222 5667778888999988653 7999999986654 344333322 1 Q ss_pred cccCCCcceecceeeeeccCCCC-----CeEEEecch-HHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeec Q lcl|NC_018271. 220 TFLNPNEFDFEGYTLTEIKGLPA-----SRMVGYNRD-NIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYA 293 (305) Q Consensus 220 ~~t~~~~~~~kGi~iv~l~~~Pd-----~~ii~T~~s-Nl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~ 293 (305) +.+++....+.|.+++....||+ ..|+..+=+ +.+++ |--.++|.+ .+..-....-+..++-+|..+. T Consensus 314 ~~~~g~~~~l~G~PV~~~~~~p~~~~~~~~i~~Gd~~~~~~i~-----~~~~~~i~~-d~~~~~~~~~~~~~~r~d~~v~ 387 (407) T protein:vir:48 314 GIELGQPSSLAGYGIVENEQMPDIAADAKAIAFGNFKRGYTIV-----DRIGTRILR-DPYTNKPFVGFYTTKRTGGMLV 387 (407) T ss_pred CcCCCCCceecceeeEEecCcCCccCCccEEEEEeccccEEEE-----EeeceEEEe-eccccCCcEEEEEEEEeccEEe Confidence 22344455799999999988885 234444433 23211 111133321 1111134455667777899998 Q ss_pred cCCeEEEecCCC Q lcl|NC_018271. 294 YGAEIVLYTPAA 305 (305) Q Consensus 294 fg~E~v~~~~~~ 305 (305) .++=||+.+=++ T Consensus 388 ~~~a~~~l~~~a 399 (407) T protein:vir:48 388 DSQAIKLMKIGA 399 (407) T ss_pred cccceEEEEeec Confidence 888888876555 No 42 >protein:vir:78523 Length: 338 # NCBI annotation: Putative head structural protein # Family: family:all:507 # MgeID: mge:1853 # MgeName: U2 # Cross-refs: genbank:acc:YP_001491585;genbank:gi:157786408;genbank:GeneID:5625675 Probab=98.24 E-value=3.7e-07 Score=55.82 Aligned_cols=279 Identities=13% Similarity=0.076 Sum_probs=159.0 Q ss_pred CceEeeeeccc-------c-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhcccc---CCCCCCCC--- Q lcl|NC_018271. 1 MATTVDITTNY-------V-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFV---DYSCGFTP--- 66 (305) Q Consensus 1 ma~~~~~~~~Y-------~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q---~~~~~~~~--- 66 (305) |+...+-+..+ . .+...+++..+.....+.+. .+++| ++.....+|+.......+ +-...|.+ T Consensus 10 ~~~~~~~~~~~~~~~~~liP~~~~~~ii~~~~~~s~l~~l--~~~~~-~~~~~~~ip~~~~~~~a~~v~~~~~~~~~Eg~ 86 (338) T protein:vir:78 10 NTAGSNHQGRLAHVPSDLLPKEIVGPIFDKAQESSLVLRL--GENIP-ISYGETIIPTTVKRPEVGQVGVGTSNEQREGG 86 (338) T ss_pred hhcccccccceecccccccchHHHHHHHHHHHhhchhhhh--cceee-ccCCceEEEEEecCccceeecccccccccccc Confidence 33333322222 2 22335555555555555444 56655 554444455544433321 11123322 Q ss_pred ---ccceEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc Q lcl|NC_018271. 67 ---SGEVDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT 143 (305) Q Consensus 67 ---~G~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~ 143 (305) -++.+|.+..|.++++.+...++-+ + +. +-++..+.++.+.|+++++..+++.++.||++. T Consensus 87 ~~~~~~~~f~~v~l~~~k~~~~~~is~e-l--------l~-------ds~~~~~~~i~~~la~a~~~~~d~~~l~G~g~~ 150 (338) T protein:vir:78 87 TKPLSGTAWDTRSVAPIKLATIVTVSEE-F--------AR-------MNPSGLYTKLQADLAYAIGRGIDLAVFHGKSPL 150 (338) T ss_pred cccccccceeEEEEEEEEEEEeehhhHH-H--------Hh-------cCHHHHHHHHHHHHHHHHHHHHHHHhhcccCCC Confidence 3467899999999999988876532 1 11 123556788889999999999999999999863 Q ss_pred --chhHHHHHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHH--hhhhccC- Q lcl|NC_018271. 144 --GNLQGILPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAY--GTQARSN- 218 (305) Q Consensus 144 --~~fdG~lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~--~~~~~k~- 218 (305) +.+.|+++-......+. .. ..........+.+.++...+........ -.++||...+.+..... +.-.|.+ T Consensus 151 ~~~~~~gi~~~~~~~~~~~-~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~m~~~~~~~L~~~~~l~d~~g~~l 226 (338) T protein:vir:78 151 TGSALQGIDTNNVIVNTTN-VD--YLQTGTTPLLDRFLDGYDLVSANTDVDF-NGWAADPRYRARLLRSQAYRDANGNVD 226 (338) T ss_pred ccccccccccccccccccc-cc--cccccchhhHHHHHHHHHHhhhhccccc-eEEEEchHHHHHHHHHhhhccCCCcee Confidence 45778776433322211 01 0111123346667777766665433222 37999998887764432 1111111 Q ss_pred --CcccCCCcceecceeeeeccCCCC---------CeEEEecchHHhhhhhhhhhhhhcccc-ce------------eee Q lcl|NC_018271. 219 --GTFLNPNEFDFEGYTLTEIKGLPA---------SRMVGYNRDNIVIGMSAQSDFNEIRIK-DM------------GDV 274 (305) Q Consensus 219 --~~~t~~~~~~~kGi~iv~l~~~Pd---------~~ii~T~~sNl~~gvnl~~D~n~I~I~-~~------------~~~ 274 (305) .....+....+.|++++--..||+ ..+++.+-++.++|.. .+.. +++. +. -.+ T Consensus 227 ~~~~~~~~~~~~l~G~PV~~~~~ip~~~~~~~~~~~~~~~gdfs~~~~~~~--~~~~-i~~~~~~~~~~~~~~~~~~~~~ 303 (338) T protein:vir:78 227 PTRINLAASAGDLLGLPVQFGKAVGGDLGAATDSKVRVVGGDFSQLKYGFA--DEIR-VKMSDTATLTDNTSPTPQTVSM 303 (338) T ss_pred ecccccCCCCceeeeeeEEEccccCccccccCCcccEEEEEecceEEEEee--cccE-EEEeecccccccccccccchhh Confidence 112334456799999987777775 3466667666654421 1111 1111 10 012 Q ss_pred ccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 275 DLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 275 ~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) +-..+..+++.+-+|+.+..++=||.=++++ T Consensus 304 ~~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~ 334 (338) T protein:vir:78 304 WQTNQIAILIEVTFGWLLGDKQAFVKFVDDE 334 (338) T ss_pred hhcCcEEEEEEEEeccEeecccceEEEeccc Confidence 2334566777788899999999999999887 No 43 >protein:vir:104085 Length: 320 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:1656 # MgeName: Che12 # Cross-refs: genbank:acc:YP_655596;genbank:gi:109392467;genbank:GeneID:4156953 Probab=98.23 E-value=2.9e-07 Score=56.40 Aligned_cols=270 Identities=13% Similarity=0.027 Sum_probs=151.4 Q ss_pred CceEeeeec--ccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEecceee Q lcl|NC_018271. 1 MATTVDITT--NYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQL 77 (305) Q Consensus 1 ma~~~~~~~--~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~L 77 (305) ||++..-.. .=-.+...+++..+.....+.+. ++++|- +......|+...+-...--.+ +--+-++.+|++..+ T Consensus 14 ~~~t~~~~~~~~ip~~~~~~ii~~~~~~s~l~~~--~~~~~~-~~~~~~~p~~~~~~~a~~v~E~~~~~~~~~~f~~v~~ 90 (320) T protein:vir:10 14 IAQTGDTMFKGYLEPEQAKDYFAEAEKTSIVQQF--AQKVPM-GTTGQKIPHWIGDVSAQWIGEGDMKPITKGNMTSQNI 90 (320) T ss_pred hhccccccccccccHHHHHHHHHHHHhccchhhh--cceeec-cCCceEEEEEeCCcceEEecCCccccccccceeEEEE Confidence 666544322 22345566677777767766655 566652 222233444433222211111 112235678999999 Q ss_pred eeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccc--hhHHHHHHHhh Q lcl|NC_018271. 78 TLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTG--NLQGILPLLEA 155 (305) Q Consensus 78 ~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~--~fdG~lk~i~~ 155 (305) .+++..+.+.++=+-. ++-++..+..+.+.|.++++..+++.++.|+++.. .+.|+.+. T Consensus 91 ~~~k~~~~~~is~ell----------------~ds~~~l~~~i~~~l~~a~a~~~d~a~l~G~g~~~~~~~~~~~~~--- 151 (320) T protein:vir:10 91 APHKIATIFVASAETV----------------RANPANYLGTMRTKVATAFAMAFDSAALNGTDSPFPTYLAQTTKS--- 151 (320) T ss_pred eeEEEEEeehhhHHHH----------------hcChHHHHHHHHHHHHHHHHHHHHHHhhcccCCCCCccccccccc--- Confidence 9999999988653311 22346678889999999999999999999998632 23333222 Q ss_pred ccceEEeccCcCcCChhhHH---HHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc--------cCC Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVE---AELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF--------LNP 224 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~---~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~--------t~~ 224 (305) ..+ .. .+..+.+++. +.+.++...++...+. +-.++||...|.+++ .++...|.+.-. ... T Consensus 152 -~~~---~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~v~n~~~~~~L~-~lkd~~G~~l~~~~~~~~~~~~~ 223 (320) T protein:vir:10 152 -VSL---AD-PGGATASDLTAYDAVAVNGLSLLVNAKKK--WTHTLLDDIVEPILN-GAKDKNGRPLFIESTYTDENSPF 223 (320) T ss_pred -ccc---ee-cccccccccccHHHHHHHHHhhhhcccCC--CcEEEEcHHHHHHHH-HhhccCCceeeccccccCccccc Confidence 111 11 1122333322 2333444444444443 459999999998886 344433322111 111 Q ss_pred CcceecceeeeeccCCCCCe--EEEecchHHhhhhhhhhhhhhcccccee---------------eeccceeEEEEEEee Q lcl|NC_018271. 225 NEFDFEGYTLTEIKGLPASR--MVGYNRDNIVIGMSAQSDFNEIRIKDMG---------------DVDLSGQIRTKMVLS 287 (305) Q Consensus 225 ~~~~~kGi~iv~l~~~Pd~~--ii~T~~sNl~~gvnl~~D~n~I~I~~~~---------------~~~~~~~~f~k~~m~ 287 (305) +..++.|++++....+|++. ++..+-+++++|.. ..+.|+... +++-..+.-+...+- T Consensus 224 ~~~~i~g~pv~~~~~~~~~~~~~~~gd~~~~~~~~~-----~~~~i~~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~~ 298 (320) T protein:vir:10 224 RAGRIVSRPTILSDHVADGTTVGYMGDFRNVIWGQV-----GGLSFDVTDQATLNLGTPTEPNFVSLWQHNLVAVRVEAE 298 (320) T ss_pred cCceeeeeeeEecCCCCCCceEEEEeecceEEEEEe-----cCeEEEEeecceeeeccccccccchhhhcCcEEEEEEEe Confidence 12468899999999999874 55677778876542 122222110 111223345566666 Q ss_pred cceeeccCCeEEEec----CCC Q lcl|NC_018271. 288 AGVEYAYGAEIVLYT----PAA 305 (305) Q Consensus 288 ~d~~i~fg~E~v~~~----~~~ 305 (305) +|+.+..++=|+.=+ |.| T Consensus 299 ~d~~v~~~~a~~~l~~~~ap~~ 320 (320) T protein:vir:10 299 YAFHNNDKDAFVKLTNVVTPDA 320 (320) T ss_pred eccEEecccceEEEEeccCCCC Confidence 788888887776655 444 No 44 >protein:vir:100247 Length: 425 # NCBI annotation: gp76 # Family: family:all:21 # MgeID: mge:1619 # MgeName: Bcep176 # Cross-refs: genbank:acc:YP_355412;genbank:gi:77864702;genbank:GeneID:3725969 Probab=98.20 E-value=3e-07 Score=56.35 Aligned_cols=272 Identities=12% Similarity=0.120 Sum_probs=152.5 Q ss_pred CceEeeeeccc-c-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCCc-------cceE Q lcl|NC_018271. 1 MATTVDITTNY-V-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTPS-------GEVD 71 (305) Q Consensus 1 ma~~~~~~~~Y-~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~~-------G~~~ 71 (305) |.......+.| . -++..+|+..+.....+.+- ++++|--+.. ..+++...+. .+.|.+- ...+ T Consensus 130 l~~~t~~~gG~lvP~~~~~~ii~~~~~~s~l~~l--~~~~~~~~~~-~~~~~~~~~~-----~a~wv~E~~~~~~~~~~~ 201 (425) T protein:vir:10 130 LNKGEDSEGGYLTPIEWDRTITNKLVLISPMRQL--CRVQPVSKAG-FSKLFNMGGT-----TSGWVGEASQRPQTNAAT 201 (425) T ss_pred hhcCcCCCCceeccHhHHHHHHHHHHhhhhhhhh--ceeeeccCCc-eEEEEEcCCc-----ceeeeccccccccccccc Confidence 22222111222 1 22334455555544444443 5655533232 2333322221 2234322 2346 Q ss_pred ecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHH Q lcl|NC_018271. 72 INEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 72 ~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk 151 (305) |.+..+.++++-+...++-+ .+. +=.+..+..+.+.|++.++..++..++.||++ +.-.||++ T Consensus 202 f~~v~~~~~k~~~~i~iS~e---------ll~-------ds~~~l~~~i~~~la~ai~~~~d~~~l~G~G~-~~p~Gil~ 264 (425) T protein:vir:10 202 FQPLSFASGEIYANPAATQQ---------ILD-------DAEIDLESWLATEVQTEFAKQEGKAFLAGDGT-NKPNGLLT 264 (425) T ss_pred cceeeeeheeeEeehHhHHH---------HHh-------cchhHHHHHHHHHHHHHHHHHHHhhhhcccCC-CCcceeee Confidence 88888888888887665422 222 22466788889999999999999999999996 56889988 Q ss_pred HHhhccceEEe-----ccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---CcccC Q lcl|NC_018271. 152 LLEADATVIDV-----VGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---GTFLN 223 (305) Q Consensus 152 ~i~~d~~~~~~-----~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~~~t~ 223 (305) .+.....+..- .......+.....+.+.+++.+++..+|.+ -.++||...|...+ .++.-.|.+ .+.++ T Consensus 265 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~l~~l~~~l~~~~~~~--a~~vmn~~~~~~L~-~lkD~~G~~l~~~~~~~ 341 (425) T protein:vir:10 265 YIAGGANAAKHPFGAIEVVNSGAAADITSDGIIDLVYDLPSAFTGN--ARFAMNRNTQRQVR-KLKDGQGNYLWQPSYVA 341 (425) T ss_pred ccccccccccccccccccccccccccccHHHHHHHHhhhhhhhccC--CEEEEchHHHHHHH-HhhcCCCceeeccCccC Confidence 76544332110 000011112223455667788888888854 48999999996654 444333322 12344 Q ss_pred CCcceecceeeeeccCCCC-----CeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeE Q lcl|NC_018271. 224 PNEFDFEGYTLTEIKGLPA-----SRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEI 298 (305) Q Consensus 224 ~~~~~~kGi~iv~l~~~Pd-----~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~ 298 (305) +....+.|.+++-...||+ ..|+..+=++.+..++- ..+. +..++. ...+.+-+...+-+|..+.-++=| T Consensus 342 g~~~~l~G~PV~~~~~~p~~~~~~~~i~~Gd~~~~~~i~~~-~~~~-v~~d~~---~~~~~~~~~~~~r~d~~v~~~~A~ 416 (425) T protein:vir:10 342 GQPATLAGYPVTEVPDMPDVAANSTPILFGDFQQTYLIIDR-IGVR-VLRDPY---TAKPYVLFYTTKRVGGGLLNPEPM 416 (425) T ss_pred CCCceecceeeEEecCcCCccCCccEEEEEehhccEEEEEe-cceE-EEeccc---ccCCcEEEEEEEEeccEeecccce Confidence 5556799999998888883 34555554554221211 0111 122222 223455666677778888777777 Q ss_pred EEecCCC Q lcl|NC_018271. 299 VLYTPAA 305 (305) Q Consensus 299 v~~~~~~ 305 (305) ++-+=+| T Consensus 417 ~~l~~~a 423 (425) T protein:vir:10 417 RAMKVAA 423 (425) T ss_pred EEEEeec Confidence 7777666 No 45 >protein:vir:2344 Length: 397 # NCBI annotation: gp14 # Family: family:all:507 # MgeID: mge:51 # MgeName: Bxb1 # Cross-refs: genbank:acc:NP_075281;genbank:gi:12657868;genbank:GeneID:920118 Probab=98.18 E-value=4.8e-07 Score=55.22 Aligned_cols=270 Identities=14% Similarity=0.069 Sum_probs=154.2 Q ss_pred CceEeeee-ccc-chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEecceee Q lcl|NC_018271. 1 MATTVDIT-TNY-VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQL 77 (305) Q Consensus 1 ma~~~~~~-~~Y-~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~L 77 (305) |+.+-.-. +.| ..++..+++..+.....+-+ +.+++| ++.....+|+...+-..+--.+ +--+.++.+|.+..+ T Consensus 10 ~~~~~t~~~~g~l~~~~~~~ii~~l~~~s~i~~--l~~~~~-~~~~~~~ip~~~~~~~a~wv~Eg~~~~~s~~~f~~v~l 86 (397) T protein:vir:23 10 IAQTKDTMFTGYLDPVQAKDYFAEAEKTSIVQR--VAQKIP-MGATGIVIPHWTGDVSAQWIGEGDMKPITKGNMTKRDV 86 (397) T ss_pred HhhccCCCCccccchhHHHHHHHHHHhccchhh--hcceee-ccCCceEEEEEcCCcceEEecCCccccccccceeEEEE Confidence 55443322 122 34556677777776655544 467777 5555444555443332221111 112235678999999 Q ss_pred eeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhcc Q lcl|NC_018271. 78 TLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEADA 157 (305) Q Consensus 78 ~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d~ 157 (305) .++++.+...++-+-.+ +-.+..+.++.+.|.++++..+++.++.|+++.....|+..... T Consensus 87 ~~~k~~~~v~iS~ell~----------------ds~~~l~~~i~~~l~~aia~~~d~a~l~G~gt~~~~~~~~~~~~--- 147 (397) T protein:vir:23 87 HPAKIATIFVASAETVR----------------ANPANYLGTMRTKVATAIAMAFDNAALHGTNAPSAFQGYLDQSN--- 147 (397) T ss_pred eeEEEEEeehhhHHHHh----------------cchHHHHHHHHHHHHHHHHHHHHHHHhhcccCCccccccccccc--- Confidence 99999998876533111 22355678888999999999999999999998666555533211 Q ss_pred ceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCC-----Cccee Q lcl|NC_018271. 158 TVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNP-----NEFDF 229 (305) Q Consensus 158 ~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~-----~~~~~ 229 (305) ... ...+..+. +.+.+...++...++. +-.+.||...+.+.+. ++.-.|.+. +...+ ...++ T Consensus 148 ~~~---~~~~~~~~----~~~~~~~~~l~~~~~~--~a~~vmn~~~~~~L~~-lkd~~G~~i~~~~~~~~~~~~~~~~tl 217 (397) T protein:vir:23 148 KTQ---SISPNAYQ----GLGVSGLTKLVTDGKK--WTHTLLDDTVEPVLNG-SVDANGRPLFVESTYESLTTPFREGRI 217 (397) T ss_pred cee---eecccchh----HHHHHHHHhhhhcccC--CCEEEEcHHHHHHHHH-hhccCCceeecccccccccccccCcee Confidence 111 11122332 3333444455666554 3489999999877664 332222211 11111 12368 Q ss_pred cceeeeeccCCCCC--eEEEecchHHhhhhhhhhhhhhccccc-------------eeeeccceeEEEEEEeecceeecc Q lcl|NC_018271. 230 EGYTLTEIKGLPAS--RMVGYNRDNIVIGMSAQSDFNEIRIKD-------------MGDVDLSGQIRTKMVLSAGVEYAY 294 (305) Q Consensus 230 kGi~iv~l~~~Pd~--~ii~T~~sNl~~gvnl~~D~n~I~I~~-------------~~~~~~~~~~f~k~~m~~d~~i~f 294 (305) .|++++....+|++ .++..+-+++++|.. .+.. +++.| .-+++-..+..|...+.+|+.+.. T Consensus 218 ~G~Pv~~s~~~~~g~~~~~~gDfs~~~i~~~--~~i~-i~~~~e~~~~~~~~~~~~~~~lf~~d~v~~ra~~r~d~~v~~ 294 (397) T protein:vir:23 218 LGRPTILSDHVAEGDVVGYAGDFSQIIWGQV--GGLS-FDVTDQATLNLGSQESPNFVSLWQHNLVAVRVEAEYGLLIND 294 (397) T ss_pred eeeeEEEeCCCCCCceEEEEeecceEEEEEE--eceE-EEEeeeeeeeeccccccceeeeeeccceeEEEEeeeccceec Confidence 99999999999877 446677788765431 1111 11111 112334455667777888888877 Q ss_pred CCeEEEecCCC Q lcl|NC_018271. 295 GAEIVLYTPAA 305 (305) Q Consensus 295 g~E~v~~~~~~ 305 (305) ++=++.-+-.. T Consensus 295 ~~a~~~~~~~~ 305 (397) T protein:vir:23 295 VNAFVKLTFDP 305 (397) T ss_pred ccceEEEeecc Confidence 66665544333 No 46 >protein:vir:99920 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:1611 # MgeName: Halo # Cross-refs: genbank:acc:YP_655524;genbank:gi:109392294;genbank:GeneID:4157089 Probab=98.07 E-value=9.1e-07 Score=53.71 Aligned_cols=278 Identities=14% Similarity=0.113 Sum_probs=145.1 Q ss_pred CceEeeeec-ccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccceEec Q lcl|NC_018271. 1 MATTVDITT-NYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEVDIN 73 (305) Q Consensus 1 ma~~~~~~~-~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~~~~ 73 (305) ||++-.-.. .--.++..+|+..+.....+.+- .+++| ++.....+|+...+- .+.|.+ -++.+|. T Consensus 1 Mat~tt~~g~~vP~~~~~~ii~~~~~~s~l~~~--~~~i~-~~~~~~~~p~~~~~~-----~a~wv~Eg~~~~~~~~~f~ 72 (311) T protein:vir:99 1 MATFGTGNLKNLPRNIADGMVKDVVQGSTVAVL--SARKP-QRFGNEDIITFNGRP-----KAEFVGEGQQKSSTTGEFD 72 (311) T ss_pred CceecCCCceeccHHHHHHHHHHHHhhchhhhh--cceee-ccCCceEEEEEeCCc-----eeEEeecCcccccccceee Confidence 998765322 22345556666666666555444 44443 222223344433221 123432 3577899 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCc--cchhHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGT--TGNLQGILP 151 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s--~~~fdG~lk 151 (305) +..|.++++.+...++=+ +.+ .+ ++--..++.++.+.|.++++..+++.++.|+++ .+...|... T Consensus 73 ~v~l~~~k~~~~~~iS~e-ll~------~~------~d~~~~l~~~i~~~la~ai~~~~d~~~l~G~g~~~g~~~~g~~~ 139 (311) T protein:vir:99 73 FVTSTPKKAQVTMRFNEE-VQW------AD------EDYQLGVLQTLSEAGAEALARALDLGLYHRINPLTGTVIPGWSN 139 (311) T ss_pred EEEEeeEEEEEeehhhHH-Hhh------cc------cccHHHHHHHHHHHHHHHHHHHHHHHhhcccCcccCcccccccc Confidence 999999999997766532 111 11 011244567788899999999999999999875 334566555 Q ss_pred HHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCcce Q lcl|NC_018271. 152 LLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNEFD 228 (305) Q Consensus 152 ~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~~~ 228 (305) .+...+.++...... .+...+.+.+++..+.........-.+.||...+...+- ++...|.+. ....+.... T Consensus 140 ~~~~~~~~~~~~~~~----~~~~~~~i~~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~-lkd~~G~~l~~~~~~~~~~~~ 214 (311) T protein:vir:99 140 YLGAASKRVELTADT----IANPDLAIEAAVGLLVANGHPTPVNGLALHPSIAWGLST-ARYTDGRKKFPELGLGIGVSS 214 (311) T ss_pred ccccccceeeccccc----cchhHHHHHHHHHHHhhhccCCCccEEEEcHHHHHHHHh-hhccCCCeeecCcccCCCCce Confidence 554444444322222 222333444444443333111111249999999877643 443333332 122333457 Q ss_pred ecceeeeeccCCCCCeEEEecchHHhhh---hhhhhhhhh-cc--------cc--ce------eeeccceeEEEEEEeec Q lcl|NC_018271. 229 FEGYTLTEIKGLPASRMVGYNRDNIVIG---MSAQSDFNE-IR--------IK--DM------GDVDLSGQIRTKMVLSA 288 (305) Q Consensus 229 ~kGi~iv~l~~~Pd~~ii~T~~sNl~~g---vnl~~D~n~-I~--------I~--~~------~~~~~~~~~f~k~~m~~ 288 (305) +.|++++.-..+|.+.......+=+..+ .-+..|++. ++ ++ +. .+++...+.-++..+-+ T Consensus 215 l~G~Pv~~s~~i~~~~~~~~~~~~~~~~~~~~~~~Gdf~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~~~~r~~~r~ 294 (311) T protein:vir:99 215 FEGIDASVSDTVNGGDEADPDDEDLDAARAVRGIVGDFANGIHWGVQRDIPVELIKYGDPDGQGDLKRHNQIALRLEIVY 294 (311) T ss_pred ecceeeEeecccccccccccccchhhccCcceEEEeeccccEEEEEecCceEEEeecCCCCcchhhhhcCcEEEEEEEee Confidence 9998887765555443322221100000 001122211 11 11 11 01222344455566667 Q ss_pred ceeeccCCeEEEecCCC Q lcl|NC_018271. 289 GVEYAYGAEIVLYTPAA 305 (305) Q Consensus 289 d~~i~fg~E~v~~~~~~ 305 (305) |+.+.- +++|..+.++ T Consensus 295 d~~v~~-~~~v~~~~~~ 310 (311) T protein:vir:99 295 GWYVFT-DRFVVIENAV 310 (311) T ss_pred cceecC-hhHeeeeccc Confidence 888755 6899999998 No 47 >protein:vir:9509 Length: 381 # NCBI annotation: hypothetical protein # Family: family:all:635 # MgeID: mge:170 # MgeName: phiN315 # Cross-refs: genbank:acc:NP_835556;genbank:gi:30043951;genbank:GeneID:1260537 Probab=98.03 E-value=1.9e-07 Score=57.40 Aligned_cols=273 Identities=12% Similarity=0.085 Sum_probs=158.2 Q ss_pred CceEeeeecccc--hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhcccc--CCCCCCCCccceEeccee Q lcl|NC_018271. 1 MATTVDITTNYV--GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFV--DYSCGFTPSGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~~~~Y~--Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q--~~~~~~~~~G~~~~~~K~ 76 (305) |.........|. -+...+|+..+..-..+.+. ++|.| ++.+. .+++........ +-.....+..+.+|.+.. T Consensus 76 ~~~~~~~~gg~lvP~~~~~~I~~~l~~~s~i~~~--~~v~~-~~~~~-~i~~~~~~~~a~w~~e~~~~~~~~~~~f~~i~ 151 (381) T protein:vir:95 76 INKNVNYKEEKLLPEETIDRIFEDLTTNHPLLAD--LGIKN-AGLRL-KFLKSETSGVAVWGKIYGEIKGQLDAAFSEET 151 (381) T ss_pred HhcccCCCCceecCHHHHHHHHHHHHhhccceeh--eeeEe-cCcce-EEEEecCCcceeeecccccccccccccceeee Confidence 222222122222 34556666666666666554 77654 65553 344433222211 111112223456889999 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) |.++++-++..++- + + +++=++..+.++.+.|+++++.-++..+++||++ +.--||++.+... T Consensus 152 l~~~kl~~~~~is~-e--------------l-L~Ds~~~ie~~i~~~la~~~a~~~~~a~i~G~G~-~qP~Gil~~~~~~ 214 (381) T protein:vir:95 152 AIQNKLTAFVVLPK-D--------------L-NDFGPAWIERFVRVQIEEAFAVALETAFLKGTGK-DQPIGLNRQVQKG 214 (381) T ss_pred ecceeEEeechhhH-H--------------H-hhcCHHHHHHHHHHHHHHHHHHHhhheeEeccCC-CCceeeeeccCcc Confidence 99999988776542 1 1 1233667789999999999999999999999997 4556888765543 Q ss_pred cceEEec----cCcCcCC---hhhHHHHHHHHHHhccHHHHh-----CCCcEEEecHHHHHHHHHHHhhhhccCCcccCC Q lcl|NC_018271. 157 ATVIDVV----GASGGIT---AANVEAELGKFIDAHTDEILQ-----APNHVFGVSTNVIRAIKRAYGTQARSNGTFLNP 224 (305) Q Consensus 157 ~~~~~~~----~~~~~iT---~anv~~~l~~~~~~iP~~~r~-----~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~ 224 (305) ..+..-+ .+.+++| ...+.+.+.++.++++...+. .++..+.|+...+-.....+..+. .++ T Consensus 215 ~~~~~g~~~~~~~~~t~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~a~~~mn~~t~~~l~~~~~~~~------~~G 288 (381) T protein:vir:95 215 VSVTEGAYPEKEEQGTLTFANPRATVNELTQVFKYHSTNEKGKSVAVKGNVTMVVNPSDAFEVQAQYTHLN------ANG 288 (381) T ss_pred cccccccccccccccccccccchhhHHHHHHHHHhhccccccccccccCceEEEEccccHHhhccccccCC------CCC Confidence 2221100 0112333 344567788888888764321 245678899776544322221111 122 Q ss_pred Ccce--ecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhcccccee-eeccceeEEEEEEeecceeeccCCeEEEe Q lcl|NC_018271. 225 NEFD--FEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIVLY 301 (305) Q Consensus 225 ~~~~--~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v~~ 301 (305) ++.. -.|++++.-..+|++.|+..+-++.+++.+ ..++|+... ....+.+..|...+-+|....=++=+|++ T Consensus 289 ~~v~~l~~g~~vv~s~~~p~~~iifgDfs~Y~i~~r-----~~~~i~~~~~~~~~~d~~~f~a~~r~dg~~~~~~A~~v~ 363 (381) T protein:vir:95 289 VYVTALPFNLNVIESTVQEAGKVLTYVKGLYDGYLA-----GGINVQKFKETLALDDMDLYTAKQFAYGKAKDNKVAAVW 363 (381) T ss_pred ceeecCCCCceEEecCCCCcCcEEEEecccEEEEEe-----cccEEEeechhHhhcCCeEEEEEEEEcCEEecCceEEEE Confidence 2211 136677777889999999998888655432 234444321 22334445566777778888888888898 Q ss_pred cCCC Q lcl|NC_018271. 302 TPAA 305 (305) Q Consensus 302 ~~~~ 305 (305) +.+. T Consensus 364 ~l~~ 367 (381) T protein:vir:95 364 KLDL 367 (381) T ss_pred EEEe Confidence 8766 No 48 >protein:vir:101291 Length: 381 # NCBI annotation: hypothetical protein # Family: family:all:635 # MgeID: mge:1591 # MgeName: phiNM3 # Cross-refs: genbank:acc:YP_908831;genbank:gi:118725095;genbank:GeneID:4555862 Probab=98.03 E-value=1.9e-07 Score=57.40 Aligned_cols=273 Identities=12% Similarity=0.085 Sum_probs=158.2 Q ss_pred CceEeeeecccc--hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhcccc--CCCCCCCCccceEeccee Q lcl|NC_018271. 1 MATTVDITTNYV--GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFV--DYSCGFTPSGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~~~~Y~--Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q--~~~~~~~~~G~~~~~~K~ 76 (305) |.........|. -+...+|+..+..-..+.+. ++|.| ++.+. .+++........ +-.....+..+.+|.+.. T Consensus 76 ~~~~~~~~gg~lvP~~~~~~I~~~l~~~s~i~~~--~~v~~-~~~~~-~i~~~~~~~~a~w~~e~~~~~~~~~~~f~~i~ 151 (381) T protein:vir:10 76 INKNVNYKEEKLLPEETIDRIFEDLTTNHPLLAD--LGIKN-AGLRL-KFLKSETSGVAVWGKIYGEIKGQLDAAFSEET 151 (381) T ss_pred HhcccCCCCceecCHHHHHHHHHHHHhhccceeh--eeeEe-cCcce-EEEEecCCcceeeecccccccccccccceeee Confidence 222222122222 34556666666666666554 77654 65553 344433222211 111112223456889999 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) |.++++-++..++- + + +++=++..+.++.+.|+++++.-++..+++||++ +.--||++.+... T Consensus 152 l~~~kl~~~~~is~-e--------------l-L~Ds~~~ie~~i~~~la~~~a~~~~~a~i~G~G~-~qP~Gil~~~~~~ 214 (381) T protein:vir:10 152 AIQNKLTAFVVLPK-D--------------L-NDFGPAWIERFVRVQIEEAFAVALETAFLKGTGK-DQPIGLNRQVQKG 214 (381) T ss_pred ecceeEEeechhhH-H--------------H-hhcCHHHHHHHHHHHHHHHHHHHhhheeEeccCC-CCceeeeeccCcc Confidence 99999988776542 1 1 1233667789999999999999999999999997 4556888765543 Q ss_pred cceEEec----cCcCcCC---hhhHHHHHHHHHHhccHHHHh-----CCCcEEEecHHHHHHHHHHHhhhhccCCcccCC Q lcl|NC_018271. 157 ATVIDVV----GASGGIT---AANVEAELGKFIDAHTDEILQ-----APNHVFGVSTNVIRAIKRAYGTQARSNGTFLNP 224 (305) Q Consensus 157 ~~~~~~~----~~~~~iT---~anv~~~l~~~~~~iP~~~r~-----~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~ 224 (305) ..+..-+ .+.+++| ...+.+.+.++.++++...+. .++..+.|+...+-.....+..+. .++ T Consensus 215 ~~~~~g~~~~~~~~~t~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~a~~~mn~~t~~~l~~~~~~~~------~~G 288 (381) T protein:vir:10 215 VSVTEGAYPEKEEQGTLTFANPRATVNELTQVFKYHSTNEKGKSVAVKGNVTMVVNPSDAFEVQAQYTHLN------ANG 288 (381) T ss_pred cccccccccccccccccccccchhhHHHHHHHHHhhccccccccccccCceEEEEccccHHhhccccccCC------CCC Confidence 2221100 0112333 344567788888888764321 245678899776544322221111 122 Q ss_pred Ccce--ecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhcccccee-eeccceeEEEEEEeecceeeccCCeEEEe Q lcl|NC_018271. 225 NEFD--FEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIVLY 301 (305) Q Consensus 225 ~~~~--~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v~~ 301 (305) ++.. -.|++++.-..+|++.|+..+-++.+++.+ ..++|+... ....+.+..|...+-+|....=++=+|++ T Consensus 289 ~~v~~l~~g~~vv~s~~~p~~~iifgDfs~Y~i~~r-----~~~~i~~~~~~~~~~d~~~f~a~~r~dg~~~~~~A~~v~ 363 (381) T protein:vir:10 289 VYVTALPFNLNVIESTVQEAGKVLTYVKGLYDGYLA-----GGINVQKFKETLALDDMDLYTAKQFAYGKAKDNKVAAVW 363 (381) T ss_pred ceeecCCCCceEEecCCCCcCcEEEEecccEEEEEe-----cccEEEeechhHhhcCCeEEEEEEEEcCEEecCceEEEE Confidence 2211 136677777889999999998888655432 234444321 22334445566777778888888888898 Q ss_pred cCCC Q lcl|NC_018271. 302 TPAA 305 (305) Q Consensus 302 ~~~~ 305 (305) +.+. T Consensus 364 ~l~~ 367 (381) T protein:vir:10 364 KLDL 367 (381) T ss_pred EEEe Confidence 8766 No 49 >protein:vir:80684 Length: 315 # NCBI annotation: gp6 # Family: family:all:966 # MgeID: mge:1884 # MgeName: PA6 # Cross-refs: genbank:acc:YP_001285582;genbank:gi:148727088;genbank:GeneID:5247055 Probab=98.03 E-value=1.4e-06 Score=52.69 Aligned_cols=271 Identities=13% Similarity=0.108 Sum_probs=153.9 Q ss_pred CceEeeeeccc--chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccceEe Q lcl|NC_018271. 1 MATTVDITTNY--VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEVDI 72 (305) Q Consensus 1 ma~~~~~~~~Y--~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~~~ 72 (305) ||....-.+.| -.+...+|+..+.....+.+. .+++|--... ..+|+...+-. +.|.+ .++.+| T Consensus 1 Ma~~~~~~gg~~vP~~~~~~ii~~l~~~s~i~~l--~~~i~~~~~~-~~ip~~~~~~~-----a~wv~Eg~~~~~s~~~f 72 (315) T protein:vir:80 1 MADDFLSAGKLELPGSMIGAVRDRAIDSGVLAKL--SPEQPTIFGP-VKGAVFSGVPR-----AKIVGEGEVKPSASVDV 72 (315) T ss_pred CCCCcCCcCceEcchHHHHHHHHHHHhhchhhhh--cceeecCCCc-eEEEEEeCCcc-----eEEeeCCccccccccce Confidence 99877654433 234446666666666555554 4555543222 33444332222 23432 357899 Q ss_pred cceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc--chhHHHH Q lcl|NC_018271. 73 NEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT--GNLQGIL 150 (305) Q Consensus 73 ~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~--~~fdG~l 150 (305) ++..|.++++.+...++-+=.+. + ..+....++..+.+.|++.++..++..++.|++.. ....|+. T Consensus 73 ~~v~l~~~kl~~~~~iS~ell~~-------s-----~~~~~~~l~~~i~~~la~ai~~~~d~a~~~G~~~~~~~~~~~~~ 140 (315) T protein:vir:80 73 SAFTAQPIKVVTQQRVSDEFMWA-------D-----ADYRLGVLQDLISPALGASIGRAVDLIAFHGIDPATGKAASAVH 140 (315) T ss_pred eeeEeeeeeEEeeehhhHHHhhc-------C-----chhHHHHHHHHHHHHHHHHHHHHHhhheeeccCCCCCccccccc Confidence 99999999999988765321111 1 12344557788889999999999999999997642 3355544 Q ss_pred HHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccH-HHHhCCCcEEEecHHHHHHHHHHHhhhhccC--Cc-----cc Q lcl|NC_018271. 151 PLLEADATVIDVVGASGGITAANVEAELGKFIDAHTD-EILQAPNHVFGVSTNVIRAIKRAYGTQARSN--GT-----FL 222 (305) Q Consensus 151 k~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~-~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~--~~-----~t 222 (305) ..+...+.++. . ...+. +.+.+++.++.. .++. .+ .+.||...+.+.+ .++...++. .. .. T Consensus 141 ~~~~~~~~~~~---~-~~~~~----~d~~~~~~~~~~~~~~~-~~-~~imn~~~~~~L~-~l~~~~g~~~~g~~~~~~~~ 209 (315) T protein:vir:80 141 TSLNKTKNIVD---A-TDSAT----ADLVKAVGLIAGAGLQV-PN-GVALDPAFSFALS-TEVYPKGSPLAGQPMYPAAG 209 (315) T ss_pred cccccccceee---c-cccch----HHHHHHHHHHhhccCcc-ce-EEEEcHHHHHHHH-HHhhccCCcccccccccccc Confidence 44322222221 1 11222 334445544432 2222 22 6999988876653 332222221 11 12 Q ss_pred CCCcceecceeeeeccCCCCC---------eEEEecchHHhhhhhhhhhhhhcccccee-------eeccceeEEEEEEe Q lcl|NC_018271. 223 NPNEFDFEGYTLTEIKGLPAS---------RMVGYNRDNIVIGMSAQSDFNEIRIKDMG-------DVDLSGQIRTKMVL 286 (305) Q Consensus 223 ~~~~~~~kGi~iv~l~~~Pd~---------~ii~T~~sNl~~gvnl~~D~n~I~I~~~~-------~~~~~~~~f~k~~m 286 (305) .+....+.|++++--..+|++ .++..+=+++++|+. +++. ++|.+-. +++-..++-++..+ T Consensus 210 ~g~~~tl~G~PV~~~~~~~~~~~~~~~~~~~~~~GDfs~~~~g~~--~~~~-i~i~~~~~~~~~~~~~~~~~~v~~r~~~ 286 (315) T protein:vir:80 210 FAGLDNWRGLNVGASSTVSGAPEMSPASGVKAIVGDFSRVHWGFQ--RNFP-IELIEYGDPDQTGRDLKGHNEVMVRAEA 286 (315) T ss_pred cCCCceecceeeEecCcCCcccccccccccEEEEeecccEEEEEe--cCee-EEEeccccccCcccchhhcCcEEEEEEE Confidence 223457999998877777754 466677777766552 1221 2332221 12223446667777 Q ss_pred ecceeeccCCeEEEecCCC Q lcl|NC_018271. 287 SAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 287 ~~d~~i~fg~E~v~~~~~~ 305 (305) -.|+.+..++=||+-+.++ T Consensus 287 r~~~~v~~~~a~~~l~~~~ 305 (315) T protein:vir:80 287 VLYVAIESLDSFAVVKEKA 305 (315) T ss_pred EecceeecccceEEEeecc Confidence 8899999999999998766 No 50 >protein:vir:2430 Length: 318 # NCBI annotation: major head subunit # Family: family:all:507 # MgeID: mge:52 # MgeName: D29 # Cross-refs: genbank:acc:NP_046832;genbank:gi:9630400;genbank:GeneID:1261582 Probab=98.01 E-value=1.4e-06 Score=52.68 Aligned_cols=273 Identities=11% Similarity=0.030 Sum_probs=150.0 Q ss_pred CceEeeeec-ccch-hHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEecceee Q lcl|NC_018271. 1 MATTVDITT-NYVG-EVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQL 77 (305) Q Consensus 1 ma~~~~~~~-~Y~G-e~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~L 77 (305) |+++-+... .... +...+++..+.....+.+. ++++| ++..+..+|+...+-..+--.+ +.-+-++.+|++..+ T Consensus 14 ~~~~~~~~~~~~ip~~~~~~ii~~~~~~~~l~~~--~~~~~-~~~~~~~ip~~~~~~~a~~v~Eg~~~~~~~~~f~~i~~ 90 (318) T protein:vir:24 14 IAQTGDTMFKGYLEPEQAKDYFAEAEKTSIVQQF--AQKVP-MGTTGQKIPHWVGDVSAQWIGEGDMKPITKGNMTSQTI 90 (318) T ss_pred hhcccCcccceeechhHHHHHHHHHHhhchhhhh--cceee-ccCCceEEEEEeCCcceEEecCCccccccccceeEEEE Confidence 555444322 2222 3334444444444444333 67766 3344344554443222221111 111235678999999 Q ss_pred eeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhcc Q lcl|NC_018271. 78 TLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEADA 157 (305) Q Consensus 78 ~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d~ 157 (305) .++++-+...++=+-.+ +-++..+..+.+.|.++++..++..++.|+++.. -.|+.... T Consensus 91 ~~~k~~~~~~iS~e~l~----------------ds~~~~~~~i~~~l~~~~~~~~d~a~l~G~g~~~-~~~~~~~~---- 149 (318) T protein:vir:24 91 APHKIATIFVASAETVR----------------ANPANYLGTMRTKVATAFAMAFDGAAMHGTDSPF-PTYIGQTT---- 149 (318) T ss_pred eeEEEEEeehhhHHHhh----------------cChHHHHHHHHHHHHHHHHHHHHHhhhcccCCCC-Cccccccc---- Confidence 99999998876532111 2235577888899999999999999999998632 23444332 Q ss_pred ceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCc-----cee Q lcl|NC_018271. 158 TVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNE-----FDF 229 (305) Q Consensus 158 ~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~-----~~~ 229 (305) .++......... ......+.+....+....+. +..+.||...|.+++. ++.-.|.+. ....+.. .++ T Consensus 150 ~~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~--~~~~v~n~~~~~~L~~-lkd~~G~~l~~~~~~~~~~~~~~~~~i 224 (318) T protein:vir:24 150 KAISIADTTGAT--TVYDQVAVNGLSLLVNDGKK--WTHTLLDDITEPILNG-AKDQNGRPLFIESTYGEAASPFRSGRI 224 (318) T ss_pred cccccccccccc--chHHHHHHHHHHhhccccCC--CCEEEEcHHHHHHHHH-hhccCCceeecCccccCccccccCceE Confidence 222211111111 11222333344444444443 4589999999987763 433222211 1222222 246 Q ss_pred cceeeeeccCCCCC--eEEEecchHHhhhhhhhhhhhhccccc-------------eeeeccceeEEEEEEeecceeecc Q lcl|NC_018271. 230 EGYTLTEIKGLPAS--RMVGYNRDNIVIGMSAQSDFNEIRIKD-------------MGDVDLSGQIRTKMVLSAGVEYAY 294 (305) Q Consensus 230 kGi~iv~l~~~Pd~--~ii~T~~sNl~~gvnl~~D~n~I~I~~-------------~~~~~~~~~~f~k~~m~~d~~i~f 294 (305) .|++++....+|++ .++..+-+.+++|.. .++. +++.+ ..+++-..+..++..+-+|+.+.. T Consensus 225 ~g~pv~~~~~~~~~~~~~~~gdfs~~~~~~~--~~l~-i~~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~ 301 (318) T protein:vir:24 225 VARPTILSDHVVEGTTVGFMGDFSQLIWGQI--GGLS-FDVTDQATLNLGTVESPNFVSLWQHNLVAVRVEAEYAFHCND 301 (318) T ss_pred EEEeeEEeCCCCCCccEEEEeecceEEEEEe--cCeE-EEEeeccceeccccccccchhhhhcCcEEEEEEEEEccEEec Confidence 67777777888855 566777777765531 1221 11111 012233456778888888999999 Q ss_pred CCeEEEecCCC Q lcl|NC_018271. 295 GAEIVLYTPAA 305 (305) Q Consensus 295 g~E~v~~~~~~ 305 (305) ++=|+.-++.+ T Consensus 302 ~~a~~~i~~~~ 312 (318) T protein:vir:24 302 AEAFVALTNVV 312 (318) T ss_pred ccceEEEEeec Confidence 98899999977 No 51 >protein:vir:97255 Length: 310 # NCBI annotation: hypothetical protein ORF017 # Family: family:all:1120 # MgeID: mge:1657 # MgeName: M6 # Cross-refs: genbank:acc:YP_001294525;genbank:gi:149408246;genbank:GeneID:5237120 Probab=97.89 E-value=2.6e-07 Score=56.73 Aligned_cols=270 Identities=14% Similarity=0.173 Sum_probs=142.7 Q ss_pred CceEeee--ecccchhHHHHHHHHhhccccchhcCceE-EecCCCCcccccchhhhhccc--cCCCCCCC----CccceE Q lcl|NC_018271. 1 MATTVDI--TTNYVGEVAGGYFLEMVKEANTISDNLIR-VIPNVPENNLFLRRMNTTDDF--VDYSCGFT----PSGEVD 71 (305) Q Consensus 1 ma~~~~~--~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~-v~~~v~~~~~~~~~~~~~~~~--q~~~~~~~----~~G~~~ 71 (305) |+..--+ ...+......+++-.++.-+++....--+ |..|. .-..|.++.-.. .+-...++ +.+..+ T Consensus 1 mpaltLaea~k~~~d~l~~~ViE~~~~~s~lL~~LpF~~veg~~----~~ynR~~~~~~~~~~~v~~~~~~~g~~~~~~t 76 (310) T protein:vir:97 1 MASVTLAESAKLAQDELVAGVIENIITVNRMFDVLPFDSIEGNS----LAYNRENVLGDVIMAGVGTTFSGAGAGKAAAT 76 (310) T ss_pred CcccchHHHhhcCcchHHHHHHHHHhccchHHHhCCcccccCCc----ceeeEeeccCCcccccccccccCCCccccccc Confidence 8733333 35888888999999999999988753221 22221 112223221111 11111222 223344 Q ss_pred ecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHH Q lcl|NC_018271. 72 INEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 72 ~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk 151 (305) ++++.-+.+-+-..+.+.- ..+.-.. ++ |..+-..-+++-.+.+++...+.+++||++.+.|+|.++ T Consensus 77 ~~~~~~~L~i~~g~~~Vd~------~i~dl~~-~~------~~dq~~~Ql~~~iea~~~~~e~~lINGD~a~n~F~GL~~ 143 (310) T protein:vir:97 77 FTKVNSNLTTIMGDAEVNG------LIQATRS-GD------GNDQTAVQIASKAKSAGRKYQDQLINGNGAGNEFAGLIQ 143 (310) T ss_pred cceeeeeeeeeeehhhhhh------HHHhhhc-CC------hHHHHHHHHHHHHHHHHHHHHHHhhccccCCCcccchhh Confidence 5555444444433333221 1111112 21 323222334666778999999999999998888999999 Q ss_pred HHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCccc----CCCc- Q lcl|NC_018271. 152 LLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFL----NPNE- 226 (305) Q Consensus 152 ~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t----~~~~- 226 (305) +++. .+.+..-+..+.+|.+...+.+..+|+. +..+ -.|+||.....++....++..+.+.-.. .++. T Consensus 144 ~~~~-~q~i~~~~~gg~~t~d~LDeLl~~v~~~-----~g~p-~~~l~~~~~~r~i~A~~R~~~~~g~~~~~~~~~G~~v 216 (310) T protein:vir:97 144 LCAS-GQKATTGATGSAISFAILDELMDLVVDK-----DGQV-DYLTMHARTLRSYKALLRALGGASINEVVELPSGAEV 216 (310) T ss_pred cCCc-cceeecCCCCCCCCHHHHHHHHHHHhcC-----CCCC-CEEEecHHHHHHHHHHHHHhcCCCCCCccccCCCCEE Confidence 9854 3444422223556765444444444432 2223 3799999999999888877766665443 2222 Q ss_pred ceecceeeeeccCCCCCeEE--EecchHHhhhhhhhhhhh-------------hccccceeeec--cceeEEEEEEeecc Q lcl|NC_018271. 227 FDFEGYTLTEIKGLPASRMV--GYNRDNIVIGMSAQSDFN-------------EIRIKDMGDVD--LSGQIRTKMVLSAG 289 (305) Q Consensus 227 ~~~kGi~iv~l~~~Pd~~ii--~T~~sNl~~gvnl~~D~n-------------~I~I~~~~~~~--~~~~~f~k~~m~~d 289 (305) ..|.|++|.|...+|.+-.- ++.++++ ++|.+-.|.. -+.+..+..++ +--+|+.++.- T Consensus 217 ~~~~GiPi~~~d~ip~~~~~~~~~gtTsI-ya~r~Ge~~~~~Gv~Gl~~~~~~glsVr~~G~~~~~~v~~~~V~~Y~--- 292 (310) T protein:vir:97 217 PAYSGTPIFRNDYIPTNQTKGGTTGCTTI-FAGTLDDGSRTHGIAGLTATQAAGIQVVDVGESEDSDEHIWRVKWYC--- 292 (310) T ss_pred eeeCCeEEEEeCccCCCccccccCCceeE-EEEeeCccccccceeccccCCccceeEEeCCcccCCcceeEEEEEee--- Confidence 25999999999888876332 1224444 2344433321 12222222111 11224443332 Q ss_pred eeeccCCeEEEecCCC Q lcl|NC_018271. 290 VEYAYGAEIVLYTPAA 305 (305) Q Consensus 290 ~~i~fg~E~v~~~~~~ 305 (305) =+++-+|.| T Consensus 293 -------~~av~~~~A 301 (310) T protein:vir:97 293 -------GLALFSEKG 301 (310) T ss_pred -------eEEEecccc Confidence 235567777 No 52 >protein:vir:78350 Length: 383 # NCBI annotation: Cps # Family: family:all:635 # MgeID: mge:1850 # MgeName: B025 # Cross-refs: genbank:acc:YP_001468644;genbank:gi:157325222;genbank:GeneID:5601696 Probab=97.79 E-value=1.2e-06 Score=52.98 Aligned_cols=272 Identities=14% Similarity=0.109 Sum_probs=154.1 Q ss_pred CceEeeeecccc--hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhcccc--CCCCCCCCccceEeccee Q lcl|NC_018271. 1 MATTVDITTNYV--GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFV--DYSCGFTPSGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~~~~Y~--Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q--~~~~~~~~~G~~~~~~K~ 76 (305) |.+.......|. -+..++|+..+..-..+.+. ++|.| ++.+. .+++........ +-........+.+|.+.. T Consensus 83 ~~~~~~~~gg~lvP~~~~~~I~~~l~~~s~l~~~--~~v~~-~~~~~-~i~~~~~~~~a~w~~e~~~~~~~~~~~f~~i~ 158 (383) T protein:vir:78 83 INKEVGYKEETLLPQTVVDEIFEDLTTEHPFLAS--IGMRT-TGLRT-KFLKSETSGVAVWGKIFGEIKGQLDATFSDEE 158 (383) T ss_pred HhccCCCCCccccCHHHHHHHHHHHHhhccceee--eeeEe-cCCce-EEEEEcCCcceEEeecccccccccCcceeeEe Confidence 333333222232 34566777666666666665 77664 66553 344433322221 001111223466788899 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) |.++++-++..++ .+ + +++=++.++.++.+.|+++++.-+++.++.||++ +.--||++.+... T Consensus 159 l~~~kl~~~i~is-~e--------------l-l~Ds~~~ie~~i~~~l~~~~a~~~~~a~i~G~G~-~qP~Gil~~~~~~ 221 (383) T protein:vir:78 159 SIQNKLTAFVVVP-KD--------------L-EKFGPAWVKRFVVTQIEEAFAVALESAYIVGDGN-DKPIGLNRKVGKG 221 (383) T ss_pred ecceeeEeeccch-HH--------------H-hhccHHHHHHHHHHHHHHHHHHHHhhheEeccCC-CCceeeeeccCCc Confidence 9998887776552 11 1 2344677889999999999999999999999985 4577888765443 Q ss_pred cceEEe-cc---CcCcCCh---hhHHHHHHHHHHhccHH-----HHhCCCcEEEecHHHH-HHHHHHHhhhhccCCcccC Q lcl|NC_018271. 157 ATVIDV-VG---ASGGITA---ANVEAELGKFIDAHTDE-----ILQAPNHVFGVSTNVI-RAIKRAYGTQARSNGTFLN 223 (305) Q Consensus 157 ~~~~~~-~~---~~~~iT~---anv~~~l~~~~~~iP~~-----~r~~~~l~~f~S~~~~-d~Y~d~~~~~~~k~~~~t~ 223 (305) ..+..- .+ +.+++|. ......+..+++..+-. .+..++++.+|+...| +..- .+..+ ..+ T Consensus 222 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~n~~~~~~~~~-~~~~~------~~~ 294 (383) T protein:vir:78 222 STVVDGVYAEKAATGTLTFANPKTTVNELTDVYKYHSVKENGHPLNVAGKVTLLVNPTDAWDVKK-QYTSL------NAN 294 (383) T ss_pred ccccccccccccccchhhhhhhHHHHHHHHHHHhccchhcccchhhhcCceEEEEcCcchhhhcc-chhcc------CCC Confidence 332211 11 1122333 34555666666665542 2223567888887543 2211 11111 112 Q ss_pred CCcceec--ceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhcccccee-eeccceeEEEEEEeecceeeccCCeEEE Q lcl|NC_018271. 224 PNEFDFE--GYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIVL 300 (305) Q Consensus 224 ~~~~~~k--Gi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v~ 300 (305) +...... |++++.-..+|++.|+..+-+..+++.+ ..++|+... .-....+..|...+-+|...--.+=+|+ T Consensus 295 G~~~t~l~~~~~iv~s~~~p~~~iifgdfs~Y~i~~r-----~~~~i~~~~~~~f~~d~~~f~~~~r~dG~~~~~~A~~v 369 (383) T protein:vir:78 295 GVYVTALPFNLNIIESLFVPEKKAISYVAERYDALIG-----GPLDIGTYDQTLAIEDLNLYAAKQFAYGKAKDDKAAAV 369 (383) T ss_pred CceeeecCCCceEEecCCCCcccEEEeeccceEEEec-----ccceEEecchhhhhcCceEEEEEEEEcCEEecCCeEEE Confidence 2222222 5556666789999888888888665432 234444221 1223344566677778888888888888 Q ss_pred ecCCC Q lcl|NC_018271. 301 YTPAA 305 (305) Q Consensus 301 ~~~~~ 305 (305) ++.+- T Consensus 370 l~~~~ 374 (383) T protein:vir:78 370 WTLNI 374 (383) T ss_pred EEEEe Confidence 87554 No 53 >protein:vir:4226 Length: 326 # NCBI annotation: observed 35.2Kd protein # Family: family:all:507 # MgeID: mge:89 # MgeName: L5 # Cross-refs: genbank:acc:NP_039681;swissprot:sw:q05223;genbank:gi:9625447;uniprot:Q05223;genbank:GeneID:2942929 Probab=97.78 E-value=8.3e-06 Score=48.43 Aligned_cols=274 Identities=10% Similarity=0.008 Sum_probs=149.6 Q ss_pred CceEeeeecccc-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEecceeee Q lcl|NC_018271. 1 MATTVDITTNYV-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQLT 78 (305) Q Consensus 1 ma~~~~~~~~Y~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~L~ 78 (305) +.+.......|. .+...+++..+.....+.+ +++++| ++.+....|+...+-.++-..+ +--+-++.+|.+..+. T Consensus 20 ~~~~~~~~g~~ip~~~~~~ii~~~~~~s~i~~--~~~~~~-~~~~~~~~p~~~~~~~a~~v~Eg~~~~~~~~~f~~i~~~ 96 (326) T protein:vir:42 20 AQTGDSMFEGYLEPEQAQDYFAEAEKISIVQQ--FAQKIP-MGTTGQKIPHWTGDVSASWIGEGDMKPITKGNMTSQTIA 96 (326) T ss_pred eeccccCCcceechhhHHHHHHHHHhcchhhh--hcceee-ccCCceEEEEEeCCcceEEecCCccccccccceeEEEEe Confidence 222222122233 3444556665555555544 367666 4344444555443333221111 1122356789999999 Q ss_pred eeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhccc Q lcl|NC_018271. 79 LKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEADAT 158 (305) Q Consensus 79 ~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d~~ 158 (305) ++++.+...++-+- +. +=++..+.++.+.|.++++..+++.++.|+++. .-.|++........ T Consensus 97 ~~k~~~~v~iS~el---------l~-------~s~~~~~~~i~~~l~~a~~~~~d~a~l~G~gs~-~p~gi~~~~~~~~~ 159 (326) T protein:vir:42 97 PHKIATIFVASAET---------VR-------ANPANYLGTMRTKVATAFAMAFDNAAINGTDSP-FPTFLAQTTKEVSL 159 (326) T ss_pred eEEEEEeehhhHHH---------Hh-------cCHHHHHHHHHHHHHHHHHHHHHHHhhcccCCC-ccccccccccccce Confidence 99999988776431 11 113556788899999999999999999999863 33555543222221 Q ss_pred eEEecc--CcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCC-----cce Q lcl|NC_018271. 159 VIDVVG--ASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPN-----EFD 228 (305) Q Consensus 159 ~~~~~~--~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~-----~~~ 228 (305) +. ..+ .....+.+++ .............. .+-.++||...+.+.+ .++.-.|.+. ...++. ..+ T Consensus 160 ~~-~~~~~~~~~~~~~~~---~~~~~~~~~~~~~~-~~a~~v~n~~~~~~L~-~lkd~~G~~l~~~~~~~~~~~~~~~~~ 233 (326) T protein:vir:42 160 VD-PDGTGSNADLTVYDA---VAVNALSLLVNAGK-KWTHTLLDDITEPILN-GAKDKSGRPLFIESTYTEENSPFRLGR 233 (326) T ss_pred ee-cccccccccchhHHH---HHHHHHhhhhhhcc-CccEEEEeHHHHHHHH-HhhccCCceeeccccccCccccccCce Confidence 11 111 1111222221 11122222233222 2458999999997776 3443333211 111121 236 Q ss_pred ecceeeeeccCCCCCeE--EEecchHHhhhhhhhhhhhhccccce---------------eeeccceeEEEEEEeeccee Q lcl|NC_018271. 229 FEGYTLTEIKGLPASRM--VGYNRDNIVIGMSAQSDFNEIRIKDM---------------GDVDLSGQIRTKMVLSAGVE 291 (305) Q Consensus 229 ~kGi~iv~l~~~Pd~~i--i~T~~sNl~~gvnl~~D~n~I~I~~~---------------~~~~~~~~~f~k~~m~~d~~ 291 (305) +.|++++....+|++.. +..+-+++++|.. ..+.++.. -+++...+.-+++.+-+|+. T Consensus 234 l~G~pv~~~~~~~~~~~~~~~Gd~s~~~~~~~-----~~~~v~~~~e~~~~~~~~~~~~~~~~~~~d~~~~r~~~~~d~~ 308 (326) T protein:vir:42 234 IVARPTILSDHVASGTVVGYQGDFRQLVWGQV-----GGLSFDVTDQATLNLGTPQAPNFVSLWQHNLVAVRVEAEYAFH 308 (326) T ss_pred eeeeeEEEcCCCCCCceEEEEeecceEEEEEe-----cceEEEEeecceeeecccccccchhhhhcCcEEEEEEEEeccE Confidence 89999999999998854 4567777765421 11222211 01223345667777888999 Q ss_pred eccCCeEEEecCCC Q lcl|NC_018271. 292 YAYGAEIVLYTPAA 305 (305) Q Consensus 292 i~fg~E~v~~~~~~ 305 (305) +..++=|+.-++.+ T Consensus 309 v~~~~a~~~l~~~~ 322 (326) T protein:vir:42 309 CNDKDAFVKLTNVD 322 (326) T ss_pred EecccceEEEeecc Confidence 98888888877776 No 54 >protein:vir:6212 Length: 434 # NCBI annotation: prohead protease # Family: family:all:21 # MgeID: mge:128 # MgeName: phBC6A52 # Cross-refs: genbank:acc:NP_852592;genbank:gi:31415852;genbank:GeneID:1489210 Probab=97.71 E-value=1.2e-05 Score=47.66 Aligned_cols=261 Identities=10% Similarity=0.076 Sum_probs=134.5 Q ss_pred CceEee-eeccc-chhHH-HHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCC----CCCCCccceEec Q lcl|NC_018271. 1 MATTVD-ITTNY-VGEVA-GGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYS----CGFTPSGEVDIN 73 (305) Q Consensus 1 ma~~~~-~~~~Y-~Ge~l-~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~----~~~~~~G~~~~~ 73 (305) .|.+.. ....| ..+.. .+|+..+..-..+.+ +.+++|.- .. ..+|+...+....... ..-.+.++.+|. T Consensus 141 ~a~~~~t~~GG~lvP~~~~~~Ii~~l~~~~~i~~--~~~~~~~~-~~-~~~p~~~~~~~a~~~~~~~e~~~~~~~~~~f~ 216 (434) T protein:vir:62 141 RALGLVTGNGSVTIPDFLSKEIITYAQEENFLRR--LGTGVKTK-EN-IKYPVLVKKAEAQGHKNERTNNEMPETDIEFD 216 (434) T ss_pred hhhcccccccceecchhhHHHHHHhhhhhhhhhh--hcceeccC-Cc-eEEEEEecCCcccceeccccccccccccccee Confidence 222111 11122 12332 334433333333322 24544432 22 2233322222111110 112234678999 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) +..+.++++-+...++-+ ++ ++-++.++.++.+.|+++++..++..+++||++.+...|++... T Consensus 217 ~v~~~~~k~~~~~~iS~e---------ll-------~ds~~~l~~~i~~~la~~~~~~~d~~~l~G~G~~~~~~g~~~~~ 280 (434) T protein:vir:62 217 EIELSPTEFDALATVTKK---------LL-------ARTGLPIEQIVMDELKKAYVRKETQYMVNGDEANNINDGALAKK 280 (434) T ss_pred eEEeeheeeEeehhhHHH---------HH-------hcchHHHHHHHHHHHHHHHHHHHHHHHhccCCCCccccceeecc Confidence 999999999887765432 11 22255678899999999999999999999999877777765321 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCc-----ccCCCcce Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGT-----FLNPNEFD 228 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~-----~t~~~~~~ 228 (305) ++ +...+.+...+.+-++..+++..+|++ -+++||...|...+ .++.-.|.+.- ...+.... T Consensus 281 -----~~-----~~~~~~~~~~d~l~~l~~~l~~~~~~~--a~~v~n~~~~~~L~-~lkd~~G~~l~~~~~~~~~g~~~t 347 (434) T protein:vir:62 281 -----AV-----EFKTDEKNLYDALVKMKNTPVKEVRKK--ARWVLNTAALTKIE-TMKTDDGFPLLRPFNQAEGGIGYT 347 (434) T ss_pred -----cc-----cccccccchhhHHHHHHhhcchhhhcC--CEEEEcHHHHHHHH-HhhccCCCEeeccCCCccCCCCce Confidence 11 111233445677778888999998864 48999999997654 45443333321 11222346 Q ss_pred ecceeeeeccCCCCCe------EEEecchHHhhhhhhhhhhhhccccceeeec-cceeEEEEEEeecceeeccCCeEEEe Q lcl|NC_018271. 229 FEGYTLTEIKGLPASR------MVGYNRDNIVIGMSAQSDFNEIRIKDMGDVD-LSGQIRTKMVLSAGVEYAYGAEIVLY 301 (305) Q Consensus 229 ~kGi~iv~l~~~Pd~~------ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~-~~~~~f~k~~m~~d~~i~fg~E~v~~ 301 (305) +.|.+++....+|..- |+..+-+..+++- ...-+.|+...... -..++.|..+.-+|-- ++| T Consensus 348 l~G~pV~~~~~~~~~~~~~~~~i~~Gdfs~~~i~~----~~g~~~i~~~~~~~~~~~~v~~~~~~r~Dgk-------~i~ 416 (434) T protein:vir:62 348 LLGFPVEEEDAIDIPDSPDTPVFYFGDFSKFYIQD----VIGSLEVQKLVELFSRTNRVGFRIWNLLDAQ-------LIH 416 (434) T ss_pred ecceeeEEecCccCccCCCceEEEEeeccceEEEE----eeceeEEEeehhhhcccCceEEEEEeeecce-------eec Confidence 9999998888886432 3334434332111 01111122111111 1122333333333322 345 Q ss_pred cCCC Q lcl|NC_018271. 302 TPAA 305 (305) Q Consensus 302 ~~~~ 305 (305) +|.+ T Consensus 417 ~~~~ 420 (434) T protein:vir:62 417 SPFE 420 (434) T ss_pred Cccc Confidence 5555 No 55 >protein:vir:9759 Length: 303 # NCBI annotation: putative structural protein # Family: family:all:966 # MgeID: mge:175 # MgeName: 315.3 # Cross-refs: genbank:acc:NP_795521;genbank:gi:28876283;genbank:GeneID:1257824 Probab=97.65 E-value=6.2e-06 Score=49.14 Aligned_cols=278 Identities=12% Similarity=0.054 Sum_probs=146.3 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEecceeeee Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQLTL 79 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~L~~ 79 (305) ||+.-+....=--++..+|+..+.....+.+.+ +++| ++..+..+|+...+-..+-..+ +--+-++.+|++..|.+ T Consensus 1 m~t~t~gg~liP~~~~~~ii~~l~~~s~i~~l~--~~~~-~~~~~~~ip~~~~~~~a~wv~E~~~~~~s~~~f~~v~l~~ 77 (303) T protein:vir:97 1 MGTETSKASLFDKHLVSDLINKVKGHSSLAKLS--SQKP-IPFNGSKEFTFTLDSDIDVVAENGKKTHGGLSLEPVTIVP 77 (303) T ss_pred CcccCCCCeEcchhHHHHHHHHHHhhchhhhhc--ceee-cCCCceEEEEEecCcceEEeecCccccccccceeeEEeee Confidence 998655322223334455555555555555553 4433 3223334454433222211111 11223567899999999 Q ss_pred eeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhccc- Q lcl|NC_018271. 80 KKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEADAT- 158 (305) Q Consensus 80 ~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d~~- 158 (305) +++.+...++-+=.+ .+ .+--+.....+.+.|.++++..+++.++.|+.+.+...+..+-+..-.. T Consensus 78 ~kl~~~~~iS~ell~-------~~------~d~~~~l~~~i~~~la~a~~~~ld~a~l~G~~~~~g~~~~~~~~~~~~~~ 144 (303) T protein:vir:97 78 IKVEYGARLSDEFLY-------AT------EEEKIDILKAFNEGFAKKLARGIDLMAMHGINPRTKKASDVIGTNHFDSK 144 (303) T ss_pred EEEEEeehhhHHHhh-------cC------ccchHHHHHHHHHHHHHHHHHHHHhhhhcccccCCccccccccccccccc Confidence 999998876533111 11 0123455577778999999999999999997654332222221111000 Q ss_pred eEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc----cCCCcceecceee Q lcl|NC_018271. 159 VIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF----LNPNEFDFEGYTL 234 (305) Q Consensus 159 ~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~----t~~~~~~~kGi~i 234 (305) +...+.. -+.....+.+.++...+....++ ++ .+.||...+.+.+. ++.-.|.+.-. .......+.|+++ T Consensus 145 ~~~~~~~---~~~~~~~~~i~~~~~~~~~~~~~-~~-~~vmn~~~~~~L~~-lkd~~g~~~~~~~~~~~~~~~~l~G~Pv 218 (303) T protein:vir:97 145 VTQVVKF---TESEDADANIEAAVNLIQGAEGV-VT-GLAMDTEFSTALAK-VTNGEMGPKMYPELAWGANPDSINGLKS 218 (303) T ss_pred ccccccc---ccccchHHHHHHHHHHHhhcCCC-cc-EEEEcHHHHHHHHH-hhccCCCeEEecCccCCCCCceecceee Confidence 1111111 11122345555566665554333 22 69999999887753 33322222111 1223457999998 Q ss_pred eeccCCCCCe--------EEEecchHH-hhhhhhhhhhhhcccccee---------eeccceeEEEEEEeecceeeccCC Q lcl|NC_018271. 235 TEIKGLPASR--------MVGYNRDNI-VIGMSAQSDFNEIRIKDMG---------DVDLSGQIRTKMVLSAGVEYAYGA 296 (305) Q Consensus 235 v~l~~~Pd~~--------ii~T~~sNl-~~gvnl~~D~n~I~I~~~~---------~~~~~~~~f~k~~m~~d~~i~fg~ 296 (305) +.-..+|+.. ++..+=++. .+|+ -..++++-.. +++-..+..+...+-+|+.+..++ T Consensus 219 ~~s~~v~~~~~~~~~~~~~~~Gdf~~~~~~~~-----~~~~~~~~~~~~~~d~~~~~~~~~n~~~~r~~~r~~~~v~~p~ 293 (303) T protein:vir:97 219 SVNTTVGAGADEAESKDLVIIGDFESMFKWGY-----AKQIPMEIIKYGDPDNSGKDLKGYNQIYLRAEAYIGWGILDAK 293 (303) T ss_pred EEecccCCccccCCCccEEEEeeccccEEEEE-----ecCcEEEEeeccCCCCcchhhhhcCcEEEEEEEEeccEeeccc Confidence 8877777543 333332222 1222 2223332111 112233466666777899999999 Q ss_pred eEEEecCCC Q lcl|NC_018271. 297 EIVLYTPAA 305 (305) Q Consensus 297 E~v~~~~~~ 305 (305) =||.-+++- T Consensus 294 af~~l~~~~ 302 (303) T protein:vir:97 294 SFARVTKGE 302 (303) T ss_pred ceEEeeCCC Confidence 999999998 No 56 >protein:vir:9574 Length: 300 # NCBI annotation: gp40 # Family: family:all:966 # MgeID: mge:171 # MgeName: SM1 # Cross-refs: genbank:acc:NP_862879;genbank:gi:32469471;genbank:GeneID:1461316 Probab=97.65 E-value=9.3e-06 Score=48.18 Aligned_cols=271 Identities=12% Similarity=0.063 Sum_probs=152.3 Q ss_pred CceEeeeeccc-chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCC------CccceEec Q lcl|NC_018271. 1 MATTVDITTNY-VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFT------PSGEVDIN 73 (305) Q Consensus 1 ma~~~~~~~~Y-~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~------~~G~~~~~ 73 (305) ||.+.+..... -.+...+|+..+....-+.+..-++..++ ....+|+...+.. +.|. +-++.+|. T Consensus 1 ma~~t~~~G~lip~~~~~~ii~~l~~~s~i~~l~~~~~~~~---~~~~~p~~~~~~~-----a~wv~Eg~~~~~s~~~f~ 72 (300) T protein:vir:95 1 MSEAQLSKGNLFNPELVTKVINKVKGHSSIAKLSPQKPIPF---NGQREFVFDFDSD-----IDIVAENGKKTHGGVSLD 72 (300) T ss_pred CcccccCCcceechhhHHHHHHHHHhhhhhhhhcceeeccC---CceEEEEEecCcc-----eEEeeCCcccccccccce Confidence 99988865433 34455555555555555544332332222 2222443322211 2332 23567899 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCc----cchhHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGT----TGNLQGI 149 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s----~~~fdG~ 149 (305) +..|.++++-+...++ +++.+ +.. + =.+..+.++.+.|.++++..+++.++.|+.+ ...+.|. T Consensus 73 ~v~l~~~k~~~~~~iS-~ell~-----~~~-d------~~~~l~~~i~~~l~~aia~~~d~~~l~G~~~~~g~~~~~~~~ 139 (300) T protein:vir:95 73 PVTIVPLKVEYGARVS-DEFLH-----ASE-E------AKVDMLTDFVEGFSKKLARGLDIMSIHGINPRTKQASTIIGD 139 (300) T ss_pred eeEeeeEEEEEeehhh-HHHhc-----cCC-C------CHHHHHHHHHHHHHHHHHHHHHHhhhhcccCCCCCCcccccc Confidence 9999999999977754 22211 111 1 1244556777889999999999999998543 2333332 Q ss_pred HHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCc Q lcl|NC_018271. 150 LPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNE 226 (305) Q Consensus 150 lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~ 226 (305) ... +..+...+...++. ..+.+.++...+....++ ++ .+.||...+.+.. .++.-.|.+. ..+.+.. T Consensus 140 ~~~---~~~~~~~~~~~~~~----~~~~i~~~~~~~~~~~~~-~~-~~vmn~~~~~~L~-~lkd~~G~~i~~~~~~~~~~ 209 (300) T protein:vir:95 140 NCF---DKKVTQTVPFKDTN----PDESMEDAVGMIDGSERD-IT-GAILDPIFTTALS-KMKNAEGGKLYPELAWGGVP 209 (300) T ss_pred ccc---ccccceeecccccc----hHHHHHHHHHHhhhcCCC-cc-EEEECHHHHHHHH-HhhccCCCeeccCccccCCC Confidence 222 11112222222233 355666666666665443 33 6999999987653 3332222221 2223345 Q ss_pred ceecceeeeeccCCCCC------eEEEecchHHh-hhhhhhhhhhhccccceee-------eccceeEEEEEEeecceee Q lcl|NC_018271. 227 FDFEGYTLTEIKGLPAS------RMVGYNRDNIV-IGMSAQSDFNEIRIKDMGD-------VDLSGQIRTKMVLSAGVEY 292 (305) Q Consensus 227 ~~~kGi~iv~l~~~Pd~------~ii~T~~sNl~-~gvnl~~D~n~I~I~~~~~-------~~~~~~~f~k~~m~~d~~i 292 (305) ..+.|++++--..+|.+ .++..+=++.+ +|+.- +.+ +++.+... ++-..++.+...+-.|+.+ T Consensus 210 ~~l~G~Pv~~s~~v~~~~~~~~~~~~~GDf~~~~~~~~~~--~~~-~~v~~~~~~d~~~~~~f~~~~v~~r~~~r~d~~v 286 (300) T protein:vir:95 210 DAINGLAVDKNRTVSYSQTDPKNTAIVGDFETMFKWGYAK--EVP-MEIIKYGDPDNSGRDLKGYNQIYIRCEAYIGWGI 286 (300) T ss_pred ceecceeeEEecCCCCCCCCCccEEEEeeccceEEEEEec--ccE-EEEeeccCCCCcchhhhhcCcEEEEEEEeeccee Confidence 67999998766666543 46666656543 44321 221 33333221 2333447777788889999 Q ss_pred ccCCeEEEecCCC Q lcl|NC_018271. 293 AYGAEIVLYTPAA 305 (305) Q Consensus 293 ~fg~E~v~~~~~~ 305 (305) ..++=||.-+..+ T Consensus 287 ~~~~a~~~l~~~~ 299 (300) T protein:vir:95 287 MDAASFARIVKTG 299 (300) T ss_pred ecccceEEEecCC Confidence 9999999999999 No 57 >protein:vir:81227 Length: 413 # NCBI annotation: gp6, major capsid protein # Family: family:all:585 # MgeID: mge:1893 # MgeName: BFK20 # Cross-refs: genbank:acc:YP_001456736;genbank:gi:157168379;hssp:P49861;interpro:IPR006444;uniprot:Q9MBJ9;genbank:GeneID:5580350 Probab=97.61 E-value=7.7e-06 Score=48.60 Aligned_cols=272 Identities=10% Similarity=0.033 Sum_probs=142.1 Q ss_pred CceEeeee--cccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccc-eE Q lcl|NC_018271. 1 MATTVDIT--TNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGE-VD 71 (305) Q Consensus 1 ma~~~~~~--~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~-~~ 71 (305) ++...... ...-.+...+++........+.+. ++++| ++......++....+.. .-.+.|.+ .++ .+ T Consensus 118 ~~~~~~~~~~~~vp~~~~~~ii~~~~~~~~l~~~--~~~~~-~~~~~~~~~~~~~~~~~-~~~a~~v~Eg~~~~~~~~~~ 193 (413) T protein:vir:81 118 STATLTDEFQGGYGTTWNRNIIYRRREKLVVADL--MDNLT-MTNTTIKYLMEKANRVV-EGGFKTVAEGGKKPYMRFAD 193 (413) T ss_pred hhcccccccccccchhhHHHHHHHHhhhhhHHhh--cceee-ccCCceeEEEecccccc-ccccceecCcccccccCccc Confidence 21111111 122333344455444445554443 44433 22222223333322221 11223432 344 47 Q ss_pred ecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHH Q lcl|NC_018271. 72 INEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 72 ~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk 151 (305) |.+..+.++++-+...++= ++ +. +-| ..+.++...|+++++..+++.++.|+++.+.+.|++. T Consensus 194 f~~i~~~~~k~~~~~~iS~-el--------l~-------ds~-~l~~~i~~~la~~~~~~~d~~~l~G~G~~~~~~Gi~~ 256 (413) T protein:vir:81 194 FDIVTESLSKIAGLTKITD-EM--------IE-------DYD-FLVSYINARLLEELAIEEERQLLLGDGTGNNLTGLLK 256 (413) T ss_pred ceeeEeeeeeEEEeehhhH-HH--------HH-------HHH-HHHHHHHHHHHHHHHHHHHHHHhccCCCCCccccccc Confidence 8899999999998877652 22 11 113 3667888999999999999999999998888999976 Q ss_pred HHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc-------cCC Q lcl|NC_018271. 152 LLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF-------LNP 224 (305) Q Consensus 152 ~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~-------t~~ 224 (305) ..... +.. . -+.....+.+.+....+.......++ .++||...|.+.+ .++.-.|.+.-. .++ T Consensus 257 ~~~~~--~~~---~---~~~~~~~~~i~~~~~~~~~~~~~~~~-~~vmn~~~~~~l~-~lkd~~G~~l~~~~~~~~~~~~ 326 (413) T protein:vir:81 257 RDGIQ--TLA---V---SNKDELADSIYKAMTNISLATPFQAD-ALVINPLDYQELR-LAKDANGQYYGGGVFQGQYGSG 326 (413) T ss_pred ccccc--ccc---c---cccchhHHHHHHHHHHhhhhccCCCc-EEEEcHHHHHHHH-HhhccCCceecccccccccccc Confidence 52111 111 1 11122233333333222222111233 5999999988754 443333322110 111 Q ss_pred ---CcceecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhcccccee-eeccceeEEEEEEeecceeeccCCeEEE Q lcl|NC_018271. 225 ---NEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIVL 300 (305) Q Consensus 225 ---~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v~ 300 (305) ....+.|.+++....+|++.++..+-++.+..++- .++. +++.+.. ...-+..+-+...+-+|+.+..++=|++ T Consensus 327 ~~~~~~~l~G~pv~~s~~~~~~~~~~gd~~~~~~~~~~-~~~~-v~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~ 404 (413) T protein:vir:81 327 GIMLDPAPWGLRTVQSQVVPVGKPVVGAFRSAASVLRK-GGVR-IDSTNTNVDDFENNLITVRAEERVGLMVTFPEAIVQ 404 (413) T ss_pred ccccCceecceeeEEcCCCCcccEEEEecccEEEEEEe-cceE-EEEeccccchhhcCcEEEEEEEeeccEEecccceEE Confidence 12358899999999999998888776653221111 1222 3333221 1122345666677777888888877766 Q ss_pred ecCCC Q lcl|NC_018271. 301 YTPAA 305 (305) Q Consensus 301 ~~~~~ 305 (305) -+=++ T Consensus 405 l~~~~ 409 (413) T protein:vir:81 405 LDVAE 409 (413) T ss_pred EEecC Confidence 54333 No 58 >protein:vir:2504 Length: 305 # NCBI annotation: major capsid subunit gp9 # Family: family:all:507 # MgeID: mge:53 # MgeName: TM4 # Cross-refs: genbank:acc:NP_569745;genbank:gi:18496895;genbank:GeneID:932268 Probab=97.59 E-value=2.4e-05 Score=45.95 Aligned_cols=269 Identities=12% Similarity=0.138 Sum_probs=144.1 Q ss_pred CceEeeeeccc-c-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC-----------c Q lcl|NC_018271. 1 MATTVDITTNY-V-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP-----------S 67 (305) Q Consensus 1 ma~~~~~~~~Y-~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~-----------~ 67 (305) ||++-.....| . .+...+|+..+.....+.+. ++++|- +.++..+|+...+.. +.|.+ . T Consensus 1 ma~~t~~~gg~liP~~~~~~Ii~~~~~~s~l~~l--~~~~~~-~~~~~~~p~~~~~~~-----a~wv~E~~~~~~~~~~~ 72 (305) T protein:vir:25 1 MADISRAEVASLIQEAYSDTLLAAAKQGSTVLSA--FQNVNM-GTKTTHLPVLATLPE-----ADWVGESATDPKGVKPT 72 (305) T ss_pred CCCccCCccceecCHHHHHHHHHHHHhhchhhhh--cceeec-cCCcEEEEEEeCCcc-----eEEeecccccccccccc Confidence 99888765432 3 44457777777777777665 676663 334344554443322 23422 2 Q ss_pred cceEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchh- Q lcl|NC_018271. 68 GEVDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNL- 146 (305) Q Consensus 68 G~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~f- 146 (305) ++.+|.+..|.++++-+...++=+ + . ++-++..+.++.+.|.++++..+++.++.|+++.... T Consensus 73 s~~~f~~i~~~~~k~~~~~~is~e-l--------l-------~ds~~~~~~~i~~~l~~~~a~~~d~a~~~G~g~~~~~~ 136 (305) T protein:vir:25 73 SKVTWANRTLVAEEIAVIIPVHEN-V--------I-------DDATVAVLTEVAELGGQAIGKKLDQAVIFGTDKPASWV 136 (305) T ss_pred cccceeeEEeeeEEEEEeehhhHH-H--------H-------hcchHHHHHHHHHHHHHHHHHHHhhhheeccCCCCCcc Confidence 356899999999999888876532 1 1 2224567788999999999999999999999864332 Q ss_pred -HHHHHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCC Q lcl|NC_018271. 147 -QGILPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPN 225 (305) Q Consensus 147 -dG~lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~ 225 (305) .+++.......+++ ....+..+..++...+..+...+-..... . -.+.|+...+.+.+ .++. ++..+.- . T Consensus 137 ~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~-~~~v~~~~~~~~l~-~lkd---~~G~~i~-~ 207 (305) T protein:vir:25 137 SPALIPAAVTAGQAV--EVVGGVANESDIVGATNRAAKAVASAGWA-P-DTLLSSLALRYEVA-NIRD---ANGNPVF-R 207 (305) T ss_pred ccccccccccccccc--cccccchhhhHHHHHHHHHHHhhhhcccc-c-ceeEecHHHHHHHH-Hhhc---cCCceee-c Confidence 11111111111111 11112223333444444433333222111 1 14899988877653 2322 1111111 1 Q ss_pred cceecceeeeeccCCC----CCeEEEecchHHhhhhhhhhhhhhcccccee---------eeccceeEEEEEEeecceee Q lcl|NC_018271. 226 EFDFEGYTLTEIKGLP----ASRMVGYNRDNIVIGMSAQSDFNEIRIKDMG---------DVDLSGQIRTKMVLSAGVEY 292 (305) Q Consensus 226 ~~~~kGi~iv~l~~~P----d~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~---------~~~~~~~~f~k~~m~~d~~i 292 (305) ...+.|.+++-...+| +..++..+-+++++|... +.. +++.+=. +++-..++-+...+-.|+.+ T Consensus 208 ~~~l~G~Pv~~~~~~~~~~~~~~~~~gd~s~~~i~~~~--~~~-i~~~~~~~~~~~~~~~~~~~~~~~~~R~~~r~~~~v 284 (305) T protein:vir:25 208 DDSFAGFRTFFNRNGAWDADAAIEVIADSSRVKIGVRQ--DIT-VKFLDQATLGTGENQINLAERDMVALRLKARFAYVL 284 (305) T ss_pred CCcccccceEEcCccCCCCCccEEEEEecceEEEEEec--CeE-EEEeeeeeeecCCceeeeeecCcEEEEEEEeeccee Confidence 1368898877666664 347778888888766432 111 1111100 01111234455555567777 Q ss_pred ccCCeEEE--------ecCCC Q lcl|NC_018271. 293 AYGAEIVL--------YTPAA 305 (305) Q Consensus 293 ~fg~E~v~--------~~~~~ 305 (305) ..++-+|. =||++ T Consensus 285 ~~p~a~v~~~~~~~~~~~pa~ 305 (305) T protein:vir:25 285 GVSATAQGANKTPVAVVAPAA 305 (305) T ss_pred eCcccEEEEccccccccCCCC Confidence 66665554 25666 No 59 >protein:vir:4953 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:108 # MgeName: Sfi19 # Cross-refs: genbank:acc:NP_049929;genbank:gi:9632900;genbank:GeneID:1262076 Probab=97.55 E-value=3.6e-05 Score=44.94 Aligned_cols=257 Identities=8% Similarity=0.074 Sum_probs=140.6 Q ss_pred CceEeeeeccc-ch-hHHHHHHHHhhccccchhcCceEEec--CCCCcccccchhhhhccccCCCC--CCCC-ccceEec Q lcl|NC_018271. 1 MATTVDITTNY-VG-EVAGGYFLEMVKEANTISDNLIRVIP--NVPENNLFLRRMNTTDDFVDYSC--GFTP-SGEVDIN 73 (305) Q Consensus 1 ma~~~~~~~~Y-~G-e~l~~~~~~~~~g~~~v~~g~I~v~~--~v~~~~~~~~~~~~~~~~q~~~~--~~~~-~G~~~~~ 73 (305) |+........| .+ +...+|+..+.....+.+. ++++| +-..+..+++ .......-..-+ .-.+ .+..+|+ T Consensus 109 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~~~l~~~--~~~~~~~~~~~~~~~~~-~~~~~~~a~~v~E~~~~~~~~~~~~~ 185 (397) T protein:vir:49 109 KTDASGSDAGLTIPQDIQTAIHTLVSQYDSLQEY--VNVENVTTLTGSRVYEK-WTDITGLANIDDEAGKIADVDDPKLS 185 (397) T ss_pred hhccccccCcccccHhHHHHHHHHHHhhhhHHhh--hceeecccCccceEEEe-eccCCcceeeecCcccccccccccee Confidence 44433333333 22 3445666666666666555 44433 3333322222 222222111111 1112 2456899 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) +..+.++++-+...++-. + + ++-++..+.++.+.|.++++..++..++.|+++... T Consensus 186 ~i~~~~~k~~~~~~iS~e-l--------l-------~ds~~~l~~~i~~~l~~~~~~~~d~ai~~G~g~~~~-------- 241 (397) T protein:vir:49 186 LIKYTIKRYAGISTVTNS-L--------L-------ADSAENILAWLSGWIAKKVVVTRNKAILEAIAALPT-------- 241 (397) T ss_pred eEEeeeeeEEeeehhHHH-H--------H-------hhhHHHHHHHHHHHHHHHHHHHHHHHHHhhcccccc-------- Confidence 999999999998775432 2 1 122355677888999999999999999999876321 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCcceec Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNEFDFE 230 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~~~~k 230 (305) ..+.++ .+.+.++..+++..++.+ -.++||...|...+- ++.-.|.+. +...+....+. T Consensus 242 -----------~~~~~~----~d~i~~~~~~l~~~~~~~--a~~vmn~~~~~~l~~-lkd~~G~~l~~~~~~~~~~~~l~ 303 (397) T protein:vir:49 242 -----------KPTLTK----WDDIIDLEAKVDPAIKQT--SFFLTNTSGFTALKK-VKNALGDYLMERDVKSPTGYSID 303 (397) T ss_pred -----------cccccc----HHHHHHHHHhhhhhhcCC--CEEEEcHHHHHHHHH-hhcCCCceeeccCcCCCCCceec Confidence 122233 334556677777777753 489999999876654 332222221 22334455799 Q ss_pred ceeeeecc--CCCCC-----eEEEecchHH-hhhhhhhhhhhhcccccee-eeccceeEEEEEEeecceeeccCCeEEEe Q lcl|NC_018271. 231 GYTLTEIK--GLPAS-----RMVGYNRDNI-VIGMSAQSDFNEIRIKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIVLY 301 (305) Q Consensus 231 Gi~iv~l~--~~Pd~-----~ii~T~~sNl-~~gvnl~~D~n~I~I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v~~ 301 (305) |++++.+. ++|++ .++..+-++. +++.+ .+.. +++++.. +...+..+.+...+-+|+.+..++=||+- T Consensus 304 G~PV~~~~~~~~~~~~~~~~~i~~gd~~~~~~~~~~--~~~~-i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~~ 380 (397) T protein:vir:49 304 GFAVKEVADRWLANGTGGAMPLYFGDLKQAVTLFDR--QHMS-LLSTNIGGGAFETDTTKVRVIDRFDVVATDTEAFVPA 380 (397) T ss_pred ceeeEEecccccccccCCceeEEEeeccceEEEEee--cceE-EEEeccccchhhcCceeEEEEeeeCcEEecccceEEE Confidence 99887653 34543 3555555553 22111 1111 2222221 22233445556666778888888888877 Q ss_pred cCCC Q lcl|NC_018271. 302 TPAA 305 (305) Q Consensus 302 ~~~~ 305 (305) +-++ T Consensus 381 ~~~~ 384 (397) T protein:vir:49 381 SFKA 384 (397) T ss_pred Eeec Confidence 6544 No 60 >protein:vir:100632 Length: 381 # NCBI annotation: 77ORF006 # Family: family:all:635 # MgeID: mge:1476 # MgeName: 77 # Cross-refs: genbank:acc:NP_958606;genbank:gi:41189521;genbank:GeneID:2743778 Probab=97.55 E-value=4.8e-06 Score=49.76 Aligned_cols=270 Identities=12% Similarity=0.111 Sum_probs=153.5 Q ss_pred CceEeeeecccc--hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCC-------CCccceE Q lcl|NC_018271. 1 MATTVDITTNYV--GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGF-------TPSGEVD 71 (305) Q Consensus 1 ma~~~~~~~~Y~--Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~-------~~~G~~~ 71 (305) |.........|. -+..++|+..+..-..+.+. ++|+| ++... .+++..... .+.| .+..+.+ T Consensus 76 ~~~~t~~~Gg~lvP~~~~~~I~~~l~~~spir~~--a~v~~-~~~~~-~i~~~~~~~-----~a~W~~e~~~~~~~~~~~ 146 (381) T protein:vir:10 76 INKSVGYKEEKLLPEETIDRIFEDLTTNHPLLAD--LGIKN-AGLRL-KFLKSETSG-----VAVWGKIYGEIKGQLDAA 146 (381) T ss_pred HhhcCCCCCceecCHHHHHHHHHHHHhhcceeee--eeeEe-cCcce-EEEeecCCc-----ceEEeecccccccccCcc Confidence 222222122222 34556666666666666554 77664 55443 233332211 1234 2233568 Q ss_pred ecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHH Q lcl|NC_018271. 72 INEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 72 ~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk 151 (305) |.+..|.++++-++..++ .++ +++-++.++.++.+.|+++++.-++..+++||++ +.--||++ T Consensus 147 f~~i~l~~~kl~a~i~is-~el---------------L~Ds~~~le~~i~~~la~~~a~~~~~afi~GdG~-~qP~Gil~ 209 (381) T protein:vir:10 147 FSEETAIQNKLTAFVVLP-KDL---------------NDFGPAWIERFVRVQIEEAFAVALETAFLKGTGK-DQPIGLNR 209 (381) T ss_pred ceeEeecceeEEeecccc-HHH---------------HhccHHHHHHHHHHHHHHHHHHHhhceeEecccC-CCceeeee Confidence 889999999988876653 211 2334667788999999999999999999999997 45679987 Q ss_pred HHhhccceEEecc----CcCcCChh---hHHHHHHHHHHhccHHH-----HhCCCcEEEecHHHHHHHHHHHhhhhccCC Q lcl|NC_018271. 152 LLEADATVIDVVG----ASGGITAA---NVEAELGKFIDAHTDEI-----LQAPNHVFGVSTNVIRAIKRAYGTQARSNG 219 (305) Q Consensus 152 ~i~~d~~~~~~~~----~~~~iT~a---nv~~~l~~~~~~iP~~~-----r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~ 219 (305) .+.....+..-+. +.+++|.. .....+.++.+.++... ...++..+.|+...+-+..-.+..+. .++ T Consensus 210 ~~~~~~~~~~g~~~~~~~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~vmn~~t~~~l~~~~~~~~-~~G 288 (381) T protein:vir:10 210 QVQKGVSVTDGAYPEKEEQGTLTFANPRATVNELTQVFKYHSTNEKGKSVAVKGNVTMVVNPSDAFEVQAQYTHLN-ANG 288 (381) T ss_pred cCCccccccccccccccccccccccchhhHHHHHHHHHHhhhhhhccccccccCceEEEEchhhHHhhccccccCC-CCC Confidence 6544333221110 11234443 34556677777765532 11245678888876554432221111 111 Q ss_pred cccCCCcceecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhcccccee-eeccceeEEEEEEeecceeeccCCeE Q lcl|NC_018271. 220 TFLNPNEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEI 298 (305) Q Consensus 220 ~~t~~~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~ 298 (305) .+.. . .-.|.+++.-..||++.|+..+-++.+++.. ..++|+... ......+.-|....-+|-...-.+=+ T Consensus 289 ~~v~--~-lp~g~~vv~~~~~p~~~i~fGDfs~Y~i~~r-----~~~~i~~~~~~~~~~d~~~f~a~~r~dG~~~~~~A~ 360 (381) T protein:vir:10 289 VYVT--A-LPFNLNVIESTVQEAGKVLTYVKGLYDGYLA-----GGINVQKFKETLALDDMDLYTAKQFAYGKAKDNKVA 360 (381) T ss_pred ceee--c-CCCCceeEEcCCCCcCcEEEEEcccEEEEEe-----cccEEEeechhhhhcCceEEEEEEEEcCEEecCCcE Confidence 1110 0 1136677777889999999999888654432 234444331 22233445566777778888888888 Q ss_pred EEecCCC Q lcl|NC_018271. 299 VLYTPAA 305 (305) Q Consensus 299 v~~~~~~ 305 (305) |+++=+. T Consensus 361 ~v~~l~~ 367 (381) T protein:vir:10 361 AVWKLDL 367 (381) T ss_pred EEEEEee Confidence 8877544 No 61 >protein:vir:94933 Length: 330 # NCBI annotation: putative phage structural protein # Family: family:all:1120 # MgeID: mge:1538 # MgeName: Xp15 # Cross-refs: genbank:acc:YP_239278;genbank:gi:66392060;genbank:GeneID:5076578 Probab=97.53 E-value=4.1e-05 Score=44.64 Aligned_cols=269 Identities=15% Similarity=0.137 Sum_probs=137.0 Q ss_pred CceEeeeec--ccchhHHHHHHHHhhccccchhcCce-EEecCCCCcccccchhhhh--ccccCCCCCCCCccceEecce Q lcl|NC_018271. 1 MATTVDITT--NYVGEVAGGYFLEMVKEANTISDNLI-RVIPNVPENNLFLRRMNTT--DDFVDYSCGFTPSGEVDINEK 75 (305) Q Consensus 1 ma~~~~~~~--~Y~Ge~l~~~~~~~~~g~~~v~~g~I-~v~~~v~~~~~~~~~~~~~--~~~q~~~~~~~~~G~~~~~~K 75 (305) |++.-.+.. -.......+++-.+..-+++..+.-. .|..|. .-.+|.++. -.+.+...++++.+...|.+. T Consensus 25 m~alTLaea~~l~~d~~~~~VIE~l~~~s~iL~~lpf~~ve~~~----~~~~r~~~lp~a~~r~~n~~~~~~~~~Tf~q~ 100 (330) T protein:vir:94 25 MPTVTLAESAKLSQDHLVSGLIETIVEVNPLYEMMPFTEIEGNA----LAYNRENVLGDVQFLAVGGTITAKNPATFTKV 100 (330) T ss_pred hhhhhhhHHhhcCchhhHHHHHHhhhccchHHhhcccccccCCc----ceeeeeecCCcceeeeccccccccCcceeeee Confidence 775544442 44555566677777777777655321 122222 111222221 112233445555444555555 Q ss_pred eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhh Q lcl|NC_018271. 76 QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEA 155 (305) Q Consensus 76 ~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~ 155 (305) +..+..+-..+++ .-+...+. |+ |..+-...+++..+.+++...+.+++||++-+.|+|+++++.. T Consensus 101 t~~l~~l~~~~~V-------d~~iadl~-g~------~~d~~~~q~~~~ieal~~~~e~~linGDs~~~~F~GL~~~~~~ 166 (330) T protein:vir:94 101 TSELTTLIGDAEV-------NGLIQATR-SD------FMDQTSVQVASKAKSIGRQYQASMITGDGTGNSFQGMMGLVAA 166 (330) T ss_pred eechhhhhhhHHH-------HHHHHHhc-CC------HHHHHHHHHHHHHHHHHHHHHHHhhccCCCCccccchhhcCCc Confidence 5543322222211 11222222 21 3344445567778889999999999999887889999999854 Q ss_pred ccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc--cC--CCcc-eec Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF--LN--PNEF-DFE 230 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~--t~--~~~~-~~k 230 (305) .+.++.-+..+.+|.++..+.+..+|+. + ...-.|+||.....++....++-.+.++.. .+ ++.. .|. T Consensus 167 -~q~i~tg~~gg~~T~d~LDeLl~~v~~~-~-----g~~~~~l~n~a~~r~I~a~~R~~~~~~v~~~~~~~~G~~v~~~~ 239 (330) T protein:vir:94 167 -SQTISAGANGGTLTFELLDQLLDLVKDK-D-----GQVDYLMSSFAMRRKYFSLLRALGGAAIGEVMTLPSGRQIPTYR 239 (330) T ss_pred -ccEEecCCCCCCCCHHHHHHHHHHhcCC-C-----CCCcEEEechhHHHHHHHHHHhccCCCCCCcccccCCCEEeeeC Confidence 4445432234567765544444444332 1 123488899999999988876554444322 22 3332 599 Q ss_pred ceeeeeccCCCCCeEE--EecchHHhhhhhhhhhh-------------hhccccceeeeccc--eeEEEEEEeecceeec Q lcl|NC_018271. 231 GYTLTEIKGLPASRMV--GYNRDNIVIGMSAQSDF-------------NEIRIKDMGDVDLS--GQIRTKMVLSAGVEYA 293 (305) Q Consensus 231 Gi~iv~l~~~Pd~~ii--~T~~sNl~~gvnl~~D~-------------n~I~I~~~~~~~~~--~~~f~k~~m~~d~~i~ 293 (305) |++|.+..-+|.+--- ++.++++ ++|++-+|. --|.+..+..++-+ -+|..++.-+ T Consensus 240 GvPi~~~d~ip~~~~~~~~~~ttsI-yav~~G~~~~~qgV~Gl~~~g~~glsVr~~G~~~~k~v~~~~v~~y~~------ 312 (330) T protein:vir:94 240 GVPWFVNDFIPSNMTQGTATNATAI-FAGTFDDGSNKYGIAGLTARGSAGLRVQNVGAKENADETITRVKMYCG------ 312 (330) T ss_pred CeEEEecccccCCCCcccCCCceeE-EEEeecccccccceEeecCCCCCcceeeeCCCccccceeeEEEEEeee------ Confidence 9999999888864110 0112222 123332221 13333333221111 2244433322 Q ss_pred cCCeEEEecCCC Q lcl|NC_018271. 294 YGAEIVLYTPAA 305 (305) Q Consensus 294 fg~E~v~~~~~~ 305 (305) +++-+|.| T Consensus 313 ----~av~~~~a 320 (330) T protein:vir:94 313 ----FANFSQLG 320 (330) T ss_pred ----eEEechhh Confidence 23455555 No 62 >protein:vir:95963 Length: 395 # NCBI annotation: ORF009 # Family: family:all:635 # MgeID: mge:1594 # MgeName: 2638A # Cross-refs: genbank:acc:YP_239802;genbank:gi:66395459;genbank:GeneID:5132880 Probab=97.52 E-value=5.9e-06 Score=49.27 Aligned_cols=269 Identities=13% Similarity=0.113 Sum_probs=146.6 Q ss_pred CceEeeeeccc--chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCC-------CccceE Q lcl|NC_018271. 1 MATTVDITTNY--VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFT-------PSGEVD 71 (305) Q Consensus 1 ma~~~~~~~~Y--~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~-------~~G~~~ 71 (305) |.......+.| --+...+|+..+.....+.+. ++|.| ++.. +.+++...... +.|. +..+.+ T Consensus 86 ~~~~t~~~gG~liP~~~~~~Ii~~l~~~s~i~~~--~~v~~-~~~~-~~i~~~~~~~~-----a~w~~e~~~~~~~~~~~ 156 (395) T protein:vir:95 86 INYDVGYTDEKILPETVVERVFDDLQKDHPLLSK--INFQN-AGIK-TRVIKADPAGQ-----AVWGKVFGEIKGQLDAA 156 (395) T ss_pred HhhccCCCCceeccHHHHHHHHHHHHhhhhhhhh--ceeEe-cCCc-eEEEEecCCcc-----eEEeecccccCcccccc Confidence 11111111111 123456666666666666554 66654 4444 23333222221 2331 234678 Q ss_pred ecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccc-hhHHHH Q lcl|NC_018271. 72 INEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTG-NLQGIL 150 (305) Q Consensus 72 ~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~-~fdG~l 150 (305) |.+..|.++++-+...++= . -+++-++.++.++...|+++++..+++.++.|+++.. .=.||+ T Consensus 157 f~~i~l~~~kl~~~~~iS~---------------e-ll~ds~~~ie~~i~~~la~~ia~~~~~a~i~G~G~~~~qP~Gil 220 (395) T protein:vir:95 157 FREENFTQYKLTCFVVLPD---------------D-LSTFGPAWIERFVRTQIQEAISVALESAIINGGGAAKTQPVGLM 220 (395) T ss_pred ceeeeeceeeEEEeecccH---------------H-HHhcchhHHHHHHHHHHHHHHHHHHhhheeeccCCCCcCceeee Confidence 9999999999988877631 1 1234467788999999999999999999999999864 356888 Q ss_pred HHHhhccceEEeccCcCcCChhh---HHHHHHHHHHhccH-----HHHhCCCcEEEecHHHHHHHHHHHhhhhccCCccc Q lcl|NC_018271. 151 PLLEADATVIDVVGASGGITAAN---VEAELGKFIDAHTD-----EILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFL 222 (305) Q Consensus 151 k~i~~d~~~~~~~~~~~~iT~an---v~~~l~~~~~~iP~-----~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t 222 (305) +.+...+.+..-...+++.|.++ ..+.+.+++..+.. +.+..++..+.|+...+-.....|.-+ .. T Consensus 221 ~~~~~~~~~~~~~~~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~mn~~t~~~~~g~~~~~------~~ 294 (395) T protein:vir:95 221 KDVNTNSGAVTDKASSGTLTFADADTTILELNDVLKNLSVDEKGKELKIDGKVALVVNPRDSWDVQARYTYL------TA 294 (395) T ss_pred ecccccccccccccccchhhhhhhHhhHHHHHHHHHhhccccccchhhhcCceEEEEcchhhhhcCCcceec------cC Confidence 76544433221111223334433 34555666655532 223345678999977653322222111 12 Q ss_pred CCCccee--cceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeee-ccceeEEEEEEeecceeeccCCeEE Q lcl|NC_018271. 223 NPNEFDF--EGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDV-DLSGQIRTKMVLSAGVEYAYGAEIV 299 (305) Q Consensus 223 ~~~~~~~--kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~-~~~~~~f~k~~m~~d~~i~fg~E~v 299 (305) ++.+... .|++++.-..||++.|+..+-++.+++.+ ..++|+...-. ....+..|...+-+|-...-.+=+| T Consensus 295 ~G~~~~~lg~g~~v~~~~~~p~~~i~fgdfs~y~i~~r-----~~~~i~~~~~~~~~~d~~~f~~~~r~dg~~~~~~A~~ 369 (395) T protein:vir:95 295 NGGFVTVLPYNVTIITSEFVPEGKLVAFVTDRYNAVRG-----GGLTVKKFDQTLALEDAVLFTAKTFAYGQPDDNKASA 369 (395) T ss_pred CCcceeccCCcceEEEcCCCCCCcEEEEecccEEEEEe-----cceEEEeccchhhhCCcEEEEEEEEECCEEeccccEE Confidence 3333333 36667777789999888888777654432 22344322111 1122333555566676665555555 Q ss_pred EecCCC Q lcl|NC_018271. 300 LYTPAA 305 (305) Q Consensus 300 ~~~~~~ 305 (305) +.+=.. T Consensus 370 ~l~i~~ 375 (395) T protein:vir:95 370 VYDLKV 375 (395) T ss_pred EEEeec Confidence 544322 No 63 >protein:vir:4830 Length: 397 # NCBI annotation: MPL-7201 # Family: family:all:21 # MgeID: mge:105 # MgeName: 7201 # Cross-refs: genbank:acc:NP_038327;genbank:gi:9634653;genbank:GeneID:1262632 Probab=97.48 E-value=3.7e-05 Score=44.86 Aligned_cols=259 Identities=8% Similarity=0.063 Sum_probs=139.3 Q ss_pred CceEeeeeccc--chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC--CCCC-ccceEecce Q lcl|NC_018271. 1 MATTVDITTNY--VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC--GFTP-SGEVDINEK 75 (305) Q Consensus 1 ma~~~~~~~~Y--~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~--~~~~-~G~~~~~~K 75 (305) |+........| --++..+|+..+.....+.+...+..+++-..+..+.++.. .......-. +-.+ .+..+|.+. T Consensus 109 ~~~~t~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~-~~~~a~~v~E~~~~~~~~~~~~~~v 187 (397) T protein:vir:48 109 KTDASGSDAGLTIPQDIQTAIHTLVRQYDSLQEYVNVENVTTLTGSRVYEKWAD-ITGLAKLDDEAGSIGTNDDPKLYPI 187 (397) T ss_pred hhccCCccccccccHHHHHHHHHHHHHHHHHHhhhceeeccCCcceEEEEeecC-CCcceeeeccccccccccccceeeE Confidence 33332222222 12444666666666666666533333334333322222211 111111111 1122 235689999 Q ss_pred eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhh Q lcl|NC_018271. 76 QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEA 155 (305) Q Consensus 76 ~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~ 155 (305) .|.++++-+...++-. + + ++-++..+.++.+.|.++++..++..++.|+++... T Consensus 188 ~~~~~k~~~~~~iS~e-l--------l-------~ds~~~l~~~v~~~l~~~~~~~~d~~il~G~g~~~~---------- 241 (397) T protein:vir:48 188 RYAIKRYAGISTVTNS-L--------L-------ADSAENILAWLSGWIAKKVVVTRNKAILEAIATLPT---------- 241 (397) T ss_pred EeeheeeeeehhhHHH-H--------H-------hhchHHHHHHHHHHHHHHHHHHHHHHHhhccccccc---------- Confidence 9999999888765432 1 1 122345667888899999999999999999876321 Q ss_pred ccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCcceecce Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNEFDFEGY 232 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~~~~kGi 232 (305) ..+.+|. +.+.++..+++..++. +-.++||...|...+- ++.-.|.+. +..++....+.|+ T Consensus 242 ---------~~~~~~~----d~i~~~~~~l~~~~~~--~a~~v~n~~~~~~L~~-lkd~~G~~i~~~~~~~~~~~~l~G~ 305 (397) T protein:vir:48 242 ---------KPTLTKW----DDIIDLQAKVDPAIKQ--TSFFLTNTSGFTALKK-VKNAFGDYLMERDVKSPTGYSIDGF 305 (397) T ss_pred ---------ccccccH----HHHHHHHHHhhhhhcC--CCEEEECHHHHHHHHH-hhcCCCceeeccCcCCCCCceeccc Confidence 1223343 3455566777777774 3599999999876543 432222221 2233445579998 Q ss_pred eeeecc-------CCCCCeEEEecchHHh-hhhhhhhhhhhcccccee-eeccceeEEEEEEeecceeeccCCeEEEecC Q lcl|NC_018271. 233 TLTEIK-------GLPASRMVGYNRDNIV-IGMSAQSDFNEIRIKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIVLYTP 303 (305) Q Consensus 233 ~iv~l~-------~~Pd~~ii~T~~sNl~-~gvnl~~D~n~I~I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~ 303 (305) +++.+. ..++..++..+-++.+ ++.+. +.. +++++.. +......+-+...+-+|+.+..++=||.-|- T Consensus 306 PV~~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~--~~~-i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~~ 382 (397) T protein:vir:48 306 AVKEVADRWLANASSGAMPLYFGDLKQAVTLFDRQ--QMS-LLSTNIGGGAFETDTTKIRVIDRFDVVATDTESFVPASF 382 (397) T ss_pred eeEEecccccCCcCCCceEEEEEeccceEEEEeec--ceE-EEEeccchhhhhcCceeEEEEeeeccEEecccceEEEEe Confidence 876543 2234455655555542 22211 111 2222221 2223344555566777888888887877664 Q ss_pred CC Q lcl|NC_018271. 304 AA 305 (305) Q Consensus 304 ~~ 305 (305) .+ T Consensus 383 ~~ 384 (397) T protein:vir:48 383 KA 384 (397) T ss_pred cc Confidence 44 No 64 >protein:vir:94771 Length: 298 # NCBI annotation: major head protein # Family: family:all:966 # MgeID: mge:1529 # MgeName: phi LC3 # Cross-refs: genbank:acc:NP_996706;genbank:gi:45597421;genbank:GeneID:2769044 Probab=97.46 E-value=3.1e-05 Score=45.32 Aligned_cols=274 Identities=12% Similarity=0.080 Sum_probs=145.6 Q ss_pred CceEeeee--cccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEecceee Q lcl|NC_018271. 1 MATTVDIT--TNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQL 77 (305) Q Consensus 1 ma~~~~~~--~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~L 77 (305) ||+.-... ..|..+..+ .+.....+.+. .+++|--... ..+|+...+-...-..+ +--+-++.+|.+..| T Consensus 1 ma~~gG~lip~~~~~~ii~----~~~~~s~i~~~--~~~~~~~~~~-~~~p~~~~~~~a~~v~Eg~~~~~~~~~f~~v~l 73 (298) T protein:vir:94 1 MVLNKGTLFDPELVTDLIS----KVAGKSSIARL--SAQKPIPFNG-EKVFTFTMDSEIDVVAESGKKTHGGVTLAPQTM 73 (298) T ss_pred CeeccccccChhHHHHHHH----HHHhhchhhhh--cceeeccCCc-eEEEEEecCcceEEeeCCccccccccceeEEEE Confidence 98865432 244444333 33333333333 4444422222 33454432222110111 112234778999999 Q ss_pred eeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCc----cchhHHHHHHH Q lcl|NC_018271. 78 TLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGT----TGNLQGILPLL 153 (305) Q Consensus 78 ~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s----~~~fdG~lk~i 153 (305) .++++-+...++-+-.+ .+..+ ....+..+.+.|+++++..++..++.|... ....-|.... T Consensus 74 ~~~k~~~~~~iS~ell~-------~~~~~------~~~l~~~i~~~la~ai~~~~d~~~l~G~~~~~g~~~~~~~~~~~- 139 (298) T protein:vir:94 74 VPIKVEYGARISDEFMY-------ASDEE------KINILQAFNDGFAKKVARGIDLMAFHGVNPRLGTASAVIGTNHF- 139 (298) T ss_pred eeeEEEEeeehhHHHhc-------cCCcc------HHHHHHHHHHHHHHHHHHHHHHHhhcccccCCCccccccccccc- Confidence 99999887776532111 11111 233456777889999999999999988432 2222221111 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCcceec Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNEFDFE 230 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~~~~k 230 (305) ...+.+.+... ...+++.+.+.++..+++...++. -.+.||...+.+.+. ++...|.+. ..+.+....+. T Consensus 140 --~~~~~~~~~~~--~~~~~~~~~i~~~~~~~~~~~~~~--~~~vmn~~~~~~l~~-lkd~~G~~l~~~~~~~~~~~tl~ 212 (298) T protein:vir:94 140 --DSKVTQKVEAP--RGIADPNGAIENAVELLTGVDADV--TGIAINPSFRSALAK-QKDLQGNALFPELKWGATPDTIN 212 (298) T ss_pred --ccccccccccc--cccccHHHHHHHHHHhhhhcCCCc--cEEEEcHHHHHHHHH-hhccCCCeeecCcccCCCCceec Confidence 11112212111 112345667777777777764432 269999999977644 332222221 22334456799 Q ss_pred ceeeeeccCCCC------CeEEEecchHHh-hhhhhhhhhhhccccceee-------eccceeEEEEEEeecceeeccCC Q lcl|NC_018271. 231 GYTLTEIKGLPA------SRMVGYNRDNIV-IGMSAQSDFNEIRIKDMGD-------VDLSGQIRTKMVLSAGVEYAYGA 296 (305) Q Consensus 231 Gi~iv~l~~~Pd------~~ii~T~~sNl~-~gvnl~~D~n~I~I~~~~~-------~~~~~~~f~k~~m~~d~~i~fg~ 296 (305) |++++-...+|+ ..++..+-++++ +|+. ++.. +++.+... ++-..+..++..+-.|+.+..++ T Consensus 213 G~PV~~~~~v~~~~~~~~~~~~~Gdfs~~~~~~~~--~~~~-~~~~~~~~~d~~~~~~f~~~~v~~r~~~r~~~~~~~~~ 289 (298) T protein:vir:94 213 GLPVDVNKTVSDMSLTQRDRAIIGDFANGFKWGYA--KEVP-LEVIQYGDPDNSGLDLKGYNQVYIRAELFLGWGILDAT 289 (298) T ss_pred ceeeEEecccccccCCCccEEEEeeccceEEEEEe--cCce-EEEeecCCCcCcchhhhhcCcEEEEEEEEeccEeeccc Confidence 999887776764 367777777653 4432 2221 23332211 12234445555666699998888 Q ss_pred eEEEecCCC Q lcl|NC_018271. 297 EIVLYTPAA 305 (305) Q Consensus 297 E~v~~~~~~ 305 (305) =||+-+.+- T Consensus 290 a~~~l~~~t 298 (298) T protein:vir:94 290 KFARVTEAN 298 (298) T ss_pred ceEEEEecC Confidence 888888888 No 65 >protein:vir:4856 Length: 293 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:106 # MgeName: DT1 # Cross-refs: genbank:acc:NP_049396;genbank:gi:9632424;genbank:GeneID:1258532 Probab=97.46 E-value=6.3e-05 Score=43.61 Aligned_cols=258 Identities=10% Similarity=0.045 Sum_probs=138.2 Q ss_pred CceEeeeeccc-chhH-HHHHHHHhhccccchhcCceEEec--CCCCcccccchhhhhccccCCC-C-CCCC-ccceEec Q lcl|NC_018271. 1 MATTVDITTNY-VGEV-AGGYFLEMVKEANTISDNLIRVIP--NVPENNLFLRRMNTTDDFVDYS-C-GFTP-SGEVDIN 73 (305) Q Consensus 1 ma~~~~~~~~Y-~Ge~-l~~~~~~~~~g~~~v~~g~I~v~~--~v~~~~~~~~~~~~~~~~q~~~-~-~~~~-~G~~~~~ 73 (305) |+....-...| .++. ..+|+..+.....+.+. ++++| +-..+..++++...+... ++- + +-.+ .+..+|. T Consensus 5 ~~~~t~~~gg~liP~~~~~~Ii~~~~~~~~l~~~--~~~~~~~~~~g~~~~~~~~~~~~~a-~~v~Eg~~~~~~~~~~~~ 81 (293) T protein:vir:48 5 KTDHSGSDAGLTIPQDIRTAINTLVRQYDSLQEY--VNVENVTTLTGSRVYEKWTDITGLA-NIDDEAGKIADIDDPKLS 81 (293) T ss_pred ecccccCcCceEechhHHHHHHHHHHhhhhhhhh--ceeeeccCCcceEEEEeecCCCcce-eeecCCccccccccccee Confidence 44443333322 2444 34455444444444333 55544 222222222221111111 111 1 1112 2456899 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) +..|.++++.+...++=+-.+ +-.+..+..+.+.|.++++..++..++.|.++.. T Consensus 82 ~i~l~~~k~~~~~~iS~ell~----------------ds~~~l~~~i~~~la~~~~~~~~~~i~~g~~~~~--------- 136 (293) T protein:vir:48 82 LIKYTIKRYAGISTVTNSLLA----------------DSAENILAWLSGWIAKKVVVTRNKAILGVVDKLP--------- 136 (293) T ss_pred EEEEeeeEEEEeehhhHHHHh----------------hhhHHHHHHHHHHHHHHHHHHHHhHHhhcccccc--------- Confidence 999999999998876533221 2234566778888888888888888887765411 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCcceec Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNEFDFE 230 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~~~~k 230 (305) +..+.+| .+.+.+++.+++..++.+ -.++||...|...+ .++.-.|.+. +.+++...++. T Consensus 137 ----------~~~~~~~----~d~i~~~~~~l~~~~~~~--a~~vmn~~~~~~L~-~lkd~~g~~l~~~~~~~~~~~~l~ 199 (293) T protein:vir:48 137 ----------TKPTLTK----WDDIIDLEAKVDPAIKQT--SFFLTNTSGFTALK-KVKNALGDYLMERDVKSPTGYSIA 199 (293) T ss_pred ----------ccccccC----HHHHHHHHHhhhhhhcCC--CEEEEcHHHHHHHH-HhhccCCceEeecCcCCCCCceec Confidence 1123344 445666777787777753 48999999986554 3433333221 23445556899 Q ss_pred ceeeeeccC--CCC-----CeEEEecchHHhhhhhhhhhhhhcccccee-eeccceeEEEEEEeecceeeccCCeEEEec Q lcl|NC_018271. 231 GYTLTEIKG--LPA-----SRMVGYNRDNIVIGMSAQSDFNEIRIKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIVLYT 302 (305) Q Consensus 231 Gi~iv~l~~--~Pd-----~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~ 302 (305) |++++.+.+ +|+ ..++..+-++.+..++. .+.. +++++.. +..-+....+...+-+|+.+..++=|+..+ T Consensus 200 G~Pv~~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~-~~~~-i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~l~ 277 (293) T protein:vir:48 200 GFAVKEISDRWLPNASSGVMPLYFGDLKQAVTLFDR-QQMS-LLSTNIGGGAFETDTTKVRVIDRFDVVATDTEAFVPAS 277 (293) T ss_pred ceeeEEecccccCCccCCceEEEEEeccceEEEEEe-cceE-EEEecccchhhhcCeEEEEEEEeeCcEEecccceEEEE Confidence 988765432 343 24555555554322211 1111 2222221 222344566777888899998888887766 Q ss_pred CCC Q lcl|NC_018271. 303 PAA 305 (305) Q Consensus 303 ~~~ 305 (305) -++ T Consensus 278 ~~~ 280 (293) T protein:vir:48 278 FKA 280 (293) T ss_pred eec Confidence 433 No 66 >protein:vir:9410 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:167 # MgeName: phi 13 # Cross-refs: genbank:acc:NP_803388;genbank:gi:29028700;genbank:GeneID:1258136 Probab=97.45 E-value=3.6e-05 Score=44.96 Aligned_cols=269 Identities=8% Similarity=-0.000 Sum_probs=136.8 Q ss_pred CceEeeee-ccc-ch-hHHHHHHHHhhccccchhcCceEEec--CCCCcccccchhhhhccccCCCC-CCCC-ccceEec Q lcl|NC_018271. 1 MATTVDIT-TNY-VG-EVAGGYFLEMVKEANTISDNLIRVIP--NVPENNLFLRRMNTTDDFVDYSC-GFTP-SGEVDIN 73 (305) Q Consensus 1 ma~~~~~~-~~Y-~G-e~l~~~~~~~~~g~~~v~~g~I~v~~--~v~~~~~~~~~~~~~~~~q~~~~-~~~~-~G~~~~~ 73 (305) .+...+-. ..+ .. ....+++..+.....+.+. ++++| +-..+.. .++...+..+..-.+ +-.+ .+..+|. T Consensus 120 ~~~~~~~~~g~~~iP~~~~~~ii~~~~~~~~l~~~--~~~~~~~~~~~~~~-~~~~~~~~~~~~v~Eg~~~~~~~~~~~~ 196 (415) T protein:vir:94 120 QGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKY--VTVKRVTNGSGKYP-VVRQSEVAALEKVEELEENPELAVKPFF 196 (415) T ss_pred hhhccccccccccCcHHHHHHHHHHHHhhhhhhhh--cceeeccCCceeEE-EEeecCCccceeccccccccccccccce Confidence 11111110 111 11 2233444444444444443 44444 2212211 111111111111111 1111 2345788 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) +..+.++++-+.+.++=+ .+ ++-++..+.++...|.++++..++..++.|+++.....+..-. T Consensus 197 ~i~~~~~k~~~~~~is~e---------ll-------~ds~~~~~~~i~~~l~~~~~~~~~~~il~g~g~g~~~~~~~~~- 259 (415) T protein:vir:94 197 QLAYDINTHRGYFRISRE---------AI-------EDAKVNVLQELKLWMARTIAATRNKAIIDVITKGSTGSTSSGF- 259 (415) T ss_pred eeEeeheeeeeechhhHH---------HH-------hhchHHHHHHHHHHHHHHHHHHHHHHHhhccccCccccccccc- Confidence 999999999887765422 11 2234567788889999999999999999998765443222111 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---CcccCCCcceec Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---GTFLNPNEFDFE 230 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~~~t~~~~~~~k 230 (305) ..+......++..+ .+.+.++..++...+++ +-.++||...|.+.+- ++.-.|.+ .+.+++....+. T Consensus 260 ---~~~~~~~~~~~~~~----~~~i~~~~~~~~~~~~~--~~~~vmn~~~~~~l~~-lkd~~G~~l~~~~~~~~~~~~l~ 329 (415) T protein:vir:94 260 ---EKEGKKLEVKKAKS----LDDIKDAINLNVKPNYE--HNVAIVSQTMFAKLDK-MKDKLGNYLIQPDVKEKTQQRLL 329 (415) T ss_pred ---cccccccccccccc----hHHHHHHHHhhhhhccC--CCEEEEcHHHHHHHHH-hhccCCCeeeccCcCCCCCceec Confidence 11111122222333 55566667777666664 3489999999877653 33322221 234455566899 Q ss_pred ceeeeeccCCCCCe-----EEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEec--C Q lcl|NC_018271. 231 GYTLTEIKGLPASR-----MVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYT--P 303 (305) Q Consensus 231 Gi~iv~l~~~Pd~~-----ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~--~ 303 (305) |++++....+|.+. |+..+-++++..++- .++. ++.+++. ..... +...+-+|+.+..++=||+-+ + T Consensus 330 G~pV~~~~~~~~~~~~~~~i~~gd~~~~~~~~~~-~~~~-v~~~~~~---~~~~~-~r~~~r~d~~~~~~~a~~~~~~~~ 403 (415) T protein:vir:94 330 GAKIEILPDEVLGQKGNNTLIIGNLKDAIVLFDR-SQYQ-ASWTDYM---HFGEC-LMIAVRQDCRILDYKSAIVIEYDD 403 (415) T ss_pred ceeeEEecccccCCCCccEEEEEehhccEEEEee-cceE-EEEeccc---cCceE-EEEEEEeccEEeccccEEEEEEec Confidence 99999888887443 666676675432211 1121 2222221 11221 234455688888777777653 3 Q ss_pred CC Q lcl|NC_018271. 304 AA 305 (305) Q Consensus 304 ~~ 305 (305) ++ T Consensus 404 ~~ 405 (415) T protein:vir:94 404 SE 405 (415) T ss_pred cC Confidence 33 No 67 >protein:vir:102335 Length: 312 # NCBI annotation: putative capsid protein # Family: family:all:701 # MgeID: mge:1566 # MgeName: phi CD119 # Cross-refs: genbank:acc:YP_529560;genbank:gi:90592716;genbank:GeneID:3974467 Probab=97.39 E-value=1.2e-05 Score=47.60 Aligned_cols=269 Identities=13% Similarity=0.073 Sum_probs=126.0 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccc-hhcCceEEecCCCCcccccchhhhhccccCCC--CC--CCCccceEec-- Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANT-ISDNLIRVIPNVPENNLFLRRMNTTDDFVDYS--CG--FTPSGEVDIN-- 73 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~-v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~--~~--~~~~G~~~~~-- 73 (305) ||++++--+-|+++..+ -+...+..+.+ .+.+.|+. -| -++..+|++.+ .++.+|+ .+ |+. |+++.+ T Consensus 1 Mantl~ya~~~~~~LD~-~~~~~~~s~~l~~~~~~v~~-~g--gktVkIp~i~~-~gl~DY~R~~g~~~~~-g~v~~~~e 74 (312) T protein:vir:10 1 MANTLAYGQVLQQGLDK-QATQELLTGWMDSNAKQIKY-EG--GKEVKIGKLST-DGLGDYSRGSANAYVG-GDVKFEYE 74 (312) T ss_pred CCcchhHHHHHHHHHHH-HHHhhhccccccCCCceEEE-ec--CcEEEEEeeec-ccccccccccCCcccc-ccccccce Confidence 99999877889998544 44444444454 45566764 23 56677898886 7788776 34 754 566555 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) .++|.-.|.- .|.+.|.|. .-+.+.... ++-.++..+ ......-++--|.-+.+.+ T Consensus 75 t~tl~qDR~~-~F~vD~mDv-------------------DETn~~~s~---anv~~ef~r-~~vvPEiDayrfskla~~a 130 (312) T protein:vir:10 75 TKTMTQDRGR-KFTLDAMDV-------------------DETNFLVTA---TTVMGEFQR-LKVIPEIDAYRLSRLATIA 130 (312) T ss_pred eEEeeecccc-eeeccccch-------------------hhHhhHHHH---HHHHHHHHH-hhhcchhhHHHHHHHHhhh Confidence 4444433321 111222221 112122211 111111111 1111111122244333333 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCC----ccee Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPN----EFDF 229 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~----~~~~ 229 (305) .......++. .+.++|++|++++|+++...+.+. +-..++.+|||+..+...+++...+. +..+..+++ -..+ T Consensus 131 ~~~~~~~~~~-~~~~~T~~ni~~~i~~~~~~lde~-~vp~~rvl~vTp~~~~lLk~~~~~~~-~~~~~~~~~i~~~V~~i 207 (312) T protein:vir:10 131 IGIKGDTNVE-YSYSVNSSTIINKIKTGIKIIREN-GYNGPLVCHLTYDSMFAIEEKVLEKL-TAVTFAQGGIQTQVPSI 207 (312) T ss_pred hccccccccc-cccccCHHHHHHHHHHHHHHHHHc-cCCCceEEEeChHHHHHHhhhhhcee-cccccccceeeeeeeee Confidence 2222222222 345689999999999999888875 32347899999999988887642222 332333333 2458 Q ss_pred cceeeeeccCCCCCeEE----Eec------------------chHHhhhhhh----hhhhhhccccceeeeccceeEEEE Q lcl|NC_018271. 230 EGYTLTEIKGLPASRMV----GYN------------------RDNIVIGMSA----QSDFNEIRIKDMGDVDLSGQIRTK 283 (305) Q Consensus 230 kGi~iv~l~~~Pd~~ii----~T~------------------~sNl~~gvnl----~~D~n~I~I~~~~~~~~~~~~f~k 283 (305) .|++|+.| |++|+- .|+ .=|+++--.. ..=.+.++|..=+..+-.-.|..+ T Consensus 208 Dgv~Ii~V---Ps~r~~t~~~f~dG~t~~~~~gg~~~~~~ak~INfiiv~~~a~i~~~K~~~~~if~P~~~~~~d~~~~~ 284 (312) T protein:vir:10 208 DGCALIKT---PQNRMYSSILLNDGTTSNQTAGGYLKGTKALDTNFIIAPVDVPLAITKQDKMRIFDPETNQTANAWSMD 284 (312) T ss_pred cccEEEEc---hhhhccceeeeccCcccccccCceeecCcccccceEEeCCceeeceeeeeeeeeeCCCCCCCcceeeee Confidence 99999886 767662 110 1144321000 000122222111000001113333 Q ss_pred EEeecceeeccCCe---EEEecCCC Q lcl|NC_018271. 284 MVLSAGVEYAYGAE---IVLYTPAA 305 (305) Q Consensus 284 ~~m~~d~~i~fg~E---~v~~~~~~ 305 (305) ...=.|.-+.=... .|-++.|- T Consensus 285 ~R~Y~D~fv~~nk~~~Iyv~~k~a~ 309 (312) T protein:vir:10 285 YRRYHDLWVTDNKANSVYANFKDAK 309 (312) T ss_pred eeeeeeeeeeccccCeEEEEeeccc Confidence 33333333310000 12222222 No 68 >protein:vir:3033 Length: 272 # NCBI annotation: major capsid protein # Family: family:all:522 # MgeID: mge:61 # MgeName: PhiNIH1.1 # Cross-refs: genbank:acc:NP_438146;genbank:gi:16271809;genbank:GeneID:929235 Probab=97.36 E-value=3.8e-05 Score=44.81 Aligned_cols=258 Identities=14% Similarity=0.087 Sum_probs=133.7 Q ss_pred CceEeee-ecccchhHHHHHHHHhhccccchhcCceEE---ecCCCCcccccchhhhhccccCCCC-CCCCccceEecce Q lcl|NC_018271. 1 MATTVDI-TTNYVGEVAGGYFLEMVKEANTISDNLIRV---IPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEK 75 (305) Q Consensus 1 ma~~~~~-~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v---~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K 75 (305) ||++.+- ..-+..|+..+++...+...- +-.++..+ +.|.+.++.-+|+.......+.+.+ +--+-++.++++. T Consensus 1 MA~~~T~~~~~~iPev~s~~v~~~~~~~~-~~~~~~~~~~~~~g~~G~tv~iP~~~~~~~a~~v~eg~~i~~~~~~~~~~ 79 (272) T protein:vir:30 1 MAVGTTKMAQMLDPEVLADMIDAEVGKAI-RFAPLAEVDTTLEGQPGTTLTVPKWDYIGDAEDVAEGEAIPMTQLGFKKT 79 (272) T ss_pred CCCccccchheechHHHHHHHHHHHHHHh-hhhccccccccccCCCCCEEEEEEecCCCCcccccCCCcccccccccceE Confidence 9965432 236778887777766555432 22333333 2455555544555433333333332 1122356788888 Q ss_pred eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhh Q lcl|NC_018271. 76 QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEA 155 (305) Q Consensus 76 ~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~ 155 (305) .+.+++....+.+.=.+. .+-. | .......+.++..++..+...++. .+.. T Consensus 80 ~~~~~~~~~~~~itd~~~------~~s~---------~-d~~~~~~~~~~~~~a~~~d~~i~~-------------~~~~ 130 (272) T protein:vir:30 80 TMTIKKAGKGVEITDEAI------LSGY---------G-DPVGQAAKQIVEAIDHKVDADVLD-------------ALSK 130 (272) T ss_pred EEEeeeeeeeeeecHHHH------hhcc---------c-cHHHHHHHHHHHHHHHHHHHHHHH-------------Hhcc Confidence 999888776666552221 1111 1 122344466666676666654431 1111 Q ss_pred ccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCc-----ccCCCcceec Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGT-----FLNPNEFDFE 230 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~-----~t~~~~~~~k 230 (305) ...+ .++..|... +.+.+..+.+... ..-.++||+..|-.+...-...+.+..+ ..++-...+. T Consensus 131 a~~~-----~~~~~t~d~----i~da~~~l~~~~~--~~~~~vv~p~~~~~L~k~~~~~~~~~~~~~~~~~~~g~ig~i~ 199 (272) T protein:vir:30 131 STQT-----VEATATVDG----VSKALDIFNDEDD--AETVIVMNPADASTLRLDAAKEWLGATEVGANRVVSGVYGEVL 199 (272) T ss_pred cccc-----cccccCHHH----HHHHHHHHhccCC--CccEEEEcHHHHHHHHHhccccccccccccccccccccchhhc Confidence 1111 112234333 3344444444321 2247999999887764322111111111 2233345799 Q ss_pred ceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 231 GYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 231 Gi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) |++++.-..+|++.+++...+++.++. ..+.. ++-+|-.+ +..-.+.-.+-+++.+..++-+|..|=.+ T Consensus 200 G~~Vi~s~~~p~~t~~~~~~~a~~~~~--~~~~~-ve~~r~~~---~~~~~i~~~~~~~~~v~~~~~vv~~t~~~ 268 (272) T protein:vir:30 200 GVQIVRSRKCPKGTAYMVRKGALRIML--KRNTM-VETDRDIT---KAINQIVANKHYGVYLYKAEKAVKITLKD 268 (272) T ss_pred CeeEEEcCCCCcceEEEEcCCeEEEEe--cCCce-eeeccccc---cceeEEEEEEEEEEEEEcCCceEEEEecc Confidence 999999999999999998888775543 23322 23222211 11122222233377788888888887655 No 69 >protein:vir:9820 Length: 272 # NCBI annotation: putative major capsid/head protein # Family: family:all:522 # MgeID: mge:176 # MgeName: 315.4 # Cross-refs: genbank:acc:NP_795582;genbank:gi:28876339;genbank:GeneID:1257858 Probab=97.36 E-value=3.8e-05 Score=44.81 Aligned_cols=258 Identities=14% Similarity=0.087 Sum_probs=133.7 Q ss_pred CceEeee-ecccchhHHHHHHHHhhccccchhcCceEE---ecCCCCcccccchhhhhccccCCCC-CCCCccceEecce Q lcl|NC_018271. 1 MATTVDI-TTNYVGEVAGGYFLEMVKEANTISDNLIRV---IPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEK 75 (305) Q Consensus 1 ma~~~~~-~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v---~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K 75 (305) ||++.+- ..-+..|+..+++...+...- +-.++..+ +.|.+.++.-+|+.......+.+.+ +--+-++.++++. T Consensus 1 MA~~~T~~~~~~iPev~s~~v~~~~~~~~-~~~~~~~~~~~~~g~~G~tv~iP~~~~~~~a~~v~eg~~i~~~~~~~~~~ 79 (272) T protein:vir:98 1 MAVGTTKMAQMLDPEVLADMIDAEVGKAI-RFAPLAEVDTTLEGQPGTTLTVPKWDYIGDAEDVAEGEAIPMTQLGFKKT 79 (272) T ss_pred CCCccccchheechHHHHHHHHHHHHHHh-hhhccccccccccCCCCCEEEEEEecCCCCcccccCCCcccccccccceE Confidence 9965432 236778887777766555432 22333333 2455555544555433333333332 1122356788888 Q ss_pred eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhh Q lcl|NC_018271. 76 QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEA 155 (305) Q Consensus 76 ~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~ 155 (305) .+.+++....+.+.=.+. .+-. | .......+.++..++..+...++. .+.. T Consensus 80 ~~~~~~~~~~~~itd~~~------~~s~---------~-d~~~~~~~~~~~~~a~~~d~~i~~-------------~~~~ 130 (272) T protein:vir:98 80 TMTIKKAGKGVEITDEAI------LSGY---------G-DPVGQAAKQIVEAIDHKVDADVLD-------------ALSK 130 (272) T ss_pred EEEeeeeeeeeeecHHHH------hhcc---------c-cHHHHHHHHHHHHHHHHHHHHHHH-------------Hhcc Confidence 999888776666552221 1111 1 122344466666676666654431 1111 Q ss_pred ccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCc-----ccCCCcceec Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGT-----FLNPNEFDFE 230 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~-----~t~~~~~~~k 230 (305) ...+ .++..|... +.+.+..+.+... ..-.++||+..|-.+...-...+.+..+ ..++-...+. T Consensus 131 a~~~-----~~~~~t~d~----i~da~~~l~~~~~--~~~~~vv~p~~~~~L~k~~~~~~~~~~~~~~~~~~~g~ig~i~ 199 (272) T protein:vir:98 131 STQT-----VEATATVDG----VSKALDIFNDEDD--AETVIVMNPADASTLRLDAAKEWLGATEVGANRVVSGVYGEVL 199 (272) T ss_pred cccc-----cccccCHHH----HHHHHHHHhccCC--CccEEEEcHHHHHHHHHhccccccccccccccccccccchhhc Confidence 1111 112234333 3344444444321 2247999999887764322111111111 2233345799 Q ss_pred ceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 231 GYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 231 Gi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) |++++.-..+|++.+++...+++.++. ..+.. ++-+|-.+ +..-.+.-.+-+++.+..++-+|..|=.+ T Consensus 200 G~~Vi~s~~~p~~t~~~~~~~a~~~~~--~~~~~-ve~~r~~~---~~~~~i~~~~~~~~~v~~~~~vv~~t~~~ 268 (272) T protein:vir:98 200 GVQIVRSRKCPKGTAYMVRKGALRIML--KRNTM-VETDRDIT---KAINQIVANKHYGVYLYKAEKAVKITLKD 268 (272) T ss_pred CeeEEEcCCCCcceEEEEcCCeEEEEe--cCCce-eeeccccc---cceeEEEEEEEEEEEEEcCCceEEEEecc Confidence 999999999999999998888775543 23322 23222211 11122222233377788888888887655 No 70 >protein:vir:79987 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:1875 # MgeName: tp310-3 # Cross-refs: genbank:acc:YP_001430002;genbank:gi:156604057;genbank:GeneID:5525447 Probab=97.33 E-value=7.4e-05 Score=43.23 Aligned_cols=267 Identities=7% Similarity=-0.032 Sum_probs=137.8 Q ss_pred CceEeeee-ccc-ch-hHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccc--cCCCC--CCCCccceEec Q lcl|NC_018271. 1 MATTVDIT-TNY-VG-EVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDF--VDYSC--GFTPSGEVDIN 73 (305) Q Consensus 1 ma~~~~~~-~~Y-~G-e~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~--q~~~~--~~~~~G~~~~~ 73 (305) ++..+.-. ..+ .+ +...+++..+.....+.+. ++++| |+......+..+..... ..-.+ .....+..+|+ T Consensus 120 ~~~~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~~--~~~~~-~~~~~~~~~~~~~~~~~~~~~v~E~~~~~~~~~~~~~ 196 (415) T protein:vir:79 120 QGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKY--VTVKR-VTNGSGKYPVVRQSEVAALEKVEELEENPELAVKPFF 196 (415) T ss_pred hhccccccccccccchHHHHHHHHHHHhhhhhhhh--eeeee-ccCCceeEEEEeecCCccceeeccccccCccccccee Confidence 22222111 111 11 3344455445555555444 55544 22111112222221111 11111 11122446899 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) +..|.++++-+...++=. ++. +=++..+.++.+.|..+++..++..++.|+++.+...+.... T Consensus 197 ~v~~~~~k~~~~~~iS~e---------ll~-------ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g~g~~~~~~~~~- 259 (415) T protein:vir:79 197 QLAYDINTHRGYFRISRE---------AIE-------DAKVNVLQELKLWMARTIAATRNKAIIDVITKGSTGSTSSGF- 259 (415) T ss_pred eEEeeeeeeEeeehhhHH---------HHh-------hchHHHHHHHHHHHHHHHHHHHHHHHhhccccCccccccccc- Confidence 999999999888765422 111 223456778889999999999999999998765443332221 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCcceec Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNEFDFE 230 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~~~~k 230 (305) ..+.+.+..++..+ .+.+.++..+++..+++ +-.++||...|.+.+. ++.-.|.+. +.+++....+. T Consensus 260 ---~~~~~~~~~~~~~~----~~~i~~~~~~~~~~~~~--~~~~v~n~~~~~~l~~-lkd~~G~~l~~~~~~~~~~~~l~ 329 (415) T protein:vir:79 260 ---EKEGKKLEVKKAKS----LDDIKDAINLNVKPNYE--HNVAIVSQTMFAKLDK-MKDKLGNYLIQPDVKEKTQQRLL 329 (415) T ss_pred ---cccccccccccccc----hhHHHHHHHhhhhhccC--CCEEEEcHHHHHHHHH-hhccCCceeeccCcCCCCCceec Confidence 11111122223333 45555666676666654 3489999999887653 433222221 33445556899 Q ss_pred ceeeeeccCCCCC-----eEEEecchHHhhhhhhhhhhhhccc--cceeeeccceeEEEEEEeecceeeccCCeEEEecC Q lcl|NC_018271. 231 GYTLTEIKGLPAS-----RMVGYNRDNIVIGMSAQSDFNEIRI--KDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTP 303 (305) Q Consensus 231 Gi~iv~l~~~Pd~-----~ii~T~~sNl~~gvnl~~D~n~I~I--~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~ 303 (305) |++++....+|.+ .++..+=++++..+ |-..+++ .+.. ..... +...+-+|..+..++=||+-+= T Consensus 330 G~pV~~~~~~~~~~~~~~~~~~Gd~~~~~~~~----~~~~~~v~~~~~~---~~~~~-~~~~~r~d~~v~~~~a~~~~~~ 401 (415) T protein:vir:79 330 GAKIEILPDEVLGQKGNNTLIIGNLKDAIVLF----DRSQYQASWTDYM---HFGEC-LMIAVRQDCRILDYKSAIVIEY 401 (415) T ss_pred ceeeEEecccccCCCCccEEEEEehhccEEEE----eecceEEEEeccc---cCceE-EEEEEEeccEEeccccEEEEEE Confidence 9999888888743 35666555543211 1222222 2221 12221 2345567888888777777643 Q ss_pred --CC Q lcl|NC_018271. 304 --AA 305 (305) Q Consensus 304 --~~ 305 (305) ++ T Consensus 402 ~~~~ 405 (415) T protein:vir:79 402 DDSE 405 (415) T ss_pred eccC Confidence 33 No 71 >protein:vir:98339 Length: 415 # NCBI annotation: putative capsid protein # Family: family:all:21 # MgeID: mge:1581 # MgeName: phiPVL(108) # Cross-refs: genbank:acc:YP_918931;genbank:gi:119443693;genbank:GeneID:4594501 Probab=97.33 E-value=7.4e-05 Score=43.23 Aligned_cols=267 Identities=7% Similarity=-0.032 Sum_probs=137.8 Q ss_pred CceEeeee-ccc-ch-hHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccc--cCCCC--CCCCccceEec Q lcl|NC_018271. 1 MATTVDIT-TNY-VG-EVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDF--VDYSC--GFTPSGEVDIN 73 (305) Q Consensus 1 ma~~~~~~-~~Y-~G-e~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~--q~~~~--~~~~~G~~~~~ 73 (305) ++..+.-. ..+ .+ +...+++..+.....+.+. ++++| |+......+..+..... ..-.+ .....+..+|+ T Consensus 120 ~~~~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~~--~~~~~-~~~~~~~~~~~~~~~~~~~~~v~E~~~~~~~~~~~~~ 196 (415) T protein:vir:98 120 QGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKY--VTVKR-VTNGSGKYPVVRQSEVAALEKVEELEENPELAVKPFF 196 (415) T ss_pred hhccccccccccccchHHHHHHHHHHHhhhhhhhh--eeeee-ccCCceeEEEEeecCCccceeeccccccCccccccee Confidence 22222111 111 11 3344455445555555444 55544 22111112222221111 11111 11122446899 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) +..|.++++-+...++=. ++. +=++..+.++.+.|..+++..++..++.|+++.+...+.... T Consensus 197 ~v~~~~~k~~~~~~iS~e---------ll~-------ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g~g~~~~~~~~~- 259 (415) T protein:vir:98 197 QLAYDINTHRGYFRISRE---------AIE-------DAKVNVLQELKLWMARTIAATRNKAIIDVITKGSTGSTSSGF- 259 (415) T ss_pred eEEeeeeeeEeeehhhHH---------HHh-------hchHHHHHHHHHHHHHHHHHHHHHHHhhccccCccccccccc- Confidence 999999999888765422 111 223456778889999999999999999998765443332221 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCcceec Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNEFDFE 230 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~~~~k 230 (305) ..+.+.+..++..+ .+.+.++..+++..+++ +-.++||...|.+.+. ++.-.|.+. +.+++....+. T Consensus 260 ---~~~~~~~~~~~~~~----~~~i~~~~~~~~~~~~~--~~~~v~n~~~~~~l~~-lkd~~G~~l~~~~~~~~~~~~l~ 329 (415) T protein:vir:98 260 ---EKEGKKLEVKKAKS----LDDIKDAINLNVKPNYE--HNVAIVSQTMFAKLDK-MKDKLGNYLIQPDVKEKTQQRLL 329 (415) T ss_pred ---cccccccccccccc----hhHHHHHHHhhhhhccC--CCEEEEcHHHHHHHHH-hhccCCceeeccCcCCCCCceec Confidence 11111122223333 45555666676666654 3489999999887653 433222221 33445556899 Q ss_pred ceeeeeccCCCCC-----eEEEecchHHhhhhhhhhhhhhccc--cceeeeccceeEEEEEEeecceeeccCCeEEEecC Q lcl|NC_018271. 231 GYTLTEIKGLPAS-----RMVGYNRDNIVIGMSAQSDFNEIRI--KDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTP 303 (305) Q Consensus 231 Gi~iv~l~~~Pd~-----~ii~T~~sNl~~gvnl~~D~n~I~I--~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~ 303 (305) |++++....+|.+ .++..+=++++..+ |-..+++ .+.. ..... +...+-+|..+..++=||+-+= T Consensus 330 G~pV~~~~~~~~~~~~~~~~~~Gd~~~~~~~~----~~~~~~v~~~~~~---~~~~~-~~~~~r~d~~v~~~~a~~~~~~ 401 (415) T protein:vir:98 330 GAKIEILPDEVLGQKGNNTLIIGNLKDAIVLF----DRSQYQASWTDYM---HFGEC-LMIAVRQDCRILDYKSAIVIEY 401 (415) T ss_pred ceeeEEecccccCCCCccEEEEEehhccEEEE----eecceEEEEeccc---cCceE-EEEEEEeccEEeccccEEEEEE Confidence 9999888888743 35666555543211 1222222 2221 12221 2345567888888777777643 Q ss_pred --CC Q lcl|NC_018271. 304 --AA 305 (305) Q Consensus 304 --~~ 305 (305) ++ T Consensus 402 ~~~~ 405 (415) T protein:vir:98 402 DDSE 405 (415) T ss_pred eccC Confidence 33 No 72 >protein:vir:81100 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:1891 # MgeName: tp310-1 # Cross-refs: genbank:acc:YP_001429874;genbank:gi:156603927;genbank:GeneID:5525320 Probab=97.33 E-value=7.4e-05 Score=43.23 Aligned_cols=267 Identities=7% Similarity=-0.032 Sum_probs=137.8 Q ss_pred CceEeeee-ccc-ch-hHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccc--cCCCC--CCCCccceEec Q lcl|NC_018271. 1 MATTVDIT-TNY-VG-EVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDF--VDYSC--GFTPSGEVDIN 73 (305) Q Consensus 1 ma~~~~~~-~~Y-~G-e~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~--q~~~~--~~~~~G~~~~~ 73 (305) ++..+.-. ..+ .+ +...+++..+.....+.+. ++++| |+......+..+..... ..-.+ .....+..+|+ T Consensus 120 ~~~~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~~--~~~~~-~~~~~~~~~~~~~~~~~~~~~v~E~~~~~~~~~~~~~ 196 (415) T protein:vir:81 120 QGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKY--VTVKR-VTNGSGKYPVVRQSEVAALEKVEELEENPELAVKPFF 196 (415) T ss_pred hhccccccccccccchHHHHHHHHHHHhhhhhhhh--eeeee-ccCCceeEEEEeecCCccceeeccccccCccccccee Confidence 22222111 111 11 3344455445555555444 55544 22111112222221111 11111 11122446899 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) +..|.++++-+...++=. ++. +=++..+.++.+.|..+++..++..++.|+++.+...+.... T Consensus 197 ~v~~~~~k~~~~~~iS~e---------ll~-------ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g~g~~~~~~~~~- 259 (415) T protein:vir:81 197 QLAYDINTHRGYFRISRE---------AIE-------DAKVNVLQELKLWMARTIAATRNKAIIDVITKGSTGSTSSGF- 259 (415) T ss_pred eEEeeeeeeEeeehhhHH---------HHh-------hchHHHHHHHHHHHHHHHHHHHHHHHhhccccCccccccccc- Confidence 999999999888765422 111 223456778889999999999999999998765443332221 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCcceec Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNEFDFE 230 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~~~~k 230 (305) ..+.+.+..++..+ .+.+.++..+++..+++ +-.++||...|.+.+. ++.-.|.+. +.+++....+. T Consensus 260 ---~~~~~~~~~~~~~~----~~~i~~~~~~~~~~~~~--~~~~v~n~~~~~~l~~-lkd~~G~~l~~~~~~~~~~~~l~ 329 (415) T protein:vir:81 260 ---EKEGKKLEVKKAKS----LDDIKDAINLNVKPNYE--HNVAIVSQTMFAKLDK-MKDKLGNYLIQPDVKEKTQQRLL 329 (415) T ss_pred ---cccccccccccccc----hhHHHHHHHhhhhhccC--CCEEEEcHHHHHHHHH-hhccCCceeeccCcCCCCCceec Confidence 11111122223333 45555666676666654 3489999999887653 433222221 33445556899 Q ss_pred ceeeeeccCCCCC-----eEEEecchHHhhhhhhhhhhhhccc--cceeeeccceeEEEEEEeecceeeccCCeEEEecC Q lcl|NC_018271. 231 GYTLTEIKGLPAS-----RMVGYNRDNIVIGMSAQSDFNEIRI--KDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTP 303 (305) Q Consensus 231 Gi~iv~l~~~Pd~-----~ii~T~~sNl~~gvnl~~D~n~I~I--~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~ 303 (305) |++++....+|.+ .++..+=++++..+ |-..+++ .+.. ..... +...+-+|..+..++=||+-+= T Consensus 330 G~pV~~~~~~~~~~~~~~~~~~Gd~~~~~~~~----~~~~~~v~~~~~~---~~~~~-~~~~~r~d~~v~~~~a~~~~~~ 401 (415) T protein:vir:81 330 GAKIEILPDEVLGQKGNNTLIIGNLKDAIVLF----DRSQYQASWTDYM---HFGEC-LMIAVRQDCRILDYKSAIVIEY 401 (415) T ss_pred ceeeEEecccccCCCCccEEEEEehhccEEEE----eecceEEEEeccc---cCceE-EEEEEEeccEEeccccEEEEEE Confidence 9999888888743 35666555543211 1222222 2221 12221 2345567888888777777643 Q ss_pred --CC Q lcl|NC_018271. 304 --AA 305 (305) Q Consensus 304 --~~ 305 (305) ++ T Consensus 402 ~~~~ 405 (415) T protein:vir:81 402 DDSE 405 (415) T ss_pred eccC Confidence 33 No 73 >protein:vir:1638 Length: 298 # NCBI annotation: Structural protein # Family: family:all:966 # MgeID: mge:33 # MgeName: r1t # Cross-refs: genbank:acc:NP_695059;genbank:gi:23455750;genbank:GeneID:955469 Probab=97.31 E-value=5.8e-05 Score=43.82 Aligned_cols=276 Identities=12% Similarity=0.049 Sum_probs=148.0 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC-CCCCccceEecceeeee Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQLTL 79 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~L~~ 79 (305) ||+.--... -.+...+|+-.+....-+.+. .+++|--+.. +.+|+...+-...-..+ +--+-++.+|.+..|.+ T Consensus 1 ma~~gG~lv--p~~~~~~ii~~~~~~s~i~~l--~~~~~~~~~~-~~ip~~~~~~~a~~v~E~~~~~~~~~~f~~v~l~~ 75 (298) T protein:vir:16 1 MVLNKGTLF--DPTLVTDLISKVAGKSSIARL--SAQKPIPFNG-EKVFTFTMDSEIDVVAESGKKTHGGVTLAPQTMVP 75 (298) T ss_pred CcccCccee--chhHHHHHHHHHHhhhhhhhh--cceeeccCCc-eEEEEEecCcceEEecCCccccccccceeEEEEee Confidence 987654321 123444555444444434333 4444422222 33554433222211111 11223467899999999 Q ss_pred eeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCC----ccchhHHHHHHHhh Q lcl|NC_018271. 80 KKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDG----TTGNLQGILPLLEA 155 (305) Q Consensus 80 ~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~----s~~~fdG~lk~i~~ 155 (305) +++-+...++-+=.+. +. +--...+.++.+.|.++++..+++.++.|.+ +...+-|..... T Consensus 76 ~k~a~~~~iS~ell~~-------s~------d~~~~l~~~i~~~la~ai~~~~d~~~l~G~~~~~g~~~~~~~~~~~~-- 140 (298) T protein:vir:16 76 IKVEYGARISDEFMYA-------SD------EEKINILQEFNDGFAKKVARGIDLMAFHGVNPRLGTASAVIGTNHFD-- 140 (298) T ss_pred eeEEEeehhhHHHhhc-------Cc------ccHHHHHHHHHHHHHHHHHHHHHHHhhccccCCCCcccccccccccc-- Confidence 9999887765332211 10 1123445677788999999999999999853 223333322111 Q ss_pred ccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCcceecce Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNEFDFEGY 232 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~~~~kGi 232 (305) ..+...+... ....+..+.+.++..++....++. =.+.||...+.+.+. ++.-.|.+. ....+....+.|+ T Consensus 141 -~~~~~~~~~~--~~~~~~~~~i~~~~~~~~~~~~~~--~~~vmn~~~~~~l~~-lkd~~G~~i~~~~~~~~~~~~l~G~ 214 (298) T protein:vir:16 141 -SKVTQKVEAP--RGIADPNGAIENAVELLTGVDADV--TGIAINPSFRSALAK-QKDLQDNALFPELKWGATPDTINGL 214 (298) T ss_pred -cccccccccc--cccccHHHHHHHHHHHhhhcCCCc--cEEEEcHHHHHHHHH-hhccCCCeeecCcccCCCCceecce Confidence 1111111111 112334556666666666554432 269999998876643 332222221 1223445689999 Q ss_pred eeeeccCCCC------CeEEEecchHHh-hhhhhhhhhhhccccceee-------eccceeEEEEEEeecceeeccCCeE Q lcl|NC_018271. 233 TLTEIKGLPA------SRMVGYNRDNIV-IGMSAQSDFNEIRIKDMGD-------VDLSGQIRTKMVLSAGVEYAYGAEI 298 (305) Q Consensus 233 ~iv~l~~~Pd------~~ii~T~~sNl~-~gvnl~~D~n~I~I~~~~~-------~~~~~~~f~k~~m~~d~~i~fg~E~ 298 (305) +++....+|+ ..+|..+=+|.+ +++. .+. .+++.+... ++-..+..+...+-.|+.+..++=+ T Consensus 215 PV~~~~~v~~~~~~~~~~~~~GDfs~~~~~~~~--~~~-~~~~~~~~~~~~~~~~~f~~~~v~~ra~~r~d~~v~~~~a~ 291 (298) T protein:vir:16 215 PVDVNKTVSDMSLTQRDRAIIGDFANGFKWGYA--KEV-PLEVIQYGDPDNSGLDLKGYNQVYIRAELFLGWGILDATKF 291 (298) T ss_pred eeEEecccccccCCCccEEEEeeccceEEEEEe--cCc-eEEEeeccCCcCcchhhhhcCcEEEEEEEEEccEeecccce Confidence 9988877775 367777766653 3331 111 233433221 1223445666677779999888888 Q ss_pred EEecCCC Q lcl|NC_018271. 299 VLYTPAA 305 (305) Q Consensus 299 v~~~~~~ 305 (305) ++-+++- T Consensus 292 ~~l~~at 298 (298) T protein:vir:16 292 ARVTEAN 298 (298) T ss_pred EEEeecC Confidence 8888888 No 74 >protein:vir:93616 Length: 645 # NCBI annotation: putative major head protein/prohead protease # Family: family:all:21 # MgeID: mge:157 # MgeName: phi 4795 # Cross-refs: genbank:acc:YP_001449293;genbank:gi:157166041;goa:Q6H9U8;interpro:IPR006433;uniprot:Q6H9U8;genbank:GeneID:5580438 Probab=97.30 E-value=6.5e-05 Score=43.55 Aligned_cols=264 Identities=11% Similarity=0.066 Sum_probs=129.2 Q ss_pred CceEee----------eecccchhHHHHHHHHhhccccchhcCceEEecC---CCCcccccchhhhhccccCCCCCCCC- Q lcl|NC_018271. 1 MATTVD----------ITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPN---VPENNLFLRRMNTTDDFVDYSCGFTP- 66 (305) Q Consensus 1 ma~~~~----------~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~---v~~~~~~~~~~~~~~~~q~~~~~~~~- 66 (305) ++.... +-+.|.++..+-+.- .-++.+...+++++ ++... -+|+...+ -.+.|.+ T Consensus 334 ~~~~~~~~~~~~Gg~~vp~~~~~~ii~~l~~-----~svv~~l~~~~~~~~~~~~~~~-~ip~~t~~-----~~a~wv~E 402 (645) T protein:vir:93 334 VGAGTTTDPQWAGSLSEYQEYAQDFIDYLRP-----QTIIGRFGQGGIPALRQVPFNI-RVHAQVSG-----GAAGWVGE 402 (645) T ss_pred hhccccccccccCCccCchhhHHHHHHhhhh-----hhhHHhhccccccccccccCce-eeeeeecC-----cceEEecc Confidence 111111 111233333322211 11222211222333 22222 23332221 1234533 Q ss_pred -----ccceEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCC Q lcl|NC_018271. 67 -----SGEVDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDG 141 (305) Q Consensus 67 -----~G~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~ 141 (305) -++.+|++..|.|+++-+...++-+=+ . +=.+..+.++...|...++..+...++.|++ T Consensus 403 g~~~~~s~~~f~~v~l~~~kla~~~~iS~ell---------~-------ds~~~~~~~i~~~l~~aia~~~d~a~l~g~g 466 (645) T protein:vir:93 403 GKTKPLTKFDFESITFSHAKVSAIAVLTEELI---------R-------FSSPAADALVRNALAEAVVARLDTDFVDPKK 466 (645) T ss_pred CccccccccceeEEEEeeEEEEEeehhHHHHH---------h-------hchHHHHHHHHHHHHHHHHHHHHHHhhcCCC Confidence 246799999999999999887653211 1 1124456777889999999999999999887 Q ss_pred ccch---hHHHHHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC Q lcl|NC_018271. 142 TTGN---LQGILPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN 218 (305) Q Consensus 142 s~~~---fdG~lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~ 218 (305) +... =-|++ ....++ .. ...+.++ +..+...+...-..-++-.++||...+.+... ++.-.|.+ T Consensus 467 ~~~~~~~p~gi~----~~~~~~---~~-~~~~~~d----~~~~~~~~~~a~~~~~~a~~vmn~~~~~~L~~-lkd~~G~~ 533 (645) T protein:vir:93 467 AAVADVSPASIT----HDVKGT---AS-SGNPDAD----AEAAFGQFVAANLQPTGAVWLMSSTNALALSM-RKNALGQK 533 (645) T ss_pred cccCCcccccee----cccccc---cc-ccchHHH----HHHHHHHHHhcCCCccccEEEEcHHHHHHHHh-ccccCCce Confidence 6311 11211 111111 11 1122333 33444443332111134579999988766532 22111111 Q ss_pred C-cccCCCcceecceeeeeccCCCCCeEEEecchHHhhhhhhh-----hhhhhccccc-------------eeeecccee Q lcl|NC_018271. 219 G-TFLNPNEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQ-----SDFNEIRIKD-------------MGDVDLSGQ 279 (305) Q Consensus 219 ~-~~t~~~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~-----~D~n~I~I~~-------------~~~~~~~~~ 279 (305) . ......-..+.|++++.-..+|++.+++ +.+.+++|.... ++.--++|.+ +-+++-... T Consensus 534 ~~~~~~~~~~tL~G~PV~~s~~vp~~~~~g-d~s~~~ig~~~~v~i~~s~~a~~~~~~~~~~~~~~~~~~~~v~lf~~d~ 612 (645) T protein:vir:93 534 EYPDMTLLGGSFQGLPVIVSQYVGDQLVLV-NAPDIYLADDGGVAVDMSREASLEMQSEPTGDSTTPSPVELVSMFQTGS 612 (645) T ss_pred eecCCCCCCceeeceeeEEeccCCcceeEe-ccccEEEEEecceEEEeecceeEEEeecccccccccccccchhHhhcCc Confidence 0 0011122479999999999999987765 445555443210 0000111111 001233344 Q ss_pred EEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 280 IRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 280 ~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) .=++..+-.|+.+..++=||+=|..- T Consensus 613 vaira~~r~d~~~~~p~a~~~lt~~~ 638 (645) T protein:vir:93 613 VAIRAERWINWRRRRTAAVAVITGVN 638 (645) T ss_pred eEEEEEEEEcceeeCccceEEEeccc Confidence 55566677799998888888877655 No 75 >protein:vir:4600 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:101 # MgeName: PVL # Cross-refs: genbank:acc:NP_058445;genbank:gi:9635171;genbank:GeneID:1262708 Probab=97.28 E-value=0.00011 Score=42.33 Aligned_cols=269 Identities=7% Similarity=-0.020 Sum_probs=140.3 Q ss_pred CceEeeee-cc-cch-hHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhcc--ccCCCC-CCCC-ccceEec Q lcl|NC_018271. 1 MATTVDIT-TN-YVG-EVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDD--FVDYSC-GFTP-SGEVDIN 73 (305) Q Consensus 1 ma~~~~~~-~~-Y~G-e~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~--~q~~~~-~~~~-~G~~~~~ 73 (305) ++..+.-. .. =.+ ....+++..+.....+.+. ++++|-=... ..+++.+.... ..--.+ +-.+ .+..+|. T Consensus 120 ~~~~~~t~~g~~~iP~~~~~~ii~~~~~~~~l~~~--~~~~~~~~~~-~~~~~~~~~~~~~~~~v~Eg~~~~~~~~~~~~ 196 (415) T protein:vir:46 120 QGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKY--VTVKRVTNGS-GKYPVVRQSEVAALEKVEELEENPELAVKPFF 196 (415) T ss_pred hhccccccCCcccccHHHHHHHHHHHHhhhhhhhh--cceeeccCCc-eeEEEEEecCCcceeeccccccccccccccee Confidence 22221111 10 112 2234444445555555444 4443321111 11222221111 111111 1122 2456889 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) +..+.++++-+...++=. ++ ++-++..+.++...|..+++..++..++.|+++.....+..-. T Consensus 197 ~v~~~~~k~~~~~~iS~e---------ll-------~ds~~~l~~~i~~~l~~~i~~~~d~~il~g~g~g~~~~~~~~~- 259 (415) T protein:vir:46 197 QLAYDINTHRGYFRISRE---------AI-------EDAKVNVLQELKLWMARTIAATRNKAIIDVITKGSTGSTSSGF- 259 (415) T ss_pred eEEeeeeeeEeeehhhHH---------HH-------hhchHHHHHHHHHHHHHHHHHHHHHHHhhccccCCcccccccc- Confidence 999999999988765421 11 1224567788999999999999999999998765443332211 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCcceec Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNEFDFE 230 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~~~~k 230 (305) ..+...+..++..| .+.+.++..+++..+++ +-.++||...|...+. ++.-.|.+. +.+++....+. T Consensus 260 ---~~~~~~~~~~~~~~----~~~i~~~~~~~~~~~~~--~~~~v~n~~~~~~L~~-lkd~~G~~i~~~~~~~~~~~~l~ 329 (415) T protein:vir:46 260 ---EKEGKKLEVKKAKS----LDDIKDAINLNVKPNYE--HNVAIVSQTMFAKLDK-MKDKLGNYLIQPDVKEKTQQRLL 329 (415) T ss_pred ---ccccceeccccccc----hHHHHHHHHhhhhhccC--CCEEEEcHHHHHHHHH-hhccCCCeeeccCcCCCCCcccc Confidence 11112222333444 45556677777777664 3489999999876643 433222221 33445556799 Q ss_pred ceeeeeccCCCCC-----eEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEe--cC Q lcl|NC_018271. 231 GYTLTEIKGLPAS-----RMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLY--TP 303 (305) Q Consensus 231 Gi~iv~l~~~Pd~-----~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~--~~ 303 (305) |++++....+|.. .++..+-++++..++- .++. ++.+++. ..... +...+-+|+.+..++=||+- || T Consensus 330 G~pV~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~-~~~~-v~~~~~~---~~~~~-~~~~~r~d~~v~~~~a~~~~~~~~ 403 (415) T protein:vir:46 330 GAKIEILPDEVLGQKGNNTLIIGNLKDAIVLFDR-SQYQ-ASWTDYM---HFGEC-LMIAVRQDCRILDYKSAIVIEYDD 403 (415) T ss_pred ceeeEEeccccccCCCccEEEEEehhccEEEEee-cceE-EEeeccc---cCceE-EEEEEEeccEEeccccEEEEEeec Confidence 9999988888732 4666666665332211 1221 2222221 22222 23556678888887776654 45 Q ss_pred CC Q lcl|NC_018271. 304 AA 305 (305) Q Consensus 304 ~~ 305 (305) ++ T Consensus 404 ~~ 405 (415) T protein:vir:46 404 SE 405 (415) T ss_pred cC Confidence 55 No 76 >protein:vir:4700 Length: 415 # NCBI annotation: phi PVL ORF 7 homologue # Family: family:all:21 # MgeID: mge:102 # MgeName: phiPV83 # Cross-refs: genbank:acc:NP_061632;genbank:gi:9635719;genbank:GeneID:1262976 Probab=97.28 E-value=0.00011 Score=42.33 Aligned_cols=269 Identities=7% Similarity=-0.020 Sum_probs=140.3 Q ss_pred CceEeeee-cc-cch-hHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhcc--ccCCCC-CCCC-ccceEec Q lcl|NC_018271. 1 MATTVDIT-TN-YVG-EVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDD--FVDYSC-GFTP-SGEVDIN 73 (305) Q Consensus 1 ma~~~~~~-~~-Y~G-e~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~--~q~~~~-~~~~-~G~~~~~ 73 (305) ++..+.-. .. =.+ ....+++..+.....+.+. ++++|-=... ..+++.+.... ..--.+ +-.+ .+..+|. T Consensus 120 ~~~~~~t~~g~~~iP~~~~~~ii~~~~~~~~l~~~--~~~~~~~~~~-~~~~~~~~~~~~~~~~v~Eg~~~~~~~~~~~~ 196 (415) T protein:vir:47 120 QGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKY--VTVKRVTNGS-GKYPVVRQSEVAALEKVEELEENPELAVKPFF 196 (415) T ss_pred hhccccccCCcccccHHHHHHHHHHHHhhhhhhhh--cceeeccCCc-eeEEEEEecCCcceeeccccccccccccccee Confidence 22221111 10 112 2234444445555555444 4443321111 11222221111 111111 1122 2456889 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) +..+.++++-+...++=. ++ ++-++..+.++...|..+++..++..++.|+++.....+..-. T Consensus 197 ~v~~~~~k~~~~~~iS~e---------ll-------~ds~~~l~~~i~~~l~~~i~~~~d~~il~g~g~g~~~~~~~~~- 259 (415) T protein:vir:47 197 QLAYDINTHRGYFRISRE---------AI-------EDAKVNVLQELKLWMARTIAATRNKAIIDVITKGSTGSTSSGF- 259 (415) T ss_pred eEEeeeeeeEeeehhhHH---------HH-------hhchHHHHHHHHHHHHHHHHHHHHHHHhhccccCCcccccccc- Confidence 999999999988765421 11 1224567788999999999999999999998765443332211 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCcceec Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNEFDFE 230 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~~~~k 230 (305) ..+...+..++..| .+.+.++..+++..+++ +-.++||...|...+. ++.-.|.+. +.+++....+. T Consensus 260 ---~~~~~~~~~~~~~~----~~~i~~~~~~~~~~~~~--~~~~v~n~~~~~~L~~-lkd~~G~~i~~~~~~~~~~~~l~ 329 (415) T protein:vir:47 260 ---EKEGKKLEVKKAKS----LDDIKDAINLNVKPNYE--HNVAIVSQTMFAKLDK-MKDKLGNYLIQPDVKEKTQQRLL 329 (415) T ss_pred ---ccccceeccccccc----hHHHHHHHHhhhhhccC--CCEEEEcHHHHHHHHH-hhccCCCeeeccCcCCCCCcccc Confidence 11112222333444 45556677777777664 3489999999876643 433222221 33445556799 Q ss_pred ceeeeeccCCCCC-----eEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEe--cC Q lcl|NC_018271. 231 GYTLTEIKGLPAS-----RMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLY--TP 303 (305) Q Consensus 231 Gi~iv~l~~~Pd~-----~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~--~~ 303 (305) |++++....+|.. .++..+-++++..++- .++. ++.+++. ..... +...+-+|+.+..++=||+- || T Consensus 330 G~pV~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~-~~~~-v~~~~~~---~~~~~-~~~~~r~d~~v~~~~a~~~~~~~~ 403 (415) T protein:vir:47 330 GAKIEILPDEVLGQKGNNTLIIGNLKDAIVLFDR-SQYQ-ASWTDYM---HFGEC-LMIAVRQDCRILDYKSAIVIEYDD 403 (415) T ss_pred ceeeEEeccccccCCCccEEEEEehhccEEEEee-cceE-EEeeccc---cCceE-EEEEEEeccEEeccccEEEEEeec Confidence 9999988888732 4666666665332211 1221 2222221 22222 23556678888887776654 45 Q ss_pred CC Q lcl|NC_018271. 304 AA 305 (305) Q Consensus 304 ~~ 305 (305) ++ T Consensus 404 ~~ 405 (415) T protein:vir:47 404 SE 405 (415) T ss_pred cC Confidence 55 No 77 >protein:vir:8420 Length: 477 # NCBI annotation: gp15 # Family: family:all:21 # MgeID: mge:155 # MgeName: Omega # Cross-refs: genbank:acc:NP_818316;genbank:gi:29566752;genbank:GeneID:1260033 Probab=97.25 E-value=1.3e-05 Score=47.29 Aligned_cols=272 Identities=10% Similarity=0.097 Sum_probs=136.8 Q ss_pred CceEeeeecccc-hhHHHHHHHHhhcc-ccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC-----------c Q lcl|NC_018271. 1 MATTVDITTNYV-GEVAGGYFLEMVKE-ANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP-----------S 67 (305) Q Consensus 1 ma~~~~~~~~Y~-Ge~l~~~~~~~~~g-~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~-----------~ 67 (305) +.++-....... .+...+.|.+.+.. ..+.+......+|+-..+ ..+|+...+... +.|.+ . T Consensus 157 ~~~~~~~gg~lv~~~~~~~~ii~~l~~~~~i~~~~~~~~~~~~~~~-~~ip~~~~~~~~----a~~~~Eg~~~~~~~~~~ 231 (477) T protein:vir:84 157 LDRNGGTGGYAVPPLWMMNRFIELARAGRTYANLCPTEPLPGGTSS-INIPKILTGTST----AIQAADNAALTAPSAHE 231 (477) T ss_pred ccccCCCcceeeccchhHHHHHHHhhhcchHHHhhceeeecCCcce-eEEEEEecCcce----eeeeccCcccccccccc Confidence 111111111111 23333333333333 333332222333454344 234443322111 23333 2 Q ss_pred cceEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhH Q lcl|NC_018271. 68 GEVDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQ 147 (305) Q Consensus 68 G~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fd 147 (305) ++..|....+.++++.+...++-+ ++ ++-.+.++.++...|..+++..++..++.|+++.+... T Consensus 232 s~~~f~~i~~~~~k~~~~~~iS~e---------ll-------~ds~~~l~~~i~~~l~~~~~~~~d~~~l~G~Gt~~~p~ 295 (477) T protein:vir:84 232 VDLTDGFVQANVKTIAGQQGIAIQ---------LL-------DQAAVSVDEFVFRDLAADYANKLNVQVISGTGSNNQVV 295 (477) T ss_pred cccceeeEEEeeeeEEeeeHHHHH---------HH-------hccchhHHHHHHHHHHHHHHHHHHHHHhccCCCCCccc Confidence 355689999999999998876422 11 12245677889999999999999999999999988899 Q ss_pred HHHHHHhhccceEEeccCcCcCChhh---HHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC------ Q lcl|NC_018271. 148 GILPLLEADATVIDVVGASGGITAAN---VEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN------ 218 (305) Q Consensus 148 G~lk~i~~d~~~~~~~~~~~~iT~an---v~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~------ 218 (305) |++... ....+... ....|.+. ..+.+-+...+++..++.. .-.++|+...|-.++. ++.-.|-+ T Consensus 296 Gi~~~~--~~~~~~~~--~~~~t~~~~~~~~~~i~~~~~~~~~~~~~~-~~~~v~~~~~~~~l~~-lkd~~G~~l~~~~~ 369 (477) T protein:vir:84 296 GVRATA--GITQVTAT--SAGSALEKHQIIYQKIADAIQRVHTSRFLE-PEVIVMHPRRWASFHA-IFAGDDRPLIVPSG 369 (477) T ss_pred eeeecc--cccccccc--ccccchhhHHHHHHHHHHHHhhccccccCC-ccEEEEcHHHHHHHHH-hhccCCCeeeecCc Confidence 999752 12212111 12233333 3344444455566565543 3478999988765443 32221111 Q ss_pred ----------CcccCCCcceecceeeeeccCCCCC--------eEEEecchHHhhhhhhhhhhhhccccceeeec-ccee Q lcl|NC_018271. 219 ----------GTFLNPNEFDFEGYTLTEIKGLPAS--------RMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVD-LSGQ 279 (305) Q Consensus 219 ----------~~~t~~~~~~~kGi~iv~l~~~Pd~--------~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~-~~~~ 279 (305) ....++-...+.|++++-...||++ .|+..+-+.++++.. .++++...-.. .+.+ T Consensus 370 ~~~~~~~~~~~~~~~~~~~~l~G~pVv~s~~~p~~~~~~~d~~~i~~gd~~~~~i~~~------~~~~~~~~~~~~~~~~ 443 (477) T protein:vir:84 370 PGFNNLGVLTEVASQRVVGQMHGLPVVTDPTLPTTLGTGTDQDVIHVLRASDLALFES------SVRMRALQETRAENLS 443 (477) T ss_pred ccccccccccccccccccchhcccceEecCcccccccccCCcceEEEEEeceEEEEee------ceeEEeccccccccce Confidence 0111222346899999999899965 355555556654321 12222111100 1111 Q ss_pred EEEEEE-eecceeeccCCeEEEecCCC Q lcl|NC_018271. 280 IRTKMV-LSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 280 ~f~k~~-m~~d~~i~fg~E~v~~~~~~ 305 (305) -.|+-+ +-++..+-|++=||.-|=.+ T Consensus 444 ~~~~v~~~~~~~~~r~~~afv~~t~~~ 470 (477) T protein:vir:84 444 VLLQVYGYLAFTAARFPQSVVEIGGTA 470 (477) T ss_pred eeeeehhhhhhhhhccccceEEeeccc Confidence 111110 11122334577777777666 No 78 >protein:vir:4997 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:109 # MgeName: Sfi21 # Cross-refs: genbank:acc:NP_049971;genbank:gi:9632943;genbank:GeneID:1262106 Probab=97.23 E-value=0.00012 Score=42.00 Aligned_cols=253 Identities=9% Similarity=0.081 Sum_probs=134.9 Q ss_pred CceEeeeeccc-chhH-HHHHHHHhhccccchhcCceEEec--CCCCcccccchhhhhccccCCCCCCCC------c-cc Q lcl|NC_018271. 1 MATTVDITTNY-VGEV-AGGYFLEMVKEANTISDNLIRVIP--NVPENNLFLRRMNTTDDFVDYSCGFTP------S-GE 69 (305) Q Consensus 1 ma~~~~~~~~Y-~Ge~-l~~~~~~~~~g~~~v~~g~I~v~~--~v~~~~~~~~~~~~~~~~q~~~~~~~~------~-G~ 69 (305) |+......+.| ..+. ..+|+........+.+. ++++| +-..+..++.+-. .... +.|.+ . .. T Consensus 109 ~~~~t~~~gg~~iP~~~~~~ii~~~~~~~~l~~~--~~~~~~~~~~~~~~~~~~~~-~~~~----a~~v~E~~~~~~~~~ 181 (397) T protein:vir:49 109 KTDGSGSDAGLTIPQDIRTAINTLVRQFDSLQEY--VNVENVTTLTGSRVYEKWAD-ITGL----AKLDDEGGQIGQNDD 181 (397) T ss_pred hhccCCccCcceecHHHHHHHHHHHHhhhhHhhh--cceeeccCCcceEEEEeecc-CCcc----eeeeccccccccccc Confidence 44333322222 2233 34454444444444433 44433 2222211111111 1111 12322 2 23 Q ss_pred eEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHH Q lcl|NC_018271. 70 VDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGI 149 (305) Q Consensus 70 ~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~ 149 (305) .+|.+..+.++++-+...++=. + + ++-+...+.++.+.|.++++..++..++.|+++.. T Consensus 182 ~~~~~v~~~~~k~~~~~~iS~e-l--------l-------~ds~~~l~~~i~~~l~~~~~~~~d~ail~G~g~~~----- 240 (397) T protein:vir:49 182 PKLSLIRYAIKRYAGISTVTNS-L--------L-------ADSAENILAWLSGWIAKKVVVTRNKAILEAIGTLP----- 240 (397) T ss_pred cceeeeEeeeeeeEeehhhHHH-H--------H-------hhhhHHHHHHHHHHHHHHHHHHHHHHHHhcccccc----- Confidence 4788889999998888765432 1 1 12244567888899999999999999999988631 Q ss_pred HHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCc Q lcl|NC_018271. 150 LPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNE 226 (305) Q Consensus 150 lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~ 226 (305) +.++.+|..+ +.++..+++..++.+ -.++||...|...+- ++.-.|.+. +..++.. T Consensus 241 --------------~~~~~~~~d~----i~~~~~~l~~~~~~~--a~~v~n~~~~~~l~~-lkd~~g~~l~~~~~~~g~~ 299 (397) T protein:vir:49 241 --------------NKPTLAKWDD----IIDLQAKVDPAIKQT--SLFLTNTSGFTALKK-VKNAMGDYLMERDVKSPTG 299 (397) T ss_pred --------------ccccccCHHH----HHHHHHhhhhhhcCC--CEEEEcHHHHHHHHH-hhccCCceeecccccCCCC Confidence 1233445433 445666777777653 489999999876543 433333321 2334445 Q ss_pred ceecceeeeecc--CCCC-----CeEEEecchH-Hhhhhhhhhhhhhccccce-eeeccceeEEEEEEeecceeeccCCe Q lcl|NC_018271. 227 FDFEGYTLTEIK--GLPA-----SRMVGYNRDN-IVIGMSAQSDFNEIRIKDM-GDVDLSGQIRTKMVLSAGVEYAYGAE 297 (305) Q Consensus 227 ~~~kGi~iv~l~--~~Pd-----~~ii~T~~sN-l~~gvnl~~D~n~I~I~~~-~~~~~~~~~f~k~~m~~d~~i~fg~E 297 (305) ..+.|++++.+. .+|+ ..++..+-++ .+++..- ... +++++. .+...+...-+...+-+|+.+..++= T Consensus 300 ~~l~G~pV~~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~--~~~-i~~~~~~~~~~~~~~~~~~~~~r~d~~~~~~~a 376 (397) T protein:vir:49 300 YSIDGFVVKEISDRFLPNGTGGAMPLYFGDLKQAVTLFDRQ--HLS-LLSTNIGGGAFETDTTKVRVIDRFDVVSTDTEA 376 (397) T ss_pred ceecceeeEEecccccccccCCceeEEEeeccceEEEEeec--ccE-EEEeccccchhhcCeeeEEEEEeeccEEecccc Confidence 679999876653 2343 3345445443 4333211 111 222221 12233455666677778888888888 Q ss_pred EEEecCCC Q lcl|NC_018271. 298 IVLYTPAA 305 (305) Q Consensus 298 ~v~~~~~~ 305 (305) ||+-+=++ T Consensus 377 ~~~~~~~~ 384 (397) T protein:vir:49 377 FVPASFKA 384 (397) T ss_pred eEEEEecc Confidence 87776333 No 79 >protein:vir:1268 Length: 397 # NCBI annotation: hypothetical protein # Family: family:all:21 # MgeID: mge:329 # MgeName: phi-105 # Cross-refs: genbank:acc:NP_690760;genbank:gi:22855000;genbank:GeneID:955203 Probab=97.14 E-value=0.00014 Score=41.79 Aligned_cols=260 Identities=6% Similarity=-0.036 Sum_probs=135.8 Q ss_pred CceEeeeeccc-c-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCC-CCC-ccceEeccee Q lcl|NC_018271. 1 MATTVDITTNY-V-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCG-FTP-SGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~~~~Y-~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~-~~~-~G~~~~~~K~ 76 (305) |+.+......+ . -+...+|+..+.....+.+.+.+..+++-+.+. .+++........--..+ -.+ .+..+|++.. T Consensus 123 ~~~~~~~~gg~lvP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~-~~~~~~~~~~a~~v~Eg~~~~~~~~~~~~~v~ 201 (397) T protein:vir:12 123 MSGINDEDGGILIPEDIGRQIHEFKRQFEPLEQYVTVEPVTTRSGTR-LLEKNADMVPFSPVEELGNLPEIDQPRFTKVS 201 (397) T ss_pred ccccccccCcccCchhHHHHHHHhhhhhhhHHhhcceeeccCCceeE-EEEEecCCcceeeecccccccccccccceeEE Confidence 33332222222 1 222344444455555555443222223332332 23322222222111111 112 2345788889 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) +.++++-+...++-. +. ++-++..+.++.+.|.++++..++..++.|+++.. T Consensus 202 ~~~~k~~~~~~is~e---------~l-------~ds~~~l~~~i~~~l~~~~~~~~d~~il~G~g~~~------------ 253 (397) T protein:vir:12 202 YSIIDYGGIMTLSNS---------ML-------NDSDQAIMTYVAKWFAKKSVVTRNNLILAAIASLK------------ 253 (397) T ss_pred eeheeeEeeehhhHH---------HH-------hhchHHHHHHHHHHHHHHHHHHHHHHHHhcccccc------------ Confidence 999888887654322 11 22345677888999999999999999999987621 Q ss_pred cceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---CcccCCCcceeccee Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---GTFLNPNEFDFEGYT 233 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~~~t~~~~~~~kGi~ 233 (305) ..+.+|. .+..+.+...++..++. +-.++||...|.+.+. ++.-.|.+ .+.+++....+.|++ T Consensus 254 --------~~g~~~~---~~i~~~~~~~l~~~~~~--~a~~~~n~~~~~~L~~-lkd~~G~~l~~~~~~~g~~~~l~G~p 319 (397) T protein:vir:12 254 --------KVDIDGL---DGIKKALNVTLDPMVAP--GSIVLTNQDGYDWLDT-LKDGTGRYLLQPDPTNPTKKLLDGRP 319 (397) T ss_pred --------ccccccH---HHHHHHHhhccchhhhC--CCEEEEcHHHHHHHHH-hhccCCceeecccccCCCCcccccee Confidence 1122232 23333455677777774 4689999999877643 32222322 233455556799999 Q ss_pred eeeccC-CC-----CCeEEEecchHHhhhhhhhhhhhhccccce-eeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 234 LTEIKG-LP-----ASRMVGYNRDNIVIGMSAQSDFNEIRIKDM-GDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 234 iv~l~~-~P-----d~~ii~T~~sNl~~gvnl~~D~n~I~I~~~-~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) ++...+ +| +..++..+-++.+..+.- .+.. +++++- ...+-...+-+...+-+|+.+..++=||+=+=++ T Consensus 320 v~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~-~~~~-i~~~~~~~~~f~~~~~~~r~~~r~d~~~~~~~a~~~~~~t~ 396 (397) T protein:vir:12 320 VVPFTNRVLKTQKGKAPLIIGNLKEAIVLFDR-EQQS-IASTDTGAGAFETNSTKVRGIEREDVRKWDEDAVVFGQITV 396 (397) T ss_pred eEEecccccccCCCccEEEEEehhceEEEEee-cceE-EEEeccccchhhcCceEEEEEEeeccEEecccceEEEEEee Confidence 876543 33 334677776665433211 1111 222211 1111233445555666688777777777766666 No 80 >protein:vir:80128 Length: 466 # NCBI annotation: Phage capsid protein # Family: family:all:635 # MgeID: mge:1877 # MgeName: bacteriophage bv1 # Cross-refs: genbank:acc:YP_001425603;genbank:gi:155042936;genbank:GeneID:5469556 Probab=97.12 E-value=2.4e-05 Score=45.92 Aligned_cols=273 Identities=13% Similarity=0.077 Sum_probs=137.0 Q ss_pred CceEeee--eccc--chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccce Q lcl|NC_018271. 1 MATTVDI--TTNY--VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEV 70 (305) Q Consensus 1 ma~~~~~--~~~Y--~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~ 70 (305) .+..... .+.+ -..+.++|+..+..-..+.+. ++|.| ++... .++.... .-.+.|.+ .++. T Consensus 147 ~~~~~~~~~g~~~~vP~~~~~~i~~~l~~~~~l~~~--~~v~~-~~g~~-~~~~~~~-----~~~a~wv~E~~~~~~~~~ 217 (466) T protein:vir:80 147 LAQQKRAVSGAELTIPDVMLELLRDNMHRYSKLISK--VRLRP-LKGTA-RQNIAGA-----IPEGVWTEAVANLNELSL 217 (466) T ss_pred HhhhhhhhccccccccHHHHHHHHHhhhhhhhhhhh--eeeee-cCcee-EeeeecC-----Ccceeecccccccccccc Confidence 1111110 0111 123445555555555555544 55544 32221 1111110 01235544 2456 Q ss_pred EecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHH Q lcl|NC_018271. 71 DINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGIL 150 (305) Q Consensus 71 ~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~l 150 (305) .|.+..+.++++-+...++= + -+++-++..+.++...|+++++..++..++.||++.. --||+ T Consensus 218 ~f~~i~~~~~k~~~~~~iS~-e---------------ll~ds~~~l~~~i~~~la~~~~~~~~~ail~G~G~~~-P~Gil 280 (466) T protein:vir:80 218 SFSQIEVDGYKVGGFIPIPN-S---------------TLEDSDLNLADEILDAIGQAIGFALDKAILYGTGTKM-PVGIV 280 (466) T ss_pred cccceeecceeeeeehhhhH-H---------------HHhcchHHHHHHHHHHHHHHHHHHHhhheeeccCCCC-cceee Confidence 68888888888877655532 1 1234456778899999999999999999999999754 44888 Q ss_pred HHHhhccceEEeccCc---CcCChhhHH----------HHHHHHHHhc--cHHHHhCCCcEEEecHHHHHHHHHHHhhhh Q lcl|NC_018271. 151 PLLEADATVIDVVGAS---GGITAANVE----------AELGKFIDAH--TDEILQAPNHVFGVSTNVIRAIKRAYGTQA 215 (305) Q Consensus 151 k~i~~d~~~~~~~~~~---~~iT~anv~----------~~l~~~~~~i--P~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~ 215 (305) +.+........-.... ..++..+.. ..+.+....+ -+.-..+++....|+...+..+..-... . T Consensus 281 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~w~~~~~~~~~l~~~~~~-~ 359 (466) T protein:vir:80 281 TRLAQTTQPPNWGTKAPAWTNLSTTNLLKIDPTGKSAEEFFSELVLKLSKARANYSNGMKFWAMSSNTHAVLMSKAIT-F 359 (466) T ss_pred ecccccccccccccccccccccchhhhhhhhhhccchhhHHHHHHHHHHhhhccccCCceeEEecchhHHHhhccccc-c Confidence 7654322211100000 112211111 1111211111 0111123555678888887766333211 1 Q ss_pred ccC-Cc-ccCCCcceecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeee-ccceeEEEEEEeecceee Q lcl|NC_018271. 216 RSN-GT-FLNPNEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDV-DLSGQIRTKMVLSAGVEY 292 (305) Q Consensus 216 ~k~-~~-~t~~~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~-~~~~~~f~k~~m~~d~~i 292 (305) +.. .. ..-.+..++.|.+++.-..+|++.++..+.+..+++.+. .++|....-. ....+..|...+-.|... T Consensus 360 ~~~g~~~~~~~~~~~i~G~pvv~s~~~~~~~~~~g~~~~y~i~~r~-----~~~i~~~~~~~f~~d~~~~r~~~r~dg~~ 434 (466) T protein:vir:80 360 NSAGALVASLNNTMPIVGGDIVILDFIPDNDIIGGYGSLYLLAERA-----DIKLAQSEHVRFIEDQTVFKGTARYDGKP 434 (466) T ss_pred cCCccccccCCCcccccccceeecCccCccceeeeccccEEEEeec-----ceEEEechhhhhhcCcEEEEEEEEEccEE Confidence 111 11 111233468899999999999999988887776554331 2223211110 112334455555567777 Q ss_pred ccCCeEEEecCCC Q lcl|NC_018271. 293 AYGAEIVLYTPAA 305 (305) Q Consensus 293 ~fg~E~v~~~~~~ 305 (305) ..++=||+-+=+. T Consensus 435 ~~~~afv~~~~~~ 447 (466) T protein:vir:80 435 VFGEGFVAVNIAN 447 (466) T ss_pred eccCceEEEEecC Confidence 7776666664332 No 81 >protein:vir:1383 Length: 421 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:314 # MgeName: phi3626 # Cross-refs: genbank:acc:NP_612835;genbank:gi:20065969;genbank:GeneID:935826 Probab=97.05 E-value=0.00015 Score=41.62 Aligned_cols=255 Identities=8% Similarity=0.014 Sum_probs=127.8 Q ss_pred CceEeeeeccc-c-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhh--ccccCCCC-CCCCccceEecce Q lcl|NC_018271. 1 MATTVDITTNY-V-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTT--DDFVDYSC-GFTPSGEVDINEK 75 (305) Q Consensus 1 ma~~~~~~~~Y-~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~--~~~q~~~~-~~~~~G~~~~~~K 75 (305) .|..-...+.| . -++..+|+..+..-..+.+ +++++| |+......+..... ...+.-.+ ...+.++.+|.+. T Consensus 114 ra~~t~~~gg~liP~~~~~~Ii~~~~~~~~l~~--l~~~~~-~~~~~~~~~~~~~~~~~~~~~~~E~~~~~~s~~~f~~i 190 (421) T protein:vir:13 114 RDIMSSTNNGAVIPQEFVNEFEKLKEGYPSLKE--HCHVIP-VNRNAGKMPVRAGASVDKLANLAKDTELVKAMLKTQPM 190 (421) T ss_pred hhccccCCcceecchhhHHHHHHHHHhhhhhhh--hceeee-ccCCceEEEEeecCCccceeeccccccccccccceeEE Confidence 11111111111 1 2333455555554444433 366665 32222222222111 11221111 2344567899999 Q ss_pred eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhh Q lcl|NC_018271. 76 QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEA 155 (305) Q Consensus 76 ~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~ 155 (305) .+.++++.+...++=. + + ++-++..+.++.+.|.++++..++..+.+ ...|++ T Consensus 191 ~~~~~k~~~~v~iS~e-l--------l-------~ds~~~l~~~i~~~la~~~~~~~~~~i~~------~~~g~~----- 243 (421) T protein:vir:13 191 AYDIDDYGLLAPIDNS-L--------L-------EDSEINFLEFVNEEFAEFAVNTENAEIVK------QAKAVL----- 243 (421) T ss_pred EeeeeeeEeehhhhHH-H--------H-------hhhHHHHHHHHHHHHHHHHHHHhhhhHhh------hhhhcc----- Confidence 9999999988765432 2 1 22245567777788887777655543331 122322 Q ss_pred ccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC--cccCCCcceeccee Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG--TFLNPNEFDFEGYT 233 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~--~~t~~~~~~~kGi~ 233 (305) ..++..+ .+.+.++..+++..+++ +-.++||...|-+.+ .++.-.|.+. +..++....+.|.+ T Consensus 244 --------~~~~~~~----~d~i~~~~~~l~~~~~~--~a~~v~n~~~~~~l~-~lkd~~G~~i~~~~~~~~~~tl~G~p 308 (421) T protein:vir:13 244 --------AEETIND----YAGLVKTINSLVPNARK--RAIIVTNSDGRAYLD-GLMDKQGRPLLKELSDGGDLVFKGRP 308 (421) T ss_pred --------ccccccc----hHHHHHHHHHhhhhhcC--CCEEEEcHHHHHHHH-HhhcCCCceeecCcCCCCCceeccee Confidence 1112222 34455566667777765 358999999987665 3433333222 22234445799999 Q ss_pred eeeccCCCCC-----eEEEecchH-Hhhhhhhhhhhhhccccceeee-ccceeEEEEEEeecceeeccCC---eEEEecC Q lcl|NC_018271. 234 LTEIKGLPAS-----RMVGYNRDN-IVIGMSAQSDFNEIRIKDMGDV-DLSGQIRTKMVLSAGVEYAYGA---EIVLYTP 303 (305) Q Consensus 234 iv~l~~~Pd~-----~ii~T~~sN-l~~gvnl~~D~n~I~I~~~~~~-~~~~~~f~k~~m~~d~~i~fg~---E~v~~~~ 303 (305) ++....+|.. .++..+-++ .+++. -..++|+..... +....+-+...+-+|+...-.+ -+++.+| T Consensus 309 V~~~~~~~~~~~~~~~~~~gd~~~~~~~~~-----~~~~~v~~~~~~~f~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~ 383 (421) T protein:vir:13 309 VIELEESIFDVGDETKFIVSDFKTLIKFMD-----RKQYLIDQSKEAGYTKNETIARIIERFDVNSPLDKSSDAEKIRKF 383 (421) T ss_pred eEEeccccccCCCceEEEEEeccccEEEEE-----ecceEEEeecccccccCeeEEEEEeeecceeecchhhheeeeccc Confidence 9988888743 356666665 33332 223333322111 1234455666666777764322 2344454 Q ss_pred CC Q lcl|NC_018271. 304 AA 305 (305) Q Consensus 304 ~~ 305 (305) ++ T Consensus 384 ~a 385 (421) T protein:vir:13 384 GV 385 (421) T ss_pred ce Confidence 44 No 82 >protein:vir:1025 Length: 408 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:20 # MgeName: bIL286 # Cross-refs: genbank:acc:NP_076679;genbank:gi:13095788;genbank:GeneID:920362 Probab=97.05 E-value=0.00019 Score=40.94 Aligned_cols=255 Identities=6% Similarity=0.033 Sum_probs=136.5 Q ss_pred CceEeeeeccc-ch-hHHHHHHHHhhccccchhcCceEEec--CCCCcccccchhhhhccccCCCCCCCC-------ccc Q lcl|NC_018271. 1 MATTVDITTNY-VG-EVAGGYFLEMVKEANTISDNLIRVIP--NVPENNLFLRRMNTTDDFVDYSCGFTP-------SGE 69 (305) Q Consensus 1 ma~~~~~~~~Y-~G-e~l~~~~~~~~~g~~~v~~g~I~v~~--~v~~~~~~~~~~~~~~~~q~~~~~~~~-------~G~ 69 (305) |.........| .. +...+|+..+.....+.+. ++++| +-..+. ..++....... +.|.+ .+. T Consensus 116 ~~~~t~~~gg~~vP~~~~~~Ii~~~~~~~~l~~~--~~~~~~~~~~~~~-~~~~~~~~~~~----a~~v~E~~~~~~~~~ 188 (408) T protein:vir:10 116 ETSGSDSAAGLTIPQDIRTMINTLVRQYDSLQQY--VRVESVSTSNGSR-VYEKWTDVTPL----TVMDAEDGKIPDLDN 188 (408) T ss_pred hhcccccCCceeccHhHHHHHHHHHHhhchhhhh--cceeeccCCcceE-EEeeccccccc----eeeecCccccccccC Confidence 22222211111 12 3334566556666665554 45444 222221 12222222111 23322 234 Q ss_pred eEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHH Q lcl|NC_018271. 70 VDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGI 149 (305) Q Consensus 70 ~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~ 149 (305) .+|.+..+.++++.+...++-+ ++. +-++..+.++...|.++++..++..++.|+++.. T Consensus 189 ~~~~~i~~~~~k~~~~~~iS~e---------ll~-------ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g~~~----- 247 (408) T protein:vir:10 189 PQLTIIKYLIKRYAGIITATNT---------SLK-------DTAENILAWLSSWIAKKVVVTRNQAIIEVMKAAP----- 247 (408) T ss_pred cceeeEEeeeeeEEeeehhHHH---------HHh-------hchHHHHHHHHHHHHHHHHHHHHHHHhhcccccc----- Confidence 6899999999999988765432 222 2245677888899999999999999999988621 Q ss_pred HHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---CcccCCCc Q lcl|NC_018271. 150 LPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---GTFLNPNE 226 (305) Q Consensus 150 lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~~~t~~~~ 226 (305) +..+..| ..+.+..+...++..++.+ -.++||...|...+. ++.-.|.+ .+.+++.. T Consensus 248 --------------~~~~~~~---~~~l~~~~~~~~~~~~~~~--a~~v~n~~~~~~l~~-lkd~~G~~i~~~~~~~~~~ 307 (408) T protein:vir:10 248 --------------KKPTIAK---FDDVITMINTAVDPAIIAT--SSLLTNQSGLNKLAL-VKTAEGKYLLEPDPTKPNS 307 (408) T ss_pred --------------ccccccc---HHHHHHHHHHhhhhhhccC--CEEEEcHHHHHHHHH-hhccCCceEeccCcCCCCC Confidence 1112223 2333444556777888754 589999999877654 32222222 22345556 Q ss_pred ceecceeeeecc--CCCCC-----eEEEecchHHhhhhhhhhhhhhccccce-eeeccceeEEEEEEeecceeeccCCeE Q lcl|NC_018271. 227 FDFEGYTLTEIK--GLPAS-----RMVGYNRDNIVIGMSAQSDFNEIRIKDM-GDVDLSGQIRTKMVLSAGVEYAYGAEI 298 (305) Q Consensus 227 ~~~kGi~iv~l~--~~Pd~-----~ii~T~~sNl~~gvnl~~D~n~I~I~~~-~~~~~~~~~f~k~~m~~d~~i~fg~E~ 298 (305) ..+.|.+++... .+|+. .|+..+-++.++.++- .+.. +++++. ...+-...+-+...+-+|+.+..++=| T Consensus 308 ~~l~G~PV~~~~~~~~~~~~~~~~~i~~gd~~~~~~~~~~-~~~~-v~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~ 385 (408) T protein:vir:10 308 YLIKGKQVIVVADRWLPNTGSTVYPLYYGDMSQAITLFDR-ENMS-LLPTNIGAGAFETDTTKIRVIDRFDVKATDSEAL 385 (408) T ss_pred ceecceeeEEecccccCccCCCceEEEEEehhccEEEEEe-cceE-EEEcccccchhhcCceEEEEEEeeccEEeccccE Confidence 689999887654 34542 3666665654321111 1111 222211 121223445555666678888877777 Q ss_pred EEec--CCC Q lcl|NC_018271. 299 VLYT--PAA 305 (305) Q Consensus 299 v~~~--~~~ 305 (305) |+-+ |.+ T Consensus 386 ~~~~~~~~~ 394 (408) T protein:vir:10 386 VAGSFSAIA 394 (408) T ss_pred EEEEeeccc Confidence 7544 433 No 83 >protein:vir:105464 Length: 346 # NCBI annotation: putative phage major capsid protein # Family: family:all:701 # MgeID: mge:1502 # MgeName: KC5a # Cross-refs: genbank:acc:YP_529874;genbank:gi:90592614;genbank:GeneID:3974528 Probab=97.00 E-value=3.7e-05 Score=44.87 Aligned_cols=270 Identities=11% Similarity=0.079 Sum_probs=128.1 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCC--CCCCCccceEecceeee Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYS--CGFTPSGEVDINEKQLT 78 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~--~~~~~~G~~~~~~K~L~ 78 (305) ||. +-.+-|+.+..+.+-..+++...+.+.-...-+-...-++..+|++.+.-++.+|+ .++++-|+++.+..++. T Consensus 1 Mai--nya~~~~~~Ld~~~~~~~lts~~l~~~~~~~~v~~~ggktVkIp~is~tsGl~DY~R~~g~~~~g~v~~~~et~t 78 (346) T protein:vir:10 1 MTI--NYAEKYQAAVQQAFYDGHLYSAELWNSPSNSIIKFDGAKHIKVPRLEITSGRKDRQRRTITTPVANYSNDWDSYE 78 (346) T ss_pred Ccc--hhHHHHHHHHHHHHHhhhccchhhcccccccceEecCCCEEEEEEeeeecccccccccCCcccccccccceeEEE Confidence 884 43468888888887666777666654422211112235667788887667888885 58877676655555444 Q ss_pred --eeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 79 --LKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 79 --~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) -.|.- .|.+.|-|... +.+... ++.-.++..+... ...-.+--|.-+.+...+. T Consensus 79 l~qDR~~-~F~vD~mDvDE-------------------Tn~~~~---~anv~~ef~r~~v-vPEiDayrfskLa~~a~~~ 134 (346) T protein:vir:10 79 LKNERYW-STLVDPSDIDE-------------------TNMVVS---LANITKQFNLDSK-MPEKDRYMFSHLYSGKEAA 134 (346) T ss_pred eeccccc-eecccccchHH-------------------HHHHhH---HHHHHHHHHHHhh-cchhhHHHHHHHHHhhhhh Confidence 32221 12222222111 111111 1111111111111 1111111133333222211 Q ss_pred cceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc-----cCCCcceecc Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF-----LNPNEFDFEG 231 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~-----t~~~~~~~kG 231 (305) ... ...+.++|++|++++++++..++.+.--...++.+|||+.++...+++-+ ..+..+. .+++-..+.| T Consensus 135 ~~~---~~~~~a~T~~ni~~~i~~~~~~lde~~vp~~~rvl~vTp~~~~lLk~s~~--f~k~~~v~~~~~i~~~V~siDG 209 (346) T protein:vir:10 135 HDG---GITTNTLDEKNILPAFDNMMLDFDEARIPSTNRILYVTPKTNAILKRAEA--MNRALTLKDPNNIQRTVYSLDD 209 (346) T ss_pred ccc---cccccccCHHHHHHHHHHHHHHHHHccCCCCCeEEEECHHHHHHHhhchh--heeccccccccccceeeeeecC Confidence 111 11234689999999999999999886333456999999999998865441 2222222 2344457889 Q ss_pred eeeeeccCCCCCeEE-----Ee------c--chHHhhhhh----hhhhhhhccccceeeeccceeEEEEEEeecceeecc Q lcl|NC_018271. 232 YTLTEIKGLPASRMV-----GY------N--RDNIVIGMS----AQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAY 294 (305) Q Consensus 232 i~iv~l~~~Pd~~ii-----~T------~--~sNl~~gvn----l~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~f 294 (305) ++|+.| |++++- .. . .=|+++.-. ...=.+.++|-.=.+ +-++.|..+...=.|.-+.= T Consensus 210 v~Ii~V---Ps~r~~t~~~f~~G~~~~t~ak~INfiiv~~~A~ia~~K~~~~~if~P~~-~~~g~~l~~~R~Y~D~fv~~ 285 (346) T protein:vir:10 210 VTIRVV---PSDLMQTAYDFSDGSKIIDTAKQIEMFLIYNGVQIAPEKYSFVGFDQPSA-ATSGNYLYYEQSYDDVLLLN 285 (346) T ss_pred eEEEEc---chhhcccchhhccCccccCCccceeEEEECCceeeeeeeeeeeEeeCCCC-Ccccceeeeeeeeeeeeeec Confidence 998876 545442 00 0 013321100 001112222221122 22233555444444544421 Q ss_pred CCe---EEEec--CCC Q lcl|NC_018271. 295 GAE---IVLYT--PAA 305 (305) Q Consensus 295 g~E---~v~~~--~~~ 305 (305) ... .|-++ |+. T Consensus 286 nk~~~Iyv~~~~a~~~ 301 (346) T protein:vir:10 286 TKTKGIQFVVSDKPKK 301 (346) T ss_pred cccceEEEeeeccccc Confidence 111 11121 211 No 84 >protein:vir:9643 Length: 377 # NCBI annotation: major coat protein # Family: family:all:635 # MgeID: mge:173 # MgeName: 315.1 # Cross-refs: genbank:acc:NP_795405;genbank:gi:28876178;genbank:GeneID:1257724 Probab=96.96 E-value=5.9e-05 Score=43.75 Aligned_cols=268 Identities=10% Similarity=0.007 Sum_probs=148.7 Q ss_pred CceEeeeeccc-c-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCC-------CCccceE Q lcl|NC_018271. 1 MATTVDITTNY-V-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGF-------TPSGEVD 71 (305) Q Consensus 1 ma~~~~~~~~Y-~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~-------~~~G~~~ 71 (305) +...-.....| . -+..++|+..+..-..+.+. ++|.| ++.. +.+++..... .+.| .+..+.+ T Consensus 79 ~~~~~~~~gg~lvP~~~~~~I~~~l~~~s~i~~~--~~v~~-~~~~-~~i~~~~~~~-----~a~wv~e~~~~~~~~~~~ 149 (377) T protein:vir:96 79 DKNVGGKDKFKLLPEETMVQVFDDLVAEHPLLKV--INFKN-TSLR-LKALTAETSG-----TAVWGDIFGEIKGQLKQA 149 (377) T ss_pred HhcCCCCCCceecCHHHHHHHHHHHHhhhhhhhh--ceeEe-cCCc-eEEEEecCCc-----ceeEeecccccccccCcc Confidence 22111111122 1 12445555555444555554 55543 5443 2233322221 2234 2234678 Q ss_pred ecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHH Q lcl|NC_018271. 72 INEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 72 ~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk 151 (305) |.+..|.++++-++..++ +++ +++=++..+.++...|.++++.-++..+++||++. .--||++ T Consensus 150 f~~i~l~~~kl~~~~~is-~~l---------------l~ds~~~le~~i~~~l~~~~~~~~~~a~i~G~G~~-~P~Gil~ 212 (377) T protein:vir:96 150 FKEQDFSQFKLTAFVVIP-KDA---------------LKFGPKWLKQFITEQLKEAIAVALELAIVKGNGLL-QPVGLLK 212 (377) T ss_pred ceeEeeeeeeEEeechhh-HHH---------------hhcchhhHHHHHHHHHHHHHHHHHhhceEeccCCC-cceeeee Confidence 999999999998876643 222 12335667789999999999999999999999964 4668887 Q ss_pred HHhhccceE-------Eec------cCcCcCChhhHHHHHHHHHHhccHH-----HHhCCCcEEEecHHHHHHHHHHHhh Q lcl|NC_018271. 152 LLEADATVI-------DVV------GASGGITAANVEAELGKFIDAHTDE-----ILQAPNHVFGVSTNVIRAIKRAYGT 213 (305) Q Consensus 152 ~i~~d~~~~-------~~~------~~~~~iT~anv~~~l~~~~~~iP~~-----~r~~~~l~~f~S~~~~d~Y~d~~~~ 213 (305) .+....... -++ +..+.+++..+.+.+.++.+.+... .+..++..++|+..++-.-.-.|.- T Consensus 213 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~a~~~mn~~t~~~~~~~~~~ 292 (377) T protein:vir:96 213 DLSQPTVDQSTGRDITTYKTDKEAIADLSDLDPDTAVELLVPVMKHLSVNDKKHPLKIAGQVKLLLNPEDRWTLEAKFTS 292 (377) T ss_pred ccccccccccccccccceeeccccccccccCChhHHHHHHHHHHHhhccccccccccccCceEEEEchhhHHhccccccc Confidence 653322110 000 0112244455666666665554321 1123567899998765321111211 Q ss_pred hhccCCcccCCCcceecc--eeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhcccccee-eeccceeEEEEEEeecce Q lcl|NC_018271. 214 QARSNGTFLNPNEFDFEG--YTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMG-DVDLSGQIRTKMVLSAGV 290 (305) Q Consensus 214 ~~~k~~~~t~~~~~~~kG--i~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~-~~~~~~~~f~k~~m~~d~ 290 (305) + ..++.+....| ++++.-..+|++.|+..+-++.+++. -..++|+... .-....+.-|...+-+|- T Consensus 293 ~------~~~G~~~~~l~~p~~v~~s~~~p~~~i~fgdf~~Y~i~~-----r~~~~i~~~~~~~~~~d~~~f~~~~r~dG 361 (377) T protein:vir:96 293 R------NQFGEYVTVLPHGITILESLAVETGKAIAFVANRYDAFM-----ATASTIEEYDQTFAMEDLQLYLTKNYFYG 361 (377) T ss_pred c------CCCCCceeccCCCceEEecCCCCcccEEEEEcCcEEEEE-----ecccEEEeehhhhhhcCCeEEEEEEEEcC Confidence 1 12233333444 45555677899989999888865443 3345554332 112234455667777788 Q ss_pred eeccCCeEEEecCCC Q lcl|NC_018271. 291 EYAYGAEIVLYTPAA 305 (305) Q Consensus 291 ~i~fg~E~v~~~~~~ 305 (305) ...-++=+|+++=+- T Consensus 362 ~~~d~~a~~vl~l~~ 376 (377) T protein:vir:96 362 KAKDNHTAALLTLAG 376 (377) T ss_pred EEecCCcEEEEEEec Confidence 888888888888887 No 85 >protein:vir:98635 Length: 377 # NCBI annotation: major coat protein # Family: family:all:635 # MgeID: mge:1601 # MgeName: phi3396 # Cross-refs: genbank:acc:YP_001039923;genbank:gi:126011098;genbank:GeneID:4818471 Probab=96.94 E-value=2.9e-05 Score=45.43 Aligned_cols=269 Identities=12% Similarity=-0.016 Sum_probs=149.4 Q ss_pred CceEeeeeccc--chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCC-------CccceE Q lcl|NC_018271. 1 MATTVDITTNY--VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFT-------PSGEVD 71 (305) Q Consensus 1 ma~~~~~~~~Y--~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~-------~~G~~~ 71 (305) +.........| --++.++|+..+..-..+.+. ++|.| ++... .+++...... +.|. +..+.+ T Consensus 79 ~~~~~~~~gg~~vP~~~~~~I~~~l~~~s~i~~~--~~v~~-~~~~~-~~~~~~~~~~-----a~w~~e~~~~~~~~~~~ 149 (377) T protein:vir:98 79 DKNVGGKDKFKLLPEETMVQVFDDLVAEHPLLKV--INFKN-TSLRL-KALTAETSGT-----AVWGDIFGEIKGQLKQA 149 (377) T ss_pred HhccCCCCCccccCHHHHHHHHHHHHHhhhhhhh--eeeEe-cCcce-EEEEecCCcc-----eeEeecccccCcccCcc Confidence 22222212222 133555666555555566655 66544 44443 2333222111 2342 234567 Q ss_pred ecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHH Q lcl|NC_018271. 72 INEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 72 ~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk 151 (305) |.+..|.++++-++..++ +++ +++=++..+.++.+.|+++++.-++..+++||++ +.--||++ T Consensus 150 f~~i~l~~~kl~a~~~is-~el---------------L~ds~~~ie~~i~~~la~~~a~~~~~a~i~G~G~-~qP~Gil~ 212 (377) T protein:vir:98 150 FKEQDFSQFKLTAFVVIP-KDA---------------LKFGPKWIKQFITEQLKEAIAVALELAIVKGDGL-LQPVGLLK 212 (377) T ss_pred ceeEeecceeEEeeeccc-HHh---------------hhccHhHHHHHHHHHHHHHHHHHHhhceEeccCC-Ccceeeee Confidence 888899999988887653 221 1233566788999999999999999999999996 45778887 Q ss_pred HHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC------------- Q lcl|NC_018271. 152 LLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN------------- 218 (305) Q Consensus 152 ~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~------------- 218 (305) .+........ ....+.|.....+.+.++...+|..+|++ ..+.|+....... +.+..-.|.+ T Consensus 213 ~~~~~~~~~~--~~~~~~~~~~~~~~~~~l~~~~~~~~~~~--a~~~m~~~t~~~~-~klkd~~G~~i~~~n~~~~~~~~ 287 (377) T protein:vir:98 213 DLSQPTVDQS--TGRDITTYKTDKEAIADLSDLTPDNAPKK--LVPVMKHLSVNDK-KRPLKIAGQVKLILNPEDRWALE 287 (377) T ss_pred cccccccccc--cccccccccchhhhHhhhhhhchhHHHHH--HHHHHHHHHHHHH-hhhhccCCceEEEecccchhhcc Confidence 6433222111 11112222223345666667777776653 3444544443222 1111111111 Q ss_pred ---Ccc-cCCCcceecce--eeeeccCCCCCeEEEecchHHhhhhhhhhhhhhcccccee-eeccceeEEEEEEeeccee Q lcl|NC_018271. 219 ---GTF-LNPNEFDFEGY--TLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMG-DVDLSGQIRTKMVLSAGVE 291 (305) Q Consensus 219 ---~~~-t~~~~~~~kGi--~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~-~~~~~~~~f~k~~m~~d~~ 291 (305) ... .++.+....|+ +++.-..+|++.|+..+-++.+++.+ ..++|+... ......+..|..++-+|.. T Consensus 288 p~~~~~~~~G~~~t~lg~p~~vv~s~~~p~~~i~fgdf~~Y~i~~r-----~~~~i~~~~~~~~~~d~~~f~~~~r~dg~ 362 (377) T protein:vir:98 288 AQFTSRNQFGEYVTVLPHGITILESLAVETGKAIAFVANRYDAFMA-----TASTIEEYDQTFAMEDLQLYLTKNYFYGK 362 (377) T ss_pred ccccccCCCCccccccCCCceEEecCCCCcccEEEEEecceeEEee-----cceEEEeechhhhhcCceEEEEEEEEcCE Confidence 001 11222234444 45555678999999999888765543 234454332 1122344667777788888 Q ss_pred eccCCeEEEecCCC Q lcl|NC_018271. 292 YAYGAEIVLYTPAA 305 (305) Q Consensus 292 i~fg~E~v~~~~~~ 305 (305) ..-++=+|+++=+. T Consensus 363 ~~~~~a~~vl~i~~ 376 (377) T protein:vir:98 363 AKDNHTAALLTLAG 376 (377) T ss_pred EeccCcEEEEEEec Confidence 88888899998888 No 86 >protein:vir:102605 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:1661 # MgeName: Llij # Cross-refs: genbank:acc:YP_655002;genbank:gi:109392192;genbank:GeneID:4157227 Probab=96.93 E-value=0.00024 Score=40.43 Aligned_cols=262 Identities=14% Similarity=0.116 Sum_probs=138.7 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCC--CCCCCccceEecceeee Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYS--CGFTPSGEVDINEKQLT 78 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~--~~~~~~G~~~~~~K~L~ 78 (305) ||+.+.+..=|++++++.+=-....+ .+++...-- -+.+.++..+|+...... .+|. .....-.+..-++..+. T Consensus 1 MA~~~~~pe~~~~~v~~~~~~~lv~~-~l~~~~~~~--~~~~Gdtv~ip~~~~~~~-~d~~~~~~~~~~~~~~~~~~~~t 76 (273) T protein:vir:10 1 MAFNNFIPELWSDMLLEEWTAQTVFA-NLVNREYEG--TASKGNVVHIAGVVAPTV-KDYKAAGRQTSADAISDTGVDLL 76 (273) T ss_pred CcchhhhHHHHHHHHHHHHHhhhccc-hhhcccccc--ccccCceEEEeecccccc-cccccCCCccCccccccceEEEE Confidence 99976655679999998854444443 355443211 123345555665443332 3332 12222234455666666 Q ss_pred eeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhccc Q lcl|NC_018271. 79 LKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEADAT 158 (305) Q Consensus 79 ~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d~~ 158 (305) ..+++..-.... |.+.. +.. +++ + . .++..+..++.++..++. ..+..... T Consensus 77 id~~~~~~~~i~-d~d~~----~~~-~~~--~-------~-~~~~~~~alA~~vD~~i~-------------~~~~~a~~ 127 (273) T protein:vir:10 77 IDQEKSIDFLVD-DIDRV----QVA-GSL--E-------A-YTRAGATALATDTDKFIA-------------DMLVDNGT 127 (273) T ss_pred EeeeeecceEee-cHHHh----hhh-ccH--H-------H-HHHHHHHHHHHHHHHHHH-------------HHHhcccc Confidence 655443332222 44433 222 221 2 1 223334445555543221 12222222 Q ss_pred eEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHH--HH-HhhhhccCC--cccCCCcceeccee Q lcl|NC_018271. 159 VIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIK--RA-YGTQARSNG--TFLNPNEFDFEGYT 233 (305) Q Consensus 159 ~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~--d~-~~~~~~k~~--~~t~~~~~~~kGi~ 233 (305) . + ..+..+|+.|+.+.|.++...+.+.--...+-.++||+..|.... +. .......+. ...++..+++.|+. T Consensus 128 ~--~-~~~~~~~~~~~~~~i~~a~~~ld~~~vP~~~R~lvv~p~~~~~L~~~~~~~~~~~~~~~~~~l~~G~ig~i~G~~ 204 (273) T protein:vir:10 128 A--L-TGSAPTDADDAFDLIAKALKELTKANVPNVGRVVVVNAEMAFWLRSSGSKLTSADTSGDAAGLRAGTIGNLLGAR 204 (273) T ss_pred c--c-ccccccchhHHHHHHHHHHHHhhhcCCCcCCCEEEECHHHHHHHhcchhhhhhhhccccccceeeeeeeEEeceE Confidence 1 1 123456788888888888888776532233568899999888763 23 322222222 22355667899999 Q ss_pred eeeccCCCCC---eEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 234 LTEIKGLPAS---RMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 234 iv~l~~~Pd~---~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) |+.-..+|.+ .+++..++-+.++. . . ++++..+..+-. ...|... |-||.-+-.++-+|+.+.++ T Consensus 205 v~~s~~lp~~~~~~~~~~~~~A~~~a~-q-~--~~~e~~r~~~~~-~~~v~~~--~~yg~~v~~~~~~~~l~~~g 272 (273) T protein:vir:10 205 IVESNNLRDTDDEQFVAFHPSAAAYVS-Q-I--DTVEALRDQDSF-SDRIRAL--HVYGGKVVRPTGVVVFNKTG 272 (273) T ss_pred EEEecccccCCccEEEEEeccceeeee-e-e--ehhhcccCCCcc-eeeeeee--eeeeeeEeccceEEEEeccC Confidence 9998888864 46666666654433 2 1 233444343322 2234443 44566666788888888877 No 87 >protein:vir:105822 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:1636 # MgeName: PMC # Cross-refs: genbank:acc:YP_655767;genbank:gi:109522090;genbank:GeneID:4157630 Probab=96.93 E-value=0.00024 Score=40.43 Aligned_cols=262 Identities=14% Similarity=0.116 Sum_probs=138.7 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCC--CCCCCccceEecceeee Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYS--CGFTPSGEVDINEKQLT 78 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~--~~~~~~G~~~~~~K~L~ 78 (305) ||+.+.+..=|++++++.+=-....+ .+++...-- -+.+.++..+|+...... .+|. .....-.+..-++..+. T Consensus 1 MA~~~~~pe~~~~~v~~~~~~~lv~~-~l~~~~~~~--~~~~Gdtv~ip~~~~~~~-~d~~~~~~~~~~~~~~~~~~~~t 76 (273) T protein:vir:10 1 MAFNNFIPELWSDMLLEEWTAQTVFA-NLVNREYEG--TASKGNVVHIAGVVAPTV-KDYKAAGRQTSADAISDTGVDLL 76 (273) T ss_pred CcchhhhHHHHHHHHHHHHHhhhccc-hhhcccccc--ccccCceEEEeecccccc-cccccCCCccCccccccceEEEE Confidence 99976655679999998854444443 355443211 123345555665443332 3332 12222234455666666 Q ss_pred eeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhccc Q lcl|NC_018271. 79 LKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEADAT 158 (305) Q Consensus 79 ~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d~~ 158 (305) ..+++..-.... |.+.. +.. +++ + . .++..+..++.++..++. ..+..... T Consensus 77 id~~~~~~~~i~-d~d~~----~~~-~~~--~-------~-~~~~~~~alA~~vD~~i~-------------~~~~~a~~ 127 (273) T protein:vir:10 77 IDQEKSIDFLVD-DIDRV----QVA-GSL--E-------A-YTRAGATALATDTDKFIA-------------DMLVDNGT 127 (273) T ss_pred EeeeeecceEee-cHHHh----hhh-ccH--H-------H-HHHHHHHHHHHHHHHHHH-------------HHHhcccc Confidence 655443332222 44433 222 221 2 1 223334445555543221 12222222 Q ss_pred eEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHH--HH-HhhhhccCC--cccCCCcceeccee Q lcl|NC_018271. 159 VIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIK--RA-YGTQARSNG--TFLNPNEFDFEGYT 233 (305) Q Consensus 159 ~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~--d~-~~~~~~k~~--~~t~~~~~~~kGi~ 233 (305) . + ..+..+|+.|+.+.|.++...+.+.--...+-.++||+..|.... +. .......+. ...++..+++.|+. T Consensus 128 ~--~-~~~~~~~~~~~~~~i~~a~~~ld~~~vP~~~R~lvv~p~~~~~L~~~~~~~~~~~~~~~~~~l~~G~ig~i~G~~ 204 (273) T protein:vir:10 128 A--L-TGSAPTDADDAFDLIAKALKELTKANVPNVGRVVVVNAEMAFWLRSSGSKLTSADTSGDAAGLRAGTIGNLLGAR 204 (273) T ss_pred c--c-ccccccchhHHHHHHHHHHHHhhhcCCCcCCCEEEECHHHHHHHhcchhhhhhhhccccccceeeeeeeEEeceE Confidence 1 1 123456788888888888888776532233568899999888763 23 322222222 22355667899999 Q ss_pred eeeccCCCCC---eEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 234 LTEIKGLPAS---RMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 234 iv~l~~~Pd~---~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) |+.-..+|.+ .+++..++-+.++. . . ++++..+..+-. ...|... |-||.-+-.++-+|+.+.++ T Consensus 205 v~~s~~lp~~~~~~~~~~~~~A~~~a~-q-~--~~~e~~r~~~~~-~~~v~~~--~~yg~~v~~~~~~~~l~~~g 272 (273) T protein:vir:10 205 IVESNNLRDTDDEQFVAFHPSAAAYVS-Q-I--DTVEALRDQDSF-SDRIRAL--HVYGGKVVRPTGVVVFNKTG 272 (273) T ss_pred EEEecccccCCccEEEEEeccceeeee-e-e--ehhhcccCCCcc-eeeeeee--eeeeeeEeccceEEEEeccC Confidence 9998888864 46666666654433 2 1 233444343322 2234443 44566666788888888877 No 88 >protein:vir:78090 Length: 302 # NCBI annotation: Cps # Family: family:all:701 # MgeID: mge:1844 # MgeName: P35 # Cross-refs: genbank:acc:YP_001468790;genbank:gi:157325371;genbank:GeneID:5601852 Probab=96.93 E-value=0.00012 Score=42.09 Aligned_cols=265 Identities=10% Similarity=0.104 Sum_probs=126.5 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccc-hhcCceEEecCCCCcccccchhhhh----ccccCCC--CCCCCccceEec Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANT-ISDNLIRVIPNVPENNLFLRRMNTT----DDFVDYS--CGFTPSGEVDIN 73 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~-v~~g~I~v~~~v~~~~~~~~~~~~~----~~~q~~~--~~~~~~G~~~~~ 73 (305) ||++++--+-|.++..+.+-..+.+ +.+ .+-+.|+ .-| -++..+|++.+. -+|.+|+ .+|. .|.++.+ T Consensus 1 Mantl~ya~~~~~~Ld~~~~~~~~t-~~l~~~~~~v~-~~G--ak~vkIp~is~~~~~TsGl~dy~R~~g~~-~g~v~~~ 75 (302) T protein:vir:78 1 MANSLALAQIYQDNIDKAIAVNSKS-AFLEANPNNVQ-YNG--GNTIKIADISFGSGTTGDLKAYNRSTGFT-QGSVTLA 75 (302) T ss_pred CCchhHHHHHHHHHHHHHHHhhhce-eecccCCceEE-Eec--CcEEEEEEEEeeccccccccccccccCcc-ccceeee Confidence 9999876678888877766555544 454 4555666 233 466778888863 4777775 4665 4777554 Q ss_pred cee--eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHH Q lcl|NC_018271. 74 EKQ--LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 74 ~K~--L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk 151 (305) ..+ |.-.|.- .|.+.|.|. .-+-+..... .+-.|.........-++--|.-+.+ T Consensus 76 ~et~tlt~DR~~-~f~vD~mDv-------------------dETn~~~~~a----ni~~ef~r~~vvPEiDayrfskla~ 131 (302) T protein:vir:78 76 WSDYTLDYDLAQ-SFQIDAMDV-------------------DETKNLATVG----NVLSEYQRTKIVPAIDKYRFTKLAN 131 (302) T ss_pred eeeEEeeeccce-eeeccccch-------------------hhhhhhhHHH----HHHHHHHHhhhcchhhHHHHHHHHH Confidence 443 3333221 111222221 1111111111 1112222122222222333554444 Q ss_pred HHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCc-------ccCC Q lcl|NC_018271. 152 LLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGT-------FLNP 224 (305) Q Consensus 152 ~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~-------~t~~ 224 (305) ...+.....+.+ ..++|.+|+++.++.+...+.+ +.++.+|||+.++...+++- ...+..+ ..+. T Consensus 132 ~a~~~~~~~~~~--~~~~t~~nvl~~i~~~~~~~~e----~~~~vl~vtp~~~~~Lk~a~--~~~~~~~~~~~~~~~i~~ 203 (302) T protein:vir:78 132 DGTGVGGVIDLS--KPDASAQALMGDIATAMELVDD----SNQLILVTSPTTLAGLLNTA--LIRESKNTQVLRRGEVDT 203 (302) T ss_pred hhhccCcccccc--ccchhHHHHHHHHHHHHHHhhc----cCCeEEEEChHHHHHHhcch--hhccceeccccccccccc Confidence 433322222211 2357899999999988765555 57899999999999988642 1111111 1234 Q ss_pred CcceecceeeeeccCCCCCeEEEe-------------cchHHhhhhh----hhhhhhhccccceeeeccceeEEEEEEee Q lcl|NC_018271. 225 NEFDFEGYTLTEIKGLPASRMVGY-------------NRDNIVIGMS----AQSDFNEIRIKDMGDVDLSGQIRTKMVLS 287 (305) Q Consensus 225 ~~~~~kGi~iv~l~~~Pd~~ii~T-------------~~sNl~~gvn----l~~D~n~I~I~~~~~~~~~~~~f~k~~m~ 287 (305) +-..+.|++|+.| |++|+-.. ..=|+++--. ...=.+.++|-.=.--+-+..|..+...= T Consensus 204 ~V~~lDgv~Ii~V---Ps~r~~t~~~f~~G~~~~~~ak~INfiiv~~~a~ia~~K~~~~~if~P~~~~~gd~~l~~~R~Y 280 (302) T protein:vir:78 204 KITFIQDVEVLQV---PSEYLYDKVAPKVGVPDYTGAKKIPYMIFKRDAPTGIVKTDKVRVFEPDTNQSADAYKVDLRLY 280 (302) T ss_pred eeeeecccEEEEc---hhhhcccceeccCCccccCCccceeEEEECCCeeeeeeeeeeeEeeCCCCCCCcceeeeeeeeE Confidence 4557999998877 55654211 0113321100 00011222221111001111234433333 Q ss_pred cceee-ccCCeEEEecCCC Q lcl|NC_018271. 288 AGVEY-AYGAEIVLYTPAA 305 (305) Q Consensus 288 ~d~~i-~fg~E~v~~~~~~ 305 (305) .|.-+ --...-++.++.+ T Consensus 281 ~D~fV~~nk~~gI~~~~~~ 299 (302) T protein:vir:78 281 HDLIVPKNQRPGIIKASFG 299 (302) T ss_pred eeeeeeccccCeEEEeecc Confidence 34333 2222344455444 No 89 >protein:vir:102873 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1492 # MgeName: Cherry # Cross-refs: genbank:acc:YP_338137;genbank:gi:77020198;genbank:GeneID:3703782 Probab=96.83 E-value=0.00031 Score=39.80 Aligned_cols=257 Identities=7% Similarity=0.001 Sum_probs=129.7 Q ss_pred CceEeeeeccc-c-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC--CCCCccceEeccee Q lcl|NC_018271. 1 MATTVDITTNY-V-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC--GFTPSGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~~~~Y-~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~--~~~~~G~~~~~~K~ 76 (305) |+..-.-...| . .+...+|+..+.....+.+..-+..+++-..+.. +++...+..+.--.+ .....+..+|++.. T Consensus 106 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~~-~~~~~~~~~a~~v~E~~~~~~~~~~~~~~v~ 184 (392) T protein:vir:10 106 MSGLTGEDGGLVIPQDIQTQINELARSFDALEQYVTVEPVRTRSGSRV-LEKNSDMIPFAEITEMGEIPETDNPKFSNVQ 184 (392) T ss_pred ccccccCCCceecchhHHHHHHHHHHhhhhhhhhceeeeccCCceeEE-EEeecCCccceeecccccccccccccceeEE Confidence 22221111111 1 1233445545555555555433333333333322 222222111111111 11223456899999 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) |.++++.+...++-.=+. +-.+..+.++.+.|.+.++..++..++.|+++.. T Consensus 185 l~~~k~~~~~~iS~ell~----------------ds~~~l~~~i~~~l~~~i~~~~d~~~~~g~g~~~------------ 236 (392) T protein:vir:10 185 YAVKDRAGILPLSRSLLQ----------------DSDQNILKYVTKWLGKKSKVTRNVLILGVIEKLT------------ 236 (392) T ss_pred eeeeeEEEeehhhHHHHh----------------hhHHHHHHHHHHHHHHHHHHHHHHHHhhcccccc------------ Confidence 999999988877643221 1134566788899999999999999998887621 Q ss_pred cceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---CcccCCCcceeccee Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---GTFLNPNEFDFEGYT 233 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~~~t~~~~~~~kGi~ 233 (305) .++.+|..++++. +...++..++. +-.++||...|...+- ++.-.|.+ .+.+++....+.|.+ T Consensus 237 --------~~~~~~~d~i~~~---~~~~l~~~~~~--~a~~vm~~~~~~~L~~-lkd~~G~~l~~~~~~~~~~~tllG~~ 302 (392) T protein:vir:10 237 --------KQAIKSLDDIKDV---LNVKLDPAISP--NAILLTNQDGFNYLDK-LKDKDGKYILQSDPTQKNKKLFAGTN 302 (392) T ss_pred --------ccCccCHHHHHHH---HHHhhhhhhcc--CCEEEEcHHHHHHHHH-hhccCCCeEeecCccCCccccccCcc Confidence 1222333333332 33467777764 4689999999877643 33222222 123344455688876 Q ss_pred eeec-cCC-C--------CCeEEEecchHHh-hhhhhhhhhhhcc--cccee-eeccceeEEEEEEeecceeeccCCeEE Q lcl|NC_018271. 234 LTEI-KGL-P--------ASRMVGYNRDNIV-IGMSAQSDFNEIR--IKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIV 299 (305) Q Consensus 234 iv~l-~~~-P--------d~~ii~T~~sNl~-~gvnl~~D~n~I~--I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v 299 (305) +|-. .+. | +..++..+-++.+ ++. -..+. +++.. ...-+..+-+...+-.|+.+.-++=|| T Consensus 303 ~v~~~~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~-----~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~ 377 (392) T protein:vir:10 303 PVVVVSNRFLKSKGTTAKKAPLIIGDLKEAIVLFK-----REDMELASTDVGGKAFTRNTLDLRAIQRDDVQMWDNEAAV 377 (392) T ss_pred cEEEecccccCCCcccCCceEEEEEehhceEEEEe-----ecceEEEEeccccchhhcCceEEEEEEeeccEEecccceE Confidence 4432 222 2 2234444444432 111 11222 22211 111123344566666688887777777 Q ss_pred E--ecCCC Q lcl|NC_018271. 300 L--YTPAA 305 (305) Q Consensus 300 ~--~~~~~ 305 (305) . .+|++ T Consensus 378 ~l~~~~~a 385 (392) T protein:vir:10 378 YGEIDLSA 385 (392) T ss_pred EEEecccc Confidence 7 55666 No 90 >protein:vir:105004 Length: 392 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:1490 # MgeName: W Beta # Cross-refs: genbank:acc:YP_459969;genbank:gi:85701384;genbank:GeneID:3882145 Probab=96.83 E-value=0.00031 Score=39.80 Aligned_cols=257 Identities=7% Similarity=0.001 Sum_probs=129.7 Q ss_pred CceEeeeeccc-c-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC--CCCCccceEeccee Q lcl|NC_018271. 1 MATTVDITTNY-V-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC--GFTPSGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~~~~Y-~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~--~~~~~G~~~~~~K~ 76 (305) |+..-.-...| . .+...+|+..+.....+.+..-+..+++-..+.. +++...+..+.--.+ .....+..+|++.. T Consensus 106 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~~-~~~~~~~~~a~~v~E~~~~~~~~~~~~~~v~ 184 (392) T protein:vir:10 106 MSGLTGEDGGLVIPQDIQTQINELARSFDALEQYVTVEPVRTRSGSRV-LEKNSDMIPFAEITEMGEIPETDNPKFSNVQ 184 (392) T ss_pred ccccccCCCceecchhHHHHHHHHHHhhhhhhhhceeeeccCCceeEE-EEeecCCccceeecccccccccccccceeEE Confidence 22221111111 1 1233445545555555555433333333333322 222222111111111 11223456899999 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) |.++++.+...++-.=+. +-.+..+.++.+.|.+.++..++..++.|+++.. T Consensus 185 l~~~k~~~~~~iS~ell~----------------ds~~~l~~~i~~~l~~~i~~~~d~~~~~g~g~~~------------ 236 (392) T protein:vir:10 185 YAVKDRAGILPLSRSLLQ----------------DSDQNILKYVTKWLGKKSKVTRNVLILGVIEKLT------------ 236 (392) T ss_pred eeeeeEEEeehhhHHHHh----------------hhHHHHHHHHHHHHHHHHHHHHHHHHhhcccccc------------ Confidence 999999988877643221 1134566788899999999999999998887621 Q ss_pred cceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---CcccCCCcceeccee Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---GTFLNPNEFDFEGYT 233 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~~~t~~~~~~~kGi~ 233 (305) .++.+|..++++. +...++..++. +-.++||...|...+- ++.-.|.+ .+.+++....+.|.+ T Consensus 237 --------~~~~~~~d~i~~~---~~~~l~~~~~~--~a~~vm~~~~~~~L~~-lkd~~G~~l~~~~~~~~~~~tllG~~ 302 (392) T protein:vir:10 237 --------KQAIKSLDDIKDV---LNVKLDPAISP--NAILLTNQDGFNYLDK-LKDKDGKYILQSDPTQKNKKLFAGTN 302 (392) T ss_pred --------ccCccCHHHHHHH---HHHhhhhhhcc--CCEEEEcHHHHHHHHH-hhccCCCeEeecCccCCccccccCcc Confidence 1222333333332 33467777764 4689999999877643 33222222 123344455688876 Q ss_pred eeec-cCC-C--------CCeEEEecchHHh-hhhhhhhhhhhcc--cccee-eeccceeEEEEEEeecceeeccCCeEE Q lcl|NC_018271. 234 LTEI-KGL-P--------ASRMVGYNRDNIV-IGMSAQSDFNEIR--IKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIV 299 (305) Q Consensus 234 iv~l-~~~-P--------d~~ii~T~~sNl~-~gvnl~~D~n~I~--I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v 299 (305) +|-. .+. | +..++..+-++.+ ++. -..+. +++.. ...-+..+-+...+-.|+.+.-++=|| T Consensus 303 ~v~~~~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~-----~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~ 377 (392) T protein:vir:10 303 PVVVVSNRFLKSKGTTAKKAPLIIGDLKEAIVLFK-----REDMELASTDVGGKAFTRNTLDLRAIQRDDVQMWDNEAAV 377 (392) T ss_pred cEEEecccccCCCcccCCceEEEEEehhceEEEEe-----ecceEEEEeccccchhhcCceEEEEEEeeccEEecccceE Confidence 4432 222 2 2234444444432 111 11222 22211 111123344566666688887777777 Q ss_pred E--ecCCC Q lcl|NC_018271. 300 L--YTPAA 305 (305) Q Consensus 300 ~--~~~~~ 305 (305) . .+|++ T Consensus 378 ~l~~~~~a 385 (392) T protein:vir:10 378 YGEIDLSA 385 (392) T ss_pred EEEecccc Confidence 7 55666 No 91 >protein:vir:107593 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1491 # MgeName: Gamma # Cross-refs: genbank:acc:YP_338188;genbank:gi:77020144;genbank:GeneID:3703724 Probab=96.83 E-value=0.00031 Score=39.80 Aligned_cols=257 Identities=7% Similarity=0.001 Sum_probs=129.7 Q ss_pred CceEeeeeccc-c-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC--CCCCccceEeccee Q lcl|NC_018271. 1 MATTVDITTNY-V-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC--GFTPSGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~~~~Y-~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~--~~~~~G~~~~~~K~ 76 (305) |+..-.-...| . .+...+|+..+.....+.+..-+..+++-..+.. +++...+..+.--.+ .....+..+|++.. T Consensus 106 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~~-~~~~~~~~~a~~v~E~~~~~~~~~~~~~~v~ 184 (392) T protein:vir:10 106 MSGLTGEDGGLVIPQDIQTQINELARSFDALEQYVTVEPVRTRSGSRV-LEKNSDMIPFAEITEMGEIPETDNPKFSNVQ 184 (392) T ss_pred ccccccCCCceecchhHHHHHHHHHHhhhhhhhhceeeeccCCceeEE-EEeecCCccceeecccccccccccccceeEE Confidence 22221111111 1 1233445545555555555433333333333322 222222111111111 11223456899999 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) |.++++.+...++-.=+. +-.+..+.++.+.|.+.++..++..++.|+++.. T Consensus 185 l~~~k~~~~~~iS~ell~----------------ds~~~l~~~i~~~l~~~i~~~~d~~~~~g~g~~~------------ 236 (392) T protein:vir:10 185 YAVKDRAGILPLSRSLLQ----------------DSDQNILKYVTKWLGKKSKVTRNVLILGVIEKLT------------ 236 (392) T ss_pred eeeeeEEEeehhhHHHHh----------------hhHHHHHHHHHHHHHHHHHHHHHHHHhhcccccc------------ Confidence 999999988877643221 1134566788899999999999999998887621 Q ss_pred cceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---CcccCCCcceeccee Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---GTFLNPNEFDFEGYT 233 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~~~t~~~~~~~kGi~ 233 (305) .++.+|..++++. +...++..++. +-.++||...|...+- ++.-.|.+ .+.+++....+.|.+ T Consensus 237 --------~~~~~~~d~i~~~---~~~~l~~~~~~--~a~~vm~~~~~~~L~~-lkd~~G~~l~~~~~~~~~~~tllG~~ 302 (392) T protein:vir:10 237 --------KQAIKSLDDIKDV---LNVKLDPAISP--NAILLTNQDGFNYLDK-LKDKDGKYILQSDPTQKNKKLFAGTN 302 (392) T ss_pred --------ccCccCHHHHHHH---HHHhhhhhhcc--CCEEEEcHHHHHHHHH-hhccCCCeEeecCccCCccccccCcc Confidence 1222333333332 33467777764 4689999999877643 33222222 123344455688876 Q ss_pred eeec-cCC-C--------CCeEEEecchHHh-hhhhhhhhhhhcc--cccee-eeccceeEEEEEEeecceeeccCCeEE Q lcl|NC_018271. 234 LTEI-KGL-P--------ASRMVGYNRDNIV-IGMSAQSDFNEIR--IKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIV 299 (305) Q Consensus 234 iv~l-~~~-P--------d~~ii~T~~sNl~-~gvnl~~D~n~I~--I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v 299 (305) +|-. .+. | +..++..+-++.+ ++. -..+. +++.. ...-+..+-+...+-.|+.+.-++=|| T Consensus 303 ~v~~~~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~-----~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~ 377 (392) T protein:vir:10 303 PVVVVSNRFLKSKGTTAKKAPLIIGDLKEAIVLFK-----REDMELASTDVGGKAFTRNTLDLRAIQRDDVQMWDNEAAV 377 (392) T ss_pred cEEEecccccCCCcccCCceEEEEEehhceEEEEe-----ecceEEEEeccccchhhcCceEEEEEEeeccEEecccceE Confidence 4432 222 2 2234444444432 111 11222 22211 111123344566666688887777777 Q ss_pred E--ecCCC Q lcl|NC_018271. 300 L--YTPAA 305 (305) Q Consensus 300 ~--~~~~~ 305 (305) . .+|++ T Consensus 378 ~l~~~~~a 385 (392) T protein:vir:10 378 YGEIDLSA 385 (392) T ss_pred EEEecccc Confidence 7 55666 No 92 >protein:vir:102082 Length: 392 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1503 # MgeName: Fah # Cross-refs: genbank:acc:YP_512315;genbank:gi:89152484;genbank:GeneID:3953075 Probab=96.83 E-value=0.00031 Score=39.80 Aligned_cols=257 Identities=7% Similarity=0.001 Sum_probs=129.7 Q ss_pred CceEeeeeccc-c-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCC--CCCCccceEeccee Q lcl|NC_018271. 1 MATTVDITTNY-V-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC--GFTPSGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~~~~Y-~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~--~~~~~G~~~~~~K~ 76 (305) |+..-.-...| . .+...+|+..+.....+.+..-+..+++-..+.. +++...+..+.--.+ .....+..+|++.. T Consensus 106 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~~-~~~~~~~~~a~~v~E~~~~~~~~~~~~~~v~ 184 (392) T protein:vir:10 106 MSGLTGEDGGLVIPQDIQTQINELARSFDALEQYVTVEPVRTRSGSRV-LEKNSDMIPFAEITEMGEIPETDNPKFSNVQ 184 (392) T ss_pred ccccccCCCceecchhHHHHHHHHHHhhhhhhhhceeeeccCCceeEE-EEeecCCccceeecccccccccccccceeEE Confidence 22221111111 1 1233445545555555555433333333333322 222222111111111 11223456899999 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) |.++++.+...++-.=+. +-.+..+.++.+.|.+.++..++..++.|+++.. T Consensus 185 l~~~k~~~~~~iS~ell~----------------ds~~~l~~~i~~~l~~~i~~~~d~~~~~g~g~~~------------ 236 (392) T protein:vir:10 185 YAVKDRAGILPLSRSLLQ----------------DSDQNILKYVTKWLGKKSKVTRNVLILGVIEKLT------------ 236 (392) T ss_pred eeeeeEEEeehhhHHHHh----------------hhHHHHHHHHHHHHHHHHHHHHHHHHhhcccccc------------ Confidence 999999988877643221 1134566788899999999999999998887621 Q ss_pred cceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---CcccCCCcceeccee Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---GTFLNPNEFDFEGYT 233 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~~~t~~~~~~~kGi~ 233 (305) .++.+|..++++. +...++..++. +-.++||...|...+- ++.-.|.+ .+.+++....+.|.+ T Consensus 237 --------~~~~~~~d~i~~~---~~~~l~~~~~~--~a~~vm~~~~~~~L~~-lkd~~G~~l~~~~~~~~~~~tllG~~ 302 (392) T protein:vir:10 237 --------KQAIKSLDDIKDV---LNVKLDPAISP--NAILLTNQDGFNYLDK-LKDKDGKYILQSDPTQKNKKLFAGTN 302 (392) T ss_pred --------ccCccCHHHHHHH---HHHhhhhhhcc--CCEEEEcHHHHHHHHH-hhccCCCeEeecCccCCccccccCcc Confidence 1222333333332 33467777764 4689999999877643 33222222 123344455688876 Q ss_pred eeec-cCC-C--------CCeEEEecchHHh-hhhhhhhhhhhcc--cccee-eeccceeEEEEEEeecceeeccCCeEE Q lcl|NC_018271. 234 LTEI-KGL-P--------ASRMVGYNRDNIV-IGMSAQSDFNEIR--IKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIV 299 (305) Q Consensus 234 iv~l-~~~-P--------d~~ii~T~~sNl~-~gvnl~~D~n~I~--I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v 299 (305) +|-. .+. | +..++..+-++.+ ++. -..+. +++.. ...-+..+-+...+-.|+.+.-++=|| T Consensus 303 ~v~~~~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~-----~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~ 377 (392) T protein:vir:10 303 PVVVVSNRFLKSKGTTAKKAPLIIGDLKEAIVLFK-----REDMELASTDVGGKAFTRNTLDLRAIQRDDVQMWDNEAAV 377 (392) T ss_pred cEEEecccccCCCcccCCceEEEEEehhceEEEEe-----ecceEEEEeccccchhhcCceEEEEEEeeccEEecccceE Confidence 4432 222 2 2234444444432 111 11222 22211 111123344566666688887777777 Q ss_pred E--ecCCC Q lcl|NC_018271. 300 L--YTPAA 305 (305) Q Consensus 300 ~--~~~~~ 305 (305) . .+|++ T Consensus 378 ~l~~~~~a 385 (392) T protein:vir:10 378 YGEIDLSA 385 (392) T ss_pred EEEecccc Confidence 7 55666 No 93 >protein:vir:101607 Length: 379 # NCBI annotation: major capsid protein precursor # Family: family:all:585 # MgeID: mge:1646 # MgeName: 11b # Cross-refs: genbank:acc:YP_112497;genbank:gi:53793597;uniprot:Q5ZGF6;genbank:GeneID:3101715 Probab=96.75 E-value=0.00036 Score=39.45 Aligned_cols=258 Identities=12% Similarity=0.059 Sum_probs=134.9 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccceEecc Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEVDINE 74 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~~~~~ 74 (305) +.+..+...---.++..+|+..+..-..+.+ +++++|--... ...++...... -...|.+ .++.+|.+ T Consensus 109 ~~~~~~~~~~ip~~~~~~ii~~~~~~~~i~~--~~~~~~~~~~~-~~~~~~~~~~~---~~~~~v~Eg~~~~~~~~~f~~ 182 (379) T protein:vir:10 109 MTLPVNLTGAQPKDYNFDVVLNPSQMLNVSD--IVGAVSISGGT-YTFVRENGAGE---GAIGAQVEGATKGQKDYDISM 182 (379) T ss_pred cccCCCCccccchhhhhHHHHhHHhhhhHHh--hceeeeccCCc-eEEEEeecCCC---cccccccCCccccccccceee Confidence 2222221111112233344444433333332 25555543222 22332221111 1123322 34679999 Q ss_pred eeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHh Q lcl|NC_018271. 75 KQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLE 154 (305) Q Consensus 75 K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~ 154 (305) ..+.++++-+...++- ++ + ++-| ..+.++...|.+.++..++..+..|+++.+. T Consensus 183 i~~~~~k~~~~~~iS~-el--------l-------~D~~-~l~~~i~~~la~~~~~~~~~~~~~g~~~~~~--------- 236 (379) T protein:vir:10 183 IDVNTDFIAGFTRYSK-KM--------A-------NNLP-FLTSFIPNALRRDYAKAENAAFNAVLAANAT--------- 236 (379) T ss_pred eEeeeeeEEeeehhhH-HH--------H-------hhHH-HHHHHHHHHHHHHHHHHHHHHHhcccccccc--------- Confidence 9999999999876542 22 1 2223 4677888888888998888888888765210 Q ss_pred hccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC-----cccCCCccee Q lcl|NC_018271. 155 ADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG-----TFLNPNEFDF 229 (305) Q Consensus 155 ~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~-----~~t~~~~~~~ 229 (305) .. .. +.+....++.+.+...++....+++. .++||...|.+.+- ++.-.|.+. ....+....+ T Consensus 237 ---~~--~~----~~~~~~~~d~i~~~~~~~~~~~~~~~--~~vmn~~~~~~l~~-lkd~~G~~l~~~~~~~~~~~~~~l 304 (379) T protein:vir:10 237 ---AS--TE----IITNKNKVEMLINEIAKQENLDFPVT--AIVLRPTDYYDILV-TQKSVGAGYGLPGVVTQDNGVLRI 304 (379) T ss_pred ---cc--cc----cccCcccHHHHHHHHHhhhhccCCCC--EEEEcHHHHHHHHH-hhccCCceeccCCccCCCCCccee Confidence 00 01 11112223444444444544445432 69999998866532 322222221 1122334579 Q ss_pred cceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhcccccee-eeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 230 EGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 230 kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) .|++++.-..+|.+.++..+-+...+.++- +. .|++.+.. ..+-.....|...+-+|+.+..++=||.=+=++ T Consensus 305 ~G~pvv~s~~~~ag~~~~gdf~~~~~~~~~--~~-~i~~~~~~~~~f~~~~~~~r~~~R~~~~v~~p~a~v~~~~~~ 378 (379) T protein:vir:10 305 NGIPLFRATWLAANKYYVGDWTRVTKVTTE--GL-SLEFSEVEGTNFVKNNITARIEAQVALAVEQPAALIFGDFTA 378 (379) T ss_pred cceeeEecCCCCCCceEEeecccEEEEEEe--ce-EEEEeecccccccCCcEEEEEEEEeccEEecCccEEEEEecC Confidence 999999999999998888776664322211 11 12222111 112344566667777788888887787755555 No 94 >protein:vir:93881 Length: 387 # NCBI annotation: ORF011 # Family: family:all:658 # MgeID: mge:1485 # MgeName: 3A # Cross-refs: genbank:acc:YP_239938;genbank:gi:66395599;genbank:GeneID:5130947 Probab=96.68 E-value=0.00041 Score=39.13 Aligned_cols=254 Identities=9% Similarity=-0.006 Sum_probs=134.7 Q ss_pred CceEeeeeccc--chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccceEe Q lcl|NC_018271. 1 MATTVDITTNY--VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEVDI 72 (305) Q Consensus 1 ma~~~~~~~~Y--~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~~~ 72 (305) |.........| --++..+|+..+..-..+.+ +++|.+ +..- ..|+...+ .-.+.|.. .++.+| T Consensus 118 l~~~t~s~gG~~IP~~~~~~Ii~~~~~~~~l~~--~~~v~~-~~~~--~~p~~~~~----~~~a~~v~E~~~~~~~~~~f 188 (387) T protein:vir:93 118 LPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLRE--KARLTN-IKGL--EIPRVSYT----LDDDDFITDVETAKELKLKG 188 (387) T ss_pred hccCcCCCCceeechhHHHHHHHHHHhhchhhh--heeeee-cCCc--eEEEEeec----CCccccccCccccccccccc Confidence 22222111222 12334556666655555543 355543 2211 12322111 11134433 246789 Q ss_pred cceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhh-hhhhccccCCccchhHHHHH Q lcl|NC_018271. 73 NEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARK-IDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 73 ~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~e-i~~~~~~GD~s~~~fdG~lk 151 (305) ++..+.++++-++..++= +++. +=.+..+.++.+.|+++++.. ....+..|+++- ...|++. T Consensus 189 ~~v~~~~~k~~~~~~iS~---------ell~-------Ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g~g-~p~g~l~ 251 (387) T protein:vir:93 189 DTVKFTTNKFKVFAAISD---------TVIH-------GSDVDLVNWVENALQSGLAAKERKDALAVSPKSG-LDHMSFY 251 (387) T ss_pred ceeeeeheeeeeechhhH---------HHHh-------hhHHHHHHHHHHHHHHHHHHHHHHhHhhcCCCcc-ccceeee Confidence 999999999988766651 1222 112445677888888887644 444555676552 2344332 Q ss_pred HHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecc Q lcl|NC_018271. 152 LLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEG 231 (305) Q Consensus 152 ~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kG 231 (305) - .. ....|.++..+.+.+++.+++..+|++ -.++|+...|.....-+.+. +.....+....+.| T Consensus 252 ~----~~-------~~~v~~~~~~d~i~~~~~~l~~~~~~~--a~~~mn~~t~~~~~~~~~d~---~~~~~~~~~~~llG 315 (387) T protein:vir:93 252 N----GS-------VKEVEGADMYDAIINALADLHEDYRDN--ATIYMRYADYVKIISVLSNG---TTNFFDTPAEKVFG 315 (387) T ss_pred c----cc-------cccccccchHHHHHHHHhccChhhhcC--CEEEEechHHHHHHHHHhcC---CCcccccCCccccc Confidence 1 11 112233445677888999999999864 47899988877766655432 23333444557889 Q ss_pred eeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEe---cCCC Q lcl|NC_018271. 232 YTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLY---TPAA 305 (305) Q Consensus 232 i~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~---~~~~ 305 (305) .+++-..++|+ ++..+=+..+.++ +.+.+++.... ..+.+.|-...-.|..+..++=||+= +|++ T Consensus 316 ~PV~~~~~~~~--~~~GDf~~~~~~~------~~~~~~~~~~~-~~~~~~~~~~~r~d~~v~~~eA~~~l~~k~~~~ 383 (387) T protein:vir:93 316 KPVVFTDAAVK--PIVGDFNYFGINY------DGTTYDTDKDV-KKGEYLFVLTAWYDQQRTLDSAFRIAKAKENTG 383 (387) T ss_pred cceEEecCCCc--eeeeehhhhheeh------hhheeeecccc-cCCceeEEEEeeeCceeechhheEEEEeecCCC Confidence 99998888775 4556655554332 22333333221 12333333344557777666666544 2222 No 95 >protein:vir:9361 Length: 402 # NCBI annotation: SLT orf 37-like protein # Family: family:all:658 # MgeID: mge:166 # MgeName: phi 12 # Cross-refs: genbank:acc:NP_803339;genbank:gi:29028650;genbank:GeneID:1258088 Probab=96.67 E-value=0.00042 Score=39.09 Aligned_cols=253 Identities=9% Similarity=0.018 Sum_probs=136.5 Q ss_pred CceEeeeeccc-chh-HHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccceEe Q lcl|NC_018271. 1 MATTVDITTNY-VGE-VAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEVDI 72 (305) Q Consensus 1 ma~~~~~~~~Y-~Ge-~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~~~ 72 (305) |-+.......| .++ ...+|+..+..-..+.+ +++|++ ++. ...|+..... -++.|.+ .++.+| T Consensus 133 ~~~~t~~~GG~lIP~~~~~~Ii~~~~~~~~l~~--~~~v~~-~~~--~~~p~~~~~~----~~a~~v~Eg~~~~~~~~~f 203 (402) T protein:vir:93 133 LPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLRE--KARLTN-IKG--LEIPRVSYTL----DDDDFITDVETAKELKAKG 203 (402) T ss_pred hccCCCcCCccccchhHHHHHHHhHHhhhhhhh--hceeee-cCC--ceeeeeeccC----Ccccccccccccccccccc Confidence 22212212223 122 33445555555555543 345433 221 1123322111 1134533 246789 Q ss_pred cceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhh-hhhhccccCCccchhHHHHH Q lcl|NC_018271. 73 NEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARK-IDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 73 ~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~e-i~~~~~~GD~s~~~fdG~lk 151 (305) ++..+.++++-++..++-+ .+. +=.+.++.++.+.|+++++.. .+..+..|+++ +...|++. T Consensus 204 ~~i~~~~~k~~~~i~iS~e---------ll~-------Ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g~-g~p~g~~~ 266 (402) T protein:vir:93 204 DTVKFTTNKFKVFAAISDT---------VIH-------GSDVDLVNWVENALQSGLAAKERKDALAVSPKS-GLEHMSFY 266 (402) T ss_pred ceeeecceeeeeechhhHH---------HHh-------hhHHHHHHHHHHHHHHHHHHHHHHhHhhcCCCc-cccceeee Confidence 9999999999887766522 111 223456678888888887653 44455667665 22334332 Q ss_pred HHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecc Q lcl|NC_018271. 152 LLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEG 231 (305) Q Consensus 152 ~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kG 231 (305) - . ..+..|.++..+.+.++..+++..+|++ -+++|+...+.....-+.+ .+.....+....+.| T Consensus 267 ~----~-------~~~~~~~~~~~d~l~~~~~~l~~~y~~n--a~~imn~~t~~~~~~~~~d---~~~~~~~~~~~~llG 330 (402) T protein:vir:93 267 N----G-------SVKEVEGADMYDAIINALADLHEDYRDN--ATIYMRYADYVKIISVLSN---GTTNFFDTPAEKVFG 330 (402) T ss_pred c----c-------ccccccccchHHHHHHHHhccChhhhcC--CEEEEechHHHHHHHHHhc---CCCcccccCCccccc Confidence 1 1 1112344455778888999999999864 4799998887766555432 333444445557889 Q ss_pred eeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEe--------cC Q lcl|NC_018271. 232 YTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLY--------TP 303 (305) Q Consensus 232 i~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~--------~~ 303 (305) .+++....+|+ ++..+=++.+.+++ .+.++.... -..+.+.|....-.|-.+..++=|++= || T Consensus 331 ~PV~~t~~~~~--i~~GDf~~~~~~~~------~~~~~~~~~-~~~~~~~~~~~~r~Dg~v~~~~A~~~l~ik~~~~~~~ 401 (402) T protein:vir:93 331 KPVVFTDAAVK--PIVGDFNYFGINYD------GTTYDTDKD-VKKGEYLFVLTAWYDQQRTLDSAFRIAKAKENTGPLP 401 (402) T ss_pred cceEEecCCCc--eeeechhhhhhhhh------hhhhhhhhc-ccCCceEEEEEEEeCcEEechhheEEEEeecCCCCCC Confidence 99998888775 45566666554332 223333222 123455555555567777665555543 23 Q ss_pred C Q lcl|NC_018271. 304 A 304 (305) Q Consensus 304 ~ 304 (305) + T Consensus 402 ~ 402 (402) T protein:vir:93 402 S 402 (402) T ss_pred C Confidence 3 No 96 >protein:vir:101650 Length: 497 # NCBI annotation: gp13 # Family: family:all:585 # MgeID: mge:1515 # MgeName: 244 # Cross-refs: genbank:acc:YP_654768;genbank:gi:109302766;genbank:GeneID:4156084 Probab=96.58 E-value=0.00028 Score=40.07 Aligned_cols=275 Identities=12% Similarity=0.099 Sum_probs=136.4 Q ss_pred CceEeeee--cccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccceEe Q lcl|NC_018271. 1 MATTVDIT--TNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEVDI 72 (305) Q Consensus 1 ma~~~~~~--~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~~~ 72 (305) |+...+-. ...-.++..+|+..+.....+. .+++++|--... ...++...+. -.+.|.+ -++.+| T Consensus 151 ~~~~~~~~gg~~vp~~~~~~ii~~~~~~~~i~--~l~~~~~~~~~~-~~~~~~~~~~----~~a~wv~E~~~~~~s~~~f 223 (497) T protein:vir:10 151 NPFGSTGTFAPGILPTFLPGIVEQLFYELSLA--DLISSRPVTSPN-LSYLTESAAH----NNAAAVAEAGTYPFSSEEF 223 (497) T ss_pred hhcccCcccccccchhhhHHHHHHHHhhhhHH--hhccccccCCCc-eEEEEEcCCC----CcceeeccCcccccccccc Confidence 32222211 1233445556665554443333 335555533222 2233322221 1233432 356789 Q ss_pred cceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHH Q lcl|NC_018271. 73 NEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPL 152 (305) Q Consensus 73 ~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~ 152 (305) .+..+.++++-+...++ +++ + ++-| .++.++...|.+.++..++..++.||++. ...||++. T Consensus 224 ~~i~~~~~k~a~~~~iS-~el-------------l--~d~~-~l~~~i~~~l~~~i~~~~d~~~l~G~G~~-~p~Gil~~ 285 (497) T protein:vir:10 224 ARVYEQVGKVANALTIT-DEG-------------L--RDAP-ELFNFVQGRLLEGIQRKEEVQLLAGGGYP-GVNGLLQR 285 (497) T ss_pred eeeEeeeeeeEeecHhH-HHH-------------H--HhHH-HHHHHHHHHHHHHHHHHHHHHhhcCCCcc-cccccccc Confidence 99999999998876543 333 1 2224 36778889999999999999999999874 47777765 Q ss_pred HhhccceE---------------Eecc--CcCcCChhh---------------------------HHHHHHHHHHhccH- Q lcl|NC_018271. 153 LEADATVI---------------DVVG--ASGGITAAN---------------------------VEAELGKFIDAHTD- 187 (305) Q Consensus 153 i~~d~~~~---------------~~~~--~~~~iT~an---------------------------v~~~l~~~~~~iP~- 187 (305) ........ +... ....+.... ....+..+.++++. T Consensus 286 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 365 (497) T protein:vir:10 286 STGFTASSASSLFGATSATVSNVKFPADGTNGAFVGQDTVASLKYGRVVTGAAGSGSGVAGSYPTAAEIAENVFDAFVDI 365 (497) T ss_pred cccccccccccchhhhhhhhhhhhhhcccccchhhhhhHHHHHHHHHhhhhhhhhccchhccccchhhhhhHHHHHHhhh Confidence 42211100 0000 000000000 11122222333222 Q ss_pred --HHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc---------cCCCcceecceeeeeccCCCCCeEEEecchHHhhh Q lcl|NC_018271. 188 --EILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF---------LNPNEFDFEGYTLTEIKGLPASRMVGYNRDNIVIG 256 (305) Q Consensus 188 --~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~---------t~~~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~g 256 (305) .....++ .+.||...|...+ -+..-.|.+.-. .......+.|++++-...||.+.++..+=+...+ T Consensus 366 ~~~~~~~~~-~~vmn~~~~~~l~-~lkd~~G~~i~~~~~~~~~~~~~~~~~~l~G~pV~~t~~~~~~~~~~Gd~~~~~~- 442 (497) T protein:vir:10 366 QLTLFQTPN-AVVMNPRDWELLR-LTKDANGQYMGGNFFGNAYGNPVNGGKNIWGVPVVTTPLIPLGTILVGHFAPSVI- 442 (497) T ss_pred hhhcccCCC-eEEEchHHHHHHH-HhhcCCCceeccCcccccccccccCCceeeceeeEecCCCCCCceEEeecccceE- Confidence 2222333 6889988887754 232222222110 0011226889999999999988766544322111 Q ss_pred hhhhhhhhhccccce---eeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 257 MSAQSDFNEIRIKDM---GDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 257 vnl~~D~n~I~I~~~---~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) ++-|-..+.|+-. ...+-+...-|...+-.|+.+..++=||+=+-.+ T Consensus 443 --~i~~r~~~~v~~~~~~~~~f~~n~v~~r~~~r~~~~v~~p~A~~~l~~~~ 492 (497) T protein:vir:10 443 --QTARREGVTMQMTNSNGTDFVDGKVTVRAEERLGLLVYRPSAFQLIQLKK 492 (497) T ss_pred --EEEEecccEEEeecccchhhhcCcEEEEEEEeecceeeccccEEEEEecC Confidence 1111122222210 0111123344555566688888888888877666 No 97 >protein:vir:7855 Length: 497 # NCBI annotation: gp12 # Family: family:all:585 # MgeID: mge:150 # MgeName: CJW1 # Cross-refs: genbank:acc:NP_817462;genbank:gi:29565891;genbank:GeneID:1259081 Probab=96.58 E-value=0.00028 Score=40.07 Aligned_cols=275 Identities=12% Similarity=0.099 Sum_probs=136.4 Q ss_pred CceEeeee--cccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccceEe Q lcl|NC_018271. 1 MATTVDIT--TNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEVDI 72 (305) Q Consensus 1 ma~~~~~~--~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~~~ 72 (305) |+...+-. ...-.++..+|+..+.....+. .+++++|--... ...++...+. -.+.|.+ -++.+| T Consensus 151 ~~~~~~~~gg~~vp~~~~~~ii~~~~~~~~i~--~l~~~~~~~~~~-~~~~~~~~~~----~~a~wv~E~~~~~~s~~~f 223 (497) T protein:vir:78 151 NPFGSTGTFAPGILPTFLPGIVEQLFYELSLA--DLISSRPVTSPN-LSYLTESAAH----NNAAAVAEAGTYPFSSEEF 223 (497) T ss_pred hhcccCcccccccchhhhHHHHHHHHhhhhHH--hhccccccCCCc-eEEEEEcCCC----CcceeeccCcccccccccc Confidence 32222211 1233445556665554443333 335555533222 2233322221 1233432 356789 Q ss_pred cceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHH Q lcl|NC_018271. 73 NEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPL 152 (305) Q Consensus 73 ~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~ 152 (305) .+..+.++++-+...++ +++ + ++-| .++.++...|.+.++..++..++.||++. ...||++. T Consensus 224 ~~i~~~~~k~a~~~~iS-~el-------------l--~d~~-~l~~~i~~~l~~~i~~~~d~~~l~G~G~~-~p~Gil~~ 285 (497) T protein:vir:78 224 ARVYEQVGKVANALTIT-DEG-------------L--RDAP-ELFNFVQGRLLEGIQRKEEVQLLAGGGYP-GVNGLLQR 285 (497) T ss_pred eeeEeeeeeeEeecHhH-HHH-------------H--HhHH-HHHHHHHHHHHHHHHHHHHHHhhcCCCcc-cccccccc Confidence 99999999998876543 333 1 2224 36778889999999999999999999874 47777765 Q ss_pred HhhccceE---------------Eecc--CcCcCChhh---------------------------HHHHHHHHHHhccH- Q lcl|NC_018271. 153 LEADATVI---------------DVVG--ASGGITAAN---------------------------VEAELGKFIDAHTD- 187 (305) Q Consensus 153 i~~d~~~~---------------~~~~--~~~~iT~an---------------------------v~~~l~~~~~~iP~- 187 (305) ........ +... ....+.... ....+..+.++++. T Consensus 286 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 365 (497) T protein:vir:78 286 STGFTASSASSLFGATSATVSNVKFPADGTNGAFVGQDTVASLKYGRVVTGAAGSGSGVAGSYPTAAEIAENVFDAFVDI 365 (497) T ss_pred cccccccccccchhhhhhhhhhhhhhcccccchhhhhhHHHHHHHHHhhhhhhhhccchhccccchhhhhhHHHHHHhhh Confidence 42211100 0000 000000000 11122222333222 Q ss_pred --HHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc---------cCCCcceecceeeeeccCCCCCeEEEecchHHhhh Q lcl|NC_018271. 188 --EILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF---------LNPNEFDFEGYTLTEIKGLPASRMVGYNRDNIVIG 256 (305) Q Consensus 188 --~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~---------t~~~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~g 256 (305) .....++ .+.||...|...+ -+..-.|.+.-. .......+.|++++-...||.+.++..+=+...+ T Consensus 366 ~~~~~~~~~-~~vmn~~~~~~l~-~lkd~~G~~i~~~~~~~~~~~~~~~~~~l~G~pV~~t~~~~~~~~~~Gd~~~~~~- 442 (497) T protein:vir:78 366 QLTLFQTPN-AVVMNPRDWELLR-LTKDANGQYMGGNFFGNAYGNPVNGGKNIWGVPVVTTPLIPLGTILVGHFAPSVI- 442 (497) T ss_pred hhhcccCCC-eEEEchHHHHHHH-HhhcCCCceeccCcccccccccccCCceeeceeeEecCCCCCCceEEeecccceE- Confidence 2222333 6889988887754 232222222110 0011226889999999999988766544322111 Q ss_pred hhhhhhhhhccccce---eeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 257 MSAQSDFNEIRIKDM---GDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 257 vnl~~D~n~I~I~~~---~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) ++-|-..+.|+-. ...+-+...-|...+-.|+.+..++=||+=+-.+ T Consensus 443 --~i~~r~~~~v~~~~~~~~~f~~n~v~~r~~~r~~~~v~~p~A~~~l~~~~ 492 (497) T protein:vir:78 443 --QTARREGVTMQMTNSNGTDFVDGKVTVRAEERLGLLVYRPSAFQLIQLKK 492 (497) T ss_pred --EEEEecccEEEeecccchhhhcCcEEEEEEEeecceeeccccEEEEEecC Confidence 1111122222210 0111123344555566688888888888877666 No 98 >protein:vir:81160 Length: 371 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:1892 # MgeName: Geobacillus virus E2 # Cross-refs: genbank:acc:YP_001285811;genbank:gi:148747732;genbank:GeneID:5247203 Probab=96.49 E-value=0.00058 Score=38.35 Aligned_cols=256 Identities=10% Similarity=-0.000 Sum_probs=136.1 Q ss_pred CceEeeeecc--cchhHHHHHHHHhhccccchhcCceEEec--CCCCcccccchhhhhccccCCCC-CCCC-ccceEecc Q lcl|NC_018271. 1 MATTVDITTN--YVGEVAGGYFLEMVKEANTISDNLIRVIP--NVPENNLFLRRMNTTDDFVDYSC-GFTP-SGEVDINE 74 (305) Q Consensus 1 ma~~~~~~~~--Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~--~v~~~~~~~~~~~~~~~~q~~~~-~~~~-~G~~~~~~ 74 (305) |.......+. =--++..+|+..+.....+.+. ++++| +-..... +++........--.+ .-.+ .+..+|.+ T Consensus 91 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~i~~~--~~~~~~~~~~~~~~-~~~~~~~~~a~~v~Eg~~~~~~~~~~f~~ 167 (371) T protein:vir:81 91 MSEGSNQDGGYTVPQDIQTRINELRESKDALQNL--ITVEPVTTLSGSRV-FKKRSQQTGFVEVAEGAAIGEKATPQFTL 167 (371) T ss_pred hccCCCccCceeecHhHHHHHHHHHHhhhhhhhh--ceeeeccCCceeEE-EEeecCCcceeeeccccccccccccceee Confidence 3222211110 1113344555555555555544 34333 3222211 221111111111111 1112 24578999 Q ss_pred eeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHh Q lcl|NC_018271. 75 KQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLE 154 (305) Q Consensus 75 K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~ 154 (305) ..+.++++-+...++-.-+ ++-++..+.++.+.|.++++..++..++.|+++.. T Consensus 168 i~~~~~k~~~~~~iS~ell----------------~ds~~~l~~~i~~~l~~a~~~~~~~~i~~g~g~~~---------- 221 (371) T protein:vir:81 168 LQYQVKKYAGFFRVTNELL----------------NDSTEAIVNTLVRWIGDESRVTRNGLIINVLNTKA---------- 221 (371) T ss_pred EEeeeeEEEEeehhhHHHH----------------hhhhHHHHHHHHHHHHHHHHHHHHHHHHhhccccc---------- Confidence 9999999999876543321 22345677889999999999999999999987621 Q ss_pred hccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---CcccCCCcceecc Q lcl|NC_018271. 155 ADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---GTFLNPNEFDFEG 231 (305) Q Consensus 155 ~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~~~t~~~~~~~kG 231 (305) ..+..+.. +....+...++..++. +-.++||...|.+.+- ++.-.|.+ .+.+.+....+.| T Consensus 222 ----------~~~~~~~~---~i~~~~~~~l~~~~~~--~a~~vmn~~~~~~L~~-lkd~~g~~l~~~~~~~~~~~~l~G 285 (371) T protein:vir:81 222 ----------KTAIADLD---GLKQIINVQLDPVFRS--TSSVIVNQDAFNWLDT-LKDQNGQYLLQPSISSPTGRQLLG 285 (371) T ss_pred ----------ccccccHH---HHHHHHHhhcchhhhc--CCEEEEcHHHHHHHHH-hhccCCCeeeecccCCCCCceecc Confidence 11122222 2223344567777774 3589999999877553 32222221 1234444567999 Q ss_pred eeeeeccCCCCC------------eEEEecchHHhhhhhhhhhhhhccccce---eeeccceeEEEEEEeecceeeccCC Q lcl|NC_018271. 232 YTLTEIKGLPAS------------RMVGYNRDNIVIGMSAQSDFNEIRIKDM---GDVDLSGQIRTKMVLSAGVEYAYGA 296 (305) Q Consensus 232 i~iv~l~~~Pd~------------~ii~T~~sNl~~gvnl~~D~n~I~I~~~---~~~~~~~~~f~k~~m~~d~~i~fg~ 296 (305) .+++....+|.+ .|+..+=++.+. +-|-..+.++.. .+.+-+...-+...+-+|+.+..++ T Consensus 286 ~pV~~~~~~~~~~~~~~~~~~~~~~i~~Gd~~~~~~----~~~~~~~~i~~~~~~~~~f~~~~v~~~~~~r~d~~~~~~~ 361 (371) T protein:vir:81 286 LPVVIVSNKVLANRVDGGTGAQFAPIIVGDLKEAVV----MFDRQRTEIMSSNVAMDAFETDATLWRAIERMDVKMRDDE 361 (371) T ss_pred eeEEEecccccCccccccccCCcceEEEEehhceEE----EEeecceEEEEeccccchhhcCceEEEEEEeeccEEeccc Confidence 999988877633 233333333211 111122333211 1112244556666777788888888 Q ss_pred eEEEecCCC Q lcl|NC_018271. 297 EIVLYTPAA 305 (305) Q Consensus 297 E~v~~~~~~ 305 (305) =+|+-+-++ T Consensus 362 a~~~~~~~~ 370 (371) T protein:vir:81 362 AFVFGEVQL 370 (371) T ss_pred ceEEEEEec Confidence 888877666 No 99 >protein:vir:1084 Length: 437 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:21 # MgeName: bIL309 # Cross-refs: genbank:acc:NP_076738;genbank:gi:13095848;genbank:GeneID:920418 Probab=96.49 E-value=0.00058 Score=38.35 Aligned_cols=256 Identities=14% Similarity=0.017 Sum_probs=122.0 Q ss_pred CceEeeeeccc-chhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccC-CCC-CCCC-ccceEeccee Q lcl|NC_018271. 1 MATTVDITTNY-VGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVD-YSC-GFTP-SGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~~~~Y-~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~-~~~-~~~~-~G~~~~~~K~ 76 (305) ++..-.....| ..+.....+-....-..+.. +++++|.-+.. ...+..+....... ... +..+ .++..|.+.. T Consensus 156 ~~~~~~~~~g~lvp~~~~~~i~~~~~~~~l~~--~~~~~~~~~~~-~~~~~~~~~~~~~~~~~e~~~~~e~~~~~~~~v~ 232 (437) T protein:vir:10 156 VTGIALKDGKVIIPETILTPEKEVHQFPRLGS--LVRTESVTTTT-GKLPIFNNSTDLLTAHTEYGQTTKNATPVITPIL 232 (437) T ss_pred hhhcccccccccchHHHHHHHHHhhhhhhhhh--cceeEeeccCc-eeeEEeeccccccccccccccccccccccceeee Confidence 22111111222 12233333333322222222 25555432222 11222222211111 111 1112 2456788888 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) +.++++-+...++-. + + ++-++..+.++...|.++++..++..++.|+++.. T Consensus 233 ~~~~k~~~~~~is~e-l--------l-------~ds~~~~~~~i~~~l~~~~~~~~~~~i~~g~g~~~------------ 284 (437) T protein:vir:10 233 WDLKTYTGGYVFSQE-L--------I-------SDSSYDWQAELQSRLIELRDNTDDSLIITALTDGI------------ 284 (437) T ss_pred eehhheeeehhhhHH-H--------H-------hhhHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc------------ Confidence 888888776655432 1 1 22245677788899999999999999999987621 Q ss_pred cceEEeccCcCcCChhhHHHHHHH-HHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---CcccCCCcceecce Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGK-FIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---GTFLNPNEFDFEGY 232 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~-~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~~~t~~~~~~~kGi 232 (305) +....+.+.. .+.+ +..+++..++.+ -.++||...|...+. ++.-.|.+ .+.+++....+.|. T Consensus 285 ------~~~~~~~~~~----~~~~~~~~~l~~~~~~~--~~~~~~~~~~~~l~~-lkd~~g~~~~~~~~~~~~~~~l~G~ 351 (437) T protein:vir:10 285 ------KKTTSTYLLG----DLKKVLNVTLKPQDSAA--ASIVMSQSAYNLFDM-ATDAMGRPLLQPNVTAATGYTLLGK 351 (437) T ss_pred ------cccccccchh----hHHHHHHhhhhhhhhcC--CEEEEcHHHHHHHHH-hhccCCCeeeccCccCCCCcccccc Confidence 1111222222 2333 334788888864 389999999876533 33222222 22344555679999 Q ss_pred eeeeccCC--CC-----CeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEec--- Q lcl|NC_018271. 233 TLTEIKGL--PA-----SRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYT--- 302 (305) Q Consensus 233 ~iv~l~~~--Pd-----~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~--- 302 (305) +++...++ |. ..++..+=++.+..++ -..+.+.--...+....++ ...+-+|+.+.-++=||+-| T Consensus 352 pv~~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~----r~~~~~~~~~~~~~~~~~~-~~~~r~d~~~~~~~a~~~l~~~~ 426 (437) T protein:vir:10 352 TVVIVDDKLFPSASAGDVNIVVAPLKKAVINFK----LTEITGQFQDTYDIWYKQL-GIFLRQNVVQASKDLIVNLTGKL 426 (437) T ss_pred eeEEecccccCCcCCCceEEEEeeccccEEEEe----eeceEEEEeccccccccee-eEEEEEccEEecccceEEEEeec Confidence 98876543 42 2366666555432221 1122221000111111111 11233477776666666654 Q ss_pred CCC Q lcl|NC_018271. 303 PAA 305 (305) Q Consensus 303 ~~~ 305 (305) |+. T Consensus 427 ~~~ 429 (437) T protein:vir:10 427 KAV 429 (437) T ss_pred ccc Confidence 222 No 100 >protein:vir:2685 Length: 387 # NCBI annotation: hypothetical protein # Family: family:all:658 # MgeID: mge:57 # MgeName: phiSLT # Cross-refs: genbank:acc:NP_075504;genbank:gi:12719433;genbank:GeneID:920169 Probab=96.44 E-value=0.00063 Score=38.15 Aligned_cols=254 Identities=9% Similarity=0.002 Sum_probs=138.1 Q ss_pred CceEeeeecccc-h-hHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccceEe Q lcl|NC_018271. 1 MATTVDITTNYV-G-EVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEVDI 72 (305) Q Consensus 1 ma~~~~~~~~Y~-G-e~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~~~ 72 (305) |.+.......|. . +...+|+..+..-..+.+. ++|.| +.. ...|+..... -.+.|.+ .++.+| T Consensus 118 ~~~~~~~~gG~lIP~~~~~~Ii~~~~~~~~l~~~--~~~~~-~~~--~~~p~~~~~~----~~a~~v~Eg~~~~~~~~~f 188 (387) T protein:vir:26 118 LPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLREK--ARLTN-IKG--LEIPRVSYTL----DDDDFITDVETAKELKAKG 188 (387) T ss_pred hccCCCCCCceeechhHHHHHHHHHHhhchhhhh--ceeee-cCC--ceeeeeeccC----Ccccccccccccccccccc Confidence 222111112221 2 2344566666665555444 44432 211 1223322111 1234433 246789 Q ss_pred cceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhh-hhhhccccCCccchhHHHHH Q lcl|NC_018271. 73 NEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARK-IDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 73 ~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~e-i~~~~~~GD~s~~~fdG~lk 151 (305) ++..|.++++-++..++=. .+. +=.+.++.++.+.|+++++.. .+..+..|+++- ...|++. T Consensus 189 ~~v~l~~~k~~~~i~iS~e---------ll~-------ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g~g-~~~g~~~ 251 (387) T protein:vir:26 189 DTVKFTTNKFKVFAAISDT---------VIH-------GSDVDLVNWVENALQSGLAAKERKDALAVSPKSG-LEHMSFY 251 (387) T ss_pred ceeeechheeeeechhhHH---------HHh-------hhHHHHHHHHHHHHHHHHHHHHHHhHhhcCCCcc-ccceeee Confidence 9999999999887666511 122 113455677888888887643 455566676652 2333331 Q ss_pred HHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecc Q lcl|NC_018271. 152 LLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEG 231 (305) Q Consensus 152 ~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kG 231 (305) - . ..+..|.++..+.+.+++.+++..+|++ -.++|+...|..-..-+.. .+.....+....+.| T Consensus 252 ~----~-------~~~~~~~~~~~d~i~~~~~~l~~~y~~n--a~~imn~~t~~~~~~~~~~---~~~~~~~~~~~~llG 315 (387) T protein:vir:26 252 N----G-------SVKEVEGADMYDAIINALADLHEDYRDN--ATIYMRYADYVKIISVLSN---GTTNFFDTPAEKVFG 315 (387) T ss_pred c----c-------ccccccccchHHHHHHHHhccChhhhcC--CEEEEechHHHHHHHHHhc---CCCcccccCCccccc Confidence 1 1 1112344455788888999999999865 4789998877665554432 233344444556889 Q ss_pred eeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 232 YTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 232 i~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) .+++....+|+ ++..+=+..+.++ +.+.+++... -..+.+.|....-.|..+..++=||+=.=.+ T Consensus 316 ~PV~~~~~~~~--~~~GDf~~~~~~~------~~~~~~~~~~-~~~~~~~~~~~~r~Dg~v~~~~A~~~l~~ka 380 (387) T protein:vir:26 316 KPVVFTDAAVK--PIVGDFNYFGINY------DGTTYDTDKD-VKKGEYLFVLTAWYDQQRTLDSAFRIAKAKE 380 (387) T ss_pred cceEEecCCCc--eeeechhhhhhhh------hhhhheeccc-ccCCceEEEEEEEeCcEeechhheEEEEeec Confidence 99998888775 4555555554333 2233443322 1234555556666788887766666643222 No 101 >protein:vir:94424 Length: 387 # NCBI annotation: ORF010 # Family: family:all:658 # MgeID: mge:1506 # MgeName: 47 # Cross-refs: genbank:acc:YP_240005;genbank:gi:66395666;genbank:GeneID:5133084 Probab=96.44 E-value=0.00063 Score=38.15 Aligned_cols=254 Identities=9% Similarity=0.002 Sum_probs=138.1 Q ss_pred CceEeeeecccc-h-hHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccceEe Q lcl|NC_018271. 1 MATTVDITTNYV-G-EVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEVDI 72 (305) Q Consensus 1 ma~~~~~~~~Y~-G-e~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~~~ 72 (305) |.+.......|. . +...+|+..+..-..+.+. ++|.| +.. ...|+..... -.+.|.+ .++.+| T Consensus 118 ~~~~~~~~gG~lIP~~~~~~Ii~~~~~~~~l~~~--~~~~~-~~~--~~~p~~~~~~----~~a~~v~Eg~~~~~~~~~f 188 (387) T protein:vir:94 118 LPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLREK--ARLTN-IKG--LEIPRVSYTL----DDDDFITDVETAKELKAKG 188 (387) T ss_pred hccCCCCCCceeechhHHHHHHHHHHhhchhhhh--ceeee-cCC--ceeeeeeccC----Ccccccccccccccccccc Confidence 222111112221 2 2344566666665555444 44432 211 1223322111 1234433 246789 Q ss_pred cceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhh-hhhhccccCCccchhHHHHH Q lcl|NC_018271. 73 NEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARK-IDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 73 ~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~e-i~~~~~~GD~s~~~fdG~lk 151 (305) ++..|.++++-++..++=. .+. +=.+.++.++.+.|+++++.. .+..+..|+++- ...|++. T Consensus 189 ~~v~l~~~k~~~~i~iS~e---------ll~-------ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g~g-~~~g~~~ 251 (387) T protein:vir:94 189 DTVKFTTNKFKVFAAISDT---------VIH-------GSDVDLVNWVENALQSGLAAKERKDALAVSPKSG-LEHMSFY 251 (387) T ss_pred ceeeechheeeeechhhHH---------HHh-------hhHHHHHHHHHHHHHHHHHHHHHHhHhhcCCCcc-ccceeee Confidence 9999999999887666511 122 113455677888888887643 455566676652 2333331 Q ss_pred HHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecc Q lcl|NC_018271. 152 LLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEG 231 (305) Q Consensus 152 ~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kG 231 (305) - . ..+..|.++..+.+.+++.+++..+|++ -.++|+...|..-..-+.. .+.....+....+.| T Consensus 252 ~----~-------~~~~~~~~~~~d~i~~~~~~l~~~y~~n--a~~imn~~t~~~~~~~~~~---~~~~~~~~~~~~llG 315 (387) T protein:vir:94 252 N----G-------SVKEVEGADMYDAIINALADLHEDYRDN--ATIYMRYADYVKIISVLSN---GTTNFFDTPAEKVFG 315 (387) T ss_pred c----c-------ccccccccchHHHHHHHHhccChhhhcC--CEEEEechHHHHHHHHHhc---CCCcccccCCccccc Confidence 1 1 1112344455788888999999999865 4789998877665554432 233344444556889 Q ss_pred eeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 232 YTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 232 i~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) .+++....+|+ ++..+=+..+.++ +.+.+++... -..+.+.|....-.|..+..++=||+=.=.+ T Consensus 316 ~PV~~~~~~~~--~~~GDf~~~~~~~------~~~~~~~~~~-~~~~~~~~~~~~r~Dg~v~~~~A~~~l~~ka 380 (387) T protein:vir:94 316 KPVVFTDAAVK--PIVGDFNYFGINY------DGTTYDTDKD-VKKGEYLFVLTAWYDQQRTLDSAFRIAKAKE 380 (387) T ss_pred cceEEecCCCc--eeeechhhhhhhh------hhhhheeccc-ccCCceEEEEEEEeCcEeechhheEEEEeec Confidence 99998888775 4555555554333 2233443322 1234555556666788887766666643222 No 102 >protein:vir:96978 Length: 387 # NCBI annotation: ORF009 # Family: family:all:658 # MgeID: mge:1643 # MgeName: 42e # Cross-refs: genbank:acc:YP_239859;genbank:gi:66395517;genbank:GeneID:5133011 Probab=96.44 E-value=0.00063 Score=38.15 Aligned_cols=254 Identities=9% Similarity=0.002 Sum_probs=138.1 Q ss_pred CceEeeeecccc-h-hHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccceEe Q lcl|NC_018271. 1 MATTVDITTNYV-G-EVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEVDI 72 (305) Q Consensus 1 ma~~~~~~~~Y~-G-e~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~~~ 72 (305) |.+.......|. . +...+|+..+..-..+.+. ++|.| +.. ...|+..... -.+.|.+ .++.+| T Consensus 118 ~~~~~~~~gG~lIP~~~~~~Ii~~~~~~~~l~~~--~~~~~-~~~--~~~p~~~~~~----~~a~~v~Eg~~~~~~~~~f 188 (387) T protein:vir:96 118 LPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLREK--ARLTN-IKG--LEIPRVSYTL----DDDDFITDVETAKELKAKG 188 (387) T ss_pred hccCCCCCCceeechhHHHHHHHHHHhhchhhhh--ceeee-cCC--ceeeeeeccC----Ccccccccccccccccccc Confidence 222111112221 2 2344566666665555444 44432 211 1223322111 1234433 246789 Q ss_pred cceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhh-hhhhccccCCccchhHHHHH Q lcl|NC_018271. 73 NEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARK-IDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 73 ~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~e-i~~~~~~GD~s~~~fdG~lk 151 (305) ++..|.++++-++..++=. .+. +=.+.++.++.+.|+++++.. .+..+..|+++- ...|++. T Consensus 189 ~~v~l~~~k~~~~i~iS~e---------ll~-------ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g~g-~~~g~~~ 251 (387) T protein:vir:96 189 DTVKFTTNKFKVFAAISDT---------VIH-------GSDVDLVNWVENALQSGLAAKERKDALAVSPKSG-LEHMSFY 251 (387) T ss_pred ceeeechheeeeechhhHH---------HHh-------hhHHHHHHHHHHHHHHHHHHHHHHhHhhcCCCcc-ccceeee Confidence 9999999999887666511 122 113455677888888887643 455566676652 2333331 Q ss_pred HHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecc Q lcl|NC_018271. 152 LLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEG 231 (305) Q Consensus 152 ~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kG 231 (305) - . ..+..|.++..+.+.+++.+++..+|++ -.++|+...|..-..-+.. .+.....+....+.| T Consensus 252 ~----~-------~~~~~~~~~~~d~i~~~~~~l~~~y~~n--a~~imn~~t~~~~~~~~~~---~~~~~~~~~~~~llG 315 (387) T protein:vir:96 252 N----G-------SVKEVEGADMYDAIINALADLHEDYRDN--ATIYMRYADYVKIISVLSN---GTTNFFDTPAEKVFG 315 (387) T ss_pred c----c-------ccccccccchHHHHHHHHhccChhhhcC--CEEEEechHHHHHHHHHhc---CCCcccccCCccccc Confidence 1 1 1112344455788888999999999865 4789998877665554432 233344444556889 Q ss_pred eeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 232 YTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 232 i~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) .+++....+|+ ++..+=+..+.++ +.+.+++... -..+.+.|....-.|..+..++=||+=.=.+ T Consensus 316 ~PV~~~~~~~~--~~~GDf~~~~~~~------~~~~~~~~~~-~~~~~~~~~~~~r~Dg~v~~~~A~~~l~~ka 380 (387) T protein:vir:96 316 KPVVFTDAAVK--PIVGDFNYFGINY------DGTTYDTDKD-VKKGEYLFVLTAWYDQQRTLDSAFRIAKAKE 380 (387) T ss_pred cceEEecCCCc--eeeechhhhhhhh------hhhhheeccc-ccCCceEEEEEEEeCcEeechhheEEEEeec Confidence 99998888775 4555555554333 2233443322 1234555556666788887766666643222 No 103 >protein:vir:7409 Length: 408 # NCBI annotation: major structural protein # Family: family:all:21 # MgeID: mge:146 # MgeName: P335 # Cross-refs: genbank:acc:NP_839926;genbank:gi:30089896;genbank:GeneID:1260683 Probab=96.39 E-value=0.00067 Score=38.00 Aligned_cols=257 Identities=8% Similarity=0.052 Sum_probs=136.3 Q ss_pred CceEeeeeccc-ch-hHHHHHHHHhhccccchhcCceEEec--CCCCcccccchhhhhccccCCCCC--CCC-ccceEec Q lcl|NC_018271. 1 MATTVDITTNY-VG-EVAGGYFLEMVKEANTISDNLIRVIP--NVPENNLFLRRMNTTDDFVDYSCG--FTP-SGEVDIN 73 (305) Q Consensus 1 ma~~~~~~~~Y-~G-e~l~~~~~~~~~g~~~v~~g~I~v~~--~v~~~~~~~~~~~~~~~~q~~~~~--~~~-~G~~~~~ 73 (305) |.........| .+ ++...|+..+.....+.+. ++++| +-..+.. .++..........-++ -.+ .+..+|. T Consensus 116 ~~~~~~~~gg~~vP~~~~~~Ii~~~~~~~~l~~~--~~~~~~~~~~~~~~-~~~~~~~~~~~~~v~E~~~~~~~~~~~~~ 192 (408) T protein:vir:74 116 ETSGSDSAAGLTIPQDIRTMINTLVRQYDSLQQY--VRVESVSTSSGSRV-YEKWTDVTPLKAMDEEDGKIPDLDNPRLT 192 (408) T ss_pred hcccccCCCceeechhHhhHHHHHHhhhcchhhh--cceeeccCCcceEE-EEeecCCccccccccccccccccccccee Confidence 32222222222 12 2334555555555555544 44443 3323321 2222111111111111 112 2457899 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) +..+.++++-+...++=. +++ +-++.++.++.+.|.++++..++..++.|+++.. T Consensus 193 ~i~~~~~k~~~~~~iS~e---------ll~-------ds~~~l~~~i~~~l~~~~~~~~d~~il~G~G~~~--------- 247 (408) T protein:vir:74 193 IIKYLIKRYAGIITATNT---------LLK-------DTAENILAWLSSWIAKKVVVTRNQAIIAAMGTVP--------- 247 (408) T ss_pred eEEeeeeeEEeeehhHHH---------HHh-------hchHHHHHHHHHHHHHHHHHHHHHHHhhcccccc--------- Confidence 999999999988775422 222 2345677889999999999999999999988621 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---CcccCCCcceec Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---GTFLNPNEFDFE 230 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~~~t~~~~~~~k 230 (305) +..+..|..++ ++.+...++..++. +-.++||...|.+.+- ++.-.|.+ .+.+++....+. T Consensus 248 ----------~~~~~~~~~~i---~~~~~~~l~~~~~~--~a~~v~n~~~~~~l~~-lkd~~G~~l~~~~~~~~~~~~l~ 311 (408) T protein:vir:74 248 ----------KKPTIANFDDV---ITMINTSVDPAIIA--TSSLLTNQSGLNKLAL-VKTAEGKYLLEPDPTKPNSYLIK 311 (408) T ss_pred ----------cccccccHHHH---HHHHHHhhhhhhcC--CCEEEEcHHHHHHHHH-hhcCCCceEeccCcCCCCCceec Confidence 11223343333 33445678888875 3589999999887763 33222222 123444456899 Q ss_pred ceeeeecc--CCC-----CCeEEEecchHHhhhhhhhhhhhhcccc--cee-eeccceeEEEEEEeecceeeccCCeEEE Q lcl|NC_018271. 231 GYTLTEIK--GLP-----ASRMVGYNRDNIVIGMSAQSDFNEIRIK--DMG-DVDLSGQIRTKMVLSAGVEYAYGAEIVL 300 (305) Q Consensus 231 Gi~iv~l~--~~P-----d~~ii~T~~sNl~~gvnl~~D~n~I~I~--~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v~ 300 (305) |.+++... .+| +..++..+-+..+..+ |-..++++ +.. ..+-...+-+...+-+|+.+..++=||+ T Consensus 312 G~pV~~~~~~~~~~~~~~~~~i~~gd~~~~~~~~----~~~~~~i~~~~~~~~~f~~~~~~~r~~~r~d~~~~~~~a~~~ 387 (408) T protein:vir:74 312 GKQVIVVADRWLPNSGSTVYPLYYGDMSQAITLF----DRENMSLLPTNIGAGAFETDTTKIRVIDRFDVKATDSEALVA 387 (408) T ss_pred ceeeEEecCcccccccCCcceEEEEehhccEEEE----EecceEEEEeccccchhhcceeeEEEEEeeCcEEecccceEE Confidence 99877653 234 3345555544432111 11222222 111 1112334555566667888988888888 Q ss_pred ecCCC Q lcl|NC_018271. 301 YTPAA 305 (305) Q Consensus 301 ~~~~~ 305 (305) -+-.+ T Consensus 388 ~~~~~ 392 (408) T protein:vir:74 388 GSFTA 392 (408) T ss_pred EEeec Confidence 77333 No 104 >protein:vir:97397 Length: 517 # NCBI annotation: major capsid protein # Family: family:all:11745 # MgeID: mge:1675 # MgeName: Q54 # Cross-refs: genbank:acc:YP_762590;genbank:gi:115304291;genbank:GeneID:5130600 Probab=96.23 E-value=0.00058 Score=38.33 Aligned_cols=263 Identities=15% Similarity=0.121 Sum_probs=109.1 Q ss_pred CceEeeeec-ccchhHHHH-HHHHhhccccchh--cCceEEecCCCCcccccchhhhhccccCCCCCCCCc------cce Q lcl|NC_018271. 1 MATTVDITT-NYVGEVAGG-YFLEMVKEANTIS--DNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTPS------GEV 70 (305) Q Consensus 1 ma~~~~~~~-~Y~Ge~l~~-~~~~~~~g~~~v~--~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~~------G~~ 70 (305) .+...++.+ ...|-..+. ++.+......... ...+++ .++ +....+.- .+ .-.+.|... ++. T Consensus 235 ~~~~~~~~~~~~~~~~~p~~~~~~i~~~~~~~~~i~~~~~~-~~i--~~~~~~~~--~~---~~~a~~~~eG~~kp~s~~ 306 (517) T protein:vir:97 235 AAWTAELKERGISGMPAPAGILKRIQDAVNDEGSLLPFIRH-ENL--PTLVVGGD--NA---LTQGTGHTTGTDKTESNI 306 (517) T ss_pred ceeeeecccccccccccchHHHHHHHHhhhhhccceeeeee-ccc--cceeeecc--cc---cceeeeeecCCccccccc Confidence 111111110 000100111 1111111000000 000111 001 00000000 00 001222222 345 Q ss_pred EecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHH Q lcl|NC_018271. 71 DINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGIL 150 (305) Q Consensus 71 ~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~l 150 (305) +|..+.+.++.+=....++=+ +.. ... + +.-+.+|.++...|..+++...+..++.||++.....|++ T Consensus 307 tf~~~~~~~~~ia~~~~~S~q---------ll~-Ds~-~-dd~~~l~s~i~~~l~~~l~~~ee~a~l~GdGtg~~~~gi~ 374 (517) T protein:vir:97 307 TLQTRVLTPQYVYKYIKLPKI---------VMN-SNA-T-DIAGAILTYVMNRLPDMVIMAVNRAIIMGGVTGVSETQIY 374 (517) T ss_pred ceeeEEeeHhhhhhhhhhhHH---------HHH-Hhh-h-ccHHHHHHHHHHHHHHHHHHHHHHHHhcccCCCccccccc Confidence 566666655544333322211 111 110 1 2235678899999999999999999999999877777777 Q ss_pred HHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCc---ccCCCcc Q lcl|NC_018271. 151 PLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGT---FLNPNEF 227 (305) Q Consensus 151 k~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~---~t~~~~~ 227 (305) .+...... . +..+.....+.+..+..+ +++..+=.+.|+..+|.+.+--- .-.|.|.- .+.+... T Consensus 375 ~~a~~~~~----~---~~~~~~~~~d~i~~l~~a----~~~a~~a~~vmn~~t~~~I~klK-D~~G~Yl~~~~~~~~~~~ 442 (517) T protein:vir:97 375 PVVGDAWA----T---NVTGTTNIQELLEKLSVA----TPKAADSTLVIHRNDLAAIRFLK-DKNGNYVFPVGVSNQTIA 442 (517) T ss_pred cccccccc----c---cccccchHHHHHHHHHHH----hhhccCCEEEECHHHHHHHHHhh-cCCCCeeccCcCCccccc Confidence 65321111 1 111122233444444433 34334457999999987764322 22222211 1222223 Q ss_pred eecce-eeeeccCCCCCeEEEecchHHhhh---hhhhhhhhhccccceeeecccee-EEEEEEeecceeeccCCeEEEec Q lcl|NC_018271. 228 DFEGY-TLTEIKGLPASRMVGYNRDNIVIG---MSAQSDFNEIRIKDMGDVDLSGQ-IRTKMVLSAGVEYAYGAEIVLYT 302 (305) Q Consensus 228 ~~kGi-~iv~l~~~Pd~~ii~T~~sNl~~g---vnl~~D~n~I~I~~~~~~~~~~~-~f~k~~m~~d~~i~fg~E~v~~~ 302 (305) ...|+ .+.|..+++ ...+.....+.+++ +...+||+ . ..+.. |.++|+++-.+--+.--=+.+|+ T Consensus 443 ~l~G~~~~~~~~~~~-~~~~~~~~~y~i~~~~g~~~~~~fd------~---~~n~~~f~~~~~~~g~i~~~~r~a~~~~~ 512 (517) T protein:vir:97 443 THFGFNRLVQSVAVD-EKTAVSLSGYVTNGSRGMEFEQGTI------L---VENNKEYLFEMPISGSLEYKGTTAYGTYT 512 (517) T ss_pred ccCCccccccccccC-ceeEeeccccEEEeecceeeeeeee------c---ccCceeEeeeeeeccccccccceEEEEEc Confidence 34452 344433332 23444444443322 22223332 1 12222 66777765544443333367899 Q ss_pred CCC Q lcl|NC_018271. 303 PAA 305 (305) Q Consensus 303 ~~~ 305 (305) |.+ T Consensus 513 p~~ 515 (517) T protein:vir:97 513 PPV 515 (517) T ss_pred CCC Confidence 988 No 105 >protein:vir:7990 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:151 # MgeName: Che8 # Cross-refs: genbank:acc:NP_817344;genbank:gi:29565772;genbank:GeneID:1258978 Probab=95.81 E-value=0.0014 Score=36.18 Aligned_cols=262 Identities=14% Similarity=0.126 Sum_probs=136.2 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCC--CCCCCccceEecceeee Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYS--CGFTPSGEVDINEKQLT 78 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~--~~~~~~G~~~~~~K~L~ 78 (305) ||+...+..=|++++++.+--....+ .+++... .. -|-+.++..+|+...... .+|. .....-.+.+-++..+. T Consensus 1 MA~~~~~pei~~~~v~~~~~~~lv~~-~l~~~~~-~~-~~~~GdTv~ip~~~~~~~-~d~~~~~~~~~~~~~~~~~~~~t 76 (273) T protein:vir:79 1 MAFNNFIPELWSDMLLEEWTAQTVFA-NLVNREY-EG-IASKGNVVHIAGVVAPTV-KDYKAAGRQTSADAISDTGVDLL 76 (273) T ss_pred CcchhhhHHHHHHHHHHHHHhhccch-hhhhccc-cc-cccCCcEEEEeecCcccc-cccccCCCccCccccccceEEEE Confidence 99976655689998888865555444 3554431 11 133345555665443322 2232 12222234455666666 Q ss_pred eeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhccc Q lcl|NC_018271. 79 LKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEADAT 158 (305) Q Consensus 79 ~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d~~ 158 (305) ..+++..-... .|.+.. +.. +++ ++ . ++..+..++.++...+ +..+..... T Consensus 77 id~~~~~~~~i-~d~d~~----~~~-~~~--~~--~------~~~~~~ala~~vD~~i-------------~~~~~~a~~ 127 (273) T protein:vir:79 77 IDQEKSIDFLV-DDIDRV----QVA-GSL--EA--Y------TRAGATALATDTDKFI-------------ADMLVDNGT 127 (273) T ss_pred Eeeecccceee-ccHHHH----hhc-ccH--HH--H------HHHHHHHHHHHHHHHH-------------HHHHhhccc Confidence 65544322222 244322 222 322 21 1 2233444555444222 222222221 Q ss_pred eEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHH--HH-HhhhhccCC--cccCCCcceeccee Q lcl|NC_018271. 159 VIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIK--RA-YGTQARSNG--TFLNPNEFDFEGYT 233 (305) Q Consensus 159 ~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~--d~-~~~~~~k~~--~~t~~~~~~~kGi~ 233 (305) . +. .....|+.++.+.+.++...+.+.--...+-.++|++..|.... +. .......+. ...++..+++.|+. T Consensus 128 ~--~~-~~~~~~~~~~~~~i~~a~~~ld~~~vP~~~R~lvv~p~~~~~Ll~~~~~~~~~~~~~~~~~l~~G~ig~~~G~~ 204 (273) T protein:vir:79 128 A--LT-GSAPSDADDAFDLIASALKELTKANVPNVGRVVVVNAEMAFWLRSSGSKLTSADTSGDAAGLRAGTIGNLLGAR 204 (273) T ss_pred c--cc-cccccchhhHHHHHHHHHHHhhhccCCccCcEEEECHHHHHHHhhchhhhhhhhhcccccceeeeEeeEEeceE Confidence 1 11 12345667778888887777666522223468899998877552 22 333332222 12356677899999 Q ss_pred eeeccCCCCC---eEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 234 LTEIKGLPAS---RMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 234 iv~l~~~Pd~---~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) |+.-..+|.+ .+++..++-+.++. .+ ++++.++..+- ....|.. +|-||.-+-.++-+|+.+.++ T Consensus 205 i~~s~~lp~~~~~~~~a~~~~A~~~a~-~~---~~~e~~r~~~~-~~~~v~~--~~~yg~~v~~p~~vv~~~~~g 272 (273) T protein:vir:79 205 IVESNNLRDTDDEQFVAFHPSAAAYVS-QI---DTVEALRDQDS-FSDRIRA--LHVYGGKVVRPTGVVVFNKTG 272 (273) T ss_pred EEecccccccCceEEEEEeccceeeee-eh---hhhhcccCccc-ceeeeee--eeeeeeEEecCceEEEEeccC Confidence 9999889865 46666666665443 21 23333333331 1222333 455677777889999998887 No 106 >protein:vir:3845 Length: 395 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:322 # MgeName: phi adh # Cross-refs: genbank:acc:NP_050151;swissprot:trembl:q9t1f6;genbank:gi:9633043;uniprot:Q9T1F6;genbank:GeneID:1262163 Probab=95.73 E-value=0.0016 Score=35.98 Aligned_cols=256 Identities=7% Similarity=-0.018 Sum_probs=129.6 Q ss_pred CceEe--eeeccc-c-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC-------ccc Q lcl|NC_018271. 1 MATTV--DITTNY-V-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP-------SGE 69 (305) Q Consensus 1 ma~~~--~~~~~Y-~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~-------~G~ 69 (305) |+... .....| . -+...+|+..+..-..+.+. ++++| |+...-.+++....+.- -.+.|.+ ... T Consensus 105 ~~~~~~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~~--~~~~~-~~~~~~~~~~~~~~~~~--~~a~~v~E~~~~~~~~~ 179 (395) T protein:vir:38 105 VTSGTTGTGNAGLTIPEDIQLQIRTLTRSFTSLESL--ANVEN-VTTSHGSRVYEKLADIT--PLKDLDDESALIGDNDD 179 (395) T ss_pred HhhccCccCCCceecchhHhhHHHHHHHhhcchhhh--cceee-ccCCcceEEEEeeccCC--ccccccccccccccccc Confidence 22111 111111 1 13334455455544444443 33332 11111111111111110 0123322 224 Q ss_pred eEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHH Q lcl|NC_018271. 70 VDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGI 149 (305) Q Consensus 70 ~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~ 149 (305) .+|.+..+.++++-+...++-. .+. +=++..+.++...|+++++..++..++.|+++.... T Consensus 180 ~~f~~v~~~~~k~~~~~~iS~e---------ll~-------ds~~~l~~~i~~~la~~~~~~~~~~il~g~g~~~~~--- 240 (395) T protein:vir:38 180 PELTVVKYLIHRYAGITTVTNT---------LLK-------DTVDNIIQWLVNWAAKKDVVTRNAKILEVMGKAPKK--- 240 (395) T ss_pred cceeeEEeeeeeeEeehhhHHH---------HHh-------hhHHHHHHHHHHHHHHHHHHHHHHHHhhcccccccc--- Confidence 6788999999999988765432 111 224456788889999999999999999998863221 Q ss_pred HHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---CcccCCCc Q lcl|NC_018271. 150 LPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---GTFLNPNE 226 (305) Q Consensus 150 lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~~~t~~~~ 226 (305) .+..+..++. +.+...++..+|. +-.|+||...|.+... ++.-.|.+ .+..++.. T Consensus 241 ----------------~~~~~~~~i~---~~~~~~l~~~~~~--~a~~v~n~~~~~~L~~-lkd~~G~~l~~~~~~~~~~ 298 (395) T protein:vir:38 241 ----------------PTISQFDNIK---DLENNTLDPAIES--TSSFITNQSGYNILSK-VKDADGRYLMQPDVTSPDK 298 (395) T ss_pred ----------------cccccHHHHH---HHHHHhhhhhhcC--CCEEEEcHHHHHHHHH-hhccCCceeeccCcCCCCc Confidence 1222322222 2233467777775 3589999999877643 33222222 12334445 Q ss_pred ceecceeeeeccCC--C----CCeEEEecchHH-hhhhhhhhhhhhcccccee-eeccceeEEEEEEeecceeeccCCeE Q lcl|NC_018271. 227 FDFEGYTLTEIKGL--P----ASRMVGYNRDNI-VIGMSAQSDFNEIRIKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEI 298 (305) Q Consensus 227 ~~~kGi~iv~l~~~--P----d~~ii~T~~sNl-~~gvnl~~D~n~I~I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~ 298 (305) ..+.|++++....+ | +..|+..+-++. +++.. .+.. +++.+.. ..+-...+-+....-+|+.+..++=| T Consensus 299 ~~l~G~pV~~~~~~~~~~~~~~~~i~~gd~~~~~~i~~~--~~~~-i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~ 375 (395) T protein:vir:38 299 YLIDGKPVIRIADKWLPDVSGSHPLYFGDLKQGITLFDR--QQMQ-IDTTNVGAGSFEHDTTKLRFIDRFDVQLIDDGAF 375 (395) T ss_pred ceeccceeEEecccccCcCCCcceEEEEeccccEEEEEe--cceE-EEEeccccchhhcCceEEEEEEeeccEEecccce Confidence 67999998877543 2 334666665553 22221 1111 2222211 11223345555666668888777777 Q ss_pred EEecCC--C Q lcl|NC_018271. 299 VLYTPA--A 305 (305) Q Consensus 299 v~~~~~--~ 305 (305) +.-+=. + T Consensus 376 ~~~~~~~~~ 384 (395) T protein:vir:38 376 AAASFKTVA 384 (395) T ss_pred EEEEeeccc Confidence 665422 2 No 107 >protein:vir:3991 Length: 404 # NCBI annotation: major structural protein # Family: family:all:21 # MgeID: mge:319 # MgeName: BK5-T # Cross-refs: genbank:acc:NP_116499;genbank:gi:14251132;genbank:GeneID:921252 Probab=95.63 E-value=0.0017 Score=35.73 Aligned_cols=259 Identities=8% Similarity=0.057 Sum_probs=130.5 Q ss_pred CceEeeeeccc-ch-hHHHHHHHHhhccccchhcCceEEec--CCCCcccccchhhhhccccCCCC--CCCC-ccceEec Q lcl|NC_018271. 1 MATTVDITTNY-VG-EVAGGYFLEMVKEANTISDNLIRVIP--NVPENNLFLRRMNTTDDFVDYSC--GFTP-SGEVDIN 73 (305) Q Consensus 1 ma~~~~~~~~Y-~G-e~l~~~~~~~~~g~~~v~~g~I~v~~--~v~~~~~~~~~~~~~~~~q~~~~--~~~~-~G~~~~~ 73 (305) |.........| .+ ++..+|+..+.....+.+. ++++| +-..+.. .++.......-..-+ +-.+ .+..+|. T Consensus 116 ~~~~t~~~gg~~iP~~~~~~ii~~~~~~~~l~~~--~~~~~~~~~~~~~~-~~~~~~~~~~a~~v~Eg~~~~~~~~~~f~ 192 (404) T protein:vir:39 116 ETSGSDSAAGLTIPQDIRTMINTLVRQYDSLQQY--VRVESVSTSNGSRV-YEKWTDVTPLTVMDAEDGKIPDLDNPRLT 192 (404) T ss_pred hhcccccCCceeccHHHHHHHHHHHHhhhhHHhh--cceeeccCCcceEE-EEeecCCccceeeecCcccccccccccee Confidence 22221111122 12 3345555555555555555 44444 2222211 221111111111111 1112 2456889 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) +..+.++++-+...++=. ++. +=++..+.++...|.++++..+++.++.|+++.. T Consensus 193 ~i~~~~~k~~~~~~iS~e---------ll~-------ds~~~l~~~i~~~l~~~~~~~~d~~il~g~g~~~--------- 247 (404) T protein:vir:39 193 IIKYLIKRYAGIITATNT---------LLK-------DTAENILAWLSSWIAKKVVVTRNQAIIAAMGTVP--------- 247 (404) T ss_pred eEEeeeeeEEeeehhHHH---------HHh-------hchHHHHHHHHHHHHHHHHHHHHHHHHhcccccc--------- Confidence 999999999888765421 222 1235567888899999999999999999987621 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---CcccCCCcceec Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---GTFLNPNEFDFE 230 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~~~t~~~~~~~k 230 (305) +..+..+.+++.+ .+...++..++. +-.++||...|.+..- ++.-.|.+ .+.+++....+. T Consensus 248 ----------~~~~~~~~~~i~~---~~~~~~~~~~~~--~a~~v~n~~~~~~L~~-lkd~~G~~l~~~~~~~~~~~~l~ 311 (404) T protein:vir:39 248 ----------KKPTIAKFDDVIT---MINTSVDPAIIA--TSSLLTNQSGLNKLAL-VKTAEGKYLLEPDPTKPNSYLIK 311 (404) T ss_pred ----------cccccccHHHHHH---HHHHhhhhhhcc--CCEEEEcHHHHHHHHH-hhccCCceeeccCcCCCCcceec Confidence 1222334333333 334467777775 3589999999876663 33222222 123444556799 Q ss_pred ceeeeecc--CCC-----CCeEEEecchHHhhhhhhhhhhhhcccccee-eeccceeEEEEEEeecceeeccCCeEEEec Q lcl|NC_018271. 231 GYTLTEIK--GLP-----ASRMVGYNRDNIVIGMSAQSDFNEIRIKDMG-DVDLSGQIRTKMVLSAGVEYAYGAEIVLYT 302 (305) Q Consensus 231 Gi~iv~l~--~~P-----d~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~-~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~ 302 (305) |.+++... .+| +..+++.+-++.++.++- .+.. +.+++.. +..-...+.+...+-+|+.+..++=||.-+ T Consensus 312 G~pV~~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~-~~~~-i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~ 389 (404) T protein:vir:39 312 GKKVIVVADRWLPNSGSTVYPLYYGDMSQAITLFDR-ENMS-LLPTNIGAGAFETDTTKIRVIDRFDVKTTDSEALVAGS 389 (404) T ss_pred ceeEEEecccccCccCCCccEEEEEeccccEEEEee-cceE-EEEeccchhhhhhceeeEEEEeeeccEEecccceEEEE Confidence 98877542 234 334677766654322111 1111 2232211 111223344445555577777777665544 Q ss_pred --CCC Q lcl|NC_018271. 303 --PAA 305 (305) Q Consensus 303 --~~~ 305 (305) +++ T Consensus 390 ~~~~a 394 (404) T protein:vir:39 390 FTAIA 394 (404) T ss_pred eeccc Confidence 222 No 108 >protein:vir:79008 Length: 299 # NCBI annotation: putative main capsid protein # Family: family:all:701 # MgeID: mge:1861 # MgeName: phiC2 # Cross-refs: genbank:acc:YP_001110725;genbank:gi:134287342;genbank:GeneID:4955182 Probab=95.41 E-value=0.0021 Score=35.25 Aligned_cols=267 Identities=13% Similarity=0.073 Sum_probs=117.2 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccchhcCce-EEecCCCCcccccchhhhhccccCCC---CCCCCccce--Eecc Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANTISDNLI-RVIPNVPENNLFLRRMNTTDDFVDYS---CGFTPSGEV--DINE 74 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I-~v~~~v~~~~~~~~~~~~~~~~q~~~---~~~~~~G~~--~~~~ 74 (305) || ++|..+-|++..++.+-....+| .|...... .+..+ -.++..+|++.+ ..+.+|+ .+|.+ |.+ .+.. T Consensus 1 MA-~~n~a~~~~~~Ld~~~~~~l~~~-~L~~~~~~~~v~~~-gg~tVkI~~i~~-~gl~DY~R~~~g~~~-g~~~~~~~t 75 (299) T protein:vir:79 1 MA-ALNYAKEYSNVLAQAYPYTLNFG-DLYATPNNGRYRWT-GSKTIEIPTIST-TGRVDSNRDTIAVAQ-RNYDNAWEP 75 (299) T ss_pred Cc-cchhHHHHHHHHHHHHHhhceee-eeccCcccceeeec-CCCEEEEecccc-ccccccccCCCcccc-cccCcceeE Confidence 99 67766788888666654444443 44333211 12122 256666888876 6788885 25544 344 4444 Q ss_pred eeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHh Q lcl|NC_018271. 75 KQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLE 154 (305) Q Consensus 75 K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~ 154 (305) .+|.-.|.- .|.+.+-|..+. . +. ++.. ..+-+-...+++-|+. +.-|..+..... T Consensus 76 ~~ldqdr~~-~f~vD~~Dvdet------~-~~-----~~~a--~v~~~~~~~~v~pEiD---------ay~~skl~~~a~ 131 (299) T protein:vir:79 76 KVLTNQRKW-STLVHPADINQT------N-YV-----ASIG--NITKVYNEEQKFPEMD---------AYCISKIYADWT 131 (299) T ss_pred EEeeccccc-eeccchhhHHHH------h-hh-----hHHH--HHHHHHHHHHhhhHhh---------HHHHHHHHHhhh Confidence 444444332 233444453331 1 11 1110 1111111122334432 111333333221 Q ss_pred hccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc------cCCCcce Q lcl|NC_018271. 155 ADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF------LNPNEFD 228 (305) Q Consensus 155 ~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~------t~~~~~~ 228 (305) +... ...+.++|++|+.+.++++...+.+.=-...+..++||+..+...+..- +..+..+. .++.-+. T Consensus 132 ~~g~----~~~~~~~T~~n~y~~i~~~~~~lde~~vP~~~rvl~vtp~~~~~L~~~~--~f~k~~~~~~~~~~~~g~Vg~ 205 (299) T protein:vir:79 132 ALGN----TADTTVLTTTNVLEVFDKLMEKMTEARVPENGRILYVTPVVNTLIKNAK--EIQRTVNIKDAGTSLNRQTTD 205 (299) T ss_pred hcCC----cccccccCHHHHHHHHHHHHHHHHhcCCCCCCeEEEeCHHHHHHHhhch--hhhcccccccccceeeeeeee Confidence 1111 2234568999999999999998887633234689999999999887543 22222222 2333356 Q ss_pred ecceeeeeccC--CCCCeEEEec--------chHHhhhhh----hhhhhhhccccce-eeeccceeEEEEEEeecceeec Q lcl|NC_018271. 229 FEGYTLTEIKG--LPASRMVGYN--------RDNIVIGMS----AQSDFNEIRIKDM-GDVDLSGQIRTKMVLSAGVEYA 293 (305) Q Consensus 229 ~kGi~iv~l~~--~Pd~~ii~T~--------~sNl~~gvn----l~~D~n~I~I~~~-~~~~~~~~~f~k~~m~~d~~i~ 293 (305) +.|++|+.+++ |+...=|..- .=|+++.-. ...=.+.++|..= ++ =++.|..+-..=.|.-+ T Consensus 206 idG~~Ii~Vps~r~~t~~~~~~G~~~~~~ak~in~ii~~~~a~~~~~K~~~~~~~~P~~~--~~~~~~~~~r~y~d~~v- 282 (299) T protein:vir:79 206 IDTVKIIKVPSNLMKTAYDFTTGWKVGAGAKQIFMSLVHPSAIITPVSYQFSKLDEPTAV--TEGKYFYFEESFEDVFI- 282 (299) T ss_pred ecceEEEEechhhcCccceeccCccccCcccccceEEEcCCeeeeeEeeeeEEeecCCCC--Cccceeeeeeeeeeeee- Confidence 88988877522 3322111110 013321100 0000111221110 11 11122222111112211 Q ss_pred cCC--e--EEEecCCC Q lcl|NC_018271. 294 YGA--E--IVLYTPAA 305 (305) Q Consensus 294 fg~--E--~v~~~~~~ 305 (305) +.. . .|-+..+- T Consensus 283 ~~nk~~~i~~~~~~a~ 298 (299) T protein:vir:79 283 LNKKADAIQFVVEGAG 298 (299) T ss_pred eccccCeEEEEeeecC Confidence 111 1 12222222 No 109 >protein:vir:78640 Length: 352 # NCBI annotation: phage capsid # Family: family:all:658 # MgeID: mge:1855 # MgeName: tp310-2 # Cross-refs: genbank:acc:YP_001429943;genbank:gi:156603997;genbank:GeneID:5525386 Probab=94.98 E-value=0.003 Score=34.39 Aligned_cols=254 Identities=8% Similarity=-0.006 Sum_probs=138.0 Q ss_pred CceEeeeeccc-ch-hHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC------ccceEe Q lcl|NC_018271. 1 MATTVDITTNY-VG-EVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP------SGEVDI 72 (305) Q Consensus 1 ma~~~~~~~~Y-~G-e~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~------~G~~~~ 72 (305) |.........| .. ++..+|+..+.....+. ++++|.+ +... . .++.... .-.+.|.+ .++.+| T Consensus 83 l~~~~~~~gG~lIP~~~~~~Ii~~l~~~s~l~--~~~~v~~-~~~~-~-~p~~~~~----~~~a~~v~E~~~~~~~~~~f 153 (352) T protein:vir:78 83 LPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLR--EKARLTN-IKGL-E-IPRVSYT----LDDDDFITDVETAKELKLKG 153 (352) T ss_pred hccCCCCCCceeccHhHHHHHHHHHHhhcchh--hheeeEe-cCCc-e-EEEEecC----CCcccccccccccccccccc Confidence 21111111111 12 33445555555555543 3466543 3221 2 2222111 11234533 246899 Q ss_pred cceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHh-hhhhhccccCCccchhHHHHH Q lcl|NC_018271. 73 NEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLAR-KIDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 73 ~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~-ei~~~~~~GD~s~~~fdG~lk 151 (305) ++..+.++++-+...++-+ .+. +=.+.++.++.+.|+++++. |....+..|+++- ...|++. T Consensus 154 ~~v~~~~~k~~~~i~is~e---------ll~-------Ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g~~-~~~g~l~ 216 (352) T protein:vir:78 154 DTVKFTTNKFKVFAAISDT---------VIH-------GSDVDLVNWVENALQSGLAAKERKDALAVSPKSG-LEHMSFY 216 (352) T ss_pred eeeeecceeEEeechhhHH---------HHh-------hhhHHHHHHHHHHHHHHHHHHHHHhhhhcCCCCc-cccccee Confidence 9999999999988776422 111 22355667888888888764 4555666777652 2334331 Q ss_pred HHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcccCCCcceecc Q lcl|NC_018271. 152 LLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLNPNEFDFEG 231 (305) Q Consensus 152 ~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~~~~~~~kG 231 (305) - .. ....|.++..+.+.++..+++..+|++ -.++|+...+.+-..-+.+ .+.....+....+.| T Consensus 217 ~----~~-------~~~~t~~~~~d~i~~~~~~l~~~~~~~--a~~~mn~~t~~~l~~~~~~---~~~~~~~~~~~~llG 280 (352) T protein:vir:78 217 N----GS-------VKEVEGANMYDAIINALADLHEDYRDN--ATIYMRYADYVKIISVLSN---GTTNFFDTPAEKVFG 280 (352) T ss_pred c----cc-------cccccccchHHHHHHHHhccChhhhcC--CEEEEehHHHHHHHHHHhc---cCCcccccCCccccc Confidence 1 11 112344445778888889999999874 4799998877665444322 233333444456789 Q ss_pred eeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 232 YTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 232 i~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) .+++-..++|+ ++..+=+..+.++ +.+.++++.. ..++...|...+-+|-.+..++=|++=+=++ T Consensus 281 ~PV~~~~~~~~--~~~Gdf~~~~~~~------~~~~~~~~~~-~~~g~~~f~~~~r~Dg~~~~~eA~~~l~~~a 345 (352) T protein:vir:78 281 KPVVFTDAAVK--PIVGDFNYFGINY------DGTTYDTDKD-VKKGEYLFVLTAWYDQQRTLDSAFRIAKAKE 345 (352) T ss_pred cceEEecCCCc--eeEeehhhhhhhh------hhheeeeecc-ccCCeeEEEEEeeeCceeechhheEEEEeec Confidence 99988877775 3445545444322 2334444433 1345566666666777777666666553222 No 110 >protein:vir:93742 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1475 # MgeName: 55 # Cross-refs: genbank:acc:YP_240459;genbank:gi:66396126;genbank:GeneID:5133511 Probab=94.73 E-value=0.0036 Score=33.96 Aligned_cols=260 Identities=12% Similarity=0.095 Sum_probs=131.5 Q ss_pred CceEeeeec-ccchhHHHHHHHHhhccccchhcCce--EEecCCCCcccccchhhhhccccCCCC-CCCCccceEeccee Q lcl|NC_018271. 1 MATTVDITT-NYVGEVAGGYFLEMVKEANTISDNLI--RVIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~~~-~Y~Ge~l~~~~~~~~~g~~~v~~g~I--~v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~ 76 (305) ||+...-.+ -...|+...++..-+...-......+ .-++|.+.++..+|+.+.....|.+.. +--+-+..+.++.+ T Consensus 1 ma~~~T~~~~~iiPev~~~~v~~~~~~~~~~~~~~~~~~~l~g~~G~tv~ip~~~~~g~~~~~~eg~~i~~~~it~~~~~ 80 (274) T protein:vir:93 1 MPQGITKTSNQIIPEVLAPMMQAQLEKKLRFASFAEVDSTLQGQPGDTLTFPAFVYSGDAQVVAEGEKIPTDILETKKRE 80 (274) T ss_pred CCccceehhheechHHHHHHHHHHHHhhhhhcccccccccccCCCCCEEEEEeeccCCCcccccCCCcccccccccceeE Confidence 999776433 35555555554443333211111111 113455555554555443334555542 11223455677777 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) +..++.. ..|-..|... .+-. ++ | - ....+.++..++..+...++ ..+... T Consensus 81 ~~i~~~~--~~~~i~D~~~----~~~~-~d------~-~--~~~~~~~~~~~a~~~d~~~~-------------~~~~~a 131 (274) T protein:vir:93 81 AKIRKIA--KGTSITDEAL----LSGY-GD------P-Q--GEQVRQHGLAHANKVDNDVL-------------EALMGA 131 (274) T ss_pred EEeeeec--ccccccHHHH----Hhhc-cc------h-H--HHHHHHHHHHHHHHHHHHHH-------------HHHhcc Confidence 7775543 3444444332 1111 21 1 1 12234455555555553332 111111 Q ss_pred cceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhc---cCCc--ccCCCcceecc Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQAR---SNGT--FLNPNEFDFEG 231 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~---k~~~--~t~~~~~~~kG 231 (305) ...+ ...+++...+++++. .+.+. ....-.++||+.+|-.....-..++- .+.+ ..++-.++|.| T Consensus 132 ~~~~----~~~~~~~d~i~dA~~----~l~d~--~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G 201 (274) T protein:vir:93 132 KLTV----NADITKLNGLQSAID----KFNDE--DLEPMVLFINPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEALG 201 (274) T ss_pred cccc----cccccCHHHHHHHHH----Hhhhc--cCCccEEEeCHHHHHHHHhhhhhcccccccccccceeecccceecC Confidence 1111 122344444444433 33332 12234899999988777533212221 1111 12344668999 Q ss_pred eeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 232 YTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 232 i~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) ++++--.++|++++++..++++-++. ..+.. ++-+|-.+ . ..--+.-++-|++-+..++-+|+=|.++ T Consensus 202 ~~Vi~s~~~p~~t~~l~~~gai~~~~--~~~~~-vE~~Rd~~--~-~~d~i~~~~~y~~~~~~~~~~v~~t~~~ 269 (274) T protein:vir:93 202 AIIVRTNKLEAGTAILAKKGAVKLIL--KRDFF-LEVARDAS--T-KTTALYSDKHYVAYLYDESKAVKITKGS 269 (274) T ss_pred eeEEEcCCCCcceEEEEeCCeEEEEe--cCCcc-cccccchh--h-cccEEEEEEEEEEEEEcCCceEEEeeCc Confidence 99999999999999999999887653 23322 22222222 1 1133444455699999999999999888 No 111 >protein:vir:3783 Length: 336 # NCBI annotation: capsid # Family: family:all:201 # MgeID: mge:328 # MgeName: HP2 # Cross-refs: genbank:acc:NP_536823;genbank:gi:17981832;genbank:GeneID:929211 Probab=94.73 E-value=0.0036 Score=33.95 Aligned_cols=270 Identities=12% Similarity=0.091 Sum_probs=144.6 Q ss_pred CceEeee---------ecccchhHHHHHHHHhhccccchhcCceEEec--CCCCcccccchhhhhccccCCCCCCCCccc Q lcl|NC_018271. 1 MATTVDI---------TTNYVGEVAGGYFLEMVKEANTISDNLIRVIP--NVPENNLFLRRMNTTDDFVDYSCGFTPSGE 69 (305) Q Consensus 1 ma~~~~~---------~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~--~v~~~~~~~~~~~~~~~~q~~~~~~~~~G~ 69 (305) +|..... +-+=++.+.-+|-..+..+.+.-++ |.|+| .++...+ . +...-.+-+++. | | T Consensus 13 ~A~~ngv~~a~~~~~~~Fsv~P~v~q~L~~~i~ess~FL~~--INvv~V~e~~Ge~v--~-lg~~g~iagrtd--t--~- 82 (336) T protein:vir:37 13 LAKHFNQPLDSVLRGESFALKAPEAALLGENIQQRSDFLKG--INMVQVAHTKGTKL--F-GATEKGVTGRKQ--T--G- 82 (336) T ss_pred HHHHhCCChhhhcccceeecCHHHHHHHHHHHHHHHHHhhc--CceeecccccceEE--e-eccCcccccccC--C--C- Confidence 2222211 1244566666666666667776666 55432 1111100 0 111111111111 0 1 Q ss_pred eEecceeeeeeeeeEeec----cCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHH-HHHHHHhhhhhhccccCC--- Q lcl|NC_018271. 70 VDINEKQLTLKKIKSDKE----VCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTD-MGNRLARKIDKDIWQGDG--- 141 (305) Q Consensus 70 ~~~~~K~L~~~~~k~~~~----~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~-l~~~ia~ei~~~~~~GD~--- 141 (305) ..-....|+...|.|.-+ .-|=+-...|-. .| ..|...... +.+++|-+.=-..|+|-+ T Consensus 83 r~r~~~~l~~~~Y~c~qTn~dt~i~y~~LD~WA~------------~~-d~~~~~~~~~~~r~iALD~i~IGfnG~s~A~ 149 (336) T protein:vir:37 83 RNLATLDHSQNGYELSETDSGILVNWSLFDSFAI------------FK-DRLVELYSEYFQNQVALDILQIGWNGQSVAT 149 (336) T ss_pred CCccccCCCCCccEEEEeeeeeeccHHHHHHHhc------------Ch-hHHHHHHHHHHHHHHhcchhhhcccceeecc Confidence 001111233444444211 112223334421 12 112222222 334466666666677853 Q ss_pred -ccch-----hHHHHHHHhhccceEEeccC----------cCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHH Q lcl|NC_018271. 142 -TTGN-----LQGILPLLEADATVIDVVGA----------SGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIR 205 (305) Q Consensus 142 -s~~~-----fdG~lk~i~~d~~~~~~~~~----------~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d 205 (305) |++- =-||++++.+.+..-.+... .+.-...|.-+.+.++.+.||+.+|+.+.|+.+|..++.. T Consensus 150 ~TdnPllqDVNkGWlQ~~Re~a~~~v~~~~~~~~g~i~~~G~~gdy~NLDalV~D~~~~I~~~~~~d~dLVvivG~dLla 229 (336) T protein:vir:37 150 NTTKTDLSDVNKGWLKLLQEQRAANFMTESTKSSGKITIFGDNADYANLDDLAFDLKQGLDFRHQNRNDLVFLVGADLVS 229 (336) T ss_pred CCCCccccccchhHHHHHHhccchhhcccccccCCceEEecCCCCcccHHHHHHHHHhccchHHhcCCCeEEEEchhhhh Confidence 2211 26999999876553111111 0112368899999999999999999999999999987643 Q ss_pred HHHHHHhhhhccCCc-ccCC-------CcceecceeeeeccCCCCCeEEEecchHHhhhhhhhhh------hhhccccce Q lcl|NC_018271. 206 AIKRAYGTQARSNGT-FLNP-------NEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSD------FNEIRIKDM 271 (305) Q Consensus 206 ~Y~d~~~~~~~k~~~-~t~~-------~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D------~n~I~I~~~ 271 (305) ++|-..+.+..+ +++- -..++-|.+-+.++-+|++.|+.|.=+||-+-+ |.+ .++++.+++ T Consensus 230 ---~~~~~l~~~~~~~PtE~~Aa~~~~~~k~iGGlpa~~~PffP~~~~lVT~L~NLsIY~--Q~gs~RR~~~d~p~r~ri 304 (336) T protein:vir:37 230 ---KETKLIQQKHGLTPTEKAALGSHNLMGSFGGMNAITPPNFPARAAAVTTLKNLSVYT--EAESVRRSLRNDEDKKGL 304 (336) T ss_pred ---hhhhhhhhhcCCCHHHHHHHHHHHHHHhhCCceEEEccccCCCceEEeeccccEEEE--ecCcEEEEEEEccccccc Confidence 344333433222 2211 124789999999999999999999999996432 222 244455555 Q ss_pred -----eeeccceeEEEEEEeecceeeccCCeE Q lcl|NC_018271. 272 -----GDVDLSGQIRTKMVLSAGVEYAYGAEI 298 (305) Q Consensus 272 -----~~~~~~~~~f~k~~m~~d~~i~fg~E~ 298 (305) .|-.|-.+.+-++-+=-..++.++.|+ T Consensus 305 e~y~s~Ne~YvVEd~~~~a~iE~i~v~~~~e~ 336 (336) T protein:vir:37 305 VTSYYRQEGYVVEDLGLMTAIDHTKVKLNGEV 336 (336) T ss_pred cchhhhcceeeeeccccEEEeeeeeeeccccC Confidence 355566666666666677888999999 No 112 >protein:vir:79157 Length: 339 # NCBI annotation: P2 family phage major capsid protein # Family: family:all:201 # MgeID: mge:1863 # MgeName: RSA1 # Cross-refs: genbank:acc:YP_001165257;genbank:gi:145708082;genbank:GeneID:5247168 Probab=94.70 E-value=0.0037 Score=33.91 Aligned_cols=271 Identities=12% Similarity=0.092 Sum_probs=138.4 Q ss_pred CceEeee---e--cccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccch--hhhhccccCCCCCCCCccceEec Q lcl|NC_018271. 1 MATTVDI---T--TNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRR--MNTTDDFVDYSCGFTPSGEVDIN 73 (305) Q Consensus 1 ma~~~~~---~--~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~--~~~~~~~q~~~~~~~~~G~~~~~ 73 (305) +|..... . -+=.+.+.-+|-..+..+.+..+. |.|+| ++.+.=-. +...-.+-+++ -|+.++.+.. T Consensus 16 ~A~~ngv~~~~~~FsV~P~v~q~L~~~i~ess~FL~~--INvv~---V~e~~Ge~v~lg~~g~iagrt--dt~~~~R~~~ 88 (339) T protein:vir:79 16 IAKLNGVERVDEKFSVAPSVQQKLETKVQESSDFLKS--INFYG---VPEQEGEKIGLGVSGPVASTT--DTTQQDRETS 88 (339) T ss_pred HHHHhCcccccceeeecHHHHHHHHHHHHHHHHHhcc--Ccccc---cccceeeEEeeccCcceeecc--cCCCCCcccc Confidence 2222211 1 133455555555556666666555 44421 11111000 11111111221 1223333322 Q ss_pred c-eeeeeeeeeEeec-cC---HHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc----- Q lcl|NC_018271. 74 E-KQLTLKKIKSDKE-VC---KEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT----- 143 (305) Q Consensus 74 ~-K~L~~~~~k~~~~-~~---P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~----- 143 (305) . -.|+...|.+.-+ || |=+-...|- +.+ .-|.-+-..+.+++|-+.=-..|+|-+-+ T Consensus 89 ~~~~l~~~~Y~c~qTn~dt~i~Y~~lD~WA----~~~---------dF~~r~~~~i~~~~ALD~i~IGfNGts~A~~Td~ 155 (339) T protein:vir:79 89 DISTMDGRRYRCEQTNSDTHITYQKLDAWA----KFA---------DFQTRIRDAIIKRQALDRIMIGFNGVSRAATSDR 155 (339) T ss_pred cccccCCCccEEEEeeeeceecHHHHHHHh----cCh---------hHHHHHHHHHHHHHhhccceecccceeeecCCCh Confidence 2 2455555555322 22 222233343 212 12233334555666666666667885321 Q ss_pred ------ch-hHHHHHHHhhccce------------EEeccCcCcCChhhHHHHHHHHHHh-ccHHHHhCCCcEEEecHHH Q lcl|NC_018271. 144 ------GN-LQGILPLLEADATV------------IDVVGASGGITAANVEAELGKFIDA-HTDEILQAPNHVFGVSTNV 203 (305) Q Consensus 144 ------~~-fdG~lk~i~~d~~~------------~~~~~~~~~iT~anv~~~l~~~~~~-iP~~~r~~~~l~~f~S~~~ 203 (305) .| =-||++++.+.+.. +++.+.. -...|.-+.+.++.+. ||+.+|+++.|+.+|+.++ T Consensus 156 ~~nPllqDVN~GWlQ~~Re~ap~rV~~~g~~~s~~i~~~G~g--gdy~NLDalV~d~~~~lId~~~~~d~dLVvivG~dL 233 (339) T protein:vir:79 156 VANPMLQDVNKGWLQNLREQAPQRVMKEGKAAAGKITVGGAG--ADYGNLDALVYDITNHLVEPWYAEDPDLVVVCGRNL 233 (339) T ss_pred hhCcCccccchhHHHHHHhhhhhhhhccceeccceeEeccCC--CCcccHHHHHHHHHhccCChHHhcCCCEEEEEchhh Confidence 11 26999999774432 1121221 2468999999999975 7999999999999999887 Q ss_pred HH-HHHHHHhhhhccCCcccCCC----cceecceeeeeccCCCCCeEEEecchHHhhhhhhhhh------hhhcccccee Q lcl|NC_018271. 204 IR-AIKRAYGTQARSNGTFLNPN----EFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSD------FNEIRIKDMG 272 (305) Q Consensus 204 ~d-~Y~d~~~~~~~k~~~~t~~~----~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D------~n~I~I~~~~ 272 (305) .. +|-.-+ ++..+..+..... ..++-|.+-+.++-+|++.|+.|.=+||-+= .|.+ .++++.+++. T Consensus 234 la~k~~~l~-n~~~~ptE~~Aa~~i~s~k~iGGl~a~~~PfFP~~~llVT~L~NLsIY--~Q~gs~RR~~~d~p~r~rie 310 (339) T protein:vir:79 234 LSDKYFPLV-NRDRDPVQQIAADLIISQKRIGNLPAIRVPYFPANGLLVTRLDNLSIY--YQEGGRRRTILDNAKRDRIE 310 (339) T ss_pred hhhHhhhHh-hcCCChHHHHHHHHHHHhhhhCCceeEEccccCCCceEEeechhcEEE--EecCcEEEEEEecccccccc Confidence 65 444444 2222221111111 2368899999999999999999999999532 2222 2445555553 Q ss_pred -----eeccceeEEEEEEeecceeeccCC Q lcl|NC_018271. 273 -----DVDLSGQIRTKMVLSAGVEYAYGA 296 (305) Q Consensus 273 -----~~~~~~~~f~k~~m~~d~~i~fg~ 296 (305) |-.|..+.+-++-+=-.+.|+.++ T Consensus 311 ~y~s~Ne~YvVEd~~~~a~iEni~~~~aa 339 (339) T protein:vir:79 311 NYESSNDAYVIEDLACAAMAENIALAAAA 339 (339) T ss_pred chhhccceeeeeccccEEEeeeeecccCC Confidence 444444433333333344444444 No 113 >protein:vir:80930 Length: 278 # NCBI annotation: Cps # Family: family:all:522 # MgeID: mge:1886 # MgeName: A500 # Cross-refs: genbank:acc:YP_001468392;genbank:gi:157324966;genbank:GeneID:5601363 Probab=94.24 E-value=0.005 Score=33.20 Aligned_cols=264 Identities=14% Similarity=0.079 Sum_probs=127.7 Q ss_pred CceEeee-e-----cccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCC-CCCccceEec Q lcl|NC_018271. 1 MATTVDI-T-----TNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCG-FTPSGEVDIN 73 (305) Q Consensus 1 ma~~~~~-~-----~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~-~~~~G~~~~~ 73 (305) ||+...- . .-|+..+.+++......++-...++. +.|.+.++..+|+.+.....+.++.+ --+-+..+.+ T Consensus 1 Ma~~~T~~~~~iiPev~s~~v~~~~~~~~v~~~~~~~~~~---l~g~~G~tv~ip~~~~~g~a~~~~~g~~i~~~~lt~~ 77 (278) T protein:vir:80 1 MADLTTKLANLIDPEVMGPMISAKLPKAIKFGKIAPIDNS---LEGQPGSEITVPKYKYIGDAQDVAEGAAIDYSALETE 77 (278) T ss_pred CCCcceehhheecHHHHHHHHHHHHHHhhhhcccceeccc---ccCCCCCEEEEeeeccCCcceeecCCCcCcccccccc Confidence 9974321 1 24555555555444444433222322 23444444444444333334544421 1112344555 Q ss_pred ceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHH Q lcl|NC_018271. 74 EKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLL 153 (305) Q Consensus 74 ~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i 153 (305) +.++..++... .|--.|.. .....++ + .....+.++..++.++....+. .+ T Consensus 78 ~~~~~i~~~~~--a~~v~D~~-----~~~~~~d-------~--~~~~~~~~a~~~a~~~d~~l~~-------------~l 128 (278) T protein:vir:80 78 SVKHGIKKAGK--GVKLTDES-----VLSGYGD-------P--VEEAQKQIRMAIASKVDNDILE-------------EA 128 (278) T ss_pred eeeEeeehhhc--cccccHHH-----Hhhcccc-------H--HHHHHHHHHHHHHHHHHHHHHH-------------HH Confidence 55555544322 23333322 1112121 1 1233355555666666544331 11 Q ss_pred hhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhc---cCCc--ccCCCcce Q lcl|NC_018271. 154 EADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQAR---SNGT--FLNPNEFD 228 (305) Q Consensus 154 ~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~---k~~~--~t~~~~~~ 228 (305) +.... ++.++.+..+..++.+.|-+...++...--.. .-+++||+..|-.....-..++- .+.+ ..++-.+. T Consensus 129 ~~a~~--~~~~~~t~~~~~~~~~~~~da~~~l~~~~~~~-~~~ivv~p~~~~~L~k~~~~~~~~~~~~g~~~~~~G~ig~ 205 (278) T protein:vir:80 129 LTTTL--EVKGAINIGLIDKIENTFTDAPDAIEDESITT-TGVLFLNYKDTAKLREEAAGSWTKASQLGDDLLVKGAFGE 205 (278) T ss_pred hcccc--ccccccccchhhhHHHHHHHHHHhhcccCCCc-ccEEEECHHHHHHHHhhhhhhccccccccccceeecccee Confidence 11111 11222222334455666666666655431111 12689999988655432211221 1111 23444668 Q ss_pred ecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 229 FEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 229 ~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) |.|++|+--..+|++.+++..++.+-++. ..+.+ ++-+|-.+- ..=-+.-++-+++-+..++-+|+-|..| T Consensus 206 ~~G~~Vi~s~~~p~~t~~l~~~gAi~~~~--~~~~~-vE~~Rd~~~---~~d~i~~~~~yg~~v~~~~~~v~it~~a 276 (278) T protein:vir:80 206 LLGWEIVRTKKLADGNALAVKAGALKTFL--KRNLL-AESGRDMDH---KLTKFNADQHYAVALVDETKAVKVVPVA 276 (278) T ss_pred ecceeEEEcCCCCcceEEEEeccceeeee--cCCcc-cccccchhh---ccceeeeeeEEEEEEEcCcceEEEeecc Confidence 99999999999999999988887664332 23333 333332221 1122233344588888999999999999 No 114 >protein:vir:98856 Length: 343 # NCBI annotation: hypothetical protein # Family: family:all:201 # MgeID: mge:1495 # MgeName: F108 # Cross-refs: genbank:acc:YP_654732;genbank:gi:109302917;genbank:GeneID:4156061 Probab=92.88 E-value=0.0038 Score=33.84 Aligned_cols=276 Identities=10% Similarity=0.060 Sum_probs=134.4 Q ss_pred CceEeee---------ecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccc-cCCCCCC-CCccc Q lcl|NC_018271. 1 MATTVDI---------TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDF-VDYSCGF-TPSGE 69 (305) Q Consensus 1 ma~~~~~---------~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~-q~~~~~~-~~~G~ 69 (305) +|-.... +-+=.+.+.-+|-..+..+.+.-++ |.|+| | ..+ .-..+.+. -+-.++= +..|. T Consensus 16 ~A~~ngv~~~~~~~~~~FsV~P~v~q~L~~~i~ess~FL~~--INvv~-V--~q~---~g~v~~~~~sg~~t~r~~t~~~ 87 (343) T protein:vir:98 16 AAEYYGANPALALAGKQFSIEAPKESVLLGAIQQRSNFLEK--INCVF-S--ERY---QRAIDLRSNRKRHYGAHDRRTP 87 (343) T ss_pred HHHHhCCccchhccCceeeecHHHHHHHHHHHHHHHHHhhc--Cceec-c--hhh---cceEEEeecCccccCccccCCC Confidence 2222211 1244566666666666667776666 66544 3 111 00110000 0000000 00111 Q ss_pred eEecce-eeeeeeeeEeec-c---CHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCC--- Q lcl|NC_018271. 70 VDINEK-QLTLKKIKSDKE-V---CKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDG--- 141 (305) Q Consensus 70 ~~~~~K-~L~~~~~k~~~~-~---~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~--- 141 (305) +.++ ..++..|.|.-+ | -|=+-...|-. .|-+-|.-+-..+.+++|-+.=...|+|-+ T Consensus 88 --~~~~~~~~~~~Y~c~qTn~dt~i~Y~~lD~WA~------------~~deF~~r~~~~i~~~~ALD~i~IGfNGts~A~ 153 (343) T protein:vir:98 88 --IQQRWTRQVMSMNVSRQIQACLIPWAKLDQWGH------------LKDKFASLYAEFVQNQIALDMIKIGFYGTSVGT 153 (343) T ss_pred --ccccccCCCCccEEEEeeeeeeccHHHHHHhhc------------ChhHHHHHHHHHHHHHHhhccceecccceeecc Confidence 1111 112222222111 1 11112223311 122223333344555566555556667743 Q ss_pred -ccch-----hHHHHHHHhhccce--E---------EeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHH Q lcl|NC_018271. 142 -TTGN-----LQGILPLLEADATV--I---------DVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVI 204 (305) Q Consensus 142 -s~~~-----fdG~lk~i~~d~~~--~---------~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~ 204 (305) |++- =-||++++.+.+.. + ++.+.. -+..|.-+.+.+..+.||+.+|+++.|+.+|..++. T Consensus 154 ~T~nPllqDVN~GWLQ~~Re~ap~rVm~~~~~~~~~~~~G~g--gdy~NLDalV~D~~~~I~~~~~~d~dLVvivG~dLl 231 (343) T protein:vir:98 154 DTSDPNLADVNKGWIQFVRENKATQILTQGATSGEIRLFGEG--ADYVNLDELAYDLKQGLDARHRDAGDLVFLVGADLV 231 (343) T ss_pred CCCCcchhhcchHHHHHHHhcchhhhhccceeccceeEecCC--CCcccHHHHHHHHHhcCchHHhcCCCEEEEEchhhh Confidence 3222 26999999875542 1 122222 246889999999999999999999999999998764 Q ss_pred H-HHHHHHhhhhccCCcccCC----CcceecceeeeeccCCCCCeEEEecchHHhhhhhhhhh------hhhcccccee- Q lcl|NC_018271. 205 R-AIKRAYGTQARSNGTFLNP----NEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSD------FNEIRIKDMG- 272 (305) Q Consensus 205 d-~Y~d~~~~~~~k~~~~t~~----~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D------~n~I~I~~~~- 272 (305) . +|---|.....+....... -..++-|.+-+.++-+|++.|+.|.=+||-+= .|.+ .++++.+++. T Consensus 232 a~~~~~l~n~~~~~ptEk~Aa~~~~~~k~iGGl~a~~~PfFP~~~llVT~L~NLsIY--~Q~gs~RR~~~d~p~r~rie~ 309 (343) T protein:vir:98 232 AKEASLVYKGNGLIATEKAALNTHDLMKSFGGMPAMIVPNMPPRAAIVTSLSNLSIY--TQEGSMRRGMKDDDDKKAVRD 309 (343) T ss_pred hhhhhhhhhhcCCChHHHHHHHHHHHHHhhCCCeeEEccccCCCceEEeeccccEEE--EecCcEEEEEEeccccccccc Confidence 3 3333332211111111111 12478899999999999999999999999532 2222 2445555553 Q ss_pred ----eeccceeEEEEEEeecceeeccCCeEEEec Q lcl|NC_018271. 273 ----DVDLSGQIRTKMVLSAGVEYAYGAEIVLYT 302 (305) Q Consensus 273 ----~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~ 302 (305) |-.|-.+.+-++-+=-..++..++.-=.|- T Consensus 310 y~s~Ne~YvVEd~~~~a~iE~i~v~~~~~~g~w~ 343 (343) T protein:vir:98 310 SYYRNEAYAVEDCGKFMAVDFTKVKLSSGKGTWK 343 (343) T ss_pred hhhhcceeeeeccccEEEeeeeeeeecCCCCCCC Confidence 444444434333333333333333222333 No 115 >protein:vir:78935 Length: 335 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:1860 # MgeName: LKD16 # Cross-refs: genbank:acc:YP_001522824;genbank:gi:158345059;genbank:GeneID:5687425 Probab=92.60 E-value=0.011 Score=31.38 Aligned_cols=263 Identities=13% Similarity=0.083 Sum_probs=125.3 Q ss_pred Cc--------------eEeee-ecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCC Q lcl|NC_018271. 1 MA--------------TTVDI-TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFT 65 (305) Q Consensus 1 ma--------------~~~~~-~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~ 65 (305) |- ....+ ..-|.|||++.+.-..++ .++++++.--..+....+++..... +..+++-. T Consensus 1 ms~~~~~t~~~~~~s~~d~al~le~f~geV~~af~~~s~~------~~~~~~rti~~g~s~~~~~iG~~~~-~~~~pG~~ 73 (335) T protein:vir:78 1 MSFLNDLTRPNYAGKNADVDIHLEEHLGIVDKHFAYTSKF------APLMNIRDLRGSNVVRLDRLGNVEA-KGRRAGEE 73 (335) T ss_pred CCccccccccccccccchhhhhhhhhhhHHHHHHHHhhhh------ccccceeeeccceeEEEeeeeeeee-cccccCcc Confidence 22 22222 357888888887765443 3556666544455444554433322 44455555 Q ss_pred Cccce-EecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhcc----ccC Q lcl|NC_018271. 66 PSGEV-DINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIW----QGD 140 (305) Q Consensus 66 ~~G~~-~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~----~GD 140 (305) ..|.. +-++++|..-.... .+-+-..-=.+|-. ..+-....++++..+|....+.++ ++- T Consensus 74 l~~~~~~~~k~~itID~ll~-----a~~~VddlDe~~~~----------yDvR~e~s~~~G~aLA~~~Dq~~~~~l~~aa 138 (335) T protein:vir:78 74 LERSRVVNDKWNLTVDTLLY-----LRHQFDHQDEWTQS----------FDMRKEVAELDGQELARKFDQACLIQVIKAA 138 (335) T ss_pred cCCCCcccCCeEEEecceee-----chhhHhhHHHhhcC----------chhHHHHHHHHHHHHHHHHHHHHHHHHHhhc Confidence 55544 33555666665552 11111111112222 122222345666677766666543 222 Q ss_pred Cc-------cchhHHHHHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCC---CcEEEecHHHHHHHHHH Q lcl|NC_018271. 141 GT-------TGNLQGILPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAP---NHVFGVSTNVIRAIKRA 210 (305) Q Consensus 141 ~s-------~~~fdG~lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~---~l~~f~S~~~~d~Y~d~ 210 (305) .+ ....+|+.+. +.+++.+....+..+++.+.+....+.++-..+. ...++|++..|...... T Consensus 139 ~~~a~~~~~~~~~~G~~~~-------~~~tg~~~~~~~~~l~~a~~~a~~~l~ekdvP~~~~~~rv~vv~P~~y~~Ll~~ 211 (335) T protein:vir:78 139 AMDAPVDLEDAFSPGVLEK-------LDLTGLTAKEAAEKIVRMHRRVVETFIERDLGDAVYSEGLTPMSPRVFSLLLEH 211 (335) T ss_pred ccccccccCCCcCCCccee-------eeeccccccccHHHHHHHHHHHHHHHHhccCCCCCCCccEEEeChHHHHHHhcc Confidence 11 1123565332 2233332223455567777776666555433221 36899999999887654 Q ss_pred --Hhhh-hc-cC--CcccCCCcceecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEE Q lcl|NC_018271. 211 --YGTQ-AR-SN--GTFLNPNEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKM 284 (305) Q Consensus 211 --~~~~-~~-k~--~~~t~~~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~ 284 (305) +-++ ++ ++ .++.+++-...-|++|+...++|.+-+-...-+|- +|+.+= ++.+ --.-+|=++ T Consensus 212 ~~l~n~~~~~s~~~~~~~~g~v~~v~Gv~V~~Sn~lP~~~~t~~~lg~a---------~n~~~~-d~~~--~~~~~~~~~ 279 (335) T protein:vir:78 212 DKLMSVEYQATGATNDYVKSRVAILNGVKVLETPRFATKAISAHPLGRH---------FNVSAE-EAER--QIALFLPSK 279 (335) T ss_pred cccccccccccccccccccceeEEeeceEEEeeccCCCCCCcccccccc---------CCcccc-cccc--eEEEEEecc Confidence 4333 22 22 23455566689999999999999876444332222 232221 1112 112233333 Q ss_pred EeecceeeccCCeEEEecCCC Q lcl|NC_018271. 285 VLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 285 ~m~~d~~i~fg~E~v~~~~~~ 305 (305) .+.....+...-|+ .|.+-- T Consensus 280 Al~t~~~~~~~~e~-~~~~~~ 299 (335) T protein:vir:78 280 TLITAQVAPVQAKL-WEDHDQ 299 (335) T ss_pred eEEEEEEEecccce-eeccch Confidence 33333333333332 111111 No 116 >protein:vir:270 Length: 341 # NCBI annotation: putative major capsid protein # Family: family:all:201 # MgeID: mge:7 # MgeName: K139 # Cross-refs: genbank:acc:NP_536650;genbank:gi:17975128;genbank:GeneID:929084 Probab=92.54 E-value=0.011 Score=31.33 Aligned_cols=278 Identities=13% Similarity=0.104 Sum_probs=127.3 Q ss_pred CceEeeeecccchhHHHHHHHHhhccc--cchhcCceEEecCCCCcccccchh-------hhhccc-----cCCCCCCCC Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEA--NTISDNLIRVIPNVPENNLFLRRM-------NTTDDF-----VDYSCGFTP 66 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~--~~v~~g~I~v~~~v~~~~~~~~~~-------~~~~~~-----q~~~~~~~~ 66 (305) |+-+.+ +-+-+.+.+++...+..+ +-+.+ .-.|-|-|.-+ +...+ +..+++ ++..-++.. T Consensus 1 m~~~m~---~~tr~~~~~y~~~~A~~ngv~~~~~-~FsV~P~v~q~--L~~~i~ess~FL~~Invv~V~e~~Ge~v~lg~ 74 (341) T protein:vir:27 1 MSQILT---QSAREYMDNFAQQLAKSYGVSNVAE-LFNVSPQLETK--LRAAITESAEFLKMITVTTVDQIEGQVVDVGV 74 (341) T ss_pred Cccccc---HHHHHHHHHHHHHHHHHcCcccccc-eEeecHHHHHH--HHHHHHhhHHhhhcCccccccceeeeEeeccc Confidence 553333 223333444444432222 11221 13333333110 11111 111111 122222222 Q ss_pred ccce---Eecce-----eeeeeeeeEeec-cCHH-HHH--HHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhh Q lcl|NC_018271. 67 SGEV---DINEK-----QLTLKKIKSDKE-VCKE-DFR--QLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDK 134 (305) Q Consensus 67 ~G~~---~~~~K-----~L~~~~~k~~~~-~~P~-d~~--~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~ 134 (305) +|-+ +-++| .|+...|.+.-+ ||-. .|+ ..|-. . |. .+.-|.-+-..+.+++|-+.=. T Consensus 75 ~g~iagrtdt~R~~r~~~l~~~~Y~c~qtn~dt~i~y~~lDaWA~---~-g~------~~dF~~r~~~~i~~~~ALD~i~ 144 (341) T protein:vir:27 75 SGLYTGRKAGGRFTKQVGVGGHKYKLAETDSCAAITWAMLCQWAN---Q-GG------RDQFMKHLTEFSNQMFALDIMR 144 (341) T ss_pred ccceeeccCCCceecccccCCcceEEEEeeeeeeecHHHHHHHHh---c-CC------ChHHHHHHHHHHHHHHhhhhhh Confidence 2111 01111 122333333111 1000 011 12211 1 21 1222333445556666766666 Q ss_pred hccccCCcc-----------ch-hHHHHHHHhhccceEEeccC----cCcCChhhHHHHHHHHHHh-ccHHHHhCCCcEE Q lcl|NC_018271. 135 DIWQGDGTT-----------GN-LQGILPLLEADATVIDVVGA----SGGITAANVEAELGKFIDA-HTDEILQAPNHVF 197 (305) Q Consensus 135 ~~~~GD~s~-----------~~-fdG~lk~i~~d~~~~~~~~~----~~~iT~anv~~~l~~~~~~-iP~~~r~~~~l~~ 197 (305) ..|+|-+-+ .| =-||++++.+.+..=.+.+. .+.-+..|.-+.+.+..+. ||+.+|+++.|++ T Consensus 145 IGfnGts~A~~Td~~anPllqDVNkGWlQ~~Re~a~~rVl~~~~~~~g~~gdy~nLDAlV~D~~~~lI~~~~~~d~dLVv 224 (341) T protein:vir:27 145 IGWNGVSAEADTDPSANPLGQDVNEGWIAFVKNRKASQVVDVDVYFDETNGDYRTLDAMASDIINNQIHPMFRNDPRLTV 224 (341) T ss_pred hcccceeeccCCChhhcccccccchhHHHHHHhhcccceeccceeeccCCCccccHHHHHHHHHhcccChHHhcCCCEEE Confidence 777885421 11 26999999886653212211 1223468899999998875 7999999999999 Q ss_pred EecHHHHH-HHHHHHhhhhccCCcccCCC-----cceecceeeeeccCCCCCeEEEecchHHhhhhhhhhh------hhh Q lcl|NC_018271. 198 GVSTNVIR-AIKRAYGTQARSNGTFLNPN-----EFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSD------FNE 265 (305) Q Consensus 198 f~S~~~~d-~Y~d~~~~~~~k~~~~t~~~-----~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D------~n~ 265 (305) +|+.++.. +|---| ++....++.- ..++-|.+.+.++-+|++.|++|+=+||-+-+ |.+ .++ T Consensus 225 ivG~dLla~k~~~l~----n~~~~ptE~~Aa~~i~k~iGGlpa~~~PffP~~~~lVT~L~NLsIY~--Q~gs~RR~~~d~ 298 (341) T protein:vir:27 225 FVGSGLIGAAQAKLY----DKADKPSEQIAAQKLDKTIAGRPAYVPPFLPDNAMVVTIPENLQVLT--QHGTAQRKAKHE 298 (341) T ss_pred EEchhhhhhhhhhhh----ccCCCCHHHHHHHHHHHhhCCCeEEEccccCCCceEEeeccceEEEE--ecCcEEEEEEec Confidence 99976654 443333 2222222211 13788999999999999999999999996432 222 255 Q ss_pred ccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 266 IRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 266 I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) ++.+++.+ |.+-|-.|=+=+++ .+.| --|--+-+| T Consensus 299 p~r~rie~--yes~YvVEdyg~~~-~~~~--~~vkl~~~~ 333 (341) T protein:vir:27 299 SDRKRSKT--HTGAWKVTQWVCWK-RSPL--TTQKKSTSA 333 (341) T ss_pred cccccccc--hhhhheeehhhhhh-hccc--cccccCccc Confidence 55666666 55544332211100 0000 001111122 No 117 >protein:vir:962 Length: 397 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:19 # MgeName: bIL285 # Cross-refs: genbank:acc:NP_076616;genbank:gi:13095724;genbank:GeneID:920264 Probab=92.40 E-value=0.012 Score=31.21 Aligned_cols=247 Identities=7% Similarity=0.044 Sum_probs=118.9 Q ss_pred CceEeeeecccc--hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCC------CC-ccceE Q lcl|NC_018271. 1 MATTVDITTNYV--GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGF------TP-SGEVD 71 (305) Q Consensus 1 ma~~~~~~~~Y~--Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~------~~-~G~~~ 71 (305) ++..-.....|. .+....+. ....-.++... ++++|- +......+...... -+..| .+ .++.+ T Consensus 132 ~~~~~~~~~~~~vp~~~~~~i~-~~~~~~~l~~~--~~~~~~-~~~~~~~~~~~~~~----~~~~~~~E~~~~~~~~~~~ 203 (397) T protein:vir:96 132 RDGFTSVEGGALIPQELLQPQL-EPKDIVDLSKY--VRSVPV-NSASGKFPVISKSG----SKMATVQQLEKNPQLANPK 203 (397) T ss_pred hhcccccccccchhHHHHHHHH-HhhhhhhHHHh--hhhccc-cccceeEEEEeccC----Ccccccccccccccccccc Confidence 222222222222 12222222 23333444333 443331 11111122222111 11222 22 24567 Q ss_pred ecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHH Q lcl|NC_018271. 72 INEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 72 ~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk 151 (305) |.+..+.++++-+...++ +++ + ++-++..+.++...|+++++..++..+..|+++.. T Consensus 204 ~~~i~~~~~~~~~~~~~s-~el--------l-------~ds~~~l~~~i~~~l~~~~~~~~~~~i~~g~g~~~------- 260 (397) T protein:vir:96 204 MVEIDYSVATRRGYIPIS-QEM--------I-------DDASYDVTGLIADEIQDQSLNTKNADIAAVLKTAT------- 260 (397) T ss_pred ccceeecHhHhhcchhhH-HHH--------H-------hhhHHHHHHHHHHHHHHHHHHHHHHHHhhcccccc------- Confidence 777777777766655432 211 1 12245667788888889999888888888877521 Q ss_pred HHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC---CcccCCCcce Q lcl|NC_018271. 152 LLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN---GTFLNPNEFD 228 (305) Q Consensus 152 ~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~---~~~t~~~~~~ 228 (305) .++.+|...+.+.+. ..++.. + +-.++||...|.+..- ++.-.|.+ .+.+++.... T Consensus 261 -------------~~~~~~~d~~~~~~~---~~~~~~-~---~a~~v~n~~~~~~l~~-lkd~~G~~~~~~~~~~~~~~~ 319 (397) T protein:vir:96 261 -------------AKSVVGVDGLKDLIN---KEIKKV-Y---DVKLFISASMYSELDK-LKDKNGRYLLQDSITAASGKQ 319 (397) T ss_pred -------------cccccchHHHHHHHH---Hhhhhh-c---CcEEEEcHHHHHHHHH-hhccCCCeEeccCccCCCccc Confidence 123344333333322 223332 2 3489999999866643 43222222 2344555667 Q ss_pred ecceeeeeccCC------CCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEec Q lcl|NC_018271. 229 FEGYTLTEIKGL------PASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYT 302 (305) Q Consensus 229 ~kGi~iv~l~~~------Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~ 302 (305) +.|.+++..... ++..++..+=++.+. +.|-..+.+..... ..... -+...+-+|+.+..++=||+=+ T Consensus 320 l~G~pv~~~~~~~~~~~~~~~~~~~gd~~~~~~----~~~~~~~~~~~~~~-~~~~~-~~~~~~r~d~~~~~~~a~~~~~ 393 (397) T protein:vir:96 320 LLGKEVVVLDDDVIGKSVGNVVGFIGDAKAFAS----FFDRKQVSVSWVDN-NIYGQ-LLAGIIRYDVKATDKKAGFYVT 393 (397) T ss_pred ccccceEEecccccCCCCCceEEEEeehhcceE----eEeecceEEEEecc-cccce-eEEEEEEEccEEecccceEEEE Confidence 999998865433 233466666555421 12223344432211 11122 2345567788887777777765 Q ss_pred -CCC Q lcl|NC_018271. 303 -PAA 305 (305) Q Consensus 303 -~~~ 305 (305) ++| T Consensus 394 ~~~a 397 (397) T protein:vir:96 394 FTIG 397 (397) T ss_pred eecC Confidence 333 No 118 >protein:vir:100172 Length: 394 # NCBI annotation: putative major head protein # Family: family:all:21 # MgeID: mge:1524 # MgeName: phi AT3 # Cross-refs: genbank:acc:YP_025031;genbank:gi:48697264;genbank:GeneID:2948270 Probab=92.26 E-value=0.012 Score=31.08 Aligned_cols=254 Identities=9% Similarity=0.047 Sum_probs=124.7 Q ss_pred CceEeeeeccc-c-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCC-CC-CCCC-ccceEecce Q lcl|NC_018271. 1 MATTVDITTNY-V-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDY-SC-GFTP-SGEVDINEK 75 (305) Q Consensus 1 ma~~~~~~~~Y-~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~-~~-~~~~-~G~~~~~~K 75 (305) +.......+.| . -+...+|+.....-..+.+. ++++| |+......+..+.+..-... .+ +-.+ .+...|.+. T Consensus 111 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~~~l~~~--~~~~~-~~~~~~~~~~~~~~~~~~~~~~E~~~~~~~~~~~~~~v 187 (394) T protein:vir:10 111 AGHVTSTEAGVLIPEEIIYDPTAEVNSVVDLSTL--VTKTP-VTTPKGTYPILKRATDRFSSVAELAENPALAEPEFEQV 187 (394) T ss_pred hcccccccCceeccHHHHHHHHHHHHhhhhhhhh--ceeee-ccCCceEEEEEecCCCccccccccccccccccccceeE Confidence 22221112223 1 13344455555444555433 55444 22221222222221111111 11 1122 245689999 Q ss_pred eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhh Q lcl|NC_018271. 76 QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEA 155 (305) Q Consensus 76 ~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~ 155 (305) .|.++++-+...++=.-++ +-.+..+.++.+.|.++++..++..++.|.++.. T Consensus 188 ~l~~~k~~~~~~iS~ell~----------------ds~~~l~~~i~~~la~~~~~~~~~~il~g~g~~~----------- 240 (394) T protein:vir:10 188 DWSVSTYRGAIPLSEEAIA----------------DSAVDLTSLVGQSINEKSVNTYNAMIAPVLQSFT----------- 240 (394) T ss_pred EeeeeeeEeeehhHHHHHh----------------hhhHHHHHHHHHHHHHHHHHHHHHHHhhcccccc----------- Confidence 9999999988765432111 2235667788899999999999988888876521 Q ss_pred ccceEEeccCcCcCChhhHHHHHHHHH-HhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC------cccCCC-cc Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGKFI-DAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG------TFLNPN-EF 227 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~~~-~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~------~~t~~~-~~ 227 (305) +...+.....+.+.++. ..++..+ +-.++||...|.+..- ++.-.|.+. ..+.++ .. T Consensus 241 ----------~~~~~~~~~~d~l~~~~~~~~~~~~----~a~~vmn~~~~~~l~~-lkd~~G~~i~~~~~~~~~~~~~~~ 305 (394) T protein:vir:10 241 ----------AKATTTDTLVDSLKHILNVDLDPAY----SRALVVTQSLFNTLDT-LKDKNGRYLLHDASDSITDGTAKG 305 (394) T ss_pred ----------cccccccccHHHHHHHHHhhhhhhc----cCEEEecHHHHHHHHH-hhccCCCeeeeccccccccCCccc Confidence 01111111233444443 4555553 2379999999888653 322222111 111222 34 Q ss_pred eecceeeeeccC--CC----CCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEE- Q lcl|NC_018271. 228 DFEGYTLTEIKG--LP----ASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVL- 300 (305) Q Consensus 228 ~~kGi~iv~l~~--~P----d~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~- 300 (305) .+.|++++.+.+ +| +..|+..+-++.++.++ -..+.+. +++-....+. +...+-+|+.+..++-|++ T Consensus 306 ~L~G~PV~~~~~~~~~~~~~~~~i~~gd~s~~~~~~~----~~~~~v~-~~~~~~~~~~-~~~~~r~d~~~~~~~ai~~~ 379 (394) T protein:vir:10 306 TVLGVPVYVVGDALLGSAAGDQKAFVGDLKRGVLFAD----RQQVTLA-WEDSKIYGRY-LGAAFRFGVKQADSNAGYFV 379 (394) T ss_pred ccccceeEEecccccCCCCCceEEEEeeccccEEEEe----ecceEEE-Eeccccccee-EEEEEEeccEEeccccEEEE Confidence 699999876643 23 22466666665432221 1223332 1111122222 2334566888877777666 Q ss_pred -ecCCC Q lcl|NC_018271. 301 -YTPAA 305 (305) Q Consensus 301 -~~~~~ 305 (305) ++|++ T Consensus 380 ~~~~~~ 385 (394) T protein:vir:10 380 TNTDAA 385 (394) T ss_pred Eeeccc Confidence 55665 No 119 >protein:vir:3870 Length: 400 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:82 # MgeName: A2 # Cross-refs: genbank:acc:NP_680487;swissprot:trembl:q8ltc0;genbank:gi:22296527;interpro:IPR006444;uniprot:Q8LTC0;genbank:GeneID:951713 Probab=92.17 E-value=0.013 Score=31.01 Aligned_cols=251 Identities=10% Similarity=0.057 Sum_probs=127.0 Q ss_pred CceEee-eecccc-h-hHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhcc-ccCCCC-CCCC-ccceEecc Q lcl|NC_018271. 1 MATTVD-ITTNYV-G-EVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDD-FVDYSC-GFTP-SGEVDINE 74 (305) Q Consensus 1 ma~~~~-~~~~Y~-G-e~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~-~q~~~~-~~~~-~G~~~~~~ 74 (305) +...+. ....|. . +...+++..+..-..+.+ +++++|-=... ...|....... ...... +-.+ .++.+|.+ T Consensus 133 ~~~~~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~--~~~~~~~~~~~-~~~~~~~~~~~~~~~~~E~~~~~~~~~~~f~~ 209 (400) T protein:vir:38 133 VNAGVKAADAASTIPETISNTPQRELQTVVDLKP--FTNVFQASTQK-GTYPTVANATTKMVTVAELEKNPAMAKPEFKP 209 (400) T ss_pred HhhcccccCCcccccHHHHHHHHHHHHhhhhhhh--cceeEeccCcc-eEEEEEecCCCcccccccccccccccccccee Confidence 111111 111221 1 223334433333333333 36665532111 12332222111 111111 1122 24668889 Q ss_pred eeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHh Q lcl|NC_018271. 75 KQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLE 154 (305) Q Consensus 75 K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~ 154 (305) ..|.++++-+...++-.=+. +-.+..+.++.+.|.++++..++..++.|.++.. T Consensus 210 i~~~~~k~~~~~~is~ell~----------------ds~~~~~~~i~~~l~~~~~~~~~~~i~~~~~~~~---------- 263 (400) T protein:vir:38 210 VNWSVETYRQALPVSQESID----------------DSAIDLVGLIAQNGQQIKVNTTNGAVATLLKGFT---------- 263 (400) T ss_pred eEeehhheeeehhhHHHHHh----------------hhHHHHHHHHHHHHHHHHHHHHHHhhhhcccccc---------- Confidence 99999888887766543221 1134566778888999998888888888776410 Q ss_pred hccceEEeccCcCcCChhhHHHHHHHHHH-hccHHHHhCCCcEEEecHHHHHHHHHHHhhhhcc---CCcccCCCcceec Q lcl|NC_018271. 155 ADATVIDVVGASGGITAANVEAELGKFID-AHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARS---NGTFLNPNEFDFE 230 (305) Q Consensus 155 ~d~~~~~~~~~~~~iT~anv~~~l~~~~~-~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k---~~~~t~~~~~~~k 230 (305) ..+..|. +.+.++.. .++.. .+-.++||...|.+.+. ++.-.|. ..+.+++....+. T Consensus 264 ----------~~~~~~~----~~~~~~~~~~~~~~----~~a~~v~~~~~~~~l~~-lkd~~G~~i~~~~~~~~~~~~l~ 324 (400) T protein:vir:38 264 ----------AKTISSV----DDLKHINNVDLDPA----YSRVIIASQSFYNFLDT-VKDGNGRYLLQDSILTPSGKSVL 324 (400) T ss_pred ----------ccccccH----HHHHHHHHhhhhhh----hCcEEEEcHHHHHHHHH-hhccCCCeeeecCcCCCCccccc Confidence 1122232 22333333 23322 23489999999988653 3322222 2233455556799 Q ss_pred ceeeeeccCCC-----CCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEE--EecC Q lcl|NC_018271. 231 GYTLTEIKGLP-----ASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIV--LYTP 303 (305) Q Consensus 231 Gi~iv~l~~~P-----d~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v--~~~~ 303 (305) |.+++....+| +..|+..+-++.+.+++- ..+.+... .-++... -|...+-+|+.+.-++-|| =+|| T Consensus 325 G~pv~~~~~~~~~~~g~~~~~~gd~s~~~~~~~~----~~~~~~~~-~~~~~~~-~~~~~~r~d~~~~~~~a~~~l~~~~ 398 (400) T protein:vir:38 325 GMPIAVVSDDTLGAAGEAHAFLGDIKRAILFANR----ADFMVRWV-DDQIYGQ-FLQAGMRFGVSVADEKAGYFLTYTP 398 (400) T ss_pred cceeEEecccccCCCCceEEEEEeccccEEEEee----cceEEEEe-cccccce-eEEEEEEeccEEecccceEEEEeec Confidence 99998887776 335677676765444322 12222211 1112222 1344566788888877754 4788 Q ss_pred CC Q lcl|NC_018271. 304 AA 305 (305) Q Consensus 304 ~~ 305 (305) +| T Consensus 399 ~a 400 (400) T protein:vir:38 399 KA 400 (400) T ss_pred CC Confidence 88 No 120 >protein:vir:6324 Length: 335 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:132 # MgeName: phiKMV # Cross-refs: genbank:acc:NP_877471;genbank:gi:33300843;uniprot:Q7Y2D3;genbank:GeneID:1482613 Probab=91.95 E-value=0.013 Score=30.84 Aligned_cols=262 Identities=13% Similarity=0.090 Sum_probs=125.2 Q ss_pred CceEeee-ecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhcc-ccCCCCCCCCccce-Eecceee Q lcl|NC_018271. 1 MATTVDI-TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDD-FVDYSCGFTPSGEV-DINEKQL 77 (305) Q Consensus 1 ma~~~~~-~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~-~q~~~~~~~~~G~~-~~~~K~L 77 (305) -+...++ ..-|.|||+..+.-..++ .++.+++.--..+....+++ ++. ++.++.+-...|.. +-++++| T Consensus 15 s~~d~al~le~f~geV~~af~~~s~~------~~~~~~rti~~g~s~~~~~i--G~~~~~~~~pG~~l~~~~~~~~k~~i 86 (335) T protein:vir:63 15 KNADVDIHLEEHLGIVDKHFAYTSKF------APLMNIRDLRGSNVVRLDRL--GNVEAKGRRAGEELERSRVVNDKWNL 86 (335) T ss_pred ccchhheehhhhhhhHHHHHHhhhhh------ccccceeeeccceeEEEeee--eeeeeecccCCcCcCCCCccccceEE Confidence 2222222 357888888887765444 35566665444554434443 332 34444444444444 3344466 Q ss_pred eeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhcc----ccCCc-c-----chh- Q lcl|NC_018271. 78 TLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIW----QGDGT-T-----GNL- 146 (305) Q Consensus 78 ~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~----~GD~s-~-----~~f- 146 (305) ..-.... ++.++.+-++. .-+..+-....++++..+|+...+.++ ++-.+ + +-| T Consensus 87 tVD~ll~------------a~~~I~dlDe~---~~~yDvRse~s~e~G~aLA~~~D~~~~~~i~~aa~~~a~~~~~~~~~ 151 (335) T protein:vir:63 87 TVDTLLY------------LRHQFDHQDEW---TQSFDMRKEVAELDGQELARKFDQACLIQVIKAAAMDAPVDLEDAFS 151 (335) T ss_pred Eecceee------------chhhhhhHHHH---hcCchhHHHHHHHHHHHHHHHHHHHHHHHHHhhccccCccccCCCcC Confidence 6655542 11112221111 112222223446666677766665443 22211 0 111 Q ss_pred HHHHHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCC---CcEEEecHHHHHHHHHH--Hhhh-hc-c-- Q lcl|NC_018271. 147 QGILPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAP---NHVFGVSTNVIRAIKRA--YGTQ-AR-S-- 217 (305) Q Consensus 147 dG~lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~---~l~~f~S~~~~d~Y~d~--~~~~-~~-k-- 217 (305) +|+.. .+.+++.+..-.+..+++.+.+....+.++--... ...++|++..|...... +-++ ++ + T Consensus 152 ~G~~~-------~~~~tg~~~~~~~~~l~~a~~~a~~~L~e~dVP~~~~~dr~~vv~P~~y~~Ll~~~~l~n~~~~~s~~ 224 (335) T protein:vir:63 152 PGVLE-------KLDLTGLTAKQAADKIVRMHRRVVETFIDRDLGDAVYSEGLTPMSPRVFSLLLEHDKLMNVEYQATGA 224 (335) T ss_pred CCcce-------eeeeccCcccccHHHHHHHHHHHHHHHHhccCCCcccCceEEEeChHHHHHHhccccccccccccccc Confidence 34322 22233322222355677777777777666533222 26899999999877554 4333 22 2 Q ss_pred CCcccCCCcceecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCe Q lcl|NC_018271. 218 NGTFLNPNEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAE 297 (305) Q Consensus 218 ~~~~t~~~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E 297 (305) ..++.+++-...-|++|+...++|.+-+-.++-+ +.+|+.+ .++.+ .-.-+|-+..+...-.+...-| T Consensus 225 ~~~~~~g~v~~v~Gv~V~~sn~lP~~~~t~~~lg---------~a~n~~~-~d~~~--~~~~~~~~~Al~t~~~~~vt~e 292 (335) T protein:vir:63 225 TNDYVKSRVAILNGVKVLETPRFATKAIAAHPLG---------RHFNVSA-EESER--QIALFLPSKTLITAQVAPVQAK 292 (335) T ss_pred cccccCceeEEeeceEEEeeccCCCCCccccccc---------ccCCccc-cccce--eEEEEEecceEEEEEEeecccc Confidence 2345566667899999999999998765544322 2234433 22222 2222333333333333333222 Q ss_pred EEEecCCC Q lcl|NC_018271. 298 IVLYTPAA 305 (305) Q Consensus 298 ~v~~~~~~ 305 (305) + .|++-- T Consensus 293 ~-~~~~~~ 299 (335) T protein:vir:63 293 L-WEDNEK 299 (335) T ss_pred e-eeccch Confidence 1 111111 No 121 >protein:vir:100057 Length: 375 # NCBI annotation: T7-like capsid protein # Family: family:all:975 # MgeID: mge:1604 # MgeName: P-SSP7 # Cross-refs: genbank:acc:YP_214206;genbank:gi:61806429;genbank:GeneID:3294737 Probab=91.81 E-value=0.0063 Score=32.63 Aligned_cols=271 Identities=13% Similarity=0.095 Sum_probs=123.8 Q ss_pred Cc----------eEee------------e-ecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccc Q lcl|NC_018271. 1 MA----------TTVD------------I-TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDF 57 (305) Q Consensus 1 ma----------~~~~------------~-~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~ 57 (305) |+ ++.+ + ..-|.|||+..+--. .+-.++++++.--..++...+++.... + T Consensus 1 ~~~~~~~~~~~~n~~t~~~~~~~~~~~al~le~f~geV~~~f~~~------si~~~~~~~rti~~Gksv~f~~iG~~t-~ 73 (375) T protein:vir:10 1 MANANQVALGRSNLSTGTGYGGATDKYALYLKLFSGEMFKGFQHE------TIARDLVTKRTLKNGKSLQFIYTGRMT-S 73 (375) T ss_pred CccccccccCccccCCccccccccchHHHHHHHHhHHHHHHHHHH------HhhhccccccccccCceEEEEeeeeeE-E Confidence 21 1111 1 235677775554433 233455666544445544455443322 2 Q ss_pred cCCCCC--CCCcc--ceEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhh Q lcl|NC_018271. 58 VDYSCG--FTPSG--EVDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKID 133 (305) Q Consensus 58 q~~~~~--~~~~G--~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~ 133 (305) +.++.+ +.+.. +.+-++++|..-+.+.+ ..++.+-++ -+.+..+.....++.+..+++... T Consensus 74 ~~~t~G~~i~~~~~~d~~~te~~l~ID~~~y~------------~~~VdDiD~---aqa~~Dlr~e~s~~~G~aLA~~~D 138 (375) T protein:vir:10 74 SFHTPGTPILGNADKAPPVAEKTIVMDDLLIS------------SAFVYDLDE---TLAHYELRGEISKKIGYALAEKYD 138 (375) T ss_pred eeecCCcCcCCccccCCCCCceEEEecchhhh------------hhhHhhHHH---HhcCchhHHHHHHHHHHHHHHHHH Confidence 444432 33322 33456777766655432 222222111 123333444455566666666655 Q ss_pred hhcc----ccCCc-------cchhHHHHHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHH Q lcl|NC_018271. 134 KDIW----QGDGT-------TGNLQGILPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTN 202 (305) Q Consensus 134 ~~~~----~GD~s-------~~~fdG~lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~ 202 (305) +.+. +|-.+ ...+.|...+..... ......+|+.++.+.+.+....+.++--..-+-.+.||+. T Consensus 139 ~~i~~~l~kaa~~~~p~~~~~~~~~Gg~~i~~~sg-----~~~~~~~ta~~~~~ai~~a~~~Lde~~VP~~~R~~vv~P~ 213 (375) T protein:vir:10 139 RLIFRSITRGARSASPVSATNFVEPGGTQIRVGSG-----TNESDAFTASALVNAFYDAAAAMDEKGVSSQGRCAVLNPR 213 (375) T ss_pred HHHHHHHHHhhhhccccccccccccCcceeeeccc-----cccccccCHHHHHHHHHHHHHHHhhcCCCCCCCEEEeChH Confidence 4443 22111 111233222111110 1112346788999999999888777644434578999999 Q ss_pred HHHHHHHH-----HhhhhccCCccc-CCCcceecceeeeeccCCCCCeEEE---ecchHHhhhhhhhhhhhhccccceee Q lcl|NC_018271. 203 VIRAIKRA-----YGTQARSNGTFL-NPNEFDFEGYTLTEIKGLPASRMVG---YNRDNIVIGMSAQSDFNEIRIKDMGD 273 (305) Q Consensus 203 ~~d~Y~d~-----~~~~~~k~~~~t-~~~~~~~kGi~iv~l~~~Pd~~ii~---T~~sNl~~gvnl~~D~n~I~I~~~~~ 273 (305) .|..-... +.++...+.... ++.-....|+.|+...++|...+-. ....|-....++.+-..-+...-..+ T Consensus 214 ~y~~Ll~~~d~~~~~n~d~~~~~~~~~g~v~~i~Gv~V~~Sn~lP~~~~~~~~~g~~~~~~a~~~~~~~~~~~~~~~~~~ 293 (375) T protein:vir:10 214 QYYALIQDIGSNGLVNRDVQGSALQSGNGVIEIAGIHIYKSMNIPFLGKYGVKYGGTTGETSPGNLGSHIGPTPENANAT 293 (375) T ss_pred HHHHHHhcCCccceeeecccccceeccceEEEEeceEEEEeccccccccccccccccccccchhhhhccccccCCcceee Confidence 98777543 322222121112 2223468899999999999664321 11112222222222222222222222 Q ss_pred eccceeEEEEEEeecceeeccC-CeEEEecCCC Q lcl|NC_018271. 274 VDLSGQIRTKMVLSAGVEYAYG-AEIVLYTPAA 305 (305) Q Consensus 274 ~~~~~~~f~k~~m~~d~~i~fg-~E~v~~~~~~ 305 (305) -.-..+|.- |+.. ++ -=-++|+|+| T Consensus 294 ~g~~~~y~~------d~~~-~~~~~~~~~~~~A 319 (375) T protein:vir:10 294 GGVNNDYGT------NAEL-GAKSCGLIFQKEA 319 (375) T ss_pred ccccccccc------cccc-cCceEEEEEchhh Confidence 111123433 2211 11 2237788888 No 122 >protein:vir:3746 Length: 336 # NCBI annotation: orf15 # Family: family:all:201 # MgeID: mge:79 # MgeName: HP1 # Cross-refs: genbank:acc:NP_043487;genbank:gi:9628622;genbank:GeneID:1261135 Probab=91.71 E-value=0.015 Score=30.65 Aligned_cols=270 Identities=13% Similarity=0.090 Sum_probs=141.6 Q ss_pred CceEeee---------ecccchhHHHHHHHHhhccccchhcCceEEec--CCCCcccccchhhhhccccCCCCCCCCccc Q lcl|NC_018271. 1 MATTVDI---------TTNYVGEVAGGYFLEMVKEANTISDNLIRVIP--NVPENNLFLRRMNTTDDFVDYSCGFTPSGE 69 (305) Q Consensus 1 ma~~~~~---------~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~--~v~~~~~~~~~~~~~~~~q~~~~~~~~~G~ 69 (305) +|..... +-+=++.+.-+|-..+..+.+.-++ |.|+| .++...+ . +...-.+-+++.. | T Consensus 13 ~A~~ngv~~a~~~~~~~Fsv~P~v~q~L~~~i~ess~FL~~--INvv~V~e~~Ge~v--~-lg~~g~iagrtdt----~- 82 (336) T protein:vir:37 13 LAKHFNQPLDSVLRGESFALKAPEAALLGENIQQRSDFLKQ--INMIQVAHTKGQKL--F-GATEKGVTGRKQT----G- 82 (336) T ss_pred HHHHhCCChhhhccCceeecCHHHHHHHHHHHHHHHHHhhc--CceeecccccceEe--e-eccCcccccccCC----C- Confidence 2222211 1234455666666666666666666 44432 1111100 0 1111111111110 1 Q ss_pred eEecceeeeeeeeeEeec----cCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHH-HHHHHHhhhhhhccccCC--- Q lcl|NC_018271. 70 VDINEKQLTLKKIKSDKE----VCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTD-MGNRLARKIDKDIWQGDG--- 141 (305) Q Consensus 70 ~~~~~K~L~~~~~k~~~~----~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~-l~~~ia~ei~~~~~~GD~--- 141 (305) ..-..-.|+...|.|.-+ .-|=+-...|-. .| ..|...... +.+++|-+.=-..|+|-+ T Consensus 83 R~~~~~~l~~~~Y~c~qTn~dt~i~y~~LD~WA~------------~~-df~~~~~~~~~~r~iALD~i~IGfnG~s~A~ 149 (336) T protein:vir:37 83 RNLANLDHTQNGFELAETDSGIIVPWALFDSFAI------------FK-DRLVELYSEYFQNQVALDILQIGWNGQSVAD 149 (336) T ss_pred ccccccCcCCcccEEEEeeeeeeecHHHHHHHhc------------Ch-hHHHHHHHHHHHHHHhhchhhhcccceeecc Confidence 100111233333333111 112222334421 11 112222222 333456666666677743 Q ss_pred -ccch-----hHHHHHHHhhccceEEeccC----------cCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHH Q lcl|NC_018271. 142 -TTGN-----LQGILPLLEADATVIDVVGA----------SGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIR 205 (305) Q Consensus 142 -s~~~-----fdG~lk~i~~d~~~~~~~~~----------~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d 205 (305) |++- =-||++++.+.+..-.+... .+.-...|.-+.+.++.+.||+.+|+.+.|+.+|..++.. T Consensus 150 ~TdnPllqDVNkGWlQ~~Re~a~~~v~~~~~~~~g~i~~~G~~gdy~NLDalV~D~~~~I~~~~~~d~dLVvivG~dLla 229 (336) T protein:vir:37 150 NTTKADLSDVNKGWLKLLQEQRAANFMTESTKSSGKITIFGDNADYANLDDLAFDLKQGLDFRHQNRNDLVFLVGADLVS 229 (336) T ss_pred CCCCCcccccchhHHHHHHhccchhhcccccccCCceEEecCCCCcccHHHHHHHHHhcCchHHhcCCCeEEEEchhhhh Confidence 2221 26999999876553211111 0112368899999999999999999999999999987643 Q ss_pred HHHHHHhhhhccCCc-ccCC-------CcceecceeeeeccCCCCCeEEEecchHHhhhhhhhhh------hhhccccce Q lcl|NC_018271. 206 AIKRAYGTQARSNGT-FLNP-------NEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSD------FNEIRIKDM 271 (305) Q Consensus 206 ~Y~d~~~~~~~k~~~-~t~~-------~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D------~n~I~I~~~ 271 (305) ++|-..+.+..+ +++- -..++-|.+-+.++-+|++.|+.|.=+||-+-+ |.+ .++++.+++ T Consensus 230 ---~~~~~l~~~~~~~PtE~~Aa~~~~~~k~iGGlpa~~~PffP~~~~lVT~L~NLsIY~--Q~gs~RR~~~d~p~r~ri 304 (336) T protein:vir:37 230 ---KETKLIQQKHGLTPTEKAALGSHNLMGSFGGMNAITPPNFPARAAAVTTLKNLSVYT--EAESVRRSLRNDEDKKGL 304 (336) T ss_pred ---hhhhhhhhhcCCCHHHHHHHHHHHHHHhhCCceeEEccccCCCceEEeechhcEEEE--ecCcEEEEEEEccccccc Confidence 334333433222 2211 124789999999999999999999999996432 222 244555555 Q ss_pred -----eeeccceeEEEEEEeecceeeccCCeE Q lcl|NC_018271. 272 -----GDVDLSGQIRTKMVLSAGVEYAYGAEI 298 (305) Q Consensus 272 -----~~~~~~~~~f~k~~m~~d~~i~fg~E~ 298 (305) .|-.|-.+.+-++-+=-..++.++.|+ T Consensus 305 e~y~s~Ne~YvVEd~~~~a~iE~i~v~~~~e~ 336 (336) T protein:vir:37 305 VTSYYRQEGYVVEDLGLMTAIDHTKVKLNGEV 336 (336) T ss_pred cchhhhcceeeeeccccEEEeeeeeeeecCcC Confidence 355566666666666667888999999 No 123 >protein:vir:79712 Length: 285 # NCBI annotation: major capsid protein gp34 # Family: family:all:701 # MgeID: mge:1873 # MgeName: LL-H # Cross-refs: genbank:acc:YP_001285883;genbank:gi:148750840;genbank:GeneID:5220414 Probab=91.32 E-value=0.016 Score=30.37 Aligned_cols=261 Identities=14% Similarity=0.122 Sum_probs=116.3 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccchhc--CceEEecCCCCcccccchhhhhccccCCC--CCCCCccce--Eecc Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANTISD--NLIRVIPNVPENNLFLRRMNTTDDFVDYS--CGFTPSGEV--DINE 74 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~v~~--g~I~v~~~v~~~~~~~~~~~~~~~~q~~~--~~~~~~G~~--~~~~ 74 (305) || +|-.+-|.+...+.+.....++.-+.+. |.++ ++ -.++..+|++.+..++.+|+ .+|+. |++ .|.. T Consensus 1 Ma--in~~~k~~~~ld~~~~~~~~~~~l~~~~n~~~~~--~~-gak~VkIp~ist~~gl~dY~R~~g~~~-g~v~~~~et 74 (285) T protein:vir:79 1 MT--VVLDSKDLARIDEEYKADSQVWSYLTGGNGVTQR--FR-GHNEVRINKLSGFVDATAYKRGQDNAR-KTISVGKET 74 (285) T ss_pred Cc--chhhHHHHHHHHHHHHHhhhhhhhcccCCcceeE--ec-CCCEEEEeeecccccccccccccCccc-cccceeeeE Confidence 88 4446789999998888776666544332 2222 22 24666788887777787775 45533 444 4444 Q ss_pred eeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHh Q lcl|NC_018271. 75 KQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLE 154 (305) Q Consensus 75 K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~ 154 (305) ++|.-.|.- .|.+.|-|..+ + +... ++ .+-.|.........-++--|.-+ +. T Consensus 75 ~tl~~DR~~-~f~iD~mDvdE-------------------n-~~~~---~~-ni~~ef~~~~vvPEiDayrfskl---a~ 126 (285) T protein:vir:79 75 VKLTHEDWF-GYDLDQFDMDE-------------------N-GAYT---VE-NVVREHNKMITIPHRDKVAVQKL---FD 126 (285) T ss_pred EEeeccccc-eecccccchhh-------------------h-hhhh---HH-HHHHHHHhhhhcchhhHHHHHHH---Hh Confidence 444443321 12222222211 0 1111 11 11111111111222122223322 22 Q ss_pred hccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccC--CcccCCC----cce Q lcl|NC_018271. 155 ADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSN--GTFLNPN----EFD 228 (305) Q Consensus 155 ~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~--~~~t~~~----~~~ 228 (305) ..+. . .++++|++|+++++++...++.+. .-..++.+|||+..+...+.+-.-.-... .+...++ -.. T Consensus 127 ~a~~---~--~~~~~T~~nv~~~i~~~~~~lde~-~vp~~rvl~vTp~~~~~Lk~s~~~~r~~~~~~~~~~~~i~~~V~~ 200 (285) T protein:vir:79 127 SAAK---K--ATDSITKDNALDAYDTAEAYMFDN-EVPGGFVMFVSSAYYTALKQSAAVTRTFSTDGTMVINGIDRRVAQ 200 (285) T ss_pred hccc---c--cccccCHHHHHHHHHHHHHHHHHc-CCCCceEEEEChHHHHHHHhhhhhheecccccceeccceeeeecc Confidence 2222 1 234689999999999999888775 32256899999999998876643111111 1111222 235 Q ss_pred ecc-eeeeeccCCCCCeEEEe---cchHHhhhhh----hhhhhhhccccce-eeeccceeEEEEEEeecceeecc-CCeE Q lcl|NC_018271. 229 FEG-YTLTEIKGLPASRMVGY---NRDNIVIGMS----AQSDFNEIRIKDM-GDVDLSGQIRTKMVLSAGVEYAY-GAEI 298 (305) Q Consensus 229 ~kG-i~iv~l~~~Pd~~ii~T---~~sNl~~gvn----l~~D~n~I~I~~~-~~~~~~~~~f~k~~m~~d~~i~f-g~E~ 298 (305) +.| ++|+.+ |++++=.. ..=|+++--. ...=.+.+++-+= ++. -...|..+...=.|.-+.= -..- T Consensus 201 lDg~v~ii~V---ps~r~kt~~~~k~Infiiv~~~a~i~~~K~~~~~~f~P~~~~-~~d~~~~~~R~Y~d~fv~~nk~~~ 276 (285) T protein:vir:79 201 LDGGVPIVRV---SSDRLKGLGITNHVNFILTPLSAIAPIVKYDSVSVIDPSTDR-SGNRWTIKGLSYYDAIVLDNAKKG 276 (285) T ss_pred ccceeEEEEc---chhhccCcCcchhccEEEecCceeccceeeeeeEeECCCCCC-Ccceeeeeeeeeeeeeehhhccce Confidence 777 777776 55554110 0012211000 0000112221111 110 1112333333333333211 1111 Q ss_pred EEecCCC Q lcl|NC_018271. 299 VLYTPAA 305 (305) Q Consensus 299 v~~~~~~ 305 (305) ++-+..| T Consensus 277 Iy~~~~a 283 (285) T protein:vir:79 277 IYVAATA 283 (285) T ss_pred eeeeecc Confidence 2222222 No 124 >protein:vir:94576 Length: 347 # NCBI annotation: Major capsid protein # Family: family:all:975 # MgeID: mge:1516 # MgeName: Berlin # Cross-refs: genbank:acc:YP_919012;genbank:gi:119637776;genbank:GeneID:5179336 Probab=90.95 E-value=0.01 Score=31.54 Aligned_cols=256 Identities=15% Similarity=0.102 Sum_probs=115.3 Q ss_pred CceEeeee--------cc------------cchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCC Q lcl|NC_018271. 1 MATTVDIT--------TN------------YVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDY 60 (305) Q Consensus 1 ma~~~~~~--------~~------------Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~ 60 (305) ||++...+ .+ |.|||+..+.-. .+-.++++++.--+.++..++++..... +.+ T Consensus 1 ma~~~~~~~~~t~~g~~~~~~d~~al~ie~~~geV~~~f~~~------s~~~~~~~~rti~~G~sv~~~~iG~~~~-~~~ 73 (347) T protein:vir:94 1 MANMNGGQQMGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRT------SVTMNKHLVRSIQSGKSAQFPVLGRTKA-AYL 73 (347) T ss_pred CCccccccccccccccCCcccchHHHHHHHHhHHHHHHHHHH------HhhhhhhhheeccccceEEeeeccceeE-eee Confidence 88665443 12 344443333222 2333445544222244444454443333 333 Q ss_pred CCCCCCc---cceEecceeeeeeeeeEeeccCHHHHHHHHHH----HhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhh Q lcl|NC_018271. 61 SCGFTPS---GEVDINEKQLTLKKIKSDKEVCKEDFRQLWTA----AEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKID 133 (305) Q Consensus 61 ~~~~~~~---G~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~----~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~ 133 (305) +.+-+.. .+.+-++++|..... .|.+.+-. +|.. .++ + ....+..+..++.... T Consensus 74 ~~G~~l~~~~~~~~~~e~~ltID~~---------~y~~~~VddiD~~q~~-~D~--r-------s~~~~~~g~ALA~~~D 134 (347) T protein:vir:94 74 QPGENLDDKRKDMKHTEKTINIDGL---------LTADVLIYDIEDAMNH-YDV--R-------SEYTAQLGESLAMAAD 134 (347) T ss_pred ecCcCCCCCcCCccccceEEEEcch---------hhhhhhhhhHHHHhcC-cch--H-------HHHHHHHHHHHHHHHH Confidence 3333222 234556655554443 33333222 3333 211 1 2233555555665555 Q ss_pred hhcc----cc-CC---ccchhHHHHHHHhhccceEEecc-----CcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEec Q lcl|NC_018271. 134 KDIW----QG-DG---TTGNLQGILPLLEADATVIDVVG-----ASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVS 200 (305) Q Consensus 134 ~~~~----~G-D~---s~~~fdG~lk~i~~d~~~~~~~~-----~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S 200 (305) +.++ .+ +. +...+.|.-. .-.+++.. .+...++.++++.|.+....+.++--..-+..++++ T Consensus 135 ~~i~~~l~~~a~~~~~~~~~~~g~~~-----~~~v~i~~~~~~~~~~~~~~~~~~d~i~~a~~~Lde~dVP~~~R~~vv~ 209 (347) T protein:vir:94 135 GAVLAEMAKLCNLPTANNENIAGLGK-----AHVLEVGDQATLQGDQVKLGQAIIAQLTLARAKLTGNYVPSSDRVFYTT 209 (347) T ss_pred HHHHHHHHHhhccccccccccccCCc-----ceeEeeeccccccccccccHHHHHHHHHHHHHHhhhcCCCCCCCEEEeC Confidence 4432 11 11 1111111000 00111111 122345677788888777776664222235788899 Q ss_pred HHHHHHHHHHHhhhhcc---CCcccCCCcceecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccc Q lcl|NC_018271. 201 TNVIRAIKRAYGTQARS---NGTFLNPNEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLS 277 (305) Q Consensus 201 ~~~~d~Y~d~~~~~~~k---~~~~t~~~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~ 277 (305) +..|............. ..+..++....+.|++|+...++|.+.+..+.-. .|+-..+..+... .+-+ T Consensus 210 P~~y~~LLk~~~~~~~~~~~~~~~~~G~V~~v~G~~V~~Sn~~p~~~~~~~~~~---~~~~~~~~~~~~~------~~~~ 280 (347) T protein:vir:94 210 PDNYSAILAALMPNAANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGDNRAE---EGVAPTNQKHAFP------DTAS 280 (347) T ss_pred hHHHHHHHHhhcccccccccccccccceeEEeeceEEEEcCccccccCcccccc---ccccccccccccc------cccc Confidence 98887765433222211 1122344456799999999999998776433321 1122222222222 2234 Q ss_pred eeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 278 GQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 278 ~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) ++|+- |+.- .=-|+|+|+| T Consensus 281 ~~y~~------d~~~---~~~l~~~~~A 299 (347) T protein:vir:94 281 GDTRV------ALDN---VVGLFNHRSA 299 (347) T ss_pred ccccc------cccc---eEEEEechhh Confidence 44543 3222 1257788888 No 125 >protein:vir:1153 Length: 338 # NCBI annotation: predicted major capsid protein # Family: family:all:201 # MgeID: mge:24 # MgeName: phi CTX # Cross-refs: genbank:acc:NP_490602;genbank:gi:17313222;genbank:GeneID:927319 Probab=90.14 E-value=0.022 Score=29.62 Aligned_cols=271 Identities=11% Similarity=0.070 Sum_probs=135.4 Q ss_pred CceEeee-----ecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccch--hhhhccccCCCCCCCCccceEec Q lcl|NC_018271. 1 MATTVDI-----TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRR--MNTTDDFVDYSCGFTPSGEVDIN 73 (305) Q Consensus 1 ma~~~~~-----~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~--~~~~~~~q~~~~~~~~~G~~~~~ 73 (305) +|..... +-+=++.+.-++-..+..+.+.-++ |.|+| | ..+.=-. +...-.+-+++ -|+.++-.+. T Consensus 16 ~A~~ngv~~~~~~FsV~P~v~q~L~~~i~ess~FL~~--Invv~-V--~e~~Ge~v~lg~~g~iagrt--dT~~~~~R~~ 88 (338) T protein:vir:11 16 LAKLNGVNSAVQTFAVEPSVQQKLEQRIQESSEFLKQ--INVYG-V--DELQGEKIGIGVSGTIASRT--DTTGDGVRKP 88 (338) T ss_pred HHHHhCCCcccceeeeCHHHHHHHHHHHHHHHHhhcc--Cceec-c--cceeeeEeeeccCccccccc--cCCCCCcccc Confidence 3322211 1133455555555556666666555 44422 1 1110000 11111111221 1111111111 Q ss_pred --ceeeeeeeeeEeec----cCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc---- Q lcl|NC_018271. 74 --EKQLTLKKIKSDKE----VCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT---- 143 (305) Q Consensus 74 --~K~L~~~~~k~~~~----~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~---- 143 (305) --.|+...|.|.-+ .-|=+-...|- +.+ .-|.-+-..+.+++|-+.=-..|+|-+-+ T Consensus 89 ~~~~~l~~~~Y~c~qtn~dt~i~y~~LD~WA----~~~---------dF~~r~~~~i~k~~ALD~i~IGfnG~s~A~~Td 155 (338) T protein:vir:11 89 RDVSALDNQRYECKHTDFDTAITYAMLDAWA----KFP---------EFQALLRDAILKRQALDRLMIGFNGTSAAATTN 155 (338) T ss_pred ccccccCCCccEEEEeeeeeeecHHHHHHHh----cCh---------hHHHHHHHHHHHHHhhchhhhcccceeeccCCC Confidence 11234444444221 11222233343 212 12334445556667766666777885421 Q ss_pred -------ch-hHHHHHHHhhccceEEe----------ccCcCcCChhhHHHHHHHHHHh-ccHHHHhCCCcEEEecHHHH Q lcl|NC_018271. 144 -------GN-LQGILPLLEADATVIDV----------VGASGGITAANVEAELGKFIDA-HTDEILQAPNHVFGVSTNVI 204 (305) Q Consensus 144 -------~~-fdG~lk~i~~d~~~~~~----------~~~~~~iT~anv~~~l~~~~~~-iP~~~r~~~~l~~f~S~~~~ 204 (305) .| =-||++++.+.+..-.+ .+..+.-...|.-+.+.+..+. ||+.+|+++.|+.+|+.++. T Consensus 156 ~~~nPllqDVNkGWlQ~~Re~ap~rv~~~~~~~~~i~i~~g~~gdy~nLDalV~d~~~~lI~~~~~~d~dLVvivG~dLl 235 (338) T protein:vir:11 156 RAANPLLQDVNIGWFQQYRNNAPARVLKEGKTTGKVVVGNGADADYKNLDALVFDVVSSLIDPWHRRDPGLVVILGRELV 235 (338) T ss_pred hhhCcCccccchhHHHHHHhhhhhhhhhcccccceeeecCCCCCccccHHHHHHHHHhccCChHHhcCCCEEEEEchhhh Confidence 11 26999999875443111 1111112368899999999975 79999999999999998764 Q ss_pred H-HHHHHHhhhhccCCcccCCC----cceecceeeeeccCCCCCeEEEecchHHhhhhhhhhh------hhhcccccee- Q lcl|NC_018271. 205 R-AIKRAYGTQARSNGTFLNPN----EFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSD------FNEIRIKDMG- 272 (305) Q Consensus 205 d-~Y~d~~~~~~~k~~~~t~~~----~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D------~n~I~I~~~~- 272 (305) . +|--.+. +..+..+..... ..++-|.+-+.++-+|++.|+.|.=+||-+-+ |.+ .++++.+++. T Consensus 236 adk~~~l~n-~~~~ptE~~Aa~~~~s~k~iGGlpa~~~PffP~~~~lVT~L~NLsIY~--Q~gs~RR~~~d~p~r~rie~ 312 (338) T protein:vir:11 236 HDKYFPMVN-KDQPATEKIATDLILSQKRMGGLPPVEVPYVPEKGLMVTTLKNLSLYW--QIGGRRRYLKEVPEKNRIEN 312 (338) T ss_pred HHHHhHHHh-cCCChHHHHHHHHHHHhhhhCCceeEEccccCCCceEEeeccccEEEE--ecCcEEEEEEeccccccccc Confidence 3 4444442 221211111111 24789999999999999999999999996442 222 2445555553 Q ss_pred ----eeccceeEEEEEEeecceeeccCC Q lcl|NC_018271. 273 ----DVDLSGQIRTKMVLSAGVEYAYGA 296 (305) Q Consensus 273 ----~~~~~~~~f~k~~m~~d~~i~fg~ 296 (305) |-.|-.+.+-++-+= + +|.+++ T Consensus 313 y~s~Ne~YvVEd~~~~a~i-e-ni~~~~ 338 (338) T protein:vir:11 313 YESSNDAYVVEDYGLGCLV-E-NIEVAE 338 (338) T ss_pred hhhhccceeeeccccEEEe-e-cceecC Confidence 444444433322221 2 555566 No 126 >protein:vir:9704 Length: 394 # NCBI annotation: hypothetical protein # Family: family:all:21 # MgeID: mge:174 # MgeName: 315.2 # Cross-refs: genbank:acc:NP_795466;genbank:gi:28876225;genbank:GeneID:1257769 Probab=89.89 E-value=0.024 Score=29.48 Aligned_cols=248 Identities=10% Similarity=0.024 Sum_probs=123.7 Q ss_pred CceEeee-eccc-c-hhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCC-------ccce Q lcl|NC_018271. 1 MATTVDI-TTNY-V-GEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTP-------SGEV 70 (305) Q Consensus 1 ma~~~~~-~~~Y-~-Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~-------~G~~ 70 (305) .+..+.- .+.| . -+....|+..+.....+.+ +++++|.=... ...|....+. -+..|.+ .+.. T Consensus 127 ~~~~~t~~~gg~liP~~~~~~ii~~~~~~~~l~~--~~~~~~~~~~~-~~~~~~~~~~----~~~~~v~E~~~~~~~~~~ 199 (394) T protein:vir:97 127 QKDGIKKENAKPVSSEEILYTPAREVKTVVDLKP--FTTVYQAKKAS-GKYPVLQRAT----TKMVTVAELEKNPALAKP 199 (394) T ss_pred hccccccccccccChHHHHHHHHHHhhhhhhhhh--hceeeeccCcc-eEEEEEecCC----Cccceecccccccccccc Confidence 2211111 1222 1 2234445555555454444 36776632222 2233222111 1123322 2456 Q ss_pred EecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHH Q lcl|NC_018271. 71 DINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGIL 150 (305) Q Consensus 71 ~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~l 150 (305) +|.+..|.++++-+...++=. ++. +-++..+.++...|+++++..++..++.|.++.. T Consensus 200 ~~~~v~l~~~k~~~~i~is~e---------ll~-------ds~~~~~~~i~~~la~~~~~~~~~~i~~g~~~~~------ 257 (394) T protein:vir:97 200 DFKDVAWNIDTYRGAIPLSQE---------SID-------DADVDLVGIVSESISQIKVNTTNDAIAKVLKSFT------ 257 (394) T ss_pred cceeEEeehhheeeehhhHHH---------HHh-------hhhHHHHHHHHHHHHHHHHHHHHHHHhhcccccc------ Confidence 788888888888776655432 111 2234566778888888888888888877765410 Q ss_pred HHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cccCCCcc Q lcl|NC_018271. 151 PLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TFLNPNEF 227 (305) Q Consensus 151 k~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~t~~~~~ 227 (305) ..++.+..++...+. ..++..+ +=.++||...|.... .++.-.|.+. +.+++... T Consensus 258 --------------~~~~~~~~~~~~~~~---~~~~~~~----~a~~v~n~~~~~~l~-~lkd~~G~~i~~~~~~~~~~~ 315 (394) T protein:vir:97 258 --------------TKTVKNLDEIKALLN---GGFDPAY----NVSLIVSQSFYQTLD-TLKDGNGRYLLQDDITAVSGK 315 (394) T ss_pred --------------ccccccHHHHHHHHH---hhhhhhh----CCEEEEcHHHHHHHH-HhhccCCCeeeecCcCCCCCc Confidence 112233333333332 2233221 236999999987654 3433333222 33445556 Q ss_pred eecceeeeecc--CCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEE--ecC Q lcl|NC_018271. 228 DFEGYTLTEIK--GLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVL--YTP 303 (305) Q Consensus 228 ~~kGi~iv~l~--~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~--~~~ 303 (305) .+.|++++-.. .+|+..+++.+=++.+..++. ..+.++-. .-++... .|...+-+|+.+.-++=||. .|| T Consensus 316 ~l~G~pv~~~~~~~~~~~~~~~gd~~~~~~~~~~----~~~~~~~~-~~~~~~~-~~~~~~r~d~~v~~~~a~~~~~~~~ 389 (394) T protein:vir:97 316 VLLGKPVFVLSDEVLGANKAFIGDFKRGVLFADR----KDLGLRWA-DNEIYGQ-YLQAVLRFGVSKVDDKAGYYVTFTP 389 (394) T ss_pred eeccceeEEecccccCCccEEEeeccccEEEEEe----cceEEEEe-cccccce-eEEEEEEEccEEecccceEEEEecc Confidence 89999877654 468888887775554322211 11222211 1111122 23444556777777776664 455 Q ss_pred CC Q lcl|NC_018271. 304 AA 305 (305) Q Consensus 304 ~~ 305 (305) .+ T Consensus 390 ~~ 391 (394) T protein:vir:97 390 EP 391 (394) T ss_pred cc Confidence 55 No 127 >protein:vir:96123 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1602 # MgeName: 37 # Cross-refs: genbank:acc:YP_240078;genbank:gi:66395742;genbank:GeneID:5133103 Probab=89.71 E-value=0.025 Score=29.39 Aligned_cols=259 Identities=13% Similarity=0.092 Sum_probs=128.7 Q ss_pred CceEeeee-cccchhHHHHHHHHhhccccchhcCceEE---ecCCCCcccccchhhhhccccCCCC-CCCCccceEecce Q lcl|NC_018271. 1 MATTVDIT-TNYVGEVAGGYFLEMVKEANTISDNLIRV---IPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEK 75 (305) Q Consensus 1 ma~~~~~~-~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v---~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K 75 (305) ||+...-. .-+..|+...++...+.+.- +-.+++.+ ++|.+.++..+|+.+..-..+.++. .--+-++...++. T Consensus 1 ma~~~T~~~d~i~Pev~s~~v~~~~~~~~-~~~~~~~~~~~l~g~~G~tv~ip~~~~~g~~~~~~~g~~i~~~~it~~~~ 79 (274) T protein:vir:96 1 MAQGTTKVSNLIVPEVLAPMMQAELDKKL-RFAQFADIDSTLVGQPGDTLTFPAFTYSGDAQVIAEGEKIPVDQIGTSKR 79 (274) T ss_pred CCccccchhhhhhhHHHHHHHHHHHHhhh-hhcccccccccccCCCCCEEEEEeeccCCCccccCCCCcCchhhccccee Confidence 99866542 36777777766655443222 22222222 2344455444555443334455543 1122345566666 Q ss_pred eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhh Q lcl|NC_018271. 76 QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEA 155 (305) Q Consensus 76 ~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~ 155 (305) .+..++. ...|--.|... .+ ..++. .....+.++..++..+...++ ..+.. T Consensus 80 ~~~i~~~--~~~~~i~D~~~----~~-~~~d~---------~~~~~~~~~~~~a~~~d~~i~-------------~~l~~ 130 (274) T protein:vir:96 80 EAKVRKI--GKGTELTDEAV----LS-GFGDP---------QGEAVRQHGLAIANKVDNDVL-------------EALKG 130 (274) T ss_pred EEEEEee--eceeeecHHHH----Hh-hcchH---------HHHHHHHHHHHHHHHHHHHHH-------------HHHhc Confidence 6666543 23344445442 22 21221 111224444445554443322 11111 Q ss_pred ccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCc-----ccCCCcceec Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGT-----FLNPNEFDFE 230 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~-----~t~~~~~~~k 230 (305) .+.. ....++|...++++.. .+-+. ....-.++||+..|-.....-..++-+..+ ..++....|. T Consensus 131 a~~~----~~~~~~~~d~i~dA~~----~l~d~--~~~~~~ivv~p~~~~~L~k~~~~~f~~~~~~g~~~~~~g~ig~~~ 200 (274) T protein:vir:96 131 ATLT----VEADITKLDGLQTAID----KFNDE--DLEPMVLFVNPLDAGGLRTSASDNFTRPTQLGDNIIVKGAFGEAL 200 (274) T ss_pred CCCC----cCcccccHHHHHHHHH----Hhccc--CCCceEEEeCHHHHHHHHhcccccccccccccccceeecccceec Confidence 1111 1123344444443333 33332 123348999999887774432112211111 1234466899 Q ss_pred ceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 231 GYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 231 Gi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) |++|+--..+|.+.+++-.++.+-++. ..|.. ++-+|-.+ +..--+.-++-|++-+.-++-+|+-|.++ T Consensus 201 G~~Vi~s~~~p~~t~~l~~~gA~~~~~--~~~~~-vE~~Rd~~---~~~d~i~~~~~yg~~~~~~~~vv~~t~~~ 269 (274) T protein:vir:96 201 GAVIVRSNKLNKGEALLAKKGAVKLIT--KRDFF-LEKDRDAS---RKSTALYSDKHYVAYLYDESKVVKITKGA 269 (274) T ss_pred CeeEEEcCCCCcceEEEEeCcceeeee--cCCcc-cccccchh---hcccEEEEeeEEEEEEEcCccEEEEEcCc Confidence 999999999999998888777665443 23322 23232222 11122333344588888899999999998 No 128 >protein:vir:100331 Length: 342 # NCBI annotation: major capsid protein N # Family: family:all:201 # MgeID: mge:1484 # MgeName: phi-MhaA1-PHL101 # Cross-refs: genbank:acc:YP_655472;genbank:gi:109289940;genbank:GeneID:4157374 Probab=89.54 E-value=0.026 Score=29.29 Aligned_cols=269 Identities=12% Similarity=0.104 Sum_probs=137.1 Q ss_pred CceEee-------e--ecccchhHHHHHHHHhhccccchhcCceEEec--CCCCcccccchhhhhccccCCCCCCCCccc Q lcl|NC_018271. 1 MATTVD-------I--TTNYVGEVAGGYFLEMVKEANTISDNLIRVIP--NVPENNLFLRRMNTTDDFVDYSCGFTPSGE 69 (305) Q Consensus 1 ma~~~~-------~--~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~--~v~~~~~~~~~~~~~~~~q~~~~~~~~~G~ 69 (305) +|.... + +-+=.+.+.-+|-..+..+.+.-+. |.|+| .++.-.+ . +...-.+-+++.. +..++ T Consensus 16 ~A~~ngv~~~~~~~~~~FsV~P~v~q~L~~~i~ess~FL~~--INvv~V~e~~Ge~i--~-lg~~g~iagrtdT-~~~~~ 89 (342) T protein:vir:10 16 QAELNNLPFNALATGIKFTVQPSVQQKLYEKVRESSDFLKS--ISFVFVDEQTGETL--G-LDSAHTVASTTDT-SGDGE 89 (342) T ss_pred HHHHhCCChhHccccceeecChHHHHHHHHHHHHHHHHhcc--CcccccccceeeEE--e-cccCccccccccc-CCCCC Confidence 332221 1 1244566666666666666666666 55433 1111100 0 1111111111111 11122 Q ss_pred eEecc-eeeeeeeeeEeec-cC---HHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc- Q lcl|NC_018271. 70 VDINE-KQLTLKKIKSDKE-VC---KEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT- 143 (305) Q Consensus 70 ~~~~~-K~L~~~~~k~~~~-~~---P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~- 143 (305) .+... -.|+...|.+.-+ || |=+-...|- +.+ .-|.-+-..+.+++|-+.=...|+|-+-+ T Consensus 90 R~~~~~~~l~~~~Y~c~qTn~dt~i~Y~~lD~WA----~~~---------dF~~r~~~~i~~~~ALD~i~IGfNGts~A~ 156 (342) T protein:vir:10 90 RKTTSIAKLVKQTYHCQQINFDTHINYKQLDMWA----KFP---------DFQQKVANVAAKQRKRDLIMIGFNGTSRAA 156 (342) T ss_pred cccccccccCCCccEEEEeeecccccHHHHHHHh----cCh---------hHHHHHHHHHHHHHhhccceecccceeecc Confidence 22222 2334444444221 22 222233343 212 12233334555666766666677885321 Q ss_pred ----------ch-hHHHHHHHhhccceEEe----------ccCcCcCChhhHHHHHHHHHHh-ccHHHHhCCCcEEEecH Q lcl|NC_018271. 144 ----------GN-LQGILPLLEADATVIDV----------VGASGGITAANVEAELGKFIDA-HTDEILQAPNHVFGVST 201 (305) Q Consensus 144 ----------~~-fdG~lk~i~~d~~~~~~----------~~~~~~iT~anv~~~l~~~~~~-iP~~~r~~~~l~~f~S~ 201 (305) .| =-||++++.+.+..-.+ .+. .-+..|.-+.+.+..+. ||+.+|+.+.|+.+|+. T Consensus 157 ~Td~~~nPllqDVN~GWlQ~~Re~ap~rv~~~~~~~~~i~iG~--~gdy~NLDalV~D~~~~lI~~~~~~d~dLVvivG~ 234 (342) T protein:vir:10 157 TSDRNSNPLLQDVAKGWLQKMREDAKERVMNGESTDNQVLVGK--GQEYANLDALVMDATEELIDEWHRDDTDLVVITGR 234 (342) T ss_pred CCChhhCcCccccchHHHHHHHhhhhhhhcccceeccceeecC--CCCcccHHHHHHHHHhccCChHHhcCCCEEEEEch Confidence 11 26999999876543111 111 12568999999999986 79999999999999998 Q ss_pred HHHH-HHHHHHhhhhccCCcccCC-------CcceecceeeeeccCCCCCeEEEecchHHhhhhhhhhh------hhhcc Q lcl|NC_018271. 202 NVIR-AIKRAYGTQARSNGTFLNP-------NEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSD------FNEIR 267 (305) Q Consensus 202 ~~~d-~Y~d~~~~~~~k~~~~t~~-------~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D------~n~I~ 267 (305) ++.. +|---+ .+....++- ...++-|.+-+.++-+|++.|+.|.=+||-+= .|.+ .++++ T Consensus 235 dLladk~~~l~----n~~~~ptE~~Aa~~i~s~k~iGGl~a~~~PfFP~~~ilVT~L~NLsIY--~Q~gs~RR~~~d~p~ 308 (342) T protein:vir:10 235 KLLADKYFPIV----NQQNAPTEELAADIVISQKRIGGLKAVRVPFFPANAILITKLENLAIY--VQEGTTRKHIENVPK 308 (342) T ss_pred hhhHHHHHHHH----hcCCChHHHHHHHHHHhhhhhcCceeEEccccCCCceEEeeccccEEE--EecCcEEEEEEeccc Confidence 8765 443333 222222221 12478899999999999999999999999532 2222 24455 Q ss_pred cccee-----eeccceeEEEEEEeecceeeccCC Q lcl|NC_018271. 268 IKDMG-----DVDLSGQIRTKMVLSAGVEYAYGA 296 (305) Q Consensus 268 I~~~~-----~~~~~~~~f~k~~m~~d~~i~fg~ 296 (305) .+++. |-.|-.+.+-++-+=-.+.|+=++ T Consensus 309 r~rie~y~s~Ne~YvVEd~~~~a~iE~i~i~~~~ 342 (342) T protein:vir:10 309 KDRIETYESENIDYVVEDYGCAALIENITLKDKE 342 (342) T ss_pred cccccchhhhccceeeeccccEEEeecceecCCC Confidence 55553 334444433322222233333222 No 129 >protein:vir:103323 Length: 364 # NCBI annotation: major capsid-like protein # Family: family:all:2806 # MgeID: mge:1609 # MgeName: Era103 # Cross-refs: genbank:acc:YP_001039668;genbank:gi:125999997;genbank:GeneID:4818399 Probab=89.21 E-value=0.019 Score=30.08 Aligned_cols=274 Identities=10% Similarity=0.053 Sum_probs=122.0 Q ss_pred CceEeee-ecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCCccce-Eecceeee Q lcl|NC_018271. 1 MATTVDI-TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTPSGEV-DINEKQLT 78 (305) Q Consensus 1 ma~~~~~-~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~~G~~-~~~~K~L~ 78 (305) =+....+ ..-|.|||+..+.-..++- ++++++.--..++...+++..... +.++.+-...|.. +-+++.|. T Consensus 15 ~~~~~al~le~f~geV~taf~~~s~~~------~~~~~rti~~gkS~q~~~iG~~~~-~~~~~G~~ld~~~~~~~k~~it 87 (364) T protein:vir:10 15 SGEVDSLLIEKFNNRVHEQYLKGENLL------QWFDVQEVVGTNSVSNKYIGETEL-QVLSPGKSPDASPTEFDKNRLV 87 (364) T ss_pred ccchhhhhhhhhhhhHHHHHHHHHhhc------CcceeeeecccceEEeeeeeeeEE-eeeccCcccCCCCcccCcEEEE Confidence 1222222 2578888888776654442 445655444445444555544433 4444433334433 33444666 Q ss_pred eeeeeEeecc--CHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhcc----cc---CCccchhHHH Q lcl|NC_018271. 79 LKKIKSDKEV--CKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIW----QG---DGTTGNLQGI 149 (305) Q Consensus 79 ~~~~k~~~~~--~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~----~G---D~s~~~fdG~ 149 (305) .-..+..=-| .-.|+.++| ..+ +. +. -+++++.++..+.+.++ .+ +....+.+++ T Consensus 88 ID~ll~a~~~V~diDe~q~~~-------D~v--R~-e~------s~e~G~ALA~~~Dq~i~~~v~~aa~a~~~~~~~~~~ 151 (364) T protein:vir:10 88 VDTTVIARNTVAHFHDVQNDI-------DGL--KS-KL------SVNQAKKLKKMEDSMVIQQLVLGGISNTEAIRKNPR 151 (364) T ss_pred ecceeeechhhhhHHHHhcCc-------cch--hH-HH------HHHHHHHHHHHHHHHHHHHHHhhhhhcccccccCCc Confidence 6555432111 111222222 101 10 11 13334444433333222 11 1111111111 Q ss_pred HHHHhhccceEEeccC--cCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHH--Hhhh-hc--cCCccc Q lcl|NC_018271. 150 LPLLEADATVIDVVGA--SGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRA--YGTQ-AR--SNGTFL 222 (305) Q Consensus 150 lk~i~~d~~~~~~~~~--~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~--~~~~-~~--k~~~~t 222 (305) . ...+..+.+.+. +...++.++++.+.+....+.++--..-+..++|++..|-.-... +-++ ++ .+.++. T Consensus 152 ~---~~~g~~i~~~~~a~~~~~~~~~l~~ai~~a~~~LdEkdVP~~~R~~vv~P~~y~~Ll~~~~lvn~d~~~~~~~~~~ 228 (364) T protein:vir:10 152 V---AGHGFSIHIVGLASSFLTSPQYMMAAIEMAMEQQTEQEVDTSELCGLMPWTAFNCLRDADRIVDKSYTIAASDNTV 228 (364) T ss_pred c---cCCcceeeecccCcchhhhHHHHHHHHHHHHHHHhhcCCCccccEEEeChHHHHHHhcCCccccccccccCCCccc Confidence 1 111222333222 223455677777777776666654433457999999888665443 3222 22 334455 Q ss_pred CCCcceecceeeeeccCCCCCeEEEec-------------chHHhhhhhhhhhhhhccccce--------------eeec Q lcl|NC_018271. 223 NPNEFDFEGYTLTEIKGLPASRMVGYN-------------RDNIVIGMSAQSDFNEIRIKDM--------------GDVD 275 (305) Q Consensus 223 ~~~~~~~kGi~iv~l~~~Pd~~ii~T~-------------~sNl~~gvnl~~D~n~I~I~~~--------------~~~~ 275 (305) +++-...-|++|+....+|..-...+. .+| .++...|+.+-++--| ..++ T Consensus 229 ~G~v~~v~Gv~Vv~Sn~lP~~~~~~~~t~~~t~h~ls~~~~g~---~y~v~~d~~~~~~~~f~~~Al~tv~~~~~t~e~~ 305 (364) T protein:vir:10 229 DGFVLKSWNTPIVPSNRFPKLSDNTEGTGNTKHHKLSNAGNGN---RYDVTAGQTSAQAVLFTQDALLVGRTISITGDIF 305 (364) T ss_pred cceeEEEeceEEEeccccccccccccccccccccccccccCCc---ccccccccceeEEEEEecceEEEEEEecceeeee Confidence 666667999999999999965222211 112 1333344433222211 1233 Q ss_pred cceeE---E--EEEEeecceeeccCCeEEE-ecCCC Q lcl|NC_018271. 276 LSGQI---R--TKMVLSAGVEYAYGAEIVL-YTPAA 305 (305) Q Consensus 276 ~~~~~---f--~k~~m~~d~~i~fg~E~v~-~~~~~ 305 (305) +..+. + .++-||++.. +++=+|+ .+.++ T Consensus 306 ~~~~~~~~~ida~~a~G~g~l--RPeaa~~i~~~~~ 339 (364) T protein:vir:10 306 YEKKEKTWYIDTFLAEGAIPD--RWEAVAVVTAADT 339 (364) T ss_pred eccceeeeeeeeehcccCccc--CccceEEEEecCC Confidence 33332 2 2333444433 3344443 34433 No 130 >protein:vir:1829 Length: 355 # NCBI annotation: major capsid protein # Family: family:all:201 # MgeID: mge:324 # MgeName: 186 # Cross-refs: genbank:acc:NP_052253;genbank:gi:9634060;genbank:GeneID:1262428 Probab=88.78 E-value=0.03 Score=28.91 Aligned_cols=276 Identities=14% Similarity=0.101 Sum_probs=141.6 Q ss_pred CceEeee-------ecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccch--hhhhccccCCCCCCCCccceE Q lcl|NC_018271. 1 MATTVDI-------TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRR--MNTTDDFVDYSCGFTPSGEVD 71 (305) Q Consensus 1 ma~~~~~-------~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~--~~~~~~~q~~~~~~~~~G~~~ 71 (305) +|....+ +-+=.+.+.-+|-..+..+.+.-++ |.|+| | ..+.=-. +...-.+-+++.. +++.+.+ T Consensus 16 ~A~~ngv~~~~~~~~Fsv~P~v~q~L~~~i~ess~FL~~--INvv~-V--~e~~Ge~i~lgv~g~iagrtdT-~~~~~R~ 89 (355) T protein:vir:18 16 LAKLNGISVDDVSKKFTVEPSVTQTLMNTVQASSAFLQM--INILP-V--AEMKGEKIGVGVTGTIASTTDT-SGDKERQ 89 (355) T ss_pred HHHHhCCChhHccceeccCHHHHHHHHHHHHHHHHHhhc--Cceec-c--ccceeeEEeeccCcceeecccc-CCCCCcc Confidence 3322221 1244566666666667777777666 55432 1 1111000 1111111222111 1122222 Q ss_pred ecce-eeeeeeeeEeec----cCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc--- Q lcl|NC_018271. 72 INEK-QLTLKKIKSDKE----VCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT--- 143 (305) Q Consensus 72 ~~~K-~L~~~~~k~~~~----~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~--- 143 (305) .... .|+...|.|.-+ .-|=+-...|-.. + .-|.-+-..+.+++|-+.=-..|+|-+-+ T Consensus 90 ~~~~~~l~~~~Y~c~qtn~dt~i~y~~LD~WA~~----~---------dF~~r~~~~i~k~~ALD~i~IGfNG~s~A~~T 156 (355) T protein:vir:18 90 TADFTALESNKYECNQINFDFHLTYKRLDLWARF----Q---------DFQRRIRDAIVQRQALDFIMAGFNGTTRADTS 156 (355) T ss_pred cccccccCCCccEEEEeeeeeeecHHHHHHHhcC----h---------hHHHHHHHHHHHHHhhchhhhcccceeeeccC Confidence 2221 244444444321 1223334455321 1 22344445566677777767777885421 Q ss_pred --------ch-hHHHHHHHhhccceEE----------------eccCcCcCChhhHHHHHHHHHHh-ccHHHHhCCCcEE Q lcl|NC_018271. 144 --------GN-LQGILPLLEADATVID----------------VVGASGGITAANVEAELGKFIDA-HTDEILQAPNHVF 197 (305) Q Consensus 144 --------~~-fdG~lk~i~~d~~~~~----------------~~~~~~~iT~anv~~~l~~~~~~-iP~~~r~~~~l~~ 197 (305) .| =-||++++.+.+..-. .++. .-+..|.-+.+.++.+. ||+.+|+.+.|+. T Consensus 157 d~~~nPllqDVNkGWlQ~~Re~ap~rV~~~~~~~~~~~~~~~i~~G~--~gdy~NLDAlV~d~~~~lI~~~~~~d~dLVv 234 (355) T protein:vir:18 157 DRVKNPMLQDVAVGWLQKYRNEAPARVMSNITDADGKVVSAVIRVGK--NGDYENLDALVMDGTNTLIDEIYQDDPKLVA 234 (355) T ss_pred ChhhCcCccccchhHHHHHHhcchhhhhccccccccccccceeeecC--CCCcccHHHHHHHHHhccCChHHhcCCCEEE Confidence 11 2699999988554210 1111 12468899999999986 7999999999999 Q ss_pred EecHHHHH-HHHHHHhhhhccCCcccCCC----cceecceeeeeccCCCCCeEEEecchHHhhhhhhhhh------hhhc Q lcl|NC_018271. 198 GVSTNVIR-AIKRAYGTQARSNGTFLNPN----EFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSD------FNEI 266 (305) Q Consensus 198 f~S~~~~d-~Y~d~~~~~~~k~~~~t~~~----~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D------~n~I 266 (305) +|+.++.. +|---|.+ .++..+..... ..++-|.+-+.++-+|++.|+.|.=+||-+-+ |.+ .+++ T Consensus 235 ivG~dLla~k~~~l~n~-~~~ptE~~Aa~~i~s~k~iGGlpa~~~PffP~~~~lVT~L~NLsIY~--Q~gs~RR~~~d~p 311 (355) T protein:vir:18 235 IVGRKLLADKYFPLVNK-QQENTESLAADIIISQKRIGNLPAVRVPYFPANAVFVTTLENLSIYF--MDESHRRSIDENP 311 (355) T ss_pred EEchhhhHHHHhHHhhc-cCChHHHHHHHHHHHHHhhCCceeEEccccCCCceEEeeccccEEEE--ecCcEEEEEEecc Confidence 99977543 55444432 12221111111 24789999999999999999999999996442 222 2445 Q ss_pred ccccee-----eeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 267 RIKDMG-----DVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 267 ~I~~~~-----~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) +.+++. |-.|-.+.+-++-+= + +|.+++-- .|++ T Consensus 312 ~r~rie~y~s~Ne~YvVEd~~~~a~i-e-ni~~~~~~---~~~~ 350 (355) T protein:vir:18 312 KKDRVENYESMNIDYVVEAYAAGCLL-E-NITLGDFT---APAA 350 (355) T ss_pred ccccccchhhhcceeeeeccccEEEE-e-eeeecCCC---Cccc Confidence 555553 444444433322221 1 33332211 2333 No 131 >protein:vir:6061 Length: 357 # NCBI annotation: gpN # Family: family:all:201 # MgeID: mge:126 # MgeName: WPhi # Cross-refs: genbank:acc:NP_878202;genbank:gi:33438901;genbank:GeneID:1457736 Probab=87.27 E-value=0.04 Score=28.26 Aligned_cols=275 Identities=15% Similarity=0.131 Sum_probs=138.3 Q ss_pred CceEeee-------ecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccch--hhhhccccCCCCCCCCcc-ce Q lcl|NC_018271. 1 MATTVDI-------TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRR--MNTTDDFVDYSCGFTPSG-EV 70 (305) Q Consensus 1 ma~~~~~-------~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~--~~~~~~~q~~~~~~~~~G-~~ 70 (305) +|..... +-+=.+.+.-+|-..+..+.+..+. |.|+| +..+.=-. +...-.+-++ .-|++| +. T Consensus 16 ~A~~ngv~~~d~~~~FsV~P~v~q~L~~~i~ess~FL~~--INvv~---V~e~~Ge~i~lg~~g~iagr--tdT~~~~~R 88 (357) T protein:vir:60 16 VAELNGIDAGDVSKKFTVEPSVTQTLMNTMQESSDFLTR--INIVP---VSEMKGEKIGIGVTGSIAST--TDTAGGTER 88 (357) T ss_pred HHHHhCCChHHhcceeecCHHHHHHHHHHHHHHHHHhcc--CCccc---cccceeeEEecccCcccccc--cccCCCCCc Confidence 3322211 1244555555566666666666665 44422 11111000 1111111112 112221 22 Q ss_pred Eecc-eeeeeeeeeEeec-cC---HHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc-- Q lcl|NC_018271. 71 DINE-KQLTLKKIKSDKE-VC---KEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT-- 143 (305) Q Consensus 71 ~~~~-K~L~~~~~k~~~~-~~---P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~-- 143 (305) +... -.|+...|.|.-+ || |=+-...|-.. | .-|.-+-..+.+++|-+.=...|+|-+-+ T Consensus 89 ~~~~~~~l~~~~Y~c~qTn~dt~i~Y~~lD~WA~~------------~-dF~~r~~~~i~~~~ALD~i~IGfNGts~A~~ 155 (357) T protein:vir:60 89 QPKDFSKLASNKYECDQINFDFYIRYKTLDLWARY------------Q-DFQLRVRNAIIKRQSLDLIMAGFNGVRRAET 155 (357) T ss_pred ccccccccCCCccEEEEeeeeccccHHHHHHHhcC------------h-hHHHHHHHHHHHHHhhccceecccceeeecc Confidence 2222 2445555554322 22 22333445221 1 22333445556667766666677885321 Q ss_pred ---------ch-hHHHHHHHhhccceE----------------EeccCcCcCChhhHHHHHHHHHHh-ccHHHHhCCCcE Q lcl|NC_018271. 144 ---------GN-LQGILPLLEADATVI----------------DVVGASGGITAANVEAELGKFIDA-HTDEILQAPNHV 196 (305) Q Consensus 144 ---------~~-fdG~lk~i~~d~~~~----------------~~~~~~~~iT~anv~~~l~~~~~~-iP~~~r~~~~l~ 196 (305) .| =-||++++.+.+..- ...+.. -+..|.-+.+.+..+. ||+.+|+.+.|+ T Consensus 156 Td~~~nPllqDVN~GWlQ~~Re~ap~rVm~~~~~~~g~~~~~~i~~G~~--gdy~NLDalV~D~~~~lI~~~~~~d~dLV 233 (357) T protein:vir:60 156 SDRSSNQMLQDVAVGWLQKYRNEAPARVMSKVTDEEGHTTSEVIRVGKG--GDYASLDALVMDATNNLIEPWYQEDPDLV 233 (357) T ss_pred CChhhCcCccccchhHHHHHHhhchhhhhccccccCCccccceeeecCC--CCcccHHHHHHHHHhccCChHHhcCCCEE Confidence 11 269999998755321 011221 2468999999999986 799999999999 Q ss_pred EEecHHHHH-HHHHHHhhhhccCCcccCCC----cceecceeeeeccCCCCCeEEEecchHHhhhhhhhhh------hhh Q lcl|NC_018271. 197 FGVSTNVIR-AIKRAYGTQARSNGTFLNPN----EFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSD------FNE 265 (305) Q Consensus 197 ~f~S~~~~d-~Y~d~~~~~~~k~~~~t~~~----~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D------~n~ 265 (305) .+|+.++.. +|-.-| ++.++..+..... ..++-|.+-+.++-+|++.|+.|.=+||-+= .|.+ .++ T Consensus 234 vivG~dLla~k~~~l~-n~~~~pTE~~Aa~~i~s~k~iGGl~a~~~PfFP~~~llVT~L~NLsIY--~Q~gs~RR~~~d~ 310 (357) T protein:vir:60 234 VIVGRQLLADKYFPIV-NREQDNSEMLAADVIISQKRIGNLPAVRVPYFPADAMLITKLENLSIY--YMDDSHRRVIEEN 310 (357) T ss_pred EEEchhhhhHHhhhHh-hcCCChHHHHHHHHHHHhhhhcCcceEEccccCCCceEEeeccccEEE--EecCcEEEEEEec Confidence 999988765 444333 2222211111111 2368899999999999999999999999532 2222 244 Q ss_pred cccccee-----eeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 266 IRIKDMG-----DVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 266 I~I~~~~-----~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) ++.+++. |-.|-.+.+-++-+=-.+.|+.++ .|+. T Consensus 311 p~r~riE~y~s~Ne~YvVEd~~~~a~iE~i~~~~~~-----~pa~ 350 (357) T protein:vir:60 311 PKLDRVENYESMNIDYVVEDYAAGCLVEKIKVGDFS-----TPAK 350 (357) T ss_pred cccccccchhhhcceeeeeccccEEEeeeeeeccCc-----cccc Confidence 5555553 444444433333332233333222 1332 No 132 >protein:vir:8885 Length: 347 # NCBI annotation: major capsid protein A # Family: family:all:975 # MgeID: mge:161 # MgeName: gh-1 # Cross-refs: genbank:acc:NP_813774;genbank:gi:29366729;genbank:GeneID:1258837 Probab=85.83 E-value=0.016 Score=30.40 Aligned_cols=281 Identities=15% Similarity=0.093 Sum_probs=117.7 Q ss_pred CceEeeee--------cccchhH----HHHHHHHh--hccccchhcCceEEecCCCCcccccchhhhhccccCCCC--CC Q lcl|NC_018271. 1 MATTVDIT--------TNYVGEV----AGGYFLEM--VKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSC--GF 64 (305) Q Consensus 1 ma~~~~~~--------~~Y~Ge~----l~~~~~~~--~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~--~~ 64 (305) ||++-..+ .+++|.. +++|--+. .+-...+-.+++++++....++..++++..... +.++. .. T Consensus 1 ~a~~~~~~~~~~~~g~~~~~~d~~al~ie~~~geV~~~f~~~s~~~~~~~~r~i~~G~sv~~~~iG~~~~-~~~~~g~~l 79 (347) T protein:vir:88 1 MANATGGQQIGANQGKGQSAADKLALFLKVFGGEVLTAFVRRSVTMDKHMVRTIQNGKSASFPVMGRTKG-YYLAPGENL 79 (347) T ss_pred CCCcccchhhhccCCCCccccchHHHHHHHHHHHHHHHHHHHhhhhhccccccccCcceEEEeeecceee-eeeccccCC Confidence 88654332 1233331 11111111 111234556678887777777666776655544 33322 22 Q ss_pred CCc-cceEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhcc----cc Q lcl|NC_018271. 65 TPS-GEVDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIW----QG 139 (305) Q Consensus 65 ~~~-G~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~----~G 139 (305) +.. -+.+-++++|.....+.+=.+.. |+ =.+|.. .++ + ....+..+..++....+.++ ++ T Consensus 80 ~~~~~~~~~~~~~i~ID~~~y~~~~Vd-d~----D~~q~~-~D~--r-------~~~~~~~g~aLA~~~D~~i~~~l~~~ 144 (347) T protein:vir:88 80 DDKRKDIKHSEKVIQIDGLLTSDVLIY-DI----EDAMNH-YDV--R-------AEYSAQLGEALAIAADGAVLAEMAKL 144 (347) T ss_pred CCCCCCCccceEEEEEechhhhhhhhh-hH----HHHhhc-CCc--h-------HHHHHHHHHHHHHHHHHHHHHHHHHh Confidence 221 23455666665555443221111 11 123333 222 1 12223344444444443332 33 Q ss_pred CC----ccchhHHHHHHHhhccceEEeccC-cCcCChh---hHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHH Q lcl|NC_018271. 140 DG----TTGNLQGILPLLEADATVIDVVGA-SGGITAA---NVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAY 211 (305) Q Consensus 140 D~----s~~~fdG~lk~i~~d~~~~~~~~~-~~~iT~a---nv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~ 211 (305) -. +...+.|..+.. .+.+... ..+.+.+ .+.+.|-+....+.++--..-+..+.+++..|....... T Consensus 145 a~~~~~~~~~~~g~~~~~-----~~~~~~~~~~~~~~~~~~~~~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~ 219 (347) T protein:vir:88 145 CNLPAASNENIAGLGQAV-----VLNIGAAADLVDVEARGKAILKGLTLARARLTKNYVPAGDRRFYCAPEDYSAILSAL 219 (347) T ss_pred hccccccccccCCccccc-----cccccccccccchhhhHHHHHHHHHHHHHHHhhcCCCCCCCEEEeCHHHHHHHhcch Confidence 21 112233322111 1111111 1111122 234445444444444322222478999998876654422 Q ss_pred ---hhhhccCCcccCCCcceecceeeeeccCCCCCeEEEecchHH---------------------------------hh Q lcl|NC_018271. 212 ---GTQARSNGTFLNPNEFDFEGYTLTEIKGLPASRMVGYNRDNI---------------------------------VI 255 (305) Q Consensus 212 ---~~~~~k~~~~t~~~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl---------------------------------~~ 255 (305) ...+....+..++....+.|++|....++|-+-.-.++..+. ++ T Consensus 220 ~~~~~~~~~~~~~~~G~vg~i~G~~V~~s~nlp~~~~~~~~~~~~~~~t~~~~~~~~~~~~~~~~d~~~~~~l~~~~~a~ 299 (347) T protein:vir:88 220 MPNAANYAALIDPETGNIRNVMGFEVIEVPHLTVGGAGDNNPADGVAPTNQKHIFPATATGDDRVAQNNVVGLFNHRSAV 299 (347) T ss_pred hhhhhhhccccchhcceeeeeccceEEEeecccccccccccccccccccccccccccccccccccccCcEEEEEechhhh Confidence 122223334455555679999999999998432222221111 11 Q ss_pred h-hhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 256 G-MSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 256 g-vnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) | |.+ -|+. +++.+-.+-+. -.+..++.||+++.=|=..=.+.-|++| T Consensus 300 g~v~~-~d~~-~e~~r~~~~~~-d~i~~~~~~G~~~~rPe~a~~~~~~~a~ 347 (347) T protein:vir:88 300 GTVKL-KDMA-LERARRPEFQA-DQIIGKYAMGHGGLRPEAAGALVFTPAA 347 (347) T ss_pred hheec-ccce-eeeeechhhHH-HHhhhhhhhcCceeccceEEEEEeCCCC Confidence 1 111 0100 11111111001 1244566666666544333345566777 No 133 >protein:vir:10450 Length: 344 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:184 # MgeName: phiA1122 # Cross-refs: genbank:acc:NP_848297;genbank:gi:30387487;genbank:GeneID:1733971 Probab=85.76 E-value=0.022 Score=29.70 Aligned_cols=274 Identities=14% Similarity=0.081 Sum_probs=121.5 Q ss_pred CceEeee---ec----c--------------cchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccC Q lcl|NC_018271. 1 MATTVDI---TT----N--------------YVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVD 59 (305) Q Consensus 1 ma~~~~~---~~----~--------------Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~ 59 (305) ||++... ++ . |.|||+..+.-.. +-.++++++.--..++...+++.... ++. T Consensus 1 ma~~~~~~~~n~~~~~~~~~~~~~~al~ie~~~geV~~~f~~~s------~~~~~~~~r~i~~g~s~~~~~iG~~~-~~~ 73 (344) T protein:vir:10 1 MANMTGGQQLGTNQGKDVMAAGDKLALFLKVFGGEVLTAFARTS------VTTSRHMVRSISSGKSAQFPVLGRTQ-AAY 73 (344) T ss_pred CccccccccCCcccCCccCCccchhHHHHHHHHHHHHHHHHHHh------hhcccceeeeecccceEEEEeeceeE-EEe Confidence 9987433 21 2 5566655544432 33355666543334544455554433 234 Q ss_pred CCCCCCCc---cceEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhc Q lcl|NC_018271. 60 YSCGFTPS---GEVDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDI 136 (305) Q Consensus 60 ~~~~~~~~---G~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~ 136 (305) ++.+-... .+.+=++++|..-..+.+=-+. .|+. .+|.. .++ + ....+..+..++....+.+ T Consensus 74 ~~~G~~l~~t~~~~~~~e~~l~ID~~~y~~~~V-dDiD----~~q~~-~D~--r-------~~~~~~~G~aLA~~~D~~i 138 (344) T protein:vir:10 74 LAPGENLDDIRKDIKHTEKVITIDGLLTADVLI-YDIE----DAMNH-YDV--R-------SEYTSQLGESLAMAADGAV 138 (344) T ss_pred eecCCCCCCCCCCcccceEEEEEcchhhhhhhh-hhHH----HHhcC-cch--H-------HHHHHHHHHHHHHHHHHHH Confidence 43322222 2345556666554433221111 1111 13333 222 2 2223444444444443322 Q ss_pred ----cccCCcc---c-----hhHHHHHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHH Q lcl|NC_018271. 137 ----WQGDGTT---G-----NLQGILPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVI 204 (305) Q Consensus 137 ----~~GD~s~---~-----~fdG~lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~ 204 (305) .++...+ + ..+|++....+++. ...+...++.++.+.+.+....+.++--...+..+.|++..| T Consensus 139 ~~~la~~a~~~~~~~~~~~g~~~~~~~~~~~~~~----~~t~~~~~~~~~~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y 214 (344) T protein:vir:10 139 LAEIAGLCNVESQYNENITGLGTATVIETTQDKT----TLTDQVALGKEIIAALTKARAALTKNYVPSSDRVFYCDPDSY 214 (344) T ss_pred HHHHHhhhccccccccccccccccceeecccccc----cccchhhhHHHHHHHHHHHHHHHhhcCCCccCCEEEeChHHH Confidence 2332211 1 12222221111111 001123445667777777777666653333356888999998 Q ss_pred HHHHHHH---hhhhccCCcccCCCcceecceeeeeccCCCCCeE-----EEecchHHhhhhhh---hhhhhh-------- Q lcl|NC_018271. 205 RAIKRAY---GTQARSNGTFLNPNEFDFEGYTLTEIKGLPASRM-----VGYNRDNIVIGMSA---QSDFNE-------- 265 (305) Q Consensus 205 d~Y~d~~---~~~~~k~~~~t~~~~~~~kGi~iv~l~~~Pd~~i-----i~T~~sNl~~gvnl---~~D~n~-------- 265 (305) ....+.- ...++...+..++....+.|++|+...++|..-+ ..|-..|.+-+.+. .-|+.. T Consensus 215 ~~Ll~~~~~~~~~~~~~~~~~~G~V~~v~G~~V~~Sn~lp~~~~~~~~~~~tg~~~~~~~~~~~~~~~~~s~~~~l~~h~ 294 (344) T protein:vir:10 215 SAILAALMPNAANYAALIDPEKGSIRNVMGFEVVEVPHLTAGGAGTSREGTTGQKHAFPATKSGNDKVAKDNVIGLFMHR 294 (344) T ss_pred HHHhhcccccccccccccceeeeEEEEEeceEEEeccccccccCCcccccccCccccccCCcccceeeecceeEEEeech Confidence 8776543 1223333333445455689999999999986421 12222222111000 001100 Q ss_pred ----------ccccceeeecccee-----EEEEEEeecceeeccCCeEEEecCC Q lcl|NC_018271. 266 ----------IRIKDMGDVDLSGQ-----IRTKMVLSAGVEYAYGAEIVLYTPA 304 (305) Q Consensus 266 ----------I~I~~~~~~~~~~~-----~f~k~~m~~d~~i~fg~E~v~~~~~ 304 (305) ++++.+ ..++ +..++.||+++.=|=..=+|-.++- T Consensus 295 ~A~~~v~~~~~~~e~~----r~~~~~~d~i~g~~~~G~~vlRPe~a~~v~~~~~ 344 (344) T protein:vir:10 295 SAVGTVKLRDLALERA----RRANFQADQIIAKYAMGHGGLRPEAAGAVVFKTK 344 (344) T ss_pred hhhhhhhhccceeecc----cchhHHHHHHHHHhhcccceecccceEEEEeecC Confidence 112211 1222 4456666666655444444666666 No 134 >protein:vir:78777 Length: 358 # NCBI annotation: putative major capsid protein # Family: family:all:201 # MgeID: mge:1857 # MgeName: phiO18P # Cross-refs: genbank:acc:YP_001285647;genbank:gi:148727153;genbank:GeneID:5220125 Probab=85.37 E-value=0.053 Score=27.56 Aligned_cols=276 Identities=13% Similarity=0.073 Sum_probs=126.3 Q ss_pred CceEeeee--------------------------cccchhHHHHHHHHhhccccchhcCceEEec--CCCCcccccchhh Q lcl|NC_018271. 1 MATTVDIT--------------------------TNYVGEVAGGYFLEMVKEANTISDNLIRVIP--NVPENNLFLRRMN 52 (305) Q Consensus 1 ma~~~~~~--------------------------~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~--~v~~~~~~~~~~~ 52 (305) |+-+-+.. -+.++.+.-+|-..+..+.+..++ |.|+| .++.-.+ .+. T Consensus 1 m~~~M~~~tr~~~~~y~~~~A~~ngv~~~~~~~~Fsv~p~v~q~L~~~i~ess~FL~~--INvv~V~e~~Ge~v---~lg 75 (358) T protein:vir:78 1 MSQTLTVQAEQRLNKYCDALAKAYGIDISKLDKQFSVTGPVETTLRSALLASVEFLGL--ITCLDVDQIKGQVV---QVG 75 (358) T ss_pred CcccccHHHHHHHHHHHHHHHHHhCCChhHccceeeeChHHHHHHHHHHHHHHHHhhc--CcccccccceeeEE---eec Confidence 43332211 133344444444444444444333 22211 0000000 000 Q ss_pred hhccccCCCCCCCCccceEecceeeeeeeeeEeec----cCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHH Q lcl|NC_018271. 53 TTDDFVDYSCGFTPSGEVDINEKQLTLKKIKSDKE----VCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRL 128 (305) Q Consensus 53 ~~~~~q~~~~~~~~~G~~~~~~K~L~~~~~k~~~~----~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~i 128 (305) ..-.+-+++.. + -.-+.-.|+...|.|.-+ .-|=+-...| +..++++ .-|.-+-..+.+++ T Consensus 76 ~~g~iagrt~t----r-~~~~~~~l~~~~Y~c~qTn~dt~i~Y~~lD~W-A~f~~~~---------dF~~r~~~~i~~~~ 140 (358) T protein:vir:78 76 VGQLYTGRKKG----G-RFKGKVGVDGNTYELTETDSCASLDWATLCTW-ANAGSEG---------EFIKLVGEFVNKAF 140 (358) T ss_pred CCcccceecCC----C-ccccccccCCCccEEEEeceeeeccHHHHHHH-HhCCChh---------HHHHHHHHHHHHHH Confidence 00000011100 0 001111222222222211 1111222334 2222111 12233334455566 Q ss_pred HhhhhhhccccCCcc-----------ch-hHHHHHHHhhccceEEe----------ccCcCcCChhhHHHHHHHHH-Hhc Q lcl|NC_018271. 129 ARKIDKDIWQGDGTT-----------GN-LQGILPLLEADATVIDV----------VGASGGITAANVEAELGKFI-DAH 185 (305) Q Consensus 129 a~ei~~~~~~GD~s~-----------~~-fdG~lk~i~~d~~~~~~----------~~~~~~iT~anv~~~l~~~~-~~i 185 (305) |-+.=...|+|-+-+ .| =-||++++.+.+..-.+ .+..+.-+..|.-+.+.+.. +.| T Consensus 141 ALD~i~IGfNGts~A~~Td~~~nPllqDVN~GWlQ~~Re~a~~~v~~~~~~~~~i~ig~g~~Gdy~NLDalV~D~~~~lI 220 (358) T protein:vir:78 141 ALDMLRVGWNGVSAADDTDPTANPLGQDVNKGWHQLAREWKGGSQIIKAAAGEKIYFDPDGKGEYKTLDEMASDLINTTI 220 (358) T ss_pred hhccceecccceeeccCCChhhCcCccccchHHHHHHHhhchhhhhccccccCceeecCCCCCccccHHHHHHHHHhccC Confidence 666666667785321 11 26999999875553111 11111124689999999987 579 Q ss_pred cHHHHhCCCcEEEecHHHHH-HHHHHHhhhhccCCcccCCC--cceecceeeeeccCCCCCeEEEecchHHhhhhhhhhh Q lcl|NC_018271. 186 TDEILQAPNHVFGVSTNVIR-AIKRAYGTQARSNGTFLNPN--EFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSD 262 (305) Q Consensus 186 P~~~r~~~~l~~f~S~~~~d-~Y~d~~~~~~~k~~~~t~~~--~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D 262 (305) |+.+|+++.|+.+|+.++.. +|-.-+. +.++..+..... ..++-|.+-+.++-+|++.|+.|.=+||-+= .|.+ T Consensus 221 ~~~~~~d~dLVvivG~dLla~k~~~l~n-~~~~pTE~~Aa~~i~k~iGGlpa~~~PfFP~~~ilVT~L~NLsIY--~Q~g 297 (358) T protein:vir:78 221 DPLFQQDPRLVVLVGTDLVAAAQAKLYS-EATKPSEQIAAQQLAKSIAGRKAYIPPFFPGKRMVVTTLDNLHCY--TQRG 297 (358) T ss_pred ChHHhcCCCEEEEEchhhhhHHhhhHhh-cCCCcHHHHHHHHHHHHhCCCeEEEccccCCCceEEeeccccEEE--EecC Confidence 99999999999999988765 4444442 222211111111 1368889999999999999999999999532 2222 Q ss_pred ------hhhcccccee-----eeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 263 ------FNEIRIKDMG-----DVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 263 ------~n~I~I~~~~-----~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) .++++.+++. |-.|-.+.+-++-+=- +.|+.+.+..| T Consensus 298 s~RR~~~d~p~r~riE~y~s~Ne~YvVEd~~~~a~iE------~i~v~~~~~pa 345 (358) T protein:vir:78 298 TRKRKADDNQDSKSFDNQYWRMEGYALGEHKAYGGFE------EADIEIGADPA 345 (358) T ss_pred cEEEEEEeccccccccchhhhcceeeeeccccEEEEe------eeeeeeCCCCC Confidence 2445555554 3334444333222211 22444555333 No 135 >protein:vir:97031 Length: 402 # NCBI annotation: 31 # Family: family:all:2806 # MgeID: mge:1644 # MgeName: K1-5 # Cross-refs: genbank:acc:YP_654132;genbank:gi:108862016;genbank:GeneID:5075980 Probab=84.87 E-value=0.039 Score=28.30 Aligned_cols=253 Identities=13% Similarity=0.105 Sum_probs=112.1 Q ss_pred Cc--------------eEeee-ecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCC Q lcl|NC_018271. 1 MA--------------TTVDI-TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFT 65 (305) Q Consensus 1 ma--------------~~~~~-~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~ 65 (305) |. ....+ ..-|.|||+..+.-..++ .++++++.--..++...+++..... +.++.+-. T Consensus 1 Ms~~n~~t~~~~~~s~~~~al~le~f~geV~taF~~~si~------~~~~~vrti~~GkS~qf~~iG~~~a-~y~~~G~~ 73 (402) T protein:vir:97 1 MSTPNTLTNVAVSASGEVDSLLIEKFNGKVNEQYLKGENI------LSYFDVQTVTGTNTVSNKYLGETEL-QVLAPGQS 73 (402) T ss_pred CCCcccccccccccccchhhhhhhhhhhhHHHHHHHHHhh------cCcceeeeecccceEEEEEEeeeEE-eeeccccc Confidence 32 22222 246788888777665443 2456665544455444555544443 55554444 Q ss_pred CccceEecce-eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccc-----c Q lcl|NC_018271. 66 PSGEVDINEK-QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQ-----G 139 (305) Q Consensus 66 ~~G~~~~~~K-~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~-----G 139 (305) ..|.-.-++| .|..-..... +-+-..-=.+|-+-..+ +. +. -+++++.++..+.+++++ | T Consensus 74 ldg~~~~~~k~~ItID~lL~a-----~~~V~diDeaq~~yD~v--Rs-e~------s~e~G~ALA~~~Dq~ii~~i~~aa 139 (402) T protein:vir:97 74 PNATPTQADKNQLVIDTTVIA-----RNTVAHIHDVQGDIDSL--KP-KL------AMNQAKQLKRLEDQMAIQQMLLGG 139 (402) T ss_pred cCCCCcccccEEEEeCceeec-----hhhhhhHHHHHhcccch--hH-HH------HHHHHHHHHHHHHHHHHHHHHHhh Confidence 4444333333 3554444321 11111111223221111 10 11 134455555444443331 1 Q ss_pred --CCccchhHHHHHHHhhccceEEecc--CcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHH--Hhh Q lcl|NC_018271. 140 --DGTTGNLQGILPLLEADATVIDVVG--ASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRA--YGT 213 (305) Q Consensus 140 --D~s~~~fdG~lk~i~~d~~~~~~~~--~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~--~~~ 213 (305) ++..-..++-. ........+++ .....++.++++.+.+-...+.++--..-...++|++..|...... +.+ T Consensus 140 ~a~t~~~~~~~~~---~~~g~s~~~~~t~~~a~~~~~~l~~ai~~a~~~LdEkdVP~~dRv~vv~P~~y~~Ll~~~rl~n 216 (402) T protein:vir:97 140 IANTKAERNKPRV---KGHGFSINVNVTESEALANPQYVMAAVEYALEQQLEQEVDISDVAIMMPWKFFNALRDADRIVD 216 (402) T ss_pred ccccccccccCcc---cccccccccccccchhhcCHHHHHHHHHHHHHHHHhcCCCccccEEEeChHHHHHHhhcccccc Confidence 11100000000 00111111222 1223456667766666665555543222347999999888877654 322 Q ss_pred h-hc--cCCcccCCCcceecceeeeeccCCCCCeEEEecc--hHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeec Q lcl|NC_018271. 214 Q-AR--SNGTFLNPNEFDFEGYTLTEIKGLPASRMVGYNR--DNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSA 288 (305) Q Consensus 214 ~-~~--k~~~~t~~~~~~~kGi~iv~l~~~Pd~~ii~T~~--sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~ 288 (305) + ++ .+.++.+++-...-|++|++...+|.+--..|.. +|-- .-+.++. .. T Consensus 217 ~d~~~~~~g~~~~G~v~~v~Gv~Vv~SnnlP~~a~~it~~~ls~a~----~G~~y~~--t~------------------- 271 (402) T protein:vir:97 217 KTYTISQSGATINGFVLSSYNCPVIPSNRFPTFAQDQAHHLLSNED----NGYRYDP--IA------------------- 271 (402) T ss_pred hhhccccCCccccceeEEEeceEEEecCccccccccccccccccCC----CCccCCc--Cc------------------- Confidence 2 22 3344566666779999999999999642211211 1110 0111221 11 Q ss_pred ceeeccCCeEEEecCCC Q lcl|NC_018271. 289 GVEYAYGAEIVLYTPAA 305 (305) Q Consensus 289 d~~i~fg~E~v~~~~~~ 305 (305) |+.- .=.++|+|+| T Consensus 272 d~t~---~~~~~f~~~A 285 (402) T protein:vir:97 272 EMNG---AVAVLFTSDA 285 (402) T ss_pred ccce---eEEEEEecce Confidence 2211 1245778877 No 136 >protein:vir:5694 Length: 357 # NCBI annotation: gpN # Family: family:all:201 # MgeID: mge:120 # MgeName: L-413C # Cross-refs: genbank:acc:NP_839853;genbank:gi:30065708;genbank:GeneID:1260602 Probab=84.38 E-value=0.061 Score=27.25 Aligned_cols=280 Identities=14% Similarity=0.107 Sum_probs=136.9 Q ss_pred CceEeee-------ecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccch--hhhhccccCCCCCCCCcc-ce Q lcl|NC_018271. 1 MATTVDI-------TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRR--MNTTDDFVDYSCGFTPSG-EV 70 (305) Q Consensus 1 ma~~~~~-------~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~--~~~~~~~q~~~~~~~~~G-~~ 70 (305) +|..... +-+=.+.+.-+|-..+..+.+..+. |.|+| +..+.=-. +...-.+-++ .-|++| +. T Consensus 16 ~A~~ngv~~~d~~~~FsV~P~v~q~L~~~i~ess~FL~~--INvv~---V~e~~Ge~i~lg~~g~iagr--tdT~~~~~R 88 (357) T protein:vir:56 16 VAELNGIDAGDVSKKFTVEPSVTQTLMNTMQESSDFLTR--INIVP---VSEMKGEKIGIGVTGSIAST--TDTAGGTER 88 (357) T ss_pred HHHHhCCChHHhcceeecCHHHHHHHHHHHHHHHHHhcc--CCccc---cccceeeEEecccCcccccc--ccCCCCCCc Confidence 2222211 1244455555555666666666555 44421 11111000 1111111111 111221 11 Q ss_pred Eecc-eeeeeeeeeEeec-cC---HHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc-- Q lcl|NC_018271. 71 DINE-KQLTLKKIKSDKE-VC---KEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT-- 143 (305) Q Consensus 71 ~~~~-K~L~~~~~k~~~~-~~---P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~-- 143 (305) +... -.|+...|.|.-+ || |=+-...|-.. | .-|.-+-..+.+++|-+.=...|+|-+-+ T Consensus 89 ~~~~~~~l~~~~Y~c~qTn~dt~i~Y~~lD~WA~~------------~-dF~~r~~~~i~~~~ALD~i~IGfNGts~A~~ 155 (357) T protein:vir:56 89 QPKDFSKLASNKYECDQINFDFYIRYKTLDLWARY------------Q-DFQLRVRNAIIKRQSLDFIMAGFNGVKRAET 155 (357) T ss_pred ccccccccCCCccEEEEeeecccccHHHHHHHhcC------------h-hHHHHHHHHHHHHHhhccceecccceeeecc Confidence 1111 2445555554322 22 22333445211 1 22333444556667766666677885321 Q ss_pred ---------ch-hHHHHHHHhhccceE----------------EeccCcCcCChhhHHHHHHHHHHh-ccHHHHhCCCcE Q lcl|NC_018271. 144 ---------GN-LQGILPLLEADATVI----------------DVVGASGGITAANVEAELGKFIDA-HTDEILQAPNHV 196 (305) Q Consensus 144 ---------~~-fdG~lk~i~~d~~~~----------------~~~~~~~~iT~anv~~~l~~~~~~-iP~~~r~~~~l~ 196 (305) .| =-||++++.+.+..- ...+.. -+..|.-+.+.+..+. ||+.+|+.+.|+ T Consensus 156 Td~~~nPllqDVN~GWlQ~~Re~ap~rVm~~~~~~~g~~~~~~i~~G~~--gdy~NLDalV~D~~~~lI~~~~~~d~dLV 233 (357) T protein:vir:56 156 SDRSSNPMLQDVAVGWLQKYRNEAPARVMSKVTDEEGHTTSEVIRVGKG--GDYASLDALVMDATNNLIEPWYQEDPDLV 233 (357) T ss_pred CChhhCcCccccchhHHHHHHhhchhhhhccccccCCccccceeeecCC--CCcccHHHHHHHHHhccCChHHhcCCCEE Confidence 11 269999998755321 011221 2468999999999986 799999999999 Q ss_pred EEecHHHHH-HHHHHHhhhhccCCcccCCC----cceecceeeeeccCCCCCeEEEecchHHhhhhhhhhh------hhh Q lcl|NC_018271. 197 FGVSTNVIR-AIKRAYGTQARSNGTFLNPN----EFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSD------FNE 265 (305) Q Consensus 197 ~f~S~~~~d-~Y~d~~~~~~~k~~~~t~~~----~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D------~n~ 265 (305) .+|+.++.. +|-.-|. +.++..+..... ..++-|.+-+.++-+|++.|+.|.=+||-+= .|.+ .++ T Consensus 234 vivG~dLla~k~~~l~n-~~~~pTE~~Aa~~i~s~k~iGGl~a~~~PfFP~~~llVT~L~NLsIY--~Q~gs~RR~~~d~ 310 (357) T protein:vir:56 234 VIVGRQLLADKYFPIVN-KEQDNSEMLAADVIISQKRIGNLPAVRVPYFPADAMLITKLENLSIY--YMDDSHRRVIEEN 310 (357) T ss_pred EEEchhhhhhhhhhHhh-ccCChHHHHHHHHHHHhhhhCCceeEEccccCCCceEEeeccccEEE--EecCcEEEEEEec Confidence 999988765 4433332 222211111111 2368899999999999999999999999532 2222 244 Q ss_pred cccccee-----eeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 266 IRIKDMG-----DVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 266 I~I~~~~-----~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) ++.+++. |-.|-.+.+-++-+=-...|+.+++-.-=.|.. T Consensus 311 p~r~riE~y~s~Ne~YvVEd~~~~a~iE~i~i~~~~~~~~~~~~~ 355 (357) T protein:vir:56 311 PKLDRVENYESMNIDYVVEDYAAGCLVEKIKVGDFSTPAKATEEP 355 (357) T ss_pred cccccccchhhhcceeeeeccccEEEeeeeeeccCCCCcccCCCC Confidence 5555554 444444433332222233333222211111111 No 137 >protein:vir:100884 Length: 389 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1473 # MgeName: Lc-Nu # Cross-refs: genbank:acc:YP_358764;genbank:gi:78000028;genbank:GeneID:3726155 Probab=83.92 E-value=0.065 Score=27.11 Aligned_cols=254 Identities=9% Similarity=0.039 Sum_probs=123.8 Q ss_pred CceEeeeeccc-ch-hHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhh-hccccCCCC-CCCC-ccceEecce Q lcl|NC_018271. 1 MATTVDITTNY-VG-EVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNT-TDDFVDYSC-GFTP-SGEVDINEK 75 (305) Q Consensus 1 ma~~~~~~~~Y-~G-e~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~-~~~~q~~~~-~~~~-~G~~~~~~K 75 (305) |+........| .+ ++..+|+.....-..+.+. ++++|-= ......+..+. ......... +-.+ .++.+|.+. T Consensus 109 ~~~~t~~~gg~~vP~~~~~~i~~~~~~~~~l~~~--~~~~~~~-~~~~~~~~~~~~~~~~~~~~E~~~~~~~~~~~~~~i 185 (389) T protein:vir:10 109 TSKVTSTEAGVLIPEEIIYDPTAEVNSVVDLSTL--VTKTPVT-TPKGTYPILKRATDRFSSVAELAENPKLAEPEFNKV 185 (389) T ss_pred hcccccCCcceeehHHHHHHHHHHHHhhhhHHhh--cceeecc-CCeeEEEEEecCCCccccccccccccccccccceee Confidence 33332222222 22 3344555555555554433 5655521 11112222221 111111111 1122 246788999 Q ss_pred eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhh Q lcl|NC_018271. 76 QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEA 155 (305) Q Consensus 76 ~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~ 155 (305) .+.++++-+...++=.-+ ++-++..+.++.+.|.++++...+..++.|+++.. T Consensus 186 ~~~~~k~~~~~~iS~ell----------------~ds~~~l~~~i~~~la~~~~~~~~~~i~~g~~~~~----------- 238 (389) T protein:vir:10 186 DWSVATYRGAIPLSEEAI----------------ADSAVDLTALVGQSIKEKSVNTYNAMIAPVLQSFT----------- 238 (389) T ss_pred eeeheeeEeeehhhHHHH----------------hhhhHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc----------- Confidence 999999887776553211 22245566788888999999888888887766421 Q ss_pred ccceEEeccCcCcCChhhHHHHHHHHH-HhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---cc---cC-CCcc Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGKFI-DAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---TF---LN-PNEF 227 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~~~-~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~~---t~-~~~~ 227 (305) +...+.+.-.+.+.++. ..++..+ +=.++||...|...+- ++.-.|.+. +. +. +... T Consensus 239 ----------~~~~~~~~~~d~l~~~~~~~~~~~~----~a~~~~n~~~~~~L~~-lkd~~G~~i~~~~~~~~~~~~~~~ 303 (389) T protein:vir:10 239 ----------AKKTTTDTLVDSLKHILNVDLDPAY----SRALVVTQSLFNTLDT-LKDKNGRYLLHDASDSITDGTAKG 303 (389) T ss_pred ----------cccccccccHHHHHHHHHhhhhhhh----CcEEEecHHHHHHHHH-hhccCCCeeeecCccccccccccc Confidence 01111222244444444 3455543 1279999999866653 322222211 11 11 2234 Q ss_pred eecceeeeeccC-CC-CC----eEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEe Q lcl|NC_018271. 228 DFEGYTLTEIKG-LP-AS----RMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLY 301 (305) Q Consensus 228 ~~kGi~iv~l~~-~P-d~----~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~ 301 (305) .+.|++++.+.+ +| .. .++..+=++.++. -|-..+.|..... ......| .-.+-+|+.+.-++=||.. T Consensus 304 ~l~G~pV~~~~~~~~~~~~~~~~~~~gd~~~~~~~----~~~~~~~i~~~~~-~~~~~~~-~~~~r~d~~~~~~~a~~~~ 377 (389) T protein:vir:10 304 TILGVPVYVVGDTLLGSLAGDQKAFVGDLKRGVLF----TDRQQVTLAWEDS-KIYGKYL-GAAFRFGVQKADSKAGYFV 377 (389) T ss_pred ccccceeEEecccccCCCCCceEEEEeeccccEEE----EeecceEEEeecc-ccccceE-EEEEEeccEEecccceEEE Confidence 699999876643 33 22 3555565554221 1222233332111 1112222 2334568888777777766 Q ss_pred c--CCC Q lcl|NC_018271. 302 T--PAA 305 (305) Q Consensus 302 ~--~~~ 305 (305) + +.+ T Consensus 378 ~~~~~~ 383 (389) T protein:vir:10 378 TNTDVP 383 (389) T ss_pred EeeccC Confidence 5 322 No 138 >protein:vir:2016 Length: 357 # NCBI annotation: gpN # Family: family:all:201 # MgeID: mge:315 # MgeName: P2 # Cross-refs: genbank:acc:NP_046760;genbank:gi:9630331;genbank:GeneID:1261541 Probab=83.80 E-value=0.066 Score=27.07 Aligned_cols=275 Identities=15% Similarity=0.131 Sum_probs=136.4 Q ss_pred CceEeee-------ecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccch--hhhhccccCCCCCCCCcc-ce Q lcl|NC_018271. 1 MATTVDI-------TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRR--MNTTDDFVDYSCGFTPSG-EV 70 (305) Q Consensus 1 ma~~~~~-------~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~--~~~~~~~q~~~~~~~~~G-~~ 70 (305) +|..... +-+=.+.+.-+|-..+..+.+..+. |.|+| +..+.=-. +...-.+-++ .-|++| +. T Consensus 16 ~A~~ngv~~~d~~~~FsV~P~v~q~L~~~i~ess~FL~~--INvv~---V~e~~Ge~i~lg~~g~iagr--tdT~~~~~R 88 (357) T protein:vir:20 16 VAELNGIDAGDVSKKFTVEPSVTQTLMNTMQESSDFLTR--INIVP---VSEMKGEKIGIGVTGSIAST--TDTAGGTER 88 (357) T ss_pred HHHHhCCChHHhcceeecCHHHHHHHHHHHHHHHHHhcc--CCccc---cccceeeEEecccCcccccc--ccCCCCCCc Confidence 2222211 1244455555555556666666555 44421 11111000 1111111111 111221 11 Q ss_pred Eecc-eeeeeeeeeEeec-cC---HHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc-- Q lcl|NC_018271. 71 DINE-KQLTLKKIKSDKE-VC---KEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT-- 143 (305) Q Consensus 71 ~~~~-K~L~~~~~k~~~~-~~---P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~-- 143 (305) +... -.|+...|.|.-+ || |=+-...|-.. | .-|.-+-..+.+++|-+.=...|+|-+-+ T Consensus 89 ~~~~~~~l~~~~Y~c~qTn~dt~i~Y~~lD~WA~~------------~-dF~~r~~~~i~~~~ALD~i~IGfNGts~A~~ 155 (357) T protein:vir:20 89 QPKDFSKLASNKYECDQINFDFYIRYKTLDLWARY------------Q-DFQLRIRNAIIKRQSLDFIMAGFNGVKRAET 155 (357) T ss_pred ccccccccCCCccEEEEeeecccccHHHHHHHhcC------------h-hHHHHHHHHHHHHHhhccceecccceeeecc Confidence 1111 2445555554322 22 22333445211 1 22333444556666766666677885321 Q ss_pred ---------ch-hHHHHHHHhhccceEE----------------eccCcCcCChhhHHHHHHHHHHh-ccHHHHhCCCcE Q lcl|NC_018271. 144 ---------GN-LQGILPLLEADATVID----------------VVGASGGITAANVEAELGKFIDA-HTDEILQAPNHV 196 (305) Q Consensus 144 ---------~~-fdG~lk~i~~d~~~~~----------------~~~~~~~iT~anv~~~l~~~~~~-iP~~~r~~~~l~ 196 (305) .| =-||++++.+.+..-. ..+.. -+..|.-+.+.+..+. ||+.+|+.+.|+ T Consensus 156 Td~~~nPllqDVN~GWlQ~~Re~ap~rVm~~~~~~~g~~~~~~i~~G~~--gdy~NLDalV~D~~~~lI~~~~~~d~dLV 233 (357) T protein:vir:20 156 SDRSSNPMLQDVAVGWLQKYRNEAPARVMSKVTDEEGRTTSEVIRVGKG--GDYASLDALVMDATNNLIEPWYQEDPDLV 233 (357) T ss_pred CChhhCcCccccchhHHHHHHhhchhhhhccccccccccccceeeecCC--CCcccHHHHHHHHHhccCChHHhcCCCEE Confidence 11 2699999987553210 11111 2468899999999986 799999999999 Q ss_pred EEecHHHHH-HHHHHHhhhhccCCcccCCC----cceecceeeeeccCCCCCeEEEecchHHhhhhhhhhh------hhh Q lcl|NC_018271. 197 FGVSTNVIR-AIKRAYGTQARSNGTFLNPN----EFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSD------FNE 265 (305) Q Consensus 197 ~f~S~~~~d-~Y~d~~~~~~~k~~~~t~~~----~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D------~n~ 265 (305) .+|+.++.. +|-.-|. +.++..+..... ..++-|.+-+.++-+|++.|+.|.=+||-+= .|.+ .++ T Consensus 234 vivG~dLla~k~~~l~n-~~~~ptE~~Aa~~i~s~k~iGGl~a~~~PfFP~~~ilVT~L~NLsIY--~Q~gs~RR~~~d~ 310 (357) T protein:vir:20 234 VIVGRQLLADKYFPIVN-KEQDNSEMLAADVIISQKRIGNLPAVRVPYFPADAMLITKLENLSIY--YMDDSHRRVIEEN 310 (357) T ss_pred EEEchhhhhhhhhhHhh-ccCChHHHHHHHHHHHhhhhCCceeEEccccCCCceEEeeccccEEE--EecCcEEEEEEec Confidence 999988765 4433332 222211111111 2368899999999999999999999999532 2222 244 Q ss_pred cccccee-----eeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 266 IRIKDMG-----DVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 266 I~I~~~~-----~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) ++.+++. |-.|-.+.+-++-+=-.+.|+..+ .|+. T Consensus 311 p~r~riE~y~s~Ne~YvVEd~~~~a~iE~i~~~~~~-----~p~~ 350 (357) T protein:vir:20 311 PKLDRVENYESMNIDYVVEDYAAGCLVEKIKVGDFS-----TPAK 350 (357) T ss_pred cccccccchhhhcceeeeeccccEEEeeeeeecccc-----CCcc Confidence 5555554 444444433333222223332222 2333 No 139 >protein:vir:97433 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1676 # MgeName: 92 # Cross-refs: genbank:acc:YP_240749;genbank:gi:66396420;genbank:GeneID:5133789 Probab=83.15 E-value=0.071 Score=26.89 Aligned_cols=259 Identities=14% Similarity=0.137 Sum_probs=129.4 Q ss_pred CceEeeee-cccchhHHHHHHHHhhccccchhcCceEE---ecCCCCcccccchhhhhccccCCCC-CCCCccceEecce Q lcl|NC_018271. 1 MATTVDIT-TNYVGEVAGGYFLEMVKEANTISDNLIRV---IPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEK 75 (305) Q Consensus 1 ma~~~~~~-~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v---~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K 75 (305) ||+...=. .-...|+...++..-+.+. ++-.++..+ +.|.+.++.-+|+.+.....+.+.. .--+-+..+.++. T Consensus 1 ma~~~T~~~d~iiPev~~~~v~~~~~~~-l~~~~~~~~d~~l~g~~G~tv~iP~~~~~g~a~~~~~g~~i~~~~lt~~~~ 79 (274) T protein:vir:97 1 MPQGLTKTSDQIIPEVLAPMMQAQLEKK-LRFASFAEVDSTLQGQPGDTLTFPAFVYSGDAQVVAEGEKIPTDILETKKR 79 (274) T ss_pred CCccceehhheechHHHHHHHHHhhhhh-hhhcccceecccccCCCCCEEEEeeecCCCccccccCCCccccccccccee Confidence 99865422 2445555555554433322 222222222 2455555544554443334555542 1112234455566 Q ss_pred eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhh Q lcl|NC_018271. 76 QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEA 155 (305) Q Consensus 76 ~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~ 155 (305) ++..++. ...|-..|... .+ ..|+ |- ....+.++..++..+...+. ..+.. T Consensus 80 ~~~i~~~--~~~~~i~D~~~----~~-~~~d------p~---~~~~~~~a~a~a~~vd~~~~-------------~~l~~ 130 (274) T protein:vir:97 80 EAKIRKI--AKGTSITDEAL----LS-GYGD------PQ---GEQVRQHGLAHANKVDNDVL-------------EALMG 130 (274) T ss_pred EEEeeee--cceecccHHHH----Hh-ccch------HH---HHHHHHHHHHHHHHHHHHHH-------------HHHhc Confidence 6666443 33455555431 11 2122 11 11224444455555543322 11111 Q ss_pred ccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc-----cCCCcceec Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF-----LNPNEFDFE 230 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~-----t~~~~~~~k 230 (305) ...++ ....++...+++++..+ .+. ....-.++||+.+|-.....-..++-+..+. .++-.+.|. T Consensus 131 a~~~~----~~~~~~~d~i~dA~~~l----~d~--~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~ 200 (274) T protein:vir:97 131 AKLTV----NADITKLNGLQSAIDKF----NDE--DLEPMVLFVNPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEAL 200 (274) T ss_pred cCccc----cccccCHHHHHHHHHHh----hcc--CCCceEEEeCHHHHHHHHhhhhhhccccCcccccceeccccceec Confidence 11111 12234444444444333 222 1223489999998877754322222221111 233456899 Q ss_pred ceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 231 GYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 231 Gi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) |++|+.-.++|++.+++..++.+-++. ..|.. ++-+|-.+ ...-.+...+-|++.+..++-+|+-|+.. T Consensus 201 G~~Vi~s~~~p~~t~~l~~~gA~~~~~--~~~~~-vE~~Rd~~---~~~d~i~~~~~y~~~~~~~~~vv~~t~~~ 269 (274) T protein:vir:97 201 GAIIVRTNKLEAGTAILAKKGAVKLIL--KRDFF-LEVARDAS---TKTTALYSDKHYVAYLYDESKAVKITKGS 269 (274) T ss_pred CeeEEEcCCCCcceEEEEeCcceEeee--cCCce-eccccchh---hcccEEEEEEEEEEEEEcCCceEEEecCc Confidence 999999999999999998888876543 23332 33332222 11133444455689999999999999887 No 140 >protein:vir:94494 Length: 274 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1508 # MgeName: 88 # Cross-refs: genbank:acc:YP_240676;genbank:gi:66396348;genbank:GeneID:5133758 Probab=83.15 E-value=0.071 Score=26.89 Aligned_cols=259 Identities=14% Similarity=0.137 Sum_probs=129.4 Q ss_pred CceEeeee-cccchhHHHHHHHHhhccccchhcCceEE---ecCCCCcccccchhhhhccccCCCC-CCCCccceEecce Q lcl|NC_018271. 1 MATTVDIT-TNYVGEVAGGYFLEMVKEANTISDNLIRV---IPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEK 75 (305) Q Consensus 1 ma~~~~~~-~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v---~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K 75 (305) ||+...=. .-...|+...++..-+.+. ++-.++..+ +.|.+.++.-+|+.+.....+.+.. .--+-+..+.++. T Consensus 1 ma~~~T~~~d~iiPev~~~~v~~~~~~~-l~~~~~~~~d~~l~g~~G~tv~iP~~~~~g~a~~~~~g~~i~~~~lt~~~~ 79 (274) T protein:vir:94 1 MPQGLTKTSDQIIPEVLAPMMQAQLEKK-LRFASFAEVDSTLQGQPGDTLTFPAFVYSGDAQVVAEGEKIPTDILETKKR 79 (274) T ss_pred CCccceehhheechHHHHHHHHHhhhhh-hhhcccceecccccCCCCCEEEEeeecCCCccccccCCCccccccccccee Confidence 99865422 2445555555554433322 222222222 2455555544554443334555542 1112234455566 Q ss_pred eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhh Q lcl|NC_018271. 76 QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEA 155 (305) Q Consensus 76 ~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~ 155 (305) ++..++. ...|-..|... .+ ..|+ |- ....+.++..++..+...+. ..+.. T Consensus 80 ~~~i~~~--~~~~~i~D~~~----~~-~~~d------p~---~~~~~~~a~a~a~~vd~~~~-------------~~l~~ 130 (274) T protein:vir:94 80 EAKIRKI--AKGTSITDEAL----LS-GYGD------PQ---GEQVRQHGLAHANKVDNDVL-------------EALMG 130 (274) T ss_pred EEEeeee--cceecccHHHH----Hh-ccch------HH---HHHHHHHHHHHHHHHHHHHH-------------HHHhc Confidence 6666443 33455555431 11 2122 11 11224444455555543322 11111 Q ss_pred ccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc-----cCCCcceec Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF-----LNPNEFDFE 230 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~-----t~~~~~~~k 230 (305) ...++ ....++...+++++..+ .+. ....-.++||+.+|-.....-..++-+..+. .++-.+.|. T Consensus 131 a~~~~----~~~~~~~d~i~dA~~~l----~d~--~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~ 200 (274) T protein:vir:94 131 AKLTV----NADITKLNGLQSAIDKF----NDE--DLEPMVLFVNPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEAL 200 (274) T ss_pred cCccc----cccccCHHHHHHHHHHh----hcc--CCCceEEEeCHHHHHHHHhhhhhhccccCcccccceeccccceec Confidence 11111 12234444444444333 222 1223489999998877754322222221111 233456899 Q ss_pred ceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 231 GYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 231 Gi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) |++|+.-.++|++.+++..++.+-++. ..|.. ++-+|-.+ ...-.+...+-|++.+..++-+|+-|+.. T Consensus 201 G~~Vi~s~~~p~~t~~l~~~gA~~~~~--~~~~~-vE~~Rd~~---~~~d~i~~~~~y~~~~~~~~~vv~~t~~~ 269 (274) T protein:vir:94 201 GAIIVRTNKLEAGTAILAKKGAVKLIL--KRDFF-LEVARDAS---TKTTALYSDKHYVAYLYDESKAVKITKGS 269 (274) T ss_pred CeeEEEcCCCCcceEEEEeCcceEeee--cCCce-eccccchh---hcccEEEEEEEEEEEEEcCCceEEEecCc Confidence 999999999999999998888876543 23332 33332222 11133444455689999999999999887 No 141 >protein:vir:99523 Length: 311 # NCBI annotation: putative protein # Family: family:all:701 # MgeID: mge:1559 # MgeName: Lj928 # Cross-refs: genbank:acc:NP_958538;genbank:gi:41179320;genbank:GeneID:2717161 Probab=81.44 E-value=0.086 Score=26.43 Aligned_cols=266 Identities=14% Similarity=0.109 Sum_probs=111.1 Q ss_pred CceEee-----eecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCC--CCCCCccceEec Q lcl|NC_018271. 1 MATTVD-----ITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYS--CGFTPSGEVDIN 73 (305) Q Consensus 1 ma~~~~-----~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~--~~~~~~G~~~~~ 73 (305) |-+.-| --+-|++. |.+-|...+.++.+.+.. ..++.| -++..+|++.+ .++.+|+ .+|+ .|+++.+ T Consensus 1 ~~~~an~mAlnya~~~~~~-Ld~~~~~~~~t~~l~~~~-~~~~~G--ak~VkIp~i~~-~gl~dY~R~~g~~-~g~v~~~ 74 (311) T protein:vir:99 1 MPTDAETRGFNYVTKDGNL-LDQKITAGLFTAALGTPE-VDLVNG--GRSFTLKTIST-SGLKDHTRGKGFN-SGTISDE 74 (311) T ss_pred CCCcchhhHHHHHHHHHHH-HHHHHHhhhcccceecCc-hheeec--CCEEEEEeeee-ccccccccccCcc-ccceeee Confidence 443333 23456665 555556666666666655 334445 45677898885 7887775 5776 4766554 Q ss_pred cee--eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHH Q lcl|NC_018271. 74 EKQ--LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 74 ~K~--L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk 151 (305) .++ |.-.|. ..|.+.|.|...+ ....+..+.+=+-...++.=|+. +--|.-+.+ T Consensus 75 ~et~tl~~DR~-~~f~vD~mDvdET--------------n~~~~~ani~~~f~r~~vvPEiD---------ayrfskla~ 130 (311) T protein:vir:99 75 KTIYTMGQDRD-VEFYLDRQDVDET--------------DNELAMANISNVFITEHVQPELD---------SYRFSKIAT 130 (311) T ss_pred eeEEEeeeccc-eeeecchhchhhh--------------hhhhHHHHHHHHHHHhhhcchhh---------HHHHHHHHh Confidence 444 433332 2233344343221 00000001110000001111121 111211111 Q ss_pred HHhh-------ccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc--- Q lcl|NC_018271. 152 LLEA-------DATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF--- 221 (305) Q Consensus 152 ~i~~-------d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~--- 221 (305) ...+ .+........+.++|.+|+.+.++.....+-+ + ...++.+|||+..+...+++- ...+..+. T Consensus 131 ~a~~~~~~~~~~~~~~~~~~~~~~lt~~nvl~~l~~~~~~~~~-v-~~~~rvl~vTp~~~~lLk~~~--~~~r~~~~~~~ 206 (311) T protein:vir:99 131 SFDNLDGTDTEGTLLAKTHKTEETLDETNAYSQLKTGIGKVRK-Y-GTQNLVGYVSSEVMDALERSK--EFTRNITNQNV 206 (311) T ss_pred hhhcccccccchhhhccccccccccCHHHHHHHHHHHHHHHHh-c-CCCCeEEEEChHHHHHHhhch--hhheeeecccc Confidence 1100 00000111123568999999999988877754 2 235699999999998777532 11111111 Q ss_pred ----cCCCcceecceeeeeccCCCCCeEEE----ecc---------hHHhhhhh----hhhhhhhccccce-eeecccee Q lcl|NC_018271. 222 ----LNPNEFDFEGYTLTEIKGLPASRMVG----YNR---------DNIVIGMS----AQSDFNEIRIKDM-GDVDLSGQ 279 (305) Q Consensus 222 ----t~~~~~~~kGi~iv~l~~~Pd~~ii~----T~~---------sNl~~gvn----l~~D~n~I~I~~~-~~~~~~~~ 279 (305) -+.+-..+.|++|+.+ +|++|+-. |.. =|+++--. ...=.+.+++..= ++. ++. T Consensus 207 ~~~~i~~~V~~lDgv~Ii~V--~ps~r~~t~~~ft~G~~~~~~ak~INfiiv~~~a~i~~~K~~~v~~f~P~~~~--~gd 282 (311) T protein:vir:99 207 GTTALESRITSIDGVQLIEV--YESNRFMTKYDFTDGAKPTEDAKAINFLVVAKPAVISIVKENAVFLFAPGQHT--DGD 282 (311) T ss_pred cccccccccceecCeEEEEe--cCchhhcchhhhcCCccccCcccccceEEeCCCeeeeeeeeeeeeeeCCCCCC--Ccc Confidence 2333446888887776 45555431 110 02211000 0000111111110 010 111 Q ss_pred -EEEEEEeecceeeccCCe---EEEecCC Q lcl|NC_018271. 280 -IRTKMVLSAGVEYAYGAE---IVLYTPA 304 (305) Q Consensus 280 -~f~k~~m~~d~~i~fg~E---~v~~~~~ 304 (305) |..+...=.|.-+.=... .|-+.-| T Consensus 283 ~~l~~~R~Y~D~fv~~nk~~~Iyv~~k~A 311 (311) T protein:vir:99 283 GYLYQNRLYHDLFIKKHKRDGIFVSVKKA 311 (311) T ss_pred eeeeeeeeeeeeeeeccccCeEEEeeecC Confidence 333333222322210000 0111111 No 142 >protein:vir:104011 Length: 337 # NCBI annotation: P2 family phage major capsid protein # Family: family:all:201 # MgeID: mge:1665 # MgeName: phi52237 # Cross-refs: genbank:acc:YP_293748;genbank:gi:72537718;genbank:GeneID:3608142 Probab=81.10 E-value=0.089 Score=26.35 Aligned_cols=269 Identities=12% Similarity=0.058 Sum_probs=133.1 Q ss_pred CceEeee---e--cccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccch--hhhhccccCCCCCCCCccceE-e Q lcl|NC_018271. 1 MATTVDI---T--TNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRR--MNTTDDFVDYSCGFTPSGEVD-I 72 (305) Q Consensus 1 ma~~~~~---~--~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~--~~~~~~~q~~~~~~~~~G~~~-~ 72 (305) +|..... + -+=++.+.-+|-..+..+.+.-++ |.|+| | ..+.=-. +...-.+-+++. |+.++.+ - T Consensus 16 ~A~~ngv~~~~~~FsV~P~v~q~L~~~i~ess~FL~~--Invv~-V--~e~~Ge~v~lg~~g~iagrt~--t~~~~R~~~ 88 (337) T protein:vir:10 16 IAKLNDTGDVSKKFAVEPTVQQRLETKMQESSEFLKR--INVLP-V--TELEGEKLGLSVSGPIASRTD--TTKAARQPI 88 (337) T ss_pred HHHhcChhhhcceeeecHHHHHHHHHHHHHHHHhhcc--Cceec-c--ccceeeEEeeccCcceeeeec--CCCCccccc Confidence 3322221 1 133444555555555555555555 44422 1 1110000 011111111111 1111110 0 Q ss_pred cceeeeeeeeeEeec----cCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc----- Q lcl|NC_018271. 73 NEKQLTLKKIKSDKE----VCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT----- 143 (305) Q Consensus 73 ~~K~L~~~~~k~~~~----~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~----- 143 (305) +--.|+...|.|.-+ .-|=+-...|- +.+ .-|.-+-..+.+++|-+.=-..|+|-+-+ T Consensus 89 ~~~~l~~~~Y~c~qtn~dt~i~y~~LD~WA----~~~---------dF~~r~~~~i~~~~ALD~i~IGfnG~s~A~~Td~ 155 (337) T protein:vir:10 89 DPTALDSNRYRCEKTDYDTAIPYRKLDMWA----KFA---------DFQQRIRDVILNQGALDRIMIGWNGVKAAATTDR 155 (337) T ss_pred cccccCCCccEEEEeeeeeeccHHHHHHHh----cCh---------hHHHHHHHHHHHHHhhchhhhcccceeeccCCCh Confidence 001234444444211 11222233343 212 12333445556667766666777885421 Q ss_pred ------ch-hHHHHHHHhhccceEEeccC---c------CcCChhhHHHHHHHHHHh-ccHHHHhCCCcEEEecHHHHH- Q lcl|NC_018271. 144 ------GN-LQGILPLLEADATVIDVVGA---S------GGITAANVEAELGKFIDA-HTDEILQAPNHVFGVSTNVIR- 205 (305) Q Consensus 144 ------~~-fdG~lk~i~~d~~~~~~~~~---~------~~iT~anv~~~l~~~~~~-iP~~~r~~~~l~~f~S~~~~d- 205 (305) .| =-||++++.+.+..-.+... + +.-+..|.-+.+.+..+. ||+.+|+++.|+.+|+.++.. T Consensus 156 ~~nPllqDVNkGWlQ~~Re~ap~rV~~~~~~~~~~i~iG~~gdy~nLDalV~D~~~~lI~~~~~~d~~LVvivG~dLlad 235 (337) T protein:vir:10 156 QANPLLQDVNIGWLQQYRERAAQRVLHEGAKQAGKVLVGKAGDYENLDALVMDIVSSMIDPWFQEDTGLVVICGRELLHD 235 (337) T ss_pred hhCcCccccchhHHHHHHhcchhhhhccccccCcceeecCCCCcccHHHHHHHHHhccCChHHhcCCCEEEEEchhhhhH Confidence 11 26999999884442111111 0 112468899999999985 899999999999999987765 Q ss_pred HHHHHHhhhhccCCcccC---CC----cceecceeeeeccCCCCCeEEEecchHHhhhhhhhhh------hhhcccccee Q lcl|NC_018271. 206 AIKRAYGTQARSNGTFLN---PN----EFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSD------FNEIRIKDMG 272 (305) Q Consensus 206 ~Y~d~~~~~~~k~~~~t~---~~----~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D------~n~I~I~~~~ 272 (305) +|---+ .+....++ .. ..++-|.+-+.++-+|++.|+.|.=+||-+-+ |.+ .++++.+++. T Consensus 236 k~~~l~----n~~~~ptE~~Aa~~i~s~k~iGGlpa~~~PffP~~~~lVT~L~NLsIY~--Q~gs~RR~~~d~p~r~rie 309 (337) T protein:vir:10 236 KYFPIV----NATQAPTERLAADLIVSQKRIGNLPAVRVPFFPKRALMVTKLSNLSIYY--QEGARRRTLKEVPERDRIE 309 (337) T ss_pred HhhHHh----ccCCCcHHHHHHHHHHHhhhhCCceeEEccccCCCceEEeechhcEEEE--ecCcEEEEEEEcccccccc Confidence 333333 22222222 11 24788999999999999999999999996432 222 2445555553 Q ss_pred -----eeccceeEEEEEEeecceeeccCCe Q lcl|NC_018271. 273 -----DVDLSGQIRTKMVLSAGVEYAYGAE 297 (305) Q Consensus 273 -----~~~~~~~~f~k~~m~~d~~i~fg~E 297 (305) |-.|..+.+-++-+= + ||-+++- T Consensus 310 ~y~s~Ne~YvVEd~~~~a~i-e-nI~~~~a 337 (337) T protein:vir:10 310 NYESSNDAYVVEDFGCGCVA-E-NIELAAA 337 (337) T ss_pred chhhccceeeeeccccEEEE-e-ceeecCC Confidence 334444433333222 2 4444443 No 143 >protein:vir:98566 Length: 355 # NCBI annotation: gp5 # Family: family:all:201 # MgeID: mge:1533 # MgeName: PSP3 # Cross-refs: genbank:acc:NP_958060;genbank:gi:41057357;genbank:GeneID:2744237 Probab=80.52 E-value=0.094 Score=26.21 Aligned_cols=278 Identities=14% Similarity=0.123 Sum_probs=140.2 Q ss_pred CceEeee-------ecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccch--hhhhccccCCCCCCCCccceE Q lcl|NC_018271. 1 MATTVDI-------TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRR--MNTTDDFVDYSCGFTPSGEVD 71 (305) Q Consensus 1 ma~~~~~-------~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~--~~~~~~~q~~~~~~~~~G~~~ 71 (305) +|....+ +-+=++.+.-+|-..+..+.+.-++ |.|+| | ..+.=-. +...-.+-+++.. +++++.+ T Consensus 16 ~A~~ngv~~~~~~~~FsV~P~v~q~L~~~i~ess~FL~~--INvv~-V--~e~~Ge~i~lgv~g~iagrtdT-~~~~~R~ 89 (355) T protein:vir:98 16 VAELNNISTDDVSKKFTVEPSVTQTLMNTVQASSAFLKT--INILP-V--AEMKGEKIGVGVTGTIASTTDT-SGDKERQ 89 (355) T ss_pred HHHHhCCChhHccceeecCHHHHHHHHHHHHHHHHHhhc--Cceec-c--ccceeeEeeeccCccccccccC-CCCCCcc Confidence 3333222 1244566666666666667776666 55432 1 1111000 1111112222111 1122222 Q ss_pred ecc-eeeeeeeeeEeec----cCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc--- Q lcl|NC_018271. 72 INE-KQLTLKKIKSDKE----VCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT--- 143 (305) Q Consensus 72 ~~~-K~L~~~~~k~~~~----~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~--- 143 (305) ... -.|+...|.|.-+ .-|=+-...|-.. + .-|.-+-..+.+++|-+.=-..|+|-+-+ T Consensus 90 ~~~~~~l~~~~Y~c~qtn~dt~i~y~~LD~WA~~----~---------dF~~r~~~~i~k~~ALD~i~IGfNG~s~A~~T 156 (355) T protein:vir:98 90 TADFTALESSKYECNQINFDFHLKYKTLDLWARF----Q---------DFQRRIRDAIVKRQALDLIMAGFNGTTRADTS 156 (355) T ss_pred cccccccCCCccEEEEeeeeeeecHHHHHHHhcC----h---------hHHHHHHHHHHHHHhhchhhhcccceeeeccC Confidence 222 1234444444321 1223334455321 1 22344445566677777767777885421 Q ss_pred --------ch-hHHHHHHHhhccceEEecc--------------CcCcCChhhHHHHHHHHHHh-ccHHHHhCCCcEEEe Q lcl|NC_018271. 144 --------GN-LQGILPLLEADATVIDVVG--------------ASGGITAANVEAELGKFIDA-HTDEILQAPNHVFGV 199 (305) Q Consensus 144 --------~~-fdG~lk~i~~d~~~~~~~~--------------~~~~iT~anv~~~l~~~~~~-iP~~~r~~~~l~~f~ 199 (305) .| =-||++++.+.+..-.+.. ..+.-+..|.-+.+.+..+. ||+.+|+.+.|+.+| T Consensus 157 d~~~nPllqDVNkGWlQ~~Re~ap~~v~~~~~~~~~~~~~~~i~~G~~gdy~NLDAlV~D~~~~lI~~~~~~d~dLVviv 236 (355) T protein:vir:98 157 DRTKNTLLQDVAVGWLQKYRNEAPARVMSNITDADGKVVSAVIRVGKNGDYENIDALVMDATNNLIDEVYQDDPNLVAIV 236 (355) T ss_pred ChhhCcCccccchhHHHHHHhcchhhhhhhhcccCccccccceeeCCCCCcccHHHHHHHHHhccCChHHhcCCCEEEEE Confidence 11 2699999988554210000 00112468899999999986 799999999999999 Q ss_pred cHHHHH-HHHHHHhhhhccCCcccCCC----cceecceeeeeccCCCCCeEEEecchHHhhhhhhhhh------hhhccc Q lcl|NC_018271. 200 STNVIR-AIKRAYGTQARSNGTFLNPN----EFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSD------FNEIRI 268 (305) Q Consensus 200 S~~~~d-~Y~d~~~~~~~k~~~~t~~~----~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D------~n~I~I 268 (305) +.++.. +|-.-|.+ ..+..+..... ..++-|.+-+.++-+|++.|+.|.=+||-+-+ |.+ .++++. T Consensus 237 G~dLla~k~~~l~n~-~~~ptE~~Aa~~i~s~k~iGGlpa~~~PffP~~~~lVT~L~NLsIY~--Q~gs~RR~~~d~p~r 313 (355) T protein:vir:98 237 GRKLLADKYFPLVNK-QQENSESLAADIIISQKRIGNLPAVRVPYFPANAVLVTTLENLSIYF--MDESHRRSIDENPKK 313 (355) T ss_pred chhhhHHHhhhHhhc-cCCcHHHHHHHHHHHhhhhCCceeEEccccCCCceEEeeccccEEEE--ecCcEEEEEEecccc Confidence 977543 55544432 12211111111 24788999999999999999999999996432 222 244555 Q ss_pred ccee-----eeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 269 KDMG-----DVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 269 ~~~~-----~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) +++. |-.|-.+.+-++-+= + +|.+++-- .|++ T Consensus 314 ~rie~y~s~Ne~YvVEd~~~~a~i-e-nI~~~~~~---~~~~ 350 (355) T protein:vir:98 314 DRVENYESMNIDYVVEVYAAGCLL-E-NITLGDFT---APAA 350 (355) T ss_pred ccccchhhhcceeeeeccccEEEe-e-ceeeeCCC---CCcc Confidence 5553 444444433322221 1 33222111 1222 No 144 >protein:vir:79171 Length: 337 # NCBI annotation: gp2, phage major capsid protein, P2 family # Family: family:all:201 # MgeID: mge:1866 # MgeName: phiE202 # Cross-refs: genbank:acc:YP_001111033;genbank:gi:134288740;genbank:GeneID:4960690 Probab=79.31 E-value=0.11 Score=25.94 Aligned_cols=269 Identities=12% Similarity=0.050 Sum_probs=133.3 Q ss_pred CceEeee---e--cccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccch--hhhhccccCCCCCCCCccceE-e Q lcl|NC_018271. 1 MATTVDI---T--TNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRR--MNTTDDFVDYSCGFTPSGEVD-I 72 (305) Q Consensus 1 ma~~~~~---~--~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~--~~~~~~~q~~~~~~~~~G~~~-~ 72 (305) +|..... + -+=++.+.-+|-..+..+.+.-++ |.|+| | ..+.=-. +...-.+-+++. |+.++.+ - T Consensus 16 ~A~~ngv~~~~~~FsV~P~v~q~L~~~i~ess~FL~~--Invv~-V--~e~~Ge~v~lg~~g~iagrt~--t~~~~R~~~ 88 (337) T protein:vir:79 16 IAKLNDTGDVSKKFAVEPTVQQRLETKMQESSEFLKR--INVLP-V--TELEGEKLGLSVSGPIASRTD--TTKAARQPI 88 (337) T ss_pred HHHhcChhhhcceeeecHHHHHHHHHHHHHHHHhhcc--Cceec-c--ccceeeEEeeccCcceeeeec--CCCCccccc Confidence 3222221 1 133444555555555555555555 44422 1 1110000 011111111111 1111110 0 Q ss_pred cceeeeeeeeeEeec----cCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc----- Q lcl|NC_018271. 73 NEKQLTLKKIKSDKE----VCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT----- 143 (305) Q Consensus 73 ~~K~L~~~~~k~~~~----~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~----- 143 (305) +--.|+...|.|.-+ .-|=+-...|- +.+ .-|.-+-..+.+++|-+.=-..|+|-+-+ T Consensus 89 ~~~~l~~~~Y~c~qtn~dt~i~y~~LD~WA----~~~---------dF~~r~~~~i~~~~ALD~i~IGfnG~s~A~~Td~ 155 (337) T protein:vir:79 89 DPTALDSNRYRCEKTDYDTAIPYRKLDAWA----KFA---------DFQQRIRDVILNQGALDRIMIGWNGVKAAATTDR 155 (337) T ss_pred cccccCCCccEEEEeeeeeeccHHHHHHHh----cCh---------hHHHHHHHHHHHHHhhchhhhcccceeeccCCCh Confidence 001234444444211 11222233343 212 12333445556667766666777885421 Q ss_pred ------ch-hHHHHHHHhhccceEEeccC---c------CcCChhhHHHHHHHHHHh-ccHHHHhCCCcEEEecHHHHH- Q lcl|NC_018271. 144 ------GN-LQGILPLLEADATVIDVVGA---S------GGITAANVEAELGKFIDA-HTDEILQAPNHVFGVSTNVIR- 205 (305) Q Consensus 144 ------~~-fdG~lk~i~~d~~~~~~~~~---~------~~iT~anv~~~l~~~~~~-iP~~~r~~~~l~~f~S~~~~d- 205 (305) .| =-||++++.+.+..-.+... + +.-+..|.-+.+.+..+. ||+.+|+.+.|+.+|+.++.. T Consensus 156 ~~nPllqDVNkGWlQ~~Re~ap~rV~~~~~~~~~~i~iG~~gdy~nLDalV~D~~~~lI~~~~~~d~~LVvivG~dLlad 235 (337) T protein:vir:79 156 QANPLLQDVNIGWLQQYRERAAQRVLHEGAKQAGKVLVGKAGDYENLDALVMDIVSSMIDPWFQEDTGLVAICGRELLHD 235 (337) T ss_pred hhCcCccccchhHHHHHHhcchhhhhccccccCcceeecCCCCcccHHHHHHHHHhccCChHHhcCCCEEEEEchhhhhH Confidence 11 26999999884442111111 1 112468899999999985 899999999999999987765 Q ss_pred HHHHHHhhhhccCCcccC---CC----cceecceeeeeccCCCCCeEEEecchHHhhhhhhhhh------hhhcccccee Q lcl|NC_018271. 206 AIKRAYGTQARSNGTFLN---PN----EFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSD------FNEIRIKDMG 272 (305) Q Consensus 206 ~Y~d~~~~~~~k~~~~t~---~~----~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D------~n~I~I~~~~ 272 (305) +|---+ .+....++ .. ..++-|.+-+.++-+|++.|+.|.=+||-+-+ |.+ .++++.+++. T Consensus 236 k~~~l~----n~~~~ptE~~Aa~~i~s~k~iGGlpa~~~PffP~~~~lVT~L~NLsIY~--Q~gs~RR~~~d~p~r~rie 309 (337) T protein:vir:79 236 KYFPIV----NATQAPTERLAADLIVSQKRIGNLPAVRVPFFPKRALMVTKLSNLSIYY--QEGARRRTLKEVPERDRIE 309 (337) T ss_pred HhhHHh----ccCCCcHHHHHHHHHHHhhhhCCceeEEccccCCCceEEeechhcEEEE--ecCcEEEEEEEcccccccc Confidence 333333 22222222 11 24788999999999999999999999996432 222 2445555553 Q ss_pred -----eeccceeEEEEEEeecceeeccCCe Q lcl|NC_018271. 273 -----DVDLSGQIRTKMVLSAGVEYAYGAE 297 (305) Q Consensus 273 -----~~~~~~~~f~k~~m~~d~~i~fg~E 297 (305) |-.|..+.+-++-+= + ||-+++- T Consensus 310 ~y~s~Ne~YvVEd~~~~a~i-e-nI~~~~a 337 (337) T protein:vir:79 310 NYESSNDAYVVEDFGCGCVA-E-NIELAAA 337 (337) T ss_pred chhhccceeeeeccccEEEE-e-ceeecCC Confidence 334444433333222 2 4444443 No 145 >protein:vir:78186 Length: 337 # NCBI annotation: gp2, phage major capsid protein, P2 family # Family: family:all:201 # MgeID: mge:1848 # MgeName: phiE12-2 # Cross-refs: genbank:acc:YP_001111152;genbank:gi:134288735;genbank:GeneID:4960646 Probab=75.23 E-value=0.15 Score=25.12 Aligned_cols=269 Identities=11% Similarity=0.043 Sum_probs=132.0 Q ss_pred CceEeee---e--cccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccch--hhhhccccCCCCCCCCccceE-e Q lcl|NC_018271. 1 MATTVDI---T--TNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRR--MNTTDDFVDYSCGFTPSGEVD-I 72 (305) Q Consensus 1 ma~~~~~---~--~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~--~~~~~~~q~~~~~~~~~G~~~-~ 72 (305) +|..... . -+=.+.+.-+|-..+..+.+.-+. |.|+| +..+.=-. +...-.+-+++ -|+.++.+ - T Consensus 16 ~A~~ngv~~~~~~FsV~P~v~q~L~~~i~ess~FL~~--INvv~---V~e~~Ge~v~lg~~g~iagrt--dt~~~~R~~~ 88 (337) T protein:vir:78 16 IAKLNDTGDVSKKFAVEPTVQQRLETKMQESSEFLKR--INVLP---VTELEGEKLGLSVSGPIASRT--DTTKAARQPI 88 (337) T ss_pred HHHhcChhhhcceeecChHHHHHHHHHHHHHHHHhcc--CCccc---cccceeeEEecccCcceeeee--cCCCcccccc Confidence 2222211 1 133444555555555555555555 44321 11110000 00111111111 11111110 0 Q ss_pred cceeeeeeeeeEeec-c---CHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc----- Q lcl|NC_018271. 73 NEKQLTLKKIKSDKE-V---CKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT----- 143 (305) Q Consensus 73 ~~K~L~~~~~k~~~~-~---~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~----- 143 (305) +--.|+...|.|.-+ | -|=+-...|- +.+ .-|.-+-..+.+++|-+.=...|+|-+-+ T Consensus 89 ~~~~l~~~~Y~c~qTn~dt~i~Y~~lD~WA----~~~---------dF~~r~~~~i~~~~ALD~i~IGfNGts~A~~Td~ 155 (337) T protein:vir:78 89 DPTALDSNRYRCEKTDYDTAIPYRKLDMWA----KFA---------DFQQRIRDVILNQGALDRIMIGWNGVKAAATTDR 155 (337) T ss_pred cccccCCCccEEEEeceecccCHHHHHHHh----cCh---------hHHHHHHHHHHHHHhhccceecccceeeccCCCh Confidence 011244555554221 1 1112223342 212 12233334455566666666667885321 Q ss_pred ------ch-hHHHHHHHhhccceEEeccC---c------CcCChhhHHHHHHHHHHh-ccHHHHhCCCcEEEecHHHHH- Q lcl|NC_018271. 144 ------GN-LQGILPLLEADATVIDVVGA---S------GGITAANVEAELGKFIDA-HTDEILQAPNHVFGVSTNVIR- 205 (305) Q Consensus 144 ------~~-fdG~lk~i~~d~~~~~~~~~---~------~~iT~anv~~~l~~~~~~-iP~~~r~~~~l~~f~S~~~~d- 205 (305) .| =-||++++.+.+..-.+... + +.-+..|.-+.+.+..+. ||+.+|+++.|+.+|+.++.. T Consensus 156 ~~nPllqDVN~GWlQ~~Re~ap~rVl~~~~~~~~~i~iG~~gdy~NLDalV~d~~~~lI~~~~~~d~dLVvivG~dLlad 235 (337) T protein:vir:78 156 QANPLLQDVNIGWLQQYRERAAQRVLHEGAKQAGKVLIGKAGDYENLDALVMDIVSSMIDPWFQEDTGLVVICGRELLHD 235 (337) T ss_pred hhCcCccccchHHHHHHHhcchhhhhccccccCCceeecCCCCcccHHHHHHHHHhccCChHHhcCCCEEEEEchhhhHH Confidence 11 26999999875443111111 0 112468999999999985 899999999999999988765 Q ss_pred HHHHHHhhhhccCCcccCC-------CcceecceeeeeccCCCCCeEEEecchHHhhhhhhhhh------hhhcccccee Q lcl|NC_018271. 206 AIKRAYGTQARSNGTFLNP-------NEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSD------FNEIRIKDMG 272 (305) Q Consensus 206 ~Y~d~~~~~~~k~~~~t~~-------~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D------~n~I~I~~~~ 272 (305) +|.--+ .+....++- ...++-|.+-+.++-+|++.|+.|.=+||-+= .|.+ .++++.+++. T Consensus 236 k~~~l~----n~~~~ptE~~Aa~~i~s~k~iGGl~a~~~PfFP~~~ilVT~L~NLsIY--~Q~gs~RR~~~d~p~r~rie 309 (337) T protein:vir:78 236 KYFPIV----NATQAPTERLAADLIVSQKRIGNLPAVRVPFFPKRALMVTKLSNLSIY--YQEGARRRTLKEVPERDRIE 309 (337) T ss_pred HHHHHH----hcCCCcHHHHHHHHHHHhhhhcCcceEEccccCCCceEEeechhcEEE--EecCcEEEEEEecccccccc Confidence 333333 222222221 12478899999999999999999999999532 2222 2445555553 Q ss_pred -----eeccceeEEEEEEeecceeeccCCe Q lcl|NC_018271. 273 -----DVDLSGQIRTKMVLSAGVEYAYGAE 297 (305) Q Consensus 273 -----~~~~~~~~f~k~~m~~d~~i~fg~E 297 (305) |-.|..+.+-++-+= + ||-+++- T Consensus 310 ~y~s~Ne~YvVEd~~~~a~i-E-nI~~~~a 337 (337) T protein:vir:78 310 NYESSNDAYVVEDFGCGCVA-E-NIELAAA 337 (337) T ss_pred chhhccceeeeeccccEEEE-e-ceeecCC Confidence 334444433333222 2 4444443 No 146 >protein:vir:78920 Length: 290 # NCBI annotation: Cps # Family: family:all:701 # MgeID: mge:1859 # MgeName: A006 # Cross-refs: genbank:acc:YP_001468846;genbank:gi:157325479;genbank:GeneID:5601917 Probab=73.96 E-value=0.16 Score=24.89 Aligned_cols=262 Identities=13% Similarity=0.105 Sum_probs=112.4 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCC--CCCCCccceEeccee-- Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYS--CGFTPSGEVDINEKQ-- 76 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~--~~~~~~G~~~~~~K~-- 76 (305) ||. +-.+-|++..++.+ ...+..+++.+.. +. +.-.++..+|++.+ .++.+|+ .+| ..|+++.+..+ T Consensus 1 Mai--n~a~~~~~~Ld~~~-~~~~~t~~l~~~~-~~---~~ggktVkI~~i~~-~gl~DY~R~~g~-~~g~v~~~~et~t 71 (290) T protein:vir:78 1 MAI--NYVDKYGKELDQKL-VFGTYTNELETPN-LL---WLDAKTFKIQTITT-TGLKAHTRNKGY-NEGSASNTNKSYT 71 (290) T ss_pred Cch--hHHHHHHHHHHHHH-Hhhheeeeccccc-ee---eccCCEEEEeeecc-CcccccccCCCc-ccCccccceeeEE Confidence 984 33467888855554 4444444554442 22 33467778998886 8888886 344 23455444333 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) |.-.|.-. |.+.|-|.... +.-.+..+.+-+-...+++-|+... -|.-+.+. T Consensus 72 l~qdR~~~-F~vD~~DvDEt--------------~~~~~~~nv~~ef~~~~v~PEiDay---------r~skla~~---- 123 (290) T protein:vir:78 72 IDFDRDVE-FFVDVMDVDET--------------GQALSAANVTKEFNSRHAGPEMDAY---------RFSKLATA---- 123 (290) T ss_pred eeccccce-eeccccchhHH--------------hhhhhHHHHHHHHHHHHhhhhhhHH---------HHHHHHhh---- Confidence 33322211 11112121110 0111112222222222344444311 12222222 Q ss_pred cceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHH--Hhhhhcc---CCcccCCCcceecc Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRA--YGTQARS---NGTFLNPNEFDFEG 231 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~--~~~~~~k---~~~~t~~~~~~~kG 231 (305) +.+.+.. .+.++|++|+.++++++..++.+ + ...++.+|||+..+...+++ .+....- +...-+++-..+.| T Consensus 124 a~~~~~~-~~~t~t~~n~~~~i~~~~~~lde-v-p~~~rvl~vtp~~~~lL~~~~~f~r~~~~~~~~~~~i~~~V~~idG 200 (290) T protein:vir:78 124 AKTNSNS-VAEEITKDNVFTKLKAAIRKVKK-Y-GTQNLVMYVSPDVMAALELSDDFVRAINVQNIGPSSIETRITAIDG 200 (290) T ss_pred hhccCcc-cccccCHHHHHHHHHHHHHHHHh-c-CCCCeEEEECHHHHHHHhhChhhhccccccccccccccceeeeecC Confidence 2221211 23468999999999999988865 3 24568999999999988643 3211110 11112333456888 Q ss_pred eeeeeccCCCC-CeEEE----ec---------chHHhhhhh----hhhhhhhccccceeeeccceeEEEEEEeecceeec Q lcl|NC_018271. 232 YTLTEIKGLPA-SRMVG----YN---------RDNIVIGMS----AQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYA 293 (305) Q Consensus 232 i~iv~l~~~Pd-~~ii~----T~---------~sNl~~gvn----l~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~ 293 (305) ++|+.+ |+ +|+-. +. .=|+++.-. ...=.+.+++..=.--+-.-.|..+...=.|.-+. T Consensus 201 ~~ii~v---ps~~r~~t~~~f~~G~~~~~~ak~in~ii~~~~a~i~~~K~~~~~~~~P~~~~~~d~~~~~~r~y~d~~v~ 277 (290) T protein:vir:78 201 TRIVEV---EAEDRFYDTFDFTDGYKPAAGAKKLNFLLVNKGSVVGGAKHASIYLHAPGSVGQGDGWLYQYRVYHDIFVL 277 (290) T ss_pred cEEEEe---cccchhhhhhhhcccccccCCccceeEEEEcCCceeeeeeeeEEEeeCCCCCcCcceeeeeeeeeeeeeee Confidence 888876 43 23210 00 002211000 00001111211100000001133333333333322 Q ss_pred cCCeEEEecCCC Q lcl|NC_018271. 294 YGAEIVLYTPAA 305 (305) Q Consensus 294 fg~E~v~~~~~~ 305 (305) =...--+|.-++ T Consensus 278 ~nk~~~i~~~~~ 289 (290) T protein:vir:78 278 DQQKDGVIASTE 289 (290) T ss_pred ccccCeeEEEee Confidence 111111222222 No 147 >protein:vir:2201 Length: 345 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:49 # MgeName: T7 # Cross-refs: genbank:acc:NP_041998;swissprot:sw:p19726;genbank:gi:9627469;goa:P19726;uniprot:P19726;genbank:GeneID:1261026 Probab=71.28 E-value=0.2 Score=24.44 Aligned_cols=258 Identities=14% Similarity=0.085 Sum_probs=119.1 Q ss_pred CceEeeee---------------------cccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccC Q lcl|NC_018271. 1 MATTVDIT---------------------TNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVD 59 (305) Q Consensus 1 ma~~~~~~---------------------~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~ 59 (305) ||++...+ .-|.|||+..+--. .+-.++++++.--..|....+++..... +. T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~al~le~f~geV~~~f~~~------s~~~~~~~~r~i~~gks~~~~~iG~~~~-~~ 73 (345) T protein:vir:22 1 MASMTGGQQMGTNQGKGVVAAGDKLALFLKVFGGEVLTAFART------SVTTSRHMVRSISSGKSAQFPVLGRTQA-AY 73 (345) T ss_pred CcccccchhcccccccccccCCchhHHHHHHHhHHHHHHHHHH------hhhcccceeeeccccceEEEeeecceEE-Ee Confidence 76655421 23445554333222 2334567765433355443554433332 33 Q ss_pred CCC--CCCCcc-ceEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhc Q lcl|NC_018271. 60 YSC--GFTPSG-EVDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDI 136 (305) Q Consensus 60 ~~~--~~~~~G-~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~ 136 (305) ++. .++..+ +.+-+|+.|..-..+..=-+. .|+ =.+|.. ..+.....+.++..+|..+.+.+ T Consensus 74 ~~~G~~l~~~~~~~~~~e~~ltID~~~y~~~~V-ddi----D~~q~~----------~D~r~~~s~~~G~aLA~~~D~~i 138 (345) T protein:vir:22 74 LAPGENLDDKRKDIKHTEKVITIDGLLTADVLI-YDI----EDAMNH----------YDVRSEYTSQLGESLAMAADGAV 138 (345) T ss_pred eecCCCCCCCCCCcccceEEEEecchhhhhhhH-hhH----HHHhcC----------chhHHHHHHHHHHHHHHHHHHHH Confidence 333 333332 456677666655544321111 111 112222 22223344555555655555444 Q ss_pred c----ccCC-c-------cchhHHHHHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHH Q lcl|NC_018271. 137 W----QGDG-T-------TGNLQGILPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVI 204 (305) Q Consensus 137 ~----~GD~-s-------~~~fdG~lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~ 204 (305) . ++-. + +..-+|++..+.++..... ....++.++.+.|.+....+.++--...+..++||+..| T Consensus 139 ~~~l~k~a~~~~~~~~~~~~~~~~~~~~~~~~g~~~t----~~~~~~~~~~~ai~~a~~~Lde~~VP~~~R~~vv~P~~y 214 (345) T protein:vir:22 139 LAEIAGLCNVESKYNENIEGLGTATVIETTQNKAALT----DQVALGKEIIAALTKARAALTKNYVPAADRVFYCDPDSY 214 (345) T ss_pred HHHHHHhhccccccccccccccccccccccccccccc----ccccCHHHHHHHHHHHHHHhhhcCCCccCCEEEeChHHH Confidence 3 2211 0 1224556555444333211 223456788888888877776654333357899999999 Q ss_pred HHHHHHH---hhhhccCCcccCCCcceecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEE Q lcl|NC_018271. 205 RAIKRAY---GTQARSNGTFLNPNEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIR 281 (305) Q Consensus 205 d~Y~d~~---~~~~~k~~~~t~~~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f 281 (305) ..-.+.- ...++...+..++....+.|++|+....+|.. +..+.+..- +.....+. . .....|. T Consensus 215 ~~Ll~~~~~~~~~~~~~~~~~~G~V~~i~G~~V~~sn~lp~~-~~~~~~~~~---~~~~~~~~--------~-~~g~~~~ 281 (345) T protein:vir:22 215 SAILAALMPNAANYAALIDPEKGSIRNVMGFEVVEVPHLTAG-GAGTAREGT---TGQKHVFP--------A-NKGEGNV 281 (345) T ss_pred HHHhccccccccccccccccccceEEEEeceEEEeccccccc-ccCccccCc---cccccccc--------c-cccceee Confidence 8776543 12233333344555667899999999999853 222221110 00000000 0 0000110 Q ss_pred EEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 282 TKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 282 ~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) ..++.+. =-|+|+|+| T Consensus 282 ----~~~~~~~----~~l~~h~~A 297 (345) T protein:vir:22 282 ----KVAKDNV----IGLFMHRSA 297 (345) T ss_pred ----eeccCce----EEEEEehhh Confidence 0111111 146666666 No 148 >protein:vir:107120 Length: 329 # NCBI annotation: conserved phage protein # Family: family:all:701 # MgeID: mge:1571 # MgeName: CNPH82 # Cross-refs: genbank:acc:YP_950606;genbank:gi:119953686;genbank:GeneID:4643129 Probab=66.81 E-value=0.26 Score=23.77 Aligned_cols=260 Identities=12% Similarity=0.059 Sum_probs=126.5 Q ss_pred Cc------eEeeeecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCC--CCCCCccceEe Q lcl|NC_018271. 1 MA------TTVDITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYS--CGFTPSGEVDI 72 (305) Q Consensus 1 ma------~~~~~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~--~~~~~~G~~~~ 72 (305) .| +++--..-|.+...+.+-..+.++.-+++.- +. +...++..+|++.+ ..+.+|+ .+++. |+++. T Consensus 30 ~~~~~~~~nt~~l~~k~~~~LD~~~~~~~~s~~~~~N~~-~e---~~~g~tVkIp~i~~-~gl~DY~R~~g~~~-g~vt~ 103 (329) T protein:vir:10 30 FANKSVEPGDTLLKNKHVGILEKVTAANSYSAPAVISND-AI---FMQGRSFTVIKGDV-TELKDYKRNATNEF-DHPQI 103 (329) T ss_pred hcCCccCCchhHHHHHHHHHHHHHHHhhceeeeeecccc-ee---eccCcEEEEeeecc-cccccccCCCCccc-ccccc Confidence 22 2222234666666655444455555555532 22 33577777898877 6788886 45554 45555 Q ss_pred cceeeeeeeee-EeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHH Q lcl|NC_018271. 73 NEKQLTLKKIK-SDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 73 ~~K~L~~~~~k-~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk 151 (305) +..++..-.-+ ..|.+.+-|-.++ . +.+ +....+-+-...+++-|+.... | -+ T Consensus 104 ~~~t~tidqdR~~~F~VD~~D~dEt------n-~~l-------~a~~i~~~~~~~~v~pEiDay~---------~---sk 157 (329) T protein:vir:10 104 QETTYFLDQEKYWGRFVDALDRRDT------E-GNI-------DINYVVAKQASEVVAPYLDNLR---------F---AT 157 (329) T ss_pred ceeEEEeecccceeeecchhhHhhh------h-hhh-------hHHHHHHHHHHHHhhhHHHHHH---------H---HH Confidence 54444443322 1223333332221 1 111 1111111222233444443111 1 12 Q ss_pred HHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhh-hccCCcc--cCCCcce Q lcl|NC_018271. 152 LLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQ-ARSNGTF--LNPNEFD 228 (305) Q Consensus 152 ~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~-~~k~~~~--t~~~~~~ 228 (305) .....+. + .+.++|++|+.+.+.++...+.+. .-..+..+|||+..+...+..-.-. .+.+.+. .++.-+. T Consensus 158 la~~a~~--~---~~~~~t~~nay~~i~~a~~~Lde~-~vp~~Rvl~VtP~~~~~Lk~~~~f~~~~~~~~~~~~~g~Vg~ 231 (329) T protein:vir:10 158 LARNKAK--H---LTVGSGADAQYDAVLDVSVELDEI-GAGASRILFVTPKFYKGIKKFVIELPQGDNRQQVLGKGVQGE 231 (329) T ss_pred HHhhccc--c---cccccCHHHHHHHHHHHHHHHHhc-CCCCCcEEEeCHHHHHHHHhhhhhhccccccccceeeeeeee Confidence 2111111 1 233578999999999999988876 2124689999999999887644211 1111111 1333456 Q ss_pred ecceeeeeccCC--CCCeEEEecchHHhhhhhhhhhhhhccccc-eeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 229 FEGYTLTEIKGL--PASRMVGYNRDNIVIGMSAQSDFNEIRIKD-MGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 229 ~kGi~iv~l~~~--Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~-~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) +.|++|+.+++- .+--+++.+++=. ....- .++.++.+ -++.+. |..+-++=+|.-+.-+...-+|+... T Consensus 232 idG~~Ii~vps~~~k~in~ii~~~~A~-~~~~K---~~~~~~~~p~~~~~a---~~v~gr~yyd~~V~~~k~~~I~~~~~ 304 (329) T protein:vir:10 232 LDGFTIVKVPSKMLQGVEAMAVIGEVM-ASPIQ---ANEAKLNSNVPGMFG---TLAEQMLYTGAFVPEHLQKYIFTIGG 304 (329) T ss_pred ecCeEEEEecCCcccceeEEEEcCCce-eeeee---eeeeeeeCCCCccch---heeeeeeeeeeEEEccccCEEEEecc Confidence 899998877443 2323344444322 12211 23344322 233221 45555555677665555444454322 No 149 >protein:vir:3364 Length: 347 # NCBI annotation: major capsid protein 10A # Family: family:all:975 # MgeID: mge:67 # MgeName: T3 # Cross-refs: genbank:acc:NP_523335;genbank:gi:17570826;genbank:GeneID:927448 Probab=61.87 E-value=0.35 Score=23.12 Aligned_cols=262 Identities=12% Similarity=0.053 Sum_probs=111.1 Q ss_pred CceEeeee---c-----ccchhHH--------HHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCC Q lcl|NC_018271. 1 MATTVDIT---T-----NYVGEVA--------GGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGF 64 (305) Q Consensus 1 ma~~~~~~---~-----~Y~Ge~l--------~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~ 64 (305) ||++...+ | .++|.+. .++++... ...+-.+++++......++..++++..... +.++.+- T Consensus 1 ~~~~~~~~~~~t~~g~~~~~~~~~al~ie~~~g~V~~~f~--~~s~~~~~v~~r~~~~G~sv~i~~iG~~t~-~~~~~g~ 77 (347) T protein:vir:33 1 MANIQGGQQIGTNQGKGQSAADKLALFLKVFGGEVLTAFA--RTSVTMPRHMLRSIASGKSAQFPVIGRTKA-AYLKPGE 77 (347) T ss_pred CCCCccCcccccccccCCcccchHHHHHHHHHHHHHHHHH--HHHhhhhhhccccccccceeEeeeccceee-eeecCCC Confidence 88776554 1 2222211 11211111 123455556665555566655666655554 4443221 Q ss_pred C--Cc-cceEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhcc---- Q lcl|NC_018271. 65 T--PS-GEVDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIW---- 137 (305) Q Consensus 65 ~--~~-G~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~---- 137 (305) . .+ -+.+-++++|..-.. .|.+ .++.+-++. +.+..+.....+..+..++....+.++ T Consensus 78 ~l~~~~~~~~~~e~~ltiD~~---------~y~~---~~VddiD~~---q~~~D~~~~~~~~~g~aLA~~~D~~i~~~l~ 142 (347) T protein:vir:33 78 NLDDKRKDIKHTEKVIHIDGL---------LTAD---VLIYDIEDA---MNHYDVRAEYTAQLGESLAMAADGAVLAELA 142 (347) T ss_pred CCCCCCCCCccceEEEEechh---------hhhh---HHHhhHHHH---hcCCchhHHHHHHHHHHHHHHHHHHHHHHHH Confidence 1 11 223445555543332 2222 222221111 122222233445555566665555443 Q ss_pred -ccCC---ccchhHHHHHHHhhccceEEeccCcC-----cCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHH Q lcl|NC_018271. 138 -QGDG---TTGNLQGILPLLEADATVIDVVGASG-----GITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIK 208 (305) Q Consensus 138 -~GD~---s~~~fdG~lk~i~~d~~~~~~~~~~~-----~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~ 208 (305) -++. ++..-.|+- .........+.++ ..++.++.+.+.+....+.++=-..-+..+++++..|.... T Consensus 143 ~~~~~~~~~~~~~~~~~----~~~~~~~~~~~tg~~~d~~~~a~~i~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll 218 (347) T protein:vir:33 143 GLVNLPDGSNENIEGLG----KPTVLTLVKPTTGSLTDPVELGKAIIAQLTIARASLTKNYVPAADRTFYTTPDNYSAIL 218 (347) T ss_pred Hhhhhhccccccccccc----ccccccccccccccccchhhhHHHHHHHHHHHHHHHhhcCCCccCcEEEeCHHHHHHHh Confidence 1111 111101110 0000000111111 12345677777777777666422223578999998887664 Q ss_pred HH--Hhhh-hccCCcccCCCcceecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEE Q lcl|NC_018271. 209 RA--YGTQ-ARSNGTFLNPNEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMV 285 (305) Q Consensus 209 d~--~~~~-~~k~~~~t~~~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~ 285 (305) .. +... +.......++....+.|++|+...++|.+-+.....++. .| .++...-+ .+..+.. T Consensus 219 ~~~~~~~~d~~~~~~~~~G~V~~i~G~~V~~Sn~lp~~~~~~~~~~~~-ag-----~~~~~~~~------~~~~~~~--- 283 (347) T protein:vir:33 219 AALMPNAANYQALLDPERGTIRNVMGFEVVEVPHLTAGGAGDTREDAP-AD-----QKHAFPAT------SSTTVKV--- 283 (347) T ss_pred ccccccccccccccccccceeEEEeceeEEEecccccCcccccccccc-cc-----ccccccCC------cccceec--- Confidence 32 2222 111122334445578999999999999875443322222 11 11211111 1111111 Q ss_pred eecceeeccCCeEEEecCCC Q lcl|NC_018271. 286 LSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 286 m~~d~~i~fg~E~v~~~~~~ 305 (305) +|-. -=.++|.|+| T Consensus 284 ---a~~~---~~gl~~h~~A 297 (347) T protein:vir:33 284 ---ALDN---VVGLFQHRSA 297 (347) T ss_pred ---cccc---eeeeeecchh Confidence 1111 1135566666 No 150 >protein:vir:97331 Length: 319 # NCBI annotation: ORF011 # Family: family:all:701 # MgeID: mge:1666 # MgeName: 52A # Cross-refs: genbank:acc:YP_240611;genbank:gi:66396278;genbank:GeneID:5133687 Probab=61.13 E-value=0.36 Score=23.02 Aligned_cols=259 Identities=12% Similarity=0.079 Sum_probs=126.4 Q ss_pred Cce------EeeeecccchhHHHHHHHHhhccc-cchhcCceEEecCCCCcccccchhhhhccccCCC--CCCCCccceE Q lcl|NC_018271. 1 MAT------TVDITTNYVGEVAGGYFLEMVKEA-NTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYS--CGFTPSGEVD 71 (305) Q Consensus 1 ma~------~~~~~~~Y~Ge~l~~~~~~~~~g~-~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~--~~~~~~G~~~ 71 (305) .|+ ++--...|++. |++.+....... -+++.- +. +...++..+|++.+ ..+.+|+ .+++. |+++ T Consensus 19 ~~~~~~~~nt~~l~~k~~~~-LD~~~~~~~~s~~~~~N~~-~e---~~gg~tVkIp~i~~-~gl~DY~R~~g~~~-g~vt 91 (319) T protein:vir:97 19 FANKSVEPGQTLLKNKHVGI-LERVTAVNAYSTPALISND-AI---FMEGRSFTVMKGDT-TELKDYKRNATNEF-DHPK 91 (319) T ss_pred hhccCCCcchHHHHHHHHHH-HHHHHHHhhhhhhcccCcc-eE---eccCcEEEEeeecc-cccccccCCCCccc-CCcc Confidence 221 11112357776 445555544433 334432 32 23577778999887 6888886 35544 4555 Q ss_pred ecceeeeeeeee-EeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHH Q lcl|NC_018271. 72 INEKQLTLKKIK-SDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGIL 150 (305) Q Consensus 72 ~~~K~L~~~~~k-~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~l 150 (305) .+..++..-.-+ ..|.+.+-|-.+ .. +.+ +.-..+-+-....++-|+.... |. T Consensus 92 ~~~~t~tidqdR~~~F~VD~~D~~E------tn-~~l-------~a~~i~~~~~~~~v~PEiDay~---------~s--- 145 (319) T protein:vir:97 92 IEETTYFLDQEKYWGRFVDALDRKD------TE-GNI-------DINYVVARQGAEVVAPYLDNLR---------FA--- 145 (319) T ss_pred cceeEEEeecccccccccchhhHhh------hh-chh-------hHHHHHHHHHHHHhhhhhhHHH---------HH--- Confidence 555554443222 122233333222 11 111 1111111222223444443111 11 Q ss_pred HHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCc---ccCCCcc Q lcl|NC_018271. 151 PLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGT---FLNPNEF 227 (305) Q Consensus 151 k~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~---~t~~~~~ 227 (305) +.....+. + .+.++|++|+.+.+.++...+.+.=-. .+..+|||+..+.....+-+-.-..... ..++.-+ T Consensus 146 kla~~a~~--~---~~~~~t~~n~y~~i~~a~~~Lde~~VP-~~Rvl~Vtp~~~~~L~~~~~f~~~~~~~~~~~~~g~Vg 219 (319) T protein:vir:97 146 TLARNKAK--H---LTVGTGSDAQYDAVLDVSVELDEIKAP-ENRVLFVSPTFYKGIKKFVIALPQGDTRQQVLGKGVQG 219 (319) T ss_pred HHHhhccc--c---cccccCHHHHHHHHHHHHHHHHhcCCC-CCcEEEeCHHHHHHHHhhhhhhccccccccceeeeece Confidence 22222111 1 234578999999999998887764112 3578999999999986655322111111 1233345 Q ss_pred eecceeeeeccC--CCCCeEEEecchHHhhhhhhhhhhhhccccc-eeeeccceeEEEEEEeecceeeccCCeEEEec-- Q lcl|NC_018271. 228 DFEGYTLTEIKG--LPASRMVGYNRDNIVIGMSAQSDFNEIRIKD-MGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYT-- 302 (305) Q Consensus 228 ~~kGi~iv~l~~--~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~-~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~-- 302 (305) .+.|++|+.+++ +.+--+++.+++=....+- .++.++-+ -++.. + |..+-++=+|.-+.-+...-+|+ T Consensus 220 ~idG~~Vi~vps~~~k~in~i~~h~~A~~~~~k----~~~~~~~~p~~~~~--a-~~v~gr~y~d~~V~~~k~~~Iy~~~ 292 (319) T protein:vir:97 220 ELDGFVIVKVPTKLLQGLQAIAVVGEVLASPIQ----ADLAKTNSNIPGMF--G-TLAEQLLYTGAFVPEHLQKYIFTIG 292 (319) T ss_pred eecCeEEEEecccccccceEEEEcCCeeeeeee----eeeeeccCCCcccc--c-eeeeeeeeeeeEEeccccceEEEee Confidence 689999877633 2333356666554432221 23333322 23321 1 44555555677776666555565 Q ss_pred CCC Q lcl|NC_018271. 303 PAA 305 (305) Q Consensus 303 ~~~ 305 (305) +++ T Consensus 293 ~~~ 295 (319) T protein:vir:97 293 GTE 295 (319) T ss_pred cCC Confidence 222 No 151 >protein:vir:94800 Length: 319 # NCBI annotation: ORF012 # Family: family:all:701 # MgeID: mge:1531 # MgeName: 29 # Cross-refs: genbank:acc:YP_240536;genbank:gi:66396203;genbank:GeneID:5133580 Probab=61.13 E-value=0.36 Score=23.02 Aligned_cols=259 Identities=12% Similarity=0.079 Sum_probs=126.4 Q ss_pred Cce------EeeeecccchhHHHHHHHHhhccc-cchhcCceEEecCCCCcccccchhhhhccccCCC--CCCCCccceE Q lcl|NC_018271. 1 MAT------TVDITTNYVGEVAGGYFLEMVKEA-NTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYS--CGFTPSGEVD 71 (305) Q Consensus 1 ma~------~~~~~~~Y~Ge~l~~~~~~~~~g~-~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~--~~~~~~G~~~ 71 (305) .|+ ++--...|++. |++.+....... -+++.- +. +...++..+|++.+ ..+.+|+ .+++. |+++ T Consensus 19 ~~~~~~~~nt~~l~~k~~~~-LD~~~~~~~~s~~~~~N~~-~e---~~gg~tVkIp~i~~-~gl~DY~R~~g~~~-g~vt 91 (319) T protein:vir:94 19 FANKSVEPGQTLLKNKHVGI-LERVTAVNAYSTPALISND-AI---FMEGRSFTVMKGDT-TELKDYKRNATNEF-DHPK 91 (319) T ss_pred hhccCCCcchHHHHHHHHHH-HHHHHHHhhhhhhcccCcc-eE---eccCcEEEEeeecc-cccccccCCCCccc-CCcc Confidence 221 11112357776 445555544433 334432 32 23577778999887 6888886 35544 4555 Q ss_pred ecceeeeeeeee-EeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHH Q lcl|NC_018271. 72 INEKQLTLKKIK-SDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGIL 150 (305) Q Consensus 72 ~~~K~L~~~~~k-~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~l 150 (305) .+..++..-.-+ ..|.+.+-|-.+ .. +.+ +.-..+-+-....++-|+.... |. T Consensus 92 ~~~~t~tidqdR~~~F~VD~~D~~E------tn-~~l-------~a~~i~~~~~~~~v~PEiDay~---------~s--- 145 (319) T protein:vir:94 92 IEETTYFLDQEKYWGRFVDALDRKD------TE-GNI-------DINYVVARQGAEVVAPYLDNLR---------FA--- 145 (319) T ss_pred cceeEEEeecccccccccchhhHhh------hh-chh-------hHHHHHHHHHHHHhhhhhhHHH---------HH--- Confidence 555554443222 122233333222 11 111 1111111222223444443111 11 Q ss_pred HHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCc---ccCCCcc Q lcl|NC_018271. 151 PLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGT---FLNPNEF 227 (305) Q Consensus 151 k~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~---~t~~~~~ 227 (305) +.....+. + .+.++|++|+.+.+.++...+.+.=-. .+..+|||+..+.....+-+-.-..... ..++.-+ T Consensus 146 kla~~a~~--~---~~~~~t~~n~y~~i~~a~~~Lde~~VP-~~Rvl~Vtp~~~~~L~~~~~f~~~~~~~~~~~~~g~Vg 219 (319) T protein:vir:94 146 TLARNKAK--H---LTVGTGSDAQYDAVLDVSVELDEIKAP-ENRVLFVSPTFYKGIKKFVIALPQGDTRQQVLGKGVQG 219 (319) T ss_pred HHHhhccc--c---cccccCHHHHHHHHHHHHHHHHhcCCC-CCcEEEeCHHHHHHHHhhhhhhccccccccceeeeece Confidence 22222111 1 234578999999999998887764112 3578999999999986655322111111 1233345 Q ss_pred eecceeeeeccC--CCCCeEEEecchHHhhhhhhhhhhhhccccc-eeeeccceeEEEEEEeecceeeccCCeEEEec-- Q lcl|NC_018271. 228 DFEGYTLTEIKG--LPASRMVGYNRDNIVIGMSAQSDFNEIRIKD-MGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYT-- 302 (305) Q Consensus 228 ~~kGi~iv~l~~--~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~-~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~-- 302 (305) .+.|++|+.+++ +.+--+++.+++=....+- .++.++-+ -++.. + |..+-++=+|.-+.-+...-+|+ T Consensus 220 ~idG~~Vi~vps~~~k~in~i~~h~~A~~~~~k----~~~~~~~~p~~~~~--a-~~v~gr~y~d~~V~~~k~~~Iy~~~ 292 (319) T protein:vir:94 220 ELDGFVIVKVPTKLLQGLQAIAVVGEVLASPIQ----ADLAKTNSNIPGMF--G-TLAEQLLYTGAFVPEHLQKYIFTIG 292 (319) T ss_pred eecCeEEEEecccccccceEEEEcCCeeeeeee----eeeeeccCCCcccc--c-eeeeeeeeeeeEEeccccceEEEee Confidence 689999877633 2333356666554432221 23333322 23321 1 44555555677776666555565 Q ss_pred CCC Q lcl|NC_018271. 303 PAA 305 (305) Q Consensus 303 ~~~ 305 (305) +++ T Consensus 293 ~~~ 295 (319) T protein:vir:94 293 GTE 295 (319) T ss_pred cCC Confidence 222 No 152 >protein:vir:4074 Length: 480 # NCBI annotation: major capsid (head) protein # Family: family:all:11745 # MgeID: mge:85 # MgeName: c2 # Cross-refs: genbank:acc:NP_043553;genbank:gi:9628687;genbank:GeneID:1261180 Probab=56.77 E-value=0.45 Score=22.49 Aligned_cols=272 Identities=10% Similarity=0.042 Sum_probs=103.3 Q ss_pred CceEe-------------eeecccchhHHHHHHHHh-hcccc---chhcCceEEecCCCCccccc-----chhhhhccc- Q lcl|NC_018271. 1 MATTV-------------DITTNYVGEVAGGYFLEM-VKEAN---TISDNLIRVIPNVPENNLFL-----RRMNTTDDF- 57 (305) Q Consensus 1 ma~~~-------------~~~~~Y~Ge~l~~~~~~~-~~g~~---~v~~g~I~v~~~v~~~~~~~-----~~~~~~~~~- 57 (305) +.... .....+.....++-+.+. ..+.+ .-..|. +.|.+.....+. ++....... T Consensus 171 ~~~~~~~~~~~~~~~~e~r~~~~~~~~~~e~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~ 248 (480) T protein:vir:40 171 REASIPSEKPEDAERKFMRELGSKMAEMPEQGFLREFANGADLNVVNSLGS--ITSKYARKSGIYDGAMKARFQGLTLAE 248 (480) T ss_pred hhhhccccchhhhhhHHHHHHHHHhccchhhhhhhhhhhhccccccccccc--cccchhhheeechhhhhhhhhcceeee Confidence 10000 000111111112212111 11111 001111 112111111000 000000000 Q ss_pred c-CCCCCCCCccceEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhc Q lcl|NC_018271. 58 V-DYSCGFTPSGEVDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDI 136 (305) Q Consensus 58 q-~~~~~~~~~G~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~ 136 (305) . .....|.+-+...-.. ..+...- .....| +.-.-...+++--.-.+++-| .++.|+..+|..+++...+..+ T Consensus 249 ~g~~~~~~~~e~~~~~~~--~~~~~~~-~~~~~~--~~v~~l~~~~k~t~~lLDDa~-~l~~~i~~~l~~~~~~~ee~a~ 322 (480) T protein:vir:40 249 DGVDDTFISGTFKAGTDK--NKSQTAT-KRSLRP--QMAEAYLQMDKATVRGVNDSG-ALSEYVMSEMVNRVIQKVEYNM 322 (480) T ss_pred ccccceeeeeeeeccccc--ccccccc-cchhhH--HHHHHHHHhHHHHHHHhhhhH-HHHHHHHHHHHHHHHHHHHHHh Confidence 0 1112333222111110 0111100 001111 111122222321101112223 5889999999999999999999 Q ss_pred cccCCcc-chhHHHHHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhh Q lcl|NC_018271. 137 WQGDGTT-GNLQGILPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQA 215 (305) Q Consensus 137 ~~GD~s~-~~fdG~lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~ 215 (305) ++||++- +.+-|+.+. .+ +.+..+|. .+.+.++..++++.+|++ +-.|+||..+|.+.+ -++.-. T Consensus 323 l~G~g~g~~~~~g~~~~-~~--------~~~~~~~~---~d~id~L~~al~~~y~~~-a~~~vmn~~t~~~I~-klKD~~ 388 (480) T protein:vir:40 323 ILGSVDGSNGFYGLKTA-TD--------GWTKQIEY---TDLFEGITDAVAECSISD-AITIVMSPQTFAELR-KAKGTD 388 (480) T ss_pred hccCCCCccccccceee-cc--------cccccchh---HHHHHHHHHhhhHHhhCC-CCEEEECHHHHHHHH-HhhcCC Confidence 9997543 335554221 10 11112333 456777888999998874 447899999987643 233333 Q ss_pred ccCC---cccCCCcceecceeeee-ccCCCCC-eEEEecchHHhh---hhhhhhhhhhccccceeeeccceeEEEEEEee Q lcl|NC_018271. 216 RSNG---TFLNPNEFDFEGYTLTE-IKGLPAS-RMVGYNRDNIVI---GMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLS 287 (305) Q Consensus 216 ~k~~---~~t~~~~~~~kGi~iv~-l~~~Pd~-~ii~T~~sNl~~---gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~ 287 (305) |.|. ..+.+......|.+++- -..+|++ .++.+...=.+. ++...+|++ |+. -..+|.++..++ T Consensus 389 G~Yi~q~~~~~~~~~~llG~pvv~~~~~~~~~~~~~~~~~~~~~~~d~~~~~~~~~~-~~~-------~~~~~~~e~~v~ 460 (480) T protein:vir:40 389 GHSRFNELATKEQIAQSFGAVNLETRVWMPKDEVAVYNHDEYVLIGDLNVENYNDFD-LRY-------NVEQWLSETLVG 460 (480) T ss_pred CCeeccCcccccCcceecccceeeeeccccCCcceeeeCCccEEEEecccceecccc-ccc-------chhhhhhhhhhc Confidence 3332 22344455688887543 3455644 344433211110 233334442 111 123455555444 Q ss_pred cceeeccCCeEEEecCCC Q lcl|NC_018271. 288 AGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 288 ~d~~i~fg~E~v~~~~~~ 305 (305) --+ ..+ |.+.|.--- T Consensus 461 g~~--~~~-~~~~~~~~~ 475 (480) T protein:vir:40 461 GSI--RGK-NRSAYLKKK 475 (480) T ss_pred eee--Ecc-ccEEEEEec Confidence 222 111 222221111 No 153 >protein:vir:1239 Length: 274 # NCBI annotation: similar to phage B1 major head protein # Family: family:all:522 # MgeID: mge:25 # MgeName: phi ETA # Cross-refs: genbank:acc:NP_510938;genbank:gi:17426272;genbank:GeneID:927376 Probab=54.26 E-value=0.51 Score=22.20 Aligned_cols=259 Identities=14% Similarity=0.101 Sum_probs=124.4 Q ss_pred CceEeeee-cccchhHHHHHHHHhhccccchhcCceEE---ecCCCCcccccchhhhhccccCCCCC-CCCccceEecce Q lcl|NC_018271. 1 MATTVDIT-TNYVGEVAGGYFLEMVKEANTISDNLIRV---IPNVPENNLFLRRMNTTDDFVDYSCG-FTPSGEVDINEK 75 (305) Q Consensus 1 ma~~~~~~-~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v---~~~v~~~~~~~~~~~~~~~~q~~~~~-~~~~G~~~~~~K 75 (305) ||+...=. .-...|+...++..-+.+. ++-.+++.+ +.|.+.++..+|+.+.....+.++.+ --.-+..+.++. T Consensus 1 ma~~~T~l~d~iiPev~~~~v~~~~~~~-l~~~~~~~~d~~l~g~~G~tv~iP~~~~ig~a~~~~~g~~i~~~~lt~~~~ 79 (274) T protein:vir:12 1 MAQGLTKTSNQIIPEVLAPMMQAQLEKK-LRFASFAEVDSTLQGQPGDTLTFPAFVYSGDAQVVAEGEKIPTDILETKKR 79 (274) T ss_pred CCcceeehhhhhchHHHHHHHHHHHHhh-hhhcccceecccccCCCCCEEEEeeecCCCccccccCCCccchhhccccee Confidence 99865422 2344444444443332222 222222222 24555665555554433345555431 111234455555 Q ss_pred eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhh Q lcl|NC_018271. 76 QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEA 155 (305) Q Consensus 76 ~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~ 155 (305) ++..++ +...|--.|..... -. |+ |- ....+.++..++..+..++. ..+.. T Consensus 80 ~~~i~~--~~~~~~i~D~~~~~----~~-~d------~~---~~~~~q~~~~~a~~vd~~~l-------------~~~~~ 130 (274) T protein:vir:12 80 EAKIRK--IAKGTSITDEALLS----GY-GD------PQ---GEQVRQHGLAHANKVDNDVL-------------EALMG 130 (274) T ss_pred eEEeee--ecceeeecHHHHHh----cc-cc------hH---HHHHHHHHHHHHHHHHHHHH-------------HHHhc Confidence 555533 34456666654321 11 32 11 11224444455555443322 11111 Q ss_pred ccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc-----cCCCcceec Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF-----LNPNEFDFE 230 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~-----t~~~~~~~k 230 (305) ... ....+++|...+++++ ..+... ....-.++||+.++-.....-..++-+..+. .++-.+.|. T Consensus 131 a~~----~~~~~a~~~d~i~dA~----~~lgd~--~~~~~~ivv~p~~~~~L~k~~~~~fv~~s~~g~~~~~~G~ig~~~ 200 (274) T protein:vir:12 131 AKL----TVNADITKLNGLQSAI----DKFNDE--DLEPMVLFINPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEAL 200 (274) T ss_pred ccc----cccccccCHHHHHHHH----HHhccc--cccccEEEeCHHHHHHHHhhhhhhccccccccccceecccceeec Confidence 111 1122344544444444 333332 1223479999998876544322222221111 233356799 Q ss_pred ceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 231 GYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 231 Gi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) |++++.-..+|.++.++.....+-++. ..+.+ ++-+|-.+-.. =.+.-.+-+++.+.-++-+|+-|.+. T Consensus 201 G~~Vi~s~~~p~~t~~l~~~gA~~~~~--~~~~~-vE~~Rd~~~~~---d~i~~~~~y~~~~~~~~~vv~~t~~~ 269 (274) T protein:vir:12 201 GAIIVRSNKLEAGTAILAKKGAVKLIL--KRDFF-LEVARDASTKT---TALYSDKHYVAYLYDESKAVKITKGS 269 (274) T ss_pred CeeEEEeCCCCcceEEEEeccceeeee--cCCce-eccccchhhcc---cEEEeeeEEEEEEEcCCceEEEEcCC Confidence 999999999999999988888776654 23433 34333322111 11222233477777888888888776 No 154 >protein:vir:78739 Length: 332 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:1856 # MgeName: Syn5 # Cross-refs: genbank:acc:YP_001285448;genbank:gi:148724482;genbank:GeneID:5220210 Probab=53.76 E-value=0.52 Score=22.14 Aligned_cols=267 Identities=13% Similarity=0.084 Sum_probs=120.6 Q ss_pred Cce-------------------Eee--e-ecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhcccc Q lcl|NC_018271. 1 MAT-------------------TVD--I-TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFV 58 (305) Q Consensus 1 ma~-------------------~~~--~-~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q 58 (305) |-+ +.+ + ..-|.||+++.+.-..+ -.+++++..-...++..++++..... + T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~d~~~al~le~~~geV~~~f~~~s~------~~~~~~~r~i~~G~tv~i~~ig~~~~-~ 73 (332) T protein:vir:78 1 MTTLSNFSLPNQANGGARNADYDVRYATALKLFSGEVFTAFNNASI------FKGLVRSYDLRGGKSKQFMFTGKLSA-G 73 (332) T ss_pred CcccccccCCccccCCccccccccchhhhhhhhhhhHHHHHHHHhh------hhhccccccccccceEEEEeccceeE-e Confidence 211 111 1 24677777666654332 33455555444456555565543333 4 Q ss_pred CCCC--CCCCccceEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhc Q lcl|NC_018271. 59 DYSC--GFTPSGEVDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDI 136 (305) Q Consensus 59 ~~~~--~~~~~G~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~ 136 (305) +++. ++++..+.+=++++|..-.. .|.+.+-..+-+ .+.+..+.....+..+..+++.+.+.+ T Consensus 74 ~~~~g~~l~~~~~~~~~~~~l~ID~~---------ky~~~~VddiD~------~q~~~dl~~~~~~~~g~aLA~~~D~~i 138 (332) T protein:vir:78 74 YHTPGTPIVGDAGIKANEKTLVMDDL---------LVSSQFVYSLDE------IFSQYSTRAEVSKQIGEALATHYDERI 138 (332) T ss_pred eecCCCCCCCCCCCCCceEEEEEehh---------hhhHHHHHhHHH------HhcCcchHHHHHHHHHHHHHHHHHHHH Confidence 5543 44454444555555544433 333333221111 112222333334455555555544333 Q ss_pred c----ccCCc---cchhHHHHHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHH Q lcl|NC_018271. 137 W----QGDGT---TGNLQGILPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKR 209 (305) Q Consensus 137 ~----~GD~s---~~~fdG~lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d 209 (305) . .+-.+ .....| ...++ .++.++.|+.++++.+-+....+.++--...+..+.+|+..|-.-.. T Consensus 139 ~~~l~~aa~~~~~~~~~~g--------~~~~~-~~~~~~~~~~~~~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~ 209 (332) T protein:vir:78 139 ARVLAKASAEASPVTGEPG--------GFHVN-IGAGNTNDAQAIVDGFFEAAAVLDERSAPQEGRVAVLSPRQYYSLIS 209 (332) T ss_pred HHHHHhhhcccCccccccc--------ccccc-cCCccccCHHHHHHHHHHHHHHHhhcCCCccCCEEEeCHHHHHHHHh Confidence 2 22111 011111 11122 22334567888999998888887775333345678889988877754 Q ss_pred HH----hhhh--ccCCcccCCC-cceecceeeeeccCCCCCeEEEecc------hHHh----------------hhhhhh Q lcl|NC_018271. 210 AY----GTQA--RSNGTFLNPN-EFDFEGYTLTEIKGLPASRMVGYNR------DNIV----------------IGMSAQ 260 (305) Q Consensus 210 ~~----~~~~--~k~~~~t~~~-~~~~kGi~iv~l~~~Pd~~ii~T~~------sNl~----------------~gvnl~ 260 (305) .. .++. +......+++ ...+.|++|+...++|...+-.... .|-. +|.=-+ T Consensus 210 ~~d~~~~n~~~~~~~~~~~~g~~i~~i~G~~V~~Sn~lp~~~g~~~~~~~~~~~~n~~~~~~~~~~~~~~h~~a~~~v~~ 289 (332) T protein:vir:78 210 SVDTNILNREIGNSQGDMNSGKGLYSIAGIRILKSNNLAGLYGQDLSSAAVTGENNDYQVDASALAGLIFHREAAGCIQS 289 (332) T ss_pred hcCceeeeeeccccccceecceeeeEEeeeEEEecCccccCcccccccccccccccccccccccceEEeecccceeeeee Confidence 32 2221 1222233443 3468899999999999654322211 1111 110000 Q ss_pred hhhhhccccceeeecccee-----EEEEEEeecceeeccCCeEEEecCC Q lcl|NC_018271. 261 SDFNEIRIKDMGDVDLSGQ-----IRTKMVLSAGVEYAYGAEIVLYTPA 304 (305) Q Consensus 261 ~D~n~I~I~~~~~~~~~~~-----~f~k~~m~~d~~i~fg~E~v~~~~~ 304 (305) -| +||... .-+...+ +...+.||+ -+-.++-+|+-+-| T Consensus 290 ~~---~~~~~t-~~~~~~~~~~d~i~~~~~~G~--~v~rPe~~v~l~~a 332 (332) T protein:vir:78 290 VA---PTIQTT-SGDFNVQYQGDLIVGKLAMGC--GSLRTSVAGSFQAA 332 (332) T ss_pred ec---cchhhh-hcccchhhhHhhhhhhhhhcC--ceecccceEEEeeC Confidence 00 111110 0011222 333334444 33455555555555 No 155 >protein:vir:105334 Length: 276 # NCBI annotation: putative phage major capsid protein # Family: family:all:522 # MgeID: mge:1679 # MgeName: PH15 # Cross-refs: genbank:acc:YP_950669;genbank:gi:119967839;genbank:GeneID:4643213 Probab=53.70 E-value=0.52 Score=22.13 Aligned_cols=256 Identities=18% Similarity=0.172 Sum_probs=124.5 Q ss_pred CceEeee-ecccchhHHHHHHHHhhccccchhcCceE---EecCCCCcccccchhhhhccccCCCCCCC-CccceEecce Q lcl|NC_018271. 1 MATTVDI-TTNYVGEVAGGYFLEMVKEANTISDNLIR---VIPNVPENNLFLRRMNTTDDFVDYSCGFT-PSGEVDINEK 75 (305) Q Consensus 1 ma~~~~~-~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~---v~~~v~~~~~~~~~~~~~~~~q~~~~~~~-~~G~~~~~~K 75 (305) ||+...= ..=...|+...++...+.+...... +.. -+.|.+.++..+|+.+.....+.+..+-+ +-+..+.++. T Consensus 1 Ma~~~T~l~d~i~Pev~~~~v~~~~~~~~~~~~-~~~~~~~l~g~~G~ti~iP~~~~igda~~~~eg~~i~~~~lt~~~~ 79 (276) T protein:vir:10 1 MAQGTTTKSTQIVPEVLAPMMQAELDKKLRFAQ-FADIDSTLVGQPGDTLTFPAFVYSGDATVVPEGQKIPVDKIETNRR 79 (276) T ss_pred CCcceeehhhhhchHHHHHHHHHHHHhhhhhcc-cceecccccCCCCCEEEeeeecCCCccccccCCCccCcccccccee Confidence 9965432 2345667666666655444433222 222 23455666555554333334444432111 1123344444 Q ss_pred eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhh Q lcl|NC_018271. 76 QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEA 155 (305) Q Consensus 76 ~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~ 155 (305) ....+ ++...|.-.|... ....++ | ....++.++..++..+..++. ..+.. T Consensus 80 ~a~i~--~~~k~~~~tD~a~-----~~~~~d------p---~~~~~~~~~~~~a~~~d~~~~-------------~~l~~ 130 (276) T protein:vir:10 80 EAKIH--KIGKGTDITDEAL-----LSGYGD------P---QGEAVRQHGLAIANKVDNDVL-------------EALRG 130 (276) T ss_pred eEEee--hccccccccHHHH-----Hhhccc------h---HHHHHHHHHHHHHHHHHHHHH-------------HHHhc Confidence 44443 2344444444432 222232 1 122345666666666654333 22222 Q ss_pred ccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhcc---CC--cccCCCcceec Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARS---NG--TFLNPNEFDFE 230 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k---~~--~~t~~~~~~~k 230 (305) ....+ ..+++|.+.+++ .+..+-+. ....-.++||+..|-.-+.....++-. +. -..++-...|. T Consensus 131 ~~~~~----~~~~~t~d~i~~----A~~~lgd~--~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~ 200 (276) T protein:vir:10 131 TKLTV----SADIGTLAGLEA----AIDTFDDE--DLEPMVLFINPKDAGKLRSSASDNFTRATELGDNIIVKGAFGEAL 200 (276) T ss_pred ccccc----cccccCHHHHHH----HHHHhccc--cCcccEEEEcHHHHHHHHHhccccccccccccccceeccccceec Confidence 11111 233456544444 44444332 122348899999986664322111111 11 11344466899 Q ss_pred ceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeee---ccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 231 GYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDV---DLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 231 Gi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~---~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) |++++--..+|++++++-.+.-+-++ ...+.. ++-+|..+- .+.+++ -+.+.+.-++-+|.=|++. T Consensus 201 G~~Vi~s~~~p~~t~~l~~~gAi~~~--~~~~~~-vE~dRd~~~~~d~i~~~~------~y~~~~~~~~~vv~~t~~~ 269 (276) T protein:vir:10 201 GAVIVRSKKLDEGEAILAKRGAVKLI--TKRDFF-LETDRDPSTKTTALYSDK------HYVAYLYDESKAVKVTKGA 269 (276) T ss_pred ceeEEEcCCCCcceEEEEeccceeee--ecCCce-eecccchhhcccEEEEee------EEEEEEEcCcceEEEecCC Confidence 99999999999998887776655433 234433 333332211 122332 2367777777888888776 No 156 >protein:vir:103759 Length: 330 # NCBI annotation: hypothetical protein # Family: family:all:1903 # MgeID: mge:1645 # MgeName: BcepC6B # Cross-refs: genbank:acc:YP_024928;genbank:gi:48697198;genbank:GeneID:2846083 Probab=51.42 E-value=0.58 Score=21.87 Aligned_cols=226 Identities=15% Similarity=0.135 Sum_probs=114.3 Q ss_pred CceEeee--------ecccchhHHHHHHHHhhccccchhcCceEEecCCC---CcccccchhhhhccccCCCCCCCCccc Q lcl|NC_018271. 1 MATTVDI--------TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVP---ENNLFLRRMNTTDDFVDYSCGFTPSGE 69 (305) Q Consensus 1 ma~~~~~--------~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~---~~~~~~~~~~~~~~~q~~~~~~~~~G~ 69 (305) ||+.-.- .-+..-.....+.-.+...+++-.+ +..+.+.. +.++.-..+. .-.+.....+++++- T Consensus 1 m~~~~~~a~TL~e~AKr~~~d~~~~~IIE~l~~tn~IL~~--lpf~e~N~~tg~~t~vrt~LP-~~~fR~lN~g~~~s~- 76 (330) T protein:vir:10 1 MATLSTNNPTMADVAKRLDPNGKVDIIVEMLNQTNPVLQD--MTAIEGNLPTGHRTSVRTGLP-TPTWRKLYGGVLPNK- 76 (330) T ss_pred CCcCCCCcccHHHHHhhcCcchhHHHHHHHHhcCchHHhh--cchhhccCCcccceeEEeecC-CchhhhcCCcccccc- Confidence 7765321 1122222334677777777887776 44444321 1111100011 122445555555543 Q ss_pred eEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc--chhH Q lcl|NC_018271. 70 VDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT--GNLQ 147 (305) Q Consensus 70 ~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~--~~fd 147 (305) .++.+++=...-+=...++.- .+ .++. |+. .+ ......+...+..+.++...++.||++. ..|+ T Consensus 77 ~tt~qvt~~l~ilgg~~eVDr-~l------a~~~-Gn~--a~----~ra~e~~~~ikam~q~~~~~~iyGD~a~~p~~F~ 142 (330) T protein:vir:10 77 SSTAQVTDNCGMLEAYAEVDK-AL------ADLN-GNT--AA----FRLSEDRAQIEGMNQEVAQTLFYGNDGIAPAEFT 142 (330) T ss_pred ceEEEEEEEeEEecchhhhhh-HH------Hhhc-CCH--HH----HHHHHHHHHHHHHHHHHHHHhccCCCCCChhhcc Confidence 444444444444444444332 11 3444 422 11 1122335556778889999999999764 5599 Q ss_pred HHHHHHhh----c-cceEE------------------------------------------------------------- Q lcl|NC_018271. 148 GILPLLEA----D-ATVID------------------------------------------------------------- 161 (305) Q Consensus 148 G~lk~i~~----d-~~~~~------------------------------------------------------------- 161 (305) |+.+++.. + .++++ T Consensus 143 GL~kR~~~~ta~~~~qvIdaGGtG~~~TSi~~v~wg~~~~~giyPkG~kaGl~~~d~g~~~~~~~dg~gg~y~~~~~~~~ 222 (330) T protein:vir:10 143 GLSPRYNSLSAENKDNVIDAGGTGSDNASAWLVVWGPNTCHSIYPKGSKAGLSVEDKGQVTIENADGNGGRMEGYRTHYK 222 (330) T ss_pred chhhhcCCCCCCchhheeeccccccCceEEEEEEEcCCeEEEEcccCccccceeeeccceeeecccCCCCceeEEeeeee Confidence 99998831 0 01110 Q ss_pred ---------------ecc--CcCcCChhhHH---HHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc Q lcl|NC_018271. 162 ---------------VVG--ASGGITAANVE---AELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF 221 (305) Q Consensus 162 ---------------~~~--~~~~iT~anv~---~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~ 221 (305) +.. .+..-|.++.. +.+-+-.+.||...+ +...|||...+....+-+..+.-+..... T Consensus 223 w~~Gl~i~d~r~vvRI~NIdvs~l~~~~~~~~li~lm~~A~~~ip~~~~--g~~~~y~n~~v~~~L~~q~~~k~n~~l~~ 300 (330) T protein:vir:10 223 WDIGLTLRDWRYVARVCNIDVSDLATSANAQALIKYMIMAAERIPQLGM--GRAVWYMNRNLREKLRLGIVDKIANNLTW 300 (330) T ss_pred eeeeeEEeCcccEEEEeecccccCCCCccHHHHHHHHHHHHHhccCCCC--CcceeeechHHHHHHHHHHhhcccceeee Confidence 000 01111222333 333333455775422 34689999999998887765543333333 Q ss_pred cCCCc---ceecceeeeeccCC--CCCeEE Q lcl|NC_018271. 222 LNPNE---FDFEGYTLTEIKGL--PASRMV 246 (305) Q Consensus 222 t~~~~---~~~kGi~iv~l~~~--Pd~~ii 246 (305) .+... ..|.|++|.-+..| -++.+| T Consensus 301 ~~~~g~~~t~~~gipir~~Dail~tE~~vv 330 (330) T protein:vir:10 301 ETVSGERVMTFDGIPVQRTDALLNTESRVV 330 (330) T ss_pred eecCCeeeEEECCeEEEEEeeeecCccccC Confidence 33322 35888888877776 466666 No 157 >protein:vir:95898 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1588 # MgeName: 71 # Cross-refs: genbank:acc:YP_240385;genbank:gi:66396054;genbank:GeneID:5133409 Probab=47.03 E-value=0.72 Score=21.38 Aligned_cols=259 Identities=14% Similarity=0.108 Sum_probs=125.5 Q ss_pred CceEeeee-cccchhHHHHHHHHhhccccchhcCceE---EecCCCCcccccchhhhhccccCCCCC-CCCccceEecce Q lcl|NC_018271. 1 MATTVDIT-TNYVGEVAGGYFLEMVKEANTISDNLIR---VIPNVPENNLFLRRMNTTDDFVDYSCG-FTPSGEVDINEK 75 (305) Q Consensus 1 ma~~~~~~-~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~---v~~~v~~~~~~~~~~~~~~~~q~~~~~-~~~~G~~~~~~K 75 (305) ||+...=. .=..+|+...++..-+.+.-.. .++.. -+.|.+.++..+|..+..-..+.++.+ --.-+....+.. T Consensus 1 m~~~~T~l~d~i~Pev~~~~v~~~~~~~l~~-~~~~~~~~~l~g~~G~tv~iP~~~~ig~a~~~~~g~~i~~~~lt~~~~ 79 (274) T protein:vir:95 1 MAQGMTKLTNQIVPEVLAPMMQAELEKKLRF-ASFAEIDNTLVGQPGDTLTFPAFIYSGDAKVVAEGEKIPTDILETKKR 79 (274) T ss_pred CCcceeehhheechHHHHHHHHHHHHhhhhc-cccceecccccCCCCCEEEeeeecCCCccccccCCCccchhhccccee Confidence 99855432 2445565555554433322111 12211 234555666555554433334555421 111133444445 Q ss_pred eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhh Q lcl|NC_018271. 76 QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEA 155 (305) Q Consensus 76 ~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~ 155 (305) .+..++ +...|.-.|.. .....|+. - ...++.++..++.++...++ ..+.. T Consensus 80 ~~~i~~--~~~a~~i~D~~-----~~~~~~d~-------~--~~~~~~~~~~~a~~vd~~i~-------------~~l~~ 130 (274) T protein:vir:95 80 EAKIRK--IAKGTSISDEA-----LLSGYGDP-------Q--GEQVRQHGLAHANKVDDDVL-------------EALKS 130 (274) T ss_pred EEEeee--eecceeehHHH-----HhhccchH-------H--HHHHHHHHHHHHHHHHHHHH-------------HHHhc Confidence 554433 34455555553 22222321 1 12234445555555543322 11111 Q ss_pred ccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc-----cCCCcceec Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF-----LNPNEFDFE 230 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~-----t~~~~~~~k 230 (305) ....+ ..++++...+++++..| -.. ....-.++||+..+-.-...-..++-+..+. .++-.+.|. T Consensus 131 a~~~~----~~~~~~~d~i~~A~~~l----gd~--~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~ 200 (274) T protein:vir:95 131 AKLTV----EADITKLTGLQTAIDKF----NDE--DLEPMVLFISPLDAGKLRGDATTNFTRATELGDDVIVKGAFGEAL 200 (274) T ss_pred ccccc----cccccCHHHHHHHHHHh----ccc--cccccEEEeCHHHHHHHHhhccccccccccccccceeccccceec Confidence 11111 12344544444444333 222 1233489999998877654332222222111 233356799 Q ss_pred ceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 231 GYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 231 Gi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) |++|+.-.++|+++.++....-+-++. ..+.+ ++-+|-.+- ..=.+.-.+-+++.+.-++-+|+-|+.+ T Consensus 201 G~~Vi~s~~~~~~t~~l~~~gA~~~~~--~~~~~-vE~~Rd~~~---~~d~i~~~~~y~~~~~~~~~~v~~tk~~ 269 (274) T protein:vir:95 201 GAVIVRSNKLEAGTAILAKKGAVKLIT--KRDFF-LETDRDPST---KTTALYSDKHYVAYLYDESKAVKITKGS 269 (274) T ss_pred CeEEEEeCCCCCceEEEEeccceeeee--cCCcc-ccccccccc---ccCEEEEeEEEEEEEEcCCcEEEEEcCC Confidence 999999999999998888877665543 23332 333332221 1111222233488888899999999888 No 158 >protein:vir:96262 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1612 # MgeName: ROSA # Cross-refs: genbank:acc:YP_240311;genbank:gi:66395978;genbank:GeneID:5133339 Probab=47.03 E-value=0.72 Score=21.38 Aligned_cols=259 Identities=14% Similarity=0.108 Sum_probs=125.5 Q ss_pred CceEeeee-cccchhHHHHHHHHhhccccchhcCceE---EecCCCCcccccchhhhhccccCCCCC-CCCccceEecce Q lcl|NC_018271. 1 MATTVDIT-TNYVGEVAGGYFLEMVKEANTISDNLIR---VIPNVPENNLFLRRMNTTDDFVDYSCG-FTPSGEVDINEK 75 (305) Q Consensus 1 ma~~~~~~-~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~---v~~~v~~~~~~~~~~~~~~~~q~~~~~-~~~~G~~~~~~K 75 (305) ||+...=. .=..+|+...++..-+.+.-.. .++.. -+.|.+.++..+|..+..-..+.++.+ --.-+....+.. T Consensus 1 m~~~~T~l~d~i~Pev~~~~v~~~~~~~l~~-~~~~~~~~~l~g~~G~tv~iP~~~~ig~a~~~~~g~~i~~~~lt~~~~ 79 (274) T protein:vir:96 1 MAQGMTKLTNQIVPEVLAPMMQAELEKKLRF-ASFAEIDNTLVGQPGDTLTFPAFIYSGDAKVVAEGEKIPTDILETKKR 79 (274) T ss_pred CCcceeehhheechHHHHHHHHHHHHhhhhc-cccceecccccCCCCCEEEeeeecCCCccccccCCCccchhhccccee Confidence 99855432 2445565555554433322111 12211 234555666555554433334555421 111133444445 Q ss_pred eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhh Q lcl|NC_018271. 76 QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEA 155 (305) Q Consensus 76 ~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~ 155 (305) .+..++ +...|.-.|.. .....|+. - ...++.++..++.++...++ ..+.. T Consensus 80 ~~~i~~--~~~a~~i~D~~-----~~~~~~d~-------~--~~~~~~~~~~~a~~vd~~i~-------------~~l~~ 130 (274) T protein:vir:96 80 EAKIRK--IAKGTSISDEA-----LLSGYGDP-------Q--GEQVRQHGLAHANKVDDDVL-------------EALKS 130 (274) T ss_pred EEEeee--eecceeehHHH-----HhhccchH-------H--HHHHHHHHHHHHHHHHHHHH-------------HHHhc Confidence 554433 34455555553 22222321 1 12234445555555543322 11111 Q ss_pred ccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCcc-----cCCCcceec Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTF-----LNPNEFDFE 230 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~-----t~~~~~~~k 230 (305) ....+ ..++++...+++++..| -.. ....-.++||+..+-.-...-..++-+..+. .++-.+.|. T Consensus 131 a~~~~----~~~~~~~d~i~~A~~~l----gd~--~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~ 200 (274) T protein:vir:96 131 AKLTV----EADITKLTGLQTAIDKF----NDE--DLEPMVLFISPLDAGKLRGDATTNFTRATELGDDVIVKGAFGEAL 200 (274) T ss_pred ccccc----cccccCHHHHHHHHHHh----ccc--cccccEEEeCHHHHHHHHhhccccccccccccccceeccccceec Confidence 11111 12344544444444333 222 1233489999998877654332222222111 233356799 Q ss_pred ceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 231 GYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 231 Gi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) |++|+.-.++|+++.++....-+-++. ..+.+ ++-+|-.+- ..=.+.-.+-+++.+.-++-+|+-|+.+ T Consensus 201 G~~Vi~s~~~~~~t~~l~~~gA~~~~~--~~~~~-vE~~Rd~~~---~~d~i~~~~~y~~~~~~~~~~v~~tk~~ 269 (274) T protein:vir:96 201 GAVIVRSNKLEAGTAILAKKGAVKLIT--KRDFF-LETDRDPST---KTTALYSDKHYVAYLYDESKAVKITKGS 269 (274) T ss_pred CeEEEEeCCCCCceEEEEeccceeeee--cCCcc-ccccccccc---ccCEEEEeEEEEEEEEcCCcEEEEEcCC Confidence 999999999999998888877665543 23332 333332221 1111222233488888899999999888 No 159 >protein:vir:95318 Length: 328 # NCBI annotation: hypothetical protein # Family: family:all:1903 # MgeID: mge:1564 # MgeName: phiV10 # Cross-refs: genbank:acc:YP_512264;genbank:gi:89152431;genbank:GeneID:3952987 Probab=39.50 E-value=1 Score=20.55 Aligned_cols=225 Identities=13% Similarity=0.123 Sum_probs=122.5 Q ss_pred CceEeee--------ecccchhHHHHHHHHhhccccchhcCceEEecCC---CCcccccchhhhhccccCCCCCCCCccc Q lcl|NC_018271. 1 MATTVDI--------TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNV---PENNLFLRRMNTTDDFVDYSCGFTPSGE 69 (305) Q Consensus 1 ma~~~~~--------~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v---~~~~~~~~~~~~~~~~q~~~~~~~~~G~ 69 (305) |++.... .-...-.....++-.+...+++-.+ +..+.+. .+.++...-+. +-.+....+++++. . T Consensus 1 m~~~~~~~~TL~e~Akr~~~d~~~~~VIE~l~~~n~IL~~--lpf~e~n~gt~~~~~v~~~LP-~~~fR~lN~g~~~s-~ 76 (328) T protein:vir:95 1 MAVKGLTALTLADWGKRVDPNGKVDKIIELLGQTNPILQD--MPFVEGNLPTGHRTTIRSGLP-SATWRLLNYGVQPS-K 76 (328) T ss_pred CCccccccccHHHHHhhhCcchhHHHHHHHHhccchhHhh--cceeecccCCcceeeEeeccC-CceeeecCCccCcc-c Confidence 7765322 1133445566788888888998887 4444443 12222111111 11233444444443 3 Q ss_pred eEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHH-HHHHHHHHHHHHhhhhhhccccCCcc--chh Q lcl|NC_018271. 70 VDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQ-GFMLTDMGNRLARKIDKDIWQGDGTT--GNL 146 (305) Q Consensus 70 ~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q-~~~l~~l~~~ia~ei~~~~~~GD~s~--~~f 146 (305) .++.+++=...-+-...++.- +.. .+. |+. +.+ ....+...+....+....+|.||++. ..| T Consensus 77 ~tt~q~t~~l~ilgg~~eVDr-~la------~~~-Gn~-------~~~ra~q~~~~~ka~~~~~~~~~iyGdsa~~p~~F 141 (328) T protein:vir:95 77 STTVQVTDSVGMLETYAEVDK-SLA------DLN-GNT-------AEFRLSEDRAFIEAMNQQMAQTLFYGDSSVNPQQF 141 (328) T ss_pred ceeEEEEEEEEEEecceeech-HHH------hhc-CCH-------HHHHHHHHHHHHHHHHHHHHHHHhcCCccCChhhh Confidence 466777777777766666653 222 223 432 122 22335556778889999999999764 469 Q ss_pred HHHHHHHhhcc-----ceE------------------------------------------------------------- Q lcl|NC_018271. 147 QGILPLLEADA-----TVI------------------------------------------------------------- 160 (305) Q Consensus 147 dG~lk~i~~d~-----~~~------------------------------------------------------------- 160 (305) ||..++..... +.+ T Consensus 142 ~GL~~R~~~~s~~~a~qiidaGgtg~~~TSi~~v~~g~~~~~giyPkG~~~Gl~~~d~g~~~~~~~~g~~y~~y~~~~~w 221 (328) T protein:vir:95 142 MGLSSRYSSLSAGNAQNIIDAGGTGTDNTSIWLVVWGENTVHGIFPKGKKAGIQMEDKGQVTLEDANGGKYEGYRTHYKW 221 (328) T ss_pred cchhhhcCccccccccceeecccCCCCceEEEEEEEcCCeEEEecccccccCceeeecCceeeecCCCCeeeEEEEEEEe Confidence 99999884210 010 Q ss_pred -------------EeccC-----cCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCCccc Q lcl|NC_018271. 161 -------------DVVGA-----SGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNGTFL 222 (305) Q Consensus 161 -------------~~~~~-----~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t 222 (305) ++... ++....++.++.+-+-...||...+ +...|||...+....+-+..+.-+-....+ T Consensus 222 ~~Gl~i~d~r~vvrI~NId~~~l~~~~~~~~l~~lm~~a~~~ip~~~~--~~~~~y~n~~v~~~L~~q~~~~~n~~~~~~ 299 (328) T protein:vir:95 222 DNGLALRDWRYVVRIANIDVSNLSEPSSAANIAKLMVKALHRIPNRGM--GRPVFYMNRTVGQALDLQSLEKTSLAISVK 299 (328) T ss_pred eeeeEEcCcccEEEEecCcccccccccChhhHHHHHHHHHHHhccCCC--CcceeehhHHHHHHHHHHHhcCcceeeeee Confidence 01110 1112345566666666666775533 357999999998877766533322222222 Q ss_pred CCC---cceecceeeeeccCC--CCCeEE Q lcl|NC_018271. 223 NPN---EFDFEGYTLTEIKGL--PASRMV 246 (305) Q Consensus 223 ~~~---~~~~kGi~iv~l~~~--Pd~~ii 246 (305) +.. ...|.|++|.-+..+ -++.+| T Consensus 300 ~~~g~~~t~~~gipir~~dai~~tE~~vv 328 (328) T protein:vir:95 300 ETEGEWWTSFRGVPIRETDALLETEARVV 328 (328) T ss_pred ccCCcceeEECCeEEEEEeeeecCccccC Confidence 222 234888887777666 455655 No 160 >protein:vir:94622 Length: 341 # NCBI annotation: PfWMP4_37 # Family: family:all:2203 # MgeID: mge:1525 # MgeName: Pf-WMP4 # Cross-refs: genbank:acc:YP_762667;genbank:gi:115304375;genbank:GeneID:5142322 Probab=38.89 E-value=1 Score=20.48 Aligned_cols=273 Identities=12% Similarity=0.073 Sum_probs=110.4 Q ss_pred CceEeee---e----cccchhHHHHHHHHhhccccchhcCceEEecC--CCCcccccchhhhhccccCCCCC-CCCccce Q lcl|NC_018271. 1 MATTVDI---T----TNYVGEVAGGYFLEMVKEANTISDNLIRVIPN--VPENNLFLRRMNTTDDFVDYSCG-FTPSGEV 70 (305) Q Consensus 1 ma~~~~~---~----~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~--v~~~~~~~~~~~~~~~~q~~~~~-~~~~G~~ 70 (305) |++|..- + ++|..|+-...+.+.... .++=..++|-.++ .+.++..+|+..... .+++..+ .-+-.+. T Consensus 3 ~~~~~~~~~~~t~~v~~fipei~s~~i~~~l~~-~~v~~~~~~d~~~~~~~Gdtv~ip~~g~~~-~~d~~~~~~i~~~~~ 80 (341) T protein:vir:94 3 LGNTITGPSINTQRGQQFIPEQWLSEVQMFRKA-KMLDTSVVKTWGAQVKKGDTFHVPRISELG-VEDKATDVPVGVQPV 80 (341) T ss_pred chhhhccccccchhHHHHHHHHHHHHHHHHHHh-hcchhhccccccccccCCceEEEeccCcce-eeeecCCCccccccc Confidence 4444432 1 134333333322222222 1222223332222 234555566655433 4455321 1122344 Q ss_pred EecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHH Q lcl|NC_018271. 71 DINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGIL 150 (305) Q Consensus 71 ~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~l 150 (305) +-+++.|...+.+. ..+-..|.+..+. . .++ + ...++..+..+++.+.+.+.. T Consensus 81 ~~~~~~itiD~~~~-~~~~i~d~d~~~~----~-~d~--~-------~~~~~~~~~aLA~~~D~~i~~------------ 133 (341) T protein:vir:94 81 NDTDFVITVDTDRT-TAVALDDLLEIQA----S-YDL--R-------APYLEAMGYALAKDMTGSILG------------ 133 (341) T ss_pred cCceEEEEEeeeee-cceeechHHHHhh----c-cch--H-------HHHHHHHHHHHHHHHHHHHHH------------ Confidence 45566666644432 2333334444322 2 211 1 222233344455554433221 Q ss_pred HHHhhcc-ceE--EeccCcCcCC---hhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHH--HHhhhhccC-Ccc Q lcl|NC_018271. 151 PLLEADA-TVI--DVVGASGGIT---AANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKR--AYGTQARSN-GTF 221 (305) Q Consensus 151 k~i~~d~-~~~--~~~~~~~~iT---~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d--~~~~~~~k~-~~~ 221 (305) .+++.+ .+. .+.......| .....+.+-+....+.++--...+..++|++..|..... .+......+ ... T Consensus 134 -~~a~~~~~~~~~~~~~~~~~~t~~~~~~~~~~i~~a~~~Lde~~VP~~gR~lvv~P~~~~~Ll~~~~~~~~~~~g~~~l 212 (341) T protein:vir:94 134 -LRAAVQNTASQNVFSSSNGAITGNGQAFSFAVFLAARRLLLEADVPEEKIVLLISPGQESALFTIPQFISKDFINNAPI 212 (341) T ss_pred -HhhhccccccCccccCccccccCchhhhhHHHHHHHHHHHhhcCCCccCCEEEeCHHHHHHHhhchhhhhhhccccchh Confidence 111100 000 0000010111 122334444444444333111224688899988877743 332222111 123 Q ss_pred cCCCcceecceeeeeccCCCCCeEEEec------------------------------------chHHhhhhhhhhhh-- Q lcl|NC_018271. 222 LNPNEFDFEGYTLTEIKGLPASRMVGYN------------------------------------RDNIVIGMSAQSDF-- 263 (305) Q Consensus 222 t~~~~~~~kGi~iv~l~~~Pd~~ii~T~------------------------------------~sNl~~gvnl~~D~-- 263 (305) .++..+++.|+.|..-..+|.....+.+ ..|.+.++-+.+-. T Consensus 213 ~~G~ig~i~G~~V~~Sn~lp~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~gl~~~~~av~~~k~~~~~~~ 292 (341) T protein:vir:94 213 AQGQIGSLMGVRVIRTSLIGNNSATGWRNGAPTIAPAEATPGFTGSRYLPKQDSFTSLPATFTGNSRPVHTAVMCHMDWA 292 (341) T ss_pred heeeeeeEeceEEEEeccccccccccccccccceecccccccccccccccccccccccEEEEEEecccccceeeecchhh Confidence 3444568999999998888865433221 11222222221110 Q ss_pred --hhccccceeeecccee-----EEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 264 --NEIRIKDMGDVDLSGQ-----IRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 264 --n~I~I~~~~~~~~~~~-----~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) .++|..+. ..++..+ +.-++.||+++.=|=+. +.++++++ T Consensus 293 ~~~~~~~~~~-~~~~~~~~~~~~i~~~~~~G~~~lrp~~~-v~~~~~~~ 339 (341) T protein:vir:94 293 AAVVSKAPRV-TQSFENREQVWLMVGRQAYGARLYRPLHA-VNIHTTGD 339 (341) T ss_pred hccccccccc-cccchhhhhhhhhhhhhhhcccccCccee-EEEecCcC Confidence 00110000 0112211 33567788888776654 66778877 No 161 >protein:vir:80068 Length: 301 # NCBI annotation: gp8 # Family: family:all:463 # MgeID: mge:1876 # MgeName: B054 # Cross-refs: genbank:acc:YP_001468712;genbank:gi:157325292;genbank:GeneID:5601759 Probab=37.61 E-value=1.1 Score=20.34 Aligned_cols=268 Identities=13% Similarity=0.070 Sum_probs=124.4 Q ss_pred CceEeeee-----cccchhHHHHHHHHhhccccchhcCceEEecCCCC-ccc-ccchhhhhccccCCCC--CCCCccceE Q lcl|NC_018271. 1 MATTVDIT-----TNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPE-NNL-FLRRMNTTDDFVDYSC--GFTPSGEVD 71 (305) Q Consensus 1 ma~~~~~~-----~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~-~~~-~~~~~~~~~~~q~~~~--~~~~~G~~~ 71 (305) |-++.+.. .+|.-+..-+.+...+.+.+++ -|...++. .++ ..+.....-..+-+.. ..-|.-++. T Consensus 1 ~~~~~~g~f~~~~l~~id~~v~e~~~~~l~~r~l~-----~v~~~~~~~~~~~~~~~~~~~G~~~~~~~~~~dip~~~~~ 75 (301) T protein:vir:80 1 MQGKITATIEARDLQAIDNVIYEPKQEELTARSVF-----PQKFDVNEGAESYSFDVMTRSGAAKIIANGADDLPLVDVD 75 (301) T ss_pred CCccccchhhHHHHHHHHHHHHHhhhhhhhhhhhc-----ccccCCCCceEEEEEeeeccceeEEEecCccccccccccc Confidence 55444331 1333333333444445555543 11111111 111 1111111112232322 112556788 Q ss_pred ecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHH Q lcl|NC_018271. 72 INEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILP 151 (305) Q Consensus 72 ~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk 151 (305) ++.+..+...+-..+.+.-+|++..=+ .. + .|+.+.+... ...++...+..++.||...+ +.|++. T Consensus 76 ~~~~~~~i~~~~~~~~~~~~El~~a~~---~g---~---~l~~~k~~aa----~~~~~~~~n~~~f~G~~~~g-~~GLlN 141 (301) T protein:vir:80 76 MVRKSVPIYSIGIGLSYTIQDLRAARM---QG---T---TVDAAKATTV----RRAIAEKENSIAFRGEKKYA-IKGAFE 141 (301) T ss_pred ceeEEEEEEEEEeeeeecHHHHHHHHH---hC---C---ChHHHHHHHH----HHHHHHhhceEEeeeccccc-ceeeec Confidence 899999999999999999888876422 11 2 2444444433 45678888899999987532 444443 Q ss_pred HHhhccceEEecc----C-c--CcCChhhHHHHHHHHHHhccHHHH-hCCCcEEEecHHHHHHHHHHHhhhhccCCcccC Q lcl|NC_018271. 152 LLEADATVIDVVG----A-S--GGITAANVEAELGKFIDAHTDEIL-QAPNHVFGVSTNVIRAIKRAYGTQARSNGTFLN 223 (305) Q Consensus 152 ~i~~d~~~~~~~~----~-~--~~iT~anv~~~l~~~~~~iP~~~r-~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~~~t~ 223 (305) - ....+...+. . + ...|+.++.+.|...+.++-..-. ......+.+|++.|.....-+.... .+....+ T Consensus 142 ~--p~~~~~~~~~~~~~~~~~w~~~t~~ei~~di~~~~~~l~~~s~g~~~p~~L~L~p~~~~~L~~~~~~~~-~~~tvl~ 218 (301) T protein:vir:80 142 A--TGIQIDVSPTTGVGNVSKWEKKTAEQIIDEIGEAHTKITVLPGYGTASLKLCLPPKQFELINKKRYSNE-DSRSVLK 218 (301) T ss_pred C--CCcccccccCcccccccccccCCHHHHHHHHHHHHHHHHHhcCceecccEEEecHHHHHhhhhccccCC-CCeeHHH Confidence 2 0111111000 0 0 234677777777777777654310 0122689999998877654332111 1111111 Q ss_pred CCcceecceeeeeccCCC------CCeEEEec--chHHhhhhhhhhhhhhccccceeeecccee-EEEEEEeecceeecc Q lcl|NC_018271. 224 PNEFDFEGYTLTEIKGLP------ASRMVGYN--RDNIVIGMSAQSDFNEIRIKDMGDVDLSGQ-IRTKMVLSAGVEYAY 294 (305) Q Consensus 224 ~~~~~~kGi~iv~l~~~P------d~~ii~T~--~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~-~f~k~~m~~d~~i~f 294 (305) -=+.-+-+.+|++++.+- .+++++-. ..|+-+.+- -|+...- ++..+. |......+ . T Consensus 219 ~l~~~~~~~~I~~~p~L~~~g~~g~~~~v~~~~~~d~~~~~v~--~~~~~~~------~e~~~~~~~~~~~~r----~-- 284 (301) T protein:vir:80 219 VLQDNAWFSAIVRVPDLAGMGTAGSDSFAVIHDSNETAELIIP--MDITRHP------EEYSFPRTKVPFEER----T-- 284 (301) T ss_pred HHHHHcCcceEEEcceeccCCCCcccEEEEEecCCcEEEEEec--Cceeeec------ceecCceeEeeeeee----e-- Confidence 001123446788887762 23344332 344422211 1111111 111111 11111111 1 Q ss_pred CCeEEEecCCC Q lcl|NC_018271. 295 GAEIVLYTPAA 305 (305) Q Consensus 295 g~E~v~~~~~~ 305 (305) .+++++-|.| T Consensus 285 -~Gv~i~~P~a 294 (301) T protein:vir:80 285 -AGVVVRFPAA 294 (301) T ss_pred -EEEEEEccce Confidence 3567788888 No 162 >protein:vir:94711 Length: 347 # NCBI annotation: capsid # Family: family:all:975 # MgeID: mge:1528 # MgeName: K1F # Cross-refs: genbank:acc:YP_338120;genbank:gi:77118198;genbank:GeneID:3707734 Probab=37.56 E-value=1.1 Score=20.33 Aligned_cols=272 Identities=17% Similarity=0.136 Sum_probs=113.8 Q ss_pred CceEee--e-----------------ecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCC Q lcl|NC_018271. 1 MATTVD--I-----------------TTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYS 61 (305) Q Consensus 1 ma~~~~--~-----------------~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~ 61 (305) ||++.- + ...|.||+...+.-. .+-++++++...-..++..++++..... ++++ T Consensus 1 m~~~~~~~~~t~~g~~~~~~d~~al~ik~f~~eV~~~f~~~------s~~~~~~~~r~i~~G~sv~i~~iG~~tv-~~~t 73 (347) T protein:vir:94 1 MANVPGQKIGTDQGKGKSSSDALALFLKVFAGEVLTAFTRR------SVTADKHIVRTIQNGKSAQFPVMGRTSG-VYLA 73 (347) T ss_pred CCCCCccccccccccCCccccHHHHHHHHHhHHHHHHHHHH------HhhhcccccccccccceEEEecccceee-eeec Confidence 554311 1 124666665543222 2345667777655566554555543333 3333 Q ss_pred CCCCCcc---ceEecceeeeeeeeeEeeccCHHHHHHHHH----HHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhh Q lcl|NC_018271. 62 CGFTPSG---EVDINEKQLTLKKIKSDKEVCKEDFRQLWT----AAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDK 134 (305) Q Consensus 62 ~~~~~~G---~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~----~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~ 134 (305) .+-...| +.+-++++|.....+ |.+.+- .+|.. .++ .....++.+..++....+ T Consensus 74 ~G~~l~~~~~~~~~~e~~itID~~~---------~~~~~VddiD~~q~~-~D~---------~~~~~~~~g~aLa~~~D~ 134 (347) T protein:vir:94 74 PGERLSDKRKGIKHTEKVITIDGLL---------TADVMIFDIEDAMNH-YDV---------AGEYSNQLGEALAIAADG 134 (347) T ss_pred CCCCcCCCCCCCCcceEEEEecchh---------hhhHHhhhHHHHhcC-cch---------HHHHHHHHHHHHHHHHHH Confidence 2222111 234444444443333 222222 23333 222 222334455555555554 Q ss_pred hcc------ccCC--ccchhHHHHHHHhhccceEEeccCcC----cCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHH Q lcl|NC_018271. 135 DIW------QGDG--TTGNLQGILPLLEADATVIDVVGASG----GITAANVEAELGKFIDAHTDEILQAPNHVFGVSTN 202 (305) Q Consensus 135 ~~~------~GD~--s~~~fdG~lk~i~~d~~~~~~~~~~~----~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~ 202 (305) .+. .... +.....|+- ...++.+..... ..+..++++.+.+....+.++--...+..+.+++. T Consensus 135 ~i~~~~~~~aa~~~~~~~~~~g~~-----~~s~~~~~~~~~~~~~~~~~~~~~~~i~~a~~~Lde~~VP~~~R~~vv~P~ 209 (347) T protein:vir:94 135 AVLAEMAILCNLPAASNENIAGLG-----TASVLEVGKKADLDTPAKLGEAIIGQLTIARAKLTSNYVPAGDRYFYTTPD 209 (347) T ss_pred HHHHHHHHHhccccccccccCCCc-----ccceeeccccccccchhhhHHHHHHHHHHHHHHHhhcCCCCCCcEEEeCHH Confidence 432 1111 111122221 011111111111 12345566666555555555422223468889988 Q ss_pred HHHHHHHHHh---hhhccCCcccCCCcceecceeeeeccCCCCCeE----------EEecchHHhhhh---hhhhhhhh- Q lcl|NC_018271. 203 VIRAIKRAYG---TQARSNGTFLNPNEFDFEGYTLTEIKGLPASRM----------VGYNRDNIVIGM---SAQSDFNE- 265 (305) Q Consensus 203 ~~d~Y~d~~~---~~~~k~~~~t~~~~~~~kGi~iv~l~~~Pd~~i----------i~T~~sNl~~gv---nl~~D~n~- 265 (305) .|..-..... ..+....+..++....+.|++|+...++|.+-+ +.....|.+.+- .+..|+.. T Consensus 210 ~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg~i~G~~V~~Sn~lp~~~~t~~~~~~~~~~~aG~~~~~~~~~~~~~~~~~~~~ 289 (347) T protein:vir:94 210 NYSAILAALMPNAANYAALIDPETGNIRNVMGFVVVEVPHLVQGGAGETRGDDGITIASGQKHAFPATASSDVKVTMDNV 289 (347) T ss_pred HHHHHhccchhhhhhccccccccccceEEEeceEEEecCcccccccccccccCcceecCcccccccccchhhhcccccce Confidence 8765533221 112222334456567899999999999995422 222222332110 11111110 Q ss_pred -----------------ccccceeeeccce-eEEEEEEeecceeeccCCeEE-EecCCC Q lcl|NC_018271. 266 -----------------IRIKDMGDVDLSG-QIRTKMVLSAGVEYAYGAEIV-LYTPAA 305 (305) Q Consensus 266 -----------------I~I~~~~~~~~~~-~~f~k~~m~~d~~i~fg~E~v-~~~~~~ 305 (305) ++++.+-..+.-. .+..++.||+++.= ++-+| +-.++| T Consensus 290 ~~l~~h~~A~~~v~~~~~~~e~~r~~~~~~d~i~~~~~~G~~~~r--P~~a~~~~~~~A 346 (347) T protein:vir:94 290 VGLFSHRSAVGTVKLRDLALERDRDVDAQGDLIVGKYAMGHGGLR--PEAAGALVFSPA 346 (347) T ss_pred eEEEeehhhhhhhhcccccccchhchhhHHHHhhhhhhhcCcccc--cceeEEEEecCC Confidence 1222111111111 24556666666554 33332 344455 No 163 >protein:vir:80180 Length: 381 # NCBI annotation: capsid protein # Family: family:all:2203 # MgeID: mge:1878 # MgeName: Pf-WMP3 # Cross-refs: genbank:acc:YP_001285797;genbank:gi:148747831;genbank:GeneID:5220456 Probab=36.87 E-value=1.1 Score=20.25 Aligned_cols=274 Identities=11% Similarity=-0.027 Sum_probs=98.1 Q ss_pred CceEe----------ee----e---cccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCC Q lcl|NC_018271. 1 MATTV----------DI----T---TNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCG 63 (305) Q Consensus 1 ma~~~----------~~----~---~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~ 63 (305) ||+.- +- + .=|++++++.+--.....+ ++.. +-..+...++.-+|+.... ..++++.+ T Consensus 1 ~~~~~~~~~~~~~~~~~t~~~~fiPev~s~~v~~~l~~~lv~~~-l~~~---~~~~~~~GdTV~ip~~g~~-~a~d~~~g 75 (381) T protein:vir:80 1 MATIQGTGGYKGSAVDLSNVQVFIPEVWSSEVRMFRDQKFAALE-ATKK---IPFEGKKGDLIHIPNISRA-AVYDKQPQ 75 (381) T ss_pred CceecccccccCcccchhhHHhhhhHHHHHHHHHHHHHhhhhhh-cccc---ccceeecCceEEeeccCcc-eeeeecCC Confidence 54432 11 0 1344555554433333322 2222 1122333455446665543 24445422 Q ss_pred -CCCccceEecceeeeeeeeeEe-eccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhcc---- Q lcl|NC_018271. 64 -FTPSGEVDINEKQLTLKKIKSD-KEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIW---- 137 (305) Q Consensus 64 -~~~~G~~~~~~K~L~~~~~k~~-~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~---- 137 (305) .-.-.+..-+++.+...+++.. +.+. |.+ ..+.. ++ ......+.+...++..+.+.+. T Consensus 76 ~~i~~~~~~~~~~~itID~~~~~~~~Id--d~D----~~~~~-~D---------~~~~~~~~~~~aLA~~~D~~i~~~~~ 139 (381) T protein:vir:80 76 TPVNLQARTDSEFTFTVTKYKESSFMIE--DIV----NTQAS-YT---------LRQYYTKEAGYALARDMDNFALAHRA 139 (381) T ss_pred CcccccccCCceEEEEEeeeeecceeec--hHH----HHhhc-cC---------hHHHHHHHHHHHHHHHHHHHHHHHHh Confidence 1111233334455555444432 2222 211 11222 21 1122234444455554444333 Q ss_pred ccCCccch--hHHHHHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHH--Hhh Q lcl|NC_018271. 138 QGDGTTGN--LQGILPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRA--YGT 213 (305) Q Consensus 138 ~GD~s~~~--fdG~lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~--~~~ 213 (305) ....+... +.+ ........+....++.+.....+.|-+....+.++--...+..++|++..|...... +.. T Consensus 140 ~~~~~~~~~~~t~-----~~~i~~~~~~~~~t~~~~~~t~~~i~~a~~~Lde~~VP~egR~lvv~P~~~~~Ll~~~~~~~ 214 (381) T protein:vir:80 140 VINAFPSQRIYSY-----DTTLGDGTVNAHLTGTPAPLTYAALLLAKQKLDEADVPQEGRIVMVSPAQYIDLLSINQFIS 214 (381) T ss_pred hcccccccccccc-----cccccccccccccccchhhHHHHHHHHHHHHHhhcCCCcCCcEEEeCHHHHHHHhhchhhhh Confidence 11111000 000 000000011111122233444555555555544431112346899999888876543 222 Q ss_pred -hhccCCcccCCCcceecceeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceee Q lcl|NC_018271. 214 -QARSNGTFLNPNEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEY 292 (305) Q Consensus 214 -~~~k~~~~t~~~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i 292 (305) .++......++..+++.|++|+.-..+|.+.........-. ..-.. ..+.-.+..--+......+..+-.+|.-+ T Consensus 215 ad~~~~~~l~~G~Ig~i~G~~Vv~Sn~lp~~~~t~~~~~aga-p~~~~---~~~~~~~~~g~~s~~a~av~~~k~yd~~~ 290 (381) T protein:vir:80 215 VDFSQVKPVTSGVVGTILGMEVIVTTQIGINSLTGYVNGQGA-PTQPT---PGVLGSPYLPDQAGTANVVNTGSASDLAV 290 (381) T ss_pred hhhccchhhhceeeeEEcceEEEeecccccccccceeeeccc-ccccc---ccccccccccccccceeeeeeeeeeceee Confidence 22222223455567899999999999997654332211110 00000 00000000000000001111111111111 Q ss_pred ccCCeE-------EEecCCC Q lcl|NC_018271. 293 AYGAEI-------VLYTPAA 305 (305) Q Consensus 293 ~fg~E~-------v~~~~~~ 305 (305) ..+-. +.++|+. T Consensus 291 -~~~~~~~~~~~g~~~~~~~ 309 (381) T protein:vir:80 291 -SLSYFGLPVFSGAGATAAD 309 (381) T ss_pred -eeeeccceeeecceeeecC Confidence 00111 2233333 No 164 >protein:vir:1541 Length: 347 # NCBI annotation: major capsid protein 10A # Family: family:all:975 # MgeID: mge:31 # MgeName: phiYeO3-12 # Cross-refs: genbank:acc:NP_052109;swissprot:trembl:q9t107;genbank:gi:9634035;uniprot:Q9T107;genbank:GeneID:1262383 Probab=32.42 E-value=1.4 Score=19.74 Aligned_cols=277 Identities=13% Similarity=0.110 Sum_probs=107.7 Q ss_pred CceEeeee---c--cc---chhHH--------HHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCC- Q lcl|NC_018271. 1 MATTVDIT---T--NY---VGEVA--------GGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCG- 63 (305) Q Consensus 1 ma~~~~~~---~--~Y---~Ge~l--------~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~- 63 (305) ||++...+ | .. +|++. .++++.-. .-.+-.+++++......++..++++..... +.++.+ T Consensus 1 ma~~~~~~~~~t~~~~~~~~~~~~a~~ie~f~g~V~~~f~--~~s~~~~~~~~~~~~~G~sv~i~~ig~~t~-~~~~~g~ 77 (347) T protein:vir:15 1 MANIQGGQQIGTNQGKGQSAADKLALFLKVFGGEVLTAFA--RTSVTMPRHMLRSIASGKSAQFPVIGRTKA-AYLKPGE 77 (347) T ss_pred CCccccCCccccccccCCCcchHHHHHHHHHHHHHHHHHH--HhhhhhhccccccccccceeEeeeccceee-eeeccCC Confidence 99887653 1 12 22222 11122111 223445556665555566555666554332 444332 Q ss_pred -CCCc-cceEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhcc---- Q lcl|NC_018271. 64 -FTPS-GEVDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIW---- 137 (305) Q Consensus 64 -~~~~-G~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~---- 137 (305) .+.+ -+.+-+++.|..-..+..=.+. .|+ -.+| .+..+.....+..+..++....+.++ T Consensus 78 ~l~~~~~~~~~~e~~ltID~~~~~~~~V-ddl----D~~q----------~~~D~~~~~~~~~g~aLA~~~D~~i~~~l~ 142 (347) T protein:vir:15 78 NLDDKRKDIKHTEKVIHIDGLLTADVLI-YDI----EDAM----------NHYDVRAEYTAQLGESLAMAADGAVLAELA 142 (347) T ss_pred CCCCCCCCCccceEEEEechhhhhhHHh-hhH----HHHh----------cCCcchHHHHHHHHHHHHHHHHHHHHHHHH Confidence 2211 2345555556544443321111 111 1122 22223333444555555555554443 Q ss_pred cc-C---CccchhHHHHHHHhhccceEEeccCcCcC-----ChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHH Q lcl|NC_018271. 138 QG-D---GTTGNLQGILPLLEADATVIDVVGASGGI-----TAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIK 208 (305) Q Consensus 138 ~G-D---~s~~~fdG~lk~i~~d~~~~~~~~~~~~i-----T~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~ 208 (305) .+ + .+...+.+.-.. .. .......++.. ++.++.+.+.+....+.++=-..-+..+++++..|.... T Consensus 143 ~~~~~~~~~~~~~~~~g~~---~~-~~~~~~~~~~~~~~~~~~~~i~d~~~~a~~~Lde~~VP~~gR~~vv~P~~y~~LL 218 (347) T protein:vir:15 143 GLVNLPDASNENIEGLGKP---TV-LTLVKPTTGDLTDPVELGKAIIAQLTIARASLTKNYVPAADRTFYTTPDNYSAIL 218 (347) T ss_pred HHhhccccccccccccCcc---cc-ccccccccccchhhhhHHHHHHHHHHHHHHHHhhcCCCccCCEEEeCHHHHHHHh Confidence 11 1 111111110000 00 00001111111 134567777666655555422223468889998887765 Q ss_pred HHHhh---hhccCCcccCCCcceecceeeeeccCCCCCeEE-----Eec-chHH-------------------------h Q lcl|NC_018271. 209 RAYGT---QARSNGTFLNPNEFDFEGYTLTEIKGLPASRMV-----GYN-RDNI-------------------------V 254 (305) Q Consensus 209 d~~~~---~~~k~~~~t~~~~~~~kGi~iv~l~~~Pd~~ii-----~T~-~sNl-------------------------~ 254 (305) ..-.- .+....+..++.-....|++|....++|..-.- ++. ..|- + T Consensus 219 ~~~~~~~~d~~~~~~~~~G~Vg~i~G~~V~~Sn~lp~~~~t~~~~~~~~g~~~~~~~~~~~~~~~~f~~~~~l~~h~~A~ 298 (347) T protein:vir:15 219 AALMPNAANYQALIDHERGTIRNVMGFEVVEVPHLTAGGAGDTREDAPADQKHAFPATSSTTVKVALDNVVGLFQHRSAV 298 (347) T ss_pred cccccccccccccccccceEEEEEeceEEEecccccccccccccccccccccccccccccceeeeccccceeeeecccee Confidence 44321 111112233444456889999999999853210 000 0010 0 Q ss_pred hhhhhhhhhhhccccceeeeccc-eeEEEEEEeecceeeccCCeEE-EecCCC Q lcl|NC_018271. 255 IGMSAQSDFNEIRIKDMGDVDLS-GQIRTKMVLSAGVEYAYGAEIV-LYTPAA 305 (305) Q Consensus 255 ~gvnl~~D~n~I~I~~~~~~~~~-~~~f~k~~m~~d~~i~fg~E~v-~~~~~~ 305 (305) .-|.+++ ++++..-...+- -.+..++.||+++.=| +=+| +.=|-- T Consensus 299 g~v~~~~----~~~e~~~~~~~~~d~i~~~~~~G~~vlrP--~~av~~~~~~~ 345 (347) T protein:vir:15 299 GTVKLKD----LALERARRANYQADQIIAKYAMGHGGLRP--EAAGAIVLPKV 345 (347) T ss_pred eeeEeec----eeeeecccchhhhhhhehhhhcCCceecc--ccEEEEecCCC Confidence 0011110 122211110011 1134445555554332 1111 112222 No 165 >protein:vir:348 Length: 321 # NCBI annotation: major virion structural protein # Family: family:all:3198 # MgeID: mge:9 # MgeName: Mx8 # Cross-refs: genbank:acc:NP_203462;genbank:gi:15320618;genbank:GeneID:921734 Probab=28.71 E-value=1.7 Score=19.29 Aligned_cols=265 Identities=15% Similarity=0.104 Sum_probs=108.7 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCCCCCCCccceEecceeeee- Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYSCGFTPSGEVDINEKQLTL- 79 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~~~G~~~~~~K~L~~- 79 (305) |-. .+|.+|.+.++ +--+++ +..+|.+..-++.++.-.--.|-++.+.+=--...|.+ ..+- T Consensus 1 mp~----------~~lsel~t~tl---~~rs~~---~~D~v~~~n~LL~~L~~kG~~~~~~gg~~I~~~l~y~~-~s~~~ 63 (321) T protein:vir:34 1 MPF----------PNISDIITTTI---ESRSGV---IADNVTKNNAILARLAKRGKPRLVSGGYTILEELSFSG-NSNGG 63 (321) T ss_pred CCC----------chHHHHHHHHH---Hhhcch---hhhhhhcccHHHHHHHhcCcccccCCCeeEEEEEeecc-Cccee Confidence 443 12445554431 111121 13445444433333322211121111111001111110 0000 Q ss_pred -eeeeEeeccCHHHHHH----HHHHHhcCCCCcccc---cCCHHHHHHHHHHHHHHH-------HhhhhhhccccCCcc- Q lcl|NC_018271. 80 -KKIKSDKEVCKEDFRQ----LWTAAEMGFSAFNDN---GLPSTEQGFMLTDMGNRL-------ARKIDKDIWQGDGTT- 143 (305) Q Consensus 80 -~~~k~~~~~~P~d~~~----~w~~~~~~~g~~~~~---~LP~~~q~~~l~~l~~~i-------a~ei~~~~~~GD~s~- 143 (305) +..-=-+.++|+|..+ -|+.. - +.+... .|--+-.+.++.+|..++ ++++.....+ |+|+ T Consensus 64 wy~Gyd~l~~~p~d~~~~Aef~wk~a--a-~~~~isg~e~l~n~g~~~~idll~~~~~~ae~t~~n~l~~~l~s-dGTa~ 139 (321) T protein:vir:34 64 WYSGYDVLPTAPQDVISSAEYALKQY--A-VPVVISGLEMLQNSGKEAQLDLLEARMNVAEATMANDISAALYG-DGTAF 139 (321) T ss_pred EEEeeeeeccchhhhccccccchhhe--e-EeeEEehhHHhhccchHHHHHHHHHHHHHHHHHHHhhhhHhhhc-ccccc Confidence 1111235567777443 24331 1 222221 122111223334444443 2444444443 3332 Q ss_pred --chhHHHHHHHhhccce-----E----------EeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHH Q lcl|NC_018271. 144 --GNLQGILPLLEADATV-----I----------DVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRA 206 (305) Q Consensus 144 --~~fdG~lk~i~~d~~~-----~----------~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~ 206 (305) ....|+=-.+..+.++ | .+.....+.|..+++.++..+|-+|-..-. -|+ -+.||.+.|++ T Consensus 140 g~~~i~GL~~lv~~~p~tGtvGGIdra~~~~WRn~~~d~~~~~t~~tl~~~m~~~w~~~~Rg~~-~PD-lii~~~~~y~~ 217 (321) T protein:vir:34 140 GGRAINGLDGAVPVDPTVGTYGGINRALWPFWRSQVEDMAAVATINTIQPAMTKLWSRCVRGAD-MPD-LIMSGNDAWTT 217 (321) T ss_pred ccchhhhhhhhcccCCCCceeccccccchhhhhhhhhhhhhcccHHHHHHHHHHHHHhhccCCC-Ccc-EEEechHHHHH Confidence 2344443333323111 1 122223456889999999999999874311 133 78899999999 Q ss_pred HHHHHhhhhccCCcccCCCc----ceecceeeeecc----CCCCCeEEEecchHHhhhhhhhhhhhhcccccee--eecc Q lcl|NC_018271. 207 IKRAYGTQARSNGTFLNPNE----FDFEGYTLTEIK----GLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMG--DVDL 276 (305) Q Consensus 207 Y~d~~~~~~~k~~~~t~~~~----~~~kGi~iv~l~----~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~--~~~~ 276 (305) |...+.....-..+ ...+. ..|+|..|+-=. .+|+++|+.-+.+=|-+-..--+++..|.=.+++ |+|+ T Consensus 218 y~~s~q~~qR~~~~-~~a~~Gf~~Lky~~~div~D~~~g~~~pan~~yfiNT~yl~~r~h~~~~~~pi~p~r~~~~NqdA 296 (321) T protein:vir:34 218 YSNSLQVLQRFTSA-EEANLGFRSLKFLSTDVVLDGGIGGFAGANTMYFLNTKYLHFRPHKDRNMVPLSPSRRAAFNQDA 296 (321) T ss_pred HHHhhheeeeeccc-ccccccceeeeeeeEEEEEeCCCCCCccccceeeeecceEEEEEcCCCceeecCcccccccchhH Confidence 99999655532222 22222 259998887665 4699988866533221110000111111111111 2222 Q ss_pred cee------------EEEEEEeecc Q lcl|NC_018271. 277 SGQ------------IRTKMVLSAG 289 (305) Q Consensus 277 ~~~------------~f~k~~m~~d 289 (305) -.+ =.|.-+|++- T Consensus 297 ~~q~I~~~GnL~~sn~~~~~vL~~~ 321 (321) T protein:vir:34 297 EAQILAWAGNLTCSGAQFQGRLIAE 321 (321) T ss_pred HhhhhhhhheeeeecccceeEEeeC Confidence 222 1122223322 No 166 >protein:vir:3613 Length: 272 # NCBI annotation: MHP # Family: family:all:522 # MgeID: mge:74 # MgeName: TP901-1 # Cross-refs: genbank:acc:NP_112699;genbank:gi:13786567;genbank:GeneID:921035 Probab=27.65 E-value=1.8 Score=19.15 Aligned_cols=260 Identities=17% Similarity=0.139 Sum_probs=114.0 Q ss_pred CceEeeeec-ccchhHHHHHHHHhhccccchhcCce---EEecCCCCcccccchhhhhccccCCCCCCC-CccceEecce Q lcl|NC_018271. 1 MATTVDITT-NYVGEVAGGYFLEMVKEANTISDNLI---RVIPNVPENNLFLRRMNTTDDFVDYSCGFT-PSGEVDINEK 75 (305) Q Consensus 1 ma~~~~~~~-~Y~Ge~l~~~~~~~~~g~~~v~~g~I---~v~~~v~~~~~~~~~~~~~~~~q~~~~~~~-~~G~~~~~~K 75 (305) ||+...=.+ --..|+..+++...+.+.- +-.++. ..+.|-+.++..+|+.+.....+.+..+-+ +-+..+.++. T Consensus 1 ma~~~T~~~d~iiPev~~~~v~~~~~~~~-~~~~~~~~~~~l~g~~G~ti~iP~~~~~gda~~~~eg~~i~~~~lt~~~~ 79 (272) T protein:vir:36 1 MSKQKTTLADLVNPEVLAPIVSYELNKAL-RFAPLAQVDTTLQGQPGNTLKFPAFTYIGDAADVAEGGEISLDKIGTTTK 79 (272) T ss_pred CCCcceehhhhhchHHHHHHHHHHHHhhh-hhccccccccccccCCCCEEEEeeeccCccccccCCCCccChhhcCCcce Confidence 997654322 3335555555444333221 111111 123455555554554443333344432211 1234455666 Q ss_pred eeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhh Q lcl|NC_018271. 76 QLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEA 155 (305) Q Consensus 76 ~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~ 155 (305) ++..++... .|--.|... ....++ | .....+.++..++..+..+++ ..+.. T Consensus 80 ~~~i~~~~k--~~~vtD~~~-----~~~~~d------~---~~~~~~~~a~~~a~~~d~~i~-------------~~l~~ 130 (272) T protein:vir:36 80 SVTIKKAAK--GTEITDEAA-----LSGYGD------P---IGESNKQLGLSLANKVDDDLL-------------SAAKT 130 (272) T ss_pred eEeeehhhc--cccccHHHH-----hhccch------H---HHHHHHHHHHHHHHHHHHHHH-------------HHhcc Confidence 666654332 333334321 112122 1 122334455555555554332 11111 Q ss_pred ccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhh--ccCCc--ccCCCcceecc Q lcl|NC_018271. 156 DATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQA--RSNGT--FLNPNEFDFEG 231 (305) Q Consensus 156 d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~--~k~~~--~t~~~~~~~kG 231 (305) ... . .+..++.+.+.+. +..+-+. ..+.-.++||+..|-..+..-.... ..+.+ ..++...+|.| T Consensus 131 ~~~--~---~~~~~~~d~i~~A----~~~lgd~--~~~~~~ivv~p~~~~~L~k~~~~~~~~~~~~~~~~~~G~ig~~~G 199 (272) T protein:vir:36 131 TSQ--T---VSTKANVDGVQAA----LDIFNDE--DAQAYVLIVNPKDAAKIRKDANAKNIGSEVGANALINGTYADVLG 199 (272) T ss_pred ccc--c---ccccccHHHHHHH----HHHhhhc--CCCceEEEEcHHHHHHHhcccccccccccccccceeeeccceecC Confidence 111 0 1123343334333 3333332 1123479999988776643221111 11111 12334568999 Q ss_pred eeeeeccCCCCCeEEEec--chHHhhhhhhhhhhhhccccceeeeccceeEEEEEEeecceeeccCCeEEEecCCC Q lcl|NC_018271. 232 YTLTEIKGLPASRMVGYN--RDNIVIGMSAQSDFNEIRIKDMGDVDLSGQIRTKMVLSAGVEYAYGAEIVLYTPAA 305 (305) Q Consensus 232 i~iv~l~~~Pd~~ii~T~--~sNl~~gvnl~~D~n~I~I~~~~~~~~~~~~f~k~~m~~d~~i~fg~E~v~~~~~~ 305 (305) ++++.-.++|++..+.+. -.+=.+|+=...|.+ ++-+|-.+-... .+.=++-+++.+..++-+|.-|=.. T Consensus 200 ~~Vv~s~~~p~~~~~~~~~~~~~gA~~~~~~~~~~-vE~~R~~~~~~d---~i~~~~~y~~~v~~~~~vv~~t~~g 271 (272) T protein:vir:36 200 AQIVRSKKLAEGSALMFKIVSNSPALKLVLKRGVQ-VETDRDIVTKTT---VITADEHYAAYLYDLTKVVNITFTG 271 (272) T ss_pred eeEEEeCCCCCCceeEEEEEecccceeeeecCCcc-cccccchhhcCc---EEEEEEEEEEEEEcCccEEEEeecC Confidence 999999999988654332 222233332334433 333332221111 1112233477777777777777666 No 167 >protein:vir:7324 Length: 335 # NCBI annotation: hypothetical protein # Family: family:all:1903 # MgeID: mge:143 # MgeName: epsilon15 # Cross-refs: genbank:acc:NP_848215;genbank:gi:30387386;genbank:GeneID:2641870 Probab=27.62 E-value=1.8 Score=19.15 Aligned_cols=224 Identities=11% Similarity=0.116 Sum_probs=113.5 Q ss_pred CceEeee--------eccc-chhHHHHHHHHhhccccchhcCceEEecCCC---CcccccchhhhhccccCCCCCCCCcc Q lcl|NC_018271. 1 MATTVDI--------TTNY-VGEVAGGYFLEMVKEANTISDNLIRVIPNVP---ENNLFLRRMNTTDDFVDYSCGFTPSG 68 (305) Q Consensus 1 ma~~~~~--------~~~Y-~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~---~~~~~~~~~~~~~~~q~~~~~~~~~G 68 (305) |++...- .-+. .|++. .+.-.+...+++-.+ +..+.+.. +.++.-..+. .-.+.....++++.- T Consensus 1 m~~~~~~a~TL~E~Akr~~~d~~~~-~IIE~l~~tneIL~~--lpf~e~N~~tg~~~~vrt~LP-~~~fR~lN~g~~~s~ 76 (335) T protein:vir:73 1 MALIGQTLPSLLDIYNRTDKNGRIA-RIVEQLAKTNDILTD--AIYVPCNDGSKHKTTIRAGIP-EPVWRRYNQGVQPTK 76 (335) T ss_pred CCcCCCCchhHHHHHhhcCcchhHH-HHHHHHhcCchHHhh--cchhcccCCcccceeEEEecC-CchhhhcCCcccccc Confidence 7665321 1122 33333 466667777777766 44443321 1111100011 122445555555543 Q ss_pred ceEecceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCcc--chh Q lcl|NC_018271. 69 EVDINEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTT--GNL 146 (305) Q Consensus 69 ~~~~~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~--~~f 146 (305) .++.+++=...-+=...++. +.+ .++. |+. .+......+...+..+.++...++.||++. ..| T Consensus 77 -~tt~qvt~~l~ilgg~~eVD-r~L------a~~~-Gn~------a~~ra~e~~~~ikam~q~~~~~~iyGDsa~~p~~F 141 (335) T protein:vir:73 77 -TQTVPVTDTTGMLYDLGFVD-KAL------ADRS-NNA------AAFRVSENMGKLQGFNNKVARYSIYGNTDAEPEAF 141 (335) T ss_pred -ceEEEEEEEEEEecchhhhh-HHH------Hhhc-CCH------HHHHHHHHHHHHHHHHHHHHHHhccCCcCCChhhc Confidence 44555555555444444443 222 2444 432 111222334556778889999999999764 559 Q ss_pred HHHHHHHhhcc--------ceEE--------------------------------------------------------- Q lcl|NC_018271. 147 QGILPLLEADA--------TVID--------------------------------------------------------- 161 (305) Q Consensus 147 dG~lk~i~~d~--------~~~~--------------------------------------------------------- 161 (305) ||..+++..-. +.++ T Consensus 142 dGL~kR~~~~st~~a~~a~~iIdaGGtG~~~TSi~~v~wg~~~~~giyPkG~kaGl~~~d~g~~~~~d~~G~~y~~~~~~ 221 (335) T protein:vir:73 142 MGLAPRFNTLSTSKAASAENVFSAGGSGSTNTSIWFMSWGENTAHMIYPEGMVAGFQHEDLGDDLVSDGNGGQFRAYRDE 221 (335) T ss_pred cchhhhhcCccccccCcccceeeccccccCceEEEEEEEcCCeeEEEcccCccccceeeeccceeeecCCCCEEeEEEee Confidence 99999973100 0010 Q ss_pred -----------------ecc--CcC----cCChhhHHH-HHHHHHH-hccHHHHhCCCcEEEecHHHHHHHHHHHhhhhc Q lcl|NC_018271. 162 -----------------VVG--ASG----GITAANVEA-ELGKFID-AHTDEILQAPNHVFGVSTNVIRAIKRAYGTQAR 216 (305) Q Consensus 162 -----------------~~~--~~~----~iT~anv~~-~l~~~~~-~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~ 216 (305) +.. .+. ..+.++.++ .++++++ .||. ++. +.-.|||...+....+-++.+. T Consensus 222 ~~w~~Gl~i~d~r~vvRI~NIdvs~l~~d~~~~~~l~~lmi~a~~~~~ip~-~~~-~~~~~y~n~~v~~~L~~q~~~~-- 297 (335) T protein:vir:73 222 FKWDIGLSVRDWRSISRICNIDVTTLTKDASTGADLISMMVDAYYARDVAM-LGD-GKEVIYANKTIHAWLHKQAMNA-- 297 (335) T ss_pred eeeeeeeEEeCcccEEEEeecccccccccccchhhHHhhHHHHHHHHhccC-CCC-CceEEEechHHHHHHHHHHhcc-- Confidence 000 001 111222222 3344443 4554 233 3369999999999888888555 Q ss_pred cCCccc--CC--Cc-ceecceeeeeccCC--CCCeEEE Q lcl|NC_018271. 217 SNGTFL--NP--NE-FDFEGYTLTEIKGL--PASRMVG 247 (305) Q Consensus 217 k~~~~t--~~--~~-~~~kGi~iv~l~~~--Pd~~ii~ 247 (305) +++..+ +. .. ..|.|++|.-+..| -++.+++ T Consensus 298 ~n~~l~~~~~~g~~~t~~~gipir~~Dail~tE~~v~~ 335 (335) T protein:vir:73 298 KNVNLTIEEYGGKKIVSFLGIPIRRVDAILNTESAVTA 335 (335) T ss_pred CceeeeeeccCCceeEEECCeEEEEEeeeecCcccccC Confidence 333222 21 11 35888888887777 5667777 No 168 >protein:vir:106647 Length: 303 # NCBI annotation: ORF011 # Family: family:all:1178 # MgeID: mge:1557 # MgeName: 187 # Cross-refs: genbank:acc:YP_239493;genbank:gi:66395226;genbank:GeneID:4555801 Probab=22.00 E-value=2.5 Score=18.39 Aligned_cols=238 Identities=13% Similarity=0.144 Sum_probs=96.8 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccchhcCceEEecCCCCcccccchhhhhccccCCC-CCCCC--------ccce- Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANTISDNLIRVIPNVPENNLFLRRMNTTDDFVDYS-CGFTP--------SGEV- 70 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~v~~~v~~~~~~~~~~~~~~~~q~~~-~~~~~--------~G~~- 70 (305) -|.+|++.+.|++- +.+|+-.+ |-.|.. | |..+..||=|+ ..|+. -|++ T Consensus 15 ~a~siDF~~~f~~~-i~~L~~~L---------Gv~r~~----------p-la~Gt~iktyK~~~~~y~gda~dVaEGe~I 73 (303) T protein:vir:10 15 KAKSIDFANKLGVG-LNKLFEAL---------AIQNKI----------P-MNVGSALKQYRFKVEDSEKPNGDVAEGDVI 73 (303) T ss_pred cceeehhhhhhhhh-HHHHHHHh---------hhhccc----------c-ccCCceeeeeeeeceeeccccccccCCccc Confidence 23333333322221 11111110 111111 1 22333333222 12222 1222 Q ss_pred Ee--------cceeeeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCc Q lcl|NC_018271. 71 DI--------NEKQLTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGT 142 (305) Q Consensus 71 ~~--------~~K~L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s 142 (305) -+ +.+.++.+++.. ...-. +=|.+ |- ..|..+ . -+.|...++..|.++ T Consensus 74 plskvt~~~~~t~~~~~kK~rK--~tTdE-------AIqls-Gy----g~aVge--t-d~qL~~~Iq~kIdnd------- 129 (303) T protein:vir:10 74 PLTKVTREQVDITELQFAKYRK--STSAE-------AIQAH-GY----DLAINQ--T-DNEMIKYVQKKFRAK------- 129 (303) T ss_pred chhhheeeecceEEEEeecccc--cccHH-------HHHhh-cC----CchhHH--H-HHHHHHHHHhhhhHH------- Confidence 11 122222222211 11111 11344 52 334332 1 133344455555422 Q ss_pred cchhHHHHHHHhhccceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhh--hhcc-CC Q lcl|NC_018271. 143 TGNLQGILPLLEADATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGT--QARS-NG 219 (305) Q Consensus 143 ~~~fdG~lk~i~~d~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~--~~~k-~~ 219 (305) |+..++..+...+-. .+...+.+++..++..+|.++..-...+.+.++|||+.+.-+|.-.=.. +... |. T Consensus 130 ------~~~~lktaT~t~~~t-~~t~~s~~glq~Al~~~~~kl~~~~ed~~~~V~FvNP~Daa~yl~~A~i~~~~t~fG~ 202 (303) T protein:vir:10 130 ------FFETLKSAIENGKRT-NKTKLSAENLQGALSKGRANLSVLLDDEITPIAFVNPNDTAEYLANGFINSTGAQFGV 202 (303) T ss_pred ------HHHHHhhcccccccc-cceeecHHHHHHHHHhhhhhccccccccccEEEEEchHHHHHHhhcCCcchhhhhhhh Confidence 233333333222211 1223567889999999998876654445678999999999888532110 0000 11 Q ss_pred cccCCCcceecceeeeeccCCCCCeEEEecchHHhhhh-hhhhhhhhccccceee-----eccceeEEEEEEeecceeec Q lcl|NC_018271. 220 TFLNPNEFDFEGYTLTEIKGLPASRMVGYNRDNIVIGM-SAQSDFNEIRIKDMGD-----VDLSGQIRTKMVLSAGVEYA 293 (305) Q Consensus 220 ~~t~~~~~~~kGi~iv~l~~~Pd~~ii~T~~sNl~~gv-nl~~D~n~I~I~~~~~-----~~~~~~~f~k~~m~~d~~i~ 293 (305) ++.. .|.|..|+--..+|++.+++|+..|+.+++ +. + .++++ +|-.+.+=+ .-..+..= T Consensus 203 n~L~----nfLG~~II~S~kv~~G~~~~T~~~Ni~~ay~~~----~----g~l~~~f~~t~D~tglIGv--~h~~~~~~- 267 (303) T protein:vir:10 203 NLLT----PYVGVKIVEFADVPQGEVWMTVAENLNVAYANP----R----GELSRAFAFATDATGFVGV--LHDIQPQR- 267 (303) T ss_pred hhhh----hhhcceEEEeccCCCceEEEeeccceEEEEecC----c----hhhhhhhhhccccccceEE--Eeccccce- Confidence 1111 399999998999999999999999997653 11 1 12221 111111100 00000000 Q ss_pred cCCeEEEec-----CCC Q lcl|NC_018271. 294 YGAEIVLYT-----PAA 305 (305) Q Consensus 294 fg~E~v~~~-----~~~ 305 (305) ---|.++.+ |.- T Consensus 268 ~t~eT~~~~~~~lfpE~ 284 (303) T protein:vir:10 268 LTSDTIYASAISMFPEN 284 (303) T ss_pred eeehhHhHhHHHhcccc Confidence 000111111 111 No 169 >protein:vir:96833 Length: 275 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1642 # MgeName: EW # Cross-refs: genbank:acc:YP_240157;genbank:gi:66395822;genbank:GeneID:5133174 Probab=20.87 E-value=2.7 Score=18.22 Aligned_cols=256 Identities=13% Similarity=0.130 Sum_probs=112.4 Q ss_pred CceEeeeecccchhHHHHHHHHhhccccchhcCceE---EecCCCCcccccchhhhhccccCCCC-CCCCccceEeccee Q lcl|NC_018271. 1 MATTVDITTNYVGEVAGGYFLEMVKEANTISDNLIR---VIPNVPENNLFLRRMNTTDDFVDYSC-GFTPSGEVDINEKQ 76 (305) Q Consensus 1 ma~~~~~~~~Y~Ge~l~~~~~~~~~g~~~v~~g~I~---v~~~v~~~~~~~~~~~~~~~~q~~~~-~~~~~G~~~~~~K~ 76 (305) ||+.-....--..|+...+...-+.+.... .++.. -+.|-+.++..+|+.+.....+.+.. +--+-+..+.++.. T Consensus 3 ~~~~T~l~d~i~PEv~~~~v~~~~~~~~~~-~~~~~~~~~l~g~~G~tv~iP~~~~ig~a~~~~~g~~i~~~~lt~~~~~ 81 (275) T protein:vir:96 3 LENMTKLANMVNPEVLAPMMQAELDKKLKF-AQFADIDNTLVGQPGNTITFPAFVYSGDAKVVPEGEEIPIDLIETKKRQ 81 (275) T ss_pred CcccchhhhhhchHHHHHHHHHHHHHhhhh-cccceecccccCCCCCEEEeeeeccCCccccccCCCCcchhhcccceee Confidence 655432222333444444443333322111 22222 13455556554554443333444432 11112333444444 Q ss_pred eeeeeeeEeeccCHHHHHHHHHHHhcCCCCcccccCCHHHHHHHHHHHHHHHHhhhhhhccccCCccchhHHHHHHHhhc Q lcl|NC_018271. 77 LTLKKIKSDKEVCKEDFRQLWTAAEMGFSAFNDNGLPSTEQGFMLTDMGNRLARKIDKDIWQGDGTTGNLQGILPLLEAD 156 (305) Q Consensus 77 L~~~~~k~~~~~~P~d~~~~w~~~~~~~g~~~~~~LP~~~q~~~l~~l~~~ia~ei~~~~~~GD~s~~~fdG~lk~i~~d 156 (305) +..++ +...|--.|..... ..|+ |. ...++.++..++..+..+.+ ..+... T Consensus 82 ~~i~~--~~~~~~i~D~~~~~-----~~~d------~~---~~~~~~~a~~~a~~~d~~ll-------------~~l~~a 132 (275) T protein:vir:96 82 ATIRK--IGKGTVLTDEALLS-----GYGD------PK---GEAVRQHGLAIANKVDNDVL-------------EALQGA 132 (275) T ss_pred EEeeh--hcccccccHHHHHh-----hccc------hH---HHHHHHHHHHHHHHHHHHHH-------------HHHhcc Confidence 44432 34444444443221 1122 11 11224444445554443332 111111 Q ss_pred cceEEeccCcCcCChhhHHHHHHHHHHhccHHHHhCCCcEEEecHHHHHHHHHHHhhhhccCC---c--ccCCCcceecc Q lcl|NC_018271. 157 ATVIDVVGASGGITAANVEAELGKFIDAHTDEILQAPNHVFGVSTNVIRAIKRAYGTQARSNG---T--FLNPNEFDFEG 231 (305) Q Consensus 157 ~~~~~~~~~~~~iT~anv~~~l~~~~~~iP~~~r~~~~l~~f~S~~~~d~Y~d~~~~~~~k~~---~--~t~~~~~~~kG 231 (305) ... ..++.+|...+++.+..| -+. ....-.++||+.++-..+.....++-+.. + ..++-...|.| T Consensus 133 ~~~----~~~~~~~~d~i~dA~~~l----gd~--~~~~~~ivv~p~~~~~L~k~~~~~f~~~~~~g~~~~~~G~ig~~~G 202 (275) T protein:vir:96 133 TLK----VEADITKLAGLQTAIDKF----NDE--DLEPMVLFVNPLDAGKLRASATDNFTRATLLGDNVIVKGAFGEALG 202 (275) T ss_pred ccc----ccccccCHHHHHHHHHHh----ccc--cCCccEEEeCHHHHHHHHhcccccccccccccccceeccccceecC Confidence 111 123345555555544443 222 12234899999988766443222222111 1 12344568999 Q ss_pred eeeeeccCCCCCeEEEecchHHhhhhhhhhhhhhccccceeee---ccceeEEEEEEeecceeeccCCeEEE--ecCCC Q lcl|NC_018271. 232 YTLTEIKGLPASRMVGYNRDNIVIGMSAQSDFNEIRIKDMGDV---DLSGQIRTKMVLSAGVEYAYGAEIVL--YTPAA 305 (305) Q Consensus 232 i~iv~l~~~Pd~~ii~T~~sNl~~gvnl~~D~n~I~I~~~~~~---~~~~~~f~k~~m~~d~~i~fg~E~v~--~~~~~ 305 (305) ++|+.-..+|.+.+++..+.-+-++. ..+.+ ++-+|-.+- .+.+++. |++-+.-++-+|. -+||. T Consensus 203 ~~Vi~s~~~p~~t~~i~~~gA~~~~~--~~~~~-vE~~Rd~~~~~d~i~~~~~------y~~~~~~~~~vv~~t~~~~~ 272 (275) T protein:vir:96 203 AIIVRSNKIKEGEAILAKRGAVKLIT--KRDFF-LETERHASHKSTALFSDKH------YVAYLYDESKVVKITKSASG 272 (275) T ss_pred eeEEEeCCCCcceEEEEeccceeeee--cCCcc-cccccchhhcCcEEEEeEE------EEEEEEcCccEEEEEecccc Confidence 99999999999988887776554433 23332 333322211 1222222 2445555555544 56766 Done!