Query lcl|NC_015254.1_cdsid_YP_004301541.1 [gene=BrPhBL3_gp09] [protein=gp9] [protein_id=YP_004301541.1] [location=6506..7546] Match_columns 346 No_of_seqs 145 out of 183 Neff 6.9 Searched_HMMs 1612 Date Thu Nov 7 13:25:41 2013 Command /home/guerois/workspace/virfam/python/lib/hhsearch//hhsearch2 -i .//seq/seq_9 -d /home/guerois/workspace/virfam/python/profile_database/capsid_neck_tail.hhm -glob -cpu 7 -o .//seq/HHR/seq_9_vs_rec_db.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 protein:vir:102944 Length: 330 100.0 2E-117 1E-120 660.5 30.5 321 13-333 1-330 (330) 2 protein:vir:5974 Length: 324 # 100.0 6E-117 4E-120 657.7 31.9 316 15-333 1-324 (324) 3 protein:vir:80446 Length: 367 100.0 1E-116 8E-120 656.0 26.0 321 11-331 1-367 (367) 4 protein:vir:1583 Length: 351 # 100.0 2E-114 1E-117 644.2 31.6 329 15-346 1-346 (351) 5 protein:vir:94989 Length: 349 100.0 3E-112 2E-115 632.1 29.3 319 15-333 1-349 (349) 6 protein:vir:78387 Length: 349 100.0 6E-112 4E-115 630.2 29.0 315 15-333 1-349 (349) 7 protein:vir:95131 Length: 325 100.0 2E-78 1.2E-81 446.6 24.0 304 18-330 1-325 (325) 8 protein:vir:95107 Length: 270 100.0 3.1E-69 1.9E-72 396.2 22.8 265 15-307 1-270 (270) 9 protein:vir:96792 Length: 315 100.0 5.4E-64 3.3E-67 367.4 25.4 293 15-334 1-315 (315) 10 protein:vir:105334 Length: 276 100.0 1E-63 6.4E-67 365.9 23.8 265 13-305 1-276 (276) 11 protein:vir:95898 Length: 274 100.0 1.1E-62 6.8E-66 360.3 22.6 267 13-317 1-274 (274) 12 protein:vir:96262 Length: 274 100.0 1.1E-62 6.8E-66 360.3 22.6 267 13-317 1-274 (274) 13 protein:vir:1239 Length: 274 # 100.0 4E-62 2.5E-65 357.2 24.7 267 13-317 1-274 (274) 14 protein:vir:3613 Length: 272 # 100.0 9.8E-62 6.1E-65 355.1 23.6 262 13-300 1-272 (272) 15 protein:vir:96833 Length: 275 100.0 9E-61 5.6E-64 349.8 23.6 264 11-302 1-275 (275) 16 protein:vir:94494 Length: 274 100.0 6.5E-60 4E-63 345.1 23.1 267 13-309 1-274 (274) 17 protein:vir:97433 Length: 274 100.0 6.5E-60 4E-63 345.1 23.1 267 13-309 1-274 (274) 18 protein:vir:96123 Length: 274 100.0 1.8E-57 1.1E-60 331.7 24.2 263 13-307 1-274 (274) 19 protein:vir:93742 Length: 274 100.0 1E-55 6.5E-59 322.0 24.7 263 13-309 1-274 (274) 20 protein:vir:80930 Length: 278 100.0 1.3E-54 7.9E-58 316.1 22.0 266 13-300 1-278 (278) 21 protein:vir:3033 Length: 272 # 100.0 4.6E-48 2.8E-51 280.1 24.7 261 13-301 1-272 (272) 22 protein:vir:9820 Length: 272 # 100.0 4.6E-48 2.8E-51 280.1 24.7 261 13-301 1-272 (272) 23 protein:vir:739 Length: 231 # 100.0 1.5E-48 9.2E-52 282.8 19.6 224 57-300 1-231 (231) 24 protein:vir:7990 Length: 273 # 99.9 3.7E-24 2.3E-27 149.0 21.9 259 20-301 1-273 (273) 25 protein:vir:9927 Length: 295 # 99.9 1.7E-24 1.1E-27 150.9 13.5 265 12-338 1-295 (295) 26 protein:vir:105822 Length: 273 99.9 9.6E-23 6E-26 141.3 21.8 259 20-301 1-273 (273) 27 protein:vir:102605 Length: 273 99.9 9.6E-23 6E-26 141.3 21.8 259 20-301 1-273 (273) 28 protein:vir:9875 Length: 296 # 99.7 3E-20 1.8E-23 127.6 14.6 263 1-304 1-296 (296) 29 protein:vir:94622 Length: 341 99.7 9E-20 5.6E-23 125.0 17.1 311 7-334 1-341 (341) 30 protein:vir:106647 Length: 303 99.7 4.1E-19 2.5E-22 121.4 15.3 261 17-303 1-303 (303) 31 protein:vir:80180 Length: 381 99.7 2.2E-18 1.4E-21 117.3 19.2 323 1-345 1-381 (381) 32 protein:vir:99075 Length: 392 99.6 7.7E-17 4.8E-20 108.9 18.5 306 20-346 1-325 (392) 33 protein:vir:99749 Length: 324 99.6 1.9E-16 1.2E-19 106.8 18.7 287 1-307 1-324 (324) 34 protein:vir:9309 Length: 324 # 99.6 2.1E-16 1.3E-19 106.6 17.9 287 1-307 1-324 (324) 35 protein:vir:103955 Length: 324 99.6 2.5E-16 1.6E-19 106.1 17.6 287 1-307 1-324 (324) 36 protein:vir:96392 Length: 324 99.5 2.1E-15 1.3E-18 101.1 18.1 283 1-305 1-324 (324) 37 protein:vir:78830 Length: 324 99.5 2.1E-15 1.3E-18 101.1 18.1 283 1-305 1-324 (324) 38 protein:vir:96223 Length: 324 99.5 2.3E-15 1.4E-18 100.8 18.0 289 1-307 1-324 (324) 39 protein:vir:97148 Length: 324 99.5 5E-15 3.1E-18 99.0 18.4 287 1-307 1-324 (324) 40 protein:vir:95763 Length: 297 99.5 4.1E-14 2.5E-17 94.0 20.9 286 7-331 1-297 (297) 41 protein:vir:108211 Length: 318 99.4 9.4E-15 5.8E-18 97.5 16.0 276 1-302 1-318 (318) 42 protein:vir:7771 Length: 330 # 99.4 3.2E-13 2E-16 89.1 20.1 286 7-305 1-330 (330) 43 protein:vir:41 Length: 299 # N 99.4 6E-13 3.7E-16 87.6 21.1 273 10-314 1-299 (299) 44 protein:vir:107120 Length: 329 99.3 5.8E-13 3.6E-16 87.7 20.3 292 1-317 12-329 (329) 45 protein:vir:80213 Length: 334 99.3 9.8E-14 6.1E-17 91.9 15.7 286 1-302 1-334 (334) 46 protein:vir:94142 Length: 304 99.3 4.9E-13 3.1E-16 88.1 18.8 268 7-304 1-304 (304) 47 protein:vir:105905 Length: 304 99.3 4.9E-13 3.1E-16 88.1 18.8 268 7-304 1-304 (304) 48 protein:vir:4856 Length: 293 # 99.3 3.4E-12 2.1E-15 83.5 22.4 280 8-342 1-293 (293) 49 protein:vir:94576 Length: 347 99.3 4.6E-13 2.8E-16 88.2 16.9 289 1-304 4-347 (347) 50 protein:vir:1541 Length: 347 # 99.3 9.4E-13 5.8E-16 86.5 18.3 286 1-306 1-347 (347) 51 protein:vir:81100 Length: 415 99.3 1.8E-12 1.1E-15 85.0 19.7 297 1-341 110-415 (415) 52 protein:vir:98339 Length: 415 99.3 1.8E-12 1.1E-15 85.0 19.7 297 1-341 110-415 (415) 53 protein:vir:79987 Length: 415 99.3 1.8E-12 1.1E-15 85.0 19.7 297 1-341 110-415 (415) 54 protein:vir:94711 Length: 347 99.3 2.7E-13 1.7E-16 89.5 15.1 285 1-306 3-347 (347) 55 protein:vir:4339 Length: 395 # 99.3 1.7E-12 1E-15 85.1 19.0 276 1-303 101-395 (395) 56 protein:vir:3364 Length: 347 # 99.3 1.1E-12 6.9E-16 86.1 17.5 286 1-306 1-347 (347) 57 protein:vir:78739 Length: 332 99.3 2.1E-13 1.3E-16 90.1 13.3 287 1-310 1-332 (332) 58 protein:vir:100135 Length: 418 99.2 2.4E-12 1.5E-15 84.3 19.1 276 1-299 116-418 (418) 59 protein:vir:97331 Length: 319 99.2 2.9E-12 1.8E-15 83.8 19.5 289 1-315 1-319 (319) 60 protein:vir:94800 Length: 319 99.2 2.9E-12 1.8E-15 83.8 19.5 289 1-315 1-319 (319) 61 protein:vir:4600 Length: 415 # 99.2 2.8E-12 1.7E-15 83.9 19.3 298 1-341 101-415 (415) 62 protein:vir:4700 Length: 415 # 99.2 2.8E-12 1.7E-15 83.9 19.3 298 1-341 101-415 (415) 63 protein:vir:9410 Length: 415 # 99.2 3.1E-12 1.9E-15 83.7 19.4 300 1-341 97-415 (415) 64 protein:vir:1886 Length: 385 # 99.2 2.5E-12 1.6E-15 84.2 18.8 275 1-301 90-385 (385) 65 protein:vir:191 Length: 385 # 99.2 2.5E-12 1.6E-15 84.2 18.8 275 1-301 90-385 (385) 66 protein:vir:2344 Length: 397 # 99.2 4.6E-12 2.9E-15 82.7 20.0 312 1-346 3-355 (397) 67 protein:vir:97053 Length: 390 99.2 1.9E-12 1.2E-15 84.9 17.5 272 1-298 102-390 (390) 68 protein:vir:81070 Length: 390 99.2 2.1E-12 1.3E-15 84.6 17.8 272 1-298 95-390 (390) 69 protein:vir:104256 Length: 458 99.2 3.6E-12 2.3E-15 83.3 18.6 285 1-301 144-458 (458) 70 protein:vir:485 Length: 407 # 99.2 3.9E-12 2.4E-15 83.1 18.4 291 1-305 90-407 (407) 71 protein:vir:9759 Length: 303 # 99.2 6E-12 3.7E-15 82.1 19.0 271 13-297 1-303 (303) 72 protein:vir:8102 Length: 543 # 99.2 6.6E-12 4.1E-15 81.9 18.7 283 1-331 237-543 (543) 73 protein:vir:1638 Length: 298 # 99.2 1.3E-11 8.3E-15 80.2 20.0 268 16-297 1-298 (298) 74 protein:vir:78223 Length: 333 99.2 1.3E-11 8E-15 80.3 19.2 282 1-302 4-333 (333) 75 protein:vir:94771 Length: 298 99.2 1.7E-11 1E-14 79.6 19.7 265 16-303 1-298 (298) 76 protein:vir:3136 Length: 322 # 99.2 5E-12 3.1E-15 82.5 16.6 278 11-305 1-322 (322) 77 protein:vir:100247 Length: 425 99.2 7E-12 4.3E-15 81.8 17.3 285 1-300 102-425 (425) 78 protein:vir:4953 Length: 397 # 99.2 1.8E-11 1.1E-14 79.5 19.5 284 1-309 94-397 (397) 79 protein:vir:8885 Length: 347 # 99.2 2.4E-12 1.5E-15 84.3 14.6 289 1-304 1-347 (347) 80 protein:vir:80684 Length: 315 99.1 2E-11 1.2E-14 79.3 19.4 291 13-345 1-315 (315) 81 protein:vir:108303 Length: 418 99.1 5.4E-11 3.3E-14 76.9 21.8 303 16-346 1-377 (418) 82 protein:vir:1328 Length: 392 # 99.1 1.3E-11 7.9E-15 80.3 18.2 279 1-299 93-392 (392) 83 protein:vir:2430 Length: 318 # 99.1 2.7E-11 1.7E-14 78.5 20.0 287 1-338 1-318 (318) 84 protein:vir:10364 Length: 390 99.1 1.3E-11 8.2E-15 80.2 18.1 272 1-298 96-390 (390) 85 protein:vir:4830 Length: 397 # 99.1 2E-11 1.3E-14 79.2 19.1 281 1-309 98-397 (397) 86 protein:vir:102655 Length: 322 99.1 1.7E-11 1E-14 79.7 18.6 289 1-333 3-322 (322) 87 protein:vir:2201 Length: 345 # 99.1 4.8E-12 3E-15 82.6 14.9 285 1-332 1-345 (345) 88 protein:vir:9574 Length: 300 # 99.1 3.1E-11 1.9E-14 78.2 18.8 270 13-299 1-300 (300) 89 protein:vir:4456 Length: 401 # 99.1 1.3E-11 8.3E-15 80.2 16.5 286 1-298 91-401 (401) 90 protein:vir:78523 Length: 338 99.1 5.5E-11 3.4E-14 76.8 19.7 288 1-304 4-338 (338) 91 protein:vir:5739 Length: 366 # 99.1 2E-11 1.3E-14 79.2 17.2 284 1-313 52-366 (366) 92 protein:vir:104085 Length: 320 99.1 2.8E-11 1.8E-14 78.4 17.9 292 1-336 1-320 (320) 93 protein:vir:4226 Length: 326 # 99.1 5.6E-11 3.5E-14 76.8 19.2 283 1-301 3-326 (326) 94 protein:vir:2504 Length: 305 # 99.1 6.7E-11 4.1E-14 76.4 19.6 274 13-304 1-305 (305) 95 protein:vir:6212 Length: 434 # 99.1 7.2E-11 4.5E-14 76.2 19.6 293 1-335 114-434 (434) 96 protein:vir:6242 Length: 390 # 99.1 2.3E-11 1.4E-14 78.9 16.8 279 1-331 97-390 (390) 97 protein:vir:8187 Length: 311 # 99.1 1.1E-10 6.6E-14 75.2 19.7 273 15-301 1-311 (311) 98 protein:vir:1383 Length: 421 # 99.0 1.5E-10 9E-14 74.5 20.2 293 1-346 105-421 (421) 99 protein:vir:10450 Length: 344 99.0 1.2E-11 7.5E-15 80.4 14.1 283 6-330 1-344 (344) 100 protein:vir:4997 Length: 397 # 99.0 2.4E-10 1.5E-13 73.3 20.5 284 1-309 94-397 (397) 101 protein:vir:78935 Length: 335 99.0 3.4E-11 2.1E-14 78.0 15.5 287 1-306 1-335 (335) 102 protein:vir:4511 Length: 409 # 99.0 1.1E-10 6.8E-14 75.2 17.8 281 1-299 99-409 (409) 103 protein:vir:94673 Length: 419 99.0 1.4E-10 8.9E-14 74.6 18.1 281 1-303 102-419 (419) 104 protein:vir:6324 Length: 335 # 99.0 1E-10 6.3E-14 75.4 16.7 284 1-304 1-335 (335) 105 protein:vir:99675 Length: 324 99.0 5.4E-11 3.3E-14 76.9 14.9 262 49-333 1-324 (324) 106 protein:vir:81160 Length: 371 98.9 2.8E-10 1.7E-13 73.0 17.8 269 1-298 79-371 (371) 107 protein:vir:8420 Length: 477 # 98.9 8.7E-11 5.4E-14 75.7 14.6 293 1-305 140-477 (477) 108 protein:vir:3991 Length: 404 # 98.9 7.1E-10 4.4E-13 70.7 19.4 285 1-346 101-404 (404) 109 protein:vir:7409 Length: 408 # 98.9 1E-09 6.3E-13 69.9 20.2 291 1-345 97-408 (408) 110 protein:vir:3845 Length: 395 # 98.9 7.4E-10 4.6E-13 70.6 18.5 285 1-346 93-395 (395) 111 protein:vir:3870 Length: 400 # 98.9 5.6E-10 3.5E-13 71.3 17.3 269 1-304 112-400 (400) 112 protein:vir:101607 Length: 379 98.9 8E-10 5E-13 70.5 17.6 271 1-300 91-379 (379) 113 protein:vir:1433 Length: 435 # 98.8 6.6E-10 4.1E-13 70.9 16.2 287 1-315 101-435 (435) 114 protein:vir:1025 Length: 408 # 98.8 4.5E-09 2.8E-12 66.4 20.6 280 1-307 100-408 (408) 115 protein:vir:99920 Length: 311 98.8 2.5E-09 1.5E-12 67.8 19.1 271 13-297 1-311 (311) 116 protein:vir:102119 Length: 404 98.8 2.3E-09 1.4E-12 67.9 18.3 288 1-303 92-404 (404) 117 protein:vir:96762 Length: 632 98.8 6.8E-10 4.2E-13 70.8 15.0 274 1-297 330-632 (632) 118 protein:vir:80376 Length: 435 98.8 1.6E-09 9.8E-13 68.8 16.8 277 1-295 105-435 (435) 119 protein:vir:105038 Length: 428 98.8 1.6E-09 1E-12 68.8 16.6 291 1-313 108-428 (428) 120 protein:vir:174 Length: 423 # 98.8 6.8E-09 4.2E-12 65.4 19.7 312 20-346 1-373 (423) 121 protein:vir:100884 Length: 389 98.8 7.3E-09 4.5E-12 65.2 19.7 275 1-304 83-389 (389) 122 protein:vir:4092 Length: 390 # 98.8 1.1E-09 6.7E-13 69.7 15.1 284 1-306 68-390 (390) 123 protein:vir:95376 Length: 425 98.7 2.3E-09 1.4E-12 68.0 16.4 278 1-303 110-425 (425) 124 protein:vir:100057 Length: 375 98.7 8.7E-09 5.4E-12 64.8 18.8 289 1-307 5-375 (375) 125 protein:vir:103323 Length: 364 98.7 2.6E-08 1.6E-11 62.2 21.3 297 1-323 1-364 (364) 126 protein:vir:100172 Length: 394 98.7 1.9E-08 1.2E-11 62.9 19.8 284 1-345 96-394 (394) 127 protein:vir:9704 Length: 394 # 98.7 1.1E-08 6.6E-12 64.3 18.4 267 1-307 118-394 (394) 128 protein:vir:105374 Length: 423 98.7 1.6E-08 9.6E-12 63.4 18.7 312 20-346 1-373 (423) 129 protein:vir:81227 Length: 413 98.6 2.9E-08 1.8E-11 61.9 18.5 277 1-302 101-413 (413) 130 protein:vir:1268 Length: 397 # 98.6 3.2E-08 2E-11 61.6 18.7 271 1-300 103-397 (397) 131 protein:vir:93616 Length: 645 98.6 1.9E-08 1.2E-11 63.0 17.3 287 1-299 320-645 (645) 132 protein:vir:1084 Length: 437 # 98.6 1.3E-08 8.1E-12 63.8 16.5 277 1-305 132-437 (437) 133 protein:vir:4197 Length: 314 # 98.5 8.1E-08 5E-11 59.5 19.7 279 1-299 4-314 (314) 134 protein:vir:105522 Length: 423 98.5 6.7E-08 4.2E-11 59.9 19.2 299 20-346 1-373 (423) 135 protein:vir:93881 Length: 387 98.5 1.6E-08 9.9E-12 63.3 15.0 274 1-303 103-387 (387) 136 protein:vir:3525 Length: 423 # 98.5 2.9E-07 1.8E-10 56.4 20.7 314 13-346 1-373 (423) 137 protein:vir:97031 Length: 402 98.4 6E-08 3.7E-11 60.2 16.7 317 1-346 1-377 (402) 138 protein:vir:962 Length: 397 # 98.4 6.2E-08 3.9E-11 60.1 16.6 266 1-298 120-397 (397) 139 protein:vir:94424 Length: 387 98.4 2E-08 1.2E-11 62.8 13.7 274 1-303 99-387 (387) 140 protein:vir:2685 Length: 387 # 98.4 2E-08 1.2E-11 62.8 13.7 274 1-303 99-387 (387) 141 protein:vir:96978 Length: 387 98.4 2E-08 1.2E-11 62.8 13.7 274 1-303 99-387 (387) 142 protein:vir:9361 Length: 402 # 98.4 5.4E-08 3.3E-11 60.4 15.3 274 1-303 114-402 (402) 143 protein:vir:102873 Length: 392 98.3 2.7E-07 1.7E-10 56.6 18.0 284 1-344 85-392 (392) 144 protein:vir:102082 Length: 392 98.3 2.7E-07 1.7E-10 56.6 18.0 284 1-344 85-392 (392) 145 protein:vir:107593 Length: 392 98.3 2.7E-07 1.7E-10 56.6 18.0 284 1-344 85-392 (392) 146 protein:vir:105004 Length: 392 98.3 2.7E-07 1.7E-10 56.6 18.0 284 1-344 85-392 (392) 147 protein:vir:4159 Length: 315 # 98.2 6.4E-07 4E-10 54.5 17.8 283 1-302 1-315 (315) 148 protein:vir:78640 Length: 352 98.2 3.8E-07 2.4E-10 55.8 15.8 275 1-303 64-352 (352) 149 protein:vir:105645 Length: 400 98.2 3.1E-07 1.9E-10 56.3 14.6 317 1-346 1-375 (400) 150 protein:vir:3158 Length: 321 # 98.0 2.7E-06 1.7E-09 51.1 17.6 283 1-312 1-321 (321) 151 protein:vir:93696 Length: 364 98.0 4.3E-07 2.7E-10 55.5 13.3 288 7-309 1-364 (364) 152 protein:vir:80128 Length: 466 98.0 1E-06 6.3E-10 53.4 14.4 296 1-346 123-463 (466) 153 protein:vir:79008 Length: 299 97.9 1.1E-05 6.6E-09 47.8 19.9 267 20-299 1-299 (299) 154 protein:vir:7019 Length: 401 # 97.9 1.8E-06 1.1E-09 52.1 13.9 322 1-346 1-375 (401) 155 protein:vir:7855 Length: 497 # 97.7 2.5E-05 1.5E-08 45.8 20.4 286 1-301 114-497 (497) 156 protein:vir:101650 Length: 497 97.7 2.5E-05 1.5E-08 45.8 20.4 286 1-301 114-497 (497) 157 protein:vir:1781 Length: 221 # 97.7 5.5E-06 3.4E-09 49.4 14.4 192 101-305 1-221 (221) 158 protein:vir:79928 Length: 393 97.6 4E-05 2.5E-08 44.7 18.0 289 1-312 59-393 (393) 159 protein:vir:95963 Length: 395 97.4 4.1E-05 2.5E-08 44.7 15.8 294 1-346 71-392 (395) 160 protein:vir:105610 Length: 430 97.4 7E-05 4.3E-08 43.4 16.8 296 11-320 1-430 (430) 161 protein:vir:102335 Length: 312 97.2 0.00014 8.7E-08 41.7 18.6 265 20-299 1-312 (312) 162 protein:vir:95875 Length: 401 97.1 4.7E-05 2.9E-08 44.3 13.0 296 9-333 1-401 (401) 163 protein:vir:2770 Length: 318 # 97.1 0.00015 9.6E-08 41.5 15.3 241 9-282 1-318 (318) 164 protein:vir:101291 Length: 381 97.0 0.00021 1.3E-07 40.7 16.1 291 1-343 61-381 (381) 165 protein:vir:9509 Length: 381 # 97.0 0.00021 1.3E-07 40.7 16.1 291 1-343 61-381 (381) 166 protein:vir:104439 Length: 404 96.9 9.7E-05 6E-08 42.6 12.6 294 1-307 1-404 (404) 167 protein:vir:819 Length: 404 # 96.9 9.7E-05 6E-08 42.6 12.6 294 1-307 1-404 (404) 168 protein:vir:3298 Length: 404 # 96.9 9.7E-05 6E-08 42.6 12.6 294 1-307 1-404 (404) 169 protein:vir:10123 Length: 404 96.9 9.7E-05 6E-08 42.6 12.6 294 1-307 1-404 (404) 170 protein:vir:78920 Length: 290 96.5 0.00058 3.6E-07 38.3 19.1 260 7-297 1-290 (290) 171 protein:vir:9643 Length: 377 # 96.3 0.00054 3.4E-07 38.5 13.4 276 1-298 63-377 (377) 172 protein:vir:100632 Length: 381 95.7 0.0015 9.5E-07 36.0 14.6 288 1-311 56-381 (381) 173 protein:vir:78350 Length: 383 93.9 0.0059 3.6E-06 32.8 13.9 286 1-344 68-383 (383) 174 protein:vir:98635 Length: 377 93.6 0.0057 3.5E-06 32.9 11.1 273 1-299 63-377 (377) 175 protein:vir:105464 Length: 346 93.2 0.0086 5.3E-06 31.9 20.4 309 7-346 1-346 (346) 176 protein:vir:97255 Length: 310 92.6 0.011 6.6E-06 31.4 19.0 275 1-312 1-310 (310) 177 protein:vir:78090 Length: 302 91.6 0.015 9.4E-06 30.6 17.4 269 20-308 1-302 (302) 178 protein:vir:79712 Length: 285 89.9 0.024 1.5E-05 29.5 17.2 267 20-333 1-285 (285) 179 protein:vir:95451 Length: 313 85.5 0.053 3.3E-05 27.6 12.7 272 9-313 1-313 (313) 180 protein:vir:106590 Length: 349 72.1 0.19 0.00012 24.6 14.2 297 1-328 2-349 (349) 181 protein:vir:80068 Length: 301 69.1 0.23 0.00014 24.1 11.9 261 9-290 1-301 (301) 182 protein:vir:8324 Length: 410 # 68.2 0.24 0.00015 24.0 13.0 271 1-306 113-410 (410) 183 protein:vir:4074 Length: 480 # 67.0 0.26 0.00016 23.8 10.1 275 1-302 153-480 (480) 184 protein:vir:98871 Length: 314 64.2 0.3 0.00019 23.4 13.5 278 1-335 1-314 (314) 185 protein:vir:97397 Length: 517 63.7 0.31 0.00019 23.4 14.4 274 1-299 220-517 (517) 186 protein:vir:94933 Length: 330 63.0 0.32 0.0002 23.3 20.2 280 1-299 1-330 (330) 187 protein:vir:99523 Length: 311 61.7 0.35 0.00022 23.1 19.5 271 9-297 1-311 (311) 188 protein:vir:96490 Length: 348 58.7 0.41 0.00025 22.7 15.9 298 17-331 1-348 (348) 189 protein:vir:103886 Length: 302 58.7 0.41 0.00025 22.7 17.5 271 22-305 1-302 (302) 190 protein:vir:107687 Length: 319 48.8 0.66 0.00041 21.6 11.8 272 1-290 1-319 (319) 191 protein:vir:79548 Length: 652 44.5 0.8 0.0005 21.1 15.0 279 1-297 351-652 (652) 192 protein:vir:104342 Length: 314 34.3 1.3 0.00081 20.0 9.2 275 5-293 1-314 (314) 193 protein:vir:95512 Length: 693 27.6 1.8 0.0011 19.1 14.8 277 1-304 386-693 (693) 194 protein:vir:2736 Length: 348 # 25.2 2.1 0.0013 18.8 11.7 297 17-331 1-348 (348) 195 protein:vir:4902 Length: 348 # 25.2 2.1 0.0013 18.8 15.7 295 17-331 1-348 (348) 196 protein:vir:99424 Length: 360 21.7 2.6 0.0016 18.3 17.0 281 1-336 11-360 (360) No 1 >protein:vir:102944 Length: 330 # NCBI annotation: major head protein # Family: family:all:1522 # MgeID: mge:1461 # MgeName: EJ-1 # Cross-refs: genbank:acc:NP_945286;genbank:gi:39653721;uniprot:Q708M6;genbank:GeneID:2672858 Probab=100.00 E-value=1.9e-117 Score=660.54 Aligned_cols=321 Identities=41% Similarity=0.726 Sum_probs=301.4 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =|+.+|+|+|||+||||++||+++.+++++|+||||++++++|++++++||++++||||++|+|++|+++||+++|++++ T Consensus 1 Ma~~~T~l~d~i~pevf~~yv~~~~~~~~~l~qSG~i~~~~~i~~~~~~~G~~i~~P~~~~l~G~~~~~~dg~~~i~~~k 80 (330) T protein:vir:10 1 MANELTKILDTITPQQYNAYMQQYTAAKSAFVQSGIAVSDERVSKNITSGGLLVNMPFWNDLTGDSEVLGNGDKALETGK 80 (330) T ss_pred CCCCceEeeeeechhHHHHHHHHHhHHhhhhhhcccccccHHHHHHhhcCCCEEEecccccCCCcccccCCCccccchhh Confidence 37889999999999999999999999999999999999999999999899999999999999999999999877899999 Q ss_pred cccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcc----eeeecccc Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNK----LDVSTETG 168 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~----~dis~~~~ 168 (346) |+++++++++++|+|||+++||+.+++|+|||++|++||++||+|++|++||++|+|+|++..++... .+.+..++ T Consensus 81 i~t~~~~a~i~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~w~~~~q~~lla~l~gvf~~~~~~~~~~~~~~~~~~~~~ 160 (330) T protein:vir:10 81 ITAGADIACVLYRGRGWAANELTGVVAGSDPVRAILNRIGAYWLREDQKALIATLNGIFATGTAGEKGALEETHVSDQSK 160 (330) T ss_pred cccceeEEEEEeecceeeehhhhhhhcchhHHHHHHHHHHHHhhhhHHHHHHHHHHhhhhhhhcccchhhhhhheecccc Confidence 99999999999999999999999999999999999999999999999999999999999987654431 22344456 Q ss_pred ccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC-ceeeEEeceEEEEeCCCccCCCceEEEE Q lcl|NC_015254. 169 DDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN-KKFPTYMGKRVIVDDGLPAKDGVYTSYI 247 (346) Q Consensus 169 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~-~~i~~~~G~~VVvdD~~p~~~g~ytt~l 247 (346) +.+.|++++|++|+++|||+.+++++++|||++|++|++++|+++++++++ +.|++|+|+||||||+||+..++|++|+ T Consensus 161 ~~a~~s~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~i~~~~G~~VivdD~~p~~~~~yt~yl 240 (330) T protein:vir:10 161 ASTGIDAGMVLDAKQLLGDSADQVTAIAMHSAVYTKLQKDNLIQYIQPTTATINIPTYLGYRVIIDDGIAPTGDIYTSYL 240 (330) T ss_pred cccccCHHHHHHHHHHhccccccceEEEEcHHHHHHHHHhhhhhhhcccccCcccccccceEEEEeCCCCCCCCceeEEE Confidence 778899999999999999999999999999999999999999999999986 5799999999999999999999999999 Q ss_pred EcCCeeEEeecCC--ccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccc--cCCCCChHHhcCCcCceeeecccccc Q lcl|NC_015254. 248 FGEGAFGLGNGEA--PVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSV--AGHSPTNTEIEKGNNWKAVYESKNIR 323 (346) Q Consensus 248 ~~~GAi~~~~~~~--~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~--~~~sPt~a~L~~~~NW~~v~~~K~i~ 323 (346) |++|||+|.++++ .+++|++|++++|+|+|++|+||++||+||||+++.+ ++.|||++||++++||+||||||+|| T Consensus 241 ~~~GAi~~~~~~~~~~v~~EtdRd~~~g~~~l~~r~~~~~hp~G~s~~~~~~~~~~~sPt~~~L~~~~NW~~v~~~k~i~ 320 (330) T protein:vir:10 241 FRTGSIGLNTGNPSGLTTFETSREAAKGNDMIYTRRALVMHPYGVKWTGAEVDAGNITPSNADLAKFKNWKRVYEPKNIG 320 (330) T ss_pred EecCceeeecccCCccccccccCCccccceEEEEeeEEEeeeeeeeecccccccCcCCcChHHhcCCcCcccccChhhcc Confidence 9999999998765 4799999999999999999999999999999987753 67899999999999999999999999 Q ss_pred eEEEEEeccc Q lcl|NC_015254. 324 IVAFVHKNGV 333 (346) Q Consensus 324 iv~~~~k~~~ 333 (346) ||+|||||+- T Consensus 321 iv~~~~~~~~ 330 (330) T protein:vir:10 321 IIALKHKIGK 330 (330) T ss_pred eEEEEEecCC Confidence 9999999998 No 2 >protein:vir:5974 Length: 324 # NCBI annotation: hypothetical protein # Family: family:all:1522 # MgeID: mge:125 # MgeName: SPP1 # Cross-refs: genbank:acc:NP_690674;genbank:geneid:6329212;genbank:gi:22855068;goa:Q38582;uniprot:Q38582;genbank:GeneID:955303 Probab=100.00 E-value=6.1e-117 Score=657.73 Aligned_cols=316 Identities=50% Similarity=0.884 Sum_probs=302.2 Q ss_pred CceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhh--CCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 15 GKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAK--SGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 15 ~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~--~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =++|+|+|||+||||++||+++++++++|+|||+++++++++.+++ .||+++++|||++|+|++++++++ ++|++++ T Consensus 1 MA~T~lsd~i~peVf~~yv~~~~~~~~~l~qSg~i~~~a~i~~~l~~~~~G~~i~~P~~~~l~Gd~~~v~~~-~~i~~~~ 79 (324) T protein:vir:59 1 MAYTKISDVIVPELFNPYVINTTTQLSAFFQSGIAATDDELNALAKKAGGGSTLNMPYWNDLDGDSQVLNDT-DDLVPQK 79 (324) T ss_pred CCceeeeceechhHHHHHHHhhhHHHHHHhhcccccccHHHHHHhhccCCCCEEEecccccCCCcccccCCC-cccchhh Confidence 2379999999999999999999999999999999999999999874 489999999999999999999887 5899999 Q ss_pred cccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccc Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSY 172 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~ 172 (346) |++++++++|++|+|||+++|++++++|+|||++|++|+++||++++|++||++|+|+|+++.++++++|+++.+ +.. T Consensus 80 l~t~~~~a~i~~~~k~~~~tD~a~~~sg~dp~~~i~~q~a~~~~~~~~~~lia~l~g~~~~~~~~~~~~dvsa~~--~~~ 157 (324) T protein:vir:59 80 INAGQDKAVLILRGNAWSSHDLAATLSGSDPMQAIGSRVAAYWAREMQKIVFAELAGVFSNDDMKDNKLDISGTA--DGI 157 (324) T ss_pred cccceeeEEEEeecCceeehhhhhhhccchHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccceeeeeccc--cce Confidence 999999999999999999999999999999999999999999999999999999999999999999999998753 467 Q ss_pred ccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC-ceeeEEeceEEEEeCCCccC-----CCceEEE Q lcl|NC_015254. 173 FTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN-KKFPTYMGKRVIVDDGLPAK-----DGVYTSY 246 (346) Q Consensus 173 ~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~-~~i~~~~G~~VVvdD~~p~~-----~g~ytt~ 246 (346) |++++|++|+++|||+.+++++++|||++|++|++++++++++++++ +.|++|+|+||||||+||+. .++|++| T Consensus 158 ~s~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~i~~~~G~~VivdD~~p~~~~~~~~~~y~s~ 237 (324) T protein:vir:59 158 YSAETFVDASYKLGDHESLLTAIGMHSATMASAVKQDLIEFVKDSQSGIRFPTYMNKRVIVDDSMPVETLEDGTKVFTSY 237 (324) T ss_pred ecHHHHHHHHHHhCCcccCcEEEEEchHHHHHHHHhhhhhhccccccCceeeeecccEEEEeCCCCccccCCCCceEEEE Confidence 99999999999999999999999999999999999999999999985 58999999999999999974 3589999 Q ss_pred EEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCChHHhcCCcCceeeecccccceEE Q lcl|NC_015254. 247 IFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEKGNNWKAVYESKNIRIVA 326 (346) Q Consensus 247 l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~ 326 (346) +|++|||+|.++++++++|++|++++|+|+|++|+||++||+||||+++.+++.|||++||++++||+||||||+||||+ T Consensus 238 l~~~GAi~~~~~~~~v~vE~dRd~~~g~~~l~~r~~~~~~p~G~s~~~~~~~~~sPt~~~L~~~~NW~~v~~~k~i~i~~ 317 (324) T protein:vir:59 238 LFGAGALGYAEGQPEVPTETARNALGSQDILINRKHFVLHPRGVKFTENAMAGTTPTDEELANGANWQRVYDPKKIRIVQ 317 (324) T ss_pred EEecCeEEEeecCCCcceecccCccccceEEEEeeEEEeEeeeEEecccccCCCCCChhhhcCCcccccccCccccceEE Confidence 99999999999999999999999999999999999999999999999988899999999999999999999999999999 Q ss_pred EEEeccc Q lcl|NC_015254. 327 FVHKNGV 333 (346) Q Consensus 327 ~~~k~~~ 333 (346) ||||+++ T Consensus 318 ~~~~~~~ 324 (324) T protein:vir:59 318 FKHRLQA 324 (324) T ss_pred EEeeccC Confidence 9999999 No 3 >protein:vir:80446 Length: 367 # NCBI annotation: BcepGomrgp07 # Family: family:all:1522 # MgeID: mge:1882 # MgeName: BcepGomr # Cross-refs: genbank:acc:YP_001210227;genbank:gi:146329919;genbank:GeneID:5123555 Probab=100.00 E-value=1.2e-116 Score=656.04 Aligned_cols=321 Identities=30% Similarity=0.553 Sum_probs=300.0 Q ss_pred eecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCC--cccc Q lcl|NC_015254. 11 KFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDG--EGAL 88 (346) Q Consensus 11 ~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg--~~~i 88 (346) |=..++.|+|+|||+||||++|+.++.+++++|+||||++++++|+.++++||++++||||++|+|+++++.++ .+++ T Consensus 1 M~~~~~~T~l~Dii~pEvF~~Yv~~~~~e~~~l~qSGiv~~d~~l~~~~~~gG~~v~iPf~~~L~g~~~n~~~d~~~~~~ 80 (367) T protein:vir:80 1 MPDFNNQVRLVDAVIPEVYTSYTAIDRPELTAFFLSGAVASNDFLSQFLSAPGRLINIPFWRDLDSLEPNYGSDNPNVEA 80 (367) T ss_pred CcchhhhhhhhhccchhhhhHHHhhhhhhhhhhhhcceeecCHHHHHHhhcCCCEEEeeeeccCCCCccccCCCCCcccc Confidence 54567789999999999999999999999999999999999999999999999999999999999988876433 3479 Q ss_pred chhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhh------------- Q lcl|NC_015254. 89 TPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGA------------- 155 (346) Q Consensus 89 t~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~------------- 155 (346) +|.+|+++++++++++|+|||+++||+.+++|+|||++|++||++||.|+.|+.||++|+|+|+++. T Consensus 81 t~~kittg~~~a~v~~r~kaw~~~Dla~~lsG~dpm~~Ia~qva~yW~r~~q~~Lla~L~Gvf~~~~a~~~~~~~~~~~~ 160 (367) T protein:vir:80 81 PIDGLGSGEMKTTKTWLNKAYGAMDLTAELAGSNPMTRIRNRFGVYWTRQWQRRIIAMAVGVYKSNLAGNFATIKTRGRV 160 (367) T ss_pred cccccccchheeeeehhcccchhhhHHHHhhCchHHHHHHHHHHHHhhhhhHHHHHHHHHHhhccccccchhhhhhhhcc Confidence 9999999999999999999999999999999999999999999999999999999999999999764 Q ss_pred -------hhhcceeeeccccc-cccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC-ceeeEEe Q lcl|NC_015254. 156 -------LDSNKLDVSTETGD-DSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN-KKFPTYM 226 (346) Q Consensus 156 -------~~~~~~dis~~~~~-~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~-~~i~~~~ 226 (346) ++.|++|||+.++. +..|++++|++|.++|||+.++|++++|||++|++|++++||+|++++++ ..|++|+ T Consensus 161 ~a~~~~~~~~~~~Dis~~t~~~~~~~s~~~~~~A~~~lGD~~~~l~~i~mHS~V~~~L~~~~li~~i~~sd~~~~i~ty~ 240 (367) T protein:vir:80 161 PAEVLGTAGDMVIDISGQTNPADAVFNREAFVDAAFTMGDHVGSIAAIAVHSMVYKRMTNNDEIEFIPDSKGQLTIPTYM 240 (367) T ss_pred ccccccccCceeeeeeccCCCccceecHHHHHHHHHHhccccccccEEEEchHHHHHHHhccccccccCCCCccccceec Confidence 35689999999874 47899999999999999999999999999999999999999999999996 5899999 Q ss_pred ceEEEEeCCCccC----CCceEEEEEcCCeeEEeecCCccceeeeecCC----cceeEEEEeeEEeeeeeeeeecccccc Q lcl|NC_015254. 227 GKRVIVDDGLPAK----DGVYTSYIFGEGAFGLGNGEAPVPTETDREKL----KGNDILINRQHFLLHPRGIAWQEKSVA 298 (346) Q Consensus 227 G~~VVvdD~~p~~----~g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~~----~g~~~l~~r~~~~~~~~G~s~~~~~~~ 298 (346) |+||||||+||+. .++|+||+|++|||+|+++.+.+++|++||++ +|+|+|++|+||++||+|+||+++.+. T Consensus 241 G~~VIvDD~~Pv~~~~a~~~yttYlfg~GAi~~~~~~~~~~~E~~Rd~~~~~~gG~d~L~~Rr~~~~hP~G~s~~~~~v~ 320 (367) T protein:vir:80 241 GKVVIVDDGMPVFGTGADKTYLSILFGGAAFGYADGAPQVPVAVGRRELRGNGSGLEYILERKEWIVHPGGFNWLDADVT 320 (367) T ss_pred ceeEEEeCCCcccccCCCceEEEEEEecceeeecccCCccceecccchhhhcCCceEEEEeeeeEEeecceeeecccccc Confidence 9999999999984 56999999999999999999999999999995 488999999999999999999987643 Q ss_pred --------------CCCCChHHhcCCcCceeeecccccceEEEEEec Q lcl|NC_015254. 299 --------------GHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKN 331 (346) Q Consensus 299 --------------~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~ 331 (346) +.|||++||++++||+||||||+||||+||||= T Consensus 321 ~~~~~~~~~~~~~~~~sPt~~eLa~~~NW~~v~d~K~I~iv~~it~g 367 (367) T protein:vir:80 321 IPDNTGSPSGITSGPPAITLANLANPDNWERVTYRKNVPMAFLVTKG 367 (367) T ss_pred cccccccccccccccCCCChHHhcCCcccccccchhhcceEEEEecC Confidence 468999999999999999999999999999997 No 4 >protein:vir:1583 Length: 351 # NCBI annotation: minor capsid protein # Family: family:all:1522 # MgeID: mge:32 # MgeName: phig1e # Cross-refs: genbank:acc:NP_695165;swissprot:trembl:o03966;genbank:gi:23455804;uniprot:O03966;genbank:GeneID:955561 Probab=100.00 E-value=1.8e-114 Score=644.15 Aligned_cols=329 Identities=28% Similarity=0.474 Sum_probs=303.8 Q ss_pred CceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhhcc Q lcl|NC_015254. 15 GKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGNIS 94 (346) Q Consensus 15 ~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt 94 (346) =++|+|+|||+||||++||+++++++++|+||||++++++|++++++||++++||||++|+|++|+++++ ++|++++|+ T Consensus 1 MA~T~lsd~i~PEvf~~yv~~~~~~~~~l~qSG~i~~~~~l~~~~~~~G~~it~P~~~~l~Gd~~~~~~~-~~i~~~kit 79 (351) T protein:vir:15 1 MAETHLSDLIVPEVFGNYVVNQIIKTNRFVQSGILTPDPDLGPHLLEAGTRITVPFLNDLTGDPDNWTDS-DDIDVNNLT 79 (351) T ss_pred CCceeeeeeechhHHHHHHhhhhHHhhhHhhcccccccHHHHHHhhcCCCEEEecccccCCCcccccCCC-cccchheec Confidence 2379999999999999999999999999999999999999999999999999999999999999998887 589999999 Q ss_pred cceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhh-hhcceeeeccccccccc Q lcl|NC_015254. 95 AAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGAL-DSNKLDVSTETGDDSYF 173 (346) Q Consensus 95 ~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~-~~~~~dis~~~~~~~~~ 173 (346) +++++++|++|+|||+++|++.+++|+|||++|++||++||+|++|++||++|+|+|+++.. +++++|++..+++++.| T Consensus 80 t~~~~a~i~~~~kg~~~tD~a~~~sg~dp~~~i~~q~a~~w~~~~q~~lla~l~gv~~~~~~~~~~~~d~t~~~~~~~~i 159 (351) T protein:vir:15 80 SGKQQGIKFYQTKAYGYTDLGTMISGAPVQETIGNRFAAFWQRADQKTLLSVLKGVMGVTKIANSKVYDQTKVSPSEPMF 159 (351) T ss_pred ccceeEEEEeeccceehhhhhHhhccchHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhchhhcccceecccccccccccc Confidence 99999999999999999999999999999999999999999999999999999999998764 45789999999999999 Q ss_pred cHHHHHHHHHHhCccc-cCceEEEEchHHHHHHHhhhhhhhcccccC-ceeeEEeceEEEEeCCCccC-----CCceEEE Q lcl|NC_015254. 174 TGDTFLSATYKLGDAE-GKLTGIAMHSQTEMNLRKQGLIEFMLDSDN-KKFPTYMGKRVIVDDGLPAK-----DGVYTSY 246 (346) Q Consensus 174 ~~~~l~~A~~~~GD~~-~~~~~ivmhS~~~~~L~~~~li~~~~~s~~-~~i~~~~G~~VVvdD~~p~~-----~g~ytt~ 246 (346) ++++|++|+++|||+. +.+++++|||++|++|+++++++|++++++ +.|++|+|+||||||+||+. .++|++| T Consensus 160 s~~~l~~A~~~~GD~~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~i~t~~G~~VivdD~~p~~~~~~~~~~ytsy 239 (351) T protein:vir:15 160 GAKGFTGAIGLMGDLQDTAFGAIAVNSATYSLMKVQGLIETIQPQNGATPFEAYNGLRIVLDDDIEIDLTDKTKPVSTSY 239 (351) T ss_pred CHHHHHHHHHHhccccccceEEEEEChHHHHHHHhhhhhhhccccccCcccceecceEEEEcCCCccccCCCCCceeEEE Confidence 9999999999999975 569999999999999999999999999986 57999999999999999984 3589999 Q ss_pred EEcCCeeEEeecCCccceeeeecC--CcceeEEEEeeEEeeeeeeeeecccc--ccCCCCChHHhcCCcCceee--eccc Q lcl|NC_015254. 247 IFGEGAFGLGNGEAPVPTETDREK--LKGNDILINRQHFLLHPRGIAWQEKS--VAGHSPTNTEIEKGNNWKAV--YESK 320 (346) Q Consensus 247 l~~~GAi~~~~~~~~~~vE~dRd~--~~g~~~l~~r~~~~~~~~G~s~~~~~--~~~~sPt~a~L~~~~NW~~v--~~~K 320 (346) +|++|||+|.++++ ++|++||+ ++|+|+|++|+||++||+||||+++. .++.|||++||++++||+|| |||| T Consensus 240 l~~~GAi~~~~~~~--~ve~~rd~~~~~g~d~l~~r~~~~~hp~G~s~~~~~~~~~~~sPt~~~L~~~~NW~~v~~~d~k 317 (351) T protein:vir:15 240 IFAPGAVRYSTNMR--STETKYDPLINGGQDVIVQKRVGTIHVAGTSIKASFSPSKASFPTIDELAKSSTWEVVDGIDVR 317 (351) T ss_pred EEecceeeeecCCc--CcceeecccCCCCceEEEEeeeeeeeeeeeeecccccccCcCCcChHHhcCCcccccccCCCcc Confidence 99999999988776 46888876 46899999999999999999998654 46789999999999999999 8999 Q ss_pred ccceEEEEEecc---cccccCCCCCCCCC Q lcl|NC_015254. 321 NIRIVAFVHKNG---VPGKKKETAPEGIK 346 (346) Q Consensus 321 ~i~iv~~~~k~~---~~~~~~~~~~~~~~ 346 (346) +||||+||||++ +|++..+++|+--. T Consensus 318 ~I~iv~~~~~~~~~~~~~~~~~~~~~~~~ 346 (351) T protein:vir:15 318 SIGVVAYTAQLDPALTPGAQMPAADTSTD 346 (351) T ss_pred ccceEEEEEecCcccccCCcCcCCCCccc Confidence 999999999997 47777888876544 No 5 >protein:vir:94989 Length: 349 # NCBI annotation: hypothetical protein # Family: family:all:1522 # MgeID: mge:1547 # MgeName: KS7 # Cross-refs: genbank:acc:YP_224029;genbank:gi:62327316;genbank:GeneID:5176817 Probab=100.00 E-value=2.9e-112 Score=632.10 Aligned_cols=319 Identities=30% Similarity=0.538 Sum_probs=291.5 Q ss_pred CceeeeeeccchH--HHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCc---cccc Q lcl|NC_015254. 15 GKNTRIADVIVPE--VFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGE---GALT 89 (346) Q Consensus 15 ~~~T~l~d~i~Pe--v~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~---~~it 89 (346) =++|+|+|+|+|| ||++||.++..|+++|+||||++++++|+.++++||++++||||++|+|++|.++++. ++++ T Consensus 1 Ma~T~l~D~iipe~~vf~~Yv~~~~~e~~~l~qSGii~~d~~l~~~~~~gG~~~~iPf~~~l~g~~e~n~~~dt~~~~~t 80 (349) T protein:vir:94 1 MAITTIGNIVTGNIPVLASYMTEDPVEKTAFFNSGILTPTPYAAEIARGPSNIANLPFWKAIDTSIEPNYSNDVYQDIAT 80 (349) T ss_pred CCceEEeeeeccChHHHHHHHHHhHHHhhhhhhccceeccHHHHHHHhcCCCEEEeeeeecCCCCcccccCCCCcccccc Confidence 3589999999998 8999999999999999999999999999999999999999999999999977655443 3699 Q ss_pred hhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhc----ceeeec Q lcl|NC_015254. 90 PGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSN----KLDVST 165 (346) Q Consensus 90 ~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~----~~dis~ 165 (346) |.+|+++++++++++|+|+|+++||+++++|+|||++|++||++||.|+.|+.||++|+|+|+++.++.. ..+++. T Consensus 81 ~~kit~~~~~a~~~~r~kaw~~~Dla~~lsG~dpm~~Ia~~va~yW~r~~q~~Lia~L~Gvf~~~~~~~~~~~~~~~~~~ 160 (349) T protein:vir:94 81 PRAIQTGEMMARVAYLNEGFGQADLTVELTSQNPLQSVASRLDNFWQRQAQRRLIATALGLYNDNVSATDAYHEQNDMVV 160 (349) T ss_pred cccccccceeeeeeeeccccchhHHHHHhhCchHHHHHHHHHHHHHhhHHHHHHHHHHHhhhcccccccccccccCceeE Confidence 9999999999999999999999999999999999999999999999999999999999999998765432 123444 Q ss_pred cccccccccHHHHHHHHHHhCcc-----ccCceEEEEchHHHHHHHhhhhhhhcccccCc-eeeEEeceEEEEeCCCccC Q lcl|NC_015254. 166 ETGDDSYFTGDTFLSATYKLGDA-----EGKLTGIAMHSQTEMNLRKQGLIEFMLDSDNK-KFPTYMGKRVIVDDGLPAK 239 (346) Q Consensus 166 ~~~~~~~~~~~~l~~A~~~~GD~-----~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~~-~i~~~~G~~VVvdD~~p~~ 239 (346) ++++++.++++.|++|+++|||. .++|++++|||++|++|++++||+|++++++. .|++|+|+||||||+||+. T Consensus 161 d~~~~a~~~~~~~~~A~~~~Gdaa~Gd~~~~lt~i~mHS~v~~~L~~~~li~~i~~s~~~~~i~ty~G~~VivDD~~Pv~ 240 (349) T protein:vir:94 161 DVSATSGFDAGAFIDATQTMGDALMGNGGEVLGAIAMHSFVYAQARKAQLIDFIRDAENNTMFATYQGYRVIVDDSMTVV 240 (349) T ss_pred EecccCCCChhhHHHHHHHHHHHhccccccceeEEEEchHHHHHHHhcchhhhccCcccCcccceecCcEEEEeCCCccc Confidence 44556679999999999998886 68999999999999999999999999999864 8999999999999999984 Q ss_pred ----CCceEEEEEcCCeeEEeecCCccceeeeecCC----cceeEEEEeeEEeeeeeeeeeccccccC-------CCCCh Q lcl|NC_015254. 240 ----DGVYTSYIFGEGAFGLGNGEAPVPTETDREKL----KGNDILINRQHFLLHPRGIAWQEKSVAG-------HSPTN 304 (346) Q Consensus 240 ----~g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~~----~g~~~l~~r~~~~~~~~G~s~~~~~~~~-------~sPt~ 304 (346) .++|+||+|++|||+|+++.+++++|++||++ +|+|+|++||||++||+||||+++.+++ .|||+ T Consensus 241 ~~g~~~~yttylfg~GAi~~~~~~~~~~~E~~rd~~~g~~~G~d~L~~R~~~~~hp~G~s~~~a~v~~~~~~~~~~sPt~ 320 (349) T protein:vir:94 241 GQDTSRKFISIIFGQGAIGYGEGNPEMPLEYEREASRANGGGVETLWTRKTWLLHPFGYSFTSAVITGNGTETIARSASW 320 (349) T ss_pred cCCCCceEEEEEeecceEEeecCCCCcceeeecccccCCcceeEEEEEeeEEEeeeeeeeecccccCCCccccccCCCCh Confidence 35999999999999999999999999999995 4789999999999999999999876553 68999 Q ss_pred HHhcCCcCceeeecccccceEEEEEeccc Q lcl|NC_015254. 305 TEIEKGNNWKAVYESKNIRIVAFVHKNGV 333 (346) Q Consensus 305 a~L~~~~NW~~v~~~K~i~iv~~~~k~~~ 333 (346) +||++++||+||||||+||||+||||+++ T Consensus 321 aeLa~~~NW~~v~~~K~I~iv~~~~~~~a 349 (349) T protein:vir:94 321 QDLANAANWNRVVDRKHVPIAFLVTGVGA 349 (349) T ss_pred HHhcCCcCcccccChhhcceEEEEeccCC Confidence 99999999999999999999999999999 No 6 >protein:vir:78387 Length: 349 # NCBI annotation: putative coat protein # Family: family:all:1522 # MgeID: mge:1851 # MgeName: SETP3 # Cross-refs: genbank:acc:YP_001110837;genbank:gi:134288598;genbank:GeneID:5179650 Probab=100.00 E-value=6.4e-112 Score=630.17 Aligned_cols=315 Identities=31% Similarity=0.536 Sum_probs=288.7 Q ss_pred CceeeeeeccchH--HHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCccccc--CCC-ccccc Q lcl|NC_015254. 15 GKNTRIADVIVPE--VFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEIL--DDG-EGALT 89 (346) Q Consensus 15 ~~~T~l~d~i~Pe--v~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~--~dg-~~~it 89 (346) =++|+|+|+|+|| ||++||.++..|+++|+||||++++++|+.++++||++++||||++|+|++|.+ .|+ .++++ T Consensus 1 Ma~T~l~D~iipe~~vf~~Yv~~~~~e~~~l~qSGii~~d~~l~~~~~~gG~~~~iPf~~~L~g~~e~nv~~D~~~~~~t 80 (349) T protein:vir:78 1 MAITTIGDIVTGNIPVLASYMTEDPVEKTAFFDSGILTSTPYAAEIANGPSNIANLPFWKAIDTSIEPNYSNDVYQDIAT 80 (349) T ss_pred CCceEEeeeeccCHHHHHHHHHHhhHHhhhhhhccceeccHHHHHHhhcCCCEEEeeeeecCCCCcccccCCCCcccccc Confidence 3589999999999 899999999999999999999999999999999999999999999999987653 233 35789 Q ss_pred hhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhh--------hcce Q lcl|NC_015254. 90 PGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALD--------SNKL 161 (346) Q Consensus 90 ~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~--------~~~~ 161 (346) +++|+++++++++++|+|||+++||+++++|+|||++|++||++||.|+.|+.||++|+|+|+++.++ ++++ T Consensus 81 ~~kitt~~~~a~~~~r~kaw~~~Dla~~lsG~dpm~~Ia~~va~yW~r~~q~~Lia~L~Gvf~~~~~a~~~~~~~~~~t~ 160 (349) T protein:vir:78 81 PRAIQTGEMMARVAYLNEGFGQADLTVELTSQNPLQSVASRLDNFWQRQAQRRLIATALGLYNDNVSATDAYHEQNDMVV 160 (349) T ss_pred cccccccceeeeeeeeccccchhHHHHHhhCchHHHHHHHHHHHHHhhHHHHHHHHHHHHhhcccccccchhhhccccee Confidence 99999999999999999999999999999999999999999999999999999999999999976443 3345 Q ss_pred eeeccccccccccHHHHHHHHHHhCcc-----ccCceEEEEchHHHHHHHhhhhhhhcccccC-ceeeEEeceEEEEeCC Q lcl|NC_015254. 162 DVSTETGDDSYFTGDTFLSATYKLGDA-----EGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN-KKFPTYMGKRVIVDDG 235 (346) Q Consensus 162 dis~~~~~~~~~~~~~l~~A~~~~GD~-----~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~-~~i~~~~G~~VVvdD~ 235 (346) |++ +++.++++.|++|+++|||. .++|++++|||++|++|++++||+|++++++ ..|++|+|+||||||+ T Consensus 161 d~s----~~a~~~~~~~~dA~~~lgda~~Gd~~~~lt~i~mHS~v~~~L~~~~li~~i~~s~~~~~i~ty~G~~VivDD~ 236 (349) T protein:vir:78 161 DVS----ATLGFDAGAFIDATQTMGDALMGNGGEVLGAIAMHSFVYAQARKAQLIDFIRDAENNTMFATYQGYRVIVDDS 236 (349) T ss_pred eec----cccCCChhhhhhhHHHHHHHhccccccceeEEEEchHHHHHHHhhhhhhhccCcccCcccceecCeEEEEeCC Confidence 544 44558999999999998886 6899999999999999999999999999986 4899999999999999 Q ss_pred CccCC----CceEEEEEcCCeeEEeecCCccceeeeecCC----cceeEEEEeeEEeeeeeeeeecccccc-------CC Q lcl|NC_015254. 236 LPAKD----GVYTSYIFGEGAFGLGNGEAPVPTETDREKL----KGNDILINRQHFLLHPRGIAWQEKSVA-------GH 300 (346) Q Consensus 236 ~p~~~----g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~~----~g~~~l~~r~~~~~~~~G~s~~~~~~~-------~~ 300 (346) ||+.. ++|+||+|++|||+|.++++++++|++||++ +|+|+|++||||++||+||||+++.++ +. T Consensus 237 ~Pv~~~g~~~~yttylfg~GAi~~~~~~~~~~~et~rd~~~g~~~G~d~l~~R~~~~~hp~G~s~~~a~v~~~~~~~~~~ 316 (349) T protein:vir:78 237 MTVVGQGAQRKFISIIFGQGAIGYGEGNPVMPLEYEREASRANGGGVETLWTRKTWLLHPFGYRFTSAVITGNGTETIAR 316 (349) T ss_pred CccccCCCCceEEEEEeecceEEEccCCCccceeeecccccCCcceeEEEEEeeEEEeeeeeeeeccccccCCccccccC Confidence 99853 4999999999999999999999999999995 478999999999999999999987655 37 Q ss_pred CCChHHhcCCcCceeeecccccceEEEEEeccc Q lcl|NC_015254. 301 SPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGV 333 (346) Q Consensus 301 sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~ 333 (346) |||++||++++||+||||||+||||+||||+++ T Consensus 317 sPt~aeLa~~~NW~~v~~~K~I~iv~~~~~~~a 349 (349) T protein:vir:78 317 SASWQDLANATNWNRVVDRKHVPIAFLVTGVGA 349 (349) T ss_pred CCChHHhcCCcCcccccChhhcceEEEEeccCC Confidence 899999999999999999999999999999999 No 7 >protein:vir:95131 Length: 325 # NCBI annotation: hypothetical protein ORF010 # Family: family:all:47 # MgeID: mge:1552 # MgeName: PA73 # Cross-refs: genbank:acc:YP_001293417;genbank:gi:148912838;genbank:GeneID:5228206 Probab=100.00 E-value=2e-78 Score=446.60 Aligned_cols=304 Identities=15% Similarity=0.159 Sum_probs=243.7 Q ss_pred eeeeec--cchHHHHHHHhhHhHHHHhHhhc--cccccchhHHHHhhCCCcEEEecccccCCCcc-c-ccCCCccccchh Q lcl|NC_015254. 18 TRIADV--IVPEVFNKYVTERTAESSALLQS--GIISNDKDLDELAKSGGNMINMPFWQDLTGED-E-ILDDGEGALTPG 91 (346) Q Consensus 18 T~l~d~--i~Pev~~~yv~~~~~~~~~~~qS--gi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~a-e-~~~dg~~~it~~ 91 (346) =-|+|| |||+++..|++.-.-...+|-++ |+++-.... ..|+++++|||++|.|+. | +++++++++++. T Consensus 1 m~lsD~~vfN~~~~~a~~e~~~q~~~~fn~as~gai~l~~~~-----~~Gd~~~~pf~~~l~g~~~~~~~~~~~~~vt~~ 75 (325) T protein:vir:95 1 MALSDLAVYSEYAYSAFSETLRQQVDLFNTATGGAIMLQSAA-----HQGDFSDVAFFAKVTGGLVRRRNAYGSGTVAEK 75 (325) T ss_pred CchhhhhhhhhhhhhhhhhhhhhhHhhhhhcccceeEecccc-----ccCceeeccccccccccccccccCCCCceeccc Confidence 345666 45666666665422223334332 444322211 149999999999987754 2 234456789999 Q ss_pred hcccceeEEEEEeecCcceechHHHhhhcchHHHH----HHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeeccc Q lcl|NC_015254. 92 NISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRA----IGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTET 167 (346) Q Consensus 92 ~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~----i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~ 167 (346) +|+++++++++++|++||..+|+++++++.|||.+ |++|+++||.+.+++.+|+.|.++++.+. +.++|++..+ T Consensus 76 kitt~~~~av~~~r~~g~~~~d~~~~~~g~~~~~~~~~~Ig~~~a~~~~~~~l~~~~~~l~~a~~~~~--~~v~dis~~~ 153 (325) T protein:vir:95 76 VLKHLVDTSVKVAAGTPPVRLDPGQFRWIQQNPEVAGAAMGQQLAVDTMADMLNVGLGSVYSALSQVS--DVVYDATANT 153 (325) T ss_pred eeccccceeeEEecccCcccccHHHHhhcCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhcccc--cceeeeeccc Confidence 99999999999999999999999999999999874 77778888888888888888888766654 5577888877 Q ss_pred cc-cccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhccc--ccCc-eeeEEeceEEEEeCCCccC---- Q lcl|NC_015254. 168 GD-DSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLD--SDNK-KFPTYMGKRVIVDDGLPAK---- 239 (346) Q Consensus 168 ~~-~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~--s~~~-~i~~~~G~~VVvdD~~p~~---- 239 (346) +. ++.+++++|++|+++|||+.++|++++|||++|++|++++|+++.+. .++. .|++|+|+||||||+||+. T Consensus 154 ~~~~~~~s~~~l~~A~~klGD~~~~l~~~~MHS~v~~~L~~~~L~~~~~~~~~~g~~~i~t~~G~~VIVdD~~p~~~~g~ 233 (325) T protein:vir:95 154 DAADKLPTWNNLNNGQAKFGDQSSQIAAWIMHSTPMHKLYGSNLTNGERLFTYGTVNVVRDPFGKLLVMTDSPNLFAAGT 233 (325) T ss_pred CcccccccHHHHHHHHHHhcccccceeEEEEchHHHHHHHHhhccccccccccCCcccccccCCcEEEEeCCCCCCCccC Confidence 64 45689999999999999999999999999999999999999987664 4443 4789999999999999974 Q ss_pred CCceEEEEEcCCeeEEeecCCc--cceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCChHHhcCCcCceeee Q lcl|NC_015254. 240 DGVYTSYIFGEGAFGLGNGEAP--VPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEKGNNWKAVY 317 (346) Q Consensus 240 ~g~ytt~l~~~GAi~~~~~~~~--~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~~NW~~v~ 317 (346) .++|++|+|++|||.|.+++++ ++.|..|+...+.. +.+|++|++||+||||+ ++++++|||++||++++||+||| T Consensus 234 ~~~ytty~lg~GAi~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~tf~lhp~G~sw~-~s~~g~sPt~aeL~~~~NW~rv~ 311 (325) T protein:vir:95 234 PNVYHILGLVPGGVLIGQNNDFDANEETKNGDENIIRT-YQAEWSYNIGVKGFAWD-KANGGKSPTDAALFTSTNWDKYA 311 (325) T ss_pred ceeEEEEEEecCeEEecCCCCccccccccCcccceeee-eeeeeeEEeecceeeee-cccccCCcChHhhcCCcCcceec Confidence 4599999999999999998874 55677776544444 46888899999999995 46678999999999999999999 Q ss_pred cc-cccceEEEEEe Q lcl|NC_015254. 318 ES-KNIRIVAFVHK 330 (346) Q Consensus 318 ~~-K~i~iv~~~~k 330 (346) ++ |..+.|.+||+ T Consensus 312 ~~~K~tagv~~~~~ 325 (325) T protein:vir:95 312 TSHKDLAGVVVKTN 325 (325) T ss_pred CCCccccceeEeeC Confidence 65 99999999999 No 8 >protein:vir:95107 Length: 270 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1549 # MgeName: X2 # Cross-refs: genbank:acc:YP_240822;genbank:gi:66394683;genbank:GeneID:5133901 Probab=100.00 E-value=3.1e-69 Score=396.16 Aligned_cols=265 Identities=17% Similarity=0.165 Sum_probs=228.1 Q ss_pred CceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhhcc Q lcl|NC_015254. 15 GKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGNIS 94 (346) Q Consensus 15 ~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt 94 (346) =+.|+++|||+||||++||.+++.++++|.+++.+ +. .|.++||++|+||||++| |++|++.||+ +|++++|+ T Consensus 1 Ma~T~~~d~I~Pev~~~~V~e~~~~~~~~~~~~~~--d~---~L~g~~G~ti~~P~~~~i-gdae~~~eg~-~i~~~~lt 73 (270) T protein:vir:95 1 MTQTKKANLINPEVLANVVSAQMQNAIRFTPYAVT--DD---TLVGQPGDTITRPKYAYI-GAAEDLQEGV-AMDTTQMS 73 (270) T ss_pred CCceehhhhcchHHHHHHHHHHHHhHHhhcccccc--cc---ccCCCCCCEEEeeeecCC-CccccccCCC-ccchhhcc Confidence 35699999999999999999999999999777654 32 355789999999999976 8999999986 89999999 Q ss_pred cceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccccc Q lcl|NC_015254. 95 AAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSYFT 174 (346) Q Consensus 95 ~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~~~ 174 (346) ++++.+++++++|+|+++|++.+++++||++++++|++.+|+|++|++|+++|+|++.+. ...++ T Consensus 74 ~~~~~a~i~~~gk~~~itD~a~~~~~~dp~~~~~~q~a~~~a~~~d~~li~~l~~a~~~~---------------~~~~t 138 (270) T protein:vir:95 74 MTTTKVTVKETGKAVEVTQTAIITNVNGTLQEASRQLAMSLADKVEIDYIAELNKSKQTA---------------TVSAD 138 (270) T ss_pred cchheeeeehhhCcceecHHHHhhhccchHHHHHHHHHHHHHHHHHHHHHHHhccccccc---------------ccccC Confidence 999999999999999999999999999999999999999999999999999999865432 12368 Q ss_pred HHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC----ceeeEEeceEEEEeCCCccCCCceEEEEEcC Q lcl|NC_015254. 175 GDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN----KKFPTYMGKRVIVDDGLPAKDGVYTSYIFGE 250 (346) Q Consensus 175 ~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~----~~i~~~~G~~VVvdD~~p~~~g~ytt~l~~~ 250 (346) ++.|++|.++|||+.+.+.+++|||++|+.|||+++++..++.++ |.|++|+|+||||+|++|. +|++|++++ T Consensus 139 ~~~~~dA~~~lgd~~~~~~~i~vhs~~~~~Lrk~~~~~~~~~~~~~~~~G~ig~~~G~~Viv~s~~~~---~~~~~l~~~ 215 (270) T protein:vir:95 139 ATGILDAIEVFNSENDEDYVLYVNPKDYNKLVKSLFKVGGNVQDRAISKGDLVEIVGVSDIVKSKRVS---ENTAFLQRY 215 (270) T ss_pred HHHHHHHHHHhccccCCCcEEEEcHHHHHHHHhhhcccccccccchhcccccceecceeEEEeCCCCC---ceeEEEEec Confidence 899999999999999999999999999999999999888777764 5799999999999999874 468999999 Q ss_pred CeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeee-eccccccCCCCChHHh Q lcl|NC_015254. 251 GAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIA-WQEKSVAGHSPTNTEI 307 (346) Q Consensus 251 GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s-~~~~~~~~~sPt~a~L 307 (346) ||+++.++++ +.+|++||+.++.+.|++|+||++|+..=+ +..-+. .+++|- |+ T Consensus 216 gAi~~~~~~~-~~vEtdRd~~~~~d~i~~~~~y~v~~~~~skvv~~t~-~~a~~~-~~ 270 (270) T protein:vir:95 216 GAMEIVNKKK-PEAYTDFDILKRTHLLSTNYHYSVNLKDETGVVKVTF-KPSGSL-EM 270 (270) T ss_pred cceeeeecCC-ceeeeccchhhcccEEEeeeEEEEEEEccceEEEEEe-cCCCCc-CC Confidence 9999988775 569999999999999999999999998722 211110 011111 11 No 9 >protein:vir:96792 Length: 315 # NCBI annotation: major capsid protein # Family: family:all:47 # MgeID: mge:1629 # MgeName: phiHSIC # Cross-refs: genbank:acc:YP_224246;genbank:gi:62362381;genbank:GeneID:3345731 Probab=100.00 E-value=5.4e-64 Score=367.44 Aligned_cols=293 Identities=14% Similarity=0.152 Sum_probs=217.7 Q ss_pred CceeeeeeccchHHHHHHHhhHhHHHHh-----Hh-hc-cccccchhHHHHhhCC--CcEEEecccccCCCccc-ccCCC Q lcl|NC_015254. 15 GKNTRIADVIVPEVFNKYVTERTAESSA-----LL-QS-GIISNDKDLDELAKSG--GNMINMPFWQDLTGEDE-ILDDG 84 (346) Q Consensus 15 ~~~T~l~d~i~Pev~~~yv~~~~~~~~~-----~~-qS-gi~~~~~~~~~l~~~~--G~ti~~P~~~~l~g~ae-~~~dg 84 (346) =++|++|||+ ||++|+...+.|++. |- +| |++ .|.+.| ||+..+|||+ +.|..+ ++.++ T Consensus 1 ~~~t~~sdl~---vfn~~~~~a~~e~~~~~~~~Fnaas~Gai-------~l~~~~~~GDf~~~~ff~-i~~~~~~rnv~~ 69 (315) T protein:vir:96 1 MATTVNSDLV---IYNDTAQTAYLERNMDNLAVFNENSRAAI-------GLNSELIEGDLKLRSFYK-VGGAIADRDVNS 69 (315) T ss_pred Cceeeeccee---eehhhhhhhHHhhhHHHHHHhhhhcCCcc-------cccccccccccccccccc-cccchhhcccCC Confidence 4699999984 466666555555432 32 11 211 122333 9999999999 655433 34466 Q ss_pred ccccchhhcccceeEEEEEeecCcceechHHHh-hhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceee Q lcl|NC_015254. 85 EGALTPGNISAAKDIARLHMRGKAWRTNDLAKA-LSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDV 163 (346) Q Consensus 85 ~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~-~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~di 163 (346) .+++++.+|++++++++++.++.+.-..+.+++ ..+.||+..++.....+|.+.++..+...|+|+++. ...++.++ T Consensus 70 ~~~~t~~kit~~~dvaVk~~~~~~~~~~~~~~~a~~g~dp~~~~~~i~~~~~~~~l~~~l~~~l~~~~aa--i~~~t~~~ 147 (315) T protein:vir:96 70 TATVAGTKIAADEMVSVKVPWKYGPYETTEEAFKRRARSPEEFSMLIGQDMADATMAGWIGYALNALQGA--IGSNAGMN 147 (315) T ss_pred CccccceecccccceeEEEeecCCchhccHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhh--hccccccc Confidence 789999999999999999988887433334443 468899987666666666666666666666655542 22344554 Q ss_pred eccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccCcee----eEEeceEEEEeCCCccC Q lcl|NC_015254. 164 STETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDNKKF----PTYMGKRVIVDDGLPAK 239 (346) Q Consensus 164 s~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~~~i----~~~~G~~VVvdD~~p~~ 239 (346) .. ++.+.++.++|++|+++|||+.++|++++|||++|++|++++|+++++...++.+ ++|+|+||||||+||+ T Consensus 148 ~~--~~~a~~~~~~l~dA~~klGD~~~~l~~~vMHS~v~~~L~~q~L~~~~~~~~~~~~~~~~~~~lGkrViVdD~~P~- 224 (315) T protein:vir:96 148 VS--GELATEGKKVLTKGLRTMGDKASSIAIWVMDSTSYFDIVDEAIDNKLYEEAGVVVYGGTPGTLGKPVLVTDQCPA- 224 (315) T ss_pred cc--ccccccCHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhhhhhcccccceeEecCcCcccccEEEEECCCCc- Confidence 43 4567899999999999999999999999999999999999999998876666544 6789999999999996 Q ss_pred CCceEEEEEcCCeeEEeecCCc--cceeeeecCCcceeEEEEeeE----EeeeeeeeeeccccccCCCCChHHhcCCcCc Q lcl|NC_015254. 240 DGVYTSYIFGEGAFGLGNGEAP--VPTETDREKLKGNDILINRQH----FLLHPRGIAWQEKSVAGHSPTNTEIEKGNNW 313 (346) Q Consensus 240 ~g~ytt~l~~~GAi~~~~~~~~--~~vE~dRd~~~g~~~l~~r~~----~~~~~~G~s~~~~~~~~~sPt~a~L~~~~NW 313 (346) |++|+|++|||+|.+++++ +++|+ .|++.|+.|++ |++||+||||+ ..++.|||++||++++|| T Consensus 225 ---~~~~gl~~GAi~~~~~~~~~~~~~~~-----~g~e~l~~~~r~e~tf~l~p~G~sw~--~~~~~sPt~aeLat~~NW 294 (315) T protein:vir:96 225 ---TKIFGLVAGAVMITESQAPGMRSYQI-----DDQENLAIGFRAEGTANVEVLGYKWK--TKTNVNPASATLATTTNW 294 (315) T ss_pred ---ceeeeeecceeeecCCCccccccccC-----CCcceeEEEEeeeeEeeeeeeeEEee--cCCCcCCChHHhcCCcCc Confidence 7899999999999988874 34444 45678887765 99999999997 346789999999999999 Q ss_pred eeeecc-cccceEEEEEecccc Q lcl|NC_015254. 314 KAVYES-KNIRIVAFVHKNGVP 334 (346) Q Consensus 314 ~~v~~~-K~i~iv~~~~k~~~~ 334 (346) +|||++ |.-..|-+|-. +.| T Consensus 295 ekV~~~~K~tagv~~~~~-~~~ 315 (315) T protein:vir:96 295 EKYATDDKATAGFIITLT-TTP 315 (315) T ss_pred ccccCCCcccceEEEEec-CCC Confidence 999988 55555544422 122 No 10 >protein:vir:105334 Length: 276 # NCBI annotation: putative phage major capsid protein # Family: family:all:522 # MgeID: mge:1679 # MgeName: PH15 # Cross-refs: genbank:acc:YP_950669;genbank:gi:119967839;genbank:GeneID:4643213 Probab=100.00 E-value=1e-63 Score=365.90 Aligned_cols=265 Identities=17% Similarity=0.188 Sum_probs=230.4 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =|+.+|+|+|||+||||++||.+++.++++|.++ +..+.+ +.++||++|+||+|++| |+++++.||. +|++++ T Consensus 1 Ma~~~T~l~d~i~Pev~~~~v~~~~~~~~~~~~~--~~~~~~---l~g~~G~ti~iP~~~~i-gda~~~~eg~-~i~~~~ 73 (276) T protein:vir:10 1 MAQGTTTKSTQIVPEVLAPMMQAELDKKLRFAQF--ADIDST---LVGQPGDTLTFPAFVYS-GDATVVPEGQ-KIPVDK 73 (276) T ss_pred CCcceeehhhhhchHHHHHHHHHHHHhhhhhccc--ceeccc---ccCCCCCEEEeeeecCC-CccccccCCC-ccCccc Confidence 2777999999999999999999999999999554 444543 45679999999999999 8999999985 899999 Q ss_pred cccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccc Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSY 172 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~ 172 (346) |+++++.+++++++|+|.++|++.+.+++||++++.+|++.+|++++|+++++.|++... .+ +... T Consensus 74 lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~~~~~~~~~~a~~~d~~~~~~l~~~~~---------~~-----~~~~ 139 (276) T protein:vir:10 74 IETNRREAKIHKIGKGTDITDEALLSGYGDPQGEAVRQHGLAIANKVDNDVLEALRGTKL---------TV-----SADI 139 (276) T ss_pred cccceeeEEeehccccccccHHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhcccc---------cc-----cccc Confidence 999999999999999999999999999999999999999999999999999999986322 11 1234 Q ss_pred ccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhccccc-------CceeeEEeceEEEEeCCCccCCCceEE Q lcl|NC_015254. 173 FTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSD-------NKKFPTYMGKRVIVDDGLPAKDGVYTS 245 (346) Q Consensus 173 ~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~-------~~~i~~~~G~~VVvdD~~p~~~g~ytt 245 (346) ++++.|++|.++|||+...+.+++|||++|+.|+|+++++|++.++ +|.|++|+|+|||+||++|. |++ T Consensus 140 ~t~d~i~~A~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G~~Vi~s~~~p~----~t~ 215 (276) T protein:vir:10 140 GTLAGLEAAIDTFDDEDLEPMVLFINPKDAGKLRSSASDNFTRATELGDNIIVKGAFGEALGAVIVRSKKLDE----GEA 215 (276) T ss_pred cCHHHHHHHHHHhccccCcccEEEEcHHHHHHHHHhccccccccccccccceeccccceecceeEEEcCCCCc----ceE Confidence 7899999999999999999999999999999999999999998775 24589999999999999984 589 Q ss_pred EEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeee----eeeccccccCCCCChH Q lcl|NC_015254. 246 YIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRG----IAWQEKSVAGHSPTNT 305 (346) Q Consensus 246 ~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G----~s~~~~~~~~~sPt~a 305 (346) |++++||+++..++ ++.+|++|++.++.+.++.|+||++++.= ..-+ ..+|..|+.+ T Consensus 216 ~l~~~gAi~~~~~~-~~~vE~dRd~~~~~d~i~~~~~y~~~~~~~~~vv~~t--~~~~~~~~~~ 276 (276) T protein:vir:10 216 ILAKRGAVKLITKR-DFFLETDRDPSTKTTALYSDKHYVAYLYDESKAVKVT--KGAGTTDSGA 276 (276) T ss_pred EEEeccceeeeecC-CceeecccchhhcccEEEEeeEEEEEEEcCcceEEEe--cCCcCCcCCC Confidence 99999999998765 56799999999999999999999988742 2222 2246678888 No 11 >protein:vir:95898 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1588 # MgeName: 71 # Cross-refs: genbank:acc:YP_240385;genbank:gi:66396054;genbank:GeneID:5133409 Probab=100.00 E-value=1.1e-62 Score=360.28 Aligned_cols=267 Identities=16% Similarity=0.166 Sum_probs=228.1 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =|+..|+|+|+|+||||++||.+++.++++| ++++..+.. +.++||++|+||+|++| |+++++.+|. ++++++ T Consensus 1 m~~~~T~l~d~i~Pev~~~~v~~~~~~~l~~--~~~~~~~~~---l~g~~G~tv~iP~~~~i-g~a~~~~~g~-~i~~~~ 73 (274) T protein:vir:95 1 MAQGMTKLTNQIVPEVLAPMMQAELEKKLRF--ASFAEIDNT---LVGQPGDTLTFPAFIYS-GDAKVVAEGE-KIPTDI 73 (274) T ss_pred CCcceeehhheechHHHHHHHHHHHHhhhhc--cccceeccc---ccCCCCCEEEeeeecCC-CccccccCCC-ccchhh Confidence 2678999999999999999999999888777 666666643 45679999999999998 8888988885 899999 Q ss_pred cccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccc Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSY 172 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~ 172 (346) |+++++.+++++++|+|.++|++.+.+++||++++.+|++.+|++++|+++++.|++... .+ +... T Consensus 74 lt~~~~~~~i~~~~~a~~i~D~~~~~~~~d~~~~~~~~~~~~~a~~vd~~i~~~l~~a~~---------~~-----~~~~ 139 (274) T protein:vir:95 74 LETKKREAKIRKIAKGTSISDEALLSGYGDPQGEQVRQHGLAHANKVDDDVLEALKSAKL---------TV-----EADI 139 (274) T ss_pred cccceeEEEeeeeecceeehHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHHhcccc---------cc-----cccc Confidence 999999999999999999999999999999999999999999999999999999976321 11 1234 Q ss_pred ccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC-------ceeeEEeceEEEEeCCCccCCCceEE Q lcl|NC_015254. 173 FTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN-------KKFPTYMGKRVIVDDGLPAKDGVYTS 245 (346) Q Consensus 173 ~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~-------~~i~~~~G~~VVvdD~~p~~~g~ytt 245 (346) ++++.|++|.++|||+.+..++++|||.+|+.|+++++++|+++++. |.|++|+|++||+||++| +|++ T Consensus 140 ~~~d~i~~A~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G~~Vi~s~~~~----~~t~ 215 (274) T protein:vir:95 140 TKLTGLQTAIDKFNDEDLEPMVLFISPLDAGKLRGDATTNFTRATELGDDVIVKGAFGEALGAVIVRSNKLE----AGTA 215 (274) T ss_pred cCHHHHHHHHHHhccccccccEEEeCHHHHHHHHhhccccccccccccccceeccccceecCeEEEEeCCCC----CceE Confidence 78999999999999999999999999999999999999999988762 458999999999999997 4689 Q ss_pred EEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCChHHhcCCcCceeee Q lcl|NC_015254. 246 YIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEKGNNWKAVY 317 (346) Q Consensus 246 ~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~~NW~~v~ 317 (346) |++++||+++..++ .+.+|++||+.++.|.|+.|+||++|+.- |+-.-..+..+|.+-- T Consensus 216 ~l~~~gA~~~~~~~-~~~vE~~Rd~~~~~d~i~~~~~y~~~~~~------------~~~~v~~tk~~~~~~~ 274 (274) T protein:vir:95 216 ILAKKGAVKLITKR-DFFLETDRDPSTKTTALYSDKHYVAYLYD------------ESKAVKITKGSGSLEM 274 (274) T ss_pred EEEeccceeeeecC-CcccccccccccccCEEEEeEEEEEEEEc------------CCcEEEEEcCCccccC Confidence 99999999998755 56799999999999999999999888752 3333333334444322 No 12 >protein:vir:96262 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1612 # MgeName: ROSA # Cross-refs: genbank:acc:YP_240311;genbank:gi:66395978;genbank:GeneID:5133339 Probab=100.00 E-value=1.1e-62 Score=360.28 Aligned_cols=267 Identities=16% Similarity=0.166 Sum_probs=228.1 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =|+..|+|+|+|+||||++||.+++.++++| ++++..+.. +.++||++|+||+|++| |+++++.+|. ++++++ T Consensus 1 m~~~~T~l~d~i~Pev~~~~v~~~~~~~l~~--~~~~~~~~~---l~g~~G~tv~iP~~~~i-g~a~~~~~g~-~i~~~~ 73 (274) T protein:vir:96 1 MAQGMTKLTNQIVPEVLAPMMQAELEKKLRF--ASFAEIDNT---LVGQPGDTLTFPAFIYS-GDAKVVAEGE-KIPTDI 73 (274) T ss_pred CCcceeehhheechHHHHHHHHHHHHhhhhc--cccceeccc---ccCCCCCEEEeeeecCC-CccccccCCC-ccchhh Confidence 2678999999999999999999999888777 666666643 45679999999999998 8888988885 899999 Q ss_pred cccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccc Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSY 172 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~ 172 (346) |+++++.+++++++|+|.++|++.+.+++||++++.+|++.+|++++|+++++.|++... .+ +... T Consensus 74 lt~~~~~~~i~~~~~a~~i~D~~~~~~~~d~~~~~~~~~~~~~a~~vd~~i~~~l~~a~~---------~~-----~~~~ 139 (274) T protein:vir:96 74 LETKKREAKIRKIAKGTSISDEALLSGYGDPQGEQVRQHGLAHANKVDDDVLEALKSAKL---------TV-----EADI 139 (274) T ss_pred cccceeEEEeeeeecceeehHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHHhcccc---------cc-----cccc Confidence 999999999999999999999999999999999999999999999999999999976321 11 1234 Q ss_pred ccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC-------ceeeEEeceEEEEeCCCccCCCceEE Q lcl|NC_015254. 173 FTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN-------KKFPTYMGKRVIVDDGLPAKDGVYTS 245 (346) Q Consensus 173 ~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~-------~~i~~~~G~~VVvdD~~p~~~g~ytt 245 (346) ++++.|++|.++|||+.+..++++|||.+|+.|+++++++|+++++. |.|++|+|++||+||++| +|++ T Consensus 140 ~~~d~i~~A~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G~~Vi~s~~~~----~~t~ 215 (274) T protein:vir:96 140 TKLTGLQTAIDKFNDEDLEPMVLFISPLDAGKLRGDATTNFTRATELGDDVIVKGAFGEALGAVIVRSNKLE----AGTA 215 (274) T ss_pred cCHHHHHHHHHHhccccccccEEEeCHHHHHHHHhhccccccccccccccceeccccceecCeEEEEeCCCC----CceE Confidence 78999999999999999999999999999999999999999988762 458999999999999997 4689 Q ss_pred EEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCChHHhcCCcCceeee Q lcl|NC_015254. 246 YIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEKGNNWKAVY 317 (346) Q Consensus 246 ~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~~NW~~v~ 317 (346) |++++||+++..++ .+.+|++||+.++.|.|+.|+||++|+.- |+-.-..+..+|.+-- T Consensus 216 ~l~~~gA~~~~~~~-~~~vE~~Rd~~~~~d~i~~~~~y~~~~~~------------~~~~v~~tk~~~~~~~ 274 (274) T protein:vir:96 216 ILAKKGAVKLITKR-DFFLETDRDPSTKTTALYSDKHYVAYLYD------------ESKAVKITKGSGSLEM 274 (274) T ss_pred EEEeccceeeeecC-CcccccccccccccCEEEEeEEEEEEEEc------------CCcEEEEEcCCccccC Confidence 99999999998755 56799999999999999999999888752 3333333334444322 No 13 >protein:vir:1239 Length: 274 # NCBI annotation: similar to phage B1 major head protein # Family: family:all:522 # MgeID: mge:25 # MgeName: phi ETA # Cross-refs: genbank:acc:NP_510938;genbank:gi:17426272;genbank:GeneID:927376 Probab=100.00 E-value=4e-62 Score=357.19 Aligned_cols=267 Identities=16% Similarity=0.169 Sum_probs=224.5 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =|+..|+++|+|+||||++||.+++.++++| ++++..+.+ +.++||++|+||+|++| |+++++.+|. ++++++ T Consensus 1 ma~~~T~l~d~iiPev~~~~v~~~~~~~l~~--~~~~~~d~~---l~g~~G~tv~iP~~~~i-g~a~~~~~g~-~i~~~~ 73 (274) T protein:vir:12 1 MAQGLTKTSNQIIPEVLAPMMQAQLEKKLRF--ASFAEVDST---LQGQPGDTLTFPAFVYS-GDAQVVAEGE-KIPTDI 73 (274) T ss_pred CCcceeehhhhhchHHHHHHHHHHHHhhhhh--cccceeccc---ccCCCCCEEEEeeecCC-CccccccCCC-ccchhh Confidence 2778999999999999999999999887766 666666654 45679999999999988 8888988885 899999 Q ss_pred cccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccc Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSY 172 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~ 172 (346) |+++++.+++++++++|+++|++.+.+++|||+++.+|++.+|++++|+++++.+++. +.+. +... T Consensus 74 lt~~~~~~~i~~~~~~~~i~D~~~~~~~~d~~~~~~~q~~~~~a~~vd~~~l~~~~~a---------~~~~-----~~~a 139 (274) T protein:vir:12 74 LETKKREAKIRKIAKGTSITDEALLSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGA---------KLTV-----NADI 139 (274) T ss_pred cccceeeEEeeeecceeeecHHHHHhcccchHHHHHHHHHHHHHHHHHHHHHHHHhcc---------cccc-----cccc Confidence 9999999999999999999999999999999999999999999999999999998742 1111 2334 Q ss_pred ccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC-------ceeeEEeceEEEEeCCCccCCCceEE Q lcl|NC_015254. 173 FTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN-------KKFPTYMGKRVIVDDGLPAKDGVYTS 245 (346) Q Consensus 173 ~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~-------~~i~~~~G~~VVvdD~~p~~~g~ytt 245 (346) ++++.|++|.++|||+.+.+++++|||.+|+.|+++++++|+++++. |.|++|+|++||+||++|+ |++ T Consensus 140 ~~~d~i~dA~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~fv~~s~~g~~~~~~G~ig~~~G~~Vi~s~~~p~----~t~ 215 (274) T protein:vir:12 140 TKLNGLQSAIDKFNDEDLEPMVLFINPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEALGAIIVRSNKLEA----GTA 215 (274) T ss_pred cCHHHHHHHHHHhccccccccEEEeCHHHHHHHHhhhhhhccccccccccceecccceeecCeeEEEeCCCCc----ceE Confidence 78999999999999999999999999999999999999999988763 4589999999999999985 589 Q ss_pred EEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCChHHhcCCcCceeee Q lcl|NC_015254. 246 YIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEKGNNWKAVY 317 (346) Q Consensus 246 ~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~~NW~~v~ 317 (346) |++++||+++..++ .+.+|++||+.++.|.|++|+||++++.- |+-.-.-+...|..-- T Consensus 216 ~l~~~gA~~~~~~~-~~~vE~~Rd~~~~~d~i~~~~~y~~~~~~------------~~~vv~~t~~~~~~~~ 274 (274) T protein:vir:12 216 ILAKKGAVKLILKR-DFFLEVARDASTKTTALYSDKHYVAYLYD------------ESKAVKITKGSGSLEM 274 (274) T ss_pred EEEeccceeeeecC-CceeccccchhhcccEEEeeeEEEEEEEc------------CCceEEEEcCCccccC Confidence 99999999998755 56799999999999999999999888752 1111111111111100 No 14 >protein:vir:3613 Length: 272 # NCBI annotation: MHP # Family: family:all:522 # MgeID: mge:74 # MgeName: TP901-1 # Cross-refs: genbank:acc:NP_112699;genbank:gi:13786567;genbank:GeneID:921035 Probab=100.00 E-value=9.8e-62 Score=355.06 Aligned_cols=262 Identities=14% Similarity=0.189 Sum_probs=228.5 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =|+..|+++|+|+||||++||.+++.++++|.++++ .+.. +.+++|++|+||+|++| |+++++.||. ++++++ T Consensus 1 ma~~~T~~~d~iiPev~~~~v~~~~~~~~~~~~~~~--~~~~---l~g~~G~ti~iP~~~~~-gda~~~~eg~-~i~~~~ 73 (272) T protein:vir:36 1 MSKQKTTLADLVNPEVLAPIVSYELNKALRFAPLAQ--VDTT---LQGQPGNTLKFPAFTYI-GDAADVAEGG-EISLDK 73 (272) T ss_pred CCCcceehhhhhchHHHHHHHHHHHHhhhhhccccc--cccc---cccCCCCEEEEeeeccC-ccccccCCCC-ccChhh Confidence 377899999999999999999999999998866544 3433 45678999999999998 8889989985 899999 Q ss_pred cccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccc Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSY 172 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~ 172 (346) |+++++.+++++++|+|.++|++.+.+++||+.++++|++.+|++++|+++++.|+|... . ..+. T Consensus 74 lt~~~~~~~i~~~~k~~~vtD~~~~~~~~d~~~~~~~~~a~~~a~~~d~~i~~~l~~~~~---------~------~~~~ 138 (272) T protein:vir:36 74 IGTTTKSVTIKKAAKGTEITDEAALSGYGDPIGESNKQLGLSLANKVDDDLLSAAKTTSQ---------T------VSTK 138 (272) T ss_pred cCCcceeEeeehhhccccccHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhccccc---------c------cccc Confidence 999999999999999999999999999999999999999999999999999999876322 1 1234 Q ss_pred ccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC------ceeeEEeceEEEEeCCCccCCCceEEE Q lcl|NC_015254. 173 FTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN------KKFPTYMGKRVIVDDGLPAKDGVYTSY 246 (346) Q Consensus 173 ~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~------~~i~~~~G~~VVvdD~~p~~~g~ytt~ 246 (346) ++++.|++|.++|||+.....+++|||++|..|++++.+++...+.+ |.|++|+|+|||+||+||.+.+.|++| T Consensus 139 ~~~d~i~~A~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~~~~~~~~~~~~~~G~ig~~~G~~Vv~s~~~p~~~~~~~~~ 218 (272) T protein:vir:36 139 ANVDGVQAALDIFNDEDAQAYVLIVNPKDAAKIRKDANAKNIGSEVGANALINGTYADVLGAQIVRSKKLAEGSALMFKI 218 (272) T ss_pred ccHHHHHHHHHHhhhcCCCceEEEEcHHHHHHHhcccccccccccccccceeeeccceecCeeEEEeCCCCCCceeEEEE Confidence 68899999999999999999999999999999999998887765542 568999999999999999999999999 Q ss_pred EEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeee----eeeccccccCC Q lcl|NC_015254. 247 IFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRG----IAWQEKSVAGH 300 (346) Q Consensus 247 l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G----~s~~~~~~~~~ 300 (346) ++++||+++..++ .+.+|++|+++++.|.+++|+||++|+.- ...+- +|. T Consensus 219 ~~~~gA~~~~~~~-~~~vE~~R~~~~~~d~i~~~~~y~~~v~~~~~vv~~t~---~g~ 272 (272) T protein:vir:36 219 VSNSPALKLVLKR-GVQVETDRDIVTKTTVITADEHYAAYLYDLTKVVNITF---TGV 272 (272) T ss_pred EecccceeeeecC-CcccccccchhhcCcEEEEEEEEEEEEEcCccEEEEee---cCC Confidence 9999999997765 56799999999999999999999988753 33332 233 No 15 >protein:vir:96833 Length: 275 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1642 # MgeName: EW # Cross-refs: genbank:acc:YP_240157;genbank:gi:66395822;genbank:GeneID:5133174 Probab=100.00 E-value=9e-61 Score=349.78 Aligned_cols=264 Identities=17% Similarity=0.175 Sum_probs=224.1 Q ss_pred eecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccch Q lcl|NC_015254. 11 KFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTP 90 (346) Q Consensus 11 ~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~ 90 (346) |=-+ +.|+++|||+||||++||.+++.++++| ++++..+.. +.++||++|+||+|++| |+++++.+|. +|++ T Consensus 1 ~~~~-~~T~l~d~i~PEv~~~~v~~~~~~~~~~--~~~~~~~~~---l~g~~G~tv~iP~~~~i-g~a~~~~~g~-~i~~ 72 (275) T protein:vir:96 1 MALE-NMTKLANMVNPEVLAPMMQAELDKKLKF--AQFADIDNT---LVGQPGNTITFPAFVYS-GDAKVVPEGE-EIPI 72 (275) T ss_pred CCCc-ccchhhhhhchHHHHHHHHHHHHHhhhh--cccceeccc---ccCCCCCEEEeeeeccC-CccccccCCC-Ccch Confidence 3223 3599999999999999999999999998 445555543 45779999999999998 8889988885 8999 Q ss_pred hhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccc Q lcl|NC_015254. 91 GNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDD 170 (346) Q Consensus 91 ~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~ 170 (346) ++|+++++.+++++++++|.++|++.+.+++||++++.+|++.+|++++|+++++.|++... . ... T Consensus 73 ~~lt~~~~~~~i~~~~~~~~i~D~~~~~~~~d~~~~~~~~~a~~~a~~~d~~ll~~l~~a~~---------~-----~~~ 138 (275) T protein:vir:96 73 DLIETKKRQATIRKIGKGTVLTDEALLSGYGDPKGEAVRQHGLAIANKVDNDVLEALQGATL---------K-----VEA 138 (275) T ss_pred hhcccceeeEEeehhcccccccHHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhcccc---------c-----ccc Confidence 99999999999999999999999999999999999999999999999999999999975321 1 123 Q ss_pred ccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhccccc-------CceeeEEeceEEEEeCCCccCCCce Q lcl|NC_015254. 171 SYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSD-------NKKFPTYMGKRVIVDDGLPAKDGVY 243 (346) Q Consensus 171 ~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~-------~~~i~~~~G~~VVvdD~~p~~~g~y 243 (346) ..++++.|++|.++|||+.+.+++++|||.+|+.|+|++.++|++.++ +|.|++|+|++||+||++|+ + T Consensus 139 ~~~~~d~i~dA~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~f~~~~~~g~~~~~~G~ig~~~G~~Vi~s~~~p~----~ 214 (275) T protein:vir:96 139 DITKLAGLQTAIDKFNDEDLEPMVLFVNPLDAGKLRASATDNFTRATLLGDNVIVKGAFGEALGAIIVRSNKIKE----G 214 (275) T ss_pred cccCHHHHHHHHHHhccccCCccEEEeCHHHHHHHHhcccccccccccccccceeccccceecCeeEEEeCCCCc----c Confidence 447899999999999999999999999999999999999899887765 34689999999999999985 4 Q ss_pred EEEEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeee----eeeccccccCCCC Q lcl|NC_015254. 244 TSYIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRG----IAWQEKSVAGHSP 302 (346) Q Consensus 244 tt~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G----~s~~~~~~~~~sP 302 (346) ++|++++||+++..++ .+.+|++|++.++.|.++.|+||++|+.- ...+. +.+|... T Consensus 215 t~~i~~~gA~~~~~~~-~~~vE~~Rd~~~~~d~i~~~~~y~~~~~~~~~vv~~t~-~~~~~~~ 275 (275) T protein:vir:96 215 EAILAKRGAVKLITKR-DFFLETERHASHKSTALFSDKHYVAYLYDESKVVKITK-SASGLGV 275 (275) T ss_pred eEEEEeccceeeeecC-CcccccccchhhcCcEEEEeEEEEEEEEcCccEEEEEe-cccccCC Confidence 7899999999998754 56799999999999999999999999863 12221 1122222 No 16 >protein:vir:94494 Length: 274 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1508 # MgeName: 88 # Cross-refs: genbank:acc:YP_240676;genbank:gi:66396348;genbank:GeneID:5133758 Probab=100.00 E-value=6.5e-60 Score=345.07 Aligned_cols=267 Identities=16% Similarity=0.178 Sum_probs=224.5 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =|++.|+++|+|+||||++||.+++.++++| ++++..+.. +.++||++|+||+|+++ |+++++.+|+ ++++++ T Consensus 1 ma~~~T~~~d~iiPev~~~~v~~~~~~~l~~--~~~~~~d~~---l~g~~G~tv~iP~~~~~-g~a~~~~~g~-~i~~~~ 73 (274) T protein:vir:94 1 MPQGLTKTSDQIIPEVLAPMMQAQLEKKLRF--ASFAEVDST---LQGQPGDTLTFPAFVYS-GDAQVVAEGE-KIPTDI 73 (274) T ss_pred CCccceehhheechHHHHHHHHHhhhhhhhh--cccceeccc---ccCCCCCEEEEeeecCC-CccccccCCC-cccccc Confidence 3788999999999999999999999777655 777777654 34679999999999988 8888888885 899999 Q ss_pred cccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccc Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSY 172 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~ 172 (346) ++++++.+++++++++|+++|++.+.+++|||+++.+|++.+|++++|+++++.|++. ...+ .... T Consensus 74 lt~~~~~~~i~~~~~~~~i~D~~~~~~~~dp~~~~~~~~a~a~a~~vd~~~~~~l~~a---------~~~~-----~~~~ 139 (274) T protein:vir:94 74 LETKKREAKIRKIAKGTSITDEALLSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGA---------KLTV-----NADI 139 (274) T ss_pred cccceeEEEeeeecceecccHHHHHhccchHHHHHHHHHHHHHHHHHHHHHHHHHhcc---------Cccc-----cccc Confidence 9999999999999999999999999999999999999999999999999999998752 1111 2345 Q ss_pred ccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC-------ceeeEEeceEEEEeCCCccCCCceEE Q lcl|NC_015254. 173 FTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN-------KKFPTYMGKRVIVDDGLPAKDGVYTS 245 (346) Q Consensus 173 ~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~-------~~i~~~~G~~VVvdD~~p~~~g~ytt 245 (346) ++++.|++|.++|||+...+++++|||.+|..|+++++++|++.++. |.|++|+|++||+||++| +|++ T Consensus 140 ~~~d~i~dA~~~l~d~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G~~Vi~s~~~p----~~t~ 215 (274) T protein:vir:94 140 TKLNGLQSAIDKFNDEDLEPMVLFVNPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEALGAIIVRTNKLE----AGTA 215 (274) T ss_pred cCHHHHHHHHHHhhccCCCceEEEeCHHHHHHHHhhhhhhccccCcccccceeccccceecCeeEEEcCCCC----cceE Confidence 78999999999999999999999999999999999999999988763 458999999999999998 4689 Q ss_pred EEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCChHHhcC Q lcl|NC_015254. 246 YIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEK 309 (346) Q Consensus 246 ~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~ 309 (346) |++++||+++.+++ .+.+|++||+.++.|.|+.|+||++++.== ..+.-..++-+-|+- T Consensus 216 ~l~~~gA~~~~~~~-~~~vE~~Rd~~~~~d~i~~~~~y~~~~~~~----~~vv~~t~~~~~~~~ 274 (274) T protein:vir:94 216 ILAKKGAVKLILKR-DFFLEVARDASTKTTALYSDKHYVAYLYDE----SKAVKITKGSGSLEM 274 (274) T ss_pred EEEeCcceEeeecC-CceeccccchhhcccEEEEEEEEEEEEEcC----CceEEEecCcccccC Confidence 99999999998765 567999999999999999999998876420 000000111111111 No 17 >protein:vir:97433 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1676 # MgeName: 92 # Cross-refs: genbank:acc:YP_240749;genbank:gi:66396420;genbank:GeneID:5133789 Probab=100.00 E-value=6.5e-60 Score=345.07 Aligned_cols=267 Identities=16% Similarity=0.178 Sum_probs=224.5 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =|++.|+++|+|+||||++||.+++.++++| ++++..+.. +.++||++|+||+|+++ |+++++.+|+ ++++++ T Consensus 1 ma~~~T~~~d~iiPev~~~~v~~~~~~~l~~--~~~~~~d~~---l~g~~G~tv~iP~~~~~-g~a~~~~~g~-~i~~~~ 73 (274) T protein:vir:97 1 MPQGLTKTSDQIIPEVLAPMMQAQLEKKLRF--ASFAEVDST---LQGQPGDTLTFPAFVYS-GDAQVVAEGE-KIPTDI 73 (274) T ss_pred CCccceehhheechHHHHHHHHHhhhhhhhh--cccceeccc---ccCCCCCEEEEeeecCC-CccccccCCC-cccccc Confidence 3788999999999999999999999777655 777777654 34679999999999988 8888888885 899999 Q ss_pred cccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccc Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSY 172 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~ 172 (346) ++++++.+++++++++|+++|++.+.+++|||+++.+|++.+|++++|+++++.|++. ...+ .... T Consensus 74 lt~~~~~~~i~~~~~~~~i~D~~~~~~~~dp~~~~~~~~a~a~a~~vd~~~~~~l~~a---------~~~~-----~~~~ 139 (274) T protein:vir:97 74 LETKKREAKIRKIAKGTSITDEALLSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGA---------KLTV-----NADI 139 (274) T ss_pred cccceeEEEeeeecceecccHHHHHhccchHHHHHHHHHHHHHHHHHHHHHHHHHhcc---------Cccc-----cccc Confidence 9999999999999999999999999999999999999999999999999999998752 1111 2345 Q ss_pred ccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC-------ceeeEEeceEEEEeCCCccCCCceEE Q lcl|NC_015254. 173 FTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN-------KKFPTYMGKRVIVDDGLPAKDGVYTS 245 (346) Q Consensus 173 ~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~-------~~i~~~~G~~VVvdD~~p~~~g~ytt 245 (346) ++++.|++|.++|||+...+++++|||.+|..|+++++++|++.++. |.|++|+|++||+||++| +|++ T Consensus 140 ~~~d~i~dA~~~l~d~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G~~Vi~s~~~p----~~t~ 215 (274) T protein:vir:97 140 TKLNGLQSAIDKFNDEDLEPMVLFVNPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEALGAIIVRTNKLE----AGTA 215 (274) T ss_pred cCHHHHHHHHHHhhccCCCceEEEeCHHHHHHHHhhhhhhccccCcccccceeccccceecCeeEEEcCCCC----cceE Confidence 78999999999999999999999999999999999999999988763 458999999999999998 4689 Q ss_pred EEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCChHHhcC Q lcl|NC_015254. 246 YIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEK 309 (346) Q Consensus 246 ~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~ 309 (346) |++++||+++.+++ .+.+|++||+.++.|.|+.|+||++++.== ..+.-..++-+-|+- T Consensus 216 ~l~~~gA~~~~~~~-~~~vE~~Rd~~~~~d~i~~~~~y~~~~~~~----~~vv~~t~~~~~~~~ 274 (274) T protein:vir:97 216 ILAKKGAVKLILKR-DFFLEVARDASTKTTALYSDKHYVAYLYDE----SKAVKITKGSGSLEM 274 (274) T ss_pred EEEeCcceEeeecC-CceeccccchhhcccEEEEEEEEEEEEEcC----CceEEEecCcccccC Confidence 99999999998765 567999999999999999999998876420 000000111111111 No 18 >protein:vir:96123 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1602 # MgeName: 37 # Cross-refs: genbank:acc:YP_240078;genbank:gi:66395742;genbank:GeneID:5133103 Probab=100.00 E-value=1.8e-57 Score=331.69 Aligned_cols=263 Identities=17% Similarity=0.214 Sum_probs=223.3 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =|+.+|+++|||+||||++|+.+++.++++| ++++..+.+ +.++||+++++|+|+.+ |+++++.+|. ++++++ T Consensus 1 ma~~~T~~~d~i~Pev~s~~v~~~~~~~~~~--~~~~~~~~~---l~g~~G~tv~ip~~~~~-g~~~~~~~g~-~i~~~~ 73 (274) T protein:vir:96 1 MAQGTTKVSNLIVPEVLAPMMQAELDKKLRF--AQFADIDST---LVGQPGDTLTFPAFTYS-GDAQVIAEGE-KIPVDQ 73 (274) T ss_pred CCccccchhhhhhhHHHHHHHHHHHHhhhhh--ccccccccc---ccCCCCCEEEEEeeccC-CCccccCCCC-cCchhh Confidence 2778899999999999999999999888877 666666644 44679999999999976 8888888885 899999 Q ss_pred cccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccc Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSY 172 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~ 172 (346) ++++++.+++++++++|.++|++.+.+++|||+++++|++.+|++++|+++++.|++.. . +..... T Consensus 74 it~~~~~~~i~~~~~~~~i~D~~~~~~~~d~~~~~~~~~~~~~a~~~d~~i~~~l~~a~---------~-----~~~~~~ 139 (274) T protein:vir:96 74 IGTSKREAKVRKIGKGTELTDEAVLSGFGDPQGEAVRQHGLAIANKVDNDVLEALKGAT---------L-----TVEADI 139 (274) T ss_pred cccceeEEEEEeeeceeeecHHHHHhhcchHHHHHHHHHHHHHHHHHHHHHHHHHhcCC---------C-----CcCccc Confidence 99999999999999999999999999999999999999999999999999999987521 1 122345 Q ss_pred ccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC-------ceeeEEeceEEEEeCCCccCCCceEE Q lcl|NC_015254. 173 FTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN-------KKFPTYMGKRVIVDDGLPAKDGVYTS 245 (346) Q Consensus 173 ~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~-------~~i~~~~G~~VVvdD~~p~~~g~ytt 245 (346) ++++.|++|.++|||+...++.++|||.+|+.|+++++++|+++++. |.|++|+|++||+||++|. +++ T Consensus 140 ~~~d~i~dA~~~l~d~~~~~~~ivv~p~~~~~L~k~~~~~f~~~~~~g~~~~~~g~ig~~~G~~Vi~s~~~p~----~t~ 215 (274) T protein:vir:96 140 TKLDGLQTAIDKFNDEDLEPMVLFVNPLDAGGLRTSASDNFTRPTQLGDNIIVKGAFGEALGAVIVRSNKLNK----GEA 215 (274) T ss_pred ccHHHHHHHHHHhcccCCCceEEEeCHHHHHHHHhcccccccccccccccceeecccceecCeeEEEcCCCCc----ceE Confidence 78999999999999999999999999999999999999999887753 4589999999999999985 478 Q ss_pred EEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeee---e-eeeccccccCCCCChHHh Q lcl|NC_015254. 246 YIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPR---G-IAWQEKSVAGHSPTNTEI 307 (346) Q Consensus 246 ~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~---G-~s~~~~~~~~~sPt~a~L 307 (346) |++++|||++..++ ++.+|++|++.++.|.++.|+||++++. | +..+.. .+.+ -+ T Consensus 216 ~l~~~gA~~~~~~~-~~~vE~~Rd~~~~~d~i~~~~~yg~~~~~~~~vv~~t~~-~~~~-----~~ 274 (274) T protein:vir:96 216 LLAKKGAVKLITKR-DFFLEKDRDASRKSTALYSDKHYVAYLYDESKVVKITKG-AGDE-----VM 274 (274) T ss_pred EEEeCcceeeeecC-CcccccccchhhcccEEEEeeEEEEEEEcCccEEEEEcC-cccc-----cC Confidence 99999999998765 4679999999999999999999987763 2 223211 1111 11 No 19 >protein:vir:93742 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1475 # MgeName: 55 # Cross-refs: genbank:acc:YP_240459;genbank:gi:66396126;genbank:GeneID:5133511 Probab=100.00 E-value=1e-55 Score=322.00 Aligned_cols=263 Identities=17% Similarity=0.180 Sum_probs=222.4 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =|++.|+++|+|+||||++|+.+++.++++|.++ +..+.. +.+++|++|+||+|++| |+++++.+|+ ++++++ T Consensus 1 ma~~~T~~~~~iiPev~~~~v~~~~~~~~~~~~~--~~~~~~---l~g~~G~tv~ip~~~~~-g~~~~~~eg~-~i~~~~ 73 (274) T protein:vir:93 1 MPQGITKTSNQIIPEVLAPMMQAQLEKKLRFASF--AEVDST---LQGQPGDTLTFPAFVYS-GDAQVVAEGE-KIPTDI 73 (274) T ss_pred CCccceehhheechHHHHHHHHHHHHhhhhhccc--cccccc---ccCCCCCEEEEEeeccC-CCcccccCCC-cccccc Confidence 3889999999999999999999999999887554 444433 45679999999999988 7888888885 899999 Q ss_pred cccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccc Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSY 172 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~ 172 (346) ++++++.+++++++++|.++|++.+.+++||++++.+|++.+|++++|+++++.|++.. ..+ .... T Consensus 74 it~~~~~~~i~~~~~~~~i~D~~~~~~~~d~~~~~~~~~~~~~a~~~d~~~~~~~~~a~---------~~~-----~~~~ 139 (274) T protein:vir:93 74 LETKKREAKIRKIAKGTSITDEALLSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGAK---------LTV-----NADI 139 (274) T ss_pred cccceeEEEeeeecccccccHHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhccc---------ccc-----cccc Confidence 99999999999999999999999999999999999999999999999999999986531 111 2334 Q ss_pred ccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC-------ceeeEEeceEEEEeCCCccCCCceEE Q lcl|NC_015254. 173 FTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN-------KKFPTYMGKRVIVDDGLPAKDGVYTS 245 (346) Q Consensus 173 ~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~-------~~i~~~~G~~VVvdD~~p~~~g~ytt 245 (346) ++++.|++|.++|||+...+++++|||.+|..|+++++++|++.++. |.|++|+|++||+||.+|. |++ T Consensus 140 ~~~d~i~dA~~~l~d~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~~G~ig~~~G~~Vi~s~~~p~----~t~ 215 (274) T protein:vir:93 140 TKLNGLQSAIDKFNDEDLEPMVLFINPLDAGKLRGDASTNFTRATELGDDIIVKGAFGEALGAIIVRTNKLEA----GTA 215 (274) T ss_pred cCHHHHHHHHHHhhhccCCccEEEeCHHHHHHHHhhhhhcccccccccccceeecccceecCeeEEEcCCCCc----ceE Confidence 78999999999999999999999999999999999999999887763 3589999999999999984 579 Q ss_pred EEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeee----eeeccccccCCCCChHHhcC Q lcl|NC_015254. 246 YIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRG----IAWQEKSVAGHSPTNTEIEK 309 (346) Q Consensus 246 ~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G----~s~~~~~~~~~sPt~a~L~~ 309 (346) |++++|||++..++ ++.+|++|++.++.+.+..++||++++.= ..-+... +-|+- T Consensus 216 ~l~~~gai~~~~~~-~~~vE~~Rd~~~~~d~i~~~~~y~~~~~~~~~~v~~t~~~--------~s~~~ 274 (274) T protein:vir:93 216 ILAKKGAVKLILKR-DFFLEVARDASTKTTALYSDKHYVAYLYDESKAVKITKGS--------GSLEM 274 (274) T ss_pred EEEeCCeEEEEecC-CcccccccchhhcccEEEEEEEEEEEEEcCCceEEEeeCc--------cccCC Confidence 99999999998765 56799999999999999999999887642 1111111 11111 No 20 >protein:vir:80930 Length: 278 # NCBI annotation: Cps # Family: family:all:522 # MgeID: mge:1886 # MgeName: A500 # Cross-refs: genbank:acc:YP_001468392;genbank:gi:157324966;genbank:GeneID:5601363 Probab=100.00 E-value=1.3e-54 Score=316.06 Aligned_cols=266 Identities=15% Similarity=0.155 Sum_probs=219.9 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =|+.+|+++|+|+||||++||.+++.++++|.++... +. .+.+++|++++||+|++| |+++++.+|. ++++++ T Consensus 1 Ma~~~T~~~~~iiPev~s~~v~~~~~~~~v~~~~~~~--~~---~l~g~~G~tv~ip~~~~~-g~a~~~~~g~-~i~~~~ 73 (278) T protein:vir:80 1 MADLTTKLANLIDPEVMGPMISAKLPKAIKFGKIAPI--DN---SLEGQPGSEITVPKYKYI-GDAQDVAEGA-AIDYSA 73 (278) T ss_pred CCCcceehhheecHHHHHHHHHHHHHHhhhhccccee--cc---cccCCCCCEEEEeeeccC-CcceeecCCC-cCcccc Confidence 3778999999999999999999999999888665433 32 244678999999999998 8888888885 899999 Q ss_pred cccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccc Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSY 172 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~ 172 (346) |+++++.+++++++++|+++|++.+.+++||++++++|++.+|++++|+++++.|+|........ .+ ..... T Consensus 74 lt~~~~~~~i~~~~~a~~v~D~~~~~~~~d~~~~~~~~~a~~~a~~~d~~l~~~l~~a~~~~~~~-----~t---~~~~~ 145 (278) T protein:vir:80 74 LETESVKHGIKKAGKGVKLTDESVLSGYGDPVEEAQKQIRMAIASKVDNDILEEALTTTLEVKGA-----IN---IGLID 145 (278) T ss_pred cccceeeEeeehhhccccccHHHHhhccccHHHHHHHHHHHHHHHHHHHHHHHHHhccccccccc-----cc---cchhh Confidence 99999999999999999999999999999999999999999999999999999998754322111 11 11122 Q ss_pred ccHHHHHHHHHHhCcccc-CceEEEEchHHHHHHHhhhhhhhccccc-------CceeeEEeceEEEEeCCCccCCCceE Q lcl|NC_015254. 173 FTGDTFLSATYKLGDAEG-KLTGIAMHSQTEMNLRKQGLIEFMLDSD-------NKKFPTYMGKRVIVDDGLPAKDGVYT 244 (346) Q Consensus 173 ~~~~~l~~A~~~~GD~~~-~~~~ivmhS~~~~~L~~~~li~~~~~s~-------~~~i~~~~G~~VVvdD~~p~~~g~yt 244 (346) ..++.|.+|..+|+++.. ....++|||.+|+.|++++++++++.++ +|.|++|+|++|++||++|. ++ T Consensus 146 ~~~~~~~da~~~l~~~~~~~~~~ivv~p~~~~~L~k~~~~~~~~~~~~g~~~~~~G~ig~~~G~~Vi~s~~~p~----~t 221 (278) T protein:vir:80 146 KIENTFTDAPDAIEDESITTTGVLFLNYKDTAKLREEAAGSWTKASQLGDDLLVKGAFGELLGWEIVRTKKLAD----GN 221 (278) T ss_pred hHHHHHHHHHHhhcccCCCcccEEEECHHHHHHHHhhhhhhccccccccccceeeccceeecceeEEEcCCCCc----ce Confidence 346889999999988764 3457999999999999999999887665 24589999999999999985 47 Q ss_pred EEEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeee-e---eeeccccccCC Q lcl|NC_015254. 245 SYIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPR-G---IAWQEKSVAGH 300 (346) Q Consensus 245 t~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~-G---~s~~~~~~~~~ 300 (346) +|++++|||++..++ ++.+|++|+++++.+.|+.|+||++++. . +..+.. +|. T Consensus 222 ~~l~~~gAi~~~~~~-~~~vE~~Rd~~~~~d~i~~~~~yg~~v~~~~~~v~it~~--a~~ 278 (278) T protein:vir:80 222 ALAVKAGALKTFLKR-NLLAESGRDMDHKLTKFNADQHYAVALVDETKAVKVVPV--AGN 278 (278) T ss_pred EEEEeccceeeeecC-CcccccccchhhccceeeeeeEEEEEEEcCcceEEEeec--cCC Confidence 899999999998776 4679999999999999999999998874 1 223211 111 No 21 >protein:vir:3033 Length: 272 # NCBI annotation: major capsid protein # Family: family:all:522 # MgeID: mge:61 # MgeName: PhiNIH1.1 # Cross-refs: genbank:acc:NP_438146;genbank:gi:16271809;genbank:GeneID:929235 Probab=100.00 E-value=4.6e-48 Score=280.12 Aligned_cols=261 Identities=20% Similarity=0.279 Sum_probs=220.8 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =|+.+|+++|+|+||+|.+|+.+++.++++|.+ ++..+.. +.+.+|+++++|+|..+ |+++.+.||. .+++++ T Consensus 1 MA~~~T~~~~~~iPev~s~~v~~~~~~~~~~~~--~~~~~~~---~~g~~G~tv~iP~~~~~-~~a~~v~eg~-~i~~~~ 73 (272) T protein:vir:30 1 MAVGTTKMAQMLDPEVLADMIDAEVGKAIRFAP--LAEVDTT---LEGQPGTTLTVPKWDYI-GDAEDVAEGE-AIPMTQ 73 (272) T ss_pred CCCccccchheechHHHHHHHHHHHHHHhhhhc--ccccccc---ccCCCCCEEEEEEecCC-CCcccccCCC-cccccc Confidence 377889999999999999999999999988744 4444432 34568999999999986 7888888985 899999 Q ss_pred cccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccc Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSY 172 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~ 172 (346) ++.++..+++++++++|.++|+....+..|++.++.+|++.+|++++|+++++.|+|.... .... T Consensus 74 ~~~~~~~~~~~~~~~~~~itd~~~~~s~~d~~~~~~~~~~~~~a~~~d~~i~~~~~~a~~~---------------~~~~ 138 (272) T protein:vir:30 74 LGFKKTTMTIKKAGKGVEITDEAILSGYGDPVGQAAKQIVEAIDHKVDADVLDALSKSTQT---------------VEAT 138 (272) T ss_pred cccceEEEEeeeeeeeeeecHHHHhhccccHHHHHHHHHHHHHHHHHHHHHHHHhcccccc---------------cccc Confidence 9999999999999999999999999999999999999999999999999999998764211 1123 Q ss_pred ccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhccccc-------CceeeEEeceEEEEeCCCccCCCceEE Q lcl|NC_015254. 173 FTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSD-------NKKFPTYMGKRVIVDDGLPAKDGVYTS 245 (346) Q Consensus 173 ~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~-------~~~i~~~~G~~VVvdD~~p~~~g~ytt 245 (346) .+++.|++|.++|||+......++|||.+|..|++++++++...++ ++.+++++|+|||+|+.||. +++ T Consensus 139 ~t~d~i~da~~~l~~~~~~~~~~vv~p~~~~~L~k~~~~~~~~~~~~~~~~~~~g~ig~i~G~~Vi~s~~~p~----~t~ 214 (272) T protein:vir:30 139 ATVDGVSKALDIFNDEDDAETVIVMNPADASTLRLDAAKEWLGATEVGANRVVSGVYGEVLGVQIVRSRKCPK----GTA 214 (272) T ss_pred cCHHHHHHHHHHHhccCCCccEEEEcHHHHHHHHHhccccccccccccccccccccchhhcCeeEEEcCCCCc----ceE Confidence 5789999999999999999999999999999999999998887664 24578999999999999985 478 Q ss_pred EEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeee----eeeccccccCCC Q lcl|NC_015254. 246 YIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRG----IAWQEKSVAGHS 301 (346) Q Consensus 246 ~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G----~s~~~~~~~~~s 301 (346) |++++||+++...+ .+.+|++|++.++.+.++.|+||++|+.- ++++-.. +++- T Consensus 215 ~~~~~~a~~~~~~~-~~~ve~~r~~~~~~~~i~~~~~~~~~v~~~~~vv~~t~~~-a~~~ 272 (272) T protein:vir:30 215 YMVRKGALRIMLKR-NTMVETDRDITKAINQIVANKHYGVYLYKAEKAVKITLKD-AAKK 272 (272) T ss_pred EEEcCCeEEEEecC-CceeeeccccccceeEEEEEEEEEEEEEcCCceEEEEecc-cccC Confidence 99999999998755 45799999999999999999999998753 4443322 2222 No 22 >protein:vir:9820 Length: 272 # NCBI annotation: putative major capsid/head protein # Family: family:all:522 # MgeID: mge:176 # MgeName: 315.4 # Cross-refs: genbank:acc:NP_795582;genbank:gi:28876339;genbank:GeneID:1257858 Probab=100.00 E-value=4.6e-48 Score=280.12 Aligned_cols=261 Identities=20% Similarity=0.279 Sum_probs=220.8 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =|+.+|+++|+|+||+|.+|+.+++.++++|.+ ++..+.. +.+.+|+++++|+|..+ |+++.+.||. .+++++ T Consensus 1 MA~~~T~~~~~~iPev~s~~v~~~~~~~~~~~~--~~~~~~~---~~g~~G~tv~iP~~~~~-~~a~~v~eg~-~i~~~~ 73 (272) T protein:vir:98 1 MAVGTTKMAQMLDPEVLADMIDAEVGKAIRFAP--LAEVDTT---LEGQPGTTLTVPKWDYI-GDAEDVAEGE-AIPMTQ 73 (272) T ss_pred CCCccccchheechHHHHHHHHHHHHHHhhhhc--ccccccc---ccCCCCCEEEEEEecCC-CCcccccCCC-cccccc Confidence 377889999999999999999999999988744 4444432 34568999999999986 7888888985 899999 Q ss_pred cccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccc Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSY 172 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~ 172 (346) ++.++..+++++++++|.++|+....+..|++.++.+|++.+|++++|+++++.|+|.... .... T Consensus 74 ~~~~~~~~~~~~~~~~~~itd~~~~~s~~d~~~~~~~~~~~~~a~~~d~~i~~~~~~a~~~---------------~~~~ 138 (272) T protein:vir:98 74 LGFKKTTMTIKKAGKGVEITDEAILSGYGDPVGQAAKQIVEAIDHKVDADVLDALSKSTQT---------------VEAT 138 (272) T ss_pred cccceEEEEeeeeeeeeeecHHHHhhccccHHHHHHHHHHHHHHHHHHHHHHHHhcccccc---------------cccc Confidence 9999999999999999999999999999999999999999999999999999998764211 1123 Q ss_pred ccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhccccc-------CceeeEEeceEEEEeCCCccCCCceEE Q lcl|NC_015254. 173 FTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSD-------NKKFPTYMGKRVIVDDGLPAKDGVYTS 245 (346) Q Consensus 173 ~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~-------~~~i~~~~G~~VVvdD~~p~~~g~ytt 245 (346) .+++.|++|.++|||+......++|||.+|..|++++++++...++ ++.+++++|+|||+|+.||. +++ T Consensus 139 ~t~d~i~da~~~l~~~~~~~~~~vv~p~~~~~L~k~~~~~~~~~~~~~~~~~~~g~ig~i~G~~Vi~s~~~p~----~t~ 214 (272) T protein:vir:98 139 ATVDGVSKALDIFNDEDDAETVIVMNPADASTLRLDAAKEWLGATEVGANRVVSGVYGEVLGVQIVRSRKCPK----GTA 214 (272) T ss_pred cCHHHHHHHHHHHhccCCCccEEEEcHHHHHHHHHhccccccccccccccccccccchhhcCeeEEEcCCCCc----ceE Confidence 5789999999999999999999999999999999999998887664 24578999999999999985 478 Q ss_pred EEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeee----eeeccccccCCC Q lcl|NC_015254. 246 YIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRG----IAWQEKSVAGHS 301 (346) Q Consensus 246 ~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G----~s~~~~~~~~~s 301 (346) |++++||+++...+ .+.+|++|++.++.+.++.|+||++|+.- ++++-.. +++- T Consensus 215 ~~~~~~a~~~~~~~-~~~ve~~r~~~~~~~~i~~~~~~~~~v~~~~~vv~~t~~~-a~~~ 272 (272) T protein:vir:98 215 YMVRKGALRIMLKR-NTMVETDRDITKAINQIVANKHYGVYLYKAEKAVKITLKD-AAKK 272 (272) T ss_pred EEEcCCeEEEEecC-CceeeeccccccceeEEEEEEEEEEEEEcCCceEEEEecc-cccC Confidence 99999999998755 45799999999999999999999998753 4443322 2222 No 23 >protein:vir:739 Length: 231 # NCBI annotation: major structural protein 4 # Family: family:all:522 # MgeID: mge:14 # MgeName: Tuc2009 # Cross-refs: genbank:acc:NP_108716;genbank:gi:13487838;genbank:GeneID:920884 Probab=100.00 E-value=1.5e-48 Score=282.80 Aligned_cols=224 Identities=13% Similarity=0.156 Sum_probs=188.1 Q ss_pred HHhhCCCcEEEecccccCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHH Q lcl|NC_015254. 57 ELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWN 136 (346) Q Consensus 57 ~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~ 136 (346) +-.-..|+||++|.| | |+++++.||. .+++++|+++++.+++++++|+|.++|++.+.+++||++++++|++++++ T Consensus 1 ~~~~~~Gdtit~P~~--i-Gda~~v~eG~-~i~~~~l~~t~~~atIk~~gk~~~itD~a~l~~~gDp~~ea~~Q~~~~iA 76 (231) T protein:vir:73 1 ENGINLANLCEYPND--I-GDAADVAEGG-EISLDKIGTTTKSVTIKKAAKGTEITDEAALSGYGDPIGESNKQLGLSLA 76 (231) T ss_pred CccccCCceEEeccc--c-cchhhhcCCC-cCChhhccccceeeeEeeeccceeeeHHHHhhccCchHHHHHHHHHHHHH Confidence 112347999999988 6 8999999996 89999999999999999999999999999999999999999999999999 Q ss_pred HHHHHHHHHHHHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhccc Q lcl|NC_015254. 137 RRRQAVLIASLNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLD 216 (346) Q Consensus 137 ~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~ 216 (346) +++|++++++|++... . .+..++++.+++|.++|||+.+...+++|||+++++||+..-..+... T Consensus 77 ~kvD~di~~~~~~a~l---------~------~~~~~t~d~i~~A~~~fgde~~~~~vivv~p~~~~~Lrk~~~~~~~~~ 141 (231) T protein:vir:73 77 NKVDDDLLKAAKTTSQ---------T------VSTKANVDGVQAALDIFNDEDAQAYVLIVNPKDAAKIRKDANAKNIGS 141 (231) T ss_pred HhhhHHHHHhhccccc---------c------ccccccHHHHHHHHHHhccccccceEEEEcchHHHhhhhccchhhhhh Confidence 9999999999875321 1 122479999999999999999999999999999999999654433332 Q ss_pred c--c----CceeeEEeceEEEEeCCCccCCCceEEEEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeee Q lcl|NC_015254. 217 S--D----NKKFPTYMGKRVIVDDGLPAKDGVYTSYIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGI 290 (346) Q Consensus 217 s--~----~~~i~~~~G~~VVvdD~~p~~~g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~ 290 (346) . + +|.||+++|++||+||++|.+.+.+..|+.++||+++..+++ +.+|++||+.++.+.+++++||++|++-= T Consensus 142 ~~g~~i~~~G~iG~i~G~~Vi~S~~~~~~~~~~~~~i~~~gAl~~~~k~~-~~vEtdRd~~~k~~~i~~~~~y~v~l~~~ 220 (231) T protein:vir:73 142 EVGANALINGTYADVLGAQIVRSKKLAEGSALMFKIVSNSPALKLVLKRG-VQVETDRDIVTKTTVITADEHYAAYLYDL 220 (231) T ss_pred hhccceeeecccceEcceEEEEcCCCCCCceeeeeEEeeccceeeeeccc-ceeeccccccccccEEEEeEEEEEEEEcC Confidence 1 2 467999999999999999999999999999999999987665 56999999999999999999999998631 Q ss_pred e-eccccccCC Q lcl|NC_015254. 291 A-WQEKSVAGH 300 (346) Q Consensus 291 s-~~~~~~~~~ 300 (346) + .-.-+..|. T Consensus 221 ~~vv~~t~~g~ 231 (231) T protein:vir:73 221 TKVVNITFTGV 231 (231) T ss_pred ccEEEEEeecC Confidence 1 000011222 No 24 >protein:vir:7990 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:151 # MgeName: Che8 # Cross-refs: genbank:acc:NP_817344;genbank:gi:29565772;genbank:GeneID:1258978 Probab=99.88 E-value=3.7e-24 Score=149.02 Aligned_cols=259 Identities=14% Similarity=0.119 Sum_probs=184.8 Q ss_pred ee-eccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhhccccee Q lcl|NC_015254. 20 IA-DVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGNISAAKD 98 (346) Q Consensus 20 l~-d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~ 98 (346) ++ +.+.||+|++++.+++.+.+.|.+ ++..+-+ .....|++|++|.|..+ +..+...++ +.++++.++..+. T Consensus 1 MA~~~~~pei~~~~v~~~~~~~lv~~~--l~~~~~~---~~~~~GdTv~ip~~~~~-~~~d~~~~~-~~~~~~~~~~~~~ 73 (273) T protein:vir:79 1 MAFNNFIPELWSDMLLEEWTAQTVFAN--LVNREYE---GIASKGNVVHIAGVVAP-TVKDYKAAG-RQTSADAISDTGV 73 (273) T ss_pred CcchhhhHHHHHHHHHHHHHhhccchh--hhhcccc---ccccCCcEEEEeecCcc-cccccccCC-CccCccccccceE Confidence 33 346899999999999998887633 3333322 23557999999999987 444444454 3678889999888 Q ss_pred EEEEE-eecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeeccccccccccHHH Q lcl|NC_015254. 99 IARLH-MRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSYFTGDT 177 (346) Q Consensus 99 ~a~~~-~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~ 177 (346) ..++. .+..++.++|+.......|. .++.+|.+...++++|+++++.+.+.-... . ..+.....-.++. T Consensus 74 ~~tid~~~~~~~~i~d~d~~~~~~~~-~~~~~~~~~ala~~vD~~i~~~~~~a~~~~-------~--~~~~~~~~~~~~~ 143 (273) T protein:vir:79 74 DLLIDQEKSIDFLVDDIDRVQVAGSL-EAYTRAGATALATDTDKFIADMLVDNGTAL-------T--GSAPSDADDAFDL 143 (273) T ss_pred EEEEeeecccceeeccHHHHhhcccH-HHHHHHHHHHHHHHHHHHHHHHHhhccccc-------c--cccccchhhHHHH Confidence 88874 57899999999888777775 679999999999999999999886521111 0 0011111124678 Q ss_pred HHHHHHHhCccc--cCceEEEEchHHHHHHHhhh--hhhhccccc-----CceeeEEeceEEEEeCCCccCCCceEEEEE Q lcl|NC_015254. 178 FLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQG--LIEFMLDSD-----NKKFPTYMGKRVIVDDGLPAKDGVYTSYIF 248 (346) Q Consensus 178 l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~--li~~~~~s~-----~~~i~~~~G~~VVvdD~~p~~~g~ytt~l~ 248 (346) |.+|..+|+++. ..-+.++++|..|..|++.. +.+.....+ .|.|+.+.|++|++|+.+|...+ ++++.+ T Consensus 144 i~~a~~~ld~~~vP~~~R~lvv~p~~~~~Ll~~~~~~~~~~~~~~~~~l~~G~ig~~~G~~i~~s~~lp~~~~-~~~~a~ 222 (273) T protein:vir:79 144 IASALKELTKANVPNVGRVVVVNAEMAFWLRSSGSKLTSADTSGDAAGLRAGTIGNLLGARIVESNNLRDTDD-EQFVAF 222 (273) T ss_pred HHHHHHHhhhccCCccCcEEEECHHHHHHHhhchhhhhhhhhcccccceeeeEeeEEeceEEEecccccccCc-eEEEEE Confidence 999999998765 34578999999999999763 323222211 46789999999999999998766 567788 Q ss_pred cCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeee---eeeeeccccccCCC Q lcl|NC_015254. 249 GEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHP---RGIAWQEKSVAGHS 301 (346) Q Consensus 249 ~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~---~G~s~~~~~~~~~s 301 (346) .++|+++.. + ...+|..|++.+..+.+..+.+|++.+ .|+---.++ + | T Consensus 223 ~~~A~~~a~-~-~~~~e~~r~~~~~~~~v~~~~~yg~~v~~p~~vv~~~~~-g--~ 273 (273) T protein:vir:79 223 HPSAAAYVS-Q-IDTVEALRDQDSFSDRIRALHVYGGKVVRPTGVVVFNKT-G--S 273 (273) T ss_pred eccceeeee-e-hhhhhcccCcccceeeeeeeeeeeeEEecCceEEEEecc-C--C Confidence 999998743 3 457899999998888888888776544 443322111 1 1 No 25 >protein:vir:9927 Length: 295 # NCBI annotation: hypothetical protein # Family: family:all:1178 # MgeID: mge:178 # MgeName: 315.6 # Cross-refs: genbank:acc:NP_795689;genbank:gi:28876459;genbank:GeneID:1258000 Probab=99.86 E-value=1.7e-24 Score=150.89 Aligned_cols=265 Identities=13% Similarity=0.086 Sum_probs=164.0 Q ss_pred ecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhc-cccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccch Q lcl|NC_015254. 12 FAAGKNTRIADVIVPEVFNKYVTERTAESSALLQS-GIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTP 90 (346) Q Consensus 12 ~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qS-gi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~ 90 (346) -|-.+.|++.|+.+|++. +++.+-....++|.+- ||....| -.-|++|++|+|.++ |+++++.||+ .|+. T Consensus 1 mAe~nlt~~~dL~~~~si-dfv~~f~~~i~~L~~~Lgi~r~~p------~a~G~tIt~pK~~~t-gda~dVaEGe-~Ipl 71 (295) T protein:vir:99 1 MAEKNLNTMADLGDIKSI-DFVNKFSKNINDLLKLLGVTRRET------LTNDLKIQTYKWEVT-LDQTDPGEGE-TIPL 71 (295) T ss_pred CCCcccccHhhccCceee-hhhHHhhhhHHHHHHHhccccccc------cccCCeEEeeeeeee-cccccccCCc-ccch Confidence 334458899999999987 4444433333334332 2322111 123999999999998 9999999996 8999 Q ss_pred hhcccce---eEEEEEeecCcceechHH-HhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecc Q lcl|NC_015254. 91 GNISAAK---DIARLHMRGKAWRTNDLA-KALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTE 166 (346) Q Consensus 91 ~~lt~~~---~~a~~~~~~k~~~~tD~a-~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~ 166 (346) ++++..+ ..+.+++.+|+. ||+| ++.+++||+++..+|+.+++++++++++++.|+.. +..++ T Consensus 72 skvt~~~~~t~t~kikK~rK~t--TdEAIqlsGygdpvgead~qL~~~ia~kId~D~~~~lkta---------t~t~t-- 138 (295) T protein:vir:99 72 SKVTRTKDKDYTVKWFKKRRAT--TAEAIARHGAARAITEADKRIMRELQNGIKDAFFTFLKTK---------PTKVK-- 138 (295) T ss_pred hhheeeeeeeeEEEeeeecccc--cHHHHHhcCCCchhHHHHHHHHHHHHHhhhHHHHHHhccC---------ceeee-- Confidence 9999764 566667777764 9999 58999999999999999999999999999999731 22222 Q ss_pred ccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccCc--eeeEEeceE-EEEeCCCccCCCc- Q lcl|NC_015254. 167 TGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDNK--KFPTYMGKR-VIVDDGLPAKDGV- 242 (346) Q Consensus 167 ~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~~--~i~~~~G~~-VVvdD~~p~~~g~- 242 (346) ++.-...++.+..+++.|.|+.+.-.++||||+++++||+.+-+++...++=| -+..++|++ ||+|.++|.+.-. T Consensus 139 -g~~lq~a~a~~~~al~~f~Ee~~~~~V~FVnP~D~a~yl~~A~~~~~~a~~fG~~~L~nfLG~q~II~S~kv~~G~~~a 217 (295) T protein:vir:99 139 -GVGLQKALSASWAKLATFNEFEGSPLVSFVSPLDVANYLGDTKVGADASNVFGMTLLKNFLGMQNVIVMPSVPEGKIYS 217 (295) T ss_pred -hhhHHHHHHHhhhhhhhcccccCCceEEEEehHHHHHHHhccccccchhhhhhhhhhhhhhccceEEEcccCCCceEEE Confidence 12223467778888999999998889999999999999999988887765422 345799997 9999999876420 Q ss_pred ------eEEEEEcC-----CeeEEeecCCccceeeeecCCc----ceeE------EEEeeEEeeeeeeeeeccccccCCC Q lcl|NC_015254. 243 ------YTSYIFGE-----GAFGLGNGEAPVPTETDREKLK----GNDI------LINRQHFLLHPRGIAWQEKSVAGHS 301 (346) Q Consensus 243 ------ytt~l~~~-----GAi~~~~~~~~~~vE~dRd~~~----g~~~------l~~r~~~~~~~~G~s~~~~~~~~~s 301 (346) ...|.-.. ++|.+..... -.+-+-++... .++. |..++.=.+-.-=+ T Consensus 218 T~~~Ni~~ay~~~~~g~l~~~f~~~~D~t-glIg~~h~~~~~~~t~et~~~~~~~lfpE~~dgiv~~tI----------- 285 (295) T protein:vir:99 218 TAVENLVFASLNVKGGDLGGLFADFTDET-GLIAAARNRQLSNLTYESVFFGANVLFAEIPEGVVEATI----------- 285 (295) T ss_pred eeccceEEEEecCCchhhhhhhhhccCcc-cceEEEeccccceeeehhhhHhHHHhcccccceEEEEEE----------- Confidence 01111111 1221111000 00111111111 1111 11111111111001 Q ss_pred CChHHhcCCcCceeeecccccceEEEEEecccccccC Q lcl|NC_015254. 302 PTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKK 338 (346) Q Consensus 302 Pt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~ 338 (346) +--++|+.-+ T Consensus 286 ---------------------------~~~~~~~~~~ 295 (295) T protein:vir:99 286 ---------------------------EAAAVPGIGG 295 (295) T ss_pred ---------------------------ecCcCCCCCC Confidence 1111111111 No 26 >protein:vir:105822 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:1636 # MgeName: PMC # Cross-refs: genbank:acc:YP_655767;genbank:gi:109522090;genbank:GeneID:4157630 Probab=99.85 E-value=9.6e-23 Score=141.29 Aligned_cols=259 Identities=13% Similarity=0.123 Sum_probs=182.0 Q ss_pred ee-eccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhhccccee Q lcl|NC_015254. 20 IA-DVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGNISAAKD 98 (346) Q Consensus 20 l~-d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~ 98 (346) ++ +.+.||+|++.+.+++.+.+.|.+ ++..+-+. ....|+++++|.+..+ +..+...++ +.++++.++..+. T Consensus 1 MA~~~~~pe~~~~~v~~~~~~~lv~~~--l~~~~~~~---~~~~Gdtv~ip~~~~~-~~~d~~~~~-~~~~~~~~~~~~~ 73 (273) T protein:vir:10 1 MAFNNFIPELWSDMLLEEWTAQTVFAN--LVNREYEG---TASKGNVVHIAGVVAP-TVKDYKAAG-RQTSADAISDTGV 73 (273) T ss_pred CcchhhhHHHHHHHHHHHHHhhhccch--hhcccccc---ccccCceEEEeecccc-cccccccCC-CccCccccccceE Confidence 33 446799999999999988876633 33333221 1346999999999987 444333333 3578888998888 Q ss_pred EEEE-EeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeeccccccccccHHH Q lcl|NC_015254. 99 IARL-HMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSYFTGDT 177 (346) Q Consensus 99 ~a~~-~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~ 177 (346) ..++ +.+..++.++|+.......| ++++.+|.+.+.++++|+++++.+.+...... ..+..+..-.++. T Consensus 74 ~~tid~~~~~~~~i~d~d~~~~~~~-~~~~~~~~~~alA~~vD~~i~~~~~~a~~~~~---------~~~~~~~~~~~~~ 143 (273) T protein:vir:10 74 DLLIDQEKSIDFLVDDIDRVQVAGS-LEAYTRAGATALATDTDKFIADMLVDNGTALT---------GSAPTDADDAFDL 143 (273) T ss_pred EEEEeeeeecceEeecHHHhhhhcc-HHHHHHHHHHHHHHHHHHHHHHHHhccccccc---------cccccchhHHHHH Confidence 8777 45799999999888877777 46799999999999999999998865321110 0111111123688 Q ss_pred HHHHHHHhCccc--cCceEEEEchHHHHHHHhhh-hh-hhccccc-----CceeeEEeceEEEEeCCCccCCCceEEEEE Q lcl|NC_015254. 178 FLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQG-LI-EFMLDSD-----NKKFPTYMGKRVIVDDGLPAKDGVYTSYIF 248 (346) Q Consensus 178 l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~-li-~~~~~s~-----~~~i~~~~G~~VVvdD~~p~~~g~ytt~l~ 248 (346) |.+|..+|.++. ..-+.++++|..|..|++.. .+ +.....+ .|.|+.+.|++|++|+.+|...+ ++.+.+ T Consensus 144 i~~a~~~ld~~~vP~~~R~lvv~p~~~~~L~~~~~~~~~~~~~~~~~~l~~G~ig~i~G~~v~~s~~lp~~~~-~~~~~~ 222 (273) T protein:vir:10 144 IAKALKELTKANVPNVGRVVVVNAEMAFWLRSSGSKLTSADTSGDAAGLRAGTIGNLLGARIVESNNLRDTDD-EQFVAF 222 (273) T ss_pred HHHHHHHhhhcCCCcCCCEEEECHHHHHHHhcchhhhhhhhccccccceeeeeeeEEeceEEEEecccccCCc-cEEEEE Confidence 999999998765 34578999999999999864 23 2222111 36789999999999999998765 567788 Q ss_pred cCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeee---eeeeeccccccCCC Q lcl|NC_015254. 249 GEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHP---RGIAWQEKSVAGHS 301 (346) Q Consensus 249 ~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~---~G~s~~~~~~~~~s 301 (346) .++|+++.. + ...+|..|++.+..+.+..+.+|++.+ -|.---.++ + | T Consensus 223 ~~~A~~~a~-q-~~~~e~~r~~~~~~~~v~~~~~yg~~v~~~~~~~~l~~~-g--~ 273 (273) T protein:vir:10 223 HPSAAAYVS-Q-IDTVEALRDQDSFSDRIRALHVYGGKVVRPTGVVVFNKT-G--S 273 (273) T ss_pred eccceeeee-e-eehhhcccCCCcceeeeeeeeeeeeeEeccceEEEEecc-C--C Confidence 999998753 3 457899999998888888888776543 443221111 1 1 No 27 >protein:vir:102605 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:1661 # MgeName: Llij # Cross-refs: genbank:acc:YP_655002;genbank:gi:109392192;genbank:GeneID:4157227 Probab=99.85 E-value=9.6e-23 Score=141.29 Aligned_cols=259 Identities=13% Similarity=0.123 Sum_probs=182.0 Q ss_pred ee-eccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhhccccee Q lcl|NC_015254. 20 IA-DVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGNISAAKD 98 (346) Q Consensus 20 l~-d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~ 98 (346) ++ +.+.||+|++.+.+++.+.+.|.+ ++..+-+. ....|+++++|.+..+ +..+...++ +.++++.++..+. T Consensus 1 MA~~~~~pe~~~~~v~~~~~~~lv~~~--l~~~~~~~---~~~~Gdtv~ip~~~~~-~~~d~~~~~-~~~~~~~~~~~~~ 73 (273) T protein:vir:10 1 MAFNNFIPELWSDMLLEEWTAQTVFAN--LVNREYEG---TASKGNVVHIAGVVAP-TVKDYKAAG-RQTSADAISDTGV 73 (273) T ss_pred CcchhhhHHHHHHHHHHHHHhhhccch--hhcccccc---ccccCceEEEeecccc-cccccccCC-CccCccccccceE Confidence 33 446799999999999988876633 33333221 1346999999999987 444333333 3578888998888 Q ss_pred EEEE-EeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeeccccccccccHHH Q lcl|NC_015254. 99 IARL-HMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSYFTGDT 177 (346) Q Consensus 99 ~a~~-~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~ 177 (346) ..++ +.+..++.++|+.......| ++++.+|.+.+.++++|+++++.+.+...... ..+..+..-.++. T Consensus 74 ~~tid~~~~~~~~i~d~d~~~~~~~-~~~~~~~~~~alA~~vD~~i~~~~~~a~~~~~---------~~~~~~~~~~~~~ 143 (273) T protein:vir:10 74 DLLIDQEKSIDFLVDDIDRVQVAGS-LEAYTRAGATALATDTDKFIADMLVDNGTALT---------GSAPTDADDAFDL 143 (273) T ss_pred EEEEeeeeecceEeecHHHhhhhcc-HHHHHHHHHHHHHHHHHHHHHHHHhccccccc---------cccccchhHHHHH Confidence 8777 45799999999888877777 46799999999999999999998865321110 0111111123688 Q ss_pred HHHHHHHhCccc--cCceEEEEchHHHHHHHhhh-hh-hhccccc-----CceeeEEeceEEEEeCCCccCCCceEEEEE Q lcl|NC_015254. 178 FLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQG-LI-EFMLDSD-----NKKFPTYMGKRVIVDDGLPAKDGVYTSYIF 248 (346) Q Consensus 178 l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~-li-~~~~~s~-----~~~i~~~~G~~VVvdD~~p~~~g~ytt~l~ 248 (346) |.+|..+|.++. ..-+.++++|..|..|++.. .+ +.....+ .|.|+.+.|++|++|+.+|...+ ++.+.+ T Consensus 144 i~~a~~~ld~~~vP~~~R~lvv~p~~~~~L~~~~~~~~~~~~~~~~~~l~~G~ig~i~G~~v~~s~~lp~~~~-~~~~~~ 222 (273) T protein:vir:10 144 IAKALKELTKANVPNVGRVVVVNAEMAFWLRSSGSKLTSADTSGDAAGLRAGTIGNLLGARIVESNNLRDTDD-EQFVAF 222 (273) T ss_pred HHHHHHHhhhcCCCcCCCEEEECHHHHHHHhcchhhhhhhhccccccceeeeeeeEEeceEEEEecccccCCc-cEEEEE Confidence 999999998765 34578999999999999864 23 2222111 36789999999999999998765 567788 Q ss_pred cCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeee---eeeeeccccccCCC Q lcl|NC_015254. 249 GEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHP---RGIAWQEKSVAGHS 301 (346) Q Consensus 249 ~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~---~G~s~~~~~~~~~s 301 (346) .++|+++.. + ...+|..|++.+..+.+..+.+|++.+ -|.---.++ + | T Consensus 223 ~~~A~~~a~-q-~~~~e~~r~~~~~~~~v~~~~~yg~~v~~~~~~~~l~~~-g--~ 273 (273) T protein:vir:10 223 HPSAAAYVS-Q-IDTVEALRDQDSFSDRIRALHVYGGKVVRPTGVVVFNKT-G--S 273 (273) T ss_pred eccceeeee-e-eehhhcccCCCcceeeeeeeeeeeeeEeccceEEEEecc-C--C Confidence 999998753 3 457899999998888888888776543 443221111 1 1 No 28 >protein:vir:9875 Length: 296 # NCBI annotation: hypothetical protein # Family: family:all:1178 # MgeID: mge:177 # MgeName: 315.5 # Cross-refs: genbank:acc:NP_795637;genbank:gi:28876404;genbank:GeneID:1257935 Probab=99.74 E-value=3e-20 Score=127.65 Aligned_cols=263 Identities=12% Similarity=0.140 Sum_probs=148.1 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhc-cccccchhHHHHhhCCCcEE-EecccccCCCcc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQS-GIISNDKDLDELAKSGGNMI-NMPFWQDLTGED 78 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qS-gi~~~~~~~~~l~~~~G~ti-~~P~~~~l~g~a 78 (346) |-+. ..|--...|...|+=.+ .=-+|+.+-....+.|.+- ||...-| + .-|++| +.|.|.++ |++ T Consensus 1 ~~~~-----~~~~e~nlt~~~dl~~~-~siDf~~~f~~~i~~L~~~LGv~r~~p----l--a~GstIkt~k~~~y~-gda 67 (296) T protein:vir:98 1 MVTS-----RTYPEENLIKSTDLKYP-ITIDVTNKFQENISKLLEMLGVTRKIS----V--SEGMTLKTYAGYDVT-LAE 67 (296) T ss_pred CCCc-----cccCcCCCcchhhhhhh-hhhhhHHHHhhhHHHHHHHhhhccccc----c--cCCCEEeeccceeee-ecc Confidence 3221 23444445555555222 1124444433333333321 2222111 1 139999 77889998 899 Q ss_pred cccCCCccccchhhcccce---eEEEEEeecCcceechHH-HhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhh Q lcl|NC_015254. 79 EILDDGEGALTPGNISAAK---DIARLHMRGKAWRTNDLA-KALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASG 154 (346) Q Consensus 79 e~~~dg~~~it~~~lt~~~---~~a~~~~~~k~~~~tD~a-~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~ 154 (346) +++.||+ .|+.+++++.+ ..+.+++.+|+. ||+| ++.+++||+++.-+|+.+++++++++++++.|++..+ T Consensus 68 ~dVaEGe-~Iplskvt~~~~~t~t~~ikK~rK~t--TdEAIqlsGyg~aVgetd~qL~~~iq~kId~d~~t~LktaT~-- 142 (296) T protein:vir:98 68 GNVPEGE-VIPLSKVERKIHSEKKIELKKYRKAT--TGEDIQMYGSNEAVTNTDNALVRQLQKKIRTDFVTALKTGTG-- 142 (296) T ss_pred ccccCCc-ccchhhheeeecceEEEEeecccccc--CHHHHHhhcCCchhHHHHHHHHHHHHHhhhHHHHHHHhcccc-- Confidence 9999996 89999999864 666778888885 9999 5899999999999999999999999999999974321 Q ss_pred hhhhcceeeeccccccccccHHHHHHHH--------HHhCccccCceEEEEchHHHHHHHhhhhhhhcccccCceee-EE Q lcl|NC_015254. 155 ALDSNKLDVSTETGDDSYFTGDTFLSAT--------YKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDNKKFP-TY 225 (346) Q Consensus 155 ~~~~~~~dis~~~~~~~~~~~~~l~~A~--------~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~~~i~-~~ 225 (346) .++ -+.+.|..|+ .+|+|+.+.-.++||||..++++++..-+..-... ++.+. .+ T Consensus 143 -------t~~--------~t~~~lQ~Ala~~~~~l~~~feded~~~~V~FVnP~D~a~ylg~a~it~qt~f-G~tyl~nf 206 (296) T protein:vir:98 143 -------TQD--------ALGAGLQGALASAWGKLQVLFEDYGSERAIVFANSLDVAEYIAKAGITTQTAF-GLTYLVDF 206 (296) T ss_pred -------eee--------echhhHHHHHHHHhhhhhhhccccCCCceEEEEehHHHHHHhcCCccchhhee-chhhhhhc Confidence 111 1234444444 89999988889999999999999988755321111 34444 49 Q ss_pred eceEEEEeCCCccCCCceEE--------EEEcC-C----eeEEeecCCccceeeeecCCc----ceeEEEE-eeEEeeee Q lcl|NC_015254. 226 MGKRVIVDDGLPAKDGVYTS--------YIFGE-G----AFGLGNGEAPVPTETDREKLK----GNDILIN-RQHFLLHP 287 (346) Q Consensus 226 ~G~~VVvdD~~p~~~g~ytt--------~l~~~-G----Ai~~~~~~~~~~vE~dRd~~~----g~~~l~~-r~~~~~~~ 287 (346) +|..||+|.++|.+. .|.| |.-.. | +|.+.... .-.+-+-+++.. .++.+-+ -.-|-=.+ T Consensus 207 LG~~II~S~kV~~G~-~~~T~~~Ni~~ay~~~~~~~l~~~f~~~~d~-tglIGv~h~~~~~~~t~eT~~~~~~~lfpE~~ 284 (296) T protein:vir:98 207 TGTVIISTNDVTKGE-IWATVPENIIFAYINPNNSELAKEFNLYGDP-TGYIGMNHFQENTTLTIQTLLVSGMLMYPERI 284 (296) T ss_pred cccEEEEcCcCCCce-EEEeeecceEEEeecccccchhhhhcccccc-ccceEEEeccccceeeehhHhHhHHHhccccc Confidence 999999999999653 1211 11111 1 11111000 001111111111 1111000 00000000 Q ss_pred eeeeeccccccCCCCCh Q lcl|NC_015254. 288 RGIAWQEKSVAGHSPTN 304 (346) Q Consensus 288 ~G~s~~~~~~~~~sPt~ 304 (346) -|+=- +..+|.- T Consensus 285 dgiv~-----~tI~~~~ 296 (296) T protein:vir:98 285 DGIVK-----VTLTPGV 296 (296) T ss_pred ceEEE-----EEecCCC Confidence 11100 0011111 No 29 >protein:vir:94622 Length: 341 # NCBI annotation: PfWMP4_37 # Family: family:all:2203 # MgeID: mge:1525 # MgeName: Pf-WMP4 # Cross-refs: genbank:acc:YP_762667;genbank:gi:115304375;genbank:GeneID:5142322 Probab=99.74 E-value=9e-20 Score=124.99 Aligned_cols=311 Identities=11% Similarity=0.061 Sum_probs=173.5 Q ss_pred cceeeecCCceee------eeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccc Q lcl|NC_015254. 7 MNLQKFAAGKNTR------IADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEI 80 (346) Q Consensus 7 ~~~q~~~a~~~T~------l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~ 80 (346) |- | +|+.|. ...-|+||+|+.++.+.+.+.+.|.+ . .++-+. ....|++|++|.++.. ...+ T Consensus 1 ~~---~-~~~~~~~~~~t~~v~~fipei~s~~i~~~l~~~~v~~~--~-~~d~~~---~~~~Gdtv~ip~~g~~--~~~d 68 (341) T protein:vir:94 1 MA---L-GNTITGPSINTQRGQQFIPEQWLSEVQMFRKAKMLDTS--V-VKTWGA---QVKKGDTFHVPRISEL--GVED 68 (341) T ss_pred Cc---c-hhhhccccccchhHHHHHHHHHHHHHHHHHHhhcchhh--c-cccccc---cccCCceEEEeccCcc--eeee Confidence 22 2 333333 45568899999999999988887633 2 233211 1235999999998765 3455 Q ss_pred cCCCccccchhhcccceeEEEE-EeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhc Q lcl|NC_015254. 81 LDDGEGALTPGNISAAKDIARL-HMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSN 159 (346) Q Consensus 81 ~~dg~~~it~~~lt~~~~~a~~-~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~ 159 (346) ...+ +.+++++++..+...++ +.+..++.++|+....+..|++.++.+|.+.++++++|+++++.+.+.......... T Consensus 69 ~~~~-~~i~~~~~~~~~~~itiD~~~~~~~~i~d~d~~~~~~d~~~~~~~~~~~aLA~~~D~~i~~~~a~~~~~~~~~~~ 147 (341) T protein:vir:94 69 KATD-VPVGVQPVNDTDFVITVDTDRTTAVALDDLLEIQASYDLRAPYLEAMGYALAKDMTGSILGLRAAVQNTASQNVF 147 (341) T ss_pred ecCC-CccccccccCceEEEEEeeeeecceeechHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHhhhccccccCccc Confidence 5555 47899999998888887 667899999999999999999999999999999999999999877543221111100 Q ss_pred ceeeeccccccccccHHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhhhhh-hhccccc----CceeeEEeceEEEE Q lcl|NC_015254. 160 KLDVSTETGDDSYFTGDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQGLI-EFMLDSD----NKKFPTYMGKRVIV 232 (346) Q Consensus 160 ~~dis~~~~~~~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~li-~~~~~s~----~~~i~~~~G~~VVv 232 (346) ...-...++....++++.|.+|..+|.++. ..-..++++|..|..|+++..+ ..-...+ .|.|+.+.|++|++ T Consensus 148 ~~~~~~~t~~~~~~~~~~i~~a~~~Lde~~VP~~gR~lvv~P~~~~~Ll~~~~~~~~~~~g~~~l~~G~ig~i~G~~V~~ 227 (341) T protein:vir:94 148 SSSNGAITGNGQAFSFAVFLAARRLLLEADVPEEKIVLLISPGQESALFTIPQFISKDFINNAPIAQGQIGSLMGVRVIR 227 (341) T ss_pred cCccccccCchhhhhHHHHHHHHHHHhhcCCCccCCEEEeCHHHHHHHhhchhhhhhhccccchhheeeeeeEeceEEEE Confidence 000011223344577899999999998764 2447899999999999986433 2211111 36689999999999 Q ss_pred eCCCccCCCceEEEEEcCCeeEEeecCC-ccceeeeecCCccee---EEEE--eeEEeeeeeeeeeccc-------cccC Q lcl|NC_015254. 233 DDGLPAKDGVYTSYIFGEGAFGLGNGEA-PVPTETDREKLKGND---ILIN--RQHFLLHPRGIAWQEK-------SVAG 299 (346) Q Consensus 233 dD~~p~~~g~ytt~l~~~GAi~~~~~~~-~~~vE~dRd~~~g~~---~l~~--r~~~~~~~~G~s~~~~-------~~~~ 299 (346) |+.+|...+. .|..+.+-.......+ ....|..|...+-.+ -|.. +..+.+-+.--.|--. ..+. T Consensus 228 Sn~lp~~~~~--~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~gl~~~~~av~~~k~~~~~~~~~~~~~~~~~~~~ 305 (341) T protein:vir:94 228 TSLIGNNSAT--GWRNGAPTIAPAEATPGFTGSRYLPKQDSFTSLPATFTGNSRPVHTAVMCHMDWAAAVVSKAPRVTQS 305 (341) T ss_pred eccccccccc--cccccccceecccccccccccccccccccccccEEEEEEecccccceeeecchhhhcccccccccccc Confidence 9999976541 2222222111111111 111222222111000 0111 1111111111011000 0000 Q ss_pred CCC-ChHHhcCCcC--ceeeecccccceEEEEEecccc Q lcl|NC_015254. 300 HSP-TNTEIEKGNN--WKAVYESKNIRIVAFVHKNGVP 334 (346) Q Consensus 300 ~sP-t~a~L~~~~N--W~~v~~~K~i~iv~~~~k~~~~ 334 (346) -.| .-+++-.+.+ =-++..|+. .|.|++--++. T Consensus 306 ~~~~~~~~~i~~~~~~G~~~lrp~~--~v~~~~~~~~~ 341 (341) T protein:vir:94 306 FENREQVWLMVGRQAYGARLYRPLH--AVNIHTTGDTV 341 (341) T ss_pred chhhhhhhhhhhhhhhcccccCcce--eEEEecCcCCC Confidence 011 1122222332 001222322 23343333222 No 30 >protein:vir:106647 Length: 303 # NCBI annotation: ORF011 # Family: family:all:1178 # MgeID: mge:1557 # MgeName: 187 # Cross-refs: genbank:acc:YP_239493;genbank:gi:66395226;genbank:GeneID:4555801 Probab=99.70 E-value=4.1e-19 Score=121.41 Aligned_cols=261 Identities=11% Similarity=0.146 Sum_probs=152.3 Q ss_pred eeeeeeccchHHHHH-----HHhhHhHHHHhHhhc-cccccchhHHHHhhCCCcEEE---ecccccCCCcccccCCCccc Q lcl|NC_015254. 17 NTRIADVIVPEVFNK-----YVTERTAESSALLQS-GIISNDKDLDELAKSGGNMIN---MPFWQDLTGEDEILDDGEGA 87 (346) Q Consensus 17 ~T~l~d~i~Pev~~~-----yv~~~~~~~~~~~qS-gi~~~~~~~~~l~~~~G~ti~---~P~~~~l~g~ae~~~dg~~~ 87 (346) .+...|+++++.+++ |+.+-......|.+- ||...-| ++ -|.+|+ +|.|.++ |++.++.||+ . T Consensus 1 M~~e~nl~~~~dL~~a~siDF~~~f~~~i~~L~~~LGv~r~~p----la--~Gt~iktyK~~~~~y~-gda~dVaEGe-~ 72 (303) T protein:vir:10 1 MSAENNLINVEALGKAKSIDFANKLGVGLNKLFEALAIQNKIP----MN--VGSALKQYRFKVEDSE-KPNGDVAEGD-V 72 (303) T ss_pred CCCCcCCcchhhcccceeehhhhhhhhhHHHHHHHhhhhcccc----cc--CCceeeeeeeeceeec-cccccccCCc-c Confidence 566667777777763 444322222333222 2222111 11 266665 5555676 8999999996 8 Q ss_pred cchhhcccc---eeEEEEEeecCcceechHHH-hhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceee Q lcl|NC_015254. 88 LTPGNISAA---KDIARLHMRGKAWRTNDLAK-ALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDV 163 (346) Q Consensus 88 it~~~lt~~---~~~a~~~~~~k~~~~tD~a~-~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~di 163 (346) |+.++++.. ...+.+++.+|+. ||+|. +.+++||+++.-+|+.+++++++++++++.|+...+.... T Consensus 73 Iplskvt~~~~~t~~~~~kK~rK~t--TdEAIqlsGyg~aVgetd~qL~~~Iq~kIdnd~~~~lktaT~t~~~------- 143 (303) T protein:vir:10 73 IPLTKVTREQVDITELQFAKYRKST--SAEAIQAHGYDLAINQTDNEMIKYVQKKFRAKFFETLKSAIENGKR------- 143 (303) T ss_pred cchhhheeeecceEEEEeecccccc--cHHHHHhhcCCchhHHHHHHHHHHHHhhhhHHHHHHHhhccccccc------- Confidence 999999975 4566777888866 99995 8999999999999999999999999999999843211000 Q ss_pred eccccccccccHHHHHHHHHHhC------ccccCceEEEEchHHHHHHHhhhhhhhccccc-C-ceeeEEeceEEEEeCC Q lcl|NC_015254. 164 STETGDDSYFTGDTFLSATYKLG------DAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSD-N-KKFPTYMGKRVIVDDG 235 (346) Q Consensus 164 s~~~~~~~~~~~~~l~~A~~~~G------D~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~-~-~~i~~~~G~~VVvdD~ 235 (346) ......+++.|..|+..|- |+.+.-.++||||++++++|+.+-+. .+.++ | .-+..++|..||+|.+ T Consensus 144 ----t~~t~~s~~glq~Al~~~~~kl~~~~ed~~~~V~FvNP~Daa~yl~~A~i~-~~~t~fG~n~L~nfLG~~II~S~k 218 (303) T protein:vir:10 144 ----TNKTKLSAENLQGALSKGRANLSVLLDDEITPIAFVNPNDTAEYLANGFIN-STGAQFGVNLLTPYVGVKIVEFAD 218 (303) T ss_pred ----ccceeecHHHHHHHHHhhhhhccccccccccEEEEEchHHHHHHhhcCCcc-hhhhhhhhhhhhhhhcceEEEecc Confidence 0112257889999998764 45555579999999999999877655 33233 2 2245699999999999 Q ss_pred CccCCCc-----eE--EEEEcC----CeeEEeecCCccceeeeecCCc----ceeE------EEEeeEEeeeeeeeeecc Q lcl|NC_015254. 236 LPAKDGV-----YT--SYIFGE----GAFGLGNGEAPVPTETDREKLK----GNDI------LINRQHFLLHPRGIAWQE 294 (346) Q Consensus 236 ~p~~~g~-----yt--t~l~~~----GAi~~~~~~~~~~vE~dRd~~~----g~~~------l~~r~~~~~~~~G~s~~~ 294 (346) +|.+.-. .. .|.-.. .+|.+..-.. -.+-+-+++.. .++. |..++.=.+-. ...+ T Consensus 219 v~~G~~~~T~~~Ni~~ay~~~~g~l~~~f~~t~D~t-glIGv~h~~~~~~~t~eT~~~~~~~lfpE~~dgiv~--~ti~- 294 (303) T protein:vir:10 219 VPQGEVWMTVAENLNVAYANPRGELSRAFAFATDAT-GFVGVLHDIQPQRLTSDTIYASAISMFPENIDAVIK--VTIK- 294 (303) T ss_pred CCCceEEEeeccceEEEEecCchhhhhhhhhccccc-cceEEEeccccceeeehhHhHhHHHhcccccceEEE--EEEe- Confidence 9876421 11 122111 2222211110 01111222211 1111 11111111111 1111 Q ss_pred ccccCCCCC Q lcl|NC_015254. 295 KSVAGHSPT 303 (346) Q Consensus 295 ~~~~~~sPt 303 (346) +.-.++-|+ T Consensus 295 ~~e~~~~~~ 303 (303) T protein:vir:10 295 KDEAGELPS 303 (303) T ss_pred ccccCCCCC Confidence 112234566 No 31 >protein:vir:80180 Length: 381 # NCBI annotation: capsid protein # Family: family:all:2203 # MgeID: mge:1878 # MgeName: Pf-WMP3 # Cross-refs: genbank:acc:YP_001285797;genbank:gi:148747831;genbank:GeneID:5220456 Probab=99.70 E-value=2.2e-18 Score=117.34 Aligned_cols=323 Identities=15% Similarity=0.089 Sum_probs=179.7 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEI 80 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~ 80 (346) |-.--.=+=+|=.+. .|.....|+||+|.+.+.+.+.+.+.|.. + +.+. .+...+|+++++|.++.. ...+ T Consensus 1 ~~~~~~~~~~~~~~~-~~t~~~~fiPev~s~~v~~~l~~~lv~~~--l-~~~~---~~~~~~GdTV~ip~~g~~--~a~d 71 (381) T protein:vir:80 1 MATIQGTGGYKGSAV-DLSNVQVFIPEVWSSEVRMFRDQKFAALE--A-TKKI---PFEGKKGDLIHIPNISRA--AVYD 71 (381) T ss_pred CceecccccccCccc-chhhHHhhhhHHHHHHHHHHHHHhhhhhh--c-cccc---cceeecCceEEeeccCcc--eeee Confidence 321111111122133 34444667799999999999888776632 2 2221 223457999999999865 3455 Q ss_pred cCCCccccchhhcccceeEEEE-EeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhh- Q lcl|NC_015254. 81 LDDGEGALTPGNISAAKDIARL-HMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDS- 158 (346) Q Consensus 81 ~~dg~~~it~~~lt~~~~~a~~-~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~- 158 (346) ..++ +.+++++++..+...++ ..+..++.++|+.......||+.++.+|++.++++++|+.+++.+..+........ T Consensus 72 ~~~g-~~i~~~~~~~~~~~itID~~~~~~~~Idd~D~~~~~~D~~~~~~~~~~~aLA~~~D~~i~~~~~~~~~~~~~~~~ 150 (381) T protein:vir:80 72 KQPQ-TPVNLQARTDSEFTFTVTKYKESSFMIEDIVNTQASYTLRQYYTKEAGYALARDMDNFALAHRAVINAFPSQRIY 150 (381) T ss_pred ecCC-CcccccccCCceEEEEEeeeeecceeechHHHHhhccChHHHHHHHHHHHHHHHHHHHHHHHHhhcccccccccc Confidence 5665 47899999998887777 55778899999999988889999999999999999999999988764432211100 Q ss_pred -cc------eeeeccccccccccHHHHHHHHHHhCcccc--CceEEEEchHHHHHHHhhh-hhhhcccc----cCceeeE Q lcl|NC_015254. 159 -NK------LDVSTETGDDSYFTGDTFLSATYKLGDAEG--KLTGIAMHSQTEMNLRKQG-LIEFMLDS----DNKKFPT 224 (346) Q Consensus 159 -~~------~dis~~~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~~~~~L~~~~-li~~~~~s----~~~~i~~ 224 (346) .. ......++....++++.|.+|..+|.++.- .-..++++|..|..|++.. ++...... ..|.|+. T Consensus 151 t~~~~i~~~~~~~~~t~~~~~~t~~~i~~a~~~Lde~~VP~egR~lvv~P~~~~~Ll~~~~~~~ad~~~~~~l~~G~Ig~ 230 (381) T protein:vir:80 151 SYDTTLGDGTVNAHLTGTPAPLTYAALLLAKQKLDEADVPQEGRIVMVSPAQYIDLLSINQFISVDFSQVKPVTSGVVGT 230 (381) T ss_pred cccccccccccccccccchhhHHHHHHHHHHHHHhhcCCCcCCcEEEeCHHHHHHHhhchhhhhhhhccchhhhceeeeE Confidence 00 000111223344678999999999987652 3468999999999999863 33222111 1467899 Q ss_pred EeceEEEEeCCCccCCCceEEEEEcCCeeEEeecC-CccceeeeecCCcceeEEEE---------eeEEeeee-eeeeec Q lcl|NC_015254. 225 YMGKRVIVDDGLPAKDGVYTSYIFGEGAFGLGNGE-APVPTETDREKLKGNDILIN---------RQHFLLHP-RGIAWQ 293 (346) Q Consensus 225 ~~G~~VVvdD~~p~~~g~ytt~l~~~GAi~~~~~~-~~~~vE~dRd~~~g~~~l~~---------r~~~~~~~-~G~s~~ 293 (346) ++|++|++|..+|...+. .+.+..|+-...... ...+++- +.......+.. ..+..++. .|..|+ T Consensus 231 i~G~~Vv~Sn~lp~~~~t--~~~~~agap~~~~~~~~~~~~~g--~~s~~a~av~~~k~yd~~~~~~~~~~~~~~g~~~~ 306 (381) T protein:vir:80 231 ILGMEVIVTTQIGINSLT--GYVNGQGAPTQPTPGVLGSPYLP--DQAGTANVVNTGSASDLAVSLSYFGLPVFSGAGAT 306 (381) T ss_pred EcceEEEeeccccccccc--ceeeecccccccccccccccccc--ccccceeeeeeeeeeceeeeeeeccceeeecceee Confidence 999999999999986552 233333332211100 0001110 00111111111 12222222 333343 Q ss_pred cccccCCCCChHHhcCC----------cCceeeecccccceEEEEE---------------ecc------cccccCCCCC Q lcl|NC_015254. 294 EKSVAGHSPTNTEIEKG----------NNWKAVYESKNIRIVAFVH---------------KNG------VPGKKKETAP 342 (346) Q Consensus 294 ~~~~~~~sPt~a~L~~~----------~NW~~v~~~K~i~iv~~~~---------------k~~------~~~~~~~~~~ 342 (346) .. ...+|..-.... .-|+-+...++ +++ |+. -|..-..=-. T Consensus 307 ~~---~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 378 (381) T protein:vir:80 307 AA---DGGQTLGSFGGANRWATAVVCHPDWLAVGVQQN-----VKSESSRETMYLADAFVTSCVYGAKVFRPDHCVLLHT 378 (381) T ss_pred ec---CCCceeeeehhhhhhhhhcccccccccccceeE-----eecccchhheeehhhhhhhhhhccccccchhhhhhhh Confidence 11 122333222112 22333222211 122 111 0112222223 Q ss_pred CCC Q lcl|NC_015254. 343 EGI 345 (346) Q Consensus 343 ~~~ 345 (346) .|+ T Consensus 379 ~~~ 381 (381) T protein:vir:80 379 SGI 381 (381) T ss_pred cCC Confidence 344 No 32 >protein:vir:99075 Length: 392 # NCBI annotation: gp30 # Family: family:all:10837 # MgeID: mge:1671 # MgeName: Wildcat # Cross-refs: genbank:acc:YP_655895;genbank:gi:109521467;genbank:GeneID:4158040 Probab=99.61 E-value=7.7e-17 Score=108.92 Aligned_cols=306 Identities=11% Similarity=0.035 Sum_probs=165.2 Q ss_pred ee-eccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccc--cCCCccccchhhcccc Q lcl|NC_015254. 20 IA-DVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEI--LDDGEGALTPGNISAA 96 (346) Q Consensus 20 l~-d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~--~~dg~~~it~~~lt~~ 96 (346) ++ .+|+||+|++.+.+.+.+.+.|.+ + +....-..+.+..||+|++|.++......-. .+.....+++++++.. T Consensus 1 Ma~~~~~p~~~a~~~l~~l~~~lv~~~--l-v~~~~~~~~~~~~GdtV~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 77 (392) T protein:vir:99 1 MANAFSKPTAVVDTAIQMLQNELILTN--L-VWLNGIGDFAHKFNDTITVRVPAPSRGHTRKLRGAGAERNLTVSDFTED 77 (392) T ss_pred CccccccHHHHHHHHHHHHHhhccchh--h-hccccccccccCCCCeEEEeecccccceeeeccccccCCcccccccccc Confidence 33 358999999999999988887733 2 2222222344467999999999876433211 1112346888899888 Q ss_pred eeEEEE-EeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeeccccccccccH Q lcl|NC_015254. 97 KDIARL-HMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSYFTG 175 (346) Q Consensus 97 ~~~a~~-~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~~~~ 175 (346) +...++ +...+++.++|+...+...|+++++.+|.+..++++++.+++..+.+........ .........+ T Consensus 78 ~~~~~id~~k~~~~~i~d~e~~~~~~~~~~~~~~~a~~ala~~vd~~i~~~~~~a~~~~~~~--------~~~~~~~~~~ 149 (392) T protein:vir:99 78 SFPVTLTDVAYHLGVLTDEELTFDLESFATQILPRQVRGVADILEEGVRDMIVGAPYEAAGA--------VHEVAPDEFF 149 (392) T ss_pred eEEEEEeeeeecceeechHHHhhhhhhhHHHHHHHHHHHHHHHHHHHHHHHHhccccccccc--------ccccChhhhH Confidence 877776 7789999999999999999999999999999999999999998887532211111 1111223467 Q ss_pred HHHHHHHHHhCccc-cCceEEEEchHHHHHHHhhh-hhhhcccc-------cCceeeEEeceEEEEeCCCccCCCceEEE Q lcl|NC_015254. 176 DTFLSATYKLGDAE-GKLTGIAMHSQTEMNLRKQG-LIEFMLDS-------DNKKFPTYMGKRVIVDDGLPAKDGVYTSY 246 (346) Q Consensus 176 ~~l~~A~~~~GD~~-~~~~~ivmhS~~~~~L~~~~-li~~~~~s-------~~~~i~~~~G~~VVvdD~~p~~~g~ytt~ 246 (346) +.|.+|..+|.+.. ..-+.++++|..+..|+++. ++...... .++.|+.++|++|+++..+|...+ + T Consensus 150 ~~i~~a~~~L~~~~vP~~R~~vv~p~~~~~l~~~~~~~~~~~~g~~~~~~l~~G~vg~i~G~~v~~s~~~~~~t~----~ 225 (392) T protein:vir:99 150 KGVNGARRALNELYIPQGRVLVVGTAVTEQILNDDRFIKYESQGQSAVSALQEARLGRIYGYEIVESTLIPHGDA----Y 225 (392) T ss_pred HHHHHHHHHHhhcCCCCCCEEEEcHHHHHHHhcccceeecccccchhhhhhhcceeeeeeeeEEEeecccccccc----e Confidence 89999999998754 23378999999999999874 33322221 246789999999999999987753 4 Q ss_pred EEcCCeeEEeecCCccceeeeec--CCcceeEEEEeeEEeeeeeeeeeccccc--cCCCCCh-HHhcCCcCceeeecccc Q lcl|NC_015254. 247 IFGEGAFGLGNGEAPVPTETDRE--KLKGNDILINRQHFLLHPRGIAWQEKSV--AGHSPTN-TEIEKGNNWKAVYESKN 321 (346) Q Consensus 247 l~~~GAi~~~~~~~~~~vE~dRd--~~~g~~~l~~r~~~~~~~~G~s~~~~~~--~~~sPt~-a~L~~~~NW~~v~~~K~ 321 (346) .+.+.++.+....+..+ +-... ...+..-+..++ .. -+...|..... ....... ..-..+......+..+. T Consensus 226 a~~~~a~~~at~a~v~~-~~~~~~~s~s~~~~v~~~~--~~-~~~~t~~s~~~~v~~~~g~~~v~~~~~~~~~~~~~~~~ 301 (392) T protein:vir:99 226 LYHPTAFIMATRAPAPP-MGAVRSTAISGDQRIAMRW--LV-DYDSTITSNRSLIDTYFGLKVVEDPNGVGFVRARKIHL 301 (392) T ss_pred eeecccccccccccccc-ccccceeEEecccceecce--ee-cccceeeccccccceeEEEEEEeeccccceeeeeeeee Confidence 45555554433322111 00000 000000011110 00 00111110000 0000000 00000000111100000 Q ss_pred cce-EEEEEecccccccCCCCCCCCC Q lcl|NC_015254. 322 IRI-VAFVHKNGVPGKKKETAPEGIK 346 (346) Q Consensus 322 i~i-v~~~~k~~~~~~~~~~~~~~~~ 346 (346) .+. +.+ +.+... .....-..|.. T Consensus 302 ~~~~v~v-~~v~~~-~~~~~~~~~~~ 325 (392) T protein:vir:99 302 IPGSIEV-APEAGA-NATITAAAGED 325 (392) T ss_pred ecceeee-eeeecc-cceeEeeeccc Confidence 000 000 000000 00000000110 No 33 >protein:vir:99749 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1497 # MgeName: phiETA2 # Cross-refs: genbank:acc:YP_001004307;genbank:gi:122891761;genbank:GeneID:4712304 Probab=99.59 E-value=1.9e-16 Score=106.80 Aligned_cols=287 Identities=17% Similarity=0.146 Sum_probs=172.0 Q ss_pred Cc--cceecceeeec----------CCceeeee--eccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEE Q lcl|NC_015254. 1 MI--KKLRMNLQKFA----------AGKNTRIA--DVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMI 66 (346) Q Consensus 1 ~~--~~~~~~~q~~~----------a~~~T~l~--d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti 66 (346) |- .|+++|+|.|+ ++.++... ...+|+.+..-+.+...+.+.+.+. ......++.++ T Consensus 1 ~~k~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~lip~~~~~~ii~~~~~~s~l~~~---------~~~~~~~~~~~ 71 (324) T protein:vir:99 1 MEQTQKLKLNLQHFASNNVKPQVFNPDNVMMHEKKDGTLLNDFTTPILQEVMENSKIMRL---------GKYEPMEGTEK 71 (324) T ss_pred CCCchHhhHHHHHHHHHhhhhhhccccceeccCCCcceechhHHHHHHHHHHhhchhhhh---------cceeeccCCce Confidence 54 45588888776 33333333 3467887777776766665554331 11223356779 Q ss_pred EecccccCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 67 NMPFWQDLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS 146 (346) Q Consensus 67 ~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~ 146 (346) ++|.+... +.+..+.|+. .++..+++..+.....++.+..+.++++...-+..|..+.+.+++++.+.+.+++.+|. T Consensus 72 ~~p~~~~~-~~a~~v~Eg~-~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~ai~~~~d~~~l~- 148 (324) T protein:vir:99 72 KFTFWADK-PGAYWVGEGQ-KIETSKATWVNATMRAFKLGVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGIL- 148 (324) T ss_pred EEEEEecC-cceeEeccCc-cccccccceeEEEEeeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhh- Confidence 99998753 5667778875 67888888888777888888888888876666666888999999999999999997763 Q ss_pred HHhhhhhhhhhhcceee-eccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhh--hcccccCceee Q lcl|NC_015254. 147 LNGITASGALDSNKLDV-STETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIE--FMLDSDNKKFP 223 (346) Q Consensus 147 L~G~~~~~~~~~~~~di-s~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~--~~~~s~~~~i~ 223 (346) |.-...........+ .........++++.|.++...+.+......+|+||+.++..|++..--+ ++. .++.-+ T Consensus 149 --G~g~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~n~~~~~~L~~l~d~~g~~~~--~~~~~~ 224 (324) T protein:vir:99 149 --NQGNNPFGKSIAQSIEKTNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKIVDPETKERI--YDRNSD 224 (324) T ss_pred --cCCCCccCccccccccccceeccccCCHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHhhcCCCceee--cCCCCc Confidence 211110000000000 1112233457899999999999887777778999999999998652111 111 112335 Q ss_pred EEeceEEEEeCCCccCCCceEEEEEcC-CeeEEeecCCccceeeeecCC----------------cceeEEEEeeEEeee Q lcl|NC_015254. 224 TYMGKRVIVDDGLPAKDGVYTSYIFGE-GAFGLGNGEAPVPTETDREKL----------------KGNDILINRQHFLLH 286 (346) Q Consensus 224 ~~~G~~VVvdD~~p~~~g~ytt~l~~~-GAi~~~~~~~~~~vE~dRd~~----------------~g~~~l~~r~~~~~~ 286 (346) +++|+||++++.++.+.+. ++++. .-+.+...+ .+.+|..|+.. .+...+....+|... T Consensus 225 ~l~G~PVv~~~~~~~~~~~---~i~gd~~~~~~~~~~-~~~i~~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~ 300 (324) T protein:vir:99 225 TLDGLPVVNLKSSNLKRGE---LITGDFDKLIYGIPQ-LIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALH 300 (324) T ss_pred cccceeEEeecCCCCCcce---EEEEecccEEEEEec-CcEEEEeecccccccccccccchhhhhcCcEEEEEEEEEccE Confidence 7999999999988877652 23322 122343333 45677766642 122333333333322 Q ss_pred e---eeeeeccccccCCCCChHHh Q lcl|NC_015254. 287 P---RGIAWQEKSVAGHSPTNTEI 307 (346) Q Consensus 287 ~---~G~s~~~~~~~~~sPt~a~L 307 (346) | ..|.--.....+..++.+|. T Consensus 301 v~~~~a~~~lt~a~~~~~~~~~~~ 324 (324) T protein:vir:99 301 IADDKAFAKLVPADKKTDSVPGEV 324 (324) T ss_pred EecccceEEEEeccCCCCCCCCCC Confidence 2 22221111222233344444 No 34 >protein:vir:9309 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:165 # MgeName: phi 11 # Cross-refs: genbank:acc:NP_803287;genbank:gi:29028597;genbank:GeneID:1258044 Probab=99.58 E-value=2.1e-16 Score=106.55 Aligned_cols=287 Identities=17% Similarity=0.133 Sum_probs=169.9 Q ss_pred Ccc--ceecceeeec----------CCceeee--eeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEE Q lcl|NC_015254. 1 MIK--KLRMNLQKFA----------AGKNTRI--ADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMI 66 (346) Q Consensus 1 ~~~--~~~~~~q~~~----------a~~~T~l--~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti 66 (346) |-+ |+|+|+++|+ |+++|.. +.-.+|+.+..-+.+...+.+.+.+. ......++..+ T Consensus 1 ~~~~~~~~~~~~~f~~~~~~~~~~~a~~~~~~~~~~~liP~~~~~~ii~~~~~~s~l~~l---------~~~~~~~~~~~ 71 (324) T protein:vir:93 1 MEQTQKLKLNLQHFASNNVKPQVFNPDNVMMHEKKDGTLLNDFTTPILQEVMENSKIMQL---------GKYEPMEGTEK 71 (324) T ss_pred CchhHHHHHHHHHHHHhhhhhhhcccccccccCCCcceechhHHHHHHHHHHhhchhhhh---------cceeeccCCce Confidence 644 4588888887 3333322 34467888777777776666555331 11223356778 Q ss_pred EecccccCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 67 NMPFWQDLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS 146 (346) Q Consensus 67 ~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~ 146 (346) ++|.+..- ..+..+.|+. .++..+++-.+.....++.+.-+.++++...-+..|....+.+++++++++..++.+|. T Consensus 72 ~ip~~~~~-~~a~~v~Eg~-~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~aia~~~d~a~l~- 148 (324) T protein:vir:93 72 KFTFWADK-PGAYWVGEGQ-KIETSKATWVNATMRAFKLGVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGIL- 148 (324) T ss_pred EEEEEecC-cceeeecCCc-cccccccceeEEEEEeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhc- Confidence 99998754 5566678875 67888888887777888888888998877666666889999999999999999997653 Q ss_pred HHhhhhhhhhhhc-ceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhh--hcccccCceee Q lcl|NC_015254. 147 LNGITASGALDSN-KLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIE--FMLDSDNKKFP 223 (346) Q Consensus 147 L~G~~~~~~~~~~-~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~--~~~~s~~~~i~ 223 (346) |.-........ .............++++.|.++...+.+......+|+||+.++..|++..--+ ++. .++.-+ T Consensus 149 --G~g~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~n~~~~~~L~~l~d~~G~~~~--~~~~~~ 224 (324) T protein:vir:93 149 --NQGNNPFGKSIAQSIEKTNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKIVDPETKERI--YDRNSD 224 (324) T ss_pred --CCCCCCcCccccccccccceeccccccHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHhhCCCCCeee--cCCCCC Confidence 32111110000 00001112233457899999999999887777789999999999998752111 111 123346 Q ss_pred EEeceEEEEeCCCccCCCceEEEEEcC-CeeEEeecCCccceeeeecCC----------------cceeEEEEeeEEeee Q lcl|NC_015254. 224 TYMGKRVIVDDGLPAKDGVYTSYIFGE-GAFGLGNGEAPVPTETDREKL----------------KGNDILINRQHFLLH 286 (346) Q Consensus 224 ~~~G~~VVvdD~~p~~~g~ytt~l~~~-GAi~~~~~~~~~~vE~dRd~~----------------~g~~~l~~r~~~~~~ 286 (346) +++|+||++++..+...+. .+++. .-+.+..-+ .+.+|..|+.. .++..+....+|.+. T Consensus 225 ~l~G~PVv~~~~~~~~~~~---i~~gdfs~~~~~~~~-~~~i~~~~~~~~~~~~~~~~~~~~~f~~n~~~~r~~~r~d~~ 300 (324) T protein:vir:93 225 SLDGLPVVNLKSSNLKRGE---LITGDFDKLIYGIPQ-LIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALH 300 (324) T ss_pred cccceeeEeecCCCCCcce---EEEEecceEEEEEec-CcEEEEeecccccccccccccchhhhhcCcEEEEEEEEeccE Confidence 7999999998877665542 22322 122333322 34566666642 122333333333322 Q ss_pred ee---eeeeccccccCCCCChHHh Q lcl|NC_015254. 287 PR---GIAWQEKSVAGHSPTNTEI 307 (346) Q Consensus 287 ~~---G~s~~~~~~~~~sPt~a~L 307 (346) +. .|.--....++..+|..|. T Consensus 301 v~~~~a~~~l~~a~~~~~~~~~~~ 324 (324) T protein:vir:93 301 IADDKAFAKLVPADKRTDSVPGEV 324 (324) T ss_pred EecccceEEEecccccCCCCCCCC Confidence 21 1221111111222222222 No 35 >protein:vir:103955 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1662 # MgeName: phiNM # Cross-refs: genbank:acc:YP_873992;genbank:gi:118430767;genbank:GeneID:4525449 Probab=99.57 E-value=2.5e-16 Score=106.10 Aligned_cols=287 Identities=18% Similarity=0.152 Sum_probs=168.7 Q ss_pred Cc--cceecceeeecCC----------ceeee--eeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEE Q lcl|NC_015254. 1 MI--KKLRMNLQKFAAG----------KNTRI--ADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMI 66 (346) Q Consensus 1 ~~--~~~~~~~q~~~a~----------~~T~l--~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti 66 (346) |- .|+++|+|.|+.+ .++.. +...+|+.+..-+.+...+.+.+.+.. .....++..+ T Consensus 1 ~~~~~~~~~~~~~f~~~~~~~~~~~a~~~~~~~~~~~liP~~~~~~ii~~~~~~s~l~~~~---------~~~~~~~~~~ 71 (324) T protein:vir:10 1 MEQTQKLKLNLQHFASNNVKPQVFNPDNVMMHEKKDGTLLNDFTTPILQEVMENSKIMQLG---------KYEPMEGTEK 71 (324) T ss_pred CCCchHHHHHHHHHHHHhhccceecccceeccCCCcceechhHHHHHHHHHHhhchhhhhc---------ceeeccCCce Confidence 43 4558889988732 22323 234678777676666666655553321 1223346679 Q ss_pred EecccccCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 67 NMPFWQDLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS 146 (346) Q Consensus 67 ~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~ 146 (346) ++|.+... +.++.+.|++ .++..+.+..+.....++.+..+.++++...-+.-|..+.+.+++++++.+..++.+|. T Consensus 72 ~~p~~~~~-~~a~~v~Eg~-~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~ai~~~~d~a~l~- 148 (324) T protein:vir:10 72 KFTFWADK-PGAYWVGEGQ-KIETSKATWVNATMRAFKLGVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGIL- 148 (324) T ss_pred EEEEEeCC-cceeEeccCc-cccccccceeEEEEeeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhh- Confidence 99998753 5677778885 67777788777777778888888888877666666888999999999999999997664 Q ss_pred HHhhhhhhhhhhcceee-eccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhh--hcccccCceee Q lcl|NC_015254. 147 LNGITASGALDSNKLDV-STETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIE--FMLDSDNKKFP 223 (346) Q Consensus 147 L~G~~~~~~~~~~~~di-s~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~--~~~~s~~~~i~ 223 (346) |.-...........+ .........++++.|.++..++.+......+|+||+.++..|++..--+ ++. .++.-+ T Consensus 149 --G~g~~~~~~~i~~~~~~~~~~~~~~~t~~~i~~~~~~l~~~~~~~~~~v~n~~~~~~L~~l~d~~g~~~~--~~~~~~ 224 (324) T protein:vir:10 149 --NQGNNPFGKSIAQSIEKTNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKIVDPETKERI--YDRNSD 224 (324) T ss_pred --cCCCCccCccccccccccceeccccCCHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHhhccCCceee--cCCCCc Confidence 211110000000000 1112233467899999999999887777778999999999998652111 111 122335 Q ss_pred EEeceEEEEeCCCccCCCceEEEEEcC-CeeEEeecCCccceeeeecCC----------------cceeEEEEeeEEeee Q lcl|NC_015254. 224 TYMGKRVIVDDGLPAKDGVYTSYIFGE-GAFGLGNGEAPVPTETDREKL----------------KGNDILINRQHFLLH 286 (346) Q Consensus 224 ~~~G~~VVvdD~~p~~~g~ytt~l~~~-GAi~~~~~~~~~~vE~dRd~~----------------~g~~~l~~r~~~~~~ 286 (346) +++|+||++++.++.+.+. ++++. .-+.+...+ .+.+|..++.. .+...+....+|... T Consensus 225 ~l~G~PV~~~~~~~~~~~~---~~~gd~~~~~~~~~~-~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~r~d~~ 300 (324) T protein:vir:10 225 TLDGLPVVNLKSSNLKRGE---LITGDFDKLIYGIPQ-LIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALH 300 (324) T ss_pred cccceeEEeecCCCCCcce---EEEEecccEEEEEec-CcEEEEeecccccccccccccchhhhhcCcEEEEEEEEEccE Confidence 7999999999888776653 23322 122343323 35566666542 122233333333322 Q ss_pred ee---eeeeccccccCCCCChHHh Q lcl|NC_015254. 287 PR---GIAWQEKSVAGHSPTNTEI 307 (346) Q Consensus 287 ~~---G~s~~~~~~~~~sPt~a~L 307 (346) +. .|.--.....+..+|-+|. T Consensus 301 v~~~~A~~~l~~a~~~~~~~~~~~ 324 (324) T protein:vir:10 301 IADDKAFAKLVPADKKTDSVPGEV 324 (324) T ss_pred EecccceEEEEeccCCCCCCCCCC Confidence 22 2221111111112233333 No 36 >protein:vir:96392 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1613 # MgeName: 53 # Cross-refs: genbank:acc:YP_239648;genbank:gi:66395381;genbank:GeneID:5132868 Probab=99.51 E-value=2.1e-15 Score=101.07 Aligned_cols=283 Identities=17% Similarity=0.140 Sum_probs=170.4 Q ss_pred Cc--cceecceeeecCC----------cee--eeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEE Q lcl|NC_015254. 1 MI--KKLRMNLQKFAAG----------KNT--RIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMI 66 (346) Q Consensus 1 ~~--~~~~~~~q~~~a~----------~~T--~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti 66 (346) |- +|++||+++|+.+ +.+ .-...++|+.+..-+.+...+.+.+.+- ......+|..+ T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~l---------~~~~~~~~~~~ 71 (324) T protein:vir:96 1 MEQTQKLKLNLQHFASNNVKPQVFNPDNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQL---------GKYEPMEGTEK 71 (324) T ss_pred CCcchhhhHHHHHHHHHhhhhhhhccccccccCcCccccchhHHHHHHHHHHhhchhhhh---------cceeeccCCce Confidence 53 5569999999832 232 2234578888877676766666555331 11223356778 Q ss_pred EecccccCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 67 NMPFWQDLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS 146 (346) Q Consensus 67 ~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~ 146 (346) ++|.+..- +.+..+.|++ .++..+++..+.....++.+..+.++++...-+..|..+.+.+++++++++..++.+|. T Consensus 72 ~~p~~~~~-~~a~~v~Eg~-~~~~~~~~~~~v~~~~~k~~~~~~is~ell~ds~~~l~~~i~~~la~ai~~~~d~a~l~- 148 (324) T protein:vir:96 72 KFTFWADK-PGAYWVGEGQ-KIETSKATWVNATMRAFKLGVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGIL- 148 (324) T ss_pred EEEEEecC-cceeEecCCc-cccccccceeEEEEeeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhc- Confidence 99998653 5666678875 67878888777777777888778888876666666888999999999999999996663 Q ss_pred HHhhhhhhhhhh-cceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccCceeeEE Q lcl|NC_015254. 147 LNGITASGALDS-NKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDNKKFPTY 225 (346) Q Consensus 147 L~G~~~~~~~~~-~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~~~i~~~ 225 (346) |.-....... ....-.........++++.|.++..++.+......+|+||++++..|++..--+.-..-.++.-+++ T Consensus 149 --G~g~~~~~~gi~~~~~~~~~~~~~~~t~~~i~~~~~~l~~~~~~~~~~vmn~~~~~~L~~l~d~~G~~~~~~~~~~~l 226 (324) T protein:vir:96 149 --NQGNNPFGKSIAQSIEKTNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKIVDPETKERIYDRNSDSL 226 (324) T ss_pred --cCCCCCcCccccccccccceeccccccHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHhhccCCCeeecCCCCCcc Confidence 3211111000 0000011112234578999999999998877788899999999999986532111000112344689 Q ss_pred eceEEEEeCCCccCCCceEEEEEcC-CeeEEeecCCccceeeeecCC----------------cceeEEEEeeEEeeeee Q lcl|NC_015254. 226 MGKRVIVDDGLPAKDGVYTSYIFGE-GAFGLGNGEAPVPTETDREKL----------------KGNDILINRQHFLLHPR 288 (346) Q Consensus 226 ~G~~VVvdD~~p~~~g~ytt~l~~~-GAi~~~~~~~~~~vE~dRd~~----------------~g~~~l~~r~~~~~~~~ 288 (346) +|+||+++..++...+. ++++. .-+.++..+ .+.+|.+++.. ..+..+...+++.+.+. T Consensus 227 ~G~PV~~~~~~~~~~~~---~~~gd~~~~~~g~~~-~~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~r~~~r~d~~v~ 302 (324) T protein:vir:96 227 DGLPVVNLKSSNLKRGE---LITGDFDKLIYGIPQ-LIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIA 302 (324) T ss_pred cceeeEeeCCCCCCcce---EEEEecceEEEEEec-CcEEEEeecccccccccccccchhhhhcCcEEEEEEEEEccEEe Confidence 99999998877766542 22322 222344323 45677766642 12333444444443332 Q ss_pred ---------eeeeccccccCCCCChH Q lcl|NC_015254. 289 ---------GIAWQEKSVAGHSPTNT 305 (346) Q Consensus 289 ---------G~s~~~~~~~~~sPt~a 305 (346) |..|... ..|..- T Consensus 303 ~~~A~~~l~~a~~~~~----~~~~~~ 324 (324) T protein:vir:96 303 DDKAFAKLVPADKRTD----SVPGEV 324 (324) T ss_pred cccceEEEecccccCC----CCCCCC Confidence 2222111 122222 No 37 >protein:vir:78830 Length: 324 # NCBI annotation: major head protein # Family: family:all:507 # MgeID: mge:1858 # MgeName: 80alpha # Cross-refs: genbank:acc:YP_001285361;genbank:gi:148717889;genbank:GeneID:5246961 Probab=99.51 E-value=2.1e-15 Score=101.07 Aligned_cols=283 Identities=17% Similarity=0.140 Sum_probs=170.4 Q ss_pred Cc--cceecceeeecCC----------cee--eeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEE Q lcl|NC_015254. 1 MI--KKLRMNLQKFAAG----------KNT--RIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMI 66 (346) Q Consensus 1 ~~--~~~~~~~q~~~a~----------~~T--~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti 66 (346) |- +|++||+++|+.+ +.+ .-...++|+.+..-+.+...+.+.+.+- ......+|..+ T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~l---------~~~~~~~~~~~ 71 (324) T protein:vir:78 1 MEQTQKLKLNLQHFASNNVKPQVFNPDNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQL---------GKYEPMEGTEK 71 (324) T ss_pred CCcchhhhHHHHHHHHHhhhhhhhccccccccCcCccccchhHHHHHHHHHHhhchhhhh---------cceeeccCCce Confidence 53 5569999999832 232 2234578888877676766666555331 11223356778 Q ss_pred EecccccCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 67 NMPFWQDLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS 146 (346) Q Consensus 67 ~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~ 146 (346) ++|.+..- +.+..+.|++ .++..+++..+.....++.+..+.++++...-+..|..+.+.+++++++++..++.+|. T Consensus 72 ~~p~~~~~-~~a~~v~Eg~-~~~~~~~~~~~v~~~~~k~~~~~~is~ell~ds~~~l~~~i~~~la~ai~~~~d~a~l~- 148 (324) T protein:vir:78 72 KFTFWADK-PGAYWVGEGQ-KIETSKATWVNATMRAFKLGVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGIL- 148 (324) T ss_pred EEEEEecC-cceeEecCCc-cccccccceeEEEEeeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhc- Confidence 99998653 5666678875 67878888777777777888778888876666666888999999999999999996663 Q ss_pred HHhhhhhhhhhh-cceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccCceeeEE Q lcl|NC_015254. 147 LNGITASGALDS-NKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDNKKFPTY 225 (346) Q Consensus 147 L~G~~~~~~~~~-~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~~~i~~~ 225 (346) |.-....... ....-.........++++.|.++..++.+......+|+||++++..|++..--+.-..-.++.-+++ T Consensus 149 --G~g~~~~~~gi~~~~~~~~~~~~~~~t~~~i~~~~~~l~~~~~~~~~~vmn~~~~~~L~~l~d~~G~~~~~~~~~~~l 226 (324) T protein:vir:78 149 --NQGNNPFGKSIAQSIEKTNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKIVDPETKERIYDRNSDSL 226 (324) T ss_pred --cCCCCCcCccccccccccceeccccccHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHhhccCCCeeecCCCCCcc Confidence 3211111000 0000011112234578999999999998877788899999999999986532111000112344689 Q ss_pred eceEEEEeCCCccCCCceEEEEEcC-CeeEEeecCCccceeeeecCC----------------cceeEEEEeeEEeeeee Q lcl|NC_015254. 226 MGKRVIVDDGLPAKDGVYTSYIFGE-GAFGLGNGEAPVPTETDREKL----------------KGNDILINRQHFLLHPR 288 (346) Q Consensus 226 ~G~~VVvdD~~p~~~g~ytt~l~~~-GAi~~~~~~~~~~vE~dRd~~----------------~g~~~l~~r~~~~~~~~ 288 (346) +|+||+++..++...+. ++++. .-+.++..+ .+.+|.+++.. ..+..+...+++.+.+. T Consensus 227 ~G~PV~~~~~~~~~~~~---~~~gd~~~~~~g~~~-~~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~r~~~r~d~~v~ 302 (324) T protein:vir:78 227 DGLPVVNLKSSNLKRGE---LITGDFDKLIYGIPQ-LIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIA 302 (324) T ss_pred cceeeEeeCCCCCCcce---EEEEecceEEEEEec-CcEEEEeecccccccccccccchhhhhcCcEEEEEEEEEccEEe Confidence 99999998877766542 22322 222344323 45677766642 12333444444443332 Q ss_pred ---------eeeeccccccCCCCChH Q lcl|NC_015254. 289 ---------GIAWQEKSVAGHSPTNT 305 (346) Q Consensus 289 ---------G~s~~~~~~~~~sPt~a 305 (346) |..|... ..|..- T Consensus 303 ~~~A~~~l~~a~~~~~----~~~~~~ 324 (324) T protein:vir:78 303 DDKAFAKLVPADKRTD----SVPGEV 324 (324) T ss_pred cccceEEEecccccCC----CCCCCC Confidence 2222111 122222 No 38 >protein:vir:96223 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1607 # MgeName: 69 # Cross-refs: genbank:acc:YP_239571;genbank:gi:66395304;genbank:GeneID:5132771 Probab=99.51 E-value=2.3e-15 Score=100.81 Aligned_cols=289 Identities=17% Similarity=0.134 Sum_probs=166.8 Q ss_pred Cc--cceecceeeecCC----------cee--eeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEE Q lcl|NC_015254. 1 MI--KKLRMNLQKFAAG----------KNT--RIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMI 66 (346) Q Consensus 1 ~~--~~~~~~~q~~~a~----------~~T--~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti 66 (346) |- .|+++|+|+|+.. .++ .-..-++|+.+..-+.+...+.+.+.+. ......+|..+ T Consensus 1 ~~~~~~~~~~~~~f~~~~~~~~~~~a~~~~~~~~~~~lip~~~~~~ii~~~~~~s~l~~l---------~~~~~~~~~~~ 71 (324) T protein:vir:96 1 MEQTQKLKLNLQHFASNNVKPQVFNPDNVMMHEKKDGTLLNDFTTPILQEVMENSKIMQL---------GKYEPMEGTEK 71 (324) T ss_pred CCcchhhhHHHHHHHHhhhhhhhcccccccccCCCcceechhHHHHHHHHHHhhchhhhh---------cceeeccCCce Confidence 42 5668999998732 222 1234366777766666666555544332 11223356778 Q ss_pred EecccccCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 67 NMPFWQDLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS 146 (346) Q Consensus 67 ~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~ 146 (346) ++|.+..- +.+..+.|+. .++..+++-.+.....++.+.-+.++++...-+..+..+.+.+++++++.+..++.+|. T Consensus 72 ~~p~~~~~-~~a~~v~Eg~-~~~~~~~~f~~v~~~~~k~~~~~~is~ell~ds~~~l~~~i~~~l~~aia~~~d~~~l~- 148 (324) T protein:vir:96 72 KFTFWADK-PGAYWVGEGQ-KIETSKATWVNATMRAFKLGVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGIL- 148 (324) T ss_pred EEEEEecC-cceeeecCCc-cccccccceeEEEEEeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhh- Confidence 99998653 5566678875 67878888887777788888888888876666666888999999999999999996663 Q ss_pred HHhhhhhhhhhhcceee-eccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccCceeeEE Q lcl|NC_015254. 147 LNGITASGALDSNKLDV-STETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDNKKFPTY 225 (346) Q Consensus 147 L~G~~~~~~~~~~~~di-s~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~~~i~~~ 225 (346) |.-............ .........++++.|.++..++.+......+|+||+.++..|++..--+.-..-.++.-+++ T Consensus 149 --G~g~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~i~~~~~~~~~~i~n~~~~~~L~~lkd~~G~~~~~~~~~~~l 226 (324) T protein:vir:96 149 --NQGNNPFGKSIAQSIKKTNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKIVDPETKERIYDRNSDSL 226 (324) T ss_pred --cCCCCCcCccccccccccceecccccchHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHhhCCCCCeeecCCCCCcc Confidence 321111100000000 01112234578999999999998877777899999999999986532111000112344689 Q ss_pred eceEEEEeCCCccCCCceEEEEEcC-CeeEEeecCCccceeeeecCCc----------------ceeEEEEeeEEeeeee Q lcl|NC_015254. 226 MGKRVIVDDGLPAKDGVYTSYIFGE-GAFGLGNGEAPVPTETDREKLK----------------GNDILINRQHFLLHPR 288 (346) Q Consensus 226 ~G~~VVvdD~~p~~~g~ytt~l~~~-GAi~~~~~~~~~~vE~dRd~~~----------------g~~~l~~r~~~~~~~~ 288 (346) +|+||+++..++.+.+. ++++. .-+.++..+ .+.+|.+|+... +...+...++|.+.+. T Consensus 227 ~G~PV~~~~~~~~~~~~---~~~gd~s~~~~~~~~-~~~i~~~~~~~~~~~~~~~~~~~~~~~~n~v~~r~~~r~d~~v~ 302 (324) T protein:vir:96 227 DGLPVVNLKSSNLKRGE---LITGDFDKLIYGIPQ-LIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIA 302 (324) T ss_pred cceeeEeecCCCCCcce---EEEEecceEEEEEec-CcEEEEeecccccccccccccchhhhhcCcEEEEEEEEeccEEe Confidence 99999998877766542 33321 223343323 355666666421 1233333333333322 Q ss_pred ---eeeeccccccCCCCChHHh Q lcl|NC_015254. 289 ---GIAWQEKSVAGHSPTNTEI 307 (346) Q Consensus 289 ---G~s~~~~~~~~~sPt~a~L 307 (346) .|..-..+..+..-|..|. T Consensus 303 ~~~a~~~l~~a~~~~~~~~~~~ 324 (324) T protein:vir:96 303 DDKAFAKLVPADKRTDSVPGEV 324 (324) T ss_pred cccceEEEecccccCCCCCCCC Confidence 1111111111111111111 No 39 >protein:vir:97148 Length: 324 # NCBI annotation: ORF010 # Family: family:all:507 # MgeID: mge:1654 # MgeName: 85 # Cross-refs: genbank:acc:YP_239726;genbank:gi:66394880;genbank:GeneID:5130881 Probab=99.49 E-value=5e-15 Score=98.98 Aligned_cols=287 Identities=17% Similarity=0.144 Sum_probs=169.5 Q ss_pred Cc--cceecceeeec----------CCcee--eeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEE Q lcl|NC_015254. 1 MI--KKLRMNLQKFA----------AGKNT--RIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMI 66 (346) Q Consensus 1 ~~--~~~~~~~q~~~----------a~~~T--~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti 66 (346) |- +++++++++|+ +..++ .-....+|+.+..-+.+...+.+.+.+- ......++..+ T Consensus 1 ~~~~~~~~~~~~~f~~~~~~~~~~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~~---------~~~~~~~~~~~ 71 (324) T protein:vir:97 1 MEQTQKLKLNLQHFASNNVKPQVFNPDNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQL---------GKYEPMEGTEK 71 (324) T ss_pred CccchhHHHHHHHHHHhhhhhhhhccccccccCCCcceechhHHHHHHHHHHhhcchhhh---------cceeeccCCce Confidence 64 45588888886 43333 2345578998877777776665554331 12223357789 Q ss_pred EecccccCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 67 NMPFWQDLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS 146 (346) Q Consensus 67 ~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~ 146 (346) ++|.+... +.+..+.|++ .++..+++..+.....++.+.-+.++++...-+.-+..+.+.++++++++++.++.+|. T Consensus 72 ~ip~~~~~-~~a~~v~Eg~-~~~~~~~~f~~v~~~~~k~~~~~~is~ell~ds~~~l~~~i~~~l~~aia~~~d~a~l~- 148 (324) T protein:vir:97 72 KFTFWADK-PGAYWVGEGQ-KIETSKATWVNATMRAFKLGVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGIL- 148 (324) T ss_pred EEEEEecC-cceeEeccCc-cccccccceeEEEEeeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhc- Confidence 99999764 5566678885 68888888888877888888888888876666666888999999999999999997664 Q ss_pred HHhhhhhhhhhhcceee-eccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhh--hcccccCceee Q lcl|NC_015254. 147 LNGITASGALDSNKLDV-STETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIE--FMLDSDNKKFP 223 (346) Q Consensus 147 L~G~~~~~~~~~~~~di-s~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~--~~~~s~~~~i~ 223 (346) |.-............ .........++++.|.++..++.+......+|+||+.++..|++..--+ ++. ..+.-+ T Consensus 149 --G~g~~~~~~gi~~~~~~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~n~~~~~~L~~lkd~~g~~~~--~~~~~~ 224 (324) T protein:vir:97 149 --NQGNNPFGKSIAQSIEKTNKVIKGDFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKIVDPETKERI--YDRNSD 224 (324) T ss_pred --cCCCCccCccccccccccceeccccCCHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHhhcCCCceee--cCCCCc Confidence 211110000000000 1112233457899999999999887777789999999999988643111 111 123346 Q ss_pred EEeceEEEEeCCCccCCCceEEEEEcCC-eeEEeecCCccceeeeecCC----------------cceeEEEEeeEEeee Q lcl|NC_015254. 224 TYMGKRVIVDDGLPAKDGVYTSYIFGEG-AFGLGNGEAPVPTETDREKL----------------KGNDILINRQHFLLH 286 (346) Q Consensus 224 ~~~G~~VVvdD~~p~~~g~ytt~l~~~G-Ai~~~~~~~~~~vE~dRd~~----------------~g~~~l~~r~~~~~~ 286 (346) +++|+||++++..+.+.+. ++|+.- -+.+...+ .+.+|.+|+.. .....+....+|.+. T Consensus 225 tl~G~PV~~~~~~~~~~~~---~~~gd~~~~~i~~~~-~~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~r~~~r~d~~ 300 (324) T protein:vir:97 225 TLDGLPVVNLKSSNLKRGE---LITGDFDKLIYGIPQ-LIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALH 300 (324) T ss_pred cccceeeEeecCCCCCcce---EEEEecccEEEEEec-CcEEEEeecccccccccccccchhhhhcCcEEEEEEEEeccE Confidence 7999999999887776652 333321 22233322 45566666542 112222233333222 Q ss_pred e---eeeeeccccccCCCCChHHh Q lcl|NC_015254. 287 P---RGIAWQEKSVAGHSPTNTEI 307 (346) Q Consensus 287 ~---~G~s~~~~~~~~~sPt~a~L 307 (346) + ..|.--.....+..-|-+|. T Consensus 301 v~~~~a~~~l~~~~~~~~~~~~~~ 324 (324) T protein:vir:97 301 IADDKAFAKLVPADKKTDSVPGEV 324 (324) T ss_pred EecccceEEEEeccCCCCCCCCCC Confidence 2 12211111111111122222 No 40 >protein:vir:95763 Length: 297 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1578 # MgeName: SMP # Cross-refs: genbank:acc:YP_950590;genbank:gi:119953785;genbank:GeneID:5076833 Probab=99.45 E-value=4.1e-14 Score=93.98 Aligned_cols=286 Identities=14% Similarity=0.062 Sum_probs=161.3 Q ss_pred cceeeecCCceeeee--eccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCC Q lcl|NC_015254. 7 MNLQKFAAGKNTRIA--DVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDG 84 (346) Q Consensus 7 ~~~q~~~a~~~T~l~--d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg 84 (346) ||+|.|.+.+.|.-+ ...+|+.+..=+.+...+.+.+.+..-+.+ ..++..+.+|....- ..+..+.|+ T Consensus 1 m~~~~~~~~~~~~t~~~~~lvP~~~~~~ii~~~~~~s~l~~~~~~~~--------~~~~~~~~~~~~~~~-~~a~~v~Eg 71 (297) T protein:vir:95 1 MTVQTFNPENVLVSQKKDGTLHKEFTDIIMKEVAQNSLVMQLGQYQE--------MEGEQEKTVYVQTDG-ISAYWVNET 71 (297) T ss_pred CCccccccccccccCCCcceechhHHHHHHHHHHhhchhhhhcceee--------cCCCccEEEEEEcCC-ceeEEeecC Confidence 999999887776554 447888887766776666665544221111 112334566765542 445567787 Q ss_pred ccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeee Q lcl|NC_015254. 85 EGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVS 164 (346) Q Consensus 85 ~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis 164 (346) + .++..+.+..+.....++.+..+.++++...-+.-|..+.|.+++++++.++.++.+|. |.=+.........--. T Consensus 72 ~-~~~~~~~~f~~v~l~~~k~~~~~~is~ell~ds~~~l~~~i~~~la~ai~~~~d~a~l~---G~g~~~~~gi~~~~~~ 147 (297) T protein:vir:95 72 E-KIKTDKPEVVPVTLKAHKLGIILVTSREALNYTWKKFFEDMKPQIVEAFYKKIDEAGLL---GHDTPFANSVAKAAKD 147 (297) T ss_pred c-cccccccceeEEEEeeEEEEEeehhhHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHHhc---ccCCcccccccccccc Confidence 5 67777777777777777888888888877766667889999999999999999998773 3111111000000001 Q ss_pred ccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcccccCceeeEEeceEEEEeCCCccCCCc Q lcl|NC_015254. 165 TETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLDSDNKKFPTYMGKRVIVDDGLPAKDGV 242 (346) Q Consensus 165 ~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~s~~~~i~~~~G~~VVvdD~~p~~~g~ 242 (346) ........++++.|.++..++.+......+|+||++.+..|++..-- .++. .+..++++|+||+++..++.+.+. T Consensus 148 ~~~~~~~~~t~~~i~~~~~~l~~~~~~~~~~v~~~~~~~~L~~l~d~~G~~i~---~~~~~~l~G~Pv~~~~~~~~~~~~ 224 (297) T protein:vir:95 148 ANKVIGGPINYDNILKLQDALYDADVEPNAFVSKIQNRSALREARDGNKVSIY---DKAANTIDGITTVDLKSARFEKGD 224 (297) T ss_pred cceecccccCHHHHHHHHHHhhhccCCcCEEEEcHHHHHHHHHhhccCCceee---cCCCCcccceeeEeecCCCCCCce Confidence 11223345789999999999988887888999999999999864210 0111 123467899999998877776653 Q ss_pred eEEEEEcCC-eeEEeecCCccceeeeecCCcc--ee---EEEEe-eEEeeeeeeeeeccccccCCCCChHHhcCCcCcee Q lcl|NC_015254. 243 YTSYIFGEG-AFGLGNGEAPVPTETDREKLKG--ND---ILINR-QHFLLHPRGIAWQEKSVAGHSPTNTEIEKGNNWKA 315 (346) Q Consensus 243 ytt~l~~~G-Ai~~~~~~~~~~vE~dRd~~~g--~~---~l~~r-~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~~NW~~ 315 (346) ++++.- .+.+...+ .+.+|..|+.... .+ ..+.. .+-.+..|...+-+-. T Consensus 225 ---~~~gd~s~~~~~~~~-~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~~~d~~------------------- 281 (297) T protein:vir:95 225 ---LLAGDFDNLIYGVPY-NITYKISEEGQISTITNADGTPINLFEQEMIAIRATMDIAVM------------------- 281 (297) T ss_pred ---EEEEecccEEEEEec-CeEEEEeeccccccccccCccchhhhhcCcEEEEEEEEeccE------------------- Confidence 333321 12233322 3445555543210 00 00000 0011122221111100 Q ss_pred eecccccceEEEEEec Q lcl|NC_015254. 316 VYESKNIRIVAFVHKN 331 (346) Q Consensus 316 v~~~K~i~iv~~~~k~ 331 (346) +.+++++..+...|+. T Consensus 282 v~~~~a~~~l~~at~~ 297 (297) T protein:vir:95 282 ITKTDAFAKLTPAERV 297 (297) T ss_pred eecccceEEEeecCCC Confidence 1122222211111111 No 41 >protein:vir:108211 Length: 318 # NCBI annotation: gp9 # Family: family:all:6420 # MgeID: mge:2004 # MgeName: Giles # Cross-refs: genbank:acc:YP_001552338;genbank:gi:160700658;genbank:GeneID:5758931 Probab=99.43 E-value=9.4e-15 Score=97.49 Aligned_cols=276 Identities=13% Similarity=0.120 Sum_probs=162.4 Q ss_pred CccceecceeeecCCceeeeeeccc-hHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCC----CcEEE----eccc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIV-PEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSG----GNMIN----MPFW 71 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~-Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~----G~ti~----~P~~ 71 (346) |-.- -++---..+..-.++|+++ |+++-.++.+.. ++. |+..- +++.. +-.+. .|+| T Consensus 1 ~~~~--~~i~s~~~~~~itv~~ll~~P~~I~~~i~e~~-~~~-~iad~----------lf~~~~a~~~~~v~f~~~~p~~ 66 (318) T protein:vir:10 1 MTAP--TGIVSVSDGPAITVRELVGNPLWIPTALKKMM-VNQ-FISES----------LFRNGGANPNGVVAYNEGNPSF 66 (318) T ss_pred CCCC--CcceeeecCCceehHHhhCCchhHHHHHHHHH-hcc-chhhh----------hhhcccccccceeEEEeccccc Confidence 3222 1111111233445678877 998888886654 222 32211 22221 22333 3666 Q ss_pred ccCCCcccccCCCccccchhhccc-ceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHh- Q lcl|NC_015254. 72 QDLTGEDEILDDGEGALTPGNISA-AKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNG- 149 (346) Q Consensus 72 ~~l~g~ae~~~dg~~~it~~~lt~-~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G- 149 (346) - .++.|.+.|+. .++....+. ....++.+++|+++.++|++....+.++++...+|++....|+.++.++.+|.- T Consensus 67 ~--~~d~e~VaEgg-EiP~~~~~~G~~~ia~~~K~G~~~~vS~Em~~~n~~~~v~r~~~~l~Nti~r~~d~~a~dal~sa 143 (318) T protein:vir:10 67 L--EDDVADVAEFG-EIPVSAGARGLPRTAFAVKKALGVRVSKEMIDENRVGAVNDQMLQLRNTFIRANDRSAKALLQSP 143 (318) T ss_pred c--cCcHhhccCcc-cccccCCCCCchhhhhhehhccceeccHHHHhhcChhHHHHHHHHHHHHHHHHHHHHHHHHHhcc Confidence 4 57899999985 677777777 445567789999999999999999999999999999999999999999998842 Q ss_pred hhhhhhhhhcceeeeccccccccccH-HHHHHHHHH-----hCcccc----CceEEEEchHHHHHHHhhhhhhhccc--c Q lcl|NC_015254. 150 ITASGALDSNKLDVSTETGDDSYFTG-DTFLSATYK-----LGDAEG----KLTGIAMHSQTEMNLRKQGLIEFMLD--S 217 (346) Q Consensus 150 ~~~~~~~~~~~~dis~~~~~~~~~~~-~~l~~A~~~-----~GD~~~----~~~~ivmhS~~~~~L~~~~li~~~~~--s 217 (346) ....-.....-... .+.-.+ .+.+ +.+..|... ++.+.. ....|||||..++.|+++..+-.... . T Consensus 144 ~t~~~~~s~~w~~~-~~~~~d-~~~A~e~v~~a~~~~~~a~~~~~~~~~GY~pdtIVlhP~~~~~l~~n~~~~~~y~~~a 221 (318) T protein:vir:10 144 IVPTLAVPTAWDNG-GKVRTD-IAIAIEQISTAAPTAYPAGVGSSDEYFGFIPDTIVMHYALLPILMDNENFMKVYERNA 221 (318) T ss_pred ccccccCCcCCCCc-cccccc-chhhhhhhhhhhhhhhhhhhhhhhhccCccceeeEECHHHHHHHhcchhhhhhhhccc Confidence 11111111000000 000000 0111 122222221 222221 34689999999999988754422111 1 Q ss_pred c--------Ccee-eEEeceEEEEeCCCccCCCceEEEEEcCCeeEEeecCCccceeeeecC----Cccee------EEE Q lcl|NC_015254. 218 D--------NKKF-PTYMGKRVIVDDGLPAKDGVYTSYIFGEGAFGLGNGEAPVPTETDREK----LKGND------ILI 278 (346) Q Consensus 218 ~--------~~~i-~~~~G~~VVvdD~~p~~~g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~----~~g~~------~l~ 278 (346) + -+.| +.++|++||+|..+|.+ +.|++-.|.+++.--..++.++.-|.. ++|.+ ... T Consensus 222 ~~~~~~~~~tg~~~g~~lGl~vi~s~~~p~~----~alvlq~g~vG~~~d~~pl~~t~~~~egg~~~g~~~~s~~~~~~~ 297 (318) T protein:vir:10 222 NYVSTAPDWTGNFPGSVMGLNVIRSRTFPID----RVLIMERGTVGFYSDTRPLQFTALYPEGNGPNGGPTESYRADASH 297 (318) T ss_pred hhhhhcccccccccceeeceEEeecCccCCC----eeEEEecCCcceeeccccceeeecccCCCCCCCCcchhhheehhe Confidence 1 1233 46899999999999976 369999999997543334445555643 33332 122 Q ss_pred EeeEEeeeeeeeeeccccccCCCC Q lcl|NC_015254. 279 NRQHFLLHPRGIAWQEKSVAGHSP 302 (346) Q Consensus 279 ~r~~~~~~~~G~s~~~~~~~~~sP 302 (346) -|-.++..|+..-|... =.+| T Consensus 298 ~~~~~V~~PkA~~~itg---i~~~ 318 (318) T protein:vir:10 298 KRALAVDQPKAALWLTG---IVTP 318 (318) T ss_pred eeeeeeeCcceeEEEee---ccCC Confidence 33566677888777521 1356 No 42 >protein:vir:7771 Length: 330 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:149 # MgeName: Bxz2 # Cross-refs: genbank:acc:NP_817605;genbank:gi:29566035;genbank:GeneID:1259229 Probab=99.36 E-value=3.2e-13 Score=89.07 Aligned_cols=286 Identities=13% Similarity=0.062 Sum_probs=149.9 Q ss_pred cceeeecCCceee---eeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCC Q lcl|NC_015254. 7 MNLQKFAAGKNTR---IADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDD 83 (346) Q Consensus 7 ~~~q~~~a~~~T~---l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~d 83 (346) |=-+.+.+..++. -..++.|++...++ +...+.+.+.+-. .....++..+.+|.+..- ..+.-+.| T Consensus 1 m~~~~~~a~~~~~t~~~g~~i~~~~~~~ii-~~~~~~s~l~~~~---------~~~~~~~~~~~~p~~~~~-~~a~~v~E 69 (330) T protein:vir:77 1 MAGSTVPSTQVALTGDFSAFLTPEQSQDYF-AEIEKTSIVQRIA---------RKVPMGPTGISIPHWTGA-VSASWTGE 69 (330) T ss_pred CcccccchhhccccCCCcceechhHHHHHH-HHHHhccchhhhc---------ceeeccCCceEEEEEcCC-cceeEecC Confidence 2222233332222 23567777765544 4444444443311 112234666889998753 44555678 Q ss_pred CccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH------HHHhhhhhhhhh Q lcl|NC_015254. 84 GEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIA------SLNGITASGALD 157 (346) Q Consensus 84 g~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla------~L~G~~~~~~~~ 157 (346) ++ .++..+.+-.+-....++.+.-+.++++...-+.-|..+.+.+++++.+++++++.+|. -..|++...... T Consensus 70 g~-~~~~~~~~f~~i~~~~~k~~~~~~is~ell~ds~~~~~~~i~~~l~~ai~~~~~~~~l~G~g~~~~~~g~~~~~~~~ 148 (330) T protein:vir:77 70 AE-RKPITKGSFGKQELEPVKITTIFAESAEVVRLNPLNYLNTMRTKIAEAIALKFDAAAIHGIDKPSAFKGYLAETTKV 148 (330) T ss_pred CC-ccccccceeeEEEEeEEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhcccCCCCcccccccccccc Confidence 74 67777777777767777777778888876666666888899999999999999997762 111222111111 Q ss_pred hcceeeeccc-cccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhh--hccccc--C-----ceeeEEec Q lcl|NC_015254. 158 SNKLDVSTET-GDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIE--FMLDSD--N-----KKFPTYMG 227 (346) Q Consensus 158 ~~~~dis~~~-~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~--~~~~s~--~-----~~i~~~~G 227 (346) .........+ .......++.|.+++.++........+|+||+.++..|++..--+ ++.... + ..-.+++| T Consensus 149 ~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~vmn~~~~~~l~~lkd~~G~~l~~~~~~~~~~~~~~~~~l~G 228 (330) T protein:vir:77 149 VSLADTNLTTASGPQGNAYLAVNNALSLLVNSGKKWTGTLLDNVTEPILNTAVDGNGRPLFVESTYTEQVGAIREGRILG 228 (330) T ss_pred ceeecccccccccccchhHHHHHHHHHhhhhcCCCccEEEEcHHHHHHHHHHhccCCceeecCccccccccccCCceecc Confidence 1111111111 111233467888888888887777789999999999998753111 111110 1 12258999 Q ss_pred eEEEEeCCCccCCC-ceEEEEEcCCe-eEEeecCCccceeeeecCC--c------------------ceeEEEEeeEEe- Q lcl|NC_015254. 228 KRVIVDDGLPAKDG-VYTSYIFGEGA-FGLGNGEAPVPTETDREKL--K------------------GNDILINRQHFL- 284 (346) Q Consensus 228 ~~VVvdD~~p~~~g-~ytt~l~~~GA-i~~~~~~~~~~vE~dRd~~--~------------------g~~~l~~r~~~~- 284 (346) +||++++.||.... ....++++.-. +.+...+ .+.++..++.. . +...+....++. T Consensus 229 ~PV~~~~~~p~~~~~~~~~~~~gd~s~~~i~~~~-~~~i~~~~e~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~ 307 (330) T protein:vir:77 229 RPTYVADNVVNGTVGNRVVGVMGDFSQVIWGQIG-GLSFDVTDQATLDFGEEQGGVWVPKLISLWQHNMVAVRCEAEFAF 307 (330) T ss_pred eeeEEeccccCCCCCCccEEEEEecceEEEEEec-CcEEEEeecceeeecccccccccccccchhhcCcEEEEEEEEecc Confidence 99999999996432 22334444322 2233222 23444444321 0 111111111111 Q ss_pred --eeeeeeeeccccccCCCCChH Q lcl|NC_015254. 285 --LHPRGIAWQEKSVAGHSPTNT 305 (346) Q Consensus 285 --~~~~G~s~~~~~~~~~sPt~a 305 (346) .||..+.-.....+|..|.-+ T Consensus 308 ~v~~~~a~~~i~~~~~~~~~~~~ 330 (330) T protein:vir:77 308 MVNDKDAFVKLTDQVAGTDPEEE 330 (330) T ss_pred EEecccceEEEEeccCCcCCCCC Confidence 111222211111122222221 No 43 >protein:vir:41 Length: 299 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:2 # MgeName: A118 # Cross-refs: genbank:acc:NP_463467;swissprot:trembl:q9t1b7;genbank:gi:16798789;uniprot:Q9T1B7;genbank:GeneID:922353 Probab=99.35 E-value=6e-13 Score=87.58 Aligned_cols=273 Identities=11% Similarity=0.040 Sum_probs=159.8 Q ss_pred eeecCCceeeeee--ccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccc Q lcl|NC_015254. 10 QKFAAGKNTRIAD--VIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGA 87 (346) Q Consensus 10 q~~~a~~~T~l~d--~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~ 87 (346) ==|.+++.|..++ ..+|+.+..-+.++..+.+.+.+-. .....++...++|.+..- .+.-+.|++ . T Consensus 1 ~g~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~~~---------~~~~~~~~~~~~~~~~~~--~a~~v~E~~-~ 68 (299) T protein:vir:41 1 MGFNPDTTTMQSAKTGSIPINISEQIITGVKNGSAAMKLA---------KAVPMTKPEEEFTFMSGV--GAFWVDEAE-R 68 (299) T ss_pred CCcCCCcccccCCCceecchhHHHHHHHHHHhcchhhhhc---------eeeecCCCcEEEEEEcCC--ceeeeecCc-c Confidence 1244665565554 4678877776777766666553321 122335777889988643 344567874 5 Q ss_pred cchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhc-ceeeecc Q lcl|NC_015254. 88 LTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSN-KLDVSTE 166 (346) Q Consensus 88 it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~-~~dis~~ 166 (346) ++..+.+-++-....++.+.-+.++++...-+.-|..+.+.+++++++.+..++.+|. |.-......-. ....... T Consensus 69 ~~~~~~~f~~v~l~~~k~~~~~~is~ell~ds~~~~~~~i~~~l~~a~~~~~d~a~l~---G~g~~~~~gil~~~~~~~~ 145 (299) T protein:vir:41 69 IQTSKPTFTKAKMRSKKMGVIIPTTKENLNYSVTNFFSLMQAEIVEAFYKKFDQAVFT---GVESPYNWNILKSATDASN 145 (299) T ss_pred ccccccceeEEEEeeEEEEEeehhhHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHHhh---cccCcccccccccccccce Confidence 7777777776667777778788999988777777889999999999999999997663 32111100000 0000111 Q ss_pred ccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhh--hcc-cccCceeeEEeceEEEEeCCCccCCCce Q lcl|NC_015254. 167 TGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIE--FML-DSDNKKFPTYMGKRVIVDDGLPAKDGVY 243 (346) Q Consensus 167 ~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~--~~~-~s~~~~i~~~~G~~VVvdD~~p~~~g~y 243 (346) +.....++++.|+++..++-++...-.+|+||+..+..|++..--+ ++. ....++.++++|+||++++.||.+.+.- T Consensus 146 ~~~~~~~~~~~l~~~~~~l~~~~~~~~~~v~n~~~~~~L~~lkd~~G~~l~~~~~~~~~~~l~G~PV~~~~~~~~~~~~~ 225 (299) T protein:vir:41 146 LVEETANKYDDLNEAIGLIEAEDLEPNGIATIRKQRVKYRSTKDGNGMPIFNTATSNGVDDVLGLPIAYTPKYTFGDKDI 225 (299) T ss_pred eeccccccHHHHHHHHHhhhcccCCcCEEEEcHHHHHHHHHhhccCCceeecCCcCCCCceecceeeEEecccCCCCCce Confidence 1223456889999999998887777789999999999999753111 111 1112334689999999999999765432 Q ss_pred EEEEEcCCe-eEEeecCCccceeeeecCCcc----------------eeEEEEeeEEee---eeeeeeeccccccCCCCC Q lcl|NC_015254. 244 TSYIFGEGA-FGLGNGEAPVPTETDREKLKG----------------NDILINRQHFLL---HPRGIAWQEKSVAGHSPT 303 (346) Q Consensus 244 tt~l~~~GA-i~~~~~~~~~~vE~dRd~~~g----------------~~~l~~r~~~~~---~~~G~s~~~~~~~~~sPt 303 (346) .++|+.-+ +.+.. +..+.+|..|+.... ...+....++.. ||.-+.-.. T Consensus 226 -~~~~gdfs~~~i~~-~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~~~d~~v~~~~A~~~l~--------- 294 (299) T protein:vir:41 226 -SELVGDWNQAYYGI-LRGVEYEILTEATLTTVADETGKPLNLAERDMAAIKATFEVGFMVVKDEAFSAVQ--------- 294 (299) T ss_pred -EEEEEecccEEEEE-ecCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEeccEEecccceEEEE--------- Confidence 23333222 12333 234567777765311 111222222211 112222111 Q ss_pred hHHhcCCcCce Q lcl|NC_015254. 304 NTEIEKGNNWK 314 (346) Q Consensus 304 ~a~L~~~~NW~ 314 (346) ...|+ T Consensus 295 ------~~aa~ 299 (299) T protein:vir:41 295 ------PKAGN 299 (299) T ss_pred ------eccCC Confidence 11111 No 44 >protein:vir:107120 Length: 329 # NCBI annotation: conserved phage protein # Family: family:all:701 # MgeID: mge:1571 # MgeName: CNPH82 # Cross-refs: genbank:acc:YP_950606;genbank:gi:119953686;genbank:GeneID:4643129 Probab=99.34 E-value=5.8e-13 Score=87.68 Aligned_cols=292 Identities=11% Similarity=-0.042 Sum_probs=164.3 Q ss_pred Ccc-------ceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccccc Q lcl|NC_015254. 1 MIK-------KLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQD 73 (346) Q Consensus 1 ~~~-------~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~ 73 (346) ||| |||+|||.| |+.....-.+.--|.|.+.+.+.+...+ + -+..+++ ..+ -..+|++|.+|.+.. T Consensus 12 ~~~~~~~~~~~~~~~~~~~-~~~~~~~nt~~l~~k~~~~LD~~~~~~~-~-s~~~~~N-~~~---e~~~g~tVkIp~i~~ 84 (329) T protein:vir:10 12 MNKEIKNATGKLKLNLQHF-ANKSVEPGDTLLKNKHVGILEKVTAANS-Y-SAPAVIS-NDA---IFMQGRSFTVIKGDV 84 (329) T ss_pred hhhhhhcccceeEEehhhh-cCCccCCchhHHHHHHHHHHHHHHHhhc-e-eeeeecc-cce---eeccCcEEEEeeecc Confidence 887 578999999 5555555666566777777777654432 1 1112222 111 134799999999975 Q ss_pred CCCcccccCCCccccchhhcccceeEEEE-EeecCcceechHHHhhhcch--HHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_015254. 74 LTGEDEILDDGEGALTPGNISAAKDIARL-HMRGKAWRTNDLAKALSGDD--PMRAIGDLVVEYWNRRRQAVLIASLNGI 150 (346) Q Consensus 74 l~g~ae~~~dg~~~it~~~lt~~~~~a~~-~~~~k~~~~tD~a~~~~g~d--p~~~i~~q~a~~~~~~~~~~lla~L~G~ 150 (346) . |-. +.+.+ +..+.+.++.....-++ +.|+.+|.+.|+...-+... .....+++........+|...++.|.+. T Consensus 85 ~-gl~-DY~R~-~g~~~g~vt~~~~t~tidqdR~~~F~VD~~D~dEtn~~l~a~~i~~~~~~~~v~pEiDay~~skla~~ 161 (329) T protein:vir:10 85 T-ELK-DYKRN-ATNEFDHPQIQETTYFLDQEKYWGRFVDALDRRDTEGNIDINYVVAKQASEVVAPYLDNLRFATLARN 161 (329) T ss_pred c-ccc-cccCC-CCccccccccceeEEEeecccceeeecchhhHhhhhhhhhHHHHHHHHHHHHhhhHHHHHHHHHHHhh Confidence 3 432 43322 35677777766555443 35777778877776655332 3344556666677777888888877432 Q ss_pred hhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccc-cCceEEEEchHHHHHHHhhhhhhhccccc-----CceeeE Q lcl|NC_015254. 151 TASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAE-GKLTGIAMHSQTEMNLRKQGLIEFMLDSD-----NKKFPT 224 (346) Q Consensus 151 ~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~-~~~~~ivmhS~~~~~L~~~~li~~~~~s~-----~~~i~~ 224 (346) .+. ..+ ++.++.-.++.|.+|..+|.+.. ..-..++|.|.+|.-|.+...+....... .+.|+. T Consensus 162 a~~----~~~------~~~t~~nay~~i~~a~~~Lde~~vp~~Rvl~VtP~~~~~Lk~~~~f~~~~~~~~~~~~~g~Vg~ 231 (329) T protein:vir:10 162 KAK----HLT------VGSGADAQYDAVLDVSVELDEIGAGASRILFVTPKFYKGIKKFVIELPQGDNRQQVLGKGVQGE 231 (329) T ss_pred ccc----ccc------cccCHHHHHHHHHHHHHHHHhcCCCCCcEEEeCHHHHHHHHhhhhhhccccccccceeeeeeee Confidence 111 111 11112224788999999997753 24468999999999999865543222211 356889 Q ss_pred EeceEEEEeCCCccCCCceEEEEE-cCCeeEEeecCCccceeeeec-CCcceeEEEEeeEEeeeee-----eeeeccccc Q lcl|NC_015254. 225 YMGKRVIVDDGLPAKDGVYTSYIF-GEGAFGLGNGEAPVPTETDRE-KLKGNDILINRQHFLLHPR-----GIAWQEKSV 297 (346) Q Consensus 225 ~~G~~VVvdD~~p~~~g~ytt~l~-~~GAi~~~~~~~~~~vE~dRd-~~~g~~~l~~r~~~~~~~~-----G~s~~~~~~ 297 (346) +.|++|+...+ ...+-..|++ -++|+.....-. .+|..|+ +....+.+..|..|.+-+. |+--..+.. T Consensus 232 idG~~Ii~vps---~~~k~in~ii~~~~A~~~~~K~~--~~~~~~p~~~~~a~~v~gr~yyd~~V~~~k~~~I~~~~~~a 306 (329) T protein:vir:10 232 LDGFTIVKVPS---KMLQGVEAMAVIGEVMASPIQAN--EAKLNSNVPGMFGTLAEQMLYTGAFVPEHLQKYIFTIGGKE 306 (329) T ss_pred ecCeEEEEecC---CcccceeEEEEcCCceeeeeeee--eeeeeCCCCccchheeeeeeeeeeEEEccccCEEEEecccC Confidence 99999997533 2222334554 467776643322 3555553 4443456666666655442 321111111 Q ss_pred cCCCCCh---HHhcCCcCceeee Q lcl|NC_015254. 298 AGHSPTN---TEIEKGNNWKAVY 317 (346) Q Consensus 298 ~~~sPt~---a~L~~~~NW~~v~ 317 (346) ...+.+. .-+++++.|+.=+ T Consensus 307 ~~~~~~~~~~~~~~~~~~~~~~~ 329 (329) T protein:vir:10 307 VETNRDGVDAHADETNASADTGA 329 (329) T ss_pred cccCCCCCCccccccccccccCC Confidence 1111111 1333444443322 No 45 >protein:vir:80213 Length: 334 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:1879 # MgeName: LKA1 # Cross-refs: genbank:acc:YP_001522884;genbank:gi:158345177;genbank:GeneID:5687476 Probab=99.33 E-value=9.8e-14 Score=91.89 Aligned_cols=286 Identities=13% Similarity=0.002 Sum_probs=175.9 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEI 80 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~ 80 (346) |..=-.-+|-+=+-+.+---.+++. |+|...|...+.+.+.|..-- .+.+ -.+|+++++|..+.. .++. T Consensus 1 m~~~~~~~~t~~~~~~~~~~~~l~l-e~~~geV~~af~~~s~~~~~~------~~r~--i~~G~s~~~~~iG~~--~~~~ 69 (334) T protein:vir:80 1 MTYPAANTHTRPGWGGANSDVSLHI-EEHLGLVDASFMYSSKFASWM------NVRS--LRGTNQLRVDRVGAS--TIAG 69 (334) T ss_pred CCCCcCCCccccccccccchheehh-hhhhhHHHHHHHHhhhhhccc------eeee--ccccceEEEeeecce--eeee Confidence 7665443333321111111124544 999999988888887774321 1111 146999999987765 2334 Q ss_pred cCCCccccchhhcccceeEEEEEe-ecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH-Hhhhhhhhh-- Q lcl|NC_015254. 81 LDDGEGALTPGNISAAKDIARLHM-RGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASL-NGITASGAL-- 156 (346) Q Consensus 81 ~~dg~~~it~~~lt~~~~~a~~~~-~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L-~G~~~~~~~-- 156 (346) ..-| +.++.+.+.+.+..-++=. .-..+.+.|+....+..|...+++++.+.++++.+|..++..| +|....... T Consensus 70 ~~~g-~~l~~~~~~~~~~~l~ID~~l~~~~~VddiD~~q~~~D~rse~~~~~G~aLA~~~D~~~~~~l~kaa~~~~~~~~ 148 (334) T protein:vir:80 70 RKAG-EELVVQKNVSDKLNLTVDTVLYARHFFDKFDEWTSNLDVRKETAREDGIALARQYDQACIIQLQKCGDFLAPAHL 148 (334) T ss_pred ecCC-CCCCCCCcccCceEEEEeeeeehhhhHhhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccc Confidence 4455 4788888888776655554 4446899999999999999999999999999999998777654 454433221 Q ss_pred -----hhcceeeeccccc--cccccH----HHHHHHHHHhCcccc-----CceEEEEchHHHHHHHhhh-hhhhccc--- Q lcl|NC_015254. 157 -----DSNKLDVSTETGD--DSYFTG----DTFLSATYKLGDAEG-----KLTGIAMHSQTEMNLRKQG-LIEFMLD--- 216 (346) Q Consensus 157 -----~~~~~dis~~~~~--~~~~~~----~~l~~A~~~~GD~~~-----~~~~ivmhS~~~~~L~~~~-li~~~~~--- 216 (346) .+.... +..++. ...-++ +.+.+|.+.|.+++- .-.+++|.|+.|..|++.. +++.... T Consensus 149 ~~~~~~G~~~~-~~~~g~~~~~~~~~~~l~~a~~~a~~~L~e~dvp~~~~~~R~~vv~P~~y~~Ll~~~r~~n~d~~~s~ 227 (334) T protein:vir:80 149 KPAFHDGILLP-STISGLAADAAADADVLVAAHRQGVEAMVFRDLGDQLMSEGVTLLDPVIFSFLLEHDRLMNVEFGAKE 227 (334) T ss_pred cccccCCccee-ecccccccchhhhHHHHHHHHHHHHHHHHhcCCCCCcCCceEEEeChHHHHHHhcccccccceecccc Confidence 111111 111111 112233 344456666654432 2379999999999999874 4443221 Q ss_pred c----cCceeeEEeceEEEEeCCCccCC-------CceE----------EEEEcCCeeEEeecCCccceeeeecCCccee Q lcl|NC_015254. 217 S----DNKKFPTYMGKRVIVDDGLPAKD-------GVYT----------SYIFGEGAFGLGNGEAPVPTETDREKLKGND 275 (346) Q Consensus 217 s----~~~~i~~~~G~~VVvdD~~p~~~-------g~yt----------t~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~ 275 (346) + .++.++.++|++|+.|..+|... +.|. ...+.+.|++..... ++..|..|++....+ T Consensus 228 ~~~~~~~g~i~~v~G~~V~~Sn~~P~~~~t~~~~g~~~~~~agd~t~~~~~~~~~~Al~t~~~~-~~~~e~~~~~~~~~d 306 (334) T protein:vir:80 228 GGNSFVGGRIAMLNGVRVVETPRFPQSAITANALGADFNVTDAEVRRKMITFIPSMALISAQVH-PVSAQFWEEKKDFGH 306 (334) T ss_pred ccccccceeEEEEeceEEEeecCCCCccccccccccccccccccccceEEEEEeCceEEEEEEe-ecceeeeechhhHHH Confidence 1 13568999999999999999652 2222 234567888876655 456899999987777 Q ss_pred EEEEeeEEeeeee---eeeeccccccCCCC Q lcl|NC_015254. 276 ILINRQHFLLHPR---GIAWQEKSVAGHSP 302 (346) Q Consensus 276 ~l~~r~~~~~~~~---G~s~~~~~~~~~sP 302 (346) .+.+.+.|+..++ +..-.+ ....+| T Consensus 307 ~i~~~~a~G~g~lRPeaa~vv~--~~~~~~ 334 (334) T protein:vir:80 307 YLDTFQSYNIGQRRPDAVAVHD--ITVTNP 334 (334) T ss_pred HHHHHHHcCCceeccceEEEEE--EeeecC Confidence 6666655554432 211111 112356 No 46 >protein:vir:94142 Length: 304 # NCBI annotation: ORF013 # Family: family:all:507 # MgeID: mge:1494 # MgeName: 96 # Cross-refs: genbank:acc:YP_240234;genbank:gi:66395898;genbank:GeneID:5133311 Probab=99.32 E-value=4.9e-13 Score=88.06 Aligned_cols=268 Identities=11% Similarity=0.069 Sum_probs=156.9 Q ss_pred cceeeecCCceeeee--eccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCC Q lcl|NC_015254. 7 MNLQKFAAGKNTRIA--DVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDG 84 (346) Q Consensus 7 ~~~q~~~a~~~T~l~--d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg 84 (346) |=.|.+.+.++|.-+ -..+|+.+..-+.+...+.+.+.+..- ....++..+++|.+..- ..+..+.|+ T Consensus 1 ma~~~~~~~~~~~t~~gg~lip~~~~~~ii~~~~~~~~l~~~~~---------~~~~~~~~~~ip~~~~~-~~a~~v~E~ 70 (304) T protein:vir:94 1 MATPTYTPGNVILSDFKNGVIPAEQGTLIMKDIMANSAIMKLAK---------NEPMTAQKKKFTYLAKG-VGAYWVSET 70 (304) T ss_pred CcccccccccccccCCCceecchhHHHHHHHHHHhccchhhhcc---------eeeccCCceEEEEEeCC-cceEEeecC Confidence 777777666555443 346788787766666666555543211 11234666889999743 445556777 Q ss_pred ccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhh-----h- Q lcl|NC_015254. 85 EGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALD-----S- 158 (346) Q Consensus 85 ~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~-----~- 158 (346) + .++..+.+..+-....++.+.-+.++++...-+.-|..+.+.+++++.+++..+..++. |. +..... . T Consensus 71 ~-~~~~~~~~~~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~ia~~~d~~~l~---G~-g~~~~~~~~~~~~ 145 (304) T protein:vir:94 71 E-RIQTSKPEYAQAEMEAKKIGVIIPLSKEFLKWTAKDFFNEVKPLIAEAFYKAFDQAVIF---GT-KSPYNTSTSGKPL 145 (304) T ss_pred c-ccccccceeeEEEEEEEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHhhhee---cc-CCCcccccccccc Confidence 4 56666777666666677777778888877666667888999999999999999987653 21 110000 0 Q ss_pred -cceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccCc-----eeeEEeceEEEE Q lcl|NC_015254. 159 -NKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDNK-----KFPTYMGKRVIV 232 (346) Q Consensus 159 -~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~~-----~i~~~~G~~VVv 232 (346) ........+.+...+.++.|.++..++.+......+|+||+.++..|++.. ++++. ..++++|+||++ T Consensus 146 ~~~~~~~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~~~~~~~~L~~lk------d~~G~~l~~~~~~~l~G~PV~~ 219 (304) T protein:vir:94 146 VEGAEEKGNVVTDTNNLYVDLSALMATIEDEELDPNGVLTTRSFRSKMRNAL------DANDRPLFDANGNEIMGLPLSY 219 (304) T ss_pred cccccccccccccccchHHHHHHHHHHhhhccCCcCEEEEcHHHHHHHHHhh------ccCCcEeecCCCccccceeeEE Confidence 000111112233456789999999999888777789999999999998642 23322 236899999999 Q ss_pred eCCCccCCCceEEEEEcC-CeeEEeecCCccceeeeecCC------------------cceeEEEEeeEEeeee---eee Q lcl|NC_015254. 233 DDGLPAKDGVYTSYIFGE-GAFGLGNGEAPVPTETDREKL------------------KGNDILINRQHFLLHP---RGI 290 (346) Q Consensus 233 dD~~p~~~g~ytt~l~~~-GAi~~~~~~~~~~vE~dRd~~------------------~g~~~l~~r~~~~~~~---~G~ 290 (346) ++.+|...+.... +|+. --+.+..-+ .+.++..|+.. .++..+....+|...| ..| T Consensus 220 ~~~~~~~~~~~~~-~~gd~~~~~~~~~~-~~~i~~~~e~~~~~~~~~~~~g~~~~~f~~~~~~~r~~~r~~~~v~~~~a~ 297 (304) T protein:vir:94 220 TGADVYDKKKSLA-LMGDWDYARYGILQ-GIEYAISEDATLTTLQASDASGQPVSLFERDMFALRATMHIAYMNVKPEAF 297 (304) T ss_pred ecccccCCCCcEE-EEEehhhEEEEEec-ceEEEEeecceeeeecccccCccchhhhhcCcEEEEEEEEeccEeecccce Confidence 9999976554432 2221 112233222 23344444431 0112222222222221 222 Q ss_pred eeccccccCCCCCh Q lcl|NC_015254. 291 AWQEKSVAGHSPTN 304 (346) Q Consensus 291 s~~~~~~~~~sPt~ 304 (346) .-- .+++ T Consensus 298 ~~l-------~~a~ 304 (304) T protein:vir:94 298 ATL-------KPTE 304 (304) T ss_pred EEE-------EecC Confidence 111 1111 No 47 >protein:vir:105905 Length: 304 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:1514 # MgeName: phiETA3 # Cross-refs: genbank:acc:YP_001004375;genbank:gi:122891830;genbank:GeneID:4712376 Probab=99.32 E-value=4.9e-13 Score=88.06 Aligned_cols=268 Identities=11% Similarity=0.069 Sum_probs=156.9 Q ss_pred cceeeecCCceeeee--eccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCC Q lcl|NC_015254. 7 MNLQKFAAGKNTRIA--DVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDG 84 (346) Q Consensus 7 ~~~q~~~a~~~T~l~--d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg 84 (346) |=.|.+.+.++|.-+ -..+|+.+..-+.+...+.+.+.+..- ....++..+++|.+..- ..+..+.|+ T Consensus 1 ma~~~~~~~~~~~t~~gg~lip~~~~~~ii~~~~~~~~l~~~~~---------~~~~~~~~~~ip~~~~~-~~a~~v~E~ 70 (304) T protein:vir:10 1 MATPTYTPGNVILSDFKNGVIPAEQGTLIMKDIMANSAIMKLAK---------NEPMTAQKKKFTYLAKG-VGAYWVSET 70 (304) T ss_pred CcccccccccccccCCCceecchhHHHHHHHHHHhccchhhhcc---------eeeccCCceEEEEEeCC-cceEEeecC Confidence 777777666555443 346788787766666666555543211 11234666889999743 445556777 Q ss_pred ccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhh-----h- Q lcl|NC_015254. 85 EGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALD-----S- 158 (346) Q Consensus 85 ~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~-----~- 158 (346) + .++..+.+..+-....++.+.-+.++++...-+.-|..+.+.+++++.+++..+..++. |. +..... . T Consensus 71 ~-~~~~~~~~~~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~ia~~~d~~~l~---G~-g~~~~~~~~~~~~ 145 (304) T protein:vir:10 71 E-RIQTSKPEYAQAEMEAKKIGVIIPLSKEFLKWTAKDFFNEVKPLIAEAFYKAFDQAVIF---GT-KSPYNTSTSGKPL 145 (304) T ss_pred c-ccccccceeeEEEEEEEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHhhhee---cc-CCCcccccccccc Confidence 4 56666777666666677777778888877666667888999999999999999987653 21 110000 0 Q ss_pred -cceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccCc-----eeeEEeceEEEE Q lcl|NC_015254. 159 -NKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDNK-----KFPTYMGKRVIV 232 (346) Q Consensus 159 -~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~~-----~i~~~~G~~VVv 232 (346) ........+.+...+.++.|.++..++.+......+|+||+.++..|++.. ++++. ..++++|+||++ T Consensus 146 ~~~~~~~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~~~~~~~~L~~lk------d~~G~~l~~~~~~~l~G~PV~~ 219 (304) T protein:vir:10 146 VEGAEEKGNVVTDTNNLYVDLSALMATIEDEELDPNGVLTTRSFRSKMRNAL------DANDRPLFDANGNEIMGLPLSY 219 (304) T ss_pred cccccccccccccccchHHHHHHHHHHhhhccCCcCEEEEcHHHHHHHHHhh------ccCCcEeecCCCccccceeeEE Confidence 000111112233456789999999999888777789999999999998642 23322 236899999999 Q ss_pred eCCCccCCCceEEEEEcC-CeeEEeecCCccceeeeecCC------------------cceeEEEEeeEEeeee---eee Q lcl|NC_015254. 233 DDGLPAKDGVYTSYIFGE-GAFGLGNGEAPVPTETDREKL------------------KGNDILINRQHFLLHP---RGI 290 (346) Q Consensus 233 dD~~p~~~g~ytt~l~~~-GAi~~~~~~~~~~vE~dRd~~------------------~g~~~l~~r~~~~~~~---~G~ 290 (346) ++.+|...+.... +|+. --+.+..-+ .+.++..|+.. .++..+....+|...| ..| T Consensus 220 ~~~~~~~~~~~~~-~~gd~~~~~~~~~~-~~~i~~~~e~~~~~~~~~~~~g~~~~~f~~~~~~~r~~~r~~~~v~~~~a~ 297 (304) T protein:vir:10 220 TGADVYDKKKSLA-LMGDWDYARYGILQ-GIEYAISEDATLTTLQASDASGQPVSLFERDMFALRATMHIAYMNVKPEAF 297 (304) T ss_pred ecccccCCCCcEE-EEEehhhEEEEEec-ceEEEEeecceeeeecccccCccchhhhhcCcEEEEEEEEeccEeecccce Confidence 9999976554432 2221 112233222 23344444431 0112222222222221 222 Q ss_pred eeccccccCCCCCh Q lcl|NC_015254. 291 AWQEKSVAGHSPTN 304 (346) Q Consensus 291 s~~~~~~~~~sPt~ 304 (346) .-- .+++ T Consensus 298 ~~l-------~~a~ 304 (304) T protein:vir:10 298 ATL-------KPTE 304 (304) T ss_pred EEE-------EecC Confidence 111 1111 No 48 >protein:vir:4856 Length: 293 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:106 # MgeName: DT1 # Cross-refs: genbank:acc:NP_049396;genbank:gi:9632424;genbank:GeneID:1258532 Probab=99.30 E-value=3.4e-12 Score=83.48 Aligned_cols=280 Identities=11% Similarity=0.054 Sum_probs=161.1 Q ss_pred ceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccc Q lcl|NC_015254. 8 NLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGA 87 (346) Q Consensus 8 ~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~ 87 (346) =|+--+..+ +.-.-..+|+.+..-+.+...+.+.+.+-.-+.+ .......+.+|.+..-++.+..+.|+. . T Consensus 1 ~l~~~~~~t-~~~gg~liP~~~~~~Ii~~~~~~~~l~~~~~~~~-------~~~~~g~~~~~~~~~~~~~a~~v~Eg~-~ 71 (293) T protein:vir:48 1 MLDSKTDHS-GSDAGLTIPQDIRTAINTLVRQYDSLQEYVNVEN-------VTTLTGSRVYEKWTDITGLANIDDEAG-K 71 (293) T ss_pred Cceeecccc-cCcCceEechhHHHHHHHHHHhhhhhhhhceeee-------ccCCcceEEEEeecCCCcceeeecCCc-c Confidence 122222222 2222356788888777777777666644211111 122234667777766556677778875 4 Q ss_pred cc-hhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecc Q lcl|NC_015254. 88 LT-PGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTE 166 (346) Q Consensus 88 it-~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~ 166 (346) ++ .++.+-.+-.-..++.+..+.++++...-+.-|..+.+.+++++.+.+..++.++..+... T Consensus 72 ~~~~~~~~~~~i~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~i~~g~~~~---------------- 135 (293) T protein:vir:48 72 IADIDDPKLSLIKYTIKRYAGISTVTNSLLADSAENILAWLSGWIAKKVVVTRNKAILGVVDKL---------------- 135 (293) T ss_pred cccccccceeEEEEeeeEEEEeehhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHhHHhhccccc---------------- Confidence 44 3455656556666677777788887766666678889999999999999998776543210 Q ss_pred ccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceeeEEeceEEEEeCCCcc--CC Q lcl|NC_015254. 167 TGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFPTYMGKRVIVDDGLPA--KD 240 (346) Q Consensus 167 ~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~~~~G~~VVvdD~~p~--~~ 240 (346) +.....++++.|.++..++......-.+|+||+.++..|++..-- .++. ...++.-++++|+||++.+..+. .. T Consensus 136 ~~~~~~~~~d~i~~~~~~l~~~~~~~a~~vmn~~~~~~L~~lkd~~g~~l~~~~~~~~~~~~l~G~Pv~~~~~~~~~~~~ 215 (293) T protein:vir:48 136 PTKPTLTKWDDIIDLEAKVDPAIKQTSFFLTNTSGFTALKKVKNALGDYLMERDVKSPTGYSIAGFAVKEISDRWLPNAS 215 (293) T ss_pred cccccccCHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHHhhccCCceEeecCcCCCCCceecceeeEEecccccCCcc Confidence 112244688999999999877777778999999999999875311 1222 12244557899999988655443 22 Q ss_pred CceEEEEEcC--CeeEEeecCCccceeeeecC----CcceeEEEEeeEEeeeeeeeeeccccccCCCCChHHhcCCcCce Q lcl|NC_015254. 241 GVYTSYIFGE--GAFGLGNGEAPVPTETDREK----LKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEKGNNWK 314 (346) Q Consensus 241 g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~----~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~~NW~ 314 (346) ..-.+++|+. -++.+...+ .+.++.+|.. ..+...+....++.+. T Consensus 216 ~~~~~~~~gd~~~~~~~~~~~-~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~---------------------------- 266 (293) T protein:vir:48 216 SGVMPLYFGDLKQAVTLFDRQ-QMSLLSTNIGGGAFETDTTKVRVIDRFDVV---------------------------- 266 (293) T ss_pred CCceEEEEEeccceEEEEEec-ceEEEEecccchhhhcCeEEEEEEEeeCcE---------------------------- Confidence 2222345543 234333322 3445554432 1222333333332221 Q ss_pred eeecccccceEEEEEecccccccCCCCC Q lcl|NC_015254. 315 AVYESKNIRIVAFVHKNGVPGKKKETAP 342 (346) Q Consensus 315 ~v~~~K~i~iv~~~~k~~~~~~~~~~~~ 342 (346) +.+++.|..+.+.+....+++.+..|- T Consensus 267 -~~~~~a~~~l~~~~~~~~~~~~~~~~~ 293 (293) T protein:vir:48 267 -ATDTEAFVPASFKAIADQKGNIGSTAV 293 (293) T ss_pred -EecccceEEEEeeccccCCccccccCC Confidence 233344444444444445555555444 No 49 >protein:vir:94576 Length: 347 # NCBI annotation: Major capsid protein # Family: family:all:975 # MgeID: mge:1516 # MgeName: Berlin # Cross-refs: genbank:acc:YP_919012;genbank:gi:119637776;genbank:GeneID:5179336 Probab=99.28 E-value=4.6e-13 Score=88.24 Aligned_cols=289 Identities=13% Similarity=0.033 Sum_probs=173.8 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEI 80 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~ 80 (346) |..--+||.|-=-++...-.-.+++ |+|...|...+.+.+.|...- ...+ + .+|+++.+|..+... ++. T Consensus 4 ~~~~~~~~t~~g~~~~~~d~~al~i-e~~~geV~~~f~~~s~~~~~~------~~rt-i-~~G~sv~~~~iG~~~--~~~ 72 (347) T protein:vir:94 4 MNGGQQMGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNKH------LVRS-I-QSGKSAQFPVLGRTK--AAY 72 (347) T ss_pred cccccccccccccCCcccchHHHHH-HHHhHHHHHHHHHHHhhhhhh------hhee-c-cccceEEeeecccee--Eee Confidence 2222244444333333333333666 999999998888888774321 1111 1 369999999888763 344 Q ss_pred cCCCcccc-chhhcccceeEEEEEee-cCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhh- Q lcl|NC_015254. 81 LDDGEGAL-TPGNISAAKDIARLHMR-GKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALD- 157 (346) Q Consensus 81 ~~dg~~~i-t~~~lt~~~~~a~~~~~-~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~- 157 (346) ...|.... +.+.+...+..-++=.. -..+.+.|+....+..|++.+++++.+.++++.+|+.++..|.-..+..... T Consensus 73 ~~~G~~l~~~~~~~~~~e~~ltID~~~y~~~~VddiD~~q~~~D~rs~~~~~~g~ALA~~~D~~i~~~l~~~a~~~~~~~ 152 (347) T protein:vir:94 73 LQPGENLDDKRKDMKHTEKTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPTANN 152 (347) T ss_pred eecCcCCCCCcCCccccceEEEEcchhhhhhhhhhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhccccccc Confidence 45554222 22456666665444433 4457899999999999999999999999999999998887654222211110 Q ss_pred ------hcc--eeeecc---ccc---cccccHHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhhhhhhhcc-c---- Q lcl|NC_015254. 158 ------SNK--LDVSTE---TGD---DSYFTGDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQGLIEFML-D---- 216 (346) Q Consensus 158 ------~~~--~dis~~---~~~---~~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~li~~~~-~---- 216 (346) ... ..+... .+. .+.--++.|.+|.++|.++. ..-..+++.|+.|..|.+.......- . T Consensus 153 ~~~~g~~~~~~v~i~~~~~~~~~~~~~~~~~~d~i~~a~~~Lde~dVP~~~R~~vv~P~~y~~LLk~~~~~~~~~~~~~~ 232 (347) T protein:vir:94 153 ENIAGLGKAHVLEVGDQATLQGDQVKLGQAIIAQLTLARAKLTGNYVPSSDRVFYTTPDNYSAILAALMPNAANYQALID 232 (347) T ss_pred cccccCCcceeEeeeccccccccccccHHHHHHHHHHHHHHhhhcCCCCCCCEEEeChHHHHHHHHhhcccccccccccc Confidence 001 111110 000 01112466778888886654 34478999999999999742211111 1 Q ss_pred ccCceeeEEeceEEEEeCCCccCC------------------------CceE-------EEEEcCCeeEEeecCCcccee Q lcl|NC_015254. 217 SDNKKFPTYMGKRVIVDDGLPAKD------------------------GVYT-------SYIFGEGAFGLGNGEAPVPTE 265 (346) Q Consensus 217 s~~~~i~~~~G~~VVvdD~~p~~~------------------------g~yt-------t~l~~~GAi~~~~~~~~~~vE 265 (346) -..|.|+.++|++|+.+..+|... +.|. ..++-+-|++.....+ ..+| T Consensus 233 ~~~G~V~~v~G~~V~~Sn~~p~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~d~~~~~~l~~~~~A~~tv~~~~-~~~e 311 (347) T protein:vir:94 233 PSTGSIRNVMGFEVIEVPHLTAGGAGDNRAEEGVAPTNQKHAFPDTASGDTRVALDNVVGLFNHRSAVGTVKLKD-MALE 311 (347) T ss_pred cccceeEEeeceEEEEcCccccccCcccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcc-ccee Confidence 124679999999999999998532 1231 3455666776655554 3589 Q ss_pred eeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCCh Q lcl|NC_015254. 266 TDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTN 304 (346) Q Consensus 266 ~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~ 304 (346) ..|++....+.+.+++.|+..|+= | +..+.-..+.- T Consensus 312 ~~~~~~~~~~~i~~~~a~G~g~~r--P-e~a~~i~~~~a 347 (347) T protein:vir:94 312 RARRANFQADQIIAKYAMGHGGLR--P-EACGALVFKKA 347 (347) T ss_pred eeechhhhhhhhhhhhhhcCcccc--c-ceeEEEEecCC Confidence 999999999999888888877642 1 00000000000 No 50 >protein:vir:1541 Length: 347 # NCBI annotation: major capsid protein 10A # Family: family:all:975 # MgeID: mge:31 # MgeName: phiYeO3-12 # Cross-refs: genbank:acc:NP_052109;swissprot:trembl:q9t107;genbank:gi:9634035;uniprot:Q9T107;genbank:GeneID:1262383 Probab=99.28 E-value=9.4e-13 Score=86.53 Aligned_cols=286 Identities=14% Similarity=0.066 Sum_probs=159.8 Q ss_pred Ccc---ceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCc Q lcl|NC_015254. 1 MIK---KLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGE 77 (346) Q Consensus 1 ~~~---~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ 77 (346) |=. .=..|.|-=-++...-.--+++ |+|...|...+.+.+.|.. .-... . -.+|+++.+|..+... T Consensus 1 ma~~~~~~~~~t~~~~~~~~~~~~a~~i-e~f~g~V~~~f~~~s~~~~------~~~~~-~-~~~G~sv~i~~ig~~t-- 69 (347) T protein:vir:15 1 MANIQGGQQIGTNQGKGQSAADKLALFL-KVFGGEVLTAFARTSVTMP------RHMLR-S-IASGKSAQFPVIGRTK-- 69 (347) T ss_pred CCccccCCccccccccCCCcchHHHHHH-HHHHHHHHHHHHHhhhhhh------ccccc-c-ccccceeEeeecccee-- Confidence 211 1111211111111111111333 5666666666555554422 11111 1 1369999999998763 Q ss_pred ccccCCCcccc--chhhcccceeEEEEEe-ecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhh Q lcl|NC_015254. 78 DEILDDGEGAL--TPGNISAAKDIARLHM-RGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASG 154 (346) Q Consensus 78 ae~~~dg~~~i--t~~~lt~~~~~a~~~~-~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~ 154 (346) .+....|. .+ +++.++..+..-++=. .-.++.+.|+....+..|++.+++++.+.++++++|+.++..|.++.... T Consensus 70 ~~~~~~g~-~l~~~~~~~~~~e~~ltID~~~~~~~~VddlD~~q~~~D~~~~~~~~~g~aLA~~~D~~i~~~l~~~~~~~ 148 (347) T protein:vir:15 70 AAYLKPGE-NLDDKRKDIKHTEKVIHIDGLLTADVLIYDIEDAMNHYDVRAEYTAQLGESLAMAADGAVLAELAGLVNLP 148 (347) T ss_pred eeeeccCC-CCCCCCCCCccceEEEEechhhhhhHHhhhHHHHhcCCcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhcc Confidence 44444553 44 3455666665544433 34467889999999999999999999999999999999999887654322 Q ss_pred hhhhcce------------eeecccccccccc----HHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhhhh-hhhcc Q lcl|NC_015254. 155 ALDSNKL------------DVSTETGDDSYFT----GDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQGL-IEFML 215 (346) Q Consensus 155 ~~~~~~~------------dis~~~~~~~~~~----~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~l-i~~~~ 215 (346) ....... ...+....++... ++.+.+|..+|..+. ..-..++|.|..|..|++..- +..-. T Consensus 149 ~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~i~d~~~~a~~~Lde~~VP~~gR~~vv~P~~y~~LL~~~~~~~~d~ 228 (347) T protein:vir:15 149 DASNENIEGLGKPTVLTLVKPTTGDLTDPVELGKAIIAQLTIARASLTKNYVPAADRTFYTTPDNYSAILAALMPNAANY 228 (347) T ss_pred ccccccccccCccccccccccccccchhhhhHHHHHHHHHHHHHHHHhhcCCCccCCEEEeCHHHHHHHhcccccccccc Confidence 1111000 0000001111112 455666677776543 344789999999999998743 22222 Q ss_pred cc----cCceeeEEeceEEEEeCCCccCCCc-----------e-----------------EEEEEcCCeeEEeecCCccc Q lcl|NC_015254. 216 DS----DNKKFPTYMGKRVIVDDGLPAKDGV-----------Y-----------------TSYIFGEGAFGLGNGEAPVP 263 (346) Q Consensus 216 ~s----~~~~i~~~~G~~VVvdD~~p~~~g~-----------y-----------------tt~l~~~GAi~~~~~~~~~~ 263 (346) .+ ..|.|+.++|++|+.|+.+|...+. | -.+++-+-|++....++ +. T Consensus 229 ~~~~~~~~G~Vg~i~G~~V~~Sn~lp~~~~t~~~~~~~~g~~~~~~~~~~~~~~~~f~~~~~l~~h~~A~g~v~~~~-~~ 307 (347) T protein:vir:15 229 QALIDHERGTIRNVMGFEVVEVPHLTAGGAGDTREDAPADQKHAFPATSSTTVKVALDNVVGLFQHRSAVGTVKLKD-LA 307 (347) T ss_pred cccccccceEEEEEeceEEEecccccccccccccccccccccccccccccceeeeccccceeeeeccceeeeeEeec-ee Confidence 21 1467899999999999999964321 0 01334455666655443 46 Q ss_pred eeeeecCCcceeEEEEeeEEeeeee---e-eeeccccccCCCCChHH Q lcl|NC_015254. 264 TETDREKLKGNDILINRQHFLLHPR---G-IAWQEKSVAGHSPTNTE 306 (346) Q Consensus 264 vE~dRd~~~g~~~l~~r~~~~~~~~---G-~s~~~~~~~~~sPt~a~ 306 (346) +|..|++....|.+...+.|+..+. + ..+.- |--+| T Consensus 308 ~e~~~~~~~~~d~i~~~~~~G~~vlrP~~av~~~~-------~~~~~ 347 (347) T protein:vir:15 308 LERARRANYQADQIIAKYAMGHGGLRPEAAGAIVL-------PKVSE 347 (347) T ss_pred eeecccchhhhhhhehhhhcCCceeccccEEEEec-------CCCCC Confidence 8888988776666665555543332 1 11111 11112 No 51 >protein:vir:81100 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:1891 # MgeName: tp310-1 # Cross-refs: genbank:acc:YP_001429874;genbank:gi:156603927;genbank:GeneID:5525320 Probab=99.28 E-value=1.8e-12 Score=84.99 Aligned_cols=297 Identities=10% Similarity=-0.018 Sum_probs=153.5 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCc--EEEecccccCCCcc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGN--MINMPFWQDLTGED 78 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~--ti~~P~~~~l~g~a 78 (346) ++.....+.... +..+|.-.-.++|+.+.+-+.+...+.+.+.+- . .....++. .+.+|.+... ... T Consensus 110 ~~~~~~~~~~~~-~~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~~------~---~~~~~~~~~~~~~~~~~~~~-~~~ 178 (415) T protein:vir:81 110 TEYLETRNDIQG-GSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKY------V---TVKRVTNGSGKYPVVRQSEV-AAL 178 (415) T ss_pred HHHHhhhhhhhh-ccccccccccccchHHHHHHHHHHHhhhhhhhh------e---eeeeccCCceeEEEEeecCC-ccc Confidence 111111111111 222233356689988877777665555544221 1 11111233 4444555443 234 Q ss_pred cccCCCccccchh-hcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhh Q lcl|NC_015254. 79 EILDDGEGALTPG-NISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALD 157 (346) Q Consensus 79 e~~~dg~~~it~~-~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~ 157 (346) ..+.|+. .++.. ..+-.+-....++.+.-..+++....-+.-|..+.+.+++++.+.+..++.++..+. .-...... T Consensus 179 ~~v~E~~-~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g-~g~~~~~~ 256 (415) T protein:vir:81 179 EKVEELE-ENPELAVKPFFQLAYDINTHRGYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVIT-KGSTGSTS 256 (415) T ss_pred eeecccc-ccCcccccceeeEEeeeeeeEeeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccc-cCcccccc Confidence 4456664 44432 334444445566666667777776555555778889999999999999987765431 10000000 Q ss_pred hcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceeeEEeceEEEEe Q lcl|NC_015254. 158 SNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFPTYMGKRVIVD 233 (346) Q Consensus 158 ~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~~~~G~~VVvd 233 (346) .........+.....++++.|.+++.++.+....-.+|+||+.++..|++..-- .++. ...++..++++|+||+++ T Consensus 257 ~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~l~~lkd~~G~~l~~~~~~~~~~~~l~G~pV~~~ 336 (415) T protein:vir:81 257 SGFEKEGKKLEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKMKDKLGNYLIQPDVKEKTQQRLLGAKIEIL 336 (415) T ss_pred ccccccccccccccccchhHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHhhccCCceeeccCcCCCCCceecceeeEEe Confidence 011111122233455789999999999988777778999999999999874211 1222 222455678999999999 Q ss_pred CCCccCCCceEEEEEc--CCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCChHHhcCCc Q lcl|NC_015254. 234 DGLPAKDGVYTSYIFG--EGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEKGN 311 (346) Q Consensus 234 D~~p~~~g~ytt~l~~--~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~~ 311 (346) +.+|.....-.+++|+ ..++.+... ..+.+++.+.... .+.+..-.++-.. T Consensus 337 ~~~~~~~~~~~~~~~Gd~~~~~~~~~~-~~~~v~~~~~~~~-~~~~~~~~r~d~~------------------------- 389 (415) T protein:vir:81 337 PDEVLGQKGNNTLIIGNLKDAIVLFDR-SQYQASWTDYMHF-GECLMIAVRQDCR------------------------- 389 (415) T ss_pred cccccCCCCccEEEEEehhccEEEEee-cceEEEEeccccC-ceEEEEEEEeccE------------------------- Confidence 9998754322345665 334433332 2345555443321 2222222222211 Q ss_pred CceeeecccccceEEEEEecccccccCCCC Q lcl|NC_015254. 312 NWKAVYESKNIRIVAFVHKNGVPGKKKETA 341 (346) Q Consensus 312 NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~ 341 (346) +.+++.+-++.+.+....+|--+-.+ T Consensus 390 ----v~~~~a~~~~~~~~~~~~~~~~~~~~ 415 (415) T protein:vir:81 390 ----ILDYKSAIVIEYDDSERGEGDLGLEA 415 (415) T ss_pred ----EeccccEEEEEEeccCCCCCccccCC Confidence 12233333333333332222222222 No 52 >protein:vir:98339 Length: 415 # NCBI annotation: putative capsid protein # Family: family:all:21 # MgeID: mge:1581 # MgeName: phiPVL(108) # Cross-refs: genbank:acc:YP_918931;genbank:gi:119443693;genbank:GeneID:4594501 Probab=99.28 E-value=1.8e-12 Score=84.99 Aligned_cols=297 Identities=10% Similarity=-0.018 Sum_probs=153.5 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCc--EEEecccccCCCcc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGN--MINMPFWQDLTGED 78 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~--ti~~P~~~~l~g~a 78 (346) ++.....+.... +..+|.-.-.++|+.+.+-+.+...+.+.+.+- . .....++. .+.+|.+... ... T Consensus 110 ~~~~~~~~~~~~-~~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~~------~---~~~~~~~~~~~~~~~~~~~~-~~~ 178 (415) T protein:vir:98 110 TEYLETRNDIQG-GSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKY------V---TVKRVTNGSGKYPVVRQSEV-AAL 178 (415) T ss_pred HHHHhhhhhhhh-ccccccccccccchHHHHHHHHHHHhhhhhhhh------e---eeeeccCCceeEEEEeecCC-ccc Confidence 111111111111 222233356689988877777665555544221 1 11111233 4444555443 234 Q ss_pred cccCCCccccchh-hcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhh Q lcl|NC_015254. 79 EILDDGEGALTPG-NISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALD 157 (346) Q Consensus 79 e~~~dg~~~it~~-~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~ 157 (346) ..+.|+. .++.. ..+-.+-....++.+.-..+++....-+.-|..+.+.+++++.+.+..++.++..+. .-...... T Consensus 179 ~~v~E~~-~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g-~g~~~~~~ 256 (415) T protein:vir:98 179 EKVEELE-ENPELAVKPFFQLAYDINTHRGYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVIT-KGSTGSTS 256 (415) T ss_pred eeecccc-ccCcccccceeeEEeeeeeeEeeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccc-cCcccccc Confidence 4456664 44432 334444445566666667777776555555778889999999999999987765431 10000000 Q ss_pred hcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceeeEEeceEEEEe Q lcl|NC_015254. 158 SNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFPTYMGKRVIVD 233 (346) Q Consensus 158 ~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~~~~G~~VVvd 233 (346) .........+.....++++.|.+++.++.+....-.+|+||+.++..|++..-- .++. ...++..++++|+||+++ T Consensus 257 ~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~l~~lkd~~G~~l~~~~~~~~~~~~l~G~pV~~~ 336 (415) T protein:vir:98 257 SGFEKEGKKLEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKMKDKLGNYLIQPDVKEKTQQRLLGAKIEIL 336 (415) T ss_pred ccccccccccccccccchhHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHhhccCCceeeccCcCCCCCceecceeeEEe Confidence 011111122233455789999999999988777778999999999999874211 1222 222455678999999999 Q ss_pred CCCccCCCceEEEEEc--CCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCChHHhcCCc Q lcl|NC_015254. 234 DGLPAKDGVYTSYIFG--EGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEKGN 311 (346) Q Consensus 234 D~~p~~~g~ytt~l~~--~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~~ 311 (346) +.+|.....-.+++|+ ..++.+... ..+.+++.+.... .+.+..-.++-.. T Consensus 337 ~~~~~~~~~~~~~~~Gd~~~~~~~~~~-~~~~v~~~~~~~~-~~~~~~~~r~d~~------------------------- 389 (415) T protein:vir:98 337 PDEVLGQKGNNTLIIGNLKDAIVLFDR-SQYQASWTDYMHF-GECLMIAVRQDCR------------------------- 389 (415) T ss_pred cccccCCCCccEEEEEehhccEEEEee-cceEEEEeccccC-ceEEEEEEEeccE------------------------- Confidence 9998754322345665 334433332 2345555443321 2222222222211 Q ss_pred CceeeecccccceEEEEEecccccccCCCC Q lcl|NC_015254. 312 NWKAVYESKNIRIVAFVHKNGVPGKKKETA 341 (346) Q Consensus 312 NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~ 341 (346) +.+++.+-++.+.+....+|--+-.+ T Consensus 390 ----v~~~~a~~~~~~~~~~~~~~~~~~~~ 415 (415) T protein:vir:98 390 ----ILDYKSAIVIEYDDSERGEGDLGLEA 415 (415) T ss_pred ----EeccccEEEEEEeccCCCCCccccCC Confidence 12233333333333332222222222 No 53 >protein:vir:79987 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:1875 # MgeName: tp310-3 # Cross-refs: genbank:acc:YP_001430002;genbank:gi:156604057;genbank:GeneID:5525447 Probab=99.28 E-value=1.8e-12 Score=84.99 Aligned_cols=297 Identities=10% Similarity=-0.018 Sum_probs=153.5 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCc--EEEecccccCCCcc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGN--MINMPFWQDLTGED 78 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~--ti~~P~~~~l~g~a 78 (346) ++.....+.... +..+|.-.-.++|+.+.+-+.+...+.+.+.+- . .....++. .+.+|.+... ... T Consensus 110 ~~~~~~~~~~~~-~~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~~------~---~~~~~~~~~~~~~~~~~~~~-~~~ 178 (415) T protein:vir:79 110 TEYLETRNDIQG-GSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKY------V---TVKRVTNGSGKYPVVRQSEV-AAL 178 (415) T ss_pred HHHHhhhhhhhh-ccccccccccccchHHHHHHHHHHHhhhhhhhh------e---eeeeccCCceeEEEEeecCC-ccc Confidence 111111111111 222233356689988877777665555544221 1 11111233 4444555443 234 Q ss_pred cccCCCccccchh-hcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhh Q lcl|NC_015254. 79 EILDDGEGALTPG-NISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALD 157 (346) Q Consensus 79 e~~~dg~~~it~~-~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~ 157 (346) ..+.|+. .++.. ..+-.+-....++.+.-..+++....-+.-|..+.+.+++++.+.+..++.++..+. .-...... T Consensus 179 ~~v~E~~-~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g-~g~~~~~~ 256 (415) T protein:vir:79 179 EKVEELE-ENPELAVKPFFQLAYDINTHRGYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVIT-KGSTGSTS 256 (415) T ss_pred eeecccc-ccCcccccceeeEEeeeeeeEeeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccc-cCcccccc Confidence 4456664 44432 334444445566666667777776555555778889999999999999987765431 10000000 Q ss_pred hcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceeeEEeceEEEEe Q lcl|NC_015254. 158 SNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFPTYMGKRVIVD 233 (346) Q Consensus 158 ~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~~~~G~~VVvd 233 (346) .........+.....++++.|.+++.++.+....-.+|+||+.++..|++..-- .++. ...++..++++|+||+++ T Consensus 257 ~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~l~~lkd~~G~~l~~~~~~~~~~~~l~G~pV~~~ 336 (415) T protein:vir:79 257 SGFEKEGKKLEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKMKDKLGNYLIQPDVKEKTQQRLLGAKIEIL 336 (415) T ss_pred ccccccccccccccccchhHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHhhccCCceeeccCcCCCCCceecceeeEEe Confidence 011111122233455789999999999988777778999999999999874211 1222 222455678999999999 Q ss_pred CCCccCCCceEEEEEc--CCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCChHHhcCCc Q lcl|NC_015254. 234 DGLPAKDGVYTSYIFG--EGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEKGN 311 (346) Q Consensus 234 D~~p~~~g~ytt~l~~--~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~~ 311 (346) +.+|.....-.+++|+ ..++.+... ..+.+++.+.... .+.+..-.++-.. T Consensus 337 ~~~~~~~~~~~~~~~Gd~~~~~~~~~~-~~~~v~~~~~~~~-~~~~~~~~r~d~~------------------------- 389 (415) T protein:vir:79 337 PDEVLGQKGNNTLIIGNLKDAIVLFDR-SQYQASWTDYMHF-GECLMIAVRQDCR------------------------- 389 (415) T ss_pred cccccCCCCccEEEEEehhccEEEEee-cceEEEEeccccC-ceEEEEEEEeccE------------------------- Confidence 9998754322345665 334433332 2345555443321 2222222222211 Q ss_pred CceeeecccccceEEEEEecccccccCCCC Q lcl|NC_015254. 312 NWKAVYESKNIRIVAFVHKNGVPGKKKETA 341 (346) Q Consensus 312 NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~ 341 (346) +.+++.+-++.+.+....+|--+-.+ T Consensus 390 ----v~~~~a~~~~~~~~~~~~~~~~~~~~ 415 (415) T protein:vir:79 390 ----ILDYKSAIVIEYDDSERGEGDLGLEA 415 (415) T ss_pred ----EeccccEEEEEEeccCCCCCccccCC Confidence 12233333333333332222222222 No 54 >protein:vir:94711 Length: 347 # NCBI annotation: capsid # Family: family:all:975 # MgeID: mge:1528 # MgeName: K1F # Cross-refs: genbank:acc:YP_338120;genbank:gi:77118198;genbank:GeneID:3707734 Probab=99.27 E-value=2.7e-13 Score=89.47 Aligned_cols=285 Identities=12% Similarity=0.040 Sum_probs=163.6 Q ss_pred CccceecceeeecCCceeeeee----ccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCC Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIAD----VIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTG 76 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d----~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g 76 (346) |...-+++.|-=-+++..---+ .|.|||+..| .+.+.|.. .-.... -.+|+++++|..+... T Consensus 3 ~~~~~~~~t~~g~~~~~~d~~al~ik~f~~eV~~~f-----~~~s~~~~------~~~~r~--i~~G~sv~i~~iG~~t- 68 (347) T protein:vir:94 3 NVPGQKIGTDQGKGKSSSDALALFLKVFAGEVLTAF-----TRRSVTAD------KHIVRT--IQNGKSAQFPVMGRTS- 68 (347) T ss_pred CCCccccccccccCCccccHHHHHHHHHhHHHHHHH-----HHHHhhhc------cccccc--ccccceEEEeccccee- Confidence 6666677766543333322223 4566666554 33333321 111111 1369999999988763 Q ss_pred cccccCCCcccc--chhhcccceeEEEEEee-cCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhh Q lcl|NC_015254. 77 EDEILDDGEGAL--TPGNISAAKDIARLHMR-GKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITAS 153 (346) Q Consensus 77 ~ae~~~dg~~~i--t~~~lt~~~~~a~~~~~-~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~ 153 (346) ......|+ ++ +++.+...+..-++-.. -..+.+.|+.....-.|++.+.+++.+.++++.+|..++..|..+.+. T Consensus 69 -v~~~t~G~-~l~~~~~~~~~~e~~itID~~~~~~~~VddiD~~q~~~D~~~~~~~~~g~aLa~~~D~~i~~~~~~~aa~ 146 (347) T protein:vir:94 69 -GVYLAPGE-RLSDKRKGIKHTEKVITIDGLLTADVMIFDIEDAMNHYDVAGEYSNQLGEALAIAADGAVLAEMAILCNL 146 (347) T ss_pred -eeeecCCC-CcCCCCCCCCcceEEEEecchhhhhHHhhhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHHHhcc Confidence 34444453 55 44456666554444333 235688899999999999999999999999999999999877544332 Q ss_pred hhhhh---------cceeeeccccc-ccccc----HHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhhhhhhhcccc Q lcl|NC_015254. 154 GALDS---------NKLDVSTETGD-DSYFT----GDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQGLIEFMLDS 217 (346) Q Consensus 154 ~~~~~---------~~~dis~~~~~-~~~~~----~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~li~~~~~s 217 (346) ..... +...+...... +...+ ++.|.+|...|.+.. ..-..++|.|..|..|.+...+...... T Consensus 147 ~~~~~~~~~g~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~a~~~Lde~~VP~~~R~~vv~P~~~~~Ll~~~~~~~~~~~ 226 (347) T protein:vir:94 147 PAASNENIAGLGTASVLEVGKKADLDTPAKLGEAIIGQLTIARAKLTSNYVPAGDRYFYTTPDNYSAILAALMPNAANYA 226 (347) T ss_pred ccccccccCCCcccceeeccccccccchhhhHHHHHHHHHHHHHHHhhcCCCCCCcEEEeCHHHHHHHhccchhhhhhcc Confidence 22111 11111000000 01111 355667777776544 2347899999999999876544322211 Q ss_pred -----cCceeeEEeceEEEEeCCCccCCC-------------------------ce-------EEEEEcCCeeEEeecCC Q lcl|NC_015254. 218 -----DNKKFPTYMGKRVIVDDGLPAKDG-------------------------VY-------TSYIFGEGAFGLGNGEA 260 (346) Q Consensus 218 -----~~~~i~~~~G~~VVvdD~~p~~~g-------------------------~y-------tt~l~~~GAi~~~~~~~ 260 (346) ..|.|+.++|++|+.|+.+|.... +| ...+|-+-|++..... T Consensus 227 ~~~~~~~G~Vg~i~G~~V~~Sn~lp~~~~t~~~~~~~~~~~aG~~~~~~~~~~~~~~~~~~~~~~l~~h~~A~~~v~~~- 305 (347) T protein:vir:94 227 ALIDPETGNIRNVMGFVVVEVPHLVQGGAGETRGDDGITIASGQKHAFPATASSDVKVTMDNVVGLFSHRSAVGTVKLR- 305 (347) T ss_pred ccccccccceEEEeceEEEecCcccccccccccccCcceecCcccccccccchhhhcccccceeEEEeehhhhhhhhcc- Confidence 246789999999999999995311 01 1233455566654433 Q ss_pred ccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCChHH Q lcl|NC_015254. 261 PVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTE 306 (346) Q Consensus 261 ~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~ 306 (346) ++.+|..|++....+.+...+.|+..+. .|. ..+.=+-+ -|| T Consensus 306 ~~~~e~~r~~~~~~d~i~~~~~~G~~~~--rP~-~a~~~~~~-~A~ 347 (347) T protein:vir:94 306 DLALERDRDVDAQGDLIVGKYAMGHGGL--RPE-AAGALVFS-PAE 347 (347) T ss_pred cccccchhchhhHHHHhhhhhhhcCccc--ccc-eeEEEEec-CCC Confidence 3468889999887777776666655442 110 00000001 111 No 55 >protein:vir:4339 Length: 395 # NCBI annotation: major head protein # Family: family:all:585 # MgeID: mge:93 # MgeName: D3 # Cross-refs: genbank:acc:NP_061502;genbank:gi:9635591;genbank:GeneID:1262860 Probab=99.26 E-value=1.7e-12 Score=85.14 Aligned_cols=276 Identities=9% Similarity=0.037 Sum_probs=150.6 Q ss_pred CccceecceeeecCCceeee-eeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCccc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRI-ADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDE 79 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l-~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae 79 (346) |.+...+.+..-+...++.- .-++.|+ +..-+.+...+.+.|.+- -.....+|..+++|.....++.+. T Consensus 101 ~~~~~~~~~~~~~~~~~~~~~g~~vp~~-~~~~ii~~~~~~~~l~~l---------~~~~~~~~~~~~~~~~~~~~~~a~ 170 (395) T protein:vir:43 101 LRGSHRVSMPRSAITSIDGSGGALVAPD-RRPGVVAAPQRRLTIRDL---------VAPGTTESNSVEYVRETGFVNNAA 170 (395) T ss_pred hhhhhhhhhhhhhhcccCCCCccccchh-hHHHHHHHHHhhhhHHhh---------ccceecCCCceEEEEEecCCCcee Confidence 21211222211111111111 2245555 444455555555544321 111123566788898866555666 Q ss_pred ccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH------HHhhhhh Q lcl|NC_015254. 80 ILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS------LNGITAS 153 (346) Q Consensus 80 ~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~------L~G~~~~ 153 (346) .+.|+. .++..+.+..+-....++.+....+++....-+ .+....+.+++++.+.+..+..+|.- ++|++.. T Consensus 171 ~v~E~~-~~~~~~~~~~~i~~~~~k~~~~~~is~ell~d~-~~l~~~v~~~la~a~~~~~d~~~l~G~g~~~~~~Gi~~~ 248 (395) T protein:vir:43 171 PVSEGT-QKPYSDLTFELENAPVRTIAHLFKASRQILDDA-SALQSYIDARARYGLMLVEECQLLYGNGTGANLHGIIPQ 248 (395) T ss_pred eecCCc-cccccccceeEEEEeeeeEEEeehhhHHHHHhH-HHHHHHHHHHHHHHHHHHHHHHHHhccCCCCcccccccc Confidence 677875 566666766666666666666677777654333 35667789999999999999977641 1122221 Q ss_pred hhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc-cccCceeeEEeceEE Q lcl|NC_015254. 154 GALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML-DSDNKKFPTYMGKRV 230 (346) Q Consensus 154 ~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~-~s~~~~i~~~~G~~V 230 (346) ..... ............++.+.++...+......-.+|+||+.++..|++..-- .++. ...++.-++++|+|| T Consensus 249 ~~~~~----~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~lkd~~G~~i~~~~~~~~~~~l~G~pV 324 (395) T protein:vir:43 249 AQAYA----PPSGVVVTAEQRIDRIRLAILQAQLAEFPASGIVLNPIDWALIELNKDAENRYIIGSPQNGTTPTLWRLPV 324 (395) T ss_pred ccccc----cccccccccchhHHHHHHHHHhhccccCCCcEEEEcHHHHHHHHHhhccCCceeccccccCCCceecceee Confidence 11111 1111122233567889999988877777778999999999998765311 1221 112344568999999 Q ss_pred EEeCCCccCCCceEEEEEcC--CeeEEeecCCccceeeeecC----CcceeEEEEeeEEeeeee---eeeeccccccCCC Q lcl|NC_015254. 231 IVDDGLPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREK----LKGNDILINRQHFLLHPR---GIAWQEKSVAGHS 301 (346) Q Consensus 231 VvdD~~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~----~~g~~~l~~r~~~~~~~~---G~s~~~~~~~~~s 301 (346) ++++.||.+. ++|+. .++.+.. +..+.+|.++.. ..+...+....++.++++ .|..-.-+ T Consensus 325 v~~~~~~~~~-----~~~gd~~~~~~~~~-~~~~~i~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~~~~t----- 393 (395) T protein:vir:43 325 VETQAITQDE-----FLTGAFSLGAQIFD-RMDIEVLVSTENDKDFENNMVTIRAEERLAFAVYRPEAFVTGSLT----- 393 (395) T ss_pred EEcCCCCCCc-----EEEEeccceEEEEE-ecceEEEEeccccchhhcCcEEEEEEEeeccEEecccceEEEEec----- Confidence 9999999764 33332 2222222 223456655543 244555666666665553 33332111 Q ss_pred CC Q lcl|NC_015254. 302 PT 303 (346) Q Consensus 302 Pt 303 (346) ++ T Consensus 394 aa 395 (395) T protein:vir:43 394 AS 395 (395) T ss_pred cC Confidence 11 No 56 >protein:vir:3364 Length: 347 # NCBI annotation: major capsid protein 10A # Family: family:all:975 # MgeID: mge:67 # MgeName: T3 # Cross-refs: genbank:acc:NP_523335;genbank:gi:17570826;genbank:GeneID:927448 Probab=99.25 E-value=1.1e-12 Score=86.11 Aligned_cols=286 Identities=14% Similarity=0.097 Sum_probs=168.4 Q ss_pred Cc---cceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCc Q lcl|NC_015254. 1 MI---KKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGE 77 (346) Q Consensus 1 ~~---~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ 77 (346) |- .-=++|.|.=-++...---.+++ |+|...|...+.+.+.|...--. + . -.+|+++.+|..+... T Consensus 1 ~~~~~~~~~~~t~~g~~~~~~~~~al~i-e~~~g~V~~~f~~~s~~~~~v~~-r--~-----~~~G~sv~i~~iG~~t-- 69 (347) T protein:vir:33 1 MANIQGGQQIGTNQGKGQSAADKLALFL-KVFGGEVLTAFARTSVTMPRHML-R--S-----IASGKSAQFPVIGRTK-- 69 (347) T ss_pred CCCCccCcccccccccCCcccchHHHHH-HHHHHHHHHHHHHHHhhhhhhcc-c--c-----ccccceeEeeecccee-- Confidence 21 11134433332343333334777 99999999888888766332111 1 1 1369999999988763 Q ss_pred ccccCCCcccc--chhhcccceeEEEEEee-cCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhh- Q lcl|NC_015254. 78 DEILDDGEGAL--TPGNISAAKDIARLHMR-GKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITAS- 153 (346) Q Consensus 78 ae~~~dg~~~i--t~~~lt~~~~~a~~~~~-~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~- 153 (346) .+....|. .+ ++..+...+..-++=.. -..+.+.|+....+..|++.+++++.+.++++++|..++..|..+... T Consensus 70 ~~~~~~g~-~l~~~~~~~~~~e~~ltiD~~~y~~~~VddiD~~q~~~D~~~~~~~~~g~aLA~~~D~~i~~~l~~~~~~~ 148 (347) T protein:vir:33 70 AAYLKPGE-NLDDKRKDIKHTEKVIHIDGLLTADVLIYDIEDAMNHYDVRAEYTAQLGESLAMAADGAVLAELAGLVNLP 148 (347) T ss_pred eeeecCCC-CCCCCCCCCccceEEEEechhhhhhHHHhhHHHHhcCCchhHHHHHHHHHHHHHHHHHHHHHHHHHhhhhh Confidence 34444453 44 33445655554443332 335789999999999999999999999999999999998766432211 Q ss_pred -hhhh------hcc-eee-ecccc--ccc----cccHHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhhhhhh-hcc Q lcl|NC_015254. 154 -GALD------SNK-LDV-STETG--DDS----YFTGDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQGLIE-FML 215 (346) Q Consensus 154 -~~~~------~~~-~di-s~~~~--~~~----~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~li~-~~~ 215 (346) ..+. ..+ ..+ ...++ .++ .--++.|.+|..+|.++. ..-+.++|.|..|..|++..-+. .-. T Consensus 149 ~~~~~~~~~~~~~~~~~~~~~~tg~~~d~~~~a~~i~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~~~~~~d~ 228 (347) T protein:vir:33 149 DGSNENIEGLGKPTVLTLVKPTTGSLTDPVELGKAIIAQLTIARASLTKNYVPAADRTFYTTPDNYSAILAALMPNAANY 228 (347) T ss_pred cccccccccccccccccccccccccccchhhhHHHHHHHHHHHHHHHhhcCCCccCcEEEeCHHHHHHHhcccccccccc Confidence 1100 000 000 00011 111 112567778888887654 24478999999999999764332 211 Q ss_pred cc----cCceeeEEeceEEEEeCCCccCCCce----------------------------EEEEEcCCeeEEeecCCccc Q lcl|NC_015254. 216 DS----DNKKFPTYMGKRVIVDDGLPAKDGVY----------------------------TSYIFGEGAFGLGNGEAPVP 263 (346) Q Consensus 216 ~s----~~~~i~~~~G~~VVvdD~~p~~~g~y----------------------------tt~l~~~GAi~~~~~~~~~~ 263 (346) .+ ..|.|+.++|++|+.|+.+|...+.. -.+++-+.|++...... +. T Consensus 229 ~~~~~~~~G~V~~i~G~~V~~Sn~lp~~~~~~~~~~~~ag~~~~~~~~~~~~~~~a~~~~~gl~~h~~A~g~v~~~~-~~ 307 (347) T protein:vir:33 229 QALLDPERGTIRNVMGFEVVEVPHLTAGGAGDTREDAPADQKHAFPATSSTTVKVALDNVVGLFQHRSAVGTVKLKD-LA 307 (347) T ss_pred ccccccccceeEEEeceeEEEecccccCccccccccccccccccccCCcccceeccccceeeeeecchhheeeeeec-ee Confidence 11 13678999999999999998643210 12455666776655443 46 Q ss_pred eeeeecCCcceeEEEEeeEEeeeee---e-eeeccccccCCCCChHH Q lcl|NC_015254. 264 TETDREKLKGNDILINRQHFLLHPR---G-IAWQEKSVAGHSPTNTE 306 (346) Q Consensus 264 vE~dRd~~~g~~~l~~r~~~~~~~~---G-~s~~~~~~~~~sPt~a~ 306 (346) +|..|++..-.|.+...+.|+..+. + ..++- |--+| T Consensus 308 ~e~~r~~~~~~d~i~~~~~~G~~vlrP~~av~i~~-------~~~~~ 347 (347) T protein:vir:33 308 LERARRANYQADQIIAKYAMGHGGLRPEAAGAIVL-------PKVSE 347 (347) T ss_pred eeeccchhhhhHhhhhhhhcCCceecccceEEEec-------CCCCC Confidence 8888988766666655554443332 1 11111 11111 No 57 >protein:vir:78739 Length: 332 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:1856 # MgeName: Syn5 # Cross-refs: genbank:acc:YP_001285448;genbank:gi:148724482;genbank:GeneID:5220210 Probab=99.25 E-value=2.1e-13 Score=90.11 Aligned_cols=287 Identities=15% Similarity=0.140 Sum_probs=162.0 Q ss_pred Cccceecceeeec---CCceeeeee--ccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCC Q lcl|NC_015254. 1 MIKKLRMNLQKFA---AGKNTRIAD--VIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLT 75 (346) Q Consensus 1 ~~~~~~~~~q~~~---a~~~T~l~d--~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~ 75 (346) |-.---|-.-+.+ +....-=.| +++ |+|...|.+.+.+.+.|.. .-..... .+|+++++|..+.. T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~d~~~al~l-e~~~geV~~~f~~~s~~~~------~~~~r~i--~~G~tv~i~~ig~~- 70 (332) T protein:vir:78 1 MTTLSNFSLPNQANGGARNADYDVRYATAL-KLFSGEVFTAFNNASIFKG------LVRSYDL--RGGKSKQFMFTGKL- 70 (332) T ss_pred CcccccccCCccccCCccccccccchhhhh-hhhhhhHHHHHHHHhhhhh------ccccccc--cccceEEEEeccce- Confidence 3221122111111 000011022 444 7777777777777666532 1111111 26999999998865 Q ss_pred CcccccCCCccccchh-hcccceeEEEEEe-ecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhh Q lcl|NC_015254. 76 GEDEILDDGEGALTPG-NISAAKDIARLHM-RGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITAS 153 (346) Q Consensus 76 g~ae~~~dg~~~it~~-~lt~~~~~a~~~~-~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~ 153 (346) .......|. .+.+. .+.+.+..-++=. ...++.+.|+....+..|.+.+++++.+.+.++++|..++..|...... T Consensus 71 -~~~~~~~g~-~l~~~~~~~~~~~~l~ID~~ky~~~~VddiD~~q~~~dl~~~~~~~~g~aLA~~~D~~i~~~l~~aa~~ 148 (332) T protein:vir:78 71 -SAGYHTPGT-PIVGDAGIKANEKTLVMDDLLVSSQFVYSLDEIFSQYSTRAEVSKQIGEALATHYDERIARVLAKASAE 148 (332) T ss_pred -eEeeecCCC-CCCCCCCCCCceEEEEEehhhhhHHHHHhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Confidence 334444454 56654 4776666544443 5556889999999999999999999999999999999999987543322 Q ss_pred hhhh-----hcceeeeccccccccccHHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhh---hhhhhcc-ccc---- Q lcl|NC_015254. 154 GALD-----SNKLDVSTETGDDSYFTGDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQ---GLIEFML-DSD---- 218 (346) Q Consensus 154 ~~~~-----~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~---~li~~~~-~s~---- 218 (346) .... +....+......++.--++.|.+|..+|.++. ..-..+++.|..|..|.+. .+++.-. .++ T Consensus 149 ~~~~~~~~g~~~~~~~~~~~~~~~~~~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~d~~~~n~~~~~~~~~~~ 228 (332) T protein:vir:78 149 ASPVTGEPGGFHVNIGAGNTNDAQAIVDGFFEAAAVLDERSAPQEGRVAVLSPRQYYSLISSVDTNILNREIGNSQGDMN 228 (332) T ss_pred cCcccccccccccccCCccccCHHHHHHHHHHHHHHHhhcCCCccCCEEEeCHHHHHHHHhhcCceeeeeecccccccee Confidence 1110 01111111111112223467788888886544 2336889999999999873 2333211 111 Q ss_pred Cc-eeeEEeceEEEEeCCCccCCCc--------------------eEEEEEcCCeeEEeecCCc-c-ceeeeecCCccee Q lcl|NC_015254. 219 NK-KFPTYMGKRVIVDDGLPAKDGV--------------------YTSYIFGEGAFGLGNGEAP-V-PTETDREKLKGND 275 (346) Q Consensus 219 ~~-~i~~~~G~~VVvdD~~p~~~g~--------------------ytt~l~~~GAi~~~~~~~~-~-~vE~dRd~~~g~~ 275 (346) ++ .++.++|++|+.|+.+|...|. ..++++-+-|+++....++ + ..|-+|++....+ T Consensus 229 ~g~~i~~i~G~~V~~Sn~lp~~~g~~~~~~~~~~~~n~~~~~~~~~~~~~~h~~a~~~v~~~~~~~~~t~~~~~~~~~~d 308 (332) T protein:vir:78 229 SGKGLYSIAGIRILKSNNLAGLYGQDLSSAAVTGENNDYQVDASALAGLIFHREAAGCIQSVAPTIQTTSGDFNVQYQGD 308 (332) T ss_pred cceeeeEEeeeEEEecCccccCcccccccccccccccccccccccceEEeecccceeeeeeeccchhhhhcccchhhhHh Confidence 23 4889999999999999965432 2356777788877665543 1 1344566665444 Q ss_pred EEEEeeEEeeeeeeeeeccccccCCCCChHHhcCC Q lcl|NC_015254. 276 ILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEKG 310 (346) Q Consensus 276 ~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~ 310 (346) .+... |.+|..--.+.. -.+|.++ T Consensus 309 ~i~~~-----~~~G~~v~rPe~------~v~l~~a 332 (332) T protein:vir:78 309 LIVGK-----LAMGCGSLRTSV------AGSFQAA 332 (332) T ss_pred hhhhh-----hhhcCceecccc------eEEEeeC Confidence 44433 344432211110 1111111 No 58 >protein:vir:100135 Length: 418 # NCBI annotation: gp5 # Family: family:all:585 # MgeID: mge:1639 # MgeName: phi1026b # Cross-refs: genbank:acc:NP_945035;genbank:gi:38707895;genbank:GeneID:2744182 Probab=99.25 E-value=2.4e-12 Score=84.33 Aligned_cols=276 Identities=10% Similarity=0.074 Sum_probs=154.7 Q ss_pred Cccceecceee-----ec--CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccccc Q lcl|NC_015254. 1 MIKKLRMNLQK-----FA--AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQD 73 (346) Q Consensus 1 ~~~~~~~~~q~-----~~--a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~ 73 (346) +-....++... .. .+.++.-...++|+.+..-+.+...+.+.+.+- + .....++..+++|.+.. T Consensus 116 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~lvp~~~~~~ii~~~~~~~~l~~~--~-------~~~~~~~~~~~~~~~~~ 186 (418) T protein:vir:10 116 ARKSVRVRVDRKSIMNVPATVGSGVSGSNSLVVADRQAGIIAPPQRKMTIRDL--L-------MPGQTSSSSIEYTVETG 186 (418) T ss_pred HhhhhhhhhHHHHHHHhhhhccCCCCCCccccchhHHHHHHHHHhhhhhHHhh--c-------ceeeccCCceeEEEEec Confidence 11111111111 11 222333345567777776666666666555431 1 11123566788898876 Q ss_pred CCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH------H Q lcl|NC_015254. 74 LTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS------L 147 (346) Q Consensus 74 l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~------L 147 (346) .+..+..+.|+. .++..+.+-.+-....++.+....+++.....+ .+..+.+.+++++..++..++.+|.- . T Consensus 187 ~~~~a~~v~E~~-~~~~~~~~f~~v~~~~~k~~~~~~is~ell~ds-~~l~~~i~~~l~~a~~~~~d~a~l~G~g~~~~p 264 (418) T protein:vir:10 187 FTNNAAAVAEGA-QKPTSDLKFNLKNQPVRTIAHLFKASRQILDDA-PALQSYIDGRARYGLQLTEEGQILKGDGTGANI 264 (418) T ss_pred CCCceeeeccCc-cccccccceeeEEEeeeeEEEeehhhHHHHHhH-HHHHHHHHHHHHHHHHHHHHHHHhccCCCCccc Confidence 655565667774 566666665555555566665666776654444 47777899999999999999977641 2 Q ss_pred HhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhccc-ccCceeeE Q lcl|NC_015254. 148 NGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLD-SDNKKFPT 224 (346) Q Consensus 148 ~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~-s~~~~i~~ 224 (346) .|++....... .........+++.+.++...+.+......+|+||+.++..|++..-- .++.. ..++.-++ T Consensus 265 ~Gi~~~~~~~~------~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~L~~lkd~~G~~i~~~~~~~~~~~ 338 (418) T protein:vir:10 265 LGILPQASAFM------PSITLANATPIDKIRLALLQAVLAEFPATGIVLNPIDWASIELTKDSQGRYIVGNPVNGTTPR 338 (418) T ss_pred ccccccccccc------ccccccccccHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHhhcCCCceeccccccCCCce Confidence 22222211111 11122233567889999988877777778999999999998864311 12221 12344578 Q ss_pred EeceEEEEeCCCccCCCceEEEEEcCC--eeEEeecCCccceeeeecCC----cceeEEEEeeEEee---eeeeeeeccc Q lcl|NC_015254. 225 YMGKRVIVDDGLPAKDGVYTSYIFGEG--AFGLGNGEAPVPTETDREKL----KGNDILINRQHFLL---HPRGIAWQEK 295 (346) Q Consensus 225 ~~G~~VVvdD~~p~~~g~ytt~l~~~G--Ai~~~~~~~~~~vE~dRd~~----~g~~~l~~r~~~~~---~~~G~s~~~~ 295 (346) ++|+||++++.||.+. ++|+.- ++.+.... .+.++.++... .+...+....++.+ ||.+|.+-.. T Consensus 339 l~G~pV~~~~~~p~~~-----~~~gd~s~~~~~~~~~-~~~i~~~~~~~~~f~~~~~~~r~~~~~d~~~~~~~a~~~~~~ 412 (418) T protein:vir:10 339 LWNLPVVETQAMTANE-----FLVGAFSMAAQIFDRM-EIEVLLSTENVDDFEKNMVSIRAEERLALAVYRPESFVTGAL 412 (418) T ss_pred ecceeeEEcCCCCCCc-----EEEeeccceEEEEEec-ceEEEEecccchhhhcCceEEEEEEeeccEEecccceEEEEe Confidence 9999999999999763 444432 23332222 34455554332 34444545444443 3445554322 Q ss_pred c--ccC Q lcl|NC_015254. 296 S--VAG 299 (346) Q Consensus 296 ~--~~~ 299 (346) + ++| T Consensus 413 ~~~~~g 418 (418) T protein:vir:10 413 VEQAGG 418 (418) T ss_pred ccCCCC Confidence 2 222 No 59 >protein:vir:97331 Length: 319 # NCBI annotation: ORF011 # Family: family:all:701 # MgeID: mge:1666 # MgeName: 52A # Cross-refs: genbank:acc:YP_240611;genbank:gi:66396278;genbank:GeneID:5133687 Probab=99.25 E-value=2.9e-12 Score=83.82 Aligned_cols=289 Identities=12% Similarity=-0.019 Sum_probs=155.5 Q ss_pred Cccc-------eecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccccc Q lcl|NC_015254. 1 MIKK-------LRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQD 73 (346) Q Consensus 1 ~~~~-------~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~ 73 (346) |||. ||+|||.|+.. .--.-.+.=-|-|.+.+.+..... .+.... +++ ..+ -..+|++|+||.+.. T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~-~~~~nt~~l~~k~~~~LD~~~~~~-~~s~~~-~~N-~~~---e~~gg~tVkIp~i~~ 73 (319) T protein:vir:97 1 MNKTIKNATGMLKLNLQHFANK-SVEPGQTLLKNKHVGILERVTAVN-AYSTPA-LIS-NDA---IFMEGRSFTVMKGDT 73 (319) T ss_pred CCcccccccceeEeehhhhhcc-CCCcchHHHHHHHHHHHHHHHHHh-hhhhhc-ccC-cce---EeccCcEEEEeeecc Confidence 9984 68999999543 222222222234555555433222 221111 122 111 234799999999986 Q ss_pred CCCcccccCCCccccchhhcccceeEEEE-EeecCcceechHHHhhhcc--hHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_015254. 74 LTGEDEILDDGEGALTPGNISAAKDIARL-HMRGKAWRTNDLAKALSGD--DPMRAIGDLVVEYWNRRRQAVLIASLNGI 150 (346) Q Consensus 74 l~g~ae~~~dg~~~it~~~lt~~~~~a~~-~~~~k~~~~tD~a~~~~g~--dp~~~i~~q~a~~~~~~~~~~lla~L~G~ 150 (346) . |- .+.+. .+..+.+.++.....-++ +.|+.+|.+.|+...-+.. ......+++........+|...++.|.+- T Consensus 74 ~-gl-~DY~R-~~g~~~g~vt~~~~t~tidqdR~~~F~VD~~D~~Etn~~l~a~~i~~~~~~~~v~PEiDay~~skla~~ 150 (319) T protein:vir:97 74 T-EL-KDYKR-NATNEFDHPKIEETTYFLDQEKYWGRFVDALDRKDTEGNIDINYVVARQGAEVVAPYLDNLRFATLARN 150 (319) T ss_pred c-cc-ccccC-CCCcccCCcccceeEEEeecccccccccchhhHhhhhchhhHHHHHHHHHHHHhhhhhhHHHHHHHHhh Confidence 3 43 23232 235677777766665443 4577788888888766544 34445566666677777888888776432 Q ss_pred hhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccc-cCceEEEEchHHHHHHHhhhhhhhccccc-----CceeeE Q lcl|NC_015254. 151 TASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAE-GKLTGIAMHSQTEMNLRKQGLIEFMLDSD-----NKKFPT 224 (346) Q Consensus 151 ~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~-~~~~~ivmhS~~~~~L~~~~li~~~~~s~-----~~~i~~ 224 (346) .+. +.. . +.++.=.++.|.++..+|-+.. ..-..++|.|.+|.-|++...+....... .+.|+. T Consensus 151 a~~------~~~--~--~~t~~n~y~~i~~a~~~Lde~~VP~~Rvl~Vtp~~~~~L~~~~~f~~~~~~~~~~~~~g~Vg~ 220 (319) T protein:vir:97 151 KAK------HLT--V--GTGSDAQYDAVLDVSVELDEIKAPENRVLFVSPTFYKGIKKFVIALPQGDTRQQVLGKGVQGE 220 (319) T ss_pred ccc------ccc--c--ccCHHHHHHHHHHHHHHHHhcCCCCCcEEEeCHHHHHHHHhhhhhhccccccccceeeeecee Confidence 111 110 0 1111124788999999996643 23478999999999998875443222111 356899 Q ss_pred EeceEEEEeCCCccCCCceEEEEEc-CCeeEEeecCCccceeeeec-CCcceeEEEEeeEEeeeee-----eeeecccc- Q lcl|NC_015254. 225 YMGKRVIVDDGLPAKDGVYTSYIFG-EGAFGLGNGEAPVPTETDRE-KLKGNDILINRQHFLLHPR-----GIAWQEKS- 296 (346) Q Consensus 225 ~~G~~VVvdD~~p~~~g~ytt~l~~-~GAi~~~~~~~~~~vE~dRd-~~~g~~~l~~r~~~~~~~~-----G~s~~~~~- 296 (346) +.|++|+.. |...++-..|+++ ++|+.....-. .+|..|. +....+.+..|..|.+-+. |+ |.... T Consensus 221 idG~~Vi~v---ps~~~k~in~i~~h~~A~~~~~k~~--~~~~~~p~~~~~a~~v~gr~y~d~~V~~~k~~~I-y~~~~~ 294 (319) T protein:vir:97 221 LDGFVIVKV---PTKLLQGLQAIAVVGEVLASPIQAD--LAKTNSNIPGMFGTLAEQLLYTGAFVPEHLQKYI-FTIGGT 294 (319) T ss_pred ecCeEEEEe---cccccccceEEEEcCCeeeeeeeee--eeeccCCCccccceeeeeeeeeeeEEeccccceE-EEeecC Confidence 999999974 3333333456655 66766543222 3454443 4443466666666655543 22 22111 Q ss_pred ------ccCCCCChHHhcCCcCcee Q lcl|NC_015254. 297 ------VAGHSPTNTEIEKGNNWKA 315 (346) Q Consensus 297 ------~~~~sPt~a~L~~~~NW~~ 315 (346) .+++.|++.+-.-....+. T Consensus 295 ~~~~~~~~~~~~~~~~~~~~~~~~~ 319 (319) T protein:vir:97 295 EVATKRDGVDAHADNVAKPSGSLEM 319 (319) T ss_pred CcccCCCccccccccccCCcccccC Confidence 1122233222211111111 No 60 >protein:vir:94800 Length: 319 # NCBI annotation: ORF012 # Family: family:all:701 # MgeID: mge:1531 # MgeName: 29 # Cross-refs: genbank:acc:YP_240536;genbank:gi:66396203;genbank:GeneID:5133580 Probab=99.25 E-value=2.9e-12 Score=83.82 Aligned_cols=289 Identities=12% Similarity=-0.019 Sum_probs=155.5 Q ss_pred Cccc-------eecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccccc Q lcl|NC_015254. 1 MIKK-------LRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQD 73 (346) Q Consensus 1 ~~~~-------~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~ 73 (346) |||. ||+|||.|+.. .--.-.+.=-|-|.+.+.+..... .+.... +++ ..+ -..+|++|+||.+.. T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~-~~~~nt~~l~~k~~~~LD~~~~~~-~~s~~~-~~N-~~~---e~~gg~tVkIp~i~~ 73 (319) T protein:vir:94 1 MNKTIKNATGMLKLNLQHFANK-SVEPGQTLLKNKHVGILERVTAVN-AYSTPA-LIS-NDA---IFMEGRSFTVMKGDT 73 (319) T ss_pred CCcccccccceeEeehhhhhcc-CCCcchHHHHHHHHHHHHHHHHHh-hhhhhc-ccC-cce---EeccCcEEEEeeecc Confidence 9984 68999999543 222222222234555555433222 221111 122 111 234799999999986 Q ss_pred CCCcccccCCCccccchhhcccceeEEEE-EeecCcceechHHHhhhcc--hHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_015254. 74 LTGEDEILDDGEGALTPGNISAAKDIARL-HMRGKAWRTNDLAKALSGD--DPMRAIGDLVVEYWNRRRQAVLIASLNGI 150 (346) Q Consensus 74 l~g~ae~~~dg~~~it~~~lt~~~~~a~~-~~~~k~~~~tD~a~~~~g~--dp~~~i~~q~a~~~~~~~~~~lla~L~G~ 150 (346) . |- .+.+. .+..+.+.++.....-++ +.|+.+|.+.|+...-+.. ......+++........+|...++.|.+- T Consensus 74 ~-gl-~DY~R-~~g~~~g~vt~~~~t~tidqdR~~~F~VD~~D~~Etn~~l~a~~i~~~~~~~~v~PEiDay~~skla~~ 150 (319) T protein:vir:94 74 T-EL-KDYKR-NATNEFDHPKIEETTYFLDQEKYWGRFVDALDRKDTEGNIDINYVVARQGAEVVAPYLDNLRFATLARN 150 (319) T ss_pred c-cc-ccccC-CCCcccCCcccceeEEEeecccccccccchhhHhhhhchhhHHHHHHHHHHHHhhhhhhHHHHHHHHhh Confidence 3 43 23232 235677777766665443 4577788888888766544 34445566666677777888888776432 Q ss_pred hhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccc-cCceEEEEchHHHHHHHhhhhhhhccccc-----CceeeE Q lcl|NC_015254. 151 TASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAE-GKLTGIAMHSQTEMNLRKQGLIEFMLDSD-----NKKFPT 224 (346) Q Consensus 151 ~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~-~~~~~ivmhS~~~~~L~~~~li~~~~~s~-----~~~i~~ 224 (346) .+. +.. . +.++.=.++.|.++..+|-+.. ..-..++|.|.+|.-|++...+....... .+.|+. T Consensus 151 a~~------~~~--~--~~t~~n~y~~i~~a~~~Lde~~VP~~Rvl~Vtp~~~~~L~~~~~f~~~~~~~~~~~~~g~Vg~ 220 (319) T protein:vir:94 151 KAK------HLT--V--GTGSDAQYDAVLDVSVELDEIKAPENRVLFVSPTFYKGIKKFVIALPQGDTRQQVLGKGVQGE 220 (319) T ss_pred ccc------ccc--c--ccCHHHHHHHHHHHHHHHHhcCCCCCcEEEeCHHHHHHHHhhhhhhccccccccceeeeecee Confidence 111 110 0 1111124788999999996643 23478999999999998875443222111 356899 Q ss_pred EeceEEEEeCCCccCCCceEEEEEc-CCeeEEeecCCccceeeeec-CCcceeEEEEeeEEeeeee-----eeeecccc- Q lcl|NC_015254. 225 YMGKRVIVDDGLPAKDGVYTSYIFG-EGAFGLGNGEAPVPTETDRE-KLKGNDILINRQHFLLHPR-----GIAWQEKS- 296 (346) Q Consensus 225 ~~G~~VVvdD~~p~~~g~ytt~l~~-~GAi~~~~~~~~~~vE~dRd-~~~g~~~l~~r~~~~~~~~-----G~s~~~~~- 296 (346) +.|++|+.. |...++-..|+++ ++|+.....-. .+|..|. +....+.+..|..|.+-+. |+ |.... T Consensus 221 idG~~Vi~v---ps~~~k~in~i~~h~~A~~~~~k~~--~~~~~~p~~~~~a~~v~gr~y~d~~V~~~k~~~I-y~~~~~ 294 (319) T protein:vir:94 221 LDGFVIVKV---PTKLLQGLQAIAVVGEVLASPIQAD--LAKTNSNIPGMFGTLAEQLLYTGAFVPEHLQKYI-FTIGGT 294 (319) T ss_pred ecCeEEEEe---cccccccceEEEEcCCeeeeeeeee--eeeccCCCccccceeeeeeeeeeeEEeccccceE-EEeecC Confidence 999999974 3333333456655 66766543222 3454443 4443466666666655543 22 22111 Q ss_pred ------ccCCCCChHHhcCCcCcee Q lcl|NC_015254. 297 ------VAGHSPTNTEIEKGNNWKA 315 (346) Q Consensus 297 ------~~~~sPt~a~L~~~~NW~~ 315 (346) .+++.|++.+-.-....+. T Consensus 295 ~~~~~~~~~~~~~~~~~~~~~~~~~ 319 (319) T protein:vir:94 295 EVATKRDGVDAHADNVAKPSGSLEM 319 (319) T ss_pred CcccCCCccccccccccCCcccccC Confidence 1122233222211111111 No 61 >protein:vir:4600 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:101 # MgeName: PVL # Cross-refs: genbank:acc:NP_058445;genbank:gi:9635171;genbank:GeneID:1262708 Probab=99.25 E-value=2.8e-12 Score=83.94 Aligned_cols=298 Identities=10% Similarity=0.018 Sum_probs=154.7 Q ss_pred Cccceecceeee--------cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEec--c Q lcl|NC_015254. 1 MIKKLRMNLQKF--------AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMP--F 70 (346) Q Consensus 1 ~~~~~~~~~q~~--------~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P--~ 70 (346) .....+-+++.+ ++..+|.-.-..+|+.+.+.+.+...+.+.+.+..-+ ...++...++| . T Consensus 101 ~~~~~~~~~~~~~~~~~~~~~~~~~t~~g~~~iP~~~~~~ii~~~~~~~~l~~~~~~---------~~~~~~~~~~~~~~ 171 (415) T protein:vir:46 101 VTSQEVRDFTEYLETRNDIQGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTV---------KRVTNGSGKYPVVR 171 (415) T ss_pred hhHHHHHHHHHHHhhhhhhhhccccccCCcccccHHHHHHHHHHHHhhhhhhhhcce---------eeccCCceeEEEEE Confidence 011111111111 1222344456789999988888877776666442111 11123333444 4 Q ss_pred cccCCCcccccCCCccccch-hhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHh Q lcl|NC_015254. 71 WQDLTGEDEILDDGEGALTP-GNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNG 149 (346) Q Consensus 71 ~~~l~g~ae~~~dg~~~it~-~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G 149 (346) +... ..+..+.|+. .++. +..+-.+-....+..+....+++....-+.-|....+.++++..+.+..++.+|..+. T Consensus 172 ~~~~-~~~~~v~Eg~-~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~il~g~g- 248 (415) T protein:vir:46 172 QSEV-AALEKVEELE-ENPELAVKPFFQLAYDINTHRGYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVIT- 248 (415) T ss_pred ecCC-cceeeccccc-ccccccccceeeEEeeeeeeEeeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccc- Confidence 4433 2334456664 3432 2333344444455566666777765554555778889999999999999997765431 Q ss_pred hhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceeeEE Q lcl|NC_015254. 150 ITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFPTY 225 (346) Q Consensus 150 ~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~~~ 225 (346) .-.....................++++.|.++...+.+....-.+|+||+..+..|++..-- .++. ...++.-+++ T Consensus 249 ~g~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~L~~lkd~~G~~i~~~~~~~~~~~~l 328 (415) T protein:vir:46 249 KGSTGSTSSGFEKEGKKLEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKMKDKLGNYLIQPDVKEKTQQRL 328 (415) T ss_pred cCCccccccccccccceeccccccchHHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHhhccCCCeeeccCcCCCCCccc Confidence 10000011111111222234455788999999999888777778999999999999864211 1222 2224455789 Q ss_pred eceEEEEeCCCccCCCceEEEEEcC--CeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCC Q lcl|NC_015254. 226 MGKRVIVDDGLPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPT 303 (346) Q Consensus 226 ~G~~VVvdD~~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt 303 (346) +|+||++++.+|.....-..++|+. -++.+...+ .+.++..+... ..+.+....++.+. T Consensus 329 ~G~pV~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~-~~~v~~~~~~~-~~~~~~~~~r~d~~----------------- 389 (415) T protein:vir:46 329 LGAKIEILPDEVLGQKGNNTLIIGNLKDAIVLFDRS-QYQASWTDYMH-FGECLMIAVRQDCR----------------- 389 (415) T ss_pred cceeeEEeccccccCCCccEEEEEehhccEEEEeec-ceEEEeecccc-CceEEEEEEEeccE----------------- Confidence 9999999999997543223456652 233333222 33444433221 12223222222211 Q ss_pred hHHhcCCcCceeeecccccceEEEEEecccccccCCCC Q lcl|NC_015254. 304 NTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETA 341 (346) Q Consensus 304 ~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~ 341 (346) +++++.+-.+.+.+..+-+|--+-.+ T Consensus 390 ------------v~~~~a~~~~~~~~~~~~~~~~~~~~ 415 (415) T protein:vir:46 390 ------------ILDYKSAIVIEYDDSERGEGDLGLEA 415 (415) T ss_pred ------------EeccccEEEEEeeccCCCCCCccCCC Confidence 22334443333433333333322222 No 62 >protein:vir:4700 Length: 415 # NCBI annotation: phi PVL ORF 7 homologue # Family: family:all:21 # MgeID: mge:102 # MgeName: phiPV83 # Cross-refs: genbank:acc:NP_061632;genbank:gi:9635719;genbank:GeneID:1262976 Probab=99.25 E-value=2.8e-12 Score=83.94 Aligned_cols=298 Identities=10% Similarity=0.018 Sum_probs=154.7 Q ss_pred Cccceecceeee--------cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEec--c Q lcl|NC_015254. 1 MIKKLRMNLQKF--------AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMP--F 70 (346) Q Consensus 1 ~~~~~~~~~q~~--------~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P--~ 70 (346) .....+-+++.+ ++..+|.-.-..+|+.+.+.+.+...+.+.+.+..-+ ...++...++| . T Consensus 101 ~~~~~~~~~~~~~~~~~~~~~~~~~t~~g~~~iP~~~~~~ii~~~~~~~~l~~~~~~---------~~~~~~~~~~~~~~ 171 (415) T protein:vir:47 101 VTSQEVRDFTEYLETRNDIQGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTV---------KRVTNGSGKYPVVR 171 (415) T ss_pred hhHHHHHHHHHHHhhhhhhhhccccccCCcccccHHHHHHHHHHHHhhhhhhhhcce---------eeccCCceeEEEEE Confidence 011111111111 1222344456789999988888877776666442111 11123333444 4 Q ss_pred cccCCCcccccCCCccccch-hhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHh Q lcl|NC_015254. 71 WQDLTGEDEILDDGEGALTP-GNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNG 149 (346) Q Consensus 71 ~~~l~g~ae~~~dg~~~it~-~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G 149 (346) +... ..+..+.|+. .++. +..+-.+-....+..+....+++....-+.-|....+.++++..+.+..++.+|..+. T Consensus 172 ~~~~-~~~~~v~Eg~-~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~il~g~g- 248 (415) T protein:vir:47 172 QSEV-AALEKVEELE-ENPELAVKPFFQLAYDINTHRGYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVIT- 248 (415) T ss_pred ecCC-cceeeccccc-ccccccccceeeEEeeeeeeEeeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccc- Confidence 4433 2334456664 3432 2333344444455566666777765554555778889999999999999997765431 Q ss_pred hhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceeeEE Q lcl|NC_015254. 150 ITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFPTY 225 (346) Q Consensus 150 ~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~~~ 225 (346) .-.....................++++.|.++...+.+....-.+|+||+..+..|++..-- .++. ...++.-+++ T Consensus 249 ~g~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~L~~lkd~~G~~i~~~~~~~~~~~~l 328 (415) T protein:vir:47 249 KGSTGSTSSGFEKEGKKLEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKMKDKLGNYLIQPDVKEKTQQRL 328 (415) T ss_pred cCCccccccccccccceeccccccchHHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHhhccCCCeeeccCcCCCCCccc Confidence 10000011111111222234455788999999999888777778999999999999864211 1222 2224455789 Q ss_pred eceEEEEeCCCccCCCceEEEEEcC--CeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCC Q lcl|NC_015254. 226 MGKRVIVDDGLPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPT 303 (346) Q Consensus 226 ~G~~VVvdD~~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt 303 (346) +|+||++++.+|.....-..++|+. -++.+...+ .+.++..+... ..+.+....++.+. T Consensus 329 ~G~pV~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~-~~~v~~~~~~~-~~~~~~~~~r~d~~----------------- 389 (415) T protein:vir:47 329 LGAKIEILPDEVLGQKGNNTLIIGNLKDAIVLFDRS-QYQASWTDYMH-FGECLMIAVRQDCR----------------- 389 (415) T ss_pred cceeeEEeccccccCCCccEEEEEehhccEEEEeec-ceEEEeecccc-CceEEEEEEEeccE----------------- Confidence 9999999999997543223456652 233333222 33444433221 12223222222211 Q ss_pred hHHhcCCcCceeeecccccceEEEEEecccccccCCCC Q lcl|NC_015254. 304 NTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETA 341 (346) Q Consensus 304 ~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~ 341 (346) +++++.+-.+.+.+..+-+|--+-.+ T Consensus 390 ------------v~~~~a~~~~~~~~~~~~~~~~~~~~ 415 (415) T protein:vir:47 390 ------------ILDYKSAIVIEYDDSERGEGDLGLEA 415 (415) T ss_pred ------------EeccccEEEEEeeccCCCCCCccCCC Confidence 22334443333433333333322222 No 63 >protein:vir:9410 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:167 # MgeName: phi 13 # Cross-refs: genbank:acc:NP_803388;genbank:gi:29028700;genbank:GeneID:1258136 Probab=99.24 E-value=3.1e-12 Score=83.68 Aligned_cols=300 Identities=10% Similarity=0.001 Sum_probs=154.3 Q ss_pred Cccce------------ecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEe Q lcl|NC_015254. 1 MIKKL------------RMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINM 68 (346) Q Consensus 1 ~~~~~------------~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~ 68 (346) +.+++ +..-..-++...|.-.-..+|+.+..-+.+...+.+.+.+-.-+.+ ..++...+.+ T Consensus 97 ~~~~~~~~e~~~~~~~~~~~~~~~~~~~~~~~g~~~iP~~~~~~ii~~~~~~~~l~~~~~~~~-------~~~~~~~~~~ 169 (415) T protein:vir:94 97 QNTKVTSQEVRDFTEYLETRNDIQGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKR-------VTNGSGKYPV 169 (415) T ss_pred hhhhhhHHHHHHHHHHhhhhhhhhhhccccccccccCcHHHHHHHHHHHHhhhhhhhhcceee-------ccCCceeEEE Confidence 11110 0000011122223334567898887777777666665533111101 1223335555 Q ss_pred cccccCCCcccccCCCccccchh-hcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 69 PFWQDLTGEDEILDDGEGALTPG-NISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASL 147 (346) Q Consensus 69 P~~~~l~g~ae~~~dg~~~it~~-~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L 147 (346) |.+... +....+.|+. .++.. ..+-.+-....++.+.-+.+++....-+.-|..+.+.+++++.+.+..++.+|..+ T Consensus 170 ~~~~~~-~~~~~v~Eg~-~~~~~~~~~~~~i~~~~~k~~~~~~is~ell~ds~~~~~~~i~~~l~~~~~~~~~~~il~g~ 247 (415) T protein:vir:94 170 VRQSEV-AALEKVEELE-ENPELAVKPFFQLAYDINTHRGYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVI 247 (415) T ss_pred EeecCC-ccceeccccc-cccccccccceeeEeeheeeeeechhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhcc Confidence 655543 3444566764 44432 33344444455566666777777655555677888999999999999998777543 Q ss_pred HhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceee Q lcl|NC_015254. 148 NGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFP 223 (346) Q Consensus 148 ~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~ 223 (346) . .-......................+++.|.++...+.+....-.+|+|||..+..|++..-- .++. ...++..+ T Consensus 248 g-~g~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~lkd~~G~~l~~~~~~~~~~~ 326 (415) T protein:vir:94 248 T-KGSTGSTSSGFEKEGKKLEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKMKDKLGNYLIQPDVKEKTQQ 326 (415) T ss_pred c-cCccccccccccccccccccccccchHHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHhhccCCCeeeccCcCCCCCc Confidence 1 10000000011111112223345788999999999888777778999999999999875211 1222 12234557 Q ss_pred EEeceEEEEeCCCccCCCceEEEEEc--CCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCC Q lcl|NC_015254. 224 TYMGKRVIVDDGLPAKDGVYTSYIFG--EGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHS 301 (346) Q Consensus 224 ~~~G~~VVvdD~~p~~~g~ytt~l~~--~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~s 301 (346) +++|+||++++.+|.....-..++|+ ..++.+.. +..+.+++.+.. ...+.+....++-+.| T Consensus 327 ~l~G~pV~~~~~~~~~~~~~~~i~~gd~~~~~~~~~-~~~~~v~~~~~~-~~~~~~r~~~r~d~~~-------------- 390 (415) T protein:vir:94 327 RLLGAKIEILPDEVLGQKGNNTLIIGNLKDAIVLFD-RSQYQASWTDYM-HFGECLMIAVRQDCRI-------------- 390 (415) T ss_pred eecceeeEEecccccCCCCccEEEEEehhccEEEEe-ecceEEEEeccc-cCceEEEEEEEeccEE-------------- Confidence 89999999999998754322345555 23333322 223445544432 1223333222222221 Q ss_pred CChHHhcCCcCceeeecccccceEEEEEecccccccCCCC Q lcl|NC_015254. 302 PTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETA 341 (346) Q Consensus 302 Pt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~ 341 (346) .+++++-.+.+.+....+|--+-.+ T Consensus 391 ---------------~~~~a~~~~~~~~~~~~~~~~~~~~ 415 (415) T protein:vir:94 391 ---------------LDYKSAIVIEYDDSERGEGDLGLEA 415 (415) T ss_pred ---------------eccccEEEEEEeccCCCCCccccCC Confidence 1233333333333333222222222 No 64 >protein:vir:1886 Length: 385 # NCBI annotation: major capsid subunit precursor # Family: family:all:585 # MgeID: mge:41 # MgeName: HK022 # Cross-refs: genbank:acc:NP_037666;genbank:gi:9634124;genbank:GeneID:1262513 Probab=99.24 E-value=2.5e-12 Score=84.15 Aligned_cols=275 Identities=13% Similarity=0.106 Sum_probs=152.5 Q ss_pred CccceecceeeecC--Cceeee-eeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCc Q lcl|NC_015254. 1 MIKKLRMNLQKFAA--GKNTRI-ADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGE 77 (346) Q Consensus 1 ~~~~~~~~~q~~~a--~~~T~l-~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ 77 (346) +.....+..-.+.+ ...+.- ..++.|++. ..+.+...+.+.+.+-. .....++..+++|.+...++. T Consensus 90 ~~~~~~~~~~~~~~~~~~~~~~~g~~i~~~~~-~~ii~~~~~~~~l~~~~---------~~~~~~~~~~~~~~~~~~~~~ 159 (385) T protein:vir:18 90 DGKQGTFGAKTFNKSLGSDADSAGSLIQPMQI-PGIIMPGLRRLTIRDLL---------AQGRTSSNALEYVREEVFTNN 159 (385) T ss_pred HHhhccchhhHHHhhhccccccCCceecchhh-hHHHHHhhhccchhhhc---------ceecccCcceEEEEEecCCcc Confidence 11111111111111 111111 224555544 44555544444443311 111224667899998765556 Q ss_pred ccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH------HHhhh Q lcl|NC_015254. 78 DEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS------LNGIT 151 (346) Q Consensus 78 ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~------L~G~~ 151 (346) +..+.|+. .++..+.+-.+.....++.+....++++...-+ .+..+.+.+++++.+.+.++..+|.- +.|+. T Consensus 160 a~~v~E~~-~~~~~~~~~~~~~~~~~k~~~~~~is~ell~d~-~~l~~~i~~~la~a~~~~~d~~~l~G~g~~~~~~Gi~ 237 (385) T protein:vir:18 160 ADVVAEKA-LKPESDITFSKQTANVKTIAHWVQASRQVMDDA-PMLQSYINNRLMYGLALKEEGQLLNGDGTGDNLEGLN 237 (385) T ss_pred eeeeccCc-cccccccceeEEEEeeeeEEEeehhhHHHHhhH-HHHHHHHHHHHHHHHHHHHHHHHHhccCCCCcccccc Confidence 66677874 677777777777777777777777887754333 45677799999999999999876641 11111 Q ss_pred hhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc-cccCceeeEEece Q lcl|NC_015254. 152 ASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML-DSDNKKFPTYMGK 228 (346) Q Consensus 152 ~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~-~s~~~~i~~~~G~ 228 (346) ...... +..........++.|.++..++......-.+|+||+.++..|++..-- .++. ...++.-++++|+ T Consensus 238 ~~~~~~------~~~~~~~~~~~~d~i~~~~~~l~~~~~~~~~~~~~~~~~~~l~~lkd~~G~~l~~~~~~~~~~~l~G~ 311 (385) T protein:vir:18 238 KVATAY------DTSLNATGDTRADIIAHAIYQVTESEFSASGIVLNPRDWHNIALLKDNEGRYIFGGPQAFTSNIMWGL 311 (385) T ss_pred cccccc------cccccccccchHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHhhcCCCceeccCcccCCCceecce Confidence 111100 011112234578999999999887777788999999999999875311 1221 1123445789999 Q ss_pred EEEEeCCCccCCCceEEEEEcC--CeeEEeecCCccceeeeecC----CcceeEEEEeeEEeeee---eeeeeccccccC Q lcl|NC_015254. 229 RVIVDDGLPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREK----LKGNDILINRQHFLLHP---RGIAWQEKSVAG 299 (346) Q Consensus 229 ~VVvdD~~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~----~~g~~~l~~r~~~~~~~---~G~s~~~~~~~~ 299 (346) ||++++.||.+. .+|+. .++.+.... .+.++..+.. ..+...+....++.+++ ..|..-.-+.+ T Consensus 312 pV~~~~~~p~~~-----~~~gd~~~~~~~~~~~-~~~v~~~~~~~~~~~~~~~~~~~~~r~~~~v~~~~a~~~~~~~aa- 384 (385) T protein:vir:18 312 PVVPTKAQAAGT-----FTVGGFDMASQVWDRM-DATVEVSREDRDNFVKNMLTILCEERLALAHYRPTAIIKGTFSSG- 384 (385) T ss_pred eeEEcCcCCCCc-----EEEeecccEEEEEEec-ceEEEEeccccchhhcCcEEEEEEEeeccEEecccceEEEEeccC- Confidence 999999999763 33432 233333322 2345544433 23445556666665544 34433221111 Q ss_pred CC Q lcl|NC_015254. 300 HS 301 (346) Q Consensus 300 ~s 301 (346) + T Consensus 385 -~ 385 (385) T protein:vir:18 385 -S 385 (385) T ss_pred -C Confidence 1 No 65 >protein:vir:191 Length: 385 # NCBI annotation: major head subunit precursor # Family: family:all:585 # MgeID: mge:6 # MgeName: HK97 # Cross-refs: genbank:acc:NP_037701;genbank:gi:9634158;genbank:GeneID:1262530 Probab=99.24 E-value=2.5e-12 Score=84.15 Aligned_cols=275 Identities=13% Similarity=0.106 Sum_probs=152.5 Q ss_pred CccceecceeeecC--Cceeee-eeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCc Q lcl|NC_015254. 1 MIKKLRMNLQKFAA--GKNTRI-ADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGE 77 (346) Q Consensus 1 ~~~~~~~~~q~~~a--~~~T~l-~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ 77 (346) +.....+..-.+.+ ...+.- ..++.|++. ..+.+...+.+.+.+-. .....++..+++|.+...++. T Consensus 90 ~~~~~~~~~~~~~~~~~~~~~~~g~~i~~~~~-~~ii~~~~~~~~l~~~~---------~~~~~~~~~~~~~~~~~~~~~ 159 (385) T protein:vir:19 90 DGKQGTFGAKTFNKSLGSDADSAGSLIQPMQI-PGIIMPGLRRLTIRDLL---------AQGRTSSNALEYVREEVFTNN 159 (385) T ss_pred HHhhccchhhHHHhhhccccccCCceecchhh-hHHHHHhhhccchhhhc---------ceecccCcceEEEEEecCCcc Confidence 11111111111111 111111 224555544 44555544444443311 111224667899998765556 Q ss_pred ccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH------HHhhh Q lcl|NC_015254. 78 DEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS------LNGIT 151 (346) Q Consensus 78 ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~------L~G~~ 151 (346) +..+.|+. .++..+.+-.+.....++.+....++++...-+ .+..+.+.+++++.+.+.++..+|.- +.|+. T Consensus 160 a~~v~E~~-~~~~~~~~~~~~~~~~~k~~~~~~is~ell~d~-~~l~~~i~~~la~a~~~~~d~~~l~G~g~~~~~~Gi~ 237 (385) T protein:vir:19 160 ADVVAEKA-LKPESDITFSKQTANVKTIAHWVQASRQVMDDA-PMLQSYINNRLMYGLALKEEGQLLNGDGTGDNLEGLN 237 (385) T ss_pred eeeeccCc-cccccccceeEEEEeeeeEEEeehhhHHHHhhH-HHHHHHHHHHHHHHHHHHHHHHHHhccCCCCcccccc Confidence 66677874 677777777777777777777777887754333 45677799999999999999876641 11111 Q ss_pred hhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc-cccCceeeEEece Q lcl|NC_015254. 152 ASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML-DSDNKKFPTYMGK 228 (346) Q Consensus 152 ~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~-~s~~~~i~~~~G~ 228 (346) ...... +..........++.|.++..++......-.+|+||+.++..|++..-- .++. ...++.-++++|+ T Consensus 238 ~~~~~~------~~~~~~~~~~~~d~i~~~~~~l~~~~~~~~~~~~~~~~~~~l~~lkd~~G~~l~~~~~~~~~~~l~G~ 311 (385) T protein:vir:19 238 KVATAY------DTSLNATGDTRADIIAHAIYQVTESEFSASGIVLNPRDWHNIALLKDNEGRYIFGGPQAFTSNIMWGL 311 (385) T ss_pred cccccc------cccccccccchHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHhhcCCCceeccCcccCCCceecce Confidence 111100 011112234578999999999887777788999999999999875311 1221 1123445789999 Q ss_pred EEEEeCCCccCCCceEEEEEcC--CeeEEeecCCccceeeeecC----CcceeEEEEeeEEeeee---eeeeeccccccC Q lcl|NC_015254. 229 RVIVDDGLPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREK----LKGNDILINRQHFLLHP---RGIAWQEKSVAG 299 (346) Q Consensus 229 ~VVvdD~~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~----~~g~~~l~~r~~~~~~~---~G~s~~~~~~~~ 299 (346) ||++++.||.+. .+|+. .++.+.... .+.++..+.. ..+...+....++.+++ ..|..-.-+.+ T Consensus 312 pV~~~~~~p~~~-----~~~gd~~~~~~~~~~~-~~~v~~~~~~~~~~~~~~~~~~~~~r~~~~v~~~~a~~~~~~~aa- 384 (385) T protein:vir:19 312 PVVPTKAQAAGT-----FTVGGFDMASQVWDRM-DATVEVSREDRDNFVKNMLTILCEERLALAHYRPTAIIKGTFSSG- 384 (385) T ss_pred eeEEcCcCCCCc-----EEEeecccEEEEEEec-ceEEEEeccccchhhcCcEEEEEEEeeccEEecccceEEEEeccC- Confidence 999999999763 33432 233333322 2345544433 23445556666665544 34433221111 Q ss_pred CC Q lcl|NC_015254. 300 HS 301 (346) Q Consensus 300 ~s 301 (346) + T Consensus 385 -~ 385 (385) T protein:vir:19 385 -S 385 (385) T ss_pred -C Confidence 1 No 66 >protein:vir:2344 Length: 397 # NCBI annotation: gp14 # Family: family:all:507 # MgeID: mge:51 # MgeName: Bxb1 # Cross-refs: genbank:acc:NP_075281;genbank:gi:12657868;genbank:GeneID:920118 Probab=99.24 E-value=4.6e-12 Score=82.74 Aligned_cols=312 Identities=13% Similarity=0.076 Sum_probs=160.1 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEI 80 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~ 80 (346) =|.+.+..++ ..++.-+.++.||+..+++..- .+.+.+.+ +......++..+++|.+..- ..+.- T Consensus 3 ~~~e~~~~~~----~~t~~~~g~l~~~~~~~ii~~l-~~~s~i~~---------l~~~~~~~~~~~~ip~~~~~-~~a~w 67 (397) T protein:vir:23 3 FSADHSQIAQ----TKDTMFTGYLDPVQAKDYFAEA-EKTSIVQR---------VAQKIPMGATGIVIPHWTGD-VSAQW 67 (397) T ss_pred cCHHHHHHhh----ccCCCCccccchhHHHHHHHHH-Hhccchhh---------hcceeeccCCceEEEEEcCC-cceEE Confidence 1222222111 1233345688999877766542 33333322 11222335677899999763 45556 Q ss_pred cCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcc Q lcl|NC_015254. 81 LDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNK 160 (346) Q Consensus 81 ~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~ 160 (346) +.|++ .++..+.+-.+-....++.+..+.++++...-+.-|....+.+++++++++++++.+|. |.-........ T Consensus 68 v~Eg~-~~~~s~~~f~~v~l~~~k~~~~v~iS~ell~ds~~~l~~~i~~~l~~aia~~~d~a~l~---G~gt~~~~~~~- 142 (397) T protein:vir:23 68 IGEGD-MKPITKGNMTKRDVHPAKIATIFVASAETVRANPANYLGTMRTKVATAIAMAFDNAALH---GTNAPSAFQGY- 142 (397) T ss_pred ecCCc-cccccccceeEEEEeeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhh---cccCCcccccc- Confidence 77875 67777888777777788888889999987777777899999999999999999997764 21110000000 Q ss_pred eeee-ccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhccccc--C-----ceeeEEeceEE Q lcl|NC_015254. 161 LDVS-TETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLDSD--N-----KKFPTYMGKRV 230 (346) Q Consensus 161 ~dis-~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~s~--~-----~~i~~~~G~~V 230 (346) .+.. ..........++.+.++...+-+......+|+||++.+..|++..-- .++...+ + ...++++|+|| T Consensus 143 ~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~a~~vmn~~~~~~L~~lkd~~G~~i~~~~~~~~~~~~~~~~tl~G~Pv 222 (397) T protein:vir:23 143 LDQSNKTQSISPNAYQGLGVSGLTKLVTDGKKWTHTLLDDTVEPVLNGSVDANGRPLFVESTYESLTTPFREGRILGRPT 222 (397) T ss_pred cccccceeeecccchhHHHHHHHHhhhhcccCCCEEEEcHHHHHHHHHhhccCCceeecccccccccccccCceeeeeeE Confidence 0000 01112233566788888888877777778999999999999975311 1222111 1 12358999999 Q ss_pred EEeCCCccCCCceEEEEEc--CCeeEEeecCCccceeeeecCCc----------------ceeEEEEeeEEee---eeee Q lcl|NC_015254. 231 IVDDGLPAKDGVYTSYIFG--EGAFGLGNGEAPVPTETDREKLK----------------GNDILINRQHFLL---HPRG 289 (346) Q Consensus 231 VvdD~~p~~~g~ytt~l~~--~GAi~~~~~~~~~~vE~dRd~~~----------------g~~~l~~r~~~~~---~~~G 289 (346) ++++.||... ...+++ ..++ +.. ...+.+|.+|+... ++..+....++.+ ||.. T Consensus 223 ~~s~~~~~g~---~~~~~gDfs~~~-i~~-~~~i~i~~~~e~~~~~~~~~~~~~~~lf~~d~v~~ra~~r~d~~v~~~~a 297 (397) T protein:vir:23 223 ILSDHVAEGD---VVGYAGDFSQII-WGQ-VGGLSFDVTDQATLNLGSQESPNFVSLWQHNLVAVRVEAEYGLLINDVNA 297 (397) T ss_pred EEeCCCCCCc---eEEEEeecceEE-EEE-EeceEEEEeeeeeeeeccccccceeeeeeccceeEEEEeeeccceecccc Confidence 9999998654 122232 1222 333 22355666665421 1122222222222 2222 Q ss_pred eeeccccccCCCCChHHhc---CCcCceeeeccc---ccceEEEEEecccccccCCCCCC----CCC Q lcl|NC_015254. 290 IAWQEKSVAGHSPTNTEIE---KGNNWKAVYESK---NIRIVAFVHKNGVPGKKKETAPE----GIK 346 (346) Q Consensus 290 ~s~~~~~~~~~sPt~a~L~---~~~NW~~v~~~K---~i~iv~~~~k~~~~~~~~~~~~~----~~~ 346 (346) +........ ..+...+. ++.+..+.+..+ .|+ +. +-......|=. |+- T Consensus 298 ~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-----~~--a~~~~~~~~~~~~~~~~~ 355 (397) T protein:vir:23 298 FVKLTFDPV--LTTYALDLDGASAGNFTLSLDGKTSANIA-----YN--ASTATVKSAIVAIDDGVS 355 (397) T ss_pred eEEEeeccc--cceeeecccccCcceEEEEecCccccCcc-----cc--cchhhhHHHhhhcccccc Confidence 222111110 00000010 122222222111 111 00 00000000000 000 No 67 >protein:vir:97053 Length: 390 # NCBI annotation: putative head protein # Family: family:all:585 # MgeID: mge:1653 # MgeName: OP1 # Cross-refs: genbank:acc:YP_453565;genbank:gi:84662600;genbank:GeneID:5142468 Probab=99.23 E-value=1.9e-12 Score=84.90 Aligned_cols=272 Identities=11% Similarity=0.108 Sum_probs=154.6 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEI 80 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~ 80 (346) +....+..+.......++.-..++.|++... +.+...+.+.+.+- + .....++..+++|.+..-++.+.. T Consensus 102 ~~~~~~~~~~~~~~~~~~~~g~lip~~~~~~-ii~~~~~~~~i~~~--~-------~~~~~~~~~~~~~~~~~~~~~a~~ 171 (390) T protein:vir:97 102 ATMNIKAALNTASTDAAGSAGALTTPNRLPG-FITPPDARLTVRDL--I-------GSGRTDSALIEYVQETGFVNNAAI 171 (390) T ss_pred hhhHHHHHHHhhhcccccccccccchhhhHH-HHHHHhhhhhhHhh--c-------ceeeccCCceEEEEEecCCcceee Confidence 1111222222111222233334566665555 44444444444321 1 112234667888988765566667 Q ss_pred cCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH------HHhhhhhh Q lcl|NC_015254. 81 LDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS------LNGITASG 154 (346) Q Consensus 81 ~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~------L~G~~~~~ 154 (346) +.||. .++..+.+..+-....++.+.-..++++...-+ .+....+.++++..+.+.++..+|.- .+|++... T Consensus 172 v~Eg~-~~~~~~~~~~~i~~~~~k~~~~~~is~ell~ds-~~l~~~i~~~la~a~~~~~d~a~l~G~g~~~~p~Gi~~~~ 249 (390) T protein:vir:97 172 VAEGA-LKPESSLKFAKKTDTTHVIAHTMKATRQILSDA-PQLASYMNNRLIRGLKVKEDAEILRGTGANDGLLGLIPQA 249 (390) T ss_pred ecCCc-cccccccceeEEEEeeeeEEEeehhhHHHHHhH-HHHHHHHHHHHHHHHHHHHHHHHhhcCCCCccccceeecc Confidence 78885 577777777766667777777777777654433 36777799999999999999876641 11221111 Q ss_pred hhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhc-ccccCceeeEEeceEEE Q lcl|NC_015254. 155 ALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFM-LDSDNKKFPTYMGKRVI 231 (346) Q Consensus 155 ~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~-~~s~~~~i~~~~G~~VV 231 (346) ... +..........++.+.++...+.+......+|+|||.++..|++..-- .++ .+..++.-++++|+||+ T Consensus 250 ~~~------~~~~~~~~~~~~d~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~lkd~~G~~l~~~~~~~~~~~l~G~pV~ 323 (390) T protein:vir:97 250 TTY------AAPTTIAGATRVDQLRLAMLQASLAEYPASGIVINPIDWAAIELAKDANNQYLIGNARGTLTPTLWGLPVV 323 (390) T ss_pred ccc------cccccccccchHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHhhcCCCceeecCccCCCCceecceeeE Confidence 100 111122334677889999999888887888999999999999865311 122 11123344689999999 Q ss_pred EeCCCccCCCceEEEEEcC--CeeEEeecCCccceeeeecC---CcceeEEEEeeEEeeeee---eeeecccccc Q lcl|NC_015254. 232 VDDGLPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREK---LKGNDILINRQHFLLHPR---GIAWQEKSVA 298 (346) Q Consensus 232 vdD~~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~---~~g~~~l~~r~~~~~~~~---G~s~~~~~~~ 298 (346) +++.||.+. ++++. .++.+.. +..+.++..++. ..+...+....+|...|+ .|..- +.+ T Consensus 324 ~~~~~~~~~-----~~~gd~~~~~~~~~-~~~~~i~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~v~~--~~a 390 (390) T protein:vir:97 324 ATQAMAPGE-----FLVGAFDLAAQIFD-QWDARVEIGYVNDDFQRNMVTVLAEERLALVVYRPEALITG--SFA 390 (390) T ss_pred EcCCCCCCc-----EEEEeccceEEEEE-ecceEEEEeecccccccCcEEEEEEEeeccEEeccccEEEE--EeC Confidence 999999763 34432 3444333 223456666543 234445555555554443 33332 111 No 68 >protein:vir:81070 Length: 390 # NCBI annotation: p09 # Family: family:all:585 # MgeID: mge:1889 # MgeName: Xop411 # Cross-refs: genbank:acc:YP_001285679;genbank:gi:148727187;genbank:GeneID:5247115 Probab=99.23 E-value=2.1e-12 Score=84.56 Aligned_cols=272 Identities=11% Similarity=0.118 Sum_probs=155.1 Q ss_pred Cc-c--ceecceeeec----CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccccc Q lcl|NC_015254. 1 MI-K--KLRMNLQKFA----AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQD 73 (346) Q Consensus 1 ~~-~--~~~~~~q~~~----a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~ 73 (346) +- . +.++++..+. ...++.-.-++.||....++ +...+.+.+.+.. .....++..+++|.+.. T Consensus 95 ~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~ii-~~~~~~~~l~~~~---------~~~~~~~~~~~~~~~~~ 164 (390) T protein:vir:81 95 WNDRSARATMNIKAALNTASTDAAGSAGALTTPNRLPGFI-TPPDARLTVRDLI---------GSGRTDSALIEYVQETG 164 (390) T ss_pred HhhhhhhhhhHHHHHHHhhccccccCCcceechhhhHHHH-HHHhhhhhhhhhc---------ceeeccCCceEEEEEec Confidence 00 0 0111111111 11122222366777665544 4444444443311 11223566788998876 Q ss_pred CCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH------H Q lcl|NC_015254. 74 LTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS------L 147 (346) Q Consensus 74 l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~------L 147 (346) -.+.+..+.|+. .++..+.+-.+.....++.+....+++....-+ .+....+.+++++.+.+..+..+|.- . T Consensus 165 ~~~~a~~v~Eg~-~~~~~~~~~~~i~~~~~k~~~~~~is~ell~d~-~~~~~~i~~~l~~~~~~~~d~a~l~G~g~~~~~ 242 (390) T protein:vir:81 165 FVNNAAIVAEGA-LKPESSLKFAKKTDTTHVIAHTMKATRQILSDA-PQLASYMNNRLIRGLKVKEDAEILRGTGANDGL 242 (390) T ss_pred CCcceeeecCCc-ccccccceeeEEEEeeeEEEEeehhhHHHHHhH-HHHHHHHHHHHHHHHHHHHHHHHHhcCCCCCcc Confidence 555666678875 577777777777777777777778888755444 36777799999999999999876631 2 Q ss_pred HhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc-cccCceeeE Q lcl|NC_015254. 148 NGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML-DSDNKKFPT 224 (346) Q Consensus 148 ~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~-~s~~~~i~~ 224 (346) +|++........ .........++.|.++...+........+|+|||.++..|++..-- .++. ...++.-++ T Consensus 243 ~Gi~~~~~~~~~------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~lkd~~G~~l~~~~~~~~~~~ 316 (390) T protein:vir:81 243 LGLIPQATTYAA------PTTIAGATRVDQLRLAMLQASLAEYNPSGIVINPIDWAAIELAKDANNQYLIGNARGTLTPT 316 (390) T ss_pred cceeeccccccc------ccccccchhHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHhhcCCCceeecCcccccCce Confidence 222221111111 1112234567889999999888777778999999999999865311 1221 122334468 Q ss_pred EeceEEEEeCCCccCCCceEEEEEcC--CeeEEeecCCccceeeeecC---CcceeEEEEeeEEeeee---eeeeecccc Q lcl|NC_015254. 225 YMGKRVIVDDGLPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREK---LKGNDILINRQHFLLHP---RGIAWQEKS 296 (346) Q Consensus 225 ~~G~~VVvdD~~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~---~~g~~~l~~r~~~~~~~---~G~s~~~~~ 296 (346) ++|+||++++.||.+. ++++. .++.+.. +..+.++.++.. ..+...+....++.+++ ..|... + T Consensus 317 l~G~pv~~~~~~p~~~-----~~~gd~~~~~~~~~-~~~~~v~~~~~~~~~~~~~v~~r~~~r~d~~v~~~~a~v~~--t 388 (390) T protein:vir:81 317 LWGLPVVATQAMAPGE-----FLVGAFDLAAQIFD-QWDARVEIGYVGEDFQRNMITVLAEERLALVVYRPEALISG--S 388 (390) T ss_pred ecceeeEEcCCCCCCc-----EEEEehhceEEEEE-ecceEEEEecccchhhcCcEEEEEEEeeccEEecccceEEE--E Confidence 9999999999999764 34443 2333332 224556666643 23445555555555443 333221 1 Q ss_pred cc Q lcl|NC_015254. 297 VA 298 (346) Q Consensus 297 ~~ 298 (346) .+ T Consensus 389 ~a 390 (390) T protein:vir:81 389 FA 390 (390) T ss_pred eC Confidence 11 No 69 >protein:vir:104256 Length: 458 # NCBI annotation: major head protein precursor # Family: family:all:27070 # MgeID: mge:1504 # MgeName: T5 # Cross-refs: genbank:acc:YP_006977;genbank:gi:46401878;genbank:GeneID:2777673 Probab=99.22 E-value=3.6e-12 Score=83.30 Aligned_cols=285 Identities=14% Similarity=0.087 Sum_probs=146.9 Q ss_pred Cccce---ecceeeecC-Cc--eeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccC Q lcl|NC_015254. 1 MIKKL---RMNLQKFAA-GK--NTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDL 74 (346) Q Consensus 1 ~~~~~---~~~~q~~~a-~~--~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l 74 (346) +.+.. ...+.-..+ +. ++.-....+|+.+..-+.+...+.+.+.+-. .....+|....+|..... T Consensus 144 ~~~~~~~~~~~~~~~~a~~~~~~~~~g~~~ip~~~~~~ii~~~~~~~~l~~~~---------~~~~~~~~~~~~~~~~~~ 214 (458) T protein:vir:10 144 MEKGVFETEHGQRHLKAVNQSSSVEVSSESYETIFSQRIIRDLQKELVVGALF---------EELPMSSKILTMLVEPDA 214 (458) T ss_pred HhhccchhhhhhhhhhhhhhcccCccccceehhhHhHHHHHHHHhhhhHHhhc---------ceeecCCcceEEEEecCC Confidence 11100 000101111 11 1122344677777776666655555442211 111224555666655443 Q ss_pred CCcccccCCCccccc------hhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH-- Q lcl|NC_015254. 75 TGEDEILDDGEGALT------PGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS-- 146 (346) Q Consensus 75 ~g~ae~~~dg~~~it------~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~-- 146 (346) +.+.-+.++. ..+ ..+.+-.+-....++.+.-..+++....-+.-+....+.++++.++.+..+..+|.- T Consensus 215 -~~a~~v~e~~-~~~~~~~~~~~~~~~~~i~~~~~k~~~~v~is~ell~ds~~~~~~~i~~~l~~~i~~~~d~~~l~G~G 292 (458) T protein:vir:10 215 -GKATWVAAST-YGTDTTTGEEVKGALKEIHFSTYKLAAKSFITDETEEDAIFSLLPLLRKRLIEAHAVSIEEAFMTGDG 292 (458) T ss_pred -cceeeccccc-ccccccccccccccceeeEeeeeeEEeeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhcCCC Confidence 3333334432 111 112222222233344555567777654444557788899999999999999976631 Q ss_pred ---HHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhh--hcc----cc Q lcl|NC_015254. 147 ---LNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIE--FML----DS 217 (346) Q Consensus 147 ---L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~--~~~----~s 217 (346) -+|++..........-.....+....++++.|.++...+......-..|+||+..+..|++..--+ ++. .. T Consensus 293 ~~~p~Gi~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~~~~~~~~l~~lkd~~G~~i~~~~~~~ 372 (458) T protein:vir:10 293 SGKPKGLLTLASEDSAKVVTEAKADGSVLVTAKTISKLRRKLGRHGLKLSKLVLIVSMDAYYDLLEDEEWQDVAQVGNDS 372 (458) T ss_pred CCccceeeecccccccceeecccccccccccHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHhhcccCCceeecccccc Confidence 112222211111111111112233457899999999998877767789999999999988653211 111 11 Q ss_pred --cCceeeEEeceEEEEeCCCccCCCceEEE--EEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEee---eeeee Q lcl|NC_015254. 218 --DNKKFPTYMGKRVIVDDGLPAKDGVYTSY--IFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLL---HPRGI 290 (346) Q Consensus 218 --~~~~i~~~~G~~VVvdD~~p~~~g~ytt~--l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~---~~~G~ 290 (346) ..+..++++|+||++++.||...+....+ .|+.+.+ +.... .+.++.|+-...+...+.++.++.+ +|-|| T Consensus 373 ~~~~~~~~~l~G~pv~~~~~~p~~~~~~~~~~~~f~~~~~-~~~~~-~~~v~~d~~~~~~~~~~~~~~r~~~~v~~~~a~ 450 (458) T protein:vir:10 373 VKLQGQVGRIYGLPVVVSEYFPAKANSAEFAVIVYKDNFV-MPRQR-AVTVERERQAGKQRDAYYVTQRVNLQRYFANGV 450 (458) T ss_pred ccccCcCceecceeeEEccccccccCCcceEEEEecccEE-EEEee-ceEEEeecccCCCceEEEEEEEecceEecccce Confidence 12345689999999999999865543322 2444433 33322 3445444444566677777777764 34444 Q ss_pred eeccccccCCC Q lcl|NC_015254. 291 AWQEKSVAGHS 301 (346) Q Consensus 291 s~~~~~~~~~s 301 (346) -- ....+ | T Consensus 451 v~-~~~aa--~ 458 (458) T protein:vir:10 451 VS-GTYAA--S 458 (458) T ss_pred EE-Eeecc--C Confidence 22 11111 1 No 70 >protein:vir:485 Length: 407 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:11 # MgeName: P27 # Cross-refs: genbank:acc:NP_543092;swissprot:trembl:q8w627;genbank:gi:18249904;uniprot:Q8W627;genbank:GeneID:929693 Probab=99.21 E-value=3.9e-12 Score=83.13 Aligned_cols=291 Identities=11% Similarity=0.042 Sum_probs=148.3 Q ss_pred Cccceecceeeec--CCceeeee--eccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCC Q lcl|NC_015254. 1 MIKKLRMNLQKFA--AGKNTRIA--DVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTG 76 (346) Q Consensus 1 ~~~~~~~~~q~~~--a~~~T~l~--d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g 76 (346) |.+-.+..|..+. +-.++.-+ -.++|+.+..-+.+...+.+.+.+.. .....++..+.+|....- . T Consensus 90 l~~g~~~~~~~~e~~a~~~~t~~~gG~~iP~~~~~~I~~~~~~~~~l~~~~---------~~~~~~~~~~~~~~~~~~-~ 159 (407) T protein:vir:48 90 MRKGREDGLRELERKALQVGNDEDGGYAIPEELDRTILTLLKDEVVMRQEA---------TVITLGGSDYKKLVNLGG-T 159 (407) T ss_pred HhccchhhhhHHHHHhhhcccCCCCcccccHhHHHHHHHHHHhhhhhhhhc---------eeeecCCCceEEEEecCC-c Confidence 2211122222111 11111112 24679888887777766655443311 112223456666655432 2 Q ss_pred cccccCCCccccchhhcc-cceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH-----HHhh Q lcl|NC_015254. 77 EDEILDDGEGALTPGNIS-AAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS-----LNGI 150 (346) Q Consensus 77 ~ae~~~dg~~~it~~~lt-~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~-----L~G~ 150 (346) .+.-+.|+. .++..+.. ..+-....++.+.-..++++...-+..|..+.+.++++..+.+..+..+|.- -+|+ T Consensus 160 ~a~~v~E~~-~~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~~~a~l~G~G~~~p~Gi 238 (407) T protein:vir:48 160 TSGWVGETD-ARPETATSKLGLIEPFMGEIYGNPQATQKMLDDAFFNVEDWINSELALEFAEQEEIAFTSGDGSKKPKGF 238 (407) T ss_pred ceeeecccc-cccccccccceeEEeeeeeeEeehhhHHHHHhcchHHHHHHHHHHHHHHHHHHHHhhhhccCCCCcccee Confidence 333356664 34433332 3333333444444456666665555668888999999999999998865531 0112 Q ss_pred hhhhhhhhc--c----eeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCc Q lcl|NC_015254. 151 TASGALDSN--K----LDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNK 220 (346) Q Consensus 151 ~~~~~~~~~--~----~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~ 220 (346) +........ + ......++....++++.|.+....+......-.+|+||+.++..|++..-- .++. ...++ T Consensus 239 l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~i~~l~~~l~~~~~~~a~~v~n~~~~~~L~~lkD~~Gr~l~~~~~~~g 318 (407) T protein:vir:48 239 LAYESTDEDDKTRAFGKLQHIASGAASGVTADAIIKLIYTLRKAHRSGAKFMMNNSSLFAIRLLKDNDGNYLWRPGIELG 318 (407) T ss_pred eecccccccccccccccccccccccccccChHHHHHHHHhhchhhhcCCEEEEcHHHHHHHHHhhccCCceeeccCcCCC Confidence 211111000 0 000112334456889999999988876666667899999999998875311 1222 22234 Q ss_pred eeeEEeceEEEEeCCCccCCCceEEEEEcC--CeeEEeecCCccceeeeecC--CcceeEEEEeeEEeeee---eeeeec Q lcl|NC_015254. 221 KFPTYMGKRVIVDDGLPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREK--LKGNDILINRQHFLLHP---RGIAWQ 293 (346) Q Consensus 221 ~i~~~~G~~VVvdD~~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~--~~g~~~l~~r~~~~~~~---~G~s~~ 293 (346) ..++++|+||+++|.||.....-.+++||. .++.+.... .+++.|++ ..+...+....+|-+.| ..|..- T Consensus 319 ~~~~l~G~PV~~~~~~p~~~~~~~~i~~Gd~~~~~~i~~~~---~~~i~~d~~~~~~~~~~~~~~r~d~~v~~~~a~~~l 395 (407) T protein:vir:48 319 QPSSLAGYGIVENEQMPDIAADAKAIAFGNFKRGYTIVDRI---GTRILRDPYTNKPFVGFYTTKRTGGMLVDSQAIKLM 395 (407) T ss_pred CCceecceeeEEecCcCCccCCccEEEEEeccccEEEEEee---ceEEEeeccccCCcEEEEEEEEeccEEecccceEEE Confidence 456899999999999996432223344442 233333221 23444444 34555555555554333 333332 Q ss_pred cccccCCCCChH Q lcl|NC_015254. 294 EKSVAGHSPTNT 305 (346) Q Consensus 294 ~~~~~~~sPt~a 305 (346) ..+.+..+-.-+ T Consensus 396 ~~~aa~~~~~~~ 407 (407) T protein:vir:48 396 KIGAATRQKAAA 407 (407) T ss_pred EeeccCCCCCCC Confidence 211111111111 No 71 >protein:vir:9759 Length: 303 # NCBI annotation: putative structural protein # Family: family:all:966 # MgeID: mge:175 # MgeName: 315.3 # Cross-refs: genbank:acc:NP_795521;genbank:gi:28876283;genbank:GeneID:1257824 Probab=99.20 E-value=6e-12 Score=82.12 Aligned_cols=271 Identities=14% Similarity=0.025 Sum_probs=151.6 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =+ ++.-...++|+.+..-+.+...+.+.+.+.+-. ...++..+++|.+..- +.+.-+.|++ .++..+ T Consensus 1 m~--t~t~gg~liP~~~~~~ii~~l~~~s~i~~l~~~---------~~~~~~~~~ip~~~~~-~~a~wv~E~~-~~~~s~ 67 (303) T protein:vir:97 1 MG--TETSKASLFDKHLVSDLINKVKGHSSLAKLSSQ---------KPIPFNGSKEFTFTLD-SDIDVVAENG-KKTHGG 67 (303) T ss_pred Cc--ccCCCCeEcchhHHHHHHHHHHhhchhhhhcce---------eecCCCceEEEEEecC-cceEEeecCc-cccccc Confidence 12 223345567777766667776666655443211 1235667899998764 5666778875 677777 Q ss_pred cccceeEEEEEeecCcceechHHHhhh---cchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhh---hhhcceee--e Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALS---GDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGA---LDSNKLDV--S 164 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~---g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~---~~~~~~di--s 164 (346) .+-.+..-..++.+.-..++++-...+ .-+..+.+.++++++..+..+..+|.-....-+... ........ . T Consensus 68 ~~f~~v~l~~~kl~~~~~iS~ell~~~~d~~~~l~~~i~~~la~a~~~~ld~a~l~G~~~~~g~~~~~~~~~~~~~~~~~ 147 (303) T protein:vir:97 68 LSLEPVTIVPIKVEYGARLSDEFLYATEEEKIDILKAFNEGFAKKLARGIDLMAMHGINPRTKKASDVIGTNHFDSKVTQ 147 (303) T ss_pred cceeeEEeeeEEEEEeehhhHHHhhcCccchHHHHHHHHHHHHHHHHHHHHhhhhcccccCCcccccccccccccccccc Confidence 776666556666776677777754322 336778899999999999999976643210001100 00000000 0 Q ss_pred ccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcccc---cCceeeEEeceEEEEeCCCccC Q lcl|NC_015254. 165 TETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLDS---DNKKFPTYMGKRVIVDDGLPAK 239 (346) Q Consensus 165 ~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~s---~~~~i~~~~G~~VVvdD~~p~~ 239 (346) ..+.......++.|.++..++-+......+|+|||.++..|++..-- .++... .++..++++|+||++++.||.. T Consensus 148 ~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~L~~lkd~~g~~~~~~~~~~~~~~~~l~G~Pv~~s~~v~~~ 227 (303) T protein:vir:97 148 VVKFTESEDADANIEAAVNLIQGAEGVVTGLAMDTEFSTALAKVTNGEMGPKMYPELAWGANPDSINGLKSSVNTTVGAG 227 (303) T ss_pred ccccccccchHHHHHHHHHHHhhcCCCccEEEEcHHHHHHHHHhhccCCCeEEecCccCCCCCceecceeeEEecccCCc Confidence 11112234568999999999877777778899999999999864211 122211 1334568999999999999864 Q ss_pred CC---ceEEEEEcC--CeeEEeecCCccceeeeecC--C--------cceeEEEEeeEEe---eeeeeeeec-cccc Q lcl|NC_015254. 240 DG---VYTSYIFGE--GAFGLGNGEAPVPTETDREK--L--------KGNDILINRQHFL---LHPRGIAWQ-EKSV 297 (346) Q Consensus 240 ~g---~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~--~--------~g~~~l~~r~~~~---~~~~G~s~~-~~~~ 297 (346) .+ ....++++. .++.+...+ .+.+|..+.- . .....+....++. +||..|.-- +..+ T Consensus 228 ~~~~~~~~~~~~Gdf~~~~~~~~~~-~~~~~~~~~~~~d~~~~~~~~~n~~~~r~~~r~~~~v~~p~af~~l~~~~~ 303 (303) T protein:vir:97 228 ADEAESKDLVIIGDFESMFKWGYAK-QIPMEIIKYGDPDNSGKDLKGYNQIYLRAEAYIGWGILDAKSFARVTKGEV 303 (303) T ss_pred cccCCCccEEEEeeccccEEEEEec-CcEEEEeeccCCCCcchhhhhcCcEEEEEEEEeccEeecccceEEeeCCCC Confidence 32 122355553 444454422 3444443211 1 1112233333332 233333221 2112 No 72 >protein:vir:8102 Length: 543 # NCBI annotation: gp6 # Family: family:all:21 # MgeID: mge:152 # MgeName: Che9c # Cross-refs: genbank:acc:NP_817683;genbank:gi:29566114;genbank:GeneID:1259308 Probab=99.19 E-value=6.6e-12 Score=81.87 Aligned_cols=283 Identities=8% Similarity=0.001 Sum_probs=146.3 Q ss_pred CccceecceeeecCCceeee--eeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRI--ADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGED 78 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l--~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~a 78 (346) +...-+..+..+.+..+|.- ..+|.+++....+.....+.+.+.+. .......| .+.+|.... .+.+ T Consensus 237 l~~~e~~~~~~~~~~~~t~~~gg~lip~~~~~~ii~~~~~~~~~l~~~---------~~~~~~~g-~~~~~~~~~-~~~a 305 (543) T protein:vir:81 237 LTEEEKRAINEVRAMGLTKADGGYLVPFQLDPTVIITSNGSLNDIRRF---------ARQVVATG-DVWHGVSSA-AVQW 305 (543) T ss_pred hhhhhhhhhhhhhhcccccccCcccCchhhhhHHHHHHHhhhchhhhh---------cccccCCc-ceEEEEecC-Ccce Confidence 11111222333322222221 23444455555555544444433221 11112234 445666543 2455 Q ss_pred cccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH------HHHhhhh Q lcl|NC_015254. 79 EILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIA------SLNGITA 152 (346) Q Consensus 79 e~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla------~L~G~~~ 152 (346) ..+.||. .++..+++-++-....+..+.-+.++.....-+ -|....|.++++..+.+..+..+|. ...|++. T Consensus 306 ~~v~Eg~-~~~~~~~~~~~i~~~~~k~~~~~~is~ell~d~-~~~~~~i~~~l~~~~~~~~d~ail~G~Gt~~~p~Gi~~ 383 (543) T protein:vir:81 306 SWDAEFE-EVSDDSPEFGQPEIPVKKAQGFVPISIEALQDE-ANVTETVALLFAEGKDELEAVTLTTGTGQGNQPTGIVT 383 (543) T ss_pred eecccCc-cccccccccceeeeeeeeeEeeehhhHHHHhcc-HHHHHHHHHHHHHHHHHHHHHHHhccCCCCcccccchh Confidence 5677875 577777777666666777777777777655433 4788889999999999999997763 2334433 Q ss_pred hhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcccc-cCceeeEEeceE Q lcl|NC_015254. 153 SGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLDS-DNKKFPTYMGKR 229 (346) Q Consensus 153 ~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~s-~~~~i~~~~G~~ 229 (346) ...... .. ..++....++++.+.++...+......-.+|+||+.++..|++..-- .++... .++.-++++|+| T Consensus 384 ~~~~~~--~~--~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~v~n~~~~~~l~~lkd~~G~~l~~~~~~g~~~~l~G~p 459 (543) T protein:vir:81 384 ALAGTA--AE--IAPVTAETFALADVYAVYEQLAARHRRQGAWLANNLIYNKIRQFDTQGGAGLWTTIGNGEPSQLLGRP 459 (543) T ss_pred hccccc--cc--ccccccccccHHHHHHHHHhhhccccCCcEEEEcHHHHHHHHHhhcCCCceeccCcCCCCCcccccee Confidence 222111 11 11234455789999999988876666667899999999999875311 122221 233456899999 Q ss_pred EEEeCCCccCC------CceEEEEEcCC-eeEEeecCCccceeeeecCCc------ceeEEEEeeEEeeeeeeeeecccc Q lcl|NC_015254. 230 VIVDDGLPAKD------GVYTSYIFGEG-AFGLGNGEAPVPTETDREKLK------GNDILINRQHFLLHPRGIAWQEKS 296 (346) Q Consensus 230 VVvdD~~p~~~------g~ytt~l~~~G-Ai~~~~~~~~~~vE~dRd~~~------g~~~l~~r~~~~~~~~G~s~~~~~ 296 (346) |++++.||... +.+ .++||.= -+.+...+ .+.++.+..... +...+....++.+. T Consensus 460 v~~~~~~~~~~~~~~~~~~~-~i~~gd~~~~~i~~~~-~~~i~~~~~~~~~~~~~~~~~~~~~~~r~d~~---------- 527 (543) T protein:vir:81 460 VGEAEAMDANWNTSASADNF-VLLYGNFQNYVIADRI-GMTVEFIPHLFGTNRRPNGSRGWFAYYRMGAD---------- 527 (543) T ss_pred eEEeccccccccccccCCcc-eEEEeeccceeEEeec-ccEEEEeccccccchhhcCceEEEEEEeeccE---------- Confidence 99999999753 233 2333321 12233222 233333222111 22222222222211 Q ss_pred ccCCCCChHHhcCCcCceeeecccccceEEEEEec Q lcl|NC_015254. 297 VAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKN 331 (346) Q Consensus 297 ~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~ 331 (346) +.+++++.++.+++.. T Consensus 528 -------------------v~~~~A~~~l~~~~~a 543 (543) T protein:vir:81 528 -------------------VVNPNAFRLLNVETAS 543 (543) T ss_pred -------------------eecccceEEEEecccC Confidence 1122222211111111 No 73 >protein:vir:1638 Length: 298 # NCBI annotation: Structural protein # Family: family:all:966 # MgeID: mge:33 # MgeName: r1t # Cross-refs: genbank:acc:NP_695059;genbank:gi:23455750;genbank:GeneID:955469 Probab=99.18 E-value=1.3e-11 Score=80.20 Aligned_cols=268 Identities=15% Similarity=0.077 Sum_probs=147.7 Q ss_pred ceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhhccc Q lcl|NC_015254. 16 KNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGNISA 95 (346) Q Consensus 16 ~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~ 95 (346) -.|.-..++.||+.++.+ +...+.+.+.+-. .....++..+++|.+..- +.+..+.|++ .++..+++- T Consensus 1 ma~~gG~lvp~~~~~~ii-~~~~~~s~i~~l~---------~~~~~~~~~~~ip~~~~~-~~a~~v~E~~-~~~~~~~~f 68 (298) T protein:vir:16 1 MVLNKGTLFDPTLVTDLI-SKVAGKSSIARLS---------AQKPIPFNGEKVFTFTMD-SEIDVVAESG-KKTHGGVTL 68 (298) T ss_pred CcccCcceechhHHHHHH-HHHHhhhhhhhhc---------ceeeccCCceEEEEEecC-cceEEecCCc-cccccccce Confidence 123334566777766655 4444444343311 111234556789998754 5666678874 677777766 Q ss_pred ceeEEEEEeecCcceechHHHhhhc---chHHHHHHHHHHHHHHHHHHHHHHHHH---Hh----hhhhhhhhhcceeeec Q lcl|NC_015254. 96 AKDIARLHMRGKAWRTNDLAKALSG---DDPMRAIGDLVVEYWNRRRQAVLIASL---NG----ITASGALDSNKLDVST 165 (346) Q Consensus 96 ~~~~a~~~~~~k~~~~tD~a~~~~g---~dp~~~i~~q~a~~~~~~~~~~lla~L---~G----~~~~~~~~~~~~dis~ 165 (346) .+.....++.+.-..++++....+. .+..+.+.+++++++.+.++..++.-. .| ..+.+.....+.. .. T Consensus 69 ~~v~l~~~k~a~~~~iS~ell~~s~d~~~~l~~~i~~~la~ai~~~~d~~~l~G~~~~~g~~~~~~~~~~~~~~~~~-~~ 147 (298) T protein:vir:16 69 APQTMVPIKVEYGARISDEFMYASDEEKINILQEFNDGFAKKVARGIDLMAFHGVNPRLGTASAVIGTNHFDSKVTQ-KV 147 (298) T ss_pred eEEEEeeeeEEEeehhhHHHhhcCcccHHHHHHHHHHHHHHHHHHHHHHHhhccccCCCCccccccccccccccccc-cc Confidence 6555566666666777777654333 367778999999999999999777431 00 0000010000000 11 Q ss_pred cccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhh--hcc--cccCceeeEEeceEEEEeCCCccCCC Q lcl|NC_015254. 166 ETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIE--FML--DSDNKKFPTYMGKRVIVDDGLPAKDG 241 (346) Q Consensus 166 ~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~--~~~--~s~~~~i~~~~G~~VVvdD~~p~~~g 241 (346) ..+....-.+..+.++..++.....+..+|+|||+.+..|++..=.+ ++. ...++.-++++|+||++++.+|...+ T Consensus 148 ~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~lkd~~G~~i~~~~~~~~~~~~l~G~PV~~~~~v~~~~~ 227 (298) T protein:vir:16 148 EAPRGIADPNGAIENAVELLTGVDADVTGIAINPSFRSALAKQKDLQDNALFPELKWGATPDTINGLPVDVNKTVSDMSL 227 (298) T ss_pred ccccccccHHHHHHHHHHHhhhcCCCccEEEEcHHHHHHHHHhhccCCCeeecCcccCCCCceecceeeEEecccccccC Confidence 11111112356788888888777777789999999999998753111 221 11234457899999999999986433 Q ss_pred c-eEEEEEcC--CeeEEeecCCccceeeeecCC----------cceeEEEEeeEEe---eeeeeeeeccccc Q lcl|NC_015254. 242 V-YTSYIFGE--GAFGLGNGEAPVPTETDREKL----------KGNDILINRQHFL---LHPRGIAWQEKSV 297 (346) Q Consensus 242 ~-ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~----------~g~~~l~~r~~~~---~~~~G~s~~~~~~ 297 (346) . ...+++|. .++.+...+ .+.++..|+.. .+...+....++. ++|..|..-.... T Consensus 228 ~~~~~~~~GDfs~~~~~~~~~-~~~~~~~~~~~~~~~~~~~f~~~~v~~ra~~r~d~~v~~~~a~~~l~~at 298 (298) T protein:vir:16 228 TQRDRAIIGDFANGFKWGYAK-EVPLEVIQYGDPDNSGLDLKGYNQVYIRAELFLGWGILDATKFARVTEAN 298 (298) T ss_pred CCccEEEEeeccceEEEEEec-CceEEEeeccCCcCcchhhhhcCcEEEEEEEEEccEeecccceEEEeecC Confidence 2 12344442 444454333 34555554321 1222233333332 3334444432211 No 74 >protein:vir:78223 Length: 333 # NCBI annotation: Putative major head protein # Family: family:all:966 # MgeID: mge:1849 # MgeName: Bethlehem # Cross-refs: genbank:acc:YP_001491666;genbank:gi:157786490;genbank:GeneID:5625701 Probab=99.16 E-value=1.3e-11 Score=80.27 Aligned_cols=282 Identities=11% Similarity=0.050 Sum_probs=147.0 Q ss_pred Cccceec----ceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCC Q lcl|NC_015254. 1 MIKKLRM----NLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTG 76 (346) Q Consensus 1 ~~~~~~~----~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g 76 (346) ||.-..+ |.|. +....-+.+ +|+.+..-+.+...+.+.+.+.+ .....++..+.+|.+... . T Consensus 4 l~el~~~~~~~~~~g---~~~~~~~~l-iP~~~~~~ii~~l~~~s~l~~~~---------~~~~~~~~~~~~p~~~~~-~ 69 (333) T protein:vir:78 4 LNELLPNSAGSNHQG---RLAHVPSDL-LPKEIVGPIFDKAQESSLVLRMG---------EQIPISYGETIIPTTVKR-P 69 (333) T ss_pred hHHhhhhcccccccC---ceecCCccc-cchhHHHHHHHHHHhhchhhhhc---------ceeeccCCceEEEEEeCC-c Confidence 3332222 1111 111122334 56666555555555555443321 112235677789998764 3 Q ss_pred cccccCCCc-------cccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH---- Q lcl|NC_015254. 77 EDEILDDGE-------GALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIA---- 145 (346) Q Consensus 77 ~ae~~~dg~-------~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla---- 145 (346) .+.-+.|++ +.++..+.+-.+-....++.+.-..++++...-+..+..+.+.+++++.+.+..++.+|. T Consensus 70 ~a~~v~eg~~~~~~e~~~~~~~~~~f~~i~l~~~kl~~~~~is~ell~~s~~~~~~~i~~~la~ai~~~~d~~~l~G~g~ 149 (333) T protein:vir:78 70 EVGQVGVGTSNEQREGGLKPLSGTAWDTRSVSPIKLATIVTVSEEFARMNPSGLYTKLQGDLAYAIGRGIDLAVFHGKSP 149 (333) T ss_pred eeEeecCcccccccccccccccccceeEEEEeeEEEEEeehhhHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHHhcccCC Confidence 332333321 234444555555555556666677888877666667888999999999999999998773 Q ss_pred ----HHHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCcc-ccCceEEEEchHHHHHHHhhhhh-h----hcc Q lcl|NC_015254. 146 ----SLNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDA-EGKLTGIAMHSQTEMNLRKQGLI-E----FML 215 (346) Q Consensus 146 ----~L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~-~~~~~~ivmhS~~~~~L~~~~li-~----~~~ 215 (346) ..+|+...+.....+. .........++++.|.++..++... .....+++|||..+..|++.... + ++. T Consensus 150 ~~~~~~~g~~~~~~~~~~~~--~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~~~~~~d~~G~~i~ 227 (333) T protein:vir:78 150 LTGSALQGIDTDNVIANTTN--VDYLQETGDPLLDRLLDGYDLVSANTDVEFNGWAVDPRFRAHLLRAQAYRDANGNVDP 227 (333) T ss_pred CCCccccccccccccccccc--ccccccccchhHHHHHHHHHhhccccccCceEEEEcchHHHHHHHHhhhcCCCCceee Confidence 1222222111111111 1112233456788999998887544 34556899999999999875432 1 222 Q ss_pred c--ccCceeeEEeceEEEEeCCCccCCC----ceEEEEEcCCe-eEEeecCCccceeeeecCC-------------ccee Q lcl|NC_015254. 216 D--SDNKKFPTYMGKRVIVDDGLPAKDG----VYTSYIFGEGA-FGLGNGEAPVPTETDREKL-------------KGND 275 (346) Q Consensus 216 ~--s~~~~i~~~~G~~VVvdD~~p~~~g----~ytt~l~~~GA-i~~~~~~~~~~vE~dRd~~-------------~g~~ 275 (346) . ..++.-++++|+||++++.||...+ ....++++.-. +.++.-+ .+.++.+|+.. .++. T Consensus 228 ~~~~~~~~~~~l~G~Pv~~~~~i~~~~~~~~~~~~~~~~gD~~~~~~g~~~-~~~i~~~~~~~~~~~~~~~~~~~~~~~v 306 (333) T protein:vir:78 228 SRINLAAQTGDVLGLPAQFGRAVGGDLGAAVDSKTRIIGGDFSQLKFGFAD-EIRIKMSDTATLTDSGSATVSMWQTNQI 306 (333) T ss_pred cCccccCCCceeeceeeEEccccCCCccccCCCccEEEEEecccEEEEEee-ccEEEEeccccccccccceeehhhcCcE Confidence 1 2234457899999999999986532 11123333221 2233222 34455555431 1122 Q ss_pred EEEEeeEEeeee---eeeeeccccccCCCC Q lcl|NC_015254. 276 ILINRQHFLLHP---RGIAWQEKSVAGHSP 302 (346) Q Consensus 276 ~l~~r~~~~~~~---~G~s~~~~~~~~~sP 302 (346) .+....++.+++ ..|..-.. ...| T Consensus 307 ~~r~~~r~d~~v~~~~a~~~l~~---~~a~ 333 (333) T protein:vir:78 307 AILIEVTFGWLLGDKQAFVKFVD---DEQP 333 (333) T ss_pred EEEEEEEEccEEecccceEEEec---cCCC Confidence 233333333333 22322211 1234 No 75 >protein:vir:94771 Length: 298 # NCBI annotation: major head protein # Family: family:all:966 # MgeID: mge:1529 # MgeName: phi LC3 # Cross-refs: genbank:acc:NP_996706;genbank:gi:45597421;genbank:GeneID:2769044 Probab=99.16 E-value=1.7e-11 Score=79.65 Aligned_cols=265 Identities=14% Similarity=0.040 Sum_probs=146.1 Q ss_pred ceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhhccc Q lcl|NC_015254. 16 KNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGNISA 95 (346) Q Consensus 16 ~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~ 95 (346) =+|.-..+ +|+.+..-+.+...+.+.+.+-. .....++..+++|.+..- +.+..+.|++ .++..+.+- T Consensus 1 ma~~gG~l-ip~~~~~~ii~~~~~~s~i~~~~---------~~~~~~~~~~~~p~~~~~-~~a~~v~Eg~-~~~~~~~~f 68 (298) T protein:vir:94 1 MVLNKGTL-FDPELVTDLISKVAGKSSIARLS---------AQKPIPFNGEKVFTFTMD-SEIDVVAESG-KKTHGGVTL 68 (298) T ss_pred Ceeccccc-cChhHHHHHHHHHHhhchhhhhc---------ceeeccCCceEEEEEecC-cceEEeeCCc-cccccccce Confidence 12333444 44444444445544444332211 112234556789988643 4555677874 677777776 Q ss_pred ceeEEEEEeecCcceechHHHhhhc---chHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhh----------hhccee Q lcl|NC_015254. 96 AKDIARLHMRGKAWRTNDLAKALSG---DDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGAL----------DSNKLD 162 (346) Q Consensus 96 ~~~~a~~~~~~k~~~~tD~a~~~~g---~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~----------~~~~~d 162 (346) .+.....++.+.-+.++++....+. .+..+.+.+++++.+.+.++..+|. |.-..++. ...+.. T Consensus 69 ~~v~l~~~k~~~~~~iS~ell~~~~~~~~~l~~~i~~~la~ai~~~~d~~~l~---G~~~~~g~~~~~~~~~~~~~~~~~ 145 (298) T protein:vir:94 69 APQTMVPIKVEYGARISDEFMYASDEEKINILQAFNDGFAKKVARGIDLMAFH---GVNPRLGTASAVIGTNHFDSKVTQ 145 (298) T ss_pred eEEEEeeeEEEEeeehhHHHhccCCccHHHHHHHHHHHHHHHHHHHHHHHhhc---ccccCCCccccccccccccccccc Confidence 6666667777777888888654333 3567789999999999999987664 32111110 000000 Q ss_pred eeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhc--ccccCceeeEEeceEEEEeCCCcc Q lcl|NC_015254. 163 VSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFM--LDSDNKKFPTYMGKRVIVDDGLPA 238 (346) Q Consensus 163 is~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~--~~s~~~~i~~~~G~~VVvdD~~p~ 238 (346) ....++.....++.|.++..++-+...+..+|+|||+.+..|++..-- .++ ....++.-++++|+||++++.+|. T Consensus 146 -~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~lkd~~G~~l~~~~~~~~~~~tl~G~PV~~~~~v~~ 224 (298) T protein:vir:94 146 -KVEAPRGIADPNGAIENAVELLTGVDADVTGIAINPSFRSALAKQKDLQGNALFPELKWGATPDTINGLPVDVNKTVSD 224 (298) T ss_pred -ccccccccccHHHHHHHHHHhhhhcCCCccEEEEcHHHHHHHHHhhccCCCeeecCcccCCCCceecceeeEEeccccc Confidence 011112222346789999999887777778999999999999875311 111 112234456899999999999986 Q ss_pred CCCce-EEEEEcC--CeeEEeecCCccceeeeecCC----------cceeEEEEeeEEeee---eeeeeeccccccCCCC Q lcl|NC_015254. 239 KDGVY-TSYIFGE--GAFGLGNGEAPVPTETDREKL----------KGNDILINRQHFLLH---PRGIAWQEKSVAGHSP 302 (346) Q Consensus 239 ~~g~y-tt~l~~~--GAi~~~~~~~~~~vE~dRd~~----------~g~~~l~~r~~~~~~---~~G~s~~~~~~~~~sP 302 (346) ..+.- ...+++. .++.|+..+ .+.+|..|... .+...+....++.+. |..|..-... T Consensus 225 ~~~~~~~~~~~Gdfs~~~~~~~~~-~~~~~~~~~~~~d~~~~~~f~~~~v~~r~~~r~~~~~~~~~a~~~l~~~------ 297 (298) T protein:vir:94 225 MSLTQRDRAIIGDFANGFKWGYAK-EVPLEVIQYGDPDNSGLDLKGYNQVYIRAELFLGWGILDATKFARVTEA------ 297 (298) T ss_pred ccCCCccEEEEeeccceEEEEEec-CceEEEeecCCCcCcchhhhhcCcEEEEEEEEeccEeecccceEEEEec------ Confidence 54321 2344453 334454333 35566655421 122223333443322 2233322111 Q ss_pred C Q lcl|NC_015254. 303 T 303 (346) Q Consensus 303 t 303 (346) | T Consensus 298 t 298 (298) T protein:vir:94 298 N 298 (298) T ss_pred C Confidence 1 No 76 >protein:vir:3136 Length: 322 # NCBI annotation: hypothetical protein # Family: family:all:11728 # MgeID: mge:64 # MgeName: VpV262 # Cross-refs: genbank:acc:NP_640318;genbank:gi:21234405;genbank:GeneID:956058 Probab=99.16 E-value=5e-12 Score=82.52 Aligned_cols=278 Identities=11% Similarity=-0.004 Sum_probs=160.4 Q ss_pred eecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccch Q lcl|NC_015254. 11 KFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTP 90 (346) Q Consensus 11 ~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~ 90 (346) |-..|.+-.....|.||+|.+.+..-+.+++.+.. ++..... .-||+|+||-..... ..+ ++..++++. T Consensus 1 ~~~~n~ts~~qafi~~EiWsa~il~~l~~~Lv~~~---~~~~~d~-----g~GDtV~InsIg~~t--V~d-Y~~~~~i~~ 69 (322) T protein:vir:31 1 MSTGNNTSNTQALIVSEIWADEIEDILHEKLLDVN---IARVVDF-----PDGDKLTIPSVGTPV--VRS-RPEQGDFTF 69 (322) T ss_pred CCCCCCcccceEEeehhhhHHHHHHHhhhhhhhhh---hhccccc-----CCCCeEEeccccccc--ccc-ccCCCCccc Confidence 33344455556678899999999876666553311 1221111 249999998877662 223 333468899 Q ss_pred hhcccceeEEEEE-eecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHH-hhhhhhhhhh--ccee---e Q lcl|NC_015254. 91 GNISAAKDIARLH-MRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLN-GITASGALDS--NKLD---V 163 (346) Q Consensus 91 ~~lt~~~~~a~~~-~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~-G~~~~~~~~~--~~~d---i 163 (346) +.+++.+..-++= ..-.+|.+.| ...-..+|.+..+.++.+...++.+|..+...|+ |+-..+.... +..+ . T Consensus 70 d~ltt~~~~l~IDq~KYfaf~VdD-D~~Qa~~dl~~~~~~~aa~ala~~~D~fva~lL~~gA~~~~~~~~p~vin~~~~~ 148 (322) T protein:vir:31 70 DNLDTGEISIILRDEVYAGNAISK-KLRQDSRWISNVGAMLPAEQARAIMERYQTDLLALGNAQFAGQNDPNVINGVPHR 148 (322) T ss_pred ccCCCceEEEEEehhhhhccccch-hHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhccCCcceecCCccc Confidence 9998886654432 2345678999 5566788999999999999999999999887676 3321112111 1111 1 Q ss_pred eccccccccccHHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhhhhhh-------hcccccCce------eeEEece Q lcl|NC_015254. 164 STETGDDSYFTGDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQGLIE-------FMLDSDNKK------FPTYMGK 228 (346) Q Consensus 164 s~~~~~~~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~li~-------~~~~s~~~~------i~~~~G~ 228 (346) -..+++...+.++.|.++..+|.+.. ..=+.+|+.|.+++.|+.-..+. +...-++|. ++..+|. T Consensus 149 iv~~gt~~~~ay~~lv~l~~kLdkanVP~~gR~vVV~P~~~~~L~~i~~~~~l~~D~rf~~i~~sG~a~g~~~Vg~~~GF 228 (322) T protein:vir:31 149 FVGTGTDQTMDVTDFSRVNYVMTQSKMPMGGMIGIIDPSVAHHLETITNISNISNNPRWEGIVESGIAPDMQFVRSVYGI 228 (322) T ss_pred eeccCCCchhhHHHHHHHHHHhccccCCCCCeEEEeCchhhhhhhhhhhhhhhhccccccccccccchhhHHHHHHHhce Confidence 13356677789999999999997755 23478999999998886543221 122222332 7899999 Q ss_pred EEEEeCCCccCCCceEEEEEcCCeeEEe-------------------ecCCccceeeeecCCcceeEEEEeeEEeeeeee Q lcl|NC_015254. 229 RVIVDDGLPAKDGVYTSYIFGEGAFGLG-------------------NGEAPVPTETDREKLKGNDILINRQHFLLHPRG 289 (346) Q Consensus 229 ~VVvdD~~p~~~g~ytt~l~~~GAi~~~-------------------~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G 289 (346) +|++|+.+|. +.|+++.=..|+.... -.+.--..|-.|+..+-.+....+..|+..+. T Consensus 229 ~V~~SN~l~~--~~~~i~aG~d~~~t~ag~~n~f~~~~~~~~~~~~~~~~~l~~~e~~r~~~~~~d~~~~~~~~g~g~~- 305 (322) T protein:vir:31 229 DLFVSNLLAD--ANETINAGGDARSTTAGKCNMFMNVSDMGLLPFVVAWKEMPTTKSFIDDYNDDLNTATTARWGNGLV- 305 (322) T ss_pred eeeeeccccc--cccccccCcccccccceeecccccccchhhhhhhhHhhhhhhhhcccCccccccceeeeeeecceee- Confidence 9999998864 3333332222222110 00111124556666555444444444443321 Q ss_pred eeecccc---ccCCCCChH Q lcl|NC_015254. 290 IAWQEKS---VAGHSPTNT 305 (346) Q Consensus 290 ~s~~~~~---~~~~sPt~a 305 (346) .-++- .+...|+-= T Consensus 306 --r~e~l~~~~a~~~~~~~ 322 (322) T protein:vir:31 306 --RDENLVCVLANADKVTF 322 (322) T ss_pred --cccceEEEEeccccccC Confidence 11100 001111110 No 77 >protein:vir:100247 Length: 425 # NCBI annotation: gp76 # Family: family:all:21 # MgeID: mge:1619 # MgeName: Bcep176 # Cross-refs: genbank:acc:YP_355412;genbank:gi:77864702;genbank:GeneID:3725969 Probab=99.15 E-value=7e-12 Score=81.75 Aligned_cols=285 Identities=12% Similarity=0.013 Sum_probs=146.3 Q ss_pred Cccce-------ecce-------eeecCCc--eeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCc Q lcl|NC_015254. 1 MIKKL-------RMNL-------QKFAAGK--NTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGN 64 (346) Q Consensus 1 ~~~~~-------~~~~-------q~~~a~~--~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ 64 (346) .+.+. +-.| +..++-+ ++.-.-..+|+-+..-+.+...+.+.+.+..-+ ...++. T Consensus 102 ~~~~~~~~~~~~~~af~~~l~~~e~~~al~~~t~~~gG~lvP~~~~~~ii~~~~~~s~l~~l~~~---------~~~~~~ 172 (425) T protein:vir:10 102 ANGVKPLRDPEYTEAFKAHVKRGDVQAALNKGEDSEGGYLTPIEWDRTITNKLVLISPMRQLCRV---------QPVSKA 172 (425) T ss_pred cccccccccHHHHHHHHHHhhhhhhHHHhhcCcCCCCceeccHhHHHHHHHHHHhhhhhhhhcee---------eeccCC Confidence 00000 0000 0011111 111112356777766666666555555442111 122344 Q ss_pred EEEecccccCCCcccccCCCccccchhhc-ccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 65 MINMPFWQDLTGEDEILDDGEGALTPGNI-SAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVL 143 (346) Q Consensus 65 ti~~P~~~~l~g~ae~~~dg~~~it~~~l-t~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~l 143 (346) ..++|....- ..+.-+.|+. .++..+. +-.+..-..++.+.-+.++++...-+.-|..+.+.+++++.+.+..+..+ T Consensus 173 ~~~~~~~~~~-~~a~wv~E~~-~~~~~~~~~f~~v~~~~~k~~~~i~iS~ell~ds~~~l~~~i~~~la~ai~~~~d~~~ 250 (425) T protein:vir:10 173 GFSKLFNMGG-TTSGWVGEAS-QRPQTNAATFQPLSFASGEIYANPAATQQILDDAEIDLESWLATEVQTEFAKQEGKAF 250 (425) T ss_pred ceEEEEEcCC-cceeeecccc-ccccccccccceeeeeheeeEeehHhHHHHHhcchhHHHHHHHHHHHHHHHHHHHhhh Confidence 5677765432 3444456764 4443333 33333334444544556666655445557788999999999999999976 Q ss_pred HHH-----HHhhhhhhhhh----hcc--eeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh- Q lcl|NC_015254. 144 IAS-----LNGITASGALD----SNK--LDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI- 211 (346) Q Consensus 144 la~-----L~G~~~~~~~~----~~~--~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li- 211 (346) |.- -.|++...... .+. ......++....++++.|.+....+......-.+|+||+.++..|++..-- T Consensus 251 l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~l~~l~~~l~~~~~~~a~~vmn~~~~~~L~~lkD~~ 330 (425) T protein:vir:10 251 LAGDGTNKPNGLLTYIAGGANAAKHPFGAIEVVNSGAAADITSDGIIDLVYDLPSAFTGNARFAMNRNTQRQVRKLKDGQ 330 (425) T ss_pred hcccCCCCcceeeeccccccccccccccccccccccccccccHHHHHHHHhhhhhhhccCCEEEEchHHHHHHHHhhcCC Confidence 641 01111111110 000 011122334566889999998888877776777899999999998865311 Q ss_pred -hhcc--cccCceeeEEeceEEEEeCCCccCCCceEEEEEc--CCeeEEeecCCccceeeeecC--CcceeEEEEeeEEe Q lcl|NC_015254. 212 -EFML--DSDNKKFPTYMGKRVIVDDGLPAKDGVYTSYIFG--EGAFGLGNGEAPVPTETDREK--LKGNDILINRQHFL 284 (346) Q Consensus 212 -~~~~--~s~~~~i~~~~G~~VVvdD~~p~~~g~ytt~l~~--~GAi~~~~~~~~~~vE~dRd~--~~g~~~l~~r~~~~ 284 (346) .++. ...++.-++++|+||+++|.||.....-..++|| ..++.+.... .++..+++ ..+...+....++- T Consensus 331 G~~l~~~~~~~g~~~~l~G~PV~~~~~~p~~~~~~~~i~~Gd~~~~~~i~~~~---~~~v~~d~~~~~~~~~~~~~~r~d 407 (425) T protein:vir:10 331 GNYLWQPSYVAGQPATLAGYPVTEVPDMPDVAANSTPILFGDFQQTYLIIDRI---GVRVLRDPYTAKPYVLFYTTKRVG 407 (425) T ss_pred CceeeccCccCCCCceecceeeEEecCcCCccCCccEEEEEehhccEEEEEec---ceEEEecccccCCcEEEEEEEEec Confidence 2222 2223444689999999999999643322345554 3344443322 23444444 44555566555554 Q ss_pred eee---eeeeeccccccCC Q lcl|NC_015254. 285 LHP---RGIAWQEKSVAGH 300 (346) Q Consensus 285 ~~~---~G~s~~~~~~~~~ 300 (346) ..+ ..|....... .+ T Consensus 408 ~~v~~~~A~~~l~~~a-s~ 425 (425) T protein:vir:10 408 GGLLNPEPMRAMKVAA-SE 425 (425) T ss_pred cEeecccceEEEEeec-cC Confidence 433 2333221111 11 No 78 >protein:vir:4953 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:108 # MgeName: Sfi19 # Cross-refs: genbank:acc:NP_049929;genbank:gi:9632900;genbank:GeneID:1262076 Probab=99.15 E-value=1.8e-11 Score=79.52 Aligned_cols=284 Identities=11% Similarity=0.080 Sum_probs=152.9 Q ss_pred Cccceecceee-ec--CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCc Q lcl|NC_015254. 1 MIKKLRMNLQK-FA--AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGE 77 (346) Q Consensus 1 ~~~~~~~~~q~-~~--a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ 77 (346) +.+.++-..+. +. +..++.-...++|+.+...+.+...+.+.+.+- .......+..| .+.+|.+..-++. T Consensus 94 ~~~~l~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~~~l~~~------~~~~~~~~~~~-~~~~~~~~~~~~~ 166 (397) T protein:vir:49 94 FKNLVRGRYQNLLDSKTDASGSDAGLTIPQDIQTAIHTLVSQYDSLQEY------VNVENVTTLTG-SRVYEKWTDITGL 166 (397) T ss_pred HHHHHhcchhHHHHHhhccccccCcccccHhHHHHHHHHHHhhhhHHhh------hceeecccCcc-ceEEEeeccCCcc Confidence 11111111111 11 112223345678998888787776666655331 11111111123 3445666555556 Q ss_pred ccccCCCccccch-hhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhh Q lcl|NC_015254. 78 DEILDDGEGALTP-GNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGAL 156 (346) Q Consensus 78 ae~~~dg~~~it~-~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~ 156 (346) +..+.|+. .++. ...+-.+-....++.+.-+.++++...-+.-|....+.+++++.+.+..+..+|.-.. . T Consensus 167 a~~v~E~~-~~~~~~~~~~~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~d~ai~~G~g----~--- 238 (397) T protein:vir:49 167 ANIDDEAG-KIADVDDPKLSLIKYTIKRYAGISTVTNSLLADSAENILAWLSGWIAKKVVVTRNKAILEAIA----A--- 238 (397) T ss_pred eeeecCcc-ccccccccceeeEEeeeeeEEeeehhHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHHhhcc----c--- Confidence 66778875 4443 3455555555666666667777776655556788889999999999999987665321 0 Q ss_pred hhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhccc--ccCceeeEEeceEEEE Q lcl|NC_015254. 157 DSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLD--SDNKKFPTYMGKRVIV 232 (346) Q Consensus 157 ~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~--s~~~~i~~~~G~~VVv 232 (346) . ......++++.+.++...+-.......+|+||+.++..|++..-- .++.. ..++.-++++|+||++ T Consensus 239 ~---------~~~~~~~~~d~i~~~~~~l~~~~~~~a~~vmn~~~~~~l~~lkd~~G~~l~~~~~~~~~~~~l~G~PV~~ 309 (397) T protein:vir:49 239 L---------PTKPTLTKWDDIIDLEAKVDPAIKQTSFFLTNTSGFTALKKVKNALGDYLMERDVKSPTGYSIDGFAVKE 309 (397) T ss_pred c---------ccccccccHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHHhhcCCCceeeccCcCCCCCceecceeeEE Confidence 0 011223578999999998877777778999999999999875311 12221 2234456899999998 Q ss_pred eCC--CccCCCceEEEEEc--CCeeEEeecCCccceeeeecC----CcceeEEEEeeEEeeee---eeeeecc-ccccCC Q lcl|NC_015254. 233 DDG--LPAKDGVYTSYIFG--EGAFGLGNGEAPVPTETDREK----LKGNDILINRQHFLLHP---RGIAWQE-KSVAGH 300 (346) Q Consensus 233 dD~--~p~~~g~ytt~l~~--~GAi~~~~~~~~~~vE~dRd~----~~g~~~l~~r~~~~~~~---~G~s~~~-~~~~~~ 300 (346) .+. +|.....-..++|+ ..++.+... ..+.++.++.. ..+...+....++.+.+ .+|..-. ++++.. T Consensus 310 ~~~~~~~~~~~~~~~i~~gd~~~~~~~~~~-~~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~~~~ 388 (397) T protein:vir:49 310 VADRWLANGTGGAMPLYFGDLKQAVTLFDR-QHMSLLSTNIGGGAFETDTTKVRVIDRFDVVATDTEAFVPASFKAIADQ 388 (397) T ss_pred ecccccccccCCceeEEEeeccceEEEEee-cceEEEEeccccchhhcCceeEEEEeeeCcEEecccceEEEEeecccCC Confidence 654 34332222345555 223444332 23445554422 23444455544444333 3333221 122222 Q ss_pred CCChHHhcC Q lcl|NC_015254. 301 SPTNTEIEK 309 (346) Q Consensus 301 sPt~a~L~~ 309 (346) .|+..-++- T Consensus 389 ~~~~~~~~~ 397 (397) T protein:vir:49 389 KGNLGSTAV 397 (397) T ss_pred CCCcccccC Confidence 222222222 No 79 >protein:vir:8885 Length: 347 # NCBI annotation: major capsid protein A # Family: family:all:975 # MgeID: mge:161 # MgeName: gh-1 # Cross-refs: genbank:acc:NP_813774;genbank:gi:29366729;genbank:GeneID:1258837 Probab=99.15 E-value=2.4e-12 Score=84.33 Aligned_cols=289 Identities=12% Similarity=0.040 Sum_probs=166.1 Q ss_pred Cc-cceecc--eeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCc Q lcl|NC_015254. 1 MI-KKLRMN--LQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGE 77 (346) Q Consensus 1 ~~-~~~~~~--~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ 77 (346) |- ..-.-| .+.=.+++..---++++ |+|...|...+.+.+.|...- + ... -.+|+++++|..+.... T Consensus 1 ~a~~~~~~~~~~~~g~~~~~~d~~al~i-e~~~geV~~~f~~~s~~~~~~---~---~r~--i~~G~sv~~~~iG~~~~- 70 (347) T protein:vir:88 1 MANATGGQQIGANQGKGQSAADKLALFL-KVFGGEVLTAFVRRSVTMDKH---M---VRT--IQNGKSASFPVMGRTKG- 70 (347) T ss_pred CCCcccchhhhccCCCCccccchHHHHH-HHHHHHHHHHHHHHhhhhhcc---c---ccc--ccCcceEEEeeecceee- Confidence 32 111111 11111222233235666 999999988888777663311 1 111 23699999998887643 Q ss_pred ccccCCCccccc--hhhcccceeEEEEEe-ecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhh Q lcl|NC_015254. 78 DEILDDGEGALT--PGNISAAKDIARLHM-RGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASG 154 (346) Q Consensus 78 ae~~~dg~~~it--~~~lt~~~~~a~~~~-~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~ 154 (346) .....|. .+. ...+...+..-++=. .-..+.+.|+.......|++.+++++.+.++++..|+.++..|....... T Consensus 71 -~~~~~g~-~l~~~~~~~~~~~~~i~ID~~~y~~~~Vdd~D~~q~~~D~r~~~~~~~g~aLA~~~D~~i~~~l~~~a~~~ 148 (347) T protein:vir:88 71 -YYLAPGE-NLDDKRKDIKHSEKVIQIDGLLTSDVLIYDIEDAMNHYDVRAEYSAQLGEALAIAADGAVLAEMAKLCNLP 148 (347) T ss_pred -eeecccc-CCCCCCCCCccceEEEEEechhhhhhhhhhHHHHhhcCCchHHHHHHHHHHHHHHHHHHHHHHHHHhhccc Confidence 3333443 332 245666655544443 34568899999999999999999999999999999999987774333221 Q ss_pred hhh-----h----cceeeecc-cccc----ccccHHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhhhhhhhcccc- Q lcl|NC_015254. 155 ALD-----S----NKLDVSTE-TGDD----SYFTGDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQGLIEFMLDS- 217 (346) Q Consensus 155 ~~~-----~----~~~dis~~-~~~~----~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~li~~~~~s- 217 (346) ... + ....+... ...+ ...-++.|.+|..+|.+.. ..-..+++.|..|..|++...+....+. T Consensus 149 ~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~~~~~~~~~~ 228 (347) T protein:vir:88 149 AASNENIAGLGQAVVLNIGAAADLVDVEARGKAILKGLTLARARLTKNYVPAGDRRFYCAPEDYSAILSALMPNAANYAA 228 (347) T ss_pred cccccccCCccccccccccccccccchhhhHHHHHHHHHHHHHHHhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcc Confidence 110 0 00000000 0000 1112577888888886654 2458999999999999875433322211 Q ss_pred ----cCceeeEEeceEEEEeCCCccCCCceE-------------------------------EEEEcCCeeEEeecCCcc Q lcl|NC_015254. 218 ----DNKKFPTYMGKRVIVDDGLPAKDGVYT-------------------------------SYIFGEGAFGLGNGEAPV 262 (346) Q Consensus 218 ----~~~~i~~~~G~~VVvdD~~p~~~g~yt-------------------------------t~l~~~GAi~~~~~~~~~ 262 (346) ..|.++.++|.+|+.+.++|++..... .+.+-+-|++.....+ . T Consensus 229 ~~~~~~G~vg~i~G~~V~~s~nlp~~~~~~~~~~~~~~~t~~~~~~~~~~~~~~~~d~~~~~~l~~~~~a~g~v~~~d-~ 307 (347) T protein:vir:88 229 LIDPETGNIRNVMGFEVIEVPHLTVGGAGDNNPADGVAPTNQKHIFPATATGDDRVAQNNVVGLFNHRSAVGTVKLKD-M 307 (347) T ss_pred ccchhcceeeeeccceEEEeecccccccccccccccccccccccccccccccccccccCcEEEEEechhhhhheeccc-c Confidence 246789999999999999996432111 1222334444444343 3 Q ss_pred ceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCCh Q lcl|NC_015254. 263 PTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTN 304 (346) Q Consensus 263 ~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~ 304 (346) .+|..|++....+.+...+.|+..+.= +.-..+--.+++- T Consensus 308 ~~e~~r~~~~~~d~i~~~~~~G~~~~r--Pe~a~~~~~~~a~ 347 (347) T protein:vir:88 308 ALERARRPEFQADQIIGKYAMGHGGLR--PEAAGALVFTPAA 347 (347) T ss_pred eeeeeechhhHHHHhhhhhhhcCceec--cceEEEEEeCCCC Confidence 588889998888877777766655531 1000000001111 No 80 >protein:vir:80684 Length: 315 # NCBI annotation: gp6 # Family: family:all:966 # MgeID: mge:1884 # MgeName: PA6 # Cross-refs: genbank:acc:YP_001285582;genbank:gi:148727088;genbank:GeneID:5247055 Probab=99.15 E-value=2e-11 Score=79.28 Aligned_cols=291 Identities=13% Similarity=0.069 Sum_probs=147.9 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =|..++.-....+|+.+..-+.+...+.+.+.+-+-+ ...++..+++|.+..- +.+.-+.|++ .++..+ T Consensus 1 Ma~~~~~~gg~~vP~~~~~~ii~~l~~~s~i~~l~~~---------i~~~~~~~~ip~~~~~-~~a~wv~Eg~-~~~~s~ 69 (315) T protein:vir:80 1 MADDFLSAGKLELPGSMIGAVRDRAIDSGVLAKLSPE---------QPTIFGPVKGAVFSGV-PRAKIVGEGE-VKPSAS 69 (315) T ss_pred CCCCcCCcCceEcchHHHHHHHHHHHhhchhhhhcce---------eecCCCceEEEEEeCC-cceEEeeCCc-cccccc Confidence 1344566677889999988777777777666443211 2234567899998753 4555667874 667666 Q ss_pred cccceeEEEEEeecCcceechHHHhhhcchH---H-HHHHHHHHHHHHHHHHHHHHHHHHhhhhhh--hhhhcc--eeee Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALSGDDP---M-RAIGDLVVEYWNRRRQAVLIASLNGITASG--ALDSNK--LDVS 164 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp---~-~~i~~q~a~~~~~~~~~~lla~L~G~~~~~--~~~~~~--~dis 164 (346) .+-++-....++.+.-..++++....+..|. + ..+.+++++++++++++.+|. |.-..+ ...... ...+ T Consensus 70 ~~f~~v~l~~~kl~~~~~iS~ell~~s~~~~~~~l~~~i~~~la~ai~~~~d~a~~~---G~~~~~~~~~~~~~~~~~~~ 146 (315) T protein:vir:80 70 VDVSAFTAQPIKVVTQQRVSDEFMWADADYRLGVLQDLISPALGASIGRAVDLIAFH---GIDPATGKAASAVHTSLNKT 146 (315) T ss_pred cceeeeEeeeeeEEeeehhhHHHhhcCchhHHHHHHHHHHHHHHHHHHHHHhhheee---ccCCCCCccccccccccccc Confidence 6655555555555555667666554444443 2 568888999999988886663 321100 111000 0001 Q ss_pred ccccccccccHHHHHHHHHHhC-ccccCceEEEEchHHHHHHHhhhhhh-------hcc-cccCceeeEEeceEEEEeCC Q lcl|NC_015254. 165 TETGDDSYFTGDTFLSATYKLG-DAEGKLTGIAMHSQTEMNLRKQGLIE-------FML-DSDNKKFPTYMGKRVIVDDG 235 (346) Q Consensus 165 ~~~~~~~~~~~~~l~~A~~~~G-D~~~~~~~ivmhS~~~~~L~~~~li~-------~~~-~s~~~~i~~~~G~~VVvdD~ 235 (346) ..........+..|.++..++- .......+|+|||+++..|++...-+ ++. ....+.-++++|+||++++. T Consensus 147 ~~~~~~~~~~~~d~~~~~~~~~~~~~~~~~~~imn~~~~~~L~~l~~~~g~~~~g~~~~~~~~~g~~~tl~G~PV~~~~~ 226 (315) T protein:vir:80 147 KNIVDATDSATADLVKAVGLIAGAGLQVPNGVALDPAFSFALSTEVYPKGSPLAGQPMYPAAGFAGLDNWRGLNVGASST 226 (315) T ss_pred cceeeccccchHHHHHHHHHHhhccCccceEEEEcHHHHHHHHHHhhccCCcccccccccccccCCCceecceeeEecCc Confidence 1111112234677888887764 34445568999999999998774211 111 11123346899999999999 Q ss_pred CccCCC----ceEEEEEcCCe-eEEeecCCccceeeeecCCccee--EEEEeeEEeeeeeeeeeccccccCCCCChHHhc Q lcl|NC_015254. 236 LPAKDG----VYTSYIFGEGA-FGLGNGEAPVPTETDREKLKGND--ILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIE 308 (346) Q Consensus 236 ~p~~~g----~ytt~l~~~GA-i~~~~~~~~~~vE~dRd~~~g~~--~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~ 308 (346) ||.... ...-.+||.=. +.|+. ...+.+|..++...... .++.+- .+.++....-+- T Consensus 227 ~~~~~~~~~~~~~~~~~GDfs~~~~g~-~~~~~i~i~~~~~~~~~~~~~~~~~--~v~~r~~~r~~~------------- 290 (315) T protein:vir:80 227 VSGAPEMSPASGVKAIVGDFSRVHWGF-QRNFPIELIEYGDPDQTGRDLKGHN--EVMVRAEAVLYV------------- 290 (315) T ss_pred CCcccccccccccEEEEeecccEEEEE-ecCeeEEEeccccccCcccchhhcC--cEEEEEEEEecc------------- Confidence 986432 11123333211 22333 22345665555321110 111111 111111111000 Q ss_pred CCcCceeeecccccceEEEEEecccccccCCCCCCCC Q lcl|NC_015254. 309 KGNNWKAVYESKNIRIVAFVHKNGVPGKKKETAPEGI 345 (346) Q Consensus 309 ~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~~~~~ 345 (346) .|.+++++. +|+-+ .+++++.|.|- T Consensus 291 ------~v~~~~a~~--~l~~~----~a~~~~~~~~~ 315 (315) T protein:vir:80 291 ------AIESLDSFA--VVKEK----AAPKPNPPAEN 315 (315) T ss_pred ------eeecccceE--EEeec----cCCCCCCCCCC Confidence 012223221 11111 11222233333 No 81 >protein:vir:108303 Length: 418 # NCBI annotation: hypothetical protein # Family: family:all:1412 # MgeID: mge:2007 # MgeName: BA3 # Cross-refs: genbank:acc:YP_001552282;genbank:gi:160700607;genbank:GeneID:5758819 Probab=99.15 E-value=5.4e-11 Score=76.88 Aligned_cols=303 Identities=12% Similarity=0.043 Sum_probs=161.8 Q ss_pred ceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhhccc Q lcl|NC_015254. 16 KNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGNISA 95 (346) Q Consensus 16 ~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~ 95 (346) =.|+--.++-||+|++-+.+.+.+.+.|.+ .+-++ .-.+. ..-||+|++|....+ .+.|+. .++++.++. T Consensus 1 m~~~~N~~ltp~iia~~~l~~l~~~lV~~~--lv~r~-y~~e~-~~~GDTV~I~vp~~~-----~v~dg~-~~~~~~~te 70 (418) T protein:vir:10 1 MAVQDNNLLTDDVIAKEALRLLKNNLVMAK--CVYRN-YEKTF-GKVGDTIRLKLPYRV-----KSASGR-TLVKQPMVD 70 (418) T ss_pred CCccccccccHHHHHHHHHHHHHHhccchh--hhcCC-CchHH-hhCCCEEEEeeCCce-----eecccC-Ccccccccc Confidence 123334456799999999998888877632 33333 22233 345999999986543 223453 688888887 Q ss_pred ceeEEEE-EeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccccc Q lcl|NC_015254. 96 AKDIARL-HMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSYFT 174 (346) Q Consensus 96 ~~~~a~~-~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~~~ 174 (346) ++-..++ +....++.++|+...+...|.+.++.++.+..++++++.++++.++++....+ ..+. ..-. T Consensus 71 ~~v~l~id~~k~~~~~itD~e~a~~~~d~~~~~l~~A~~aLA~~vD~~ia~l~~~a~~~~g----------t~gt-~~~~ 139 (418) T protein:vir:10 71 QTIPFKIAYQEHVGLEYTVKDKTLDIMQFSERYLKSGMVQIANQIDRSLALTLKKAFHSSG----------TPGV-RPGA 139 (418) T ss_pred ceEEEEEecccccceeechHHHhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHhhcccccc----------cCCc-Ccch Confidence 7666554 56777899999999999999999999999999999999999988775421111 0111 1124 Q ss_pred HHHHHHHHHHhCcccc--C-ceEEEEchHHHHHHHhhhhhhhccccc-----CceeeEEeceEEEEeCCCcc-CCCceEE Q lcl|NC_015254. 175 GDTFLSATYKLGDAEG--K-LTGIAMHSQTEMNLRKQGLIEFMLDSD-----NKKFPTYMGKRVIVDDGLPA-KDGVYTS 245 (346) Q Consensus 175 ~~~l~~A~~~~GD~~~--~-~~~ivmhS~~~~~L~~~~li~~~~~s~-----~~~i~~~~G~~VVvdD~~p~-~~g~ytt 245 (346) ++.+.+|..+|.+..- . .+.+|+.|..|..|.+.....+.+... .|.|+.+.|.+|++|+.+|. ..|.+.. T Consensus 140 ~~~i~~a~~~Ld~~~VP~~G~R~lVv~P~~~~~L~~~~~~~~~~~~~~~~lr~G~IG~i~GF~V~~S~nip~~tag~~~~ 219 (418) T protein:vir:10 140 FIDFANAGAKQTTYAVPQDGMRHAVLDPFTCASLSDEVTKLFKESMVEQAYKMGYRGNVAAYEVYESQNLPKHTVGDHGG 219 (418) T ss_pred HHHHHHHHHHHHhcCCCCCCceEEEeCHHHHHHHhhhccccccccccchhhheeeeeeeeceEEEEecCCCccccccccc Confidence 7899999999987642 2 478999999999998776543333221 46789999999999999995 4443322 Q ss_pred EEEcCCeeEEeecCCccceeeeecC------CcceeEEEEeeEEeeeeee-------eee--ccc---ccc-----CCCC Q lcl|NC_015254. 246 YIFGEGAFGLGNGEAPVPTETDREK------LKGNDILINRQHFLLHPRG-------IAW--QEK---SVA-----GHSP 302 (346) Q Consensus 246 ~l~~~GAi~~~~~~~~~~vE~dRd~------~~g~~~l~~r~~~~~~~~G-------~s~--~~~---~~~-----~~sP 302 (346) -.+..||-.- ...+-.+.+. +..-|.+.---.|.+|+.. -.| +.. ..+ ..+| T Consensus 220 t~~v~ga~~~-----~~~~~~~~~t~s~~g~l~~Gd~~ti~gv~~v~~~t~~~~~~~~~f~V~~~~~~~~~~~~tv~i~p 294 (418) T protein:vir:10 220 TPLVNGTVVN-----GDTVGFDGGTASTTGFLKAGDVITFGGVFGVNPQNYETTGLLQEFVVLEDVDTDAGGAGSIKISP 294 (418) T ss_pred ceeeeccccc-----ceeEEEeecceeeccceeeccEEEECceeecccccccccccceEEEEEeeccccccCcceeEecc Confidence 1222222110 0001011111 1111222111111111110 011 000 000 0111 Q ss_pred C-----------hHHhcCCcCce-------------------------eeecccccceEEEEEecccc-cc---cC-CCC Q lcl|NC_015254. 303 T-----------NTEIEKGNNWK-------------------------AVYESKNIRIVAFVHKNGVP-GK---KK-ETA 341 (346) Q Consensus 303 t-----------~a~L~~~~NW~-------------------------~v~~~K~i~iv~~~~k~~~~-~~---~~-~~~ 341 (346) . ..+.-...|.. +++-+..|.++...= ..| +. .. ... T Consensus 295 ~~~~~~~~~~~~~~~~~~~~~~~~v~a~~a~~~~it~~~~a~~~~~~nl~f~~~a~~l~~~~l--~~p~g~~~~~~~~~~ 372 (418) T protein:vir:10 295 SLNDGTATINNENGDPVSLTAYQNVTALPADNAPITVLGAANTTYEQNYLFHRDAIALAMIDL--ELPQSAVIKSRAADP 372 (418) T ss_pred ccccccccccccccccccccCCCcccccccCcceeeeecccccceeeeeeeecceEEEEEeec--cCCCCCCcceEEEec Confidence 1 11111111111 122222222221111 100 00 00 011 Q ss_pred CCCCC Q lcl|NC_015254. 342 PEGIK 346 (346) Q Consensus 342 ~~~~~ 346 (346) ..|+- T Consensus 373 ~~G~s 377 (418) T protein:vir:10 373 ETGLS 377 (418) T ss_pred cCCeE Confidence 12222 No 82 >protein:vir:1328 Length: 392 # NCBI annotation: gp36 # Family: family:all:21 # MgeID: mge:28 # MgeName: phi-C31 # Cross-refs: genbank:acc:NP_047927;swissprot:trembl:q9zwv6;genbank:gi:9631145;uniprot:Q9ZWV6;genbank:GeneID:2715889 Probab=99.14 E-value=1.3e-11 Score=80.33 Aligned_cols=279 Identities=11% Similarity=0.074 Sum_probs=152.2 Q ss_pred Ccc---ceecceeeec---CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccC Q lcl|NC_015254. 1 MIK---KLRMNLQKFA---AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDL 74 (346) Q Consensus 1 ~~~---~~~~~~q~~~---a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l 74 (346) +.. .-...+.... +.+++--..++.|+++.+.+.+...+.+.+.+-.-+ .-..++..+.+|....- T Consensus 93 ~r~g~~~~~~~~~~~~~~~~~t~~~~g~~~~~~~~~~~i~~~~~~~~~l~~~~~~--------~~~~~~~~~~~~~~~~~ 164 (392) T protein:vir:13 93 LRAGNLGEARSFEFAPEKRDGTKAGNPNVLSRTLYGQLIAQAVERSAIMRGGAST--------FTTSDANPMDFTVITGR 164 (392) T ss_pred HhccchhhhHHHHhhhhhhcccccCCCccccccchHHHHHHHHhhhhhhhhccee--------eecCCCceeEEEEEcCC Confidence 100 0011111111 111222234788888888887755544333211100 11235778899988763 Q ss_pred CCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH-----HHh Q lcl|NC_015254. 75 TGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS-----LNG 149 (346) Q Consensus 75 ~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~-----L~G 149 (346) +.+.-+.|+. .++..+.+.++..-..++.+.-..+++....-+.-|..+.+.+++++.+.+..+..+|.- -+| T Consensus 165 -~~a~~v~E~~-~~~~~~~~f~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~l~G~Gt~~p~G 242 (392) T protein:vir:13 165 -ATAGIVGETA-EIPESYPATTQRSMGGFKYGFASVVSYEFATDQVLDLVGFLVSDAGPAIGDAMGRHFLTGTGTGQPRG 242 (392) T ss_pred -cceeeecccc-cccccccceeeEEeeeeeEEeeehhHHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhcccCCccccc Confidence 4444567874 667667666655555666666667777765555557778899999999999999977731 112 Q ss_pred hhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceeeEE Q lcl|NC_015254. 150 ITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFPTY 225 (346) Q Consensus 150 ~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~~~ 225 (346) ++...... .....+.....++++.|.++...+......-.+|+||+.++..|++..-- .++. ...++.-.++ T Consensus 243 il~~~~~~----~~~~~~~~~~~~~~d~l~~~~~~l~~~~~~~a~~v~n~~~~~~l~~lkd~~G~~l~~~~~~~g~~~~l 318 (392) T protein:vir:13 243 ILTDATGA----NAAFGEADADSKVSDALIDLFHEVPSAYRKNAKFVVNDLRAAQMRKLKDANGQYLWQSALTVGAPDTF 318 (392) T ss_pred cccccccc----cccccccccccccHHHHHHHHHhhhhhhhcCCEEEEcHHHHHHHHHhhccCCceeecCCcCCCCCcee Confidence 22211111 11112233455789999998888766665667899999999998864211 1221 1123334689 Q ss_pred eceEEEEeCCCccCCCceEEEEEcC-CeeEEeecCCccceeeeecC--CcceeEEEEeeEEeeee---eeeeeccccccC Q lcl|NC_015254. 226 MGKRVIVDDGLPAKDGVYTSYIFGE-GAFGLGNGEAPVPTETDREK--LKGNDILINRQHFLLHP---RGIAWQEKSVAG 299 (346) Q Consensus 226 ~G~~VVvdD~~p~~~g~ytt~l~~~-GAi~~~~~~~~~~vE~dRd~--~~g~~~l~~r~~~~~~~---~G~s~~~~~~~~ 299 (346) +|+||++++.||... ++||. ..+.+...+ .+.++.+++. ..+.+.+....++.+.+ ..|.....+.+. T Consensus 319 ~G~Pv~~~~~~~~~~-----i~~Gdf~~~~i~~~~-~~~i~~~~~~~~~~~~~~~r~~~r~d~~~~~~~A~~~~~~~~aa 392 (392) T protein:vir:13 319 NGKVVETDDGMPADK-----VLFADLSKYRVRFAG-SLRVDRSVDAKFSTDQIVYRFLQRADGLLVDARGAKVLTVTPAA 392 (392) T ss_pred cceeeEEcCCCCCCc-----EEEeeccceeEEeec-ceEEEeeccccccCCcEEEEEEEEeccEEecccceEEEEeeccC Confidence 999999999999753 44443 123333222 2334433443 33445555555544433 233322111111 No 83 >protein:vir:2430 Length: 318 # NCBI annotation: major head subunit # Family: family:all:507 # MgeID: mge:52 # MgeName: D29 # Cross-refs: genbank:acc:NP_046832;genbank:gi:9630400;genbank:GeneID:1261582 Probab=99.14 E-value=2.7e-11 Score=78.50 Aligned_cols=287 Identities=9% Similarity=0.036 Sum_probs=156.0 Q ss_pred CccceecceeeecCCce--eeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKN--TRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGED 78 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~--T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~a 78 (346) |-..-+|+.+...+-.+ |.-+-+ +|+.+...+.+...+.+.+.+- ......++..+.+|.+... +.+ T Consensus 1 ~~~~~~~~~e~~~~~~~~~~~~~~~-ip~~~~~~ii~~~~~~~~l~~~---------~~~~~~~~~~~~ip~~~~~-~~a 69 (318) T protein:vir:24 1 MAAGTAFAVDHAQIAQTGDTMFKGY-LEPEQAKDYFAEAEKTSIVQQF---------AQKVPMGTTGQKIPHWVGD-VSA 69 (318) T ss_pred CCCCCCCCHHHHHhhcccCccccee-echhHHHHHHHHHHhhchhhhh---------cceeeccCCceEEEEEeCC-cce Confidence 77777777777653222 222344 5666655555555555444331 1122335777999998764 566 Q ss_pred cccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhh--hh Q lcl|NC_015254. 79 EILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASG--AL 156 (346) Q Consensus 79 e~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~--~~ 156 (346) .-+.|+. .++..+.+-.+-....++.+..+.++++...-+..|..+.+.+++++.+.+++++.+|. |.-... .. T Consensus 70 ~~v~Eg~-~~~~~~~~f~~i~~~~~k~~~~~~iS~e~l~ds~~~~~~~i~~~l~~~~~~~~d~a~l~---G~g~~~~~~~ 145 (318) T protein:vir:24 70 QWIGEGD-MKPITKGNMTSQTIAPHKIATIFVASAETVRANPANYLGTMRTKVATAFAMAFDGAAMH---GTDSPFPTYI 145 (318) T ss_pred EEecCCc-cccccccceeEEEEeeEEEEEeehhhHHHhhcChHHHHHHHHHHHHHHHHHHHHHhhhc---ccCCCCCccc Confidence 6678875 67777777666666677788788888877666667889999999999999999997663 321100 00 Q ss_pred hhcceeeecc-ccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcccc--cCc---e--eeEEe Q lcl|NC_015254. 157 DSNKLDVSTE-TGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLDS--DNK---K--FPTYM 226 (346) Q Consensus 157 ~~~~~dis~~-~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~s--~~~---~--i~~~~ 226 (346) ......++.. .........+.+.++............+|+||+..+..|++..-- .++... .++ . -..+. T Consensus 146 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~lkd~~G~~l~~~~~~~~~~~~~~~~~i~ 225 (318) T protein:vir:24 146 GQTTKAISIADTTGATTVYDQVAVNGLSLLVNDGKKWTHTLLDDITEPILNGAKDQNGRPLFIESTYGEAASPFRSGRIV 225 (318) T ss_pred ccccccccccccccccchHHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHhhccCCceeecCccccCccccccCceEE Confidence 0001111111 111122333455666666655666667899999999999865311 111111 111 1 24788 Q ss_pred ceEEEEeCCCccCCCceEEEEEcC-CeeEEeecCCccceeeeecCCc----------------ceeEEEEeeEEeeeeee Q lcl|NC_015254. 227 GKRVIVDDGLPAKDGVYTSYIFGE-GAFGLGNGEAPVPTETDREKLK----------------GNDILINRQHFLLHPRG 289 (346) Q Consensus 227 G~~VVvdD~~p~~~g~ytt~l~~~-GAi~~~~~~~~~~vE~dRd~~~----------------g~~~l~~r~~~~~~~~G 289 (346) |++|++++.+|.+.. ..+++. .-+.++..+ .+.+|..|+... ++..+....++.+.+ T Consensus 226 g~pv~~~~~~~~~~~---~~~~gdfs~~~~~~~~-~l~i~~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v-- 299 (318) T protein:vir:24 226 ARPTILSDHVVEGTT---VGFMGDFSQLIWGQIG-GLSFDVTDQATLNLGTVESPNFVSLWQHNLVAVRVEAEYAFHC-- 299 (318) T ss_pred EEeeEEeCCCCCCcc---EEEEeecceEEEEEec-CeEEEEeeccceeccccccccchhhhhcCcEEEEEEEEEccEE-- Confidence 999999999987642 222332 113344423 456776666421 112222222222221 Q ss_pred eeeccccccCCCCChHHhcCCcCceeeecccccceEEEEEecccccccC Q lcl|NC_015254. 290 IAWQEKSVAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKK 338 (346) Q Consensus 290 ~s~~~~~~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~ 338 (346) .+++++. .++++.+.+.++ T Consensus 300 ---------------------------~~~~a~~---~i~~~~a~~~~~ 318 (318) T protein:vir:24 300 ---------------------------NDAEAFV---ALTNVVSGGGEG 318 (318) T ss_pred ---------------------------ecccceE---EEEeeccCCCCC Confidence 1122211 122222222222 No 84 >protein:vir:10364 Length: 390 # NCBI annotation: head protein; major capsid subunit precursor # Family: family:all:585 # MgeID: mge:183 # MgeName: Xp10 # Cross-refs: genbank:acc:NP_858956;genbank:gi:32128421;genbank:GeneID:2648357 Probab=99.14 E-value=1.3e-11 Score=80.23 Aligned_cols=272 Identities=11% Similarity=0.125 Sum_probs=150.9 Q ss_pred Cccce--ecceeeec----CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccC Q lcl|NC_015254. 1 MIKKL--RMNLQKFA----AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDL 74 (346) Q Consensus 1 ~~~~~--~~~~q~~~----a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l 74 (346) +..+. .+.++.+. ...++.-.-++.|+++...+.. ..+.+.+.+ +-.....++..+++|.+... T Consensus 96 ~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~ii~~-~~~~~~l~~---------~~~~~~~~~~~~~~~~~~~~ 165 (390) T protein:vir:10 96 NDRSARATMNIKAALNTASTDAAGSAGALTTPNRLPGFITQ-PDARLTVRD---------LIGSGRTDSALIEYVQETGF 165 (390) T ss_pred hhhhhhhhhHHHHHHHhhhcccccccccccchhHHHHHHHH-HHhhchhhh---------hcceeeccCCceEEEEEecC Confidence 11111 11111111 1111112236778777655543 333333322 11122335667899999766 Q ss_pred CCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH------HH Q lcl|NC_015254. 75 TGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS------LN 148 (346) Q Consensus 75 ~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~------L~ 148 (346) ++.+..+.|+. .++..+.+..+.....+..+.-..+++....-+ .+....+.++++..+.+..++.+|.- .. T Consensus 166 ~~~a~~v~Eg~-~~~~~~~~~~~i~~~~~k~~~~~~is~ell~d~-~~l~~~i~~~l~~~~~~~~~~~il~G~G~~~~p~ 243 (390) T protein:vir:10 166 VNNAAIVAEGA-LKPESSLKFAKKTDTTHVIAHTMKATRQILSDA-PQLASYMNNRLIRGLKVKEDAEILRGTGANDGLL 243 (390) T ss_pred CcceeeecCCc-cccccccceeEEEEeeEEEEEeehhhHHHHHhH-HHHHHHHHHHHHHHHHHHHHHHHhhcCCCCcccc Confidence 56666677875 567667776666666666666667777643323 36677899999999999999876641 22 Q ss_pred hhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhccc-ccCceeeEE Q lcl|NC_015254. 149 GITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLD-SDNKKFPTY 225 (346) Q Consensus 149 G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~-s~~~~i~~~ 225 (346) |++........ .........++.+.++...+.+......+|+||+.++..|++..-- .++.. ..++.-+++ T Consensus 244 Gi~~~~~~~~~------~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~v~n~~~~~~L~~lkd~~g~~l~~~~~~~~~~~l 317 (390) T protein:vir:10 244 GLIPQATTYAA------PTTIAGATRVDQLRLAMLQASLAEYPASGIVINPIDWAAIELAKDANNQYLIGNARGTLTPTL 317 (390) T ss_pred ccccccccccc------cccccccchHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHhhcCCCceeecCCcCcCCcee Confidence 22221111111 1112233467889999998888888888999999999999865311 12211 122334689 Q ss_pred eceEEEEeCCCccCCCceEEEEEcC--CeeEEeecCCccceeeeecC---CcceeEEEEeeEEeeeee---eeeeccccc Q lcl|NC_015254. 226 MGKRVIVDDGLPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREK---LKGNDILINRQHFLLHPR---GIAWQEKSV 297 (346) Q Consensus 226 ~G~~VVvdD~~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~---~~g~~~l~~r~~~~~~~~---G~s~~~~~~ 297 (346) +|+||++++.||.+. ++++. .++.+.. +..+.++..++. ..+...+....++.+.++ .|..- +. T Consensus 318 ~G~pv~~~~~~p~~~-----~~~gdf~~~~~~~~-~~~~~i~~~~~~~~~~~~~~~~r~~~r~d~~v~~~~a~~~~--~~ 389 (390) T protein:vir:10 318 WGLPVVATQAMAPGE-----FLVGAFDLAAQIFD-QWDARVEIGYVNDDFQRNMVTVLAEERLALVVYRPEALISG--SF 389 (390) T ss_pred cceeeEEcCCCCCCc-----EEEEeccceEEEEE-ecceEEEEeecccccccCcEEEEEEEeeccEEeccccEEEE--Ee Confidence 999999999999763 33332 2333322 223456665543 234455555555554443 33221 11 Q ss_pred c Q lcl|NC_015254. 298 A 298 (346) Q Consensus 298 ~ 298 (346) + T Consensus 390 a 390 (390) T protein:vir:10 390 A 390 (390) T ss_pred C Confidence 1 No 85 >protein:vir:4830 Length: 397 # NCBI annotation: MPL-7201 # Family: family:all:21 # MgeID: mge:105 # MgeName: 7201 # Cross-refs: genbank:acc:NP_038327;genbank:gi:9634653;genbank:GeneID:1262632 Probab=99.14 E-value=2e-11 Score=79.20 Aligned_cols=281 Identities=9% Similarity=0.008 Sum_probs=155.8 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccC--CCcc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDL--TGED 78 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l--~g~a 78 (346) +......-++.. +..++.-.-.++|+.+..-+.+...+.+.+.+-.- ....++...++|++... .+.+ T Consensus 98 ~~~~~~~~~~~~-~~~t~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~---------~~~~~~~~~~~~~~~~~~~~~~a 167 (397) T protein:vir:48 98 VRGRYQNLLDSK-TDASGSDAGLTIPQDIQTAIHTLVRQYDSLQEYVN---------VENVTTLTGSRVYEKWADITGLA 167 (397) T ss_pred HhhhhhHHHHHh-hccCCccccccccHHHHHHHHHHHHHHHHHHhhhc---------eeeccCCcceEEEEeecCCCcce Confidence 222222112222 12223334567888888878777777766644211 11123455555555433 2334 Q ss_pred cccCCCccccc-hhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhh Q lcl|NC_015254. 79 EILDDGEGALT-PGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALD 157 (346) Q Consensus 79 e~~~dg~~~it-~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~ 157 (346) ..+.|+. .++ ..+.+-.+-....+..+.-..+++....-+.-|....+.+++++.+.+..++.++.... .. T Consensus 168 ~~v~E~~-~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~v~~~l~~~~~~~~d~~il~G~g-------~~ 239 (397) T protein:vir:48 168 KLDDEAG-SIGTNDDPKLYPIRYAIKRYAGISTVTNSLLADSAENILAWLSGWIAKKVVVTRNKAILEAIA-------TL 239 (397) T ss_pred eeecccc-ccccccccceeeEEeeheeeeeehhhHHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccc-------cc Confidence 4556664 343 23344444444555666666777776555566788899999999999999997764321 00 Q ss_pred hcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhccc--ccCceeeEEeceEEEEe Q lcl|NC_015254. 158 SNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLD--SDNKKFPTYMGKRVIVD 233 (346) Q Consensus 158 ~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~--s~~~~i~~~~G~~VVvd 233 (346) ......++++.|.++...+......-.+|+||+.++..|++..-- .++.. ..++.-++++|+||++. T Consensus 240 ---------~~~~~~~~~d~i~~~~~~l~~~~~~~a~~v~n~~~~~~L~~lkd~~G~~i~~~~~~~~~~~~l~G~PV~~~ 310 (397) T protein:vir:48 240 ---------PTKPTLTKWDDIIDLQAKVDPAIKQTSFFLTNTSGFTALKKVKNAFGDYLMERDVKSPTGYSIDGFAVKEV 310 (397) T ss_pred ---------ccccccccHHHHHHHHHHhhhhhcCCCEEEECHHHHHHHHHhhcCCCceeeccCcCCCCCceeccceeEEe Confidence 011233578999999888877777778999999999999875311 12221 22345578999999986 Q ss_pred CC--CccCCCceEEEEEcC--CeeEEeecCCccceeeeecC----CcceeEEEEeeEEe---eeeeeeeecc-ccccCCC Q lcl|NC_015254. 234 DG--LPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREK----LKGNDILINRQHFL---LHPRGIAWQE-KSVAGHS 301 (346) Q Consensus 234 D~--~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~----~~g~~~l~~r~~~~---~~~~G~s~~~-~~~~~~s 301 (346) +. +|.....-.+++|+. .++.+...+ .+.+|.++.. ..+...+....++. .||.+|.+.. ++.+... T Consensus 311 ~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~-~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~~~~~ 389 (397) T protein:vir:48 311 ADRWLANASSGAMPLYFGDLKQAVTLFDRQ-QMSLLSTNIGGGAFETDTTKIRVIDRFDVVATDTESFVPASFKAIADQK 389 (397) T ss_pred cccccCCcCCCceEEEEEeccceEEEEeec-ceEEEEeccchhhhhcCceeEEEEeeeccEEecccceEEEEecccccCC Confidence 64 343332234556653 344443322 3556666643 34445555444443 3444554432 1222333 Q ss_pred CChHHhcC Q lcl|NC_015254. 302 PTNTEIEK 309 (346) Q Consensus 302 Pt~a~L~~ 309 (346) |+...++- T Consensus 390 ~~~~~~~~ 397 (397) T protein:vir:48 390 GNLGSTAV 397 (397) T ss_pred CCccccCC Confidence 33333333 No 86 >protein:vir:102655 Length: 322 # NCBI annotation: Hypothetical protein # Family: family:all:6384 # MgeID: mge:1624 # MgeName: VP2 # Cross-refs: genbank:acc:YP_052979;genbank:gi:50282923;genbank:GeneID:2948122 Probab=99.14 E-value=1.7e-11 Score=79.66 Aligned_cols=289 Identities=13% Similarity=0.029 Sum_probs=162.2 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhh-HhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCC--C- Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTE-RTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLT--G- 76 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~-~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~--g- 76 (346) .|.-+.|=.- =.|++...|+ +.|.+-+.- ...+.++|... +. .+ -...++++++.|--.... + T Consensus 3 ~~~~~~~~~~-----Ms~~i~~~fv-~qy~~~v~~~~qq~~s~L~~t--V~-~~----~~~~~~~~~~~~~~~~~~~~~~ 69 (322) T protein:vir:10 3 LNAIMSMLPL-----IAGDIDQAFV-QTYETTLRILSQQKSAKLKQY--CQ-HK----NESSESHNWETLASMDPDAVKR 69 (322) T ss_pred ccceeeeeee-----eechhhhHHH-HHHHHHHHHHHHHhhhhhhcc--cc-cc----cccccccceeeccccccccccc Confidence 1111111000 0123455555 555543333 33344455332 11 10 122345565544432210 1 Q ss_pred -cc-cccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhh Q lcl|NC_015254. 77 -ED-EILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASG 154 (346) Q Consensus 77 -~a-e~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~ 154 (346) .. +...|+.-+.++..+..+...+.......++.+.|+..+....||.....++.+.+++|+.|+.+++.+.|..... T Consensus 70 ~~~~~~~~d~~~dtp~~~~~~~~r~~~~~d~~~~~~VDd~D~~k~~~D~~~~~~~~~a~AL~R~~D~~I~~a~~g~a~~~ 149 (322) T protein:vir:10 70 KRSRQQSADGTYPTPVNNKPFAKRRTNVDTYDTGHVVEQEDISQMLLDPNSALITSQAYAMARKTDDLIIAGAWKPASIK 149 (322) T ss_pred ccccccccCcccCCCccccccceEEEeecccccceecchHHHHHhhcCchHHHHHHHHHHhhhHHHHHHHhhhhcccccc Confidence 11 1223333244445556666666666667778888999999999999999999999999999999998877654322 Q ss_pred hhhhcceeeec--cccccccccHHHHHHHHHHhCccc--cC-ceEEEEchHHHHHHHhhhhhhhcccc------cCceee Q lcl|NC_015254. 155 ALDSNKLDVST--ETGDDSYFTGDTFLSATYKLGDAE--GK-LTGIAMHSQTEMNLRKQGLIEFMLDS------DNKKFP 223 (346) Q Consensus 155 ~~~~~~~dis~--~~~~~~~~~~~~l~~A~~~~GD~~--~~-~~~ivmhS~~~~~L~~~~li~~~~~s------~~~~i~ 223 (346) .....+..-+. .......++.+.|.+|.++|..++ ++ -..+++.|..+.+|.+..-+.-..+. .+|.++ T Consensus 150 ~~gt~v~~~ss~~i~~g~~g~t~~kl~~a~~~l~~~dvp~d~~R~~vv~p~~~~~LL~d~~~ts~D~~~~~~l~~~G~ig 229 (322) T protein:vir:10 150 GTGQPVEFLATQEIGDGTKPISFDYVTEITERFLENEIEPEVSKVIVIGPTQARKLLQITEATSADYTSAMDLQSKGIIT 229 (322) T ss_pred ccccccccCCCcccccCccchhHHHHHHHHHHHHhcCCCCCCCeEEEeCHHHHHHHhcchhhhhhhcccchhhhhcCeee Confidence 21111111111 112234688999999999998654 23 36899999999999976443221111 246799 Q ss_pred EEeceEEEEeCCCccCC--------------CceEEEEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeee Q lcl|NC_015254. 224 TYMGKRVIVDDGLPAKD--------------GVYTSYIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRG 289 (346) Q Consensus 224 ~~~G~~VVvdD~~p~~~--------------g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G 289 (346) +++|.+++++.++|..+ ++..|+..-..|+++...++ +.+|...++.+. ++.+.|.+|.+| T Consensus 230 ~~lGf~~i~s~~lp~~~~t~~~~~~~~~~~~~~~~~~a~~k~Av~~a~~~d-v~~~i~~~~~~~----~a~~I~~~~~~G 304 (322) T protein:vir:10 230 NWMGYTWIVSTRLDKFDPTQWGMAAEDGPQGDEIWCIAMTDMALGYHSCKD-IWTKVAEDPSAS----FAWRIYSAFTAD 304 (322) T ss_pred eeeeEEEEEeccCCccccccccccccCCCCccceeEEEEecCceeEEEeee-eeEEeeccCCcc----hhhhhhhhhhhC Confidence 99999999999998532 25678899999999987664 445554444332 223334444444 Q ss_pred eeeccccccCCCCChHHhcCCcCceeeecccccceEEEEEeccc Q lcl|NC_015254. 290 IAWQEKSVAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGV 333 (346) Q Consensus 290 ~s~~~~~~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~ 333 (346) -.=. ++| .+|.|.++..- T Consensus 305 a~ri------------------------~~~--gVv~i~~~e~~ 322 (322) T protein:vir:10 305 CVRV------------------------EDE--HIFKLRLKNSL 322 (322) T ss_pred ceEe------------------------ccC--cEEEEEEeccC Confidence 2221 111 11222222222 No 87 >protein:vir:2201 Length: 345 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:49 # MgeName: T7 # Cross-refs: genbank:acc:NP_041998;swissprot:sw:p19726;genbank:gi:9627469;goa:P19726;uniprot:P19726;genbank:GeneID:1261026 Probab=99.12 E-value=4.8e-12 Score=82.64 Aligned_cols=285 Identities=15% Similarity=0.091 Sum_probs=165.5 Q ss_pred Ccc---ceecce----eeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccccc Q lcl|NC_015254. 1 MIK---KLRMNL----QKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQD 73 (346) Q Consensus 1 ~~~---~~~~~~----q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~ 73 (346) |-. ..+.|+ +..+++ ---+++. |+|...|...+.+.+.|.. .-...+ -.+|+++.+|..+. T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~---~~~al~l-e~f~geV~~~f~~~s~~~~------~~~~r~--i~~gks~~~~~iG~ 68 (345) T protein:vir:22 1 MASMTGGQQMGTNQGKGVVAAG---DKLALFL-KVFGGEVLTAFARTSVTTS------RHMVRS--ISSGKSAQFPVLGR 68 (345) T ss_pred CcccccchhcccccccccccCC---chhHHHH-HHHhHHHHHHHHHHhhhcc------cceeee--ccccceEEEeeecc Confidence 221 111111 111111 1124555 8898888888888876632 111111 12699999998776 Q ss_pred CCCcccccCCCccccchh--hcccceeEEEEE-eecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_015254. 74 LTGEDEILDDGEGALTPG--NISAAKDIARLH-MRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGI 150 (346) Q Consensus 74 l~g~ae~~~dg~~~it~~--~lt~~~~~a~~~-~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~ 150 (346) . .+.....|+ .+... .+...+.+-+|= ..-..+.+.|+....+..|.+.+++++.+.+.++.+|+.++..|... T Consensus 69 ~--~~~~~~~G~-~l~~~~~~~~~~e~~ltID~~~y~~~~VddiD~~q~~~D~r~~~s~~~G~aLA~~~D~~i~~~l~k~ 145 (345) T protein:vir:22 69 T--QAAYLAPGE-NLDDKRKDIKHTEKVITIDGLLTADVLIYDIEDAMNHYDVRSEYTSQLGESLAMAADGAVLAEIAGL 145 (345) T ss_pred e--EEEeeecCC-CCCCCCCCcccceEEEEecchhhhhhhHhhHHHHhcCchhHHHHHHHHHHHHHHHHHHHHHHHHHHh Confidence 5 334444553 45332 344444333322 34445889999999999999999999999999999999998877533 Q ss_pred hhhhh-hh------hcc--eeee--ccccccccc----cHHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhhhhhhh Q lcl|NC_015254. 151 TASGA-LD------SNK--LDVS--TETGDDSYF----TGDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQGLIEF 213 (346) Q Consensus 151 ~~~~~-~~------~~~--~dis--~~~~~~~~~----~~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~li~~ 213 (346) ..... .. .+. .++. +...+.... -++.|.+|.++|.+.. ..-..++|.|..|..|++...+.. T Consensus 146 a~~~~~~~~~~~~~~~~~~~~~~~~g~~~t~~~~~~~~~~~ai~~a~~~Lde~~VP~~~R~~vv~P~~y~~Ll~~~~~~~ 225 (345) T protein:vir:22 146 CNVESKYNENIEGLGTATVIETTQNKAALTDQVALGKEIIAALTKARAALTKNYVPAADRVFYCDPDSYSAILAALMPNA 225 (345) T ss_pred hcccccccccccccccccccccccccccccccccCHHHHHHHHHHHHHHhhhcCCCccCCEEEeChHHHHHHhccccccc Confidence 22111 10 000 0110 111111111 2577778888886544 234789999999999998765543 Q ss_pred ccccc-----CceeeEEeceEEEEeCCCccCC-------------------Cce---------EEEEEcCCeeEEeecCC Q lcl|NC_015254. 214 MLDSD-----NKKFPTYMGKRVIVDDGLPAKD-------------------GVY---------TSYIFGEGAFGLGNGEA 260 (346) Q Consensus 214 ~~~s~-----~~~i~~~~G~~VVvdD~~p~~~-------------------g~y---------tt~l~~~GAi~~~~~~~ 260 (346) ..+.. .|.|+.++|++|+.+..+|... |.+ ...+|-+.|++..... T Consensus 226 ~~~~~~~~~~~G~V~~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~~~~l~~h~~A~~~v~~~- 304 (345) T protein:vir:22 226 ANYAALIDPEKGSIRNVMGFEVVEVPHLTAGGAGTAREGTTGQKHVFPANKGEGNVKVAKDNVIGLFMHRSAVGTVKLR- 304 (345) T ss_pred cccccccccccceEEEEeceEEEecccccccccCccccCcccccccccccccceeeeeccCceEEEEEehhheeeeeee- Confidence 32222 3678999999999999887421 111 1245667777766544 Q ss_pred ccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCChHHhcCCcCceeeecccccceEEEEEecc Q lcl|NC_015254. 261 PVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNG 332 (346) Q Consensus 261 ~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~ 332 (346) ++.+|..|++....+.+.+.+.|+..+.= | . ..|.|+.|+. T Consensus 305 ~~~~e~~r~~~~~~d~I~~~~a~G~~vlR--P---e--------------------------aa~~i~~~~~ 345 (345) T protein:vir:22 305 DLALERARRANFQADQIIAKYAMGHGGLR--P---E--------------------------AAGAVVFKVE 345 (345) T ss_pred cceeeeeechhHHHHHHHHHHhcCCcccc--c---c--------------------------eeEEEEEeeC Confidence 34688888887666666555555443320 0 0 1122333333 No 88 >protein:vir:9574 Length: 300 # NCBI annotation: gp40 # Family: family:all:966 # MgeID: mge:171 # MgeName: SM1 # Cross-refs: genbank:acc:NP_862879;genbank:gi:32469471;genbank:GeneID:1461316 Probab=99.11 E-value=3.1e-11 Score=78.18 Aligned_cols=270 Identities=14% Similarity=0.049 Sum_probs=152.9 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =|.++|.-..+|.||+..+ +.+...+.+.+.+-. .....++..+++|.+.. ++.+.-+.|++ .++..+ T Consensus 1 ma~~t~~~G~lip~~~~~~-ii~~l~~~s~i~~l~---------~~~~~~~~~~~~p~~~~-~~~a~wv~Eg~-~~~~s~ 68 (300) T protein:vir:95 1 MSEAQLSKGNLFNPELVTK-VINKVKGHSSIAKLS---------PQKPIPFNGQREFVFDF-DSDIDIVAENG-KKTHGG 68 (300) T ss_pred CcccccCCcceechhhHHH-HHHHHHhhhhhhhhc---------ceeeccCCceEEEEEec-CcceEEeeCCc-cccccc Confidence 3666666677777776555 445544444443311 11122455688998765 35566677874 677677 Q ss_pred cccceeEEEEEeecCcceechHHHhhh---cchHHHHHHHHHHHHHHHHHHHHHHHHHH---h----hhhhhhhhhccee Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALS---GDDPMRAIGDLVVEYWNRRRQAVLIASLN---G----ITASGALDSNKLD 162 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~---g~dp~~~i~~q~a~~~~~~~~~~lla~L~---G----~~~~~~~~~~~~d 162 (346) .+-.+.....++.+.-..++++-...+ .-+..+.+.+++++.++++.+..+|.-.. | ..+........ T Consensus 69 ~~f~~v~l~~~k~~~~~~iS~ell~~~~d~~~~l~~~i~~~l~~aia~~~d~~~l~G~~~~~g~~~~~~~~~~~~~~~-- 146 (300) T protein:vir:95 69 VSLDPVTIVPLKVEYGARVSDEFLHASEEAKVDMLTDFVEGFSKKLARGLDIMSIHGINPRTKQASTIIGDNCFDKKV-- 146 (300) T ss_pred ccceeeEeeeEEEEEeehhhHHHhccCCCCHHHHHHHHHHHHHHHHHHHHHHhhhhcccCCCCCCccccccccccccc-- Confidence 766666656666666677777754322 34677889999999999999998774310 0 00000000000 Q ss_pred eeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhccc--ccCceeeEEeceEEEEeCCCcc Q lcl|NC_015254. 163 VSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLD--SDNKKFPTYMGKRVIVDDGLPA 238 (346) Q Consensus 163 is~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~--s~~~~i~~~~G~~VVvdD~~p~ 238 (346) +..........++.|.++..++.+...+..+|+|||.++..|++..-- .++.. ..++.-++++|+||++++.+|. T Consensus 147 -~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~L~~lkd~~G~~i~~~~~~~~~~~~l~G~Pv~~s~~v~~ 225 (300) T protein:vir:95 147 -TQTVPFKDTNPDESMEDAVGMIDGSERDITGAILDPIFTTALSKMKNAEGGKLYPELAWGGVPDAINGLAVDKNRTVSY 225 (300) T ss_pred -ceeecccccchHHHHHHHHHHhhhcCCCccEEEECHHHHHHHHHhhccCCCeeccCccccCCCceecceeeEEecCCCC Confidence 001112234668899999999988777778999999999999875311 12221 1234567899999999999987 Q ss_pred CCCc-eEEEEEcC--CeeEEeecCCccceeeeec--CC--------cceeEEEEeeEEee---eeeeeeeccccccC Q lcl|NC_015254. 239 KDGV-YTSYIFGE--GAFGLGNGEAPVPTETDRE--KL--------KGNDILINRQHFLL---HPRGIAWQEKSVAG 299 (346) Q Consensus 239 ~~g~-ytt~l~~~--GAi~~~~~~~~~~vE~dRd--~~--------~g~~~l~~r~~~~~---~~~G~s~~~~~~~~ 299 (346) ..+. ....+++. .++.++. +..+.++..+. +. ..+..+....++.+ ||.-|.--.. .+| T Consensus 226 ~~~~~~~~~~~GDf~~~~~~~~-~~~~~~~v~~~~~~d~~~~~~f~~~~v~~r~~~r~d~~v~~~~a~~~l~~-~~g 300 (300) T protein:vir:95 226 SQTDPKNTAIVGDFETMFKWGY-AKEVPMEIIKYGDPDNSGRDLKGYNQIYIRCEAYIGWGIMDAASFARIVK-TGG 300 (300) T ss_pred CCCCCccEEEEeeccceEEEEE-ecccEEEEeeccCCCCcchhhhhcCcEEEEEEEeecceeecccceEEEec-CCC Confidence 5432 22233342 3343433 22333433321 11 11233334444433 3444444322 233 No 89 >protein:vir:4456 Length: 401 # NCBI annotation: Major capsid protein precursor # Family: family:all:21 # MgeID: mge:96 # MgeName: ST64B # Cross-refs: genbank:acc:NP_700379;genbank:gi:23505451;genbank:GeneID:955658 Probab=99.10 E-value=1.3e-11 Score=80.20 Aligned_cols=286 Identities=10% Similarity=0.027 Sum_probs=142.7 Q ss_pred Cccceecceeeec--CCceeeee--eccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCC Q lcl|NC_015254. 1 MIKKLRMNLQKFA--AGKNTRIA--DVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTG 76 (346) Q Consensus 1 ~~~~~~~~~q~~~--a~~~T~l~--d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g 76 (346) |.+-....+..+. +-.++.-+ -..+|+.+.+-+.+...+.+.+.+-. .....+|....+|....- . T Consensus 91 lr~~~~~~~~~~e~~a~~~~~~~~GG~~iP~~~~~~ii~~~~~~~~l~~~~---------~~~~~~~~~~~~~~~~~~-~ 160 (401) T protein:vir:44 91 LRKGREDGLRDLERKALQVGTDEDGGYAVPEELDRSILSLLKDEVVMRQEA---------TVITVGGSDYKKLVNLGG-T 160 (401) T ss_pred HhhhhhhhhHHHHHHHhhcCCCCCCceeccHhHHHHHHHHHHhhhhhhhhc---------eeeecCCCceEEEEecCC-c Confidence 1000011111110 00111111 24577777666666555544443311 111223555666665432 2 Q ss_pred cccccCCCccccchhhc-ccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH-----HHhh Q lcl|NC_015254. 77 EDEILDDGEGALTPGNI-SAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS-----LNGI 150 (346) Q Consensus 77 ~ae~~~dg~~~it~~~l-t~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~-----L~G~ 150 (346) .+.-+.|+. ..+.... +-.+-....++.+.-..++++...-+.-|..+.+.++++..+.+..+..+|.- .+|+ T Consensus 161 ~a~wv~E~~-~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~la~ai~~~~~~~~l~G~G~~~p~Gi 239 (401) T protein:vir:44 161 ASGWVGETD-TRSQTATSRLGLIEPFMGEIYGNPQATQKMLDDAFFNVEAWINSELATEFAEQEEIAFTTGDGTKKPKGF 239 (401) T ss_pred cceeecccc-ccCccccccceeeeeehhheeeehhhhHHHHhcchHHHHHHHHHHHHHHHHHHHHhhhhccCCCCcccee Confidence 233345654 3332222 22222333334444455666544444557788899999999999999877631 1222 Q ss_pred hhhhhhhhc------ceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCc Q lcl|NC_015254. 151 TASGALDSN------KLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNK 220 (346) Q Consensus 151 ~~~~~~~~~------~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~ 220 (346) +........ .......++....++++.+.++...+......-.+|+||+..+..|++..=- .++. ....+ T Consensus 240 l~~~~~~~~~~~~~~~~~~~~~t~~~~~~~~d~i~~~~~~l~~~~~~~a~~v~n~~~~~~L~~lkd~~G~~l~~~~~~~g 319 (401) T protein:vir:44 240 LAYESTEESDKARAFGKLQHIVSGEATAVTADAIIKLIYTLRKAHRTGAKFMMNNNSLFAIRLLKDTEGNYLWRPGLELG 319 (401) T ss_pred eccccccccccccccccccccccccccccCHHHHHHHHHhcchhhhcCCEEEEcHHHHHHHHHhhccCCceeecCCcCCC Confidence 221111110 0001112344566889999999998877666667899999999999865211 1222 12234 Q ss_pred eeeEEeceEEEEeCCCccCCCceEEEEEcC--CeeEEeecCCccceeeeecCCcceeEEEEeeEEeeee---eeeeeccc Q lcl|NC_015254. 221 KFPTYMGKRVIVDDGLPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHP---RGIAWQEK 295 (346) Q Consensus 221 ~i~~~~G~~VVvdD~~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~---~G~s~~~~ 295 (346) ...+++|+||+++|.||.....-.+++||. -++.+.... .+.++.++....+...+.+..++.+.+ ..|..-.- T Consensus 320 ~~~~l~G~PVv~~~~~p~~~~~~~~i~~Gd~~~~~~i~~~~-~~~~~~~~~~~~~~v~~~a~~r~d~~~~~~~a~~~l~~ 398 (401) T protein:vir:44 320 QPSSLAGYGIAENEQMPDIAADAKAIAFGNFKRGYTIVDRI-GTRILRDPYTNKPFVGFYTTKRTGGMLVDSQAIKLLKI 398 (401) T ss_pred CCceecceeeEEecCcCCccCCccEEEEeehhccEEEEEec-ceEEeeeccccCCcEEEEEEEEeccEEecccceEEEEe Confidence 456899999999999996443333445542 344443222 233443333344555555554444333 33333221 Q ss_pred ccc Q lcl|NC_015254. 296 SVA 298 (346) Q Consensus 296 ~~~ 298 (346) ..+ T Consensus 399 ~aa 401 (401) T protein:vir:44 399 AAA 401 (401) T ss_pred ecC Confidence 111 No 90 >protein:vir:78523 Length: 338 # NCBI annotation: Putative head structural protein # Family: family:all:507 # MgeID: mge:1853 # MgeName: U2 # Cross-refs: genbank:acc:YP_001491585;genbank:gi:157786408;genbank:GeneID:5625675 Probab=99.10 E-value=5.5e-11 Score=76.84 Aligned_cols=288 Identities=10% Similarity=0.051 Sum_probs=150.9 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCC----- Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLT----- 75 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~----- 75 (346) ||.-..+..-.-..+..|.-.--++|+.|..-+.+...+.+.+.+-+ .....++..+++|.+..-. T Consensus 4 ~~e~~~~~~~~~~~~~~~~~~~~liP~~~~~~ii~~~~~~s~l~~l~---------~~~~~~~~~~~ip~~~~~~~a~~v 74 (338) T protein:vir:78 4 LNELAPNTAGSNHQGRLAHVPSDLLPKEIVGPIFDKAQESSLVLRLG---------ENIPISYGETIIPTTVKRPEVGQV 74 (338) T ss_pred hHHhhhhhcccccccceecccccccchHHHHHHHHHHHhhchhhhhc---------ceeeccCCceEEEEEecCccceee Confidence 45444331111111112222223678877777777776666554322 1123357788899875321 Q ss_pred --CcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH------- Q lcl|NC_015254. 76 --GEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS------- 146 (346) Q Consensus 76 --g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~------- 146 (346) +.+..+.|++ .++..+++-.+-....++.+.-..++++...-+..|..+.+.+++++.+.+.++..+|.- T Consensus 75 ~~~~~~~~~Eg~-~~~~~~~~f~~v~l~~~k~~~~~~is~ell~ds~~~~~~~i~~~la~a~~~~~d~~~l~G~g~~~~~ 153 (338) T protein:vir:78 75 GVGTSNEQREGG-TKPLSGTAWDTRSVAPIKLATIVTVSEEFARMNPSGLYTKLQADLAYAIGRGIDLAVFHGKSPLTGS 153 (338) T ss_pred cccccccccccc-cccccccceeEEEEEEEEEEEeehhhHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHhhcccCCCccc Confidence 2223345664 566666766666666667777788888766666678889999999999999999977742 Q ss_pred -HHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhC-ccccCceEEEEchHHHHHHHhhh-hhh----hccc--c Q lcl|NC_015254. 147 -LNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLG-DAEGKLTGIAMHSQTEMNLRKQG-LIE----FMLD--S 217 (346) Q Consensus 147 -L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~G-D~~~~~~~ivmhS~~~~~L~~~~-li~----~~~~--s 217 (346) ..|+.........+ .. ....+.....++.|.++..++. .......+|+||+..+..|++.. +.+ ++.. . T Consensus 154 ~~~gi~~~~~~~~~~-~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~m~~~~~~~L~~~~~l~d~~g~~l~~~~~ 231 (338) T protein:vir:78 154 ALQGIDTNNVIVNTT-NV-DYLQTGTTPLLDRFLDGYDLVSANTDVDFNGWAADPRYRARLLRSQAYRDANGNVDPTRIN 231 (338) T ss_pred ccccccccccccccc-cc-ccccccchhhHHHHHHHHHHhhhhccccceEEEEchHHHHHHHHHhhhccCCCceeecccc Confidence 11211111111100 00 0111222345778888887763 33445678999999999987643 211 2211 1 Q ss_pred cCceeeEEeceEEEEeCCCccCCC----ceEEEEEcCCe-eEEeecCCccceeeeecCCc----------------ceeE Q lcl|NC_015254. 218 DNKKFPTYMGKRVIVDDGLPAKDG----VYTSYIFGEGA-FGLGNGEAPVPTETDREKLK----------------GNDI 276 (346) Q Consensus 218 ~~~~i~~~~G~~VVvdD~~p~~~g----~ytt~l~~~GA-i~~~~~~~~~~vE~dRd~~~----------------g~~~ 276 (346) .++.-++++|+||++++.||...+ ....++|+.-+ +.++. ...+.+|..|+... ++.. T Consensus 232 ~~~~~~~l~G~PV~~~~~ip~~~~~~~~~~~~~~~gdfs~~~~~~-~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 310 (338) T protein:vir:78 232 LAASAGDLLGLPVQFGKAVGGDLGAATDSKVRVVGGDFSQLKYGF-ADEIRVKMSDTATLTDNTSPTPQTVSMWQTNQIA 310 (338) T ss_pred cCCCCceeeeeeEEEccccCccccccCCcccEEEEEecceEEEEe-ecccEEEEeecccccccccccccchhhhhcCcEE Confidence 234457899999999999986422 22233444332 22332 22355666655421 1111 Q ss_pred EEEeeEEe---eeeeeeeeccccccCCCCCh Q lcl|NC_015254. 277 LINRQHFL---LHPRGIAWQEKSVAGHSPTN 304 (346) Q Consensus 277 l~~r~~~~---~~~~G~s~~~~~~~~~sPt~ 304 (346) +....++. +||..|.--.... -|.- T Consensus 311 ~r~~~r~d~~v~~~~a~~~l~~~~---~~~~ 338 (338) T protein:vir:78 311 ILIEVTFGWLLGDKQAFVKFVDDE---DPDA 338 (338) T ss_pred EEEEEEeccEeecccceEEEeccc---CCCC Confidence 22222222 2222222211111 1111 No 91 >protein:vir:5739 Length: 366 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:122 # MgeName: PY54 # Cross-refs: genbank:acc:NP_892050;genbank:gi:33770513;interpro:IPR006444;uniprot:Q7Y410;genbank:GeneID:1732928 Probab=99.09 E-value=2e-11 Score=79.20 Aligned_cols=284 Identities=12% Similarity=0.097 Sum_probs=142.2 Q ss_pred Cccc-e-ecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcc Q lcl|NC_015254. 1 MIKK-L-RMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGED 78 (346) Q Consensus 1 ~~~~-~-~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~a 78 (346) |-++ + .-.+++.-+ +++.-.-..+|+.+..-+.+...+.+.+.+.|+= ....+...+++|.+..- ..+ T Consensus 52 ~a~~~~~~~~~~~a~~-~~~~~Gg~lvP~~~~~~ii~~l~~~s~l~~lg~~--------~v~~~~g~~~~p~~t~~-~~a 121 (366) T protein:vir:57 52 FAATELGDTGLSMAIS-TAAGSGGALIPQNMQNEVIELLRDRTVVRILGAR--------SIPLPNGNLSMPRLSGG-ATA 121 (366) T ss_pred HHHHhhcchhhhhhcc-ccccCCccccchhHHHHHHHHHhhhcchhhhcee--------eeecCCCceEEEEEeCC-cce Confidence 1000 0 011122111 1122234457888877666665555544332210 11122335889988643 445 Q ss_pred cccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH------HHHhhhh Q lcl|NC_015254. 79 EILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIA------SLNGITA 152 (346) Q Consensus 79 e~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla------~L~G~~~ 152 (346) .-+.|+. .++..+.+-++-....++.+.-+.++++...-+.-+....+.+++++.+.++.++.+|. .-+|+++ T Consensus 122 ~wv~E~~-~~~~s~~~f~~i~~~~~k~~~~~~iS~ell~ds~~~~~~~i~~~l~~a~~~~~d~a~l~G~G~~~~p~Gi~~ 200 (366) T protein:vir:57 122 GYVGEGK-DVVATGATFDDVKLSAKTMIALVPVSNQLIGRAGFNVEQLLLGDILSAIATREDKAFLRDDGTGDTPKGMKA 200 (366) T ss_pred eeeccCc-cccccccceeEEEEeeEEEEEeehhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHhhccCCCCccccceee Confidence 5567874 67777776666555666666666777766555555777889999999999999987663 2233333 Q ss_pred hhhhhhcceeeecccccccc-ccHHHHHHHHHH-hCcc--ccCceEEEEchHHHHHHHhhhhh--hhcccccCceeeEEe Q lcl|NC_015254. 153 SGALDSNKLDVSTETGDDSY-FTGDTFLSATYK-LGDA--EGKLTGIAMHSQTEMNLRKQGLI--EFMLDSDNKKFPTYM 226 (346) Q Consensus 153 ~~~~~~~~~dis~~~~~~~~-~~~~~l~~A~~~-~GD~--~~~~~~ivmhS~~~~~L~~~~li--~~~~~s~~~~i~~~~ 226 (346) ........... +++... -..+.+.+.+.. +.+. ......|+||+..+..|++..-- .++.... .-++++ T Consensus 201 ~~~~~~~~~~~---~~t~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~vmn~~~~~~L~~lkd~~G~~l~~~~--~~g~l~ 275 (366) T protein:vir:57 201 VATAANRLVAW---TGTAINLTTIDEYLDSLILKHMDSNSNMIRCGWGLSNRTYMTLFGLRDGNGNKVYPEM--SQGILK 275 (366) T ss_pred ccccccceeec---cccccchhhHHHHHHHHHHhhhccccccccCEEEecHHHHHHHHhhhccCCceeccCC--CCCeec Confidence 22222211111 111111 123344444433 3332 23457899999999999875311 1121111 125799 Q ss_pred ceEEEEeCCCccCCC---ceEEEEEcCCe-eEEeecCCccceeeeecCC----cce---------eEEEEeeEEeeeeee Q lcl|NC_015254. 227 GKRVIVDDGLPAKDG---VYTSYIFGEGA-FGLGNGEAPVPTETDREKL----KGN---------DILINRQHFLLHPRG 289 (346) Q Consensus 227 G~~VVvdD~~p~~~g---~ytt~l~~~GA-i~~~~~~~~~~vE~dRd~~----~g~---------~~l~~r~~~~~~~~G 289 (346) |+||++++.||...+ .-..++|+.=. +.+.. ...+.++..|++. .|. ..+-...++-+.|+ T Consensus 276 G~Pvv~s~~ip~~~~~~~~~~~i~~gdfs~~~i~~-~~~i~i~~~~ea~~~~~~g~~~~~f~~~~~~iR~~~~~d~~v~- 353 (366) T protein:vir:57 276 GYPIQRTSAIPANLGDDGNESEIYFCDFNDVVIGE-DGMMKVDFSTEATYKDADGQLVSAFARNQSLIRVVTEHDIGFR- 353 (366) T ss_pred ceeeEEccccccccccCCCccEEEEEecceEEEEE-ecceEEEEeeccccccccccchhhhhcCceeEEeeeeeCcEee- Confidence 999999999997432 22234444322 22332 2234566666642 111 11212222221111 Q ss_pred eeeccccccCCCCChHHhcCCcCc Q lcl|NC_015254. 290 IAWQEKSVAGHSPTNTEIEKGNNW 313 (346) Q Consensus 290 ~s~~~~~~~~~sPt~a~L~~~~NW 313 (346) .|.--.+-++.+| T Consensus 354 -----------~~~a~~~lt~~~~ 366 (366) T protein:vir:57 354 -----------HPEGLVLGTGVIW 366 (366) T ss_pred -----------ccccEEEEecccC Confidence 2333334455556 No 92 >protein:vir:104085 Length: 320 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:1656 # MgeName: Che12 # Cross-refs: genbank:acc:YP_655596;genbank:gi:109392467;genbank:GeneID:4156953 Probab=99.09 E-value=2.8e-11 Score=78.40 Aligned_cols=292 Identities=12% Similarity=0.073 Sum_probs=155.2 Q ss_pred CccceecceeeecCCce--eeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKN--TRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGED 78 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~--T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~a 78 (346) |=....||.+..+...+ +..+.+|.|++..+. .+...+.+.+.+- ......++..+++|.+..- ..+ T Consensus 1 ~~~~~~~~~~~~~~~~t~~~~~~~~ip~~~~~~i-i~~~~~~s~l~~~---------~~~~~~~~~~~~~p~~~~~-~~a 69 (320) T protein:vir:10 1 MAAGTAFQVDHAQIAQTGDTMFKGYLEPEQAKDY-FAEAEKTSIVQQF---------AQKVPMGTTGQKIPHWIGD-VSA 69 (320) T ss_pred CCCCccCCHHHHHhhccccccccccccHHHHHHH-HHHHHhccchhhh---------cceeeccCCceEEEEEeCC-cce Confidence 88888888877752222 223466666655444 4544444444331 1222335777899998754 455 Q ss_pred cccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhh--hh Q lcl|NC_015254. 79 EILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASG--AL 156 (346) Q Consensus 79 e~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~--~~ 156 (346) .-+.|++ .++..+.+-.+-....++.+..+.++++...-+.-|..+.+.+++++++++..++.+|. |.-... .. T Consensus 70 ~~v~E~~-~~~~~~~~f~~v~~~~~k~~~~~~is~ell~ds~~~l~~~i~~~l~~a~a~~~d~a~l~---G~g~~~~~~~ 145 (320) T protein:vir:10 70 QWIGEGD-MKPITKGNMTSQNIAPHKIATIFVASAETVRANPANYLGTMRTKVATAFAMAFDSAALN---GTDSPFPTYL 145 (320) T ss_pred EEecCCc-cccccccceeEEEEeeEEEEEeehhhHHHHhcChHHHHHHHHHHHHHHHHHHHHHHhhc---ccCCCCCccc Confidence 5667874 67878888777777788888889999887776667888999999999999999997653 211100 00 Q ss_pred hh--cceeeeccc-ccccccc-H-HHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhh--hcccc--cC-----cee Q lcl|NC_015254. 157 DS--NKLDVSTET-GDDSYFT-G-DTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIE--FMLDS--DN-----KKF 222 (346) Q Consensus 157 ~~--~~~dis~~~-~~~~~~~-~-~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~--~~~~s--~~-----~~i 222 (346) .. +........ .....++ . ..+.++..........-.+++||+..+..|++..--+ ++... .+ ..- T Consensus 146 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~lkd~~G~~l~~~~~~~~~~~~~~~ 225 (320) T protein:vir:10 146 AQTTKSVSLADPGGATASDLTAYDAVAVNGLSLLVNAKKKWTHTLLDDIVEPILNGAKDKNGRPLFIESTYTDENSPFRA 225 (320) T ss_pred ccccccccceecccccccccccHHHHHHHHHhhhhcccCCCcEEEEcHHHHHHHHHhhccCCceeeccccccCccccccC Confidence 00 000111110 0111111 1 2456666666666667789999999999998753211 11111 01 112 Q ss_pred eEEeceEEEEeCCCccCCCceEEEEEcC--CeeEEeecCCccceeeeecCCc--ce------eEEEEeeEEeeeeeeeee Q lcl|NC_015254. 223 PTYMGKRVIVDDGLPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREKLK--GN------DILINRQHFLLHPRGIAW 292 (346) Q Consensus 223 ~~~~G~~VVvdD~~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~~--g~------~~l~~r~~~~~~~~G~s~ 292 (346) .+++|+||++++.+|.+.. ..+++. .++ ++.-+ .+.+|.+|+... +. -.++.+ --+.++.+-+ T Consensus 226 ~~i~g~pv~~~~~~~~~~~---~~~~gd~~~~~-~~~~~-~~~i~~~~~~~~~~~~~~~~~~~~~f~~--~~~~~r~~~~ 298 (320) T protein:vir:10 226 GRIVSRPTILSDHVADGTT---VGYMGDFRNVI-WGQVG-GLSFDVTDQATLNLGTPTEPNFVSLWQH--NLVAVRVEAE 298 (320) T ss_pred ceeeeeeeEecCCCCCCce---EEEEeecceEE-EEEec-CeEEEEeecceeeeccccccccchhhhc--CcEEEEEEEe Confidence 5789999999999987541 122221 222 33322 345555555421 00 001110 1112222222 Q ss_pred ccccccCCCCChHHhcCCcCceeeecccccceEEEEEecccccc Q lcl|NC_015254. 293 QEKSVAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGK 336 (346) Q Consensus 293 ~~~~~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~ 336 (346) -+-. +.+++++ + .++++.+|-+ T Consensus 299 ~d~~-------------------v~~~~a~--~-~l~~~~ap~~ 320 (320) T protein:vir:10 299 YAFH-------------------NNDKDAF--V-KLTNVVTPDA 320 (320) T ss_pred eccE-------------------Eecccce--E-EEEeccCCCC Confidence 1100 1122221 1 2223322222 No 93 >protein:vir:4226 Length: 326 # NCBI annotation: observed 35.2Kd protein # Family: family:all:507 # MgeID: mge:89 # MgeName: L5 # Cross-refs: genbank:acc:NP_039681;swissprot:sw:q05223;genbank:gi:9625447;uniprot:Q05223;genbank:GeneID:2942929 Probab=99.08 E-value=5.6e-11 Score=76.79 Aligned_cols=283 Identities=11% Similarity=0.037 Sum_probs=146.5 Q ss_pred Cccceecceeeec---CC--ceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCC Q lcl|NC_015254. 1 MIKKLRMNLQKFA---AG--KNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLT 75 (346) Q Consensus 1 ~~~~~~~~~q~~~---a~--~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~ 75 (346) ||....+++.... +- .++.-+.++.|++..+ +.+...+.+.+.+ +......++..+++|.+..- T Consensus 3 ~~~~r~~~~~~~~e~~a~~~~~~~~g~~ip~~~~~~-ii~~~~~~s~i~~---------~~~~~~~~~~~~~~p~~~~~- 71 (326) T protein:vir:42 3 VNPDRTTPFLGVNDPKVAQTGDSMFEGYLEPEQAQD-YFAEAEKISIVQQ---------FAQKIPMGTTGQKIPHWTGD- 71 (326) T ss_pred CCccchhhhcCcchhhheeccccCCcceechhhHHH-HHHHHHhcchhhh---------hcceeeccCCceEEEEEeCC- Confidence 6665554442111 11 1111234665555444 4555444444422 11122335778899998764 Q ss_pred CcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH-----Hhh Q lcl|NC_015254. 76 GEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASL-----NGI 150 (346) Q Consensus 76 g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L-----~G~ 150 (346) ..+..+.|++ .++..+.+..+-....++.+..+.++++...-+..|..+.+.+++++++.+.+++.+|.-- .|+ T Consensus 72 ~~a~~v~Eg~-~~~~~~~~f~~i~~~~~k~~~~v~iS~ell~~s~~~~~~~i~~~l~~a~~~~~d~a~l~G~gs~~p~gi 150 (326) T protein:vir:42 72 VSASWIGEGD-MKPITKGNMTSQTIAPHKIATIFVASAETVRANPANYLGTMRTKVATAFAMAFDNAAINGTDSPFPTFL 150 (326) T ss_pred cceEEecCCc-cccccccceeEEEEeeEEEEEeehhhHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHhhcccCCCccccc Confidence 4455667874 6787888877777778888888999988777677788999999999999999999766310 011 Q ss_pred hhhhhhhhcceeeeccccccccccH-HH-HHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcccc--cC----- Q lcl|NC_015254. 151 TASGALDSNKLDVSTETGDDSYFTG-DT-FLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLDS--DN----- 219 (346) Q Consensus 151 ~~~~~~~~~~~dis~~~~~~~~~~~-~~-l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~s--~~----- 219 (346) ......... .....++....+.. +. +.++.............|+||++++..|++..-- .++... .+ T Consensus 151 ~~~~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~n~~~~~~L~~lkd~~G~~l~~~~~~~~~~~~ 228 (326) T protein:vir:42 151 AQTTKEVSL--VDPDGTGSNADLTVYDAVAVNALSLLVNAGKKWTHTLLDDITEPILNGAKDKSGRPLFIESTYTEENSP 228 (326) T ss_pred cccccccce--eecccccccccchhHHHHHHHHHhhhhhhccCccEEEEeHHHHHHHHHhhccCCceeeccccccCcccc Confidence 111111111 11111112222222 22 3344544555555667899999999999875311 112111 11 Q ss_pred ceeeEEeceEEEEeCCCccCCCceEEEEEcCCe-eEEeecCCccceeeeecCC----------------cceeEEEEeeE Q lcl|NC_015254. 220 KKFPTYMGKRVIVDDGLPAKDGVYTSYIFGEGA-FGLGNGEAPVPTETDREKL----------------KGNDILINRQH 282 (346) Q Consensus 220 ~~i~~~~G~~VVvdD~~p~~~g~ytt~l~~~GA-i~~~~~~~~~~vE~dRd~~----------------~g~~~l~~r~~ 282 (346) ...++++|+||++++.+|.+.. ..+++.-. +.++.- ..+.++..++.. ..+..+....+ T Consensus 229 ~~~~~l~G~pv~~~~~~~~~~~---~~~~Gd~s~~~~~~~-~~~~v~~~~e~~~~~~~~~~~~~~~~~~~d~~~~r~~~~ 304 (326) T protein:vir:42 229 FRLGRIVARPTILSDHVASGTV---VGYQGDFRQLVWGQV-GGLSFDVTDQATLNLGTPQAPNFVSLWQHNLVAVRVEAE 304 (326) T ss_pred ccCceeeeeeEEEcCCCCCCce---EEEEeecceEEEEEe-cceEEEEeecceeeecccccccchhhhhcCcEEEEEEEE Confidence 1235799999999999987542 11222211 112221 123444444431 11222333333 Q ss_pred Eeeee---eeeeeccccccCCC Q lcl|NC_015254. 283 FLLHP---RGIAWQEKSVAGHS 301 (346) Q Consensus 283 ~~~~~---~G~s~~~~~~~~~s 301 (346) +.+.| .-|.-.....++++ T Consensus 305 ~d~~v~~~~a~~~l~~~~~~~~ 326 (326) T protein:vir:42 305 YAFHCNDKDAFVKLTNVDATEA 326 (326) T ss_pred eccEEecccceEEEeeccccCC Confidence 32222 22211111112222 No 94 >protein:vir:2504 Length: 305 # NCBI annotation: major capsid subunit gp9 # Family: family:all:507 # MgeID: mge:53 # MgeName: TM4 # Cross-refs: genbank:acc:NP_569745;genbank:gi:18496895;genbank:GeneID:932268 Probab=99.08 E-value=6.7e-11 Score=76.36 Aligned_cols=274 Identities=9% Similarity=0.055 Sum_probs=141.2 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCcc----cc Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEG----AL 88 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~----~i 88 (346) -|..+|.-.-..+|+.+.+-+.+...+.+.+.+- ......++..+++|.+..- ..+.-+.|++. .+ T Consensus 1 ma~~t~~~gg~liP~~~~~~Ii~~~~~~s~l~~l---------~~~~~~~~~~~~~p~~~~~-~~a~wv~E~~~~~~~~~ 70 (305) T protein:vir:25 1 MADISRAEVASLIQEAYSDTLLAAAKQGSTVLSA---------FQNVNMGTKTTHLPVLATL-PEADWVGESATDPKGVK 70 (305) T ss_pred CCCccCCccceecCHHHHHHHHHHHHhhchhhhh---------cceeeccCCcEEEEEEeCC-cceEEeecccccccccc Confidence 2444444444556887777777766665555332 1222335678999998753 44544566532 12 Q ss_pred chhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH---Hhhhhhhhhhhcceeeec Q lcl|NC_015254. 89 TPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASL---NGITASGALDSNKLDVST 165 (346) Q Consensus 89 t~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L---~G~~~~~~~~~~~~dis~ 165 (346) +..+.+-.+-....++.+..+.++++...-+..|....+.+++++.+++..++.+|.-- +|.+...........-.. T Consensus 71 ~~s~~~f~~i~~~~~k~~~~~~is~ell~ds~~~~~~~i~~~l~~~~a~~~d~a~~~G~g~~~~~~~~~~~~~~~~~~~~ 150 (305) T protein:vir:25 71 PTSKVTWANRTLVAEEIAVIIPVHENVIDDATVAVLTEVAELGGQAIGKKLDQAVIFGTDKPASWVSPALIPAAVTAGQA 150 (305) T ss_pred cccccceeeEEeeeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHhhhheeccCCCCCcccccccccccccccc Confidence 33344445555556677777888888777677788999999999999999999777311 011111100000000000 Q ss_pred cccccccccH----HHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccCcee---eEEeceEEEEeCCCcc Q lcl|NC_015254. 166 ETGDDSYFTG----DTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDNKKF---PTYMGKRVIVDDGLPA 238 (346) Q Consensus 166 ~~~~~~~~~~----~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~~~i---~~~~G~~VVvdD~~p~ 238 (346) .......... +.+.++.....+.......++||+..+..|++. +++++..+ .+++|+||++++.+|. T Consensus 151 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~l------kd~~G~~i~~~~~l~G~Pv~~~~~~~~ 224 (305) T protein:vir:25 151 VEVVGGVANESDIVGATNRAAKAVASAGWAPDTLLSSLALRYEVANI------RDANGNPVFRDDSFAGFRTFFNRNGAW 224 (305) T ss_pred ccccccchhhhHHHHHHHHHHHhhhhcccccceeEecHHHHHHHHHh------hccCCceeecCCcccccceEEcCccCC Confidence 0011111222 233334444444444556799999999998754 23333332 4799999999999987 Q ss_pred CCCceEEEEEcC-CeeEEeecCCccceeeeecCCc--ce----------eEEEEeeEEee---eeeeeeecccc-ccCCC Q lcl|NC_015254. 239 KDGVYTSYIFGE-GAFGLGNGEAPVPTETDREKLK--GN----------DILINRQHFLL---HPRGIAWQEKS-VAGHS 301 (346) Q Consensus 239 ~~g~ytt~l~~~-GAi~~~~~~~~~~vE~dRd~~~--g~----------~~l~~r~~~~~---~~~G~s~~~~~-~~~~s 301 (346) ..+....+ |+. --+.++..+ .+.+|..++..- +. ..+....++.+ +|..+-+.... .+... T Consensus 225 ~~~~~~~~-~gd~s~~~i~~~~-~~~i~~~~~~~~~~~~~~~~~~~~~~~~~R~~~r~~~~v~~p~a~v~~~~~~~~~~~ 302 (305) T protein:vir:25 225 DADAAIEV-IADSSRVKIGVRQ-DITVKFLDQATLGTGENQINLAERDMVALRLKARFAYVLGVSATAQGANKTPVAVVA 302 (305) T ss_pred CCCccEEE-EEecceEEEEEec-CeEEEEeeeeeeecCCceeeeeecCcEEEEEEEeecceeeCcccEEEEccccccccC Confidence 66554433 332 223333322 445666655411 11 11212222221 22222221100 00111 Q ss_pred CCh Q lcl|NC_015254. 302 PTN 304 (346) Q Consensus 302 Pt~ 304 (346) |+- T Consensus 303 pa~ 305 (305) T protein:vir:25 303 PAA 305 (305) T ss_pred CCC Confidence 221 No 95 >protein:vir:6212 Length: 434 # NCBI annotation: prohead protease # Family: family:all:21 # MgeID: mge:128 # MgeName: phBC6A52 # Cross-refs: genbank:acc:NP_852592;genbank:gi:31415852;genbank:GeneID:1489210 Probab=99.08 E-value=7.2e-11 Score=76.19 Aligned_cols=293 Identities=13% Similarity=0.058 Sum_probs=145.6 Q ss_pred Ccc-----ceecceeeec-----------CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCc Q lcl|NC_015254. 1 MIK-----KLRMNLQKFA-----------AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGN 64 (346) Q Consensus 1 ~~~-----~~~~~~q~~~-----------a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ 64 (346) .+. ..|-.|..|. .+.+|--.-.++|+.|..-|.+...+.+.+.+-+-+. ..+| T Consensus 114 ~~~~~~~~e~r~a~~~~l~~~~~~~e~~a~~~~t~~GG~lvP~~~~~~Ii~~l~~~~~i~~~~~~~---------~~~~- 183 (434) T protein:vir:62 114 GHRTNKETEIRSVFANYIVGNIDEKEARALGLVTGNGSVTIPDFLSKEIITYAQEENFLRRLGTGV---------KTKE- 183 (434) T ss_pred cccchHHHHHHHHHHHHhccccchhhhhhhcccccccceecchhhHHHHHHhhhhhhhhhhhccee---------ccCC- Confidence 000 0010111110 1112211234678888777777666665553322111 1123 Q ss_pred EEEecccccCCCcccccCC-C-ccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 65 MINMPFWQDLTGEDEILDD-G-EGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAV 142 (346) Q Consensus 65 ti~~P~~~~l~g~ae~~~d-g-~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~ 142 (346) .+.+|.+..- +.+.-..+ + .+.++..+.+-.+-....++.+.-+.+++....-+.-|..+.+.++++..+.+..++. T Consensus 184 ~~~~p~~~~~-~~a~~~~~~~e~~~~~~~~~~f~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~d~~ 262 (434) T protein:vir:62 184 NIKYPVLVKK-AEAQGHKNERTNNEMPETDIEFDEIELSPTEFDALATVTKKLLARTGLPIEQIVMDELKKAYVRKETQY 262 (434) T ss_pred ceEEEEEecC-CcccceecccccccccccccceeeEEeeheeeEeehhhHHHHHhcchHHHHHHHHHHHHHHHHHHHHHH Confidence 4677776543 22222221 1 1244444454444455555555556777766555555778889999999999999987 Q ss_pred HHHHHHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhccc---- Q lcl|NC_015254. 143 LIASLNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLD---- 216 (346) Q Consensus 143 lla~L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~---- 216 (346) +|. |-=...........-..........+++.|.+....+-.....-.+|+||+.++..|++..-- .++.. T Consensus 263 ~l~---G~G~~~~~~g~~~~~~~~~~~~~~~~~d~l~~l~~~l~~~~~~~a~~v~n~~~~~~L~~lkd~~G~~l~~~~~~ 339 (434) T protein:vir:62 263 MVN---GDEANNINDGALAKKAVEFKTDEKNLYDALVKMKNTPVKEVRKKARWVLNTAALTKIETMKTDDGFPLLRPFNQ 339 (434) T ss_pred Hhc---cCCCCccccceeecccccccccccchhhHHHHHHhhcchhhhcCCEEEEcHHHHHHHHHhhccCCCEeeccCCC Confidence 763 210000000000000011122334678999998888877666777899999999999875322 23221 Q ss_pred ccCceeeEEeceEEEEeCCCccCCC-ceEEEEEcCC-eeEEeecCCccceeeeecC--CcceeEEEEeeEEeeeeeeeee Q lcl|NC_015254. 217 SDNKKFPTYMGKRVIVDDGLPAKDG-VYTSYIFGEG-AFGLGNGEAPVPTETDREK--LKGNDILINRQHFLLHPRGIAW 292 (346) Q Consensus 217 s~~~~i~~~~G~~VVvdD~~p~~~g-~ytt~l~~~G-Ai~~~~~~~~~~vE~dRd~--~~g~~~l~~r~~~~~~~~G~s~ 292 (346) ..++.-.+++|+||++++.||...+ ....++||.= .+.+..-.....++..++. ..++..+....++- T Consensus 340 ~~~g~~~tl~G~pV~~~~~~~~~~~~~~~~i~~Gdfs~~~i~~~~g~~~i~~~~~~~~~~~~v~~~~~~r~D-------- 411 (434) T protein:vir:62 340 AEGGIGYTLLGFPVEEEDAIDIPDSPDTPVFYFGDFSKFYIQDVIGSLEVQKLVELFSRTNRVGFRIWNLLD-------- 411 (434) T ss_pred ccCCCCceecceeeEEecCccCccCCCceEEEEeeccceEEEEeeceeEEEeehhhhcccCceEEEEEeeec-------- Confidence 1234456899999999999986543 2233444311 1112221111222222222 22333333333331 Q ss_pred ccccccCCCCChHHhcCCcCceeeecccccceEEEEEeccccc Q lcl|NC_015254. 293 QEKSVAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPG 335 (346) Q Consensus 293 ~~~~~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~ 335 (346) +.++|.|.++++..+.=|..+.+ T Consensus 412 --------------------gk~i~~~~~~~~~~~~~~~~~~~ 434 (434) T protein:vir:62 412 --------------------AQLIHSPFEVPVYKYVLKAPTGA 434 (434) T ss_pred --------------------ceeecCcccceEEEEEeccCCCC Confidence 23344455555544444433333 No 96 >protein:vir:6242 Length: 390 # NCBI annotation: gp36 # Family: family:all:21 # MgeID: mge:131 # MgeName: phi-BT1 # Cross-refs: genbank:acc:NP_813696;swissprot:trembl:q859c1;genbank:gi:29366756;interpro:IPR006444;uniprot:Q859C1;genbank:GeneID:1258897 Probab=99.07 E-value=2.3e-11 Score=78.93 Aligned_cols=279 Identities=11% Similarity=0.071 Sum_probs=148.6 Q ss_pred Ccccee-cceeeec-CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcc Q lcl|NC_015254. 1 MIKKLR-MNLQKFA-AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGED 78 (346) Q Consensus 1 ~~~~~~-~~~q~~~-a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~a 78 (346) +-.+.+ +.+..-. ..+.+--..++.|+++...+.....+..-+.+ .... .-...|..+.+|.+..- ..+ T Consensus 97 ~~~~~r~~~~~~~~~~~t~~~~g~~~~~~~~~~~i~~~~~~~~~l~~------~~~~--~~~~~~~~~~~p~~~~~-~~a 167 (390) T protein:vir:62 97 NLGEARSFEFAPEKRDGTKAGNPNVLSRTLYGQLIAQAVERSAIMRG------GATT--FTTSDANPLDFTVITGR-SSA 167 (390) T ss_pred hhhhhHHHHhhhhhhcccccCCCccccccchHHHHHHHHhhhhhhhh------ccee--eecCCCceeEEEEEcCC-cce Confidence 111111 1111110 11222223577888887777655443332211 1110 01224667889988653 345 Q ss_pred cccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH---HHhhhhhhh Q lcl|NC_015254. 79 EILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS---LNGITASGA 155 (346) Q Consensus 79 e~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~---L~G~~~~~~ 155 (346) .-+.|+. .++..+.+.++-.-..++.+.-..++++...-+.-|..+.+.++++..+.+..+..+|.- =+|++.... T Consensus 168 ~wv~E~~-~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~l~G~G~p~Gi~~~~~ 246 (390) T protein:vir:62 168 SIVGETA-EIPESYPATAQRSMGGFKYGFASVVSYEFATDQVLDLVGFLVSDAGPAIGDAMGRHFITGTGQPRGILTDAS 246 (390) T ss_pred eeecccc-cccccccceeeeEeeeeeEEeehHHHHHHHhhhhHHHHHHHHHHHHHHHHHHHHhhhhccCCcccccccccc Confidence 4567764 566666666655555666666666676665555557778899999999999999976631 012222221 Q ss_pred hhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhh--hhhhccc--ccCceeeEEeceEEE Q lcl|NC_015254. 156 LDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQG--LIEFMLD--SDNKKFPTYMGKRVI 231 (346) Q Consensus 156 ~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~--li~~~~~--s~~~~i~~~~G~~VV 231 (346) ....+ ...+....++++.|++....+......-.+|+||+..+..|++.. .=.++.. ..++...+++|+||+ T Consensus 247 ~~~~~----~~~~~~~~~~~~~l~~~~~~l~~~~~~~a~~vmn~~~~~~L~~lkd~~g~~l~~~~~~~g~~~~l~G~Pv~ 322 (390) T protein:vir:62 247 PATAT----FLATDTDSKVSDALIDLFHEVPSAYRANAKYVVNDLRAAQMRKLKDANGQYLWQSGLTVGAPSLFNGKVVE 322 (390) T ss_pred ccccc----eecccccccchHHHHHHHHhhhhhhhcCCEEEEchHHHHHHHHhhccCCCeeecCCcCCCccceecccceE Confidence 11111 112233457889999988777665555668999999999987642 1123221 123444689999999 Q ss_pred EeCCCccCCCceEEEEEcCCe-eEEeecCCccceeeeecC--CcceeEEEEeeEEeeee---eeeeeccccccCCCCChH Q lcl|NC_015254. 232 VDDGLPAKDGVYTSYIFGEGA-FGLGNGEAPVPTETDREK--LKGNDILINRQHFLLHP---RGIAWQEKSVAGHSPTNT 305 (346) Q Consensus 232 vdD~~p~~~g~ytt~l~~~GA-i~~~~~~~~~~vE~dRd~--~~g~~~l~~r~~~~~~~---~G~s~~~~~~~~~sPt~a 305 (346) +++.+|... ++||.-. +.+...+ .+.++.+.+. ..+...+....++.+.| ..+.....+.+ T Consensus 323 ~~~~~p~~~-----i~~gd~s~~~i~~~~-~~~v~~~~~~~~~~~~~~~~~~~r~d~~~~~~~A~~~l~~~~~------- 389 (390) T protein:vir:62 323 TDDGMPADK-----ILFADLSKYRVRFAG-SLRVDRSVDAKFSTDQIVYRFLQRADGLLVDARGAKVLTVTPG------- 389 (390) T ss_pred EecCCCCcc-----EEEeeccceeEEeec-ceEEEeeccccccCCcEEEEEEEEeCcEeechhheEEEEeecC------- Confidence 999999753 4444311 1222212 2334433333 34445555554444332 23322211111 Q ss_pred HhcCCcCceeeecccccceEEEEEec Q lcl|NC_015254. 306 EIEKGNNWKAVYESKNIRIVAFVHKN 331 (346) Q Consensus 306 ~L~~~~NW~~v~~~K~i~iv~~~~k~ 331 (346) . T Consensus 390 -------------------------a 390 (390) T protein:vir:62 390 -------------------------A 390 (390) T ss_pred -------------------------C Confidence 1 No 97 >protein:vir:8187 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:153 # MgeName: Che9d # Cross-refs: genbank:acc:NP_817980;genbank:gi:29566414;genbank:GeneID:2700968 Probab=99.06 E-value=1.1e-10 Score=75.24 Aligned_cols=273 Identities=12% Similarity=0.034 Sum_probs=143.7 Q ss_pred CceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhhcc Q lcl|NC_015254. 15 GKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGNIS 94 (346) Q Consensus 15 ~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt 94 (346) =+++.-...++|+.|.+-+.+...+.+.+.+-+-+ ...++..+++|.+..- ..+.-+.|++ .++..+.+ T Consensus 1 mat~~~gg~lvP~~~~~~ii~~~~~~s~i~~~~~~---------i~~~~~~~~~p~~~~~-~~a~wv~Eg~-~~~~~~~~ 69 (311) T protein:vir:81 1 MVALATGTFQLPKHLVPGVWQKAQGQSVLARLSMA---------EPQEFGEQQYMTLTAP-PRGEVVGEGA-QKSESTAT 69 (311) T ss_pred CceecCCceEcchhHHHHHHHHHHhcchhhhhcce---------eecCCCceEEEEEeCC-ceeEEeecCc-ccccccce Confidence 12444457889999988777776666555432221 1224556899999754 4555667875 56766666 Q ss_pred cceeEEEEEeecCcceechHHHhhhcc---hHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhh--hh-hh--cceeeecc Q lcl|NC_015254. 95 AAKDIARLHMRGKAWRTNDLAKALSGD---DPMRAIGDLVVEYWNRRRQAVLIASLNGITASG--AL-DS--NKLDVSTE 166 (346) Q Consensus 95 ~~~~~a~~~~~~k~~~~tD~a~~~~g~---dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~--~~-~~--~~~dis~~ 166 (346) -.+.....++.+.-+.++++-...+.. +..+.+.+++++++.+..+..++.--..--+.. .. .. ++...... T Consensus 70 f~~v~l~~~kl~~~~~iS~ell~~~~d~~~~l~~~i~~~la~ai~~~~d~a~l~G~~~~~~~~~~gi~~~~~~~~~~~~~ 149 (311) T protein:vir:81 70 FAPVTAIPRKVQVTQRFSQEVKWADESRQLGVLQTMADLSGVALGRALDLIGIHGINPLTGAALSGSPAKILDTTNIVEL 149 (311) T ss_pred eeEEEEeeEEEEEeehhhHHHhhcCcccHHHHHHHHHHHHHHHHHHHHHHhhhccccCCCCcccccccccccccceeeee Confidence 655555555666556676664433333 467789999999999999997764210000000 00 00 00000011 Q ss_pred ccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhc--ccccCceeeEEeceEEEEeCCCccCCC- Q lcl|NC_015254. 167 TGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFM--LDSDNKKFPTYMGKRVIVDDGLPAKDG- 241 (346) Q Consensus 167 ~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~--~~s~~~~i~~~~G~~VVvdD~~p~~~g- 241 (346) +..+.......+.++..++-+...+..+|+|||.++..|++..-- .++ ....++..++++|+||++++.||.... T Consensus 150 ~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~lkd~~G~~l~~~~~~~~~~~tl~G~Pv~~~~~i~~~~~~ 229 (311) T protein:vir:81 150 TTGTSATPDLAVEAAVGLVLGDNLSPDGVALDNTFSFMLATQRDSQGRKLYPELGFGTDVASFAGLNAAVSDTVRGGPEA 229 (311) T ss_pred cccccchHHHHHHHHHHHhhhcCCCceEEEEcHHHHHHHHhhhccCCCeeecCccccCCCceecceeEEecccccccccc Confidence 111122233456667777766555667899999999999875311 111 112234457899999999999985431 Q ss_pred ---ceEE---------EEEcCCe-eEEeecCCccceeeeecCC---------cceeEEEEeeEEeee---eeeeeecccc Q lcl|NC_015254. 242 ---VYTS---------YIFGEGA-FGLGNGEAPVPTETDREKL---------KGNDILINRQHFLLH---PRGIAWQEKS 296 (346) Q Consensus 242 ---~ytt---------~l~~~GA-i~~~~~~~~~~vE~dRd~~---------~g~~~l~~r~~~~~~---~~G~s~~~~~ 296 (346) .... ++|+.=. +.+.. ...+.+|..++.. .+...+....++.+. |..|..-... T Consensus 230 ~~~~~~~~~~~~~~~~~~~gDfs~~~i~~-~~~~~~~~~~~~~~~~~~~~~~~~~v~~r~~~r~d~~v~~~~a~~~l~~a 308 (311) T protein:vir:81 230 VTASTGVYRTTNPNVKAIAGDFSAFRWGV-QVSIPLELIEFGDPDGLGDLKRQNQIAIRAEVVYGIGIMSTDAFAVVRDA 308 (311) T ss_pred cccccchhcccCCccEEEEEecccEEEEE-eccceEEEeccCCCCcchhhhhcCcEEEEEEEEeccEeecccceEEEEee Confidence 1111 2222211 12222 1233455555432 122333333444333 3333332111 Q ss_pred ccCCC Q lcl|NC_015254. 297 VAGHS 301 (346) Q Consensus 297 ~~~~s 301 (346) + .+ T Consensus 309 ~--~~ 311 (311) T protein:vir:81 309 D--ES 311 (311) T ss_pred c--cC Confidence 1 11 No 98 >protein:vir:1383 Length: 421 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:314 # MgeName: phi3626 # Cross-refs: genbank:acc:NP_612835;genbank:gi:20065969;genbank:GeneID:935826 Probab=99.05 E-value=1.5e-10 Score=74.51 Aligned_cols=293 Identities=13% Similarity=0.036 Sum_probs=152.2 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCc-cc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGE-DE 79 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~-ae 79 (346) +...+.- ...+..++.-.-..+|+.+..-+.+...+.+.+.+- ......++..+.+|.+...... .. T Consensus 105 ~~~~~~~---~~ra~~t~~~gg~liP~~~~~~Ii~~~~~~~~l~~l---------~~~~~~~~~~~~~~~~~~~~~~~~~ 172 (421) T protein:vir:13 105 RGIQLSE---EERDIMSSTNNGAVIPQEFVNEFEKLKEGYPSLKEH---------CHVIPVNRNAGKMPVRAGASVDKLA 172 (421) T ss_pred hccchhH---HHhhccccCCcceecchhhHHHHHHHHHhhhhhhhh---------ceeeeccCCceEEEEeecCCcccee Confidence 1111110 111222222234456776655454444443333221 1111224556777777654322 22 Q ss_pred ccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhc Q lcl|NC_015254. 80 ILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSN 159 (346) Q Consensus 80 ~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~ 159 (346) .+.++. .++..+++-.+-....+..+.-+.++++...-+.-|....+.+++++.+.+..+..++..++|+...+. T Consensus 173 ~~~E~~-~~~~s~~~f~~i~~~~~k~~~~v~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~i~~~~~g~~~~~~---- 247 (421) T protein:vir:13 173 NLAKDT-ELVKAMLKTQPMAYDIDDYGLLAPIDNSLLEDSEINFLEFVNEEFAEFAVNTENAEIVKQAKAVLAEET---- 247 (421) T ss_pred eccccc-cccccccceeEEEeeeeeeEeehhhhHHHHhhhHHHHHHHHHHHHHHHHHHHhhhhHhhhhhhcccccc---- Confidence 355653 566666665555555666666677777665545556778899999999999999988888877643221 Q ss_pred ceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc-cccCceeeEEeceEEEEeCCC Q lcl|NC_015254. 160 KLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML-DSDNKKFPTYMGKRVIVDDGL 236 (346) Q Consensus 160 ~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~-~s~~~~i~~~~G~~VVvdD~~ 236 (346) ..+++.|.+++..+......-.+|+||+..+..|++..-- .++. +..++.-++++|+||++++.+ T Consensus 248 ------------~~~~d~i~~~~~~l~~~~~~~a~~v~n~~~~~~l~~lkd~~G~~i~~~~~~~~~~tl~G~pV~~~~~~ 315 (421) T protein:vir:13 248 ------------INDYAGLVKTINSLVPNARKRAIIVTNSDGRAYLDGLMDKQGRPLLKELSDGGDLVFKGRPVIELEES 315 (421) T ss_pred ------------ccchHHHHHHHHHhhhhhcCCCEEEEcHHHHHHHHHhhcCCCceeecCcCCCCCceecceeeEEeccc Confidence 1357889999998877777778999999999999865311 1222 122344568999999999999 Q ss_pred ccCCCceEEEEEcCC--eeEEeecCCccceeeeecCC--cceeEEEEeeEEeeeeeeeeeccccccCCCCChHHhcCCcC Q lcl|NC_015254. 237 PAKDGVYTSYIFGEG--AFGLGNGEAPVPTETDREKL--KGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEKGNN 312 (346) Q Consensus 237 p~~~g~ytt~l~~~G--Ai~~~~~~~~~~vE~dRd~~--~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~~N 312 (346) |...+.-..++|+.- ++.+... ..+.++..++.. .+...+....+|.+.++- ..+ ...-. T Consensus 316 ~~~~~~~~~~~~gd~~~~~~~~~~-~~~~v~~~~~~~f~~~~~~~r~~~r~d~~~~~----~~a--------~~~~~--- 379 (421) T protein:vir:13 316 IFDVGDETKFIVSDFKTLIKFMDR-KQYLIDQSKEAGYTKNETIARIIERFDVNSPL----DKS--------SDAEK--- 379 (421) T ss_pred cccCCCceEEEEEeccccEEEEEe-cceEEEeecccccccCeeEEEEEeeecceeec----chh--------hheee--- Confidence 865543344555532 2333332 245566665543 333344444444222210 000 00000 Q ss_pred ceeeecccccceEEEEEecccccccCC----------------CCCCCCC Q lcl|NC_015254. 313 WKAVYESKNIRIVAFVHKNGVPGKKKE----------------TAPEGIK 346 (346) Q Consensus 313 W~~v~~~K~i~iv~~~~k~~~~~~~~~----------------~~~~~~~ 346 (346) -.+...+++-.++|.+..+ -+.+|-. T Consensus 380 --------~~~~~a~v~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 421 (421) T protein:vir:13 380 --------IRKFGVIVKLQEVLKSSPRSGKNKNESKEEIKEEGEATQQNE 421 (421) T ss_pred --------ecccceeeccccccCCCCcCCCCccccchheeeccccccCCC Confidence 0000111111121111111 1111111 No 99 >protein:vir:10450 Length: 344 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:184 # MgeName: phiA1122 # Cross-refs: genbank:acc:NP_848297;genbank:gi:30387487;genbank:GeneID:1733971 Probab=99.04 E-value=1.2e-11 Score=80.44 Aligned_cols=283 Identities=14% Similarity=0.082 Sum_probs=166.6 Q ss_pred ecceeeecC-C--------ceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCC Q lcl|NC_015254. 6 RMNLQKFAA-G--------KNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTG 76 (346) Q Consensus 6 ~~~~q~~~a-~--------~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g 76 (346) ==|+|.++. | ...---++++ |+|...|...+.+.+.|.. .-...+ + .+|+++.+|..+... T Consensus 1 ma~~~~~~~~n~~~~~~~~~~~~~~al~i-e~~~geV~~~f~~~s~~~~------~~~~r~-i-~~g~s~~~~~iG~~~- 70 (344) T protein:vir:10 1 MANMTGGQQLGTNQGKDVMAAGDKLALFL-KVFGGEVLTAFARTSVTTS------RHMVRS-I-SSGKSAQFPVLGRTQ- 70 (344) T ss_pred CccccccccCCcccCCccCCccchhHHHH-HHHHHHHHHHHHHHhhhcc------cceeee-e-cccceEEEEeeceeE- Confidence 113444432 1 1222234566 9999999988888877632 111111 1 269999999887652 Q ss_pred cccccCCCccccc--hhhcccceeEEEEEe-ecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhh Q lcl|NC_015254. 77 EDEILDDGEGALT--PGNISAAKDIARLHM-RGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITAS 153 (346) Q Consensus 77 ~ae~~~dg~~~it--~~~lt~~~~~a~~~~-~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~ 153 (346) +.....|+ ++. ++.+...+..-++=. .-..+.+.|+....+..|.+.+++++.+.++++.+|+.++..|..+.+. T Consensus 71 -~~~~~~G~-~l~~t~~~~~~~e~~l~ID~~~y~~~~VdDiD~~q~~~D~r~~~~~~~G~aLA~~~D~~i~~~la~~a~~ 148 (344) T protein:vir:10 71 -AAYLAPGE-NLDDIRKDIKHTEKVITIDGLLTADVLIYDIEDAMNHYDVRSEYTSQLGESLAMAADGAVLAEIAGLCNV 148 (344) T ss_pred -EEeeecCC-CCCCCCCCcccceEEEEEcchhhhhhhhhhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Confidence 33444554 443 245666655444433 4446889999999999999999999999999999999998877543322 Q ss_pred hhhh-------hc--ceeee--cccccccccc----HHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhhhhhhhccc Q lcl|NC_015254. 154 GALD-------SN--KLDVS--TETGDDSYFT----GDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQGLIEFMLD 216 (346) Q Consensus 154 ~~~~-------~~--~~dis--~~~~~~~~~~----~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~li~~~~~ 216 (346) .... .+ +.... ....++...+ ++.|.+|.++|.++. ..-..+++.|..|..|++...+....+ T Consensus 149 ~~~~~~~~~g~~~~~~~~~~~~~~~~t~~~~~~~~~~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~~~~~~~~ 228 (344) T protein:vir:10 149 ESQYNENITGLGTATVIETTQDKTTLTDQVALGKEIIAALTKARAALTKNYVPSSDRVFYCDPDSYSAILAALMPNAANY 228 (344) T ss_pred ccccccccccccccceeecccccccccchhhhHHHHHHHHHHHHHHHhhcCCCccCCEEEeChHHHHHHhhccccccccc Confidence 1100 00 11111 1111122222 355667777775543 233789999999999998866543333 Q ss_pred cc-----CceeeEEeceEEEEeCCCccCC---------Cc---------------e---EEEEEcCCeeEEeecCCccce Q lcl|NC_015254. 217 SD-----NKKFPTYMGKRVIVDDGLPAKD---------GV---------------Y---TSYIFGEGAFGLGNGEAPVPT 264 (346) Q Consensus 217 s~-----~~~i~~~~G~~VVvdD~~p~~~---------g~---------------y---tt~l~~~GAi~~~~~~~~~~v 264 (346) .. .|.|+.++|++|+.++.+|.+. |. + ...+|-+-|++..... ++.+ T Consensus 229 ~~~~~~~~G~V~~v~G~~V~~Sn~lp~~~~~~~~~~~tg~~~~~~~~~~~~~~~~~s~~~~l~~h~~A~~~v~~~-~~~~ 307 (344) T protein:vir:10 229 AALIDPEKGSIRNVMGFEVVEVPHLTAGGAGTSREGTTGQKHAFPATKSGNDKVAKDNVIGLFMHRSAVGTVKLR-DLAL 307 (344) T ss_pred ccccceeeeEEEEEeceEEEeccccccccCCcccccccCccccccCCcccceeeecceeEEEeechhhhhhhhhc-ccee Confidence 22 3678999999999999998531 10 0 0123444555554333 3457 Q ss_pred eeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCChHHhcCCcCceeeecccccceEEEEEe Q lcl|NC_015254. 265 ETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHK 330 (346) Q Consensus 265 E~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k 330 (346) |..|++....+.+.+++.|+..+. .|+....|.|++| T Consensus 308 e~~r~~~~~~d~i~g~~~~G~~vl-----------------------------RPe~a~~v~~~~~ 344 (344) T protein:vir:10 308 ERARRANFQADQIIAKYAMGHGGL-----------------------------RPEAAGAVVFKTK 344 (344) T ss_pred ecccchhHHHHHHHHHhhccccee-----------------------------cccceEEEEeecC Confidence 777887666665555544443321 1233333444444 No 100 >protein:vir:4997 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:109 # MgeName: Sfi21 # Cross-refs: genbank:acc:NP_049971;genbank:gi:9632943;genbank:GeneID:1262106 Probab=99.02 E-value=2.4e-10 Score=73.30 Aligned_cols=284 Identities=11% Similarity=0.067 Sum_probs=148.9 Q ss_pred Cccceecc-eeeecCCceeee--eeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCc Q lcl|NC_015254. 1 MIKKLRMN-LQKFAAGKNTRI--ADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGE 77 (346) Q Consensus 1 ~~~~~~~~-~q~~~a~~~T~l--~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ 77 (346) ..+.++-+ .+.+.+.+.+.. .-..+|+.+..-+.+...+.+.+.+-.=+.+ ...+.-.+.+|.+....+. T Consensus 94 ~~~~l~~~~~~~~~~~~~~t~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~-------~~~~~~~~~~~~~~~~~~~ 166 (397) T protein:vir:49 94 FKNLVRGRYQNLLDSKTDGSGSDAGLTIPQDIRTAINTLVRQFDSLQEYVNVEN-------VTTLTGSRVYEKWADITGL 166 (397) T ss_pred HHHHhhcchhhHHHhhhccCCccCcceecHHHHHHHHHHHHhhhhHhhhcceee-------ccCCcceEEEEeeccCCcc Confidence 11111111 111222222222 3356788887777776666665533111111 1112224556666555456 Q ss_pred ccccCCCccccchhhc-ccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhh Q lcl|NC_015254. 78 DEILDDGEGALTPGNI-SAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGAL 156 (346) Q Consensus 78 ae~~~dg~~~it~~~l-t~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~ 156 (346) +..+.|+. .++.... +-.+-....++.+.-+.+++....-+.-|....+.++++..+.+..+..+|.-. +. T Consensus 167 a~~v~E~~-~~~~~~~~~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~d~ail~G~-------g~ 238 (397) T protein:vir:49 167 AKLDDEGG-QIGQNDDPKLSLIRYAIKRYAGISTVTNSLLADSAENILAWLSGWIAKKVVVTRNKAILEAI-------GT 238 (397) T ss_pred eeeecccc-ccccccccceeeeEeeeeeeEeehhhHHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHHhcc-------cc Confidence 66677764 4443332 333334445555555667766555455577888999999999999988665421 10 Q ss_pred hhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceeeEEeceEEEE Q lcl|NC_015254. 157 DSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFPTYMGKRVIV 232 (346) Q Consensus 157 ~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~~~~G~~VVv 232 (346) . ......++++.|.++...+.........|+|||.++..|++..=- .++. ...++.-++++|+||++ T Consensus 239 ~---------~~~~~~~~~d~i~~~~~~l~~~~~~~a~~v~n~~~~~~l~~lkd~~g~~l~~~~~~~g~~~~l~G~pV~~ 309 (397) T protein:vir:49 239 L---------PNKPTLAKWDDIIDLQAKVDPAIKQTSLFLTNTSGFTALKKVKNAMGDYLMERDVKSPTGYSIDGFVVKE 309 (397) T ss_pred c---------cccccccCHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHHhhccCCceeecccccCCCCceecceeeEE Confidence 0 012234688999999998877777788999999999999875311 1221 11234446899999988 Q ss_pred eCC--CccCCCceEEEEEcC--CeeEEeecCCccceeeeecC----CcceeEEEEeeEEeee---eeeeeeccc-cccCC Q lcl|NC_015254. 233 DDG--LPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREK----LKGNDILINRQHFLLH---PRGIAWQEK-SVAGH 300 (346) Q Consensus 233 dD~--~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~----~~g~~~l~~r~~~~~~---~~G~s~~~~-~~~~~ 300 (346) .+. +|...+.-.+++|+. .++.+...+ .+.++.++.. ..+...+....++.+. +..|.+-.- +.+.. T Consensus 310 ~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~-~~~i~~~~~~~~~~~~~~~~~~~~~r~d~~~~~~~a~~~~~~~~~~~~ 388 (397) T protein:vir:49 310 ISDRFLPNGTGGAMPLYFGDLKQAVTLFDRQ-HLSLLSTNIGGGAFETDTTKVRVIDRFDVVSTDTEAFVPASFKAIADQ 388 (397) T ss_pred ecccccccccCCceeEEEeeccceEEEEeec-ccEEEEeccccchhhcCeeeEEEEEeeccEEecccceEEEEecccccc Confidence 654 454433334566663 344444332 3455555433 2344444444444332 333333211 11112 Q ss_pred CCChHHhcC Q lcl|NC_015254. 301 SPTNTEIEK 309 (346) Q Consensus 301 sPt~a~L~~ 309 (346) .|+..-.+. T Consensus 389 ~~~~~~~~~ 397 (397) T protein:vir:49 389 KAKLSTAGA 397 (397) T ss_pred cCcccccCC Confidence 222222222 No 101 >protein:vir:78935 Length: 335 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:1860 # MgeName: LKD16 # Cross-refs: genbank:acc:YP_001522824;genbank:gi:158345059;genbank:GeneID:5687425 Probab=99.02 E-value=3.4e-11 Score=78.01 Aligned_cols=287 Identities=11% Similarity=0.052 Sum_probs=166.6 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEI 80 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~ 80 (346) |-.- =||.+..-+...--.+++. |+|...|...+.+++.|.. .-...+ -.+|+++++|+-+... ++. T Consensus 1 ms~~--~~~t~~~~~~s~~d~al~l-e~f~geV~~af~~~s~~~~------~~~~rt--i~~g~s~~~~~iG~~~--~~~ 67 (335) T protein:vir:78 1 MSFL--NDLTRPNYAGKNADVDIHL-EEHLGIVDKHFAYTSKFAP------LMNIRD--LRGSNVVRLDRLGNVE--AKG 67 (335) T ss_pred CCcc--ccccccccccccchhhhhh-hhhhhHHHHHHHHhhhhcc------ccceee--eccceeEEEeeeeeee--ecc Confidence 4332 1333443333333346777 9999999998888877742 221222 2469999999887662 333 Q ss_pred cCCCccccchhhcccceeEEEEEe-ecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH-Hhhhhhhhhh- Q lcl|NC_015254. 81 LDDGEGALTPGNISAAKDIARLHM-RGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASL-NGITASGALD- 157 (346) Q Consensus 81 ~~dg~~~it~~~lt~~~~~a~~~~-~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L-~G~~~~~~~~- 157 (346) ..-|+ .+..+.+...+..-+|=. .--...+.|+....+.-|...+++++.+.++++.+|+.++..| ++.-...... T Consensus 68 ~~pG~-~l~~~~~~~~k~~itID~ll~a~~~VddlDe~~~~yDvR~e~s~~~G~aLA~~~Dq~~~~~l~~aa~~~a~~~~ 146 (335) T protein:vir:78 68 RRAGE-ELERSRVVNDKWNLTVDTLLYLRHQFDHQDEWTQSFDMRKEVAELDGQELARKFDQACLIQVIKAAAMDAPVDL 146 (335) T ss_pred cccCc-ccCCCCcccCCeEEEecceeechhhHhhHHHhhcCchhHHHHHHHHHHHHHHHHHHHHHHHHHhhccccccccc Confidence 33443 455555555554333322 2233568999998888899999999999999999999777544 4432111110 Q ss_pred -----hcceeeeccccccccccHHHHHHHHH----HhCccc--c---CceEEEEchHHHHHHHhh-hhhhhcc-ccc--- Q lcl|NC_015254. 158 -----SNKLDVSTETGDDSYFTGDTFLSATY----KLGDAE--G---KLTGIAMHSQTEMNLRKQ-GLIEFML-DSD--- 218 (346) Q Consensus 158 -----~~~~dis~~~~~~~~~~~~~l~~A~~----~~GD~~--~---~~~~ivmhS~~~~~L~~~-~li~~~~-~s~--- 218 (346) .-....+..++.++.-.+..|.+|.. .|-+++ + .-.+++|.|+.|..|++. ++++... .++ T Consensus 147 ~~~~~~G~~~~~~~tg~~~~~~~~~l~~a~~~a~~~l~ekdvP~~~~~~rv~vv~P~~y~~Ll~~~~l~n~~~~~s~~~~ 226 (335) T protein:vir:78 147 EDAFSPGVLEKLDLTGLTAKEAAEKIVRMHRRVVETFIERDLGDAVYSEGLTPMSPRVFSLLLEHDKLMSVEYQATGATN 226 (335) T ss_pred CCCcCCCcceeeeeccccccccHHHHHHHHHHHHHHHHhccCCCCCCCccEEEeChHHHHHHhccccccccccccccccc Confidence 00011111122333334555555444 454221 1 237899999999999986 4555322 122 Q ss_pred ---CceeeEEeceEEEEeCCCccCCCc------------e-----EEEEEcCCeeEEeecCCccceeeeecCCcceeEEE Q lcl|NC_015254. 219 ---NKKFPTYMGKRVIVDDGLPAKDGV------------Y-----TSYIFGEGAFGLGNGEAPVPTETDREKLKGNDILI 278 (346) Q Consensus 219 ---~~~i~~~~G~~VVvdD~~p~~~g~------------y-----tt~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~ 278 (346) ++.+..++|++|+.+..+|...++ | ...++-+.|++..... ++..|..|++....+.+. T Consensus 227 ~~~~g~v~~v~Gv~V~~Sn~lP~~~~t~~~lg~a~n~~~~d~~~~~~~~~~~~Al~t~~~~-~~~~e~~~~~~~~~~~i~ 305 (335) T protein:vir:78 227 DYVKSRVAILNGVKVLETPRFATKAISAHPLGRHFNVSAEEAERQIALFLPSKTLITAQVA-PVQAKLWEDHDQFSWVLD 305 (335) T ss_pred ccccceeEEeeceEEEeeccCCCCCCccccccccCCcccccccceEEEEEecceEEEEEEE-ecccceeeccchhhHhhh Confidence 356899999999999999964321 1 3455778888876555 355788888877777776 Q ss_pred EeeEEeeeeee------eeeccccccCCCCChHH Q lcl|NC_015254. 279 NRQHFLLHPRG------IAWQEKSVAGHSPTNTE 306 (346) Q Consensus 279 ~r~~~~~~~~G------~s~~~~~~~~~sPt~a~ 306 (346) +.+.|++.++= +..+. -.+-+... T Consensus 306 ~~~a~G~g~lRPe~a~~i~~tg----~~~~~~~~ 335 (335) T protein:vir:78 306 TFQMYNIGARRPDTAGAIELKG----IEAFDITA 335 (335) T ss_pred HHHHcCCcccCcceEEEEEecC----CCcccccC Confidence 66666544431 01100 00000000 No 102 >protein:vir:4511 Length: 409 # NCBI annotation: capsid # Family: family:all:21 # MgeID: mge:97 # MgeName: V # Cross-refs: genbank:acc:NP_599037;genbank:gi:19548995;genbank:GeneID:935211 Probab=99.00 E-value=1.1e-10 Score=75.20 Aligned_cols=281 Identities=14% Similarity=0.067 Sum_probs=143.4 Q ss_pred Cccceecce----eeecCCceeeeee--ccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccC Q lcl|NC_015254. 1 MIKKLRMNL----QKFAAGKNTRIAD--VIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDL 74 (346) Q Consensus 1 ~~~~~~~~~----q~~~a~~~T~l~d--~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l 74 (346) |...+.-.. .-+.+..++.-.+ .++|+.+..-+.+...+.+.+.+- ..+- -...+..+.+|..... T Consensus 99 ~~~~~~~~e~~~~~~~~a~~~~~~~~gg~liP~~~~~~ii~~~~~~~~l~~~------~~~~--~~~~~~~~~~~~~~~~ 170 (409) T protein:vir:45 99 GASELTSEERKALRELRAQGVAQDEKGGYTVPETFLAKVVEKMKSYGGIASV------AQIL--TTSDGRTMEWATADGT 170 (409) T ss_pred hhhhccHHHHHHHHHHhhccCccCcCCceeccHhHHHHHHHHHHhhhhhhhh------ceee--ecCCCceEEEEeeccC Confidence 222221111 1111222222222 456777766555554444444221 1110 1123556666665544 Q ss_pred CCcccccCCCccccchhhcccceeEEEEEeec-CcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH------- Q lcl|NC_015254. 75 TGEDEILDDGEGALTPGNISAAKDIARLHMRG-KAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS------- 146 (346) Q Consensus 75 ~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~-k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~------- 146 (346) ...+..+.|+. .++....+-.+-.-.-++.. .-+.+++....-+.-|..+.+.++++..+.+..+..+|.- T Consensus 171 ~~~~~~v~E~~-~~~~~~~~f~~~~l~~~k~~~~~i~is~ell~ds~~~l~~~i~~~la~a~~~~~~~a~l~G~G~~~~~ 249 (409) T protein:vir:45 171 SEVGVLLGENE-EAGEEDTDFGMGSLGALKMTSKIIRVSNELLQDSAIDMEAYLARRIAERIGRGEARYLIQGTGAGTPK 249 (409) T ss_pred ccccccccccc-cccccccccceeeeeeeeeeeeehhhhHHHHhccHHHHHHHHHHHHHHHHHHHHHHHhhccCCCCCcc Confidence 33344556664 44544444333322233333 3345666655444457788899999999999999876641 Q ss_pred -HHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCce--EEEEchHHHHHHHhhhh--hhhcc--cccC Q lcl|NC_015254. 147 -LNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLT--GIAMHSQTEMNLRKQGL--IEFML--DSDN 219 (346) Q Consensus 147 -L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~--~ivmhS~~~~~L~~~~l--i~~~~--~s~~ 219 (346) .+|++..... ...+.....++++.|.++...+........ +++||+.++..|++..- =.++. ...+ T Consensus 250 ~p~Gil~~~~~-------~~~~~~~~~~~~d~i~~l~~~l~~~~~~~a~~~~~~n~~~~~~l~~lkd~~G~~i~~~~~~~ 322 (409) T protein:vir:45 250 QPKGLAASVTG-------TTQTAAANAVKWQEILALKHSIDPAYRRGPKFRLAFNDNTLKLISEMEDGQGRPLWLPDIVG 322 (409) T ss_pred ccceeeecccc-------ccccccccccchHHHHHHHHhhhhhhccCCeEEEEECHHHHHHHHHhhcCCCceeeccCcCC Confidence 1222211110 112233445788999999888866554433 56889999999886531 12222 2223 Q ss_pred ceeeEEeceEEEEeCCCcc-CCCceEEEEEcC-CeeEEeecCCccceeeeecC--CcceeEEEEeeEEeeeee---eeee Q lcl|NC_015254. 220 KKFPTYMGKRVIVDDGLPA-KDGVYTSYIFGE-GAFGLGNGEAPVPTETDREK--LKGNDILINRQHFLLHPR---GIAW 292 (346) Q Consensus 220 ~~i~~~~G~~VVvdD~~p~-~~g~ytt~l~~~-GAi~~~~~~~~~~vE~dRd~--~~g~~~l~~r~~~~~~~~---G~s~ 292 (346) +.-.+++|+||++++.||. +.|.+ +++||. .-+.+...+ ...++..+|. ..+...+....+|..+|. .|.- T Consensus 323 ~~~~~l~G~PV~~~~~~p~~~~~~~-~i~~Gd~~~~~i~~~~-~~~~~~~~d~~~~~~~~~~~~~~r~d~~~~~~~A~~~ 400 (409) T protein:vir:45 323 VAPASVLNVPYVIDQEIDDIGAGKK-FMFCGDFDRFIIRRVR-YMILKRLVERYAEYDQTGFLAFHRFDCILEDTSAIKA 400 (409) T ss_pred CCCceecceeeEEecCcCCccCCcc-EEEEeehhhhheeecc-ceEEEEeecccccCCcEEEEEEEEeccEeechhheEE Confidence 4456899999999999996 33444 344433 112222222 2335544554 345555666666665553 1221 Q ss_pred c--cccccC Q lcl|NC_015254. 293 Q--EKSVAG 299 (346) Q Consensus 293 ~--~~~~~~ 299 (346) . ..++++ T Consensus 401 l~~k~s~~~ 409 (409) T protein:vir:45 401 LVGKGSVGG 409 (409) T ss_pred EEeccCCCC Confidence 1 111112 No 103 >protein:vir:94673 Length: 419 # NCBI annotation: major capsid protein # Family: family:all:585 # MgeID: mge:1527 # MgeName: mu1/6 # Cross-refs: genbank:acc:YP_579208;genbank:gi:93007444;genbank:GeneID:5076792 Probab=98.99 E-value=1.4e-10 Score=74.55 Aligned_cols=281 Identities=11% Similarity=0.063 Sum_probs=150.6 Q ss_pred Cccceecceeeec----------CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecc Q lcl|NC_015254. 1 MIKKLRMNLQKFA----------AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPF 70 (346) Q Consensus 1 ~~~~~~~~~q~~~----------a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~ 70 (346) ....++..++.+. ++.++.-...++|+.+...+.........+.+ .+. .....+..+++|. T Consensus 102 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~p~~~~~~i~~~~~~~~~i~~--~~~-------~~~~~~~~~~~~~ 172 (419) T protein:vir:94 102 KRGQFQVEMRDIDPNRLLSRDAPAGTITNPNVPHLPQLVPGIVPTTPDLPLLVAD--LLD-------QQNADYNVLEYIR 172 (419) T ss_pred hhhhhhHHHHHHHHHHhhccccccccccCCcccccchhhhHHHHHHHhhhhhhhh--cce-------eeeccCCceeeee Confidence 1111111111111 22233444567888888877665433332211 111 1112344444443 Q ss_pred --------cccCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 71 --------WQDLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAV 142 (346) Q Consensus 71 --------~~~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~ 142 (346) +... +.+..+.||+ .++..+++-.+-....+..+.-..++++...-+ .+....+.++++..+.+..++. T Consensus 173 ~~~~~~~~~~~~-~~a~~v~Eg~-~~~~~~~~~~~i~~~~~k~~~~~~is~ell~d~-~~l~~~i~~~la~a~~~~~d~a 249 (419) T protein:vir:94 173 DTSGTAGAGSTW-NKAAVVPEGT-AKPQSTLSFDTITTTLKTVAHWLPITRQAADDN-SQLMGYIQGRLTYGLRFLRDRQ 249 (419) T ss_pred eccccccccccC-cccceecCCc-cccccccceeeEEeeeeeEEEeehhhHHHHHhH-HHHHHHHHHHHHHHHHHHHHHH Confidence 3332 2344556774 566666666666666666666667776654433 4677779999999999999998 Q ss_pred HHH-----HHHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhh---h- Q lcl|NC_015254. 143 LIA-----SLNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIE---F- 213 (346) Q Consensus 143 lla-----~L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~---~- 213 (346) +|. ..+|++.......... ............++.|.++...+-.......+|+||+.++..|++..--. + T Consensus 250 ii~G~G~~~p~Gi~~~~~~~~~~~-~~~~~~~t~~~~~~~l~~~~~~~~~~~~~~~~~v~n~~~~~~l~~~k~~~~~~~~ 328 (419) T protein:vir:94 250 LLNGNGSTEMQGILTTPGIGTYQQ-PKPTAPATDEPPLVDIRRAKTVAEIAGFPPDGVVVHPQDWESIELDQAPGSGVFR 328 (419) T ss_pred HHhccCcccccceecccccccccc-cccccccccchhHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHHhhcCCCcee Confidence 774 2334433322211110 01111223345688999999888776667779999999999988653221 1 Q ss_pred cc-cccCceeeEEeceEEEEeCCCccCCCceEEEEEc--CCeeEEeecCCccceeeeecC----CcceeEEEEeeEEeee Q lcl|NC_015254. 214 ML-DSDNKKFPTYMGKRVIVDDGLPAKDGVYTSYIFG--EGAFGLGNGEAPVPTETDREK----LKGNDILINRQHFLLH 286 (346) Q Consensus 214 ~~-~s~~~~i~~~~G~~VVvdD~~p~~~g~ytt~l~~--~GAi~~~~~~~~~~vE~dRd~----~~g~~~l~~r~~~~~~ 286 (346) ++ ...++..++++|+||++++.||.+. ++|+ ..++.+.... .+.++.++.. ..+...+....++.+. T Consensus 329 ~~~~~~~~~~~~l~G~pV~~~~~~~~~~-----~~~gd~~~~~~~~~~~-~~~v~~~~~~~~~~~~~~~~~r~~~r~d~~ 402 (419) T protein:vir:94 329 VIANVQGEATPRIWGLNVVSTVAIAQGT-----ALVGGFRQGATLWSRQ-GITVLMTDSHADFFTANTLVILAEFRANLA 402 (419) T ss_pred ecCCcccCCCccccceeeEEcCCCCCcc-----EEEeeccceEEEEEec-ceEEEEeccccchhhcCcEEEEEEEeeccE Confidence 11 1224456789999999999999654 2332 1222222222 3345544433 2455666666666554 Q ss_pred ee---eeeeccccccCCCCC Q lcl|NC_015254. 287 PR---GIAWQEKSVAGHSPT 303 (346) Q Consensus 287 ~~---G~s~~~~~~~~~sPt 303 (346) |+ +|..-.- ...|| T Consensus 403 v~~~~a~~~~~~---~aa~~ 419 (419) T protein:vir:94 403 VYQPKAFVRVTF---AAATT 419 (419) T ss_pred EeccccEEEEEe---ccCCC Confidence 43 3332111 12355 No 104 >protein:vir:6324 Length: 335 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:132 # MgeName: phiKMV # Cross-refs: genbank:acc:NP_877471;genbank:gi:33300843;uniprot:Q7Y2D3;genbank:GeneID:1482613 Probab=98.98 E-value=1e-10 Score=75.36 Aligned_cols=284 Identities=13% Similarity=0.092 Sum_probs=165.3 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEI 80 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~ 80 (346) |-.- =||.+..-+...--.+++. |+|..-|...+.+.+.|.. .-...++ .+|+++++|+.+.. .++. T Consensus 1 ms~~--~~~tr~~~~~s~~d~al~l-e~f~geV~~af~~~s~~~~------~~~~rti--~~g~s~~~~~iG~~--~~~~ 67 (335) T protein:vir:63 1 MSFL--NDLTRPNYAGKNADVDIHL-EEHLGIVDKHFAYTSKFAP------LMNIRDL--RGSNVVRLDRLGNV--EAKG 67 (335) T ss_pred CCCc--ccchhhhcccccchhheeh-hhhhhhHHHHHHhhhhhcc------ccceeee--ccceeEEEeeeeee--eeec Confidence 4322 1333333222333335666 8888888777777766632 2222222 46999999999876 3444 Q ss_pred cCCCccccchhhcccceeEEEEEe-ecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH-Hhhhhhhhh-- Q lcl|NC_015254. 81 LDDGEGALTPGNISAAKDIARLHM-RGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASL-NGITASGAL-- 156 (346) Q Consensus 81 ~~dg~~~it~~~lt~~~~~a~~~~-~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L-~G~~~~~~~-- 156 (346) ..-|+ .+........+.+-+|=. .--...+.|+....+.-|...+++++++.+.++.+|..++..| ++.-..... T Consensus 68 ~~pG~-~l~~~~~~~~k~~itVD~ll~a~~~I~dlDe~~~~yDvRse~s~e~G~aLA~~~D~~~~~~i~~aa~~~a~~~~ 146 (335) T protein:vir:63 68 RRAGE-ELERSRVVNDKWNLTVDTLLYLRHQFDHQDEWTQSFDMRKEVAELDGQELARKFDQACLIQVIKAAAMDAPVDL 146 (335) T ss_pred ccCCc-CcCCCCccccceEEEecceeechhhhhhHHHHhcCchhHHHHHHHHHHHHHHHHHHHHHHHHHhhccccCcccc Confidence 44443 455455555543333322 1223568899888888899999999999999999999887544 332221111 Q ss_pred -----hh--cceeeeccccccccccHHHHH----HHHHHhCccc--c---CceEEEEchHHHHHHHhh-hhhhhcc-ccc Q lcl|NC_015254. 157 -----DS--NKLDVSTETGDDSYFTGDTFL----SATYKLGDAE--G---KLTGIAMHSQTEMNLRKQ-GLIEFML-DSD 218 (346) Q Consensus 157 -----~~--~~~dis~~~~~~~~~~~~~l~----~A~~~~GD~~--~---~~~~ivmhS~~~~~L~~~-~li~~~~-~s~ 218 (346) .+ ...++++. ++.=.++.|. +|.++|-+++ + .-.+++|.|++|..|++. ++++... .++ T Consensus 147 ~~~~~~G~~~~~~~tg~---~~~~~~~~l~~a~~~a~~~L~e~dVP~~~~~dr~~vv~P~~y~~Ll~~~~l~n~~~~~s~ 223 (335) T protein:vir:63 147 EDAFSPGVLEKLDLTGL---TAKQAADKIVRMHRRVVETFIDRDLGDAVYSEGLTPMSPRVFSLLLEHDKLMNVEYQATG 223 (335) T ss_pred CCCcCCCcceeeeeccC---cccccHHHHHHHHHHHHHHHHhccCCCcccCceEEEeChHHHHHHhcccccccccccccc Confidence 01 11122222 2111344444 6677776544 2 227899999999999986 4555321 122 Q ss_pred ------CceeeEEeceEEEEeCCCccCCC-----------------ceEEEEEcCCeeEEeecCCccceeeeecCCccee Q lcl|NC_015254. 219 ------NKKFPTYMGKRVIVDDGLPAKDG-----------------VYTSYIFGEGAFGLGNGEAPVPTETDREKLKGND 275 (346) Q Consensus 219 ------~~~i~~~~G~~VVvdD~~p~~~g-----------------~ytt~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~ 275 (346) ++.+..++|++|+.+..+|...+ +...+++-+.|++..... ++..|..|++....+ T Consensus 224 ~~~~~~~g~v~~v~Gv~V~~sn~lP~~~~t~~~lg~a~n~~~~d~~~~~~~~~~~~Al~t~~~~-~vt~e~~~~~~~~~~ 302 (335) T protein:vir:63 224 ATNDYVKSRVAILNGVKVLETPRFATKAIAAHPLGRHFNVSAEESERQIALFLPSKTLITAQVA-PVQAKLWEDNEKFSW 302 (335) T ss_pred ccccccCceeEEeeceEEEeeccCCCCCcccccccccCCccccccceeEEEEEecceEEEEEEe-ecccceeeccchhhH Confidence 35689999999999999985432 124566778888876555 456788888877777 Q ss_pred EEEEeeEEeeeeee------eeeccccccCCCCCh Q lcl|NC_015254. 276 ILINRQHFLLHPRG------IAWQEKSVAGHSPTN 304 (346) Q Consensus 276 ~l~~r~~~~~~~~G------~s~~~~~~~~~sPt~ 304 (346) .+.+.+.|++.++= +..+ ..+.-+-|- T Consensus 303 ~i~~~~a~G~g~lRPe~a~~i~~t--g~~~~~~~~ 335 (335) T protein:vir:63 303 VLDTFQMYNIGARRPDTAGAIELK--GIGAFDITA 335 (335) T ss_pred HhHHHHHcCCcccccceEEEEEEc--CCCceeecC Confidence 77666666544431 0110 000001111 No 105 >protein:vir:99675 Length: 324 # NCBI annotation: Major capsid protein # Family: family:all:975 # MgeID: mge:1523 # MgeName: VP4 # Cross-refs: genbank:acc:YP_249589;genbank:gi:68299740;genbank:GeneID:3799990 Probab=98.97 E-value=5.4e-11 Score=76.88 Aligned_cols=262 Identities=11% Similarity=0.073 Sum_probs=145.4 Q ss_pred cccchhHHHHhhCCCcEEEecccccCCCcccccCCCcccc--chhhcccceeEEEEE-eecCcceechHHHhhhcchHHH Q lcl|NC_015254. 49 ISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGAL--TPGNISAAKDIARLH-MRGKAWRTNDLAKALSGDDPMR 125 (346) Q Consensus 49 ~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~i--t~~~lt~~~~~a~~~-~~~k~~~~tD~a~~~~g~dp~~ 125 (346) +++ .+ .+|+++.+|+.+.. ......-|+ .| +++.+...+..-+|= ..-..+.+.|+....+..|++. T Consensus 1 ~vr------~i-~~g~s~~~~~iG~~--~~~~~~~G~-~l~~~~~~~~~~e~~itID~~l~~~~~VdDiD~~qa~~Dlr~ 70 (324) T protein:vir:99 1 MTR------TI-TSGKSAQFPVMGRT--KARYLKQGQ-SLDDGREDIKHTEKVITIDGLLTTDVLIYDIEDAMNHYDVRS 70 (324) T ss_pred Cee------ee-ecCceEEEeeeeee--EeccccCCC-CcCCCcCCcCcccEEEEecchhhhhhhhhhHHHHhcCccchh Confidence 111 11 26999999988765 333444453 44 456677776654443 3444588999999999999999 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhh----------cceeeecccccccccc----HHHHHHHHHHhCccc-- Q lcl|NC_015254. 126 AIGDLVVEYWNRRRQAVLIASLNGITASGALDS----------NKLDVSTETGDDSYFT----GDTFLSATYKLGDAE-- 189 (346) Q Consensus 126 ~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~----------~~~dis~~~~~~~~~~----~~~l~~A~~~~GD~~-- 189 (346) +++++.+.++++.+|+.++..+.++........ ....+++.. ..+..+ ++.|.+|.++|-.+. T Consensus 71 e~s~~~G~aLA~~~Dq~i~~~~a~~~~~~a~~~~~~~~~~g~~~~~~~~~~~-~~~~~~~~~~~dai~~a~~~Lde~~VP 149 (324) T protein:vir:99 71 EYSTQMGEALAMAADVANYAEMAKLVNSRKETTNENIEGLGAASLVKITGKK-EDPAKYGTQVIQALTYARAAFAKKYIP 149 (324) T ss_pred HHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccCCcccCCccceecccccc-cccccCHHHHHHHHHHHHHHHhhcCCC Confidence 999999999999999999877654443221110 011122221 122233 455566667774433 Q ss_pred cCceEEEEchHHHHHHHhhhhhhhcccc-----cCceeeEEeceEEEEeCCCccCCCc---------------------- Q lcl|NC_015254. 190 GKLTGIAMHSQTEMNLRKQGLIEFMLDS-----DNKKFPTYMGKRVIVDDGLPAKDGV---------------------- 242 (346) Q Consensus 190 ~~~~~ivmhS~~~~~L~~~~li~~~~~s-----~~~~i~~~~G~~VVvdD~~p~~~g~---------------------- 242 (346) ..-..++|.|..|..|++...+....+. .+|.|+.++|++|+.|+.+|...+. T Consensus 150 ~~gR~~vv~P~~y~~Ll~~~~~~~~~~~~~~~~~~G~V~~i~Gf~V~~Sn~lp~~~~t~~~~a~~~~~~~~~~~~~~~~~ 229 (324) T protein:vir:99 150 AGDRTFYTDPDTYSAILAALMPNAANYAALIDPETGNIRNVMGFEVVETPHMTAQMVTNPTDAFDGTGHIFPATGDSTTT 229 (324) T ss_pred CCCCEEEeChHHHHHHhhcccccccccccccceecceEEEEeceEEEecCCccccccccccccccccccccccccccccc Confidence 3447899999999999877554432221 2467899999999999999975321 Q ss_pred --eE-------EEEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeee---e---eeeccccccCCCCChHH- Q lcl|NC_015254. 243 --YT-------SYIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPR---G---IAWQEKSVAGHSPTNTE- 306 (346) Q Consensus 243 --yt-------t~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~---G---~s~~~~~~~~~sPt~a~- 306 (346) |. ..+|-+-|++..... .+.+|..|++....+.+...+.|+..++ + +.+......+..|.-.+ T Consensus 230 ~ky~~d~~~~~gl~~~~~a~~tv~~~-~~~~e~~~~~~~~~d~i~~~~a~G~~~lRPe~a~~v~l~~~~~~~~~~~~~~~ 308 (324) T protein:vir:99 230 GKMTVGADNVVGLFVHRSAVATLKLK-DMALERARRPEYQADQIIAKYAMGHGGLRPEAVGAIIFEDGETPAVAPDVITG 308 (324) T ss_pred cccccccCceeEEEEehhheEEEeee-cceecceechhhHHHhhhhhhhhcCcccccceEEEEEEccCccccccchhhhh Confidence 21 123333444443322 3457778888776666666665554432 1 22222111122222110 Q ss_pred hcCCcCceeeecccccceEEEEEeccc Q lcl|NC_015254. 307 IEKGNNWKAVYESKNIRIVAFVHKNGV 333 (346) Q Consensus 307 L~~~~NW~~v~~~K~i~iv~~~~k~~~ 333 (346) ++...- + +.-+-|+.+ T Consensus 309 ~~~~~~------~-----~~~~~~~~~ 324 (324) T protein:vir:99 309 VASFAA------P-----ASTRAKSSA 324 (324) T ss_pred hccccC------c-----ccceeeecC Confidence 000000 0 000000000 No 106 >protein:vir:81160 Length: 371 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:1892 # MgeName: Geobacillus virus E2 # Cross-refs: genbank:acc:YP_001285811;genbank:gi:148747732;genbank:GeneID:5247203 Probab=98.94 E-value=2.8e-10 Score=72.97 Aligned_cols=269 Identities=10% Similarity=0.066 Sum_probs=140.9 Q ss_pred Cccceec-ceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCC-Ccc Q lcl|NC_015254. 1 MIKKLRM-NLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLT-GED 78 (346) Q Consensus 1 ~~~~~~~-~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~-g~a 78 (346) .++.++- ...-++.+ ++.-...++|+.+...+.+...+.+.+.+-.=+ ...++...++|+...-+ +.+ T Consensus 79 ~~~~l~~~~~~a~~~~-t~~~gg~~vP~~~~~~ii~~~~~~s~i~~~~~~---------~~~~~~~~~~~~~~~~~~~~a 148 (371) T protein:vir:81 79 FVNHIRTRFRNAMSEG-SNQDGGYTVPQDIQTRINELRESKDALQNLITV---------EPVTTLSGSRVFKKRSQQTGF 148 (371) T ss_pred HHHHHHHHHHHhhccC-CCccCceeecHhHHHHHHHHHHhhhhhhhhcee---------eeccCCceeEEEEeecCCcce Confidence 1111110 11112222 223345678888877777776666655432111 11123334444333222 344 Q ss_pred cccCCCccccc-hhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhh Q lcl|NC_015254. 79 EILDDGEGALT-PGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALD 157 (346) Q Consensus 79 e~~~dg~~~it-~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~ 157 (346) ..+.||. .++ ....+-.+-....++.+.-..++++...-+.-|....+.+++++.+.+..+..++.... . . T Consensus 149 ~~v~Eg~-~~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~a~~~~~~~~i~~g~g-~------~ 220 (371) T protein:vir:81 149 VEVAEGA-AIGEKATPQFTLLQYQVKKYAGFFRVTNELLNDSTEAIVNTLVRWIGDESRVTRNGLIINVLN-T------K 220 (371) T ss_pred eeecccc-ccccccccceeeEEeeeeEEEEeehhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHHhhcc-c------c Confidence 4567774 333 34455555555555666666777776554445677889999999999998887665321 0 0 Q ss_pred hcceeeeccccccccccHHHHHHHHHH-hCccccCceEEEEchHHHHHHHhhhhh--hhccc--ccCceeeEEeceEEEE Q lcl|NC_015254. 158 SNKLDVSTETGDDSYFTGDTFLSATYK-LGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLD--SDNKKFPTYMGKRVIV 232 (346) Q Consensus 158 ~~~~dis~~~~~~~~~~~~~l~~A~~~-~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~--s~~~~i~~~~G~~VVv 232 (346) ......+++.+..+... +-.....-.+|+||+.++..|++..-- .++.. ..++.-++++|+||++ T Consensus 221 ----------~~~~~~~~~~i~~~~~~~l~~~~~~~a~~vmn~~~~~~L~~lkd~~g~~l~~~~~~~~~~~~l~G~pV~~ 290 (371) T protein:vir:81 221 ----------AKTAIADLDGLKQIINVQLDPVFRSTSSVIVNQDAFNWLDTLKDQNGQYLLQPSISSPTGRQLLGLPVVI 290 (371) T ss_pred ----------cccccccHHHHHHHHHhhcchhhhcCCEEEEcHHHHHHHHHhhccCCCeeeecccCCCCCceecceeEEE Confidence 01123467788877754 434444557899999999999875321 12221 1234457899999999 Q ss_pred eCCCccCCC-------ceEEEEEcC--CeeEEeecCCccceeeeecCC----cceeEEEEeeEEee---eeeeeeecccc Q lcl|NC_015254. 233 DDGLPAKDG-------VYTSYIFGE--GAFGLGNGEAPVPTETDREKL----KGNDILINRQHFLL---HPRGIAWQEKS 296 (346) Q Consensus 233 dD~~p~~~g-------~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~----~g~~~l~~r~~~~~---~~~G~s~~~~~ 296 (346) +|.+|.... ....++||. -++.+.. +..+.++.++... .+...+....++.+ +|..|.+-.-+ T Consensus 291 ~~~~~~~~~~~~~~~~~~~~i~~Gd~~~~~~~~~-~~~~~i~~~~~~~~~f~~~~v~~~~~~r~d~~~~~~~a~~~~~~~ 369 (371) T protein:vir:81 291 VSNKVLANRVDGGTGAQFAPIIVGDLKEAVVMFD-RQRTEIMSSNVAMDAFETDATLWRAIERMDVKMRDDEAFVFGEVQ 369 (371) T ss_pred ecccccCccccccccCCcceEEEEehhceEEEEe-ecceEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEEe Confidence 999985421 122344442 1222222 2233444444332 34455555555543 33444443222 Q ss_pred cc Q lcl|NC_015254. 297 VA 298 (346) Q Consensus 297 ~~ 298 (346) .+ T Consensus 370 ~A 371 (371) T protein:vir:81 370 LA 371 (371) T ss_pred cC Confidence 22 No 107 >protein:vir:8420 Length: 477 # NCBI annotation: gp15 # Family: family:all:21 # MgeID: mge:155 # MgeName: Omega # Cross-refs: genbank:acc:NP_818316;genbank:gi:29566752;genbank:GeneID:1260033 Probab=98.93 E-value=8.7e-11 Score=75.73 Aligned_cols=293 Identities=12% Similarity=0.028 Sum_probs=136.7 Q ss_pred Cccceec---ceeeecCCceeee--eeccchHHHHHHHhhHhHHHHhHhhccccccchhHHH-HhhCCCcEEEecccccC Q lcl|NC_015254. 1 MIKKLRM---NLQKFAAGKNTRI--ADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDE-LAKSGGNMINMPFWQDL 74 (346) Q Consensus 1 ~~~~~~~---~~q~~~a~~~T~l--~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~-l~~~~G~ti~~P~~~~l 74 (346) .+...+. ..+...+..+|.- ..++.||.+..-+.+.+.+.+.+.+- +.. .+..++..+.+|....- T Consensus 140 ~~~~~~~~~~~~~~~~~~~~~~~~gg~lv~~~~~~~~ii~~l~~~~~i~~~--------~~~~~~~~~~~~~~ip~~~~~ 211 (477) T protein:vir:84 140 SDKEIRKIAKVGEEYRDLDRNGGTGGYAVPPLWMMNRFIELARAGRTYANL--------CPTEPLPGGTSSINIPKILTG 211 (477) T ss_pred hhhhHHHHHHhhhhhccccccCCCcceeeccchhHHHHHHHhhhcchHHHh--------hceeeecCCcceeEEEEEecC Confidence 1111100 0111112222222 24677887665555544443333221 000 11234556888875421 Q ss_pred CCcccccCCCcccc-----chhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH---- Q lcl|NC_015254. 75 TGEDEILDDGEGAL-----TPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIA---- 145 (346) Q Consensus 75 ~g~ae~~~dg~~~i-----t~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla---- 145 (346) ...+.-+.|+. .+ +..+++-++-....++.+.-..+++..-.-+.-+..+.+.++++..+.+..+..+|. T Consensus 212 ~~~a~~~~Eg~-~~~~~~~~~s~~~f~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~d~~~l~G~Gt 290 (477) T protein:vir:84 212 TSTAIQAADNA-ALTAPSAHEVDLTDGFVQANVKTIAGQQGIAIQLLDQAAVSVDEFVFRDLAADYANKLNVQVISGTGS 290 (477) T ss_pred cceeeeeccCc-ccccccccccccceeeEEEeeeeEEeeeHHHHHHHhccchhHHHHHHHHHHHHHHHHHHHHHhccCCC Confidence 11122344542 22 223344444444455555555666665554555778889999999999999997763 Q ss_pred --HHHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCcccc-CceEEEEchHHHHHHHhhhhh--hhccccc-- Q lcl|NC_015254. 146 --SLNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEG-KLTGIAMHSQTEMNLRKQGLI--EFMLDSD-- 218 (346) Q Consensus 146 --~L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~-~~~~ivmhS~~~~~L~~~~li--~~~~~s~-- 218 (346) ..+|++........+..-...+.......++.+.++......... ...+|+|||.++..|++..-- .++...+ T Consensus 291 ~~~p~Gi~~~~~~~~~~~~~~~~t~~~~~~~~~~i~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~lkd~~G~~l~~~~~~ 370 (477) T protein:vir:84 291 NNQVVGVRATAGITQVTATSAGSALEKHQIIYQKIADAIQRVHTSRFLEPEVIVMHPRRWASFHAIFAGDDRPLIVPSGP 370 (477) T ss_pred CCccceeeeccccccccccccccchhhHHHHHHHHHHHHhhccccccCCccEEEEcHHHHHHHHHhhccCCCeeeecCcc Confidence 223443322221111100000001111234556666665544433 345899999999998875311 1221111 Q ss_pred -------------CceeeEEeceEEEEeCCCccCCCc---eEEEEEcCCe-eEEeecCCccceeeeecCCc--ceeEEEE Q lcl|NC_015254. 219 -------------NKKFPTYMGKRVIVDDGLPAKDGV---YTSYIFGEGA-FGLGNGEAPVPTETDREKLK--GNDILIN 279 (346) Q Consensus 219 -------------~~~i~~~~G~~VVvdD~~p~~~g~---ytt~l~~~GA-i~~~~~~~~~~vE~dRd~~~--g~~~l~~ 279 (346) .+..++++|+||++++.||.+.|. ...++|+.-+ +.+.. ..+.++.+++... +...+.. T Consensus 371 ~~~~~~~~~~~~~~~~~~~l~G~pVv~s~~~p~~~~~~~d~~~i~~gd~~~~~i~~--~~~~~~~~~~~~~~~~~~~~~v 448 (477) T protein:vir:84 371 GFNNLGVLTEVASQRVVGQMHGLPVVTDPTLPTTLGTGTDQDVIHVLRASDLALFE--SSVRMRALQETRAENLSVLLQV 448 (477) T ss_pred cccccccccccccccccchhcccceEecCcccccccccCCcceEEEEEeceEEEEe--eceeEEeccccccccceeeeee Confidence 123468999999999999975432 2345554322 33332 2233333333322 2222211 Q ss_pred e--eEEe--eeeeeeeeccccccCCCCChH Q lcl|NC_015254. 280 R--QHFL--LHPRGIAWQEKSVAGHSPTNT 305 (346) Q Consensus 280 r--~~~~--~~~~G~s~~~~~~~~~sPt~a 305 (346) . ..+. -||.-|.-. ...+-..||.+ T Consensus 449 ~~~~~~~~~r~~~afv~~-t~~~~~~~~~~ 477 (477) T protein:vir:84 449 YGYLAFTAARFPQSVVEI-GGTALTAPTFA 477 (477) T ss_pred hhhhhhhhhccccceEEe-ecccccccccC Confidence 1 1111 155544421 11122368887 No 108 >protein:vir:3991 Length: 404 # NCBI annotation: major structural protein # Family: family:all:21 # MgeID: mge:319 # MgeName: BK5-T # Cross-refs: genbank:acc:NP_116499;genbank:gi:14251132;genbank:GeneID:921252 Probab=98.92 E-value=7.1e-10 Score=70.73 Aligned_cols=285 Identities=11% Similarity=0.050 Sum_probs=147.1 Q ss_pred Cccc-eecc--eeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCC-- Q lcl|NC_015254. 1 MIKK-LRMN--LQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLT-- 75 (346) Q Consensus 1 ~~~~-~~~~--~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~-- 75 (346) +... ..++ -+.-....++.-....+|+.+.+.+.+...+.+.+.+-. .....++...++|+|..-+ T Consensus 101 ~~~~~~~~~~~e~~a~~~~t~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~---------~~~~~~~~~~~~~~~~~~~~~ 171 (404) T protein:vir:39 101 VRNPMAFLNTVSSKTETSGSDSAAGLTIPQDIRTMINTLVRQYDSLQQYV---------RVESVSTSNGSRVYEKWTDVT 171 (404) T ss_pred HhcchhhhhhhhhhhhhcccccCCceeccHHHHHHHHHHHHhhhhHHhhc---------ceeeccCCcceEEEEeecCCc Confidence 1000 0000 000001111222345789988888877766666553321 1111234455666664332 Q ss_pred CcccccCCCccccc-hhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhh Q lcl|NC_015254. 76 GEDEILDDGEGALT-PGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASG 154 (346) Q Consensus 76 g~ae~~~dg~~~it-~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~ 154 (346) +.+..+.|+. .++ .++.+-.+-....++.+.-..++++...-+.-|....+.+++++.+.+..++.+|.... .. T Consensus 172 ~~a~~v~Eg~-~~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~d~~il~g~g----~~ 246 (404) T protein:vir:39 172 PLTVMDAEDG-KIPDLDNPRLTIIKYLIKRYAGIITATNTLLKDTAENILAWLSSWIAKKVVVTRNQAIIAAMG----TV 246 (404) T ss_pred cceeeecCcc-ccccccccceeeEEeeeeeEEeeehhHHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHHhccc----cc Confidence 3344567774 444 34556566666666777777788776665666788889999999999999997765321 00 Q ss_pred hhhhcceeeeccccccccccHHHHHHHHHH-hCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceeeEEeceE Q lcl|NC_015254. 155 ALDSNKLDVSTETGDDSYFTGDTFLSATYK-LGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFPTYMGKR 229 (346) Q Consensus 155 ~~~~~~~dis~~~~~~~~~~~~~l~~A~~~-~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~~~~G~~ 229 (346) .......+++.+.+++.. +......-.+|+||+.++..|++..-- .++. ...++.-++++|+| T Consensus 247 ------------~~~~~~~~~~~i~~~~~~~~~~~~~~~a~~v~n~~~~~~L~~lkd~~G~~l~~~~~~~~~~~~l~G~p 314 (404) T protein:vir:39 247 ------------PKKPTIAKFDDVITMINTSVDPAIIATSSLLTNQSGLNKLALVKTAEGKYLLEPDPTKPNSYLIKGKK 314 (404) T ss_pred ------------ccccccccHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHhhccCCceeeccCcCCCCcceeccee Confidence 011223567888888765 444445567899999999999964211 1221 11234446899999 Q ss_pred EEEeCC--CccCCCceEEEEEcC--CeeEEeecCCccceeeeecC----CcceeEEEEeeEEeeeeeeeeeccccccCCC Q lcl|NC_015254. 230 VIVDDG--LPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREK----LKGNDILINRQHFLLHPRGIAWQEKSVAGHS 301 (346) Q Consensus 230 VVvdD~--~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~----~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~s 301 (346) |++.+. +|.....-..++++. .++.+...+ .+.++.++.. ..+...+....+|.+.++ T Consensus 315 V~~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~~~-~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~~~------------- 380 (404) T protein:vir:39 315 VIVVADRWLPNSGSTVYPLYYGDMSQAITLFDRE-NMSLLPTNIGAGAFETDTTKIRVIDRFDVKTT------------- 380 (404) T ss_pred EEEecccccCccCCCccEEEEEeccccEEEEeec-ceEEEEeccchhhhhhceeeEEEEeeeccEEe------------- Confidence 999765 444322222344443 233333322 3445555543 133344444444443332 Q ss_pred CChHHhcCCcCceeeecccccceEEEEEecccccccCCCCCCCCC Q lcl|NC_015254. 302 PTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETAPEGIK 346 (346) Q Consensus 302 Pt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~~~~~~ 346 (346) +++.+..+.+++-.++.| ..++|| T Consensus 381 ----------------~~~a~~~~~~~~~a~~~~-----~~~~~~ 404 (404) T protein:vir:39 381 ----------------DSEALVAGSFTAIADQVG-----NFTAGK 404 (404) T ss_pred ----------------cccceEEEEeeccccCCC-----CCCCCC Confidence 222222222222111111 112222 No 109 >protein:vir:7409 Length: 408 # NCBI annotation: major structural protein # Family: family:all:21 # MgeID: mge:146 # MgeName: P335 # Cross-refs: genbank:acc:NP_839926;genbank:gi:30089896;genbank:GeneID:1260683 Probab=98.92 E-value=1e-09 Score=69.90 Aligned_cols=291 Identities=12% Similarity=0.113 Sum_probs=145.9 Q ss_pred Cccceecc---eeeec--C--CceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccccc Q lcl|NC_015254. 1 MIKKLRMN---LQKFA--A--GKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQD 73 (346) Q Consensus 1 ~~~~~~~~---~q~~~--a--~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~ 73 (346) +...++.+ ++... + ..++.-.-.++|+.+...+.+...+.+.+.+-.=..+ .+.+...+.+|.+.. T Consensus 97 ~~~~~~~~~~~~~~~~~~a~~~~~~~~gg~~vP~~~~~~Ii~~~~~~~~l~~~~~~~~-------~~~~~~~~~~~~~~~ 169 (408) T protein:vir:74 97 FVNMVRNPMAFLNTVSSKTETSGSDSAAGLTIPQDIRTMINTLVRQYDSLQQYVRVES-------VSTSSGSRVYEKWTD 169 (408) T ss_pred HHHHHhcchhhhhhhhhhhhcccccCCCceeechhHhhHHHHHHhhhcchhhhcceee-------ccCCcceEEEEeecC Confidence 00000000 00011 1 1111112456788888888777666665533110000 111222445555554 Q ss_pred CCCcccccCCCccccch-hhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhh Q lcl|NC_015254. 74 LTGEDEILDDGEGALTP-GNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITA 152 (346) Q Consensus 74 l~g~ae~~~dg~~~it~-~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~ 152 (346) -+..+..+.|+. .++. .+.+-.+-....++.+....++++...-+.-|....+.++|++.+.+..+..+|.-. + T Consensus 170 ~~~~~~~v~E~~-~~~~~~~~~~~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~d~~il~G~----G 244 (408) T protein:vir:74 170 VTPLKAMDEEDG-KIPDLDNPRLTIIKYLIKRYAGIITATNTLLKDTAENILAWLSSWIAKKVVVTRNQAIIAAM----G 244 (408) T ss_pred Cccccccccccc-ccccccccceeeEEeeeeeEEeeehhHHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhcc----c Confidence 433444566664 4442 445656666666677777778887666566678889999999999999998666421 1 Q ss_pred hhhhhhcceeeeccccccccccHHHHHHHHHH-hCccccCceEEEEchHHHHHHHhhhhh--hhcccc--cCceeeEEec Q lcl|NC_015254. 153 SGALDSNKLDVSTETGDDSYFTGDTFLSATYK-LGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLDS--DNKKFPTYMG 227 (346) Q Consensus 153 ~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~-~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~s--~~~~i~~~~G 227 (346) .. .......+++.+.+++.. +-.....-.+|+||+..+..|++..-- .++... .++.-++++| T Consensus 245 ~~------------~~~~~~~~~~~i~~~~~~~l~~~~~~~a~~v~n~~~~~~l~~lkd~~G~~l~~~~~~~~~~~~l~G 312 (408) T protein:vir:74 245 TV------------PKKPTIANFDDVITMINTSVDPAIIATSSLLTNQSGLNKLALVKTAEGKYLLEPDPTKPNSYLIKG 312 (408) T ss_pred cc------------ccccccccHHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHHhhcCCCceEeccCcCCCCCceecc Confidence 00 011223577888888753 444444556899999999999875311 122211 1233468999 Q ss_pred eEEEEeC--CCccCCCceEEEEEcC--CeeEEeecCCccceeeeecC----CcceeEEEEeeEEeeeeeeeeeccccccC Q lcl|NC_015254. 228 KRVIVDD--GLPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREK----LKGNDILINRQHFLLHPRGIAWQEKSVAG 299 (346) Q Consensus 228 ~~VVvdD--~~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~----~~g~~~l~~r~~~~~~~~G~s~~~~~~~~ 299 (346) +||++++ .||.......+++++. .++.+...+ .+.++.++.. ..+...+....++.+.++ T Consensus 313 ~pV~~~~~~~~~~~~~~~~~i~~gd~~~~~~~~~~~-~~~i~~~~~~~~~f~~~~~~~r~~~r~d~~~~----------- 380 (408) T protein:vir:74 313 KQVIVVADRWLPNSGSTVYPLYYGDMSQAITLFDRE-NMSLLPTNIGAGAFETDTTKIRVIDRFDVKAT----------- 380 (408) T ss_pred eeeEEecCcccccccCCcceEEEEehhccEEEEEec-ceEEEEeccccchhhcceeeEEEEEeeCcEEe----------- Confidence 9999865 4675443334455553 334443322 3455555432 233444444444443332 Q ss_pred CCCChHHhcCCcCceeeecccccceEEEEEecccccccCCCCCCCC Q lcl|NC_015254. 300 HSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETAPEGI 345 (346) Q Consensus 300 ~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~~~~~ 345 (346) +++.+..+.+......++.+...+-+-. T Consensus 381 ------------------~~~a~~~~~~~~~~~~~~~~~~~~~~~~ 408 (408) T protein:vir:74 381 ------------------DSEALVAGSFTAIADQVGNFKTTTSTAV 408 (408) T ss_pred ------------------cccceEEEEeecccCCCCCCCCCccccC Confidence 1122222222222111111111111111 No 110 >protein:vir:3845 Length: 395 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:322 # MgeName: phi adh # Cross-refs: genbank:acc:NP_050151;swissprot:trembl:q9t1f6;genbank:gi:9633043;uniprot:Q9T1F6;genbank:GeneID:1262163 Probab=98.89 E-value=7.4e-10 Score=70.64 Aligned_cols=285 Identities=11% Similarity=0.054 Sum_probs=142.4 Q ss_pred Cccceecceeee-cCCceee-eeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccc--cCCC Q lcl|NC_015254. 1 MIKKLRMNLQKF-AAGKNTR-IADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQ--DLTG 76 (346) Q Consensus 1 ~~~~~~~~~q~~-~a~~~T~-l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~--~l~g 76 (346) |...+.=.+..+ +++.++. -.-.++|+.|...+.+...+.+.+.+-.-. ...++....+|+|. ..++ T Consensus 93 ~~~~~~~~~~~~~~~~~~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~~~~~---------~~~~~~~~~~~~~~~~~~~~ 163 (395) T protein:vir:38 93 MKNQFVKDFKNLVTSGTTGTGNAGLTIPEDIQLQIRTLTRSFTSLESLANV---------ENVTTSHGSRVYEKLADITP 163 (395) T ss_pred HHHHHHHHHHHHHhhccCccCCCceecchhHhhHHHHHHHhhcchhhhcce---------eeccCCcceEEEEeeccCCc Confidence 222221112222 2222222 234578888877777766665555332111 11234444555554 3333 Q ss_pred cccccCCCccccch-hhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhh Q lcl|NC_015254. 77 EDEILDDGEGALTP-GNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGA 155 (346) Q Consensus 77 ~ae~~~dg~~~it~-~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~ 155 (346) .+..+.|+. .++. ...+-.+-.-..++.+.-..++++...-+.-|..+.+.+++++.+.+..+..++.... . T Consensus 164 ~a~~v~E~~-~~~~~~~~~f~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~il~g~g----~-- 236 (395) T protein:vir:38 164 LKDLDDESA-LIGDNDDPELTVVKYLIHRYAGITTVTNTLLKDTVDNIIQWLVNWAAKKDVVTRNAKILEVMG----K-- 236 (395) T ss_pred ccccccccc-ccccccccceeeEEeeeeeeEeehhhHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhccc----c-- Confidence 344456664 3332 2334344344445555556666665544555778889999999999999887664321 0 Q ss_pred hhhcceeeeccccccccccHHHHHHHHHH-hCccccCceEEEEchHHHHHHHhhhhhh--hc--ccccCceeeEEeceEE Q lcl|NC_015254. 156 LDSNKLDVSTETGDDSYFTGDTFLSATYK-LGDAEGKLTGIAMHSQTEMNLRKQGLIE--FM--LDSDNKKFPTYMGKRV 230 (346) Q Consensus 156 ~~~~~~dis~~~~~~~~~~~~~l~~A~~~-~GD~~~~~~~ivmhS~~~~~L~~~~li~--~~--~~s~~~~i~~~~G~~V 230 (346) .. ......+++.+.+++.. +......-.+|+||+.++..|++..--+ ++ ....++.-.+++|+|| T Consensus 237 -~~---------~~~~~~~~~~i~~~~~~~l~~~~~~~a~~v~n~~~~~~L~~lkd~~G~~l~~~~~~~~~~~~l~G~pV 306 (395) T protein:vir:38 237 -AP---------KKPTISQFDNIKDLENNTLDPAIESTSSFITNQSGYNILSKVKDADGRYLMQPDVTSPDKYLIDGKPV 306 (395) T ss_pred -cc---------cccccccHHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHHhhccCCceeeccCcCCCCcceecccee Confidence 00 01122467788887753 4444556678999999999998753111 11 1112344468999999 Q ss_pred EEeCCCccC--CCceEEEEEcC--CeeEEeecCCccceeeeecCC----cceeEEEEeeEEeeeeeeeeeccccccCCCC Q lcl|NC_015254. 231 IVDDGLPAK--DGVYTSYIFGE--GAFGLGNGEAPVPTETDREKL----KGNDILINRQHFLLHPRGIAWQEKSVAGHSP 302 (346) Q Consensus 231 VvdD~~p~~--~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~----~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sP 302 (346) ++++.++.. .+.+ +++|+. .++.+...+ .+.++..+... .+...+....+|.+. T Consensus 307 ~~~~~~~~~~~~~~~-~i~~gd~~~~~~i~~~~-~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~---------------- 368 (395) T protein:vir:38 307 IRIADKWLPDVSGSH-PLYFGDLKQGITLFDRQ-QMQIDTTNVGAGSFEHDTTKLRFIDRFDVQ---------------- 368 (395) T ss_pred EEecccccCcCCCcc-eEEEEeccccEEEEEec-ceEEEEeccccchhhcCceEEEEEEeeccE---------------- Confidence 999876543 2333 345552 334333322 34555555432 122222222222111 Q ss_pred ChHHhcCCcCceeeecccccceEEEEEecccccccCCCCCCCCC Q lcl|NC_015254. 303 TNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETAPEGIK 346 (346) Q Consensus 303 t~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~~~~~~ 346 (346) +.+++.+..+ ++.+..++.+..-.+|| T Consensus 369 -------------~~~~~a~~~~----~~~~~~~~~~~~~~~~~ 395 (395) T protein:vir:38 369 -------------LIDDGAFAAA----SFKTVANQAQGTAGTGK 395 (395) T ss_pred -------------EecccceEEE----EeecccCCCCCccCCCC Confidence 1222333322 22333444444445556 No 111 >protein:vir:3870 Length: 400 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:82 # MgeName: A2 # Cross-refs: genbank:acc:NP_680487;swissprot:trembl:q8ltc0;genbank:gi:22296527;interpro:IPR006444;uniprot:Q8LTC0;genbank:GeneID:951713 Probab=98.87 E-value=5.6e-10 Score=71.30 Aligned_cols=269 Identities=11% Similarity=-0.008 Sum_probs=139.6 Q ss_pred Ccc---------ceecceeeec-CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecc Q lcl|NC_015254. 1 MIK---------KLRMNLQKFA-AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPF 70 (346) Q Consensus 1 ~~~---------~~~~~~q~~~-a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~ 70 (346) +.+ ......+.-. +..++.-....+|+.+..-+.+...+.+.+.+- ......++...++|. T Consensus 112 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~~---------~~~~~~~~~~~~~~~ 182 (400) T protein:vir:38 112 EKTDVGTFAVLRAVPTDASDAVNAGVKAADAASTIPETISNTPQRELQTVVDLKPF---------TNVFQASTQKGTYPT 182 (400) T ss_pred HHHHHHHHhhhhhhhHHHHHHHhhcccccCCcccccHHHHHHHHHHHHhhhhhhhc---------ceeEeccCcceEEEE Confidence 000 0011111100 111122234678888877777666555544321 011223466788898 Q ss_pred cccCCCcccccCCCccccch-hhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHh Q lcl|NC_015254. 71 WQDLTGEDEILDDGEGALTP-GNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNG 149 (346) Q Consensus 71 ~~~l~g~ae~~~dg~~~it~-~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G 149 (346) +..-++....+.|+. ..+. ...+-.+-.-..++.+.-..++++...-+.-|..+.+.+.+++......+..++..+.+ T Consensus 183 ~~~~~~~~~~~~E~~-~~~~~~~~~f~~i~~~~~k~~~~~~is~ell~ds~~~~~~~i~~~l~~~~~~~~~~~i~~~~~~ 261 (400) T protein:vir:38 183 VANATTKMVTVAELE-KNPAMAKPEFKPVNWSVETYRQALPVSQESIDDSAIDLVGLIAQNGQQIKVNTTNGAVATLLKG 261 (400) T ss_pred EecCCCccccccccc-cccccccccceeeEeehhheeeehhhHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhhhhcccc Confidence 876556565666664 3332 33344444444555666667777655445556777889999988888777755543221 Q ss_pred hhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceeeEE Q lcl|NC_015254. 150 ITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFPTY 225 (346) Q Consensus 150 ~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~~~ 225 (346) . ......+++.+.++....-+... -.+|+|||.++..|++..-- .++. ...++.-+++ T Consensus 262 ~-----------------~~~~~~~~~~~~~~~~~~~~~~~-~a~~v~~~~~~~~l~~lkd~~G~~i~~~~~~~~~~~~l 323 (400) T protein:vir:38 262 F-----------------TAKTISSVDDLKHINNVDLDPAY-SRVIIASQSFYNFLDTVKDGNGRYLLQDSILTPSGKSV 323 (400) T ss_pred c-----------------cccccccHHHHHHHHHhhhhhhh-CcEEEEcHHHHHHHHHhhccCCCeeeecCcCCCCcccc Confidence 0 01122467778887776545443 37899999999999875311 1222 2223445789 Q ss_pred eceEEEEeCCCccCCCceEEEEEcC--CeeEEeecCCccceeeeecCCcceeEEEEeeEEe---eeeeeeeeccccccCC Q lcl|NC_015254. 226 MGKRVIVDDGLPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREKLKGNDILINRQHFL---LHPRGIAWQEKSVAGH 300 (346) Q Consensus 226 ~G~~VVvdD~~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~---~~~~G~s~~~~~~~~~ 300 (346) +|+||++++.+|.....-..++|+. -++.+... ..+.++..++.... +.+....+|. .|+.+|.+..-+ T Consensus 324 ~G~pv~~~~~~~~~~~g~~~~~~gd~s~~~~~~~~-~~~~~~~~~~~~~~-~~~~~~~r~d~~~~~~~a~~~l~~~---- 397 (400) T protein:vir:38 324 LGMPIAVVSDDTLGAAGEAHAFLGDIKRAILFANR-ADFMVRWVDDQIYG-QFLQAGMRFGVSVADEKAGYFLTYT---- 397 (400) T ss_pred ccceeEEecccccCCCCceEEEEEeccccEEEEee-cceEEEEecccccc-eeEEEEEEeccEEecccceEEEEee---- Confidence 9999999999986543223455543 12333222 23445555443222 2222222222 233344443211 Q ss_pred CCCh Q lcl|NC_015254. 301 SPTN 304 (346) Q Consensus 301 sPt~ 304 (346) |.- T Consensus 398 -~~a 400 (400) T protein:vir:38 398 -PKA 400 (400) T ss_pred -cCC Confidence 111 No 112 >protein:vir:101607 Length: 379 # NCBI annotation: major capsid protein precursor # Family: family:all:585 # MgeID: mge:1646 # MgeName: 11b # Cross-refs: genbank:acc:YP_112497;genbank:gi:53793597;uniprot:Q5ZGF6;genbank:GeneID:3101715 Probab=98.86 E-value=8e-10 Score=70.45 Aligned_cols=271 Identities=11% Similarity=-0.007 Sum_probs=144.7 Q ss_pred Cccceecce--eeecCCceeeeeec--cchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCC Q lcl|NC_015254. 1 MIKKLRMNL--QKFAAGKNTRIADV--IVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTG 76 (346) Q Consensus 1 ~~~~~~~~~--q~~~a~~~T~l~d~--i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g 76 (346) ...+.+... +.-+++..|.-.+. .+|+.+..-+.+...+.+.+.+ ++ ......+..+++|.....++ T Consensus 91 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ip~~~~~~ii~~~~~~~~i~~--~~-------~~~~~~~~~~~~~~~~~~~~ 161 (379) T protein:vir:10 91 DIKEVRNGKSIQVKAVGDMTLPVNLTGAQPKDYNFDVVLNPSQMLNVSD--IV-------GAVSISGGTYTFVRENGAGE 161 (379) T ss_pred hHHHHHhhhhhhhhhhcccccCCCCccccchhhhhHHHHhHHhhhhHHh--hc-------eeeeccCCceEEEEeecCCC Confidence 011111111 11122222222222 4577776666665555444422 11 11223466788888765533 Q ss_pred cc-cccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhh Q lcl|NC_015254. 77 ED-EILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGA 155 (346) Q Consensus 77 ~a-e~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~ 155 (346) .+ ..+.||. .++..+.+.++-....++.+.-..++++...-+ .+....+.++++....+..+..++..+. .++ T Consensus 162 ~~~~~v~Eg~-~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~D~-~~l~~~i~~~la~~~~~~~~~~~~~g~~----~~~ 235 (379) T protein:vir:10 162 GAIGAQVEGA-TKGQKDYDISMIDVNTDFIAGFTRYSKKMANNL-PFLTSFIPNALRRDYAKAENAAFNAVLA----ANA 235 (379) T ss_pred cccccccCCc-cccccccceeeeEeeeeeEEeeehhhHHHHhhH-HHHHHHHHHHHHHHHHHHHHHHHhcccc----ccc Confidence 33 2346664 556556665555555555555566776643322 2455668888888888888876665432 111 Q ss_pred hhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhccc----ccCceeeEEeceE Q lcl|NC_015254. 156 LDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLD----SDNKKFPTYMGKR 229 (346) Q Consensus 156 ~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~----s~~~~i~~~~G~~ 229 (346) ... .. ......+.+.+.++...+.+..-...+|+||+.++..|++..-- .++-. ..++.-.+++|+| T Consensus 236 ~~~----~~---~~~~~~~~d~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~lkd~~G~~l~~~~~~~~~~~~~~l~G~p 308 (379) T protein:vir:10 236 TAS----TE---IITNKNKVEMLINEIAKQENLDFPVTAIVLRPTDYYDILVTQKSVGAGYGLPGVVTQDNGVLRINGIP 308 (379) T ss_pred ccc----cc---cccCcccHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHhhccCCceeccCCccCCCCCcceeccee Confidence 111 11 11223457889999988877777778999999999999875321 12211 1234446899999 Q ss_pred EEEeCCCccCCCceEEEEEcCCeeEEeecCCccceeeeecC----CcceeEEEEeeEEeeeee---eeeeccccccCC Q lcl|NC_015254. 230 VIVDDGLPAKDGVYTSYIFGEGAFGLGNGEAPVPTETDREK----LKGNDILINRQHFLLHPR---GIAWQEKSVAGH 300 (346) Q Consensus 230 VVvdD~~p~~~g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~----~~g~~~l~~r~~~~~~~~---G~s~~~~~~~~~ 300 (346) |++++.||.+. ..---|...++.+. ..+.++..++. ..+...+....++.+.++ .|-.- +.+.. T Consensus 309 vv~s~~~~ag~--~~~gdf~~~~~~~~---~~~~i~~~~~~~~~f~~~~~~~r~~~R~~~~v~~p~a~v~~--~~~~~ 379 (379) T protein:vir:10 309 LFRATWLAANK--YYVGDWTRVTKVTT---EGLSLEFSEVEGTNFVKNNITARIEAQVALAVEQPAALIFG--DFTAV 379 (379) T ss_pred eEecCCCCCCc--eEEeecccEEEEEE---eceEEEEeecccccccCCcEEEEEEEEeccEEecCccEEEE--EecCC Confidence 99999998653 11111333444332 23456655544 345566667777766554 22221 11112 No 113 >protein:vir:1433 Length: 435 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:30 # MgeName: phiE125 # Cross-refs: genbank:acc:NP_536362;genbank:gi:17975167;genbank:GeneID:929171 Probab=98.83 E-value=6.6e-10 Score=70.92 Aligned_cols=287 Identities=13% Similarity=0.064 Sum_probs=134.2 Q ss_pred Ccc---ceecceee------------ecCCceeeee----eccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhC Q lcl|NC_015254. 1 MIK---KLRMNLQK------------FAAGKNTRIA----DVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKS 61 (346) Q Consensus 1 ~~~---~~~~~~q~------------~~a~~~T~l~----d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~ 61 (346) |.+ ..+-+++. ..+...+..+ -.++|+.+..-+.+...+.+-+.+-+. ..... T Consensus 101 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~l~~~~~i~~~~~--------~~~~~ 172 (435) T protein:vir:14 101 MVRALAAARGDAQLASKLAIERGFGEEVAMSLNTLSPGAGGVLVPENLSSEVIELLRPKSVVRKLGA--------RTLPL 172 (435) T ss_pred HHHHHHhhcchhhHHHHHHHhhhhhhhhhhhcccCCcCCCccccchhHHHHHHHHHhhhchhhhhcc--------eeeec Confidence 000 00111110 0011122221 235788776666555544443432211 01111 Q ss_pred CCcEEEecccccCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcch--HHHHHHHHHHHHHHHHH Q lcl|NC_015254. 62 GGNMINMPFWQDLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDD--PMRAIGDLVVEYWNRRR 139 (346) Q Consensus 62 ~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~d--p~~~i~~q~a~~~~~~~ 139 (346) ....+++|.+..- +.+..+.|+. .++..+.+-.+-....++.+..+.++++...-++-+ ..+.+.++++.++.++. T Consensus 173 ~~~~~~~p~~~~~-~~a~~v~E~~-~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~ds~~~~~l~~~i~~~l~~ai~~~~ 250 (435) T protein:vir:14 173 SNGNITIPRLKGG-AIVGYIGADT-DIPTTQQQFDDLKLTAKKMAALVPIANDLIKYAGVNPNVDQIVVGDLTAAIGARE 250 (435) T ss_pred CCCceEEEEEeCC-cceeeeccCc-cccccccceeEEEeeeEEEEEeehhhHHHHHhhccCHHHHHHHHHHHHHHHHHHH Confidence 2335888988643 4454567764 566666665555555666666677777654445444 44679999999999999 Q ss_pred HHHHHH------HHHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhhhhh Q lcl|NC_015254. 140 QAVLIA------SLNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQGLI 211 (346) Q Consensus 140 ~~~lla------~L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~li 211 (346) ++.++. ..+|+......... .....+.........+.+....+-... -.-.+|+||+.++..|++..-- T Consensus 251 d~a~l~G~G~~~~p~Gi~~~~~~~~~---~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~v~n~~~~~~L~~lkd~ 327 (435) T protein:vir:14 251 DKAFIRDDGTANTPKGLRFWALPSNV---ITASDASTLQKIETDLGKVILALENADANLTQPGWIMAPRTFRFLEGLRDG 327 (435) T ss_pred HHHhhccCCCCccccceeecccccce---eccccccchhhHHHHHHHHHHHhhhccccccCCEEEEcHHHHHHHHHhhcc Confidence 997763 12222211111000 000111111112234455554443322 2235799999999998865311 Q ss_pred --hhcccccCceeeEEeceEEEEeCCCccCCC---ceEEEEEcCCe-eEEeecCCccceeeeecCC-------------c Q lcl|NC_015254. 212 --EFMLDSDNKKFPTYMGKRVIVDDGLPAKDG---VYTSYIFGEGA-FGLGNGEAPVPTETDREKL-------------K 272 (346) Q Consensus 212 --~~~~~s~~~~i~~~~G~~VVvdD~~p~~~g---~ytt~l~~~GA-i~~~~~~~~~~vE~dRd~~-------------~ 272 (346) .++..+.. -++++|+||++++.||...+ ....++|+.-. +.+.. +..+.++.+++.. . T Consensus 328 ~G~~l~~~~~--~g~l~G~Pv~~~~~~p~~~~~~~~~~~i~~gd~s~~~i~~-~~~~~~~~~~~~~~~~~~~~~~~~f~~ 404 (435) T protein:vir:14 328 NGNKVYPELA--NGMLKGYPVGKTTQVPINLGETGKESEIYFTDFGDVFIGE-EETLEIDYSKEATYKDADGHMVSAFQR 404 (435) T ss_pred CCceeccCCC--CCeeecceeEeeccccccccCCCccceEEEeecccEEEEE-ecccEEEEeccccccccccchhhhhhc Confidence 12222211 24789999999999987432 22334444221 22322 2234555555432 1 Q ss_pred ceeEEEEeeEEeeeeeeeeeccccccCCCCChHHhcCCcCcee Q lcl|NC_015254. 273 GNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEKGNNWKA 315 (346) Q Consensus 273 g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~~NW~~ 315 (346) +...+-...++.+.|+ .|.---.-++.+|-- T Consensus 405 ~~~~~r~~~r~d~~~~------------~~~a~~~l~~~~~~~ 435 (435) T protein:vir:14 405 DQTLIRVIAKNDFGPR------------HVESIAVLAGVAWGA 435 (435) T ss_pred ChhheeeeeeeCceee------------cccceEEEecCCCCC Confidence 1122222222222211 122222223333322 No 114 >protein:vir:1025 Length: 408 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:20 # MgeName: bIL286 # Cross-refs: genbank:acc:NP_076679;genbank:gi:13095788;genbank:GeneID:920362 Probab=98.82 E-value=4.5e-09 Score=66.35 Aligned_cols=280 Identities=11% Similarity=0.077 Sum_probs=143.1 Q ss_pred Cccce--ecceeeecCCceeeee--eccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccc--ccC Q lcl|NC_015254. 1 MIKKL--RMNLQKFAAGKNTRIA--DVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFW--QDL 74 (346) Q Consensus 1 ~~~~~--~~~~q~~~a~~~T~l~--d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~--~~l 74 (346) +.+.. .++.--..+-..+..+ ...+|+.+...+.+...+.+.+.+-. .....++...++|++ ..- T Consensus 100 ~~~~~~~~~~~~~~~a~~~~t~~~gg~~vP~~~~~~Ii~~~~~~~~l~~~~---------~~~~~~~~~~~~~~~~~~~~ 170 (408) T protein:vir:10 100 MVRNPMAFMNTVSSKTETSGSDSAAGLTIPQDIRTMINTLVRQYDSLQQYV---------RVESVSTSNGSRVYEKWTDV 170 (408) T ss_pred HhhcchhhhhhhhhhhhhcccccCCceeccHhHHHHHHHHHHhhchhhhhc---------ceeeccCCcceEEEeecccc Confidence 11110 0000000111122223 35679998888887776666553311 111223334444444 333 Q ss_pred CCcccccCCCccccch-hhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhh Q lcl|NC_015254. 75 TGEDEILDDGEGALTP-GNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITAS 153 (346) Q Consensus 75 ~g~ae~~~dg~~~it~-~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~ 153 (346) .+.+..+.|+. .++- +..+-.+-....+..+....++++...-+.-|....+.+++++.+.+..+..++....+ T Consensus 171 ~~~a~~v~E~~-~~~~~~~~~~~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g~---- 245 (408) T protein:vir:10 171 TPLTVMDAEDG-KIPDLDNPQLTIIKYLIKRYAGIITATNTSLKDTAENILAWLSSWIAKKVVVTRNQAIIEVMKA---- 245 (408) T ss_pred ccceeeecCcc-ccccccCcceeeEEeeeeeEEeeehhHHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhcccc---- Confidence 33444556664 3432 23344444444555565566676655545557888899999999999999876654321 Q ss_pred hhhhhcceeeeccccccccccHHHHHHHHHH-hCccccCceEEEEchHHHHHHHhhhhh--hhccc--ccCceeeEEece Q lcl|NC_015254. 154 GALDSNKLDVSTETGDDSYFTGDTFLSATYK-LGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLD--SDNKKFPTYMGK 228 (346) Q Consensus 154 ~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~-~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~--s~~~~i~~~~G~ 228 (346) .. ......+++.+.+++.. +-.....-.+|+||+..+..|++..-- .++.. ..++...+++|+ T Consensus 246 ---~~---------~~~~~~~~~~l~~~~~~~~~~~~~~~a~~v~n~~~~~~l~~lkd~~G~~i~~~~~~~~~~~~l~G~ 313 (408) T protein:vir:10 246 ---AP---------KKPTIAKFDDVITMINTAVDPAIIATSSLLTNQSGLNKLALVKTAEGKYLLEPDPTKPNSYLIKGK 313 (408) T ss_pred ---cc---------cccccccHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHhhccCCceEeccCcCCCCCceecce Confidence 00 01122467888888754 433444456899999999999976321 12221 223445789999 Q ss_pred EEEEeC--CCccCCCceEEEEEcC--CeeEEeecCCccceeeeecCC----cceeEEEEeeEEeeee---eeeeeccc-c Q lcl|NC_015254. 229 RVIVDD--GLPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREKL----KGNDILINRQHFLLHP---RGIAWQEK-S 296 (346) Q Consensus 229 ~VVvdD--~~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~----~g~~~l~~r~~~~~~~---~G~s~~~~-~ 296 (346) ||++.+ .+|.....-..++|+. .++.+.... .+.++.++... .+...+....++.+.+ .+|..-.- + T Consensus 314 PV~~~~~~~~~~~~~~~~~i~~gd~~~~~~~~~~~-~~~v~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~~~~~~ 392 (408) T protein:vir:10 314 QVIVVADRWLPNTGSTVYPLYYGDMSQAITLFDRE-NMSLLPTNIGAGAFETDTTKIRVIDRFDVKATDSEALVAGSFSA 392 (408) T ss_pred eeEEecccccCccCCCceEEEEEehhccEEEEEec-ceEEEEcccccchhhcCceEEEEEEeeccEEeccccEEEEEeec Confidence 999965 4565332223355553 233343322 34566555432 3555666666665444 33332110 1 Q ss_pred c---cCCC--CChHHh Q lcl|NC_015254. 297 V---AGHS--PTNTEI 307 (346) Q Consensus 297 ~---~~~s--Pt~a~L 307 (346) . .+.+ |+-+.. T Consensus 393 ~~~~~~~~~~~~~~~~ 408 (408) T protein:vir:10 393 IADQVGNFKTTTSTAV 408 (408) T ss_pred cccCCCCCCCCCcccC Confidence 1 1122 222222 No 115 >protein:vir:99920 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:1611 # MgeName: Halo # Cross-refs: genbank:acc:YP_655524;genbank:gi:109392294;genbank:GeneID:4157089 Probab=98.82 E-value=2.5e-09 Score=67.79 Aligned_cols=271 Identities=12% Similarity=0.028 Sum_probs=137.6 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) =|..+|.-.. .+|+.+.+-+.+...+.+.+.+-+-.. ..++..+++|.+..- ..+.-+.|++ .++..+ T Consensus 1 Mat~tt~~g~-~vP~~~~~~ii~~~~~~s~l~~~~~~i---------~~~~~~~~~p~~~~~-~~a~wv~Eg~-~~~~~~ 68 (311) T protein:vir:99 1 MATFGTGNLK-NLPRNIADGMVKDVVQGSTVAVLSARK---------PQRFGNEDIITFNGR-PKAEFVGEGQ-QKSSTT 68 (311) T ss_pred CceecCCCce-eccHHHHHHHHHHHHhhchhhhhccee---------eccCCceEEEEEeCC-ceeEEeecCc-cccccc Confidence 1322333333 458877666666665555554322221 223456789998753 4555677874 677777 Q ss_pred cccceeEEEEEeecCcceechHHHhhh---cchHHHHHHHHHHHHHHHHHHHHHHHHHH-----hhhhhhhh-hhcceee Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALS---GDDPMRAIGDLVVEYWNRRRQAVLIASLN-----GITASGAL-DSNKLDV 163 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~---g~dp~~~i~~q~a~~~~~~~~~~lla~L~-----G~~~~~~~-~~~~~di 163 (346) .+-.+.....++.+--+.++++-...+ ..|..+.+.+++++++++++++.+|.-.. +..+.... ...+..+ T Consensus 69 ~~f~~v~l~~~k~~~~~~iS~ell~~~~d~~~~l~~~i~~~la~ai~~~~d~~~l~G~g~~~g~~~~g~~~~~~~~~~~~ 148 (311) T protein:vir:99 69 GEFDFVTSTPKKAQVTMRFNEEVQWADEDYQLGVLQTLSEAGAEALARALDLGLYHRINPLTGTVIPGWSNYLGAASKRV 148 (311) T ss_pred ceeeEEEEeeEEEEEeehhhHHHhhcccccHHHHHHHHHHHHHHHHHHHHHHHhhcccCcccCcccccccccccccccee Confidence 776666665666666677777754322 33568889999999999999997774321 00000000 0001111 Q ss_pred eccccccccccHHHHHHHHHHhCcccc--CceEEEEchHHHHHHHhhhhh--hhc--ccccCceeeEEeceEEEEeCCCc Q lcl|NC_015254. 164 STETGDDSYFTGDTFLSATYKLGDAEG--KLTGIAMHSQTEMNLRKQGLI--EFM--LDSDNKKFPTYMGKRVIVDDGLP 237 (346) Q Consensus 164 s~~~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~~~~~L~~~~li--~~~--~~s~~~~i~~~~G~~VVvdD~~p 237 (346) +.. ..........+.++..++..... ...+|+|||.++..|++..-- .++ ....++..++++|+||++++.+| T Consensus 149 ~~~-~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~lkd~~G~~l~~~~~~~~~~~~l~G~Pv~~s~~i~ 227 (311) T protein:vir:99 149 ELT-ADTIANPDLAIEAAVGLLVANGHPTPVNGLALHPSIAWGLSTARYTDGRKKFPELGLGIGVSSFEGIDASVSDTVN 227 (311) T ss_pred ecc-ccccchhHHHHHHHHHHHhhhccCCCccEEEEcHHHHHHHHhhhccCCCeeecCcccCCCCceecceeeEeecccc Confidence 111 11111223455566666544322 234599999999999875311 122 22233445789999999999887 Q ss_pred cCCCc-----------eEEEEEcC--CeeEEeecCCccceeeeecCC---------cceeEEEEeeEEe---eeeeeeee Q lcl|NC_015254. 238 AKDGV-----------YTSYIFGE--GAFGLGNGEAPVPTETDREKL---------KGNDILINRQHFL---LHPRGIAW 292 (346) Q Consensus 238 ~~~g~-----------ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~---------~g~~~l~~r~~~~---~~~~G~s~ 292 (346) ...+. +.-+++|. ..+.+...+ ...+++.+... .....+....++. .|+.-+.. T Consensus 228 ~~~~~~~~~~~~~~~~~~~~~~Gdf~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~d~~~~r~~~r~d~~v~~~~~v~~ 306 (311) T protein:vir:99 228 GGDEADPDDEDLDAARAVRGIVGDFANGIHWGVQR-DIPVELIKYGDPDGQGDLKRHNQIALRLEIVYGWYVFTDRFVVI 306 (311) T ss_pred cccccccccchhhccCcceEEEeeccccEEEEEec-CceEEEeecCCCCcchhhhhcCcEEEEEEEeecceecChhHeee Confidence 53221 11122221 233333222 22333332211 1112232333333 23333444 Q ss_pred ccccc Q lcl|NC_015254. 293 QEKSV 297 (346) Q Consensus 293 ~~~~~ 297 (346) ++.+. T Consensus 307 ~~~~A 311 (311) T protein:vir:99 307 ENAVA 311 (311) T ss_pred ecccC Confidence 33322 No 116 >protein:vir:102119 Length: 404 # NCBI annotation: phage major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1641 # MgeName: phiSM101 # Cross-refs: genbank:acc:YP_699941;genbank:gi:110804052;genbank:GeneID:4206662 Probab=98.80 E-value=2.3e-09 Score=67.94 Aligned_cols=288 Identities=10% Similarity=-0.003 Sum_probs=140.4 Q ss_pred Cccceecceeee----cC--CceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccC Q lcl|NC_015254. 1 MIKKLRMNLQKF----AA--GKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDL 74 (346) Q Consensus 1 ~~~~~~~~~q~~----~a--~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l 74 (346) +-.+-+.++... .+ ..++.-.-..+|+.+..-+.+...+.+.+.+- ..... ...+.-.+.+|..... T Consensus 92 ~~~~~~~~~~~~~~e~~a~~~~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~l------~~~~~-~~~~~g~~~~~~~~~~ 164 (404) T protein:vir:10 92 LKQKNQRGLNLSEKEINAISENIDEDGGYAVPEDIQTKINTRLKDTTDLYNM------VDYEP-VFTRSGSRTYEKRSKQ 164 (404) T ss_pred HHHHHhhhhcchhhHHhhhccccCCCCceeechhHHHHHHHHHhhhhhHhhh------hceee-ccCCccceEEEEecCC Confidence 111111111111 01 11111123456777766666655554444221 00000 1112224445544433 Q ss_pred CCcccccCCCccccchh--hcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhh Q lcl|NC_015254. 75 TGEDEILDDGEGALTPG--NISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITA 152 (346) Q Consensus 75 ~g~ae~~~dg~~~it~~--~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~ 152 (346) .....+.++. .++.. .++-.+-....++.+.-..+++....-+.-+....+.+++++.+.+..+..+|. |.=. T Consensus 165 -~~~~~v~e~~-~~~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~il~---G~g~ 239 (404) T protein:vir:10 165 -KPMKPLSENQ-QIPTNGDNGKLERFNFKLKDLADFMSIPNDLLKFADKSLEDWIINWFVDKVRITRNAEILY---GAGG 239 (404) T ss_pred -cceeeccccc-cccccccccceeeeEeeheeeEeeehhhHHHHhhcHHHHHHHHHHHHHHHHHHHHHHHHhh---cCCC Confidence 2334456654 23322 233333334444555556676665444444677789999999999999996663 3211 Q ss_pred hhhhhh--cceeeeccccccccccHHHHHHHHHH-hCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceeeEE Q lcl|NC_015254. 153 SGALDS--NKLDVSTETGDDSYFTGDTFLSATYK-LGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFPTY 225 (346) Q Consensus 153 ~~~~~~--~~~dis~~~~~~~~~~~~~l~~A~~~-~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~~~ 225 (346) .+...+ +...+... ......+++.+.+++.. +-.....-.+|+|||.++..|++..-. .++. ...++..+++ T Consensus 240 ~~~~~gi~~~~~~~~~-~~~~~~~~~~~~~~~~~~l~~~~~~~~~~v~n~~~~~~L~~lkd~~G~~l~~~~~~~~~~~~l 318 (404) T protein:vir:10 240 DEHATGIMTANKFKKI-TLPKSPALKDFKKCKNVELLNVFKATSSWIVNQDGFNYLDSLEDKTGRPYLQPDPKDPTQYRF 318 (404) T ss_pred CCcccceeecccccee-eccccccHHHHHHHHHhhhhccccCCCEEEEcHHHHHHHHHhhccCCceeeccCcCCCCCccc Confidence 111000 00011111 11223467888887764 333334446799999999999975311 1222 2234455789 Q ss_pred eceEEEE-eCCCccCCCceEEEEEc--CCeeEEeecCCccceeeeecC----CcceeEEEEeeEEeee---eeeeeeccc Q lcl|NC_015254. 226 MGKRVIV-DDGLPAKDGVYTSYIFG--EGAFGLGNGEAPVPTETDREK----LKGNDILINRQHFLLH---PRGIAWQEK 295 (346) Q Consensus 226 ~G~~VVv-dD~~p~~~g~ytt~l~~--~GAi~~~~~~~~~~vE~dRd~----~~g~~~l~~r~~~~~~---~~G~s~~~~ 295 (346) +|+||++ ++.+|.....-.+++|+ ..++.+... ..+.++.+++. ..+...+....++.+. +..|..-.. T Consensus 319 ~G~PV~~~~~~~~~~~~~~~~~~~gd~s~~~~~~~~-~~~~i~~~~~~~~~~~~~~~~~~~~~r~d~~v~~~~a~~~~~~ 397 (404) T protein:vir:10 319 LGLPVIELPNDLLLSTESAIPVLLGDTKEAYKYVSD-GAYELATTNIGAGAFETNTTKARIIMRIDGNVKDSEALLIAEI 397 (404) T ss_pred cceeeEEecccccCCCCCccEEEEEeccccEEEEEe-cceEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEe Confidence 9999985 45455443334456666 334444332 24556666554 2455666666666544 445554322 Q ss_pred cccCCCCC Q lcl|NC_015254. 296 SVAGHSPT 303 (346) Q Consensus 296 ~~~~~sPt 303 (346) + +..+|. T Consensus 398 ~-~aa~~~ 404 (404) T protein:vir:10 398 P-VESVQA 404 (404) T ss_pred e-cccCCC Confidence 2 223455 No 117 >protein:vir:96762 Length: 632 # NCBI annotation: putative phage-related protein # Family: family:all:21 # MgeID: mge:1628 # MgeName: VP882 # Cross-refs: genbank:acc:YP_001039818;genbank:gi:126010917;genbank:GeneID:5076272 Probab=98.79 E-value=6.8e-10 Score=70.84 Aligned_cols=274 Identities=10% Similarity=0.050 Sum_probs=145.8 Q ss_pred Ccc-----------ceeccee---ee--cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCc Q lcl|NC_015254. 1 MIK-----------KLRMNLQ---KF--AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGN 64 (346) Q Consensus 1 ~~~-----------~~~~~~q---~~--~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ 64 (346) |+. -+.|..+ .- ..++.+.-..+|.|+++.+-+.+.+...+.+.+-|+ ..+..... T Consensus 330 ~a~~~a~~~G~~arg~~~~~~~l~~ra~~~~t~~~gg~lvp~~~~~~~iie~lr~~s~i~~l~~--------~~~~~~~g 401 (632) T protein:vir:96 330 VSLAIADASGKEARGFYMPHEVLVQRQLEKKTAGKGGELVATELLSEEFIDILRNKAIIGQMGA--------RMLPGLVG 401 (632) T ss_pred HHHHHHHhhhhhhhhhhhhHHHHHHhhhhcccccccccccccccchHHHHHHHhhcchhhhhcc--------eEeecCCc Confidence 000 0001100 11 122222334577778776655555444443323221 01122233 Q ss_pred EEEecccccCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 65 MINMPFWQDLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLI 144 (346) Q Consensus 65 ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~ll 144 (346) .+++|.... ++.+--+.|+. .++..+++-++-....++.+.-..++.....-+.-+..+.+.+.++.++.+..+..+| T Consensus 402 ~~~ip~~~~-~~~a~wv~E~~-~~~~s~~~f~~i~l~~~k~~~~v~iS~ell~ds~~~~~~~i~~~l~~a~~~~~d~a~l 479 (632) T protein:vir:96 402 DVDIPKKTS-GANFYWIGEDE-DVQDSDFDFTTLSFSPKTIAGAVPVTRKLRKQSSIHVENLIREDLIEGIGVALDLAML 479 (632) T ss_pred ceEEEEEeC-CceeEeecCCc-cccccccceeeEEeeeeEEEEehhhHHHHHhccchHHHHHHHHHHHHHHHHHHHHHhh Confidence 578888753 23333356663 5666666555444445555555566665544455566778999999999999999776 Q ss_pred HHHHhhhhhhhhhh--cceeeeccccccccccHHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhhhhhhhcccccCc Q lcl|NC_015254. 145 ASLNGITASGALDS--NKLDVSTETGDDSYFTGDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQGLIEFMLDSDNK 220 (346) Q Consensus 145 a~L~G~~~~~~~~~--~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~li~~~~~s~~~ 220 (346) . |.-..+...+ +...+...+.+...+++..+.++..++.... ..-.+++||+.++..|++..+. +.++. T Consensus 480 ~---G~G~~~~p~Gi~~~~~~~~~~~~~~~~~~~~i~~~~~~i~~~~~~~~~~~~~~~~~~~~~l~~~~l~----d~~G~ 552 (632) T protein:vir:96 480 T---GTGLANDPVGLLNMTGVPALTYPAGGVDWASVVDMETKISTFNADAGRLAYLTSVTQRGAAKKAQVF----DNTGE 552 (632) T ss_pred c---ccCCCCccceeeecccccceecccccCCHHHHHHHHHHHhhcccccCccEEEEchhHHHHHHHHhcc----CCCCc Confidence 3 2110010000 0001111122334578889999888765433 2345899999999998875542 33333 Q ss_pred ee---eEEeceEEEEeCCCccCCCceEEEEEcCCee-EEeecCCccceeeeecC--CcceeEEEEeeEE---eeeeeeee Q lcl|NC_015254. 221 KF---PTYMGKRVIVDDGLPAKDGVYTSYIFGEGAF-GLGNGEAPVPTETDREK--LKGNDILINRQHF---LLHPRGIA 291 (346) Q Consensus 221 ~i---~~~~G~~VVvdD~~p~~~g~ytt~l~~~GAi-~~~~~~~~~~vE~dRd~--~~g~~~l~~r~~~---~~~~~G~s 291 (346) .+ ++++|+||++++.+|.+. .+|+.-+- .++.-+ .+.++.++.. ..+...+....++ +.||..|. T Consensus 553 ~i~~~~~l~G~pv~~s~~ip~~~-----~~~gd~s~~~i~~~~-~~~i~~~~~~~~~~~~v~~~~~~~~d~~v~~~~af~ 626 (632) T protein:vir:96 553 RIWQNNEVNGYRAEASNQIPADT-----WIFGDWSQIVIAMWG-VLDLKVDPYTKAASDGLVLRVFQDVDAGVRRKEAFC 626 (632) T ss_pred eeecCCeecccceEeccccccCc-----EEEeecceEEEEEec-ceEEEEccccccccCceEEEEEeecCceeechhhhh Confidence 22 578999999999999764 34443322 222222 2334444322 3455556555544 46677888 Q ss_pred eccccc Q lcl|NC_015254. 292 WQEKSV 297 (346) Q Consensus 292 ~~~~~~ 297 (346) |..... T Consensus 627 ~~k~~A 632 (632) T protein:vir:96 627 IAKKGA 632 (632) T ss_pred heeecC Confidence 865543 No 118 >protein:vir:80376 Length: 435 # NCBI annotation: gp6, major capsid head protein # Family: family:all:21 # MgeID: mge:1881 # MgeName: phi644-2 # Cross-refs: genbank:acc:YP_001111085;genbank:gi:134288639;genbank:GeneID:4960624 Probab=98.78 E-value=1.6e-09 Score=68.85 Aligned_cols=277 Identities=13% Similarity=0.050 Sum_probs=136.4 Q ss_pred Cccceeccee--------e----ecCCc----eeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCc Q lcl|NC_015254. 1 MIKKLRMNLQ--------K----FAAGK----NTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGN 64 (346) Q Consensus 1 ~~~~~~~~~q--------~----~~a~~----~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ 64 (346) |+... .+++ . ..+.. ++.-.-.++|+.+...+.+...+.+.+.+-+. ........ T Consensus 105 ~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gg~lvP~~~~~~ii~~l~~~~~i~~~~~--------~~v~~~~~ 175 (435) T protein:vir:80 105 LAAAR-GDAQLASKLAIERGFGEEVAMSLNTLSPGAGGVLVPENLSSEVIELLRPKSVVRKLGA--------RTLPLSNG 175 (435) T ss_pred HHhcc-chhHHHHHHHHhhhhhhhhhhhhcccCCCCCccccchhHHHHHHHHHhhhchhhhccc--------eeeecCCC Confidence 11000 0000 0 00111 11122346788777666666555444433111 01122233 Q ss_pred EEEecccccCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcc--hHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 65 MINMPFWQDLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGD--DPMRAIGDLVVEYWNRRRQAV 142 (346) Q Consensus 65 ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~--dp~~~i~~q~a~~~~~~~~~~ 142 (346) .+.+|.+..- ..+.-+.|+. .++..+.+-.+-....++.+.-+.+++....-+.- +..+.+.++++..+.+..+.. T Consensus 176 ~~~~p~~~~~-~~a~~v~E~~-~~~~~~~~f~~i~~~~~k~~~~~~is~ell~ds~~~~~l~~~i~~~l~~a~~~~~d~a 253 (435) T protein:vir:80 176 NITIPRLKGG-AIVGYIGADT-DIPTTQQQFDDLKLTAKKMAALVPIANDLIKYAGVNPNVDQIVVGDLTAAIGAREDKA 253 (435) T ss_pred ceEEEEEeCC-cceeeeccCc-cccccccceeeEEEeeEEEEEeehhhHHHHHhhcccHHHHHHHHHHHHHHHHHHHHHH Confidence 5888988643 4444567764 56666666555555566666667777776544433 455779999999999999997 Q ss_pred HHHH------HHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCcc--ccCceEEEEchHHHHHHHhhhhh--h Q lcl|NC_015254. 143 LIAS------LNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDA--EGKLTGIAMHSQTEMNLRKQGLI--E 212 (346) Q Consensus 143 lla~------L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~--~~~~~~ivmhS~~~~~L~~~~li--~ 212 (346) +|.- .+|++....... . .....+.........+.++...+-.. ...-.+|+||+.++..|++..-- . T Consensus 254 ~l~G~G~~~~p~Gi~~~~~~~~-~--~~~~~~~~~~~~~~d~~~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~lkd~~G~ 330 (435) T protein:vir:80 254 FIRDDGTANTPKGLRFWALPGN-V--ITASDGSTLQKIETDLGKAILALENADANLTQPGWIMAPRTFRFLEGLRDGNGN 330 (435) T ss_pred hhccCCCCCcccceeecccccc-e--eecccccchhhHHHHHHHHHHHhhccccccccCEEEEcHHHHHHHHhhhccCCc Confidence 7631 112211111110 0 01111111112234556665555332 22346799999999998764311 1 Q ss_pred hcccccCceeeEEeceEEEEeCCCccCCC---ceEEEEEcCCe-eEEeecCCccceeeeecCC-------------ccee Q lcl|NC_015254. 213 FMLDSDNKKFPTYMGKRVIVDDGLPAKDG---VYTSYIFGEGA-FGLGNGEAPVPTETDREKL-------------KGND 275 (346) Q Consensus 213 ~~~~s~~~~i~~~~G~~VVvdD~~p~~~g---~ytt~l~~~GA-i~~~~~~~~~~vE~dRd~~-------------~g~~ 275 (346) ++..... =++++|+||++++.||...+ ....++|+.-+ +.++. +..+.++..++.. .+.. T Consensus 331 ~l~~~~~--~~~l~G~pv~~~~~~p~~~~~~~~~~~i~~gd~s~~~i~~-~~~~~i~~~~~~~~~~~~~~~~~~f~~n~~ 407 (435) T protein:vir:80 331 KVYPELA--NGMLKGYPVGKTTQVPINLGEAGKESEIYFTDFGDVFIGE-EETLEIDYSKEATYKDADGHMVSAFQRDQT 407 (435) T ss_pred eeccCCC--CCeEeeeeeEEeccccccccCCCCcceEEEEEcccEEEEe-ecceEEEEeccccccccccchhhhhhcCcc Confidence 2222211 14799999999999997432 22334444222 22332 2345566666542 1122 Q ss_pred EEEEeeEEee---------eeeeeeeccc Q lcl|NC_015254. 276 ILINRQHFLL---------HPRGIAWQEK 295 (346) Q Consensus 276 ~l~~r~~~~~---------~~~G~s~~~~ 295 (346) .+-...+|.+ .+.|+.|- + T Consensus 408 ~~r~~~r~d~~~~~~~a~~~l~~~~~~-~ 435 (435) T protein:vir:80 408 LIRVIAKNDFGPRHVESIAVLSGVAWG-A 435 (435) T ss_pred eeeeeeeeCcEeecccceEEEeccCCC-C Confidence 2333333322 22344442 1 No 119 >protein:vir:105038 Length: 428 # NCBI annotation: major capsid head protein precursor # Family: family:all:21 # MgeID: mge:1465 # MgeName: phiKO2 # Cross-refs: genbank:acc:YP_006586;genbank:gi:46402092;genbank:GeneID:2777903 Probab=98.77 E-value=1.6e-09 Score=68.80 Aligned_cols=291 Identities=12% Similarity=0.077 Sum_probs=138.1 Q ss_pred Cccce----ecceeeec--CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccC Q lcl|NC_015254. 1 MIKKL----RMNLQKFA--AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDL 74 (346) Q Consensus 1 ~~~~~----~~~~q~~~--a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l 74 (346) ++... .++-+.-+ ....+.-.-..+|+-+.+-+.+...+.+.+.+-|. ..+......+++|.+..- T Consensus 108 ~~~~~~~~~~~~~~~~~~~~~~~~~~gg~liP~~~~~~ii~~l~~~~~l~~~~~--------~~~~~~~g~~~~p~~~~~ 179 (428) T protein:vir:10 108 QDAAKFASDELNDQSVSMAISTAAGSGGVLIPQNIHSEVIELLRDRTIVRKLGA--------RSIPLPNGNMSLPRLAGG 179 (428) T ss_pred HHHHHHhhhhhhhhhHhhhhcccccCCccccchhHHHHHHHHHhhhchhhhhcc--------eeeecCCcceEEEEEeCC Confidence 00000 00000000 01111112356787776655555555444433221 011112234788887643 Q ss_pred CCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH------HHH Q lcl|NC_015254. 75 TGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIA------SLN 148 (346) Q Consensus 75 ~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla------~L~ 148 (346) +.+.-+.|+. .++..+.+-.+-.-..+..+.-..++++...-+.-+..+.+.+++++.+.+..++.+|. .-+ T Consensus 180 -~~a~~v~Eg~-~~~~~~~~f~~i~~~~~k~~~~v~is~ell~ds~~~l~~~i~~~l~~ai~~~~d~~~l~G~G~~~~p~ 257 (428) T protein:vir:10 180 -ATASYTGENQ-DAKVSEARFDDVKLTAKTMIAMVPISNALIGRAGFNVEQLVLQDILTAISVREDKAFMRDDGTGDTPI 257 (428) T ss_pred -cceeeeccCc-cccccccceeeEEeeeEEEEEeehhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHhccCCCCcccc Confidence 4455567774 56666665444444455555566777775544555778889999999999999997663 112 Q ss_pred hhhhhhhhhhcceeeeccccccccccHHHH---HHHHHHh---CccccCceEEEEchHHHHHHHhhhhh--hhcccccCc Q lcl|NC_015254. 149 GITASGALDSNKLDVSTETGDDSYFTGDTF---LSATYKL---GDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLDSDNK 220 (346) Q Consensus 149 G~~~~~~~~~~~~dis~~~~~~~~~~~~~l---~~A~~~~---GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~s~~~ 220 (346) |++.......... .+...+..+.+.+ .++...+ +.....-.+|+||+..+..|++..-- .++..... T Consensus 258 Gi~~~~~~~~~~~----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~lkd~~G~~i~~~~~- 332 (428) T protein:vir:10 258 GMKARATQWNRLL----PWAADAAVNLDTIDTYLDSIILMSMDGNSNMISSGWGMSNRTYMKLFGLRDGNGNKVYPEMA- 332 (428) T ss_pred ccccccccccccc----cccccccccHHHHHHHHHHHHHhhhccccccccCEEEEcHHHHHHHHHhhccCCceeccCCC- Confidence 2222111111111 1112223344443 3333322 22233446899999999998865311 12221111 Q ss_pred eeeEEeceEEEEeCCCccCCC---ceEEEEEcCCee-EEeecCCccceeeeecCCc----ce--eEEEEeeEEeeeeeee Q lcl|NC_015254. 221 KFPTYMGKRVIVDDGLPAKDG---VYTSYIFGEGAF-GLGNGEAPVPTETDREKLK----GN--DILINRQHFLLHPRGI 290 (346) Q Consensus 221 ~i~~~~G~~VVvdD~~p~~~g---~ytt~l~~~GAi-~~~~~~~~~~vE~dRd~~~----g~--~~l~~r~~~~~~~~G~ 290 (346) -++++|+||++++.||...+ .-..++|+.-+- .++. ...+.++++|+... +. ..+..+. +.++.+ T Consensus 333 -~g~l~G~pv~~~~~~p~~~~~~~~~~~i~~gd~s~~~i~~-~~~i~i~~~~~~~~~~~~~~~~~~f~~~~---~~~R~~ 407 (428) T protein:vir:10 333 -QGMLKGYPIQRTSAIPANLGEGGKESEIYFADFNDVVIGE-DGNMKVDFSKEASYIDTDGKLVSAFSRNQ---SLIRVV 407 (428) T ss_pred -CCeeeceeeEEeccccccccCCCccceEEEEecceEEEEE-ecceEEEeecccccccccccccchhhcch---hheeee Confidence 14799999999999987532 233455554332 2222 22344555554311 10 0011111 111111 Q ss_pred eeccccccCCCCChHHhcCCcCc Q lcl|NC_015254. 291 AWQEKSVAGHSPTNTEIEKGNNW 313 (346) Q Consensus 291 s~~~~~~~~~sPt~a~L~~~~NW 313 (346) -+-+-.+ ..|.--.+-++.+| T Consensus 408 ~r~d~~v--~~p~a~~~~t~~~~ 428 (428) T protein:vir:10 408 TEHDIGF--RHPEGLVLGTGVLF 428 (428) T ss_pred eeeCcee--eccceEEEEeccCC Confidence 1111110 13555556677777 No 120 >protein:vir:174 Length: 423 # NCBI annotation: capsid protein # Family: family:all:1412 # MgeID: mge:5 # MgeName: HK620 # Cross-refs: genbank:acc:NP_112079;genbank:gi:13559869;genbank:GeneID:920999 Probab=98.76 E-value=6.8e-09 Score=65.37 Aligned_cols=312 Identities=13% Similarity=0.059 Sum_probs=157.8 Q ss_pred eeec---cchHHHHHHHhhHhHHHHhHhhccccccchhHHHHh-hCCCcEEEecccccCCCcccccCCCccccchhhccc Q lcl|NC_015254. 20 IADV---IVPEVFNKYVTERTAESSALLQSGIISNDKDLDELA-KSGGNMINMPFWQDLTGEDEILDDGEGALTPGNISA 95 (346) Q Consensus 20 l~d~---i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~-~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~ 95 (346) ++|- ++||+++.-+.+.+.+.+-|.+ .+-++- ..+.. +.-||+|+||.=....-......++ ..++++.++. T Consensus 1 MaN~llT~ip~iia~~al~~l~~~lV~~~--lVnr~y-~~e~~~~k~GDTV~I~~p~~~~~~~~~~~~~-~~~~~~~l~e 76 (423) T protein:vir:17 1 MPNNLDSNVSQIVLKKFLPGFMSDLVLAK--TVDRQL-LAGEINSSTGDSVSFKRPHQFSSLRTPTGDI-SGQNKNNLIS 76 (423) T ss_pred CccchhhhhHHHHHHHHHHHHHhhcccch--hhcccC-CcchhhcccCCEEEEeeCCcceeecccCccc-CCcccCcccc Confidence 3333 3699998888888777776633 233322 22222 3469999998655542222111222 2456788877 Q ss_pred ceeEEEE-EeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccccc Q lcl|NC_015254. 96 AKDIARL-HMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSYFT 174 (346) Q Consensus 96 ~~~~a~~-~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~~~ 174 (346) .+...++ +....++.++|+...+.-.|. +++.++.....+++++.+|++.+.+.. .+.. +..++ ..-. T Consensus 77 ~~v~l~id~~k~va~~v~d~E~~~~i~~~-~~~l~~A~~aLA~~vd~~ia~~~~~~a-~~~~--------gt~~t-~~~a 145 (423) T protein:vir:17 77 GKATGRVGNYITVAVEYQQLEEAIKLNQL-EEILAPVRQRIVTDLETELAHFMMNNG-ALSL--------GSPNT-PITK 145 (423) T ss_pred ceeEEEeeceeeeeeeecHHHHhcChhHH-HHHHHHHHHHHHHHHHHHHHHHHhhcc-cccc--------ccCCc-cccc Confidence 6644443 345557899999887666675 677777778889999999887764321 1111 11111 1124 Q ss_pred HHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhhh-hhhhcccc-----cCcee-eEEeceEEEEeCCCcc-CCCceE Q lcl|NC_015254. 175 GDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQG-LIEFMLDS-----DNKKF-PTYMGKRVIVDDGLPA-KDGVYT 244 (346) Q Consensus 175 ~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~-li~~~~~s-----~~~~i-~~~~G~~VVvdD~~p~-~~g~yt 244 (346) ++.+.++..+|.+.. ..-+.+|+.|..+..|++.. .+..-... ..+.| +.+.|.+|+.|+.+|. +.|.+. T Consensus 146 ~~~i~~a~~~Ld~~~vP~~~R~~Vv~p~~~a~Ll~~~~~~~~~~~~~~~alr~g~i~G~i~GFdvy~Snnip~~T~gt~~ 225 (423) T protein:vir:17 146 WSDVAQTASFLKDLGVNEGENYAVMDPWSAQRLADAQTGLHASDQLVRTAWENAQIPTNFGGIRALMSNGLASRTQGAFG 225 (423) T ss_pred HHHHHHHHHHHHhccCCcCCCEEEeChHHHHHHhccccceecccccchHHHhhccceeeecceEEEEeCCCcccccccee Confidence 788999999997754 23478899999999998653 22211111 12445 7999999999999994 455543 Q ss_pred -EEEEc-----CCeeEEeecCCcc--ceeeee--cCCcceeEEEEeeEEeeeeee------------eeecccc-----c Q lcl|NC_015254. 245 -SYIFG-----EGAFGLGNGEAPV--PTETDR--EKLKGNDILINRQHFLLHPRG------------IAWQEKS-----V 297 (346) Q Consensus 245 -t~l~~-----~GAi~~~~~~~~~--~vE~dR--d~~~g~~~l~~r~~~~~~~~G------------~s~~~~~-----~ 297 (346) +.... +++...+..+..+ ...+.+ +.+...|.+.---.+.+|+.- .+|.-.. . T Consensus 226 ~t~~~~~~~~v~~~a~~~~~~~~~~~~~~~~~~~g~l~~GD~~t~aGv~~v~~~tk~v~~~~~t~~~~~~~v~~~~~~~a 305 (423) T protein:vir:17 226 GTLTVKTQPTVTYNAVKDSYQFTVTLTGATTSVTGFLKAGDQVKFTNTYWLQQQTKQALYNGATPISFTATVTADANSDS 305 (423) T ss_pred ceeeecccccccccccccccceeeeeeeeeeeccCceeecceEEecceeeecccccccccccccccceEEEEEecccccc Confidence 11111 1221111111111 111111 122223344333444444422 2222110 0 Q ss_pred cC-----CCCC-------------hHHhcCCcCceeeec-ccccceEEEEEecccccccCCCCCCCCC Q lcl|NC_015254. 298 AG-----HSPT-------------NTEIEKGNNWKAVYE-SKNIRIVAFVHKNGVPGKKKETAPEGIK 346 (346) Q Consensus 298 ~~-----~sPt-------------~a~L~~~~NW~~v~~-~K~i~iv~~~~k~~~~~~~~~~~~~~~~ 346 (346) ++ .+|. .+..++++.|+.+.. .-..+-=.+=||.+......|-++.|.. T Consensus 306 ~~~~tv~i~p~~i~~~~~~~~~~v~a~~a~~~~vT~~~~a~~t~~~nl~~~~~a~~l~~~pl~~~~~~ 373 (423) T protein:vir:17 306 SGDVTVTLSGVPIYDTTNPQYNSVSRQVAAGDAVSVVGTASQTMKPNLFYNKFFCGLGSIPLPKLHSI 373 (423) T ss_pred cCceEEEecCccccccCCcccccceecccCCceeeccccccCCeeEEEEecCcceEEEEEcccCCCcc Confidence 00 1121 133445555554311 0001111122333333333333333332 No 121 >protein:vir:100884 Length: 389 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1473 # MgeName: Lc-Nu # Cross-refs: genbank:acc:YP_358764;genbank:gi:78000028;genbank:GeneID:3726155 Probab=98.76 E-value=7.3e-09 Score=65.19 Aligned_cols=275 Identities=13% Similarity=0.026 Sum_probs=140.5 Q ss_pred Cc------------ccee-c-cee-eecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcE Q lcl|NC_015254. 1 MI------------KKLR-M-NLQ-KFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNM 65 (346) Q Consensus 1 ~~------------~~~~-~-~~q-~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~t 65 (346) |+ .-+| . ..+ ..++. ++.-.-.++|+.+...+.+...+.+.+.+.. .....++.. T Consensus 83 ~~~~~~~~~~~~~~~~lr~~~~~~~~~~~~-t~~~gg~~vP~~~~~~i~~~~~~~~~l~~~~---------~~~~~~~~~ 152 (389) T protein:vir:10 83 LSKKPIDAKKKAINDFIHSHGKVIDATSKV-TSTEAGVLIPEEIIYDPTAEVNSVVDLSTLV---------TKTPVTTPK 152 (389) T ss_pred cchhHHHHHHHHHHHHhhcchhhhhhhccc-ccCCcceeehHHHHHHHHHHHHhhhhHHhhc---------ceeeccCCe Confidence 11 0000 0 000 01111 1111235678887776666666555553321 111224566 Q ss_pred EEecccccCCCcccccCCCccccc-hhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 66 INMPFWQDLTGEDEILDDGEGALT-PGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLI 144 (346) Q Consensus 66 i~~P~~~~l~g~ae~~~dg~~~it-~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~ll 144 (346) .++|....-++....+.|+. ..+ ..+.+-.+-....++.+.-+.++++...-+.-|....+.+.+++...+..+..++ T Consensus 153 ~~~~~~~~~~~~~~~~~E~~-~~~~~~~~~~~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~i~ 231 (389) T protein:vir:10 153 GTYPILKRATDRFSSVAELA-ENPKLAEPEFNKVDWSVATYRGAIPLSEEAIADSAVDLTALVGQSIKEKSVNTYNAMIA 231 (389) T ss_pred eEEEEEecCCCccccccccc-cccccccccceeeeeeheeeEeeehhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHh Confidence 78888876555545566654 333 3445555555556666666777777655555567788999999999999888777 Q ss_pred HHHHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcccc----- Q lcl|NC_015254. 145 ASLNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLDS----- 217 (346) Q Consensus 145 a~L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~s----- 217 (346) ..+.+.. ..+.....+++.|.++....=+... -.+|+||+.++..|++..-- .++... T Consensus 232 ~g~~~~~--------------~~~~~~~~~~d~l~~~~~~~~~~~~-~a~~~~n~~~~~~L~~lkd~~G~~i~~~~~~~~ 296 (389) T protein:vir:10 232 PVLQSFT--------------AKKTTTDTLVDSLKHILNVDLDPAY-SRALVVTQSLFNTLDTLKDKNGRYLLHDASDSI 296 (389) T ss_pred hhhcccc--------------cccccccccHHHHHHHHHhhhhhhh-CcEEEecHHHHHHHHHhhccCCCeeeecCcccc Confidence 6543210 0112233567778777664322222 26899999999999975321 122211 Q ss_pred -cCceeeEEeceEEEEeCCC-c-cCCCceEEEEEcC--CeeEEeecCCccceeeeecCCcceeEEEEeeEEe---eeeee Q lcl|NC_015254. 218 -DNKKFPTYMGKRVIVDDGL-P-AKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREKLKGNDILINRQHFL---LHPRG 289 (346) Q Consensus 218 -~~~~i~~~~G~~VVvdD~~-p-~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~---~~~~G 289 (346) .++...+++|+||++.+++ + ...|. .+++|+. -++.+...+ .+.+++.++..-. +.+..-.++. +|+.. T Consensus 297 ~~~~~~~~l~G~pV~~~~~~~~~~~~~~-~~~~~gd~~~~~~~~~~~-~~~i~~~~~~~~~-~~~~~~~r~d~~~~~~~a 373 (389) T protein:vir:10 297 TDGTAKGTILGVPVYVVGDTLLGSLAGD-QKAFVGDLKRGVLFTDRQ-QVTLAWEDSKIYG-KYLGAAFRFGVQKADSKA 373 (389) T ss_pred cccccccccccceeEEecccccCCCCCc-eEEEEeeccccEEEEeec-ceEEEeecccccc-ceEEEEEEeccEEecccc Confidence 1233468999999876553 3 33333 3456653 234343322 3456655544322 2222212221 33333 Q ss_pred eeeccc-cccCCCCCh Q lcl|NC_015254. 290 IAWQEK-SVAGHSPTN 304 (346) Q Consensus 290 ~s~~~~-~~~~~sPt~ 304 (346) +.+-.- .+.+.+|+- T Consensus 374 ~~~~~~~~~~~~~~~~ 389 (389) T protein:vir:10 374 GYFVTNTDVPGSALGK 389 (389) T ss_pred eEEEEeeccCCCCCCC Confidence 333211 112223333 No 122 >protein:vir:4092 Length: 390 # NCBI annotation: major capsid protein a # Family: family:all:635 # MgeID: mge:86 # MgeName: 2389 # Cross-refs: genbank:acc:NP_510986;swissprot:trembl:q8w604;genbank:gi:17488508;uniprot:Q8W604;genbank:GeneID:1260361 Probab=98.76 E-value=1.1e-09 Score=69.73 Aligned_cols=284 Identities=11% Similarity=-0.018 Sum_probs=139.4 Q ss_pred Cccceecc----eeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCC Q lcl|NC_015254. 1 MIKKLRMN----LQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTG 76 (346) Q Consensus 1 ~~~~~~~~----~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g 76 (346) +...++-. ++.+.+..++.-...++|+.+..-+.+...+.+.+.+- ......++....+|.+... + T Consensus 68 ~~~~l~~~~r~~~~~~~~~~~~~~gg~lvP~~~~~~I~~~~~~~s~i~~~---------~~~~~~~~~~~~i~~~~~~-~ 137 (390) T protein:vir:40 68 GANALTSDESKYYNEVIAGNGFAGVTALLPPTVFERVFEDLTVEHPLLSK---------INFVNTTATTEWIISVGDV-A 137 (390) T ss_pred CchhccHHHHHHHHHHHhccCcccCcccccHHHHHHHHHHHHhhhhhhhh---------ceeeecCCceeEEEEEcCC-c Confidence 11111111 11122333444456678888877776666555554331 1112235667778877653 4 Q ss_pred cccccCCCccccch-hhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH-----HHhh Q lcl|NC_015254. 77 EDEILDDGEGALTP-GNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS-----LNGI 150 (346) Q Consensus 77 ~ae~~~dg~~~it~-~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~-----L~G~ 150 (346) .+.-+.|+. .+.. .+.+-++-.-..++.+.-+.++++...-+.-|..+.+.+++++.+.+..++.+|.- -.|+ T Consensus 138 ~a~~~~E~~-~~~~~~~~~f~~i~l~~~k~~~~i~iS~ell~ds~~~l~~~i~~~la~~i~~~~~~a~l~G~G~~~P~Gi 216 (390) T protein:vir:40 138 TAWWGPLCA-EIKEVLDNGFDKIQTGMYKLSAYIPVCNAMLDLGPSWLDQYVRTILGEAMALGLEAGIVNGSGKDQPIGM 216 (390) T ss_pred ceeeecccc-ccCccccccceeeEeeeeeEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHhhhhcccCCCcccee Confidence 554455643 3432 23333333333444444466776666666667788899999999999999966641 0011 Q ss_pred hhhhhhhhcceeeeccccccccccHHHHHH----HHHHhCcc---ccCceEEEEchHHHHHHHhhhhhhhcccccCcee- Q lcl|NC_015254. 151 TASGALDSNKLDVSTETGDDSYFTGDTFLS----ATYKLGDA---EGKLTGIAMHSQTEMNLRKQGLIEFMLDSDNKKF- 222 (346) Q Consensus 151 ~~~~~~~~~~~dis~~~~~~~~~~~~~l~~----A~~~~GD~---~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~~~i- 222 (346) +........... .. .....++.....+ -...+++. ...-.+|+||+.++..+.+. +..+++.++..+ T Consensus 217 l~~~~~~~~~~~--~~-~~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~a~~i~n~~t~~~~l~~--~~~~~d~~G~~v~ 291 (390) T protein:vir:40 217 MRDLNNVTAGEH--PV-KTATPLTDLTPATLATKVMLPLTDNGKKSVSDAILVINPADYWSKIYA--ATSYMTPQGVWVT 291 (390) T ss_pred eecccccccccc--cc-ccccccchhhHHHHHHHHHHHhhcchhhhhcCceEEEcchhHHHHHHH--HhhccCCCCcccc Confidence 111110000000 00 0112233222222 22223332 23457899999987654432 223444444322 Q ss_pred -eEEeceEEEEeCCCccCCCceEEEEEcCCe-eEEeecCCccceeeeecC--CcceeEEEEeeEEeeeee---ee---ee Q lcl|NC_015254. 223 -PTYMGKRVIVDDGLPAKDGVYTSYIFGEGA-FGLGNGEAPVPTETDREK--LKGNDILINRQHFLLHPR---GI---AW 292 (346) Q Consensus 223 -~~~~G~~VVvdD~~p~~~g~ytt~l~~~GA-i~~~~~~~~~~vE~dRd~--~~g~~~l~~r~~~~~~~~---G~---s~ 292 (346) ....|++||+++.||.+. ++|+.-. +.+.. +..+.++++++. ..+++.+....++-+.|+ .| +. T Consensus 292 ~~~~~g~pvv~~~~~p~~~-----i~~Gd~s~~~i~~-~~~~~v~~~~~~~f~~~~~~~r~~~r~dg~v~~~~A~~~l~~ 365 (390) T protein:vir:40 292 GILPVPLEIVQSVAVPVGK-----AVAGRAKDYFMGI-GSEQVIRTSTEYRLLDDETLYYAKQYANGRPKDNSSFLVFDI 365 (390) T ss_pred ccCCCceeEEEcCCCCCCc-----EEEEeeceEEEEe-ecceEEEecchhhhhcCcEEEEEEEEeCCEEecccceEEEEe Confidence 234799999999999753 4443322 12222 223345544433 456677777777766553 22 22 Q ss_pred ccc-----------cccCCCCChHH Q lcl|NC_015254. 293 QEK-----------SVAGHSPTNTE 306 (346) Q Consensus 293 ~~~-----------~~~~~sPt~a~ 306 (346) +.. ++.+.+|+.+| T Consensus 366 ~~~~~~~~~~~~~~~~~~~~~~~~~ 390 (390) T protein:vir:40 366 TGLEGSPAIDVNVVNNATPSETPAE 390 (390) T ss_pred eccCCCCCCCcceeeCCCCCCCCCC Confidence 211 01122334444 No 123 >protein:vir:95376 Length: 425 # NCBI annotation: phage major capsid protein # Family: family:all:635 # MgeID: mge:1567 # MgeName: GBSV1 # Cross-refs: genbank:acc:YP_764476;genbank:gi:115334630;genbank:GeneID:5179263 Probab=98.74 E-value=2.3e-09 Score=67.97 Aligned_cols=278 Identities=13% Similarity=0.099 Sum_probs=143.7 Q ss_pred Cc------------cceecceee----ecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCc Q lcl|NC_015254. 1 MI------------KKLRMNLQK----FAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGN 64 (346) Q Consensus 1 ~~------------~~~~~~~q~----~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ 64 (346) ++ ..++-...- +.+..+|.-...++|+.+.+.+.+...+.+.+.+..- .....|+ T Consensus 110 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gg~~vP~~~~~~Ii~~l~~~~~i~~~~~---------~~~~~g~ 180 (425) T protein:vir:95 110 MNRLQVREMLKTGEYYKRSEVVEFYEKFRNLRAVAGGELTIPEVVVNRIMDIMGDYTTLYPLVD---------KIRVKGT 180 (425) T ss_pred HHHHHHHHHHhhhhhhhhhHHHHHHHHHHhhcccccCceeccHHHHHHHHHHHHhhhhHHHhhc---------eeecCce Confidence 00 000100000 1122223334568899888888887777766654311 1122354 Q ss_pred EEEecccccCCCcccccCCCccccchhhcccceeE-EEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 65 MINMPFWQDLTGEDEILDDGEGALTPGNISAAKDI-ARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVL 143 (346) Q Consensus 65 ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~-a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~l 143 (346) ..+|..... +.+.-+.|+. .++.....+..++ -..++.+.-+.+++....-+..+....+.++++..+++..++.+ T Consensus 181 -~~ip~~~~~-~~a~~v~E~~-~~~~~~~~~f~~i~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~i 257 (425) T protein:vir:95 181 -TRILVDTDT-SPATWIEQSG-ALPTGDVGTIASIDFDGFKVGKVTFVDNYLLQDSIINLDDYVTKKIARAIAKALDLAI 257 (425) T ss_pred -eEEEEecCC-cccccccccc-ccccccccccceeeeeheeeeeeehhhHHHHhccHHHHHHHHHHHHHHHHHHHHHHHh Confidence 478876554 4454566764 4554444333333 33444555567777766556667788899999999999999976 Q ss_pred HHH-------HHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccc--cCceEEEEchHHHHH-HHhhh---- Q lcl|NC_015254. 144 IAS-------LNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAE--GKLTGIAMHSQTEMN-LRKQG---- 209 (346) Q Consensus 144 la~-------L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~-L~~~~---- 209 (346) |.- -.|++....... ..+......+++.+.++..++.-.. ..-.+++||+.++.. |.... T Consensus 258 l~G~G~~~~~p~Gil~~~~~~~------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~~l~~l~~~kd 331 (425) T protein:vir:95 258 VKGTGAANKQPLGIIPSLPPEN------QVTVEADNNLLKNLVKQIGLIDTGDDSVGEIVAVMKRSTYYNRLVEFSIQVD 331 (425) T ss_pred hccCCCCccccceeeccccccc------ccccccccchHHHHHHHHHhhhhhccccCceEEEEeChHHHHHHHHHHhhcC Confidence 641 012222111111 0112233467888998887765433 334578999988643 32211 Q ss_pred -hhhhcccccCceeeEEeceEEEEeCCCccCCCceEEEEEcCCe-eEEeecCCccceeeeecC--CcceeEEEEeeEEe- Q lcl|NC_015254. 210 -LIEFMLDSDNKKFPTYMGKRVIVDDGLPAKDGVYTSYIFGEGA-FGLGNGEAPVPTETDREK--LKGNDILINRQHFL- 284 (346) Q Consensus 210 -li~~~~~s~~~~i~~~~G~~VVvdD~~p~~~g~ytt~l~~~GA-i~~~~~~~~~~vE~dRd~--~~g~~~l~~r~~~~- 284 (346) -=.++....++..++++|+||+++|.||... ++||.-. +.+.. +..+.++..++. ..+...+....++- T Consensus 332 ~~g~~i~~~~~~~~~~l~G~pvv~~~~~~~~~-----i~~Gd~~~~~~~~-~~~~~i~~~~~~~f~~~~~~~~~~~r~d~ 405 (425) T protein:vir:95 332 SNGNVVGKLPNLRTPDLLGLRVVFNNFLDDDT-----VLFGEFEQYTLVE-RENITIDSSTHVKFTEDQTAFRGKGRFDG 405 (425) T ss_pred CCCceeeccCCCCCccccceeeEEcCcCCCcc-----EEEEecccEEEEe-ecceEEEeecccccccCceEEEEEEeeCc Confidence 1123333334556789999999999999763 4443322 12222 222344444443 23444454444443 Q ss_pred --eeeeeeeeccccccCCCCC Q lcl|NC_015254. 285 --LHPRGIAWQEKSVAGHSPT 303 (346) Q Consensus 285 --~~~~G~s~~~~~~~~~sPt 303 (346) ++|..|..-+-+.. ..+. T Consensus 406 ~~~~~~a~~~~~i~~~-~~g~ 425 (425) T protein:vir:95 406 KPVKPEAFVLVTITDP-VQGA 425 (425) T ss_pred EeecccceEEEEecCc-CCCC Confidence 33444443321110 1111 No 124 >protein:vir:100057 Length: 375 # NCBI annotation: T7-like capsid protein # Family: family:all:975 # MgeID: mge:1604 # MgeName: P-SSP7 # Cross-refs: genbank:acc:YP_214206;genbank:gi:61806429;genbank:GeneID:3294737 Probab=98.71 E-value=8.7e-09 Score=64.79 Aligned_cols=289 Identities=12% Similarity=0.134 Sum_probs=156.9 Q ss_pred Cccce-ecceee---ecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCC Q lcl|NC_015254. 1 MIKKL-RMNLQK---FAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTG 76 (346) Q Consensus 1 ~~~~~-~~~~q~---~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g 76 (346) |-..+ ..|+-. +. ..+---+|+. |+|..-|...+.+.+.|.. .-...+ -.+|+++.+|..+.. T Consensus 5 ~~~~~~~~n~~t~~~~~--~~~~~~al~l-e~f~geV~~~f~~~si~~~------~~~~rt--i~~Gksv~f~~iG~~-- 71 (375) T protein:vir:10 5 NQVALGRSNLSTGTGYG--GATDKYALYL-KLFSGEMFKGFQHETIARD------LVTKRT--LKNGKSLQFIYTGRM-- 71 (375) T ss_pred cccccCccccCCccccc--cccchHHHHH-HHHhHHHHHHHHHHHhhhc------cccccc--cccCceEEEEeeeee-- Confidence 11111 111100 10 0111124555 8888777777777766532 111111 226999999988765 Q ss_pred cccccCCCccccch---hhcccceeEEEE-EeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHH-hhh Q lcl|NC_015254. 77 EDEILDDGEGALTP---GNISAAKDIARL-HMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLN-GIT 151 (346) Q Consensus 77 ~ae~~~dg~~~it~---~~lt~~~~~a~~-~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~-G~~ 151 (346) .......|+ .+.. ..+.+.+..-++ ...-..+.+.|+....+..|.+.+++++.+.+.++.+|+.++..|. ++. T Consensus 72 t~~~~t~G~-~i~~~~~~d~~~te~~l~ID~~~y~~~~VdDiD~aqa~~Dlr~e~s~~~G~aLA~~~D~~i~~~l~kaa~ 150 (375) T protein:vir:10 72 TSSFHTPGT-PILGNADKAPPVAEKTIVMDDLLISSAFVYDLDETLAHYELRGEISKKIGYALAEKYDRLIFRSITRGAR 150 (375) T ss_pred EEeeecCCc-CcCCccccCCCCCceEEEecchhhhhhhHhhHHHHhcCchhHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Confidence 233333343 3321 133333333222 2334568899999999999999999999999999999999988774 332 Q ss_pred hhhhhhh------cceeeecc--cccccccc----HHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhh----hhhhh Q lcl|NC_015254. 152 ASGALDS------NKLDVSTE--TGDDSYFT----GDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQ----GLIEF 213 (346) Q Consensus 152 ~~~~~~~------~~~dis~~--~~~~~~~~----~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~----~li~~ 213 (346) ....... ...-+... ..+...++ ++.|.+|..+|.++. ..-..++|.|..|..|.+. .+++. T Consensus 151 ~~~p~~~~~~~~~Gg~~i~~~sg~~~~~~~ta~~~~~ai~~a~~~Lde~~VP~~~R~~vv~P~~y~~Ll~~~d~~~~~n~ 230 (375) T protein:vir:10 151 SASPVSATNFVEPGGTQIRVGSGTNESDAFTASALVNAFYDAAAAMDEKGVSSQGRCAVLNPRQYYALIQDIGSNGLVNR 230 (375) T ss_pred hccccccccccccCcceeeeccccccccccCHHHHHHHHHHHHHHHhhcCCCCCCCEEEeChHHHHHHHhcCCccceeee Confidence 2211000 00111111 11112233 466667777776544 3347899999999999875 23332 Q ss_pred ccccc----CceeeEEeceEEEEeCCCccCCCc------------------------------------e---------- Q lcl|NC_015254. 214 MLDSD----NKKFPTYMGKRVIVDDGLPAKDGV------------------------------------Y---------- 243 (346) Q Consensus 214 ~~~s~----~~~i~~~~G~~VVvdD~~p~~~g~------------------------------------y---------- 243 (346) ....+ ++.++.+.|++|+.+..+|...+. | T Consensus 231 d~~~~~~~~~g~v~~i~Gv~V~~Sn~lP~~~~~~~~~g~~~~~~a~~~~~~~~~~~~~~~~~~~g~~~~y~~d~~~~~~~ 310 (375) T protein:vir:10 231 DVQGSALQSGNGVIEIAGIHIYKSMNIPFLGKYGVKYGGTTGETSPGNLGSHIGPTPENANATGGVNNDYGTNAELGAKS 310 (375) T ss_pred cccccceeccceEEEEeceEEEEeccccccccccccccccccccchhhhhccccccCCcceeeccccccccccccccCce Confidence 22222 356789999999999999954321 1 Q ss_pred EEEEEcCCeeEEeecCCccceee---eecCCcceeEEEEeeEEeeeeeeeeeccccccCC--CCChHHh Q lcl|NC_015254. 244 TSYIFGEGAFGLGNGEAPVPTET---DREKLKGNDILINRQHFLLHPRGIAWQEKSVAGH--SPTNTEI 307 (346) Q Consensus 244 tt~l~~~GAi~~~~~~~~~~vE~---dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~--sPt~a~L 307 (346) ...++-+-|++.....+ ..+|+ ||++....+.+.+++.|+..++ .|. ..+.=. .|-++.. T Consensus 311 ~~~~~~~~A~g~v~~~~-~~~~~~~~~~~~~~q~~~i~~~~a~G~~~l--rp~-~av~l~~~~~~~~~~ 375 (375) T protein:vir:10 311 CGLIFQKEAAGVVEAIG-PQVQVTNGDVSVIYQGDVILGRMAMGADYL--NPA-AAVELYIGATAPSAF 375 (375) T ss_pred EEEEEchhheeeeeeec-cccccccchhhheeeeeeeeeeeeeccCcc--Cce-eEEEEecCcCccccC Confidence 11344555555544443 23454 3567777777777777665543 221 111000 1222222 No 125 >protein:vir:103323 Length: 364 # NCBI annotation: major capsid-like protein # Family: family:all:2806 # MgeID: mge:1609 # MgeName: Era103 # Cross-refs: genbank:acc:YP_001039668;genbank:gi:125999997;genbank:GeneID:4818399 Probab=98.71 E-value=2.6e-08 Score=62.16 Aligned_cols=297 Identities=12% Similarity=0.072 Sum_probs=167.3 Q ss_pred CccceecceeeecCCceeee------eec-cchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccccc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRI------ADV-IVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQD 73 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l------~d~-i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~ 73 (346) |- -+|..|+- ++. +-=|+|..-|...+.+.+.|..- -...+ -.+|+++.+|+.+. T Consensus 1 ms----------~~n~~t~~~~~~~~~~~al~le~f~geV~taf~~~s~~~~~------~~~rt--i~~gkS~q~~~iG~ 62 (364) T protein:vir:10 1 MS----------NPNVLTQPAVSASGEVDSLLIEKFNNRVHEQYLKGENLLQW------FDVQE--VVGTNSVSNKYIGE 62 (364) T ss_pred CC----------CcccccccccccccchhhhhhhhhhhhHHHHHHHHHhhcCc------ceeee--ecccceEEeeeeee Confidence 21 12222221 222 22277777777777666666321 11111 23799999999876 Q ss_pred CCCcccccCCCccccchhhcccceeEEEEEe-ecCcceechHHHhhhcch-HHHHHHHHHHHHHHHHHHHHHHHHHHhhh Q lcl|NC_015254. 74 LTGEDEILDDGEGALTPGNISAAKDIARLHM-RGKAWRTNDLAKALSGDD-PMRAIGDLVVEYWNRRRQAVLIASLNGIT 151 (346) Q Consensus 74 l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~-~~k~~~~tD~a~~~~g~d-p~~~i~~q~a~~~~~~~~~~lla~L~G~~ 151 (346) .. +....-|+ .+.++.+...+..-+|=. .--...+.|+.....-=| +-.+++++.+.++++.+|..++..++... T Consensus 63 ~~--~~~~~~G~-~ld~~~~~~~k~~itID~ll~a~~~V~diDe~q~~~D~vR~e~s~e~G~ALA~~~Dq~i~~~v~~aa 139 (364) T protein:vir:10 63 TE--LQVLSPGK-SPDASPTEFDKNRLVVDTTVIARNTVAHFHDVQNDIDGLKSKLSVNQAKKLKKMEDSMVIQQLVLGG 139 (364) T ss_pred eE--EeeeccCc-ccCCCCcccCcEEEEecceeeechhhhhHHHHhcCccchhHHHHHHHHHHHHHHHHHHHHHHHHhhh Confidence 52 33333443 466666666654444322 122356888888877667 67799999999999999999887664332 Q ss_pred -hhhhhh-------hcc--eeeeccccccccccH----HHHHHHHHHhCccc--cCceEEEEchHHHHHHHhh-hhhh-- Q lcl|NC_015254. 152 -ASGALD-------SNK--LDVSTETGDDSYFTG----DTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQ-GLIE-- 212 (346) Q Consensus 152 -~~~~~~-------~~~--~dis~~~~~~~~~~~----~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~-~li~-- 212 (346) +..... ... .++.. +.+....++ ++|.+|.+.|.++. ..-.+++|.|..|..|++. ++++ T Consensus 140 ~a~~~~~~~~~~~~~~g~~i~~~~-~a~~~~~~~~~l~~ai~~a~~~LdEkdVP~~~R~~vv~P~~y~~Ll~~~~lvn~d 218 (364) T protein:vir:10 140 ISNTEAIRKNPRVAGHGFSIHIVG-LASSFLTSPQYMMAAIEMAMEQQTEQEVDTSELCGLMPWTAFNCLRDADRIVDKS 218 (364) T ss_pred hhcccccccCCcccCCcceeeecc-cCcchhhhHHHHHHHHHHHHHHHhhcCCCccccEEEeChHHHHHHhcCCcccccc Confidence 111111 111 11111 112223333 44556777776644 2448999999999999976 4554 Q ss_pred hccccc----CceeeEEeceEEEEeCCCccCCC-----------------------------ceEEEEEcCCeeEEeecC Q lcl|NC_015254. 213 FMLDSD----NKKFPTYMGKRVIVDDGLPAKDG-----------------------------VYTSYIFGEGAFGLGNGE 259 (346) Q Consensus 213 ~~~~s~----~~~i~~~~G~~VVvdD~~p~~~g-----------------------------~ytt~l~~~GAi~~~~~~ 259 (346) |...+. .+.++.+.|++|+.|+.+|...+ +....+|-+-|++..... T Consensus 219 ~~~~~~~~~~~G~v~~v~Gv~Vv~Sn~lP~~~~~~~~t~~~t~h~ls~~~~g~~y~v~~d~~~~~~~~f~~~Al~tv~~~ 298 (364) T protein:vir:10 219 YTIAASDNTVDGFVLKSWNTPIVPSNRFPKLSDNTEGTGNTKHHKLSNAGNGNRYDVTAGQTSAQAVLFTQDALLVGRTI 298 (364) T ss_pred ccccCCCccccceeEEEeceEEEeccccccccccccccccccccccccccCCcccccccccceeEEEEEecceEEEEEEe Confidence 322222 35688999999999999984211 223456777888876655 Q ss_pred CccceeeeecCCcceeEEEEeeEEeeeeee---e-eeccccccCCCCC--hHHhcCCcCceeeecccccc Q lcl|NC_015254. 260 APVPTETDREKLKGNDILINRQHFLLHPRG---I-AWQEKSVAGHSPT--NTEIEKGNNWKAVYESKNIR 323 (346) Q Consensus 260 ~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G---~-s~~~~~~~~~sPt--~a~L~~~~NW~~v~~~K~i~ 323 (346) ++..|..|++....+.+.+++.|++.++= . ..+-.. ++...+ ++-|+- +|-..+|. |.+- T Consensus 299 -~~t~e~~~~~~~~~~~ida~~a~G~g~lRPeaa~~i~~~~-~~~~~~~~~~~~~~-~~~~~~~~-~~~~ 364 (364) T protein:vir:10 299 -SITGDIFYEKKEKTWYIDTFLAEGAIPDRWEAVAVVTAAD-TAELATDHNAILAR-ANRKVTLT-KSVN 364 (364) T ss_pred -cceeeeeeccceeeeeeeeehcccCcccCccceEEEEecC-CCCCccchhhhhhh-ccccEEEE-EecC Confidence 45688889998888888887777766542 1 111111 111111 222332 22222221 1111 No 126 >protein:vir:100172 Length: 394 # NCBI annotation: putative major head protein # Family: family:all:21 # MgeID: mge:1524 # MgeName: phi AT3 # Cross-refs: genbank:acc:YP_025031;genbank:gi:48697264;genbank:GeneID:2948270 Probab=98.68 E-value=1.9e-08 Score=62.92 Aligned_cols=284 Identities=15% Similarity=0.031 Sum_probs=136.5 Q ss_pred Cccce--ecceeeecCCceee-eeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCc Q lcl|NC_015254. 1 MIKKL--RMNLQKFAAGKNTR-IADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGE 77 (346) Q Consensus 1 ~~~~~--~~~~q~~~a~~~T~-l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ 77 (346) +++-+ .+.....+++..|. -.-..+|+.+..-+.+...+.+.+.+- ......++...++|....-++. T Consensus 96 ~~~~l~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~~~l~~~---------~~~~~~~~~~~~~~~~~~~~~~ 166 (394) T protein:vir:10 96 INDFIHSHGKVIDNAAGHVTSTEAGVLIPEEIIYDPTAEVNSVVDLSTL---------VTKTPVTTPKGTYPILKRATDR 166 (394) T ss_pred HHHHHhccchhhhhhhcccccccCceeccHHHHHHHHHHHHhhhhhhhh---------ceeeeccCCceEEEEEecCCCc Confidence 00000 01111122221121 123566777655555554444433221 1112235667888887765555 Q ss_pred ccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhh Q lcl|NC_015254. 78 DEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALD 157 (346) Q Consensus 78 ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~ 157 (346) ...+.|+.........+-.+-.-..++.+.-+.++++...-+.-|....+.+++++...+..+..++..+. . T Consensus 167 ~~~~~E~~~~~~~~~~~~~~v~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~il~g~g----~---- 238 (394) T protein:vir:10 167 FSSVAELAENPALAEPEFEQVDWSVSTYRGAIPLSEEAIADSAVDLTSLVGQSINEKSVNTYNAMIAPVLQ----S---- 238 (394) T ss_pred cccccccccccccccccceeEEeeeeeeEeeehhHHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHhhccc----c---- Confidence 55566664322233444444444555555556666665544445677789999999999998887665432 0 Q ss_pred hcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhh--hcccc------cCceeeEEeceE Q lcl|NC_015254. 158 SNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIE--FMLDS------DNKKFPTYMGKR 229 (346) Q Consensus 158 ~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~--~~~~s------~~~~i~~~~G~~ 229 (346) ... .+.....+++.|.++....=+.... .+|+||+.++..|++..--+ ++... .++.-++++|+| T Consensus 239 ~~~------~~~~~~~~~d~l~~~~~~~~~~~~~-a~~vmn~~~~~~l~~lkd~~G~~i~~~~~~~~~~~~~~~~L~G~P 311 (394) T protein:vir:10 239 FTA------KATTTDTLVDSLKHILNVDLDPAYS-RALVVTQSLFNTLDTLKDKNGRYLLHDASDSITDGTAKGTVLGVP 311 (394) T ss_pred ccc------ccccccccHHHHHHHHHhhhhhhcc-CEEEecHHHHHHHHHhhccCCCeeeeccccccccCCcccccccce Confidence 000 0112234667788877654333332 68999999999999753111 21111 122335899999 Q ss_pred EEEeCCC--ccCCCceEEEEEcC--CeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccCCCCChH Q lcl|NC_015254. 230 VIVDDGL--PAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNT 305 (346) Q Consensus 230 VVvdD~~--p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a 305 (346) |++.+.+ |...|.. .++|+. -++.+.. +..+.++..++.... +.+....++-+ T Consensus 312 V~~~~~~~~~~~~~~~-~i~~gd~s~~~~~~~-~~~~~v~~~~~~~~~-~~~~~~~r~d~-------------------- 368 (394) T protein:vir:10 312 VYVVGDALLGSAAGDQ-KAFVGDLKRGVLFAD-RQQVTLAWEDSKIYG-RYLGAAFRFGV-------------------- 368 (394) T ss_pred eEEecccccCCCCCce-EEEEeeccccEEEEe-ecceEEEEecccccc-eeEEEEEEecc-------------------- Confidence 9887654 3333433 344442 2233332 223445555543322 22221111111 Q ss_pred HhcCCcCceeeecccccceEEEEEecccccccCCCCCCCC Q lcl|NC_015254. 306 EIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETAPEGI 345 (346) Q Consensus 306 ~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~~~~~ 345 (346) .+.++++|.++.+ .+...+|.+.+|- T Consensus 369 ---------~~~~~~ai~~~~~-----~~~~~~~~~~~~~ 394 (394) T protein:vir:10 369 ---------KQADSNAGYFVTN-----TDAASGSTSGTGK 394 (394) T ss_pred ---------EEeccccEEEEEe-----ecccCCCCCCCCC Confidence 1223333322211 1112233334443 No 127 >protein:vir:9704 Length: 394 # NCBI annotation: hypothetical protein # Family: family:all:21 # MgeID: mge:174 # MgeName: 315.2 # Cross-refs: genbank:acc:NP_795466;genbank:gi:28876225;genbank:GeneID:1257769 Probab=98.68 E-value=1.1e-08 Score=64.28 Aligned_cols=267 Identities=10% Similarity=0.002 Sum_probs=133.1 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEI 80 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~ 80 (346) .......+.+ +.+.++.-.-..+|+.+..-+.+...+.+.+.+- ......++...++|.+..-++.+.- T Consensus 118 ~~~~~~~~~~--~~~~t~~~gg~liP~~~~~~ii~~~~~~~~l~~~---------~~~~~~~~~~~~~~~~~~~~~~~~~ 186 (394) T protein:vir:97 118 INETTPVEPQ--KDGIKKENAKPVSSEEILYTPAREVKTVVDLKPF---------TTVYQAKKASGKYPVLQRATTKMVT 186 (394) T ss_pred HHhhhhhhhh--ccccccccccccChHHHHHHHHHHhhhhhhhhhh---------ceeeeccCcceEEEEEecCCCccce Confidence 0000011110 0111222234578887766666655444444221 1112234556788888654444445 Q ss_pred cCCCccccch-hhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhc Q lcl|NC_015254. 81 LDDGEGALTP-GNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSN 159 (346) Q Consensus 81 ~~dg~~~it~-~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~ 159 (346) +.|+. ..+. ...+-.+-.-..++.+.-..+++....-+.-|....+.+++++...+..+..+|..+.+. T Consensus 187 v~E~~-~~~~~~~~~~~~v~l~~~k~~~~i~is~ell~ds~~~~~~~i~~~la~~~~~~~~~~i~~g~~~~--------- 256 (394) T protein:vir:97 187 VAELE-KNPALAKPDFKDVAWNIDTYRGAIPLSQESIDDADVDLVGIVSESISQIKVNTTNDAIAKVLKSF--------- 256 (394) T ss_pred ecccc-cccccccccceeEEeehhheeeehhhHHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHhhccccc--------- Confidence 66764 3332 333444444444455555566665444444567778999999999998887665433210 Q ss_pred ceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceeeEEeceEEEEeCC Q lcl|NC_015254. 160 KLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFPTYMGKRVIVDDG 235 (346) Q Consensus 160 ~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~~~~G~~VVvdD~ 235 (346) ......+++.+.++....-+... -..|+||+.++..|++..-- .++. ...++.-++++|+||++++. T Consensus 257 --------~~~~~~~~~~~~~~~~~~~~~~~-~a~~v~n~~~~~~l~~lkd~~G~~i~~~~~~~~~~~~l~G~pv~~~~~ 327 (394) T protein:vir:97 257 --------TTKTVKNLDEIKALLNGGFDPAY-NVSLIVSQSFYQTLDTLKDGNGRYLLQDDITAVSGKVLLGKPVFVLSD 327 (394) T ss_pred --------cccccccHHHHHHHHHhhhhhhh-CCEEEEcHHHHHHHHHhhccCCCeeeecCcCCCCCceeccceeEEecc Confidence 01123467888888776444332 36799999999998865311 1222 22234456899999999877 Q ss_pred CccCCCceEEEEEcC--CeeEEeecCCccceeeeecCCcceeEEEEeeEEe---eeeeeeeeccccccCCCCChHHh Q lcl|NC_015254. 236 LPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREKLKGNDILINRQHFL---LHPRGIAWQEKSVAGHSPTNTEI 307 (346) Q Consensus 236 ~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~---~~~~G~s~~~~~~~~~sPt~a~L 307 (346) +....+. ++||. -++.+.. +....++..++..... .+....+|. .||..|..-. .+|+-+-| T Consensus 328 ~~~~~~~---~~~gd~~~~~~~~~-~~~~~~~~~~~~~~~~-~~~~~~r~d~~v~~~~a~~~~~-----~~~~~~p~ 394 (394) T protein:vir:97 328 EVLGANK---AFIGDFKRGVLFAD-RKDLGLRWADNEIYGQ-YLQAVLRFGVSKVDDKAGYYVT-----FTPEPLPL 394 (394) T ss_pred cccCCcc---EEEeeccccEEEEE-ecceEEEEecccccce-eEEEEEEEccEEecccceEEEE-----ecccccCC Confidence 6655542 33432 2222332 2233455544443222 222222221 2223333211 12222223 No 128 >protein:vir:105374 Length: 423 # NCBI annotation: gene 5 protein # Family: family:all:1412 # MgeID: mge:1556 # MgeName: Sf6 # Cross-refs: genbank:acc:NP_958181;genbank:gi:41057283;genbank:GeneID:2716621 Probab=98.66 E-value=1.6e-08 Score=63.39 Aligned_cols=312 Identities=13% Similarity=0.052 Sum_probs=155.5 Q ss_pred eeecc---chHHHHHHHhhHhHHHHhHhhccccccchhHHHHh-hCCCcEEEecccccCCCcccccCCCccccchhhccc Q lcl|NC_015254. 20 IADVI---VPEVFNKYVTERTAESSALLQSGIISNDKDLDELA-KSGGNMINMPFWQDLTGEDEILDDGEGALTPGNISA 95 (346) Q Consensus 20 l~d~i---~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~-~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~ 95 (346) ++|-| +||+++.-+.+.+.+.+-|.+ .+-++- ..+.. +.-||+|++|.=....-......++ +.++++.++. T Consensus 1 MaN~llT~~p~iia~~aL~~l~~~lV~~~--lVnr~y-~~ef~~~k~GDTV~I~~p~~~~~~d~~~~~~-~~~~~~dl~e 76 (423) T protein:vir:10 1 MPNNLDSNVSQIVLKKFLPGFMSDLVLAK--TVDRQL-LAGEINSSTGDSVSFKRPHQFSSLRTPTGDI-SGQNKNNLIS 76 (423) T ss_pred CccchhhhhHHHHHHHHHHHHHhhcccch--hhcccC-CCcccccccCCEEEEeeCCceeeeccCCccc-cccccCcccc Confidence 44433 699998888887777776533 333322 12221 3469999988665542221111222 2467778887 Q ss_pred ceeEEEEE-eecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecccccccccc Q lcl|NC_015254. 96 AKDIARLH-MRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSYFT 174 (346) Q Consensus 96 ~~~~a~~~-~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~~~ 174 (346) ++...++- ....++.++|+...+.-.+. +++.++.....+++++.+|++.+.+... +. ....+. ..-. T Consensus 77 ~~v~l~id~~k~va~~v~d~E~~~~i~~~-~~~l~~A~~aLA~~vd~~ia~~~~~~~~-~~--------~gt~~t-~~~a 145 (423) T protein:vir:10 77 GKATGRVGNYITVAVEYQQLEEAIKLNQL-EEILAPVRQRIVTDLETELAHFMMNNGA-LS--------LGSPNT-PITK 145 (423) T ss_pred ceeEEEeeceeeeeeeechHHHhcChhhH-HHHHHHHHHHHHHHHHHHHHHHHhhccc-cc--------cccCCc-ccch Confidence 76554443 45557888888877666665 6677777788999999998876543211 11 111111 1124 Q ss_pred HHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhhh-hhhhcccc--c---Ccee-eEEeceEEEEeCCCcc-CCCceE Q lcl|NC_015254. 175 GDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQG-LIEFMLDS--D---NKKF-PTYMGKRVIVDDGLPA-KDGVYT 244 (346) Q Consensus 175 ~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~-li~~~~~s--~---~~~i-~~~~G~~VVvdD~~p~-~~g~yt 244 (346) ++.+.++..+|.+.. ..-+.+|+.|..+..|.+.. .+..-... + .+.| +.+.|.+|+.|+.+|. +.|.+. T Consensus 146 ~~~i~~a~~~Ld~~~vP~~~R~~Vv~p~~~a~Ll~~~~~~~~~~~~~~~alr~g~i~G~i~GFdv~~Snnip~~T~gt~~ 225 (423) T protein:vir:10 146 WSDVAQTASFLKDLGVNEGENYAVMDPWSAQRLADAQTGLHASDQLVRTAWENAQIPTNFGGIRALMSNGLASRTQGAFG 225 (423) T ss_pred HHHHHHHHHHHHhccCCcCCCEEEeChHHHHHHhccccceecccccchhhhhhccceeeecceEEEEeCCCccccccccc Confidence 688999999997754 23478899999999998653 22211111 1 2345 7999999999999996 445443 Q ss_pred -EEEEcCC-----eeEEeecCCccce--eeee--cCCcceeEEEEeeEEeeeeee------------eeecccc-----c Q lcl|NC_015254. 245 -SYIFGEG-----AFGLGNGEAPVPT--ETDR--EKLKGNDILINRQHFLLHPRG------------IAWQEKS-----V 297 (346) Q Consensus 245 -t~l~~~G-----Ai~~~~~~~~~~v--E~dR--d~~~g~~~l~~r~~~~~~~~G------------~s~~~~~-----~ 297 (346) +.....+ +...+..+-.+.+ .+-+ +.+...|.+.---.|.+|+.- ..|.-.. . T Consensus 226 ~t~~~~~~~~v~~~a~~~a~~~~~~~~~~~~~~~~~l~~GD~~t~aGv~~v~~~tk~~~~~~~t~~~~~~~v~a~~~~~~ 305 (423) T protein:vir:10 226 GTLTVKTQPTVTYNAVKDSYQFTVTLTGATASVTGFLKAGDQVKFTNTYWLQQQTKQALYNGATPISFTATVTADANSDS 305 (423) T ss_pred cceeeeecceeccccccccceeeeeeeeccccccCceeecceEEecceeeecccccccccccccCcceEEEEEeeeeecc Confidence 1111111 1111111100100 0011 112222333333334444322 1221100 0 Q ss_pred c-----CCCCC-------------hHHhcCCcCceeeec-ccccceEEEEEecccccccCCCCCCCCC Q lcl|NC_015254. 298 A-----GHSPT-------------NTEIEKGNNWKAVYE-SKNIRIVAFVHKNGVPGKKKETAPEGIK 346 (346) Q Consensus 298 ~-----~~sPt-------------~a~L~~~~NW~~v~~-~K~i~iv~~~~k~~~~~~~~~~~~~~~~ 346 (346) + ..+|. .+..++++.|+.+.. .-..+-=.+=||.+......|-++.|.. T Consensus 306 ~g~~tv~i~p~~i~~~~~~~~~~v~a~~a~~~~vT~~~~a~~t~~~nl~~~~~a~~l~~~pl~~~~~~ 373 (423) T protein:vir:10 306 GGDVTVTLSGVPIYDTTNPQYNSVSRQVEAGDAVSVVGTASQTMKPNLFYNKFFCGLGSIPLPKLHSI 373 (423) T ss_pred CCceeeeccCccccccCCcccccccccccCCceeeccccccCCeeEEEEecCcceEEEEEcccCCCcc Confidence 0 01121 134455555554321 0001111122333333333333333332 No 129 >protein:vir:81227 Length: 413 # NCBI annotation: gp6, major capsid protein # Family: family:all:585 # MgeID: mge:1893 # MgeName: BFK20 # Cross-refs: genbank:acc:YP_001456736;genbank:gi:157168379;hssp:P49861;interpro:IPR006444;uniprot:Q9MBJ9;genbank:GeneID:5580350 Probab=98.59 E-value=2.9e-08 Score=61.88 Aligned_cols=277 Identities=11% Similarity=0.039 Sum_probs=135.4 Q ss_pred Ccc-ceecceeee----cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCC Q lcl|NC_015254. 1 MIK-KLRMNLQKF----AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLT 75 (346) Q Consensus 1 ~~~-~~~~~~q~~----~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~ 75 (346) +.. .....+..+ .+..++.-..-.+|+.+.+-+.+...+.+.+.+- ......+|..+.+|...... T Consensus 101 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~vp~~~~~~ii~~~~~~~~l~~~---------~~~~~~~~~~~~~~~~~~~~ 171 (413) T protein:vir:81 101 SVGEYVAPRVKAASDPASTATLTDEFQGGYGTTWNRNIIYRRREKLVVADL---------MDNLTMTNTTIKYLMEKANR 171 (413) T ss_pred hhhhhhhhHHHhhhhhhhhcccccccccccchhhHHHHHHHHhhhhhHHhh---------cceeeccCCceeEEEecccc Confidence 000 000000000 0222333444556777777777766655554321 11112245667777665432 Q ss_pred ---CcccccCCCccccchhhcccce-eEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH----- Q lcl|NC_015254. 76 ---GEDEILDDGEGALTPGNISAAK-DIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS----- 146 (346) Q Consensus 76 ---g~ae~~~dg~~~it~~~lt~~~-~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~----- 146 (346) +.+..+.||. .++-..+.... .....+..+.-..+++....-+ ......+.+.+++.+.+..++.+|.- T Consensus 172 ~~~~~a~~v~Eg~-~~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~ds-~~l~~~i~~~la~~~~~~~d~~~l~G~G~~~ 249 (413) T protein:vir:81 172 VVEGGFKTVAEGG-KKPYMRFADFDIVTESLSKIAGLTKITDEMIEDY-DFLVSYINARLLEELAIEEERQLLLGDGTGN 249 (413) T ss_pred ccccccceecCcc-cccccCcccceeeEeeeeeEEEeehhhHHHHHHH-HHHHHHHHHHHHHHHHHHHHHHHhccCCCCC Confidence 2344566764 44433443333 3333444555566777644333 24556688999999999999876642 Q ss_pred -HHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCcc-ccCceEEEEchHHHHHHHhhhhhh--hccc-----c Q lcl|NC_015254. 147 -LNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDA-EGKLTGIAMHSQTEMNLRKQGLIE--FMLD-----S 217 (346) Q Consensus 147 -L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~-~~~~~~ivmhS~~~~~L~~~~li~--~~~~-----s 217 (346) ++|++....... + +.......++.+.++...+-.. ...-.+|+||+.++..|++..--+ ++.. . T Consensus 250 ~~~Gi~~~~~~~~----~---~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~vmn~~~~~~l~~lkd~~G~~l~~~~~~~~ 322 (413) T protein:vir:81 250 NLTGLLKRDGIQT----L---AVSNKDELADSIYKAMTNISLATPFQADALVINPLDYQELRLAKDANGQYYGGGVFQGQ 322 (413) T ss_pred ccccccccccccc----c---cccccchhHHHHHHHHHHhhhhccCCCcEEEEcHHHHHHHHHhhccCCceecccccccc Confidence 122222111111 0 0111112356666776654322 223346999999999988653111 1111 0 Q ss_pred --cC--ceeeEEeceEEEEeCCCccCCCceEEEEEcC--CeeEEeecCCccceeeeecC----CcceeEEEEeeEEeeee Q lcl|NC_015254. 218 --DN--KKFPTYMGKRVIVDDGLPAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREK----LKGNDILINRQHFLLHP 287 (346) Q Consensus 218 --~~--~~i~~~~G~~VVvdD~~p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~----~~g~~~l~~r~~~~~~~ 287 (346) .+ ..-++++|+||++++.||.+. ++|+. .++.+.. +..+.++.++.. ..+...+....+|.+.| T Consensus 323 ~~~~~~~~~~~l~G~pv~~s~~~~~~~-----~~~gd~~~~~~~~~-~~~~~v~~~~~~~~~~~~~~~~~r~~~r~d~~~ 396 (413) T protein:vir:81 323 YGSGGIMLDPAPWGLRTVQSQVVPVGK-----PVVGAFRSAASVLR-KGGVRIDSTNTNVDDFENNLITVRAEERVGLMV 396 (413) T ss_pred ccccccccCceecceeeEEcCCCCccc-----EEEEecccEEEEEE-ecceEEEEeccccchhhcCcEEEEEEEeeccEE Confidence 01 123579999999999999653 34432 2232222 223446665543 34555555555555444 Q ss_pred ---eeeeeccccccCCCC Q lcl|NC_015254. 288 ---RGIAWQEKSVAGHSP 302 (346) Q Consensus 288 ---~G~s~~~~~~~~~sP 302 (346) ..|..-. -.+..+| T Consensus 397 ~~~~a~~~l~-~~~~~~p 413 (413) T protein:vir:81 397 TFPEAIVQLD-VAEVVTP 413 (413) T ss_pred ecccceEEEE-ecCCCCC Confidence 3333211 1122456 No 130 >protein:vir:1268 Length: 397 # NCBI annotation: hypothetical protein # Family: family:all:21 # MgeID: mge:329 # MgeName: phi-105 # Cross-refs: genbank:acc:NP_690760;genbank:gi:22855000;genbank:GeneID:955203 Probab=98.59 E-value=3.2e-08 Score=61.64 Aligned_cols=271 Identities=12% Similarity=0.031 Sum_probs=138.3 Q ss_pred Cccce-----ec-ceeeecC--CceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccc Q lcl|NC_015254. 1 MIKKL-----RM-NLQKFAA--GKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQ 72 (346) Q Consensus 1 ~~~~~-----~~-~~q~~~a--~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~ 72 (346) +++.+ .+ +.-.+.+ ..++.-.-.++|+.+.+.+.+...+.+.+.+-.=..+ ...+...+.+|... T Consensus 103 ~~~~~~~~~~~~~~~~~~~a~~~~~~~~gg~lvP~~~~~~ii~~~~~~~~l~~~~~~~~-------~~~~~~~~~~~~~~ 175 (397) T protein:vir:12 103 RGKRLTDEERDLLDSPEFRAMSGINDEDGGILIPEDIGRQIHEFKRQFEPLEQYVTVEP-------VTTRSGTRLLEKNA 175 (397) T ss_pred hccCCcHHHHHHHhhhhhhhccccccccCcccCchhHHHHHHHhhhhhhhHHhhcceee-------ccCCceeEEEEEec Confidence 11110 00 0001111 1222233467899998888777666655533110000 11112244455444 Q ss_pred cCCCcccccCCCccccch-hhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Q lcl|NC_015254. 73 DLTGEDEILDDGEGALTP-GNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGIT 151 (346) Q Consensus 73 ~l~g~ae~~~dg~~~it~-~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~ 151 (346) .. +.+.-+.|+. .++. ...+-.+-....++.+....++++...-+.-|....+.++++..+.+..+..++.... T Consensus 176 ~~-~~a~~v~Eg~-~~~~~~~~~~~~v~~~~~k~~~~~~is~e~l~ds~~~l~~~i~~~l~~~~~~~~d~~il~G~g--- 250 (397) T protein:vir:12 176 DM-VPFSPVEELG-NLPEIDQPRFTKVSYSIIDYGGIMTLSNSMLNDSDQAIMTYVAKWFAKKSVVTRNNLILAAIA--- 250 (397) T ss_pred CC-cceeeecccc-cccccccccceeEEeeheeeEeeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHHhccc--- Confidence 43 3344566764 3332 3344444444455556566677766555555777889999999999999887664321 Q ss_pred hhhhhhhcceeeeccccccccccHHHHHHHHHH-hCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceeeEEe Q lcl|NC_015254. 152 ASGALDSNKLDVSTETGDDSYFTGDTFLSATYK-LGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFPTYM 226 (346) Q Consensus 152 ~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~-~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~~~~ 226 (346) . .. ....++++.+.+++.. +-.....-.+|+||+.++..|++..-- .++. ...++.-++++ T Consensus 251 ---~--~~---------~~g~~~~~~i~~~~~~~l~~~~~~~a~~~~n~~~~~~L~~lkd~~G~~l~~~~~~~g~~~~l~ 316 (397) T protein:vir:12 251 ---S--LK---------KVDIDGLDGIKKALNVTLDPMVAPGSIVLTNQDGYDWLDTLKDGTGRYLLQPDPTNPTKKLLD 316 (397) T ss_pred ---c--cc---------ccccccHHHHHHHHhhccchhhhCCCEEEEcHHHHHHHHHhhccCCceeecccccCCCCcccc Confidence 0 00 1123567888888763 433444557899999999999865211 1222 11234446899 Q ss_pred ceEEEEeCCC-ccCCCceEEEEEcC--CeeEEeecCCccceeeeecCC----cceeEEEEeeEEeeee---eeeeecccc Q lcl|NC_015254. 227 GKRVIVDDGL-PAKDGVYTSYIFGE--GAFGLGNGEAPVPTETDREKL----KGNDILINRQHFLLHP---RGIAWQEKS 296 (346) Q Consensus 227 G~~VVvdD~~-p~~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~----~g~~~l~~r~~~~~~~---~G~s~~~~~ 296 (346) |+||++++.+ |.....-..++|+. .++.+.. +..+.++.++... .+...+....++...+ ..|....-+ T Consensus 317 G~pv~~~~~~~~~~~~~~~~~~~gd~~~~~~~~~-~~~~~i~~~~~~~~~f~~~~~~~r~~~r~d~~~~~~~a~~~~~~t 395 (397) T protein:vir:12 317 GRPVVPFTNRVLKTQKGKAPLIIGNLKEAIVLFD-REQQSIASTDTGAGAFETNSTKVRGIEREDVRKWDEDAVVFGQIT 395 (397) T ss_pred ceeeEEecccccccCCCccEEEEEehhceEEEEe-ecceEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEEe Confidence 9999887654 43322223455653 2333332 2234566555432 3444455544444333 233332111 Q ss_pred ccCC Q lcl|NC_015254. 297 VAGH 300 (346) Q Consensus 297 ~~~~ 300 (346) -+ T Consensus 396 --~~ 397 (397) T protein:vir:12 396 --VE 397 (397) T ss_pred --eC Confidence 11 No 131 >protein:vir:93616 Length: 645 # NCBI annotation: putative major head protein/prohead protease # Family: family:all:21 # MgeID: mge:157 # MgeName: phi 4795 # Cross-refs: genbank:acc:YP_001449293;genbank:gi:157166041;goa:Q6H9U8;interpro:IPR006433;uniprot:Q6H9U8;genbank:GeneID:5580438 Probab=98.59 E-value=1.9e-08 Score=62.97 Aligned_cols=287 Identities=11% Similarity=-0.003 Sum_probs=136.1 Q ss_pred Cccceecceeeec---CCceee---eeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccC Q lcl|NC_015254. 1 MIKKLRMNLQKFA---AGKNTR---IADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDL 74 (346) Q Consensus 1 ~~~~~~~~~q~~~---a~~~T~---l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l 74 (346) .+...+.+.+..+ ++.+|. -..+++|+.+..-+.+.+.+.+-+.+.+.... -.+...++ .+++|.... T Consensus 320 ~~~~~~~~~~~~~a~~~~~~~~~~~~Gg~~vp~~~~~~ii~~l~~~svv~~l~~~~~----~~~~~~~~-~~~ip~~t~- 393 (645) T protein:vir:93 320 YPDDSRLHHVLKSAVGAGTTTDPQWAGSLSEYQEYAQDFIDYLRPQTIIGRFGQGGI----PALRQVPF-NIRVHAQVS- 393 (645) T ss_pred cccchhhhhhhhhhhhccccccccccCCccCchhhHHHHHHhhhhhhhHHhhccccc----cccccccC-ceeeeeeec- Confidence 1111122221111 222111 14678999888777776666665554432210 01111122 456776543 Q ss_pred CCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHH-hhhhh Q lcl|NC_015254. 75 TGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLN-GITAS 153 (346) Q Consensus 75 ~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~-G~~~~ 153 (346) ++.+.-+.|+. .++..+.+-++-....++.+--..++++--.-+.-|..+.+.++++..+.+..++.+|.--. |..+. T Consensus 394 ~~~a~wv~Eg~-~~~~s~~~f~~v~l~~~kla~~~~iS~ell~ds~~~~~~~i~~~l~~aia~~~d~a~l~g~g~~~~~~ 472 (645) T protein:vir:93 394 GGAAGWVGEGK-TKPLTKFDFESITFSHAKVSAIAVLTEELIRFSSPAADALVRNALAEAVVARLDTDFVDPKKAAVADV 472 (645) T ss_pred CcceEEeccCc-cccccccceeEEEEeeEEEEEeehhHHHHHhhchHHHHHHHHHHHHHHHHHHHHHHhhcCCCcccCCc Confidence 23444466764 56666655444444444444444555544333444566679999999999999987764221 11111 Q ss_pred hhhhhcceeeeccccccccccHHHHHHHHHHhCcccc--CceEEEEchHHHHHHHhhhhhh--hcccccCceeeEEeceE Q lcl|NC_015254. 154 GALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEG--KLTGIAMHSQTEMNLRKQGLIE--FMLDSDNKKFPTYMGKR 229 (346) Q Consensus 154 ~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~~~~~L~~~~li~--~~~~s~~~~i~~~~G~~ 229 (346) .. .+........ .........+..+...+-+... .-.+|+|||.++..|++..--+ ++...-...=++++|+| T Consensus 473 ~p-~gi~~~~~~~--~~~~~~~~d~~~~~~~~~~a~~~~~~a~~vmn~~~~~~L~~lkd~~G~~~~~~~~~~~~tL~G~P 549 (645) T protein:vir:93 473 SP-ASITHDVKGT--ASSGNPDADAEAAFGQFVAANLQPTGAVWLMSSTNALALSMRKNALGQKEYPDMTLLGGSFQGLP 549 (645) T ss_pred cc-cceecccccc--ccccchHHHHHHHHHHHHhcCCCccccEEEEcHHHHHHHHhccccCCceeecCCCCCCceeecee Confidence 11 1111111111 1111233456666666544332 3358999999999998763211 11111111125899999 Q ss_pred EEEeCCCccCC--CceEEEEE-cCCeeEEeecCCccceeeeecCCc----------------ceeEEEEeeEEe------ Q lcl|NC_015254. 230 VIVDDGLPAKD--GVYTSYIF-GEGAFGLGNGEAPVPTETDREKLK----------------GNDILINRQHFL------ 284 (346) Q Consensus 230 VVvdD~~p~~~--g~ytt~l~-~~GAi~~~~~~~~~~vE~dRd~~~----------------g~~~l~~r~~~~------ 284 (346) |++++.||..- |....+++ -.|.+.+.... ...++..-.+.. ....+-...++. T Consensus 550 V~~s~~vp~~~~~gd~s~~~ig~~~~v~i~~s~-~a~~~~~~~~~~~~~~~~~~~~v~lf~~d~vaira~~r~d~~~~~p 628 (645) T protein:vir:93 550 VIVSQYVGDQLVLVNAPDIYLADDGGVAVDMSR-EASLEMQSEPTGDSTTPSPVELVSMFQTGSVAIRAERWINWRRRRT 628 (645) T ss_pred eEEeccCCcceeEeccccEEEEEecceEEEeec-ceeEEEeecccccccccccccchhHhhcCceEEEEEEEEcceeeCc Confidence 99999998631 11111222 22344443322 222333221110 111122222222 Q ss_pred ---eeeeeeeeccccccC Q lcl|NC_015254. 285 ---LHPRGIAWQEKSVAG 299 (346) Q Consensus 285 ---~~~~G~s~~~~~~~~ 299 (346) +.+-|..| .++.+| T Consensus 629 ~a~~~lt~~~~-g~~~~~ 645 (645) T protein:vir:93 629 AAVAVITGVNY-GSASGG 645 (645) T ss_pred cceEEEecccC-CcccCC Confidence 34458888 555555 No 132 >protein:vir:1084 Length: 437 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:21 # MgeName: bIL309 # Cross-refs: genbank:acc:NP_076738;genbank:gi:13095848;genbank:GeneID:920418 Probab=98.59 E-value=1.3e-08 Score=63.81 Aligned_cols=277 Identities=13% Similarity=0.018 Sum_probs=123.8 Q ss_pred Cccce-ec---ceeee--------cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEe Q lcl|NC_015254. 1 MIKKL-RM---NLQKF--------AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINM 68 (346) Q Consensus 1 ~~~~~-~~---~~q~~--------~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~ 68 (346) +..+. .. .++.+ .+..++.-.-..+|+.+...+... .+...+.+ . ......+...+++ T Consensus 132 ~~~~~~~~~~~~~~~~~~~~e~~~~~~~~~~~~g~lvp~~~~~~i~~~-~~~~~l~~------~---~~~~~~~~~~~~~ 201 (437) T protein:vir:10 132 VGGEIADKKVTAFADYLKTGEVRDVTGIALKDGKVIIPETILTPEKEV-HQFPRLGS------L---VRTESVTTTTGKL 201 (437) T ss_pred HHHHHHHhhhhhhHHHHHhhhhhhhhhcccccccccchHHHHHHHHHh-hhhhhhhh------c---ceeEeeccCceee Confidence 00000 00 00000 011112222245777766554432 12222211 1 0111223445778 Q ss_pred cccccCCCcccccCCCccccc-hhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 69 PFWQDLTGEDEILDDGEGALT-PGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASL 147 (346) Q Consensus 69 P~~~~l~g~ae~~~dg~~~it-~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L 147 (346) |.+....+....+.++. .++ ....+-++-.-..++.+.-..+++....-+.-|..+.+.+.++..+.+..+..+|..+ T Consensus 202 ~~~~~~~~~~~~~~e~~-~~~e~~~~~~~~v~~~~~k~~~~~~is~ell~ds~~~~~~~i~~~l~~~~~~~~~~~i~~g~ 280 (437) T protein:vir:10 202 PIFNNSTDLLTAHTEYG-QTTKNATPVITPILWDLKTYTGGYVFSQELISDSSYDWQAELQSRLIELRDNTDDSLIITAL 280 (437) T ss_pred EEeeccccccccccccc-cccccccccceeeeeehhheeeehhhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhhh Confidence 88776555555556653 332 2333333333334445544566665444444467778999999999999888766543 Q ss_pred HhhhhhhhhhhcceeeeccccccccccHHHHHHHHHH-hCccccCceEEEEchHHHHHHHhhhhh--hhccc--ccCcee Q lcl|NC_015254. 148 NGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYK-LGDAEGKLTGIAMHSQTEMNLRKQGLI--EFMLD--SDNKKF 222 (346) Q Consensus 148 ~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~-~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~~--s~~~~i 222 (346) .. . ..+.....+.+.|.+++.. +-.....-.+|+||+.++..|++..-- .++.. ..++.- T Consensus 281 g~----~-----------~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~l~~lkd~~g~~~~~~~~~~~~~ 345 (437) T protein:vir:10 281 TD----G-----------IKKTTSTYLLGDLKKVLNVTLKPQDSAAASIVMSQSAYNLFDMATDAMGRPLLQPNVTAATG 345 (437) T ss_pred cc----c-----------ccccccccchhhHHHHHHhhhhhhhhcCCEEEEcHHHHHHHHHhhccCCCeeeccCccCCCC Confidence 21 0 0111122345666666553 323333456899999999999875311 22321 123445 Q ss_pred eEEeceEEEEeCCC--cc-CCCceEEEEEcC--CeeEEeecCCccceeeeecCCcceeEEEEeeEE---eeeeeeeeec- Q lcl|NC_015254. 223 PTYMGKRVIVDDGL--PA-KDGVYTSYIFGE--GAFGLGNGEAPVPTETDREKLKGNDILINRQHF---LLHPRGIAWQ- 293 (346) Q Consensus 223 ~~~~G~~VVvdD~~--p~-~~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~---~~~~~G~s~~- 293 (346) ++++|+||++++.+ |. +.|.+ +++||. -++.+.. +..+.++...+-....+.+..-.+| ++||..|... T Consensus 346 ~~l~G~pv~~~~~~~~~~~~~~~~-~~~~gd~~~~~~~~~-r~~~~~~~~~~~~~~~~~~~~~~r~d~~~~~~~a~~~l~ 423 (437) T protein:vir:10 346 YTLLGKTVVIVDDKLFPSASAGDV-NIVVAPLKKAVINFK-LTEITGQFQDTYDIWYKQLGIFLRQNVVQASKDLIVNLT 423 (437) T ss_pred cccccceeEEecccccCCcCCCce-EEEEeeccccEEEEe-eeceEEEEecccccccceeeEEEEEccEEecccceEEEE Confidence 68999999998775 43 23333 334442 2333322 2233444332211111111111111 2333333321 Q ss_pred --cccccCCCCChH Q lcl|NC_015254. 294 --EKSVAGHSPTNT 305 (346) Q Consensus 294 --~~~~~~~sPt~a 305 (346) -.+++...|+-+ T Consensus 424 ~~~~~~~~~~~~~~ 437 (437) T protein:vir:10 424 GKLKAVTVVQSTAV 437 (437) T ss_pred eeccccccCCCCCC Confidence 111111122222 No 133 >protein:vir:4197 Length: 314 # NCBI annotation: putative structural protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:88 # MgeName: psiM100 # Cross-refs: genbank:acc:NP_071822;genbank:gi:11863105;genbank:GeneID:1257607 Probab=98.54 E-value=8.1e-08 Score=59.47 Aligned_cols=279 Identities=12% Similarity=0.028 Sum_probs=149.4 Q ss_pred CccceecceeeecCCceeee-eeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCccc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRI-ADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDE 79 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l-~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae 79 (346) |||-+...=. + +++.+ .=..+||.+..++ +.+.+.+.|.+ .+.+.. ..+....++|.++. +.... T Consensus 4 ~~~~~~~~k~-i---t~~d~~gG~L~P~~~~~~i-~~l~e~s~i~~------~a~vi~--t~~s~~~~i~~i~~-g~~~~ 69 (314) T protein:vir:41 4 LNKPFQITPK-I---DVPDLGKGILAVQRFGEFV-REVRENSAIIK------DARVLN--ALKSYEVDISRISL-GVELE 69 (314) T ss_pred hhhHHHhhcc-c---ccccCCCceeChHHHHHHH-HHHHhccchhh------heeeec--ccCccceeeccccc-Ccccc Confidence 5554321100 0 01111 1236899987655 55555554533 222111 11244567777652 11111 Q ss_pred c-cCC-C-ccccchhhcccceeEEEEEeecCcceechHHHhhh--cchHHHHHHHHHHHHHHHHHHHHHHHH-------- Q lcl|NC_015254. 80 I-LDD-G-EGALTPGNISAAKDIARLHMRGKAWRTNDLAKALS--GDDPMRAIGDLVVEYWNRRRQAVLIAS-------- 146 (346) Q Consensus 80 ~-~~d-g-~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~--g~dp~~~i~~q~a~~~~~~~~~~lla~-------- 146 (346) . ..+ + .+..+..+.+-++..-..++..--|.+++..-.-+ +.|.-+.+.+++|+.+.++.+..++.- T Consensus 70 ~~~~~~~~~~~~~~~~~tf~~~~l~~~kl~~~v~is~e~L~D~a~~~~le~~i~~~~Ae~~g~~~~~~~~nGdg~~~s~~ 149 (314) T protein:vir:41 70 PGRNTSGTKVAPTADEVTVSTNTLEMKELVTKVVLEDEALEDNIEQSAFEQTITSLLASGVTYDLECFFLHADSSLTTGR 149 (314) T ss_pred cccccccCCccCCcccccccceeeeeEEEEEeecccHHHHHhhhchhhHHHHHHHHHHHHHHHHHHHHhhccccCCcCcc Confidence 1 111 1 11223233333333333344444577777665533 458889999999999999998876541 Q ss_pred -----HHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCcccc---CceEEEEchHHHHHHHhhh--hhhhc-- Q lcl|NC_015254. 147 -----LNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEG---KLTGIAMHSQTEMNLRKQG--LIEFM-- 214 (346) Q Consensus 147 -----L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~---~~~~ivmhS~~~~~L~~~~--li~~~-- 214 (346) -+|++.. +..+ ++..++.+..++.+.|.++...+-++.- .-.+|+||+.++..+++.- .-..+ T Consensus 150 ~~~~~p~G~l~~--a~~~---~~~~~~~~~~~~~~~~~~l~~sl~~~yr~~~~~~~~~m~~~t~~~~r~~l~~~~~~l~~ 224 (314) T protein:vir:41 150 ELYRINDGWMKL--AGNQ---YTDAEPEDENWPLNLFDGMMDELDTRYLQLKPRMKFYVSNEIYNGYRKQLLVRETGLGD 224 (314) T ss_pred cchhcchhhhhh--cccc---eeecCccccccHHHHHHHHHHhcCchhhcCCCceEEEecHHHHHHHHHHHhccCCcccc Confidence 1222211 1111 2222334455778889999999988552 3568999999998887642 11111 Q ss_pred ccccCceeeEEeceEEEEeCCCccCCCceEEEEEcCCee-EEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeee- Q lcl|NC_015254. 215 LDSDNKKFPTYMGKRVIVDDGLPAKDGVYTSYIFGEGAF-GLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAW- 292 (346) Q Consensus 215 ~~s~~~~i~~~~G~~VVvdD~~p~~~g~ytt~l~~~GAi-~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~- 292 (346) ..-.++.-.+++|+||+....||.-..--..++|+.-.- .+.. ...+.+|.+|++..++..++.+.++-++..-..+ T Consensus 225 ~~~~~~~~~~l~G~PV~~~~~~~~~~~~~~~i~fgd~~nlv~~~-~~~ir~~~~~~a~~~~~~~~~~~r~d~~~~~~~aa 303 (314) T protein:vir:41 225 SALIGATGLQYDGIPIQYVPALDALGDDKARALLTVPTNLVYGF-WRNIRIEPKRDAAMRRTEYIASLRADCNYEDENAA 303 (314) T ss_pred hhhhCCCCceecceeeEecccccccCCCCceEEEechhheEEEe-eceeEEeecccCcCCeEEEEEEEEeceEEEEcCcE Confidence 111234556799999999988875322223455554432 2332 3456788889888888888887777666542222 Q ss_pred ----ccccccC Q lcl|NC_015254. 293 ----QEKSVAG 299 (346) Q Consensus 293 ----~~~~~~~ 299 (346) .+++.+| T Consensus 304 ~~~~~~~~~~~ 314 (314) T protein:vir:41 304 VAAVIDMSSGG 314 (314) T ss_pred EEEEeeccCCC Confidence 1222233 No 134 >protein:vir:105522 Length: 423 # NCBI annotation: phage major head protein # Family: family:all:1412 # MgeID: mge:1463 # MgeName: phiSG1 # Cross-refs: genbank:acc:YP_516191;genbank:gi:89885994;genbank:GeneID:3964382 Probab=98.54 E-value=6.7e-08 Score=59.90 Aligned_cols=299 Identities=12% Similarity=0.040 Sum_probs=146.3 Q ss_pred eee-c--cchHHHHHHHhhHhHHHHhHhhccccccchhHHHHh-hCCCcEEEecccccCCCcccccCCC-ccccchhhcc Q lcl|NC_015254. 20 IAD-V--IVPEVFNKYVTERTAESSALLQSGIISNDKDLDELA-KSGGNMINMPFWQDLTGEDEILDDG-EGALTPGNIS 94 (346) Q Consensus 20 l~d-~--i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~-~~~G~ti~~P~~~~l~g~ae~~~dg-~~~it~~~lt 94 (346) ++| + ++||+|++...+.+.+.+-|.+ .+-++ ...+.. +.-||||++|.=... ...+.... .+..+++.+. T Consensus 1 MANsl~~l~p~iia~~al~~l~~~lV~~~--lV~r~-y~~ef~~ak~GDTV~I~~P~~~--~~~d~~~~~~t~~~~~~l~ 75 (423) T protein:vir:10 1 MANNLDANVSQIVLKKFLPGFMSDLVLCK--TVDRQ-LLAGEINSSTGDSVSFKRPHQF--KSERTMDGDITGKSKNSLI 75 (423) T ss_pred CccccccccHHHHHHHHHHHHHhhcccch--hhccC-CCccccccccCCEEEEeeCCce--eeecccCcccCcccccccc Confidence 433 3 5899999988888877776633 33333 222222 335999999876644 22221111 1122345565 Q ss_pred cceeEEE-EEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeeccccccccc Q lcl|NC_015254. 95 AAKDIAR-LHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSYF 173 (346) Q Consensus 95 ~~~~~a~-~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~~ 173 (346) ..+-..+ -+....++.++|....+...|. +++.++....++++++.+|...+... +.+. .+..+.. .- T Consensus 76 e~~v~l~id~~k~~a~~v~d~E~~l~i~~~-~~~l~~A~~aLA~~vd~~ia~~~~~~-~~~~--------vgt~~t~-~~ 144 (423) T protein:vir:10 76 SAKATGEVGNYITVAVEYRQIEEALKLNQL-DQILVPINERMVTDLETELALFMMKH-GALS--------LGSPNTP-IK 144 (423) T ss_pred cceEEEEecceeeeeeeeChHHHhcChhHH-HHHHHHHHHHHHHHHHHHHHHHhhhc-cccc--------ccccccc-cc Confidence 4443333 3345557899998877666776 67777778889999999886544221 1111 1111111 11 Q ss_pred cHHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhh-hhhhhcccc-----cCcee-eEEeceEEEEeCCCcc-CCCce Q lcl|NC_015254. 174 TGDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQ-GLIEFMLDS-----DNKKF-PTYMGKRVIVDDGLPA-KDGVY 243 (346) Q Consensus 174 ~~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~-~li~~~~~s-----~~~~i-~~~~G~~VVvdD~~p~-~~g~y 243 (346) .++.+.+|..+|.+.. ..-+.+||.|..+..|.+. .....-... ..+.| +.+.|.+|+.|+.+|. +.|.+ T Consensus 145 a~~~~a~a~~~L~~~~vP~~~R~~Vv~p~~~a~Ll~~~~~~~~~~~~~~~alr~~~i~G~~~GFdi~~Sn~vp~~T~g~~ 224 (423) T protein:vir:10 145 KWSDVAQTASFLKDLGINSGENYAVMDPWAAQRLADAQSGLHVSEQLVRTAWENAQISGNFGGIRALMSNGLASRTQGAF 224 (423) T ss_pred cHHHHHHHHHHHhhccCCcCCCEEEeCHHHHHHHhhhhhhhccccccchHHHHhcccceeecceEEEEecCCcccccccc Confidence 3688999999997754 2347889999999999753 322221111 12445 8999999999999993 44543 Q ss_pred EEEEEcCCeeEEeecCCccceeeee-----------cCCcceeEEEEeeEEeeeee-e-----------eeeccc--c-- Q lcl|NC_015254. 244 TSYIFGEGAFGLGNGEAPVPTETDR-----------EKLKGNDILINRQHFLLHPR-G-----------IAWQEK--S-- 296 (346) Q Consensus 244 tt~l~~~GAi~~~~~~~~~~vE~dR-----------d~~~g~~~l~~r~~~~~~~~-G-----------~s~~~~--~-- 296 (346) .-.....|+..+. +.+....+..+ .-+..-|++.---.|.+|+. + .+|.-. . T Consensus 225 ~ga~~~~~~~~vt-~a~~~~~~~~~~~~~~~T~s~~g~l~~GD~~t~aGv~~v~~~tk~~l~~~~~~~~~~~~V~~~~~~ 303 (423) T protein:vir:10 225 GGKLTVKGTPEVN-YDSVKDSYAFTATLTGATASKKGFLKVGDQLQFDDTHWLNQQSKQTLYNGASALSFTATVMEDANA 303 (423) T ss_pred cceeeeeeeeEEE-ecccccccccccceeeccceeceeEEecceEeecceeeecccccceeecccCCcceEEEEEecccc Confidence 2222222322221 11000000000 00111233333333333332 1 112110 0 Q ss_pred -cc-----CCCCCh-------------HHhcCCcC------------ceeeecccccceEEEEEecccccccCCCCCCCC Q lcl|NC_015254. 297 -VA-----GHSPTN-------------TEIEKGNN------------WKAVYESKNIRIVAFVHKNGVPGKKKETAPEGI 345 (346) Q Consensus 297 -~~-----~~sPt~-------------a~L~~~~N------------W~~v~~~K~i~iv~~~~k~~~~~~~~~~~~~~~ 345 (346) .+ ..+|.. +.+++++. =+++|.+..++++ ..|-++.|. T Consensus 304 ~a~~~~tv~i~p~~~~~~~~~~~~~V~a~~a~~~~vT~~~~~~~t~~~nl~~~~~a~~l~-----------~~pl~~~~~ 372 (423) T protein:vir:10 304 HSSGDVTVKISGVPIFDAGYPQYNAVDRLLAEGDTVSVIGTSKQAMKPNLFYNKLFCGLG-----------TIPLPKLHS 372 (423) T ss_pred cccCceEEEeccccccccCcccccceeccccCCceeEEeeccCCceeEEEEecCcceEEE-----------EEcccCCCc Confidence 00 011211 11111111 1123444444433 223333333 Q ss_pred C Q lcl|NC_015254. 346 K 346 (346) Q Consensus 346 ~ 346 (346) . T Consensus 373 ~ 373 (423) T protein:vir:10 373 I 373 (423) T ss_pred c Confidence 2 No 135 >protein:vir:93881 Length: 387 # NCBI annotation: ORF011 # Family: family:all:658 # MgeID: mge:1485 # MgeName: 3A # Cross-refs: genbank:acc:YP_239938;genbank:gi:66395599;genbank:GeneID:5130947 Probab=98.51 E-value=1.6e-08 Score=63.34 Aligned_cols=274 Identities=14% Similarity=0.088 Sum_probs=133.3 Q ss_pred Cccceecceeeec-CCc--eeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCc Q lcl|NC_015254. 1 MIKKLRMNLQKFA-AGK--NTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGE 77 (346) Q Consensus 1 ~~~~~~~~~q~~~-a~~--~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ 77 (346) ...+..++.+... +-. ++.-.-.++|+-|..-+.+...+.+.+.+-.-+ ...++ .++|....-.+. T Consensus 103 ~~~~~~~~~~~~~~al~~~t~s~gG~~IP~~~~~~Ii~~~~~~~~l~~~~~v---------~~~~~--~~~p~~~~~~~~ 171 (387) T protein:vir:93 103 EFEKPSMEAQRLLHALPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLREKARL---------TNIKG--LEIPRVSYTLDD 171 (387) T ss_pred hhhhhhhhhHHHHHhhccCcCCCCceeechhHHHHHHHHHHhhchhhhheee---------eecCC--ceEEEEeecCCc Confidence 0000011111111 100 111124578888777676666555544331111 11223 235554322234 Q ss_pred ccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhh Q lcl|NC_015254. 78 DEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALD 157 (346) Q Consensus 78 ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~ 157 (346) +.-+.|+. ..+..+.+-++-.-..++.+.-..++++...-+..|..+.+.+++++.+.+.....++....|.-...+.- T Consensus 172 a~~v~E~~-~~~~~~~~f~~v~~~~~k~~~~~~iS~ell~Ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g~g~p~g~l 250 (387) T protein:vir:93 172 DDFITDVE-TAKELKLKGDTVKFTTNKFKVFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAVSPKSGLDHMSF 250 (387) T ss_pred cccccCcc-cccccccccceeeeeheeeeeechhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhHhhcCCCccccceee Confidence 44556664 44444444333333333333335566554443455777889999999998877666654332211000000 Q ss_pred hcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC----ceeeEEeceEEEEe Q lcl|NC_015254. 158 SNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN----KKFPTYMGKRVIVD 233 (346) Q Consensus 158 ~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~----~~i~~~~G~~VVvd 233 (346) .+ ..+. .......++.|.++...+......-..|+||+.++..|++. +.+.++ +.-.+++|+||+++ T Consensus 251 ~~-~~~~---~v~~~~~~d~i~~~~~~l~~~~~~~a~~~mn~~t~~~~~~~-----~~d~~~~~~~~~~~~llG~PV~~~ 321 (387) T protein:vir:93 251 YN-GSVK---EVEGADMYDAIINALADLHEDYRDNATIYMRYADYVKIISV-----LSNGTTNFFDTPAEKVFGKPVVFT 321 (387) T ss_pred ec-cccc---cccccchHHHHHHHHhccChhhhcCCEEEEechHHHHHHHH-----HhcCCCcccccCCccccccceEEe Confidence 00 0111 11122357889999888877776777899999998776543 111111 11247899999999 Q ss_pred CCCccCCCceEEEEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeee---eeeec-cccccCCCCC Q lcl|NC_015254. 234 DGLPAKDGVYTSYIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPR---GIAWQ-EKSVAGHSPT 303 (346) Q Consensus 234 D~~p~~~g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~---G~s~~-~~~~~~~sPt 303 (346) |.++. .+||.=.-.|.. .....++.+++...|...+..+.+|-+.+. -|..- -++.++..|+ T Consensus 322 ~~~~~-------~~~GDf~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~r~d~~v~~~eA~~~l~~k~~~~~~~~ 387 (387) T protein:vir:93 322 DAAVK-------PIVGDFNYFGIN-YDGTTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKENTGSLPS 387 (387) T ss_pred cCCCc-------eeeeehhhhhee-hhhheeeecccccCCceeEEEEeeeCceeechhheEEEEeecCCCCCCC Confidence 98753 222211111111 122345555666666666666666555543 22221 1223344455 No 136 >protein:vir:3525 Length: 423 # NCBI annotation: major head protein # Family: family:all:1412 # MgeID: mge:72 # MgeName: APSE-1 # Cross-refs: genbank:acc:NP_050985;genbank:gi:9633571;genbank:GeneID:1262318 Probab=98.46 E-value=2.9e-07 Score=56.40 Aligned_cols=314 Identities=15% Similarity=0.032 Sum_probs=148.9 Q ss_pred cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHh-hCCCcEEEecccccCCCcccccCCC-ccccch Q lcl|NC_015254. 13 AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELA-KSGGNMINMPFWQDLTGEDEILDDG-EGALTP 90 (346) Q Consensus 13 ~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~-~~~G~ti~~P~~~~l~g~ae~~~dg-~~~it~ 90 (346) =|| +-+. ++||+++.-..+.+.+.+-|.+ .+-++-+ .+.. +.-||+|+||.=... ...+..-+ .+.+++ T Consensus 1 MAN--~llT--~iP~iia~~al~~l~~~lV~~~--lV~r~y~-ge~~~a~~GDTV~I~~p~~~--~v~d~~~~~~~~~~~ 71 (423) T protein:vir:35 1 MAN--NLES--NISQIVLKKFLPGFMSDIVLCK--TVDRQLL-SGEINSNTGDSVSFKRPHQF--KSERTETGDITGKDK 71 (423) T ss_pred Ccc--chhh--hhHHHHHHHHHHHHHhhcccch--hcccCCC-cccccccCCCEEEEeeCCcc--eeecccCcCCCCccc Confidence 122 1111 3699998888887777766633 3333322 2222 345999999976654 22332111 245778 Q ss_pred hhcccceeEEE-EEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeeccccc Q lcl|NC_015254. 91 GNISAAKDIAR-LHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGD 169 (346) Q Consensus 91 ~~lt~~~~~a~-~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~ 169 (346) +.++..+-..+ -+....++.++|+...+...|.. .+.++.+.+.+++++.+++..+..-... .. +..++ T Consensus 72 ~~~~e~~v~l~id~~k~~a~~v~d~e~~l~i~~~~-~~l~~a~~ala~~vd~~l~~~l~~~a~~-~v--------gt~~t 141 (423) T protein:vir:35 72 NGLFSAKATGKVGKYITVAVEWTQIEEALKLNQLD-QILSPIHERMVTDLETELAHFMMNNGAL-SL--------GSPNT 141 (423) T ss_pred cccccceeeEEeccceeccceeCHHHHHhhHHHHH-HHHHHHHHHHHHHHHHHHHHHHhhcccc-cc--------ccccC Confidence 88886654333 33455578999999887777774 5555667889999999998866421110 10 00111 Q ss_pred cccccHHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhhhh-hhhcccc--c---Cce-eeEEeceEEEEeCCCcc-C Q lcl|NC_015254. 170 DSYFTGDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQGL-IEFMLDS--D---NKK-FPTYMGKRVIVDDGLPA-K 239 (346) Q Consensus 170 ~~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~l-i~~~~~s--~---~~~-i~~~~G~~VVvdD~~p~-~ 239 (346) ..-.++.+.+|..+|.+.. ..-+.+|+.|..+..|++... +..-... + .+. ++.+.|..|+.|+.+|. + T Consensus 142 -~~~~~~~i~~a~~~Ld~~~vP~~~R~~Vv~p~~~a~Ll~~~~~~~~~~~~~~~alr~g~i~G~i~GFdv~~Snnvp~~T 220 (423) T protein:vir:35 142 -AIKKWADVAQTASFIKDIGIKTGENYAIMDPWSAQRLADAQSGLHAADQLVRTAWENAQISGNFGGIRALMSNGLASRK 220 (423) T ss_pred -CcchHHHHHHHHHHHHHhcCCcCCCEEEeCHHHHHHHhccccceeccccchhHHHhhccceeeecceEEEEcCCCcccc Confidence 1124688999999997654 234789999999999986432 2111111 1 233 48999999999999995 5 Q ss_pred CCceEEEEEcCCeeEEeecCC---cccee-----eee--cCCcceeEEEEeeEEeeeeee------------eee--ccc Q lcl|NC_015254. 240 DGVYTSYIFGEGAFGLGNGEA---PVPTE-----TDR--EKLKGNDILINRQHFLLHPRG------------IAW--QEK 295 (346) Q Consensus 240 ~g~ytt~l~~~GAi~~~~~~~---~~~vE-----~dR--d~~~g~~~l~~r~~~~~~~~G------------~s~--~~~ 295 (346) .|.+.......++........ ...+. +-+ +.....|.+.-.-.+.+|+.- +.| +.. T Consensus 221 ~gt~~~~~~v~~a~~v~~~a~~~~~~~~~~~~~~~~~~~g~l~~GD~~t~aGv~~v~~~t~~~~~~~~t~~~~~~~V~~~ 300 (423) T protein:vir:35 221 QGDFDGAITVKTAPNVDYLSVKDSYQFTVALTGATPSKTGFLKAGDQLKFTSTHWLNQQSKQTLYNGSTAMSFTATVLEE 300 (423) T ss_pred ccccccceeeccccccccccccccccceeeeeeeeeccCCcEEecceEEeeeeeeccccccceeecccCCceeEEEEecc Confidence 555433222222211100000 00000 000 111122233222222222211 011 100 Q ss_pred cc---cC-----CCCChHHhcCCcCceee-ecccccceE-------------EEEEecccccccCCCCCCCCC Q lcl|NC_015254. 296 SV---AG-----HSPTNTEIEKGNNWKAV-YESKNIRIV-------------AFVHKNGVPGKKKETAPEGIK 346 (346) Q Consensus 296 ~~---~~-----~sPt~a~L~~~~NW~~v-~~~K~i~iv-------------~~~~k~~~~~~~~~~~~~~~~ 346 (346) .+ +| .+|..--.+......-| ..+++-.-+ .+=||.+......|-++.|.. T Consensus 301 ~~~~a~g~~~v~i~p~~~~~~~~~~~~~v~a~~a~~~~vt~~~~a~~~~~~nl~~~~~a~~l~~~~l~~~~~~ 373 (423) T protein:vir:35 301 TNSTASGDVTVKLSGVPIYDEKNSQYNAVDAKVKAGDAVSIIGTAKQQMKPNLFYNKFFCGLGTIPLPKLHSL 373 (423) T ss_pred ccccccCceeEEccccccccCCCcccccccccccCCceeeeeecCCCceeEEEeecCceeEEEEEccccCCcc Confidence 00 00 11211000111111111 111111100 122333323333333333332 No 137 >protein:vir:97031 Length: 402 # NCBI annotation: 31 # Family: family:all:2806 # MgeID: mge:1644 # MgeName: K1-5 # Cross-refs: genbank:acc:YP_654132;genbank:gi:108862016;genbank:GeneID:5075980 Probab=98.45 E-value=6e-08 Score=60.17 Aligned_cols=317 Identities=15% Similarity=0.107 Sum_probs=175.8 Q ss_pred CccceecceeeecCCceeee------eec-cchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccccc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRI------ADV-IVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQD 73 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l------~d~-i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~ 73 (346) |-- +|..|+- ++. +-=|+|..-|...+.+.+.|.. .-...+ -.+|+++++|+-+. T Consensus 1 Ms~----------~n~~t~~~~~~s~~~~al~le~f~geV~taF~~~si~~~------~~~vrt--i~~GkS~qf~~iG~ 62 (402) T protein:vir:97 1 MST----------PNTLTNVAVSASGEVDSLLIEKFNGKVNEQYLKGENILS------YFDVQT--VTGTNTVSNKYLGE 62 (402) T ss_pred CCC----------cccccccccccccchhhhhhhhhhhhHHHHHHHHHhhcC------cceeee--ecccceEEEEEEee Confidence 221 2222221 222 2226777777777666666632 111111 23799999999876 Q ss_pred CCCcccccCCCccccchhhcccceeEEEEEe-ecCcceechHHHhhhcch-HHHHHHHHHHHHHHHHHHHHHHHHHHh-h Q lcl|NC_015254. 74 LTGEDEILDDGEGALTPGNISAAKDIARLHM-RGKAWRTNDLAKALSGDD-PMRAIGDLVVEYWNRRRQAVLIASLNG-I 150 (346) Q Consensus 74 l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~-~~k~~~~tD~a~~~~g~d-p~~~i~~q~a~~~~~~~~~~lla~L~G-~ 150 (346) .. +....-|+ .+.++.+...+..-+|=. .--...+.|+......=| +-.+++++.+.++++.+|+.+|..++. . T Consensus 63 ~~--a~y~~~G~-~ldg~~~~~~k~~ItID~lL~a~~~V~diDeaq~~yD~vRse~s~e~G~ALA~~~Dq~ii~~i~~aa 139 (402) T protein:vir:97 63 TE--LQVLAPGQ-SPNATPTQADKNQLVIDTTVIARNTVAHIHDVQGDIDSLKPKLAMNQAKQLKRLEDQMAIQQMLLGG 139 (402) T ss_pred eE--Eeeecccc-ccCCCCcccccEEEEeCceeechhhhhhHHHHHhcccchhHHHHHHHHHHHHHHHHHHHHHHHHHhh Confidence 52 33333343 455556665544333322 112355888888777667 678999999999999999988876542 2 Q ss_pred hhhh-hhh------hcce-eeeccccccccccHHHHH----HHHHHhCccc--cCceEEEEchHHHHHHHhh-hhhh--h Q lcl|NC_015254. 151 TASG-ALD------SNKL-DVSTETGDDSYFTGDTFL----SATYKLGDAE--GKLTGIAMHSQTEMNLRKQ-GLIE--F 213 (346) Q Consensus 151 ~~~~-~~~------~~~~-dis~~~~~~~~~~~~~l~----~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~-~li~--~ 213 (346) ..+. ... .+.. .-...+.+.+..+...|. +|.+.|-+++ ..-.+++|.|..|..|++. +|++ | T Consensus 140 ~a~t~~~~~~~~~~~~g~s~~~~~t~~~a~~~~~~l~~ai~~a~~~LdEkdVP~~dRv~vv~P~~y~~Ll~~~rl~n~d~ 219 (402) T protein:vir:97 140 IANTKAERNKPRVKGHGFSINVNVTESEALANPQYVMAAVEYALEQQLEQEVDISDVAIMMPWKFFNALRDADRIVDKTY 219 (402) T ss_pred ccccccccccCcccccccccccccccchhhcCHHHHHHHHHHHHHHHHhcCCCccccEEEeChHHHHHHhhcccccchhh Confidence 2111 110 0000 011222333344555444 5666665432 2337999999999999976 4553 3 Q ss_pred ccccc----CceeeEEeceEEEEeCCCccCC-------------C----------ceEEEEEcCCeeEEeecCCccceee Q lcl|NC_015254. 214 MLDSD----NKKFPTYMGKRVIVDDGLPAKD-------------G----------VYTSYIFGEGAFGLGNGEAPVPTET 266 (346) Q Consensus 214 ~~~s~----~~~i~~~~G~~VVvdD~~p~~~-------------g----------~ytt~l~~~GAi~~~~~~~~~~vE~ 266 (346) ..... ++.+..++|++|+.|..+|... | +-..++|-+-|++..... ++..|+ T Consensus 220 ~~~~~g~~~~G~v~~v~Gv~Vv~SnnlP~~a~~it~~~ls~a~~G~~y~~t~d~t~~~~~~f~~~Av~tvk~~-~vT~~~ 298 (402) T protein:vir:97 220 TISQSGATINGFVLSSYNCPVIPSNRFPTFAQDQAHHLLSNEDNGYRYDPIAEMNGAVAVLFTSDALLVGRTI-EVTGDI 298 (402) T ss_pred ccccCCccccceeEEEeceEEEecCccccccccccccccccCCCCccCCcCcccceeEEEEEecceEEEEEee-ccccch Confidence 22222 3678999999999999998532 1 113466778888876655 356788 Q ss_pred eecCCcceeEEEEeeEEeeeeee------eeeccccccCCCCChHHhcCCcCceeeecccccceEEEEEecccccccCCC Q lcl|NC_015254. 267 DREKLKGNDILINRQHFLLHPRG------IAWQEKSVAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKET 340 (346) Q Consensus 267 dRd~~~g~~~l~~r~~~~~~~~G------~s~~~~~~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~ 340 (346) .|++....+.|.+.+.|+..|+= +-.+.-..++..| +|.+.-+=-+..-++. +-+....++... -.. T Consensus 299 ~~d~r~~~~~id~~~a~G~g~~RPeaa~vv~~~~~~t~~~~~---~~~~~~~~~~~~~~~~---~~~~~~~~~~~~-~~~ 371 (402) T protein:vir:97 299 FYEKKEKTYYIDTFMAEGAIPDRWEAVSVVTTKRDATTGDAG---GPGDDHATVLARAQRK---AVYVKTEGAAAA-FSA 371 (402) T ss_pred hhchhHHHHHHHHHHHhCCcccCccceEEEEEecccccccCC---ccccchhhhhcccccc---eEEEeccccchh-ccc Confidence 88888777777777777766631 1122111112222 3333322111111111 223333444433 356 Q ss_pred CCCCCC Q lcl|NC_015254. 341 APEGIK 346 (346) Q Consensus 341 ~~~~~~ 346 (346) +|.|+. T Consensus 372 ~~~~~~ 377 (402) T protein:vir:97 372 APAGIQ 377 (402) T ss_pred cccccc Confidence 888988 No 138 >protein:vir:962 Length: 397 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:19 # MgeName: bIL285 # Cross-refs: genbank:acc:NP_076616;genbank:gi:13095724;genbank:GeneID:920264 Probab=98.44 E-value=6.2e-08 Score=60.09 Aligned_cols=266 Identities=11% Similarity=0.002 Sum_probs=123.8 Q ss_pred Cccce-ecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCccc Q lcl|NC_015254. 1 MIKKL-RMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDE 79 (346) Q Consensus 1 ~~~~~-~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae 79 (346) +...+ +...+.- +...+.-....+|+-+...+.+... ...+.+ . ......++....+|....=++.+. T Consensus 120 ~~~~~~~~~~~~~-~~~~~~~~~~~vp~~~~~~i~~~~~-~~~l~~------~---~~~~~~~~~~~~~~~~~~~~~~~~ 188 (397) T protein:vir:96 120 INAFVKSKGAEKR-DGFTSVEGGALIPQELLQPQLEPKD-IVDLSK------Y---VRSVPVNSASGKFPVISKSGSKMA 188 (397) T ss_pred HHHHHHhhhhhhh-hcccccccccchhHHHHHHHHHhhh-hhhHHH------h---hhhccccccceeEEEEeccCCccc Confidence 00000 1111111 2222333445666666555544321 111111 0 111122344566666543333444 Q ss_pred ccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhc Q lcl|NC_015254. 80 ILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSN 159 (346) Q Consensus 80 ~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~ 159 (346) .+.|+.........+..+-....+..+.-..+++....-+.-|....+.+++++...+..+..++..... T Consensus 189 ~~~E~~~~~~~~~~~~~~i~~~~~~~~~~~~~s~ell~ds~~~l~~~i~~~l~~~~~~~~~~~i~~g~g~---------- 258 (397) T protein:vir:96 189 TVQQLEKNPQLANPKMVEIDYSVATRRGYIPISQEMIDDASYDVTGLIADEIQDQSLNTKNADIAAVLKT---------- 258 (397) T ss_pred cccccccccccccccccceeecHhHhhcchhhHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhcccc---------- Confidence 4566543222233333333333344444444444433333445666788888888888877766643210 Q ss_pred ceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceeeEEeceEEEEeCC Q lcl|NC_015254. 160 KLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFPTYMGKRVIVDDG 235 (346) Q Consensus 160 ~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~~~~G~~VVvdD~ 235 (346) ......++++.|.+++...-+... -.+|+|||.++..|++..-- .++. ...++.-++++|+||++++. T Consensus 259 -------~~~~~~~~~d~~~~~~~~~~~~~~-~a~~v~n~~~~~~l~~lkd~~G~~~~~~~~~~~~~~~l~G~pv~~~~~ 330 (397) T protein:vir:96 259 -------ATAKSVVGVDGLKDLINKEIKKVY-DVKLFISASMYSELDKLKDKNGRYLLQDSITAASGKQLLGKEVVVLDD 330 (397) T ss_pred -------cccccccchHHHHHHHHHhhhhhc-CcEEEEcHHHHHHHHHhhccCCCeEeccCccCCCcccccccceEEecc Confidence 011223578888888876544333 36899999999999875211 2222 22234456899999998766 Q ss_pred CccC--CCceEEEEEcC--CeeEEeecCCccceeeeecCCcceeEEEEeeEE---eeeeeeeeecccccc Q lcl|NC_015254. 236 LPAK--DGVYTSYIFGE--GAFGLGNGEAPVPTETDREKLKGNDILINRQHF---LLHPRGIAWQEKSVA 298 (346) Q Consensus 236 ~p~~--~g~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~---~~~~~G~s~~~~~~~ 298 (346) ++.. .|. .+++||. .++.+.. +..+.++..++... .+.+.+-.++ +.||..|..-.-+++ T Consensus 331 ~~~~~~~~~-~~~~~gd~~~~~~~~~-~~~~~~~~~~~~~~-~~~~~~~~r~d~~~~~~~a~~~~~~~~a 397 (397) T protein:vir:96 331 DVIGKSVGN-VVGFIGDAKAFASFFD-RKQVSVSWVDNNIY-GQLLAGIIRYDVKATDKKAGFYVTFTIG 397 (397) T ss_pred cccCCCCCc-eEEEEeehhcceEeEe-ecceEEEEeccccc-ceeEEEEEEEccEEecccceEEEEeecC Confidence 4332 233 3455553 2233333 22344554443322 2223222222 234444444322222 No 139 >protein:vir:94424 Length: 387 # NCBI annotation: ORF010 # Family: family:all:658 # MgeID: mge:1506 # MgeName: 47 # Cross-refs: genbank:acc:YP_240005;genbank:gi:66395666;genbank:GeneID:5133084 Probab=98.43 E-value=2e-08 Score=62.79 Aligned_cols=274 Identities=14% Similarity=0.076 Sum_probs=137.3 Q ss_pred Ccc----ceecceeeec-C--CceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccccc Q lcl|NC_015254. 1 MIK----KLRMNLQKFA-A--GKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQD 73 (346) Q Consensus 1 ~~~----~~~~~~q~~~-a--~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~ 73 (346) |.. +..++-.... + ..++.-...++|+-|..-+.+...+.+.+.+-.-+ ...++ ..+|.... T Consensus 99 ~~~~~~~~~~~~~~~~~~a~~~~~~~~gG~lIP~~~~~~Ii~~~~~~~~l~~~~~~---------~~~~~--~~~p~~~~ 167 (387) T protein:vir:94 99 ILPNEFEKPSMEAQRLLHALPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLREKARL---------TNIKG--LEIPRVSY 167 (387) T ss_pred HhhhhHHHHHHHHHHHHhhhccCCCCCCceeechhHHHHHHHHHHhhchhhhhcee---------eecCC--ceeeeeec Confidence 110 0011111110 0 00111124577887776666655555544321111 11223 34565544 Q ss_pred CCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhh Q lcl|NC_015254. 74 LTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITAS 153 (346) Q Consensus 74 l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~ 153 (346) -.+++.-+.|+. .++..+.+-++-.-..+..+--..++++...-+.-|..+.+.+++++.+.+...+.++....|.=.. T Consensus 168 ~~~~a~~v~Eg~-~~~~~~~~f~~v~l~~~k~~~~i~iS~ell~ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g~g~~ 246 (387) T protein:vir:94 168 TLDDDDFITDVE-TAKELKAKGDTVKFTTNKFKVFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAVSPKSGLE 246 (387) T ss_pred cCCccccccccc-cccccccccceeeechheeeeechhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhHhhcCCCcccc Confidence 334555566764 4454444433333333334334566665444455577888999999999887666665433221000 Q ss_pred hhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC----ceeeEEeceE Q lcl|NC_015254. 154 GALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN----KKFPTYMGKR 229 (346) Q Consensus 154 ~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~----~~i~~~~G~~ 229 (346) .+.-. ...+. ..+....++.|.++...+-.....-..|+||+.++..|++.- ...++ +.-.+++|+| T Consensus 247 ~g~~~-~~~~~---~~~~~~~~d~i~~~~~~l~~~y~~na~~imn~~t~~~~~~~~-----~~~~~~~~~~~~~~llG~P 317 (387) T protein:vir:94 247 HMSFY-NGSVK---EVEGADMYDAIINALADLHEDYRDNATIYMRYADYVKIISVL-----SNGTTNFFDTPAEKVFGKP 317 (387) T ss_pred ceeee-ccccc---cccccchHHHHHHHHhccChhhhcCCEEEEechHHHHHHHHH-----hcCCCcccccCCccccccc Confidence 00000 00111 111223578899988887776666678999999988876431 11111 1224789999 Q ss_pred EEEeCCCccCCCceEEEEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeee---eeeeccc-cccCCCCC Q lcl|NC_015254. 230 VIVDDGLPAKDGVYTSYIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPR---GIAWQEK-SVAGHSPT 303 (346) Q Consensus 230 VVvdD~~p~~~g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~---G~s~~~~-~~~~~sPt 303 (346) |+++|.++. ++||.=.-.|.. .....++..|+...+...+....+|-..|. .|..-.. +.++..|| T Consensus 318 V~~~~~~~~-------~~~GDf~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~r~Dg~v~~~~A~~~l~~ka~~~~~~~ 387 (387) T protein:vir:94 318 VVFTDAAVK-------PIVGDFNYFGIN-YDGTTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKENTGPLPS 387 (387) T ss_pred eEEecCCCc-------eeeechhhhhhh-hhhhhheecccccCCceEEEEEEEeCcEeechhheEEEEeecCCCCCCC Confidence 999998753 223211111111 122345566777677777777666655543 3333322 23344555 No 140 >protein:vir:2685 Length: 387 # NCBI annotation: hypothetical protein # Family: family:all:658 # MgeID: mge:57 # MgeName: phiSLT # Cross-refs: genbank:acc:NP_075504;genbank:gi:12719433;genbank:GeneID:920169 Probab=98.43 E-value=2e-08 Score=62.79 Aligned_cols=274 Identities=14% Similarity=0.076 Sum_probs=137.3 Q ss_pred Ccc----ceecceeeec-C--CceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccccc Q lcl|NC_015254. 1 MIK----KLRMNLQKFA-A--GKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQD 73 (346) Q Consensus 1 ~~~----~~~~~~q~~~-a--~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~ 73 (346) |.. +..++-.... + ..++.-...++|+-|..-+.+...+.+.+.+-.-+ ...++ ..+|.... T Consensus 99 ~~~~~~~~~~~~~~~~~~a~~~~~~~~gG~lIP~~~~~~Ii~~~~~~~~l~~~~~~---------~~~~~--~~~p~~~~ 167 (387) T protein:vir:26 99 ILPNEFEKPSMEAQRLLHALPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLREKARL---------TNIKG--LEIPRVSY 167 (387) T ss_pred HhhhhHHHHHHHHHHHHhhhccCCCCCCceeechhHHHHHHHHHHhhchhhhhcee---------eecCC--ceeeeeec Confidence 110 0011111110 0 00111124577887776666655555544321111 11223 34565544 Q ss_pred CCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhh Q lcl|NC_015254. 74 LTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITAS 153 (346) Q Consensus 74 l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~ 153 (346) -.+++.-+.|+. .++..+.+-++-.-..+..+--..++++...-+.-|..+.+.+++++.+.+...+.++....|.=.. T Consensus 168 ~~~~a~~v~Eg~-~~~~~~~~f~~v~l~~~k~~~~i~iS~ell~ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g~g~~ 246 (387) T protein:vir:26 168 TLDDDDFITDVE-TAKELKAKGDTVKFTTNKFKVFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAVSPKSGLE 246 (387) T ss_pred cCCccccccccc-cccccccccceeeechheeeeechhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhHhhcCCCcccc Confidence 334555566764 4454444433333333334334566665444455577888999999999887666665433221000 Q ss_pred hhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC----ceeeEEeceE Q lcl|NC_015254. 154 GALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN----KKFPTYMGKR 229 (346) Q Consensus 154 ~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~----~~i~~~~G~~ 229 (346) .+.-. ...+. ..+....++.|.++...+-.....-..|+||+.++..|++.- ...++ +.-.+++|+| T Consensus 247 ~g~~~-~~~~~---~~~~~~~~d~i~~~~~~l~~~y~~na~~imn~~t~~~~~~~~-----~~~~~~~~~~~~~~llG~P 317 (387) T protein:vir:26 247 HMSFY-NGSVK---EVEGADMYDAIINALADLHEDYRDNATIYMRYADYVKIISVL-----SNGTTNFFDTPAEKVFGKP 317 (387) T ss_pred ceeee-ccccc---cccccchHHHHHHHHhccChhhhcCCEEEEechHHHHHHHHH-----hcCCCcccccCCccccccc Confidence 00000 00111 111223578899988887776666678999999988876431 11111 1224789999 Q ss_pred EEEeCCCccCCCceEEEEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeee---eeeeccc-cccCCCCC Q lcl|NC_015254. 230 VIVDDGLPAKDGVYTSYIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPR---GIAWQEK-SVAGHSPT 303 (346) Q Consensus 230 VVvdD~~p~~~g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~---G~s~~~~-~~~~~sPt 303 (346) |+++|.++. ++||.=.-.|.. .....++..|+...+...+....+|-..|. .|..-.. +.++..|| T Consensus 318 V~~~~~~~~-------~~~GDf~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~r~Dg~v~~~~A~~~l~~ka~~~~~~~ 387 (387) T protein:vir:26 318 VVFTDAAVK-------PIVGDFNYFGIN-YDGTTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKENTGPLPS 387 (387) T ss_pred eEEecCCCc-------eeeechhhhhhh-hhhhhheecccccCCceEEEEEEEeCcEeechhheEEEEeecCCCCCCC Confidence 999998753 223211111111 122345566777677777777666655543 3333322 23344555 No 141 >protein:vir:96978 Length: 387 # NCBI annotation: ORF009 # Family: family:all:658 # MgeID: mge:1643 # MgeName: 42e # Cross-refs: genbank:acc:YP_239859;genbank:gi:66395517;genbank:GeneID:5133011 Probab=98.43 E-value=2e-08 Score=62.79 Aligned_cols=274 Identities=14% Similarity=0.076 Sum_probs=137.3 Q ss_pred Ccc----ceecceeeec-C--CceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccccc Q lcl|NC_015254. 1 MIK----KLRMNLQKFA-A--GKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQD 73 (346) Q Consensus 1 ~~~----~~~~~~q~~~-a--~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~ 73 (346) |.. +..++-.... + ..++.-...++|+-|..-+.+...+.+.+.+-.-+ ...++ ..+|.... T Consensus 99 ~~~~~~~~~~~~~~~~~~a~~~~~~~~gG~lIP~~~~~~Ii~~~~~~~~l~~~~~~---------~~~~~--~~~p~~~~ 167 (387) T protein:vir:96 99 ILPNEFEKPSMEAQRLLHALPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLREKARL---------TNIKG--LEIPRVSY 167 (387) T ss_pred HhhhhHHHHHHHHHHHHhhhccCCCCCCceeechhHHHHHHHHHHhhchhhhhcee---------eecCC--ceeeeeec Confidence 110 0011111110 0 00111124577887776666655555544321111 11223 34565544 Q ss_pred CCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhh Q lcl|NC_015254. 74 LTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITAS 153 (346) Q Consensus 74 l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~ 153 (346) -.+++.-+.|+. .++..+.+-++-.-..+..+--..++++...-+.-|..+.+.+++++.+.+...+.++....|.=.. T Consensus 168 ~~~~a~~v~Eg~-~~~~~~~~f~~v~l~~~k~~~~i~iS~ell~ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g~g~~ 246 (387) T protein:vir:96 168 TLDDDDFITDVE-TAKELKAKGDTVKFTTNKFKVFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAVSPKSGLE 246 (387) T ss_pred cCCccccccccc-cccccccccceeeechheeeeechhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhHhhcCCCcccc Confidence 334555566764 4454444433333333334334566665444455577888999999999887666665433221000 Q ss_pred hhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC----ceeeEEeceE Q lcl|NC_015254. 154 GALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN----KKFPTYMGKR 229 (346) Q Consensus 154 ~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~----~~i~~~~G~~ 229 (346) .+.-. ...+. ..+....++.|.++...+-.....-..|+||+.++..|++.- ...++ +.-.+++|+| T Consensus 247 ~g~~~-~~~~~---~~~~~~~~d~i~~~~~~l~~~y~~na~~imn~~t~~~~~~~~-----~~~~~~~~~~~~~~llG~P 317 (387) T protein:vir:96 247 HMSFY-NGSVK---EVEGADMYDAIINALADLHEDYRDNATIYMRYADYVKIISVL-----SNGTTNFFDTPAEKVFGKP 317 (387) T ss_pred ceeee-ccccc---cccccchHHHHHHHHhccChhhhcCCEEEEechHHHHHHHHH-----hcCCCcccccCCccccccc Confidence 00000 00111 111223578899988887776666678999999988876431 11111 1224789999 Q ss_pred EEEeCCCccCCCceEEEEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeee---eeeeccc-cccCCCCC Q lcl|NC_015254. 230 VIVDDGLPAKDGVYTSYIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPR---GIAWQEK-SVAGHSPT 303 (346) Q Consensus 230 VVvdD~~p~~~g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~---G~s~~~~-~~~~~sPt 303 (346) |+++|.++. ++||.=.-.|.. .....++..|+...+...+....+|-..|. .|..-.. +.++..|| T Consensus 318 V~~~~~~~~-------~~~GDf~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~r~Dg~v~~~~A~~~l~~ka~~~~~~~ 387 (387) T protein:vir:96 318 VVFTDAAVK-------PIVGDFNYFGIN-YDGTTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKENTGPLPS 387 (387) T ss_pred eEEecCCCc-------eeeechhhhhhh-hhhhhheecccccCCceEEEEEEEeCcEeechhheEEEEeecCCCCCCC Confidence 999998753 223211111111 122345566777677777777666655543 3333322 23344555 No 142 >protein:vir:9361 Length: 402 # NCBI annotation: SLT orf 37-like protein # Family: family:all:658 # MgeID: mge:166 # MgeName: phi 12 # Cross-refs: genbank:acc:NP_803339;genbank:gi:29028650;genbank:GeneID:1258088 Probab=98.40 E-value=5.4e-08 Score=60.43 Aligned_cols=274 Identities=13% Similarity=0.075 Sum_probs=135.9 Q ss_pred Cccc-e---ecceeeecC--C-ceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccccc Q lcl|NC_015254. 1 MIKK-L---RMNLQKFAA--G-KNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQD 73 (346) Q Consensus 1 ~~~~-~---~~~~q~~~a--~-~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~ 73 (346) |..+ . .++-+...+ . .++.-.-.++|+-|..-+.+...+.+.+.+-.-+ ...++ ..+|.... T Consensus 114 ~~~~~~~~~~~~~~~~~~a~~~~t~~~GG~lIP~~~~~~Ii~~~~~~~~l~~~~~v---------~~~~~--~~~p~~~~ 182 (402) T protein:vir:93 114 ILPNEFEKPSMEAQRLLHALPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLREKARL---------TNIKG--LEIPRVSY 182 (402) T ss_pred HhhhhHHHHHHhHHHHHhhhccCCCcCCccccchhHHHHHHHhHHhhhhhhhhcee---------eecCC--ceeeeeec Confidence 1110 0 111111110 0 0111123567887777666666555555331111 11223 34565544 Q ss_pred CCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhh Q lcl|NC_015254. 74 LTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITAS 153 (346) Q Consensus 74 l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~ 153 (346) -.+++.-+.|+. .++..+.+-.+-.-..+..+.-..++++...-+.-|..+.+.++++..+.+...+.++....|.-.. T Consensus 183 ~~~~a~~v~Eg~-~~~~~~~~f~~i~~~~~k~~~~i~iS~ell~Ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g~g~p 261 (402) T protein:vir:93 183 TLDDDDFITDVE-TAKELKAKGDTVKFTTNKFKVFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAVSPKSGLE 261 (402) T ss_pred cCCccccccccc-cccccccccceeeecceeeeeechhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhHhhcCCCcccc Confidence 334455566764 4444444433333333333333456655444445577888999999999887666655443321100 Q ss_pred hhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccC----ceeeEEeceE Q lcl|NC_015254. 154 GALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDN----KKFPTYMGKR 229 (346) Q Consensus 154 ~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~----~~i~~~~G~~ 229 (346) .+... ...+.. .+....++.|.++...+......-..|+||+.++..|++.- .+.++ +.-.+++|+| T Consensus 262 ~g~~~-~~~~~~---~~~~~~~d~l~~~~~~l~~~y~~na~~imn~~t~~~~~~~~-----~d~~~~~~~~~~~~llG~P 332 (402) T protein:vir:93 262 HMSFY-NGSVKE---VEGADMYDAIINALADLHEDYRDNATIYMRYADYVKIISVL-----SNGTTNFFDTPAEKVFGKP 332 (402) T ss_pred ceeee-cccccc---ccccchHHHHHHHHhccChhhhcCCEEEEechHHHHHHHHH-----hcCCCcccccCCccccccc Confidence 00000 001111 11223478899988877776666678999999988876531 11111 1224789999 Q ss_pred EEEeCCCccCCCceEEEEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeee---eeeecc-ccccCCCCC Q lcl|NC_015254. 230 VIVDDGLPAKDGVYTSYIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPR---GIAWQE-KSVAGHSPT 303 (346) Q Consensus 230 VVvdD~~p~~~g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~---G~s~~~-~~~~~~sPt 303 (346) |+++|+++. ++||.=+-.|. ......++..|++..+...+....++-..|. .|..-. +..++..|| T Consensus 333 V~~t~~~~~-------i~~GDf~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~r~Dg~v~~~~A~~~l~ik~~~~~~~~ 402 (402) T protein:vir:93 333 VVFTDAAVK-------PIVGDFNYFGI-NYDGTTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKENTGPLPS 402 (402) T ss_pred eEEecCCCc-------eeeechhhhhh-hhhhhhhhhhhcccCCceEEEEEEEeCcEEechhheEEEEeecCCCCCCC Confidence 999998753 23321110111 1122345566777667776666666654442 333222 123344455 No 143 >protein:vir:102873 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1492 # MgeName: Cherry # Cross-refs: genbank:acc:YP_338137;genbank:gi:77020198;genbank:GeneID:3703782 Probab=98.34 E-value=2.7e-07 Score=56.57 Aligned_cols=284 Identities=13% Similarity=0.080 Sum_probs=133.7 Q ss_pred Cccceecceeeec---------CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcE--EEec Q lcl|NC_015254. 1 MIKKLRMNLQKFA---------AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNM--INMP 69 (346) Q Consensus 1 ~~~~~~~~~q~~~---------a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~t--i~~P 69 (346) .+..++=..+.|. +..++.-.-.++|+.+...+.+...+.+.+.+..- ....++.. ..+| T Consensus 85 ~~~~~~~~~~~~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~~~~---------~~~~~~~~~~~~~~ 155 (392) T protein:vir:10 85 RNKPLNAEEREFLEDDLEQRAMSGLTGEDGGLVIPQDIQTQINELARSFDALEQYVT---------VEPVRTRSGSRVLE 155 (392) T ss_pred hcccccHHHHHHHhhhhhhhhccccccCCCceecchhHHHHHHHHHHhhhhhhhhce---------eeeccCCceeEEEE Confidence 1111111111110 11122224456798888878777766666543211 11112333 3445 Q ss_pred ccccCCCcccccCCCccccch-hhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 70 FWQDLTGEDEILDDGEGALTP-GNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLN 148 (346) Q Consensus 70 ~~~~l~g~ae~~~dg~~~it~-~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~ 148 (346) ..... ..+.-+.|+. .++. +..+-.+-.-..++.+.-..++++...-+.-|..+.+.+++++.+.+..+..++.... T Consensus 156 ~~~~~-~~a~~v~E~~-~~~~~~~~~~~~v~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~~~g~g 233 (392) T protein:vir:10 156 KNSDM-IPFAEITEMG-EIPETDNPKFSNVQYAVKDRAGILPLSRSLLQDSDQNILKYVTKWLGKKSKVTRNVLILGVIE 233 (392) T ss_pred eecCC-ccceeecccc-cccccccccceeEEeeeeeEEEeehhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhccc Confidence 44433 2344566764 3332 2233333334445555556666654433444677889999999999988887664321 Q ss_pred hhhhhhhhhhcceeeeccccccccccHHHHHHHHH-HhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceee Q lcl|NC_015254. 149 GITASGALDSNKLDVSTETGDDSYFTGDTFLSATY-KLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFP 223 (346) Q Consensus 149 G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~-~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~ 223 (346) . ......++++.+.+++. .+......-..|+||+.++..|++..-- .++. ...++.-+ T Consensus 234 ----~-------------~~~~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~lkd~~G~~l~~~~~~~~~~~ 296 (392) T protein:vir:10 234 ----K-------------LTKQAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDKLKDKDGKYILQSDPTQKNKK 296 (392) T ss_pred ----c-------------ccccCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHhhccCCCeEeecCccCCccc Confidence 0 01123357888999885 4555555567899999999999875211 1222 11234456 Q ss_pred EEeceEEEEe-CCC-ccCCC---ceEEEEEcC--CeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeecccc Q lcl|NC_015254. 224 TYMGKRVIVD-DGL-PAKDG---VYTSYIFGE--GAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKS 296 (346) Q Consensus 224 ~~~G~~VVvd-D~~-p~~~g---~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~ 296 (346) +++|+|+|+. +.+ |...+ .-..++|+. -++.+.. +..+.+++++..... + .+..+.++++.+-+- T Consensus 297 tllG~~~v~~~~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~-~~~~~~~~~~~~~~~---f---~~~~~~~r~~~r~d~- 368 (392) T protein:vir:10 297 LFAGTNPVVVVSNRFLKSKGTTAKKAPLIIGDLKEAIVLFK-REDMELASTDVGGKA---F---TRNTLDLRAIQRDDV- 368 (392) T ss_pred cccCcccEEEecccccCCCcccCCceEEEEEehhceEEEEe-ecceEEEEeccccch---h---hcCceEEEEEEeecc- Confidence 8999876664 333 22211 112344443 2233332 223345544422100 0 011222333333211 Q ss_pred ccCCCCChHHhcCCcCceeeecccccceEEEEEecccccccCCCCCCC Q lcl|NC_015254. 297 VAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETAPEG 344 (346) Q Consensus 297 ~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~~~~ 344 (346) .+.+++.|-.+.+.+ +.....||| T Consensus 369 ------------------~v~~~~a~~~l~~~~------~a~~~~~~~ 392 (392) T protein:vir:10 369 ------------------QMWDNEAAVYGEIDL------SAPVEQPQG 392 (392) T ss_pred ------------------EEecccceEEEEecc------cccccCCCC Confidence 122233332222221 222233555 No 144 >protein:vir:102082 Length: 392 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1503 # MgeName: Fah # Cross-refs: genbank:acc:YP_512315;genbank:gi:89152484;genbank:GeneID:3953075 Probab=98.34 E-value=2.7e-07 Score=56.57 Aligned_cols=284 Identities=13% Similarity=0.080 Sum_probs=133.7 Q ss_pred Cccceecceeeec---------CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcE--EEec Q lcl|NC_015254. 1 MIKKLRMNLQKFA---------AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNM--INMP 69 (346) Q Consensus 1 ~~~~~~~~~q~~~---------a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~t--i~~P 69 (346) .+..++=..+.|. +..++.-.-.++|+.+...+.+...+.+.+.+..- ....++.. ..+| T Consensus 85 ~~~~~~~~~~~~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~~~~---------~~~~~~~~~~~~~~ 155 (392) T protein:vir:10 85 RNKPLNAEEREFLEDDLEQRAMSGLTGEDGGLVIPQDIQTQINELARSFDALEQYVT---------VEPVRTRSGSRVLE 155 (392) T ss_pred hcccccHHHHHHHhhhhhhhhccccccCCCceecchhHHHHHHHHHHhhhhhhhhce---------eeeccCCceeEEEE Confidence 1111111111110 11122224456798888878777766666543211 11112333 3445 Q ss_pred ccccCCCcccccCCCccccch-hhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 70 FWQDLTGEDEILDDGEGALTP-GNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLN 148 (346) Q Consensus 70 ~~~~l~g~ae~~~dg~~~it~-~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~ 148 (346) ..... ..+.-+.|+. .++. +..+-.+-.-..++.+.-..++++...-+.-|..+.+.+++++.+.+..+..++.... T Consensus 156 ~~~~~-~~a~~v~E~~-~~~~~~~~~~~~v~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~~~g~g 233 (392) T protein:vir:10 156 KNSDM-IPFAEITEMG-EIPETDNPKFSNVQYAVKDRAGILPLSRSLLQDSDQNILKYVTKWLGKKSKVTRNVLILGVIE 233 (392) T ss_pred eecCC-ccceeecccc-cccccccccceeEEeeeeeEEEeehhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhccc Confidence 44433 2344566764 3332 2233333334445555556666654433444677889999999999988887664321 Q ss_pred hhhhhhhhhhcceeeeccccccccccHHHHHHHHH-HhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceee Q lcl|NC_015254. 149 GITASGALDSNKLDVSTETGDDSYFTGDTFLSATY-KLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFP 223 (346) Q Consensus 149 G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~-~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~ 223 (346) . ......++++.+.+++. .+......-..|+||+.++..|++..-- .++. ...++.-+ T Consensus 234 ----~-------------~~~~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~lkd~~G~~l~~~~~~~~~~~ 296 (392) T protein:vir:10 234 ----K-------------LTKQAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDKLKDKDGKYILQSDPTQKNKK 296 (392) T ss_pred ----c-------------ccccCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHhhccCCCeEeecCccCCccc Confidence 0 01123357888999885 4555555567899999999999875211 1222 11234456 Q ss_pred EEeceEEEEe-CCC-ccCCC---ceEEEEEcC--CeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeecccc Q lcl|NC_015254. 224 TYMGKRVIVD-DGL-PAKDG---VYTSYIFGE--GAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKS 296 (346) Q Consensus 224 ~~~G~~VVvd-D~~-p~~~g---~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~ 296 (346) +++|+|+|+. +.+ |...+ .-..++|+. -++.+.. +..+.+++++..... + .+..+.++++.+-+- T Consensus 297 tllG~~~v~~~~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~-~~~~~~~~~~~~~~~---f---~~~~~~~r~~~r~d~- 368 (392) T protein:vir:10 297 LFAGTNPVVVVSNRFLKSKGTTAKKAPLIIGDLKEAIVLFK-REDMELASTDVGGKA---F---TRNTLDLRAIQRDDV- 368 (392) T ss_pred cccCcccEEEecccccCCCcccCCceEEEEEehhceEEEEe-ecceEEEEeccccch---h---hcCceEEEEEEeecc- Confidence 8999876664 333 22211 112344443 2233332 223345544422100 0 011222333333211 Q ss_pred ccCCCCChHHhcCCcCceeeecccccceEEEEEecccccccCCCCCCC Q lcl|NC_015254. 297 VAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETAPEG 344 (346) Q Consensus 297 ~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~~~~ 344 (346) .+.+++.|-.+.+.+ +.....||| T Consensus 369 ------------------~v~~~~a~~~l~~~~------~a~~~~~~~ 392 (392) T protein:vir:10 369 ------------------QMWDNEAAVYGEIDL------SAPVEQPQG 392 (392) T ss_pred ------------------EEecccceEEEEecc------cccccCCCC Confidence 122233332222221 222233555 No 145 >protein:vir:107593 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1491 # MgeName: Gamma # Cross-refs: genbank:acc:YP_338188;genbank:gi:77020144;genbank:GeneID:3703724 Probab=98.34 E-value=2.7e-07 Score=56.57 Aligned_cols=284 Identities=13% Similarity=0.080 Sum_probs=133.7 Q ss_pred Cccceecceeeec---------CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcE--EEec Q lcl|NC_015254. 1 MIKKLRMNLQKFA---------AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNM--INMP 69 (346) Q Consensus 1 ~~~~~~~~~q~~~---------a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~t--i~~P 69 (346) .+..++=..+.|. +..++.-.-.++|+.+...+.+...+.+.+.+..- ....++.. ..+| T Consensus 85 ~~~~~~~~~~~~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~~~~---------~~~~~~~~~~~~~~ 155 (392) T protein:vir:10 85 RNKPLNAEEREFLEDDLEQRAMSGLTGEDGGLVIPQDIQTQINELARSFDALEQYVT---------VEPVRTRSGSRVLE 155 (392) T ss_pred hcccccHHHHHHHhhhhhhhhccccccCCCceecchhHHHHHHHHHHhhhhhhhhce---------eeeccCCceeEEEE Confidence 1111111111110 11122224456798888878777766666543211 11112333 3445 Q ss_pred ccccCCCcccccCCCccccch-hhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 70 FWQDLTGEDEILDDGEGALTP-GNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLN 148 (346) Q Consensus 70 ~~~~l~g~ae~~~dg~~~it~-~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~ 148 (346) ..... ..+.-+.|+. .++. +..+-.+-.-..++.+.-..++++...-+.-|..+.+.+++++.+.+..+..++.... T Consensus 156 ~~~~~-~~a~~v~E~~-~~~~~~~~~~~~v~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~~~g~g 233 (392) T protein:vir:10 156 KNSDM-IPFAEITEMG-EIPETDNPKFSNVQYAVKDRAGILPLSRSLLQDSDQNILKYVTKWLGKKSKVTRNVLILGVIE 233 (392) T ss_pred eecCC-ccceeecccc-cccccccccceeEEeeeeeEEEeehhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhccc Confidence 44433 2344566764 3332 2233333334445555556666654433444677889999999999988887664321 Q ss_pred hhhhhhhhhhcceeeeccccccccccHHHHHHHHH-HhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceee Q lcl|NC_015254. 149 GITASGALDSNKLDVSTETGDDSYFTGDTFLSATY-KLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFP 223 (346) Q Consensus 149 G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~-~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~ 223 (346) . ......++++.+.+++. .+......-..|+||+.++..|++..-- .++. ...++.-+ T Consensus 234 ----~-------------~~~~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~lkd~~G~~l~~~~~~~~~~~ 296 (392) T protein:vir:10 234 ----K-------------LTKQAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDKLKDKDGKYILQSDPTQKNKK 296 (392) T ss_pred ----c-------------ccccCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHhhccCCCeEeecCccCCccc Confidence 0 01123357888999885 4555555567899999999999875211 1222 11234456 Q ss_pred EEeceEEEEe-CCC-ccCCC---ceEEEEEcC--CeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeecccc Q lcl|NC_015254. 224 TYMGKRVIVD-DGL-PAKDG---VYTSYIFGE--GAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKS 296 (346) Q Consensus 224 ~~~G~~VVvd-D~~-p~~~g---~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~ 296 (346) +++|+|+|+. +.+ |...+ .-..++|+. -++.+.. +..+.+++++..... + .+..+.++++.+-+- T Consensus 297 tllG~~~v~~~~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~-~~~~~~~~~~~~~~~---f---~~~~~~~r~~~r~d~- 368 (392) T protein:vir:10 297 LFAGTNPVVVVSNRFLKSKGTTAKKAPLIIGDLKEAIVLFK-REDMELASTDVGGKA---F---TRNTLDLRAIQRDDV- 368 (392) T ss_pred cccCcccEEEecccccCCCcccCCceEEEEEehhceEEEEe-ecceEEEEeccccch---h---hcCceEEEEEEeecc- Confidence 8999876664 333 22211 112344443 2233332 223345544422100 0 011222333333211 Q ss_pred ccCCCCChHHhcCCcCceeeecccccceEEEEEecccccccCCCCCCC Q lcl|NC_015254. 297 VAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETAPEG 344 (346) Q Consensus 297 ~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~~~~ 344 (346) .+.+++.|-.+.+.+ +.....||| T Consensus 369 ------------------~v~~~~a~~~l~~~~------~a~~~~~~~ 392 (392) T protein:vir:10 369 ------------------QMWDNEAAVYGEIDL------SAPVEQPQG 392 (392) T ss_pred ------------------EEecccceEEEEecc------cccccCCCC Confidence 122233332222221 222233555 No 146 >protein:vir:105004 Length: 392 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:1490 # MgeName: W Beta # Cross-refs: genbank:acc:YP_459969;genbank:gi:85701384;genbank:GeneID:3882145 Probab=98.34 E-value=2.7e-07 Score=56.57 Aligned_cols=284 Identities=13% Similarity=0.080 Sum_probs=133.7 Q ss_pred Cccceecceeeec---------CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcE--EEec Q lcl|NC_015254. 1 MIKKLRMNLQKFA---------AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNM--INMP 69 (346) Q Consensus 1 ~~~~~~~~~q~~~---------a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~t--i~~P 69 (346) .+..++=..+.|. +..++.-.-.++|+.+...+.+...+.+.+.+..- ....++.. ..+| T Consensus 85 ~~~~~~~~~~~~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~~~~---------~~~~~~~~~~~~~~ 155 (392) T protein:vir:10 85 RNKPLNAEEREFLEDDLEQRAMSGLTGEDGGLVIPQDIQTQINELARSFDALEQYVT---------VEPVRTRSGSRVLE 155 (392) T ss_pred hcccccHHHHHHHhhhhhhhhccccccCCCceecchhHHHHHHHHHHhhhhhhhhce---------eeeccCCceeEEEE Confidence 1111111111110 11122224456798888878777766666543211 11112333 3445 Q ss_pred ccccCCCcccccCCCccccch-hhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 70 FWQDLTGEDEILDDGEGALTP-GNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLN 148 (346) Q Consensus 70 ~~~~l~g~ae~~~dg~~~it~-~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~ 148 (346) ..... ..+.-+.|+. .++. +..+-.+-.-..++.+.-..++++...-+.-|..+.+.+++++.+.+..+..++.... T Consensus 156 ~~~~~-~~a~~v~E~~-~~~~~~~~~~~~v~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~~~g~g 233 (392) T protein:vir:10 156 KNSDM-IPFAEITEMG-EIPETDNPKFSNVQYAVKDRAGILPLSRSLLQDSDQNILKYVTKWLGKKSKVTRNVLILGVIE 233 (392) T ss_pred eecCC-ccceeecccc-cccccccccceeEEeeeeeEEEeehhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhccc Confidence 44433 2344566764 3332 2233333334445555556666654433444677889999999999988887664321 Q ss_pred hhhhhhhhhhcceeeeccccccccccHHHHHHHHH-HhCccccCceEEEEchHHHHHHHhhhhh--hhcc--cccCceee Q lcl|NC_015254. 149 GITASGALDSNKLDVSTETGDDSYFTGDTFLSATY-KLGDAEGKLTGIAMHSQTEMNLRKQGLI--EFML--DSDNKKFP 223 (346) Q Consensus 149 G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~-~~GD~~~~~~~ivmhS~~~~~L~~~~li--~~~~--~s~~~~i~ 223 (346) . ......++++.+.+++. .+......-..|+||+.++..|++..-- .++. ...++.-+ T Consensus 234 ----~-------------~~~~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~lkd~~G~~l~~~~~~~~~~~ 296 (392) T protein:vir:10 234 ----K-------------LTKQAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDKLKDKDGKYILQSDPTQKNKK 296 (392) T ss_pred ----c-------------ccccCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHhhccCCCeEeecCccCCccc Confidence 0 01123357888999885 4555555567899999999999875211 1222 11234456 Q ss_pred EEeceEEEEe-CCC-ccCCC---ceEEEEEcC--CeeEEeecCCccceeeeecCCcceeEEEEeeEEeeeeeeeeecccc Q lcl|NC_015254. 224 TYMGKRVIVD-DGL-PAKDG---VYTSYIFGE--GAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKS 296 (346) Q Consensus 224 ~~~G~~VVvd-D~~-p~~~g---~ytt~l~~~--GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~ 296 (346) +++|+|+|+. +.+ |...+ .-..++|+. -++.+.. +..+.+++++..... + .+..+.++++.+-+- T Consensus 297 tllG~~~v~~~~~~~~~~~~~~~~~~~~~~gdfs~~~~i~~-~~~~~~~~~~~~~~~---f---~~~~~~~r~~~r~d~- 368 (392) T protein:vir:10 297 LFAGTNPVVVVSNRFLKSKGTTAKKAPLIIGDLKEAIVLFK-REDMELASTDVGGKA---F---TRNTLDLRAIQRDDV- 368 (392) T ss_pred cccCcccEEEecccccCCCcccCCceEEEEEehhceEEEEe-ecceEEEEeccccch---h---hcCceEEEEEEeecc- Confidence 8999876664 333 22211 112344443 2233332 223345544422100 0 011222333333211 Q ss_pred ccCCCCChHHhcCCcCceeeecccccceEEEEEecccccccCCCCCCC Q lcl|NC_015254. 297 VAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETAPEG 344 (346) Q Consensus 297 ~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~~~~ 344 (346) .+.+++.|-.+.+.+ +.....||| T Consensus 369 ------------------~v~~~~a~~~l~~~~------~a~~~~~~~ 392 (392) T protein:vir:10 369 ------------------QMWDNEAAVYGEIDL------SAPVEQPQG 392 (392) T ss_pred ------------------EEecccceEEEEecc------cccccCCCC Confidence 122233332222221 222233555 No 147 >protein:vir:4159 Length: 315 # NCBI annotation: structural protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:87 # MgeName: psiM2 # Cross-refs: genbank:acc:NP_046968;genbank:gi:9630538;genbank:GeneID:1261712 Probab=98.24 E-value=6.4e-07 Score=54.54 Aligned_cols=283 Identities=13% Similarity=-0.002 Sum_probs=135.7 Q ss_pred CccceecceeeecC-Cceeeeee----ccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccc--- Q lcl|NC_015254. 1 MIKKLRMNLQKFAA-GKNTRIAD----VIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQ--- 72 (346) Q Consensus 1 ~~~~~~~~~q~~~a-~~~T~l~d----~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~--- 72 (346) |-+==.|-+|.... -..+..+| ..+||.+..++ +...+.+.|.+-.-+.+ ...+.+.+++.-+ T Consensus 1 ~~~~~~~~~~~~~~~~k~~t~~d~~Gg~l~P~~~~~~i-~~~~e~s~~l~~~~vi~--------~~~~~~~~i~~~g~~~ 71 (315) T protein:vir:41 1 MLTIEDIRGGKPFEIVPKIDVPDLGRGVLSVDRFGEFV-KAVRDSAVIIPEARIDN--------ALKSYEKDISRLSLVL 71 (315) T ss_pred CcccchhhcCChhhhhhhcCCcCCCCceechHHHHHHH-HHHHhhhhhhhhceeee--------ccccccccccccccCc Confidence 22222222333221 01112344 37899988766 56666666655322110 0112223332211 Q ss_pred -cCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhh--cchHHHHHHHHHHHHHHHHHHHHHHHHHHh Q lcl|NC_015254. 73 -DLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALS--GDDPMRAIGDLVVEYWNRRRQAVLIASLNG 149 (346) Q Consensus 73 -~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~--g~dp~~~i~~q~a~~~~~~~~~~lla~L~G 149 (346) -..|..+. .+. +..+..+.+-++..-..++..--+.+++..-.-+ +-|..+.+..++++.+.++.+..++. | T Consensus 72 ~~~~g~~~~-~~~-~~~~~~~~~f~~~~l~~~~l~~~~~it~elL~D~~~~~~~e~~l~~~~a~~~a~~~~~~~~n---G 146 (315) T protein:vir:41 72 DVGPGRDET-GQK-LAPPESTAEVKTNTLYMREMVTKVVIHEDAIEDNIEGKAFEQKIVTLLGEGISYVLEKYYLH---G 146 (315) T ss_pred ccccccccc-cCc-CCCCCCccccceeeeceeeeeeeccccHHHHHhhhccccHHHHHHHHHHHHHHHHHHHHhhc---c Confidence 11111110 011 1111112222222222222322356766655434 34888899999999999998886664 3 Q ss_pred hhhh--h------h----hhhcceeeeccccccccccHHHHHHHHHHhCcccc---CceEEEEchHHHHHHHhhhhh--h Q lcl|NC_015254. 150 ITAS--G------A----LDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEG---KLTGIAMHSQTEMNLRKQGLI--E 212 (346) Q Consensus 150 ~~~~--~------~----~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~---~~~~ivmhS~~~~~L~~~~li--~ 212 (346) --+. . + +..+.. .....++...++.+.|.+....+-.+.. .-.+|+||+.++..+|+..-- . T Consensus 147 dg~s~~p~~~~~~G~l~~a~~~~~-~~~~~~~a~~~~~d~l~~l~~sl~~~yr~~~~~~~~imn~~t~~~~rklk~~~g~ 225 (315) T protein:vir:41 147 DTSSSDPLLRMSDGWLKLASEKLT-ESDVDPEAEDWPMNLFDTMIESLPTPYRNNLPNMKFYVTWDIYRAYRDALKGRET 225 (315) T ss_pred CCcCcCccccccccceeccccccc-ccccccccccccHHHHHHHHHhcChHHhhcCCceEEEEcHHHHHHHHHHhccCCC Confidence 1110 0 0 000100 0111223344677888888887766442 356899999999998875421 2 Q ss_pred hcc--cccCceeeEEeceEEEEeCCCccCCCceEEEEEcCCe-eEEeecCCccceeeeecCCcceeEEEEeeEEeeeeee Q lcl|NC_015254. 213 FML--DSDNKKFPTYMGKRVIVDDGLPAKDGVYTSYIFGEGA-FGLGNGEAPVPTETDREKLKGNDILINRQHFLLHPRG 289 (346) Q Consensus 213 ~~~--~s~~~~i~~~~G~~VVvdD~~p~~~g~ytt~l~~~GA-i~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G 289 (346) ++- .-.++.-.+++|+||+..+.||....--..++|+.-. +.+.. ...+.+|.+|+...+..-++.+.+..++ T Consensus 226 ~lw~~~~~~g~~~tl~G~PV~~~~~m~~~~~~~~~ilf~d~~nl~~~~-~~~i~i~~~~~a~~~~~~~~~~~r~d~~--- 301 (315) T protein:vir:41 226 GLGDQALTGANSILYDGRPVQYVPALEALNDGKSRALFVVPTQLVYGF-WRNIKVVPDYDAEMRLTKYVASLRTDNH--- 301 (315) T ss_pred ccccchhhcCCCceecccceEecccccccCCCCccEEEecccceEEEe-ccccEEEeeecCCCCceEEEEEEEecee--- Confidence 221 1124555689999999999998643211235555432 33433 3346677788876665555554443222 Q ss_pred eeeccccc-cCCCC Q lcl|NC_015254. 290 IAWQEKSV-AGHSP 302 (346) Q Consensus 290 ~s~~~~~~-~~~sP 302 (346) |-|.+..+ .-..- T Consensus 302 ~~~~~~~a~~~~~v 315 (315) T protein:vir:41 302 YEDEEGAVSATITV 315 (315) T ss_pred EEeccceeEeeeeC Confidence 12211100 00000 No 148 >protein:vir:78640 Length: 352 # NCBI annotation: phage capsid # Family: family:all:658 # MgeID: mge:1855 # MgeName: tp310-2 # Cross-refs: genbank:acc:YP_001429943;genbank:gi:156603997;genbank:GeneID:5525386 Probab=98.20 E-value=3.8e-07 Score=55.76 Aligned_cols=275 Identities=14% Similarity=0.101 Sum_probs=133.2 Q ss_pred Cccc-ee---cceeee-cC-Cc-eeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccccc Q lcl|NC_015254. 1 MIKK-LR---MNLQKF-AA-GK-NTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQD 73 (346) Q Consensus 1 ~~~~-~~---~~~q~~-~a-~~-~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~ 73 (346) |.++ ++ +..+.. .+ +. ++.-...++|+-+..-+.+...+.+.+.+- .. ....+|. ++|.... T Consensus 64 ~~~~~~~~~~~~~~~~~~al~~~~~~~gG~lIP~~~~~~Ii~~l~~~s~l~~~------~~---v~~~~~~--~~p~~~~ 132 (352) T protein:vir:78 64 ILPNEFEKPSMEAQRLLHALPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLREK------AR---LTNIKGL--EIPRVSY 132 (352) T ss_pred hhhhHHHHHHhhHHHHHHHhccCCCCCCceeccHhHHHHHHHHHHhhcchhhh------ee---eEecCCc--eEEEEec Confidence 1100 00 000000 01 00 112224567877766666555555444221 11 1112232 4555443 Q ss_pred CCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhh Q lcl|NC_015254. 74 LTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITAS 153 (346) Q Consensus 74 l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~ 153 (346) -.+++.-+.|+. .++..+.+-++-.-..++.+--..+++..-.-+.-|..+.+.+++++.+.+.....++..-.|.-.. T Consensus 133 ~~~~a~~v~E~~-~~~~~~~~f~~v~~~~~k~~~~i~is~ell~Ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g~~~~ 211 (352) T protein:vir:78 133 TLDDDDFITDVE-TAKELKLKGDTVKFTTNKFKVFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAVSPKSGLE 211 (352) T ss_pred CCCccccccccc-ccccccccceeeeecceeEEeechhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHHhhhhcCCCCccc Confidence 334555566764 4555555433333333344444566665444455677888999999999876555555322111000 Q ss_pred hhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhh--hhhcccccCceeeEEeceEEE Q lcl|NC_015254. 154 GALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGL--IEFMLDSDNKKFPTYMGKRVI 231 (346) Q Consensus 154 ~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~l--i~~~~~s~~~~i~~~~G~~VV 231 (346) ...- ....+... +....++.|.++...+-.....-.+|+||+.++..|++..- -.++. .+.-.+++|+||+ T Consensus 212 ~g~l-~~~~~~~~---t~~~~~d~i~~~~~~l~~~~~~~a~~~mn~~t~~~l~~~~~~~~~~~~---~~~~~~llG~PV~ 284 (352) T protein:vir:78 212 HMSF-YNGSVKEV---EGANMYDAIINALADLHEDYRDNATIYMRYADYVKIISVLSNGTTNFF---DTPAEKVFGKPVV 284 (352) T ss_pred ccce-eccccccc---cccchHHHHHHHHhccChhhhcCCEEEEehHHHHHHHHHHhccCCccc---ccCCccccccceE Confidence 0000 00011111 11124788888888776666666889999999988765310 00111 1112478999999 Q ss_pred EeCCCccCCCceEEEEEcCCeeEEee-cCCccceeeeecCCcceeEEEEeeEEeeeee---eeeec-cccccCCCCC Q lcl|NC_015254. 232 VDDGLPAKDGVYTSYIFGEGAFGLGN-GEAPVPTETDREKLKGNDILINRQHFLLHPR---GIAWQ-EKSVAGHSPT 303 (346) Q Consensus 232 vdD~~p~~~g~ytt~l~~~GAi~~~~-~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~---G~s~~-~~~~~~~sPt 303 (346) ++|.++. .+|| -|.+.. ......++..++...|...+..+.++-..|. -|..- -++.++.-|+ T Consensus 285 ~~~~~~~-------~~~G--df~~~~~~~~~~~~~~~~~~~~g~~~f~~~~r~Dg~~~~~eA~~~l~~~a~~~~~~~ 352 (352) T protein:vir:78 285 FTDAAVK-------PIVG--DFNYFGINYDGTTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKESTGSLPS 352 (352) T ss_pred EecCCCc-------eeEe--ehhhhhhhhhhheeeeeccccCCeeEEEEEeeeCceeechhheEEEEeecccCCCCC Confidence 9997753 2222 121111 1223456666776777777776666655543 22221 1122334455 No 149 >protein:vir:105645 Length: 400 # NCBI annotation: putative major capsid protein # Family: family:all:2806 # MgeID: mge:1674 # MgeName: K1E # Cross-refs: genbank:acc:YP_425009;genbank:gi:83571757;uniprot:Q2WC43;genbank:GeneID:3837286 Probab=98.16 E-value=3.1e-07 Score=56.28 Aligned_cols=317 Identities=15% Similarity=0.119 Sum_probs=165.2 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEI 80 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~ 80 (346) |-.-= ||++= +...+--.+-+-=|+|.--|...+.++..|.. .-.+.+ -.+|+++.+|+-+.. .++. T Consensus 1 Ms~~n--~~t~p-~~~gsg~~~aL~Le~f~GeV~taF~~~si~~~------~~~vRt--I~~gkS~qf~~lG~s--~a~y 67 (400) T protein:vir:10 1 MSTPN--NLTNV-AVSASGEVDSLLIEKFNGKVNEQYLKGENIMS------YFDVQT--VTGTNTVSNKYLGET--ELQV 67 (400) T ss_pred CCCCc--ccccc-ccccccchhhhHHhHhcchHHHHHHHHhhhcc------cceeee--ecccceEEEEEeeee--EEee Confidence 32110 11111 00001111112235565556555555554421 111111 237999999998765 3444 Q ss_pred cCCCccccchhhcccceeEEEEEe-ecCcceechHHHhhhcch-HHHHHHHHHHHHHHHHHHHHHHHHH-Hhhhhhhh-- Q lcl|NC_015254. 81 LDDGEGALTPGNISAAKDIARLHM-RGKAWRTNDLAKALSGDD-PMRAIGDLVVEYWNRRRQAVLIASL-NGITASGA-- 155 (346) Q Consensus 81 ~~dg~~~it~~~lt~~~~~a~~~~-~~k~~~~tD~a~~~~g~d-p~~~i~~q~a~~~~~~~~~~lla~L-~G~~~~~~-- 155 (346) ..-|+ .+....+...+.+-+|=. .---..+.|+......=| +-.+++++++.+.++.+|..+|..+ .+.++.+. T Consensus 68 ~~pG~-~ldg~~~~~dk~~ItIDtLL~a~~~V~dlDd~q~~yD~vRse~s~e~G~ALA~~~Dq~iiq~i~~a~~a~t~~~ 146 (400) T protein:vir:10 68 LAPGQ-SPAATSTQADKNQLVIDATVIARNTVAHLHDVQGDIDSLKPKLATNQAKQLKKMEDEMLIQQMLLGGIANTQAK 146 (400) T ss_pred ecCCC-CcCCCCcccCcEEEEeCceeeecchhhhHHHHhhccccccHHHHHHHHHHHHHHHHHHHHHHHHHhcccccccc Confidence 45554 456666666655433322 222356788887777777 7899999999999999999888755 33333211 Q ss_pred -----hhhc--ceeeeccccccccccHHHHH----HHHHHhCccc--cCceEEEEchHHHHHHHhh-hhhhhccc-c-c- Q lcl|NC_015254. 156 -----LDSN--KLDVSTETGDDSYFTGDTFL----SATYKLGDAE--GKLTGIAMHSQTEMNLRKQ-GLIEFMLD-S-D- 218 (346) Q Consensus 156 -----~~~~--~~dis~~~~~~~~~~~~~l~----~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~-~li~~~~~-s-~- 218 (346) ...+ ...+++.. +....+...|. +|.+.|-++. ..-.++++.|..|.-|+.. +|++.... + + T Consensus 147 ~~~~~g~~~g~s~~v~~~~-~~~~~~~~~l~~A~~~A~~~LdEkdVP~~d~vvl~pp~~Ys~Ll~~dkLvnrdf~~s~~g 225 (400) T protein:vir:10 147 RTNPRVKGHGFSVNVEVNE-GEALVNPQYVMAAVEFALEQQLEQEVDISDVAILMPWRYFNVLRDADRIVDKSYTISQSG 225 (400) T ss_pred cccCCccccccceeecccc-cccccCHHHHHHHHHHHHHHHHhcCCCccceEEEcCHHHHHHHHhCCcccchhccccCCC Confidence 1112 22232222 22223544444 4555443221 2224666666777666653 47754432 2 2 Q ss_pred ---CceeeEEeceEEEEeCCCccCC-------------C----------ceEEEEEcCCeeEEeecCCccceeeeecCCc Q lcl|NC_015254. 219 ---NKKFPTYMGKRVIVDDGLPAKD-------------G----------VYTSYIFGEGAFGLGNGEAPVPTETDREKLK 272 (346) Q Consensus 219 ---~~~i~~~~G~~VVvdD~~p~~~-------------g----------~ytt~l~~~GAi~~~~~~~~~~vE~dRd~~~ 272 (346) .+.+..++|++|+.+..+|... | +-...+|-+-|++..... ++.-|+.||+.. T Consensus 226 ~~~~g~v~~v~Gv~Iv~Sn~lP~~a~~~~~~~lS~a~~G~~y~~t~d~s~~~av~F~~sAv~tvk~~-~lt~~~~~d~r~ 304 (400) T protein:vir:10 226 ATIQGFVLSSYNCPVIPSNRFPKYSQGQKHHLLSNEDNGYRYDPIAEMNGAIAVLFTADALLVGRSI-DVIGDIFYEKKE 304 (400) T ss_pred ccccceEEEEeceEEEeeCcCCcccCcccccccccCCCCccCCccccccceeEEEEehhheEEEEee-ccccccccchhh Confidence 2567899999999999998532 1 112456778888876555 456788899988 Q ss_pred ceeEEEEeeEEeeeeee------eeeccccccC--CCCC--hHHhcCCcCceeeecccccceEEEEEecccccccCCCCC Q lcl|NC_015254. 273 GNDILINRQHFLLHPRG------IAWQEKSVAG--HSPT--NTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETAP 342 (346) Q Consensus 273 g~~~l~~r~~~~~~~~G------~s~~~~~~~~--~sPt--~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~~ 342 (346) ..+.|.+.+.|++.|+= +...+....+ ..|. -..+-+-+|-.. +.+|+ . +.....+| T Consensus 305 ~~~~id~~~a~G~g~~RPeaa~vv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~---------~~~~~-~---~~~~~~~~ 371 (400) T protein:vir:10 305 KTYYIDTFMSEGAIPDRWEAVSVVTTKRQSTGAVDSGNAAQHTQVLNRAQRKA---------VYVKN-A---APAGAFAA 371 (400) T ss_pred HHHHHHHHHHhCCcccchhheEEEEecCCcccccccCcchhHHHHHhhcccce---------EEEec-c---cccccccc Confidence 88888888888877752 1121111100 0111 111112222211 22222 1 12233455 Q ss_pred CCCC Q lcl|NC_015254. 343 EGIK 346 (346) Q Consensus 343 ~~~~ 346 (346) .|+. T Consensus 372 ~~~~ 375 (400) T protein:vir:10 372 ASLS 375 (400) T ss_pred cccc Confidence 5555 No 150 >protein:vir:3158 Length: 321 # NCBI annotation: capsid protein gpE # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:316 # MgeName: PhiCh1 # Cross-refs: genbank:acc:NP_665929;genbank:gi:22091115;genbank:GeneID:951342 Probab=98.04 E-value=2.7e-06 Score=51.11 Aligned_cols=283 Identities=11% Similarity=0.140 Sum_probs=128.3 Q ss_pred Cccce-ecceeeecCCceeeeee-----ccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccC Q lcl|NC_015254. 1 MIKKL-RMNLQKFAAGKNTRIAD-----VIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDL 74 (346) Q Consensus 1 ~~~~~-~~~~q~~~a~~~T~l~d-----~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l 74 (346) |=.|+ .=+||..+.-+....++ .+.|++...++ ++..+.+.|++--=+ .........+|.|.. T Consensus 1 ~~~k~~~~~l~~~~~~~~~~~~~~~~g~~v~~~~~~~l~-~~i~e~s~~l~~i~v---------~~v~~~~~~i~~~~~- 69 (321) T protein:vir:31 1 MASRTINNDLSRITEKNALTVDDLDAGGTLPDPLWDEFW-TDMIEETPLLDAIRT---------ETVGAKKTRIPTLNI- 69 (321) T ss_pred CchHHHHHHHHHHHHhccccccccCCcceeCHHHHHHHH-HHHHHhhhhhhhcee---------eeccCcceeeeeecc- Confidence 54444 33456654211122222 46666655544 445666666542111 112233456677653 Q ss_pred CCcccc-cCCCccccchhhcccceeEEEEEeecCcceechHHHhhh--cchHHHHHHHHHHHHHHHHHHHHHHHHH---- Q lcl|NC_015254. 75 TGEDEI-LDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALS--GDDPMRAIGDLVVEYWNRRRQAVLIASL---- 147 (346) Q Consensus 75 ~g~ae~-~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~--g~dp~~~i~~q~a~~~~~~~~~~lla~L---- 147 (346) ++.... ..++.+.....+.+.++..-..++..--+.+++.--.-+ +.|..+.+.+++++.++.+.+...+.-= T Consensus 70 ~~~~~~~~~e~~~~~~~~~~~~~~~~~~~~k~~~~~~it~e~L~d~a~~~d~e~~i~~~ia~~~a~~~~~~~~nGd~~~~ 149 (321) T protein:vir:31 70 GERHRRPQDEGEWNENESDVSTGTIDISTEKATVAWDLPREVVQENPEGEALADRILNLMTDAWSADVEDLAANGDEDAE 149 (321) T ss_pred CCcccccccccccccccccceeeeeeeeeEEEEeehhccHHHHHhhhcchhHHHHHHHHHHHHHHHHHHhheeeccccCC Confidence 232222 223332333333333333333333333444555433322 4588888999999999998887655310 Q ss_pred -------HhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCcccc--CceEEEEchHHHHHHHhhhhhh---hc- Q lcl|NC_015254. 148 -------NGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEG--KLTGIAMHSQTEMNLRKQGLIE---FM- 214 (346) Q Consensus 148 -------~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~~~~~L~~~~li~---~~- 214 (346) +|.+.. +..+. .....+...++.+.|.+....+-.... .-.+|+||+.++..+++. +.+ .+ T Consensus 150 ~~~~~~n~G~l~~--a~~~~---~~~~~~~~~~~~d~l~~l~~~l~~~yr~~~~~v~im~~~~~~~~~~~-l~~~~~~~~ 223 (321) T protein:vir:31 150 DSFENQNDGFITV--AEGDV---ETIDAADDILDNDLVIRTIAGLDSKYRARMNPALIVSEDQLLSYHYT-LTDRDTPLG 223 (321) T ss_pred Ccccccchhhhhh--hcccc---ccccccccccCHHHHHHHHHhccHhHhcCCCeEEEechHHHHHHHHH-HhcCCCccc Confidence 111110 00011 111123345788899999888865432 334799999998776542 111 10 Q ss_pred -ccccCceeeEEeceEEEEeCCCccCCCceEEEEEcC---CeeEEeecCCccceeeeecCC--c-c----eeEEEEeeEE Q lcl|NC_015254. 215 -LDSDNKKFPTYMGKRVIVDDGLPAKDGVYTSYIFGE---GAFGLGNGEAPVPTETDREKL--K-G----NDILINRQHF 283 (346) Q Consensus 215 -~~s~~~~i~~~~G~~VVvdD~~p~~~g~ytt~l~~~---GAi~~~~~~~~~~vE~dRd~~--~-g----~~~l~~r~~~ 283 (346) ..-.++...+++|+||+.++.||... +++++ =++++.. .+.++..|+.. . + ...+..+..| T Consensus 224 ~~~l~~~~~~tl~G~pvv~~~~mP~~~-----il~t~~~nl~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 295 (321) T protein:vir:31 224 DNVIMGEADVNPFSFPIIGSGLWPDDK-----AMFTDPQNLIYALYR---DLEIDVLTESDKVSERDLHARYFMRGDDDF 295 (321) T ss_pred cchhhccccccccceeEEEcCCCCCCc-----EEEeccccEEEEEee---ccEEEEeecCccccccceeeEeeeeeecce Confidence 11113344579999999999999763 33332 1222221 22333333321 1 1 1122233333 Q ss_pred eeeeee-eeeccccccCCCCChHHhcCCcC Q lcl|NC_015254. 284 LLHPRG-IAWQEKSVAGHSPTNTEIEKGNN 312 (346) Q Consensus 284 ~~~~~G-~s~~~~~~~~~sPt~a~L~~~~N 312 (346) ++--++ +.+ +.+.-=..+-|...+. T Consensus 296 ~ve~~~a~a~----~~~i~~~~~~~~~~~~ 321 (321) T protein:vir:31 296 AIENTEAVVL----AEGLGDPLEHLEEETS 321 (321) T ss_pred eEeccccEEE----EecCCcchhcccCCCC Confidence 333222 111 1111000111111111 No 151 >protein:vir:93696 Length: 364 # NCBI annotation: Bcep22gp55 # Family: family:all:974 # MgeID: mge:1470 # MgeName: Bcep22 # Cross-refs: genbank:acc:NP_944284;genbank:gi:38640361;genbank:GeneID:2658350 Probab=98.04 E-value=4.3e-07 Score=55.48 Aligned_cols=288 Identities=13% Similarity=0.106 Sum_probs=153.9 Q ss_pred cceeeecCCceeeeeeccchH---HHHHHHhhHhHHHH----hHhhccccccchhHHHHhhCCCcEEEecccccCCCccc Q lcl|NC_015254. 7 MNLQKFAAGKNTRIADVIVPE---VFNKYVTERTAESS----ALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDE 79 (346) Q Consensus 7 ~~~q~~~a~~~T~l~d~i~Pe---v~~~yv~~~~~~~~----~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae 79 (346) |=...|.++ +|+ +|+.-+.....+++ +|...|==.+-..+..|-.++|++|+++.-..|.|+- T Consensus 1 Ma~T~~~~~---------~p~a~~~ws~~l~~~~~~~s~f~~~l~G~~~~~~I~~~~dL~k~~Gd~v~f~L~~~L~g~g- 70 (364) T protein:vir:93 1 MSQTVIPFG---------DPKAVKRWSADLAVDVRKKSYFEQRFIGTSENAVIQRKTELESDAGDRITFDLSVHLRGKP- 70 (364) T ss_pred CceeccCcC---------CHHHHHHHHHHHHHHHHhhCccccccccCCCCCcEEEeeecCCCCCceEEeeeeeecccCC- Confidence 433444332 344 33333333333333 3443322112222233556789999999999997652 Q ss_pred ccCCCccccc--hhhcccceeEEEEEeecCcceec-hHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhh Q lcl|NC_015254. 80 ILDDGEGALT--PGNISAAKDIARLHMRGKAWRTN-DLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGAL 156 (346) Q Consensus 80 ~~~dg~~~it--~~~lt~~~~~a~~~~~~k~~~~t-D~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~ 156 (346) +.. + ..++ -+.|+..++..++=....++... .+++..+.=|...++.+.++.||++.+|..++-.|.|..+.+.. T Consensus 71 v~G-d-~~leGnee~L~~~~~~i~idq~r~~V~~~g~ms~qRt~~dlr~~ar~~L~~w~~~~~d~~~f~~laGarg~~~~ 148 (364) T protein:vir:93 71 TYG-D-ARVEGKEESLRFYQDEVRIDQVRHSVSAGGRMSRKRTVHNIRRIARDRLGDYFYKFTDELLFIYLSGARGINLD 148 (364) T ss_pred ccc-C-ceeeccccceeEEeeEEEEeeccccccccCchhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhcccccccc Confidence 321 1 1222 26788999998888888887543 35556667788899999999999999999999999885443210 Q ss_pred ------------h-------hcce-----eeeccccccccccHHHHHHHHHHh---Ccc-------------ccCceEEE Q lcl|NC_015254. 157 ------------D-------SNKL-----DVSTETGDDSYFTGDTFLSATYKL---GDA-------------EGKLTGIA 196 (346) Q Consensus 157 ------------~-------~~~~-----dis~~~~~~~~~~~~~l~~A~~~~---GD~-------------~~~~~~iv 196 (346) + ++.+ .......++-.++.+.+.+|.... |-. .++..+++ T Consensus 149 ~~~~~~~~~~~~N~v~aPt~~r~~~~~~at~~~~l~stD~~sl~~id~a~~~a~~~~~~~~~~~~~~Pv~~~g~~~yV~~ 228 (364) T protein:vir:93 149 FIETPDFTGYAGNPLDAPDVDHLLYGGVATSKASLAATDIMAPLVIEKAVEKAAMMQAENPDVANMVPVSIDGDDHYVCV 228 (364) T ss_pred cccccCcccccccccCCCCCCcEEeccccCchhhccccccccHHHHHHHHHHHHHhCCCCCCCcccceeEecCcceeEEE Confidence 0 0000 001111233468889998887753 211 12567999 Q ss_pred EchHHHHHHHhh---hhhhhccccc----------CceeeEEeceEEEEeCCCc------cCCCc--eEEEEEcCCeeEE Q lcl|NC_015254. 197 MHSQTEMNLRKQ---GLIEFMLDSD----------NKKFPTYMGKRVIVDDGLP------AKDGV--YTSYIFGEGAFGL 255 (346) Q Consensus 197 mhS~~~~~L~~~---~li~~~~~s~----------~~~i~~~~G~~VVvdD~~p------~~~g~--ytt~l~~~GAi~~ 255 (346) |||-.+.+|+.. +++++.++.. .|.++.|+|+.|.---+++ ....+ -.++++|.=|+++ T Consensus 229 l~p~q~~~Lr~~t~~~w~d~qk~A~~~~g~~nPlF~G~~gm~ngvii~~~~~vi~~~~~~~~~~v~~~ralllGaQA~~~ 308 (364) T protein:vir:93 229 MSEYQATDMRTAAGGTWIDFQKAAAAAEGRNNPIFKGGLGMINNVVLHKHRNVIRFNDYGAGANVEAARALFMGRQAGVI 308 (364) T ss_pred EcchhhhhhhhcCCHHHHHHHHHhhhcccccCCceecCeeeEcCeEEeccCCcccccccccCccccchhhheecceeeEE Confidence 999999999954 3556665432 2567999999664433332 12222 2358888777665 Q ss_pred eecCC----ccceeeeecCCcceeEEEEeeEEeeeeeeeeeccccccC-CCCChHHhcC Q lcl|NC_015254. 256 GNGEA----PVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKSVAG-HSPTNTEIEK 309 (346) Q Consensus 256 ~~~~~----~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~~~~-~sPt~a~L~~ 309 (346) ..++. +.-.|-..|-.....+.+ ..++.+.-..|.+...+- .=||-+.+-. T Consensus 309 a~g~~~g~~~~w~Ee~~D~gn~~~i~~---~~i~G~kK~rF~~~DfGvi~idtaa~~~~ 364 (364) T protein:vir:93 309 AYGTANGLRFDWEETVKDYGNEPAIAA---GFIAGMKKARFNNKDFGVISIDTAAKKHS 364 (364) T ss_pred EeecCCCCCceeeecccCCCCchhhhh---hhHhhhhhcccCCccceEEEecccccccC Confidence 54442 222343333322222111 011111111221111110 0122222222 No 152 >protein:vir:80128 Length: 466 # NCBI annotation: Phage capsid protein # Family: family:all:635 # MgeID: mge:1877 # MgeName: bacteriophage bv1 # Cross-refs: genbank:acc:YP_001425603;genbank:gi:155042936;genbank:GeneID:5469556 Probab=97.99 E-value=1e-06 Score=53.44 Aligned_cols=296 Identities=13% Similarity=0.086 Sum_probs=130.4 Q ss_pred Cccce------ecceeeec------CCceeee--eeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEE Q lcl|NC_015254. 1 MIKKL------RMNLQKFA------AGKNTRI--ADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMI 66 (346) Q Consensus 1 ~~~~~------~~~~q~~~------a~~~T~l--~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti 66 (346) |..+. +-.++.|. +...+.. ...++|+-+.+.+.+.+.+.+.+.+..-+.+ ..| +. T Consensus 123 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~vP~~~~~~i~~~l~~~~~l~~~~~v~~---------~~g-~~ 192 (466) T protein:vir:80 123 MPYEQRAALIARSEVKEFLAQVRTLAQQKRAVSGAELTIPDVMLELLRDNMHRYSKLISKVRLRP---------LKG-TA 192 (466) T ss_pred hhhhhHHHHHHHHHHHHHHHHHHHHhhhhhhhccccccccHHHHHHHHHhhhhhhhhhhheeeee---------cCc-ee Confidence 10000 00011110 0111112 2368899888877777666665544221111 122 34 Q ss_pred EecccccCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 67 NMPFWQDLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS 146 (346) Q Consensus 67 ~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~ 146 (346) .+|.-.... .+.-+.|+. .++..+.+-++-.-..+..+.-+.+++..-.-+..|..+.+.+++++.+....+..+|.- T Consensus 193 ~~~~~~~~~-~a~wv~E~~-~~~~~~~~f~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~ail~G 270 (466) T protein:vir:80 193 RQNIAGAIP-EGVWTEAVA-NLNELSLSFSQIEVDGYKVGGFIPIPNSTLEDSDLNLADEILDAIGQAIGFALDKAILYG 270 (466) T ss_pred EeeeecCCc-ceeeccccc-ccccccccccceeecceeeeeehhhhHHHHhcchHHHHHHHHHHHHHHHHHHHhhheeec Confidence 666555542 232345553 455444443333333444444456666665556667788899999999999999876641 Q ss_pred H-----HhhhhhhhhhhcceeeeccccccccccHH--------------HHHHHHHH----hCccccCceEEEEchHHHH Q lcl|NC_015254. 147 L-----NGITASGALDSNKLDVSTETGDDSYFTGD--------------TFLSATYK----LGDAEGKLTGIAMHSQTEM 203 (346) Q Consensus 147 L-----~G~~~~~~~~~~~~dis~~~~~~~~~~~~--------------~l~~A~~~----~GD~~~~~~~ivmhS~~~~ 203 (346) - .|++...........-...+.....++.. .+.+.... ..-......+|+||+.++. T Consensus 271 ~G~~~P~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~w~~~~~~~~ 350 (466) T protein:vir:80 271 TGTKMPVGIVTRLAQTTQPPNWGTKAPAWTNLSTTNLLKIDPTGKSAEEFFSELVLKLSKARANYSNGMKFWAMSSNTHA 350 (466) T ss_pred cCCCCcceeeecccccccccccccccccccccchhhhhhhhhhccchhhHHHHHHHHHHhhhccccCCceeEEecchhHH Confidence 0 12222111100000000000000011111 12222111 1222344567999999999 Q ss_pred HHHhhhhhh-----hcccccCceeeEEeceEEEEeCCCccCCCceEEEEEcCCeeEEeecCCccceeeeecC--CcceeE Q lcl|NC_015254. 204 NLRKQGLIE-----FMLDSDNKKFPTYMGKRVIVDDGLPAKDGVYTSYIFGEGAFGLGNGEAPVPTETDREK--LKGNDI 276 (346) Q Consensus 204 ~L~~~~li~-----~~~~s~~~~i~~~~G~~VVvdD~~p~~~g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~--~~g~~~ 276 (346) .|+...+.. ++....+ -..++|++|++++.||.+. .-+.+.... .+...+ .+.++.+++. ..+++. T Consensus 351 ~l~~~~~~~~~~g~~~~~~~~--~~~i~G~pvv~s~~~~~~~---~~~g~~~~y-~i~~r~-~~~i~~~~~~~f~~d~~~ 423 (466) T protein:vir:80 351 VLMSKAITFNSAGALVASLNN--TMPIVGGDIVILDFIPDND---IIGGYGSLY-LLAERA-DIKLAQSEHVRFIEDQTV 423 (466) T ss_pred HhhcccccccCCccccccCCC--cccccccceeecCccCccc---eeeeccccE-EEEeec-ceEEEechhhhhhcCcEE Confidence 988765321 1111111 1247899999999999864 111122222 232222 3334444433 245666 Q ss_pred EEEeeEEeeeeee-eeeccccccCCCCChHHhcCCcCceeeecccccceEEEEEecccccccCCCCCCCCC Q lcl|NC_015254. 277 LINRQHFLLHPRG-IAWQEKSVAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETAPEGIK 346 (346) Q Consensus 277 l~~r~~~~~~~~G-~s~~~~~~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~~~~~~ 346 (346) +.+..++...|+- -+|.--..++.+| .....-.+..|++ T Consensus 424 ~r~~~r~dg~~~~~~afv~~~~~~~~~-------------------------------~~~~~~~~~~~~~ 463 (466) T protein:vir:80 424 FKGTARYDGKPVFGEGFVAVNIANANP-------------------------------TTSITFAPDEANV 463 (466) T ss_pred EEEEEEEccEEeccCceEEEEecCCCc-------------------------------ccceeeecCcCcC Confidence 6666666554431 0111001111111 1111111111221 No 153 >protein:vir:79008 Length: 299 # NCBI annotation: putative main capsid protein # Family: family:all:701 # MgeID: mge:1861 # MgeName: phiC2 # Cross-refs: genbank:acc:YP_001110725;genbank:gi:134287342;genbank:GeneID:4955182 Probab=97.92 E-value=1.1e-05 Score=47.83 Aligned_cols=267 Identities=10% Similarity=-0.007 Sum_probs=130.5 Q ss_pred eeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhhcccce-e Q lcl|NC_015254. 20 IADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGNISAAK-D 98 (346) Q Consensus 20 l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~-~ 98 (346) ++-+=-.|.|.+.+.+++.+.+- +|.+...+.....--.||++|.+|....- |-. +-+.+..--..+.++... . T Consensus 1 MA~~n~a~~~~~~Ld~~~~~~l~---~~~L~~~~~~~~v~~~gg~tVkI~~i~~~-gl~-DY~R~~~g~~~g~~~~~~~t 75 (299) T protein:vir:79 1 MAALNYAKEYSNVLAQAYPYTLN---FGDLYATPNNGRYRWTGSKTIEIPTISTT-GRV-DSNRDTIAVAQRNYDNAWEP 75 (299) T ss_pred CccchhHHHHHHHHHHHHHhhce---eeeeccCcccceeeecCCCEEEEeccccc-ccc-ccccCCCcccccccCcceeE Confidence 33111248888888888776653 35444333322212247999999988652 332 222111112333455333 3 Q ss_pred EEEEEeecCcceechHHHhhhcc-hHHHHHH-HHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeeccccccccccHH Q lcl|NC_015254. 99 IARLHMRGKAWRTNDLAKALSGD-DPMRAIG-DLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSYFTGD 176 (346) Q Consensus 99 ~a~~~~~~k~~~~tD~a~~~~g~-dp~~~i~-~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~~~~~ 176 (346) ...-+.|+.+|.+.++...-+.. -.++.+. ++........+|+..++.|-... +.....+.. ...+.+ . -++ T Consensus 76 ~~ldqdr~~~f~vD~~Dvdet~~~~~~a~v~~~~~~~~v~pEiDay~~skl~~~a--~~~g~~~~~-~~~T~~--n-~y~ 149 (299) T protein:vir:79 76 KVLTNQRKWSTLVHPADINQTNYVASIGNITKVYNEEQKFPEMDAYCISKIYADW--TALGNTADT-TVLTTT--N-VLE 149 (299) T ss_pred EEeeccccceeccchhhHHHHhhhhHHHHHHHHHHHHHhhhHhhHHHHHHHHHhh--hhcCCcccc-cccCHH--H-HHH Confidence 44456688888888665443322 2334333 33334445556777777653110 111111000 000111 1 257 Q ss_pred HHHHHHHHhCccc--cCceEEEEchHHHHHHHhhhhhhhccccc------CceeeEEeceEEEE--eCCCcc-------- Q lcl|NC_015254. 177 TFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQGLIEFMLDSD------NKKFPTYMGKRVIV--DDGLPA-------- 238 (346) Q Consensus 177 ~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~li~~~~~s~------~~~i~~~~G~~VVv--dD~~p~-------- 238 (346) .|.++..+|-+.. ..-..++|.|.++.-|.+...+....... ++.|+.+.|++|+. ++.|+. T Consensus 150 ~i~~~~~~lde~~vP~~~rvl~vtp~~~~~L~~~~~f~k~~~~~~~~~~~~g~Vg~idG~~Ii~Vps~r~~t~~~~~~G~ 229 (299) T protein:vir:79 150 VFDKLMEKMTEARVPENGRILYVTPVVNTLIKNAKEIQRTVNIKDAGTSLNRQTTDIDTVKIIKVPSNLMKTAYDFTTGW 229 (299) T ss_pred HHHHHHHHHHhcCCCCCCeEEEeCHHHHHHHhhchhhhcccccccccceeeeeeeeecceEEEEechhhcCccceeccCc Confidence 7888888887654 34589999999999999876554333222 24578999999986 555543 Q ss_pred ---CCCceEEEEEcC-CeeEEeecCCccceeee-ecCCcceeEEE-EeeEEeeeee-----eeeeccccccC Q lcl|NC_015254. 239 ---KDGVYTSYIFGE-GAFGLGNGEAPVPTETD-REKLKGNDILI-NRQHFLLHPR-----GIAWQEKSVAG 299 (346) Q Consensus 239 ---~~g~ytt~l~~~-GAi~~~~~~~~~~vE~d-Rd~~~g~~~l~-~r~~~~~~~~-----G~s~~~~~~~~ 299 (346) ..++...|++.+ +|+.-..+-.. +... -+.....+.++ .|..+-+-++ |+-..-++..+ T Consensus 230 ~~~~~ak~in~ii~~~~a~~~~~K~~~--~~~~~P~~~~~~~~~~~~r~y~d~~v~~nk~~~i~~~~~~a~~ 299 (299) T protein:vir:79 230 KVGAGAKQIFMSLVHPSAIITPVSYQF--SKLDEPTAVTEGKYFYFEESFEDVFILNKKADAIQFVVEGAGA 299 (299) T ss_pred cccCcccccceEEEcCCeeeeeEeeee--EEeecCCCCCccceeeeeeeeeeeeeeccccCeEEEEeeecCC Confidence 123334455443 44332221111 1111 11122223443 3443333332 33222222222 No 154 >protein:vir:7019 Length: 401 # NCBI annotation: major capsid protein # Family: family:all:2806 # MgeID: mge:141 # MgeName: SP6 # Cross-refs: genbank:acc:NP_853592;genbank:gi:31711674;genbank:GeneID:1481800 Probab=97.87 E-value=1.8e-06 Score=52.05 Aligned_cols=322 Identities=15% Similarity=0.115 Sum_probs=163.2 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEI 80 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~ 80 (346) |-.- =||+.=. ...+--.+-+-=|+|.--|...+.++..|.. .-.+.+ -.+|+++.+|+-+.. .++. T Consensus 1 Ms~~--n~~t~~~-~~~sg~~~al~Le~f~GeV~taF~~~si~~~------~~~vRt--i~~gkS~qf~~~G~s--~~~~ 67 (401) T protein:vir:70 1 MSTP--NNLTNVA-VSASGEVDSLLIEKFNGKVNEQYLKGENIMS------YFDVQT--VTGTNTVSNKYLGET--ELQV 67 (401) T ss_pred CCCC--ccccccc-cccccchhHhHHhHhcchHHHHHHHHhhhcc------cceeee--ecccceEEEEEeeee--Eeee Confidence 3211 0111111 0001111112235555555555555554421 111111 237999999998765 3444 Q ss_pred cCCCccccchhhcccceeEEEEEe-ecCcceechHHHhhhcch-HHHHHHHHHHHHHHHHHHHHHHHHHH-hhhhhhhh- Q lcl|NC_015254. 81 LDDGEGALTPGNISAAKDIARLHM-RGKAWRTNDLAKALSGDD-PMRAIGDLVVEYWNRRRQAVLIASLN-GITASGAL- 156 (346) Q Consensus 81 ~~dg~~~it~~~lt~~~~~a~~~~-~~k~~~~tD~a~~~~g~d-p~~~i~~q~a~~~~~~~~~~lla~L~-G~~~~~~~- 156 (346) ..-|+ .+..+.+...+.+-+|=. .---..+.|+....+.=| +-.+++++++.+.++.+|..++..++ +.++.... T Consensus 68 ~~pG~-~ld~~~~~~dK~~ItID~lL~a~~~V~dlDe~q~~yD~vRse~s~e~G~ALA~~~Dq~iiq~i~~aa~ana~~~ 146 (401) T protein:vir:70 68 LAPGQ-SPAATSTQADKNQLVIDATVIARNTVAHLHDVQGDIDSLKPKLATNQAKQLKRMEDEMLIQQMMLGGIANTQAK 146 (401) T ss_pred ecCCC-CcCCCCcccccEEEEeCceeehhhhhhhHHHHHhcccccchHHHHHHHHHHHHHHHHHHHHHHHHhcccccccc Confidence 45554 466667766665433322 222367888888877767 66799999999999999998877664 22221111 Q ss_pred ------hhc--ceeeeccccccccccHHHHH----HHHHHhCcc--ccCceEEEEchHHHHHHHh-hhhhhhcc--ccc- Q lcl|NC_015254. 157 ------DSN--KLDVSTETGDDSYFTGDTFL----SATYKLGDA--EGKLTGIAMHSQTEMNLRK-QGLIEFML--DSD- 218 (346) Q Consensus 157 ------~~~--~~dis~~~~~~~~~~~~~l~----~A~~~~GD~--~~~~~~ivmhS~~~~~L~~-~~li~~~~--~s~- 218 (346) ..+ ..++.. .......+...|. +|.+.|-++ ...-.++++.|..|.-|++ .+|++... .+. T Consensus 147 ~~~p~~~~~G~~i~v~~-~~~~~~~~~~~l~~ai~dA~~~LdEkdVP~~r~vvl~pp~~Ys~Ll~~d~L~nrd~~~s~~g 225 (401) T protein:vir:70 147 RTNPRVKGHGFSINVEV-AEGEALVNPQYVMAAVEFALEQQLEQEVDISDVAILMPWRYFNVLRDADRIVDKTYTISQSG 225 (401) T ss_pred ccCCCcCCCceEEeccc-cccccccCHHHHHHHHHHHHHHHHhcCCCccceEEEcCHHHHHHHHhcCcccchhhccccCC Confidence 111 122222 1222334444444 555554322 2233455566666666655 35776432 222 Q ss_pred ---CceeeEEeceEEEEeCCCccCC-------------C-c---------eEEEEEcCCeeEEeecCCccceeeeecCCc Q lcl|NC_015254. 219 ---NKKFPTYMGKRVIVDDGLPAKD-------------G-V---------YTSYIFGEGAFGLGNGEAPVPTETDREKLK 272 (346) Q Consensus 219 ---~~~i~~~~G~~VVvdD~~p~~~-------------g-~---------ytt~l~~~GAi~~~~~~~~~~vE~dRd~~~ 272 (346) ++.+..+.|++|+.+..+|... | . -...+|-+-|++..... ++.-|+.||+.. T Consensus 226 ~~~~G~v~~vaGv~Vv~SnnlP~~a~~it~~~ls~a~~G~~y~~~~d~s~~~~v~f~~~Av~tvk~~-~lt~~~~~d~r~ 304 (401) T protein:vir:70 226 ATIQGFTLSSYNCPVIPSNRFPKYSQGQTHHLLSNEDNGYRYDPLPAMNGAIAVLFTADALLVGRSI-DVTGDIFYEKKE 304 (401) T ss_pred ccccceEEEEeceEEEeeccccccccccccccccccCCCccCCCCccccceeEEEEehhheEEEEee-ccccchhhhhhh Confidence 3567899999999999998632 1 1 12456777888876554 456788888888 Q ss_pred ceeEEEEeeEEeeeeeeeeeccccc---cCCCCChHHhcCCcCceeeecccccceEEEEEecc--cccccCCCCCCCCC Q lcl|NC_015254. 273 GNDILINRQHFLLHPRGIAWQEKSV---AGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNG--VPGKKKETAPEGIK 346 (346) Q Consensus 273 g~~~l~~r~~~~~~~~G~s~~~~~~---~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~--~~~~~~~~~~~~~~ 346 (346) ..+.|.+.+.|++.|+= |.-..+ ....-|-+.+++.- +.|.+-+++=-.|.- -+...-..++.++. T Consensus 305 ~~~~id~~~a~g~g~~R--Peaa~vv~~k~~~~~~~~~~~~~------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 375 (401) T protein:vir:70 305 KTYYIDTFMAEGAIPDR--WEAVSVVTTKRNTTTGAVEGTDG------AQHTIVKNRAQRKAVYVKNAAPVAAAAASLS 375 (401) T ss_pred hHHHHHHHHHhCCcccc--hhheEEEeecCcccccccccCCc------chhhhhhhhccceeEEeccccchhhhccccc Confidence 88888777777766541 210000 00001111222110 011222221111111 01113345566666 No 155 >protein:vir:7855 Length: 497 # NCBI annotation: gp12 # Family: family:all:585 # MgeID: mge:150 # MgeName: CJW1 # Cross-refs: genbank:acc:NP_817462;genbank:gi:29565891;genbank:GeneID:1259081 Probab=97.72 E-value=2.5e-05 Score=45.84 Aligned_cols=286 Identities=10% Similarity=0.044 Sum_probs=125.8 Q ss_pred Cccce----------------e------ccee----eecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchh Q lcl|NC_015254. 1 MIKKL----------------R------MNLQ----KFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKD 54 (346) Q Consensus 1 ~~~~~----------------~------~~~q----~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~ 54 (346) +.... + ...+ .-..+.++...-+ +|+.+..-+.+...+.+.+.+- T Consensus 114 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gg~~-vp~~~~~~ii~~~~~~~~i~~l-------- 184 (497) T protein:vir:78 114 FDVSFNVSAKAADPGTAAAELMGAFADGETAPAAIGQNPFGSTGTFAPG-ILPTFLPGIVEQLFYELSLADL-------- 184 (497) T ss_pred hhhhhhhhhhhhhhHHHHHHHHHHHhhhhhhHHHHHhhhcccCcccccc-cchhhhHHHHHHHHhhhhHHhh-------- Confidence 00000 0 0000 0001222333344 4444444455544444444221 Q ss_pred HHHHhhCCCcEEEecccccCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHH Q lcl|NC_015254. 55 LDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEY 134 (346) Q Consensus 55 ~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~ 134 (346) ......++..+++|....-++.+.-+.|+. .++..+.+-++-....++.+--..++++...-+ .+-...|.++++.. T Consensus 185 -~~~~~~~~~~~~~~~~~~~~~~a~wv~E~~-~~~~s~~~f~~i~~~~~k~a~~~~iS~ell~d~-~~l~~~i~~~l~~~ 261 (497) T protein:vir:78 185 -ISSRPVTSPNLSYLTESAAHNNAAAVAEAG-TYPFSSEEFARVYEQVGKVANALTITDEGLRDA-PELFNFVQGRLLEG 261 (497) T ss_pred -ccccccCCCceEEEEEcCCCCcceeeccCc-ccccccccceeeEeeeeeeEeecHhHHHHHHhH-HHHHHHHHHHHHHH Confidence 111122455688888754333455567774 566656555544444444444445555432222 24456699999999 Q ss_pred HHHHHHHHHHHH-----HHhhhhhhhhhhcce------ee---------eccccccccccH------------------- Q lcl|NC_015254. 135 WNRRRQAVLIAS-----LNGITASGALDSNKL------DV---------STETGDDSYFTG------------------- 175 (346) Q Consensus 135 ~~~~~~~~lla~-----L~G~~~~~~~~~~~~------di---------s~~~~~~~~~~~------------------- 175 (346) +.+..+..+|.- ..|++.......... +. ...-.....++. T Consensus 262 i~~~~d~~~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 341 (497) T protein:vir:78 262 IQRKEEVQLLAGGGYPGVNGLLQRSTGFTASSASSLFGATSATVSNVKFPADGTNGAFVGQDTVASLKYGRVVTGAAGSG 341 (497) T ss_pred HHHHHHHHhhcCCCcccccccccccccccccccccchhhhhhhhhhhhhhcccccchhhhhhHHHHHHHHHhhhhhhhhc Confidence 999999876641 112222111100000 00 000000000111 Q ss_pred --------------HHHHHHHHHhCcc-ccCceEEEEchHHHHHHHhhhhh--hhccccc-C-------ceeeEEeceEE Q lcl|NC_015254. 176 --------------DTFLSATYKLGDA-EGKLTGIAMHSQTEMNLRKQGLI--EFMLDSD-N-------KKFPTYMGKRV 230 (346) Q Consensus 176 --------------~~l~~A~~~~GD~-~~~~~~ivmhS~~~~~L~~~~li--~~~~~s~-~-------~~i~~~~G~~V 230 (346) ..+..+....-.. ...-.+|+||+.++..|++..=- .|+.... + ..-++++|+|| T Consensus 342 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~vmn~~~~~~l~~lkd~~G~~i~~~~~~~~~~~~~~~~~~l~G~pV 421 (497) T protein:vir:78 342 SGVAGSYPTAAEIAENVFDAFVDIQLTLFQTPNAVVMNPRDWELLRLTKDANGQYMGGNFFGNAYGNPVNGGKNIWGVPV 421 (497) T ss_pred cchhccccchhhhhhHHHHHHhhhhhhcccCCCeEEEchHHHHHHHHhhcCCCceeccCcccccccccccCCceeeceee Confidence 1111222211111 11223799999999998865321 1222111 1 12248999999 Q ss_pred EEeCCCccCCCceEEEEEcCCeeEEeecCCccceeeeecC----CcceeEEEEeeEEee---eeeeeeeccc-cccCCC Q lcl|NC_015254. 231 IVDDGLPAKDGVYTSYIFGEGAFGLGNGEAPVPTETDREK----LKGNDILINRQHFLL---HPRGIAWQEK-SVAGHS 301 (346) Q Consensus 231 VvdD~~p~~~g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~----~~g~~~l~~r~~~~~---~~~G~s~~~~-~~~~~s 301 (346) ++++.||.+. +..=-|..+++.+.... .+.++..+.. ..+...+....++.+ ||..|..-.- +....| T Consensus 422 ~~t~~~~~~~--~~~Gd~~~~~~~i~~r~-~~~v~~~~~~~~~f~~n~v~~r~~~r~~~~v~~p~A~~~l~~~~~~~~~ 497 (497) T protein:vir:78 422 VTTPLIPLGT--ILVGHFAPSVIQTARRE-GVTMQMTNSNGTDFVDGKVTVRAEERLGLLVYRPSAFQLIQLKKGATGS 497 (497) T ss_pred EecCCCCCCc--eEEeecccceEEEEEec-ccEEEeecccchhhhcCcEEEEEEEeecceeeccccEEEEEecCCccCC Confidence 9999999754 11111345666554322 2334433221 234555555555544 4545443211 111122 No 156 >protein:vir:101650 Length: 497 # NCBI annotation: gp13 # Family: family:all:585 # MgeID: mge:1515 # MgeName: 244 # Cross-refs: genbank:acc:YP_654768;genbank:gi:109302766;genbank:GeneID:4156084 Probab=97.72 E-value=2.5e-05 Score=45.84 Aligned_cols=286 Identities=10% Similarity=0.044 Sum_probs=125.8 Q ss_pred Cccce----------------e------ccee----eecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchh Q lcl|NC_015254. 1 MIKKL----------------R------MNLQ----KFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKD 54 (346) Q Consensus 1 ~~~~~----------------~------~~~q----~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~ 54 (346) +.... + ...+ .-..+.++...-+ +|+.+..-+.+...+.+.+.+- T Consensus 114 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gg~~-vp~~~~~~ii~~~~~~~~i~~l-------- 184 (497) T protein:vir:10 114 FDVSFNVSAKAADPGTAAAELMGAFADGETAPAAIGQNPFGSTGTFAPG-ILPTFLPGIVEQLFYELSLADL-------- 184 (497) T ss_pred hhhhhhhhhhhhhhHHHHHHHHHHHhhhhhhHHHHHhhhcccCcccccc-cchhhhHHHHHHHHhhhhHHhh-------- Confidence 00000 0 0000 0001222333344 4444444455544444444221 Q ss_pred HHHHhhCCCcEEEecccccCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHH Q lcl|NC_015254. 55 LDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEY 134 (346) Q Consensus 55 ~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~ 134 (346) ......++..+++|....-++.+.-+.|+. .++..+.+-++-....++.+--..++++...-+ .+-...|.++++.. T Consensus 185 -~~~~~~~~~~~~~~~~~~~~~~a~wv~E~~-~~~~s~~~f~~i~~~~~k~a~~~~iS~ell~d~-~~l~~~i~~~l~~~ 261 (497) T protein:vir:10 185 -ISSRPVTSPNLSYLTESAAHNNAAAVAEAG-TYPFSSEEFARVYEQVGKVANALTITDEGLRDA-PELFNFVQGRLLEG 261 (497) T ss_pred -ccccccCCCceEEEEEcCCCCcceeeccCc-ccccccccceeeEeeeeeeEeecHhHHHHHHhH-HHHHHHHHHHHHHH Confidence 111122455688888754333455567774 566656555544444444444445555432222 24456699999999 Q ss_pred HHHHHHHHHHHH-----HHhhhhhhhhhhcce------ee---------eccccccccccH------------------- Q lcl|NC_015254. 135 WNRRRQAVLIAS-----LNGITASGALDSNKL------DV---------STETGDDSYFTG------------------- 175 (346) Q Consensus 135 ~~~~~~~~lla~-----L~G~~~~~~~~~~~~------di---------s~~~~~~~~~~~------------------- 175 (346) +.+..+..+|.- ..|++.......... +. ...-.....++. T Consensus 262 i~~~~d~~~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 341 (497) T protein:vir:10 262 IQRKEEVQLLAGGGYPGVNGLLQRSTGFTASSASSLFGATSATVSNVKFPADGTNGAFVGQDTVASLKYGRVVTGAAGSG 341 (497) T ss_pred HHHHHHHHhhcCCCcccccccccccccccccccccchhhhhhhhhhhhhhcccccchhhhhhHHHHHHHHHhhhhhhhhc Confidence 999999876641 112222111100000 00 000000000111 Q ss_pred --------------HHHHHHHHHhCcc-ccCceEEEEchHHHHHHHhhhhh--hhccccc-C-------ceeeEEeceEE Q lcl|NC_015254. 176 --------------DTFLSATYKLGDA-EGKLTGIAMHSQTEMNLRKQGLI--EFMLDSD-N-------KKFPTYMGKRV 230 (346) Q Consensus 176 --------------~~l~~A~~~~GD~-~~~~~~ivmhS~~~~~L~~~~li--~~~~~s~-~-------~~i~~~~G~~V 230 (346) ..+..+....-.. ...-.+|+||+.++..|++..=- .|+.... + ..-++++|+|| T Consensus 342 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~vmn~~~~~~l~~lkd~~G~~i~~~~~~~~~~~~~~~~~~l~G~pV 421 (497) T protein:vir:10 342 SGVAGSYPTAAEIAENVFDAFVDIQLTLFQTPNAVVMNPRDWELLRLTKDANGQYMGGNFFGNAYGNPVNGGKNIWGVPV 421 (497) T ss_pred cchhccccchhhhhhHHHHHHhhhhhhcccCCCeEEEchHHHHHHHHhhcCCCceeccCcccccccccccCCceeeceee Confidence 1111222211111 11223799999999998865321 1222111 1 12248999999 Q ss_pred EEeCCCccCCCceEEEEEcCCeeEEeecCCccceeeeecC----CcceeEEEEeeEEee---eeeeeeeccc-cccCCC Q lcl|NC_015254. 231 IVDDGLPAKDGVYTSYIFGEGAFGLGNGEAPVPTETDREK----LKGNDILINRQHFLL---HPRGIAWQEK-SVAGHS 301 (346) Q Consensus 231 VvdD~~p~~~g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~----~~g~~~l~~r~~~~~---~~~G~s~~~~-~~~~~s 301 (346) ++++.||.+. +..=-|..+++.+.... .+.++..+.. ..+...+....++.+ ||..|..-.- +....| T Consensus 422 ~~t~~~~~~~--~~~Gd~~~~~~~i~~r~-~~~v~~~~~~~~~f~~n~v~~r~~~r~~~~v~~p~A~~~l~~~~~~~~~ 497 (497) T protein:vir:10 422 VTTPLIPLGT--ILVGHFAPSVIQTARRE-GVTMQMTNSNGTDFVDGKVTVRAEERLGLLVYRPSAFQLIQLKKGATGS 497 (497) T ss_pred EecCCCCCCc--eEEeecccceEEEEEec-ccEEEeecccchhhhcCcEEEEEEEeecceeeccccEEEEEecCCccCC Confidence 9999999754 11111345666554322 2334433221 234555555555544 4545443211 111122 No 157 >protein:vir:1781 Length: 221 # NCBI annotation: minor capsid protein # Family: family:all:975 # MgeID: mge:38 # MgeName: P60 # Cross-refs: genbank:acc:NP_570347;genbank:gi:18640506;genbank:GeneID:932719 Probab=97.72 E-value=5.5e-06 Score=49.43 Aligned_cols=192 Identities=13% Similarity=0.117 Sum_probs=96.1 Q ss_pred EEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhh-h--hcceeee---cccccccccc Q lcl|NC_015254. 101 RLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGAL-D--SNKLDVS---TETGDDSYFT 174 (346) Q Consensus 101 ~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~-~--~~~~dis---~~~~~~~~~~ 174 (346) +---.---+.+.|+....+..|.+.++.+|.+.+.++.+|+.++..+......... . ....+.. +.+.+...+ T Consensus 1 iD~lL~a~~~VdDiD~aqa~~dvr~e~t~e~G~ALA~~~D~~i~~~~~~aA~~~~p~~~~~~g~~~~~~a~~t~~~~~l- 79 (221) T protein:vir:17 1 MDDLLVASQFVYDLDEILAQWNTRSEISKQIGEALAIHYDERIARVLASASIAAAPVTGQDGGFSVNIGAGNTNNAQAI- 79 (221) T ss_pred CCcchhHHHHHHhHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhcCcccccccCcceeccccccCCHHHH- Confidence 00011223678899999999999999999999999999999998877533322110 0 0001111 111111222 Q ss_pred HHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhh---hhhhhccc-cc----Cc-eeeEEeceEEEEeCCCccCCCce Q lcl|NC_015254. 175 GDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQ---GLIEFMLD-SD----NK-KFPTYMGKRVIVDDGLPAKDGVY 243 (346) Q Consensus 175 ~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~---~li~~~~~-s~----~~-~i~~~~G~~VVvdD~~p~~~g~y 243 (346) ++.|.+|.++|-++. ..-..+++.|..|+.|.+. .+.+.... ++ ++ .++.++|.+|+.|+.+|...|+- T Consensus 80 ~dai~~a~~~LdekdVP~~gR~~vv~P~~y~~LL~~~d~~~~n~d~~~s~g~~~~g~~i~~v~G~~V~~SnnlP~~~gt~ 159 (221) T protein:vir:17 80 VDGFFEAAAVLDERSAPMDGRVAVLSPRQYYSLISSVDTNILNREIGNTQGDMNTGKGLYVNAGIRIYKSNVLASLYGTN 159 (221) T ss_pred HHHHHHHHHHHhhcCCCCCCCEEEeCcHHHHHHHHhcCcceeeeecccccccccccceeeeecCcEEEEeccCCcccccc Confidence 567777888875543 3456788999999998862 34443222 22 23 58889999999999999755532 Q ss_pred EEEEEcCCeeEEeecCCc----------cceeeeecCCcceeEEE--EeeEEeeeeeeeeeccccccCCCCChH Q lcl|NC_015254. 244 TSYIFGEGAFGLGNGEAP----------VPTETDREKLKGNDILI--NRQHFLLHPRGIAWQEKSVAGHSPTNT 305 (346) Q Consensus 244 tt~l~~~GAi~~~~~~~~----------~~vE~dRd~~~g~~~l~--~r~~~~~~~~G~s~~~~~~~~~sPt~a 305 (346) +...+|.+... .... +.+-..|...+-..-+- +|-..++..+ |.. -|..- T Consensus 160 --~~~~ag~~~~~-~~~~~~yr~~fs~~~glv~~~~Avgtvkl~~~~~~~~~~~~~~--~~~-------~~~~~ 221 (221) T protein:vir:17 160 --LVTDPGDATTS-GENNGSYRPAITDRAGLVFHKEAADTVEVLLPPSRPPLVISMF--SIR-------RPDRR 221 (221) T ss_pred --cccCCcccccc-ccccccccccccceEEEEEcchheeeeeeecCCCCCceeeeee--ecc-------CCCCC Confidence 11222221100 0000 00001111110000000 0111111111 110 01111 No 158 >protein:vir:79928 Length: 393 # NCBI annotation: major head protein # Family: family:all:30335 # MgeID: mge:1874 # MgeName: 0305phi8-36 # Cross-refs: genbank:acc:YP_001429616;genbank:gi:156564106;genbank:GeneID:5525693 Probab=97.60 E-value=4e-05 Score=44.71 Aligned_cols=289 Identities=15% Similarity=0.162 Sum_probs=169.1 Q ss_pred Cccce---ecceeeecCCceeeeeeccchHHHHHHHhh---HhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccC Q lcl|NC_015254. 1 MIKKL---RMNLQKFAAGKNTRIADVIVPEVFNKYVTE---RTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDL 74 (346) Q Consensus 1 ~~~~~---~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~---~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l 74 (346) |--+. .+|+.-|- +|--+++.+|.|+..-+.+ ++.=-++++|- ..-.-|....+|.++-| T Consensus 59 m~G~~p~~eV~~~e~m---tt~~a~IliP~vis~v~~Eaaepl~~~~kl~qk-----------~~L~~Grsm~F~~~g~~ 124 (393) T protein:vir:79 59 MEGETPTNEVNLREFM---ATPSAQILIPRVIVGTMREAAEPLYIGTKMLQK-----------IRLKSGQSMIFPSIGIM 124 (393) T ss_pred hcCCCchhheehhhhh---cCCCcceechhhhhhhhhhcccchhHHHHHHHH-----------HhhhcCcceeccchhee Confidence 33222 25555552 3566789999999776666 22222223221 11123777778888766 Q ss_pred CCcccccCCCccccchhhcc-cceeEEEEE--eecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Q lcl|NC_015254. 75 TGEDEILDDGEGALTPGNIS-AAKDIARLH--MRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGIT 151 (346) Q Consensus 75 ~g~ae~~~dg~~~it~~~lt-~~~~~a~~~--~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~ 151 (346) -+-++.||. .++...|. +..+...+. +.|-....+|+..--++=|-+.-.-++.....+|+.+..++..++.-- T Consensus 125 --Ra~~IgEGg-E~~~~sld~~T~dsv~~~~gK~G~~Ia~SqEmIsDSg~Dvin~~l~aA~RaMaRkKee~a~n~fk~~g 201 (393) T protein:vir:79 125 --RAYDVAEGQ-EIPEDSIDWQTHESPEIRVGKSGIRLRFTDEMISDSQWDLMSMMIKQAGRAMGRHKEQKAYHQFRSHG 201 (393) T ss_pred --eeccccccc-cccccchhhhcCCceeEEechhhhhhhhHHHHhhcchHHHHHHHHHHHHHHHHhhhHHHHHhhhhccc Confidence 344667774 56666666 333332322 233447889998888888999998888999999999999888876210 Q ss_pred h--hhhhhhc-ceeeec---cccccccccHHHHHH-HHHHhCccccCceEEEEchHHHHHHHhhhhhhhcccccCce--- Q lcl|NC_015254. 152 A--SGALDSN-KLDVST---ETGDDSYFTGDTFLS-ATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSDNKK--- 221 (346) Q Consensus 152 ~--~~~~~~~-~~dis~---~~~~~~~~~~~~l~~-A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~~~~--- 221 (346) - -.+...+ ....++ ..-...+|+++.|.| +.+.+.+... ..+++|||-....+.|.++.+.+....-+. T Consensus 202 htvfDa~st~t~ahptGr~~~~~qNGTlSleDllDm~~av~~~hyt-~svi~MHPLAWnv~AKna~me~~~~na~gN~~~ 280 (393) T protein:vir:79 202 HTVFDNYSTNKLAHTTGLDKNGVQNDTFSAEDFLDLIIAVMANEYT-PSDLMMHPLAWTVFAKNELMGSLQANPYGNYPA 280 (393) T ss_pred ceeeeccccCccceeecCCccccccccccHHHHHHHHHHHhcccCC-cceEEEcCchhhhhhhhhhhcceeeccccccCc Confidence 0 0000000 011111 111235799999999 4445666554 479999999999999998887665432111 Q ss_pred --ee--------EEec-----eEEEEeCCCccC--CCceEEEEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEe Q lcl|NC_015254. 222 --FP--------TYMG-----KRVIVDDGLPAK--DGVYTSYIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFL 284 (346) Q Consensus 222 --i~--------~~~G-----~~VVvdD~~p~~--~g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~ 284 (346) +. -++| ..|++|-=+|.. +..|.-|.+-..-+++-.-++.+.+|.--|+..+.+.+--+.+|+ T Consensus 281 ~~~~ts~algp~~i~~~~~~nlnv~~sPfvp~d~k~~rFd~~~Vd~NnvgvlLV~D~i~tdq~ddk~rdiq~iKl~ERYG 360 (393) T protein:vir:79 281 KGAPSSMALGPDSIQGRLPFNFNVNLSPFIPLDKKSRRFDVYAVDRNNVGVLLVRDDLKTDQWDEKARGLQNIKMIERYG 360 (393) T ss_pred cccchhhhhchhhhccccccceeEEEecccccccccceeeEEEeecCCceEEEEecCcceeccccccccceeeeeeeeec Confidence 11 1333 478888766654 335555666666555555566666665556777888888888888 Q ss_pred eeeee----------eeeccccccCCCCChHHhcCCcC Q lcl|NC_015254. 285 LHPRG----------IAWQEKSVAGHSPTNTEIEKGNN 312 (346) Q Consensus 285 ~~~~G----------~s~~~~~~~~~sPt~a~L~~~~N 312 (346) +++.- ++++++ -|..--+.+-.| T Consensus 361 ~gvLn~gkaiavakNI~~~k~-----y~~P~~~~~~~~ 393 (393) T protein:vir:79 361 IGILNEGKAIAVAKNISMDKS-----YAEPMLIKNVGN 393 (393) T ss_pred eeeeeCCceEEEEecceeecc-----cccchhhhccCC Confidence 86642 222211 121112222233 No 159 >protein:vir:95963 Length: 395 # NCBI annotation: ORF009 # Family: family:all:635 # MgeID: mge:1594 # MgeName: 2638A # Cross-refs: genbank:acc:YP_239802;genbank:gi:66395459;genbank:GeneID:5132880 Probab=97.45 E-value=4.1e-05 Score=44.65 Aligned_cols=294 Identities=11% Similarity=0.007 Sum_probs=122.2 Q ss_pred Cccceecceeeec----CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCC Q lcl|NC_015254. 1 MIKKLRMNLQKFA----AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTG 76 (346) Q Consensus 1 ~~~~~~~~~q~~~----a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g 76 (346) +...+.-+-+.|. .+ ++--.-.++|+-+.+.+.+.+.+.+-+.+-.-+ ...+| ...+|....- + T Consensus 71 ~~~~l~~ee~~~~~~~~~~-t~~~gG~liP~~~~~~Ii~~l~~~s~i~~~~~v---------~~~~~-~~~i~~~~~~-~ 138 (395) T protein:vir:95 71 SQDPLTSEERKFFNDINYD-VGYTDEKILPETVVERVFDDLQKDHPLLSKINF---------QNAGI-KTRVIKADPA-G 138 (395) T ss_pred CccccchHHHHHHHHHhhc-cCCCCceeccHHHHHHHHHHHHhhhhhhhhcee---------EecCC-ceEEEEecCC-c Confidence 1111211111111 11 111123467888888888877777666543211 12234 3578876543 3 Q ss_pred cccccCCCccccchhhcccceeE-EEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH-------HH Q lcl|NC_015254. 77 EDEILDDGEGALTPGNISAAKDI-ARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIAS-------LN 148 (346) Q Consensus 77 ~ae~~~dg~~~it~~~lt~~~~~-a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~-------L~ 148 (346) .+.-..+. +.+..+.-.+..+. -..++...-..++..--.-+.-|-...+.+++++.+++..++.+|.- =. T Consensus 139 ~a~w~~e~-~~~~~~~~~~f~~i~l~~~kl~~~~~iS~ell~ds~~~ie~~i~~~la~~ia~~~~~a~i~G~G~~~~qP~ 217 (395) T protein:vir:95 139 QAVWGKVF-GEIKGQLDAAFREENFTQYKLTCFVVLPDDLSTFGPAWIERFVRTQIQEAISVALESAIINGGGAAKTQPV 217 (395) T ss_pred ceEEeecc-cccCccccccceeeeeceeeEEEeecccHHHHhcchhHHHHHHHHHHHHHHHHHHhhheeeccCCCCcCce Confidence 33222222 22322211121222 12222332234444443344556677899999999999999865531 01 Q ss_pred hhhhhhhhhhccee--eec--cccccccccHHHHHHHHHHhC-------ccccCceEEEEchHHHHHHHhhhhhhhcccc Q lcl|NC_015254. 149 GITASGALDSNKLD--VST--ETGDDSYFTGDTFLSATYKLG-------DAEGKLTGIAMHSQTEMNLRKQGLIEFMLDS 217 (346) Q Consensus 149 G~~~~~~~~~~~~d--is~--~~~~~~~~~~~~l~~A~~~~G-------D~~~~~~~ivmhS~~~~~L~~~~li~~~~~s 217 (346) |++........... ... .+.....+.+..+.++...+. .....-..++||+.++.+++.+-+ + +. T Consensus 218 Gil~~~~~~~~~~~~~~~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~mn~~t~~~~~g~~~--~-~~- 293 (395) T protein:vir:95 218 GLMKDVNTNSGAVTDKASSGTLTFADADTTILELNDVLKNLSVDEKGKELKIDGKVALVVNPRDSWDVQARYT--Y-LT- 293 (395) T ss_pred eeeecccccccccccccccchhhhhhhHhhHHHHHHHHHhhccccccchhhhcCceEEEEcchhhhhcCCcce--e-cc- Confidence 22221110000000 000 001111223333433333221 122344579999999887654332 1 12 Q ss_pred cCceeeEEe--ceEEEEeCCCccCCCceEEEEEcCCe-eEEeecCCccceeeeecC--CcceeEEEEeeEEeeeeeeeee Q lcl|NC_015254. 218 DNKKFPTYM--GKRVIVDDGLPAKDGVYTSYIFGEGA-FGLGNGEAPVPTETDREK--LKGNDILINRQHFLLHPRGIAW 292 (346) Q Consensus 218 ~~~~i~~~~--G~~VVvdD~~p~~~g~ytt~l~~~GA-i~~~~~~~~~~vE~dRd~--~~g~~~l~~r~~~~~~~~G~s~ 292 (346) .+|...+.+ |++|+.++.||... ++|+.=. +.+.. +..+.++..++. ..+++.+....++... T Consensus 294 ~~G~~~~~lg~g~~v~~~~~~p~~~-----i~fgdfs~y~i~~-r~~~~i~~~~~~~~~~d~~~f~~~~r~dg~------ 361 (395) T protein:vir:95 294 ANGGFVTVLPYNVTIITSEFVPEGK-----LVAFVTDRYNAVR-GGGLTVKKFDQTLALEDAVLFTAKTFAYGQ------ 361 (395) T ss_pred CCCcceeccCCcceEEEcCCCCCCc-----EEEEecccEEEEE-ecceEEEeccchhhhCCcEEEEEEEEECCE------ Confidence 244555665 66789999999653 3333221 11221 112223332222 2233334433333222 Q ss_pred ccccccCCCCChHHhcCCcCceeeecccccceEEEEEecccccccCCCCCCCCC Q lcl|NC_015254. 293 QEKSVAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETAPEGIK 346 (346) Q Consensus 293 ~~~~~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~~~~~~ 346 (346) +.+++++.+..|..-..++.+...+++.-+- T Consensus 362 -----------------------~~~~~A~~~l~i~~~~~~~~~~~~~~~~~~~ 392 (395) T protein:vir:95 362 -----------------------PDDNKASAVYDLKVASAPRRQTSAGGTTDGI 392 (395) T ss_pred -----------------------EeccccEEEEEeeccCCCCCCCCCCCCCCcc Confidence 2333444433331111122222222221111 No 160 >protein:vir:105610 Length: 430 # NCBI annotation: virion structural protein # Family: family:all:974 # MgeID: mge:1540 # MgeName: F116 # Cross-refs: genbank:acc:YP_164307;genbank:gi:56692923;genbank:GeneID:3197221 Probab=97.42 E-value=7e-05 Score=43.38 Aligned_cols=296 Identities=13% Similarity=0.074 Sum_probs=150.0 Q ss_pred eecCCceeeeeeccchHHHHHHHhhHhHHH----HhHhhc----------------cccccchhHHHHhhCCCcEEEecc Q lcl|NC_015254. 11 KFAAGKNTRIADVIVPEVFNKYVTERTAES----SALLQS----------------GIISNDKDLDELAKSGGNMINMPF 70 (346) Q Consensus 11 ~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~----~~~~qS----------------gi~~~~~~~~~l~~~~G~ti~~P~ 70 (346) |-++-+.....+-.-..+|+.-+-....+. ++|... +.=.|-..+..|-...|+.|+++. T Consensus 1 ~~~a~T~~~~~~p~a~~~ws~~l~~~~~k~~~~~~kl~G~~~~~~~~~~~~~~~~ts~~~pI~r~~dL~K~~GD~Vtf~L 80 (430) T protein:vir:10 1 MTASKTTMRYGDPNAMIQQAAGLFALCQGRNSTLNRLTGKMPSGTSDAEKKTKGQSSLELPIVQAQDLGRNKGDEVRFHF 80 (430) T ss_pred CcceeeecccCChhHHHHHHHHHHHHHhhhhhhHHHhhccccccccchhhhccCCCCCCccEEEeccCCCCCccEEEEeE Confidence 444433333333333445555444433332 344432 110112223345567899999999 Q ss_pred cccCCCcccccCCCccccc--hhhcccceeEEEEEeecCcceechH-HHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 71 WQDLTGEDEILDDGEGALT--PGNISAAKDIARLHMRGKAWRTNDL-AKALSGDDPMRAIGDLVVEYWNRRRQAVLIASL 147 (346) Q Consensus 71 ~~~l~g~ae~~~dg~~~it--~~~lt~~~~~a~~~~~~k~~~~tD~-a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L 147 (346) -..|.|+- +.-+ ..++ -+.|+..++..+|=...+++....- ++..+.-|...++...++.||++..|..++-.| T Consensus 81 ~~~L~g~g-v~Gd--~~lEGnee~L~~~~d~l~IDq~R~~V~~gg~msqQRt~~dlR~~ar~~L~~w~~~~~Dq~~~v~l 157 (430) T protein:vir:10 81 VQPANAFP-IMGS--EYAEGKGTGLKIGSDQLRVNQARFPVDLGDVMSQIRNPYDLRRLGRPKAKWFMDAYLDQSMLVHL 157 (430) T ss_pred eeccccCc-eecC--ceeeccccceEEEeeEEEEeeeccccccCCchhhhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHH Confidence 99997753 3211 1222 2678899998888888888776643 455666788889999999999999999999999 Q ss_pred Hhhhhh---------------------hhhhh-cc--eee-ec-------------cccccccccHHHHHHHHHHhCc-- Q lcl|NC_015254. 148 NGITAS---------------------GALDS-NK--LDV-ST-------------ETGDDSYFTGDTFLSATYKLGD-- 187 (346) Q Consensus 148 ~G~~~~---------------------~~~~~-~~--~di-s~-------------~~~~~~~~~~~~l~~A~~~~GD-- 187 (346) .|..+. +.... +. +-. .+ ...++-.|+.+.+.+|...... T Consensus 158 aGarg~~~~~~~~~~~~~~~~~~~~~~N~v~aPt~nrh~~~~G~at~~~~~~~~~~sl~stD~~s~~~id~a~~~a~~~~ 237 (430) T protein:vir:10 158 AGARGNHYNKEWCLPLETHPKLADMLVNRVKAPTKNRHFVASADAITGVAPNAGEYNITTADVLDVDVVDSIATYMDQIE 237 (430) T ss_pred hhhhcccccccccccccCCcchhhhhccccCCCCCceeEeecccccccccccccccchhhhcccCHHHHHHHHHHHHhhC Confidence 886431 10000 00 000 00 1122335788888777665422 Q ss_pred -----------c---ccCceEEEEchHHHHHHHhhhhh-hhcc-----ccc-------CceeeEEeceEEEEe------- Q lcl|NC_015254. 188 -----------A---EGKLTGIAMHSQTEMNLRKQGLI-EFML-----DSD-------NKKFPTYMGKRVIVD------- 233 (346) Q Consensus 188 -----------~---~~~~~~ivmhS~~~~~L~~~~li-~~~~-----~s~-------~~~i~~~~G~~VVvd------- 233 (346) + .....+++|||..+++|+.+.-. ++.. ... .|.++.|+|+.|.-- T Consensus 238 ~~i~Pv~v~gd~~~g~~~~yV~~~~p~q~~~Lr~dt~~~~wq~~~~a~a~~g~~nPlF~G~~gm~ngvii~~~~~virf~ 317 (430) T protein:vir:10 238 LPPPPVKFEGDEAAEDSPIRVLLCSPAQYNSFAKQEKFRSWQAAALARASNAKQHPIFRVDAGLWSNTLIIKMPKPIRFY 317 (430) T ss_pred CCCcceEeecccccCCccEEEEEechHHHHHHhhCcchHHHHHHHHHhhcccccCCceecceeeecCeEEecCCceeeec Confidence 1 12358999999999999987432 2211 111 256789999866531 Q ss_pred -----CCCc-cCC-------------C---ceEEEEEcCCeeEEeecCCc------cceeeeecCCcceeEEEEeeEEee Q lcl|NC_015254. 234 -----DGLP-AKD-------------G---VYTSYIFGEGAFGLGNGEAP------VPTETDREKLKGNDILINRQHFLL 285 (346) Q Consensus 234 -----D~~p-~~~-------------g---~ytt~l~~~GAi~~~~~~~~------~~vE~dRd~~~g~~~l~~r~~~~~ 285 (346) +-+. ... + +-..+++|.-|+.+..++.+ .=.|...|-.....+... .++ T Consensus 318 ~g~~~~~~a~~~~~~~~~~~~~a~~~~~~~v~RalllGaQA~~~A~g~~~~~g~~f~w~Ee~~D~g~~~~i~~~---~i~ 394 (430) T protein:vir:10 318 AGDTIKYCAAYNSEAESSAVVSDSFGNQYAVDRALLLGGQALAQAWAASEHSGMPFFWSEKDMDHGDKLELLIG---AIL 394 (430) T ss_pred CCCccccccCCcccccccccccccccccccchhhhhccchhheeeeeccCCCCcceeeeeeccccCchhhhhhh---HHh Confidence 1100 000 0 12345677666666554421 113333333222221110 111 Q ss_pred eeeeeeeccccccCCC---------CChHHhcCCcCceeeeccc Q lcl|NC_015254. 286 HPRGIAWQEKSVAGHS---------PTNTEIEKGNNWKAVYESK 320 (346) Q Consensus 286 ~~~G~s~~~~~~~~~s---------Pt~a~L~~~~NW~~v~~~K 320 (346) .+.=..|..+...+.+ ||-+.+--+- | T Consensus 395 G~kK~rF~~~~~~~~~~~DfGvi~idtaa~~~~~~--------~ 430 (430) T protein:vir:10 395 GCSKIRFAVEATNGLEYTDHGVMAIDTAVKIIGPR--------K 430 (430) T ss_pred ccceeeecCCCCCCceeeeeEEEEhhhhhhhhcCC--------C Confidence 1111122211111110 1111111111 0 No 161 >protein:vir:102335 Length: 312 # NCBI annotation: putative capsid protein # Family: family:all:701 # MgeID: mge:1566 # MgeName: phi CD119 # Cross-refs: genbank:acc:YP_529560;genbank:gi:90592716;genbank:GeneID:3974467 Probab=97.18 E-value=0.00014 Score=41.71 Aligned_cols=265 Identities=11% Similarity=0.130 Sum_probs=117.8 Q ss_pred eeecc-chHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCcc-ccchhhcccce Q lcl|NC_015254. 20 IADVI-VPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEG-ALTPGNISAAK 97 (346) Q Consensus 20 l~d~i-~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~-~it~~~lt~~~ 97 (346) +++.| -.+.|.+.+.+.+.+.+ -|+.+.+++..-. -.||++|.+|...- +|-. +.+...+ .-....++... T Consensus 1 Mantl~ya~~~~~~LD~~~~~~~---~s~~l~~~~~~v~--~~ggktVkIp~i~~-~gl~-DY~R~~g~~~~~g~v~~~~ 73 (312) T protein:vir:10 1 MANTLAYGQVLQQGLDKQATQEL---LTGWMDSNAKQIK--YEGGKEVKIGKLST-DGLG-DYSRGSANAYVGGDVKFEY 73 (312) T ss_pred CCcchhHHHHHHHHHHHHHHhhh---ccccccCCCceEE--EecCcEEEEEeeec-cccc-ccccccCCccccccccccc Confidence 44322 23667777766665544 2566665543222 14799999999763 4433 2121111 11333455444 Q ss_pred eEEE-EEeecCcceec--h--HHHh-hhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeeccccccc Q lcl|NC_015254. 98 DIAR-LHMRGKAWRTN--D--LAKA-LSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDS 171 (346) Q Consensus 98 ~~a~-~~~~~k~~~~t--D--~a~~-~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~ 171 (346) +.-+ -+.|+..|.+. | ++.. ++-++.|+ ++..+.-.-.+|+.-++.|-.... ........+....-++ T Consensus 74 et~tl~qDR~~~F~vD~mDvDETn~~~s~anv~~---ef~r~~vvPEiDayrfskla~~a~---~~~~~~~~~~~~~~T~ 147 (312) T protein:vir:10 74 ETKTMTQDRGRKFTLDAMDVDETNFLVTATTVMG---EFQRLKVIPEIDAYRLSRLATIAI---GIKGDTNVEYSYSVNS 147 (312) T ss_pred eeEEeeecccceeeccccchhhHhhHHHHHHHHH---HHHHhhhcchhhHHHHHHHHhhhh---ccccccccccccccCH Confidence 4333 44578888777 3 3322 22333333 334444455666776766531111 0111111111111011 Q ss_pred cccHHHHHHHHHHhCccc-cCceEEEEchHHHHHHHhhhhhhhccccc------CceeeEEeceEEEE--eCCCcc---- Q lcl|NC_015254. 172 YFTGDTFLSATYKLGDAE-GKLTGIAMHSQTEMNLRKQGLIEFMLDSD------NKKFPTYMGKRVIV--DDGLPA---- 238 (346) Q Consensus 172 ~~~~~~l~~A~~~~GD~~-~~~~~ivmhS~~~~~L~~~~li~~~~~s~------~~~i~~~~G~~VVv--dD~~p~---- 238 (346) .=-++.|.+++.+|-|.. ..-.+++|.|.++.-|.+. ........+ .+.++.+.|++||. ++.|.. T Consensus 148 ~ni~~~i~~~~~~lde~~vp~~rvl~vTp~~~~lLk~~-~~~~~~~~~~~~~~i~~~V~~iDgv~Ii~VPs~r~~t~~~f 226 (312) T protein:vir:10 148 STIINKIKTGIKIIRENGYNGPLVCHLTYDSMFAIEEK-VLEKLTAVTFAQGGIQTQVPSIDGCALIKTPQNRMYSSILL 226 (312) T ss_pred HHHHHHHHHHHHHHHHccCCCceEEEeChHHHHHHhhh-hhceecccccccceeeeeeeeecccEEEEchhhhccceeee Confidence 112456667777776643 2356899999999666653 333222222 24578899999975 233311 Q ss_pred ----------------CCCceEEEEEcCCeeEEeecC-CccceeeeecCCcce--eEEEEeeEEeeeee-----ee--ee Q lcl|NC_015254. 239 ----------------KDGVYTSYIFGEGAFGLGNGE-APVPTETDREKLKGN--DILINRQHFLLHPR-----GI--AW 292 (346) Q Consensus 239 ----------------~~g~ytt~l~~~GAi~~~~~~-~~~~vE~dRd~~~g~--~~l~~r~~~~~~~~-----G~--s~ 292 (346) +.++...|++.+....+.-.+ ..+-+ .+-+..... -.+..|..+-+-++ |+ +. T Consensus 227 ~dG~t~~~~~gg~~~~~~ak~INfiiv~~~a~i~~~K~~~~~i-f~P~~~~~~d~~~~~~R~Y~D~fv~~nk~~~Iyv~~ 305 (312) T protein:vir:10 227 NDGTTSNQTAGGYLKGTKALDTNFIIAPVDVPLAITKQDKMRI-FDPETNQTANAWSMDYRRYHDLWVTDNKANSVYANF 305 (312) T ss_pred ccCcccccccCceeecCcccccceEEeCCceeeceeeeeeeee-eCCCCCCCcceeeeeeeeeeeeeeeccccCeEEEEe Confidence 111223343333222221111 11100 001111111 23445554444443 22 33 Q ss_pred ccccccC Q lcl|NC_015254. 293 QEKSVAG 299 (346) Q Consensus 293 ~~~~~~~ 299 (346) +++...| T Consensus 306 k~a~~~~ 312 (312) T protein:vir:10 306 KDAKPVG 312 (312) T ss_pred ecccCCC Confidence 3222222 No 162 >protein:vir:95875 Length: 401 # NCBI annotation: major coat protein # Family: family:all:10944 # MgeID: mge:1586 # MgeName: N4 # Cross-refs: genbank:acc:YP_950534;genbank:gi:119952248;genbank:GeneID:5075702 Probab=97.12 E-value=4.7e-05 Score=44.32 Aligned_cols=296 Identities=14% Similarity=0.121 Sum_probs=142.7 Q ss_pred eeeecCCceeeeeec---cchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCc Q lcl|NC_015254. 9 LQKFAAGKNTRIADV---IVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGE 85 (346) Q Consensus 9 ~q~~~a~~~T~l~d~---i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~ 85 (346) +-.|-|.+.-+++.. .-||+=.-|-.++....-+.. -+.-+.++...+-...|.||.+-++-.|..+...+++|. T Consensus 1 ~~~~~a~~~~~~~s~~g~~~~~~~t~y~~~k~L~~Aa~~--lv~~~fA~~~piPkn~GkTIk~r~y~pl~~~~~pl~eGv 78 (401) T protein:vir:95 1 MLNYNAPTDGQKSSIDGANSDQMQTFFWLKKAIITARKE--QYFMPLASVTNMPKHYGKTIKVYEYVPLLDDRNINDQGI 78 (401) T ss_pred CCccCCCcccccccccccccceeeehhhHHHHHhhhhhh--hhhhhcccccccccccCCeEEEEecccccccccchhcCC Confidence 111223222222222 245544444434333322221 112222333334456799999999999966555566664 Q ss_pred cc-------------------cchh--------------hcccceeEEEEEeecCcceechHHHhhhcchHHHH-HHHH- Q lcl|NC_015254. 86 GA-------------------LTPG--------------NISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRA-IGDL- 130 (346) Q Consensus 86 ~~-------------------it~~--------------~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~-i~~q- 130 (346) ++ ||.. +++--...+.+++.|.-.+.||+..+..-++-+.+ +.+. T Consensus 79 ~a~G~~~~~g~~y~~~rdv~~it~~m~~~t~~~~rvn~v~~~~~d~~g~l~qyG~~~e~Td~~~dt~~D~~l~~h~s~el 158 (401) T protein:vir:95 79 DASGATIVNGNLYGSSKDIGNITSKLPLLTENGGRVNRVGFTRIAREGSIHKFGFFYEFTQESIDFDSDDGLMEHLSREL 158 (401) T ss_pred CcccccccCccccccccccceeecccccccccccccccccceeeeeeeeeeeccCccchhhhhhhhhcchHHHHHHHHHH Confidence 21 1110 22223344556777777899999998877766664 3222 Q ss_pred ---HHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCc-------------------c Q lcl|NC_015254. 131 ---VVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGD-------------------A 188 (346) Q Consensus 131 ---~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD-------------------~ 188 (346) .+....+-.+++||+.-.-++- .+.+..-..++..++....++++.+-++...+-+ - T Consensus 159 l~g~~~~t~d~i~~dll~ag~~viy-Ag~ats~At~~~~~~~~t~vt~~~l~rl~~~L~~nRapk~t~~i~~s~~~dTk~ 237 (401) T protein:vir:95 159 MNGATQITEAVLQKDLLAAAGTVLY-AGAATSDATITGEGSTPSVVSYKNLMRLDQILTENRTPTQTTIITGSRMIDTKV 237 (401) T ss_pred hhhhhhhHHHHHHHHHHhhcCeeec-CCccceeeeccccccccceechhHHHHHHHHHHhcccccchhhhhhhhccCccc Confidence 2233345556666643210100 0111111223344455566788888887766543 1 Q ss_pred ccCceEEEEchHHHHHHHhh-------hhhhhccccc-----CceeeEEeceEEEEeCC--------CccCC-------- Q lcl|NC_015254. 189 EGKLTGIAMHSQTEMNLRKQ-------GLIEFMLDSD-----NKKFPTYMGKRVIVDDG--------LPAKD-------- 240 (346) Q Consensus 189 ~~~~~~ivmhS~~~~~L~~~-------~li~~~~~s~-----~~~i~~~~G~~VVvdD~--------~p~~~-------- 240 (346) -..-.+.+|||....+|+.. ++++..++.+ .++||.+-+.|+|++.- ||.+. T Consensus 238 i~~s~va~~h~~L~~di~a~~D~~~~~~fi~v~kYa~~~~i~~gEiG~i~~vR~i~~p~~~~w~~ag~~a~~~~~~y~~~ 317 (401) T protein:vir:95 238 IGATRVMYVGSELVPELKAMKDLFGNKAFIETQHYADAGTIMNGEVGSIDKFRIIQVPEMLHWAGAGAQATGANPGYRTS 317 (401) T ss_pred cccceEEEEecCchhHHHHHHHhcCCCCceehhhcCCccccccccccccCceeEEecccceeecCCcccccccccccccc Confidence 23456789999887777644 4777777776 46789999999999865 43322 Q ss_pred -----C---ceEEEEEcCCeeEEeecCCc-----cceeeeecCCcc----eeEEEEeeEEeeeeeeeeeccccccCCCCC Q lcl|NC_015254. 241 -----G---VYTSYIFGEGAFGLGNGEAP-----VPTETDREKLKG----NDILINRQHFLLHPRGIAWQEKSVAGHSPT 303 (346) Q Consensus 241 -----g---~ytt~l~~~GAi~~~~~~~~-----~~vE~dRd~~~g----~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt 303 (346) | +|-.++++..|++....+-. ..+= -+.+..+ .|-|-.| ..-|++|--.. .-.+| T Consensus 318 ~~~~gg~~dVyp~lV~G~dAf~~~~l~g~g~~~~~~~i-vk~pG~~~ad~~DPlgQ~-----g~vgwK~~~a~-~vL~~- 389 (401) T protein:vir:95 318 MVSGQEHYDVYPMLVVGDDSFTSIGFQTDGKSLKFTVM-TKMPGKETADRNDPYGET-----GFSSIKWYYGI-LVKRP- 389 (401) T ss_pred cccCCCcceeeeeeEEccccceecccccCCccccceeE-eecCCcCCCCCCCcccce-----ehhhhhhhhhh-heecc- Confidence 1 57777888888876432211 0110 1111110 0101000 00122221000 00000 Q ss_pred hHHhcCCcCceeeecccccceEEEEEeccc Q lcl|NC_015254. 304 NTEIEKGNNWKAVYESKNIRIVAFVHKNGV 333 (346) Q Consensus 304 ~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~ 333 (346) =.+++|++-.-+ T Consensus 390 ------------------e~m~~ies~a~~ 401 (401) T protein:vir:95 390 ------------------ERLALIKTVAPL 401 (401) T ss_pred ------------------ceeEEEEeecCC Confidence 012222221111 No 163 >protein:vir:2770 Length: 318 # NCBI annotation: hypothetical protein # Family: family:all:974 # MgeID: mge:59 # MgeName: Stx2 converting bacteriophage I # Cross-refs: genbank:acc:NP_612887;genbank:gi:20065804;genbank:GeneID:935710 Probab=97.07 E-value=0.00015 Score=41.48 Aligned_cols=241 Identities=12% Similarity=0.158 Sum_probs=125.4 Q ss_pred eeeecCCceeee--------eeccch--HHHHHHHhhHhH---HHHhHhhccccccchhHHHHhhCCCcEEEecccccCC Q lcl|NC_015254. 9 LQKFAAGKNTRI--------ADVIVP--EVFNKYVTERTA---ESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLT 75 (346) Q Consensus 9 ~q~~~a~~~T~l--------~d~i~P--ev~~~yv~~~~~---~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~ 75 (346) ...+..+..-.+ .-.=+| .+|+.-+..... ..-.|.+.|-=.+-..+..|-.+.||.|+++.-..|+ T Consensus 1 mt~~~~~~~~~~~~~~~ft~~~~~~~~vk~ws~~l~~~~~~~~~~~~~~g~~~~~~I~r~~dL~K~~GD~Vtf~L~~~L~ 80 (318) T protein:vir:27 1 MTTVTSAQANKLFQVALFTAANRNRSMVNILTEQQEAPKAVSPDKKSTKQTSAGAPVVRITDLNKQAGDEVTFSIMHKLS 80 (318) T ss_pred CCccCCCChHHHHHHHHHHHHhcCChHHHHHHHhhhhHHHhhhhhhcccCCCCCceEEEeccCCCCCccEEEEeEeeccc Confidence 111111100000 000011 123332222111 1224445442222223344556789999999999997 Q ss_pred CcccccCCCccccc--hhhcccceeEEEEEeecCcceech-HHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhh Q lcl|NC_015254. 76 GEDEILDDGEGALT--PGNISAAKDIARLHMRGKAWRTND-LAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITA 152 (346) Q Consensus 76 g~ae~~~dg~~~it--~~~lt~~~~~a~~~~~~k~~~~tD-~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~ 152 (346) |+- +..+ ..++ -+.|+..++..+|=....++.... +++..+.-|...++...++.||++..|..++-.|.|..+ T Consensus 81 g~g-v~Gd--~~lEGnee~L~~~~d~l~IDq~r~~V~~gg~msqqRt~~dlR~~ar~~L~~w~~~~~Dq~~~v~laGarg 157 (318) T protein:vir:27 81 KRP-TMGD--ERVEGRGEDLSHADFSLKINQGRHLVDAGGRMSQQRTKFNLASSARTLLGTYFNDLQDQCAIVHLAGARG 157 (318) T ss_pred cCc-cccC--ceeeccccceEEEeeEEEEeeeccccccccchhhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhccc Confidence 753 3211 1222 256888888888877777775443 334455668888999999999999999999999988775 Q ss_pred h--h-----hhhh-------cceeeec----------------cccccccccHHHHHHHHHHh-------------Ccc- Q lcl|NC_015254. 153 S--G-----ALDS-------NKLDVST----------------ETGDDSYFTGDTFLSATYKL-------------GDA- 188 (346) Q Consensus 153 ~--~-----~~~~-------~~~dis~----------------~~~~~~~~~~~~l~~A~~~~-------------GD~- 188 (346) . + .... ...++++ ...++-.|+.+.+.++.... ||+ T Consensus 158 ~~~n~~~~~p~~~~~~~~~~~~N~v~aPt~~r~~~~g~at~~~~l~stD~~s~~lid~~~~~~~~~a~pi~PV~v~g~~~ 237 (318) T protein:vir:27 158 DFVADDTILPTAEHPEFKKIMINDVLPPTHDRHFFGGDATSFEQIEAADIFSIGLVDNLSLFIDEMAHPLQPVRLSGDEL 237 (318) T ss_pred ccccccceEecccCccchhhhhcccCCCCCCcEEeccCccchhhhhhcccccHHHHHHHHHHHHHhCCCCcceeeccccc Confidence 2 0 0000 0011111 11223457777776665443 332 Q ss_pred --ccCceEEEEchHHHHHHHhhh----hhhhccccc-----------CceeeEEeceEEEEeCCCccCCCceEEEEEcCC Q lcl|NC_015254. 189 --EGKLTGIAMHSQTEMNLRKQG----LIEFMLDSD-----------NKKFPTYMGKRVIVDDGLPAKDGVYTSYIFGEG 251 (346) Q Consensus 189 --~~~~~~ivmhS~~~~~L~~~~----li~~~~~s~-----------~~~i~~~~G~~VVvdD~~p~~~g~ytt~l~~~G 251 (346) .....+++|||..+++|+.+. +.+..+.+. .|.++.|+|+-|.---.+|.- |-+| T Consensus 238 ~~~~~~yV~~~~p~q~~~Lrtdt~~~~w~d~q~~A~~r~~g~knPLF~G~~gm~ngvil~~~~~vpIr--------f~~G 309 (318) T protein:vir:27 238 HGEDPYYVLYVTPRQWNDWYTSTSGKDWNQMMVRAVNRAKGFNHPLFKGECAMWRNILVRKYAGMPIR--------FYQG 309 (318) T ss_pred cCCcceEEEEechHHHHHHhhcCCCHHHHHHHHHHHhcccccCCCceecceeeecCEEEeecCCccEE--------EcCC Confidence 123589999999999999873 444333211 144677777655444333321 1111 Q ss_pred eeEEeecCCccceeeeecCCcceeEEEEeeE Q lcl|NC_015254. 252 AFGLGNGEAPVPTETDREKLKGNDILINRQH 282 (346) Q Consensus 252 Ai~~~~~~~~~~vE~dRd~~~g~~~l~~r~~ 282 (346) .++-++|.. T Consensus 310 ----------------------~~v~~~~~~ 318 (318) T protein:vir:27 310 ----------------------QRFWYQRIT 318 (318) T ss_pred ----------------------CeeeeeecC Confidence 111111111 No 164 >protein:vir:101291 Length: 381 # NCBI annotation: hypothetical protein # Family: family:all:635 # MgeID: mge:1591 # MgeName: phiNM3 # Cross-refs: genbank:acc:YP_908831;genbank:gi:118725095;genbank:GeneID:4555862 Probab=97.01 E-value=0.00021 Score=40.73 Aligned_cols=291 Identities=11% Similarity=0.016 Sum_probs=121.8 Q ss_pred Cccceecceee-ec---CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCC Q lcl|NC_015254. 1 MIKKLRMNLQK-FA---AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTG 76 (346) Q Consensus 1 ~~~~~~~~~q~-~~---a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g 76 (346) ....+.-+-+. |. .+. +.-...++|+-+..-+.+.+.+.+.+.+- .. ....+|. ..+|.-... + T Consensus 61 ~~~~lt~~e~~~~~~~~~~~-~~~gg~lvP~~~~~~I~~~l~~~s~i~~~------~~---v~~~~~~-~~i~~~~~~-~ 128 (381) T protein:vir:10 61 SAQSLSANQRSFFMDINKNV-NYKEEKLLPEETIDRIFEDLTTNHPLLAD------LG---IKNAGLR-LKFLKSETS-G 128 (381) T ss_pred CcccccHHHHHHHHHHhccc-CCCCceecCHHHHHHHHHHHHhhccceeh------ee---eEecCcc-eEEEEecCC-c Confidence 11111111111 21 111 11234578888877777777766655331 11 1122343 467765432 3 Q ss_pred cccccCCCccccchh-hcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH-----Hhh Q lcl|NC_015254. 77 EDEILDDGEGALTPG-NISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASL-----NGI 150 (346) Q Consensus 77 ~ae~~~dg~~~it~~-~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L-----~G~ 150 (346) .+.-+.++. .++.+ +.+-++-.-..++.+.-..++..--.-+.-|..+.+.+++++.+++..+..+|.-- .|+ T Consensus 129 ~a~w~~e~~-~~~~~~~~~f~~i~l~~~kl~~~~~is~elL~Ds~~~ie~~i~~~la~~~a~~~~~a~i~G~G~~qP~Gi 207 (381) T protein:vir:10 129 VAVWGKIYG-EIKGQLDAAFSEETAIQNKLTAFVVLPKDLNDFGPAWIERFVRVQIEEAFAVALETAFLKGTGKDQPIGL 207 (381) T ss_pred ceeeecccc-cccccccccceeeeecceeEEeechhhHHHhhcCHHHHHHHHHHHHHHHHHHHhhheeEeccCCCCceee Confidence 333333332 23221 11212222222333333344444333344577778999999999998887554310 011 Q ss_pred hhhhh----hhhcce-e-eec--cccccccccHHHHHHHHHHhC---c----cccCceEEEEchHHHHHHHhhhhhhhcc Q lcl|NC_015254. 151 TASGA----LDSNKL-D-VST--ETGDDSYFTGDTFLSATYKLG---D----AEGKLTGIAMHSQTEMNLRKQGLIEFML 215 (346) Q Consensus 151 ~~~~~----~~~~~~-d-is~--~~~~~~~~~~~~l~~A~~~~G---D----~~~~~~~ivmhS~~~~~L~~~~li~~~~ 215 (346) +.... ....+. + .+. .+..+....++.|.+....+. . ....-..|+||+.++.+|++.... + T Consensus 208 l~~~~~~~~~~~g~~~~~~~~~t~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~a~~~mn~~t~~~l~~~~~~---~ 284 (381) T protein:vir:10 208 NRQVQKGVSVTEGAYPEKEEQGTLTFANPRATVNELTQVFKYHSTNEKGKSVAVKGNVTMVVNPSDAFEVQAQYTH---L 284 (381) T ss_pred eeccCcccccccccccccccccccccccchhhHHHHHHHHHhhccccccccccccCceEEEEccccHHhhcccccc---C Confidence 11000 000000 0 000 001111112333443333332 1 123345789999999998765421 1 Q ss_pred cccCceeeEEe--ceEEEEeCCCccCCCceEEEEEcCC-eeEEeecCCccceeeeecC--CcceeEEEEeeEEeeeeeee Q lcl|NC_015254. 216 DSDNKKFPTYM--GKRVIVDDGLPAKDGVYTSYIFGEG-AFGLGNGEAPVPTETDREK--LKGNDILINRQHFLLHPRGI 290 (346) Q Consensus 216 ~s~~~~i~~~~--G~~VVvdD~~p~~~g~ytt~l~~~G-Ai~~~~~~~~~~vE~dRd~--~~g~~~l~~r~~~~~~~~G~ 290 (346) .+ +|.+.+.+ |.+|+.++.||.+. ++|+.- -..+.+.. .+.++..++. ..+++.+.+..++-..| T Consensus 285 ~~-~G~~v~~l~~g~~vv~s~~~p~~~-----iifgDfs~Y~i~~r~-~~~i~~~~~~~~~~d~~~f~a~~r~dg~~--- 354 (381) T protein:vir:10 285 NA-NGVYVTALPFNLNVIESTVQEAGK-----VLTYVKGLYDGYLAG-GINVQKFKETLALDDMDLYTAKQFAYGKA--- 354 (381) T ss_pred CC-CCceeecCCCCceEEecCCCCcCc-----EEEEecccEEEEEec-ccEEEeechhHhhcCCeEEEEEEEEcCEE--- Confidence 12 34333333 67799999999653 333321 12232222 2333333322 34555555555543322 Q ss_pred eeccccccCCCCChHHhcCCcCceeeecccccceEEEEEecccccccCCCCCC Q lcl|NC_015254. 291 AWQEKSVAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETAPE 343 (346) Q Consensus 291 s~~~~~~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~~~ 343 (346) .+++++.+..++.+-+.+.++..+-.- T Consensus 355 --------------------------~~~~A~~v~~l~~~~~~~~~~~~~~~~ 381 (381) T protein:vir:10 355 --------------------------KDNKVAAVWKLDLKGHKPALEGTEETL 381 (381) T ss_pred --------------------------ecCceEEEEEEEecCCCcCcccccccC Confidence 233333333333332222221111111 No 165 >protein:vir:9509 Length: 381 # NCBI annotation: hypothetical protein # Family: family:all:635 # MgeID: mge:170 # MgeName: phiN315 # Cross-refs: genbank:acc:NP_835556;genbank:gi:30043951;genbank:GeneID:1260537 Probab=97.01 E-value=0.00021 Score=40.73 Aligned_cols=291 Identities=11% Similarity=0.016 Sum_probs=121.8 Q ss_pred Cccceecceee-ec---CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCC Q lcl|NC_015254. 1 MIKKLRMNLQK-FA---AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTG 76 (346) Q Consensus 1 ~~~~~~~~~q~-~~---a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g 76 (346) ....+.-+-+. |. .+. +.-...++|+-+..-+.+.+.+.+.+.+- .. ....+|. ..+|.-... + T Consensus 61 ~~~~lt~~e~~~~~~~~~~~-~~~gg~lvP~~~~~~I~~~l~~~s~i~~~------~~---v~~~~~~-~~i~~~~~~-~ 128 (381) T protein:vir:95 61 SAQSLSANQRSFFMDINKNV-NYKEEKLLPEETIDRIFEDLTTNHPLLAD------LG---IKNAGLR-LKFLKSETS-G 128 (381) T ss_pred CcccccHHHHHHHHHHhccc-CCCCceecCHHHHHHHHHHHHhhccceeh------ee---eEecCcc-eEEEEecCC-c Confidence 11111111111 21 111 11234578888877777777766655331 11 1122343 467765432 3 Q ss_pred cccccCCCccccchh-hcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH-----Hhh Q lcl|NC_015254. 77 EDEILDDGEGALTPG-NISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASL-----NGI 150 (346) Q Consensus 77 ~ae~~~dg~~~it~~-~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L-----~G~ 150 (346) .+.-+.++. .++.+ +.+-++-.-..++.+.-..++..--.-+.-|..+.+.+++++.+++..+..+|.-- .|+ T Consensus 129 ~a~w~~e~~-~~~~~~~~~f~~i~l~~~kl~~~~~is~elL~Ds~~~ie~~i~~~la~~~a~~~~~a~i~G~G~~qP~Gi 207 (381) T protein:vir:95 129 VAVWGKIYG-EIKGQLDAAFSEETAIQNKLTAFVVLPKDLNDFGPAWIERFVRVQIEEAFAVALETAFLKGTGKDQPIGL 207 (381) T ss_pred ceeeecccc-cccccccccceeeeecceeEEeechhhHHHhhcCHHHHHHHHHHHHHHHHHHHhhheeEeccCCCCceee Confidence 333333332 23221 11212222222333333344444333344577778999999999998887554310 011 Q ss_pred hhhhh----hhhcce-e-eec--cccccccccHHHHHHHHHHhC---c----cccCceEEEEchHHHHHHHhhhhhhhcc Q lcl|NC_015254. 151 TASGA----LDSNKL-D-VST--ETGDDSYFTGDTFLSATYKLG---D----AEGKLTGIAMHSQTEMNLRKQGLIEFML 215 (346) Q Consensus 151 ~~~~~----~~~~~~-d-is~--~~~~~~~~~~~~l~~A~~~~G---D----~~~~~~~ivmhS~~~~~L~~~~li~~~~ 215 (346) +.... ....+. + .+. .+..+....++.|.+....+. . ....-..|+||+.++.+|++.... + T Consensus 208 l~~~~~~~~~~~g~~~~~~~~~t~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~a~~~mn~~t~~~l~~~~~~---~ 284 (381) T protein:vir:95 208 NRQVQKGVSVTEGAYPEKEEQGTLTFANPRATVNELTQVFKYHSTNEKGKSVAVKGNVTMVVNPSDAFEVQAQYTH---L 284 (381) T ss_pred eeccCcccccccccccccccccccccccchhhHHHHHHHHHhhccccccccccccCceEEEEccccHHhhcccccc---C Confidence 11000 000000 0 000 001111112333443333332 1 123345789999999998765421 1 Q ss_pred cccCceeeEEe--ceEEEEeCCCccCCCceEEEEEcCC-eeEEeecCCccceeeeecC--CcceeEEEEeeEEeeeeeee Q lcl|NC_015254. 216 DSDNKKFPTYM--GKRVIVDDGLPAKDGVYTSYIFGEG-AFGLGNGEAPVPTETDREK--LKGNDILINRQHFLLHPRGI 290 (346) Q Consensus 216 ~s~~~~i~~~~--G~~VVvdD~~p~~~g~ytt~l~~~G-Ai~~~~~~~~~~vE~dRd~--~~g~~~l~~r~~~~~~~~G~ 290 (346) .+ +|.+.+.+ |.+|+.++.||.+. ++|+.- -..+.+.. .+.++..++. ..+++.+.+..++-..| T Consensus 285 ~~-~G~~v~~l~~g~~vv~s~~~p~~~-----iifgDfs~Y~i~~r~-~~~i~~~~~~~~~~d~~~f~a~~r~dg~~--- 354 (381) T protein:vir:95 285 NA-NGVYVTALPFNLNVIESTVQEAGK-----VLTYVKGLYDGYLAG-GINVQKFKETLALDDMDLYTAKQFAYGKA--- 354 (381) T ss_pred CC-CCceeecCCCCceEEecCCCCcCc-----EEEEecccEEEEEec-ccEEEeechhHhhcCCeEEEEEEEEcCEE--- Confidence 12 34333333 67799999999653 333321 12232222 2333333322 34555555555543322 Q ss_pred eeccccccCCCCChHHhcCCcCceeeecccccceEEEEEecccccccCCCCCC Q lcl|NC_015254. 291 AWQEKSVAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETAPE 343 (346) Q Consensus 291 s~~~~~~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~~~ 343 (346) .+++++.+..++.+-+.+.++..+-.- T Consensus 355 --------------------------~~~~A~~v~~l~~~~~~~~~~~~~~~~ 381 (381) T protein:vir:95 355 --------------------------KDNKVAAVWKLDLKGHKPALEGTEETL 381 (381) T ss_pred --------------------------ecCceEEEEEEEecCCCcCcccccccC Confidence 233333333333332222221111111 No 166 >protein:vir:104439 Length: 404 # NCBI annotation: putative virion structural protein # Family: family:all:974 # MgeID: mge:1471 # MgeName: 86 # Cross-refs: genbank:acc:YP_794063;genbank:gi:116222008;genbank:GeneID:4397504 Probab=96.87 E-value=9.7e-05 Score=42.59 Aligned_cols=294 Identities=12% Similarity=0.116 Sum_probs=142.5 Q ss_pred Ccc------ceecc---eeeecCCceeeeeeccchHHHHHHHhhHhHH---HHhHhhccccccchhHHHHhhCCCcEEEe Q lcl|NC_015254. 1 MIK------KLRMN---LQKFAAGKNTRIADVIVPEVFNKYVTERTAE---SSALLQSGIISNDKDLDELAKSGGNMINM 68 (346) Q Consensus 1 ~~~------~~~~~---~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~---~~~~~qSgi~~~~~~~~~l~~~~G~ti~~ 68 (346) |-+ +++.- |..|..+.. ..+ .|..-+...... .-.+.+.+-=.+-..+..|-.+.|+.|++ T Consensus 1 ~~~~~~~~a~~~~~~~lft~~~~~~~--~~~-----~~~~~~~~~~~~~~~~~~~~g~~~~~~I~~~~dL~K~aGd~vtf 73 (404) T protein:vir:10 1 MTTVTSAQANKLYQVALFTAANRNRS--MVN-----ILTEQQEAPKAVSPDKKSTKQTSAGAPVVRITDLNKQAGDEVTF 73 (404) T ss_pred CCCcCCcchhhhHHHHHHHHHhcCCh--hHh-----hhhhhhhhhhhhccchhhccCCCCCccEEEeecCCCCCCcEEEE Confidence 211 11000 011111111 011 111111111000 11112332212222333455678999999 Q ss_pred cccccCCCcccccCCCccccc--hhhcccceeEEEEEeecCcceec-hHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 69 PFWQDLTGEDEILDDGEGALT--PGNISAAKDIARLHMRGKAWRTN-DLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIA 145 (346) Q Consensus 69 P~~~~l~g~ae~~~dg~~~it--~~~lt~~~~~a~~~~~~k~~~~t-D~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla 145 (346) +.-..|.|+- +.. + +.++ -+.|+..++..++=....++... .+++..+.=|...++...++.||++..+..++- T Consensus 74 ~L~~~L~g~g-v~G-d-~~lEGnee~L~~~s~~i~Idq~r~~V~~~g~msqQRt~~dlr~~ar~~L~~w~~~~~d~~~~~ 150 (404) T protein:vir:10 74 SIMHKLSKRP-TMG-D-ERVEGRGEDLSHADFSLKINQGRHLVDAGGRMSQQRTKFNLASSARTLLGTYFNDLQDQCAIV 150 (404) T ss_pred eEeeecccCC-ccc-C-ceeeccccceeEEeeEEEEeeecccccccCchhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHH Confidence 9999997652 321 1 1222 26788888988888888876444 344456666888899999999999999999999 Q ss_pred HHHhhhhhhh-------hhhc-------ceeeec----------------cccccccccHHHHHHHHHHh---------- Q lcl|NC_015254. 146 SLNGITASGA-------LDSN-------KLDVST----------------ETGDDSYFTGDTFLSATYKL---------- 185 (346) Q Consensus 146 ~L~G~~~~~~-------~~~~-------~~dis~----------------~~~~~~~~~~~~l~~A~~~~---------- 185 (346) .|.|..+.-. ...+ ..++.. ...++-.|+.+.+.++.... T Consensus 151 ~laG~rg~~~n~~~~vp~~~~~~~~~~~~N~v~APt~~r~~~~g~at~~~~l~stD~~s~~~Id~~~~~~~~~~~pi~Pv 230 (404) T protein:vir:10 151 HLAGARGDFVADDTILPTAEHPEFKKIMINDVLPPTHDRHFFGGDATSFEQIEAADIFSIGLVDNLSLFIDEMAHPLQPV 230 (404) T ss_pred HHhccccccccccceeeccccccccceeecccCCCCCCcEEeccCccchhhhhhcccccHHHHHHHHHHHHHhCCCCcce Confidence 9988765200 0000 001111 11122357777777765554 Q ss_pred ---Cccc---cCceEEEEchHHHHHHHhhh----hhhhcccc-------c----CceeeEEeceEEEEeCCCcc------ Q lcl|NC_015254. 186 ---GDAE---GKLTGIAMHSQTEMNLRKQG----LIEFMLDS-------D----NKKFPTYMGKRVIVDDGLPA------ 238 (346) Q Consensus 186 ---GD~~---~~~~~ivmhS~~~~~L~~~~----li~~~~~s-------~----~~~i~~~~G~~VVvdD~~p~------ 238 (346) ||+. ....+++|||..+++|+.+- +.+..+.. . .|.++.|+|+.|.---.+|. T Consensus 231 ~~~g~~~~~~~~~yV~~~~p~q~~~Lr~dt~~~~w~d~q~~A~a~~rg~~nPlF~G~~gm~ngvii~~~~~~~Irf~~g~ 310 (404) T protein:vir:10 231 RLSGDELHGEDPYYVLYVTPRQWNDWYTSTSGKDWNQMMVRAVNRAKGFNHPLFKGECAMWRNILVRKYAGMPIRFYQGS 310 (404) T ss_pred EeccccccCccceEEEEechHHHHHHhhCCCcHHHHHHHHHHhhccccccCCceecCeeEEcCEEEEecCCceeeecccc Confidence 4331 23589999999999999982 44443321 1 14568899977753211210 Q ss_pred -----------CCCc-------eEEEEEcCCeeEEeecCC----ccceeeeecCCcceeEEEEeeEEeeeeeeeeecccc Q lcl|NC_015254. 239 -----------KDGV-------YTSYIFGEGAFGLGNGEA----PVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKS 296 (346) Q Consensus 239 -----------~~g~-------ytt~l~~~GAi~~~~~~~----~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~ 296 (346) ..+. -..+++|.=|+++..++. +--+|-..|-.....+..... +.+.=..|.++. T Consensus 311 ~~~~~~n~~~a~~~~~aa~~~v~RallLGaQAl~~A~g~~~g~~~~w~Ee~~D~g~~~~i~~~~i---~G~kK~rF~~~~ 387 (404) T protein:vir:10 311 KVLVSENNLTATTKEVAAATNIDRAMLLGAQALANAYGQKAGGHFNMVEKKTDMDNRTEIAISWI---NGLKKIRFPEKS 387 (404) T ss_pred eeeecCCccccccccccccccchhheeecceeEEEEeeccCCCCceeEeeccccCchhhhhhHHH---hhhhhccccCCC Confidence 0111 134777776665554442 222443333322222211111 111111121110 Q ss_pred ccC------CCCChHHh Q lcl|NC_015254. 297 VAG------HSPTNTEI 307 (346) Q Consensus 297 ~~~------~sPt~a~L 307 (346) -.+ .=||-+-| T Consensus 388 g~~~DfGvi~idta~~~ 404 (404) T protein:vir:10 388 GKMQDHGVIAVDTAVKL 404 (404) T ss_pred CceeeEEEEEecccccC Confidence 000 01222222 No 167 >protein:vir:819 Length: 404 # NCBI annotation: hypothetical protein # Family: family:all:974 # MgeID: mge:16 # MgeName: VT2-Sa # Cross-refs: genbank:acc:NP_050552;genbank:gi:9633449;genbank:GeneID:1262254 Probab=96.87 E-value=9.7e-05 Score=42.59 Aligned_cols=294 Identities=12% Similarity=0.116 Sum_probs=142.5 Q ss_pred Ccc------ceecc---eeeecCCceeeeeeccchHHHHHHHhhHhHH---HHhHhhccccccchhHHHHhhCCCcEEEe Q lcl|NC_015254. 1 MIK------KLRMN---LQKFAAGKNTRIADVIVPEVFNKYVTERTAE---SSALLQSGIISNDKDLDELAKSGGNMINM 68 (346) Q Consensus 1 ~~~------~~~~~---~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~---~~~~~qSgi~~~~~~~~~l~~~~G~ti~~ 68 (346) |-+ +++.- |..|..+.. ..+ .|..-+...... .-.+.+.+-=.+-..+..|-.+.|+.|++ T Consensus 1 ~~~~~~~~a~~~~~~~lft~~~~~~~--~~~-----~~~~~~~~~~~~~~~~~~~~g~~~~~~I~~~~dL~K~aGd~vtf 73 (404) T protein:vir:81 1 MTTVTSAQANKLYQVALFTAANRNRS--MVN-----ILTEQQEAPKAVSPDKKSTKQTSAGAPVVRITDLNKQAGDEVTF 73 (404) T ss_pred CCCcCCcchhhhHHHHHHHHHhcCCh--hHh-----hhhhhhhhhhhhccchhhccCCCCCccEEEeecCCCCCCcEEEE Confidence 211 11000 011111111 011 111111111000 11112332212222333455678999999 Q ss_pred cccccCCCcccccCCCccccc--hhhcccceeEEEEEeecCcceec-hHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 69 PFWQDLTGEDEILDDGEGALT--PGNISAAKDIARLHMRGKAWRTN-DLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIA 145 (346) Q Consensus 69 P~~~~l~g~ae~~~dg~~~it--~~~lt~~~~~a~~~~~~k~~~~t-D~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla 145 (346) +.-..|.|+- +.. + +.++ -+.|+..++..++=....++... .+++..+.=|...++...++.||++..+..++- T Consensus 74 ~L~~~L~g~g-v~G-d-~~lEGnee~L~~~s~~i~Idq~r~~V~~~g~msqQRt~~dlr~~ar~~L~~w~~~~~d~~~~~ 150 (404) T protein:vir:81 74 SIMHKLSKRP-TMG-D-ERVEGRGEDLSHADFSLKINQGRHLVDAGGRMSQQRTKFNLASSARTLLGTYFNDLQDQCAIV 150 (404) T ss_pred eEeeecccCC-ccc-C-ceeeccccceeEEeeEEEEeeecccccccCchhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHH Confidence 9999997652 321 1 1222 26788888988888888876444 344456666888899999999999999999999 Q ss_pred HHHhhhhhhh-------hhhc-------ceeeec----------------cccccccccHHHHHHHHHHh---------- Q lcl|NC_015254. 146 SLNGITASGA-------LDSN-------KLDVST----------------ETGDDSYFTGDTFLSATYKL---------- 185 (346) Q Consensus 146 ~L~G~~~~~~-------~~~~-------~~dis~----------------~~~~~~~~~~~~l~~A~~~~---------- 185 (346) .|.|..+.-. ...+ ..++.. ...++-.|+.+.+.++.... T Consensus 151 ~laG~rg~~~n~~~~vp~~~~~~~~~~~~N~v~APt~~r~~~~g~at~~~~l~stD~~s~~~Id~~~~~~~~~~~pi~Pv 230 (404) T protein:vir:81 151 HLAGARGDFVADDTILPTAEHPEFKKIMINDVLPPTHDRHFFGGDATSFEQIEAADIFSIGLVDNLSLFIDEMAHPLQPV 230 (404) T ss_pred HHhccccccccccceeeccccccccceeecccCCCCCCcEEeccCccchhhhhhcccccHHHHHHHHHHHHHhCCCCcce Confidence 9988765200 0000 001111 11122357777777765554 Q ss_pred ---Cccc---cCceEEEEchHHHHHHHhhh----hhhhcccc-------c----CceeeEEeceEEEEeCCCcc------ Q lcl|NC_015254. 186 ---GDAE---GKLTGIAMHSQTEMNLRKQG----LIEFMLDS-------D----NKKFPTYMGKRVIVDDGLPA------ 238 (346) Q Consensus 186 ---GD~~---~~~~~ivmhS~~~~~L~~~~----li~~~~~s-------~----~~~i~~~~G~~VVvdD~~p~------ 238 (346) ||+. ....+++|||..+++|+.+- +.+..+.. . .|.++.|+|+.|.---.+|. T Consensus 231 ~~~g~~~~~~~~~yV~~~~p~q~~~Lr~dt~~~~w~d~q~~A~a~~rg~~nPlF~G~~gm~ngvii~~~~~~~Irf~~g~ 310 (404) T protein:vir:81 231 RLSGDELHGEDPYYVLYVTPRQWNDWYTSTSGKDWNQMMVRAVNRAKGFNHPLFKGECAMWRNILVRKYAGMPIRFYQGS 310 (404) T ss_pred EeccccccCccceEEEEechHHHHHHhhCCCcHHHHHHHHHHhhccccccCCceecCeeEEcCEEEEecCCceeeecccc Confidence 4331 23589999999999999982 44443321 1 14568899977753211210 Q ss_pred -----------CCCc-------eEEEEEcCCeeEEeecCC----ccceeeeecCCcceeEEEEeeEEeeeeeeeeecccc Q lcl|NC_015254. 239 -----------KDGV-------YTSYIFGEGAFGLGNGEA----PVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKS 296 (346) Q Consensus 239 -----------~~g~-------ytt~l~~~GAi~~~~~~~----~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~ 296 (346) ..+. -..+++|.=|+++..++. +--+|-..|-.....+..... +.+.=..|.++. T Consensus 311 ~~~~~~n~~~a~~~~~aa~~~v~RallLGaQAl~~A~g~~~g~~~~w~Ee~~D~g~~~~i~~~~i---~G~kK~rF~~~~ 387 (404) T protein:vir:81 311 KVLVSENNLTATTKEVAAATNIDRAMLLGAQALANAYGQKAGGHFNMVEKKTDMDNRTEIAISWI---NGLKKIRFPEKS 387 (404) T ss_pred eeeecCCccccccccccccccchhheeecceeEEEEeeccCCCCceeEeeccccCchhhhhhHHH---hhhhhccccCCC Confidence 0111 134777776665554442 222443333322222211111 111111121110 Q ss_pred ccC------CCCChHHh Q lcl|NC_015254. 297 VAG------HSPTNTEI 307 (346) Q Consensus 297 ~~~------~sPt~a~L 307 (346) -.+ .=||-+-| T Consensus 388 g~~~DfGvi~idta~~~ 404 (404) T protein:vir:81 388 GKMQDHGVIAVDTAVKL 404 (404) T ss_pred CceeeEEEEEecccccC Confidence 000 01222222 No 168 >protein:vir:3298 Length: 404 # NCBI annotation: hypothetical protein # Family: family:all:974 # MgeID: mge:66 # MgeName: 933W # Cross-refs: genbank:acc:NP_049514;genbank:gi:9632520;genbank:GeneID:1262006 Probab=96.87 E-value=9.7e-05 Score=42.59 Aligned_cols=294 Identities=12% Similarity=0.116 Sum_probs=142.5 Q ss_pred Ccc------ceecc---eeeecCCceeeeeeccchHHHHHHHhhHhHH---HHhHhhccccccchhHHHHhhCCCcEEEe Q lcl|NC_015254. 1 MIK------KLRMN---LQKFAAGKNTRIADVIVPEVFNKYVTERTAE---SSALLQSGIISNDKDLDELAKSGGNMINM 68 (346) Q Consensus 1 ~~~------~~~~~---~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~---~~~~~qSgi~~~~~~~~~l~~~~G~ti~~ 68 (346) |-+ +++.- |..|..+.. ..+ .|..-+...... .-.+.+.+-=.+-..+..|-.+.|+.|++ T Consensus 1 ~~~~~~~~a~~~~~~~lft~~~~~~~--~~~-----~~~~~~~~~~~~~~~~~~~~g~~~~~~I~~~~dL~K~aGd~vtf 73 (404) T protein:vir:32 1 MTTVTSAQANKLYQVALFTAANRNRS--MVN-----ILTEQQEAPKAVSPDKKSTKQTSAGAPVVRITDLNKQAGDEVTF 73 (404) T ss_pred CCCcCCcchhhhHHHHHHHHHhcCCh--hHh-----hhhhhhhhhhhhccchhhccCCCCCccEEEeecCCCCCCcEEEE Confidence 211 11000 011111111 011 111111111000 11112332212222333455678999999 Q ss_pred cccccCCCcccccCCCccccc--hhhcccceeEEEEEeecCcceec-hHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 69 PFWQDLTGEDEILDDGEGALT--PGNISAAKDIARLHMRGKAWRTN-DLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIA 145 (346) Q Consensus 69 P~~~~l~g~ae~~~dg~~~it--~~~lt~~~~~a~~~~~~k~~~~t-D~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla 145 (346) +.-..|.|+- +.. + +.++ -+.|+..++..++=....++... .+++..+.=|...++...++.||++..+..++- T Consensus 74 ~L~~~L~g~g-v~G-d-~~lEGnee~L~~~s~~i~Idq~r~~V~~~g~msqQRt~~dlr~~ar~~L~~w~~~~~d~~~~~ 150 (404) T protein:vir:32 74 SIMHKLSKRP-TMG-D-ERVEGRGEDLSHADFSLKINQGRHLVDAGGRMSQQRTKFNLASSARTLLGTYFNDLQDQCAIV 150 (404) T ss_pred eEeeecccCC-ccc-C-ceeeccccceeEEeeEEEEeeecccccccCchhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHH Confidence 9999997652 321 1 1222 26788888988888888876444 344456666888899999999999999999999 Q ss_pred HHHhhhhhhh-------hhhc-------ceeeec----------------cccccccccHHHHHHHHHHh---------- Q lcl|NC_015254. 146 SLNGITASGA-------LDSN-------KLDVST----------------ETGDDSYFTGDTFLSATYKL---------- 185 (346) Q Consensus 146 ~L~G~~~~~~-------~~~~-------~~dis~----------------~~~~~~~~~~~~l~~A~~~~---------- 185 (346) .|.|..+.-. ...+ ..++.. ...++-.|+.+.+.++.... T Consensus 151 ~laG~rg~~~n~~~~vp~~~~~~~~~~~~N~v~APt~~r~~~~g~at~~~~l~stD~~s~~~Id~~~~~~~~~~~pi~Pv 230 (404) T protein:vir:32 151 HLAGARGDFVADDTILPTAEHPEFKKIMINDVLPPTHDRHFFGGDATSFEQIEAADIFSIGLVDNLSLFIDEMAHPLQPV 230 (404) T ss_pred HHhccccccccccceeeccccccccceeecccCCCCCCcEEeccCccchhhhhhcccccHHHHHHHHHHHHHhCCCCcce Confidence 9988765200 0000 001111 11122357777777765554 Q ss_pred ---Cccc---cCceEEEEchHHHHHHHhhh----hhhhcccc-------c----CceeeEEeceEEEEeCCCcc------ Q lcl|NC_015254. 186 ---GDAE---GKLTGIAMHSQTEMNLRKQG----LIEFMLDS-------D----NKKFPTYMGKRVIVDDGLPA------ 238 (346) Q Consensus 186 ---GD~~---~~~~~ivmhS~~~~~L~~~~----li~~~~~s-------~----~~~i~~~~G~~VVvdD~~p~------ 238 (346) ||+. ....+++|||..+++|+.+- +.+..+.. . .|.++.|+|+.|.---.+|. T Consensus 231 ~~~g~~~~~~~~~yV~~~~p~q~~~Lr~dt~~~~w~d~q~~A~a~~rg~~nPlF~G~~gm~ngvii~~~~~~~Irf~~g~ 310 (404) T protein:vir:32 231 RLSGDELHGEDPYYVLYVTPRQWNDWYTSTSGKDWNQMMVRAVNRAKGFNHPLFKGECAMWRNILVRKYAGMPIRFYQGS 310 (404) T ss_pred EeccccccCccceEEEEechHHHHHHhhCCCcHHHHHHHHHHhhccccccCCceecCeeEEcCEEEEecCCceeeecccc Confidence 4331 23589999999999999982 44443321 1 14568899977753211210 Q ss_pred -----------CCCc-------eEEEEEcCCeeEEeecCC----ccceeeeecCCcceeEEEEeeEEeeeeeeeeecccc Q lcl|NC_015254. 239 -----------KDGV-------YTSYIFGEGAFGLGNGEA----PVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKS 296 (346) Q Consensus 239 -----------~~g~-------ytt~l~~~GAi~~~~~~~----~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~ 296 (346) ..+. -..+++|.=|+++..++. +--+|-..|-.....+..... +.+.=..|.++. T Consensus 311 ~~~~~~n~~~a~~~~~aa~~~v~RallLGaQAl~~A~g~~~g~~~~w~Ee~~D~g~~~~i~~~~i---~G~kK~rF~~~~ 387 (404) T protein:vir:32 311 KVLVSENNLTATTKEVAAATNIDRAMLLGAQALANAYGQKAGGHFNMVEKKTDMDNRTEIAISWI---NGLKKIRFPEKS 387 (404) T ss_pred eeeecCCccccccccccccccchhheeecceeEEEEeeccCCCCceeEeeccccCchhhhhhHHH---hhhhhccccCCC Confidence 0111 134777776665554442 222443333322222211111 111111121110 Q ss_pred ccC------CCCChHHh Q lcl|NC_015254. 297 VAG------HSPTNTEI 307 (346) Q Consensus 297 ~~~------~sPt~a~L 307 (346) -.+ .=||-+-| T Consensus 388 g~~~DfGvi~idta~~~ 404 (404) T protein:vir:32 388 GKMQDHGVIAVDTAVKL 404 (404) T ss_pred CceeeEEEEEecccccC Confidence 000 01222222 No 169 >protein:vir:10123 Length: 404 # NCBI annotation: hypothetical protein # Family: family:all:974 # MgeID: mge:180 # MgeName: Stx2 converting bacteriophage II # Cross-refs: genbank:acc:NP_859253;genbank:gi:32171009;genbank:GeneID:2653345 Probab=96.87 E-value=9.7e-05 Score=42.59 Aligned_cols=294 Identities=12% Similarity=0.116 Sum_probs=142.5 Q ss_pred Ccc------ceecc---eeeecCCceeeeeeccchHHHHHHHhhHhHH---HHhHhhccccccchhHHHHhhCCCcEEEe Q lcl|NC_015254. 1 MIK------KLRMN---LQKFAAGKNTRIADVIVPEVFNKYVTERTAE---SSALLQSGIISNDKDLDELAKSGGNMINM 68 (346) Q Consensus 1 ~~~------~~~~~---~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~---~~~~~qSgi~~~~~~~~~l~~~~G~ti~~ 68 (346) |-+ +++.- |..|..+.. ..+ .|..-+...... .-.+.+.+-=.+-..+..|-.+.|+.|++ T Consensus 1 ~~~~~~~~a~~~~~~~lft~~~~~~~--~~~-----~~~~~~~~~~~~~~~~~~~~g~~~~~~I~~~~dL~K~aGd~vtf 73 (404) T protein:vir:10 1 MTTVTSAQANKLYQVALFTAANRNRS--MVN-----ILTEQQEAPKAVSPDKKSTKQTSAGAPVVRITDLNKQAGDEVTF 73 (404) T ss_pred CCCcCCcchhhhHHHHHHHHHhcCCh--hHh-----hhhhhhhhhhhhccchhhccCCCCCccEEEeecCCCCCCcEEEE Confidence 211 11000 011111111 011 111111111000 11112332212222333455678999999 Q ss_pred cccccCCCcccccCCCccccc--hhhcccceeEEEEEeecCcceec-hHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 69 PFWQDLTGEDEILDDGEGALT--PGNISAAKDIARLHMRGKAWRTN-DLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIA 145 (346) Q Consensus 69 P~~~~l~g~ae~~~dg~~~it--~~~lt~~~~~a~~~~~~k~~~~t-D~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla 145 (346) +.-..|.|+- +.. + +.++ -+.|+..++..++=....++... .+++..+.=|...++...++.||++..+..++- T Consensus 74 ~L~~~L~g~g-v~G-d-~~lEGnee~L~~~s~~i~Idq~r~~V~~~g~msqQRt~~dlr~~ar~~L~~w~~~~~d~~~~~ 150 (404) T protein:vir:10 74 SIMHKLSKRP-TMG-D-ERVEGRGEDLSHADFSLKINQGRHLVDAGGRMSQQRTKFNLASSARTLLGTYFNDLQDQCAIV 150 (404) T ss_pred eEeeecccCC-ccc-C-ceeeccccceeEEeeEEEEeeecccccccCchhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHH Confidence 9999997652 321 1 1222 26788888988888888876444 344456666888899999999999999999999 Q ss_pred HHHhhhhhhh-------hhhc-------ceeeec----------------cccccccccHHHHHHHHHHh---------- Q lcl|NC_015254. 146 SLNGITASGA-------LDSN-------KLDVST----------------ETGDDSYFTGDTFLSATYKL---------- 185 (346) Q Consensus 146 ~L~G~~~~~~-------~~~~-------~~dis~----------------~~~~~~~~~~~~l~~A~~~~---------- 185 (346) .|.|..+.-. ...+ ..++.. ...++-.|+.+.+.++.... T Consensus 151 ~laG~rg~~~n~~~~vp~~~~~~~~~~~~N~v~APt~~r~~~~g~at~~~~l~stD~~s~~~Id~~~~~~~~~~~pi~Pv 230 (404) T protein:vir:10 151 HLAGARGDFVADDTILPTAEHPEFKKIMINDVLPPTHDRHFFGGDATSFEQIEAADIFSIGLVDNLSLFIDEMAHPLQPV 230 (404) T ss_pred HHhccccccccccceeeccccccccceeecccCCCCCCcEEeccCccchhhhhhcccccHHHHHHHHHHHHHhCCCCcce Confidence 9988765200 0000 001111 11122357777777765554 Q ss_pred ---Cccc---cCceEEEEchHHHHHHHhhh----hhhhcccc-------c----CceeeEEeceEEEEeCCCcc------ Q lcl|NC_015254. 186 ---GDAE---GKLTGIAMHSQTEMNLRKQG----LIEFMLDS-------D----NKKFPTYMGKRVIVDDGLPA------ 238 (346) Q Consensus 186 ---GD~~---~~~~~ivmhS~~~~~L~~~~----li~~~~~s-------~----~~~i~~~~G~~VVvdD~~p~------ 238 (346) ||+. ....+++|||..+++|+.+- +.+..+.. . .|.++.|+|+.|.---.+|. T Consensus 231 ~~~g~~~~~~~~~yV~~~~p~q~~~Lr~dt~~~~w~d~q~~A~a~~rg~~nPlF~G~~gm~ngvii~~~~~~~Irf~~g~ 310 (404) T protein:vir:10 231 RLSGDELHGEDPYYVLYVTPRQWNDWYTSTSGKDWNQMMVRAVNRAKGFNHPLFKGECAMWRNILVRKYAGMPIRFYQGS 310 (404) T ss_pred EeccccccCccceEEEEechHHHHHHhhCCCcHHHHHHHHHHhhccccccCCceecCeeEEcCEEEEecCCceeeecccc Confidence 4331 23589999999999999982 44443321 1 14568899977753211210 Q ss_pred -----------CCCc-------eEEEEEcCCeeEEeecCC----ccceeeeecCCcceeEEEEeeEEeeeeeeeeecccc Q lcl|NC_015254. 239 -----------KDGV-------YTSYIFGEGAFGLGNGEA----PVPTETDREKLKGNDILINRQHFLLHPRGIAWQEKS 296 (346) Q Consensus 239 -----------~~g~-------ytt~l~~~GAi~~~~~~~----~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~~~ 296 (346) ..+. -..+++|.=|+++..++. +--+|-..|-.....+..... +.+.=..|.++. T Consensus 311 ~~~~~~n~~~a~~~~~aa~~~v~RallLGaQAl~~A~g~~~g~~~~w~Ee~~D~g~~~~i~~~~i---~G~kK~rF~~~~ 387 (404) T protein:vir:10 311 KVLVSENNLTATTKEVAAATNIDRAMLLGAQALANAYGQKAGGHFNMVEKKTDMDNRTEIAISWI---NGLKKIRFPEKS 387 (404) T ss_pred eeeecCCccccccccccccccchhheeecceeEEEEeeccCCCCceeEeeccccCchhhhhhHHH---hhhhhccccCCC Confidence 0111 134777776665554442 222443333322222211111 111111121110 Q ss_pred ccC------CCCChHHh Q lcl|NC_015254. 297 VAG------HSPTNTEI 307 (346) Q Consensus 297 ~~~------~sPt~a~L 307 (346) -.+ .=||-+-| T Consensus 388 g~~~DfGvi~idta~~~ 404 (404) T protein:vir:10 388 GKMQDHGVIAVDTAVKL 404 (404) T ss_pred CceeeEEEEEecccccC Confidence 000 01222222 No 170 >protein:vir:78920 Length: 290 # NCBI annotation: Cps # Family: family:all:701 # MgeID: mge:1859 # MgeName: A006 # Cross-refs: genbank:acc:YP_001468846;genbank:gi:157325479;genbank:GeneID:5601917 Probab=96.49 E-value=0.00058 Score=38.34 Aligned_cols=260 Identities=10% Similarity=0.037 Sum_probs=128.5 Q ss_pred cceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCcc Q lcl|NC_015254. 7 MNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEG 86 (346) Q Consensus 7 ~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~ 86 (346) |=+.. -+.|.+-+.+.+.+.+- ||.+... . .-..+|++|.+|.... +|-. +...+ + T Consensus 1 Main~--------------a~~~~~~Ld~~~~~~~~---t~~l~~~-~---~~~~ggktVkI~~i~~-~gl~-DY~R~-~ 56 (290) T protein:vir:78 1 MAINY--------------VDKYGKELDQKLVFGTY---TNELETP-N---LLWLDAKTFKIQTITT-TGLK-AHTRN-K 56 (290) T ss_pred CchhH--------------HHHHHHHHHHHHHhhhe---eeecccc-c---eeeccCCEEEEeeecc-Cccc-ccccC-C Confidence 22211 15566666666655442 3444322 1 1234799999998874 3432 22222 2 Q ss_pred ccchhhcccceeE-EEEEeecCcceec--hHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceee Q lcl|NC_015254. 87 ALTPGNISAAKDI-ARLHMRGKAWRTN--DLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDV 163 (346) Q Consensus 87 ~it~~~lt~~~~~-a~~~~~~k~~~~t--D~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~di 163 (346) ...-..++...+. ..-+.|+.+|.+. |...--.-......++++........+|...++.|-+..+.. +.+... T Consensus 57 g~~~g~v~~~~et~tl~qdR~~~F~vD~~DvDEt~~~~~~~nv~~ef~~~~v~PEiDayr~skla~~a~~~---~~~~~~ 133 (290) T protein:vir:78 57 GYNEGSASNTNKSYTIDFDRDVEFFVDVMDVDETGQALSAANVTKEFNSRHAGPEMDAYRFSKLATAAKTN---SNSVAE 133 (290) T ss_pred CcccCccccceeeEEeeccccceeeccccchhHHhhhhhHHHHHHHHHHHHhhhhhhHHHHHHHHhhhhcc---Cccccc Confidence 2333444433333 2344577777776 433211112344445556666667778888888774332211 111110 Q ss_pred eccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhh-cccc---c---CceeeEEeceEEEE--e- Q lcl|NC_015254. 164 STETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEF-MLDS---D---NKKFPTYMGKRVIV--D- 233 (346) Q Consensus 164 s~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~-~~~s---~---~~~i~~~~G~~VVv--d- 233 (346) .. ++.=-++.|.++..+|-+-...-..++|.|.++.-|.+...+.. +... . ++.|+.+.|++|+. + T Consensus 134 -t~---t~~n~~~~i~~~~~~ldevp~~~rvl~vtp~~~~lL~~~~~f~r~~~~~~~~~~~i~~~V~~idG~~ii~vps~ 209 (290) T protein:vir:78 134 -EI---TKDNVFTKLKAAIRKVKKYGTQNLVMYVSPDVMAALELSDDFVRAINVQNIGPSSIETRITAIDGTRIVEVEAE 209 (290) T ss_pred -cc---CHHHHHHHHHHHHHHHHhcCCCCeEEEECHHHHHHHhhChhhhccccccccccccccceeeeecCcEEEEeccc Confidence 00 11113566777777774444455899999999999887755543 2111 1 35689999999875 3 Q ss_pred CCC-----------ccCCCceEEEEEcCCeeEEeecC-Cccc-eeeeecCCcceeEEEEeeEEeeeeeeee----ecccc Q lcl|NC_015254. 234 DGL-----------PAKDGVYTSYIFGEGAFGLGNGE-APVP-TETDREKLKGNDILINRQHFLLHPRGIA----WQEKS 296 (346) Q Consensus 234 D~~-----------p~~~g~ytt~l~~~GAi~~~~~~-~~~~-vE~dRd~~~g~~~l~~r~~~~~~~~G~s----~~~~~ 296 (346) +.+ +.+.++...|++.+....+.-.+ ..+- .+.+.+-.+..-.+..|..+-+-++--+ |.... T Consensus 210 ~r~~t~~~f~~G~~~~~~ak~in~ii~~~~a~i~~~K~~~~~~~~P~~~~~~d~~~~~~r~y~d~~v~~nk~~~i~~~~~ 289 (290) T protein:vir:78 210 DRFYDTFDFTDGYKPAAGAKKLNFLLVNKGSVVGGAKHASIYLHAPGSVGQGDGWLYQYRVYHDIFVLDQQKDGVIASTE 289 (290) T ss_pred chhhhhhhhcccccccCCccceeEEEEcCCceeeeeeeeEEEeeCCCCCcCcceeeeeeeeeeeeeeeccccCeeEEEee Confidence 222 33445666676655433332222 1110 1111111111234556666655554322 11122 Q ss_pred c Q lcl|NC_015254. 297 V 297 (346) Q Consensus 297 ~ 297 (346) + T Consensus 290 ~ 290 (290) T protein:vir:78 290 V 290 (290) T ss_pred C Confidence 2 No 171 >protein:vir:9643 Length: 377 # NCBI annotation: major coat protein # Family: family:all:635 # MgeID: mge:173 # MgeName: 315.1 # Cross-refs: genbank:acc:NP_795405;genbank:gi:28876178;genbank:GeneID:1257724 Probab=96.32 E-value=0.00054 Score=38.48 Aligned_cols=276 Identities=9% Similarity=-0.024 Sum_probs=120.1 Q ss_pred Cccceecc----eeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCC Q lcl|NC_015254. 1 MIKKLRMN----LQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTG 76 (346) Q Consensus 1 ~~~~~~~~----~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g 76 (346) .+.++.-. |+-+....++.-...++|+-+..-+.+.+.+.+-+.+- ..+ ...+| ...+|.-.. ++ T Consensus 63 ~~~~lt~ee~~~~~~~~~~~~~~~gg~lvP~~~~~~I~~~l~~~s~i~~~------~~v---~~~~~-~~~i~~~~~-~~ 131 (377) T protein:vir:96 63 KNRELTAEEIKFFNDIDKNVGGKDKFKLLPEETMVQVFDDLVAEHPLLKV------INF---KNTSL-RLKALTAET-SG 131 (377) T ss_pred CCcccCHHHHHHHHHHHhcCCCCCCceecCHHHHHHHHHHHHhhhhhhhh------cee---EecCC-ceEEEEecC-Cc Confidence 22111111 11111222222234578887777777766665555432 111 11223 356775432 23 Q ss_pred cccccCCCccccchhhcccceeE-EEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH-----Hhh Q lcl|NC_015254. 77 EDEILDDGEGALTPGNISAAKDI-ARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASL-----NGI 150 (346) Q Consensus 77 ~ae~~~dg~~~it~~~lt~~~~~-a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L-----~G~ 150 (346) .+.-+.++. .++.+.-.+..+. -..++...-..++..--.-+.-|..+.+.+++++.+.+..+..++.-= .|+ T Consensus 132 ~a~wv~e~~-~~~~~~~~~f~~i~l~~~kl~~~~~is~~ll~ds~~~le~~i~~~l~~~~~~~~~~a~i~G~G~~~P~Gi 210 (377) T protein:vir:96 132 TAVWGDIFG-EIKGQLKQAFKEQDFSQFKLTAFVVIPKDALKFGPKWLKQFITEQLKEAIAVALELAIVKGNGLLQPVGL 210 (377) T ss_pred ceeEeeccc-ccccccCccceeEeeeeeeEEeechhhHHHhhcchhhHHHHHHHHHHHHHHHHHhhceEeccCCCcceee Confidence 443344432 3332211122222 222222222344444333455577788999999999999888665310 012 Q ss_pred hhhhhhh----h------cceeeeccccccccccHHHHHHHHHHh-------Cc----cccCceEEEEchHHHHHHHhhh Q lcl|NC_015254. 151 TASGALD----S------NKLDVSTETGDDSYFTGDTFLSATYKL-------GD----AEGKLTGIAMHSQTEMNLRKQG 209 (346) Q Consensus 151 ~~~~~~~----~------~~~dis~~~~~~~~~~~~~l~~A~~~~-------GD----~~~~~~~ivmhS~~~~~L~~~~ 209 (346) +...... . ..++.....+....++.+.+.+-...+ |. ....-.+|+|||.+|.+++.+- T Consensus 211 l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~a~~~mn~~t~~~~~~~~ 290 (377) T protein:vir:96 211 LKDLSQPTVDQSTGRDITTYKTDKEAIADLSDLDPDTAVELLVPVMKHLSVNDKKHPLKIAGQVKLLLNPEDRWTLEAKF 290 (377) T ss_pred eeccccccccccccccccceeeccccccccccCChhHHHHHHHHHHHhhccccccccccccCceEEEEchhhHHhccccc Confidence 1111000 0 001111111222234555555433221 21 1123357999999998875322 Q ss_pred hhhhcccccCceeeEEece--EEEEeCCCccCCCceEEEEEcC-CeeEEeecCCccceeeeecC--CcceeEEEEeeEEe Q lcl|NC_015254. 210 LIEFMLDSDNKKFPTYMGK--RVIVDDGLPAKDGVYTSYIFGE-GAFGLGNGEAPVPTETDREK--LKGNDILINRQHFL 284 (346) Q Consensus 210 li~~~~~s~~~~i~~~~G~--~VVvdD~~p~~~g~ytt~l~~~-GAi~~~~~~~~~~vE~dRd~--~~g~~~l~~r~~~~ 284 (346) . + + ..+|...+.+|+ +|+.++.||.+. .+|+. +-..+...+ .+.++..++. ..+++.+.+..++- T Consensus 291 ~--~-~-~~~G~~~~~l~~p~~v~~s~~~p~~~-----i~fgdf~~Y~i~~r~-~~~i~~~~~~~~~~d~~~f~~~~r~d 360 (377) T protein:vir:96 291 T--S-R-NQFGEYVTVLPHGITILESLAVETGK-----AIAFVANRYDAFMAT-ASTIEEYDQTFAMEDLQLYLTKNYFY 360 (377) T ss_pred c--c-c-CCCCCceeccCCCceEEecCCCCccc-----EEEEEcCcEEEEEec-ccEEEeehhhhhhcCCeEEEEEEEEc Confidence 1 1 1 124555566665 577788898653 22221 223333322 2333333322 34556666655554 Q ss_pred eee---eeeeecccccc Q lcl|NC_015254. 285 LHP---RGIAWQEKSVA 298 (346) Q Consensus 285 ~~~---~G~s~~~~~~~ 298 (346) -.| ..+..-.-+.+ T Consensus 361 G~~~d~~a~~vl~l~~~ 377 (377) T protein:vir:96 361 GKAKDNHTAALLTLAGG 377 (377) T ss_pred CEEecCCcEEEEEEecC Confidence 333 23222221211 No 172 >protein:vir:100632 Length: 381 # NCBI annotation: 77ORF006 # Family: family:all:635 # MgeID: mge:1476 # MgeName: 77 # Cross-refs: genbank:acc:NP_958606;genbank:gi:41189521;genbank:GeneID:2743778 Probab=95.74 E-value=0.0015 Score=36.01 Aligned_cols=288 Identities=11% Similarity=0.035 Sum_probs=114.9 Q ss_pred Cccceec-----cee-eec---CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccc Q lcl|NC_015254. 1 MIKKLRM-----NLQ-KFA---AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFW 71 (346) Q Consensus 1 ~~~~~~~-----~~q-~~~---a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~ 71 (346) ...+.+. .-+ +|. .+ ++.-....+|+-|..-+.+.+.+.+.+.+ ...+ ...+| ...+|.- T Consensus 56 ~~~~~~~~~l~~~e~~~~~~~~~~-t~~~Gg~lvP~~~~~~I~~~l~~~spir~------~a~v---~~~~~-~~~i~~~ 124 (381) T protein:vir:10 56 SSLPKSAQTLSANQRNFFMDINKS-VGYKEEKLLPEETIDRIFEDLTTNHPLLA------DLGI---KNAGL-RLKFLKS 124 (381) T ss_pred HHhcccccccCHHHHHHHHHHhhc-CCCCCceecCHHHHHHHHHHHHhhcceee------eeee---EecCc-ceEEEee Confidence 0000000 000 111 11 11112356888887777777666554433 1111 12234 3466755 Q ss_pred ccCCCcccccCCCccccchh-hcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH--- Q lcl|NC_015254. 72 QDLTGEDEILDDGEGALTPG-NISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASL--- 147 (346) Q Consensus 72 ~~l~g~ae~~~dg~~~it~~-~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L--- 147 (346) ... +.+.-..+. +.++.+ +.+-++-.-..++.+.-..++..--.-+.-|-.+.+.+++++.+++..+..++.-= T Consensus 125 ~~~-~~a~W~~e~-~~~~~~~~~~f~~i~l~~~kl~a~i~is~elL~Ds~~~le~~i~~~la~~~a~~~~~afi~GdG~~ 202 (381) T protein:vir:10 125 ETS-GVAVWGKIY-GEIKGQLDAAFSEETAIQNKLTAFVVLPKDLNDFGPAWIERFVRVQIEEAFAVALETAFLKGTGKD 202 (381) T ss_pred cCC-cceEEeecc-cccccccCccceeEeecceeEEeeccccHHHHhccHHHHHHHHHHHHHHHHHHHhhceeEecccCC Confidence 432 333222322 122221 11111112222233322344444333344466677999999999988887554210 Q ss_pred --Hhhhhhhhhh----hcceeeeccccccccccHHH----HH---HHHHHhCc----cccCceEEEEchHHHHHHHhhhh Q lcl|NC_015254. 148 --NGITASGALD----SNKLDVSTETGDDSYFTGDT----FL---SATYKLGD----AEGKLTGIAMHSQTEMNLRKQGL 210 (346) Q Consensus 148 --~G~~~~~~~~----~~~~dis~~~~~~~~~~~~~----l~---~A~~~~GD----~~~~~~~ivmhS~~~~~L~~~~l 210 (346) .|++...... ..+.......+.....+... +. ......+. ....-..|+||+.++.+|++... T Consensus 203 qP~Gil~~~~~~~~~~~g~~~~~~~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~vmn~~t~~~l~~~~~ 282 (381) T protein:vir:10 203 QPIGLNRQVQKGVSVTDGAYPEKEEQGTLTFANPRATVNELTQVFKYHSTNEKGKSVAVKGNVTMVVNPSDAFEVQAQYT 282 (381) T ss_pred CceeeeecCCccccccccccccccccccccccchhhHHHHHHHHHHhhhhhhccccccccCceEEEEchhhHHhhccccc Confidence 0111100000 00000000000000111111 11 11111111 11223568999999999876442 Q ss_pred hhhcccccCceeeEE--eceEEEEeCCCccCCCceEEEEEcCC-eeEEeecCCccceeeeecC--CcceeEEEEeeEEe- Q lcl|NC_015254. 211 IEFMLDSDNKKFPTY--MGKRVIVDDGLPAKDGVYTSYIFGEG-AFGLGNGEAPVPTETDREK--LKGNDILINRQHFL- 284 (346) Q Consensus 211 i~~~~~s~~~~i~~~--~G~~VVvdD~~p~~~g~ytt~l~~~G-Ai~~~~~~~~~~vE~dRd~--~~g~~~l~~r~~~~- 284 (346) .+.++| .+-+. .|.+|+.++.||.+. ++|+.- -..+.+.. .+.++..++. ..+++.+.+..++- T Consensus 283 ---~~~~~G-~~v~~lp~g~~vv~~~~~p~~~-----i~fGDfs~Y~i~~r~-~~~i~~~~~~~~~~d~~~f~a~~r~dG 352 (381) T protein:vir:10 283 ---HLNANG-VYVTALPFNLNVIESTVQEAGK-----VLTYVKGLYDGYLAG-GINVQKFKETLALDDMDLYTAKQFAYG 352 (381) T ss_pred ---cCCCCC-ceeecCCCCceeEEcCCCCcCc-----EEEEEcccEEEEEec-ccEEEeechhhhhcCceEEEEEEEEcC Confidence 122333 22222 488899999999753 333321 12232222 2334433332 34556665555543 Q ss_pred --eeeeeeeeccccccCCCCChHHhcCCc Q lcl|NC_015254. 285 --LHPRGIAWQEKSVAGHSPTNTEIEKGN 311 (346) Q Consensus 285 --~~~~G~s~~~~~~~~~sPt~a~L~~~~ 311 (346) +|+..+..-.-+..+.-|.-++-+..- T Consensus 353 ~~~~~~A~~v~~l~~~~~~~~~~~~~~~~ 381 (381) T protein:vir:10 353 KAKDNKVAAVWKLDLKGHKPALEDTEETL 381 (381) T ss_pred EEecCCcEEEEEEeecCCccccccccccC Confidence 333333322222222233322221111 No 173 >protein:vir:78350 Length: 383 # NCBI annotation: Cps # Family: family:all:635 # MgeID: mge:1850 # MgeName: B025 # Cross-refs: genbank:acc:YP_001468644;genbank:gi:157325222;genbank:GeneID:5601696 Probab=93.95 E-value=0.0059 Score=32.82 Aligned_cols=286 Identities=13% Similarity=0.069 Sum_probs=113.5 Q ss_pred Cccceecce----eeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCC Q lcl|NC_015254. 1 MIKKLRMNL----QKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTG 76 (346) Q Consensus 1 ~~~~~~~~~----q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g 76 (346) +-..+.=.- +-++.+ ++.-....+|+-+.+-|.+.+.+.+.+++. .. ....+|+ ..+|.-... + T Consensus 68 g~~~lt~~e~~~~~~~~~~-~~~~gg~lvP~~~~~~I~~~l~~~s~l~~~------~~---v~~~~~~-~~i~~~~~~-~ 135 (383) T protein:vir:78 68 TDKNITNEEIKFFNDINKE-VGYKEETLLPQTVVDEIFEDLTTEHPFLAS------IG---MRTTGLR-TKFLKSETS-G 135 (383) T ss_pred ChhhhhHHHHHHHHHHhcc-CCCCCccccCHHHHHHHHHHHHhhccceee------ee---eEecCCc-eEEEEEcCC-c Confidence 111110000 111122 222234567888877777777666555332 11 1122454 578876654 3 Q ss_pred cccccCCCccccchh-hcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH-----Hhh Q lcl|NC_015254. 77 EDEILDDGEGALTPG-NISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASL-----NGI 150 (346) Q Consensus 77 ~ae~~~dg~~~it~~-~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L-----~G~ 150 (346) .+.-+.++. .++.+ +.+-++-.-..++.+.-+.++..--.-+.-|-.+.+.+++++.+++..+..+|.-= .|+ T Consensus 136 ~a~w~~e~~-~~~~~~~~~f~~i~l~~~kl~~~i~is~ell~Ds~~~ie~~i~~~l~~~~a~~~~~a~i~G~G~~qP~Gi 214 (383) T protein:vir:78 136 VAVWGKIFG-EIKGQLDATFSDEESIQNKLTAFVVVPKDLEKFGPAWVKRFVVTQIEEAFAVALESAYIVGDGNDKPIGL 214 (383) T ss_pred ceEEeeccc-ccccccCcceeeEeecceeeEeeccchHHHhhccHHHHHHHHHHHHHHHHHHHHhhheEeccCCCCceee Confidence 332233321 22211 11111111122222222444444333344466778999999999999888655310 011 Q ss_pred hhhhhhhhccee-eeccccccccccHH------HHHHHHHHh-----Ccc---ccCceEEEEchHHHHHHHhhhhhhhcc Q lcl|NC_015254. 151 TASGALDSNKLD-VSTETGDDSYFTGD------TFLSATYKL-----GDA---EGKLTGIAMHSQTEMNLRKQGLIEFML 215 (346) Q Consensus 151 ~~~~~~~~~~~d-is~~~~~~~~~~~~------~l~~A~~~~-----GD~---~~~~~~ivmhS~~~~~L~~~~li~~~~ 215 (346) +........... +.....+...++.. .+..+..+. .+. ...-..|+||+..|.+++..- .. . T Consensus 215 l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~n~~~~~~~~~~~--~~-~ 291 (383) T protein:vir:78 215 NRKVGKGSTVVDGVYAEKAATGTLTFANPKTTVNELTDVYKYHSVKENGHPLNVAGKVTLLVNPTDAWDVKKQY--TS-L 291 (383) T ss_pred eeccCCcccccccccccccccchhhhhhhHHHHHHHHHHHhccchhcccchhhhcCceEEEEcCcchhhhccch--hc-c Confidence 110000000000 00000000111111 112222211 111 112235888988776554221 11 1 Q ss_pred cccCceeeEEec--eEEEEeCCCccCCCceEEEEEcC-CeeEEeecCCccceeeeecC--CcceeEEEEeeEEeeeeeee Q lcl|NC_015254. 216 DSDNKKFPTYMG--KRVIVDDGLPAKDGVYTSYIFGE-GAFGLGNGEAPVPTETDREK--LKGNDILINRQHFLLHPRGI 290 (346) Q Consensus 216 ~s~~~~i~~~~G--~~VVvdD~~p~~~g~ytt~l~~~-GAi~~~~~~~~~~vE~dRd~--~~g~~~l~~r~~~~~~~~G~ 290 (346) ..+|...+.+| ++|+.++.||... .+|+. ....+...+ .+.++..++. ..+++.+....++-- T Consensus 292 -~~~G~~~t~l~~~~~iv~s~~~p~~~-----iifgdfs~Y~i~~r~-~~~i~~~~~~~f~~d~~~f~~~~r~dG----- 359 (383) T protein:vir:78 292 -NANGVYVTALPFNLNIIESLFVPEKK-----AISYVAERYDALIGG-PLDIGTYDQTLAIEDLNLYAAKQFAYG----- 359 (383) T ss_pred -CCCCceeeecCCCceEEecCCCCccc-----EEEeeccceEEEecc-cceEEecchhhhhcCceEEEEEEEEcC----- Confidence 12344556655 4577888898653 23322 112232222 2223322221 223333333332221 Q ss_pred eeccccccCCCCChHHhcCCcCceeeecccccceEEEEEecccccccCCCCCCC Q lcl|NC_015254. 291 AWQEKSVAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETAPEG 344 (346) Q Consensus 291 s~~~~~~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~~~~ 344 (346) ++.+++++.+..++. .+++.-|.| T Consensus 360 ------------------------~~~~~~A~~vl~~~~------~~~~~~~~~ 383 (383) T protein:vir:78 360 ------------------------KAKDDKAAAVWTLNI------NPAEQTPEG 383 (383) T ss_pred ------------------------EEecCCeEEEEEEEe------cCCCCCCCC Confidence 344555555444432 222333333 No 174 >protein:vir:98635 Length: 377 # NCBI annotation: major coat protein # Family: family:all:635 # MgeID: mge:1601 # MgeName: phi3396 # Cross-refs: genbank:acc:YP_001039923;genbank:gi:126011098;genbank:GeneID:4818471 Probab=93.62 E-value=0.0057 Score=32.90 Aligned_cols=273 Identities=10% Similarity=0.005 Sum_probs=112.2 Q ss_pred Cccceecceeee----cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCC Q lcl|NC_015254. 1 MIKKLRMNLQKF----AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTG 76 (346) Q Consensus 1 ~~~~~~~~~q~~----~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g 76 (346) .+.++.-.-..| ....++.-....+|+-+..-+.+.+.+.+.+++- .. ....+|+ ..+|.-.. ++ T Consensus 63 ~~~~lt~ee~~~~~~~~~~~~~~~gg~~vP~~~~~~I~~~l~~~s~i~~~------~~---v~~~~~~-~~~~~~~~-~~ 131 (377) T protein:vir:98 63 KNRELTAEEIKFFNDIDKNVGGKDKFKLLPEETMVQVFDDLVAEHPLLKV------IN---FKNTSLR-LKALTAET-SG 131 (377) T ss_pred CCcccCHHHHHHHHHHHhccCCCCCccccCHHHHHHHHHHHHHhhhhhhh------ee---eEecCcc-eEEEEecC-Cc Confidence 222211111111 1112222235678888888787777666555331 11 1122354 57886543 23 Q ss_pred cccccCCCccccchhhcccceeE-EEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH-----Hhh Q lcl|NC_015254. 77 EDEILDDGEGALTPGNISAAKDI-ARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASL-----NGI 150 (346) Q Consensus 77 ~ae~~~dg~~~it~~~lt~~~~~-a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L-----~G~ 150 (346) .+.-+.++. .++.+.-.+..+. -..++...-..++..--.-+.-|..+.+.+++++.+++..+..++.-= +|+ T Consensus 132 ~a~w~~e~~-~~~~~~~~~f~~i~l~~~kl~a~~~is~elL~ds~~~ie~~i~~~la~~~a~~~~~a~i~G~G~~qP~Gi 210 (377) T protein:vir:98 132 TAVWGDIFG-EIKGQLKQAFKEQDFSQFKLTAFVVIPKDALKFGPKWIKQFITEQLKEAIAVALELAIVKGDGLLQPVGL 210 (377) T ss_pred ceeEeeccc-ccCcccCccceeEeecceeEEeeecccHHhhhccHhHHHHHHHHHHHHHHHHHHhhceEeccCCCcceee Confidence 333334432 2332111111111 112222222344444333455577788999999999998887555310 011 Q ss_pred hhhhhhhhcceeeecccccc-ccccHHHHH-----------------------HHHHHhCccccCceEEEEchHHHHHHH Q lcl|NC_015254. 151 TASGALDSNKLDVSTETGDD-SYFTGDTFL-----------------------SATYKLGDAEGKLTGIAMHSQTEMNLR 206 (346) Q Consensus 151 ~~~~~~~~~~~dis~~~~~~-~~~~~~~l~-----------------------~A~~~~GD~~~~~~~ivmhS~~~~~L~ 206 (346) +.... ..+........+. .....+.+. .+.+++-|..++ .+|+|||..|..+. T Consensus 211 l~~~~--~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~a~~~m~~~t~~~~~klkd~~G~-~i~~~n~~~~~~~~ 287 (377) T protein:vir:98 211 LKDLS--QPTVDQSTGRDITTYKTDKEAIADLSDLTPDNAPKKLVPVMKHLSVNDKKRPLKIAGQ-VKLILNPEDRWALE 287 (377) T ss_pred eeccc--ccccccccccccccccchhhhHhhhhhhchhHHHHHHHHHHHHHHHHHHhhhhccCCc-eEEEecccchhhcc Confidence 11000 0000000000000 000011111 122333333333 34666666665542 Q ss_pred hhhhhhhcccccCceeeEEeceE--EEEeCCCccCCCceEEEEEcCC-eeEEeecCCccceeeeecC--CcceeEEEEee Q lcl|NC_015254. 207 KQGLIEFMLDSDNKKFPTYMGKR--VIVDDGLPAKDGVYTSYIFGEG-AFGLGNGEAPVPTETDREK--LKGNDILINRQ 281 (346) Q Consensus 207 ~~~li~~~~~s~~~~i~~~~G~~--VVvdD~~p~~~g~ytt~l~~~G-Ai~~~~~~~~~~vE~dRd~--~~g~~~l~~r~ 281 (346) -.. .....+|.+.+.+|++ |+.++.||... .+|+.- ...+...+ .+.++..++. ..+++.+.++. T Consensus 288 p~~----~~~~~~G~~~t~lg~p~~vv~s~~~p~~~-----i~fgdf~~Y~i~~r~-~~~i~~~~~~~~~~d~~~f~~~~ 357 (377) T protein:vir:98 288 AQF----TSRNQFGEYVTVLPHGITILESLAVETGK-----AIAFVANRYDAFMAT-ASTIEEYDQTFAMEDLQLYLTKN 357 (377) T ss_pred ccc----cccCCCCccccccCCCceEEecCCCCccc-----EEEEEecceeEEeec-ceEEEeechhhhhcCceEEEEEE Confidence 111 1111245566778765 67788888653 222221 12222222 2333333322 34666666665 Q ss_pred EEeeee---eeeeeccccccC Q lcl|NC_015254. 282 HFLLHP---RGIAWQEKSVAG 299 (346) Q Consensus 282 ~~~~~~---~G~s~~~~~~~~ 299 (346) ++--.| ..+.--+-+ +| T Consensus 358 r~dg~~~~~~a~~vl~i~-~~ 377 (377) T protein:vir:98 358 YFYGKAKDNHTAALLTLA-GG 377 (377) T ss_pred EEcCEEeccCcEEEEEEe-cC Confidence 554333 333322211 12 No 175 >protein:vir:105464 Length: 346 # NCBI annotation: putative phage major capsid protein # Family: family:all:701 # MgeID: mge:1502 # MgeName: KC5a # Cross-refs: genbank:acc:YP_529874;genbank:gi:90592614;genbank:GeneID:3974528 Probab=93.15 E-value=0.0086 Score=31.91 Aligned_cols=309 Identities=11% Similarity=0.035 Sum_probs=117.8 Q ss_pred cceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCcc Q lcl|NC_015254. 7 MNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEG 86 (346) Q Consensus 7 ~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~ 86 (346) |=++. -+.|.+.+.+++....--..++...++. ...--.||++|.+|...--+|-. +.+...+ T Consensus 1 Mainy--------------a~~~~~~Ld~~~~~~~lts~~l~~~~~~--~~v~~~ggktVkIp~is~tsGl~-DY~R~~g 63 (346) T protein:vir:10 1 MTINY--------------AEKYQAAVQQAFYDGHLYSAELWNSPSN--SIIKFDGAKHIKVPRLEITSGRK-DRQRRTI 63 (346) T ss_pred Ccchh--------------HHHHHHHHHHHHHhhhccchhhcccccc--cceEecCCCEEEEEEeeeecccc-cccccCC Confidence 22221 1445555555543321111112222221 11112479999999985222322 2111111 Q ss_pred ccchhhcccceeEEE-EEeecCcceechHHH-----hhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhh-hhhhhhhhc Q lcl|NC_015254. 87 ALTPGNISAAKDIAR-LHMRGKAWRTNDLAK-----ALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGI-TASGALDSN 159 (346) Q Consensus 87 ~it~~~lt~~~~~a~-~~~~~k~~~~tD~a~-----~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~-~~~~~~~~~ 159 (346) .-..+.++...+.-+ -+.|+.+|.+..+.. .++.++.|++ +..+.-.-.+|..-++.|-.. .+.+..... T Consensus 64 ~~~~g~v~~~~et~tl~qDR~~~F~vD~mDvDETn~~~~~anv~~e---f~r~~vvPEiDayrfskLa~~a~~~~~~~~~ 140 (346) T protein:vir:10 64 TTPVANYSNDWDSYELKNERYWSTLVDPSDIDETNMVVSLANITKQ---FNLDSKMPEKDRYMFSHLYSGKEAAHDGGIT 140 (346) T ss_pred cccccccccceeEEEeeccccceecccccchHHHHHHhHHHHHHHH---HHHHhhcchhhHHHHHHHHHhhhhhcccccc Confidence 112345554433333 445777777773322 2223333333 223333335577777765311 111111101 Q ss_pred ceeeeccccccccccHHHHHHHHHHhCccc--cCceEEEEchHHHHHHHhhhhhhhccc---cc--CceeeEEeceEEEE Q lcl|NC_015254. 160 KLDVSTETGDDSYFTGDTFLSATYKLGDAE--GKLTGIAMHSQTEMNLRKQGLIEFMLD---SD--NKKFPTYMGKRVIV 232 (346) Q Consensus 160 ~~dis~~~~~~~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~~~~~L~~~~li~~~~~---s~--~~~i~~~~G~~VVv 232 (346) ...++ +.=-++.|.++..+|-|.. ..-.+++|.|.+|.-|.+...+..... .. .+.++.+.|++|+. T Consensus 141 ~~a~T------~~ni~~~i~~~~~~lde~~vp~~~rvl~vTp~~~~lLk~s~~f~k~~~v~~~~~i~~~V~siDGv~Ii~ 214 (346) T protein:vir:10 141 TNTLD------EKNILPAFDNMMLDFDEARIPSTNRILYVTPKTNAILKRAEAMNRALTLKDPNNIQRTVYSLDDVTIRV 214 (346) T ss_pred ccccC------HHHHHHHHHHHHHHHHHccCCCCCeEEEECHHHHHHHhhchhheeccccccccccceeeeeecCeEEEE Confidence 10010 1112467777888886543 244899999999998887765542221 11 24578999999975 Q ss_pred --eCCCcc-----------CCCceEEEEEcCCeeEEeecC-Cccc-eeeeecCCcceeEEEEeeEEeeeee-----eeee Q lcl|NC_015254. 233 --DDGLPA-----------KDGVYTSYIFGEGAFGLGNGE-APVP-TETDREKLKGNDILINRQHFLLHPR-----GIAW 292 (346) Q Consensus 233 --dD~~p~-----------~~g~ytt~l~~~GAi~~~~~~-~~~~-vE~dRd~~~g~~~l~~r~~~~~~~~-----G~s~ 292 (346) ++.|+. +.++...|++.+....+.-.+ ..+- .+.. .-..|.-.+..|..+-+-++ |+-. T Consensus 215 VPs~r~~t~~~f~~G~~~~t~ak~INfiiv~~~A~ia~~K~~~~~if~P~-~~~~g~~l~~~R~Y~D~fv~~nk~~~Iyv 293 (346) T protein:vir:10 215 VPSDLMQTAYDFSDGSKIIDTAKQIEMFLIYNGVQIAPEKYSFVGFDQPS-AATSGNYLYYEQSYDDVLLLNTKTKGIQF 293 (346) T ss_pred cchhhcccchhhccCccccCCccceeEEEECCceeeeeeeeeeeEeeCCC-CCcccceeeeeeeeeeeeeeccccceEEE Confidence 555542 233444555444332222111 1110 0110 00112212333443333332 1111 Q ss_pred --ccccccCCCCChHHhcCCcCceeeecccccceEEEEEecccccccCCCCC-CCCC Q lcl|NC_015254. 293 --QEKSVAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPGKKKETAP-EGIK 346 (346) Q Consensus 293 --~~~~~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~~~~~~~~-~~~~ 346 (346) +.+..++.+ +-+.=+.++|=..|-+-|+= .=+|-|+--|......- +=-| T Consensus 294 ~~~~a~~~~~~-~~~~~~kpt~~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~~ 346 (346) T protein:vir:10 294 VVSDKPKKDQE-QSGQDAKPTAESTLEEIKAY---LDKNHIDYTGKTKKDELLALVK 346 (346) T ss_pred eeecccccCcc-CcccccCcccccchHHHHHH---hcccccccccccchhhHHhhcC Confidence 000000000 00011112221111111110 01111221111110000 0000 No 176 >protein:vir:97255 Length: 310 # NCBI annotation: hypothetical protein ORF017 # Family: family:all:1120 # MgeID: mge:1657 # MgeName: M6 # Cross-refs: genbank:acc:YP_001294525;genbank:gi:149408246;genbank:GeneID:5237120 Probab=92.61 E-value=0.011 Score=31.39 Aligned_cols=275 Identities=15% Similarity=0.080 Sum_probs=111.6 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcc-- Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGED-- 78 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~a-- 78 (346) |.. |..=-| ....+.-...-|.+.+.+.+.+++- .+..+ +. |...+..--+-+.+.. T Consensus 1 mpa-----ltLaea-------~k~~~d~l~~~ViE~~~~~s~lL~~-----LpF~~--ve--g~~~~ynR~~~~~~~~~~ 59 (310) T protein:vir:97 1 MAS-----VTLAES-------AKLAQDELVAGVIENIITVNRMFDV-----LPFDS--IE--GNSLAYNRENVLGDVIMA 59 (310) T ss_pred Ccc-----cchHHH-------hhcCcchHHHHHHHHHhccchHHHh-----CCccc--cc--CCcceeeEeeccCCcccc Confidence 221 111000 0112222333444444444433221 01000 01 2111111111111111 Q ss_pred cccCC-CccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHH---HHHHHHHHHHHHHHH---HHhhh Q lcl|NC_015254. 79 EILDD-GEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLV---VEYWNRRRQAVLIAS---LNGIT 151 (346) Q Consensus 79 e~~~d-g~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~---a~~~~~~~~~~lla~---L~G~~ 151 (346) ++..+ +.....+..-+..+......-.+....+.-.-..+-.++|+.++..|+ .++..++++..+|.- -+.-+ T Consensus 60 ~v~~~~~~~g~~~~~~t~~~~~~~L~i~~g~~~Vd~~i~dl~~~~~~dq~~~Ql~~~iea~~~~~e~~lINGD~a~n~F~ 139 (310) T protein:vir:97 60 GVGTTFSGAGAGKAAATFTKVNSNLTTIMGDAEVNGLIQATRSGDGNDQTAVQIASKAKSAGRKYQDQLINGNGAGNEFA 139 (310) T ss_pred cccccccCCCccccccccceeeeeeeeeeehhhhhhHHHhhhcCChHHHHHHHHHHHHHHHHHHHHHHhhccccCCCccc Confidence 01100 000111111222222222222333333332223333477888876655 567888888888761 01100 Q ss_pred hhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhh-------hhhhhcccccCceeeE Q lcl|NC_015254. 152 ASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQ-------GLIEFMLDSDNKKFPT 224 (346) Q Consensus 152 ~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~-------~li~~~~~s~~~~i~~ 224 (346) +..........|...+ ..+.++.+.|...+.+.=+.......++|||+.+.+++.- +.........|..+.+ T Consensus 140 GL~~~~~~~q~i~~~~-~gg~~t~d~LDeLl~~v~~~~g~p~~~l~~~~~~r~i~A~~R~~~~~g~~~~~~~~~G~~v~~ 218 (310) T protein:vir:97 140 GLIQLCASGQKATTGA-TGSAISFAILDELMDLVVDKDGQVDYLTMHARTLRSYKALLRALGGASINEVVELPSGAEVPA 218 (310) T ss_pred chhhcCCccceeecCC-CCCCCCHHHHHHHHHHHhcCCCCCCEEEecHHHHHHHHHHHHHhcCCCCCCccccCCCCEEee Confidence 1111111122333222 2345788888888877645555667899999865444432 2222223333567899 Q ss_pred EeceEEEEeCCCccC------CCceEEEEE--cC-----CeeEEee-cCCccceeeeecCCcceeE--EEE--eeEEeee Q lcl|NC_015254. 225 YMGKRVIVDDGLPAK------DGVYTSYIF--GE-----GAFGLGN-GEAPVPTETDREKLKGNDI--LIN--RQHFLLH 286 (346) Q Consensus 225 ~~G~~VVvdD~~p~~------~g~ytt~l~--~~-----GAi~~~~-~~~~~~vE~dRd~~~g~~~--l~~--r~~~~~~ 286 (346) |.|+||+..|.+|.. .|+-..|.+ +. |-+++.. +++.+.|+ ....-++. ... .+.+.+. T Consensus 219 ~~GiPi~~~d~ip~~~~~~~~~gtTsIya~r~Ge~~~~~Gv~Gl~~~~~~glsVr---~~G~~~~~~v~~~~V~~Y~~~a 295 (310) T protein:vir:97 219 YSGTPIFRNDYIPTNQTKGGTTGCTTIFAGTLDDGSRTHGIAGLTATQAAGIQVV---DVGESEDSDEHIWRVKWYCGLA 295 (310) T ss_pred eCCeEEEEeCccCCCccccccCCceeEEEEeeCccccccceeccccCCccceeEE---eCCcccCCcceeEEEEEeeeEE Confidence 999999999999975 233334543 43 3344322 22334443 32211111 111 1222211 Q ss_pred eeeeeeccccccCCCCC-hHHhcCCcC Q lcl|NC_015254. 287 PRGIAWQEKSVAGHSPT-NTEIEKGNN 312 (346) Q Consensus 287 ~~G~s~~~~~~~~~sPt-~a~L~~~~N 312 (346) + .+|. .+-|.+-.| T Consensus 296 v------------~~~~A~a~L~~V~~ 310 (310) T protein:vir:97 296 L------------FSEKGLACADGITN 310 (310) T ss_pred E------------ecccceeeeccccC Confidence 1 1221 223444444 No 177 >protein:vir:78090 Length: 302 # NCBI annotation: Cps # Family: family:all:701 # MgeID: mge:1844 # MgeName: P35 # Cross-refs: genbank:acc:YP_001468790;genbank:gi:157325371;genbank:GeneID:5601852 Probab=91.58 E-value=0.015 Score=30.56 Aligned_cols=269 Identities=12% Similarity=0.050 Sum_probs=117.2 Q ss_pred eeeccc-hHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEeccccc----CCC--cccccCCCccccchhh Q lcl|NC_015254. 20 IADVIV-PEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQD----LTG--EDEILDDGEGALTPGN 92 (346) Q Consensus 20 l~d~i~-Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~----l~g--~ae~~~dg~~~it~~~ 92 (346) +++.|+ .+.|.+.+.+.+...+ -|+.+..++..- --.||++|.+|...- .+| |=.+. .| -.... T Consensus 1 Mantl~ya~~~~~~Ld~~~~~~~---~t~~l~~~~~~v--~~~Gak~vkIp~is~~~~~TsGl~dy~R~-~g---~~~g~ 71 (302) T protein:vir:78 1 MANSLALAQIYQDNIDKAIAVNS---KSAFLEANPNNV--QYNGGNTIKIADISFGSGTTGDLKAYNRS-TG---FTQGS 71 (302) T ss_pred CCchhHHHHHHHHHHHHHHHhhh---ceeecccCCceE--EEecCcEEEEEEEEeeccccccccccccc-cC---ccccc Confidence 443332 2556666666655433 255554443321 134799999999961 223 22221 12 12233 Q ss_pred cccceeE-EEEEeecCcceechHHHhhhc-chHHHHH-HHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeeccccc Q lcl|NC_015254. 93 ISAAKDI-ARLHMRGKAWRTNDLAKALSG-DDPMRAI-GDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGD 169 (346) Q Consensus 93 lt~~~~~-a~~~~~~k~~~~tD~a~~~~g-~dp~~~i-~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~ 169 (346) ++-..+. ..-+.|+..|.+.-+...-+. .=.+++| +++..+.-.-.+|..-++.|-...... .+....+...- T Consensus 72 v~~~~et~tlt~DR~~~f~vD~mDvdETn~~~~~ani~~ef~r~~vvPEiDayrfskla~~a~~~---~~~~~~~~~~~- 147 (302) T protein:vir:78 72 VTLAWSDYTLDYDLAQSFQIDAMDVDETKNLATVGNVLSEYQRTKIVPAIDKYRFTKLANDGTGV---GGVIDLSKPDA- 147 (302) T ss_pred eeeeeeeEEeeeccceeeeccccchhhhhhhhHHHHHHHHHHHhhhcchhhHHHHHHHHHhhhcc---Cccccccccch- Confidence 4433333 234457777666522221111 1112222 222333334455666666653211111 11111111100 Q ss_pred cccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhhccccc-------CceeeEEeceEEEE--eCCCcc-- Q lcl|NC_015254. 170 DSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEFMLDSD-------NKKFPTYMGKRVIV--DDGLPA-- 238 (346) Q Consensus 170 ~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~~~~s~-------~~~i~~~~G~~VVv--dD~~p~-- 238 (346) ++.=-++.+..++..|.|. .-.+++|.|.++.-|.+...+....... ...++.+.|++||. ++.|.. T Consensus 148 t~~nvl~~i~~~~~~~~e~--~~~vl~vtp~~~~~Lk~a~~~~~~~~~~~~~~~~i~~~V~~lDgv~Ii~VPs~r~~t~~ 225 (302) T protein:vir:78 148 SAQALMGDIATAMELVDDS--NQLILVTSPTTLAGLLNTALIRESKNTQVLRRGEVDTKITFIQDVEVLQVPSEYLYDKV 225 (302) T ss_pred hHHHHHHHHHHHHHHhhcc--CCeEEEEChHHHHHHhcchhhccceeccccccccccceeeeecccEEEEchhhhcccce Confidence 0111134566777778775 3579999999999888766554322211 23578899998875 333321 Q ss_pred ---------CCCceEEEEEcCCeeEEeecC-CccceeeeecCCccee--EEEEeeEEeeeeeeeeeccccccCCCCChHH Q lcl|NC_015254. 239 ---------KDGVYTSYIFGEGAFGLGNGE-APVPTETDREKLKGND--ILINRQHFLLHPRGIAWQEKSVAGHSPTNTE 306 (346) Q Consensus 239 ---------~~g~ytt~l~~~GAi~~~~~~-~~~~vE~dRd~~~g~~--~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~ 306 (346) +.++...|++.+....+.-.+ ..+-+ .+-+.+...+ .+..|..+-+-++--+-..-- .|++ +. T Consensus 226 ~f~~G~~~~~~ak~INfiiv~~~a~ia~~K~~~~~i-f~P~~~~~gd~~l~~~R~Y~D~fV~~nk~~gI~---~~~~-~~ 300 (302) T protein:vir:78 226 APKVGVPDYTGAKKIPYMIFKRDAPTGIVKTDKVRV-FEPDTNQSADAYKVDLRLYHDLIVPKNQRPGII---KASF-GT 300 (302) T ss_pred eccCCccccCCccceeEEEECCCeeeeeeeeeeeEe-eCCCCCCCcceeeeeeeeEeeeeeeccccCeEE---Eeec-cc Confidence 233445566554433332222 11111 1111222222 344555555544322210000 0000 00 Q ss_pred hc Q lcl|NC_015254. 307 IE 308 (346) Q Consensus 307 L~ 308 (346) ++ T Consensus 301 ~~ 302 (302) T protein:vir:78 301 IA 302 (302) T ss_pred cC Confidence 00 No 178 >protein:vir:79712 Length: 285 # NCBI annotation: major capsid protein gp34 # Family: family:all:701 # MgeID: mge:1873 # MgeName: LL-H # Cross-refs: genbank:acc:YP_001285883;genbank:gi:148750840;genbank:GeneID:5220414 Probab=89.93 E-value=0.024 Score=29.51 Aligned_cols=267 Identities=12% Similarity=0.057 Sum_probs=122.1 Q ss_pred eeeccc-hHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccccchhhccccee Q lcl|NC_015254. 20 IADVIV-PEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGNISAAKD 98 (346) Q Consensus 20 l~d~i~-Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~ 98 (346) ++ ++ -+.|.+.+.+++...+.. +.+..+.... ..--.||++|.+|.....+|-. +.+.+. ..+.+.++...+ T Consensus 1 Ma--in~~~k~~~~ld~~~~~~~~~--~~l~~~~n~~-~~~~~gak~VkIp~ist~~gl~-dY~R~~-g~~~g~v~~~~e 73 (285) T protein:vir:79 1 MT--VVLDSKDLARIDEEYKADSQV--WSYLTGGNGV-TQRFRGHNEVRINKLSGFVDAT-AYKRGQ-DNARKTISVGKE 73 (285) T ss_pred Cc--chhhHHHHHHHHHHHHHhhhh--hhhcccCCcc-eeEecCCCEEEEeeeccccccc-cccccc-Cccccccceeee Confidence 11 11 144556666665554332 2233222100 0112379999999985433322 222222 345566665544 Q ss_pred EEE-EEeecCcceechHHHhhhcchHHHHHHHHH-HHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeeccccccccccHH Q lcl|NC_015254. 99 IAR-LHMRGKAWRTNDLAKALSGDDPMRAIGDLV-VEYWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSYFTGD 176 (346) Q Consensus 99 ~a~-~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~-a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~~~~~ 176 (346) .-+ -+-|+..|.+..+...-.+.=.++.|.+++ .+.-.-.+|..-++.|-+..+ ...+..+ +... -++ T Consensus 74 t~tl~~DR~~~f~iD~mDvdEn~~~~~~ni~~ef~~~~vvPEiDayrfskla~~a~----~~~~~~~-----T~~n-v~~ 143 (285) T protein:vir:79 74 TVKLTHEDWFGYDLDQFDMDENGAYTVENVVREHNKMITIPHRDKVAVQKLFDSAA----KKATDSI-----TKDN-ALD 143 (285) T ss_pred EEEeeccccceecccccchhhhhhhhHHHHHHHHHhhhhcchhhHHHHHHHHhhcc----ccccccc-----CHHH-HHH Confidence 433 345777776663333222222333333332 222234556666776642211 1111111 1111 257 Q ss_pred HHHHHHHHhCccc-cCceEEEEchHHHHHHHhhhhhhhccccc-----C---ceeeEEec-eEEEE--eCCCcc-CCCce Q lcl|NC_015254. 177 TFLSATYKLGDAE-GKLTGIAMHSQTEMNLRKQGLIEFMLDSD-----N---KKFPTYMG-KRVIV--DDGLPA-KDGVY 243 (346) Q Consensus 177 ~l~~A~~~~GD~~-~~~~~ivmhS~~~~~L~~~~li~~~~~s~-----~---~~i~~~~G-~~VVv--dD~~p~-~~g~y 243 (346) .|.+|+.+|-|.. ..-.+++|.|.+|.-|++...+......+ + +.++.+.| ++|+. ++.|.. +.++- T Consensus 144 ~i~~~~~~lde~~vp~~rvl~vTp~~~~~Lk~s~~~~r~~~~~~~~~~~~i~~~V~~lDg~v~ii~Vps~r~kt~~~~k~ 223 (285) T protein:vir:79 144 AYDTAEAYMFDNEVPGGFVMFVSSAYYTALKQSAAVTRTFSTDGTMVINGIDRRVAQLDGGVPIVRVSSDRLKGLGITNH 223 (285) T ss_pred HHHHHHHHHHHcCCCCceEEEEChHHHHHHHhhhhhheecccccceeccceeeeeccccceeEEEEcchhhccCcCcchh Confidence 7777888875543 24478999999999998876655433221 1 24678888 78876 566653 33344 Q ss_pred EEEEEcCCeeEEeecCCccceeeeecCC-cce-eEEEEeeEEeeeeeeeeeccccccCCCCChHHhcCCcCceeeecccc Q lcl|NC_015254. 244 TSYIFGEGAFGLGNGEAPVPTETDREKL-KGN-DILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEKGNNWKAVYESKN 321 (346) Q Consensus 244 tt~l~~~GAi~~~~~~~~~~vE~dRd~~-~g~-~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~~NW~~v~~~K~ 321 (346) ..|++.+....+...+.....=.+-+.+ .|. -.+..|..+-+-++--+-. T Consensus 224 Infiiv~~~a~i~~~K~~~~~~f~P~~~~~~d~~~~~~R~Y~d~fv~~nk~~---------------------------- 275 (285) T protein:vir:79 224 VNFILTPLSAIAPIVKYDSVSVIDPSTDRSGNRWTIKGLSYYDAIVLDNAKK---------------------------- 275 (285) T ss_pred ccEEEecCceeccceeeeeeEeECCCCCCCcceeeeeeeeeeeeeehhhccc---------------------------- Confidence 5677665543333222111111111112 222 3345565555544421110 Q ss_pred cceEEEEEeccc Q lcl|NC_015254. 322 IRIVAFVHKNGV 333 (346) Q Consensus 322 i~iv~~~~k~~~ 333 (346) . +.+-++.++ T Consensus 276 -~-Iy~~~~a~~ 285 (285) T protein:vir:79 276 -G-IYVAATAGV 285 (285) T ss_pred -e-eeeeecccC Confidence 0 000011111 No 179 >protein:vir:95451 Length: 313 # NCBI annotation: hypothetical protein ORF044 # Family: family:all:11728 # MgeID: mge:1570 # MgeName: PA11 # Cross-refs: genbank:acc:YP_001294637;genbank:gi:149408203;genbank:GeneID:5237018 Probab=85.47 E-value=0.053 Score=27.60 Aligned_cols=272 Identities=13% Similarity=0.145 Sum_probs=146.6 Q ss_pred eeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCcccc Q lcl|NC_015254. 9 LQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGAL 88 (346) Q Consensus 9 ~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~i 88 (346) .|.-|+ -.-+|..|.|++.+.--+-+++ +.--+...-.+ + .-|+++++|..+...- +.-.| ++++ T Consensus 1 ~~~TSN-----T~A~I~SE~~s~~I~~~LH~~L--L~~~~~R~V~D----F-~~G~~L~I~tiGs~~~--~~~~E-~~~~ 65 (313) T protein:vir:95 1 MQLTSN-----TRAFIESEQYSKFILLNLHDGL--LPETFYRNVSD----F-GSGETLHIKTIGSVTL--QEAEE-DTPL 65 (313) T ss_pred Cccccc-----chheehhhhHHHHHHHHhhccc--cchhhhhhhcc----C-CCCCEEEecccCceee--ecccc-CCCe Confidence 343322 2346889999998876554432 11111110101 1 2499999998765421 12123 4688 Q ss_pred chhhcccceeEEEEE-eecCcceechHHHh--hhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhh----hhhcce Q lcl|NC_015254. 89 TPGNISAAKDIARLH-MRGKAWRTNDLAKA--LSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGA----LDSNKL 161 (346) Q Consensus 89 t~~~lt~~~~~a~~~-~~~k~~~~tD~a~~--~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~----~~~~~~ 161 (346) +.+.|.+++-.-.+- +.|.+|-++|--.. ..-++-|++.....+.++...+|.++|+.=..-|+.+. .++..+ T Consensus 66 ~~~~i~TGEIt~~i~~Y~G~A~~vt~~LR~D~~~I~~~~A~~~AE~~RAI~E~~~TD~L~~G~~~FA~~~~P~~vNG~PH 145 (313) T protein:vir:95 66 IYNPIETGEITFQITEYKGDAWYVTDDLREDGTDIDRLMAERAAESTRAIQETFETDFLKTGAEYFAANPGPHNVNGFPH 145 (313) T ss_pred eecccccceEEEEEEeecCChhhhhhhhhhcchhHHHHhhhcchhhHHHHHHHHhhHHHhhchhhhccCCCCcccccccc Confidence 888999998765544 57778988775543 22445677777778899999999999987666665443 222222 Q ss_pred eeeccccccccccHHHHHHHHHHhCcccc--CceEEEEchHHHHHHHhhhhhhhcccccCc------------e-eeEEe Q lcl|NC_015254. 162 DVSTETGDDSYFTGDTFLSATYKLGDAEG--KLTGIAMHSQTEMNLRKQGLIEFMLDSDNK------------K-FPTYM 226 (346) Q Consensus 162 dis~~~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~~~~~L~~~~li~~~~~s~~~------------~-i~~~~ 226 (346) - -..++++..+..+.|..-...|--..- .=.+.++.|.+.+.|.-.--|..- -++.+ . |-.++ T Consensus 146 ~-~V~~~T~~~~~~~~~~~~~~~~~~a~~P~~G~v~IvDP~~~~~L~~l~~It~~-vt~~~k~I~ESG~A~~~~Fi~~~Y 223 (313) T protein:vir:95 146 V-IVSAETNGVFALKHLIAMRLAFDKANVPAEGRVFIVDPVAEATLNGLVTITHD-VTDFGKMILESGMARGQRFIMNLY 223 (313) T ss_pred e-EEeccCCceehhhHHHHhhhhhhhccCCccceEEEEcchhhhhhhhhheeecc-cccccceeeeccCCchhHHHHHHh Confidence 2 234567788888888877776744332 345788999999998754222210 01211 1 23477 Q ss_pred ceEEEEeCCCccC---C------Cce-EEEE--E--cCCeeEEeecCCccceeeeecCC--cceeEEEEeeEEeeee--- Q lcl|NC_015254. 227 GKRVIVDDGLPAK---D------GVY-TSYI--F--GEGAFGLGNGEAPVPTETDREKL--KGNDILINRQHFLLHP--- 287 (346) Q Consensus 227 G~~VVvdD~~p~~---~------g~y-tt~l--~--~~GAi~~~~~~~~~~vE~dRd~~--~g~~~l~~r~~~~~~~--- 287 (346) |+.+++|+.+-+. + |.. ..|+ + +.--|...-.+-| ..|-.|+.- .-++....||-|++.- T Consensus 224 G~Di~~SN~L~~AN~~D~~tT~~G~~~NlFM~i~D~~~~P~~~AWr~MP-~s~~~~~~~~~~~~~~~~~R~G~Gi~R~~~ 302 (313) T protein:vir:95 224 GWDILTSNRLHVANYNDGTTTGNGYVGNLFMCILDDQTKPIMGAWRRMP-KSEGERNKDRARDEHVVRCRYGFGIQRLDT 302 (313) T ss_pred hhhhhhhhhhhhccccccccccCceeeeeeeeeecccccceeeeecccc-ccccccccccccccceeeeeecccceeecc Confidence 8888888765432 2 221 1111 0 1111111111111 134444443 3445555565554432 Q ss_pred eeeeeccccccCCCCChHHhcCCcCc Q lcl|NC_015254. 288 RGIAWQEKSVAGHSPTNTEIEKGNNW 313 (346) Q Consensus 288 ~G~s~~~~~~~~~sPt~a~L~~~~NW 313 (346) .|.--+.++ -+ T Consensus 303 L~~~~~~A~---------------~~ 313 (313) T protein:vir:95 303 LGLLATSAT---------------AY 313 (313) T ss_pred eeEEEeccc---------------cC Confidence 122111111 11 No 180 >protein:vir:106590 Length: 349 # NCBI annotation: putative major head protein # Family: family:all:1083 # MgeID: mge:1598 # MgeName: Lj965 # Cross-refs: genbank:acc:NP_958585;genbank:gi:41179245;genbank:GeneID:2717126 Probab=72.09 E-value=0.19 Score=24.57 Aligned_cols=297 Identities=15% Similarity=0.093 Sum_probs=123.0 Q ss_pred CccceecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHH----HhhCCCcEEEecccccCCC Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDE----LAKSGGNMINMPFWQDLTG 76 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~----l~~~~G~ti~~P~~~~l~g 76 (346) =|.++.||||.| .+++.|+|.++.++.|+.+.-.. . |+-.-.+......+. +-...+..+--|+ T Consensus 2 ~~~~~~~~~~~~----~~~~~d~~~~~~l~~~~~~~~~~-~-~l~~~~Fp~~~~~~~~~~~~~~~~~~~~~a~~------ 69 (349) T protein:vir:10 2 KNQKLQLDLQRF----ATPILDMFSQNTVLDYTRNRQYP-E-MLGDTLFPAVKVPTLEVDILKAGSRVPTIASV------ 69 (349) T ss_pred CcchhhHHHHHH----HHHhhcccCHHHHHHHHHhcCcc-h-hhHhhcCCccccccceeEEEeeccCcceeeee------ Confidence 478889999999 47889999999999999874332 2 211112211111100 0011111111122 Q ss_pred cccccCCCccccchhhcccceeEEE--EEeecCcceechHHHhhhcchH--HHHHHHHH-------HHHHHHHHHHHHHH Q lcl|NC_015254. 77 EDEILDDGEGALTPGNISAAKDIAR--LHMRGKAWRTNDLAKALSGDDP--MRAIGDLV-------VEYWNRRRQAVLIA 145 (346) Q Consensus 77 ~ae~~~dg~~~it~~~lt~~~~~a~--~~~~~k~~~~tD~a~~~~g~dp--~~~i~~q~-------a~~~~~~~~~~lla 145 (346) +..+. .....+-+..+.... .++...-..+.|+-.+...+++ ...+.+++ .....+..+-.+.. T Consensus 70 ----v~~~~-~~~~~~r~~~~~~~~~p~ik~~~~i~e~dl~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~q 144 (349) T protein:vir:10 70 ----SAFDA-EAEIGTREASKMTAELAYVKRKMQITEEMLIKLQSPRNTAEENYLKQYVFDDIDAMVQAVKARGEKMTME 144 (349) T ss_pred ----ecCCC-CcceecccceeEEeeccccccccccCHHHHHHHhhccCcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Confidence 11111 111111111111111 2223344666777666655433 22233333 33344444444455 Q ss_pred HHH-hhhhhhh---------hhhcceeeecccc-cccccc-HHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh-h Q lcl|NC_015254. 146 SLN-GITASGA---------LDSNKLDVSTETG-DDSYFT-GDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI-E 212 (346) Q Consensus 146 ~L~-G~~~~~~---------~~~~~~dis~~~~-~~~~~~-~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li-~ 212 (346) +|. |-+.... ...|+..+++... +++.-+ .+.|.+...++|-. ...++|+++++..|+++.-+ + T Consensus 145 ~l~~Gki~~~~~g~~vD~g~~~~~~~~lt~~~~Ws~~~adpi~Di~~~~~~~g~~---p~~~vm~~~~~~~l~~~~~i~~ 221 (349) T protein:vir:10 145 MFATGKITDKKNGIAIDYGVPKKHQETLSGTKTWDKSDASIIDNLQDWSDSLDVT---PTRALTSKKVLRILMRSTEIKE 221 (349) T ss_pred HHhCCeeEEcCCcEEEecccCccceeEecCcccCCCCCCCHHHHHHHHHHHhCCC---ccEEEeCHHHHHHHhcCHHHHH Confidence 443 3322211 0122222222110 001101 24555666667643 46789999999999886443 4 Q ss_pred hcccccCce----------eeEEeceEEEEeCC-Cc--cCCCceEE---------EEEcCCeeEEeecCCccceeeeecC Q lcl|NC_015254. 213 FMLDSDNKK----------FPTYMGKRVIVDDG-LP--AKDGVYTS---------YIFGEGAFGLGNGEAPVPTETDREK 270 (346) Q Consensus 213 ~~~~s~~~~----------i~~~~G~~VVvdD~-~p--~~~g~ytt---------~l~~~GAi~~~~~~~~~~vE~dRd~ 270 (346) .+..++.+. ++.+.|.+|++=|. .. .+.+++++ .++..|..+...-.+ + .|.+... T Consensus 222 ~~~~~~~~~~~~~~~~~~~l~~~~~~~i~~yd~~y~d~~~~~~~t~~~~~p~~~v~l~~~~~~G~~~yG~-~-~e~~~~~ 299 (349) T protein:vir:10 222 AIFGKDTGRVVGQADLDQWMTAQGLPIIRAYDGKYRDEDSRGNLTTNSYFPEDRIVLFNDEVPGQKIYGP-T-PEENRLI 299 (349) T ss_pred HhcccccccccCHHHHHHHHHhcCCceEEEEeeEEEeecCCCceeecccccCCeEEEecCCCceeEEeec-c-chhhhhc Confidence 443333221 24455655655332 11 11222222 122222222211110 1 1111111 Q ss_pred CcceeEEEEeeEEeeeeeeeeeccccccCCCCChHHhcCCcCceee-ecccccceEEEE Q lcl|NC_015254. 271 LKGNDILINRQHFLLHPRGIAWQEKSVAGHSPTNTEIEKGNNWKAV-YESKNIRIVAFV 328 (346) Q Consensus 271 ~~g~~~l~~r~~~~~~~~G~s~~~~~~~~~sPt~a~L~~~~NW~~v-~~~K~i~iv~~~ 328 (346) .+....... -+.+-.+|... ..|....+...++.=-| .+++.+-++.++ T Consensus 300 ~g~~~~~~~-----~~~~~~~~~~~----~dP~~~~~~~~s~~lPv~~~~~~~~~a~Vl 349 (349) T protein:vir:10 300 SSNAQVSNV-----GNIMAKIYETS----EDPIGTWILASATMLPSFASADDVFQAKVL 349 (349) T ss_pred ccccceeec-----cceEEEeeeec----CCCceEEEEEeeeeeeeecCCCcEEEEEeC Confidence 111111100 01112233211 24666666555554433 345566666665 No 181 >protein:vir:80068 Length: 301 # NCBI annotation: gp8 # Family: family:all:463 # MgeID: mge:1876 # MgeName: B054 # Cross-refs: genbank:acc:YP_001468712;genbank:gi:157325292;genbank:GeneID:5601759 Probab=69.14 E-value=0.23 Score=24.11 Aligned_cols=261 Identities=14% Similarity=0.118 Sum_probs=109.5 Q ss_pred eeeecCCcee-eeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccc Q lcl|NC_015254. 9 LQKFAAGKNT-RIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGA 87 (346) Q Consensus 9 ~q~~~a~~~T-~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~ 87 (346) +|.++++..+ ..-+-|+|+|...-..+-.. -+|+ + +.+-...+..++.++..... |.++...++.++ T Consensus 1 ~~~~~~g~f~~~~l~~id~~v~e~~~~~l~~--r~l~--~-------v~~~~~~~~~~~~~~~~~~~-G~~~~~~~~~~d 68 (301) T protein:vir:80 1 MQGKITATIEARDLQAIDNVIYEPKQEELTA--RSVF--P-------QKFDVNEGAESYSFDVMTRS-GAAKIIANGADD 68 (301) T ss_pred CCccccchhhHHHHHHHHHHHHHhhhhhhhh--hhhc--c-------cccCCCCceEEEEEeeeccc-eeEEEecCcccc Confidence 7888665311 11122333333221111100 0110 0 01112223456677766654 777777777778 Q ss_pred cchhhcccceeEEEEEeecCc--ceechHHHhhhcc-hHHHHHHHHHHHHHHHHHHHHHHHH-----HHhhhhhhhhhhc Q lcl|NC_015254. 88 LTPGNISAAKDIARLHMRGKA--WRTNDLAKALSGD-DPMRAIGDLVVEYWNRRRQAVLIAS-----LNGITASGALDSN 159 (346) Q Consensus 88 it~~~lt~~~~~a~~~~~~k~--~~~tD~a~~~~g~-dp~~~i~~q~a~~~~~~~~~~lla~-----L~G~~~~~~~~~~ 159 (346) ++..+.........++..+.+ |+.-|+......+ +.-..-+...+..++++.++.++-- +.|++........ T Consensus 69 ip~~~~~~~~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aa~~~~~~~~n~~~f~G~~~~g~~GLlN~p~~~~~ 148 (301) T protein:vir:80 69 LPLVDVDMVRKSVPIYSIGIGLSYTIQDLRAARMQGTTVDAAKATTVRRAIAEKENSIAFRGEKKYAIKGAFEATGIQID 148 (301) T ss_pred cccccccceeEEEEEEEEEeeeeecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhceEEeeecccccceeeecCCCcccc Confidence 887777777777777776665 6777787775555 4444456666677777777755532 1222222221110 Q ss_pred ceeeecccccc-----cccc----HHHHHHHHHHhCcc---ccCceEEEEchHHHHHHHhh--------hhhhhcccccC Q lcl|NC_015254. 160 KLDVSTETGDD-----SYFT----GDTFLSATYKLGDA---EGKLTGIAMHSQTEMNLRKQ--------GLIEFMLDSDN 219 (346) Q Consensus 160 ~~dis~~~~~~-----~~~~----~~~l~~A~~~~GD~---~~~~~~ivmhS~~~~~L~~~--------~li~~~~~s~~ 219 (346) .+..++.. +.-+ ++.|+++..++=.. ...-..++|+|..|..|-.- -++++++...- T Consensus 149 ---~~~~~~~~~~~~w~~~t~~ei~~di~~~~~~l~~~s~g~~~p~~L~L~p~~~~~L~~~~~~~~~~~tvl~~l~~~~~ 225 (301) T protein:vir:80 149 ---VSPTTGVGNVSKWEKKTAEQIIDEIGEAHTKITVLPGYGTASLKLCLPPKQFELINKKRYSNEDSRSVLKVLQDNAW 225 (301) T ss_pred ---cccCcccccccccccCCHHHHHHHHHHHHHHHHHhcCceecccEEEecHHHHHhhhhccccCCCCeeHHHHHHHHcC Confidence 01011100 0011 35666676664111 11235799999999998521 12344442211 Q ss_pred ceeeEEeceEEEEeCCCccCCCceEEEEEcCC--eeEEeecCCc--cceeeeecCCc------ce-eEEEEeeEEeeeee Q lcl|NC_015254. 220 KKFPTYMGKRVIVDDGLPAKDGVYTSYIFGEG--AFGLGNGEAP--VPTETDREKLK------GN-DILINRQHFLLHPR 288 (346) Q Consensus 220 ~~i~~~~G~~VVvdD~~p~~~g~ytt~l~~~G--Ai~~~~~~~~--~~vE~dRd~~~------g~-~~l~~r~~~~~~~~ 288 (346) ..++...+-+.+.+ ..|+-..+++..+ -+.+....+. .++|. |+..- +. -.++-|-.-++.+- T Consensus 226 --~~~I~~~p~L~~~g---~~g~~~~v~~~~~~d~~~~~v~~~~~~~~~e~-~~~~~~~~~~~r~~Gv~i~~P~ai~~~~ 299 (301) T protein:vir:80 226 --FSAIVRVPDLAGMG---TAGSDSFAVIHDSNETAELIIPMDITRHPEEY-SFPRTKVPFEERTAGVVVRFPAAIVRVD 299 (301) T ss_pred --cceEEEcceeccCC---CCcccEEEEEecCCcEEEEEecCceeeeccee-cCceeEeeeeeeeEEEEEEccceEEEEe Confidence 11111111111111 1111111111111 1111111111 11211 11100 00 11222223334445 Q ss_pred ee Q lcl|NC_015254. 289 GI 290 (346) Q Consensus 289 G~ 290 (346) |+ T Consensus 300 GI 301 (301) T protein:vir:80 300 GI 301 (301) T ss_pred cC Confidence 55 No 182 >protein:vir:8324 Length: 410 # NCBI annotation: gp41 # Family: family:all:30827 # MgeID: mge:154 # MgeName: Corndog # Cross-refs: genbank:acc:NP_817892;genbank:gi:29566325;genbank:GeneID:1259520 Probab=68.24 E-value=0.24 Score=23.98 Aligned_cols=271 Identities=11% Similarity=0.030 Sum_probs=131.7 Q ss_pred Ccc--ceecc-eeeec----CCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccc- Q lcl|NC_015254. 1 MIK--KLRMN-LQKFA----AGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQ- 72 (346) Q Consensus 1 ~~~--~~~~~-~q~~~----a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~- 72 (346) =++ .--.+ |+.|. .+.+--.+..|-|+..++- ++|++++- +...+-..+--+|.|++-|.-. T Consensus 113 ~~~Gd~~A~~~~e~~r~a~~~~~Tgd~~~~i~~~~v~d~--------i~li~q~r--~i~slf~tLP~~g~T~eY~v~t~ 182 (410) T protein:vir:83 113 SAQGNASAADRLEVYARAADHQKTGDLQGVIPDPIVGPV--------IDFIDSAR--PLVSTLGTLPLNNATFYRPIVSQ 182 (410) T ss_pred cCCchHHHHHHHHHHHHhhccCcccccccccchhHhhhH--------HHHHhhcc--chhhhhhhCCCCCCeeEEeeecc Confidence 000 00000 11111 1111112333444433321 23443321 1111112233468888865542 Q ss_pred cCCCccc-----ccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 73 DLTGEDE-----ILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASL 147 (346) Q Consensus 73 ~l~g~ae-----~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L 147 (346) ++.-..+ .-.|| +.++..|++.+...+.++..|-.-.++--+-+.+.-.-+...-+-++.+.+..-.+..-+.| T Consensus 183 ~~tV~~q~~~~kqa~EG-d~L~~gKl~~~t~tA~ikTyGGyt~LSRQ~IERs~v~~L~~~lraL~~AYA~atea~vra~L 261 (410) T protein:vir:83 183 RPAVGLQGVAGGASDEK-TELDSQKMVIDRLTVNAKTLGGYVNVSRQAIDFSSPSALDLVVNGLGQQYAIETEALVGAAL 261 (410) T ss_pred ccccccccccccccccc-ccccccceeeeeccceeehhcCcccccceeeecCChhhHHHHHHHHHHHHHHHHHHHHHHHH Confidence 2211000 01366 47999999999999999988876666666666666666666666666666666666666666 Q ss_pred HhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCcccc--CceEEEEchHHHHHHHhhh--hhhhcccccC---- Q lcl|NC_015254. 148 NGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEG--KLTGIAMHSQTEMNLRKQG--LIEFMLDSDN---- 219 (346) Q Consensus 148 ~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~~~~~L~~~~--li~~~~~s~~---- 219 (346) .+.+....+..+ ++ .+. -...+.||..++-|... .+..|.+++.|..++-+.= +--...++.| T Consensus 262 ~~t~t~~~a~~~---~T----ad~--~~~~i~da~~~v~da~~~~~~~~i~vS~DVl~~~~~~f~~~~~~~~dt~Gfg~~ 332 (410) T protein:vir:83 262 ASTSTGAVGYGN---AT----ADN--VASAIWQAAGAVYTAVKGMGRLVIAIAPDVLGDFGPLFAPVNPTNAHSTGFEAG 332 (410) T ss_pred HHhhhhhhhhhh---cc----HHH--HHHHHHHHHHHHhhhhccceeeeEEechhhhhhccceeeccCCCCccccccccc Confidence 443321111110 01 111 22466788888888743 5557899999965544321 0000111111 Q ss_pred ---ce-eeEEeceEEEEeCCCccCCCceEEEEEcCCeeEEeecC-CccceeeeecCCcceeEEEEeeEEeeeeeeeeecc Q lcl|NC_015254. 220 ---KK-FPTYMGKRVIVDDGLPAKDGVYTSYIFGEGAFGLGNGE-APVPTETDREKLKGNDILINRQHFLLHPRGIAWQE 294 (346) Q Consensus 220 ---~~-i~~~~G~~VVvdD~~p~~~g~ytt~l~~~GAi~~~~~~-~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s~~~ 294 (346) .. -+.++|.+|++..+.+.+ +.+++-+-||.+.+++ .|+.+ +|-++ +.|..++. ||-... T Consensus 333 ~lg~gi~G~~~~ipVvm~~~a~Ag----TA~f~~~~Ai~~~eS~~gp~qL-~d~~i----~nLt~~yS------gY~a~a 397 (410) T protein:vir:83 333 RFGQGVMGSISGIPVVMSAALGSG----DAYLFSTAAIECFEQRVGTLQV-VEPSV----FGLQVAYA------GYFSTL 397 (410) T ss_pred ccccchhhhhcccceEEecCCCcC----eeeEeccceeeeeecCCceeEe-eCCch----hhhhhhhe------eeeeec Confidence 11 256889999999998887 4677889999988766 33322 12222 22333333 221111 Q ss_pred cc-ccCCCCChHH Q lcl|NC_015254. 295 KS-VAGHSPTNTE 306 (346) Q Consensus 295 ~~-~~~~sPt~a~ 306 (346) .. ..|.-|---. T Consensus 398 ~~~~~gliPv~g~ 410 (410) T protein:vir:83 398 VVNEDAIVPLVGS 410 (410) T ss_pred cccccceeeeccC Confidence 11 1111111000 No 183 >protein:vir:4074 Length: 480 # NCBI annotation: major capsid (head) protein # Family: family:all:11745 # MgeID: mge:85 # MgeName: c2 # Cross-refs: genbank:acc:NP_043553;genbank:gi:9628687;genbank:GeneID:1261180 Probab=67.01 E-value=0.26 Score=23.80 Aligned_cols=275 Identities=11% Similarity=0.008 Sum_probs=74.2 Q ss_pred Cccce-ec------ceeeecCCc-eeeeeeccchHHHHHHHhhHhHHHHhHhhc----cccccchhHHHH-hh-CCCcEE Q lcl|NC_015254. 1 MIKKL-RM------NLQKFAAGK-NTRIADVIVPEVFNKYVTERTAESSALLQS----GIISNDKDLDEL-AK-SGGNMI 66 (346) Q Consensus 1 ~~~~~-~~------~~q~~~a~~-~T~l~d~i~Pev~~~yv~~~~~~~~~~~qS----gi~~~~~~~~~l-~~-~~G~ti 66 (346) +++++ .+ ..+.+.++. .+...+....| + .++..+.++ +.+.... +.+ +. ...... T Consensus 153 l~akl~el~k~~ee~k~~~~~~~~~~~~~~~~~~e-----~----r~~~~~~~~~~e~~~~~~~~--~~~~~~~~~~~~~ 221 (480) T protein:vir:40 153 LEAKVEELNKEREELKKEREASIPSEKPEDAERKF-----M----RELGSKMAEMPEQGFLREFA--NGADLNVVNSLGS 221 (480) T ss_pred HHHHHHHHHhHHHHHhhhhhhhccccchhhhhhHH-----H----HHHHHHhccchhhhhhhhhh--hhccccccccccc Confidence 11110 00 000000000 00001110001 0 011111111 0000000 000 00 001111 Q ss_pred EecccccCCCcccccCCCccccchhhcccce---eEEEEEeecC-------------cceechHHHh-----hhcch--- Q lcl|NC_015254. 67 NMPFWQDLTGEDEILDDGEGALTPGNISAAK---DIARLHMRGK-------------AWRTNDLAKA-----LSGDD--- 122 (346) Q Consensus 67 ~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~---~~a~~~~~~k-------------~~~~tD~a~~-----~~g~d--- 122 (346) -+|.+.............-....+-...... -.+....++. .+.+..+..+ ...+| T Consensus 222 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~e~~~~~~~~~~~~~~~~~~~~~~v~~l~~~~k~t~~lLDDa~~ 301 (480) T protein:vir:40 222 ITSKYARKSGIYDGAMKARFQGLTLAEDGVDDTFISGTFKAGTDKNKSQTATKRSLRPQMAEAYLQMDKATVRGVNDSGA 301 (480) T ss_pred cccchhhheeechhhhhhhhhcceeeeccccceeeeeeeecccccccccccccchhhHHHHHHHHHhHHHHHHHhhhhHH Confidence 2222222111100000000000000000000 0000000000 0000111110 00111 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhh-hcceeeeccccccccc-cHHHHHHHHHHhCccccCce-EEEEch Q lcl|NC_015254. 123 PMRAIGDLVVEYWNRRRQAVLIASLNGITASGALD-SNKLDVSTETGDDSYF-TGDTFLSATYKLGDAEGKLT-GIAMHS 199 (346) Q Consensus 123 p~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~-~~~~dis~~~~~~~~~-~~~~l~~A~~~~GD~~~~~~-~ivmhS 199 (346) -...|.++++..+.++..+.+| .|- + ++.+ ........ .+.+..+ +.+.|.+-...+-.....-. .++||+ T Consensus 302 l~~~i~~~l~~~~~~~ee~a~l---~G~-g-~g~~~~~g~~~~~-~~~~~~~~~~d~id~L~~al~~~y~~~a~~~vmn~ 375 (480) T protein:vir:40 302 LSEYVMSEMVNRVIQKVEYNMI---LGS-V-DGSNGFYGLKTAT-DGWTKQIEYTDLFEGITDAVAECSISDAITIVMSP 375 (480) T ss_pred HHHHHHHHHHHHHHHHHHHHhh---ccC-C-CCccccccceeec-ccccccchhHHHHHHHHHhhhHHhhCCCCEEEECH Confidence 2223566666666666655443 231 0 1111 01111111 1111222 23333333333333332334 589999 Q ss_pred HHHHHHHhhhhh--hhccc--ccCceeeEEeceEEEEeCC-CccCCCceEEEEEcCC--eeEEeecCCccceeeeecC-- Q lcl|NC_015254. 200 QTEMNLRKQGLI--EFMLD--SDNKKFPTYMGKRVIVDDG-LPAKDGVYTSYIFGEG--AFGLGNGEAPVPTETDREK-- 270 (346) Q Consensus 200 ~~~~~L~~~~li--~~~~~--s~~~~i~~~~G~~VVvdD~-~p~~~g~ytt~l~~~G--Ai~~~~~~~~~~vE~dRd~-- 270 (346) .+...|++.+=- .|+.+ ..++...+.+|+||++++. +|.+. ++++.+ ++.+++. .+|..++- T Consensus 376 ~t~~~I~klKD~~G~Yi~q~~~~~~~~~~llG~pvv~~~~~~~~~~-----~~~~~~~~~~~~~d~----~~~~~~~~~~ 446 (480) T protein:vir:40 376 QTFAELRKAKGTDGHSRFNELATKEQIAQSFGAVNLETRVWMPKDE-----VAVYNHDEYVLIGDL----NVENYNDFDL 446 (480) T ss_pred HHHHHHHHhhcCCCCeeccCcccccCcceecccceeeeeccccCCc-----ceeeeCCccEEEEec----ccceeccccc Confidence 999998876422 24432 2356778999999988753 33321 222222 2233321 12322211 Q ss_pred CcceeEEEEeeEEeeeeee---eeeccccccCCCC Q lcl|NC_015254. 271 LKGNDILINRQHFLLHPRG---IAWQEKSVAGHSP 302 (346) Q Consensus 271 ~~g~~~l~~r~~~~~~~~G---~s~~~~~~~~~sP 302 (346) ..-...+..+..-..++.. +.+.+.. ++... T Consensus 447 ~~~~~~~~~e~~v~g~~~~~~~~~~~~~~-~~~~~ 480 (480) T protein:vir:40 447 RYNVEQWLSETLVGGSIRGKNRSAYLKKK-GSLGV 480 (480) T ss_pred ccchhhhhhhhhhceeeEccccEEEEEec-cCcCC Confidence 1111111122221122211 1111000 01111 No 184 >protein:vir:98871 Length: 314 # NCBI annotation: major capsid protein # Family: family:all:3269 # MgeID: mge:1568 # MgeName: BCJA1c # Cross-refs: genbank:acc:YP_164418;genbank:gi:56694908;genbank:GeneID:3197261 Probab=64.22 E-value=0.3 Score=23.42 Aligned_cols=278 Identities=13% Similarity=0.111 Sum_probs=117.1 Q ss_pred Ccccee-----cceeeecCCceeeeeec--cchHHHHHHHhhHhHHHHhHhhcccccc-chhHHHHhhCCCcEEEecccc Q lcl|NC_015254. 1 MIKKLR-----MNLQKFAAGKNTRIADV--IVPEVFNKYVTERTAESSALLQSGIISN-DKDLDELAKSGGNMINMPFWQ 72 (346) Q Consensus 1 ~~~~~~-----~~~q~~~a~~~T~l~d~--i~Pev~~~yv~~~~~~~~~~~qSgi~~~-~~~~~~l~~~~G~ti~~P~~~ 72 (346) |-|++| -|+|+|+..++-+-.-+ +- .-|...++.-+.++..|.. .+.+ ...+|+... -++.=.=+ T Consensus 1 ~~~~~~~~~~~~~~~~~~~~t~N~n~avr~Y~-Kqf~glL~~vf~~qa~F~~--~FGg~lQalDGV~~--N~tafsvK-- 73 (314) T protein:vir:98 1 MKKQFKPFLPLNNIQFFASGTANQNKAARSYQ-KEFRQLLQAVFRSQAYFRD--FFGGGIEALDGVQH--NDTAFYVK-- 73 (314) T ss_pred CcccccccccccceeeeeeccccCccceeeec-HHHHHHHHHHHhhHhhhhh--hcccceeeccCCCc--cceEEEEe-- Confidence 999886 38999986532222111 22 2355555555555555522 1111 111121110 11111001 Q ss_pred cCCCcccccC--CCcccc-------chhhcccceeEEEEEe------ecCcceech-HHHhhhcchHHHHHHHHH---HH Q lcl|NC_015254. 73 DLTGEDEILD--DGEGAL-------TPGNISAAKDIARLHM------RGKAWRTND-LAKALSGDDPMRAIGDLV---VE 133 (346) Q Consensus 73 ~l~g~ae~~~--dg~~~i-------t~~~lt~~~~~a~~~~------~~k~~~~tD-~a~~~~g~dp~~~i~~q~---a~ 133 (346) ..|.+++- +..++- |...---+.+. -+++ .-.+|.+.+ +.++.-..|+-++++++| |. T Consensus 74 --tsD~pVVig~~Y~TdeNvaFGtGTg~SsRFGprk-Ei~y~dtdVpY~~~~~iHEGiD~~TVNnd~~aaVAdRL~LQA~ 150 (314) T protein:vir:98 74 --TSDIPVVVGNEYNKDENVGFGEGTSRSTRFGPRR-EIIYQDTPVPYTWEWVYHEGIDKHTVNNDFQAAVADRLDLQAN 150 (314) T ss_pred --ecccceeecCcccCCCCcccccCCccccccCcee-EEEeecccccccccchhhhccccccccCChhHHHHHHHHHHHH Confidence 11222211 111110 10011111111 1222 223455443 223444667888888776 67 Q ss_pred HHHHHHHHHHHHHHHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccc-cCceEEEEchHHHHHHHhhhhhh Q lcl|NC_015254. 134 YWNRRRQAVLIASLNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAE-GKLTGIAMHSQTEMNLRKQGLIE 212 (346) Q Consensus 134 ~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~-~~~~~ivmhS~~~~~L~~~~li~ 212 (346) +|.|.++..+=..|.-+.+.+- ...|++.+ . --..|++|..+|=+.. ..-....+||.+|..|.+..|.+ T Consensus 151 Akt~~~n~~~Gk~lS~~As~te---~ltd~~~d----~--V~~LF~~as~~yvn~ev~~~~~AyV~~evYnaiiD~~l~T 221 (314) T protein:vir:98 151 AKIKQFNAQHSKFISSIAEKTE---TLTDYSAD----N--VLRLFNELSKYYVNIEAIGTKAAKVSPELYNAIVDHPLTT 221 (314) T ss_pred HHHHHHHHHHHHHHHhhhhhhh---hhhhcchh----h--HHHHHHHHHhhhhcceeeEEEEEEEchhHHhHhhcccccc Confidence 8888888766555532111100 00122111 1 1145677777776643 33357889999999999999988 Q ss_pred hccccc----CceeeEEeceEEEEeCCCccCCCceEEEEEcCCeeEEee----cCCccceeeeecCCcceeEEEEeeEEe Q lcl|NC_015254. 213 FMLDSD----NKKFPTYMGKRVIVDDGLPAKDGVYTSYIFGEGAFGLGN----GEAPVPTETDREKLKGNDILINRQHFL 284 (346) Q Consensus 213 ~~~~s~----~~~i~~~~G~~VVvdD~~p~~~g~ytt~l~~~GAi~~~~----~~~~~~vE~dRd~~~g~~~l~~r~~~~ 284 (346) .-+.|. -.+|..|.|..+... |.. -|.+|.+.+.. +++++-+-+.|-+ .++--=+ T Consensus 222 saK~SsaNIDengi~~FkGf~i~e~---P~~-------~~q~g~ia~~s~dnig~aftGIn~aR~I-------esEdF~G 284 (314) T protein:vir:98 222 SAKSSSANIDQNGIVNFKGFAIQEI---PES-------MLQSGDVAYTYITNIGKAFTGINTSRII-------ESEDFDG 284 (314) T ss_pred ccccceeeeccCCcceecceEEEec---chh-------hcCCCcEEEEccccceeecccceeeeee-------ecccccc Confidence 777653 246777888776652 332 13444444432 1222333333332 1111112 Q ss_pred eeeeeeeeccccccCCCCChHHhcCCcCceeeecccccceEEEEEeccccc Q lcl|NC_015254. 285 LHPRGIAWQEKSVAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKNGVPG 335 (346) Q Consensus 285 ~~~~G~s~~~~~~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~~~~~ 335 (346) +.+.|--- .|+ | .-|.....|+.+.+. +.+ T Consensus 285 ValQgAGK-----~G~------------~--I~edNk~Ai~k~t~t--p~~ 314 (314) T protein:vir:98 285 VALQGAGK-----AGE------------F--ILDDNKKAVAKVTST--PEG 314 (314) T ss_pred eeeecccc-----ccc------------c--cccccceeeEEEecC--CCC Confidence 22222000 000 0 000001111111110 000 No 185 >protein:vir:97397 Length: 517 # NCBI annotation: major capsid protein # Family: family:all:11745 # MgeID: mge:1675 # MgeName: Q54 # Cross-refs: genbank:acc:YP_762590;genbank:gi:115304291;genbank:GeneID:5130600 Probab=63.72 E-value=0.31 Score=23.35 Aligned_cols=274 Identities=11% Similarity=0.045 Sum_probs=96.5 Q ss_pred CccceecceeeecCCc--eeee-----eeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhh-CCCcEEEecccc Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGK--NTRI-----ADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAK-SGGNMINMPFWQ 72 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~--~T~l-----~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~-~~G~ti~~P~~~ 72 (346) |........+.=+++. .+.+ ..+..|.-+..-+...... -|+++ . ... .+.....+|... T Consensus 220 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~p~~~~~~i~~~~~~-----~~~i~-~------~~~~~~i~~~~~~~~~ 287 (517) T protein:vir:97 220 AEVAYMSASLTKDPKAAWTAELKERGISGMPAPAGILKRIQDAVND-----EGSLL-P------FIRHENLPTLVVGGDN 287 (517) T ss_pred HHHHHHHhcccccccceeeeecccccccccccchHHHHHHHHhhhh-----hccce-e------eeeeccccceeeeccc Confidence 0000000000000000 0001 0112222221111111100 01111 1 111 122344455443 Q ss_pred cCCCcccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcch----HHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 73 DLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDD----PMRAIGDLVVEYWNRRRQAVLIASLN 148 (346) Q Consensus 73 ~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~d----p~~~i~~q~a~~~~~~~~~~lla~L~ 148 (346) .- ..+..+.+|. ..+...++-.+.....+..+.-+.++..-..-+.-| -..-|.++++..+.++.+..+| . T Consensus 288 ~~-~~a~~~~eG~-~kp~s~~tf~~~~~~~~~ia~~~~~S~qll~Ds~~dd~~~l~s~i~~~l~~~l~~~ee~a~l---~ 362 (517) T protein:vir:97 288 AL-TQGTGHTTGT-DKTESNITLQTRVLTPQYVYKYIKLPKIVMNSNATDIAGAILTYVMNRLPDMVIMAVNRAII---M 362 (517) T ss_pred cc-ceeeeeecCC-cccccccceeeEEeeHhhhhhhhhhhHHHHHHhhhccHHHHHHHHHHHHHHHHHHHHHHHHh---c Confidence 21 2233445553 344455665555555555555555554322212222 2335888899999988888665 3 Q ss_pred hhhhhhhhhh-cceeeeccccccccccHHHHHHHHHHhCcccc--CceEEEEchHHHHHHHhhhh--hhhcccc--cCce Q lcl|NC_015254. 149 GITASGALDS-NKLDVSTETGDDSYFTGDTFLSATYKLGDAEG--KLTGIAMHSQTEMNLRKQGL--IEFMLDS--DNKK 221 (346) Q Consensus 149 G~~~~~~~~~-~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~~~~~L~~~~l--i~~~~~s--~~~~ 221 (346) |- +. +.+. ....+..........+.+.+.+.+..+-+... .-..|+||+.++..|++.+= =.|+.+. .++. T Consensus 363 Gd-Gt-g~~~~gi~~~a~~~~~~~~~~~~~~~d~i~~l~~a~~~a~~a~~vmn~~t~~~I~klKD~~G~Yl~~~~~~~~~ 440 (517) T protein:vir:97 363 GG-VT-GVSETQIYPVVGDAWATNVTGTTNIQELLEKLSVATPKAADSTLVIHRNDLAAIRFLKDKNGNYVFPVGVSNQT 440 (517) T ss_pred cc-CC-CcccccccccccccccccccccchHHHHHHHHHHHhhhccCCEEEECHHHHHHHHHhhcCCCCeeccCcCCccc Confidence 31 00 0000 01111100001111122333444433333222 23579999999999987642 2344332 2344 Q ss_pred eeEEeceEEEEeCCCccCCCceEEEEEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEE---eeeeeeeeec--ccc Q lcl|NC_015254. 222 FPTYMGKRVIVDDGLPAKDGVYTSYIFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHF---LLHPRGIAWQ--EKS 296 (346) Q Consensus 222 i~~~~G~~VVvdD~~p~~~g~ytt~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~---~~~~~G~s~~--~~~ 296 (346) ..+.+|..-++.. ++++. . +..+..|=+..........--+||+.+ ++.+..+..- +.++.-+.+. -+. T Consensus 441 ~~~l~G~~~~~~~-~~~~~--~-~~~~~~~y~i~~~~g~~~~~~fd~~~n--~~~f~~~~~~~g~i~~~~r~a~~~~~p~ 514 (517) T protein:vir:97 441 IATHFGFNRLVQS-VAVDE--K-TAVSLSGYVTNGSRGMEFEQGTILVEN--NKEYLFEMPISGSLEYKGTTAYGTYTPP 514 (517) T ss_pred ccccCCccccccc-cccCc--e-eEeeccccEEEeecceeeeeeeecccC--ceeEeeeeeeccccccccceEEEEEcCC Confidence 5566674333321 22221 1 122222211111111001111334322 2223222221 2223222221 122 Q ss_pred ccC Q lcl|NC_015254. 297 VAG 299 (346) Q Consensus 297 ~~~ 299 (346) ++| T Consensus 515 ~~~ 517 (517) T protein:vir:97 515 VAG 517 (517) T ss_pred CCC Confidence 333 No 186 >protein:vir:94933 Length: 330 # NCBI annotation: putative phage structural protein # Family: family:all:1120 # MgeID: mge:1538 # MgeName: Xp15 # Cross-refs: genbank:acc:YP_239278;genbank:gi:66392060;genbank:GeneID:5076578 Probab=63.00 E-value=0.32 Score=23.26 Aligned_cols=280 Identities=15% Similarity=0.100 Sum_probs=114.4 Q ss_pred Cccc----eecceeeec------CCceeeeeec--cchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEe Q lcl|NC_015254. 1 MIKK----LRMNLQKFA------AGKNTRIADV--IVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINM 68 (346) Q Consensus 1 ~~~~----~~~~~q~~~------a~~~T~l~d~--i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~ 68 (346) |-.- |++-+...+ +..+-+|++. ..|.-+..-|.+.+.+.+.+++- .+ +. ..-|...+. T Consensus 1 ~~~~~~~~~~~~~~~~~~~~p~l~m~alTLaea~~l~~d~~~~~VIE~l~~~s~iL~~-----lp-f~---~ve~~~~~~ 71 (330) T protein:vir:94 1 MVRICTPPLRGRWRTLTHQFPELKMPTVTLAESAKLSQDHLVSGLIETIVEVNPLYEM-----MP-FT---EIEGNALAY 71 (330) T ss_pred CceecCCccccceeehhccccccchhhhhhhHHhhcCchhhHHHHHHhhhccchHHhh-----cc-cc---cccCCccee Confidence 3221 122111111 1111222221 22333344444444444433220 00 00 111333333 Q ss_pred cccccCCCcc-cccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHH---HHHHHHHHHHHHH Q lcl|NC_015254. 69 PFWQDLTGED-EILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLV---VEYWNRRRQAVLI 144 (346) Q Consensus 69 P~~~~l~g~a-e~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~---a~~~~~~~~~~ll 144 (346) +--..+.+.+ .++.++ +++++=.+..+.-.-...-.++..=|-...-.+++|+.....|+ .++..++++..+| T Consensus 72 ~r~~~lp~a~~r~~n~~---~~~~~~~Tf~q~t~~l~~l~~~~~Vd~~iadl~g~~~d~~~~q~~~~ieal~~~~e~~li 148 (330) T protein:vir:94 72 NRENVLGDVQFLAVGGT---ITAKNPATFTKVTSELTTLIGDAEVNGLIQATRSDFMDQTSVQVASKAKSIGRQYQASMI 148 (330) T ss_pred eeeecCCcceeeecccc---ccccCcceeeeeeechhhhhhhHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHhh Confidence 3333343222 122221 22222112222222222333332222222224778877765555 4556677777666 Q ss_pred HH------HHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhh-------h Q lcl|NC_015254. 145 AS------LNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGL-------I 211 (346) Q Consensus 145 a~------L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~l-------i 211 (346) .- ..|+...- .. ..-+... +..+.++.+.|...+.+.=.....-.+++||.+...+++.-.- . T Consensus 149 nGDs~~~~F~GL~~~~-~~--~q~i~tg-~~gg~~T~d~LDeLl~~v~~~~g~~~~~l~n~a~~r~I~a~~R~~~~~~v~ 224 (330) T protein:vir:94 149 TGDGTGNSFQGMMGLV-AA--SQTISAG-ANGGTLTFELLDQLLDLVKDKDGQVDYLMSSFAMRRKYFSLLRALGGAAIG 224 (330) T ss_pred ccCCCCccccchhhcC-Cc--ccEEecC-CCCCCCCHHHHHHHHHHhcCCCCCCcEEEechhHHHHHHHHHHhccCCCCC Confidence 41 00111100 01 1112211 1345688888888887764444556788887776655554321 1 Q ss_pred hhcccccCceeeEEeceEEEEeCCCccCCC------ceEEEEEc--C-----CeeEEeecC-CccceeeeecCCccee-- Q lcl|NC_015254. 212 EFMLDSDNKKFPTYMGKRVIVDDGLPAKDG------VYTSYIFG--E-----GAFGLGNGE-APVPTETDREKLKGND-- 275 (346) Q Consensus 212 ~~~~~s~~~~i~~~~G~~VVvdD~~p~~~g------~ytt~l~~--~-----GAi~~~~~~-~~~~vE~dRd~~~g~~-- 275 (346) .......|..+.+|.|+||+..|.+|.+.+ +-..|.+. . |-.++.... +.+.| |+...-++ T Consensus 225 ~~~~~~~G~~v~~~~GvPi~~~d~ip~~~~~~~~~~ttsIyav~~G~~~~~qgV~Gl~~~g~~glsV---r~~G~~~~k~ 301 (330) T protein:vir:94 225 EVMTLPSGRQIPTYRGVPWFVNDFIPSNMTQGTATNATAIFAGTFDDGSNKYGIAGLTARGSAGLRV---QNVGAKENAD 301 (330) T ss_pred CcccccCCCEEeeeCCeEEEecccccCCCCcccCCCceeEEEEeecccccccceEeecCCCCCccee---eeCCCccccc Confidence 111222356789999999999999998542 23334433 2 445554332 33444 44432111 Q ss_pred EEEE--eeEEeee---eeeeeeccccccC Q lcl|NC_015254. 276 ILIN--RQHFLLH---PRGIAWQEKSVAG 299 (346) Q Consensus 276 ~l~~--r~~~~~~---~~G~s~~~~~~~~ 299 (346) +... ++.+.+. +....--+....| T Consensus 302 v~~~~v~~y~~~av~~~~a~~~L~~V~~g 330 (330) T protein:vir:94 302 ETITRVKMYCGFANFSQLGLAAIKGLIPG 330 (330) T ss_pred eeeEEEEEeeeeEEechhheeeeccccCC Confidence 1111 2222222 2222221111111 No 187 >protein:vir:99523 Length: 311 # NCBI annotation: putative protein # Family: family:all:701 # MgeID: mge:1559 # MgeName: Lj928 # Cross-refs: genbank:acc:NP_958538;genbank:gi:41179320;genbank:GeneID:2717161 Probab=61.68 E-value=0.35 Score=23.09 Aligned_cols=271 Identities=11% Similarity=0.054 Sum_probs=119.1 Q ss_pred eeeecCCceeeeeeccc-hHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcccccCCCccc Q lcl|NC_015254. 9 LQKFAAGKNTRIADVIV-PEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGEDEILDDGEGA 87 (346) Q Consensus 9 ~q~~~a~~~T~l~d~i~-Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~ 87 (346) +-.. || |. + ++ -+.|.+.+.+.+.+.+ -||.+..+. . ..-.||++|.+|... .+|-. +.+... . T Consensus 1 ~~~~-an--~m-A--lnya~~~~~~Ld~~~~~~~---~t~~l~~~~-~--~~~~Gak~VkIp~i~-~~gl~-dY~R~~-g 65 (311) T protein:vir:99 1 MPTD-AE--TR-G--FNYVTKDGNLLDQKITAGL---FTAALGTPE-V--DLVNGGRSFTLKTIS-TSGLK-DHTRGK-G 65 (311) T ss_pred CCCc-ch--hh-H--HHHHHHHHHHHHHHHHhhh---cccceecCc-h--heeecCCEEEEEeee-ecccc-cccccc-C Confidence 1111 21 11 0 10 2455666666555432 356665443 1 123579999999997 44543 222222 2 Q ss_pred cchhhcccceeEEE-EEeecCcceechHHHh-----hhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhh-hhhcc Q lcl|NC_015254. 88 LTPGNISAAKDIAR-LHMRGKAWRTNDLAKA-----LSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGA-LDSNK 160 (346) Q Consensus 88 it~~~lt~~~~~a~-~~~~~k~~~~tD~a~~-----~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~-~~~~~ 160 (346) -..+.++...+.-+ -+.|+.+|.+.-+... ++-++-+ +++..+.-.-.+|..-++.|-..+.... ..... T Consensus 66 ~~~g~v~~~~et~tl~~DR~~~f~vD~mDvdETn~~~~~ani~---~~f~r~~vvPEiDayrfskla~~a~~~~~~~~~~ 142 (311) T protein:vir:99 66 FNSGTISDEKTIYTMGQDRDVEFYLDRQDVDETDNELAMANIS---NVFITEHVQPELDSYRFSKIATSFDNLDGTDTEG 142 (311) T ss_pred ccccceeeeeeEEEeeeccceeeecchhchhhhhhhhHHHHHH---HHHHHhhhcchhhHHHHHHHHhhhhcccccccch Confidence 34566665544443 3457777777622221 2222222 3333333344456666665532111100 00000 Q ss_pred eeeeccccccccccH----HHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhhhh-ccccc------CceeeEEeceE Q lcl|NC_015254. 161 LDVSTETGDDSYFTG----DTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLIEF-MLDSD------NKKFPTYMGKR 229 (346) Q Consensus 161 ~dis~~~~~~~~~~~----~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li~~-~~~s~------~~~i~~~~G~~ 229 (346) ............++. +.|-.++..|=|....-.+++|.|.++.-|.+...+.. +...+ ...++.+.|++ T Consensus 143 ~~~~~~~~~~~~lt~~nvl~~l~~~~~~~~~v~~~~rvl~vTp~~~~lLk~~~~~~r~~~~~~~~~~~i~~~V~~lDgv~ 222 (311) T protein:vir:99 143 TLLAKTHKTEETLDETNAYSQLKTGIGKVRKYGTQNLVGYVSSEVMDALERSKEFTRNITNQNVGTTALESRITSIDGVQ 222 (311) T ss_pred hhhccccccccccCHHHHHHHHHHHHHHHHhcCCCCeEEEEChHHHHHHhhchhhheeeecccccccccccccceecCeE Confidence 000000111122333 44555566664444455899999999998776554432 22111 23578999998 Q ss_pred EEEe---CCCc-----------cCCCceEEEEEcCCeeEEeecCC-ccc-eeeeecCCcceeEEEEeeEEeeeee----- Q lcl|NC_015254. 230 VIVD---DGLP-----------AKDGVYTSYIFGEGAFGLGNGEA-PVP-TETDREKLKGNDILINRQHFLLHPR----- 288 (346) Q Consensus 230 VVvd---D~~p-----------~~~g~ytt~l~~~GAi~~~~~~~-~~~-vE~dRd~~~g~~~l~~r~~~~~~~~----- 288 (346) ||.- +.+. ...++...|++.+....+.-.+. .+- .+.+-+-.+..-.+..|..+-+-++ T Consensus 223 Ii~V~ps~r~~t~~~ft~G~~~~~~ak~INfiiv~~~a~i~~~K~~~v~~f~P~~~~~gd~~l~~~R~Y~D~fv~~nk~~ 302 (311) T protein:vir:99 223 LIEVYESNRFMTKYDFTDGAKPTEDAKAINFLVVAKPAVISIVKENAVFLFAPGQHTDGDGYLYQNRLYHDLFIKKHKRD 302 (311) T ss_pred EEEecCchhhcchhhhcCCccccCcccccceEEeCCCeeeeeeeeeeeeeeCCCCCCCcceeeeeeeeeeeeeeeccccC Confidence 7643 3332 22334455665544333322221 110 1111111111233445555555544 Q ss_pred eeeeccccc Q lcl|NC_015254. 289 GIAWQEKSV 297 (346) Q Consensus 289 G~s~~~~~~ 297 (346) |+-..-+.. T Consensus 303 ~Iyv~~k~A 311 (311) T protein:vir:99 303 GIFVSVKKA 311 (311) T ss_pred eEEEeeecC Confidence 322211111 No 188 >protein:vir:96490 Length: 348 # NCBI annotation: head protein # Family: family:all:1083 # MgeID: mge:1620 # MgeName: 2972 # Cross-refs: genbank:acc:YP_238492;genbank:gi:66391768;genbank:GeneID:5176912 Probab=58.70 E-value=0.41 Score=22.72 Aligned_cols=298 Identities=11% Similarity=0.075 Sum_probs=114.6 Q ss_pred eeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHH----HhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 17 NTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDE----LAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 17 ~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~----l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) .-.+.|+|++..++.|+.+.......|+....+......+. +....+..+..||-..- ...... .-+. T Consensus 1 M~~i~d~f~~~~l~~~i~~~~~~~~~~l~~~~Fp~~~~~~~~~~~~~~~~~~~~~a~~v~~~-~~~~~~-------~r~~ 72 (348) T protein:vir:96 1 MGLIYDKVTASNIAGYFNTLQENVDSTLGESIFPARKQLGTKLSYIKGASGQSVALKAAAFD-TNVTIR-------DRVS 72 (348) T ss_pred CcchhhccCHHHHHHHHHhcccchhhhhhhhcCCCccccceeEEEEeecCCceeEeeeecCC-CCccee-------cccc Confidence 44567999999999999765444444444333332221111 11122223333333221 111111 1111 Q ss_pred cccceeEEEEEeecCcceechHHHhhh----cchH-HHHHHHHHH-------HHHHHHHHHHHHHHHH-hhhhhhh---- Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALS----GDDP-MRAIGDLVV-------EYWNRRRQAVLIASLN-GITASGA---- 155 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~----g~dp-~~~i~~q~a-------~~~~~~~~~~lla~L~-G~~~~~~---- 155 (346) .++..-....++......+.|.-.+.. +.++ .+.+.++++ +.+.+..+..+..+|. |-+.... T Consensus 73 ~~~~~~~~p~i~~~~~i~~~d~~~l~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~qal~~Gki~~~~~~~~ 152 (348) T protein:vir:96 73 AEIHDEQMPFFKEALLVKENDRQQLNLVKDTGNEALINTIVAGIFNDDVTLINGARARLEAMRMQVLATGKIAFTSDGVN 152 (348) T ss_pred eeeeeeecCccccccccCHHHHHHHHhhhccCCchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCeeEeecCCee Confidence 122211112222333455666544311 2222 344555544 3455555556666553 4332211 Q ss_pred -------hhhcceeeeccccc-cccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh-hhccccc--Cc---- Q lcl|NC_015254. 156 -------LDSNKLDVSTETGD-DSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI-EFMLDSD--NK---- 220 (346) Q Consensus 156 -------~~~~~~dis~~~~~-~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li-~~~~~s~--~~---- 220 (346) .+.|+...+..=++ .+. -...|.+.....-+.......++|+++++..|+++..+ +.+.... .+ T Consensus 153 ~~vdfg~~~~~~~t~~~~W~~~~ad-p~~di~~~~~~~~~~G~~~~~~i~~~~~~~~l~~~~~v~~~~~~~~~~~~~~~~ 231 (348) T protein:vir:96 153 KDIDYGVKADHKKQVSKSWAEPGAT-PLADLEDAIETARELGLNPERAIMNAKTFGLIRKAASTVKAIKPLAGDGSSVTK 231 (348) T ss_pred EEEeccCCcccceeeccccCCCCCC-HHHHHHHHHHHHHhcCCcccEEEeCHHHHHHHhcCHHHHHHHhccCCccccccH Confidence 01222222211000 111 11333334333323334567899999999999987544 3333221 11 Q ss_pred -----eeeEEeceEEEEe-CCCccCCCceEEE-------EEcCCeeEEeecCCccceeeeecCCcceeEEEEeeEEeeee Q lcl|NC_015254. 221 -----KFPTYMGKRVIVD-DGLPAKDGVYTSY-------IFGEGAFGLGNGEAPVPTETDREKLKGNDILINRQHFLLHP 287 (346) Q Consensus 221 -----~i~~~~G~~VVvd-D~~p~~~g~ytt~-------l~~~GAi~~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~ 287 (346) .++++.|.+|++= +.....+|+.+.| ++..|.++...=.+ ..|- .+...+.+.-..-....-+. T Consensus 232 ~~~~~~~~~~~g~~i~~y~~~y~d~~G~~~~~~p~~~v~l~~~~~~G~~~yg~--~~e~-~~~~~~~~~~~~~~~~~~~~ 308 (348) T protein:vir:96 232 AELQNYVADNYGVEIVLENGTYRNEKGEVSKFFPDGHLTLIPNGPLGNTVFGT--TPEE-SDLFADNTVNADVEIVDSGI 308 (348) T ss_pred HHHHHHHhhhcCceEEEEccEEEecCCcEeccccCCeEEEEcCCCceeEEecc--Chhh-hhhhhcccccccceecCCee Confidence 1345667777663 3332234433222 22222222111010 1111 11111111000000001112 Q ss_pred eeeeeccccccCCCCChHHhcCCcCceeee-cccccceEEEEEec Q lcl|NC_015254. 288 RGIAWQEKSVAGHSPTNTEIEKGNNWKAVY-ESKNIRIVAFVHKN 331 (346) Q Consensus 288 ~G~s~~~~~~~~~sPt~a~L~~~~NW~~v~-~~K~i~iv~~~~k~ 331 (346) +-.+|.+ ..|....+...++.=-|. ++..+-++.+..-+ T Consensus 309 ~~~~~~~-----~dP~~~~~~~~s~plPv~~~~~~~~~a~Vl~~~ 348 (348) T protein:vir:96 309 AVTTTKT-----TDPVNVQTKVSMVALPSFERLGDVYMLTVIPGV 348 (348) T ss_pred EEEeeec-----CCCceEEEEEeeeeeccccCCCcEEEEEEecCC Confidence 2234532 236555555444432222 22333333333222 No 189 >protein:vir:103886 Length: 302 # NCBI annotation: putative major head subunit protein # Family: family:all:776 # MgeID: mge:1522 # MgeName: D3112 # Cross-refs: genbank:acc:NP_938242;genbank:gi:38229147;genbank:GeneID:2648201 Probab=58.69 E-value=0.41 Score=22.72 Aligned_cols=271 Identities=10% Similarity=0.048 Sum_probs=116.9 Q ss_pred eccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhh-CCCcEEEecccccCCCcccccCCCccccchhhcccceeEE Q lcl|NC_015254. 22 DVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAK-SGGNMINMPFWQDLTGEDEILDDGEGALTPGNISAAKDIA 100 (346) Q Consensus 22 d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~-~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~a 100 (346) =+|+|+.+...-. ..++.++-|.-..-+.....+. .+.++ ..=.+..| |+.+.+.+-.+......|+.....- T Consensus 1 m~it~~~l~~l~~----~~~~~~~~~y~~a~~~~~~~a~~~~sdf-~~~~~~~l-g~~p~l~e~~Ge~~~~~l~~~~~~i 74 (302) T protein:vir:10 1 MLINKQSLNAAFV----AIKTIFNNAFAAAPTTWQKIAMEVPSNT-SSNDYKWL-STFPKMRRWIGAKVVKNLKAYKYVV 74 (302) T ss_pred CcccHHHHHHHHH----HHHHHHHHHHHhhhhhhhceeeecCCCc-ceeeceec-CCCCCccccccceeeccccccceeE Confidence 1245554432111 1111111122222233333322 12221 22223345 4555554423456677888888877 Q ss_pred EEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhhhcceeeecc-------------- Q lcl|NC_015254. 101 RLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALDSNKLDVSTE-------------- 166 (346) Q Consensus 101 ~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~~~~~dis~~-------------- 166 (346) ..+..++-++++-.+-.==-=.-...+.++++.+.++..++.+.+.|.+-++...-.++.+--+.+ T Consensus 75 ~~~~~g~~v~i~R~~i~nDdlg~~~~~~~~~G~aaa~~~~~lv~~~L~~g~~~~~~DG~~fF~~dH~~g~~~~~N~g~~~ 154 (302) T protein:vir:10 75 ENEDFEATVEVDRNDIEDDQIGIYSPQAKMAGYSAAQLPDELVYEAVNGAFTKPCFDGQYFIDTDHPVGDASVSNKGTAP 154 (302) T ss_pred EeecccceecccHHhhcccccchhHHHHHHHHHHHHhhHHHHHHHHHhccCCCcccCCcceecccccccccccccccchh Confidence 788888888877665431001234445666677888888888888887655432222211110000 Q ss_pred -ccccccccHHHHHHHHHHh----Ccccc----CceEEEEchHHHHHHHhhhhhhhcccccCceeeEEec-eEEEEeCCC Q lcl|NC_015254. 167 -TGDDSYFTGDTFLSATYKL----GDAEG----KLTGIAMHSQTEMNLRKQGLIEFMLDSDNKKFPTYMG-KRVIVDDGL 236 (346) Q Consensus 167 -~~~~~~~~~~~l~~A~~~~----GD~~~----~~~~ivmhS~~~~~L~~~~li~~~~~s~~~~i~~~~G-~~VVvdD~~ 236 (346) ......++.+.|..|...| ++... .-..|++.|......++. +...+..++..-+ +.| .++|++-.+ T Consensus 155 ~~~~~~~l~~~~~~aa~~am~~~k~~~G~~L~i~P~~LiVp~~le~~A~~l--l~~~~~~~g~~Np-~~g~~~~vv~p~L 231 (302) T protein:vir:10 155 LSNASQAAAKAGYGAARTAMKKFKDEEGRSLNVSPNVLLVGPALEDVAKML--LTNPKLADNTPNP-YVGTAELVVDGRI 231 (302) T ss_pred hhhcccccchHHHHHHHHHHHHHhhhcccccccCCCEEEecchhHHHHHHH--hhccccCCCCcce-eccceEEEEeecc Confidence 0011235666676665554 22222 224688887776655432 1111112222222 334 577777655 Q ss_pred ccCCCceEEEEEcCCe-eE--EeecCCccceeeeecCCcceeEEEEeeEEeeeee---eeeeccccccCCCCChH Q lcl|NC_015254. 237 PAKDGVYTSYIFGEGA-FG--LGNGEAPVPTETDREKLKGNDILINRQHFLLHPR---GIAWQEKSVAGHSPTNT 305 (346) Q Consensus 237 p~~~g~ytt~l~~~GA-i~--~~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~---G~s~~~~~~~~~sPt~a 305 (346) .... --||+...+ +. |-.++..-.+|..-+.....--+-.+..|.+.-| ||.+=...-+.+ ++-+ T Consensus 232 ~s~~---aWyL~a~~~~i~~~~l~g~~~P~~~~~~~~~~dgv~~k~~~d~Gvd~R~~~G~~~wq~a~~s~-g~~~ 302 (302) T protein:vir:10 232 ESDT---AWFLLDTTKPVKPFIFQPRKQPEFVSQVNLDSDDVFNLRKLKFGAEARAAAGYGFWQLAYGST-GTGA 302 (302) T ss_pred CCCC---ceEEEecCCccceEEEcCccccEEEeccCCCCCceEEEEEEEEeeeeeeecchhhhhhhhccC-ccCC Confidence 4332 235553322 21 2233432224433333222222333444444333 332111111111 1111 No 190 >protein:vir:107687 Length: 319 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:1518 # MgeName: T1 # Cross-refs: genbank:acc:YP_003898;genbank:gi:45686314;genbank:GeneID:2773027 Probab=48.76 E-value=0.66 Score=21.57 Aligned_cols=272 Identities=10% Similarity=0.084 Sum_probs=110.7 Q ss_pred Cccceecce------eeecCCceeeeeec-cchHHHH----HHHhhHhHHHHhHhhccccccchhHH--HHhhCCCcEEE Q lcl|NC_015254. 1 MIKKLRMNL------QKFAAGKNTRIADV-IVPEVFN----KYVTERTAESSALLQSGIISNDKDLD--ELAKSGGNMIN 67 (346) Q Consensus 1 ~~~~~~~~~------q~~~a~~~T~l~d~-i~Pev~~----~yv~~~~~~~~~~~qSgi~~~~~~~~--~l~~~~G~ti~ 67 (346) |- .+++|. |++..+ .+...|- ...-+|. .++.+...+... .. ++-...+. +-...+-.+++ T Consensus 1 ~~-~~~~~~~~~~~~~~~~~~-~~~~~da~~~~g~~~~~ql~~id~~v~e~~~--~~--l~~~~~i~v~~~~~~~~~~~~ 74 (319) T protein:vir:10 1 MT-TKKFDEADKSNVEMYLIQ-AGVKQDAAATMGIWTAQELHRIKSQSYEEDY--PV--GSALRVFPVTTELSPTDKTFE 74 (319) T ss_pred CC-CcchhHHhhHHHHHHHhh-ccchhhhhhhhhhHHHHHHHHHHHHHHhhhh--cc--eechhhcccccCCCCceEEEE Confidence 53 355553 333222 2222221 1111232 344443333210 00 00011111 11122334677 Q ss_pred ecccccCCCcccccCCCccccchhhcccceeEEEEEeecCc--ceechHHHhhhcc-hHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015254. 68 MPFWQDLTGEDEILDDGEGALTPGNISAAKDIARLHMRGKA--WRTNDLAKALSGD-DPMRAIGDLVVEYWNRRRQAVLI 144 (346) Q Consensus 68 ~P~~~~l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~--~~~tD~a~~~~g~-dp~~~i~~q~a~~~~~~~~~~ll 144 (346) .+.+... |.++...++.++++...+........++..+.+ |+.-|+......+ +--..-+...+..++++.++.++ T Consensus 75 ~~~~~~~-G~a~~~~d~~~dip~v~~~~~~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~i~f 153 (319) T protein:vir:10 75 YMTFDKV-GTAQIIADYTDDLPLVDALGTSEFGKVFRLGNAYLISIDEIKAGQATGRPLSTRKASACQLAHDQLVNRLVF 153 (319) T ss_pred eeeeccc-cceeeecCccccccceeccceeeEEEEEEEEeeeeecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhceEEE Confidence 7777765 777777777777877777777777777776665 6667777765555 33334555566677776666544 Q ss_pred HH-----HHhhhhhhhhhhcceeeecccccccccc----HHHHHHHHHHh-----CccccCceEEEEchHHHHHHHh-hh Q lcl|NC_015254. 145 AS-----LNGITASGALDSNKLDVSTETGDDSYFT----GDTFLSATYKL-----GDAEGKLTGIAMHSQTEMNLRK-QG 209 (346) Q Consensus 145 a~-----L~G~~~~~~~~~~~~dis~~~~~~~~~~----~~~l~~A~~~~-----GD~~~~~~~ivmhS~~~~~L~~-~~ 209 (346) -- +.|++....... ...+... ..++-+ .+.++.+...+ |- ..-..++++|..|..|.. .+ T Consensus 154 ~G~~~~g~~GLlN~p~~~~--~~~~~~~-~~~t~t~~~i~~di~~~~~~l~~~s~g~--~~p~~L~L~p~~~~~L~~~~~ 228 (319) T protein:vir:10 154 KGSAPHKIVSVFNHPNITK--ITSGKWI-DVSTMKPETAEAELTQAIETIETITRGQ--HRATNILIPPSMRKVLAIRMP 228 (319) T ss_pred eecccccceeEEeCCCcee--eecCCCC-CccccCHHHHHHHHHHHHHHHHHhcCce--eeceEEEecHHHHHhhhcccC Confidence 21 112222222111 0000000 001111 23455555543 32 234589999999999852 21 Q ss_pred -----hhhhcccccCceeeEEeceEEEEeCCCccCCCc--eEEEEEcCCeeEEeecCCc--cceeeeecCCcce------ Q lcl|NC_015254. 210 -----LIEFMLDSDNKKFPTYMGKRVIVDDGLPAKDGV--YTSYIFGEGAFGLGNGEAP--VPTETDREKLKGN------ 274 (346) Q Consensus 210 -----li~~~~~s~~~~i~~~~G~~VVvdD~~p~~~g~--ytt~l~~~GAi~~~~~~~~--~~vE~dRd~~~g~------ 274 (346) .+++++... ...++.+.+.+.+.+ +.|+ ...|.-.+.-+.+....+. .++|. |+..-.. T Consensus 229 ~~~~t~l~~lk~~~--~~l~I~~~pel~~ag---~~g~~~~v~y~~~~~~~~~~v~~~~~~~~~e~-~~l~~~~~~~~r~ 302 (319) T protein:vir:10 229 ETTMSYLDYFKSQN--SGIEIDSIAELEDID---GAGTKGVLVYEKNPMNMSIEIPEAFNMLPAQP-KDLHFKVPCTSKC 302 (319) T ss_pred CCCeeHHHHHHHhc--CCceEEEeeeecccC---CCcceEEEEEecCCceEEEecCcceeeeeeee-cCceEEEeeeeee Confidence 223333221 111222222222211 1111 1112222222222211211 12221 1110000 Q ss_pred -eEEEEeeEEeeeeeee Q lcl|NC_015254. 275 -DILINRQHFLLHPRGI 290 (346) Q Consensus 275 -~~l~~r~~~~~~~~G~ 290 (346) -.++-|-.-++.+-|+ T Consensus 303 ~Gv~i~~P~ai~~~dGI 319 (319) T protein:vir:10 303 TGLTIYRPMTIVLITGV 319 (319) T ss_pred EEEEEEccceeEeeecC Confidence 0112222333444455 No 191 >protein:vir:79548 Length: 652 # NCBI annotation: putative protease/scaffold protein # Family: family:all:62 # ACLAME annotation(s): go:0008236 - serine-type peptidase activity; phi:0000017 - phage prohead/capsid assembly # MgeID: mge:1871 # MgeName: cdtI # Cross-refs: genbank:acc:YP_001272518;genbank:gi:148609387;genbank:GeneID:5204384 Probab=44.52 E-value=0.8 Score=21.10 Aligned_cols=279 Identities=9% Similarity=-0.016 Sum_probs=125.0 Q ss_pred Cccceecceeee--cCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCCcc Q lcl|NC_015254. 1 MIKKLRMNLQKF--AAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTGED 78 (346) Q Consensus 1 ~~~~~~~~~q~~--~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g~a 78 (346) |+..--+...+- .++=.-=|.|+++-.+...|-..+.+ --+|..-| ...++. +...+.+--|. +- T Consensus 351 ~~~~~~v~~A~~hsTsDFp~IL~~~~nk~l~~~y~~a~~t-~~~~~~~~---~~~DFk-----~~~~~~lg~~~----~L 417 (652) T protein:vir:79 351 YNPMQMVGAAFTHSTSDFGNILLDVANKAILQGWEDAPET-YEQWTRKG---QLSDFK-----IAHRVGMGGFS----AL 417 (652) T ss_pred CCHHHHHHHHhhcCcchHHHHHHHHHHHHHHHHHhhhHHH-HHHHhccC---CCcccc-----ccceeecCCCC----Cc Confidence 432110111100 00000012222333333333222111 01111111 111222 22333332233 33 Q ss_pred cccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhh- Q lcl|NC_015254. 79 EILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGALD- 157 (346) Q Consensus 79 e~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~~- 157 (346) +.|.|+ +.+....++-....-.+...|+-+++|-.+-.=--=+.+..+-+.++.+-.+...+.+.+.|.+ ...+. T Consensus 418 ~~V~E~-gEyk~~t~~e~~e~~~l~tyG~~~~iTRqaiINDDL~a~~~ip~~~g~aA~~~~~~~vy~~l~~---Np~~~~ 493 (652) T protein:vir:79 418 RQVREG-AEYKYVTTGDKQATIALATYGELFSITRQAIINDDLNMLTDVPMKLGRAAKSTIADLVYAILTS---NPKIST 493 (652) T ss_pred cccCCC-CccceeeecCccceeeeecccCeeeeehheeeccchhHHHHHHHHHHHHHHHHHHHHHHHHHhc---Cccccc Confidence 445565 3566666666666667778888888887654311114566777888888888888888887742 22222 Q ss_pred hcceee-ecccc---ccccccHHHHHHHHHHhCcccc-------CceEEEEchHHHHHHHhhhhhhhccccc--CceeeE Q lcl|NC_015254. 158 SNKLDV-STETG---DDSYFTGDTFLSATYKLGDAEG-------KLTGIAMHSQTEMNLRKQGLIEFMLDSD--NKKFPT 224 (346) Q Consensus 158 ~~~~di-s~~~~---~~~~~~~~~l~~A~~~~GD~~~-------~~~~ivmhS~~~~~L~~~~li~~~~~s~--~~~i~~ 224 (346) +.+.-. .++.+ ..+.++.+.|..|...|..+.+ .-..|+++|......++.-.-..++.++ ++.+.- T Consensus 494 DGk~LF~hA~H~Nl~~~aa~~~~~l~~ar~aM~~Qk~g~~~l~i~P~~llvp~~le~~a~~ll~s~~v~~a~~~~~~~Np 573 (652) T protein:vir:79 494 DNVSLFDKAKHANVLESAAMDVASLDKARQLMRVQKEGERHLNIRPAFVLVPTAMESVANQVIRSSSVKGADINAGIINP 573 (652) T ss_pred CCceeecccccccccccccCCHHHHHHHHHHHHHhccCCccccccccEEEecchhHHHHHHHhccCCCcccccccccccc Confidence 222222 22222 2356899999998876643221 2346888888776654432113444443 344555 Q ss_pred Eece-EEEEeCCCccCCCceEEEEEcCC---eeEE--eecCCccceeeeecCCcceeEEEEeeEEeeeeeeee-eccccc Q lcl|NC_015254. 225 YMGK-RVIVDDGLPAKDGVYTSYIFGEG---AFGL--GNGEAPVPTETDREKLKGNDILINRQHFLLHPRGIA-WQEKSV 297 (346) Q Consensus 225 ~~G~-~VVvdD~~p~~~g~ytt~l~~~G---Ai~~--~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~~G~s-~~~~~~ 297 (346) +.|. +||++-.+...+. ..-|++.+. .|.+ -++...-.+|+.-+-..---.+-.|.-|++.+..|- |.+++. T Consensus 574 ~~~~~~~i~eprL~~~s~-~~wylaa~~~~dtiev~yL~G~~~P~ie~~~gf~~dG~~~kvrlD~G~~~iD~RG~~k~t~ 652 (652) T protein:vir:79 574 VKDFATVIAEPRLDDNSQ-TTFYLAASKGSDTIEVAYLNGVDTPYIDQMEGFSVDGVTTKVRIDAGVAPVDHRGLVKCTA 652 (652) T ss_pred cccccccccccccCCCCc-ccEEEecCCCCCeEEEEEecCCCCCeeeecCCCCcceEEEEEEEeccCceeeccceeeecC Confidence 6665 6666655544332 223454332 2433 334432224443221111122334555555554332 212221 No 192 >protein:vir:104342 Length: 314 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:1593 # MgeName: RTP # Cross-refs: genbank:acc:YP_398971;genbank:gi:81343955;genbank:GeneID:3778874 Probab=34.29 E-value=1.3 Score=19.96 Aligned_cols=275 Identities=8% Similarity=0.087 Sum_probs=115.5 Q ss_pred eecceeeecCCceeeeeec--cchHHHHHHHhhHhHHHHh-HhhccccccchhHH--------HHhhCCCcEEEeccccc Q lcl|NC_015254. 5 LRMNLQKFAAGKNTRIADV--IVPEVFNKYVTERTAESSA-LLQSGIISNDKDLD--------ELAKSGGNMINMPFWQD 73 (346) Q Consensus 5 ~~~~~q~~~a~~~T~l~d~--i~Pev~~~yv~~~~~~~~~-~~qSgi~~~~~~~~--------~l~~~~G~ti~~P~~~~ 73 (346) .-||+-.+.|...+++..+ ..-+.-..|+.+++..... +.+ .+.+.+. +.....-.+++.+.+.. T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~d~~~~fl~~ql~~id~~v~e----~~~~~~~~~~~i~v~~~~~~~~et~~~~~~e~ 76 (314) T protein:vir:10 1 MAIKFDAEQAKITTHLEQMGVEKADAAGIWAVSQLTAALNRAYE----KEYAENSVVNIFPVTNEIPGHAKYFEYPEFDG 76 (314) T ss_pred CccchHHHHHHHHHHHHhhcccchhhhHHHHHHHHHHHHHHHhh----hhccccccceeeccccCCCCceeEEEeeeecc Confidence 3444445555555555443 2222333455444332221 111 1111111 11111233778888876 Q ss_pred CCCcccccCCCccccchhhcccceeEEEEEeecCc--ceechHHHhhhcchHH-HHHHHHHHHHHHHHHHHHHHHH---- Q lcl|NC_015254. 74 LTGEDEILDDGEGALTPGNISAAKDIARLHMRGKA--WRTNDLAKALSGDDPM-RAIGDLVVEYWNRRRQAVLIAS---- 146 (346) Q Consensus 74 l~g~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~--~~~tD~a~~~~g~dp~-~~i~~q~a~~~~~~~~~~lla~---- 146 (346) . |.+....++.++++.......+....++..+.+ |+.-|+......+=|+ ..-+...+..+++..++.++-- T Consensus 77 ~-G~a~~~~d~~~dip~vd~~~~~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~i~f~G~~~~ 155 (314) T protein:vir:10 77 V-GIAQIIADYSDDLPLVDAFMTEKQGKVFRFGNAFLISTDEIKAGAATGQSLSARKQALAFEAHDNLLDKLVWSGSAPH 155 (314) T ss_pred c-cceeeeCCcccccceeecccceeEEEEEEEEeeEEecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhceEEEeecccc Confidence 5 777777777777877777777778878877776 5556777665544343 3344445556666666544321 Q ss_pred -HHhhhhhhhhhhcceeeeccccccccccHHHHHHHHHHhCcc---ccCceEEEEchHHHHHHHhh----h--hhhhccc Q lcl|NC_015254. 147 -LNGITASGALDSNKLDVSTETGDDSYFTGDTFLSATYKLGDA---EGKLTGIAMHSQTEMNLRKQ----G--LIEFMLD 216 (346) Q Consensus 147 -L~G~~~~~~~~~~~~dis~~~~~~~~~~~~~l~~A~~~~GD~---~~~~~~ivmhS~~~~~L~~~----~--li~~~~~ 216 (346) +.|++......... .+.+- +++.--.+.++.+...+=.. ...-..++++|..|..|..- + +.++++. T Consensus 156 g~~GLlN~p~v~~~~--~~~~W-aT~~ei~~Di~~~~~~l~~~s~g~~~p~~l~Lpp~~~~~L~~~~~~~~~tvl~~l~~ 232 (314) T protein:vir:10 156 GIVSVFDQPNINNVV--ATPNW-SVPQNAIDDVTAMIDAVESSTQGLHHVTDILLPASARRVMQGLVPQTNLSYGELFTR 232 (314) T ss_pred cceeEeecCCCcccc--CCCCc-ccHHHHHHHHHHHHHHHHHhcCccccceeEEecHHHHHhhcccccCCCccHHHHHHH Confidence 11222211111000 00000 00000145556665553111 11234789999988877531 1 1223322 Q ss_pred ccCceeeEEeceEEEEeCCCccCCCc--eEEEEEcCCeeEEeecCC--ccceeeeecCC-----cce--eEEEEeeEEee Q lcl|NC_015254. 217 SDNKKFPTYMGKRVIVDDGLPAKDGV--YTSYIFGEGAFGLGNGEA--PVPTETDREKL-----KGN--DILINRQHFLL 285 (346) Q Consensus 217 s~~~~i~~~~G~~VVvdD~~p~~~g~--ytt~l~~~GAi~~~~~~~--~~~vE~dRd~~-----~g~--~~l~~r~~~~~ 285 (346) .. ..-++.+.+-..+.+ +.|+ +..|.-.+.-+.+....+ ..++|. |+.. ... -..+-|-.-++ T Consensus 233 n~--~~l~I~~~~el~~ag---~~g~~~~v~y~~~~~~~~~~vp~~~~~l~~e~-~~~~~~~~~~~r~~Gv~i~~P~ai~ 306 (314) T protein:vir:10 233 NN--PGLTIRFLQFLDNYD---GAGGKAALAFEKSPLNMSIEIPEVTNVLPAQP-KDLHFRYPVTSKATGLIVYRPLTMA 306 (314) T ss_pred hC--CCcEEEEcccccccC---CCcceEEEEEecCCcEEEEecCccceeeccee-cCceEEEcceeeeEEEEEECcceeE Confidence 11 111122222222211 1111 111111111111111111 112221 1110 000 12233444456 Q ss_pred eeeeeeec Q lcl|NC_015254. 286 HPRGIAWQ 293 (346) Q Consensus 286 ~~~G~s~~ 293 (346) .+-|++|. T Consensus 307 ~~dGI~~~ 314 (314) T protein:vir:10 307 VIKGITFA 314 (314) T ss_pred eeeeeecC Confidence 67799985 No 193 >protein:vir:95512 Length: 693 # NCBI annotation: Putative Clp protease # Family: family:all:62 # ACLAME annotation(s): go:0008236 - serine-type peptidase activity; phi:0000017 - phage prohead/capsid assembly # MgeID: mge:1574 # MgeName: F10 # Cross-refs: genbank:acc:YP_001293349;genbank:gi:148912770;genbank:GeneID:5228164 Probab=27.58 E-value=1.8 Score=19.14 Aligned_cols=277 Identities=12% Similarity=0.041 Sum_probs=127.8 Q ss_pred CccceecceeeecCCcee----eeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCC Q lcl|NC_015254. 1 MIKKLRMNLQKFAAGKNT----RIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTG 76 (346) Q Consensus 1 ~~~~~~~~~q~~~a~~~T----~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g 76 (346) ||..--+... |+ +++- =|.|+++-.++..|-..+.+= -+|..-| ...++. +...+.+=-|. T Consensus 386 ~~~~~~~~~a-~~-htTSDFp~IL~~~~nk~l~~~y~~a~~t~-~~~~~~~---~~~DFk-----~~~~~~lg~~~---- 450 (693) T protein:vir:95 386 LNAPQMVGLA-FT-HTSSDFGLILLDVANKSVLAGWEEAEETF-PLWTKSG---ILTDFK-----PARRVGLGEFS---- 450 (693) T ss_pred CCHHHHHHHH-Hh-cCcchhHHHHHHHHHHHHHHHHHhhhhHH-HHHhccC---CCCccc-----ccceeecCCCC---- Confidence 4432212222 21 1000 122333333333333222110 0111111 111122 22333322222 Q ss_pred cccccCCCccccchhhcccceeEEEEEeecCcceechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhh Q lcl|NC_015254. 77 EDEILDDGEGALTPGNISAAKDIARLHMRGKAWRTNDLAKALSGDDPMRAIGDLVVEYWNRRRQAVLIASLNGITASGAL 156 (346) Q Consensus 77 ~ae~~~dg~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~~~~~~~~~lla~L~G~~~~~~~ 156 (346) +-+.|.|+ +.+....++-....-.+...|+-+++|-.+-.=--=+.+..+-++++.+-.+...+.+.+.|.+ ...+ T Consensus 451 ~L~~V~E~-gEyk~~t~~e~~e~~~l~tyG~~~~iTRqaiINDDLga~~~ip~~~g~aA~~~~~~~vy~~L~~---Np~m 526 (693) T protein:vir:95 451 SLRQVREG-AEYKYVTLGERGEQIILATYGELFSITRQAIINDDLQMLSDIPFKLGQAAKATIGDLVYAVLTG---NPAM 526 (693) T ss_pred ChhhcCCC-CceeeeecCCccceeehhhcCCeeeecHHhhhccchHHHHHHHHHHHHHHHHHHHHHHHHHHhc---Cccc Confidence 22445565 3455566666666666777888898887765421114566688888888888888888888753 2222 Q ss_pred hhcceeeecccc-----ccccccHHHHHHHHHHhCcccc------------CceEEEEchHHHHHHHhhhhhhhcccc-- Q lcl|NC_015254. 157 DSNKLDVSTETG-----DDSYFTGDTFLSATYKLGDAEG------------KLTGIAMHSQTEMNLRKQGLIEFMLDS-- 217 (346) Q Consensus 157 ~~~~~dis~~~~-----~~~~~~~~~l~~A~~~~GD~~~------------~~~~ivmhS~~~~~L~~~~li~~~~~s-- 217 (346) .+.+.-..++.+ +.+.++.++|..|...|.-+.+ .-..|++++......+..---..++.+ T Consensus 527 ~DGk~LFhadH~Nl~tga~sals~~sl~~a~~am~~qk~~~~~~~g~~L~i~P~~llvP~~le~~a~~l~~s~~~~~a~~ 606 (693) T protein:vir:95 527 SDGKTLFHADHSNLLTGAASALSIDSLSKAKTQMATQKAQVEKGKGRTLNIRPGFVLTPVALEDKANQIINSESVPGADV 606 (693) T ss_pred cCCcceeeccccccccccccccChHHHHHHHHHHHHhhcchhccCCceeecccceEEecchHHHHHHHHhcccccccccc Confidence 222222222222 3346899999998777644321 224678887777665543222344433 Q ss_pred cCceeeEEece-EEEEeCCCccCCCceEEEEEcCC---eeEEe--ecCCccceeeeecCCcceeE--EEEeeEEeeeeee Q lcl|NC_015254. 218 DNKKFPTYMGK-RVIVDDGLPAKDGVYTSYIFGEG---AFGLG--NGEAPVPTETDREKLKGNDI--LINRQHFLLHPRG 289 (346) Q Consensus 218 ~~~~i~~~~G~-~VVvdD~~p~~~g~ytt~l~~~G---Ai~~~--~~~~~~~vE~dRd~~~g~~~--l~~r~~~~~~~~G 289 (346) +.+.+.-+.|+ +||++-.+...+++ .=|++... .|.+. ++...-.+|+.. .-..|- +-.|.-|++.+.. T Consensus 607 ~~~~~NP~~~~~~vi~~prL~~~s~~-~Wyl~a~~~~dtie~~yL~G~~~P~ie~~~--gf~~dG~~~kvr~D~G~~~iD 683 (693) T protein:vir:95 607 NSGIVNPIRAFAQVIGEPRLDDASAT-AWYMAAKKGSDTIEVAYLDGVDTPYLEQQE--GFTVDGVASKVRIDAGVAPLD 683 (693) T ss_pred ccccccchhccccccccceecCCCCC-ceEEecCCCCCeEEEEEecCCCCCeEeecC--CCCcceEEEEEEEeccCceee Confidence 23334446664 67777666543432 12454432 24433 333322244332 222222 3344445544443 Q ss_pred eeeccccccCCCCCh Q lcl|NC_015254. 290 IAWQEKSVAGHSPTN 304 (346) Q Consensus 290 ~s~~~~~~~~~sPt~ 304 (346) |.=- -+||-- T Consensus 684 ~Rg~-----~kn~GA 693 (693) T protein:vir:95 684 FRGL-----QKSNGA 693 (693) T ss_pred cccc-----ccCCCC Confidence 3311 122221 No 194 >protein:vir:2736 Length: 348 # NCBI annotation: putative structural protein # Family: family:all:1083 # MgeID: mge:58 # MgeName: O1205 # Cross-refs: genbank:acc:NP_695109;genbank:gi:23455878;genbank:GeneID:955608 Probab=25.25 E-value=2.1 Score=18.84 Aligned_cols=297 Identities=12% Similarity=0.076 Sum_probs=113.2 Q ss_pred eeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHH----HhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 17 NTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDE----LAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 17 ~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~----l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) .-++.|+|+|..+..|+.+.......|+....+.+....+. +-+..+..+-.|+-..- ...+... -+. T Consensus 1 M~~i~d~f~~~~l~~~v~~~~~~~~~~l~~~~Fp~~~~~~~~~~~~~~~~~~~~~a~~v~~~-~~~~~~~-------r~~ 72 (348) T protein:vir:27 1 MGLIYDKVTASNIAGYFNALQENVSSTLGESIFPARKQLGTKLSYIKGASGQSVALKAAAFD-TNVTIRD-------RVS 72 (348) T ss_pred CcchhhhcCHHHHHHHHHhccchhhhhhHhhcCCCccccceeEEEEeeccCceeEeeeecCC-CCcceec-------ccc Confidence 23467999999999999875433333433333322211110 01112222222332221 1111110 111 Q ss_pred cccceeEEEEEeecCcceechHHHh---hhcchH--HHHHHHHH-------HHHHHHHHHHHHHHHHH-hhhhhhhh--- Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKA---LSGDDP--MRAIGDLV-------VEYWNRRRQAVLIASLN-GITASGAL--- 156 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~---~~g~dp--~~~i~~q~-------a~~~~~~~~~~lla~L~-G~~~~~~~--- 156 (346) +++.+-.-..++...-..+.|+..+ ...+++ +..+.+++ ...+.+..+-.+..+|. |-+....- T Consensus 73 ~~~~~~~~p~i~~~~~i~~~d~~~~~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~~al~~Gki~i~~~~~~ 152 (348) T protein:vir:27 73 AEMHDEQMPFFKEAMLVKENDRQQLNLVKDSGNAVLVNTIVAGIFNDNLTLVNGARARLEAMRMQVLATGKIAFTSDGVN 152 (348) T ss_pred eeeeeeecCccccccccCHHHHHHHHHhhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCeeEEecCCee Confidence 1211111122223334556665443 222222 22343443 34455555666666663 54432210 Q ss_pred --------hhcceeeeccccc-cccccHHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh-hhccccc--Cc---- Q lcl|NC_015254. 157 --------DSNKLDVSTETGD-DSYFTGDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI-EFMLDSD--NK---- 220 (346) Q Consensus 157 --------~~~~~dis~~~~~-~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li-~~~~~s~--~~---- 220 (346) +.|+...+..=++ ++. -...|.+....+-+.......++|.++++..|+++.-+ +.+.... ++ T Consensus 153 ~~vdfg~~~~~~~t~~~~W~~~~ad-p~~di~~~~~~~~~~G~~~~~ii~~~~~~~~l~~~~~v~~~~~~~~~~~~~i~~ 231 (348) T protein:vir:27 153 KDIDYGVKPDHKKQVSKSWAEPGAT-PLADLEDAIETARELGLNPERAVMNAKTFGLIRKAASTVKVIKPLAGDGSAVTK 231 (348) T ss_pred EEEeecCCcccceeeeeccCCCCCC-HHHHHHHHHHHHHhcCCcccEEEECHHHHHHHhcCHHHHHHhcccCccccccCH Confidence 1122211111000 111 11344444444433344667899999999999987544 3332211 11 Q ss_pred -----eeeEEeceEEEEeCC-CccCCCceE-------EEEEcCCeeEEeecCCccceeeeecCCccee-EEEEeeEEeee Q lcl|NC_015254. 221 -----KFPTYMGKRVIVDDG-LPAKDGVYT-------SYIFGEGAFGLGNGEAPVPTETDREKLKGND-ILINRQHFLLH 286 (346) Q Consensus 221 -----~i~~~~G~~VVvdD~-~p~~~g~yt-------t~l~~~GAi~~~~~~~~~~vE~dRd~~~g~~-~l~~r~~~~~~ 286 (346) -++++.|.+|++=|. ....+|+.+ ..++..|.++...=.+ ..|-. +...+.+ ...... ..-+ T Consensus 232 ~~~~~~~~~~~g~~i~~yd~~y~d~~G~~~~~~p~~~vvl~~~~~~G~~~yG~--~~e~~-~~~~~~~~~~~~~~-~~~~ 307 (348) T protein:vir:27 232 AELENYIADNFGVSIVLENGTYRNDKGEVSKFYPDGHLTLIPNGPLGNTVFGT--TPEES-DLFADNTVNAEVEI-VDNG 307 (348) T ss_pred HHHHHHHHhhcCceEEEEeeEEEcCCCcCcccccCCeEEEEcCCcceeEEecc--Ccchh-hhhhccccccceee-eCCe Confidence 124566777766432 222334322 2233333333211111 12211 1111110 000000 0011 Q ss_pred eeeeeeccccccCCCCChHHhcCCcCceeee-cccccceEEEEEec Q lcl|NC_015254. 287 PRGIAWQEKSVAGHSPTNTEIEKGNNWKAVY-ESKNIRIVAFVHKN 331 (346) Q Consensus 287 ~~G~s~~~~~~~~~sPt~a~L~~~~NW~~v~-~~K~i~iv~~~~k~ 331 (346) .+--+|.+ ..|....+...+..=-|. ++..+-++.+.+-+ T Consensus 308 ~~~~~~~~-----~dP~~~~~~~~s~~lPv~~~~~~~~~a~Vl~~~ 348 (348) T protein:vir:27 308 IAVTTTKT-----TDPVNVQTKVSMVALPSFERLDDVYMLTVIPAV 348 (348) T ss_pred eEEEeeec-----CCCceEEEEEeeeeeccccCCCcEEEEEEecCC Confidence 12233532 245554444443333222 23333333333322 No 195 >protein:vir:4902 Length: 348 # NCBI annotation: gp348 # Family: family:all:1083 # MgeID: mge:107 # MgeName: Sfi11 # Cross-refs: genbank:acc:NP_056680;genbank:gi:9635015;genbank:GeneID:1262657 Probab=25.17 E-value=2.1 Score=18.83 Aligned_cols=295 Identities=13% Similarity=0.097 Sum_probs=114.0 Q ss_pred eeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHH----HhhCCCcEEEecccccCCCcccccCCCccccchhh Q lcl|NC_015254. 17 NTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDE----LAKSGGNMINMPFWQDLTGEDEILDDGEGALTPGN 92 (346) Q Consensus 17 ~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~----l~~~~G~ti~~P~~~~l~g~ae~~~dg~~~it~~~ 92 (346) .-+|.|+|+|.-+..|+.+.......|+....+......+. +-...|..+--||-..-.+ .+... -+. T Consensus 1 M~~l~d~f~~~~l~~~v~~~~~~~~~~l~~~~Fp~~~~~~~~~~~~~~~~~~~~~a~~v~~~~~-~~~~~-------r~~ 72 (348) T protein:vir:49 1 MGLIYDKVTASNIAGYFNALQENVDSTLGESIFPARKQLGTKLSYITGASGQSVALKAAAFDTN-VTVRD-------RVS 72 (348) T ss_pred CcchhhhcCHHHHHHHHHhccccchhhhHhhcCCCccccCceeEEEEeecCceeeeeeecCCCC-cceec-------ccc Confidence 33467999999999999875433333443333322221111 1122333333343332211 11111 111 Q ss_pred cccceeEEEEEeecCcceechHHHhhhc---chH--HHHHHHHHH-------HHHHHHHHHHHHHHHH-hhhhhhh---- Q lcl|NC_015254. 93 ISAAKDIARLHMRGKAWRTNDLAKALSG---DDP--MRAIGDLVV-------EYWNRRRQAVLIASLN-GITASGA---- 155 (346) Q Consensus 93 lt~~~~~a~~~~~~k~~~~tD~a~~~~g---~dp--~~~i~~q~a-------~~~~~~~~~~lla~L~-G~~~~~~---- 155 (346) .++..-....++....+.+.|.-.+... ..+ ...+.++++ +...+..+-.+..+|. |-+.... T Consensus 73 ~~~~~~~~p~i~~~~~i~~~d~~~l~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~qal~~Gki~i~~~g~~ 152 (348) T protein:vir:49 73 AEMHDEQMPFFKEAMLVKENDRQQLNLVKDSGNAALVNTIVAGIFNDNLTLVNGARARLEAMRMQVLATGKIAFTSDGVN 152 (348) T ss_pred eeeeeeecCccccccccCHHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhCCeEEEecCCce Confidence 2222222222233444666675444222 221 223334333 3455555666666653 4332211 Q ss_pred -------hhhcceeeecccccccccc-HHHHHHHHHHhCccccCceEEEEchHHHHHHHhhhhh-hhccccc--Cc---- Q lcl|NC_015254. 156 -------LDSNKLDVSTETGDDSYFT-GDTFLSATYKLGDAEGKLTGIAMHSQTEMNLRKQGLI-EFMLDSD--NK---- 220 (346) Q Consensus 156 -------~~~~~~dis~~~~~~~~~~-~~~l~~A~~~~GD~~~~~~~ivmhS~~~~~L~~~~li-~~~~~s~--~~---- 220 (346) .+.|+...+..= +++.-+ ...|.+.....-+.......++|.++++..|++..-+ +.+.... .+ T Consensus 153 ~~vdyg~~~~~~~t~~~~W-~~~~adp~~di~~~~~~~~~~G~~~~~ii~~~~~~~~l~~~~~v~~~~~~~~~~~~~i~~ 231 (348) T protein:vir:49 153 KDIDYGVKPDHKKQVSKSW-AEPGATPLADLEDAIETARELGLNPERAVMNAKTFGLIRKAASTVKVIKPLAGDGSSVTK 231 (348) T ss_pred EEEeecCCcccceeeeecc-CCCCCCHHHHHHHHHHHHHhcCCcccEEEeCHHHHHHHhcCHHHHHHhhccCcccccccH Confidence 011221111110 011111 1233333333323334567899999999999987544 3332221 11 Q ss_pred -----eeeEEeceEEEE-eCCCccCCCceEEEEEcCCeeEEeecCC------ccceeeeecCCcce----eEEEEeeEEe Q lcl|NC_015254. 221 -----KFPTYMGKRVIV-DDGLPAKDGVYTSYIFGEGAFGLGNGEA------PVPTETDREKLKGN----DILINRQHFL 284 (346) Q Consensus 221 -----~i~~~~G~~VVv-dD~~p~~~g~ytt~l~~~GAi~~~~~~~------~~~vE~dRd~~~g~----~~l~~r~~~~ 284 (346) .+.++.|.+|++ |......+|+-+.| +-.+.+.+..... ....|.+ +...+. +.-..+.++. T Consensus 232 ~~~~~~~~~~~g~~i~~y~~~y~d~dG~~~~~-~p~~~v~l~~~~~~G~~~yg~~~e~~-~~~~~~~~~~~~~~~~~~~~ 309 (348) T protein:vir:49 232 AELDNYIADNFGVTVVLENGTYRNEKGEVSKF-FPDGHLTLIPNGPLGNTVFGTTPEES-DLFADNTVNADVEIVDNGIA 309 (348) T ss_pred HHHHHHHHhhcCceEEEEeeEEEecCCcEeee-ecCCeEEEecCCCcceeEEecChhhh-hhccccccccceeecCCeEE Confidence 124566776665 33333334443222 2222222211000 0011211 111111 1111122222 Q ss_pred eeeeeeeeccccccCCCCChHHhcCCcCceee-ecccccceEEEEEec Q lcl|NC_015254. 285 LHPRGIAWQEKSVAGHSPTNTEIEKGNNWKAV-YESKNIRIVAFVHKN 331 (346) Q Consensus 285 ~~~~G~s~~~~~~~~~sPt~a~L~~~~NW~~v-~~~K~i~iv~~~~k~ 331 (346) + -+|.+ ..|....+...+..=-| .++..+-++.+.+-+ T Consensus 310 ~----~~~~~-----~dP~~~~~~~~s~~lPv~~~~~~~~~a~Vl~~~ 348 (348) T protein:vir:49 310 V----TTTKT-----TDPVNVQTKVSMVALPSFERLDDVYMLTVIPAV 348 (348) T ss_pred E----eeeec-----CCCceEEEEEeeeccccccCCCcEEEEEEecCC Confidence 2 23532 23554444433333222 234444444444433 No 196 >protein:vir:99424 Length: 360 # NCBI annotation: hypothetical protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:1595 # MgeName: BJ1 # Cross-refs: genbank:acc:YP_919080;genbank:gi:119757038;genbank:GeneID:4606077 Probab=21.70 E-value=2.6 Score=18.35 Aligned_cols=281 Identities=11% Similarity=0.079 Sum_probs=109.9 Q ss_pred Cccce-ecceeeecCCceeeeeeccchHHHHHHHhhHhHHHHhHhhccccccchhHHHHhhCCCcEEEecccccCCC--- Q lcl|NC_015254. 1 MIKKL-RMNLQKFAAGKNTRIADVIVPEVFNKYVTERTAESSALLQSGIISNDKDLDELAKSGGNMINMPFWQDLTG--- 76 (346) Q Consensus 1 ~~~~~-~~~~q~~~a~~~T~l~d~i~Pev~~~yv~~~~~~~~~~~qSgi~~~~~~~~~l~~~~G~ti~~P~~~~l~g--- 76 (346) ||+++ ++.--........ .=+..||++..+++.-. +.+.|++ .... ....-.+.++|..+. +. T Consensus 11 ~n~~~~~i~k~~it~~~l~--~g~L~p~~a~~Fl~~v~-~~t~iL~------~~r~---~~~~s~~~ei~kig~-G~r~~ 77 (360) T protein:vir:99 11 RNQNMNSLSQKDIGLAELD--GFQLPVDVTEEFLERMQ-KGVQILG------MADT---MTLARLEMEVPQFGV-PRLSG 77 (360) T ss_pred hhhHHHHHHhhhccccccC--ceeecHHHHHHHHHHHh-hccchhh------hcce---eeccccccccccccc-ceeec Confidence 77765 2211112111111 34578999888776532 2233322 1100 111122233333221 11 Q ss_pred --cccccCC-CccccchhhcccceeEEEEEeecCcceechHHHh----hhcchHHHHHHHHHHHHHHHHHHHH------- Q lcl|NC_015254. 77 --EDEILDD-GEGALTPGNISAAKDIARLHMRGKAWRTNDLAKA----LSGDDPMRAIGDLVVEYWNRRRQAV------- 142 (346) Q Consensus 77 --~ae~~~d-g~~~it~~~lt~~~~~a~~~~~~k~~~~tD~a~~----~~g~dp~~~i~~q~a~~~~~~~~~~------- 142 (346) ..|.-.. +...++.......+.. +.+. -|.++..... ..+..+.+.+.+++++...++++.. T Consensus 78 r~~~e~~~~~~~~~~~~~~v~~~~~~---~~~~-~~~i~~~~~~~n~~~~~~~f~~~i~~~~ae~~~~Dle~l~~~g~~d 153 (360) T protein:vir:99 78 HTRDEEGSRTENSEAESGSVKFNATD---KSYY-ILVEPKRDALKNTHYGPDQFGDYIVDQFIERYGNDLGLMGIRAGAS 153 (360) T ss_pred cccccCCCCCcCCcCccccCcccccc---ceee-EeechHHHHHhhhhcccchhHHHHHHHHHHHHHHHHHHHHhhccch Confidence 0010000 0001111111111111 1111 2344333322 2243556667777776666654332 Q ss_pred ------------HHHHHHhhhhhhhhh---------hcceeeec----------------cccccccccHHHHHHHHHHh Q lcl|NC_015254. 143 ------------LIASLNGITASGALD---------SNKLDVST----------------ETGDDSYFTGDTFLSATYKL 185 (346) Q Consensus 143 ------------lla~L~G~~~~~~~~---------~~~~dis~----------------~~~~~~~~~~~~l~~A~~~~ 185 (346) ++...+|.+...... .+..+.+. ..+....++...|.++...| T Consensus 154 s~d~~~~~~~d~fl~~~dGwlKka~~~~~~id~a~d~t~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~lf~~~~~~L 233 (360) T protein:vir:99 154 SGNLQSIGGAAELDNTFKGWIARAEGDAQSVDDAGDSTRIGLEDTATADADSMPSIANTDGSGNPQPVDTSLFNETIQTL 233 (360) T ss_pred hcccccCcccchhhhhhHHHHHHhhcccchhhccccccccccccccccccccchhhhccccccccccchHHHHHHHHHhc Confidence 444455554332100 00000000 00111223445577777777 Q ss_pred CccccC----ceEEEEchHHHHHHHhhhhhhhccccc-----CceeeEEeceEEEEeCCCccCCCceEEEEEcCC-eeEE Q lcl|NC_015254. 186 GDAEGK----LTGIAMHSQTEMNLRKQGLIEFMLDSD-----NKKFPTYMGKRVIVDDGLPAKDGVYTSYIFGEG-AFGL 255 (346) Q Consensus 186 GD~~~~----~~~ivmhS~~~~~L~~~~li~~~~~s~-----~~~i~~~~G~~VVvdD~~p~~~g~ytt~l~~~G-Ai~~ 255 (346) =++.-. ...++||+..+..-++ .|.+.-..-+ ++..-.|.|++|+.-..+|.+. ++|.+. =+.| T Consensus 234 p~kyr~~~~~~~~~~~s~~~~~~yr~-~L~~R~t~LGd~~l~g~~~~~~~Gipi~~v~~~pd~~-----~mlT~p~NLi~ 307 (360) T protein:vir:99 234 DSRYRESDAYSPVLMTSPNQVQSYTM-SLTEREDPLGSAVIFGDSDITPFSYDLVGVNGFPDEY-----MMFTDPNNLAF 307 (360) T ss_pred chhhhcCcccceEEEccCchHHHHHH-HHhccCcccchhheecccccccceeeeEEcCCCCCCc-----eEEeccCceeE Confidence 776532 3489999998766553 2322222111 2223468999999988888653 444322 2233 Q ss_pred eecCCccceeeeecCCcceeEEEEeeEEeeee----eeeeeccccccCCCCChHHhcCCcCceeeecccccceEEEEEec Q lcl|NC_015254. 256 GNGEAPVPTETDREKLKGNDILINRQHFLLHP----RGIAWQEKSVAGHSPTNTEIEKGNNWKAVYESKNIRIVAFVHKN 331 (346) Q Consensus 256 ~~~~~~~~vE~dRd~~~g~~~l~~r~~~~~~~----~G~s~~~~~~~~~sPt~a~L~~~~NW~~v~~~K~i~iv~~~~k~ 331 (346) +..+ |+.++.....+.+..++.|+.|. .-|-|.+... |++++.+ T Consensus 308 g~~~-----~iri~~~~e~~~~~~~~~~~~~~~~~~~D~~iee~~A---------------------------v~~vt~~ 355 (360) T protein:vir:99 308 GLYE-----EMELDQSTDTDKVHEQRLHSRNWLEGQFDFQIKEQQA---------------------------GVLVTDL 355 (360) T ss_pred Eeee-----eeEEeecccchhhhhhceeeeEEEEEEeeEEEEeccc---------------------------EEEEecC Confidence 2222 22222222223333334444453 2233322110 1122222 Q ss_pred ccccc Q lcl|NC_015254. 332 GVPGK 336 (346) Q Consensus 332 ~~~~~ 336 (346) ..|.+ T Consensus 356 ~~~~~ 360 (360) T protein:vir:99 356 ETPTA 360 (360) T ss_pred CCCCC Confidence 21111 Done!