Query lcl|NC_015249.1_cdsid_YP_004300568.1 [gene=10a] [protein=gp10a] [protein_id=YP_004300568.1] [location=21464..22507] Match_columns 347 No_of_seqs 132 out of 148 Neff 7.5 Searched_HMMs 1612 Date Thu Nov 7 12:49:08 2013 Command /home/guerois/workspace/virfam/python/lib/hhsearch//hhsearch2 -i .//seq/seq_34 -d /home/guerois/workspace/virfam/python/profile_database/capsid_neck_tail.hhm -glob -cpu 7 -o .//seq/HHR/seq_34_vs_rec_db.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 protein:vir:94576 Length: 347 100.0 5E-115 3E-118 647.1 28.6 347 1-347 1-347 (347) 2 protein:vir:8885 Length: 347 # 100.0 3E-111 2E-114 626.4 27.4 346 1-347 1-346 (347) 3 protein:vir:10450 Length: 344 100.0 2E-108 1E-111 610.8 27.3 342 1-347 1-344 (344) 4 protein:vir:94711 Length: 347 100.0 1E-107 8E-111 606.5 27.1 345 1-347 1-346 (347) 5 protein:vir:2201 Length: 345 # 100.0 1E-106 7E-110 601.3 26.3 342 1-347 1-345 (345) 6 protein:vir:3364 Length: 347 # 100.0 6E-106 3E-109 597.6 26.5 343 1-347 1-345 (347) 7 protein:vir:100057 Length: 375 100.0 9E-105 5E-108 591.1 27.5 344 1-347 1-370 (375) 8 protein:vir:1541 Length: 347 # 100.0 1E-104 7E-108 590.5 26.2 344 1-347 1-345 (347) 9 protein:vir:103323 Length: 364 100.0 2.2E-99 1E-102 561.4 27.3 333 1-347 1-339 (364) 10 protein:vir:80213 Length: 334 100.0 3.6E-99 2E-102 560.3 24.8 326 1-347 1-332 (334) 11 protein:vir:6324 Length: 335 # 100.0 1E-98 6E-102 557.9 23.9 322 1-347 1-328 (335) 12 protein:vir:78935 Length: 335 100.0 1E-97 6E-101 552.3 24.1 322 1-347 1-328 (335) 13 protein:vir:97031 Length: 402 100.0 3.6E-97 2E-100 549.3 23.6 328 1-347 1-333 (402) 14 protein:vir:7019 Length: 401 # 100.0 1.2E-94 7.5E-98 535.5 23.5 328 1-347 1-333 (401) 15 protein:vir:105645 Length: 400 100.0 1.4E-93 8.5E-97 529.7 24.5 328 1-347 1-333 (400) 16 protein:vir:78739 Length: 332 100.0 1.1E-93 6.8E-97 530.2 23.5 321 1-345 4-332 (332) 17 protein:vir:99675 Length: 324 100.0 2.6E-89 1.6E-92 506.2 22.3 294 50-347 1-296 (324) 18 protein:vir:94622 Length: 341 100.0 1.8E-72 1.1E-75 413.9 23.9 321 1-347 3-339 (341) 19 protein:vir:80180 Length: 381 100.0 4.3E-69 2.7E-72 395.4 21.2 335 1-347 1-381 (381) 20 protein:vir:3136 Length: 322 # 100.0 2.6E-57 1.6E-60 330.8 15.8 304 4-347 1-318 (322) 21 protein:vir:102605 Length: 273 100.0 9E-56 5.6E-59 322.4 23.2 266 1-347 1-273 (273) 22 protein:vir:105822 Length: 273 100.0 9E-56 5.6E-59 322.4 23.2 266 1-347 1-273 (273) 23 protein:vir:102655 Length: 322 100.0 1.5E-55 9.4E-59 321.1 20.3 307 1-347 1-321 (322) 24 protein:vir:7990 Length: 273 # 100.0 6.1E-54 3.8E-57 312.3 23.0 266 1-347 1-273 (273) 25 protein:vir:1781 Length: 221 # 100.0 2.2E-54 1.4E-57 314.7 14.1 217 95-339 1-221 (221) 26 protein:vir:80930 Length: 278 100.0 2E-42 1.3E-45 249.2 21.7 270 1-347 1-277 (278) 27 protein:vir:107120 Length: 329 100.0 1.5E-41 9E-45 244.5 21.3 284 1-347 16-305 (329) 28 protein:vir:97331 Length: 319 100.0 1.8E-41 1.1E-44 244.0 21.7 284 1-347 5-294 (319) 29 protein:vir:94800 Length: 319 100.0 1.8E-41 1.1E-44 244.0 21.7 284 1-347 5-294 (319) 30 protein:vir:96123 Length: 274 100.0 2.6E-41 1.6E-44 243.1 20.5 263 1-347 1-270 (274) 31 protein:vir:108303 Length: 418 100.0 6.9E-40 4.3E-43 235.3 21.8 299 1-347 1-417 (418) 32 protein:vir:93742 Length: 274 100.0 1E-39 6.3E-43 234.4 20.7 264 1-347 1-270 (274) 33 protein:vir:3613 Length: 272 # 100.0 3E-39 1.8E-42 231.8 20.4 267 1-347 1-272 (272) 34 protein:vir:1239 Length: 274 # 100.0 4.4E-39 2.7E-42 230.9 21.2 264 1-347 1-270 (274) 35 protein:vir:96262 Length: 274 100.0 3.9E-39 2.4E-42 231.1 20.6 263 1-347 1-270 (274) 36 protein:vir:95898 Length: 274 100.0 3.9E-39 2.4E-42 231.1 20.6 263 1-347 1-270 (274) 37 protein:vir:94494 Length: 274 100.0 4.8E-39 3E-42 230.7 20.9 264 1-347 1-270 (274) 38 protein:vir:97433 Length: 274 100.0 4.8E-39 3E-42 230.7 20.9 264 1-347 1-270 (274) 39 protein:vir:99075 Length: 392 100.0 8.6E-39 5.3E-42 229.3 21.6 286 1-347 1-316 (392) 40 protein:vir:96833 Length: 275 100.0 5.8E-39 3.6E-42 230.2 20.6 265 1-347 1-271 (275) 41 protein:vir:174 Length: 423 # 100.0 1.8E-36 1.1E-39 216.6 22.8 301 1-346 1-423 (423) 42 protein:vir:105374 Length: 423 100.0 3.2E-36 2E-39 215.2 23.1 301 1-346 1-423 (423) 43 protein:vir:3525 Length: 423 # 100.0 4E-36 2.5E-39 214.7 23.3 301 1-346 1-423 (423) 44 protein:vir:9820 Length: 272 # 100.0 2.6E-35 1.6E-38 210.2 20.2 263 1-347 1-269 (272) 45 protein:vir:3033 Length: 272 # 100.0 2.6E-35 1.6E-38 210.2 20.2 263 1-347 1-269 (272) 46 protein:vir:105334 Length: 276 100.0 3.1E-35 1.9E-38 209.8 20.5 264 1-347 1-270 (276) 47 protein:vir:79008 Length: 299 100.0 1.5E-34 9E-38 206.1 22.7 282 1-347 1-299 (299) 48 protein:vir:105522 Length: 423 100.0 8.4E-34 5.2E-37 201.9 22.4 301 1-346 1-423 (423) 49 protein:vir:78920 Length: 290 100.0 1.1E-30 6.7E-34 184.9 20.0 279 1-347 1-289 (290) 50 protein:vir:739 Length: 231 # 99.9 1.2E-29 7.7E-33 179.0 16.6 230 51-347 1-231 (231) 51 protein:vir:105464 Length: 346 99.9 1.3E-28 8E-32 173.5 21.0 284 1-347 1-300 (346) 52 protein:vir:95107 Length: 270 99.9 3.8E-28 2.4E-31 170.9 20.2 261 1-347 1-265 (270) 53 protein:vir:102335 Length: 312 99.9 7E-28 4.4E-31 169.4 21.1 298 1-347 1-308 (312) 54 protein:vir:79712 Length: 285 99.9 4.1E-24 2.5E-27 148.8 18.9 264 24-347 1-283 (285) 55 protein:vir:99523 Length: 311 99.9 1E-23 6.3E-27 146.6 20.7 295 17-347 1-311 (311) 56 protein:vir:95451 Length: 313 99.8 3.5E-24 2.1E-27 149.2 12.8 300 17-347 1-311 (313) 57 protein:vir:78090 Length: 302 99.8 1.5E-20 9.5E-24 129.2 20.1 283 1-347 1-299 (302) 58 protein:vir:2106 Length: 430 # 99.7 1.3E-17 8.4E-21 113.1 20.0 302 1-347 1-429 (430) 59 protein:vir:41 Length: 299 # N 99.7 3.2E-17 2E-20 111.0 20.8 282 10-347 1-298 (299) 60 protein:vir:9265 Length: 430 # 99.7 1.7E-17 1.1E-20 112.5 19.2 302 1-347 1-429 (430) 61 protein:vir:100939 Length: 430 99.7 1.7E-17 1.1E-20 112.5 19.2 302 1-347 1-429 (430) 62 protein:vir:78523 Length: 338 99.6 2.3E-16 1.4E-19 106.3 21.0 306 1-347 1-335 (338) 63 protein:vir:78223 Length: 333 99.6 4.6E-16 2.9E-19 104.6 21.5 307 1-347 1-332 (333) 64 protein:vir:7771 Length: 330 # 99.6 1.3E-15 7.8E-19 102.3 21.6 297 1-347 1-323 (330) 65 protein:vir:6242 Length: 390 # 99.5 4.1E-15 2.5E-18 99.5 18.4 289 1-347 97-389 (390) 66 protein:vir:1328 Length: 392 # 99.5 6.4E-15 4E-18 98.4 19.2 292 1-347 97-391 (392) 67 protein:vir:4600 Length: 415 # 99.5 1.9E-14 1.2E-17 95.8 21.1 293 1-347 110-404 (415) 68 protein:vir:4700 Length: 415 # 99.5 1.9E-14 1.2E-17 95.8 21.1 293 1-347 110-404 (415) 69 protein:vir:79987 Length: 415 99.5 3.1E-14 1.9E-17 94.7 21.6 293 1-347 109-404 (415) 70 protein:vir:98339 Length: 415 99.5 3.1E-14 1.9E-17 94.7 21.6 293 1-347 109-404 (415) 71 protein:vir:81100 Length: 415 99.5 3.1E-14 1.9E-17 94.7 21.6 293 1-347 109-404 (415) 72 protein:vir:4511 Length: 409 # 99.5 4.3E-14 2.7E-17 93.8 21.9 297 1-347 99-406 (409) 73 protein:vir:9574 Length: 300 # 99.4 1.1E-13 6.9E-17 91.6 22.9 284 1-347 1-300 (300) 74 protein:vir:9410 Length: 415 # 99.4 2.7E-14 1.7E-17 94.9 19.3 293 1-347 109-404 (415) 75 protein:vir:94142 Length: 304 99.4 1E-13 6.3E-17 91.8 21.5 285 1-346 1-304 (304) 76 protein:vir:105905 Length: 304 99.4 1E-13 6.3E-17 91.8 21.5 285 1-346 1-304 (304) 77 protein:vir:9309 Length: 324 # 99.4 6.5E-14 4.1E-17 92.9 20.3 278 1-347 21-315 (324) 78 protein:vir:99920 Length: 311 99.4 1.8E-13 1.1E-16 90.5 22.2 297 1-347 1-311 (311) 79 protein:vir:1638 Length: 298 # 99.4 1.8E-13 1.1E-16 90.5 21.8 280 1-347 1-297 (298) 80 protein:vir:4339 Length: 395 # 99.4 1.7E-13 1E-16 90.6 21.6 291 1-347 98-395 (395) 81 protein:vir:97053 Length: 390 99.4 1.4E-13 8.8E-17 91.0 20.7 285 1-345 101-390 (390) 82 protein:vir:96392 Length: 324 99.4 1E-13 6.2E-17 91.9 19.4 284 1-347 1-315 (324) 83 protein:vir:78830 Length: 324 99.4 1E-13 6.2E-17 91.9 19.4 284 1-347 1-315 (324) 84 protein:vir:96223 Length: 324 99.4 8.4E-14 5.2E-17 92.3 19.0 284 1-347 4-315 (324) 85 protein:vir:9759 Length: 303 # 99.4 2.9E-13 1.8E-16 89.3 21.9 284 1-347 1-301 (303) 86 protein:vir:1886 Length: 385 # 99.4 1.5E-13 9.3E-17 90.9 20.3 286 1-347 93-384 (385) 87 protein:vir:191 Length: 385 # 99.4 1.5E-13 9.3E-17 90.9 20.3 286 1-347 93-384 (385) 88 protein:vir:104085 Length: 320 99.4 3.7E-13 2.3E-16 88.8 21.5 295 1-347 1-317 (320) 89 protein:vir:10364 Length: 390 99.4 3.7E-13 2.3E-16 88.8 20.8 279 1-345 107-390 (390) 90 protein:vir:94771 Length: 298 99.4 6.8E-13 4.2E-16 87.3 22.0 281 1-347 1-297 (298) 91 protein:vir:103955 Length: 324 99.4 2.4E-13 1.5E-16 89.7 19.5 279 1-347 18-315 (324) 92 protein:vir:8187 Length: 311 # 99.4 5.4E-13 3.3E-16 87.9 21.3 295 1-347 1-310 (311) 93 protein:vir:485 Length: 407 # 99.4 4.3E-13 2.7E-16 88.4 20.5 293 1-347 90-400 (407) 94 protein:vir:100247 Length: 425 99.4 4E-13 2.5E-16 88.6 20.4 290 1-347 121-424 (425) 95 protein:vir:94673 Length: 419 99.4 4.1E-13 2.6E-16 88.5 20.3 292 1-347 110-417 (419) 96 protein:vir:99749 Length: 324 99.3 5.2E-13 3.2E-16 87.9 20.4 280 1-347 18-315 (324) 97 protein:vir:97148 Length: 324 99.3 3.6E-13 2.2E-16 88.8 19.5 286 1-347 1-315 (324) 98 protein:vir:8102 Length: 543 # 99.3 3.8E-13 2.3E-16 88.7 19.6 298 1-347 237-542 (543) 99 protein:vir:4456 Length: 401 # 99.3 2.1E-13 1.3E-16 90.1 17.9 292 1-347 99-401 (401) 100 protein:vir:80684 Length: 315 99.3 9E-13 5.6E-16 86.6 20.3 286 1-347 1-306 (315) 101 protein:vir:4856 Length: 293 # 99.3 1.6E-12 9.7E-16 85.3 21.2 274 1-347 1-281 (293) 102 protein:vir:4830 Length: 397 # 99.3 1.1E-12 6.8E-16 86.2 20.3 284 1-347 94-385 (397) 103 protein:vir:3870 Length: 400 # 99.3 4.7E-13 2.9E-16 88.2 18.1 277 1-347 120-399 (400) 104 protein:vir:81070 Length: 390 99.3 1.8E-12 1.1E-15 85.0 20.9 279 1-345 107-390 (390) 105 protein:vir:104256 Length: 458 99.3 1.3E-12 7.9E-16 85.8 20.1 294 1-347 155-458 (458) 106 protein:vir:4997 Length: 397 # 99.3 1.9E-12 1.2E-15 84.9 20.5 284 1-347 95-385 (397) 107 protein:vir:81160 Length: 371 99.3 1.7E-12 1.1E-15 85.1 19.5 281 1-347 80-371 (371) 108 protein:vir:102119 Length: 404 99.3 2.3E-12 1.4E-15 84.4 20.1 297 1-347 92-400 (404) 109 protein:vir:2430 Length: 318 # 99.3 3.2E-12 2E-15 83.6 20.7 289 1-347 1-313 (318) 110 protein:vir:5974 Length: 324 # 99.3 2.2E-12 1.3E-15 84.5 19.5 278 1-347 1-295 (324) 111 protein:vir:102944 Length: 330 99.3 1.3E-12 8.2E-16 85.7 18.1 281 1-347 1-296 (330) 112 protein:vir:100135 Length: 418 99.3 3.6E-12 2.2E-15 83.3 20.4 287 1-347 121-415 (418) 113 protein:vir:4953 Length: 397 # 99.2 4.1E-12 2.5E-15 83.0 20.2 284 1-347 95-385 (397) 114 protein:vir:95763 Length: 297 99.2 1.1E-11 6.7E-15 80.7 21.7 277 1-347 1-294 (297) 115 protein:vir:3991 Length: 404 # 99.2 1.3E-11 7.9E-15 80.3 21.9 284 1-347 101-393 (404) 116 protein:vir:2344 Length: 397 # 99.2 4.8E-12 3E-15 82.6 19.3 284 1-347 1-306 (397) 117 protein:vir:1433 Length: 435 # 99.2 1.4E-11 8.4E-15 80.2 20.9 295 1-347 105-432 (435) 118 protein:vir:101607 Length: 379 99.2 1.8E-11 1.1E-14 79.5 21.4 271 1-347 100-379 (379) 119 protein:vir:80376 Length: 435 99.2 2.7E-11 1.7E-14 78.5 21.6 295 1-347 105-432 (435) 120 protein:vir:4226 Length: 326 # 99.2 2.2E-11 1.4E-14 79.0 21.0 285 1-347 1-323 (326) 121 protein:vir:100172 Length: 394 99.2 1.6E-11 9.7E-15 79.8 20.1 277 1-347 100-384 (394) 122 protein:vir:95376 Length: 425 99.2 1.2E-11 7.7E-15 80.4 18.9 291 1-347 119-421 (425) 123 protein:vir:1383 Length: 421 # 99.2 1.1E-11 6.9E-15 80.6 18.4 277 1-347 101-383 (421) 124 protein:vir:2504 Length: 305 # 99.1 2.9E-11 1.8E-14 78.4 20.4 280 1-347 1-305 (305) 125 protein:vir:5739 Length: 366 # 99.1 3.2E-11 2E-14 78.1 20.7 293 1-347 52-365 (366) 126 protein:vir:81227 Length: 413 99.1 4.6E-11 2.8E-14 77.3 21.0 290 1-347 105-410 (413) 127 protein:vir:1583 Length: 351 # 99.1 1.2E-11 7.7E-15 80.4 17.6 280 1-347 1-299 (351) 128 protein:vir:962 Length: 397 # 99.1 9.2E-12 5.7E-15 81.1 16.0 276 1-347 121-397 (397) 129 protein:vir:1268 Length: 397 # 99.1 3.5E-11 2.2E-14 77.9 19.0 282 1-347 103-397 (397) 130 protein:vir:1025 Length: 408 # 99.1 1.1E-10 6.6E-14 75.2 21.5 284 1-347 101-393 (408) 131 protein:vir:7409 Length: 408 # 99.1 6.1E-11 3.8E-14 76.6 20.0 284 1-347 97-393 (408) 132 protein:vir:105610 Length: 430 99.1 2.6E-11 1.6E-14 78.6 16.9 328 1-347 1-422 (430) 133 protein:vir:9704 Length: 394 # 99.1 6.9E-11 4.3E-14 76.3 19.1 274 1-347 115-390 (394) 134 protein:vir:100884 Length: 389 99.0 1.1E-10 7E-14 75.1 19.5 280 1-347 95-382 (389) 135 protein:vir:3845 Length: 395 # 99.0 1.1E-10 7.1E-14 75.1 19.4 274 1-347 102-383 (395) 136 protein:vir:105004 Length: 392 99.0 1.2E-10 7.2E-14 75.0 19.4 284 1-347 84-384 (392) 137 protein:vir:102082 Length: 392 99.0 1.2E-10 7.2E-14 75.0 19.4 284 1-347 84-384 (392) 138 protein:vir:102873 Length: 392 99.0 1.2E-10 7.2E-14 75.0 19.4 284 1-347 84-384 (392) 139 protein:vir:107593 Length: 392 99.0 1.2E-10 7.2E-14 75.0 19.4 284 1-347 84-384 (392) 140 protein:vir:6212 Length: 434 # 99.0 1.1E-10 6.8E-14 75.2 19.2 288 1-347 131-431 (434) 141 protein:vir:1084 Length: 437 # 99.0 5.7E-11 3.6E-14 76.7 17.4 280 1-347 141-427 (437) 142 protein:vir:4092 Length: 390 # 99.0 4.2E-10 2.6E-13 72.0 21.4 290 1-347 68-368 (390) 143 protein:vir:105038 Length: 428 99.0 5.5E-10 3.4E-13 71.3 21.5 299 1-347 113-427 (428) 144 protein:vir:93616 Length: 645 98.9 1.1E-09 7E-13 69.6 21.3 287 1-347 315-637 (645) 145 protein:vir:9361 Length: 402 # 98.9 2.4E-10 1.5E-13 73.3 16.1 271 1-347 114-396 (402) 146 protein:vir:96762 Length: 632 98.9 2.9E-10 1.8E-13 72.9 16.2 285 1-346 334-632 (632) 147 protein:vir:8420 Length: 477 # 98.9 1.9E-09 1.2E-12 68.4 20.3 297 1-347 148-471 (477) 148 protein:vir:2770 Length: 318 # 98.9 2.3E-10 1.4E-13 73.4 15.1 259 1-269 1-318 (318) 149 protein:vir:93881 Length: 387 98.9 4.7E-10 2.9E-13 71.7 16.8 272 1-347 99-381 (387) 150 protein:vir:93696 Length: 364 98.9 5E-10 3.1E-13 71.6 16.7 308 1-347 1-359 (364) 151 protein:vir:101650 Length: 497 98.9 1.1E-09 7E-13 69.6 18.1 297 1-347 138-493 (497) 152 protein:vir:7855 Length: 497 # 98.9 1.1E-09 7E-13 69.6 18.1 297 1-347 138-493 (497) 153 protein:vir:78640 Length: 352 98.8 6.2E-10 3.9E-13 71.1 16.2 269 1-347 64-346 (352) 154 protein:vir:9927 Length: 295 # 98.8 2E-09 1.2E-12 68.3 17.0 272 1-347 1-288 (295) 155 protein:vir:96978 Length: 387 98.7 6.9E-10 4.3E-13 70.8 13.8 273 1-347 99-381 (387) 156 protein:vir:2685 Length: 387 # 98.7 6.9E-10 4.3E-13 70.8 13.8 273 1-347 99-381 (387) 157 protein:vir:94424 Length: 387 98.7 6.9E-10 4.3E-13 70.8 13.8 273 1-347 99-381 (387) 158 protein:vir:9875 Length: 296 # 98.7 2.5E-09 1.5E-12 67.8 16.5 274 1-347 1-295 (296) 159 protein:vir:3298 Length: 404 # 98.7 3.5E-09 2.2E-12 66.9 16.7 333 1-347 1-401 (404) 160 protein:vir:104439 Length: 404 98.7 3.5E-09 2.2E-12 66.9 16.7 333 1-347 1-401 (404) 161 protein:vir:819 Length: 404 # 98.7 3.5E-09 2.2E-12 66.9 16.7 333 1-347 1-401 (404) 162 protein:vir:10123 Length: 404 98.7 3.5E-09 2.2E-12 66.9 16.7 333 1-347 1-401 (404) 163 protein:vir:4159 Length: 315 # 98.6 2.1E-08 1.3E-11 62.7 18.2 305 1-346 1-315 (315) 164 protein:vir:4197 Length: 314 # 98.6 3.1E-08 1.9E-11 61.8 19.1 299 1-347 1-311 (314) 165 protein:vir:9643 Length: 377 # 98.5 1.5E-07 9.5E-11 57.9 20.4 285 1-347 59-377 (377) 166 protein:vir:3158 Length: 321 # 98.5 4E-08 2.5E-11 61.2 15.8 298 1-347 1-312 (321) 167 protein:vir:100632 Length: 381 98.4 1.2E-07 7.4E-11 58.5 18.4 284 1-347 57-368 (381) 168 protein:vir:95875 Length: 401 98.4 8.2E-08 5.1E-11 59.4 17.3 322 1-347 1-400 (401) 169 protein:vir:9509 Length: 381 # 98.4 2E-07 1.2E-10 57.3 18.9 285 1-347 57-368 (381) 170 protein:vir:101291 Length: 381 98.4 2E-07 1.2E-10 57.3 18.9 285 1-347 57-368 (381) 171 protein:vir:108211 Length: 318 98.4 9.1E-08 5.6E-11 59.2 17.0 290 1-345 1-318 (318) 172 protein:vir:95963 Length: 395 98.4 2.4E-07 1.5E-10 56.9 18.4 285 1-347 61-376 (395) 173 protein:vir:78350 Length: 383 98.3 2.7E-07 1.7E-10 56.6 17.9 285 1-347 64-375 (383) 174 protein:vir:106647 Length: 303 98.3 3.6E-07 2.2E-10 55.9 17.0 272 1-347 1-296 (303) 175 protein:vir:98635 Length: 377 98.1 1.1E-06 6.8E-10 53.2 17.3 285 1-347 59-377 (377) 176 protein:vir:80128 Length: 466 98.1 5.3E-07 3.3E-10 55.0 15.5 289 1-347 123-448 (466) 177 protein:vir:79928 Length: 393 98.1 4.7E-07 2.9E-10 55.3 14.7 301 1-347 59-381 (393) 178 protein:vir:78387 Length: 349 97.6 3.8E-05 2.3E-08 44.8 20.2 289 1-347 1-317 (349) 179 protein:vir:80446 Length: 367 97.6 3.9E-05 2.4E-08 44.8 19.1 294 1-347 1-335 (367) 180 protein:vir:107687 Length: 319 97.1 0.00017 1.1E-07 41.2 16.2 294 1-346 1-319 (319) 181 protein:vir:94989 Length: 349 97.1 0.00018 1.1E-07 41.1 20.7 289 1-347 1-317 (349) 182 protein:vir:3969 Length: 287 # 96.9 0.00023 1.4E-07 40.6 14.9 250 22-298 1-287 (287) 183 protein:vir:97397 Length: 517 96.8 0.00029 1.8E-07 40.0 14.8 281 1-347 226-514 (517) 184 protein:vir:94528 Length: 286 96.6 0.00048 3E-07 38.8 16.1 249 1-299 1-286 (286) 185 protein:vir:103285 Length: 296 96.6 0.00048 3E-07 38.8 17.1 274 16-345 1-296 (296) 186 protein:vir:4786 Length: 295 # 95.9 0.00021 1.3E-07 40.7 9.3 275 4-335 1-295 (295) 187 protein:vir:80068 Length: 301 95.8 0.0014 8.5E-07 36.3 18.4 279 19-345 1-301 (301) 188 protein:vir:4074 Length: 480 # 95.5 0.0019 1.2E-06 35.5 13.3 280 1-347 171-477 (480) 189 protein:vir:95512 Length: 693 95.0 0.003 1.9E-06 34.4 16.2 298 1-347 371-693 (693) 190 protein:vir:98871 Length: 314 94.6 0.0038 2.4E-06 33.8 14.2 266 1-299 11-314 (314) 191 protein:vir:79548 Length: 652 94.6 0.004 2.5E-06 33.7 17.3 294 1-346 336-652 (652) 192 protein:vir:96079 Length: 382 93.1 0.0089 5.5E-06 31.8 16.7 278 1-306 51-382 (382) 193 protein:vir:5942 Length: 523 # 89.1 0.028 1.7E-05 29.1 15.8 312 1-347 162-521 (523) 194 protein:vir:10324 Length: 320 89.1 0.028 1.8E-05 29.1 13.7 288 10-347 1-317 (320) 195 protein:vir:95131 Length: 325 80.9 0.091 5.6E-05 26.3 16.7 276 18-347 1-298 (325) 196 protein:vir:79642 Length: 329 79.8 0.1 6.3E-05 26.0 16.7 292 1-345 14-329 (329) 197 protein:vir:8324 Length: 410 # 72.2 0.19 0.00012 24.6 10.9 275 1-345 89-410 (410) 198 protein:vir:104342 Length: 314 72.0 0.19 0.00012 24.6 15.6 290 1-345 1-314 (314) 199 protein:vir:94070 Length: 339 71.5 0.19 0.00012 24.5 13.3 284 1-346 35-339 (339) 200 protein:vir:97255 Length: 310 70.9 0.2 0.00013 24.4 17.6 284 1-345 1-310 (310) 201 protein:vir:95258 Length: 368 70.1 0.21 0.00013 24.3 17.8 315 1-347 1-366 (368) 202 protein:vir:3643 Length: 336 # 69.8 0.22 0.00014 24.2 13.0 285 1-346 34-336 (336) 203 protein:vir:94933 Length: 330 65.4 0.28 0.00018 23.6 16.4 285 1-347 25-330 (330) 204 protein:vir:99424 Length: 360 63.6 0.31 0.00019 23.3 17.3 304 1-347 1-356 (360) 205 protein:vir:101557 Length: 336 59.6 0.39 0.00024 22.8 13.7 286 1-346 34-336 (336) 206 protein:vir:78558 Length: 336 58.9 0.4 0.00025 22.7 13.5 286 1-346 34-336 (336) 207 protein:vir:103886 Length: 302 52.4 0.56 0.00034 22.0 16.9 278 1-347 1-293 (302) 208 protein:vir:107732 Length: 379 51.6 0.58 0.00036 21.9 16.1 299 1-346 56-379 (379) 209 protein:vir:103181 Length: 457 47.5 0.7 0.00043 21.4 15.9 306 1-347 97-439 (457) 210 protein:vir:99576 Length: 388 42.6 0.88 0.00054 20.9 12.4 298 1-346 30-388 (388) 211 protein:vir:106734 Length: 336 41.4 0.93 0.00058 20.8 12.4 285 1-346 34-336 (336) 212 protein:vir:78148 Length: 123 36.5 1.1 0.00067 20.4 5.6 118 207-347 1-122 (123) 213 protein:vir:5670 Length: 514 # 35.8 1.2 0.00075 20.1 15.5 301 1-347 144-493 (514) 214 protein:vir:105778 Length: 358 20.3 2.8 0.0017 18.1 8.4 301 1-334 36-358 (358) No 1 >protein:vir:94576 Length: 347 # NCBI annotation: Major capsid protein # Family: family:all:975 # MgeID: mge:1516 # MgeName: Berlin # Cross-refs: genbank:acc:YP_919012;genbank:gi:119637776;genbank:GeneID:5179336 Probab=100.00 E-value=5.4e-115 Score=647.06 Aligned_cols=347 Identities=94% Similarity=1.346 Sum_probs=337.1 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENLD 80 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~ 80 (347) |||+++|++++|||||+++++|+++||||+|+|||+++|+|+|++++++++|+|++|||++||++|++++.+|+||++++ T Consensus 1 ma~~~~~~~~~t~~g~~~~~~d~~al~ie~~~geV~~~f~~~s~~~~~~~~rti~~G~sv~~~~iG~~~~~~~~~G~~l~ 80 (347) T protein:vir:94 1 MANMNGGQQMGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENLD 80 (347) T ss_pred CCccccccccccccccCCcccchHHHHHHHHhHHHHHHHHHHHhhhhhhhheeccccceEEeeeccceeEeeeecCcCCC Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred CccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccC Q lcl|NC_015249. 81 DKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGK 160 (347) Q Consensus 81 ~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~ 160 (347) ++.++++++|++|+||+++|++|+|||+|++|+++|+|+++++|+||+||+++|++|+++++++++++.+....+.++++ T Consensus 81 ~~~~~~~~~e~~ltID~~~y~~~~VddiD~~q~~~D~rs~~~~~~g~ALA~~~D~~i~~~l~~~a~~~~~~~~~~~g~~~ 160 (347) T protein:vir:94 81 DKRKDMKHTEKTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPTANNENIAGLGK 160 (347) T ss_pred CCcCCccccceEEEEcchhhhhhhhhhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhccccccccccccCCc Confidence 98889999999999999999999999999999999999999999999999999999999999999999888888999999 Q ss_pred cceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceEEE Q lcl|NC_015249. 161 AHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSIRN 240 (347) Q Consensus 161 g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg~ 240 (347) ++.+.++..+...++++..+.++|++|++|+++|+|++||++|||+||+|++|+.||+..++...++.+...+.+|.|++ T Consensus 161 ~~~v~i~~~~~~~~~~~~~~~~~~d~i~~a~~~Lde~dVP~~~R~~vv~P~~y~~LLk~~~~~~~~~~~~~~~~~G~V~~ 240 (347) T protein:vir:94 161 AHVLEVGDQATLQGDQVKLGQAIIAQLTLARAKLTGNYVPSSDRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSIRN 240 (347) T ss_pred ceeEeeeccccccccccccHHHHHHHHHHHHHHhhhcCCCCCCCEEEeChHHHHHHHHhhcccccccccccccccceeEE Confidence 99999998888888889999999999999999999999999999999999999999998888888888888899999999 Q ss_pred EeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhhhc Q lcl|NC_015249. 241 VMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANFQA 320 (347) Q Consensus 241 i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~~ 320 (347) ++||+||+|||+|..+++.+..+.+...+++.+.++.+.+++|+++|+++++|+|||+|+++||++++++|.+||++||+ T Consensus 241 v~G~~V~~Sn~~p~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~d~~~~~~l~~~~~A~~tv~~~~~~~e~~~~~~~~~ 320 (347) T protein:vir:94 241 VMGFEVIEVPHLTAGGAGDNRAEEGVAPTNQKHAFPDTASGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANFQA 320 (347) T ss_pred eeceEEEEcCccccccCcccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcccceeeeechhhhh Confidence 99999999999999999888888888888999999999999999999999999999999999999999999999999999 Q ss_pred ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 321 DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 321 d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |+|+++|+|||+++||||||+|++++| T Consensus 321 ~~i~~~~a~G~g~~rPe~a~~i~~~~a 347 (347) T protein:vir:94 321 DQIIAKYAMGHGGLRPEACGALVFKKA 347 (347) T ss_pred hhhhhhhhhcCcccccceeEEEEecCC Confidence 999999999999999999999999999 No 2 >protein:vir:8885 Length: 347 # NCBI annotation: major capsid protein A # Family: family:all:975 # MgeID: mge:161 # MgeName: gh-1 # Cross-refs: genbank:acc:NP_813774;genbank:gi:29366729;genbank:GeneID:1258837 Probab=100.00 E-value=3.1e-111 Score=626.44 Aligned_cols=346 Identities=76% Similarity=1.168 Sum_probs=333.6 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENLD 80 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~ 80 (347) |||++||+|++|||||+++++|+++||||+|+|||+++|+++|++++++++|++++|||+|||++|++++.+|++|++++ T Consensus 1 ~a~~~~~~~~~~~~g~~~~~~d~~al~ie~~~geV~~~f~~~s~~~~~~~~r~i~~G~sv~~~~iG~~~~~~~~~g~~l~ 80 (347) T protein:vir:88 1 MANATGGQQIGANQGKGQSAADKLALFLKVFGGEVLTAFVRRSVTMDKHMVRTIQNGKSASFPVMGRTKGYYLAPGENLD 80 (347) T ss_pred CCCcccchhhhccCCCCccccchHHHHHHHHHHHHHHHHHHHhhhhhccccccccCcceEEEeeecceeeeeeccccCCC Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred CccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccC Q lcl|NC_015249. 81 DKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGK 160 (347) Q Consensus 81 ~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~ 160 (347) ++++++++++++|+||+++|++|+|||+|++|+++|+|+++++++|++||+++|++|+++++++++.+.+.+..++++++ T Consensus 81 ~~~~~~~~~~~~i~ID~~~y~~~~Vdd~D~~q~~~D~r~~~~~~~g~aLA~~~D~~i~~~l~~~a~~~~~~~~~~~g~~~ 160 (347) T protein:vir:88 81 DKRKDIKHSEKVIQIDGLLTSDVLIYDIEDAMNHYDVRAEYSAQLGEALAIAADGAVLAEMAKLCNLPAASNENIAGLGQ 160 (347) T ss_pred CCCCCCccceEEEEEechhhhhhhhhhHHHHhhcCCchHHHHHHHHHHHHHHHHHHHHHHHHHhhccccccccccCCccc Confidence 98899999999999999999999999999999999999999999999999999999999999999998888888999888 Q ss_pred cceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceEEE Q lcl|NC_015249. 161 AHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSIRN 240 (347) Q Consensus 161 g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg~ 240 (347) +..+.++++.+. .++..+++++|+.|++|+++|+|++||++|||+||+|++|++||+++++++.+|.+...+++|.|++ T Consensus 161 ~~~~~~~~~~~~-~~~~~~~~~~~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~~~~~~~~~~~~~~~~G~vg~ 239 (347) T protein:vir:88 161 AVVLNIGAAADL-VDVEARGKAILKGLTLARARLTKNYVPAGDRRFYCAPEDYSAILSALMPNAANYAALIDPETGNIRN 239 (347) T ss_pred cccccccccccc-cchhhhHHHHHHHHHHHHHHHhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccchhcceeee Confidence 888877766554 3667788999999999999999999999999999999999999999999999999999999999999 Q ss_pred EeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhhhc Q lcl|NC_015249. 241 VMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANFQA 320 (347) Q Consensus 241 i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~~ 320 (347) ++||+||+|||+|+.+++....+...+.++..+.+..+.+++|++++++.++|+||++|+++||+|++++|.+|+++||+ T Consensus 240 i~G~~V~~s~nlp~~~~~~~~~~~~~~~t~~~~~~~~~~~~~~~~d~~~~~~l~~~~~a~g~v~~~d~~~e~~r~~~~~~ 319 (347) T protein:vir:88 240 VMGFEVIEVPHLTVGGAGDNNPADGVAPTNQKHIFPATATGDDRVAQNNVVGLFNHRSAVGTVKLKDMALERARRPEFQA 319 (347) T ss_pred eccceEEEeecccccccccccccccccccccccccccccccccccccCcEEEEEechhhhhheecccceeeeeechhhHH Confidence 99999999999999999888888888999999999999999999999999999999999999999999999999999999 Q ss_pred ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 321 DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 321 d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |+|+++|+|||+++||||||+|++.+| T Consensus 320 d~i~~~~~~G~~~~rPe~a~~~~~~~a 346 (347) T protein:vir:88 320 DQIIGKYAMGHGGLRPEAAGALVFTPA 346 (347) T ss_pred HHhhhhhhhcCceeccceEEEEEeCCC Confidence 999999999999999999999999999 No 3 >protein:vir:10450 Length: 344 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:184 # MgeName: phiA1122 # Cross-refs: genbank:acc:NP_848297;genbank:gi:30387487;genbank:GeneID:1733971 Probab=100.00 E-value=2.2e-108 Score=610.80 Aligned_cols=342 Identities=77% Similarity=1.134 Sum_probs=319.9 Q ss_pred CCccccccccc--ccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIG--KDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~--t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~ 78 (347) |||++++++.+ ++|+| ++++|+++||||+|+|||+++|+++|+++++|++|+|++|||++||++|++++++|+||++ T Consensus 1 ma~~~~~~~~n~~~~~~~-~~~~~~~al~ie~~~geV~~~f~~~s~~~~~~~~r~i~~g~s~~~~~iG~~~~~~~~~G~~ 79 (344) T protein:vir:10 1 MANMTGGQQLGTNQGKDV-MAAGDKLALFLKVFGGEVLTAFARTSVTTSRHMVRSISSGKSAQFPVLGRTQAAYLAPGEN 79 (344) T ss_pred CccccccccCCcccCCcc-CCccchhHHHHHHHHHHHHHHHHHHhhhcccceeeeecccceEEEEeeceeEEEeeecCCC Confidence 99999887666 44443 5777899999999999999999999999999999999999999999999999999999999 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) ++++.++++++|++|+||+++|++|+|||+|++|++||+|+++++|+|++||+++|++|+++++++++++++.+..++++ T Consensus 80 l~~t~~~~~~~e~~l~ID~~~y~~~~VdDiD~~q~~~D~r~~~~~~~G~aLA~~~D~~i~~~la~~a~~~~~~~~~~~g~ 159 (344) T protein:vir:10 80 LDDIRKDIKHTEKVITIDGLLTADVLIYDIEDAMNHYDVRSEYTSQLGESLAMAADGAVLAEIAGLCNVESQYNENITGL 159 (344) T ss_pred CCCCCCCcccceEEEEEcchhhhhhhhhhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccc Confidence 99988899999999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceE Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSI 238 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~V 238 (347) ++++++....++...++++..++++|+.|++|+++|+|++||++|||+||+|++|++||+++++++.+|++++.+++|+| T Consensus 160 ~~~~~~~~~~~~~~~t~~~~~~~~~~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~~~~~~~~~~~~~~~~G~V 239 (344) T protein:vir:10 160 GTATVIETTQDKTTLTDQVALGKEIIAALTKARAALTKNYVPSSDRVFYCDPDSYSAILAALMPNAANYAALIDPEKGSI 239 (344) T ss_pred cccceeecccccccccchhhhHHHHHHHHHHHHHHHhhcCCCccCCEEEeChHHHHHHhhcccccccccccccceeeeEE Confidence 99999888877777788899999999999999999999999999999999999999999999999999999999999999 Q ss_pred EEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhh Q lcl|NC_015249. 239 RNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANF 318 (347) Q Consensus 239 g~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~ 318 (347) ++++||+||+|||+|..+.++. ....+++.+.++...+.+++.+++++|||+|||+|+++++++++++|.+|+++| T Consensus 240 ~~v~G~~V~~Sn~lp~~~~~~~----~~~~tg~~~~~~~~~~~~~~~~~s~~~~l~~h~~A~~~v~~~~~~~e~~r~~~~ 315 (344) T protein:vir:10 240 RNVMGFEVVEVPHLTAGGAGTS----REGTTGQKHAFPATKSGNDKVAKDNVIGLFMHRSAVGTVKLRDLALERARRANF 315 (344) T ss_pred EEEeceEEEeccccccccCCcc----cccccCccccccCCcccceeeecceeEEEeechhhhhhhhhccceeecccchhH Confidence 9999999999999998766543 233566677778888889999999999999999999999999999999999999 Q ss_pred hcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 319 QADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 319 ~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |+|+|+++|+|||+++|||||++|+++.= T Consensus 316 ~~d~i~g~~~~G~~vlRPe~a~~v~~~~~ 344 (344) T protein:vir:10 316 QADQIIAKYAMGHGGLRPEAAGAVVFKTK 344 (344) T ss_pred HHHHHHHHhhcccceecccceEEEEeecC Confidence 99999999999999999999977777666 No 4 >protein:vir:94711 Length: 347 # NCBI annotation: capsid # Family: family:all:975 # MgeID: mge:1528 # MgeName: K1F # Cross-refs: genbank:acc:YP_338120;genbank:gi:77118198;genbank:GeneID:3707734 Probab=100.00 E-value=1.3e-107 Score=606.50 Aligned_cols=345 Identities=73% Similarity=1.122 Sum_probs=326.2 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENLD 80 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~ 80 (347) ||++++ +.++|||||+++++|+++||||+|.+||+++|+++|++++++++|+|++|||+|||++|++++++|+||++++ T Consensus 1 m~~~~~-~~~~t~~g~~~~~~d~~al~ik~f~~eV~~~f~~~s~~~~~~~~r~i~~G~sv~i~~iG~~tv~~~t~G~~l~ 79 (347) T protein:vir:94 1 MANVPG-QKIGTDQGKGKSSSDALALFLKVFAGEVLTAFTRRSVTADKHIVRTIQNGKSAQFPVMGRTSGVYLAPGERLS 79 (347) T ss_pred CCCCCc-cccccccccCCccccHHHHHHHHHhHHHHHHHHHHHhhhcccccccccccceEEEecccceeeeeecCCCCcC Confidence 999985 6678999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred CccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccC Q lcl|NC_015249. 81 DKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGK 160 (347) Q Consensus 81 ~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~ 160 (347) ++++++++++++|+||+++|++|+|||+|++|+++|+|+++++|+|++||+++|++|+++++++++++.+++..+++++. T Consensus 80 ~~~~~~~~~e~~itID~~~~~~~~VddiD~~q~~~D~~~~~~~~~g~aLa~~~D~~i~~~~~~~aa~~~~~~~~~~g~~~ 159 (347) T protein:vir:94 80 DKRKGIKHTEKVITIDGLLTADVMIFDIEDAMNHYDVAGEYSNQLGEALAIAADGAVLAEMAILCNLPAASNENIAGLGT 159 (347) T ss_pred CCCCCCCcceEEEEecchhhhhHHhhhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHHHhccccccccccCCCcc Confidence 88888999999999999999999999999999999999999999999999999999999999999999988888888888 Q ss_pred cceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceEEE Q lcl|NC_015249. 161 AHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSIRN 240 (347) Q Consensus 161 g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg~ 240 (347) ++++..+...+.. ++.+.+++++++|++|+++|+|++||++|||+||+|++|++||+++++++.++.++..+.+|+|++ T Consensus 160 ~s~~~~~~~~~~~-~~~~~~~~~~~~i~~a~~~Lde~~VP~~~R~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg~ 238 (347) T protein:vir:94 160 ASVLEVGKKADLD-TPAKLGEAIIGQLTIARAKLTSNYVPAGDRYFYTTPDNYSAILAALMPNAANYAALIDPETGNIRN 238 (347) T ss_pred cceeecccccccc-chhhhHHHHHHHHHHHHHHHhhcCCCCCCcEEEeCHHHHHHHhccchhhhhhccccccccccceEE Confidence 8888877666543 557788999999999999999999999999999999999999999999999999998999999999 Q ss_pred EeceEEEEecceeccccccccccccc-ccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhhh Q lcl|NC_015249. 241 VMGFEVIEVPHLTAGGAGEDRPEEGA-NPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANFQ 319 (347) Q Consensus 241 i~G~~V~~sn~lp~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~ 319 (347) ++||+||+|||||+.+.+++..+.+. ...|+.+.++.+.+.+|+++|++.++|+|||+|+++||+|++++|.+|++++| T Consensus 239 i~G~~V~~Sn~lp~~~~t~~~~~~~~~~~aG~~~~~~~~~~~~~~~~~~~~~~l~~h~~A~~~v~~~~~~~e~~r~~~~~ 318 (347) T protein:vir:94 239 VMGFVVVEVPHLVQGGAGETRGDDGITIASGQKHAFPATASSDVKVTMDNVVGLFSHRSAVGTVKLRDLALERDRDVDAQ 318 (347) T ss_pred EeceEEEecCcccccccccccccCcceecCcccccccccchhhhcccccceeEEEeehhhhhhhhcccccccchhchhhH Confidence 99999999999999888877776654 45777888888999999999999999999999999999999999999999999 Q ss_pred cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 320 ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 320 ~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +|+|+++|+|||+++||||||+|+++.| T Consensus 319 ~d~i~~~~~~G~~~~rP~~a~~~~~~~A 346 (347) T protein:vir:94 319 GDLIVGKYAMGHGGLRPEAAGALVFSPA 346 (347) T ss_pred HHHhhhhhhhcCcccccceeEEEEecCC Confidence 9999999999999999999999999999 No 5 >protein:vir:2201 Length: 345 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:49 # MgeName: T7 # Cross-refs: genbank:acc:NP_041998;swissprot:sw:p19726;genbank:gi:9627469;goa:P19726;uniprot:P19726;genbank:GeneID:1261026 Probab=100.00 E-value=1.2e-106 Score=601.34 Aligned_cols=342 Identities=77% Similarity=1.132 Sum_probs=315.2 Q ss_pred CCccccccccc--ccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIG--KDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~--t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~ 78 (347) ||+++++++.+ |||||+ +++|+++||||+|+|||+++|+++|+++++|++|+|++|||++||++|++++++|+||++ T Consensus 1 ~~~~~~~~~~~~~~~~~~~-~~~~~~al~le~f~geV~~~f~~~s~~~~~~~~r~i~~gks~~~~~iG~~~~~~~~~G~~ 79 (345) T protein:vir:22 1 MASMTGGQQMGTNQGKGVV-AAGDKLALFLKVFGGEVLTAFARTSVTTSRHMVRSISSGKSAQFPVLGRTQAAYLAPGEN 79 (345) T ss_pred Ccccccchhcccccccccc-cCCchhHHHHHHHhHHHHHHHHHHhhhcccceeeeccccceEEEeeecceEEEeeecCCC Confidence 99999987766 788886 577999999999999999999999999999999999999999999999999999999999 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) |++++++++++|++|+||+++|++|+|||+|++|++||+|+++++|+||+||+++|++|+++++++++++++....++++ T Consensus 80 l~~~~~~~~~~e~~ltID~~~y~~~~VddiD~~q~~~D~r~~~s~~~G~aLA~~~D~~i~~~l~k~a~~~~~~~~~~~~~ 159 (345) T protein:vir:22 80 LDDKRKDIKHTEKVITIDGLLTADVLIYDIEDAMNHYDVRSEYTSQLGESLAMAADGAVLAEIAGLCNVESKYNENIEGL 159 (345) T ss_pred CCCCCCCcccceEEEEecchhhhhhhHhhHHHHhcCchhHHHHHHHHHHHHHHHHHHHHHHHHHHhhccccccccccccc Confidence 99988889999999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceE Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSI 238 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~V 238 (347) +.+.++.+..+....+++.+.+.++|++|++|+++|+|++||.+|||+||+|++|++||++++|++.+|.+++.+.+|+| T Consensus 160 ~~~~~~~~~~~g~~~t~~~~~~~~~~~ai~~a~~~Lde~~VP~~~R~~vv~P~~y~~Ll~~~~~~~~~~~~~~~~~~G~V 239 (345) T protein:vir:22 160 GTATVIETTQNKAALTDQVALGKEIIAALTKARAALTKNYVPAADRVFYCDPDSYSAILAALMPNAANYAALIDPEKGSI 239 (345) T ss_pred ccccccccccccccccccccCHHHHHHHHHHHHHHhhhcCCCccCCEEEeChHHHHHHhccccccccccccccccccceE Confidence 99988887776666667788899999999999999999999999999999999999999999999999999999999999 Q ss_pred EEEeceEEEEecceecccccccccccccccccccccccccc-cccccccccceEEEEechhhhhhhhhcceeeeeeechh Q lcl|NC_015249. 239 RNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETS-SGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRAN 317 (347) Q Consensus 239 g~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~ 317 (347) ++++||+||+|||+|...+++.... ..++.+.++.+. +.++..+.+++|||+|||+|+++||+|++++|.+|+++ T Consensus 240 ~~i~G~~V~~sn~lp~~~~~~~~~~----~~~~~~~~~~~~g~~~~~~~~~~~~~l~~h~~A~~~v~~~~~~~e~~r~~~ 315 (345) T protein:vir:22 240 RNVMGFEVVEVPHLTAGGAGTAREG----TTGQKHVFPANKGEGNVKVAKDNVIGLFMHRSAVGTVKLRDLALERARRAN 315 (345) T ss_pred EEEeceEEEecccccccccCccccC----cccccccccccccceeeeeccCceEEEEEehhheeeeeeecceeeeeechh Confidence 9999999999999998766554432 233444444443 44566778999999999999999999999999999999 Q ss_pred hhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 318 FQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 318 ~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ||+|+|+++|+|||+++||||||+|+++-- T Consensus 316 ~~~d~I~~~~a~G~~vlRPeaa~~i~~~~~ 345 (345) T protein:vir:22 316 FQADQIIAKYAMGHGGLRPEAAGAVVFKVE 345 (345) T ss_pred HHHHHHHHHHhcCCcccccceeEEEEEeeC Confidence 999999999999999999999999999988 No 6 >protein:vir:3364 Length: 347 # NCBI annotation: major capsid protein 10A # Family: family:all:975 # MgeID: mge:67 # MgeName: T3 # Cross-refs: genbank:acc:NP_523335;genbank:gi:17570826;genbank:GeneID:927448 Probab=100.00 E-value=5.6e-106 Score=597.60 Aligned_cols=343 Identities=79% Similarity=1.158 Sum_probs=309.9 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENLD 80 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~ 80 (347) |||+++|++++|||||+++++|++|||||+|+|||+++|+++|++++++++|++++|||+|||++|++++++|++|++++ T Consensus 1 ~~~~~~~~~~~t~~g~~~~~~~~~al~ie~~~g~V~~~f~~~s~~~~~v~~r~~~~G~sv~i~~iG~~t~~~~~~g~~l~ 80 (347) T protein:vir:33 1 MANIQGGQQIGTNQGKGQSAADKLALFLKVFGGEVLTAFARTSVTMPRHMLRSIASGKSAQFPVIGRTKAAYLKPGENLD 80 (347) T ss_pred CCCCccCcccccccccCCcccchHHHHHHHHHHHHHHHHHHHHhhhhhhccccccccceeEeeeccceeeeeecCCCCCC Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred CccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccC Q lcl|NC_015249. 81 DKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGK 160 (347) Q Consensus 81 ~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~ 160 (347) ++++++++++++|+||+++||+|+|||+|++|+++|+|+++++++|++||+++|++|+++++++.+.+.......++++. T Consensus 81 ~~~~~~~~~e~~ltiD~~~y~~~~VddiD~~q~~~D~~~~~~~~~g~aLA~~~D~~i~~~l~~~~~~~~~~~~~~~~~~~ 160 (347) T protein:vir:33 81 DKRKDIKHTEKVIHIDGLLTADVLIYDIEDAMNHYDVRAEYTAQLGESLAMAADGAVLAELAGLVNLPDGSNENIEGLGK 160 (347) T ss_pred CCCCCCccceEEEEechhhhhhHHHhhHHHHhcCCchhHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhcccccccccccc Confidence 88888999999999999999999999999999999999999999999999999999999999887765544433333333 Q ss_pred c--ceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceE Q lcl|NC_015249. 161 A--HVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSI 238 (347) Q Consensus 161 g--~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~V 238 (347) + +.+..+++. ...++...+.++|++|++|+++|+|++||++|||+||+|++|++||++++|++++|+++..+.+|+| T Consensus 161 ~~~~~~~~~~tg-~~~d~~~~a~~i~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~~~~~~d~~~~~~~~~G~V 239 (347) T protein:vir:33 161 PTVLTLVKPTTG-SLTDPVELGKAIIAQLTIARASLTKNYVPAADRTFYTTPDNYSAILAALMPNAANYQALLDPERGTI 239 (347) T ss_pred cccccccccccc-cccchhhhHHHHHHHHHHHHHHHhhcCCCccCcEEEeCHHHHHHHhcccccccccccccccccccee Confidence 3 222222222 3445667889999999999999999999999999999999999999999999999999889999999 Q ss_pred EEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhh Q lcl|NC_015249. 239 RNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANF 318 (347) Q Consensus 239 g~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~ 318 (347) ++++||+||+|||||..+++.+..+ +..+..+.+....+.+++++|++.+||+|||+|+++++++++++|..||++| T Consensus 240 ~~i~G~~V~~Sn~lp~~~~~~~~~~---~~ag~~~~~~~~~~~~~~~a~~~~~gl~~h~~A~g~v~~~~~~~e~~r~~~~ 316 (347) T protein:vir:33 240 RNVMGFEVVEVPHLTAGGAGDTRED---APADQKHAFPATSSTTVKVALDNVVGLFQHRSAVGTVKLKDLALERARRANY 316 (347) T ss_pred EEEeceeEEEecccccCcccccccc---ccccccccccCCcccceeccccceeeeeecchhheeeeeeceeeeeccchhh Confidence 9999999999999999876654333 2345556666777788999999999999999999999999999999999999 Q ss_pred hcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 319 QADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 319 ~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |+|+|+++|+|||+++||||||+|++|+= T Consensus 317 ~~d~i~~~~~~G~~vlrP~~av~i~~~~~ 345 (347) T protein:vir:33 317 QADQIIAKYAMGHGGLRPEAAGAIVLPKV 345 (347) T ss_pred hhHhhhhhhhcCCceecccceEEEecCCC Confidence 99999999999999999999999999998 No 7 >protein:vir:100057 Length: 375 # NCBI annotation: T7-like capsid protein # Family: family:all:975 # MgeID: mge:1604 # MgeName: P-SSP7 # Cross-refs: genbank:acc:YP_214206;genbank:gi:61806429;genbank:GeneID:3294737 Probab=100.00 E-value=8.6e-105 Score=591.11 Aligned_cols=344 Identities=22% Similarity=0.318 Sum_probs=310.8 Q ss_pred CCccc----ccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecC Q lcl|NC_015249. 1 MAKMN----GGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPG 76 (347) Q Consensus 1 ma~~~----~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g 76 (347) |++++ |++|++|||||++ ++|+++||||+|+|||+++|+++|++++++++|+|++|||++|+++|++++++|+|| T Consensus 1 ~~~~~~~~~~~~n~~t~~~~~~-~~~~~al~le~f~geV~~~f~~~si~~~~~~~rti~~Gksv~f~~iG~~t~~~~t~G 79 (375) T protein:vir:10 1 MANANQVALGRSNLSTGTGYGG-ATDKYALYLKLFSGEMFKGFQHETIARDLVTKRTLKNGKSLQFIYTGRMTSSFHTPG 79 (375) T ss_pred CccccccccCccccCCcccccc-ccchHHHHHHHHhHHHHHHHHHHHhhhccccccccccCceEEEEeeeeeEEeeecCC Confidence 88877 8899999999994 468899999999999999999999999999999999999999999999999999999 Q ss_pred CCCCCc-cCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccc Q lcl|NC_015249. 77 ENLDDK-RKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENI 155 (347) Q Consensus 77 ~~~~~~-~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~ 155 (347) ++|+++ ..++++++++|+||+++||+|+|||+|++|+++|+|+++++|+||+||+++|++|+++++++++...+....+ T Consensus 80 ~~i~~~~~~d~~~te~~l~ID~~~y~~~~VdDiD~aqa~~Dlr~e~s~~~G~aLA~~~D~~i~~~l~kaa~~~~p~~~~~ 159 (375) T protein:vir:10 80 TPILGNADKAPPVAEKTIVMDDLLISSAFVYDLDETLAHYELRGEISKKIGYALAEKYDRLIFRSITRGARSASPVSATN 159 (375) T ss_pred cCcCCccccCCCCCceEEEecchhhhhhhHhhHHHHhcCchhHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccccc Confidence 998875 4578899999999999999999999999999999999999999999999999999999999999999988888 Q ss_pred ccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcc---hhhhhhhhccccc Q lcl|NC_015249. 156 AGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAA---LMPNAANYQALID 232 (347) Q Consensus 156 ~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~---~~~~~~~~~~~~~ 232 (347) ...++++.+..++... ...+.++.++|++|++++++|+|++||++|||+||+|++|++||++ +++++.+|+++.. T Consensus 160 ~~~~Gg~~i~~~sg~~--~~~~~ta~~~~~ai~~a~~~Lde~~VP~~~R~~vv~P~~y~~Ll~~~d~~~~~n~d~~~~~~ 237 (375) T protein:vir:10 160 FVEPGGTQIRVGSGTN--ESDAFTASALVNAFYDAAAAMDEKGVSSQGRCAVLNPRQYYALIQDIGSNGLVNRDVQGSAL 237 (375) T ss_pred ccccCcceeeeccccc--cccccCHHHHHHHHHHHHHHHhhcCCCCCCCEEEeChHHHHHHHhcCCccceeeecccccce Confidence 8899998887765443 3455679999999999999999999999999999999999999987 6789999999988 Q ss_pred cccceEEEEeceEEEEecceecccccccccccccccccc------ccc------ccccccccccccc---cceEEEEech Q lcl|NC_015249. 233 PSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQ------KHA------FPETSSGDTRVAL---DNVVGLFNHR 297 (347) Q Consensus 233 ~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~------~~~------~~~~~~~~y~~~~---~~~~~l~~~~ 297 (347) ..+|.|++++||+||+|||+|..+++.+..+.+.+.++. .+. .....+.+|++++ +|+|||+||| T Consensus 238 ~~~g~v~~i~Gv~V~~Sn~lP~~~~~~~~~g~~~~~~a~~~~~~~~~~~~~~~~~~~g~~~~y~~d~~~~~~~~~~~~~~ 317 (375) T protein:vir:10 238 QSGNGVIEIAGIHIYKSMNIPFLGKYGVKYGGTTGETSPGNLGSHIGPTPENANATGGVNNDYGTNAELGAKSCGLIFQK 317 (375) T ss_pred eccceEEEEeceEEEEeccccccccccccccccccccchhhhhccccccCCcceeeccccccccccccccCceEEEEEch Confidence 999999999999999999999998877766665554321 111 1122345899998 9999999999 Q ss_pred hhhhhhhhcceeeeee---echhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 298 SAVGTVKLKDMALERA---RRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 298 ~Av~~v~~~~~~~e~~---~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +|+++||++++++|++ |+++||+|+|+++|+|||+++||||||+|.+... T Consensus 318 ~A~g~v~~~~~~~~~~~~~~~~~~q~~~i~~~~a~G~~~lrp~~av~l~~~~~ 370 (375) T protein:vir:10 318 EAAGVVEAIGPQVQVTNGDVSVIYQGDVILGRMAMGADYLNPAAAVELYIGAT 370 (375) T ss_pred hheeeeeeeccccccccchhhheeeeeeeeeeeeeccCccCceeEEEEecCcC Confidence 9999999999999988 7999999999999999999999999999999955 No 8 >protein:vir:1541 Length: 347 # NCBI annotation: major capsid protein 10A # Family: family:all:975 # MgeID: mge:31 # MgeName: phiYeO3-12 # Cross-refs: genbank:acc:NP_052109;swissprot:trembl:q9t107;genbank:gi:9634035;uniprot:Q9T107;genbank:GeneID:1262383 Probab=100.00 E-value=1.1e-104 Score=590.53 Aligned_cols=344 Identities=80% Similarity=1.156 Sum_probs=310.8 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENLD 80 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~ 80 (347) |||++||++++|||||+++++|++|||||+|+|+|+++|+++|++++++++|++++|||++||++|++++++|++|++++ T Consensus 1 ma~~~~~~~~~t~~~~~~~~~~~~a~~ie~f~g~V~~~f~~~s~~~~~~~~~~~~~G~sv~i~~ig~~t~~~~~~g~~l~ 80 (347) T protein:vir:15 1 MANIQGGQQIGTNQGKGQSAADKLALFLKVFGGEVLTAFARTSVTMPRHMLRSIASGKSAQFPVIGRTKAAYLKPGENLD 80 (347) T ss_pred CCccccCCccccccccCCCcchHHHHHHHHHHHHHHHHHHHhhhhhhccccccccccceeEeeeccceeeeeeccCCCCC Confidence 99999999999999999999999999999999999999999999999999999999999999999999999999999999 Q ss_pred CccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccC Q lcl|NC_015249. 81 DKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGK 160 (347) Q Consensus 81 ~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~ 160 (347) ++++++++++++|+||+++||+|+|||+|++|+++|+|+++++++|++||+++|++|++++++++.+.........+.++ T Consensus 81 ~~~~~~~~~e~~ltID~~~~~~~~VddlD~~q~~~D~~~~~~~~~g~aLA~~~D~~i~~~l~~~~~~~~~~~~~~~~~g~ 160 (347) T protein:vir:15 81 DKRKDIKHTEKVIHIDGLLTADVLIYDIEDAMNHYDVRAEYTAQLGESLAMAADGAVLAELAGLVNLPDASNENIEGLGK 160 (347) T ss_pred CCCCCCccceEEEEechhhhhhHHhhhHHHHhcCCcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhccccccccccccCc Confidence 88888999999999999999999999999999999999999999999999999999999999887765544444333222 Q ss_pred cceee-cccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceEE Q lcl|NC_015249. 161 AHVLE-VGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSIR 239 (347) Q Consensus 161 g~~i~-~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg 239 (347) ..+.. ....+....++...+.++++.|++|+++|+|++||++|||+||+|++|+.||+++++++.+|.++..+++|.|+ T Consensus 161 ~~~~~~~~~~~~~~~~~~~~~~~i~d~~~~a~~~Lde~~VP~~gR~~vv~P~~y~~LL~~~~~~~~d~~~~~~~~~G~Vg 240 (347) T protein:vir:15 161 PTVLTLVKPTTGDLTDPVELGKAIIAQLTIARASLTKNYVPAADRTFYTTPDNYSAILAALMPNAANYQALIDHERGTIR 240 (347) T ss_pred cccccccccccccchhhhhHHHHHHHHHHHHHHHHhhcCCCccCCEEEeCHHHHHHHhcccccccccccccccccceEEE Confidence 22221 11222234466677899999999999999999999999999999999999999999999999999999999999 Q ss_pred EEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhhh Q lcl|NC_015249. 240 NVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANFQ 319 (347) Q Consensus 240 ~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~ 319 (347) +++||+||+|||||..+++.+.. .+.++..+.+....+.+.+.+|++.++|+||++|++++|+|++++|..||++|| T Consensus 241 ~i~G~~V~~Sn~lp~~~~t~~~~---~~~~g~~~~~~~~~~~~~~~~f~~~~~l~~h~~A~g~v~~~~~~~e~~~~~~~~ 317 (347) T protein:vir:15 241 NVMGFEVVEVPHLTAGGAGDTRE---DAPADQKHAFPATSSTTVKVALDNVVGLFQHRSAVGTVKLKDLALERARRANYQ 317 (347) T ss_pred EEeceEEEecccccccccccccc---cccccccccccccccceeeeccccceeeeeccceeeeeEeeceeeeecccchhh Confidence 99999999999999887665442 344566777777778889999999999999999999999999999999999999 Q ss_pred cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 320 ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 320 ~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +|+|+++|+|||+++||||||+|++|+= T Consensus 318 ~d~i~~~~~~G~~vlrP~~av~~~~~~~ 345 (347) T protein:vir:15 318 ADQIIAKYAMGHGGLRPEAAGAIVLPKV 345 (347) T ss_pred hhhhehhhhcCCceeccccEEEEecCCC Confidence 9999999999999999999999999998 No 9 >protein:vir:103323 Length: 364 # NCBI annotation: major capsid-like protein # Family: family:all:2806 # MgeID: mge:1609 # MgeName: Era103 # Cross-refs: genbank:acc:YP_001039668;genbank:gi:125999997;genbank:GeneID:4818399 Probab=100.00 E-value=2.2e-99 Score=561.44 Aligned_cols=333 Identities=14% Similarity=0.117 Sum_probs=292.0 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENLD 80 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~ 80 (347) |...+ .+|||||+++ +|.++||||+|+|||+++|+++|++++++++|+|++|||++||++|+++++||+||++++ T Consensus 1 ms~~n----~~t~~~~~~~-~~~~al~le~f~geV~taf~~~s~~~~~~~~rti~~gkS~q~~~iG~~~~~~~~~G~~ld 75 (364) T protein:vir:10 1 MSNPN----VLTQPAVSAS-GEVDSLLIEKFNNRVHEQYLKGENLLQWFDVQEVVGTNSVSNKYIGETELQVLSPGKSPD 75 (364) T ss_pred CCCcc----cccccccccc-cchhhhhhhhhhhhHHHHHHHHHhhcCcceeeeecccceEEeeeeeeeEEeeeccCcccC Confidence 66554 6799999954 477999999999999999999999999999999999999999999999999999999998 Q ss_pred CccCCCCCceEEEEEEeeeecccccccHHHHHhChh-hHHHHHHHHHHHHHHHHHHHHHHHHHHHh-hhccccccccccc Q lcl|NC_015249. 81 DKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYD-VRSEYTAQLGESLAMAADGAVLAEMAKLC-NLPSASDENIAGL 158 (347) Q Consensus 81 ~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D-~r~~~~~~~g~aLa~~~D~~i~~~~~~~a-~~~~~~~~~~~~~ 158 (347) + +++.++|++|+||+++|++++|||+|++|+||| +|+||++|+||+||+++||+|++++..++ .-+.+....+.+. T Consensus 76 ~--~~~~~~k~~itID~ll~a~~~V~diDe~q~~~D~vR~e~s~e~G~ALA~~~Dq~i~~~v~~aa~a~~~~~~~~~~~~ 153 (364) T protein:vir:10 76 A--SPTEFDKNRLVVDTTVIARNTVAHFHDVQNDIDGLKSKLSVNQAKKLKKMEDSMVIQQLVLGGISNTEAIRKNPRVA 153 (364) T ss_pred C--CCcccCcEEEEecceeeechhhhhHHHHhcCccchhHHHHHHHHHHHHHHHHHHHHHHHHhhhhhcccccccCCccc Confidence 6 578899999999999999999999999999999 89999999999999999999988775543 2233344455566 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhc--cccccccc Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQ--ALIDPSTG 236 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~--~~~~~~~G 236 (347) ++|+.+.++. .+.+....+.+++++|++|.++|+|++||.+|||+||+|++|++||++++|+|++|. +++.+.+| T Consensus 154 ~~g~~i~~~~---~a~~~~~~~~~l~~ai~~a~~~LdEkdVP~~~R~~vv~P~~y~~Ll~~~~lvn~d~~~~~~~~~~~G 230 (364) T protein:vir:10 154 GHGFSIHIVG---LASSFLTSPQYMMAAIEMAMEQQTEQEVDTSELCGLMPWTAFNCLRDADRIVDKSYTIAASDNTVDG 230 (364) T ss_pred CCcceeeecc---cCcchhhhHHHHHHHHHHHHHHHhhcCCCccccEEEeChHHHHHHhcCCccccccccccCCCccccc Confidence 6666665532 234567788999999999999999999999999999999999999999999999986 66779999 Q ss_pred eEEEEeceEEEEecceeccccccccccccccccccccccccccccccc--ccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 237 SIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTR--VALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 237 ~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~--~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) +|++++||+||+|||||..+..... ++..++|+++...+.+ +|. .+++++++++|||+||+++|++++++|.|| T Consensus 231 ~v~~v~Gv~Vv~Sn~lP~~~~~~~~---t~~~t~h~ls~~~~g~-~y~v~~d~~~~~~~~f~~~Al~tv~~~~~t~e~~~ 306 (364) T protein:vir:10 231 FVLKSWNTPIVPSNRFPKLSDNTEG---TGNTKHHKLSNAGNGN-RYDVTAGQTSAQAVLFTQDALLVGRTISITGDIFY 306 (364) T ss_pred eeEEEeceEEEeccccccccccccc---cccccccccccccCCc-ccccccccceeEEEEEecceEEEEEEecceeeeee Confidence 9999999999999999988665433 4456667666554433 454 788999999999999999999999999999 Q ss_pred chhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++++|+|+|+++|+|||+++||||||+|.+..+ T Consensus 307 ~~~~~~~~ida~~a~G~g~lRPeaa~~i~~~~~ 339 (364) T protein:vir:10 307 EKKEKTWYIDTFLAEGAIPDRWEAVAVVTAADT 339 (364) T ss_pred ccceeeeeeeeehcccCcccCccceEEEEecCC Confidence 999999999999999999999999999999988 No 10 >protein:vir:80213 Length: 334 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:1879 # MgeName: LKA1 # Cross-refs: genbank:acc:YP_001522884;genbank:gi:158345177;genbank:GeneID:5687476 Probab=100.00 E-value=3.6e-99 Score=560.29 Aligned_cols=326 Identities=16% Similarity=0.115 Sum_probs=293.3 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENLD 80 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~ 80 (347) |+|+++ |..|||+|+++++| ++||||+|+|||+++|+++|+|++++++|+|++|||+|||++|+++++||+||++++ T Consensus 1 m~~~~~--~~~t~~~~~~~~~~-~~l~le~~~geV~~af~~~s~~~~~~~~r~i~~G~s~~~~~iG~~~~~~~~~g~~l~ 77 (334) T protein:vir:80 1 MTYPAA--NTHTRPGWGGANSD-VSLHIEEHLGLVDASFMYSSKFASWMNVRSLRGTNQLRVDRVGASTIAGRKAGEELV 77 (334) T ss_pred CCCCcC--CCccccccccccch-heehhhhhhhHHHHHHHHhhhhhccceeeeccccceEEEeeecceeeeeecCCCCCC Confidence 999997 44589999988777 779999999999999999999999999999999999999999999999999999998 Q ss_pred CccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccC Q lcl|NC_015249. 81 DKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGK 160 (347) Q Consensus 81 ~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~ 160 (347) + +++++++++|+||+++|++++|||+|++|++||+|+|+++|+|++||+++||+|+++++++++.+.+....++..++ T Consensus 78 ~--~~~~~~~~~l~ID~~l~~~~~VddiD~~q~~~D~rse~~~~~G~aLA~~~D~~~~~~l~kaa~~~~~~~~~~~~~~G 155 (334) T protein:vir:80 78 V--QKNVSDKLNLTVDTVLYARHFFDKFDEWTSNLDVRKETAREDGIALARQYDQACIIQLQKCGDFLAPAHLKPAFHDG 155 (334) T ss_pred C--CCcccCceEEEEeeeeehhhhHhhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccccccccCC Confidence 7 46899999999999999999999999999999999999999999999999999999999999999888877777666 Q ss_pred cceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCC---CCCEEEeCHHHHHHHhcchhhhhhhhcc---ccccc Q lcl|NC_015249. 161 AHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPS---ADRVFYTTPDNYSAILAALMPNAANYQA---LIDPS 234 (347) Q Consensus 161 g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~---~gR~~vv~P~~~~~Ll~~~~~~~~~~~~---~~~~~ 234 (347) +......... +.+....++.+++++++|++.|+|+|||+ +|||+||+|++|++||++++|++++|.+ ...+. T Consensus 156 ~~~~~~~~g~--~~~~~~~~~~l~~a~~~a~~~L~e~dvp~~~~~~R~~vv~P~~y~~Ll~~~r~~n~d~~~s~~~~~~~ 233 (334) T protein:vir:80 156 ILLPSTISGL--AADAAADADVLVAAHRQGVEAMVFRDLGDQLMSEGVTLLDPVIFSFLLEHDRLMNVEFGAKEGGNSFV 233 (334) T ss_pred cceeeccccc--ccchhhhHHHHHHHHHHHHHHHHhcCCCCCcCCceEEEeChHHHHHHhcccccccceecccccccccc Confidence 5554433222 23456678899999999999999999994 6799999999999999999999999854 45589 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) +|+|++++||+||+|||+|..+.+.+..+ ....+|++||++.++++||++|++++|++++++|.+| T Consensus 234 ~g~i~~v~G~~V~~Sn~~P~~~~t~~~~g--------------~~~~~~agd~t~~~~~~~~~~Al~t~~~~~~~~e~~~ 299 (334) T protein:vir:80 234 GGRIAMLNGVRVVETPRFPQSAITANALG--------------ADFNVTDAEVRRKMITFIPSMALISAQVHPVSAQFWE 299 (334) T ss_pred ceeEEEEeceEEEeecCCCCccccccccc--------------cccccccccccceEEEEEeCceEEEEEEeecceeeee Confidence 99999999999999999997654432222 1234799999999999999999999999999999999 Q ss_pred chhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++++|+|+|+++++|||+++|||||+++++..- T Consensus 300 ~~~~~~d~i~~~~a~G~g~lRPeaa~vv~~~~~ 332 (334) T protein:vir:80 300 EKKDFGHYLDTFQSYNIGQRRPDAVAVHDITVT 332 (334) T ss_pred chhhHHHHHHHHHHcCCceeccceEEEEEEeee Confidence 999999999999999999999999999999888 No 11 >protein:vir:6324 Length: 335 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:132 # MgeName: phiKMV # Cross-refs: genbank:acc:NP_877471;genbank:gi:33300843;uniprot:Q7Y2D3;genbank:GeneID:1482613 Probab=100.00 E-value=1e-98 Score=557.85 Aligned_cols=322 Identities=16% Similarity=0.128 Sum_probs=286.9 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENLD 80 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~ 80 (347) |.+.+ ..|||||+++++|. +||||+|+|||+++|+|+|++++++++|+|++|||+|||++|+.+++||+||++++ T Consensus 1 ms~~~----~~tr~~~~~s~~d~-al~le~f~geV~~af~~~s~~~~~~~~rti~~g~s~~~~~iG~~~~~~~~pG~~l~ 75 (335) T protein:vir:63 1 MSFLN----DLTRPNYAGKNADV-DIHLEEHLGIVDKHFAYTSKFAPLMNIRDLRGSNVVRLDRLGNVEAKGRRAGEELE 75 (335) T ss_pred CCCcc----cchhhhcccccchh-heehhhhhhhHHHHHHhhhhhccccceeeeccceeEEEeeeeeeeeecccCCcCcC Confidence 76664 57999999999996 79999999999999999999999999999999999999999999999999999999 Q ss_pred CccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccC Q lcl|NC_015249. 81 DKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGK 160 (347) Q Consensus 81 ~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~ 160 (347) ++ .+.++|++|+||+++|++++|||+|++|++||+|+||++|+|++||+++||+|++++++++++.++....++.+++ T Consensus 76 ~~--~~~~~k~~itVD~ll~a~~~I~dlDe~~~~yDvRse~s~e~G~aLA~~~D~~~~~~i~~aa~~~a~~~~~~~~~~G 153 (335) T protein:vir:63 76 RS--RVVNDKWNLTVDTLLYLRHQFDHQDEWTQSFDMRKEVAELDGQELARKFDQACLIQVIKAAAMDAPVDLEDAFSPG 153 (335) T ss_pred CC--CccccceEEEecceeechhhhhhHHHHhcCchhHHHHHHHHHHHHHHHHHHHHHHHHHhhccccCccccCCCcCCC Confidence 86 4788999999999999999999999999999999999999999999999999999999999999888877765555 Q ss_pred cceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCC---CCEEEeCHHHHHHHhcchhhhhhhhc---cccccc Q lcl|NC_015249. 161 AHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSA---DRVFYTTPDNYSAILAALMPNAANYQ---ALIDPS 234 (347) Q Consensus 161 g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~---gR~~vv~P~~~~~Ll~~~~~~~~~~~---~~~~~~ 234 (347) ++.....++.+ ....+.+++++|++|.++|+|++||++ +||++|+|++|++||++++|+|++|+ +.+.+. T Consensus 154 ~~~~~~~tg~~----~~~~~~~l~~a~~~a~~~L~e~dVP~~~~~dr~~vv~P~~y~~Ll~~~~l~n~~~~~s~~~~~~~ 229 (335) T protein:vir:63 154 VLEKLDLTGLT----AKQAADKIVRMHRRVVETFIDRDLGDAVYSEGLTPMSPRVFSLLLEHDKLMNVEYQATGATNDYV 229 (335) T ss_pred cceeeeeccCc----ccccHHHHHHHHHHHHHHHHhccCCCcccCceEEEeChHHHHHHhcccccccccccccccccccc Confidence 44433322222 223478899999999999999999975 49999999999999999999999986 346689 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) +|+|++++||+|++|||+|..+++.+..+.. ...|++|+++.++++||++|++++|++++++|.|| T Consensus 230 ~g~v~~v~Gv~V~~sn~lP~~~~t~~~lg~a--------------~n~~~~d~~~~~~~~~~~~Al~t~~~~~vt~e~~~ 295 (335) T protein:vir:63 230 KSRVAILNGVKVLETPRFATKAIAAHPLGRH--------------FNVSAEESERQIALFLPSKTLITAQVAPVQAKLWE 295 (335) T ss_pred CceeEEeeceEEEeeccCCCCCccccccccc--------------CCccccccceeEEEEEecceEEEEEEeecccceee Confidence 9999999999999999999766544332222 12588899999999999999999999999999999 Q ss_pred chhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++++|+|+|+++++|||+++||||||+|+++.. T Consensus 296 ~~~~~~~~i~~~~a~G~g~lRPe~a~~i~~tg~ 328 (335) T protein:vir:63 296 DNEKFSWVLDTFQMYNIGARRPDTAGAIELKGI 328 (335) T ss_pred ccchhhHHhHHHHHcCCcccccceEEEEEEcCC Confidence 999999999999999999999999999999766 No 12 >protein:vir:78935 Length: 335 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:1860 # MgeName: LKD16 # Cross-refs: genbank:acc:YP_001522824;genbank:gi:158345059;genbank:GeneID:5687425 Probab=100.00 E-value=1e-97 Score=552.33 Aligned_cols=322 Identities=16% Similarity=0.131 Sum_probs=289.0 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENLD 80 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~ 80 (347) |.+.+ ..|||||+++++|. +||||+|+|||+++|+++|++++++++|+|++|||+|||++|+.+++||+||++++ T Consensus 1 ms~~~----~~t~~~~~~s~~d~-al~le~f~geV~~af~~~s~~~~~~~~rti~~g~s~~~~~iG~~~~~~~~pG~~l~ 75 (335) T protein:vir:78 1 MSFLN----DLTRPNYAGKNADV-DIHLEEHLGIVDKHFAYTSKFAPLMNIRDLRGSNVVRLDRLGNVEAKGRRAGEELE 75 (335) T ss_pred CCccc----cccccccccccchh-hhhhhhhhhHHHHHHHHhhhhccccceeeeccceeEEEeeeeeeeecccccCcccC Confidence 76663 57999999999985 79999999999999999999999999999999999999999999999999999998 Q ss_pred CccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccC Q lcl|NC_015249. 81 DKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGK 160 (347) Q Consensus 81 ~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~ 160 (347) ++ .+.++|++|+||+++|++++|||+|++|++||+|+||++|+|++||+++||++++++++++++.++++..++.+++ T Consensus 76 ~~--~~~~~k~~itID~ll~a~~~VddlDe~~~~yDvR~e~s~~~G~aLA~~~Dq~~~~~l~~aa~~~a~~~~~~~~~~G 153 (335) T protein:vir:78 76 RS--RVVNDKWNLTVDTLLYLRHQFDHQDEWTQSFDMRKEVAELDGQELARKFDQACLIQVIKAAAMDAPVDLEDAFSPG 153 (335) T ss_pred CC--CcccCCeEEEecceeechhhHhhHHHhhcCchhHHHHHHHHHHHHHHHHHHHHHHHHHhhcccccccccCCCcCCC Confidence 85 4789999999999999999999999999999999999999999999999999999999999999988888776666 Q ss_pred cceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCC---CCEEEeCHHHHHHHhcchhhhhhhhc---cccccc Q lcl|NC_015249. 161 AHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSA---DRVFYTTPDNYSAILAALMPNAANYQ---ALIDPS 234 (347) Q Consensus 161 g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~---gR~~vv~P~~~~~Ll~~~~~~~~~~~---~~~~~~ 234 (347) ++.....++. +....+.++++++.++.+.|+|+|||+. +||++|+|++|++||++++|+|++|. +.+.+. T Consensus 154 ~~~~~~~tg~----~~~~~~~~l~~a~~~a~~~l~ekdvP~~~~~~rv~vv~P~~y~~Ll~~~~l~n~~~~~s~~~~~~~ 229 (335) T protein:vir:78 154 VLEKLDLTGL----TAKEAAEKIVRMHRRVVETFIERDLGDAVYSEGLTPMSPRVFSLLLEHDKLMSVEYQATGATNDYV 229 (335) T ss_pred cceeeeeccc----cccccHHHHHHHHHHHHHHHHhccCCCCCCCccEEEeChHHHHHHhcccccccccccccccccccc Confidence 6554433322 2334588899999999999999999965 69999999999999999999999986 346689 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) +|+|++++||+|++|||||..+++.+..+.. ...|++|+++.++++||++||+++++++++.|.+| T Consensus 230 ~g~v~~v~Gv~V~~Sn~lP~~~~t~~~lg~a--------------~n~~~~d~~~~~~~~~~~~Al~t~~~~~~~~e~~~ 295 (335) T protein:vir:78 230 KSRVAILNGVKVLETPRFATKAISAHPLGRH--------------FNVSAEEAERQIALFLPSKTLITAQVAPVQAKLWE 295 (335) T ss_pred cceeEEeeceEEEeeccCCCCCCcccccccc--------------CCcccccccceEEEEEecceEEEEEEEecccceee Confidence 9999999999999999999765443322211 13477899999999999999999999999999999 Q ss_pred chhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++++|+|+|+++++|||+++||||||+|.++.. T Consensus 296 ~~~~~~~~i~~~~a~G~g~lRPe~a~~i~~tg~ 328 (335) T protein:vir:78 296 DHDQFSWVLDTFQMYNIGARRPDTAGAIELKGI 328 (335) T ss_pred ccchhhHhhhHHHHcCCcccCcceEEEEEecCC Confidence 999999999999999999999999999999988 No 13 >protein:vir:97031 Length: 402 # NCBI annotation: 31 # Family: family:all:2806 # MgeID: mge:1644 # MgeName: K1-5 # Cross-refs: genbank:acc:YP_654132;genbank:gi:108862016;genbank:GeneID:5075980 Probab=100.00 E-value=3.6e-97 Score=549.34 Aligned_cols=328 Identities=14% Similarity=0.118 Sum_probs=282.8 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENLD 80 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~ 80 (347) |.+.| .+|||||+++ +|.++||||+|+|||+++|+++|++++++++|+|++|||++||++|+++++||+||++++ T Consensus 1 Ms~~n----~~t~~~~~~s-~~~~al~le~f~geV~taF~~~si~~~~~~vrti~~GkS~qf~~iG~~~a~y~~~G~~ld 75 (402) T protein:vir:97 1 MSTPN----TLTNVAVSAS-GEVDSLLIEKFNGKVNEQYLKGENILSYFDVQTVTGTNTVSNKYLGETELQVLAPGQSPN 75 (402) T ss_pred CCCcc----cccccccccc-cchhhhhhhhhhhhHHHHHHHHHhhcCcceeeeecccceEEEEEEeeeEEeeeccccccC Confidence 66553 6899999954 477999999999999999999999999999999999999999999999999999999998 Q ss_pred CccCCCCCceEEEEEEeeeecccccccHHHHHhChh-hHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh-ccccccccccc Q lcl|NC_015249. 81 DKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYD-VRSEYTAQLGESLAMAADGAVLAEMAKLCNL-PSASDENIAGL 158 (347) Q Consensus 81 ~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D-~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~-~~~~~~~~~~~ 158 (347) + +++.++|++|+||+++|++++|||+|++|+||| +|++|++|+|++||+++||+|++++..+++. +.+....+.+. T Consensus 76 g--~~~~~~k~~ItID~lL~a~~~V~diDeaq~~yD~vRse~s~e~G~ALA~~~Dq~ii~~i~~aa~a~t~~~~~~~~~~ 153 (402) T protein:vir:97 76 A--TPTQADKNQLVIDTTVIARNTVAHIHDVQGDIDSLKPKLAMNQAKQLKRLEDQMAIQQMLLGGIANTKAERNKPRVK 153 (402) T ss_pred C--CCcccccEEEEeCceeechhhhhhHHHHHhcccchhHHHHHHHHHHHHHHHHHHHHHHHHHhhccccccccccCccc Confidence 6 568899999999999999999999999999999 8999999999999999999998777554432 33444455555 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhc--cccccccc Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQ--ALIDPSTG 236 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~--~~~~~~~G 236 (347) ..++.+.+... ..+....+.+++++|++|.++|+|++||.+|||++|+|++|++||++++|++++|. +.+.+.+| T Consensus 154 ~~g~s~~~~~t---~~~a~~~~~~l~~ai~~a~~~LdEkdVP~~dRv~vv~P~~y~~Ll~~~rl~n~d~~~~~~g~~~~G 230 (402) T protein:vir:97 154 GHGFSINVNVT---ESEALANPQYVMAAVEYALEQQLEQEVDISDVAIMMPWKFFNALRDADRIVDKTYTISQSGATING 230 (402) T ss_pred ccccccccccc---cchhhcCHHHHHHHHHHHHHHHHhcCCCccccEEEeChHHHHHHhhcccccchhhccccCCccccc Confidence 54444443221 12335678899999999999999999999999999999999999999999999994 66779999 Q ss_pred eEEEEeceEEEEecceecccccccccccccccccccccccccc-cccccccccceEEEEechhhhhhhhhcceeeeeeec Q lcl|NC_015249. 237 SIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETS-SGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARR 315 (347) Q Consensus 237 ~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d 315 (347) +|++++||+||+|||||+.+. +.++|..+...+. .-.|++|++++++++|||+||+++|+++++.|.||| T Consensus 231 ~v~~v~Gv~Vv~SnnlP~~a~---------~it~~~ls~a~~G~~y~~t~d~t~~~~~~f~~~Av~tvk~~~vT~~~~~d 301 (402) T protein:vir:97 231 FVLSSYNCPVIPSNRFPTFAQ---------DQAHHLLSNEDNGYRYDPIAEMNGAVAVLFTSDALLVGRTIEVTGDIFYE 301 (402) T ss_pred eeEEEeceEEEecCccccccc---------cccccccccCCCCccCCcCcccceeEEEEEecceEEEEEeeccccchhhc Confidence 999999999999999997532 2333433333322 235779999999999999999999999999999999 Q ss_pred hhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 316 ANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 316 ~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++|+|+|+++++|||+++|||||+++.+... T Consensus 302 ~r~~~~~id~~~a~G~g~~RPeaa~vv~~~~~ 333 (402) T protein:vir:97 302 KKEKTYYIDTFMAEGAIPDRWEAVSVVTTKRD 333 (402) T ss_pred hhHHHHHHHHHHHhCCcccCccceEEEEEecc Confidence 99999999999999999999999999999884 No 14 >protein:vir:7019 Length: 401 # NCBI annotation: major capsid protein # Family: family:all:2806 # MgeID: mge:141 # MgeName: SP6 # Cross-refs: genbank:acc:NP_853592;genbank:gi:31711674;genbank:GeneID:1481800 Probab=100.00 E-value=1.2e-94 Score=535.46 Aligned_cols=328 Identities=15% Similarity=0.123 Sum_probs=283.8 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENLD 80 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~ 80 (347) |.+.+ .+|||||++++ |.++||||+|+|||+++|+++|++++++++|+|++|||++||++|+++++||+||++++ T Consensus 1 Ms~~n----~~t~~~~~~sg-~~~al~Le~f~GeV~taF~~~si~~~~~~vRti~~gkS~qf~~~G~s~~~~~~pG~~ld 75 (401) T protein:vir:70 1 MSTPN----NLTNVAVSASG-EVDSLLIEKFNGKVNEQYLKGENIMSYFDVQTVTGTNTVSNKYLGETELQVLAPGQSPA 75 (401) T ss_pred CCCCc----ccccccccccc-chhHhHHhHhcchHHHHHHHHhhhcccceeeeecccceEEEEEeeeeEeeeecCCCCcC Confidence 77775 57999999544 78899999999999999999999999999999999999999999999999999999998 Q ss_pred CccCCCCCceEEEEEEeeeecccccccHHHHHhChh-hHHHHHHHHHHHHHHHHHHHHHHHHHHHhh-hccccccccccc Q lcl|NC_015249. 81 DKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYD-VRSEYTAQLGESLAMAADGAVLAEMAKLCN-LPSASDENIAGL 158 (347) Q Consensus 81 ~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D-~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~-~~~~~~~~~~~~ 158 (347) + +++.++|++|+||+++|++++|+|+|++|+||| +|+||++|+|++||+++||+|++.+..++. .+++....+.+. T Consensus 76 ~--~~~~~dK~~ItID~lL~a~~~V~dlDe~q~~yD~vRse~s~e~G~ALA~~~Dq~iiq~i~~aa~ana~~~~~~p~~~ 153 (401) T protein:vir:70 76 A--TSTQADKNQLVIDATVIARNTVAHLHDVQGDIDSLKPKLATNQAKQLKRMEDEMLIQQMMLGGIANTQAKRTNPRVK 153 (401) T ss_pred C--CCcccccEEEEeCceeehhhhhhhHHHHHhcccccchHHHHHHHHHHHHHHHHHHHHHHHHhccccccccccCCCcC Confidence 6 568899999999999999999999999999999 999999999999999999988777643332 255667778888 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhc--cccccccc Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQ--ALIDPSTG 236 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~--~~~~~~~G 236 (347) ++|+.+.++... .+....+.+++++|++|...|+|++||.++++++.||.+|++|++++++++++|. +.+.+.+| T Consensus 154 ~~G~~i~v~~~~---~~~~~~~~~l~~ai~dA~~~LdEkdVP~~r~vvl~pp~~Ys~Ll~~d~L~nrd~~~s~~g~~~~G 230 (401) T protein:vir:70 154 GHGFSINVEVAE---GEALVNPQYVMAAVEFALEQQLEQEVDISDVAILMPWRYFNVLRDADRIVDKTYTISQSGATIQG 230 (401) T ss_pred CCceEEeccccc---cccccCHHHHHHHHHHHHHHHHhcCCCccceEEEcCHHHHHHHHhcCcccchhhccccCCccccc Confidence 888888876433 2445678889999999999999999996644444477788899999999999985 56779999 Q ss_pred eEEEEeceEEEEecceecccccccccccccccccccccccccc-cccccccccceEEEEechhhhhhhhhcceeeeeeec Q lcl|NC_015249. 237 SIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETS-SGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARR 315 (347) Q Consensus 237 ~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d 315 (347) +|.+++||+||+|||+|+.+.+ .++|..+...+. ...|++|+++++|++|||+|++++|+++++.|.||| T Consensus 231 ~v~~vaGv~Vv~SnnlP~~a~~---------it~~~ls~a~~G~~y~~~~d~s~~~~v~f~~~Av~tvk~~~lt~~~~~d 301 (401) T protein:vir:70 231 FTLSSYNCPVIPSNRFPKYSQG---------QTHHLLSNEDNGYRYDPLPAMNGAIAVLFTADALLVGRSIDVTGDIFYE 301 (401) T ss_pred eEEEEeceEEEeeccccccccc---------cccccccccCCCccCCCCccccceeEEEEehhheEEEEeeccccchhhh Confidence 9999999999999999985432 233443333222 225779999999999999999999999999999999 Q ss_pred hhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 316 ANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 316 ~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++|+|+|+++++|||+++|||||++++++.- T Consensus 302 ~r~~~~~id~~~a~g~g~~RPeaa~vv~~k~~ 333 (401) T protein:vir:70 302 KKEKTYYIDTFMAEGAIPDRWEAVSVVTTKRN 333 (401) T ss_pred hhhhHHHHHHHHHhCCcccchhheEEEeecCc Confidence 99999999999999999999999999887776 No 15 >protein:vir:105645 Length: 400 # NCBI annotation: putative major capsid protein # Family: family:all:2806 # MgeID: mge:1674 # MgeName: K1E # Cross-refs: genbank:acc:YP_425009;genbank:gi:83571757;uniprot:Q2WC43;genbank:GeneID:3837286 Probab=100.00 E-value=1.4e-93 Score=529.70 Aligned_cols=328 Identities=14% Similarity=0.117 Sum_probs=276.3 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENLD 80 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~ 80 (347) |.+.+ .+|||||+++ +|.++||||+|+|||+++|+++|++++++++|+|++|||++||++|+++++||+||++|+ T Consensus 1 Ms~~n----~~t~p~~~gs-g~~~aL~Le~f~GeV~taF~~~si~~~~~~vRtI~~gkS~qf~~lG~s~a~y~~pG~~ld 75 (400) T protein:vir:10 1 MSTPN----NLTNVAVSAS-GEVDSLLIEKFNGKVNEQYLKGENIMSYFDVQTVTGTNTVSNKYLGETELQVLAPGQSPA 75 (400) T ss_pred CCCCc----cccccccccc-cchhhhHHhHhcchHHHHHHHHhhhcccceeeeecccceEEEEEeeeeEEeeecCCCCcC Confidence 77775 5799999954 488889999999999999999999999999999999999999999999999999999998 Q ss_pred CccCCCCCceEEEEEEeeeecccccccHHHHHhChh-hHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh-ccccccccccc Q lcl|NC_015249. 81 DKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYD-VRSEYTAQLGESLAMAADGAVLAEMAKLCNL-PSASDENIAGL 158 (347) Q Consensus 81 ~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D-~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~-~~~~~~~~~~~ 158 (347) ++ ++.++|++|+||+++|++++|||+|++|+||| +|+||++|+|++||+++||+|++++..+... +......+.+. T Consensus 76 g~--~~~~dk~~ItIDtLL~a~~~V~dlDd~q~~yD~vRse~s~e~G~ALA~~~Dq~iiq~i~~a~~a~t~~~~~~~~g~ 153 (400) T protein:vir:10 76 AT--STQADKNQLVIDATVIARNTVAHLHDVQGDIDSLKPKLATNQAKQLKKMEDEMLIQQMLLGGIANTQAKRTNPRVK 153 (400) T ss_pred CC--CcccCcEEEEeCceeeecchhhhHHHHhhccccccHHHHHHHHHHHHHHHHHHHHHHHHHhcccccccccccCCcc Confidence 74 68899999999999999999999999999999 9999999999999999999998777544311 22222333333 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhc--cccccccc Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQ--ALIDPSTG 236 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~--~~~~~~~G 236 (347) ..+..+.+.+ ...+...++.++..+|++|.+.|+|++||.++++++++|++|++|+.+++++|++|. +++++.+| T Consensus 154 ~~g~s~~v~~---~~~~~~~~~~~l~~A~~~A~~~LdEkdVP~~d~vvl~pp~~Ys~Ll~~dkLvnrdf~~s~~g~~~~g 230 (400) T protein:vir:10 154 GHGFSVNVEV---NEGEALVNPQYVMAAVEFALEQQLEQEVDISDVAILMPWRYFNVLRDADRIVDKSYTISQSGATIQG 230 (400) T ss_pred ccccceeecc---cccccccCHHHHHHHHHHHHHHHHhcCCCccceEEEcCHHHHHHHHhCCcccchhccccCCCccccc Confidence 3333333321 122333467889999999999999999998777777788899999999999999986 55779999 Q ss_pred eEEEEeceEEEEecceeccccccccccccccccccccccccccc-ccccccccceEEEEechhhhhhhhhcceeeeeeec Q lcl|NC_015249. 237 SIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSS-GDTRVALDNVVGLFNHRSAVGTVKLKDMALERARR 315 (347) Q Consensus 237 ~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d 315 (347) +|.+++|++||+|||+|+.+. +.++|..+...+++ -.|++|+++++|++|||+|++++|+++++.|.||| T Consensus 231 ~v~~v~Gv~Iv~Sn~lP~~a~---------~~~~~~lS~a~~G~~y~~t~d~s~~~av~F~~sAv~tvk~~~lt~~~~~d 301 (400) T protein:vir:10 231 FVLSSYNCPVIPSNRFPKYSQ---------GQKHHLLSNEDNGYRYDPIAEMNGAIAVLFTADALLVGRSIDVIGDIFYE 301 (400) T ss_pred eEEEEeceEEEeeCcCCcccC---------cccccccccCCCCccCCccccccceeEEEEehhheEEEEeeccccccccc Confidence 999999999999999997432 22334333333322 25679999999999999999999999999999999 Q ss_pred hhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 316 ANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 316 ~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++|+|+|+++++|||+++|||||+++++... T Consensus 302 ~r~~~~~id~~~a~G~g~~RPeaa~vv~~~~~ 333 (400) T protein:vir:10 302 KKEKTYYIDTFMSEGAIPDRWEAVSVVTTKRQ 333 (400) T ss_pred hhhHHHHHHHHHHhCCcccchhheEEEEecCC Confidence 99999999999999999999999999999887 No 16 >protein:vir:78739 Length: 332 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:1856 # MgeName: Syn5 # Cross-refs: genbank:acc:YP_001285448;genbank:gi:148724482;genbank:GeneID:5220210 Probab=100.00 E-value=1.1e-93 Score=530.24 Aligned_cols=321 Identities=25% Similarity=0.367 Sum_probs=279.6 Q ss_pred CCcccccccccccccccccccchh-hhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKL-ALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~-al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~ 79 (347) ++||+..++ .|+||+++++|++ |||||+|+|||+++|+++|+++++++.|++++|||++||++|++++++|++|+++ T Consensus 4 ~~~~~~~~~--~~~~~~~~~~d~~~al~le~~~geV~~~f~~~s~~~~~~~~r~i~~G~tv~i~~ig~~~~~~~~~g~~l 81 (332) T protein:vir:78 4 LSNFSLPNQ--ANGGARNADYDVRYATALKLFSGEVFTAFNNASIFKGLVRSYDLRGGKSKQFMFTGKLSAGYHTPGTPI 81 (332) T ss_pred cccccCCcc--ccCCccccccccchhhhhhhhhhhHHHHHHHHhhhhhccccccccccceEEEEeccceeEeeecCCCCC Confidence 566665554 5889999999965 9999999999999999999999999999999999999999999999999999998 Q ss_pred CCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLG 159 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~ 159 (347) ++. +++++++++|+||+.+|++|+|||+|++|+++|+|+++++|+||+||+++|++|+++++++++...+. .+.+ T Consensus 82 ~~~-~~~~~~~~~l~ID~~ky~~~~VddiD~~q~~~dl~~~~~~~~g~aLA~~~D~~i~~~l~~aa~~~~~~----~~~~ 156 (332) T protein:vir:78 82 VGD-AGIKANEKTLVMDDLLVSSQFVYSLDEIFSQYSTRAEVSKQIGEALATHYDERIARVLAKASAEASPV----TGEP 156 (332) T ss_pred CCC-CCCCCceEEEEEehhhhhHHHHHhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccCcc----cccc Confidence 763 56899999999999999999999999999999999999999999999999999999999877554433 3344 Q ss_pred CcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhc--chhhhhhhhcc-ccccccc Q lcl|NC_015249. 160 KAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILA--ALMPNAANYQA-LIDPSTG 236 (347) Q Consensus 160 ~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~--~~~~~~~~~~~-~~~~~~G 236 (347) +++.+.++... +..+.++|++|++|+++|+|++||.+|||+||+|++|+.||+ +++|++.++.+ ++.+++| T Consensus 157 g~~~~~~~~~~------~~~~~~~~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~d~~~~n~~~~~~~~~~~~g 230 (332) T protein:vir:78 157 GGFHVNIGAGN------TNDAQAIVDGFFEAAAVLDERSAPQEGRVAVLSPRQYYSLISSVDTNILNREIGNSQGDMNSG 230 (332) T ss_pred cccccccCCcc------ccCHHHHHHHHHHHHHHHhhcCCCccCCEEEeCHHHHHHHHhhcCceeeeeeccccccceecc Confidence 45544443322 345788999999999999999999999999999999999998 78999999876 4567887 Q ss_pred e-EEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeee--- Q lcl|NC_015249. 237 S-IRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALER--- 312 (347) Q Consensus 237 ~-Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~--- 312 (347) . |++++||+||+|||||..+++.+.... .......|+++|++.++|+|||+|+++++++++++|. T Consensus 231 ~~i~~i~G~~V~~Sn~lp~~~g~~~~~~~-----------~~~~~n~~~~~~~~~~~~~~h~~a~~~v~~~~~~~~~t~~ 299 (332) T protein:vir:78 231 KGLYSIAGIRILKSNNLAGLYGQDLSSAA-----------VTGENNDYQVDASALAGLIFHREAAGCIQSVAPTIQTTSG 299 (332) T ss_pred eeeeEEeeeEEEecCccccCccccccccc-----------ccccccccccccccceEEeecccceeeeeeeccchhhhhc Confidence 6 899999999999999977655433221 1223457999999999999999999999999997764 Q ss_pred eechhhhcceeeeeeeecccccccceEEEEEEc Q lcl|NC_015249. 313 ARRANFQADQIIAKYAMGHGGLRPEACGALVFN 345 (347) Q Consensus 313 ~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~ 345 (347) +|++++|+|+|+++|+||++++||||+|+|... T Consensus 300 ~~~~~~~~d~i~~~~~~G~~v~rPe~~v~l~~a 332 (332) T protein:vir:78 300 DFNVQYQGDLIVGKLAMGCGSLRTSVAGSFQAA 332 (332) T ss_pred ccchhhhHhhhhhhhhhcCceecccceEEEeeC Confidence 689999999999999999999999999988887 No 17 >protein:vir:99675 Length: 324 # NCBI annotation: Major capsid protein # Family: family:all:975 # MgeID: mge:1523 # MgeName: VP4 # Cross-refs: genbank:acc:YP_249589;genbank:gi:68299740;genbank:GeneID:3799990 Probab=100.00 E-value=2.6e-89 Score=506.20 Aligned_cols=294 Identities=62% Similarity=0.904 Sum_probs=266.0 Q ss_pred ccccccccceEEEeecCcceeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHH Q lcl|NC_015249. 50 LVRSIQSGKSAQFPVLGRTKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESL 129 (347) Q Consensus 50 ~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aL 129 (347) ++|+|++|||++||++|+++++||+||++|++++++++++|++|+||+++|++|+|||+|++|++||+|+++++|+||+| T Consensus 1 ~vr~i~~g~s~~~~~iG~~~~~~~~~G~~l~~~~~~~~~~e~~itID~~l~~~~~VdDiD~~qa~~Dlr~e~s~~~G~aL 80 (324) T protein:vir:99 1 MTRTITSGKSAQFPVMGRTKARYLKQGQSLDDGREDIKHTEKVITIDGLLTTDVLIYDIEDAMNHYDVRSEYSTQMGEAL 80 (324) T ss_pred CeeeeecCceEEEeeeeeeEeccccCCCCcCCCcCCcCcccEEEEecchhhhhhhhhhHHHHhcCccchhHHHHHHHHHH Confidence 99999999999999999999999999999998889999999999999999999999999999999999999999999999 Q ss_pred HHHHHHHHHHHHHHHhhhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeC Q lcl|NC_015249. 130 AMAADGAVLAEMAKLCNLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTT 209 (347) Q Consensus 130 a~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~ 209 (347) |+.+|++|++++++++++..+....+....++..+...++. ..+.+..++++|+.|++|+++|+|++||.+|||+||+ T Consensus 81 A~~~Dq~i~~~~a~~~~~~a~~~~~~~~~~g~~~~~~~~~~--~~~~~~~~~~~~dai~~a~~~Lde~~VP~~gR~~vv~ 158 (324) T protein:vir:99 81 AMAADVANYAEMAKLVNSRKETTNENIEGLGAASLVKITGK--KEDPAKYGTQVIQALTYARAAFAKKYIPAGDRTFYTD 158 (324) T ss_pred HHHHHHHHHHHHHHhhhcccccccCCcccCCccceeccccc--ccccccCHHHHHHHHHHHHHHHhhcCCCCCCCEEEeC Confidence 99999999999999998887777766665555444333222 2355678899999999999999999999999999999 Q ss_pred HHHHHHHhcchhhhhhhhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccc--cccccccc Q lcl|NC_015249. 210 PDNYSAILAALMPNAANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETS--SGDTRVAL 287 (347) Q Consensus 210 P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~y~~~~ 287 (347) |++|++||+++++++.+|++++.+++|+|++++||+||+|||+|..+++... .+.+.++|..++.... .++|++++ T Consensus 159 P~~y~~Ll~~~~~~~~~~~~~~~~~~G~V~~i~Gf~V~~Sn~lp~~~~t~~~--~a~~~~~~~~~~~~~~~~~~ky~~d~ 236 (324) T protein:vir:99 159 PDTYSAILAALMPNAANYAALIDPETGNIRNVMGFEVVETPHMTAQMVTNPT--DAFDGTGHIFPATGDSTTTGKMTVGA 236 (324) T ss_pred hHHHHHHhhcccccccccccccceecceEEEEeceEEEecCCcccccccccc--cccccccccccccccccccccccccc Confidence 9999999999999999999999999999999999999999999998776543 3556666766665554 45899999 Q ss_pred cceEEEEechhhhhhhhhcceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 288 DNVVGLFNHRSAVGTVKLKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 288 ~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++++||+||++|++++|++++++|.+||++||+|+|+++|+|||+++|||||++++++++ T Consensus 237 ~~~~gl~~~~~a~~tv~~~~~~~e~~~~~~~~~d~i~~~~a~G~~~lRPe~a~~v~l~~~ 296 (324) T protein:vir:99 237 DNVVGLFVHRSAVATLKLKDMALERARRPEYQADQIIAKYAMGHGGLRPEAVGAIIFEDG 296 (324) T ss_pred CceeEEEEehhheEEEeeecceecceechhhHHHhhhhhhhhcCcccccceEEEEEEccC Confidence 999999999999999999999999999999999999999999999999999999999988 No 18 >protein:vir:94622 Length: 341 # NCBI annotation: PfWMP4_37 # Family: family:all:2203 # MgeID: mge:1525 # MgeName: Pf-WMP4 # Cross-refs: genbank:acc:YP_762667;genbank:gi:115304375;genbank:GeneID:5142322 Probab=100.00 E-value=1.8e-72 Score=413.93 Aligned_cols=321 Identities=16% Similarity=0.147 Sum_probs=260.1 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhccccccc--ccccceEEEeecCcceeeeeecCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNKHLVRS--IQSGKSAQFPVLGRTKAAYLQPGE 77 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~~~~r~--i~~G~tv~i~~iG~~~~~~~~~g~ 77 (347) |+|+-+|+++ .++.+.. || |+|+++|++.|++.+++.++++.++ +++|+|++||++|+.++++|++|. T Consensus 3 ~~~~~~~~~~--------~t~~v~~-fipei~s~~i~~~l~~~~v~~~~~~d~~~~~~~Gdtv~ip~~g~~~~~d~~~~~ 73 (341) T protein:vir:94 3 LGNTITGPSI--------NTQRGQQ-FIPEQWLSEVQMFRKAKMLDTSVVKTWGAQVKKGDTFHVPRISELGVEDKATDV 73 (341) T ss_pred chhhhccccc--------cchhHHH-HHHHHHHHHHHHHHHhhcchhhccccccccccCCceEEEeccCcceeeeecCCC Confidence 5555555554 3444454 55 9999999999999999999998775 466999999999999999999999 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG 157 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~ 157 (347) +++. +++++++++|+||+++|+++.|+|+|+.|+++|+|++++++++++||+++|+.|+..++..+.... + T Consensus 74 ~i~~--~~~~~~~~~itiD~~~~~~~~i~d~d~~~~~~d~~~~~~~~~~~aLA~~~D~~i~~~~a~~~~~~~-------~ 144 (341) T protein:vir:94 74 PVGV--QPVNDTDFVITVDTDRTTAVALDDLLEIQASYDLRAPYLEAMGYALAKDMTGSILGLRAAVQNTAS-------Q 144 (341) T ss_pred cccc--ccccCceEEEEEeeeeecceeechHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc-------C Confidence 8754 689999999999999999999999999999999999999999999999999999877654321111 0 Q ss_pred ccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccce Q lcl|NC_015249. 158 LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGS 237 (347) Q Consensus 158 ~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~ 237 (347) ..... .....+++ .....++.|++|+++|+|++||.+|||+||+|++|+.|+++++|++.++.++..+++|. T Consensus 145 ~~~~~-----~~~~~t~~---~~~~~~~~i~~a~~~Lde~~VP~~gR~lvv~P~~~~~Ll~~~~~~~~~~~g~~~l~~G~ 216 (341) T protein:vir:94 145 NVFSS-----SNGAITGN---GQAFSFAVFLAARRLLLEADVPEEKIVLLISPGQESALFTIPQFISKDFINNAPIAQGQ 216 (341) T ss_pred ccccC-----ccccccCc---hhhhhHHHHHHHHHHHhhcCCCccCCEEEeCHHHHHHHhhchhhhhhhccccchhheee Confidence 00000 00001111 12345788999999999999999999999999999999999999999999988899999 Q ss_pred EEEEeceEEEEecceeccccccccccccccc-cccccccc-ccccccccccccceEEEEechhhhhhhhhcc-------- Q lcl|NC_015249. 238 IRNVMGFEVIEVPHLTAGGAGEDRPEEGANP-TGQKHAFP-ETSSGDTRVALDNVVGLFNHRSAVGTVKLKD-------- 307 (347) Q Consensus 238 Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~-~~~~~~~~-~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~-------- 307 (347) |++++||+||+||++|..+.+.+....+... .+...++. ....++|+++++..+||+||++|++++|+++ T Consensus 217 ig~i~G~~V~~Sn~lp~~~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~gl~~~~~av~~~k~~~~~~~~~~~ 296 (341) T protein:vir:94 217 IGSLMGVRVIRTSLIGNNSATGWRNGAPTIAPAEATPGFTGSRYLPKQDSFTSLPATFTGNSRPVHTAVMCHMDWAAAVV 296 (341) T ss_pred eeeEeceEEEEeccccccccccccccccceecccccccccccccccccccccccEEEEEEecccccceeeecchhhhccc Confidence 9999999999999999888776555443321 11111121 2234478899999999999999999998666 Q ss_pred ---eeeeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 308 ---MALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 308 ---~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +.+|..|+++||+|+|+++|+|||+++||||||+|.+..+ T Consensus 297 ~~~~~~~~~~~~~~~~~~i~~~~~~G~~~lrp~~~v~~~~~~~ 339 (341) T protein:vir:94 297 SKAPRVTQSFENREQVWLMVGRQAYGARLYRPLHAVNIHTTGD 339 (341) T ss_pred cccccccccchhhhhhhhhhhhhhhcccccCcceeEEEecCcC Confidence 6677789999999999999999999999999999999988 No 19 >protein:vir:80180 Length: 381 # NCBI annotation: capsid protein # Family: family:all:2203 # MgeID: mge:1878 # MgeName: Pf-WMP3 # Cross-refs: genbank:acc:YP_001285797;genbank:gi:148747831;genbank:GeneID:5220456 Probab=100.00 E-value=4.3e-69 Score=395.37 Aligned_cols=335 Identities=19% Similarity=0.229 Sum_probs=267.9 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccccccc--ccccceEEEeecCcceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRS--IQSGKSAQFPVLGRTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~--i~~G~tv~i~~iG~~~~~~~~~g~~ 78 (347) |||+||. ++..|++.+..+..++..|+|+++|++.|++.+++..++..++ .+.|+|++||++|..++.++++|++ T Consensus 1 ~~~~~~~---~~~~~~~~~~t~~~~fiPev~s~~v~~~l~~~lv~~~l~~~~~~~~~~GdTV~ip~~g~~~a~d~~~g~~ 77 (381) T protein:vir:80 1 MATIQGT---GGYKGSAVDLSNVQVFIPEVWSSEVRMFRDQKFAALEATKKIPFEGKKGDLIHIPNISRAAVYDKQPQTP 77 (381) T ss_pred Cceeccc---ccccCcccchhhHHhhhhHHHHHHHHHHHHHhhhhhhccccccceeecCceEEeeccCcceeeeecCCCc Confidence 9999963 5788999999999876669999999999999999988877654 4679999999999999999999998 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) +.. +++++++++++||+++|+++.|+|+|+.|+++|+|++++++++++||+++|++|+..+.+......+.. . T Consensus 78 i~~--~~~~~~~~~itID~~~~~~~~Idd~D~~~~~~D~~~~~~~~~~~aLA~~~D~~i~~~~~~~~~~~~~~~-----~ 150 (381) T protein:vir:80 78 VNL--QARTDSEFTFTVTKYKESSFMIEDIVNTQASYTLRQYYTKEAGYALARDMDNFALAHRAVINAFPSQRI-----Y 150 (381) T ss_pred ccc--cccCCceEEEEEeeeeecceeechHHHHhhccChHHHHHHHHHHHHHHHHHHHHHHHHhhccccccccc-----c Confidence 754 678999999999999999999999999999999999999999999999999999988765443222111 0 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceE Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSI 238 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~V 238 (347) .....+..++ ............+++.|++|+++|+|++||.+|||+||+|++|+.||++++|++.+|.++..+++|.| T Consensus 151 t~~~~i~~~~--~~~~~t~~~~~~t~~~i~~a~~~Lde~~VP~egR~lvv~P~~~~~Ll~~~~~~~ad~~~~~~l~~G~I 228 (381) T protein:vir:80 151 SYDTTLGDGT--VNAHLTGTPAPLTYAALLLAKQKLDEADVPQEGRIVMVSPAQYIDLLSINQFISVDFSQVKPVTSGVV 228 (381) T ss_pred cccccccccc--cccccccchhhHHHHHHHHHHHHHhhcCCCcCCcEEEeCHHHHHHHhhchhhhhhhhccchhhhceee Confidence 1111111111 11111223345678999999999999999999999999999999999999999999999889999999 Q ss_pred EEEeceEEEEecceeccccccccccccccccc------cc--c-------c----------------------------- Q lcl|NC_015249. 239 RNVMGFEVIEVPHLTAGGAGEDRPEEGANPTG------QK--H-------A----------------------------- 274 (347) Q Consensus 239 g~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~------~~--~-------~----------------------------- 274 (347) ++++||+||+||+||....+.+....+..... .+ . + T Consensus 229 g~i~G~~Vv~Sn~lp~~~~t~~~~~agap~~~~~~~~~~~~~g~~s~~a~av~~~k~yd~~~~~~~~~~~~~~g~~~~~~ 308 (381) T protein:vir:80 229 GTILGMEVIVTTQIGINSLTGYVNGQGAPTQPTPGVLGSPYLPDQAGTANVVNTGSASDLAVSLSYFGLPVFSGAGATAA 308 (381) T ss_pred eEEcceEEEeecccccccccceeeeccccccccccccccccccccccceeeeeeeeeeceeeeeeeccceeeecceeeec Confidence 99999999999999987665544333211000 00 0 0 Q ss_pred ccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 275 FPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 275 ~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ......+.|+....+..+++||+++.+.+..+.++++..+...||+|.|+++|+||++.+||++||+|.+..- T Consensus 309 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 381 (381) T protein:vir:80 309 DGGQTLGSFGGANRWATAVVCHPDWLAVGVQQNVKSESSRETMYLADAFVTSCVYGAKVFRPDHCVLLHTSGI 381 (381) T ss_pred CCCceeeeehhhhhhhhhcccccccccccceeEeecccchhheeehhhhhhhhhhccccccchhhhhhhhcCC Confidence 0011122344444566788999999988888888888899999999999999999999999999999998777 No 20 >protein:vir:3136 Length: 322 # NCBI annotation: hypothetical protein # Family: family:all:11728 # MgeID: mge:64 # MgeName: VpV262 # Cross-refs: genbank:acc:NP_640318;genbank:gi:21234405;genbank:GeneID:956058 Probab=100.00 E-value=2.6e-57 Score=330.83 Aligned_cols=304 Identities=12% Similarity=0.041 Sum_probs=223.8 Q ss_pred ccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCCCCCc Q lcl|NC_015249. 4 MNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENLDDK 82 (347) Q Consensus 4 ~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~~~ 82 (347) |.. ||.+++.+++|+ |+|+.+++..++++-+...+.+......|+|||||+||++++++|++++++. T Consensus 1 ~~~----------~n~ts~~qafi~~EiWsa~il~~l~~~Lv~~~~~~~~d~g~GDtV~InsIg~~tV~dY~~~~~i~-- 68 (322) T protein:vir:31 1 MST----------GNNTSNTQALIVSEIWADEIEDILHEKLLDVNIARVVDFPDGDKLTIPSVGTPVVRSRPEQGDFT-- 68 (322) T ss_pred CCC----------CCCcccceEEeehhhhHHHHHHHhhhhhhhhhhhcccccCCCCeEEeccccccccccccCCCCcc-- Confidence 222 234455566774 9999999999988889888888777778999999999999999999999874 Q ss_pred cCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc--ccccccC Q lcl|NC_015249. 83 RKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDE--NIAGLGK 160 (347) Q Consensus 83 ~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~--~~~~~~~ 160 (347) .+++++++.+|+|||.|||+|.||| |++|..+|++++++++++|+||+.+|+++...+..++.-...... .+.+.+ T Consensus 69 ~d~ltt~~~~l~IDq~KYfaf~VdD-D~~Qa~~dl~~~~~~~aa~ala~~~D~fva~lL~~gA~~~~~~~~p~vin~~~- 146 (322) T protein:vir:31 69 FDNLDTGEISIILRDEVYAGNAISK-KLRQDSRWISNVGAMLPAEQARAIMERYQTDLLALGNAQFAGQNDPNVINGVP- 146 (322) T ss_pred cccCCCceEEEEEehhhhhccccch-hHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhccCCcceecCCc- Confidence 4779999999999999999999999 999999999999999999999999999998766554422211111 111111 Q ss_pred cceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHH---------HHhcchhhhhhhhcccc Q lcl|NC_015249. 161 AHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYS---------AILAALMPNAANYQALI 231 (347) Q Consensus 161 g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~---------~Ll~~~~~~~~~~~~~~ 231 (347) ...+++ ...+...|+.|++++.+|+|+|||.+|||+||+|+++. .|++++||+..+-.|. T Consensus 147 ~~iv~~----------gt~~~~ay~~lv~l~~kLdkanVP~~gR~vVV~P~~~~~L~~i~~~~~l~~D~rf~~i~~sG~- 215 (322) T protein:vir:31 147 HRFVGT----------GTDQTMDVTDFSRVNYVMTQSKMPMGGMIGIIDPSVAHHLETITNISNISNNPRWEGIVESGI- 215 (322) T ss_pred cceecc----------CCCchhhHHHHHHHHHHhccccCCCCCeEEEeCchhhhhhhhhhhhhhhhccccccccccccc- Confidence 111221 22345579999999999999999999999999999865 4578888876554443 Q ss_pred ccccc--eEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhccee Q lcl|NC_015249. 232 DPSTG--SIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMA 309 (347) Q Consensus 232 ~~~~G--~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~ 309 (347) ..| .||+++||+||+||++|..+.+-. .+.....++. .. -+...|.-+.. +...++..+.+ ++ T Consensus 216 --a~g~~~Vg~~~GF~V~~SN~l~~~~~~i~-aG~d~~~t~a-g~--~n~f~~~~~~~--------~~~~~~~~~~l-~~ 280 (322) T protein:vir:31 216 --APDMQFVRSVYGIDLFVSNLLADANETIN-AGGDARSTTA-GK--CNMFMNVSDMG--------LLPFVVAWKEM-PT 280 (322) T ss_pred --hhhHHHHHHHhceeeeeeccccccccccc-cCcccccccc-ee--ecccccccchh--------hhhhhhHhhhh-hh Confidence 223 499999999999999974332110 0000111110 00 01111222211 23334444444 48 Q ss_pred eeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 310 LERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 310 ~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .|.+|++.+++|.++++++||+|++|||..+.+....+ T Consensus 281 ~e~~r~~~~~~d~~~~~~~~g~g~~r~e~l~~~~a~~~ 318 (322) T protein:vir:31 281 TKSFIDDYNDDLNTATTARWGNGLVRDENLVCVLANAD 318 (322) T ss_pred hhcccCccccccceeeeeeecceeecccceEEEEeccc Confidence 89999999999999999999999999999999999888 No 21 >protein:vir:102605 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:1661 # MgeName: Llij # Cross-refs: genbank:acc:YP_655002;genbank:gi:109392192;genbank:GeneID:4157227 Probab=100.00 E-value=9e-56 Score=322.38 Aligned_cols=266 Identities=18% Similarity=0.175 Sum_probs=221.5 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhcccccc---cccccceEEEeecCcceeeeeec- Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNKHLVR---SIQSGKSAQFPVLGRTKAAYLQP- 75 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~~~~r---~i~~G~tv~i~~iG~~~~~~~~~- 75 (347) ||+. .|+ |+|+++|++.|++.+++.+++... +++.|+|++||++|..++.+|++ T Consensus 1 MA~~---------------------~~~pe~~~~~v~~~~~~~lv~~~l~~~~~~~~~~~Gdtv~ip~~~~~~~~d~~~~ 59 (273) T protein:vir:10 1 MAFN---------------------NFIPELWSDMLLEEWTAQTVFANLVNREYEGTASKGNVVHIAGVVAPTVKDYKAA 59 (273) T ss_pred Ccch---------------------hhhHHHHHHHHHHHHHhhhccchhhccccccccccCceEEEeecccccccccccC Confidence 4441 355 899999999999999999887552 57789999999999999987765 Q ss_pred CCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccc Q lcl|NC_015249. 76 GENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENI 155 (347) Q Consensus 76 g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~ 155 (347) |..+ ..+++.+++++++||+.+|+++.|+|+|+.|+++|+++ ++++++++||+.+|+.++..++.++ ..+ T Consensus 60 ~~~~--~~~~~~~~~~~~tid~~~~~~~~i~d~d~~~~~~~~~~-~~~~~~~alA~~vD~~i~~~~~~a~-------~~~ 129 (273) T protein:vir:10 60 GRQT--SADAISDTGVDLLIDQEKSIDFLVDDIDRVQVAGSLEA-YTRAGATALATDTDKFIADMLVDNG-------TAL 129 (273) T ss_pred CCcc--CccccccceEEEEEeeeeecceEeecHHHhhhhccHHH-HHHHHHHHHHHHHHHHHHHHHhccc-------ccc Confidence 4443 34678999999999999999999999999999999865 9999999999999999987664321 000 Q ss_pred ccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhh-hhhhcc-cccc Q lcl|NC_015249. 156 AGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPN-AANYQA-LIDP 233 (347) Q Consensus 156 ~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~-~~~~~~-~~~~ 233 (347) . ......++++++.|++|+++|++++||.++||+||+|++|+.|++++.++ +.++.+ ...+ T Consensus 130 ~-----------------~~~~~~~~~~~~~i~~a~~~ld~~~vP~~~R~lvv~p~~~~~L~~~~~~~~~~~~~~~~~~l 192 (273) T protein:vir:10 130 T-----------------GSAPTDADDAFDLIAKALKELTKANVPNVGRVVVVNAEMAFWLRSSGSKLTSADTSGDAAGL 192 (273) T ss_pred c-----------------cccccchhHHHHHHHHHHHHhhhcCCCcCCCEEEECHHHHHHHhcchhhhhhhhccccccce Confidence 0 01122356789999999999999999999999999999999999988655 556654 4568 Q ss_pred ccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeee Q lcl|NC_015249. 234 STGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERA 313 (347) Q Consensus 234 ~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~ 313 (347) ++|.||+++||+||+||+||..+. ...++||++|++.++++. ++|.. T Consensus 193 ~~G~ig~i~G~~v~~s~~lp~~~~--------------------------------~~~~~~~~~A~~~a~q~~-~~e~~ 239 (273) T protein:vir:10 193 RAGTIGNLLGARIVESNNLRDTDD--------------------------------EQFVAFHPSAAAYVSQID-TVEAL 239 (273) T ss_pred eeeeeeEEeceEEEEecccccCCc--------------------------------cEEEEEeccceeeeeeee-hhhcc Confidence 999999999999999999995321 123689999999998776 89999 Q ss_pred echhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 314 RRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 314 ~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |++.+++|.|.++++||++++|||+++.|....+ T Consensus 240 r~~~~~~~~v~~~~~yg~~v~~~~~~~~l~~~g~ 273 (273) T protein:vir:10 240 RDQDSFSDRIRALHVYGGKVVRPTGVVVFNKTGS 273 (273) T ss_pred cCCCcceeeeeeeeeeeeeEeccceEEEEeccCC Confidence 9999999999999999999999999998887777 No 22 >protein:vir:105822 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:1636 # MgeName: PMC # Cross-refs: genbank:acc:YP_655767;genbank:gi:109522090;genbank:GeneID:4157630 Probab=100.00 E-value=9e-56 Score=322.38 Aligned_cols=266 Identities=18% Similarity=0.175 Sum_probs=221.5 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhcccccc---cccccceEEEeecCcceeeeeec- Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNKHLVR---SIQSGKSAQFPVLGRTKAAYLQP- 75 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~~~~r---~i~~G~tv~i~~iG~~~~~~~~~- 75 (347) ||+. .|+ |+|+++|++.|++.+++.+++... +++.|+|++||++|..++.+|++ T Consensus 1 MA~~---------------------~~~pe~~~~~v~~~~~~~lv~~~l~~~~~~~~~~~Gdtv~ip~~~~~~~~d~~~~ 59 (273) T protein:vir:10 1 MAFN---------------------NFIPELWSDMLLEEWTAQTVFANLVNREYEGTASKGNVVHIAGVVAPTVKDYKAA 59 (273) T ss_pred Ccch---------------------hhhHHHHHHHHHHHHHhhhccchhhccccccccccCceEEEeecccccccccccC Confidence 4441 355 899999999999999999887552 57789999999999999987765 Q ss_pred CCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccc Q lcl|NC_015249. 76 GENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENI 155 (347) Q Consensus 76 g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~ 155 (347) |..+ ..+++.+++++++||+.+|+++.|+|+|+.|+++|+++ ++++++++||+.+|+.++..++.++ ..+ T Consensus 60 ~~~~--~~~~~~~~~~~~tid~~~~~~~~i~d~d~~~~~~~~~~-~~~~~~~alA~~vD~~i~~~~~~a~-------~~~ 129 (273) T protein:vir:10 60 GRQT--SADAISDTGVDLLIDQEKSIDFLVDDIDRVQVAGSLEA-YTRAGATALATDTDKFIADMLVDNG-------TAL 129 (273) T ss_pred CCcc--CccccccceEEEEEeeeeecceEeecHHHhhhhccHHH-HHHHHHHHHHHHHHHHHHHHHhccc-------ccc Confidence 4443 34678999999999999999999999999999999865 9999999999999999987664321 000 Q ss_pred ccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhh-hhhhcc-cccc Q lcl|NC_015249. 156 AGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPN-AANYQA-LIDP 233 (347) Q Consensus 156 ~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~-~~~~~~-~~~~ 233 (347) . ......++++++.|++|+++|++++||.++||+||+|++|+.|++++.++ +.++.+ ...+ T Consensus 130 ~-----------------~~~~~~~~~~~~~i~~a~~~ld~~~vP~~~R~lvv~p~~~~~L~~~~~~~~~~~~~~~~~~l 192 (273) T protein:vir:10 130 T-----------------GSAPTDADDAFDLIAKALKELTKANVPNVGRVVVVNAEMAFWLRSSGSKLTSADTSGDAAGL 192 (273) T ss_pred c-----------------cccccchhHHHHHHHHHHHHhhhcCCCcCCCEEEECHHHHHHHhcchhhhhhhhccccccce Confidence 0 01122356789999999999999999999999999999999999988655 556654 4568 Q ss_pred ccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeee Q lcl|NC_015249. 234 STGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERA 313 (347) Q Consensus 234 ~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~ 313 (347) ++|.||+++||+||+||+||..+. ...++||++|++.++++. ++|.. T Consensus 193 ~~G~ig~i~G~~v~~s~~lp~~~~--------------------------------~~~~~~~~~A~~~a~q~~-~~e~~ 239 (273) T protein:vir:10 193 RAGTIGNLLGARIVESNNLRDTDD--------------------------------EQFVAFHPSAAAYVSQID-TVEAL 239 (273) T ss_pred eeeeeeEEeceEEEEecccccCCc--------------------------------cEEEEEeccceeeeeeee-hhhcc Confidence 999999999999999999995321 123689999999998776 89999 Q ss_pred echhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 314 RRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 314 ~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |++.+++|.|.++++||++++|||+++.|....+ T Consensus 240 r~~~~~~~~v~~~~~yg~~v~~~~~~~~l~~~g~ 273 (273) T protein:vir:10 240 RDQDSFSDRIRALHVYGGKVVRPTGVVVFNKTGS 273 (273) T ss_pred cCCCcceeeeeeeeeeeeeEeccceEEEEeccCC Confidence 9999999999999999999999999998887777 No 23 >protein:vir:102655 Length: 322 # NCBI annotation: Hypothetical protein # Family: family:all:6384 # MgeID: mge:1624 # MgeName: VP2 # Cross-refs: genbank:acc:YP_052979;genbank:gi:50282923;genbank:GeneID:2948122 Probab=100.00 E-value=1.5e-55 Score=321.12 Aligned_cols=307 Identities=14% Similarity=0.079 Sum_probs=232.1 Q ss_pred CC--ccccc-ccccccccccccccchhhhhhhhhhhHHHHHHHHH-Hhhhcccccccccccc-------eEEEeecCcce Q lcl|NC_015249. 1 MA--KMNGG-QQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRT-SVTMNKHLVRSIQSGK-------SAQFPVLGRTK 69 (347) Q Consensus 1 ma--~~~~~-~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~-s~~~~~~~~r~i~~G~-------tv~i~~iG~~~ 69 (347) |+ ++-++ ...++ +..+.|+|+|..+|+..||.+ |+|+++|+.++-.+|. ++.++.+|+.. T Consensus 1 ~~~~~~~~~~~~Ms~---------~i~~~fv~qy~~~v~~~~qq~~s~L~~tV~~~~~~~~~~~~~~~~~~~~~~~~~~~ 71 (322) T protein:vir:10 1 MKLNAIMSMLPLIAG---------DIDQAFVQTYETTLRILSQQKSAKLKQYCQHKNESSESHNWETLASMDPDAVKRKR 71 (322) T ss_pred Ccccceeeeeeeeec---------hhhhHHHHHHHHHHHHHHHHhhhhhhcccccccccccccceeeccccccccccccc Confidence 33 11111 22222 456679999999999999966 8999999998855442 45556677777 Q ss_pred eeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Q lcl|NC_015249. 70 AAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPS 149 (347) Q Consensus 70 ~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~ 149 (347) +..+.+.+.++.++++.+++.+.+.++++ |+.++|||+|+.|+++|++++|++++++||+|++|+.|+..+.+.+. T Consensus 72 ~~~~~~d~~~dtp~~~~~~~~r~~~~~d~-~~~~~VDd~D~~k~~~D~~~~~~~~~a~AL~R~~D~~I~~a~~g~a~--- 147 (322) T protein:vir:10 72 SRQQSADGTYPTPVNNKPFAKRRTNVDTY-DTGHVVEQEDISQMLLDPNSALITSQAYAMARKTDDLIIAGAWKPAS--- 147 (322) T ss_pred ccccccCcccCCCccccccceEEEeeccc-ccceecchHHHHHhhcCchHHHHHHHHHHhhhHHHHHHHhhhhcccc--- Confidence 77666666666677788889988877776 88899999999999999999999999999999999988754433221 Q ss_pred ccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCC-CEEEeCHHHHHHHhcchhhhhhhhc Q lcl|NC_015249. 150 ASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSAD-RVFYTTPDNYSAILAALMPNAANYQ 228 (347) Q Consensus 150 ~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~g-R~~vv~P~~~~~Ll~~~~~~~~~~~ 228 (347) .+.+++.+....+.... + .+....+++|++|+++|+|++||+++ ||+||+|++|++||++++|++.||. T Consensus 148 ------~~~~gt~v~~~ss~~i~--~--g~~g~t~~kl~~a~~~l~~~dvp~d~~R~~vv~p~~~~~LL~d~~~ts~D~~ 217 (322) T protein:vir:10 148 ------IKGTGQPVEFLATQEIG--D--GTKPISFDYVTEITERFLENEIEPEVSKVIVIGPTQARKLLQITEATSADYT 217 (322) T ss_pred ------ccccccccccCCCcccc--c--CccchhHHHHHHHHHHHHhcCCCCCCCeEEEeCHHHHHHHhcchhhhhhhcc Confidence 11111111111111000 0 01122467899999999999999875 9999999999999999999999999 Q ss_pred ccccc-ccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcc Q lcl|NC_015249. 229 ALIDP-STGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKD 307 (347) Q Consensus 229 ~~~~~-~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~ 307 (347) +...+ ++|.|++++||+|++||+||..+.+.+..+... ..+... ..+++||++|+++++.++ T Consensus 218 ~~~~l~~~G~ig~~lGf~~i~s~~lp~~~~t~~~~~~~~----------------~~~~~~-~~~~a~~k~Av~~a~~~d 280 (322) T protein:vir:10 218 SAMDLQSKGIITNWMGYTWIVSTRLDKFDPTQWGMAAED----------------GPQGDE-IWCIAMTDMALGYHSCKD 280 (322) T ss_pred cchhhhhcCeeeeeeeEEEEEeccCCccccccccccccC----------------CCCccc-eeEEEEecCceeEEEeee Confidence 88777 679999999999999999997665433222111 111222 235799999999999999 Q ss_pred eeeeeeechhh-hcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 308 MALERARRANF-QADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 308 ~~~e~~~d~~~-~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++++.+++|.. +.|.|.++++||++.++|+.+++|...-+ T Consensus 281 v~~~i~~~~~~~~a~~I~~~~~~Ga~ri~~~gVv~i~~~e~ 321 (322) T protein:vir:10 281 IWTKVAEDPSASFAWRIYSAFTADCVRVEDEHIFKLRLKNS 321 (322) T ss_pred eeEEeeccCCcchhhhhhhhhhhCceEeccCcEEEEEEecc Confidence 99999988885 48999999999999999999999999988 No 24 >protein:vir:7990 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:151 # MgeName: Che8 # Cross-refs: genbank:acc:NP_817344;genbank:gi:29565772;genbank:GeneID:1258978 Probab=100.00 E-value=6.1e-54 Score=312.33 Aligned_cols=266 Identities=18% Similarity=0.160 Sum_probs=219.2 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhcccccc---cccccceEEEeecCcceeeeee-c Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNKHLVR---SIQSGKSAQFPVLGRTKAAYLQ-P 75 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~~~~r---~i~~G~tv~i~~iG~~~~~~~~-~ 75 (347) ||+. +|+ |+|+++|++.|++.+++.++++.. ....|+|++||++|..++.+++ + T Consensus 1 MA~~---------------------~~~pei~~~~v~~~~~~~lv~~~l~~~~~~~~~~~GdTv~ip~~~~~~~~d~~~~ 59 (273) T protein:vir:79 1 MAFN---------------------NFIPELWSDMLLEEWTAQTVFANLVNREYEGIASKGNVVHIAGVVAPTVKDYKAA 59 (273) T ss_pred Ccch---------------------hhhHHHHHHHHHHHHHhhccchhhhhccccccccCCcEEEEeecCcccccccccC Confidence 5552 255 999999999999999988876443 3346999999999999987665 4 Q ss_pred CCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccc Q lcl|NC_015249. 76 GENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENI 155 (347) Q Consensus 76 g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~ 155 (347) |..++ .+++++++++++||+.+++++.|+|+|+.|+++|++ +++++++++||+.+|+.++..++.+. ... T Consensus 60 ~~~~~--~~~~~~~~~~~tid~~~~~~~~i~d~d~~~~~~~~~-~~~~~~~~ala~~vD~~i~~~~~~a~-------~~~ 129 (273) T protein:vir:79 60 GRQTS--ADAISDTGVDLLIDQEKSIDFLVDDIDRVQVAGSLE-AYTRAGATALATDTDKFIADMLVDNG-------TAL 129 (273) T ss_pred CCccC--ccccccceEEEEEeeecccceeeccHHHHhhcccHH-HHHHHHHHHHHHHHHHHHHHHHhhcc-------ccc Confidence 55543 467899999999999999999999999999999997 59999999999999999887664321 000 Q ss_pred ccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchh-hhhhhhcc-cccc Q lcl|NC_015249. 156 AGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALM-PNAANYQA-LIDP 233 (347) Q Consensus 156 ~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~-~~~~~~~~-~~~~ 233 (347) + . .....+.++++.|++|+.+|++++||.+|||+||+|++|+.||+++. +.+.++.+ ...+ T Consensus 130 ~---------~--------~~~~~~~~~~~~i~~a~~~ld~~~vP~~~R~lvv~p~~~~~Ll~~~~~~~~~~~~~~~~~l 192 (273) T protein:vir:79 130 T---------G--------SAPSDADDAFDLIASALKELTKANVPNVGRVVVVNAEMAFWLRSSGSKLTSADTSGDAAGL 192 (273) T ss_pred c---------c--------ccccchhhHHHHHHHHHHHhhhccCCccCcEEEECHHHHHHHhhchhhhhhhhhcccccce Confidence 0 0 01122456789999999999999999999999999999999999875 56677765 4568 Q ss_pred ccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeee Q lcl|NC_015249. 234 STGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERA 313 (347) Q Consensus 234 ~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~ 313 (347) ++|.||+++||+||+||++|.... ...+++|++|++.++.+. ++|.. T Consensus 193 ~~G~ig~~~G~~i~~s~~lp~~~~--------------------------------~~~~a~~~~A~~~a~~~~-~~e~~ 239 (273) T protein:vir:79 193 RAGTIGNLLGARIVESNNLRDTDD--------------------------------EQFVAFHPSAAAYVSQID-TVEAL 239 (273) T ss_pred eeeEeeEEeceEEEecccccccCc--------------------------------eEEEEEeccceeeeeehh-hhhcc Confidence 999999999999999999995321 123688999999998765 89999 Q ss_pred echhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 314 RRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 314 ~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |++.+++|+|.++++||++++|||++++|....+ T Consensus 240 r~~~~~~~~v~~~~~yg~~v~~p~~vv~~~~~g~ 273 (273) T protein:vir:79 240 RDQDSFSDRIRALHVYGGKVVRPTGVVVFNKTGS 273 (273) T ss_pred cCcccceeeeeeeeeeeeEEecCceEEEEeccCC Confidence 9999999999999999999999999998777766 No 25 >protein:vir:1781 Length: 221 # NCBI annotation: minor capsid protein # Family: family:all:975 # MgeID: mge:38 # MgeName: P60 # Cross-refs: genbank:acc:NP_570347;genbank:gi:18640506;genbank:GeneID:932719 Probab=100.00 E-value=2.2e-54 Score=314.71 Aligned_cols=217 Identities=21% Similarity=0.285 Sum_probs=170.7 Q ss_pred EEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccCcceeeccccccccc Q lcl|NC_015249. 95 IDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGKAHVLEVGKQSELRG 174 (347) Q Consensus 95 ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~g~~i~~~~~~~~~~ 174 (347) ||++++++|+|||+|++|++||+|+++++|+||+||+++|++|+++++++++...+....+ ++....+.. T Consensus 1 iD~lL~a~~~VdDiD~aqa~~dvr~e~t~e~G~ALA~~~D~~i~~~~~~aA~~~~p~~~~~----~g~~~~~~a------ 70 (221) T protein:vir:17 1 MDDLLVASQFVYDLDEILAQWNTRSEISKQIGEALAIHYDERIARVLASASIAAAPVTGQD----GGFSVNIGA------ 70 (221) T ss_pred CCcchhHHHHHHhHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhcCcccccc----cCcceeccc------ Confidence 9999999999999999999999999999999999999999999999998887655544332 222222221 Q ss_pred chhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhc--chhhhhhhhcc-ccccccc-eEEEEeceEEEEec Q lcl|NC_015249. 175 DQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILA--ALMPNAANYQA-LIDPSTG-SIRNVMGFEVIEVP 250 (347) Q Consensus 175 ~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~--~~~~~~~~~~~-~~~~~~G-~Vg~i~G~~V~~sn 250 (347) ..+..++++|+.|++|+++|+|+|||++|||+||+|++|+.||+ ++++.++++.+ .+.+++| .|++++||+||+|| T Consensus 71 ~~t~~~~~l~dai~~a~~~LdekdVP~~gR~~vv~P~~y~~LL~~~d~~~~n~d~~~s~g~~~~g~~i~~v~G~~V~~Sn 150 (221) T protein:vir:17 71 GNTNNAQAIVDGFFEAAAVLDERSAPMDGRVAVLSPRQYYSLISSVDTNILNREIGNTQGDMNTGKGLYVNAGIRIYKSN 150 (221) T ss_pred cccCCHHHHHHHHHHHHHHHhhcCCCCCCCEEEeCcHHHHHHHHhcCcceeeeecccccccccccceeeeecCcEEEEec Confidence 12345788999999999999999999999999999999999987 47788998875 4557788 49999999999999 Q ss_pred ceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhhhcceeeeeeeec Q lcl|NC_015249. 251 HLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANFQADQIIAKYAMG 330 (347) Q Consensus 251 ~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~G 330 (347) |+|..+++++...++. .+...+..++|+++|+|++||+|||+||||||++.+-+. -| ++...+ T Consensus 151 nlP~~~gt~~~~~ag~------~~~~~~~~~~yr~~fs~~~glv~~~~Avgtvkl~~~~~~---~~-----~~~~~~--- 213 (221) T protein:vir:17 151 VLASLYGTNLVTDPGD------ATTSGENNGSYRPAITDRAGLVFHKEAADTVEVLLPPSR---PP-----LVISMF--- 213 (221) T ss_pred cCCcccccccccCCcc------ccccccccccccccccceEEEEEcchheeeeeeecCCCC---Cc-----eeeeee--- Confidence 9999877765433322 234455677999999999999999999999999987432 22 222222 Q ss_pred ccccccceE Q lcl|NC_015249. 331 HGGLRPEAC 339 (347) Q Consensus 331 ~~~~Rpe~a 339 (347) .++|||-- T Consensus 214 -~~~~~~~~ 221 (221) T protein:vir:17 214 -SIRRPDRR 221 (221) T ss_pred -eccCCCCC Confidence 23455443 No 26 >protein:vir:80930 Length: 278 # NCBI annotation: Cps # Family: family:all:522 # MgeID: mge:1886 # MgeName: A500 # Cross-refs: genbank:acc:YP_001468392;genbank:gi:157324966;genbank:GeneID:5601363 Probab=100.00 E-value=2e-42 Score=249.17 Aligned_cols=270 Identities=15% Similarity=0.127 Sum_probs=221.2 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhccccc-ccc--cccceEEEeecCcc-eeeeeec Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNKHLV-RSI--QSGKSAQFPVLGRT-KAAYLQP 75 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~~~~-r~i--~~G~tv~i~~iG~~-~~~~~~~ 75 (347) |||+++ +... +|+ |+|+.+|.+.|.+..++.++... +++ +.|++++||+++.. .++++.. T Consensus 1 Ma~~~T------~~~~---------~iiPev~s~~v~~~~~~~~v~~~~~~~~~~l~g~~G~tv~ip~~~~~g~a~~~~~ 65 (278) T protein:vir:80 1 MADLTT------KLAN---------LIDPEVMGPMISAKLPKAIKFGKIAPIDNSLEGQPGSEITVPKYKYIGDAQDVAE 65 (278) T ss_pred CCCcce------ehhh---------eecHHHHHHHHHHHHHHhhhhcccceecccccCCCCCEEEEeeeccCCcceeecC Confidence 999653 3311 355 99999999999988888877643 444 35999999997754 4677888 Q ss_pred CCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccc Q lcl|NC_015249. 76 GENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENI 155 (347) Q Consensus 76 g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~ 155 (347) |++++ +++++.++.+++|++.. ..|.|+|++..++..|++++++++++++|+++.|+.++..+.++. ... T Consensus 66 g~~i~--~~~lt~~~~~~~i~~~~-~a~~v~D~~~~~~~~d~~~~~~~~~a~~~a~~~d~~l~~~l~~a~-------~~~ 135 (278) T protein:vir:80 66 GAAID--YSALETESVKHGIKKAG-KGVKLTDESVLSGYGDPVEEAQKQIRMAIASKVDNDILEEALTTT-------LEV 135 (278) T ss_pred CCcCc--ccccccceeeEeeehhh-ccccccHHHHhhccccHHHHHHHHHHHHHHHHHHHHHHHHHhccc-------ccc Confidence 98875 36899999999999975 589999999999999999999999999999999999987764311 000 Q ss_pred ccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcch--hhhhhhhcccccc Q lcl|NC_015249. 156 AGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAAL--MPNAANYQALIDP 233 (347) Q Consensus 156 ~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~--~~~~~~~~~~~~~ 233 (347) . .+ .........++.|.++..+|++.++|. .|+++|+|++|+.|+++. +|+.....+++.+ T Consensus 136 ~-----------~~-----~t~~~~~~~~~~~~da~~~l~~~~~~~-~~~ivv~p~~~~~L~k~~~~~~~~~~~~g~~~~ 198 (278) T protein:vir:80 136 K-----------GA-----INIGLIDKIENTFTDAPDAIEDESITT-TGVLFLNYKDTAKLREEAAGSWTKASQLGDDLL 198 (278) T ss_pred c-----------cc-----cccchhhhHHHHHHHHHHhhcccCCCc-ccEEEECHHHHHHHHhhhhhhccccccccccce Confidence 0 00 011123456888999999999999996 678999999999999875 6776667777789 Q ss_pred ccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeee Q lcl|NC_015249. 234 STGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERA 313 (347) Q Consensus 234 ~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~ 313 (347) ++|.|++++||+||+||++|.. .+.+|++.|+++++.+++++|.. T Consensus 199 ~~G~ig~~~G~~Vi~s~~~p~~-----------------------------------t~~l~~~gAi~~~~~~~~~vE~~ 243 (278) T protein:vir:80 199 VKGAFGELLGWEIVRTKKLADG-----------------------------------NALAVKAGALKTFLKRNLLAESG 243 (278) T ss_pred eeccceeecceeEEEcCCCCcc-----------------------------------eEEEEeccceeeeecCCcccccc Confidence 9999999999999999999831 23588999999999999999999 Q ss_pred echhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 314 RRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 314 ~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ||+.++.|.|.+++.||++++||++++.|...++ T Consensus 244 Rd~~~~~d~i~~~~~yg~~v~~~~~~v~it~~a~ 277 (278) T protein:vir:80 244 RDMDHKLTKFNADQHYAVALVDETKAVKVVPVAG 277 (278) T ss_pred cchhhccceeeeeeEEEEEEEcCcceEEEeeccC Confidence 9999999999999999999999999998888777 No 27 >protein:vir:107120 Length: 329 # NCBI annotation: conserved phage protein # Family: family:all:701 # MgeID: mge:1571 # MgeName: CNPH82 # Cross-refs: genbank:acc:YP_950606;genbank:gi:119953686;genbank:GeneID:4643129 Probab=100.00 E-value=1.5e-41 Score=244.46 Aligned_cols=284 Identities=11% Similarity=-0.001 Sum_probs=221.4 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhcc--cccccccccceEEEeecCcceeeeeecCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNK--HLVRSIQSGKSAQFPVLGRTKAAYLQPGE 77 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~--~~~r~i~~G~tv~i~~iG~~~~~~~~~g~ 77 (347) .-|.+|-... .-+|..+.+-.+-.+-+ |+|.+.+++.|...++.... .+.....+|++|+||+++.+.+.+|++++ T Consensus 16 ~~~~~~~~~~-~~~~~~~~~~~~nt~~l~~k~~~~LD~~~~~~~~s~~~~~N~~~e~~~g~tVkIp~i~~~gl~DY~R~~ 94 (329) T protein:vir:10 16 IKNATGKLKL-NLQHFANKSVEPGDTLLKNKHVGILEKVTAANSYSAPAVISNDAIFMQGRSFTVIKGDVTELKDYKRNA 94 (329) T ss_pred hhcccceeEE-ehhhhcCCccCCchhHHHHHHHHHHHHHHHhhceeeeeecccceeeccCcEEEEeeecccccccccCCC Confidence 3344443332 23445555555444444 99999999999988765543 33345678999999999999999999887 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhh--HHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDV--RSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENI 155 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~--r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~ 155 (347) .. ...+++.+..+++||+.+||.|.||++|..|++.++ ...+.+.+.+.++..+|.+.+..++..+. T Consensus 95 g~--~~g~vt~~~~t~tidqdR~~~F~VD~~D~dEtn~~l~a~~i~~~~~~~~v~pEiDay~~skla~~a~--------- 163 (329) T protein:vir:10 95 TN--EFDHPQIQETTYFLDQEKYWGRFVDALDRRDTEGNIDINYVVAKQASEVVAPYLDNLRFATLARNKA--------- 163 (329) T ss_pred Cc--cccccccceeEEEeecccceeeecchhhHhhhhhhhhHHHHHHHHHHHHhhhHHHHHHHHHHHhhcc--------- Confidence 65 346789999999999999999999999999999876 34456778999999999988876653210 Q ss_pred ccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccccccc Q lcl|NC_015249. 156 AGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPST 235 (347) Q Consensus 156 ~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~ 235 (347) . .......++++|+.|++++++|+|++|| ++||++|+|++|.+|+++++|+.....+...+.+ T Consensus 164 ------~----------~~~~~~t~~nay~~i~~a~~~Lde~~vp-~~Rvl~VtP~~~~~Lk~~~~f~~~~~~~~~~~~~ 226 (329) T protein:vir:10 164 ------K----------HLTVGSGADAQYDAVLDVSVELDEIGAG-ASRILFVTPKFYKGIKKFVIELPQGDNRQQVLGK 226 (329) T ss_pred ------c----------ccccccCHHHHHHHHHHHHHHHHhcCCC-CCcEEEeCHHHHHHHHhhhhhhccccccccceee Confidence 0 0011234678999999999999999999 5999999999999999999998765556667889 Q ss_pred ceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee- Q lcl|NC_015249. 236 GSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR- 314 (347) Q Consensus 236 G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~- 314 (347) |+|++++||+|+++|+.... +.-.++.|++|+......+ .+|.++ T Consensus 227 g~Vg~idG~~Ii~vps~~~k---------------------------------~in~ii~~~~A~~~~~K~~-~~~~~~p 272 (329) T protein:vir:10 227 GVQGELDGFTIVKVPSKMLQ---------------------------------GVEAMAVIGEVMASPIQAN-EAKLNSN 272 (329) T ss_pred eeeeeecCeEEEEecCCccc---------------------------------ceeEEEEcCCceeeeeeee-eeeeeCC Confidence 99999999999998654320 1224789999999887777 577775 Q ss_pred chhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .+.+++|.|..++.||+.++||+..++++..+. T Consensus 273 ~~~~~a~~v~gr~yyd~~V~~~k~~~I~~~~~~ 305 (329) T protein:vir:10 273 VPGMFGTLAEQMLYTGAFVPEHLQKYIFTIGGK 305 (329) T ss_pred CCccchheeeeeeeeeeEEEccccCEEEEeccc Confidence 577899999999999999999998887775544 No 28 >protein:vir:97331 Length: 319 # NCBI annotation: ORF011 # Family: family:all:701 # MgeID: mge:1666 # MgeName: 52A # Cross-refs: genbank:acc:YP_240611;genbank:gi:66396278;genbank:GeneID:5133687 Probab=100.00 E-value=1.8e-41 Score=243.98 Aligned_cols=284 Identities=11% Similarity=0.004 Sum_probs=218.4 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhcc--cccccccccceEEEeecCcceeeeeecCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNK--HLVRSIQSGKSAQFPVLGRTKAAYLQPGE 77 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~--~~~r~i~~G~tv~i~~iG~~~~~~~~~g~ 77 (347) .-|.+|-...+ -+|..+.+=++-.+.+ |.|++.+++.+...+++..+ .+.....+|++|+||+++.+.+++|++++ T Consensus 5 ~~~~~~~~~~~-~~~~~~~~~~~nt~~l~~k~~~~LD~~~~~~~~s~~~~~N~~~e~~gg~tVkIp~i~~~gl~DY~R~~ 83 (319) T protein:vir:97 5 IKNATGMLKLN-LQHFANKSVEPGQTLLKNKHVGILERVTAVNAYSTPALISNDAIFMEGRSFTVMKGDTTELKDYKRNA 83 (319) T ss_pred cccccceeEee-hhhhhccCCCcchHHHHHHHHHHHHHHHHHhhhhhhcccCcceEeccCcEEEEeeecccccccccCCC Confidence 33444433222 3334443333333344 99999999988877776543 33345678999999999999999999887 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhH--HHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVR--SEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENI 155 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r--~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~ 155 (347) ... ..+++.+..+++||+.+||.|.||++|..|++.++. ....+++.+.++..+|.+.+..++..+. T Consensus 84 g~~--~g~vt~~~~t~tidqdR~~~F~VD~~D~~Etn~~l~a~~i~~~~~~~~v~PEiDay~~skla~~a~--------- 152 (319) T protein:vir:97 84 TNE--FDHPKIEETTYFLDQEKYWGRFVDALDRKDTEGNIDINYVVARQGAEVVAPYLDNLRFATLARNKA--------- 152 (319) T ss_pred Ccc--cCCcccceeEEEeecccccccccchhhHhhhhchhhHHHHHHHHHHHHhhhhhhHHHHHHHHhhcc--------- Confidence 653 467999999999999999999999999999998873 4456778888888999887766643210 Q ss_pred ccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccccccc Q lcl|NC_015249. 156 AGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPST 235 (347) Q Consensus 156 ~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~ 235 (347) .. .+...+++++|+.|+++.++|+|++|| ++||++|+|++|.+|+++++|+.....++..+.+ T Consensus 153 ------~~----------~~~~~t~~n~y~~i~~a~~~Lde~~VP-~~Rvl~Vtp~~~~~L~~~~~f~~~~~~~~~~~~~ 215 (319) T protein:vir:97 153 ------KH----------LTVGTGSDAQYDAVLDVSVELDEIKAP-ENRVLFVSPTFYKGIKKFVIALPQGDTRQQVLGK 215 (319) T ss_pred ------cc----------cccccCHHHHHHHHHHHHHHHHhcCCC-CCcEEEeCHHHHHHHHhhhhhhccccccccceee Confidence 00 011234678999999999999999999 6999999999999999999999766566667899 Q ss_pred ceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee- Q lcl|NC_015249. 236 GSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR- 314 (347) Q Consensus 236 G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~- 314 (347) |.|++++||+|+++|+... .+.-.++.|++|+......+ .+|.++ T Consensus 216 g~Vg~idG~~Vi~vps~~~---------------------------------k~in~i~~h~~A~~~~~k~~-~~~~~~p 261 (319) T protein:vir:97 216 GVQGELDGFVIVKVPTKLL---------------------------------QGLQAIAVVGEVLASPIQAD-LAKTNSN 261 (319) T ss_pred eeceeecCeEEEEeccccc---------------------------------ccceEEEEcCCeeeeeeeee-eeeccCC Confidence 9999999999999764321 01224789999998887766 467665 Q ss_pred chhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .+.+++|.+..++.||+.++||...++++...+ T Consensus 262 ~~~~~a~~v~gr~y~d~~V~~~k~~~Iy~~~~~ 294 (319) T protein:vir:97 262 IPGMFGTLAEQLLYTGAFVPEHLQKYIFTIGGT 294 (319) T ss_pred CccccceeeeeeeeeeeEEeccccceEEEeecC Confidence 577999999999999999999998888876555 No 29 >protein:vir:94800 Length: 319 # NCBI annotation: ORF012 # Family: family:all:701 # MgeID: mge:1531 # MgeName: 29 # Cross-refs: genbank:acc:YP_240536;genbank:gi:66396203;genbank:GeneID:5133580 Probab=100.00 E-value=1.8e-41 Score=243.98 Aligned_cols=284 Identities=11% Similarity=0.004 Sum_probs=218.4 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhcc--cccccccccceEEEeecCcceeeeeecCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNK--HLVRSIQSGKSAQFPVLGRTKAAYLQPGE 77 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~--~~~r~i~~G~tv~i~~iG~~~~~~~~~g~ 77 (347) .-|.+|-...+ -+|..+.+=++-.+.+ |.|++.+++.+...+++..+ .+.....+|++|+||+++.+.+++|++++ T Consensus 5 ~~~~~~~~~~~-~~~~~~~~~~~nt~~l~~k~~~~LD~~~~~~~~s~~~~~N~~~e~~gg~tVkIp~i~~~gl~DY~R~~ 83 (319) T protein:vir:94 5 IKNATGMLKLN-LQHFANKSVEPGQTLLKNKHVGILERVTAVNAYSTPALISNDAIFMEGRSFTVMKGDTTELKDYKRNA 83 (319) T ss_pred cccccceeEee-hhhhhccCCCcchHHHHHHHHHHHHHHHHHhhhhhhcccCcceEeccCcEEEEeeecccccccccCCC Confidence 33444433222 3334443333333344 99999999988877776543 33345678999999999999999999887 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhH--HHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVR--SEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENI 155 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r--~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~ 155 (347) ... ..+++.+..+++||+.+||.|.||++|..|++.++. ....+++.+.++..+|.+.+..++..+. T Consensus 84 g~~--~g~vt~~~~t~tidqdR~~~F~VD~~D~~Etn~~l~a~~i~~~~~~~~v~PEiDay~~skla~~a~--------- 152 (319) T protein:vir:94 84 TNE--FDHPKIEETTYFLDQEKYWGRFVDALDRKDTEGNIDINYVVARQGAEVVAPYLDNLRFATLARNKA--------- 152 (319) T ss_pred Ccc--cCCcccceeEEEeecccccccccchhhHhhhhchhhHHHHHHHHHHHHhhhhhhHHHHHHHHhhcc--------- Confidence 653 467999999999999999999999999999998873 4456778888888999887766643210 Q ss_pred ccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccccccc Q lcl|NC_015249. 156 AGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPST 235 (347) Q Consensus 156 ~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~ 235 (347) .. .+...+++++|+.|+++.++|+|++|| ++||++|+|++|.+|+++++|+.....++..+.+ T Consensus 153 ------~~----------~~~~~t~~n~y~~i~~a~~~Lde~~VP-~~Rvl~Vtp~~~~~L~~~~~f~~~~~~~~~~~~~ 215 (319) T protein:vir:94 153 ------KH----------LTVGTGSDAQYDAVLDVSVELDEIKAP-ENRVLFVSPTFYKGIKKFVIALPQGDTRQQVLGK 215 (319) T ss_pred ------cc----------cccccCHHHHHHHHHHHHHHHHhcCCC-CCcEEEeCHHHHHHHHhhhhhhccccccccceee Confidence 00 011234678999999999999999999 6999999999999999999999766566667899 Q ss_pred ceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee- Q lcl|NC_015249. 236 GSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR- 314 (347) Q Consensus 236 G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~- 314 (347) |.|++++||+|+++|+... .+.-.++.|++|+......+ .+|.++ T Consensus 216 g~Vg~idG~~Vi~vps~~~---------------------------------k~in~i~~h~~A~~~~~k~~-~~~~~~p 261 (319) T protein:vir:94 216 GVQGELDGFVIVKVPTKLL---------------------------------QGLQAIAVVGEVLASPIQAD-LAKTNSN 261 (319) T ss_pred eeceeecCeEEEEeccccc---------------------------------ccceEEEEcCCeeeeeeeee-eeeccCC Confidence 9999999999999764321 01224789999998887766 467665 Q ss_pred chhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .+.+++|.+..++.||+.++||...++++...+ T Consensus 262 ~~~~~a~~v~gr~y~d~~V~~~k~~~Iy~~~~~ 294 (319) T protein:vir:94 262 IPGMFGTLAEQLLYTGAFVPEHLQKYIFTIGGT 294 (319) T ss_pred CccccceeeeeeeeeeeEEeccccceEEEeecC Confidence 577999999999999999999998888876555 No 30 >protein:vir:96123 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1602 # MgeName: 37 # Cross-refs: genbank:acc:YP_240078;genbank:gi:66395742;genbank:GeneID:5133103 Probab=100.00 E-value=2.6e-41 Score=243.12 Aligned_cols=263 Identities=15% Similarity=0.172 Sum_probs=219.2 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhcccccc-cc--cccceEEEeecCc-ceeeeeec Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNKHLVR-SI--QSGKSAQFPVLGR-TKAAYLQP 75 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~~~~r-~i--~~G~tv~i~~iG~-~~~~~~~~ 75 (347) |||.++ +.. + +++ |+|+..|.+.|.+..++.++.... ++ +.|++++||+++. ..++.+.. T Consensus 1 ma~~~T------~~~--------d-~i~Pev~s~~v~~~~~~~~~~~~~~~~~~~l~g~~G~tv~ip~~~~~g~~~~~~~ 65 (274) T protein:vir:96 1 MAQGTT------KVS--------N-LIVPEVLAPMMQAELDKKLRFAQFADIDSTLVGQPGDTLTFPAFTYSGDAQVIAE 65 (274) T ss_pred CCcccc------chh--------h-hhhhHHHHHHHHHHHHhhhhhcccccccccccCCCCCEEEEEeeccCCCccccCC Confidence 998773 321 1 344 999999999998888888877653 23 3599999999874 36778888 Q ss_pred CCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccc Q lcl|NC_015249. 76 GENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENI 155 (347) Q Consensus 76 g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~ 155 (347) |++++ .+++..++.+++|++. +..|.|+|++..++..|++++++++++++|++++|+.++..+.++. T Consensus 66 g~~i~--~~~it~~~~~~~i~~~-~~~~~i~D~~~~~~~~d~~~~~~~~~~~~~a~~~d~~i~~~l~~a~---------- 132 (274) T protein:vir:96 66 GEKIP--VDQIGTSKREAKVRKI-GKGTELTDEAVLSGFGDPQGEAVRQHGLAIANKVDNDVLEALKGAT---------- 132 (274) T ss_pred CCcCc--hhhcccceeEEEEEee-eceeeecHHHHHhhcchHHHHHHHHHHHHHHHHHHHHHHHHHhcCC---------- Confidence 99885 4679999999999885 7899999999999999999999999999999999999876553210 Q ss_pred ccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcch--hhhhhhhcccccc Q lcl|NC_015249. 156 AGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAAL--MPNAANYQALIDP 233 (347) Q Consensus 156 ~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~--~~~~~~~~~~~~~ 233 (347) ..+ .+ ....|+.|++|..+|+++++ ++||++|+|++|+.|+++. +|+.....+++.+ T Consensus 133 --------~~~--~~---------~~~~~d~i~dA~~~l~d~~~--~~~~ivv~p~~~~~L~k~~~~~f~~~~~~g~~~~ 191 (274) T protein:vir:96 133 --------LTV--EA---------DITKLDGLQTAIDKFNDEDL--EPMVLFVNPLDAGGLRTSASDNFTRPTQLGDNII 191 (274) T ss_pred --------CCc--Cc---------ccccHHHHHHHHHHhcccCC--CceEEEeCHHHHHHHHhcccccccccccccccce Confidence 000 00 01137889999999999886 6899999999999999874 6776666677889 Q ss_pred ccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeee Q lcl|NC_015249. 234 STGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERA 313 (347) Q Consensus 234 ~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~ 313 (347) ++|.|++++||+|++||++|.. .+.+|++.|+++++.+++++|.. T Consensus 192 ~~g~ig~~~G~~Vi~s~~~p~~-----------------------------------t~~l~~~gA~~~~~~~~~~vE~~ 236 (274) T protein:vir:96 192 VKGAFGEALGAVIVRSNKLNKG-----------------------------------EALLAKKGAVKLITKRDFFLEKD 236 (274) T ss_pred eecccceecCeeEEEcCCCCcc-----------------------------------eEEEEeCcceeeeecCCcccccc Confidence 9999999999999999999831 13588999999999999999999 Q ss_pred echhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 314 RRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 314 ~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ||+.++.|.|.+++.||++++||++++++....| T Consensus 237 Rd~~~~~d~i~~~~~yg~~~~~~~~vv~~t~~~~ 270 (274) T protein:vir:96 237 RDASRKSTALYSDKHYVAYLYDESKVVKITKGAG 270 (274) T ss_pred cchhhcccEEEEeeEEEEEEEcCccEEEEEcCcc Confidence 9999999999999999999999999999988888 No 31 >protein:vir:108303 Length: 418 # NCBI annotation: hypothetical protein # Family: family:all:1412 # MgeID: mge:2007 # MgeName: BA3 # Cross-refs: genbank:acc:YP_001552282;genbank:gi:160700607;genbank:GeneID:5758819 Probab=100.00 E-value=6.9e-40 Score=235.28 Aligned_cols=299 Identities=16% Similarity=0.108 Sum_probs=208.8 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccc---cc-cccceEEEeecCcceeeeeecC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVR---SI-QSGKSAQFPVLGRTKAAYLQPG 76 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r---~i-~~G~tv~i~~iG~~~~~~~~~g 76 (347) ||-.+- ++. =.|+|+.++++.|+++.++.+++... .+ +.|++|+||+.+..+++++ T Consensus 1 m~~~~N--~~l---------------tp~iia~~~l~~l~~~lV~~~lv~r~y~~e~~~~GDTV~I~vp~~~~v~dg--- 60 (418) T protein:vir:10 1 MAVQDN--NLL---------------TDDVIAKEALRLLKNNLVMAKCVYRNYEKTFGKVGDTIRLKLPYRVKSASG--- 60 (418) T ss_pred CCcccc--ccc---------------cHHHHHHHHHHHHHHhccchhhhcCCCchHHhhCCCEEEEeeCCceeeccc--- Confidence 665441 111 13699999999999999988777652 22 3499999999999988764 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIA 156 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~ 156 (347) .++ +++++..++++|+||+.+|++|.|+|.|++|...|++++++++++++||+.+|+.++..+....+ . T Consensus 61 ~~~--~~~~~te~~v~l~id~~k~~~~~itD~e~a~~~~d~~~~~l~~A~~aLA~~vD~~ia~l~~~a~~---------~ 129 (418) T protein:vir:10 61 RTL--VKQPMVDQTIPFKIAYQEHVGLEYTVKDKTLDIMQFSERYLKSGMVQIANQIDRSLALTLKKAFH---------S 129 (418) T ss_pred CCc--cccccccceEEEEEecccccceeechHHHhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHhhccc---------c Confidence 344 34678899999999999999999999999999999999999999999999999998765432110 0 Q ss_pred cccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCC-CEEEeCHHHHHHHhcchhhhhhhhcccccccc Q lcl|NC_015249. 157 GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSAD-RVFYTTPDNYSAILAALMPNAANYQALIDPST 235 (347) Q Consensus 157 ~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~g-R~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~ 235 (347) +++... ....|+.|++++.+|++++||.+| ||+||+|++|+.|++++++..........+++ T Consensus 130 ---------~gt~gt--------~~~~~~~i~~a~~~Ld~~~VP~~G~R~lVv~P~~~~~L~~~~~~~~~~~~~~~~lr~ 192 (418) T protein:vir:10 130 ---------SGTPGV--------RPGAFIDFANAGAKQTTYAVPQDGMRHAVLDPFTCASLSDEVTKLFKESMVEQAYKM 192 (418) T ss_pred ---------cccCCc--------CcchHHHHHHHHHHHHhcCCCCCCceEEEeCHHHHHHHhhhccccccccccchhhhe Confidence 010111 012378899999999999999985 99999999999999988776444455667999 Q ss_pred ceEEEEeceEEEEecceecccccccc---cccccccccc-------------------cccccc---------------- Q lcl|NC_015249. 236 GSIRNVMGFEVIEVPHLTAGGAGEDR---PEEGANPTGQ-------------------KHAFPE---------------- 277 (347) Q Consensus 236 G~Vg~i~G~~V~~sn~lp~~~~~~~~---~~~~~~~~~~-------------------~~~~~~---------------- 277 (347) |.||+++||+||+|||+|....++.. ...+...++. .+.+.. T Consensus 193 G~IG~i~GF~V~~S~nip~~tag~~~~t~~v~ga~~~~~~~~~~~~t~s~~g~l~~Gd~~ti~gv~~v~~~t~~~~~~~~ 272 (418) T protein:vir:10 193 GYRGNVAAYEVYESQNLPKHTVGDHGGTPLVNGTVVNGDTVGFDGGTASTTGFLKAGDVITFGGVFGVNPQNYETTGLLQ 272 (418) T ss_pred eeeeeeeceEEEEecCCCcccccccccceeeecccccceeEEEeecceeeccceeeccEEEECceeecccccccccccce Confidence 99999999999999999964433211 1111111111 111110 Q ss_pred ccc------------cccc-------------------------------cc------------ccceEEEEechhhhhh Q lcl|NC_015249. 278 TSS------------GDTR-------------------------------VA------------LDNVVGLFNHRSAVGT 302 (347) Q Consensus 278 ~~~------------~~y~-------------------------------~~------------~~~~~~l~~~~~Av~~ 302 (347) +.. ...+ ++ .+-..-|+||++|+.. T Consensus 273 ~f~V~~~~~~~~~~~~tv~i~p~~~~~~~~~~~~~~~~~~~~~~~~v~a~~a~~~~it~~~~a~~~~~~nl~f~~~a~~l 352 (418) T protein:vir:10 273 EFVVLEDVDTDAGGAGSIKISPSLNDGTATINNENGDPVSLTAYQNVTALPADNAPITVLGAANTTYEQNYLFHRDAIAL 352 (418) T ss_pred EEEEEeeccccccCcceeEeccccccccccccccccccccccCCCcccccccCcceeeeecccccceeeeeeeecceEEE Confidence 000 0000 00 0011238999997754 Q ss_pred hhh--------------------cceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 303 VKL--------------------KDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 303 v~~--------------------~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +.. +.+.+-..||...+-+.++.-.-||...+|||.++.|.=++| T Consensus 353 ~~~~l~~p~g~~~~~~~~~~~~G~s~r~~~~~d~~~~~~~~r~d~l~g~~~~~p~~~~~~~g~~~ 417 (418) T protein:vir:10 353 AMIDLELPQSAVIKSRAADPETGLSLTLTGAYDINEQSEIHRIDAVWGADMIYGELALRLWGAAS 417 (418) T ss_pred EEeeccCCCCCCcceEEEeccCCeEEEEEEcccccccceEEEEEeecCceeecccceEEEEeecC Confidence 322 112222337777777888888899999999999877666666 No 32 >protein:vir:93742 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1475 # MgeName: 55 # Cross-refs: genbank:acc:YP_240459;genbank:gi:66396126;genbank:GeneID:5133511 Probab=100.00 E-value=1e-39 Score=234.35 Aligned_cols=264 Identities=16% Similarity=0.144 Sum_probs=218.6 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccccc-cccc--ccceEEEeecCc-ceeeeeecC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLV-RSIQ--SGKSAQFPVLGR-TKAAYLQPG 76 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~-r~i~--~G~tv~i~~iG~-~~~~~~~~g 76 (347) |||.+ |+.... +.-|+|+.+|.+.+.+..++.++... .++. .|++++||+++. ..++.+..| T Consensus 1 ma~~~------T~~~~~--------iiPev~~~~v~~~~~~~~~~~~~~~~~~~l~g~~G~tv~ip~~~~~g~~~~~~eg 66 (274) T protein:vir:93 1 MPQGI------TKTSNQ--------IIPEVLAPMMQAQLEKKLRFASFAEVDSTLQGQPGDTLTFPAFVYSGDAQVVAEG 66 (274) T ss_pred CCccc------eehhhe--------echHHHHHHHHHHHHhhhhhcccccccccccCCCCCEEEEEeeccCCCcccccCC Confidence 99976 344222 34499999999999988888888765 3333 499999999764 367788889 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIA 156 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~ 156 (347) ++++. ++++.++.+++|++. ++.|.|+|++..++..|++++++++++++|++++|+.++..+.+.. . T Consensus 67 ~~i~~--~~it~~~~~~~i~~~-~~~~~i~D~~~~~~~~d~~~~~~~~~~~~~a~~~d~~~~~~~~~a~---------~- 133 (274) T protein:vir:93 67 EKIPT--DILETKKREAKIRKI-AKGTSITDEALLSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGAK---------L- 133 (274) T ss_pred Ccccc--cccccceeEEEeeee-cccccccHHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhccc---------c- Confidence 98854 689999999999885 6899999999999999999999999999999999999987653210 0 Q ss_pred cccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcch--hhhhhhhccccccc Q lcl|NC_015249. 157 GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAAL--MPNAANYQALIDPS 234 (347) Q Consensus 157 ~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~--~~~~~~~~~~~~~~ 234 (347) .+ .++ . ..++.|++|..+|+++++ ++||++|+|++|+.|+++. +|+.....++..++ T Consensus 134 --------~~--~~~-----~----~~~d~i~dA~~~l~d~~~--~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~ 192 (274) T protein:vir:93 134 --------TV--NAD-----I----TKLNGLQSAIDKFNDEDL--EPMVLFINPLDAGKLRGDASTNFTRATELGDDIIV 192 (274) T ss_pred --------cc--ccc-----c----cCHHHHHHHHHHhhhccC--CccEEEeCHHHHHHHHhhhhhccccccccccccee Confidence 00 000 0 126788999999999876 6899999999999999985 66766666777889 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) +|.|++++||+|++||++|. ..+.+|++.|+++++.+++.+|..| T Consensus 193 ~G~ig~~~G~~Vi~s~~~p~-----------------------------------~t~~l~~~gai~~~~~~~~~vE~~R 237 (274) T protein:vir:93 193 KGAFGEALGAIIVRTNKLEA-----------------------------------GTAILAKKGAVKLILKRDFFLEVAR 237 (274) T ss_pred ecccceecCeeEEEcCCCCc-----------------------------------ceEEEEeCCeEEEEecCCccccccc Confidence 99999999999999999982 1235889999999999999999999 Q ss_pred chhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |+.++.|.|.+++.||++++||+.++.+....| T Consensus 238 d~~~~~d~i~~~~~y~~~~~~~~~~v~~t~~~~ 270 (274) T protein:vir:93 238 DASTKTTALYSDKHYVAYLYDESKAVKITKGSG 270 (274) T ss_pred chhhcccEEEEEEEEEEEEEcCCceEEEeeCcc Confidence 999999999999999999999999999998888 No 33 >protein:vir:3613 Length: 272 # NCBI annotation: MHP # Family: family:all:522 # MgeID: mge:74 # MgeName: TP901-1 # Cross-refs: genbank:acc:NP_112699;genbank:gi:13786567;genbank:GeneID:921035 Probab=100.00 E-value=3e-39 Score=231.81 Aligned_cols=267 Identities=17% Similarity=0.155 Sum_probs=217.4 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccc-ccc--ccceEEEeecCcc-eeeeeecC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVR-SIQ--SGKSAQFPVLGRT-KAAYLQPG 76 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r-~i~--~G~tv~i~~iG~~-~~~~~~~g 76 (347) |||.++ +.. +-+.-|+|+.+|.+.|.+..++.++..+. ++. .|++++||+.+.. ....+..| T Consensus 1 ma~~~T------~~~--------d~iiPev~~~~v~~~~~~~~~~~~~~~~~~~l~g~~G~ti~iP~~~~~gda~~~~eg 66 (272) T protein:vir:36 1 MSKQKT------TLA--------DLVNPEVLAPIVSYELNKALRFAPLAQVDTTLQGQPGNTLKFPAFTYIGDAADVAEG 66 (272) T ss_pred CCCcce------ehh--------hhhchHHHHHHHHHHHHhhhhhccccccccccccCCCCEEEEeeeccCccccccCCC Confidence 998663 321 21345999999999999988888877663 344 4999999997654 34567888 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIA 156 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~ 156 (347) ++++ .++++.++.+++|.+. ...|.|+|++..++..|++++++++++++||+++|+.++..+.+.. T Consensus 67 ~~i~--~~~lt~~~~~~~i~~~-~k~~~vtD~~~~~~~~d~~~~~~~~~a~~~a~~~d~~i~~~l~~~~----------- 132 (272) T protein:vir:36 67 GEIS--LDKIGTTTKSVTIKKA-AKGTEITDEAALSGYGDPIGESNKQLGLSLANKVDDDLLSAAKTTS----------- 132 (272) T ss_pred CccC--hhhcCCcceeEeeehh-hccccccHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhcccc----------- Confidence 8885 4679999999999886 5789999999999999999999999999999999998876542110 Q ss_pred cccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhh-hhcccccccc Q lcl|NC_015249. 157 GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAA-NYQALIDPST 235 (347) Q Consensus 157 ~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~-~~~~~~~~~~ 235 (347) ..+ .....++.|.+|..+|.+.++| .||++|+|+.|+.|+++.++... ++.+...+++ T Consensus 133 -------~~~------------~~~~~~d~i~~A~~~lgd~~~~--~~~ivv~p~~~~~L~k~~~~~~~~~~~~~~~~~~ 191 (272) T protein:vir:36 133 -------QTV------------STKANVDGVQAALDIFNDEDAQ--AYVLIVNPKDAAKIRKDANAKNIGSEVGANALIN 191 (272) T ss_pred -------ccc------------cccccHHHHHHHHHHhhhcCCC--ceEEEEcHHHHHHHhcccccccccccccccceee Confidence 000 0122467899999999999986 68999999999999999988865 4567778999 Q ss_pred ceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeec Q lcl|NC_015249. 236 GSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARR 315 (347) Q Consensus 236 G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d 315 (347) |.|++++|++|++||++|..++ ....++|.+.|+++...+++++|..|| T Consensus 192 G~ig~~~G~~Vv~s~~~p~~~~-------------------------------~~~~~~~~~gA~~~~~~~~~~vE~~R~ 240 (272) T protein:vir:36 192 GTYADVLGAQIVRSKKLAEGSA-------------------------------LMFKIVSNSPALKLVLKRGVQVETDRD 240 (272) T ss_pred eccceecCeeEEEeCCCCCCce-------------------------------eEEEEEecccceeeeecCCcccccccc Confidence 9999999999999999994321 022357889999999999999999999 Q ss_pred hhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 316 ANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 316 ~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +.++.|.|.+++.||++++||++++.+.++-= T Consensus 241 ~~~~~d~i~~~~~y~~~v~~~~~vv~~t~~g~ 272 (272) T protein:vir:36 241 IVTKTTVITADEHYAAYLYDLTKVVNITFTGV 272 (272) T ss_pred hhhcCcEEEEEEEEEEEEEcCccEEEEeecCC Confidence 99999999999999999999999888766655 No 34 >protein:vir:1239 Length: 274 # NCBI annotation: similar to phage B1 major head protein # Family: family:all:522 # MgeID: mge:25 # MgeName: phi ETA # Cross-refs: genbank:acc:NP_510938;genbank:gi:17426272;genbank:GeneID:927376 Probab=100.00 E-value=4.4e-39 Score=230.86 Aligned_cols=264 Identities=16% Similarity=0.148 Sum_probs=218.0 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccc-cc--cccceEEEeecCcc-eeeeeecC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVR-SI--QSGKSAQFPVLGRT-KAAYLQPG 76 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r-~i--~~G~tv~i~~iG~~-~~~~~~~g 76 (347) |||.++ +.. +-+.-|+|+.+|.+.+.+..++.+++..- ++ +.|++++||+.+.. .+..+..| T Consensus 1 ma~~~T------~l~--------d~iiPev~~~~v~~~~~~~l~~~~~~~~d~~l~g~~G~tv~iP~~~~ig~a~~~~~g 66 (274) T protein:vir:12 1 MAQGLT------KTS--------NQIIPEVLAPMMQAQLEKKLRFASFAEVDSTLQGQPGDTLTFPAFVYSGDAQVVAEG 66 (274) T ss_pred CCccee------ehh--------hhhchHHHHHHHHHHHHhhhhhcccceecccccCCCCCEEEEeeecCCCccccccCC Confidence 998763 331 21344999999999998888888877763 33 35999999986543 56778888 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIA 156 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~ 156 (347) ++++ .++++.++.+++|++ .++.|.|+|++..++..|++++++++++++||+++|+.++..+.++.. T Consensus 67 ~~i~--~~~lt~~~~~~~i~~-~~~~~~i~D~~~~~~~~d~~~~~~~q~~~~~a~~vd~~~l~~~~~a~~---------- 133 (274) T protein:vir:12 67 EKIP--TDILETKKREAKIRK-IAKGTSITDEALLSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGAKL---------- 133 (274) T ss_pred Cccc--hhhcccceeeEEeee-ecceeeecHHHHHhcccchHHHHHHHHHHHHHHHHHHHHHHHHhcccc---------- Confidence 8885 468999999999988 589999999999999999999999999999999999998866532100 Q ss_pred cccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcch--hhhhhhhccccccc Q lcl|NC_015249. 157 GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAAL--MPNAANYQALIDPS 234 (347) Q Consensus 157 ~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~--~~~~~~~~~~~~~~ 234 (347) .+. .. . ..++.|++|..+|++++. .+||++|+|++|+.|+++. +|+...-.+...++ T Consensus 134 --------~~~--~~-----a----~~~d~i~dA~~~lgd~~~--~~~~ivv~p~~~~~L~k~~~~~fv~~s~~g~~~~~ 192 (274) T protein:vir:12 134 --------TVN--AD-----I----TKLNGLQSAIDKFNDEDL--EPMVLFINPLDAGKLRGDASTNFTRATELGDDIIV 192 (274) T ss_pred --------ccc--cc-----c----cCHHHHHHHHHHhccccc--cccEEEeCHHHHHHHHhhhhhhcccccccccccee Confidence 000 00 0 127788999999998874 7899999999999999985 77876666777899 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) +|.||+++||+||+||++|.. .+.+|++.|++++..+++++|..| T Consensus 193 ~G~ig~~~G~~Vi~s~~~p~~-----------------------------------t~~l~~~gA~~~~~~~~~~vE~~R 237 (274) T protein:vir:12 193 KGAFGEALGAIIVRSNKLEAG-----------------------------------TAILAKKGAVKLILKRDFFLEVAR 237 (274) T ss_pred cccceeecCeeEEEeCCCCcc-----------------------------------eEEEEeccceeeeecCCceecccc Confidence 999999999999999999831 235888999999999999999999 Q ss_pred chhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |+.++.|.|.+++.||+++.||+.++.+....| T Consensus 238 d~~~~~d~i~~~~~y~~~~~~~~~vv~~t~~~~ 270 (274) T protein:vir:12 238 DASTKTTALYSDKHYVAYLYDESKAVKITKGSG 270 (274) T ss_pred chhhcccEEEeeeEEEEEEEcCCceEEEEcCCc Confidence 999999999999999999999999999998888 No 35 >protein:vir:96262 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1612 # MgeName: ROSA # Cross-refs: genbank:acc:YP_240311;genbank:gi:66395978;genbank:GeneID:5133339 Probab=100.00 E-value=3.9e-39 Score=231.14 Aligned_cols=263 Identities=15% Similarity=0.147 Sum_probs=215.9 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhccccc-cccc--ccceEEEeecCcc-eeeeeec Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNKHLV-RSIQ--SGKSAQFPVLGRT-KAAYLQP 75 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~~~~-r~i~--~G~tv~i~~iG~~-~~~~~~~ 75 (347) |||.++ +... +++ |+|+.+|.+.+.+..++.++..+ +++. .|++++||+.... .+..+.. T Consensus 1 m~~~~T------~l~d---------~i~Pev~~~~v~~~~~~~l~~~~~~~~~~~l~g~~G~tv~iP~~~~ig~a~~~~~ 65 (274) T protein:vir:96 1 MAQGMT------KLTN---------QIVPEVLAPMMQAELEKKLRFASFAEIDNTLVGQPGDTLTFPAFIYSGDAKVVAE 65 (274) T ss_pred CCccee------ehhh---------eechHHHHHHHHHHHHhhhhccccceecccccCCCCCEEEeeeecCCCccccccC Confidence 998663 3311 344 99999999999888888888654 3444 4999999986543 5667888 Q ss_pred CCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccc Q lcl|NC_015249. 76 GENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENI 155 (347) Q Consensus 76 g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~ 155 (347) |++++ .++++.++.+++|++. +..|.|+|+|..++..|++++++++++++||+++|+.++..+.++. T Consensus 66 g~~i~--~~~lt~~~~~~~i~~~-~~a~~i~D~~~~~~~~d~~~~~~~~~~~~~a~~vd~~i~~~l~~a~---------- 132 (274) T protein:vir:96 66 GEKIP--TDILETKKREAKIRKI-AKGTSISDEALLSGYGDPQGEQVRQHGLAHANKVDDDVLEALKSAK---------- 132 (274) T ss_pred CCccc--hhhcccceeEEEeeee-ecceeehHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHHhccc---------- Confidence 88875 4689999999999884 8999999999999999999999999999999999999876653211 Q ss_pred ccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcch--hhhhhhhcccccc Q lcl|NC_015249. 156 AGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAAL--MPNAANYQALIDP 233 (347) Q Consensus 156 ~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~--~~~~~~~~~~~~~ 233 (347) ..+. .+ ...|+.|++|..+|++.+. .+||++|+|++|+.|++++ +|+.....+...+ T Consensus 133 ------~~~~----~~---------~~~~d~i~~A~~~lgd~~~--~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~ 191 (274) T protein:vir:96 133 ------LTVE----AD---------ITKLTGLQTAIDKFNDEDL--EPMVLFISPLDAGKLRGDATTNFTRATELGDDVI 191 (274) T ss_pred ------cccc----cc---------ccCHHHHHHHHHHhccccc--cccEEEeCHHHHHHHHhhccccccccccccccce Confidence 0000 00 0126788999999998874 6899999999999999985 6777666677889 Q ss_pred ccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeee Q lcl|NC_015249. 234 STGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERA 313 (347) Q Consensus 234 ~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~ 313 (347) ++|.||+++||+||+||++|.. .+.+|++.|+++...+++.+|.. T Consensus 192 ~~G~ig~~~G~~Vi~s~~~~~~-----------------------------------t~~l~~~gA~~~~~~~~~~vE~~ 236 (274) T protein:vir:96 192 VKGAFGEALGAVIVRSNKLEAG-----------------------------------TAILAKKGAVKLITKRDFFLETD 236 (274) T ss_pred eccccceecCeEEEEeCCCCCc-----------------------------------eEEEEeccceeeeecCCcccccc Confidence 9999999999999999998721 23588899999999999999999 Q ss_pred echhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 314 RRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 314 ~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ||+.++.|.|.+++.||++++||++++.+....+ T Consensus 237 Rd~~~~~d~i~~~~~y~~~~~~~~~~v~~tk~~~ 270 (274) T protein:vir:96 237 RDPSTKTTALYSDKHYVAYLYDESKAVKITKGSG 270 (274) T ss_pred cccccccCEEEEeEEEEEEEEcCCcEEEEEcCCc Confidence 9999999999999999999999999998886666 No 36 >protein:vir:95898 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1588 # MgeName: 71 # Cross-refs: genbank:acc:YP_240385;genbank:gi:66396054;genbank:GeneID:5133409 Probab=100.00 E-value=3.9e-39 Score=231.14 Aligned_cols=263 Identities=15% Similarity=0.147 Sum_probs=215.9 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhccccc-cccc--ccceEEEeecCcc-eeeeeec Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNKHLV-RSIQ--SGKSAQFPVLGRT-KAAYLQP 75 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~~~~-r~i~--~G~tv~i~~iG~~-~~~~~~~ 75 (347) |||.++ +... +++ |+|+.+|.+.+.+..++.++..+ +++. .|++++||+.... .+..+.. T Consensus 1 m~~~~T------~l~d---------~i~Pev~~~~v~~~~~~~l~~~~~~~~~~~l~g~~G~tv~iP~~~~ig~a~~~~~ 65 (274) T protein:vir:95 1 MAQGMT------KLTN---------QIVPEVLAPMMQAELEKKLRFASFAEIDNTLVGQPGDTLTFPAFIYSGDAKVVAE 65 (274) T ss_pred CCccee------ehhh---------eechHHHHHHHHHHHHhhhhccccceecccccCCCCCEEEeeeecCCCccccccC Confidence 998663 3311 344 99999999999888888888654 3444 4999999986543 5667888 Q ss_pred CCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccc Q lcl|NC_015249. 76 GENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENI 155 (347) Q Consensus 76 g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~ 155 (347) |++++ .++++.++.+++|++. +..|.|+|+|..++..|++++++++++++||+++|+.++..+.++. T Consensus 66 g~~i~--~~~lt~~~~~~~i~~~-~~a~~i~D~~~~~~~~d~~~~~~~~~~~~~a~~vd~~i~~~l~~a~---------- 132 (274) T protein:vir:95 66 GEKIP--TDILETKKREAKIRKI-AKGTSISDEALLSGYGDPQGEQVRQHGLAHANKVDDDVLEALKSAK---------- 132 (274) T ss_pred CCccc--hhhcccceeEEEeeee-ecceeehHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHHhccc---------- Confidence 88875 4689999999999884 8999999999999999999999999999999999999876653211 Q ss_pred ccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcch--hhhhhhhcccccc Q lcl|NC_015249. 156 AGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAAL--MPNAANYQALIDP 233 (347) Q Consensus 156 ~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~--~~~~~~~~~~~~~ 233 (347) ..+. .+ ...|+.|++|..+|++.+. .+||++|+|++|+.|++++ +|+.....+...+ T Consensus 133 ------~~~~----~~---------~~~~d~i~~A~~~lgd~~~--~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~ 191 (274) T protein:vir:95 133 ------LTVE----AD---------ITKLTGLQTAIDKFNDEDL--EPMVLFISPLDAGKLRGDATTNFTRATELGDDVI 191 (274) T ss_pred ------cccc----cc---------ccCHHHHHHHHHHhccccc--cccEEEeCHHHHHHHHhhccccccccccccccce Confidence 0000 00 0126788999999998874 6899999999999999985 6777666677889 Q ss_pred ccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeee Q lcl|NC_015249. 234 STGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERA 313 (347) Q Consensus 234 ~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~ 313 (347) ++|.||+++||+||+||++|.. .+.+|++.|+++...+++.+|.. T Consensus 192 ~~G~ig~~~G~~Vi~s~~~~~~-----------------------------------t~~l~~~gA~~~~~~~~~~vE~~ 236 (274) T protein:vir:95 192 VKGAFGEALGAVIVRSNKLEAG-----------------------------------TAILAKKGAVKLITKRDFFLETD 236 (274) T ss_pred eccccceecCeEEEEeCCCCCc-----------------------------------eEEEEeccceeeeecCCcccccc Confidence 9999999999999999998721 23588899999999999999999 Q ss_pred echhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 314 RRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 314 ~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ||+.++.|.|.+++.||++++||++++.+....+ T Consensus 237 Rd~~~~~d~i~~~~~y~~~~~~~~~~v~~tk~~~ 270 (274) T protein:vir:95 237 RDPSTKTTALYSDKHYVAYLYDESKAVKITKGSG 270 (274) T ss_pred cccccccCEEEEeEEEEEEEEcCCcEEEEEcCCc Confidence 9999999999999999999999999998886666 No 37 >protein:vir:94494 Length: 274 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1508 # MgeName: 88 # Cross-refs: genbank:acc:YP_240676;genbank:gi:66396348;genbank:GeneID:5133758 Probab=100.00 E-value=4.8e-39 Score=230.66 Aligned_cols=264 Identities=16% Similarity=0.141 Sum_probs=218.2 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccc-ccc--ccceEEEeecCcc-eeeeeecC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVR-SIQ--SGKSAQFPVLGRT-KAAYLQPG 76 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r-~i~--~G~tv~i~~iG~~-~~~~~~~g 76 (347) |||.+ |+.... +.-|+|+.+|.+.+.+..++.++.... ++. .|++++||+++.. .++.+..| T Consensus 1 ma~~~------T~~~d~--------iiPev~~~~v~~~~~~~l~~~~~~~~d~~l~g~~G~tv~iP~~~~~g~a~~~~~g 66 (274) T protein:vir:94 1 MPQGL------TKTSDQ--------IIPEVLAPMMQAQLEKKLRFASFAEVDSTLQGQPGDTLTFPAFVYSGDAQVVAEG 66 (274) T ss_pred CCccc------eehhhe--------echHHHHHHHHHhhhhhhhhcccceecccccCCCCCEEEEeeecCCCccccccCC Confidence 99865 333221 344999999999998888888887663 333 4999999996643 56778889 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIA 156 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~ 156 (347) ++++ .++++.++.+++|++. ++.|.|+|++..++..|++++++++++++|++.+|+.++..+.+.. T Consensus 67 ~~i~--~~~lt~~~~~~~i~~~-~~~~~i~D~~~~~~~~dp~~~~~~~~a~a~a~~vd~~~~~~l~~a~----------- 132 (274) T protein:vir:94 67 EKIP--TDILETKKREAKIRKI-AKGTSITDEALLSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGAK----------- 132 (274) T ss_pred Cccc--ccccccceeEEEeeee-cceecccHHHHHhccchHHHHHHHHHHHHHHHHHHHHHHHHHhccC----------- Confidence 8885 4689999999999885 6899999999999999999999999999999999999887653210 Q ss_pred cccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcch--hhhhhhhccccccc Q lcl|NC_015249. 157 GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAAL--MPNAANYQALIDPS 234 (347) Q Consensus 157 ~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~--~~~~~~~~~~~~~~ 234 (347) . .+ .+. . ..++.|++|..+|++++. .+||++|+|++|+.|+++. +|+...-.+...++ T Consensus 133 -----~--~~--~~~-----~----~~~d~i~dA~~~l~d~~~--~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~ 192 (274) T protein:vir:94 133 -----L--TV--NAD-----I----TKLNGLQSAIDKFNDEDL--EPMVLFVNPLDAGKLRGDASTNFTRATELGDDIIV 192 (274) T ss_pred -----c--cc--ccc-----c----cCHHHHHHHHHHhhccCC--CceEEEeCHHHHHHHHhhhhhhccccCccccccee Confidence 0 00 000 0 126788999999999876 5799999999999999985 77777666777889 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) +|.||+++||+|++||++|. ..+.+|++.|++.++.+++.+|..| T Consensus 193 ~G~ig~~~G~~Vi~s~~~p~-----------------------------------~t~~l~~~gA~~~~~~~~~~vE~~R 237 (274) T protein:vir:94 193 KGAFGEALGAIIVRTNKLEA-----------------------------------GTAILAKKGAVKLILKRDFFLEVAR 237 (274) T ss_pred ccccceecCeeEEEcCCCCc-----------------------------------ceEEEEeCcceEeeecCCceecccc Confidence 99999999999999999982 1235888999999999999999999 Q ss_pred chhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |+..+.|.|.+++.||++++||+.++.+....| T Consensus 238 d~~~~~d~i~~~~~y~~~~~~~~~vv~~t~~~~ 270 (274) T protein:vir:94 238 DASTKTTALYSDKHYVAYLYDESKAVKITKGSG 270 (274) T ss_pred chhhcccEEEEEEEEEEEEEcCCceEEEecCcc Confidence 999999999999999999999999999998888 No 38 >protein:vir:97433 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1676 # MgeName: 92 # Cross-refs: genbank:acc:YP_240749;genbank:gi:66396420;genbank:GeneID:5133789 Probab=100.00 E-value=4.8e-39 Score=230.66 Aligned_cols=264 Identities=16% Similarity=0.141 Sum_probs=218.2 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccc-ccc--ccceEEEeecCcc-eeeeeecC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVR-SIQ--SGKSAQFPVLGRT-KAAYLQPG 76 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r-~i~--~G~tv~i~~iG~~-~~~~~~~g 76 (347) |||.+ |+.... +.-|+|+.+|.+.+.+..++.++.... ++. .|++++||+++.. .++.+..| T Consensus 1 ma~~~------T~~~d~--------iiPev~~~~v~~~~~~~l~~~~~~~~d~~l~g~~G~tv~iP~~~~~g~a~~~~~g 66 (274) T protein:vir:97 1 MPQGL------TKTSDQ--------IIPEVLAPMMQAQLEKKLRFASFAEVDSTLQGQPGDTLTFPAFVYSGDAQVVAEG 66 (274) T ss_pred CCccc------eehhhe--------echHHHHHHHHHhhhhhhhhcccceecccccCCCCCEEEEeeecCCCccccccCC Confidence 99865 333221 344999999999998888888887663 333 4999999996643 56778889 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIA 156 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~ 156 (347) ++++ .++++.++.+++|++. ++.|.|+|++..++..|++++++++++++|++.+|+.++..+.+.. T Consensus 67 ~~i~--~~~lt~~~~~~~i~~~-~~~~~i~D~~~~~~~~dp~~~~~~~~a~a~a~~vd~~~~~~l~~a~----------- 132 (274) T protein:vir:97 67 EKIP--TDILETKKREAKIRKI-AKGTSITDEALLSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGAK----------- 132 (274) T ss_pred Cccc--ccccccceeEEEeeee-cceecccHHHHHhccchHHHHHHHHHHHHHHHHHHHHHHHHHhccC----------- Confidence 8885 4689999999999885 6899999999999999999999999999999999999887653210 Q ss_pred cccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcch--hhhhhhhccccccc Q lcl|NC_015249. 157 GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAAL--MPNAANYQALIDPS 234 (347) Q Consensus 157 ~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~--~~~~~~~~~~~~~~ 234 (347) . .+ .+. . ..++.|++|..+|++++. .+||++|+|++|+.|+++. +|+...-.+...++ T Consensus 133 -----~--~~--~~~-----~----~~~d~i~dA~~~l~d~~~--~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~ 192 (274) T protein:vir:97 133 -----L--TV--NAD-----I----TKLNGLQSAIDKFNDEDL--EPMVLFVNPLDAGKLRGDASTNFTRATELGDDIIV 192 (274) T ss_pred -----c--cc--ccc-----c----cCHHHHHHHHHHhhccCC--CceEEEeCHHHHHHHHhhhhhhccccCccccccee Confidence 0 00 000 0 126788999999999876 5799999999999999985 77777666777889 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) +|.||+++||+|++||++|. ..+.+|++.|++.++.+++.+|..| T Consensus 193 ~G~ig~~~G~~Vi~s~~~p~-----------------------------------~t~~l~~~gA~~~~~~~~~~vE~~R 237 (274) T protein:vir:97 193 KGAFGEALGAIIVRTNKLEA-----------------------------------GTAILAKKGAVKLILKRDFFLEVAR 237 (274) T ss_pred ccccceecCeeEEEcCCCCc-----------------------------------ceEEEEeCcceEeeecCCceecccc Confidence 99999999999999999982 1235888999999999999999999 Q ss_pred chhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |+..+.|.|.+++.||++++||+.++.+....| T Consensus 238 d~~~~~d~i~~~~~y~~~~~~~~~vv~~t~~~~ 270 (274) T protein:vir:97 238 DASTKTTALYSDKHYVAYLYDESKAVKITKGSG 270 (274) T ss_pred chhhcccEEEEEEEEEEEEEcCCceEEEecCcc Confidence 999999999999999999999999999998888 No 39 >protein:vir:99075 Length: 392 # NCBI annotation: gp30 # Family: family:all:10837 # MgeID: mge:1671 # MgeName: Wildcat # Cross-refs: genbank:acc:YP_655895;genbank:gi:109521467;genbank:GeneID:4158040 Probab=100.00 E-value=8.6e-39 Score=229.27 Aligned_cols=286 Identities=12% Similarity=0.056 Sum_probs=184.7 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhcccccc---cc--cccceEEEeecCcceeeeee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNKHLVR---SI--QSGKSAQFPVLGRTKAAYLQ 74 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~~~~r---~i--~~G~tv~i~~iG~~~~~~~~ 74 (347) |||. +|+ |+|+.+++..|++..++..++... .+ +.|++|+|++.+..++.+++ T Consensus 1 Ma~~---------------------~~~p~~~a~~~l~~l~~~lv~~~lv~~~~~~~~~~~~GdtV~i~~~~~~~~~~~~ 59 (392) T protein:vir:99 1 MANA---------------------FSKPTAVVDTAIQMLQNELILTNLVWLNGIGDFAHKFNDTITVRVPAPSRGHTRK 59 (392) T ss_pred Cccc---------------------cccHHHHHHHHHHHHHhhccchhhhccccccccccCCCCeEEEeecccccceeee Confidence 5542 344 899999999999998888877542 45 35999999999999998876 Q ss_pred cCCC---CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccc Q lcl|NC_015249. 75 PGEN---LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSAS 151 (347) Q Consensus 75 ~g~~---~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~ 151 (347) +... -+...+++.+++++++||+.+|++|.|+|.|+.|...|++.++.++++++||+.+|+.++..+..... T Consensus 60 ~~~~~~~~~~~~~~~~~~~~~~~id~~k~~~~~i~d~e~~~~~~~~~~~~~~~a~~ala~~vd~~i~~~~~~a~~----- 134 (392) T protein:vir:99 60 LRGAGAERNLTVSDFTEDSFPVTLTDVAYHLGVLTDEELTFDLESFATQILPRQVRGVADILEEGVRDMIVGAPY----- 134 (392) T ss_pred ccccccCCcccccccccceEEEEEeeeeecceeechHHHhhhhhhhHHHHHHHHHHHHHHHHHHHHHHHHhcccc----- Confidence 4221 12244678899999999999999999999999999999999999999999999999998876542110 Q ss_pred ccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccc- Q lcl|NC_015249. 152 DENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQAL- 230 (347) Q Consensus 152 ~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~- 230 (347) . .. .......+..+|+.|++|+++|+|++||. |||++|+|++|+.|+++++|.+.++.+. T Consensus 135 -~-~~----------------~~~~~~~~~~~~~~i~~a~~~L~~~~vP~-~R~~vv~p~~~~~l~~~~~~~~~~~~g~~ 195 (392) T protein:vir:99 135 -E-AA----------------GAVHEVAPDEFFKGVNGARRALNELYIPQ-GRVLVVGTAVTEQILNDDRFIKYESQGQS 195 (392) T ss_pred -c-cc----------------ccccccChhhhHHHHHHHHHHHhhcCCCC-CCEEEEcHHHHHHHhcccceeecccccch Confidence 0 00 00111234567899999999999999996 8999999999999999999998877654 Q ss_pred --cccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcce Q lcl|NC_015249. 231 --IDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDM 308 (347) Q Consensus 231 --~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~ 308 (347) ..+++|.||+++||+||+|+++|...+...+..+.......+ ..+......+ .+.... .+ -. T Consensus 196 ~~~~l~~G~vg~i~G~~v~~s~~~~~~t~~a~~~~a~~~at~a~-v~~~~~~~~~----------s~s~~~--~v---~~ 259 (392) T protein:vir:99 196 AVSALQEARLGRIYGYEIVESTLIPHGDAYLYHPTAFIMATRAP-APPMGAVRST----------AISGDQ--RI---AM 259 (392) T ss_pred hhhhhhcceeeeeeeeEEEeecccccccceeeeccccccccccc-ccccccccee----------EEeccc--ce---ec Confidence 458899999999999999999997654433222111111100 0000000000 000000 00 00 Q ss_pred eeeeeechhhhcceeeeeeeecccccccceEEEEEEc------C------------C Q lcl|NC_015249. 309 ALERARRANFQADQIIAKYAMGHGGLRPEACGALVFN------K------------A 347 (347) Q Consensus 309 ~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~------~------------a 347 (347) ..-..++.....+........|.+...-.+...+... . . T Consensus 260 ~~~~~~~~t~~s~~~~v~~~~g~~~v~~~~~~~~~~~~~~~~~~~~v~v~~v~~~~~ 316 (392) T protein:vir:99 260 RWLVDYDSTITSNRSLIDTYFGLKVVEDPNGVGFVRARKIHLIPGSIEVAPEAGANA 316 (392) T ss_pred ceeecccceeeccccccceeEEEEEEeeccccceeeeeeeeeecceeeeeeeecccc Confidence 0011233333333333222223222211111100000 0 0 No 40 >protein:vir:96833 Length: 275 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1642 # MgeName: EW # Cross-refs: genbank:acc:YP_240157;genbank:gi:66395822;genbank:GeneID:5133174 Probab=100.00 E-value=5.8e-39 Score=230.21 Aligned_cols=265 Identities=16% Similarity=0.167 Sum_probs=216.8 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccc-ccc--ccceEEEeecCcc-eeeeeecC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVR-SIQ--SGKSAQFPVLGRT-KAAYLQPG 76 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r-~i~--~G~tv~i~~iG~~-~~~~~~~g 76 (347) ||..+. |+.. +-+.-|+|+.+|.+.+.+..++.++..+- ++. .|++++||+.... .+..+..| T Consensus 1 ~~~~~~-----T~l~--------d~i~PEv~~~~v~~~~~~~~~~~~~~~~~~~l~g~~G~tv~iP~~~~ig~a~~~~~g 67 (275) T protein:vir:96 1 MALENM-----TKLA--------NMVNPEVLAPMMQAELDKKLKFAQFADIDNTLVGQPGNTITFPAFVYSGDAKVVPEG 67 (275) T ss_pred CCCccc-----chhh--------hhhchHHHHHHHHHHHHHhhhhcccceecccccCCCCCEEEeeeeccCCccccccCC Confidence 655541 3331 11345999999999999999999887653 344 4999999986543 56678888 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIA 156 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~ 156 (347) ++++. ++++.++.+.+|.+ .++.|.|+|++..++..|++.+++++++++||+++|+.++..+.++. T Consensus 68 ~~i~~--~~lt~~~~~~~i~~-~~~~~~i~D~~~~~~~~d~~~~~~~~~a~~~a~~~d~~ll~~l~~a~----------- 133 (275) T protein:vir:96 68 EEIPI--DLIETKKRQATIRK-IGKGTVLTDEALLSGYGDPKGEAVRQHGLAIANKVDNDVLEALQGAT----------- 133 (275) T ss_pred CCcch--hhcccceeeEEeeh-hcccccccHHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhccc----------- Confidence 88854 67999999999977 59999999999999999999999999999999999999886653210 Q ss_pred cccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcch--hhhhhhhccccccc Q lcl|NC_015249. 157 GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAAL--MPNAANYQALIDPS 234 (347) Q Consensus 157 ~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~--~~~~~~~~~~~~~~ 234 (347) ..+ ... ...++.|++|..+|.+.+. .+||++|+|++|+.|+++. +|+..+..+...++ T Consensus 134 -------~~~--~~~---------~~~~d~i~dA~~~lgd~~~--~~~~ivv~p~~~~~L~k~~~~~f~~~~~~g~~~~~ 193 (275) T protein:vir:96 134 -------LKV--EAD---------ITKLAGLQTAIDKFNDEDL--EPMVLFVNPLDAGKLRASATDNFTRATLLGDNVIV 193 (275) T ss_pred -------ccc--ccc---------ccCHHHHHHHHHHhccccC--CccEEEeCHHHHHHHHhccccccccccccccccee Confidence 000 000 0137888999999988764 6899999999999998874 78877777888899 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) +|.|++++||+||+||++|.. .+.+|++.|++++..+++++|..| T Consensus 194 ~G~ig~~~G~~Vi~s~~~p~~-----------------------------------t~~i~~~gA~~~~~~~~~~vE~~R 238 (275) T protein:vir:96 194 KGAFGEALGAIIVRSNKIKEG-----------------------------------EAILAKRGAVKLITKRDFFLETER 238 (275) T ss_pred ccccceecCeeEEEeCCCCcc-----------------------------------eEEEEeccceeeeecCCccccccc Confidence 999999999999999999731 236788999999999999999999 Q ss_pred chhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |+.++.|.|.+++.||++++||+.++.+.++.| T Consensus 239 d~~~~~d~i~~~~~y~~~~~~~~~vv~~t~~~~ 271 (275) T protein:vir:96 239 HASHKSTALFSDKHYVAYLYDESKVVKITKSAS 271 (275) T ss_pred chhhcCcEEEEeEEEEEEEEcCccEEEEEeccc Confidence 999999999999999999999999999988888 No 41 >protein:vir:174 Length: 423 # NCBI annotation: capsid protein # Family: family:all:1412 # MgeID: mge:5 # MgeName: HK620 # Cross-refs: genbank:acc:NP_112079;genbank:gi:13559869;genbank:GeneID:920999 Probab=100.00 E-value=1.8e-36 Score=216.60 Aligned_cols=301 Identities=13% Similarity=0.085 Sum_probs=205.0 Q ss_pred CCcccccccccccccccccccchhhhh-hhhhhhHHHHHHHHHHhhhcccccc---cc---cccceEEEeecCcceeeee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALF-LKVFGGEVLTAFTRTSVTMNKHLVR---SI---QSGKSAQFPVLGRTKAAYL 73 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~-ie~f~g~V~~~f~~~s~~~~~~~~r---~i---~~G~tv~i~~iG~~~~~~~ 73 (347) |||.- ..| .++|..++++.|+++.++..++... .+ +.|+||+|++.+..+++.+ T Consensus 1 MaN~l-------------------lT~ip~iia~~al~~l~~~lV~~~lVnr~y~~e~~~~k~GDTV~I~~p~~~~~~~~ 61 (423) T protein:vir:17 1 MPNNL-------------------DSNVSQIVLKKFLPGFMSDLVLAKTVDRQLLAGEINSSTGDSVSFKRPHQFSSLRT 61 (423) T ss_pred Cccch-------------------hhhhHHHHHHHHHHHHHhhcccchhhcccCCcchhhcccCCEEEEeeCCcceeecc Confidence 55432 134 4899999999999999988877653 22 3599999999999999887 Q ss_pred ecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc Q lcl|NC_015249. 74 QPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDE 153 (347) Q Consensus 74 ~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~ 153 (347) +.........+++...++.|+||+.+|++|.++|.|+.+..-|+ +++.+.++++||+.+|+.++..+.+.+. T Consensus 62 ~~~~~~~~~~~~l~e~~v~l~id~~k~va~~v~d~E~~~~i~~~-~~~l~~A~~aLA~~vd~~ia~~~~~~a~------- 133 (423) T protein:vir:17 62 PTGDISGQNKNNLISGKATGRVGNYITVAVEYQQLEEAIKLNQL-EEILAPVRQRIVTDLETELAHFMMNNGA------- 133 (423) T ss_pred cCcccCCcccCccccceeEEEeeceeeeeeeecHHHHhcChhHH-HHHHHHHHHHHHHHHHHHHHHHHhhccc------- Confidence 65432212346788888999999999999999999999766666 8899999999999999988766543211 Q ss_pred ccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchh-hhhhhhccccc Q lcl|NC_015249. 154 NIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALM-PNAANYQALID 232 (347) Q Consensus 154 ~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~-~~~~~~~~~~~ 232 (347) .. .+...... ..|+.+++++.+|++++||..|||+||+|++|..|+++++ +.+.+..+... T Consensus 134 -~~---------~gt~~t~~--------~a~~~i~~a~~~Ld~~~vP~~~R~~Vv~p~~~a~Ll~~~~~~~~~~~~~~~a 195 (423) T protein:vir:17 134 -LS---------LGSPNTPI--------TKWSDVAQTASFLKDLGVNEGENYAVMDPWSAQRLADAQTGLHASDQLVRTA 195 (423) T ss_pred -cc---------cccCCccc--------ccHHHHHHHHHHHHhccCCcCCCEEEeChHHHHHHhccccceecccccchHH Confidence 00 01111110 1378899999999999999999999999999999998765 44445556666 Q ss_pred cccceE-EEEeceEEEEecceecccccccccc-----c----ccc----------------------cccccc------- Q lcl|NC_015249. 233 PSTGSI-RNVMGFEVIEVPHLTAGGAGEDRPE-----E----GAN----------------------PTGQKH------- 273 (347) Q Consensus 233 ~~~G~V-g~i~G~~V~~sn~lp~~~~~~~~~~-----~----~~~----------------------~~~~~~------- 273 (347) +++|.| |+++||+||+||++|....++.... . ... ..|... T Consensus 196 lr~g~i~G~i~GFdvy~Snnip~~T~gt~~~t~~~~~~~~v~~~a~~~~~~~~~~~~~~~~~~~g~l~~GD~~t~aGv~~ 275 (423) T protein:vir:17 196 WENAQIPTNFGGIRALMSNGLASRTQGAFGGTLTVKTQPTVTYNAVKDSYQFTVTLTGATTSVTGFLKAGDQVKFTNTYW 275 (423) T ss_pred HhhccceeeecceEEEEeCCCccccccceeceeeecccccccccccccccceeeeeeeeeeeccCceeecceEEecceee Confidence 999987 8999999999999996433332100 0 000 000000 Q ss_pred ------------------cccccc------cccc----cc------------------------------cccceEEEEe Q lcl|NC_015249. 274 ------------------AFPETS------SGDT----RV------------------------------ALDNVVGLFN 295 (347) Q Consensus 274 ------------------~~~~~~------~~~y----~~------------------------------~~~~~~~l~~ 295 (347) ++.... .+.. .+ ..+-..-|+| T Consensus 276 v~~~tk~v~~~~~t~~~~~~~v~~~~~~~a~~~~tv~i~p~~i~~~~~~~~~~v~a~~a~~~~vT~~~~a~~t~~~nl~~ 355 (423) T protein:vir:17 276 LQQQTKQALYNGATPISFTATVTADANSDSSGDVTVTLSGVPIYDTTNPQYNSVSRQVAAGDAVSVVGTASQTMKPNLFY 355 (423) T ss_pred ecccccccccccccccceEEEEEecccccccCceEEEecCccccccCCcccccceecccCCceeeccccccCCeeEEEEe Confidence 110000 0000 00 0011234799 Q ss_pred chhhhhhh-----------------hhcceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcC Q lcl|NC_015249. 296 HRSAVGTV-----------------KLKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNK 346 (347) Q Consensus 296 ~~~Av~~v-----------------~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~ 346 (347) ||+|+.++ +.+.+.+-..||.+..-..++.-.-||...+|||.++.+.=.- T Consensus 356 ~~~a~~l~~~pl~~~~~~~~~~~~~~g~s~r~~~~~d~~~~~~~~r~d~l~g~~~~~p~~~~~~~g~~ 423 (423) T protein:vir:17 356 NKFFCGLGSIPLPKLHSIDSAVATYEGFSIRVHKYADGDANVQKMRFDLLPAYVCFNPHMGGQFFGNP 423 (423) T ss_pred cCcceEEEEEcccCCCccceeecccCCcEEEEEEecccccceeEEEEEeecceeeeccceEEEEEecC Confidence 99987654 2333334444666555555666677999999999997766555 No 42 >protein:vir:105374 Length: 423 # NCBI annotation: gene 5 protein # Family: family:all:1412 # MgeID: mge:1556 # MgeName: Sf6 # Cross-refs: genbank:acc:NP_958181;genbank:gi:41057283;genbank:GeneID:2716621 Probab=100.00 E-value=3.2e-36 Score=215.17 Aligned_cols=301 Identities=13% Similarity=0.086 Sum_probs=206.6 Q ss_pred CCcccccccccccccccccccchhhhh-hhhhhhHHHHHHHHHHhhhcccccc---cc---cccceEEEeecCcceeeee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALF-LKVFGGEVLTAFTRTSVTMNKHLVR---SI---QSGKSAQFPVLGRTKAAYL 73 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~-ie~f~g~V~~~f~~~s~~~~~~~~r---~i---~~G~tv~i~~iG~~~~~~~ 73 (347) |||.- ..| .++|..++++.|++..++..++... .+ +.|+||+|++.+..++..+ T Consensus 1 MaN~l-------------------lT~~p~iia~~aL~~l~~~lV~~~lVnr~y~~ef~~~k~GDTV~I~~p~~~~~~d~ 61 (423) T protein:vir:10 1 MPNNL-------------------DSNVSQIVLKKFLPGFMSDLVLAKTVDRQLLAGEINSSTGDSVSFKRPHQFSSLRT 61 (423) T ss_pred Cccch-------------------hhhhHHHHHHHHHHHHHhhcccchhhcccCCCcccccccCCEEEEeeCCceeeecc Confidence 55432 134 3899999999999999988777652 23 3599999999999999988 Q ss_pred ecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc Q lcl|NC_015249. 74 QPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDE 153 (347) Q Consensus 74 ~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~ 153 (347) +.+..-....+++...++.|+||+.||++|.++|.|+++..-|+ +.+.+++.++||+.+|+.++..+..... . T Consensus 62 ~~~~~~~~~~~dl~e~~v~l~id~~k~va~~v~d~E~~~~i~~~-~~~l~~A~~aLA~~vd~~ia~~~~~~~~-----~- 134 (423) T protein:vir:10 62 PTGDISGQNKNNLISGKATGRVGNYITVAVEYQQLEEAIKLNQL-EEILAPVRQRIVTDLETELAHFMMNNGA-----L- 134 (423) T ss_pred CCccccccccCccccceeEEEeeceeeeeeeechHHHhcChhhH-HHHHHHHHHHHHHHHHHHHHHHHhhccc-----c- Confidence 86421111346788899999999999999999999998766555 8899999999999999998765432210 0 Q ss_pred ccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchh-hhhhhhccccc Q lcl|NC_015249. 154 NIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALM-PNAANYQALID 232 (347) Q Consensus 154 ~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~-~~~~~~~~~~~ 232 (347) . .+.+.... ..|+.+.+++.+|++++||..|||+||+|++|..|+++++ +.+.+..+... T Consensus 135 ~-----------~gt~~t~~--------~a~~~i~~a~~~Ld~~~vP~~~R~~Vv~p~~~a~Ll~~~~~~~~~~~~~~~a 195 (423) T protein:vir:10 135 S-----------LGSPNTPI--------TKWSDVAQTASFLKDLGVNEGENYAVMDPWSAQRLADAQTGLHASDQLVRTA 195 (423) T ss_pred c-----------cccCCccc--------chHHHHHHHHHHHHhccCCcCCCEEEeChHHHHHHhccccceecccccchhh Confidence 0 01111110 1378899999999999999999999999999999997765 44555566677 Q ss_pred cccceE-EEEeceEEEEecceecccccccccc-----cc--------------------------cccccccccccc--- Q lcl|NC_015249. 233 PSTGSI-RNVMGFEVIEVPHLTAGGAGEDRPE-----EG--------------------------ANPTGQKHAFPE--- 277 (347) Q Consensus 233 ~~~G~V-g~i~G~~V~~sn~lp~~~~~~~~~~-----~~--------------------------~~~~~~~~~~~~--- 277 (347) +++|.| |+++||+||+||++|....++.... .+ +...|....++. T Consensus 196 lr~g~i~G~i~GFdv~~Snnip~~T~gt~~~t~~~~~~~~v~~~a~~~a~~~~~~~~~~~~~~~~~l~~GD~~t~aGv~~ 275 (423) T protein:vir:10 196 WENAQIPTNFGGIRALMSNGLASRTQGAFGGTLTVKTQPTVTYNAVKDSYQFTVTLTGATASVTGFLKAGDQVKFTNTYW 275 (423) T ss_pred hhhccceeeecceEEEEeCCCccccccccccceeeeecceeccccccccceeeeeeeeccccccCceeecceEEecceee Confidence 999987 8999999999999996433321100 00 000011111111 Q ss_pred ----------------------c------ccccc----cc------------------------------cccceEEEEe Q lcl|NC_015249. 278 ----------------------T------SSGDT----RV------------------------------ALDNVVGLFN 295 (347) Q Consensus 278 ----------------------~------~~~~y----~~------------------------------~~~~~~~l~~ 295 (347) . ..+.. .+ ..+-..-|+| T Consensus 276 v~~~tk~~~~~~~t~~~~~~~v~a~~~~~~~g~~tv~i~p~~i~~~~~~~~~~v~a~~a~~~~vT~~~~a~~t~~~nl~~ 355 (423) T protein:vir:10 276 LQQQTKQALYNGATPISFTATVTADANSDSGGDVTVTLSGVPIYDTTNPQYNSVSRQVEAGDAVSVVGTASQTMKPNLFY 355 (423) T ss_pred ecccccccccccccCcceEEEEEeeeeeccCCceeeeccCccccccCCcccccccccccCCceeeccccccCCeeEEEEe Confidence 0 00000 00 0012234799 Q ss_pred chhhhhhh-----------------hhcceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcC Q lcl|NC_015249. 296 HRSAVGTV-----------------KLKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNK 346 (347) Q Consensus 296 ~~~Av~~v-----------------~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~ 346 (347) ||+|+.++ +...+.+-..||.+..-..++.-.-||...+|||.++.+.=.- T Consensus 356 ~~~a~~l~~~pl~~~~~~~~~~~~~~g~s~r~~~~~d~~~~~~~~r~d~l~g~~~~~p~~~~~~~g~~ 423 (423) T protein:vir:10 356 NKFFCGLGSIPLPKLHSIDSAVATYEGFSIRVHKYADGDANVQKMRFDLLPAYVCFNPHMGGQFFGNP 423 (423) T ss_pred cCcceEEEEEcccCCCccceeeccccCceEEEEEeeeccccceEEEEEeecceeeeccceEEEEEecC Confidence 99987654 3334444455777766666777777999999999997766555 No 43 >protein:vir:3525 Length: 423 # NCBI annotation: major head protein # Family: family:all:1412 # MgeID: mge:72 # MgeName: APSE-1 # Cross-refs: genbank:acc:NP_050985;genbank:gi:9633571;genbank:GeneID:1262318 Probab=100.00 E-value=4e-36 Score=214.65 Aligned_cols=301 Identities=14% Similarity=0.101 Sum_probs=207.4 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhcccccc---cc---cccceEEEeecCcceeeee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNKHLVR---SI---QSGKSAQFPVLGRTKAAYL 73 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~~~~r---~i---~~G~tv~i~~iG~~~~~~~ 73 (347) |||. ...|| ++|..+.++.|++..++..++... .+ +.|+||+|++.+..+++++ T Consensus 1 MAN~-------------------llT~iP~iia~~al~~l~~~lV~~~lV~r~y~ge~~~a~~GDTV~I~~p~~~~v~d~ 61 (423) T protein:vir:35 1 MANN-------------------LESNISQIVLKKFLPGFMSDIVLCKTVDRQLLSGEINSNTGDSVSFKRPHQFKSERT 61 (423) T ss_pred Cccc-------------------hhhhhHHHHHHHHHHHHHhhcccchhcccCCCcccccccCCCEEEEeeCCcceeecc Confidence 6533 22354 899999999999999988887653 23 3499999999999999988 Q ss_pred ecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc Q lcl|NC_015249. 74 QPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDE 153 (347) Q Consensus 74 ~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~ 153 (347) .++......++++...++.|+||+.+|++|.++|.|++|..-|+. .+.+.++++|++.+|+.++..+...+ .+ T Consensus 62 ~~~~~~~~~~~~~~e~~v~l~id~~k~~a~~v~d~e~~l~i~~~~-~~l~~a~~ala~~vd~~l~~~l~~~a-----~~- 134 (423) T protein:vir:35 62 ETGDITGKDKNGLFSAKATGKVGKYITVAVEWTQIEEALKLNQLD-QILSPIHERMVTDLETELAHFMMNNG-----AL- 134 (423) T ss_pred cCcCCCCccccccccceeeEEeccceeccceeCHHHHHhhHHHHH-HHHHHHHHHHHHHHHHHHHHHHhhcc-----cc- Confidence 764322223467778889999999999999999999999888884 67778899999999999876554321 00 Q ss_pred ccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchh-hhhhhhccccc Q lcl|NC_015249. 154 NIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALM-PNAANYQALID 232 (347) Q Consensus 154 ~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~-~~~~~~~~~~~ 232 (347) .++.... + ...|+.|++++.+|++++||..|||+||+|++|..|+++++ |.+.+..+... T Consensus 135 -----------~vgt~~t---~-----~~~~~~i~~a~~~Ld~~~vP~~~R~~Vv~p~~~a~Ll~~~~~~~~~~~~~~~a 195 (423) T protein:vir:35 135 -----------SLGSPNT---A-----IKKWADVAQTASFIKDIGIKTGENYAIMDPWSAQRLADAQSGLHAADQLVRTA 195 (423) T ss_pred -----------ccccccC---C-----cchHHHHHHHHHHHHHhcCCcCCCEEEeCHHHHHHHhccccceeccccchhHH Confidence 0111111 0 12378899999999999999999999999999999997654 55555556667 Q ss_pred cccceE-EEEeceEEEEecceecccccccccccc-------------------------------cccccccccccc--- Q lcl|NC_015249. 233 PSTGSI-RNVMGFEVIEVPHLTAGGAGEDRPEEG-------------------------------ANPTGQKHAFPE--- 277 (347) Q Consensus 233 ~~~G~V-g~i~G~~V~~sn~lp~~~~~~~~~~~~-------------------------------~~~~~~~~~~~~--- 277 (347) +++|.| |+++||+||+||++|....++...... +...|..+.+.. T Consensus 196 lr~g~i~G~i~GFdv~~Snnvp~~T~gt~~~~~~v~~a~~v~~~a~~~~~~~~~~~~~~~~~~~g~l~~GD~~t~aGv~~ 275 (423) T protein:vir:35 196 WENAQISGNFGGIRALMSNGLASRKQGDFDGAITVKTAPNVDYLSVKDSYQFTVALTGATPSKTGFLKAGDQLKFTSTHW 275 (423) T ss_pred HhhccceeeecceEEEEcCCCccccccccccceeeccccccccccccccccceeeeeeeeeccCCcEEecceEEeeeeee Confidence 888876 999999999999999643332211000 000000000100 Q ss_pred ----------------------c--c----ccccc----c------------------------------cccceEEEEe Q lcl|NC_015249. 278 ----------------------T--S----SGDTR----V------------------------------ALDNVVGLFN 295 (347) Q Consensus 278 ----------------------~--~----~~~y~----~------------------------------~~~~~~~l~~ 295 (347) . . .+.+. + ..+-..-|+| T Consensus 276 v~~~t~~~~~~~~t~~~~~~~V~~~~~~~a~g~~~v~i~p~~~~~~~~~~~~~v~a~~a~~~~vt~~~~a~~~~~~nl~~ 355 (423) T protein:vir:35 276 LNQQSKQTLYNGSTAMSFTATVLEETNSTASGDVTVKLSGVPIYDEKNSQYNAVDAKVKAGDAVSIIGTAKQQMKPNLFY 355 (423) T ss_pred ccccccceeecccCCceeEEEEeccccccccCceeEEccccccccCCCcccccccccccCCceeeeeecCCCceeEEEee Confidence 0 0 00000 0 0011245799 Q ss_pred chhhhhhhh-----------------hcceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcC Q lcl|NC_015249. 296 HRSAVGTVK-----------------LKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNK 346 (347) Q Consensus 296 ~~~Av~~v~-----------------~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~ 346 (347) ||+|++++. ...+.+-..||.+..-..++.-.-||...+|||.++.+.=.- T Consensus 356 ~~~a~~l~~~~l~~~~~~~~~~~~~~g~s~r~~~~~d~~~~~~~~r~d~l~g~~~~~p~~~~~~~g~~ 423 (423) T protein:vir:35 356 NKFFCGLGTIPLPKLHSLDSAVATYEGFSIRVHKYADGDANKQMMRFDLLPAYVCFNPHMGGQFFGNP 423 (423) T ss_pred cCceeEEEEEccccCCccceeeccccCceEEEEEeeccccCceEEEEEeecceeeecccceEEEEecC Confidence 999876542 233444455777766666777778999999999997766555 No 44 >protein:vir:9820 Length: 272 # NCBI annotation: putative major capsid/head protein # Family: family:all:522 # MgeID: mge:176 # MgeName: 315.4 # Cross-refs: genbank:acc:NP_795582;genbank:gi:28876339;genbank:GeneID:1257858 Probab=100.00 E-value=2.6e-35 Score=210.17 Aligned_cols=263 Identities=16% Similarity=0.145 Sum_probs=213.9 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccc-ccc--ccceEEEeecCc-ceeeeeecC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVR-SIQ--SGKSAQFPVLGR-TKAAYLQPG 76 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r-~i~--~G~tv~i~~iG~-~~~~~~~~g 76 (347) ||+.++ +.+. .+.-|+|+.+|.+.+++.+++.++..+. ++. .|++++||+.+. ..+..+..| T Consensus 1 MA~~~T------~~~~--------~~iPev~s~~v~~~~~~~~~~~~~~~~~~~~~g~~G~tv~iP~~~~~~~a~~v~eg 66 (272) T protein:vir:98 1 MAVGTT------KMAQ--------MLDPEVLADMIDAEVGKAIRFAPLAEVDTTLEGQPGTTLTVPKWDYIGDAEDVAEG 66 (272) T ss_pred CCCccc------cchh--------eechHHHHHHHHHHHHHHhhhhccccccccccCCCCCEEEEEEecCCCCcccccCC Confidence 998763 2211 1344999999999999999988877763 333 599999999864 467778888 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIA 156 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~ 156 (347) +.++. .+++.++.++++++. ...+.|+|.+..++..|+++.+.+++++++++++|+.++..+.+.. T Consensus 67 ~~i~~--~~~~~~~~~~~~~~~-~~~~~itd~~~~~s~~d~~~~~~~~~~~~~a~~~d~~i~~~~~~a~----------- 132 (272) T protein:vir:98 67 EAIPM--TQLGFKKTTMTIKKA-GKGVEITDEAILSGYGDPVGQAAKQIVEAIDHKVDADVLDALSKST----------- 132 (272) T ss_pred Ccccc--cccccceEEEEeeee-eeeeeecHHHHhhccccHHHHHHHHHHHHHHHHHHHHHHHHhcccc----------- Confidence 88754 578899999999885 5779999999999999999999999999999999999876542210 Q ss_pred cccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcch--hhhhhhhccccccc Q lcl|NC_015249. 157 GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAAL--MPNAANYQALIDPS 234 (347) Q Consensus 157 ~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~--~~~~~~~~~~~~~~ 234 (347) ..+. ....++.|++|..+|++.+. ..|+++++|++|..|+++. +++.....+...+. T Consensus 133 -------~~~~------------~~~t~d~i~da~~~l~~~~~--~~~~~vv~p~~~~~L~k~~~~~~~~~~~~~~~~~~ 191 (272) T protein:vir:98 133 -------QTVE------------ATATVDGVSKALDIFNDEDD--AETVIVMNPADASTLRLDAAKEWLGATEVGANRVV 191 (272) T ss_pred -------cccc------------cccCHHHHHHHHHHHhccCC--CccEEEEcHHHHHHHHHhccccccccccccccccc Confidence 0000 01126788999999988764 4799999999999999874 55555555667789 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) +|.+++++|++|++|+++|.. ..++|++.+++.+..+++.+|.+| T Consensus 192 ~g~ig~i~G~~Vi~s~~~p~~-----------------------------------t~~~~~~~a~~~~~~~~~~ve~~r 236 (272) T protein:vir:98 192 SGVYGEVLGVQIVRSRKCPKG-----------------------------------TAYMVRKGALRIMLKRNTMVETDR 236 (272) T ss_pred cccchhhcCeeEEEcCCCCcc-----------------------------------eEEEEcCCeEEEEecCCceeeecc Confidence 999999999999999999821 135788899999999999999999 Q ss_pred chhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |+.++.|.|..++.||.+++||++++.+.+..| T Consensus 237 ~~~~~~~~i~~~~~~~~~v~~~~~vv~~t~~~a 269 (272) T protein:vir:98 237 DITKAINQIVANKHYGVYLYKAEKAVKITLKDA 269 (272) T ss_pred ccccceeEEEEEEEEEEEEEcCCceEEEEeccc Confidence 999999999999999999999999999999999 No 45 >protein:vir:3033 Length: 272 # NCBI annotation: major capsid protein # Family: family:all:522 # MgeID: mge:61 # MgeName: PhiNIH1.1 # Cross-refs: genbank:acc:NP_438146;genbank:gi:16271809;genbank:GeneID:929235 Probab=100.00 E-value=2.6e-35 Score=210.17 Aligned_cols=263 Identities=16% Similarity=0.145 Sum_probs=213.9 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccc-ccc--ccceEEEeecCc-ceeeeeecC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVR-SIQ--SGKSAQFPVLGR-TKAAYLQPG 76 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r-~i~--~G~tv~i~~iG~-~~~~~~~~g 76 (347) ||+.++ +.+. .+.-|+|+.+|.+.+++.+++.++..+. ++. .|++++||+.+. ..+..+..| T Consensus 1 MA~~~T------~~~~--------~~iPev~s~~v~~~~~~~~~~~~~~~~~~~~~g~~G~tv~iP~~~~~~~a~~v~eg 66 (272) T protein:vir:30 1 MAVGTT------KMAQ--------MLDPEVLADMIDAEVGKAIRFAPLAEVDTTLEGQPGTTLTVPKWDYIGDAEDVAEG 66 (272) T ss_pred CCCccc------cchh--------eechHHHHHHHHHHHHHHhhhhccccccccccCCCCCEEEEEEecCCCCcccccCC Confidence 998763 2211 1344999999999999999988877763 333 599999999864 467778888 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIA 156 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~ 156 (347) +.++. .+++.++.++++++. ...+.|+|.+..++..|+++.+.+++++++++++|+.++..+.+.. T Consensus 67 ~~i~~--~~~~~~~~~~~~~~~-~~~~~itd~~~~~s~~d~~~~~~~~~~~~~a~~~d~~i~~~~~~a~----------- 132 (272) T protein:vir:30 67 EAIPM--TQLGFKKTTMTIKKA-GKGVEITDEAILSGYGDPVGQAAKQIVEAIDHKVDADVLDALSKST----------- 132 (272) T ss_pred Ccccc--cccccceEEEEeeee-eeeeeecHHHHhhccccHHHHHHHHHHHHHHHHHHHHHHHHhcccc----------- Confidence 88754 578899999999885 5779999999999999999999999999999999999876542210 Q ss_pred cccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcch--hhhhhhhccccccc Q lcl|NC_015249. 157 GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAAL--MPNAANYQALIDPS 234 (347) Q Consensus 157 ~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~--~~~~~~~~~~~~~~ 234 (347) ..+. ....++.|++|..+|++.+. ..|+++++|++|..|+++. +++.....+...+. T Consensus 133 -------~~~~------------~~~t~d~i~da~~~l~~~~~--~~~~~vv~p~~~~~L~k~~~~~~~~~~~~~~~~~~ 191 (272) T protein:vir:30 133 -------QTVE------------ATATVDGVSKALDIFNDEDD--AETVIVMNPADASTLRLDAAKEWLGATEVGANRVV 191 (272) T ss_pred -------cccc------------cccCHHHHHHHHHHHhccCC--CccEEEEcHHHHHHHHHhccccccccccccccccc Confidence 0000 01126788999999988764 4799999999999999874 55555555667789 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) +|.+++++|++|++|+++|.. ..++|++.+++.+..+++.+|.+| T Consensus 192 ~g~ig~i~G~~Vi~s~~~p~~-----------------------------------t~~~~~~~a~~~~~~~~~~ve~~r 236 (272) T protein:vir:30 192 SGVYGEVLGVQIVRSRKCPKG-----------------------------------TAYMVRKGALRIMLKRNTMVETDR 236 (272) T ss_pred cccchhhcCeeEEEcCCCCcc-----------------------------------eEEEEcCCeEEEEecCCceeeecc Confidence 999999999999999999821 135788899999999999999999 Q ss_pred chhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |+.++.|.|..++.||.+++||++++.+.+..| T Consensus 237 ~~~~~~~~i~~~~~~~~~v~~~~~vv~~t~~~a 269 (272) T protein:vir:30 237 DITKAINQIVANKHYGVYLYKAEKAVKITLKDA 269 (272) T ss_pred ccccceeEEEEEEEEEEEEEcCCceEEEEeccc Confidence 999999999999999999999999999999999 No 46 >protein:vir:105334 Length: 276 # NCBI annotation: putative phage major capsid protein # Family: family:all:522 # MgeID: mge:1679 # MgeName: PH15 # Cross-refs: genbank:acc:YP_950669;genbank:gi:119967839;genbank:GeneID:4643213 Probab=100.00 E-value=3.1e-35 Score=209.80 Aligned_cols=264 Identities=15% Similarity=0.162 Sum_probs=215.0 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccccccc-c--cccceEEEeecCcc-eeeeeecC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRS-I--QSGKSAQFPVLGRT-KAAYLQPG 76 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~-i--~~G~tv~i~~iG~~-~~~~~~~g 76 (347) |||.++ +.. +-+.-|+|+.+|.+.+.+..++.++..+-+ + ..|++++||+.+.. .+..+..| T Consensus 1 Ma~~~T------~l~--------d~i~Pev~~~~v~~~~~~~~~~~~~~~~~~~l~g~~G~ti~iP~~~~igda~~~~eg 66 (276) T protein:vir:10 1 MAQGTT------TKS--------TQIVPEVLAPMMQAELDKKLRFAQFADIDSTLVGQPGDTLTFPAFVYSGDATVVPEG 66 (276) T ss_pred CCccee------ehh--------hhhchHHHHHHHHHHHHhhhhhcccceecccccCCCCCEEEeeeecCCCccccccCC Confidence 998663 221 214569999999999999999988887643 3 35999999987554 45567788 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIA 156 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~ 156 (347) ++++ +++++.++.+.+|.+ .+..|.++|++..++..|++.+++++++++||+++|+.++..+.... T Consensus 67 ~~i~--~~~lt~~~~~a~i~~-~~k~~~~tD~a~~~~~~dp~~~~~~~~~~~~a~~~d~~~~~~l~~~~----------- 132 (276) T protein:vir:10 67 QKIP--VDKIETNRREAKIHK-IGKGTDITDEALLSGYGDPQGEAVRQHGLAIANKVDNDVLEALRGTK----------- 132 (276) T ss_pred CccC--ccccccceeeEEeeh-ccccccccHHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhccc----------- Confidence 8875 367999999999966 59999999999999999999999999999999999998876553210 Q ss_pred cccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcc--hhhhhhhhccccccc Q lcl|NC_015249. 157 GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAA--LMPNAANYQALIDPS 234 (347) Q Consensus 157 ~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~--~~~~~~~~~~~~~~~ 234 (347) . ... .. . ..++.|.+|..+|.+++. +.++++|.|++|..|+++ .+|+..+..+.+.++ T Consensus 133 ----~---~~~--~~-----~----~t~d~i~~A~~~lgd~~~--~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g~~~~~ 192 (276) T protein:vir:10 133 ----L---TVS--AD-----I----GTLAGLEAAIDTFDDEDL--EPMVLFINPKDAGKLRSSASDNFTRATELGDNIIV 192 (276) T ss_pred ----c---ccc--cc-----c----cCHHHHHHHHHHhccccC--cccEEEEcHHHHHHHHHhcccccccccccccccee Confidence 0 000 00 0 126788899999998875 689999999999999875 678877767777889 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) +|.|++++|++|++|+++|.. ...+|++.|++++..+++++|..| T Consensus 193 ~G~ig~~~G~~Vi~s~~~p~~-----------------------------------t~~l~~~gAi~~~~~~~~~vE~dR 237 (276) T protein:vir:10 193 KGAFGEALGAVIVRSKKLDEG-----------------------------------EAILAKRGAVKLITKRDFFLETDR 237 (276) T ss_pred ccccceecceeEEEcCCCCcc-----------------------------------eEEEEeccceeeeecCCceeeccc Confidence 999999999999999999721 235888999999999999999999 Q ss_pred chhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |+.++.|.|.+.+.||++..||+.++.+....+ T Consensus 238 d~~~~~d~i~~~~~y~~~~~~~~~vv~~t~~~~ 270 (276) T protein:vir:10 238 DPSTKTTALYSDKHYVAYLYDESKAVKVTKGAG 270 (276) T ss_pred chhhcccEEEEeeEEEEEEEcCcceEEEecCCc Confidence 999999999999999999999999888875555 No 47 >protein:vir:79008 Length: 299 # NCBI annotation: putative main capsid protein # Family: family:all:701 # MgeID: mge:1861 # MgeName: phiC2 # Cross-refs: genbank:acc:YP_001110725;genbank:gi:134287342;genbank:GeneID:4955182 Probab=100.00 E-value=1.5e-34 Score=206.09 Aligned_cols=282 Identities=11% Similarity=0.014 Sum_probs=189.0 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccc-----cccccceEEEeecCcceeeeeec Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVR-----SIQSGKSAQFPVLGRTKAAYLQP 75 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r-----~i~~G~tv~i~~iG~~~~~~~~~ 75 (347) ||.++ |.|+|+.++++.|...+++..+.+.. ...+|++|+||+++.+.+++|++ T Consensus 1 MA~~n---------------------~a~~~~~~Ld~~~~~~l~~~~L~~~~~~~~v~~~gg~tVkI~~i~~~gl~DY~R 59 (299) T protein:vir:79 1 MAALN---------------------YAKEYSNVLAQAYPYTLNFGDLYATPNNGRYRWTGSKTIEIPTISTTGRVDSNR 59 (299) T ss_pred Cccch---------------------hHHHHHHHHHHHHHhhceeeeeccCcccceeeecCCCEEEEecccccccccccc Confidence 54322 56899999999999999887655432 23579999999999999999998 Q ss_pred CCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHH--HHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc Q lcl|NC_015249. 76 GENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRS--EYTAQLGESLAMAADGAVLAEMAKLCNLPSASDE 153 (347) Q Consensus 76 g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~--~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~ 153 (347) ++..-. ..+++.+..+++|||.+||.|.||++|..|++..+.. ...+.+.+.++.++|.+.+..|+..+.. T Consensus 60 ~~~g~~-~g~~~~~~~t~~ldqdr~~~f~vD~~Dvdet~~~~~~a~v~~~~~~~~v~pEiDay~~skl~~~a~~------ 132 (299) T protein:vir:79 60 DTIAVA-QRNYDNAWEPKVLTNQRKWSTLVHPADINQTNYVASIGNITKVYNEEQKFPEMDAYCISKIYADWTA------ 132 (299) T ss_pred CCCccc-ccccCcceeEEEeeccccceeccchhhHHHHhhhhHHHHHHHHHHHHHhhhHhhHHHHHHHHHhhhh------ Confidence 764322 2457889999999999999999998887777766532 3445566777778888777665432210 Q ss_pred ccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhh-hccccc Q lcl|NC_015249. 154 NIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAAN-YQALID 232 (347) Q Consensus 154 ~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~-~~~~~~ 232 (347) .|. ...+.+.+++++|+.|+++.++|+|++||.+|||++|+|++|.+|+++++|+... ...... T Consensus 133 ------~g~---------~~~~~~~T~~n~y~~i~~~~~~lde~~vP~~~rvl~vtp~~~~~L~~~~~f~k~~~~~~~~~ 197 (299) T protein:vir:79 133 ------LGN---------TADTTVLTTTNVLEVFDKLMEKMTEARVPENGRILYVTPVVNTLIKNAKEIQRTVNIKDAGT 197 (299) T ss_pred ------cCC---------cccccccCHHHHHHHHHHHHHHHHhcCCCCCCeEEEeCHHHHHHHhhchhhhcccccccccc Confidence 000 0112234577899999999999999999999999999999999999999998543 344445 Q ss_pred cccceEEEEeceEEEE--ecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceee Q lcl|NC_015249. 233 PSTGSIRNVMGFEVIE--VPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMAL 310 (347) Q Consensus 233 ~~~G~Vg~i~G~~V~~--sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~ 310 (347) ..+|+|++++||+|++ |++++..-.-+. + ...+ .+.-+.=.++.|++|+......+ .+ T Consensus 198 ~~~g~Vg~idG~~Ii~Vps~r~~t~~~~~~--G---~~~~--------------~~ak~in~ii~~~~a~~~~~K~~-~~ 257 (299) T protein:vir:79 198 SLNRQTTDIDTVKIIKVPSNLMKTAYDFTT--G---WKVG--------------AGAKQIFMSLVHPSAIITPVSYQ-FS 257 (299) T ss_pred eeeeeeeeecceEEEEechhhcCccceecc--C---cccc--------------CcccccceEEEcCCeeeeeEeee-eE Confidence 7899999999999998 666763111000 0 0000 00112335788999887665555 34 Q ss_pred eeeechh--hhcceeeeeeeeccccc--cc---ceEEEEEEcCC Q lcl|NC_015249. 311 ERARRAN--FQADQIIAKYAMGHGGL--RP---EACGALVFNKA 347 (347) Q Consensus 311 e~~~d~~--~~~d~i~~~~a~G~~~~--Rp---e~a~~i~~~~a 347 (347) +. ++|. ..+|..- .++..|+.. .. -+-+.+...+| T Consensus 258 ~~-~~P~~~~~~~~~~-~~r~y~d~~v~~nk~~~i~~~~~~a~~ 299 (299) T protein:vir:79 258 KL-DEPTAVTEGKYFY-FEESFEDVFILNKKADAIQFVVEGAGA 299 (299) T ss_pred Ee-ecCCCCCccceee-eeeeeeeeeeeccccCeEEEEeeecCC Confidence 43 4554 4444332 345555433 22 22233333333 No 48 >protein:vir:105522 Length: 423 # NCBI annotation: phage major head protein # Family: family:all:1412 # MgeID: mge:1463 # MgeName: phiSG1 # Cross-refs: genbank:acc:YP_516191;genbank:gi:89885994;genbank:GeneID:3964382 Probab=100.00 E-value=8.4e-34 Score=201.92 Aligned_cols=301 Identities=13% Similarity=0.070 Sum_probs=202.9 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccc---cc---cccceEEEeecCcceeeeee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVR---SI---QSGKSAQFPVLGRTKAAYLQ 74 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r---~i---~~G~tv~i~~iG~~~~~~~~ 74 (347) |||.-. +|-.++|+.+.++.|++..++..++... .+ +.|+||+|++.+..++.... T Consensus 1 MANsl~------------------~l~p~iia~~al~~l~~~lV~~~lV~r~y~~ef~~ak~GDTV~I~~P~~~~~~d~~ 62 (423) T protein:vir:10 1 MANNLD------------------ANVSQIVLKKFLPGFMSDLVLCKTVDRQLLAGEINSSTGDSVSFKRPHQFKSERTM 62 (423) T ss_pred Cccccc------------------cccHHHHHHHHHHHHHhhcccchhhccCCCccccccccCCEEEEeeCCceeeeccc Confidence 664331 1345899999999999999988777652 22 35999999999988887633 Q ss_pred cCCCCCC-ccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc Q lcl|NC_015249. 75 PGENLDD-KRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDE 153 (347) Q Consensus 75 ~g~~~~~-~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~ 153 (347) . ..+.. ..+++...++.|+||+.+|++|.++|.|+.+.--|+ +.+.+.+.++||+.+|+.++..+.+.+. + T Consensus 63 ~-~~~t~~~~~~l~e~~v~l~id~~k~~a~~v~d~E~~l~i~~~-~~~l~~A~~aLA~~vd~~ia~~~~~~~~-----~- 134 (423) T protein:vir:10 63 D-GDITGKSKNSLISAKATGEVGNYITVAVEYRQIEEALKLNQL-DQILVPINERMVTDLETELALFMMKHGA-----L- 134 (423) T ss_pred C-cccCcccccccccceEEEEecceeeeeeeeChHHHhcChhHH-HHHHHHHHHHHHHHHHHHHHHHhhhccc-----c- Confidence 3 33222 335676778999999999999999999999655556 7899999999999999988765543211 0 Q ss_pred ccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhh-hhhhhccccc Q lcl|NC_015249. 154 NIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMP-NAANYQALID 232 (347) Q Consensus 154 ~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~-~~~~~~~~~~ 232 (347) .++.+.... ..|+.+.+++.+|++.+||..+||+||+|++|..|++++.+ ...+..++.. T Consensus 135 -----------~vgt~~t~~--------~a~~~~a~a~~~L~~~~vP~~~R~~Vv~p~~~a~Ll~~~~~~~~~~~~~~~a 195 (423) T protein:vir:10 135 -----------SLGSPNTPI--------KKWSDVAQTASFLKDLGINSGENYAVMDPWAAQRLADAQSGLHVSEQLVRTA 195 (423) T ss_pred -----------ccccccccc--------ccHHHHHHHHHHHhhccCCcCCCEEEeCHHHHHHHhhhhhhhccccccchHH Confidence 011111110 12688999999999999999999999999999999976654 4455566777 Q ss_pred cccceE-EEEeceEEEEecceecccccccc-----ccccc--------------------------cccc---------- Q lcl|NC_015249. 233 PSTGSI-RNVMGFEVIEVPHLTAGGAGEDR-----PEEGA--------------------------NPTG---------- 270 (347) Q Consensus 233 ~~~G~V-g~i~G~~V~~sn~lp~~~~~~~~-----~~~~~--------------------------~~~~---------- 270 (347) +++|.| |+++||+||+||++|....++.. .+... ...| T Consensus 196 lr~~~i~G~~~GFdi~~Sn~vp~~T~g~~~ga~~~~~~~~vt~a~~~~~~~~~~~~~~~T~s~~g~l~~GD~~t~aGv~~ 275 (423) T protein:vir:10 196 WENAQISGNFGGIRALMSNGLASRTQGAFGGKLTVKGTPEVNYDSVKDSYAFTATLTGATASKKGFLKVGDQLQFDDTHW 275 (423) T ss_pred HHhcccceeecceEEEEecCCcccccccccceeeeeeeeEEEecccccccccccceeeccceeceeEEecceEeecceee Confidence 999977 99999999999999953222110 00000 0000 Q ss_pred ---------------ccccccccc------ccccc----c------------------------------cccceEEEEe Q lcl|NC_015249. 271 ---------------QKHAFPETS------SGDTR----V------------------------------ALDNVVGLFN 295 (347) Q Consensus 271 ---------------~~~~~~~~~------~~~y~----~------------------------------~~~~~~~l~~ 295 (347) ...++.... .+... + ..+-..-|+| T Consensus 276 v~~~tk~~l~~~~~~~~~~~~V~~~~~~~a~~~~tv~i~p~~~~~~~~~~~~~V~a~~a~~~~vT~~~~~~~t~~~nl~~ 355 (423) T protein:vir:10 276 LNQQSKQTLYNGASALSFTATVMEDANAHSSGDVTVKISGVPIFDAGYPQYNAVDRLLAEGDTVSVIGTSKQAMKPNLFY 355 (423) T ss_pred ecccccceeecccCCcceEEEEEecccccccCceEEEeccccccccCcccccceeccccCCceeEEeeccCCceeEEEEe Confidence 001110000 00000 0 0001234799 Q ss_pred chhhhhhh-----------------hhcceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcC Q lcl|NC_015249. 296 HRSAVGTV-----------------KLKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNK 346 (347) Q Consensus 296 ~~~Av~~v-----------------~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~ 346 (347) ||+|+.++ +...+.+-..||.+..-..++.-.-||...+|||.++.+.=.- T Consensus 356 ~~~a~~l~~~pl~~~~~~~~~~~~~~g~s~r~~~~~d~~~~~~~~r~d~l~g~~~~~p~~~~~~~g~~ 423 (423) T protein:vir:10 356 NKLFCGLGTIPLPKLHSIDSAVATYEGFSIRVHKYADGDANKQMMRFDLLPAYVCYNPHMGGQFFGNP 423 (423) T ss_pred cCcceEEEEEcccCCCccceeecccccceEEEEEeeeccccceEEEEEeecceeeeccceEEEEEecC Confidence 99987644 3344444555777766666777777999999999997766555 No 49 >protein:vir:78920 Length: 290 # NCBI annotation: Cps # Family: family:all:701 # MgeID: mge:1859 # MgeName: A006 # Cross-refs: genbank:acc:YP_001468846;genbank:gi:157325479;genbank:GeneID:5601917 Probab=99.95 E-value=1.1e-30 Score=184.86 Aligned_cols=279 Identities=12% Similarity=0.062 Sum_probs=190.0 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccccc-ccccccceEEEeecCcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLV-RSIQSGKSAQFPVLGRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~-r~i~~G~tv~i~~iG~~~~~~~~~g~~~ 79 (347) || .+ +.|+|+..+++.|...+++..+... ....+|++|+||+++.+.+++|++++.. T Consensus 1 Ma---------------------in-~a~~~~~~Ld~~~~~~~~t~~l~~~~~~~~ggktVkI~~i~~~gl~DY~R~~g~ 58 (290) T protein:vir:78 1 MA---------------------IN-YVDKYGKELDQKLVFGTYTNELETPNLLWLDAKTFKIQTITTTGLKAHTRNKGY 58 (290) T ss_pred Cc---------------------hh-HHHHHHHHHHHHHHhhheeeeccccceeeccCCEEEEeeeccCcccccccCCCc Confidence 22 11 2389999999999999887665433 3456899999999999999999998765 Q ss_pred CCccCCCCCceEEEEEEeeeecccccc--cHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADVLIY--DIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG 157 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~~Id--d~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~ 157 (347) .. .+++.+..+++||+.++|.|.|| |+||.+....+.....+.+.+.++..+|.+.+..|+..+... T Consensus 59 ~~--g~v~~~~et~tl~qdR~~~F~vD~~DvDEt~~~~~~~nv~~ef~~~~v~PEiDayr~skla~~a~~~--------- 127 (290) T protein:vir:78 59 NE--GSASNTNKSYTIDFDRDVEFFVDVMDVDETGQALSAANVTKEFNSRHAGPEMDAYRFSKLATAAKTN--------- 127 (290) T ss_pred cc--CccccceeeEEeeccccceeeccccchhHHhhhhhHHHHHHHHHHHHhhhhhhHHHHHHHHhhhhcc--------- Confidence 43 46788899999999999999999 999999889999999999999999999999887665433100 Q ss_pred ccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhh-hcc-cccccc Q lcl|NC_015249. 158 LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAAN-YQA-LIDPST 235 (347) Q Consensus 158 ~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~-~~~-~~~~~~ 235 (347) +.. ...+.+++++|+.|+++.++|+| ||.+|||++|+|++|.+|+++++|+..- -+. .....+ T Consensus 128 ---~~~----------~~~t~t~~n~~~~i~~~~~~lde--vp~~~rvl~vtp~~~~lL~~~~~f~r~~~~~~~~~~~i~ 192 (290) T protein:vir:78 128 ---SNS----------VAEEITKDNVFTKLKAAIRKVKK--YGTQNLVMYVSPDVMAALELSDDFVRAINVQNIGPSSIE 192 (290) T ss_pred ---Ccc----------cccccCHHHHHHHHHHHHHHHHh--cCCCCeEEEECHHHHHHHhhChhhhcccccccccccccc Confidence 000 01123567899999999999987 8999999999999999999999998532 222 223459 Q ss_pred ceEEEEeceEEEEecce-ecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 236 GSIRNVMGFEVIEVPHL-TAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 236 G~Vg~i~G~~V~~sn~l-p~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) |+|++++||+|++++.- -... ......+.-.. ....+.=.++.|++|+......+ .+..+ T Consensus 193 ~~V~~idG~~ii~vps~~r~~t--~~~f~~G~~~~---------------~~ak~in~ii~~~~a~i~~~K~~-~~~~~- 253 (290) T protein:vir:78 193 TRITAIDGTRIVEVEAEDRFYD--TFDFTDGYKPA---------------AGAKKLNFLLVNKGSVVGGAKHA-SIYLH- 253 (290) T ss_pred ceeeeecCcEEEEecccchhhh--hhhhccccccc---------------CCccceeEEEEcCCceeeeeeee-EEEee- Confidence 99999999999996521 1100 00000000000 01122336889999876655444 33433 Q ss_pred chh--hh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RAN--FQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~--~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +|. .. +|.+..+.-+..-++.....+ |....+ T Consensus 254 ~P~~~~~~d~~~~~~r~y~d~~v~~nk~~~-i~~~~~ 289 (290) T protein:vir:78 254 APGSVGQGDGWLYQYRVYHDIFVLDQQKDG-VIASTE 289 (290) T ss_pred CCCCCcCcceeeeeeeeeeeeeeeccccCe-eEEEee Confidence 455 33 445555444444444333222 222222 No 50 >protein:vir:739 Length: 231 # NCBI annotation: major structural protein 4 # Family: family:all:522 # MgeID: mge:14 # MgeName: Tuc2009 # Cross-refs: genbank:acc:NP_108716;genbank:gi:13487838;genbank:GeneID:920884 Probab=99.94 E-value=1.2e-29 Score=179.05 Aligned_cols=230 Identities=17% Similarity=0.146 Sum_probs=187.5 Q ss_pred cccccccceEEEeecCcceeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHH Q lcl|NC_015249. 51 VRSIQSGKSAQFPVLGRTKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLA 130 (347) Q Consensus 51 ~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa 130 (347) ..-+..|++++||.. -..+..+..|++++. +.++.++.+.+|.+. ...|.|.|.+..+...|++.+.++|++.+|| T Consensus 1 ~~~~~~Gdtit~P~~-iGda~~v~eG~~i~~--~~l~~t~~~atIk~~-gk~~~itD~a~l~~~gDp~~ea~~Q~~~~iA 76 (231) T protein:vir:73 1 ENGINLANLCEYPND-IGDAADVAEGGEISL--DKIGTTTKSVTIKKA-AKGTEITDEAALSGYGDPIGESNKQLGLSLA 76 (231) T ss_pred CccccCCceEEeccc-ccchhhhcCCCcCCh--hhccccceeeeEeee-ccceeeeHHHHhhccCchHHHHHHHHHHHHH Confidence 334567999999843 234577788999864 679999999999775 8899999999999999999999999999999 Q ss_pred HHHHHHHHHHHHHHhhhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCH Q lcl|NC_015249. 131 MAADGAVLAEMAKLCNLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTP 210 (347) Q Consensus 131 ~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P 210 (347) +++|..++..+.+.. ... +.. -.++.|.+|..+|.+.+ ..++|++|.| T Consensus 77 ~kvD~di~~~~~~a~------------------l~~--------~~~----~t~d~i~~A~~~fgde~--~~~~vivv~p 124 (231) T protein:vir:73 77 NKVDDDLLKAAKTTS------------------QTV--------STK----ANVDGVQAALDIFNDED--AQAYVLIVNP 124 (231) T ss_pred HhhhHHHHHhhcccc------------------ccc--------ccc----ccHHHHHHHHHHhcccc--ccceEEEEcc Confidence 999998876543210 000 001 12678899999999887 3578999999 Q ss_pred HHHHHHhcchhhhhh-hhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccc Q lcl|NC_015249. 211 DNYSAILAALMPNAA-NYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDN 289 (347) Q Consensus 211 ~~~~~Ll~~~~~~~~-~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~ 289 (347) +.|+.|.++.++... +..+.+.+++|.||.+.|++|+.|+++|..++- T Consensus 125 ~~~~~Lrk~~~~~~~~~~~g~~i~~~G~iG~i~G~~Vi~S~~~~~~~~~------------------------------- 173 (231) T protein:vir:73 125 KDAAKIRKDANAKNIGSEVGANALINGTYADVLGAQIVRSKKLAEGSAL------------------------------- 173 (231) T ss_pred hHHHhhhhccchhhhhhhhccceeeecccceEcceEEEEcCCCCCCcee------------------------------- Confidence 999999998887764 456778899999999999999999999853211 Q ss_pred eEEEEechhhhhhhhhcceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 290 VVGLFNHRSAVGTVKLKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 290 ~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .+-.++.+.|++....+++++|..||+..+.+.|.+.+.|+.+..+|+.++.+.++.- T Consensus 174 ~~~~i~~~gAl~~~~k~~~~vEtdRd~~~k~~~i~~~~~y~v~l~~~~~vv~~t~~g~ 231 (231) T protein:vir:73 174 MFKIVSNSPALKLVLKRGVQVETDRDIVTKTTVITADEHYAAYLYDLTKVVNITFTGV 231 (231) T ss_pred eeeEEeeccceeeeecccceeeccccccccccEEEEeEEEEEEEEcCccEEEEEeecC Confidence 0113456889999999999999999999999999999999999999999998887766 No 51 >protein:vir:105464 Length: 346 # NCBI annotation: putative phage major capsid protein # Family: family:all:701 # MgeID: mge:1502 # MgeName: KC5a # Cross-refs: genbank:acc:YP_529874;genbank:gi:90592614;genbank:GeneID:3974528 Probab=99.94 E-value=1.3e-28 Score=173.48 Aligned_cols=284 Identities=10% Similarity=0.005 Sum_probs=177.4 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhc-c-----cccccccccceEEEeecC-cceeeee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMN-K-----HLVRSIQSGKSAQFPVLG-RTKAAYL 73 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~-~-----~~~r~i~~G~tv~i~~iG-~~~~~~~ 73 (347) || .+ +.++|+.++++.|...++... + .......+|++|+||++. .+.+++| T Consensus 1 Ma---------------------in-ya~~~~~~Ld~~~~~~~lts~~l~~~~~~~~v~~~ggktVkIp~is~tsGl~DY 58 (346) T protein:vir:10 1 MT---------------------IN-YAEKYQAAVQQAFYDGHLYSAELWNSPSNSIIKFDGAKHIKVPRLEITSGRKDR 58 (346) T ss_pred Cc---------------------ch-hHHHHHHHHHHHHHhhhccchhhcccccccceEecCCCEEEEEEeeeecccccc Confidence 11 00 338999999999987766422 2 222344689999999995 4567888 Q ss_pred ecCCCCCCccCCCCCceEEEEEEeeeecccccc--cHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccc Q lcl|NC_015249. 74 QPGENLDDKRKDMKHTERTINIDGLLTADVLIY--DIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSAS 151 (347) Q Consensus 74 ~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Id--d~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~ 151 (347) +++.-. ....+++.+..+++|++.++|.|.|| |+||.+....+..-..+.+....+-.+|.+.|..|+..+...+ T Consensus 59 ~R~~g~-~~~g~v~~~~et~tl~qDR~~~F~vD~mDvDETn~~~~~anv~~ef~r~~vvPEiDayrfskLa~~a~~~~-- 135 (346) T protein:vir:10 59 QRRTIT-TPVANYSNDWDSYELKNERYWSTLVDPSDIDETNMVVSLANITKQFNLDSKMPEKDRYMFSHLYSGKEAAH-- 135 (346) T ss_pred cccCCc-ccccccccceeEEEeeccccceecccccchHHHHHHhHHHHHHHHHHHHhhcchhhHHHHHHHHHhhhhhc-- Confidence 765443 22246888999999999999999999 6666655555544444556666677888877665543321100 Q ss_pred ccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccc Q lcl|NC_015249. 152 DENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALI 231 (347) Q Consensus 152 ~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~ 231 (347) ++. ..+.+.+++++|+.|+++.++|+|+.||.++||++|+|++|.+|+++++|+.....++. T Consensus 136 --------~~~----------~~~~a~T~~ni~~~i~~~~~~lde~~vp~~~rvl~vTp~~~~lLk~s~~f~k~~~v~~~ 197 (346) T protein:vir:10 136 --------DGG----------ITTNTLDEKNILPAFDNMMLDFDEARIPSTNRILYVTPKTNAILKRAEAMNRALTLKDP 197 (346) T ss_pred --------ccc----------ccccccCHHHHHHHHHHHHHHHHHccCCCCCeEEEECHHHHHHHhhchhheeccccccc Confidence 000 00112357789999999999999999999999999999999999999988854333333 Q ss_pred ccccceEEEEeceEEEE--ecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhccee Q lcl|NC_015249. 232 DPSTGSIRNVMGFEVIE--VPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMA 309 (347) Q Consensus 232 ~~~~G~Vg~i~G~~V~~--sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~ 309 (347) ...+|+|++++||+|++ |++++..=.-+ .|.... ...-..=.++.|+.|+....-.+ . T Consensus 198 ~~i~~~V~siDGv~Ii~VPs~r~~t~~~f~----~G~~~~---------------t~ak~INfiiv~~~A~ia~~K~~-~ 257 (346) T protein:vir:10 198 NNIQRTVYSLDDVTIRVVPSDLMQTAYDFS----DGSKII---------------DTAKQIEMFLIYNGVQIAPEKYS-F 257 (346) T ss_pred cccceeeeeecCeEEEEcchhhcccchhhc----cCcccc---------------CCccceeEEEECCceeeeeeeee-e Confidence 34699999999999998 56665210000 000000 01112335788999876554444 2 Q ss_pred eeee-echhhhc-ceeeeeeeecccccccceEEE---EEEcCC Q lcl|NC_015249. 310 LERA-RRANFQA-DQIIAKYAMGHGGLRPEACGA---LVFNKA 347 (347) Q Consensus 310 ~e~~-~d~~~~~-d~i~~~~a~G~~~~Rpe~a~~---i~~~~a 347 (347) +..+ ..+...+ |.+..+.-+..-++.....++ +....+ T Consensus 258 ~~if~P~~~~~g~~l~~~R~Y~D~fv~~nk~~~Iyv~~~~a~~ 300 (346) T protein:vir:10 258 VGFDQPSAATSGNYLYYEQSYDDVLLLNTKTKGIQFVVSDKPK 300 (346) T ss_pred eEeeCCCCCcccceeeeeeeeeeeeeeccccceEEEeeecccc Confidence 3433 2223444 344444444444443332222 232222 No 52 >protein:vir:95107 Length: 270 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1549 # MgeName: X2 # Cross-refs: genbank:acc:YP_240822;genbank:gi:66394683;genbank:GeneID:5133901 Probab=99.93 E-value=3.8e-28 Score=170.88 Aligned_cols=261 Identities=15% Similarity=0.135 Sum_probs=205.1 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccccccc-c--cccceEEEeecCcc-eeeeeecC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRS-I--QSGKSAQFPVLGRT-KAAYLQPG 76 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~-i--~~G~tv~i~~iG~~-~~~~~~~g 76 (347) ||.+.-+ +-+.-|+|+.+|.+.+.+..++.++..+.+ + +.|++++||...-. .++.+..| T Consensus 1 Ma~T~~~----------------d~I~Pev~~~~V~e~~~~~~~~~~~~~~d~~L~g~~G~ti~~P~~~~igdae~~~eg 64 (270) T protein:vir:95 1 MTQTKKA----------------NLINPEVLANVVSAQMQNAIRFTPYAVTDDTLVGQPGDTITRPKYAYIGAAEDLQEG 64 (270) T ss_pred CCceehh----------------hhcchHHHHHHHHHHHHhHHhhccccccccccCCCCCCEEEeeeecCCCccccccCC Confidence 7665421 214559999999999999999888887643 3 46999999976532 45667788 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIA 156 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~ 156 (347) ++++ +++++.++.+.+|-+. ...|.|.|++...+..|++.+.+++++..+|+++|+.++..+.+. .+. T Consensus 65 ~~i~--~~~lt~~~~~a~i~~~-gk~~~itD~a~~~~~~dp~~~~~~q~a~~~a~~~d~~li~~l~~a-~~~-------- 132 (270) T protein:vir:95 65 VAMD--TTQMSMTTTKVTVKET-GKAVEVTQTAIITNVNGTLQEASRQLAMSLADKVEIDYIAELNKS-KQT-------- 132 (270) T ss_pred Cccc--hhhcccchheeeeehh-hCcceecHHHHhhhccchHHHHHHHHHHHHHHHHHHHHHHHhccc-ccc-------- Confidence 8875 4689999999999665 789999999999998999999999999999999999887655321 000 Q ss_pred cccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccc Q lcl|NC_015249. 157 GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTG 236 (347) Q Consensus 157 ~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G 236 (347) . +... .++.|.+|..+|.+.. ....+++|.|..|+.|.++..+.. .-.+.+.+++| T Consensus 133 ---------~--------~~~~----t~~~~~dA~~~lgd~~--~~~~~i~vhs~~~~~Lrk~~~~~~-~~~~~~~~~~G 188 (270) T protein:vir:95 133 ---------A--------TVSA----DATGILDAIEVFNSEN--DEDYVLYVNPKDYNKLVKSLFKVG-GNVQDRAISKG 188 (270) T ss_pred ---------c--------cccc----CHHHHHHHHHHhcccc--CCCcEEEEcHHHHHHHHhhhcccc-cccccchhccc Confidence 0 0001 1456778888886543 335699999999999998874432 22355668899 Q ss_pred eEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeech Q lcl|NC_015249. 237 SIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRA 316 (347) Q Consensus 237 ~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~ 316 (347) .|+.+.|++|+.+.+.|. +....+|++.|++.+..+++.+|..||+ T Consensus 189 ~ig~~~G~~Viv~s~~~~----------------------------------~~~~~l~~~gAi~~~~~~~~~vEtdRd~ 234 (270) T protein:vir:95 189 DLVEIVGVSDIVKSKRVS----------------------------------ENTAFLQRYGAMEIVNKKKPEAYTDFDI 234 (270) T ss_pred ccceecceeEEEeCCCCC----------------------------------ceeEEEEeccceeeeecCCceeeeccch Confidence 999999999988766542 1134688999999999999999999999 Q ss_pred hhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 317 NFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 317 ~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .++.|.|.+.+.||.++.+|+.++.+.++.| T Consensus 235 ~~~~d~i~~~~~y~v~~~~~skvv~~t~~~a 265 (270) T protein:vir:95 235 LKRTHLLSTNYHYSVNLKDETGVVKVTFKPS 265 (270) T ss_pred hhcccEEEeeeEEEEEEEccceEEEEEecCC Confidence 9999999999999999999999999999999 No 53 >protein:vir:102335 Length: 312 # NCBI annotation: putative capsid protein # Family: family:all:701 # MgeID: mge:1566 # MgeName: phi CD119 # Cross-refs: genbank:acc:YP_529560;genbank:gi:90592716;genbank:GeneID:3974467 Probab=99.93 E-value=7e-28 Score=169.44 Aligned_cols=298 Identities=13% Similarity=0.008 Sum_probs=190.2 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccccc---ccccccceEEEeecCcceeeeeecCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLV---RSIQSGKSAQFPVLGRTKAAYLQPGE 77 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~---r~i~~G~tv~i~~iG~~~~~~~~~g~ 77 (347) |||. . =|.++|..++++.|+..+++-.+... -.+.+||+|+||++....+++|++++ T Consensus 1 Mant-------------------l-~ya~~~~~~LD~~~~~~~~s~~l~~~~~~v~~~ggktVkIp~i~~~gl~DY~R~~ 60 (312) T protein:vir:10 1 MANT-------------------L-AYGQVLQQGLDKQATQELLTGWMDSNAKQIKYEGGKEVKIGKLSTDGLGDYSRGS 60 (312) T ss_pred CCcc-------------------h-hHHHHHHHHHHHHHHhhhccccccCCCceEEEecCcEEEEEeeeccccccccccc Confidence 6642 1 16689999999999999877655322 23578999999999999999999865 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccc--cHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIY--DIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENI 155 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Id--d~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~ 155 (347) ..--...+++.+..++++++.++|.|.|| |+||.+....+..-..+.+.+..+-.+|.+.+..|+..+.... T Consensus 61 g~~~~~g~v~~~~et~tl~qDR~~~F~vD~mDvDETn~~~s~anv~~ef~r~~vvPEiDayrfskla~~a~~~~------ 134 (312) T protein:vir:10 61 ANAYVGGDVKFEYETKTMTQDRGRKFTLDAMDVDETNFLVTATTVMGEFQRLKVIPEIDAYRLSRLATIAIGIK------ 134 (312) T ss_pred CCccccccccccceeEEeeecccceeeccccchhhHhhHHHHHHHHHHHHHhhhcchhhHHHHHHHHhhhhccc------ Confidence 52112235888999999999999999999 8888887777777777778888889999988776654331110 Q ss_pred ccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccccccc Q lcl|NC_015249. 156 AGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPST 235 (347) Q Consensus 156 ~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~ 235 (347) ..+ ..+.+.+.+++++|+.|.++.++|+|+.|| .+|+++|+|+++.+|.++..+.-..........+ T Consensus 135 ---~~~---------~~~~~~~~T~~ni~~~i~~~~~~lde~~vp-~~rvl~vTp~~~~lLk~~~~~~~~~~~~~~~~i~ 201 (312) T protein:vir:10 135 ---GDT---------NVEYSYSVNSSTIINKIKTGIKIIRENGYN-GPLVCHLTYDSMFAIEEKVLEKLTAVTFAQGGIQ 201 (312) T ss_pred ---ccc---------ccccccccCHHHHHHHHHHHHHHHHHccCC-CceEEEeChHHHHHHhhhhhceecccccccceee Confidence 001 111233456788999999999999999999 5999999999998877654333222222333459 Q ss_pred ceEEEEeceEEEEecceecccccccccccccccccccccccccccccc-cccccceEEEEechhhhhhhhhcceeeeee- Q lcl|NC_015249. 236 GSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDT-RVALDNVVGLFNHRSAVGTVKLKDMALERA- 313 (347) Q Consensus 236 G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y-~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~- 313 (347) |+|++++|++|++.+.--....-.-..+.+ ++. ...+.. ..+.-+.=.++.|++|+....-.+ .+..+ T Consensus 202 ~~V~~iDgv~Ii~VPs~r~~t~~~f~dG~t---~~~------~~gg~~~~~~ak~INfiiv~~~a~i~~~K~~-~~~if~ 271 (312) T protein:vir:10 202 TQVPSIDGCALIKTPQNRMYSSILLNDGTT---SNQ------TAGGYLKGTKALDTNFIIAPVDVPLAITKQD-KMRIFD 271 (312) T ss_pred eeeeeecccEEEEchhhhccceeeeccCcc---ccc------ccCceeecCcccccceEEeCCceeeceeeee-eeeeeC Confidence 999999999999854322211111000000 000 000100 111123336899999775554444 23333 Q ss_pred echhhh--cceeeeeeeecccccccceEEE-EEEcCC Q lcl|NC_015249. 314 RRANFQ--ADQIIAKYAMGHGGLRPEACGA-LVFNKA 347 (347) Q Consensus 314 ~d~~~~--~d~i~~~~a~G~~~~Rpe~a~~-i~~~~a 347 (347) .+.... +|.+.-+.-+..-++.....++ +-.+.| T Consensus 272 P~~~~~~d~~~~~~R~Y~D~fv~~nk~~~Iyv~~k~a 308 (312) T protein:vir:10 272 PETNQTANAWSMDYRRYHDLWVTDNKANSVYANFKDA 308 (312) T ss_pred CCCCCCcceeeeeeeeeeeeeeeccccCeEEEEeecc Confidence 233344 4565555555554554433333 444444 No 54 >protein:vir:79712 Length: 285 # NCBI annotation: major capsid protein gp34 # Family: family:all:701 # MgeID: mge:1873 # MgeName: LL-H # Cross-refs: genbank:acc:YP_001285883;genbank:gi:148750840;genbank:GeneID:5220414 Probab=99.87 E-value=4.1e-24 Score=148.80 Aligned_cols=264 Identities=11% Similarity=0.035 Sum_probs=175.1 Q ss_pred hhh-hhhhhhhHHHHHHHHHHhhhccccc-----ccccccceEEEeecC-cceeeeeecCCCCCCccCCCCCceEEEEEE Q lcl|NC_015249. 24 LAL-FLKVFGGEVLTAFTRTSVTMNKHLV-----RSIQSGKSAQFPVLG-RTKAAYLQPGENLDDKRKDMKHTERTINID 96 (347) Q Consensus 24 ~al-~ie~f~g~V~~~f~~~s~~~~~~~~-----r~i~~G~tv~i~~iG-~~~~~~~~~g~~~~~~~~~~~~~~~~l~ID 96 (347) -++ +.++|...+++.|...+++..+... ....||++|+||++. ...+.+|+++... ...+++.+..+++++ T Consensus 1 Main~~~k~~~~ld~~~~~~~~~~~l~~~~n~~~~~~~gak~VkIp~ist~~gl~dY~R~~g~--~~g~v~~~~et~tl~ 78 (285) T protein:vir:79 1 MTVVLDSKDLARIDEEYKADSQVWSYLTGGNGVTQRFRGHNEVRINKLSGFVDATAYKRGQDN--ARKTISVGKETVKLT 78 (285) T ss_pred CcchhhHHHHHHHHHHHHHhhhhhhhcccCCcceeEecCCCEEEEeeecccccccccccccCc--cccccceeeeEEEee Confidence 112 3478999999999888776655432 345789999999996 4678888887654 346788899999999 Q ss_pred eeeecccccccHHHHHhChhhHHHHHHH-HHHHHHHHHHHHHHHHHHHHhhhccccccccccccCcceeecccccccccc Q lcl|NC_015249. 97 GLLTADVLIYDIEDAMNHYDVRSEYTAQ-LGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGKAHVLEVGKQSELRGD 175 (347) Q Consensus 97 ~~~~~~~~Idd~D~~q~~~D~r~~~~~~-~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~ 175 (347) +.+++.|.||.+|..++..=....++.+ .-....-.+|.+.+..++..+ +.. .. T Consensus 79 ~DR~~~f~iD~mDvdEn~~~~~~ni~~ef~~~~vvPEiDayrfskla~~a---------------~~~----------~~ 133 (285) T protein:vir:79 79 HEDWFGYDLDQFDMDENGAYTVENVVREHNKMITIPHRDKVAVQKLFDSA---------------AKK----------AT 133 (285) T ss_pred ccccceecccccchhhhhhhhHHHHHHHHHhhhhcchhhHHHHHHHHhhc---------------ccc----------cc Confidence 9999999999666655321123333333 233344566666555443211 000 01 Q ss_pred hhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccc---cccceEEEEec-eEEEEe-- Q lcl|NC_015249. 176 QVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALID---PSTGSIRNVMG-FEVIEV-- 249 (347) Q Consensus 176 ~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~---~~~G~Vg~i~G-~~V~~s-- 249 (347) .+.+++++|+.|.++.++|+|..|| .+||++++|++|.+|++++.|...-..+... -.+++|+.++| ++|++. T Consensus 134 ~~~T~~nv~~~i~~~~~~lde~~vp-~~rvl~vTp~~~~~Lk~s~~~~r~~~~~~~~~~~~i~~~V~~lDg~v~ii~Vps 212 (285) T protein:vir:79 134 DSITKDNALDAYDTAEAYMFDNEVP-GGFVMFVSSAYYTALKQSAAVTRTFSTDGTMVINGIDRRVAQLDGGVPIVRVSS 212 (285) T ss_pred cccCHHHHHHHHHHHHHHHHHcCCC-CceEEEEChHHHHHHHhhhhhheecccccceeccceeeeeccccceeEEEEcch Confidence 1234778999999999999999999 6999999999999999999887543222221 24678999999 899984 Q ss_pred cceecccccccccccccccccccccccccccccccccc-cceEEEEechhhhhhhhhcceeeeeeechh--hhc--ceee Q lcl|NC_015249. 250 PHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVAL-DNVVGLFNHRSAVGTVKLKDMALERARRAN--FQA--DQII 324 (347) Q Consensus 250 n~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~-~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~--~~~--d~i~ 324 (347) +++... ++ .+.=.++.||+|+......+ .+.. ++|. ..+ |.+. T Consensus 213 ~r~kt~------------------------------~~~k~Infiiv~~~a~i~~~K~~-~~~~-f~P~~~~~~d~~~~~ 260 (285) T protein:vir:79 213 DRLKGL------------------------------GITNHVNFILTPLSAIAPIVKYD-SVSV-IDPSTDRSGNRWTIK 260 (285) T ss_pred hhccCc------------------------------CcchhccEEEecCceeccceeee-eeEe-ECCCCCCCcceeeee Confidence 444210 01 12335889999876665544 2332 3443 444 5666 Q ss_pred eeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 325 AKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 325 ~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .+.-+..-++.....++.+-.+| T Consensus 261 ~R~Y~d~fv~~nk~~~Iy~~~~a 283 (285) T protein:vir:79 261 GLSYYDAIVLDNAKKGIYVAATA 283 (285) T ss_pred eeeeeeeeehhhccceeeeeecc Confidence 65555666666666666666666 No 55 >protein:vir:99523 Length: 311 # NCBI annotation: putative protein # Family: family:all:701 # MgeID: mge:1559 # MgeName: Lj928 # Cross-refs: genbank:acc:NP_958538;genbank:gi:41179320;genbank:GeneID:2717161 Probab=99.87 E-value=1e-23 Score=146.65 Aligned_cols=295 Identities=10% Similarity=0.046 Sum_probs=181.7 Q ss_pred cccccchhhh-hhhhhhhHHHHHHHHHHhhhcccccc-cc-cccceEEEeecCcceeeeeecCCCCCCccCCCCCceEEE Q lcl|NC_015249. 17 GMSAGDKLAL-FLKVFGGEVLTAFTRTSVTMNKHLVR-SI-QSGKSAQFPVLGRTKAAYLQPGENLDDKRKDMKHTERTI 93 (347) Q Consensus 17 ~~~~~d~~al-~ie~f~g~V~~~f~~~s~~~~~~~~r-~i-~~G~tv~i~~iG~~~~~~~~~g~~~~~~~~~~~~~~~~l 93 (347) -.-.+|..|| |.++|..++++.|...+++-.+.... .+ .|||+|+||++....+.+|++++.. ...+++.+..++ T Consensus 1 ~~~~an~mAlnya~~~~~~Ld~~~~~~~~t~~l~~~~~~~~~Gak~VkIp~i~~~gl~dY~R~~g~--~~g~v~~~~et~ 78 (311) T protein:vir:99 1 MPTDAETRGFNYVTKDGNLLDQKITAGLFTAALGTPEVDLVNGGRSFTLKTISTSGLKDHTRGKGF--NSGTISDEKTIY 78 (311) T ss_pred CCCcchhhHHHHHHHHHHHHHHHHHhhhcccceecCchheeecCCEEEEEeeeeccccccccccCc--cccceeeeeeEE Confidence 2233455555 78999999999999887765554332 23 5899999999999999999998754 346788899999 Q ss_pred EEEeeeecccccc--cHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccCcceeecccccc Q lcl|NC_015249. 94 NIDGLLTADVLIY--DIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGKAHVLEVGKQSE 171 (347) Q Consensus 94 ~ID~~~~~~~~Id--d~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~g~~i~~~~~~~ 171 (347) ++++.+++.|.|| |+||.......-.-..+.......-.+|.+.+..|+..+...... ..++.. ..+. T Consensus 79 tl~~DR~~~f~vD~mDvdETn~~~~~ani~~~f~r~~vvPEiDayrfskla~~a~~~~~~------~~~~~~----~~~~ 148 (311) T protein:vir:99 79 TMGQDRDVEFYLDRQDVDETDNELAMANISNVFITEHVQPELDSYRFSKIATSFDNLDGT------DTEGTL----LAKT 148 (311) T ss_pred EeeeccceeeecchhchhhhhhhhHHHHHHHHHHHhhhcchhhHHHHHHHHhhhhccccc------ccchhh----hccc Confidence 9999999999999 555544333322223333444455677877766554332111110 001100 0111 Q ss_pred cccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhh-hh--ccccccccceEEEEeceEEEE Q lcl|NC_015249. 172 LRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAA-NY--QALIDPSTGSIRNVMGFEVIE 248 (347) Q Consensus 172 ~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~-~~--~~~~~~~~G~Vg~i~G~~V~~ 248 (347) .........+++++.|..+..++++ ||.++|+++++|++|.+|.+++.|... +- .+.. ..+++|++++|++|++ T Consensus 149 ~~~~~~lt~~nvl~~l~~~~~~~~~--v~~~~rvl~vTp~~~~lLk~~~~~~r~~~~~~~~~~-~i~~~V~~lDgv~Ii~ 225 (311) T protein:vir:99 149 HKTEETLDETNAYSQLKTGIGKVRK--YGTQNLVGYVSSEVMDALERSKEFTRNITNQNVGTT-ALESRITSIDGVQLIE 225 (311) T ss_pred cccccccCHHHHHHHHHHHHHHHHh--cCCCCeEEEEChHHHHHHhhchhhheeeeccccccc-ccccccceecCeEEEE Confidence 2223456678899999999999987 789999999999999988888777632 11 2223 3488899999999996 Q ss_pred e---cceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechh--hh--cc Q lcl|NC_015249. 249 V---PHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRAN--FQ--AD 321 (347) Q Consensus 249 s---n~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~--~~--~d 321 (347) + +++...-.-+ .+..... ..-+.=.++.||+|+....-.+ .+.. ++|. .. +| T Consensus 226 V~ps~r~~t~~~ft--~G~~~~~-----------------~ak~INfiiv~~~a~i~~~K~~-~v~~-f~P~~~~~gd~~ 284 (311) T protein:vir:99 226 VYESNRFMTKYDFT--DGAKPTE-----------------DAKAINFLVVAKPAVISIVKEN-AVFL-FAPGQHTDGDGY 284 (311) T ss_pred ecCchhhcchhhhc--CCccccC-----------------cccccceEEeCCCeeeeeeeee-eeee-eCCCCCCCccee Confidence 5 4454211000 0000000 0112336789999875554443 2332 3444 33 55 Q ss_pred eeeeeeeecccccccceEEE-EEEcCC Q lcl|NC_015249. 322 QIIAKYAMGHGGLRPEACGA-LVFNKA 347 (347) Q Consensus 322 ~i~~~~a~G~~~~Rpe~a~~-i~~~~a 347 (347) .+..+.-+..-++.....++ +-.+.| T Consensus 285 l~~~R~Y~D~fv~~nk~~~Iyv~~k~A 311 (311) T protein:vir:99 285 LYQNRLYHDLFIKKHKRDGIFVSVKKA 311 (311) T ss_pred eeeeeeeeeeeeeccccCeEEEeeecC Confidence 65555555555554433333 445666 No 56 >protein:vir:95451 Length: 313 # NCBI annotation: hypothetical protein ORF044 # Family: family:all:11728 # MgeID: mge:1570 # MgeName: PA11 # Cross-refs: genbank:acc:YP_001294637;genbank:gi:149408203;genbank:GeneID:5237018 Probab=99.85 E-value=3.5e-24 Score=149.21 Aligned_cols=300 Identities=14% Similarity=0.123 Sum_probs=216.2 Q ss_pred cccccchhhhhh-hhhhhHHHHHHHHHHhhhcccc-cccccccceEEEeecCcceeeeeecCCCCCCccCCCCCceEEEE Q lcl|NC_015249. 17 GMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNKHL-VRSIQSGKSAQFPVLGRTKAAYLQPGENLDDKRKDMKHTERTIN 94 (347) Q Consensus 17 ~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~~~-~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~~~~~~~~~~~~~l~ 94 (347) -+..++.+|+.. |+|+.+++-..+.+-+-..+-+ +-+.-+|++.+|+.+|.++++.....+ |-..+++++.|.++. T Consensus 1 ~~~TSNT~A~I~SE~~s~~I~~~LH~~LL~~~~~R~V~DF~~G~~L~I~tiGs~~~~~~~E~~--~~~~~~i~TGEIt~~ 78 (313) T protein:vir:95 1 MQLTSNTRAFIESEQYSKFILLNLHDGLLPETFYRNVSDFGSGETLHIKTIGSVTLQEAEEDT--PLIYNPIETGEITFQ 78 (313) T ss_pred CcccccchheehhhhHHHHHHHHhhccccchhhhhhhccCCCCCEEEecccCceeeeccccCC--CeeecccccceEEEE Confidence 224455566555 9999999877765543333333 445667999999999999998655444 346789999999999 Q ss_pred EEeeeeccccc-ccHHHHHhChh-hHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccCcceeeccccccc Q lcl|NC_015249. 95 IDGLLTADVLI-YDIEDAMNHYD-VRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGKAHVLEVGKQSEL 172 (347) Q Consensus 95 ID~~~~~~~~I-dd~D~~q~~~D-~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~g~~i~~~~~~~~ 172 (347) |.+++--+++| +|+-+.-..+| ++.+...|.++|+.+.+...+|..-+ .+....+....+.|++.-.+ ++.++ T Consensus 79 i~~Y~G~A~~vt~~LR~D~~~I~~~~A~~~AE~~RAI~E~~~TD~L~~G~-~~FA~~~~P~~vNG~PH~~V---~~~T~- 153 (313) T protein:vir:95 79 ITEYKGDAWYVTDDLREDGTDIDRLMAERAAESTRAIQETFETDFLKTGA-EYFAANPGPHNVNGFPHVIV---SAETN- 153 (313) T ss_pred EEeecCChhhhhhhhhhcchhHHHHhhhcchhhHHHHHHHHhhHHHhhch-hhhccCCCCcccccccceEE---eccCC- Confidence 99988888778 78888888888 99999999999999999877764321 12223333444555555432 22221 Q ss_pred ccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhh-hhhcc------ccccccceEEEEeceE Q lcl|NC_015249. 173 RGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNA-ANYQA------LIDPSTGSIRNVMGFE 245 (347) Q Consensus 173 ~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~-~~~~~------~~~~~~G~Vg~i~G~~ 245 (347) +...+..|..++-.+++.++|.+||+.+++|....-|-.-....+ ....+ ....-..+|.+++|++ T Consensus 154 -------~~~~~~~~~~~~~~~~~a~~P~~G~v~IvDP~~~~~L~~l~~It~~vt~~~k~I~ESG~A~~~~Fi~~~YG~D 226 (313) T protein:vir:95 154 -------GVFALKHLIAMRLAFDKANVPAEGRVFIVDPVAEATLNGLVTITHDVTDFGKMILESGMARGQRFIMNLYGWD 226 (313) T ss_pred -------ceehhhHHHHhhhhhhhccCCccceEEEEcchhhhhhhhhheeecccccccceeeeccCCchhHHHHHHhhhh Confidence 222355677888999999999999999999998887754333332 11111 1223345678999999 Q ss_pred EEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhhhcceeee Q lcl|NC_015249. 246 VIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANFQADQIIA 325 (347) Q Consensus 246 V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~~d~i~~ 325 (347) ++.||.|...+.+.. +......-.+...|.-.+.++..+++|.+. +++|.++|+.+-.+.-.. T Consensus 227 i~~SN~L~~AN~~D~--------~tT~~G~~~NlFM~i~D~~~~P~~~AWr~M---------P~s~~~~~~~~~~~~~~~ 289 (313) T protein:vir:95 227 ILTSNRLHVANYNDG--------TTTGNGYVGNLFMCILDDQTKPIMGAWRRM---------PKSEGERNKDRARDEHVV 289 (313) T ss_pred hhhhhhhhhcccccc--------ccccCceeeeeeeeeecccccceeeeeccc---------ccccccccccccccccee Confidence 999999986554421 111112234566788889999999999888 799999999999999999 Q ss_pred eeeecccccccceEEEEEEcCC Q lcl|NC_015249. 326 KYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 326 ~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .++||.|+.|-|..|.+.+.+- T Consensus 290 ~~R~G~Gi~R~~~L~~~~~~A~ 311 (313) T protein:vir:95 290 RCRYGFGIQRLDTLGLLATSAT 311 (313) T ss_pred eeeecccceeecceeEEEeccc Confidence 9999999999999998887766 No 57 >protein:vir:78090 Length: 302 # NCBI annotation: Cps # Family: family:all:701 # MgeID: mge:1844 # MgeName: P35 # Cross-refs: genbank:acc:YP_001468790;genbank:gi:157325371;genbank:GeneID:5601852 Probab=99.79 E-value=1.5e-20 Score=129.22 Aligned_cols=283 Identities=12% Similarity=0.076 Sum_probs=175.3 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccc---cccccceEEEeecC-----cceeee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVR---SIQSGKSAQFPVLG-----RTKAAY 72 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r---~i~~G~tv~i~~iG-----~~~~~~ 72 (347) |||. . =|.++|.+++++.|...+++..+.... .+.|||+|+||++- .+-+++ T Consensus 1 Mant-------------------l-~ya~~~~~~Ld~~~~~~~~t~~l~~~~~~v~~~Gak~vkIp~is~~~~~TsGl~d 60 (302) T protein:vir:78 1 MANS-------------------L-ALAQIYQDNIDKAIAVNSKSAFLEANPNNVQYNGGNTIKIADISFGSGTTGDLKA 60 (302) T ss_pred CCch-------------------h-HHHHHHHHHHHHHHHhhhceeecccCCceEEEecCcEEEEEEEEeeccccccccc Confidence 5532 2 166899999999999998877663321 36789999999995 456778 Q ss_pred eecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhh--HHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccc Q lcl|NC_015249. 73 LQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDV--RSEYTAQLGESLAMAADGAVLAEMAKLCNLPSA 150 (347) Q Consensus 73 ~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~--r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~ 150 (347) |++++... ..+++.+..++++++.+++.|.||-+|..+++.-+ -.-..+.......-++|.+.+..|+..+... T Consensus 61 y~R~~g~~--~g~v~~~~et~tlt~DR~~~f~vD~mDvdETn~~~~~ani~~ef~r~~vvPEiDayrfskla~~a~~~-- 136 (302) T protein:vir:78 61 YNRSTGFT--QGSVTLAWSDYTLDYDLAQSFQIDAMDVDETKNLATVGNVLSEYQRTKIVPAIDKYRFTKLANDGTGV-- 136 (302) T ss_pred cccccCcc--ccceeeeeeeEEeeeccceeeeccccchhhhhhhhHHHHHHHHHHHhhhcchhhHHHHHHHHHhhhcc-- Confidence 88876532 34577888999999999999999955555544332 2333333455556777777665554321100 Q ss_pred cccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhh---hh Q lcl|NC_015249. 151 SDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAA---NY 227 (347) Q Consensus 151 ~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~---~~ 227 (347) +.. ... .....+++++++.|..+.++++|+ ++|+++|+|+.+.+|.+++.+... .. T Consensus 137 ----------~~~------~~~-~~~~~t~~nvl~~i~~~~~~~~e~----~~~vl~vtp~~~~~Lk~a~~~~~~~~~~~ 195 (302) T protein:vir:78 137 ----------GGV------IDL-SKPDASAQALMGDIATAMELVDDS----NQLILVTSPTTLAGLLNTALIRESKNTQV 195 (302) T ss_pred ----------Ccc------ccc-cccchhHHHHHHHHHHHHHHhhcc----CCeEEEEChHHHHHHhcchhhccceeccc Confidence 000 000 112345788999999999999996 599999999999999887766532 22 Q ss_pred ccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcc Q lcl|NC_015249. 228 QALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKD 307 (347) Q Consensus 228 ~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~ 307 (347) .+.+ -.+++|++++|++|++.+.=-..+.- +.++. + .. ..+.-+.=.++.|+.|+....-.+ T Consensus 196 ~~~~-~i~~~V~~lDgv~Ii~VPs~r~~t~~--------~f~~G---~-----~~-~~~ak~INfiiv~~~a~ia~~K~~ 257 (302) T protein:vir:78 196 LRRG-EVDTKITFIQDVEVLQVPSEYLYDKV--------APKVG---V-----PD-YTGAKKIPYMIFKRDAPTGIVKTD 257 (302) T ss_pred cccc-cccceeeeecccEEEEchhhhcccce--------eccCC---c-----cc-cCCccceeEEEECCCeeeeeeeee Confidence 2222 34889999999999985432222111 11000 0 00 011123346899999775554444 Q ss_pred eeeeee-echhhhcc--eeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 308 MALERA-RRANFQAD--QIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 308 ~~~e~~-~d~~~~~d--~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .+..+ .++...+| .+..+.-...-++.....++++-.++ T Consensus 258 -~~~if~P~~~~~gd~~l~~~R~Y~D~fV~~nk~~gI~~~~~~ 299 (302) T protein:vir:78 258 -KVRVFEPDTNQSADAYKVDLRLYHDLIVPKNQRPGIIKASFG 299 (302) T ss_pred -eeEeeCCCCCCCcceeeeeeeeEeeeeeeccccCeEEEeecc Confidence 23333 44456765 55555555555555555555554444 No 58 >protein:vir:2106 Length: 430 # NCBI annotation: coat protein # Family: family:all:1412 # MgeID: mge:46 # MgeName: P22 # Cross-refs: genbank:acc:NP_059630;genbank:gi:9635538;genbank:GeneID:1262831 Probab=99.67 E-value=1.3e-17 Score=113.07 Aligned_cols=302 Identities=16% Similarity=0.134 Sum_probs=183.3 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcc---cccccc---cccceEEEeecCcceeeeee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNK---HLVRSI---QSGKSAQFPVLGRTKAAYLQ 74 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~---~~~r~i---~~G~tv~i~~iG~~~~~~~~ 74 (347) ||+.-. + +++.=-.|++..|+...++... ++.... +.|+++.+|.--..... T Consensus 1 Ma~~~~--~-----------------~lti~~~eal~~~~n~lV~a~~~~~~r~~d~~~~r~Gdti~ip~p~~~~~~--- 58 (430) T protein:vir:21 1 MALNEG--Q-----------------IVTLAVDEIIETISAITPMAQKAKKYTPPAASMQRSSNTIWMPVEQESPTQ--- 58 (430) T ss_pred Cccccc--h-----------------hhHHHHHHHHHHhhhhhhhhhhhhccCCchhhhhcccceEEeecccccccc--- Confidence 666521 1 2221117888889888777653 333333 56999988854333322 Q ss_pred cCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccc Q lcl|NC_015249. 75 PGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDEN 154 (347) Q Consensus 75 ~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~ 154 (347) .|.++.+...++....+.++||+.+-..|.+. .+| +...|....+.+.+.++||..+|..++..++.-..+.... T Consensus 59 ~G~~~t~~~~~~~e~~v~~~~~~~~~V~~~~~-~kE-l~~~~~~er~l~pAm~~LA~~Vd~dl~~~~~~~~~~v~~~--- 133 (430) T protein:vir:21 59 EGWDLTDKATGLLELNVAVNMGEPDNDFFQLR-ADD-LRDETAYRRRIQSAARKLANNVELKVANMAAEMGSLVITS--- 133 (430) T ss_pred ccccccCCCccceeeeEeEEEeeeccceEEee-hhH-hcChhhHHHHHHHHHHHHHHHHHHHHHHHhhhhhhccccc--- Confidence 24444444445666788999999998888886 333 5677788899999999999999999887654322111000 Q ss_pred cccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCC-CCEEEeCHHHHHHHhcc-hhhhhhhhccccc Q lcl|NC_015249. 155 IAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSA-DRVFYTTPDNYSAILAA-LMPNAANYQALID 232 (347) Q Consensus 155 ~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~-gR~~vv~P~~~~~Ll~~-~~~~~~~~~~~~~ 232 (347) ..+ ..... ...++.+.++++.|++..||.+ +|.++++|+.+..|... .++...+-..... T Consensus 134 ----~~~------t~~~~--------~~~~~~~A~a~~~L~~~~vP~~~~R~~~~~p~~~~~l~~~l~~~~~~~~~~~~A 195 (430) T protein:vir:21 134 ----PDA------IGTNT--------ADAWNFVADAEEIMFSRELNRDMGTSYFFNPQDYKKAGYDLTKRDIFGRIPEEA 195 (430) T ss_pred ----cCC------CCCCC--------CcchhhHHHHHHHHHHhcCCCCCCcEEEeChHHHHHHhhhhccccccccchhHH Confidence 000 00000 1135678889999999999995 79999999999988653 3444444445667 Q ss_pred cccceEEE-EeceE-EEEecceecccccccccc-----------------------------------cccccccccccc Q lcl|NC_015249. 233 PSTGSIRN-VMGFE-VIEVPHLTAGGAGEDRPE-----------------------------------EGANPTGQKHAF 275 (347) Q Consensus 233 ~~~G~Vg~-i~G~~-V~~sn~lp~~~~~~~~~~-----------------------------------~~~~~~~~~~~~ 275 (347) +++|.|++ +.||+ +|+++++|....++.... .++...|..+.+ T Consensus 196 ~r~g~i~r~~~Gfd~~~~s~~~~~~t~gt~t~~tv~gA~~~~~~~~tv~~~g~~~~~d~~~~~it~s~tg~l~~GD~fti 275 (430) T protein:vir:21 196 YRDGTIQRQVAGFDDVLRSPKLPVLTKSTATGITVSGAQSFKPVAWQLDNDGNKVNVDNRFATVTLSATTGMKRGDKISF 275 (430) T ss_pred HhhcccccccchhhhhhhcCCcccccCccCcCceeccccccccccceeccccccccccccceeeeeecccceecccEEEe Confidence 89999996 99996 789999997433221100 000111111111 Q ss_pred ccc----------------------ccc----------------------cc-----c---ccccc-------eEEEEec Q lcl|NC_015249. 276 PET----------------------SSG----------------------DT-----R---VALDN-------VVGLFNH 296 (347) Q Consensus 276 ~~~----------------------~~~----------------------~y-----~---~~~~~-------~~~l~~~ 296 (347) +.. .++ .| . +..-. ..-|+|| T Consensus 276 aGV~~v~~itk~~~~~l~qf~V~a~~~~ttv~I~Pai~~~~~~~~~~~~~~y~nVsaspa~~aavT~v~~a~~~~Nl~fh 355 (430) T protein:vir:21 276 AGVKFLGQMAKNVLAQDATFSVVRVVDGTHVEITPKPVALDDVSLSPEQRAYANVNTSLADAMAVNILNVKDARTNVFWA 355 (430) T ss_pred cceeeeccccccccCCcceEEEEEecCCceeEEeecccccccccccccccccceeccccccCceeEEeccCCcccceeEc Confidence 110 000 00 0 00000 1238999 Q ss_pred hhhhhhhhhc---------------------ceeeeee--echhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 297 RSAVGTVKLK---------------------DMALERA--RRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 297 ~~Av~~v~~~---------------------~~~~e~~--~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |+|+..+... .+.+... ||....-..++.-.-||...+|||.++++...++ T Consensus 356 ~~A~~La~~pl~~p~~~~~~~~~~~~~~~~~Glsirv~~~yd~~~~~~~~r~DilyG~~~l~Pe~a~v~l~g~~ 429 (430) T protein:vir:21 356 DDAIRIVSQPIPANHELFAGMKTTSFSIPDVGLNGIFATQGDISTLSGLCRIALWYGVNATRPEAIGVGLPGQT 429 (430) T ss_pred cceeEEEEecccCCCChhHhhheeeeeccccceEEEEEEccccccCceEEEEEeecCccccCcceEEEEcCCCC Confidence 9977544221 1222222 5555555667777889999999999988887777 No 59 >protein:vir:41 Length: 299 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:2 # MgeName: A118 # Cross-refs: genbank:acc:NP_463467;swissprot:trembl:q9t1b7;genbank:gi:16798789;uniprot:Q9T1B7;genbank:GeneID:922353 Probab=99.65 E-value=3.2e-17 Score=111.00 Aligned_cols=282 Identities=13% Similarity=0.095 Sum_probs=179.0 Q ss_pred ccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCCCCCccCCCCCc Q lcl|NC_015249. 10 IGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENLDDKRKDMKHT 89 (347) Q Consensus 10 ~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~~~~~~~~~~ 89 (347) ++.++-....+++.-.+.-+++..++.+..++.|+++.+.++..+. +.+.++++.....+..+..|+.++.+ +++-+ T Consensus 1 ~g~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~~~~~~~~~-~~~~~~~~~~~~~a~~v~E~~~~~~~--~~~f~ 77 (299) T protein:vir:41 1 MGFNPDTTTMQSAKTGSIPINISEQIITGVKNGSAAMKLAKAVPMT-KPEEEFTFMSGVGAFWVDEAERIQTS--KPTFT 77 (299) T ss_pred CCcCCCcccccCCCceecchhHHHHHHHHHHhcchhhhhceeeecC-CCcEEEEEEcCCceeeeecCcccccc--cccee Confidence 3333333333333334667999999999999999999998887764 56778888887888888888887653 46677 Q ss_pred eEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccCcceeecccc Q lcl|NC_015249. 90 ERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGKAHVLEVGKQ 169 (347) Q Consensus 90 ~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~g~~i~~~~~ 169 (347) ++++...+. +....|.+-=-.++..|+.+.+.++.++++++..|+.++.-- ....+. |........ T Consensus 78 ~v~l~~~k~-~~~~~is~ell~ds~~~~~~~i~~~l~~a~~~~~d~a~l~G~---------g~~~~~----gil~~~~~~ 143 (299) T protein:vir:41 78 KAKMRSKKM-GVIIPTTKENLNYSVTNFFSLMQAEIVEAFYKKFDQAVFTGV---------ESPYNW----NILKSATDA 143 (299) T ss_pred EEEEeeEEE-EEeehhhHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHHhhcc---------cCcccc----ccccccccc Confidence 777777554 344455432222356889999999999999999999886321 000111 111001101 Q ss_pred cccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceEEEEeceEEEEe Q lcl|NC_015249. 170 SELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSIRNVMGFEVIEV 249 (347) Q Consensus 170 ~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg~i~G~~V~~s 249 (347) .... ......++.|+++...|...+.+. -.++++|..|..|.+-. -.++.+........| .+++.|.+|+.+ T Consensus 144 ~~~~----~~~~~~~~~l~~~~~~l~~~~~~~--~~~v~n~~~~~~L~~lk-d~~G~~l~~~~~~~~-~~~l~G~PV~~~ 215 (299) T protein:vir:41 144 SNLV----EETANKYDDLNEAIGLIEAEDLEP--NGIATIRKQRVKYRSTK-DGNGMPIFNTATSNG-VDDVLGLPIAYT 215 (299) T ss_pred ceee----ccccccHHHHHHHHHhhhcccCCc--CEEEEcHHHHHHHHHhh-ccCCceeecCCcCCC-CceecceeeEEe Confidence 1110 111223677888888888888753 35799999999998533 233444433334443 368999999999 Q ss_pred cceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechh------------ Q lcl|NC_015249. 250 PHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRAN------------ 317 (347) Q Consensus 250 n~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~------------ 317 (347) +++|..... ..-+-+||++.. + +..+++++|..++.- T Consensus 216 ~~~~~~~~~---------------------~~~~~gdfs~~~-i---------~~~~~~~i~~~~~~~~~~~~~~~~~~~ 264 (299) T protein:vir:41 216 PKYTFGDKD---------------------ISELVGDWNQAY-Y---------GILRGVEYEILTEATLTTVADETGKPL 264 (299) T ss_pred cccCCCCCc---------------------eEEEEEecccEE-E---------EEecCcEEEEeecccccccccccccch Confidence 999842210 112334554432 2 222335556554432 Q ss_pred --hhcce--eeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 318 --FQADQ--IIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 318 --~~~d~--i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++.+. ++...++|..+++|++.+.|..++| T Consensus 265 ~~~~~~~~~~r~~~~~d~~v~~~~A~~~l~~~aa 298 (299) T protein:vir:41 265 NLAERDMAAIKATFEVGFMVVKDEAFSAVQPKAG 298 (299) T ss_pred hhhhcCcEEEEEEEEeccEEecccceEEEEeccC Confidence 34444 4666788999999999999988888 No 60 >protein:vir:9265 Length: 430 # NCBI annotation: 5 # Family: family:all:1412 # MgeID: mge:164 # MgeName: ST64T # Cross-refs: genbank:acc:NP_720329;genbank:gi:24371587;genbank:GeneID:955820 Probab=99.65 E-value=1.7e-17 Score=112.49 Aligned_cols=302 Identities=14% Similarity=0.114 Sum_probs=183.7 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccc---ccccc---cccceEEEeecCcceeeeee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKH---LVRSI---QSGKSAQFPVLGRTKAAYLQ 74 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~---~~r~i---~~G~tv~i~~iG~~~~~~~~ 74 (347) |||.-+ -.++.=..|.+..|+...++...+ +..+. +.|+++.+|.--..... T Consensus 1 MAn~l~-------------------~~~~ii~~eal~~l~n~~v~a~~~~~~r~~d~~~~r~Gdti~~p~~~~~~~~--- 58 (430) T protein:vir:92 1 MALNEG-------------------QIVTLAVDEIIETISAITPMAQKAKKYTPPAASMQRSSNTIWMPVEQESPTQ--- 58 (430) T ss_pred Cccchh-------------------hHHHHHHHHHHHHHhhhhhhhhhhcccCCchhhhhcccceEEeccccccccc--- Confidence 665421 233455567778887777666432 32222 46999988865444433 Q ss_pred cCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccc Q lcl|NC_015249. 75 PGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDEN 154 (347) Q Consensus 75 ~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~ 154 (347) .|.++.+...++....+.++||+.+-..|.+.+-+ +...+....+.+.+.++||..+|..++..++.-..+. T Consensus 59 ~G~~~t~~~~~i~e~~v~~~v~~~k~V~~~~~~ke--l~~~~~~~~~i~~Am~~LA~~Vd~dl~~~~~~~~~~v------ 130 (430) T protein:vir:92 59 EGWDLTDKATGLLELNVAVNMGEPDNDFFQLRADD--LRDETAYRHRIQSAARKLANNVELKVANMAAEMGSLV------ 130 (430) T ss_pred cCcccCCCCCccccceEEEEEeeeccceEEechhH--hcChhHHHHHhHHHHHHHHHHHHHHHHHHhhhccccc------ Confidence 25554444445656788999999999999998654 5777778888899999999999998876643221111 Q ss_pred cccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCC-CCEEEeCHHHHHHHhcc-hhhhhhhhccccc Q lcl|NC_015249. 155 IAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSA-DRVFYTTPDNYSAILAA-LMPNAANYQALID 232 (347) Q Consensus 155 ~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~-gR~~vv~P~~~~~Ll~~-~~~~~~~~~~~~~ 232 (347) +....+.. .. +...++.+..+.+.|++..||.+ +|.++++|+.+..|... .++-..+-..... T Consensus 131 ---------~~~~~~t~---~~---~~~~~~~~A~a~~~L~~~~vP~~~~R~~vldp~~~~~l~~~l~~l~~~~~~~~~A 195 (430) T protein:vir:92 131 ---------ITSPDAIG---TN---TADAWNFVADAEELMFSRELNRDMGTSYFFNPQDYKKAGYDLTKRDIFGRIPEEA 195 (430) T ss_pred ---------ccccccCC---Cc---CCcchhhHHHHHHHHHHhcCCCCCCcEEEeChHHHHHHHhhhccccccccchhHH Confidence 00000000 00 11135667889999999999995 89999999999998643 2333333345566 Q ss_pred cccceEEE-EeceE-EEEecceecccccccccc-----------------------------------cccccccccccc Q lcl|NC_015249. 233 PSTGSIRN-VMGFE-VIEVPHLTAGGAGEDRPE-----------------------------------EGANPTGQKHAF 275 (347) Q Consensus 233 ~~~G~Vg~-i~G~~-V~~sn~lp~~~~~~~~~~-----------------------------------~~~~~~~~~~~~ 275 (347) +++|.|++ +.||+ +|+++++|....++.... .++...|..+.+ T Consensus 196 ~r~g~i~~~~~Gfd~~~~~~~~~~~t~g~~t~~tv~gA~~~~~~~~~v~~~g~~~~~d~~~~tit~s~tg~l~~GD~fti 275 (430) T protein:vir:92 196 YRDGTIQRQVAGFDDVLRSPKLPVLTKSTATGITVSGAQSFKPVAWQLDNDGNKVNVDNRFATVTLSATTGLKRGDKISF 275 (430) T ss_pred HhhccccccchhhhhhhhcCCcccccCccCcCceeccccccccccceecccccccccccccceeeeecccceecccEEEe Confidence 89999996 99995 789999997443221100 000111111111 Q ss_pred ccc----------------c----------------------------cccc-----cccc----------cceEEEEec Q lcl|NC_015249. 276 PET----------------S----------------------------SGDT-----RVAL----------DNVVGLFNH 296 (347) Q Consensus 276 ~~~----------------~----------------------------~~~y-----~~~~----------~~~~~l~~~ 296 (347) ... . +..| .+.. .-..-|+|| T Consensus 276 aGV~~v~~~tkq~~~~l~~F~Vt~~~~atsv~I~paii~~~~~~~~~~~~~y~nVsaspa~~aavTvv~~a~~~~Nl~fh 355 (430) T protein:vir:92 276 TGVKFLGQMAKNVLAQDATFSVVRVVDGTHVEITPKPVALDDVSLSPEQRAYANVNTSLADAMAVNILNVKDARTNVFWA 355 (430) T ss_pred cceeeeccccccccCCccEEEEEEecCCceeEEeccccccccccccccccccceeccccccCceeEEeccCCcccceeEc Confidence 110 0 0000 0000 002348999 Q ss_pred hhhhhhhhhc---------------------ceeee--eeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 297 RSAVGTVKLK---------------------DMALE--RARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 297 ~~Av~~v~~~---------------------~~~~e--~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |+|++.+... .+.+. .+||....-...+.-.-||.+.+|||.++++...++ T Consensus 356 r~A~aLa~~pL~~~~~~~~~~~~~~~~~~~~Glsirv~~~yd~~~~~~~~r~DvLyG~~~v~Pe~a~v~l~g~~ 429 (430) T protein:vir:92 356 DDAIRIVSQPIPANHELFAGMKTTSFSIPDVGLNGIFATQGDISTLSGLCRIALWYGVNATRPEAIGVGLPGQT 429 (430) T ss_pred ccceEEEEecccCCCCHHHhhhhheeccccceEEEEEEEecccccCceEEEEeeeccceecCcceEEEEcCCCC Confidence 9977543221 11222 236666556666777789999999999988887777 No 61 >protein:vir:100939 Length: 430 # NCBI annotation: Gp5 # Family: family:all:1412 # MgeID: mge:1509 # MgeName: ST104 # Cross-refs: genbank:acc:YP_006408;genbank:gi:46358700;genbank:GeneID:2777089 Probab=99.65 E-value=1.7e-17 Score=112.49 Aligned_cols=302 Identities=14% Similarity=0.114 Sum_probs=183.7 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccc---ccccc---cccceEEEeecCcceeeeee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKH---LVRSI---QSGKSAQFPVLGRTKAAYLQ 74 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~---~~r~i---~~G~tv~i~~iG~~~~~~~~ 74 (347) |||.-+ -.++.=..|.+..|+...++...+ +..+. +.|+++.+|.--..... T Consensus 1 MAn~l~-------------------~~~~ii~~eal~~l~n~~v~a~~~~~~r~~d~~~~r~Gdti~~p~~~~~~~~--- 58 (430) T protein:vir:10 1 MALNEG-------------------QIVTLAVDEIIETISAITPMAQKAKKYTPPAASMQRSSNTIWMPVEQESPTQ--- 58 (430) T ss_pred Cccchh-------------------hHHHHHHHHHHHHHhhhhhhhhhhcccCCchhhhhcccceEEeccccccccc--- Confidence 665421 233455567778887777666432 32222 46999988865444433 Q ss_pred cCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccc Q lcl|NC_015249. 75 PGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDEN 154 (347) Q Consensus 75 ~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~ 154 (347) .|.++.+...++....+.++||+.+-..|.+.+-+ +...+....+.+.+.++||..+|..++..++.-..+. T Consensus 59 ~G~~~t~~~~~i~e~~v~~~v~~~k~V~~~~~~ke--l~~~~~~~~~i~~Am~~LA~~Vd~dl~~~~~~~~~~v------ 130 (430) T protein:vir:10 59 EGWDLTDKATGLLELNVAVNMGEPDNDFFQLRADD--LRDETAYRHRIQSAARKLANNVELKVANMAAEMGSLV------ 130 (430) T ss_pred cCcccCCCCCccccceEEEEEeeeccceEEechhH--hcChhHHHHHhHHHHHHHHHHHHHHHHHHhhhccccc------ Confidence 25554444445656788999999999999998654 5777778888899999999999998876643221111 Q ss_pred cccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCC-CCEEEeCHHHHHHHhcc-hhhhhhhhccccc Q lcl|NC_015249. 155 IAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSA-DRVFYTTPDNYSAILAA-LMPNAANYQALID 232 (347) Q Consensus 155 ~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~-gR~~vv~P~~~~~Ll~~-~~~~~~~~~~~~~ 232 (347) +....+.. .. +...++.+..+.+.|++..||.+ +|.++++|+.+..|... .++-..+-..... T Consensus 131 ---------~~~~~~t~---~~---~~~~~~~~A~a~~~L~~~~vP~~~~R~~vldp~~~~~l~~~l~~l~~~~~~~~~A 195 (430) T protein:vir:10 131 ---------ITSPDAIG---TN---TADAWNFVADAEELMFSRELNRDMGTSYFFNPQDYKKAGYDLTKRDIFGRIPEEA 195 (430) T ss_pred ---------ccccccCC---Cc---CCcchhhHHHHHHHHHHhcCCCCCCcEEEeChHHHHHHHhhhccccccccchhHH Confidence 00000000 00 11135667889999999999995 89999999999998643 2333333345566 Q ss_pred cccceEEE-EeceE-EEEecceecccccccccc-----------------------------------cccccccccccc Q lcl|NC_015249. 233 PSTGSIRN-VMGFE-VIEVPHLTAGGAGEDRPE-----------------------------------EGANPTGQKHAF 275 (347) Q Consensus 233 ~~~G~Vg~-i~G~~-V~~sn~lp~~~~~~~~~~-----------------------------------~~~~~~~~~~~~ 275 (347) +++|.|++ +.||+ +|+++++|....++.... .++...|..+.+ T Consensus 196 ~r~g~i~~~~~Gfd~~~~~~~~~~~t~g~~t~~tv~gA~~~~~~~~~v~~~g~~~~~d~~~~tit~s~tg~l~~GD~fti 275 (430) T protein:vir:10 196 YRDGTIQRQVAGFDDVLRSPKLPVLTKSTATGITVSGAQSFKPVAWQLDNDGNKVNVDNRFATVTLSATTGLKRGDKISF 275 (430) T ss_pred HhhccccccchhhhhhhhcCCcccccCccCcCceeccccccccccceecccccccccccccceeeeecccceecccEEEe Confidence 89999996 99995 789999997443221100 000111111111 Q ss_pred ccc----------------c----------------------------cccc-----cccc----------cceEEEEec Q lcl|NC_015249. 276 PET----------------S----------------------------SGDT-----RVAL----------DNVVGLFNH 296 (347) Q Consensus 276 ~~~----------------~----------------------------~~~y-----~~~~----------~~~~~l~~~ 296 (347) ... . +..| .+.. .-..-|+|| T Consensus 276 aGV~~v~~~tkq~~~~l~~F~Vt~~~~atsv~I~paii~~~~~~~~~~~~~y~nVsaspa~~aavTvv~~a~~~~Nl~fh 355 (430) T protein:vir:10 276 TGVKFLGQMAKNVLAQDATFSVVRVVDGTHVEITPKPVALDDVSLSPEQRAYANVNTSLADAMAVNILNVKDARTNVFWA 355 (430) T ss_pred cceeeeccccccccCCccEEEEEEecCCceeEEeccccccccccccccccccceeccccccCceeEEeccCCcccceeEc Confidence 110 0 0000 0000 002348999 Q ss_pred hhhhhhhhhc---------------------ceeee--eeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 297 RSAVGTVKLK---------------------DMALE--RARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 297 ~~Av~~v~~~---------------------~~~~e--~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |+|++.+... .+.+. .+||....-...+.-.-||.+.+|||.++++...++ T Consensus 356 r~A~aLa~~pL~~~~~~~~~~~~~~~~~~~~Glsirv~~~yd~~~~~~~~r~DvLyG~~~v~Pe~a~v~l~g~~ 429 (430) T protein:vir:10 356 DDAIRIVSQPIPANHELFAGMKTTSFSIPDVGLNGIFATQGDISTLSGLCRIALWYGVNATRPEAIGVGLPGQT 429 (430) T ss_pred ccceEEEEecccCCCCHHHhhhhheeccccceEEEEEEEecccccCceEEEEeeeccceecCcceEEEEcCCCC Confidence 9977543221 11222 236666556666777789999999999988887777 No 62 >protein:vir:78523 Length: 338 # NCBI annotation: Putative head structural protein # Family: family:all:507 # MgeID: mge:1853 # MgeName: U2 # Cross-refs: genbank:acc:YP_001491585;genbank:gi:157786408;genbank:GeneID:5625675 Probab=99.61 E-value=2.3e-16 Score=106.31 Aligned_cols=306 Identities=12% Similarity=0.069 Sum_probs=174.6 Q ss_pred CCccccc--ccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec---------Ccce Q lcl|NC_015249. 1 MAKMNGG--QQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL---------GRTK 69 (347) Q Consensus 1 ma~~~~~--~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i---------G~~~ 69 (347) ||+++-- +..+ ...+++..+..-+|+-++|..++.+.-++.|.++.+.++..+. +..+.+|+. |..+ T Consensus 1 ~~~~~e~~~~~~~-~~~~~~~~~~~~~liP~~~~~~ii~~~~~~s~l~~l~~~~~~~-~~~~~ip~~~~~~~a~~v~~~~ 78 (338) T protein:vir:78 1 MATLNELAPNTAG-SNHQGRLAHVPSDLLPKEIVGPIFDKAQESSLVLRLGENIPIS-YGETIIPTTVKRPEVGQVGVGT 78 (338) T ss_pred CcchHHhhhhhcc-cccccceecccccccchHHHHHHHHHHHhhchhhhhcceeecc-CCceEEEEEecCccceeecccc Confidence 8877721 2222 2224444444555788999999999999999999998887765 567777764 3334 Q ss_pred eeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Q lcl|NC_015249. 70 AAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPS 149 (347) Q Consensus 70 ~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~ 149 (347) +.....|+.++. .++.-.++++..-+. +....|.+-=-.++.+|+.+.+.++.++++++..|+.++..-.. T Consensus 79 ~~~~~Eg~~~~~--~~~~f~~v~l~~~k~-~~~~~is~ell~ds~~~~~~~i~~~la~a~~~~~d~~~l~G~g~------ 149 (338) T protein:vir:78 79 SNEQREGGTKPL--SGTAWDTRSVAPIKL-ATIVTVSEEFARMNPSGLYTKLQADLAYAIGRGIDLAVFHGKSP------ 149 (338) T ss_pred cccccccccccc--cccceeEEEEEEEEE-EEeehhhHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHhhcccCC------ Confidence 444445555543 234455555555433 23333433222346689999999999999999999988632100 Q ss_pred ccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhh--hhhh Q lcl|NC_015249. 150 ASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPN--AANY 227 (347) Q Consensus 150 ~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~--~~~~ 227 (347) .....+.+........... ............|+.|.++...+.. +.......++++|..|..|++...+. +..+ T Consensus 150 ~~~~~~~gi~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~m~~~~~~~L~~~~~l~d~~g~~ 225 (338) T protein:vir:78 150 LTGSALQGIDTNNVIVNTT---NVDYLQTGTTPLLDRFLDGYDLVSA-NTDVDFNGWAADPRYRARLLRSQAYRDANGNV 225 (338) T ss_pred Ccccccccccccccccccc---ccccccccchhhHHHHHHHHHHhhh-hccccceEEEEchHHHHHHHHHhhhccCCCce Confidence 0000011111100000000 1111122234567888877666643 33444557899999999997655443 3334 Q ss_pred ccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcc Q lcl|NC_015249. 228 QALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKD 307 (347) Q Consensus 228 ~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~ 307 (347) .-......|..+.++|++|+.++++|....... . .....|-+||+... + . ...+ T Consensus 226 l~~~~~~~~~~~~l~G~PV~~~~~ip~~~~~~~------~----------~~~~~~~gdfs~~~-~-~--------~~~~ 279 (338) T protein:vir:78 226 DPTRINLAASAGDLLGLPVQFGKAVGGDLGAAT------D----------SKVRVVGGDFSQLK-Y-G--------FADE 279 (338) T ss_pred eecccccCCCCceeeeeeEEEccccCccccccC------C----------cccEEEEEecceEE-E-E--------eecc Confidence 434445677778999999999999995432110 0 00112334554432 1 1 1223 Q ss_pred eeeeeeechh--------------hhc--ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 308 MALERARRAN--------------FQA--DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 308 ~~~e~~~d~~--------------~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++++.++.. ++. ..++...++|..++||++.+.|+-..+ T Consensus 280 ~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~~ 335 (338) T protein:vir:78 280 IRVKMSDTATLTDNTSPTPQTVSMWQTNQIAILIEVTFGWLLGDKQAFVKFVDDED 335 (338) T ss_pred cEEEEeecccccccccccccchhhhhcCcEEEEEEEEeccEeecccceEEEecccC Confidence 4555554321 222 345677889999999998776554444 No 63 >protein:vir:78223 Length: 333 # NCBI annotation: Putative major head protein # Family: family:all:966 # MgeID: mge:1849 # MgeName: Bethlehem # Cross-refs: genbank:acc:YP_001491666;genbank:gi:157786490;genbank:GeneID:5625701 Probab=99.60 E-value=4.6e-16 Score=104.64 Aligned_cols=307 Identities=12% Similarity=0.084 Sum_probs=172.8 Q ss_pred CCccccc--ccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCC Q lcl|NC_015249. 1 MAKMNGG--QQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGE 77 (347) Q Consensus 1 ma~~~~~--~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~ 77 (347) ||.++-- +..+ ...++...+..-++..+++.+++.+.-++.|.++.+.++..+.+ ...++|+. +..++.....|+ T Consensus 1 ~a~l~el~~~~~~-~~~~g~~~~~~~~liP~~~~~~ii~~l~~~s~l~~~~~~~~~~~-~~~~~p~~~~~~~a~~v~eg~ 78 (333) T protein:vir:78 1 MATLNELLPNSAG-SNHQGRLAHVPSDLLPKEIVGPIFDKAQESSLVLRMGEQIPISY-GETIIPTTVKRPEVGQVGVGT 78 (333) T ss_pred CchhHHhhhhccc-ccccCceecCCccccchhHHHHHHHHHHhhchhhhhcceeeccC-CceEEEEEeCCceeEeecCcc Confidence 7766621 2222 22233333333447889999999999999999999988887664 55567665 444555544443 Q ss_pred CCCC------ccCCCCCceEEEEEEeeeeccc-ccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccc Q lcl|NC_015249. 78 NLDD------KRKDMKHTERTINIDGLLTADV-LIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSA 150 (347) Q Consensus 78 ~~~~------~~~~~~~~~~~l~ID~~~~~~~-~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~ 150 (347) .... +...+.-.+ +++...++..+ .|.+-=-.++.+|+.+.+.+++++++++..|+.++..-. .. T Consensus 79 ~~~~~e~~~~~~~~~~f~~--i~l~~~kl~~~~~is~ell~~s~~~~~~~i~~~la~ai~~~~d~~~l~G~g----~~-- 150 (333) T protein:vir:78 79 SNEQREGGLKPLSGTAWDT--RSVSPIKLATIVTVSEEFARMNPSGLYTKLQGDLAYAIGRGIDLAVFHGKS----PL-- 150 (333) T ss_pred cccccccccccccccceeE--EEEeeEEEEEeehhhHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHHhcccC----CC-- Confidence 2211 011222233 34444554443 333211124678899999999999999999998863210 00 Q ss_pred cccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhh--hhhhc Q lcl|NC_015249. 151 SDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPN--AANYQ 228 (347) Q Consensus 151 ~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~--~~~~~ 228 (347) ....+.+......+...+ ........+...++.|+++...+..+. ....-.++++|..|..|++..... +..+. T Consensus 151 ~~~~~~g~~~~~~~~~~~---~~~~~~~~~~~~~~~i~~~~~~~~~~~-~~~~~~~vmn~~~~~~L~~~~~~~d~~G~~i 226 (333) T protein:vir:78 151 TGSALQGIDTDNVIANTT---NVDYLQETGDPLLDRLLDGYDLVSANT-DVEFNGWAVDPRFRAHLLRAQAYRDANGNVD 226 (333) T ss_pred CCcccccccccccccccc---cccccccccchhHHHHHHHHHhhcccc-ccCceEEEEcchHHHHHHHHhhhcCCCCcee Confidence 001111111111111100 011111223345778888877766543 223346788999999998765443 34454 Q ss_pred cccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcce Q lcl|NC_015249. 229 ALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDM 308 (347) Q Consensus 229 ~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~ 308 (347) -......|..++++|++|+.|+++|....... .....-|-+||++... +...++ T Consensus 227 ~~~~~~~~~~~~l~G~Pv~~~~~i~~~~~~~~----------------~~~~~~~~gD~~~~~~----------g~~~~~ 280 (333) T protein:vir:78 227 PSRINLAAQTGDVLGLPAQFGRAVGGDLGAAV----------------DSKTRIIGGDFSQLKF----------GFADEI 280 (333) T ss_pred ecCccccCCCceeeceeeEEccccCCCccccC----------------CCccEEEEEecccEEE----------EEeecc Confidence 44556667778999999999999995432210 0011123445554331 112334 Q ss_pred eeeeeech-----------hhhcc--eeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 309 ALERARRA-----------NFQAD--QIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 309 ~~e~~~d~-----------~~~~d--~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++..++. .++.| .+++.++++.++++|++.+.|+-..| T Consensus 281 ~i~~~~~~~~~~~~~~~~~~~~~~~v~~r~~~r~d~~v~~~~a~~~l~~~~a 332 (333) T protein:vir:78 281 RIKMSDTATLTDSGSATVSMWQTNQIAILIEVTFGWLLGDKQAFVKFVDDEQ 332 (333) T ss_pred EEEEeccccccccccceeehhhcCcEEEEEEEEEccEEecccceEEEeccCC Confidence 55554331 12333 36777889999999998887766666 No 64 >protein:vir:7771 Length: 330 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:149 # MgeName: Bxz2 # Cross-refs: genbank:acc:NP_817605;genbank:gi:29566035;genbank:GeneID:1259229 Probab=99.57 E-value=1.3e-15 Score=102.27 Aligned_cols=297 Identities=10% Similarity=0.022 Sum_probs=171.4 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~~ 79 (347) ||.-.--.. .....++.=.+..+++..++.+..+..|+++.+.++....+ ..+.+|+. +...+..+..|+.+ T Consensus 1 m~~~~~~a~------~~~~t~~~g~~i~~~~~~~ii~~~~~~s~l~~~~~~~~~~~-~~~~~p~~~~~~~a~~v~Eg~~~ 73 (330) T protein:vir:77 1 MAGSTVPST------QVALTGDFSAFLTPEQSQDYFAEIEKTSIVQRIARKVPMGP-TGISIPHWTGAVSASWTGEAERK 73 (330) T ss_pred Ccccccchh------hccccCCCcceechhHHHHHHHHHHhccchhhhcceeeccC-CceEEEEEcCCcceeEecCCCcc Confidence 776652221 11111111124456777888888888999999988877554 45778876 66677888888887 Q ss_pred CCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLG 159 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~ 159 (347) +.+ +++-.++++..-+. +.-..|.+-=-.++.+|+.+.+.++.++++++..|+.++.-- . ....+.+.. T Consensus 74 ~~~--~~~f~~i~~~~~k~-~~~~~is~ell~ds~~~~~~~i~~~l~~ai~~~~~~~~l~G~-------g-~~~~~~g~~ 142 (330) T protein:vir:77 74 PIT--KGSFGKQELEPVKI-TTIFAESAEVVRLNPLNYLNTMRTKIAEAIALKFDAAAIHGI-------D-KPSAFKGYL 142 (330) T ss_pred ccc--cceeeEEEEeEEEE-EEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhccc-------C-CCCcccccc Confidence 653 46566666666443 333344332223356889999999999999999999886311 0 000011100 Q ss_pred CcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccc-----cc Q lcl|NC_015249. 160 KAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALID-----PS 234 (347) Q Consensus 160 ~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~-----~~ 234 (347) ...............+........|+.|.++...+...+.+. ..++++|..|..|.+-. -.+..+.-... .. T Consensus 143 ~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~--~~~vmn~~~~~~l~~lk-d~~G~~l~~~~~~~~~~~ 219 (330) T protein:vir:77 143 AETTKVVSLADTNLTTASGPQGNAYLAVNNALSLLVNSGKKW--TGTLLDNVTEPILNTAV-DGNGRPLFVESTYTEQVG 219 (330) T ss_pred ccccccceeecccccccccccchhHHHHHHHHHhhhhcCCCc--cEEEEcHHHHHHHHHHh-ccCCceeecCcccccccc Confidence 000000000000111122233456788888888888887653 35789999999887532 22233322211 22 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) ...-+++.|++|+.++++|....++ ...-+-+||++.+ .+..++++++++. T Consensus 220 ~~~~~~l~G~PV~~~~~~p~~~~~~-------------------~~~~~~gd~s~~~----------i~~~~~~~i~~~~ 270 (330) T protein:vir:77 220 AIREGRILGRPTYVADNVVNGTVGN-------------------RVVGVMGDFSQVI----------WGQIGGLSFDVTD 270 (330) T ss_pred ccCCceecceeeEEeccccCCCCCC-------------------ccEEEEEecceEE----------EEEecCcEEEEee Confidence 2233579999999999998532111 0112334444432 1222344555443 Q ss_pred chh------------------hh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RAN------------------FQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~------------------~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +.- ++ ...+++..+++..++||++.+.|....| T Consensus 271 e~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~i~~~~~ 323 (330) T protein:vir:77 271 QATLDFGEEQGGVWVPKLISLWQHNMVAVRCEAEFAFMVNDKDAFVKLTDQVA 323 (330) T ss_pred cceeeecccccccccccccchhhcCcEEEEEEEEeccEEecccceEEEEeccC Confidence 321 22 2446788899999999999888877777 No 65 >protein:vir:6242 Length: 390 # NCBI annotation: gp36 # Family: family:all:21 # MgeID: mge:131 # MgeName: phi-BT1 # Cross-refs: genbank:acc:NP_813696;swissprot:trembl:q859c1;genbank:gi:29366756;interpro:IPR006444;uniprot:Q859C1;genbank:GeneID:1258897 Probab=99.49 E-value=4.1e-15 Score=99.48 Aligned_cols=289 Identities=10% Similarity=0.074 Sum_probs=168.5 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~ 78 (347) +....-+.........+..+++.. +.+ +++...+.......++++.+.++....++..+.||+. |...+.....|+. T Consensus 97 ~~~~~r~~~~~~~~~~~t~~~~g~-~~~~~~~~~~i~~~~~~~~~l~~~~~~~~~~~~~~~~~p~~~~~~~a~wv~E~~~ 175 (390) T protein:vir:62 97 NLGEARSFEFAPEKRDGTKAGNPN-VLSRTLYGQLIAQAVERSAIMRGGATTFTTSDANPLDFTVITGRSSASIVGETAE 175 (390) T ss_pred hhhhhHHHHhhhhhhcccccCCCc-cccccchHHHHHHHHhhhhhhhhcceeeecCCCceeEEEEEcCCcceeeeccccc Confidence 100000000000000111112222 334 4555556555656678888888877777788899866 5567777777887 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) ++.+ ++.-.++++.+.++ +.-..|.+-=-.++.+|+.+.+.++.++++++..|+.++.- . |. T Consensus 176 ~~~~--~~~f~~i~~~~~k~-~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~l~G----~-----------G~ 237 (390) T protein:vir:62 176 IPES--YPATAQRSMGGFKY-GFASVVSYEFATDQVLDLVGFLVSDAGPAIGDAMGRHFITG----T-----------GQ 237 (390) T ss_pred cccc--ccceeeeEeeeeeE-EeehHHHHHHHhhhhHHHHHHHHHHHHHHHHHHHHhhhhcc----C-----------Cc Confidence 7653 46667777777655 33334533222356789999999999999999999987621 0 01 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceE Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSI 238 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~V 238 (347) |.|-....+....... ........++.|+++...|+..... +-..|++|..|..|.+- +-.+..|.-...+..|.- T Consensus 238 p~Gi~~~~~~~~~~~~-~~~~~~~~~~~l~~~~~~l~~~~~~--~a~~vmn~~~~~~L~~l-kd~~g~~l~~~~~~~g~~ 313 (390) T protein:vir:62 238 PRGILTDASPATATFL-ATDTDSKVSDALIDLFHEVPSAYRA--NAKYVVNDLRAAQMRKL-KDANGQYLWQSGLTVGAP 313 (390) T ss_pred ccccccccccccccee-cccccccchHHHHHHHHhhhhhhhc--CCEEEEchHHHHHHHHh-hccCCCeeecCCcCCCcc Confidence 1111000000000000 0001112367777887788766543 33568899999988542 223344543444566777 Q ss_pred EEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhh Q lcl|NC_015249. 239 RNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANF 318 (347) Q Consensus 239 g~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~ 318 (347) ..+.|++|+.++++|... -+-+||+... ++. ..++.++...++.+ T Consensus 314 ~~l~G~Pv~~~~~~p~~~-------------------------i~~gd~s~~~--i~~--------~~~~~v~~~~~~~~ 358 (390) T protein:vir:62 314 SLFNGKVVETDDGMPADK-------------------------ILFADLSKYR--VRF--------AGSLRVDRSVDAKF 358 (390) T ss_pred ceecccceEEecCCCCcc-------------------------EEEeecccee--EEe--------ecceEEEeeccccc Confidence 789999999999998421 0123454432 222 22456666666655 Q ss_pred hcce--eeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 319 QADQ--IIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 319 ~~d~--i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .-|. +++.+++|..+++|++...|.++.| T Consensus 359 ~~~~~~~~~~~r~d~~~~~~~A~~~l~~~~~ 389 (390) T protein:vir:62 359 STDQIVYRFLQRADGLLVDARGAKVLTVTPG 389 (390) T ss_pred cCCcEEEEEEEEeCcEeechhheEEEEeecC Confidence 4444 5788999999999999999999888 No 66 >protein:vir:1328 Length: 392 # NCBI annotation: gp36 # Family: family:all:21 # MgeID: mge:28 # MgeName: phi-C31 # Cross-refs: genbank:acc:NP_047927;swissprot:trembl:q9zwv6;genbank:gi:9631145;uniprot:Q9ZWV6;genbank:GeneID:2715889 Probab=99.49 E-value=6.4e-15 Score=98.40 Aligned_cols=292 Identities=11% Similarity=0.054 Sum_probs=169.2 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~~ 79 (347) +.....+.........+-.+++.-.+--+++...+.....+.++++...++....++..+.+|+. |..++..+..|+.+ T Consensus 97 ~~~~~~~~~~~~~~~~~t~~~~g~~~~~~~~~~~i~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~~~ 176 (392) T protein:vir:13 97 NLGEARSFEFAPEKRDGTKAGNPNVLSRTLYGQLIAQAVERSAIMRGGASTFTTSDANPMDFTVITGRATAGIVGETAEI 176 (392) T ss_pred chhhhHHHHhhhhhhcccccCCCccccccchHHHHHHHHhhhhhhhhcceeeecCCCceeEEEEEcCCcceeeecccccc Confidence 10000000000000011111222212235677778777777889998888877777888888766 55677777788777 Q ss_pred CCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLG 159 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~ 159 (347) +.+ ++.-+++++..-+. +.-..|.+-=-.++.+|+.+.+.++.++++++..|+.++.- . ....+ T Consensus 177 ~~~--~~~f~~v~~~~~k~-~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~l~G----~-----Gt~~p---- 240 (392) T protein:vir:13 177 PES--YPATTQRSMGGFKY-GFASVVSYEFATDQVLDLVGFLVSDAGPAIGDAMGRHFLTG----T-----GTGQP---- 240 (392) T ss_pred ccc--ccceeeEEeeeeeE-EeeehhHHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhcc----c-----CCccc---- Confidence 653 45566666666544 33334443333346778999999999999999999987631 0 00111 Q ss_pred CcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceEE Q lcl|NC_015249. 160 KAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSIR 239 (347) Q Consensus 160 ~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg 239 (347) .|.....+.. .............|+.|+++...|...... .. ..|++|..+..|.+- +-.++.|.-...+..|... T Consensus 241 ~Gil~~~~~~-~~~~~~~~~~~~~~d~l~~~~~~l~~~~~~-~a-~~v~n~~~~~~l~~l-kd~~G~~l~~~~~~~g~~~ 316 (392) T protein:vir:13 241 RGILTDATGA-NAAFGEADADSKVSDALIDLFHEVPSAYRK-NA-KFVVNDLRAAQMRKL-KDANGQYLWQSALTVGAPD 316 (392) T ss_pred cccccccccc-cccccccccccccHHHHHHHHHhhhhhhhc-CC-EEEEcHHHHHHHHHh-hccCCceeecCCcCCCCCc Confidence 1111100000 000000111223367777777777665432 23 457799999988642 2233344333445567667 Q ss_pred EEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhhh Q lcl|NC_015249. 240 NVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANFQ 319 (347) Q Consensus 240 ~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~ 319 (347) .++|.+|+.++++|... -+-+||+... ++ ...+++++.+.++-+. T Consensus 317 ~l~G~Pv~~~~~~~~~~-------------------------i~~Gdf~~~~--i~--------~~~~~~i~~~~~~~~~ 361 (392) T protein:vir:13 317 TFNGKVVETDDGMPADK-------------------------VLFADLSKYR--VR--------FAGSLRVDRSVDAKFS 361 (392) T ss_pred eecceeeEEcCCCCCCc-------------------------EEEeecccee--EE--------eecceEEEeecccccc Confidence 89999999999998421 0123444422 22 2234566666666544 Q ss_pred c--ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 320 A--DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 320 ~--d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) - ..+++..++|..+.+|++.+.+.++.| T Consensus 362 ~~~~~~r~~~r~d~~~~~~~A~~~~~~~~a 391 (392) T protein:vir:13 362 TDQIVYRFLQRADGLLVDARGAKVLTVTPA 391 (392) T ss_pred CCcEEEEEEEEeccEEecccceEEEEeecc Confidence 4 456788999999999999999999888 No 67 >protein:vir:4600 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:101 # MgeName: PVL # Cross-refs: genbank:acc:NP_058445;genbank:gi:9635171;genbank:GeneID:1262708 Probab=99.48 E-value=1.9e-14 Score=95.76 Aligned_cols=293 Identities=10% Similarity=0.051 Sum_probs=167.1 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccc-eEEEe-ecCcceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGK-SAQFP-VLGRTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~-tv~i~-~iG~~~~~~~~~g~~ 78 (347) +........ ........++--.+.-+.|.+++.+.....+.+++++++..+.++. ++.++ ..+...+.....|.. T Consensus 110 ~~~~~~~~~---~~~~~~~t~~g~~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~Eg~~ 186 (415) T protein:vir:46 110 TEYLETRND---IQGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVRQSEVAALEKVEELEE 186 (415) T ss_pred HHHHhhhhh---hhhccccccCCcccccHHHHHHHHHHHHhhhhhhhhcceeeccCCceeEEEEEecCCcceeecccccc Confidence 000000000 0000001111112455899999999988889999999888876543 23333 334455666666666 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) ++.. ..+.-.++++..-+.- .-+.|.+-=-.++.+|+.+.+.+++++++++..|+.|+.-... ..+... T Consensus 187 ~~~~-~~~~~~~v~~~~~k~~-~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~il~g~g~---------g~~~~~ 255 (415) T protein:vir:46 187 NPEL-AVKPFFQLAYDINTHR-GYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVITK---------GSTGST 255 (415) T ss_pred cccc-cccceeeEEeeeeeeE-eeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhcccc---------CCcccc Confidence 5432 1233455555554432 2234433222346688999999999999999999988643211 001111 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceE Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSI 238 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~V 238 (347) ....... ......+ ....|+.|+++...+...... .=.+|++|..|..|.+- +-.++.|.-...+.+|.. T Consensus 256 ~~~~~~~---~~~~~~~----~~~~~~~i~~~~~~~~~~~~~--~~~~v~n~~~~~~L~~l-kd~~G~~i~~~~~~~~~~ 325 (415) T protein:vir:46 256 SSGFEKE---GKKLEVK----KAKSLDDIKDAINLNVKPNYE--HNVAIVSQTMFAKLDKM-KDKLGNYLIQPDVKEKTQ 325 (415) T ss_pred ccccccc---cceeccc----cccchHHHHHHHHhhhhhccC--CCEEEEcHHHHHHHHHh-hccCCCeeeccCcCCCCC Confidence 1110000 0000111 112366777777777776654 23578999999988642 234455554445677888 Q ss_pred EEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhh Q lcl|NC_015249. 239 RNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANF 318 (347) Q Consensus 239 g~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~ 318 (347) ++++|++|+.++++|..+.+... -+-+||+..+.+ +...+++++......+ T Consensus 326 ~~l~G~pV~~~~~~~~~~~~~~~--------------------~~~gd~~~~~~~---------~~~~~~~v~~~~~~~~ 376 (415) T protein:vir:46 326 QRLLGAKIEILPDEVLGQKGNNT--------------------LIIGNLKDAIVL---------FDRSQYQASWTDYMHF 376 (415) T ss_pred ccccceeeEEeccccccCCCccE--------------------EEEEehhccEEE---------EeecceEEEeeccccC Confidence 89999999999999854322110 133345443322 2223455555433333 Q ss_pred hcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 319 QADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 319 ~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ...+++.++++..+.+|++.+.+.+..+ T Consensus 377 -~~~~~~~~r~d~~v~~~~a~~~~~~~~~ 404 (415) T protein:vir:46 377 -GECLMIAVRQDCRILDYKSAIVIEYDDS 404 (415) T ss_pred -ceEEEEEEEeccEEeccccEEEEEeecc Confidence 4567889999999999999999988877 No 68 >protein:vir:4700 Length: 415 # NCBI annotation: phi PVL ORF 7 homologue # Family: family:all:21 # MgeID: mge:102 # MgeName: phiPV83 # Cross-refs: genbank:acc:NP_061632;genbank:gi:9635719;genbank:GeneID:1262976 Probab=99.48 E-value=1.9e-14 Score=95.76 Aligned_cols=293 Identities=10% Similarity=0.051 Sum_probs=167.1 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccc-eEEEe-ecCcceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGK-SAQFP-VLGRTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~-tv~i~-~iG~~~~~~~~~g~~ 78 (347) +........ ........++--.+.-+.|.+++.+.....+.+++++++..+.++. ++.++ ..+...+.....|.. T Consensus 110 ~~~~~~~~~---~~~~~~~t~~g~~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~Eg~~ 186 (415) T protein:vir:47 110 TEYLETRND---IQGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVRQSEVAALEKVEELEE 186 (415) T ss_pred HHHHhhhhh---hhhccccccCCcccccHHHHHHHHHHHHhhhhhhhhcceeeccCCceeEEEEEecCCcceeecccccc Confidence 000000000 0000001111112455899999999988889999999888876543 23333 334455666666666 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) ++.. ..+.-.++++..-+.- .-+.|.+-=-.++.+|+.+.+.+++++++++..|+.|+.-... ..+... T Consensus 187 ~~~~-~~~~~~~v~~~~~k~~-~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~il~g~g~---------g~~~~~ 255 (415) T protein:vir:47 187 NPEL-AVKPFFQLAYDINTHR-GYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVITK---------GSTGST 255 (415) T ss_pred cccc-cccceeeEEeeeeeeE-eeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhcccc---------CCcccc Confidence 5432 1233455555554432 2234433222346688999999999999999999988643211 001111 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceE Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSI 238 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~V 238 (347) ....... ......+ ....|+.|+++...+...... .=.+|++|..|..|.+- +-.++.|.-...+.+|.. T Consensus 256 ~~~~~~~---~~~~~~~----~~~~~~~i~~~~~~~~~~~~~--~~~~v~n~~~~~~L~~l-kd~~G~~i~~~~~~~~~~ 325 (415) T protein:vir:47 256 SSGFEKE---GKKLEVK----KAKSLDDIKDAINLNVKPNYE--HNVAIVSQTMFAKLDKM-KDKLGNYLIQPDVKEKTQ 325 (415) T ss_pred ccccccc---cceeccc----cccchHHHHHHHHhhhhhccC--CCEEEEcHHHHHHHHHh-hccCCCeeeccCcCCCCC Confidence 1110000 0000111 112366777777777776654 23578999999988642 234455554445677888 Q ss_pred EEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhh Q lcl|NC_015249. 239 RNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANF 318 (347) Q Consensus 239 g~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~ 318 (347) ++++|++|+.++++|..+.+... -+-+||+..+.+ +...+++++......+ T Consensus 326 ~~l~G~pV~~~~~~~~~~~~~~~--------------------~~~gd~~~~~~~---------~~~~~~~v~~~~~~~~ 376 (415) T protein:vir:47 326 QRLLGAKIEILPDEVLGQKGNNT--------------------LIIGNLKDAIVL---------FDRSQYQASWTDYMHF 376 (415) T ss_pred ccccceeeEEeccccccCCCccE--------------------EEEEehhccEEE---------EeecceEEEeeccccC Confidence 89999999999999854322110 133345443322 2223455555433333 Q ss_pred hcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 319 QADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 319 ~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ...+++.++++..+.+|++.+.+.+..+ T Consensus 377 -~~~~~~~~r~d~~v~~~~a~~~~~~~~~ 404 (415) T protein:vir:47 377 -GECLMIAVRQDCRILDYKSAIVIEYDDS 404 (415) T ss_pred -ceEEEEEEEeccEEeccccEEEEEeecc Confidence 4567889999999999999999988877 No 69 >protein:vir:79987 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:1875 # MgeName: tp310-3 # Cross-refs: genbank:acc:YP_001430002;genbank:gi:156604057;genbank:GeneID:5525447 Probab=99.47 E-value=3.1e-14 Score=94.67 Aligned_cols=293 Identities=9% Similarity=0.036 Sum_probs=170.8 Q ss_pred CCccc-ccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccc-eEEEe-ecCcceeeeeecCC Q lcl|NC_015249. 1 MAKMN-GGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGK-SAQFP-VLGRTKAAYLQPGE 77 (347) Q Consensus 1 ma~~~-~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~-tv~i~-~iG~~~~~~~~~g~ 77 (347) +.+.. .+... +.+ ....++--.+.-+.|..++.+.....+.+++++++..+.++. ++.++ ..+...+.....|. T Consensus 109 ~~~~~~~~~~~--~~~-~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~E~~ 185 (415) T protein:vir:79 109 FTEYLETRNDI--QGG-SLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVRQSEVAALEKVEELE 185 (415) T ss_pred HHHHHhhhhhh--hhc-cccccccccccchHHHHHHHHHHHhhhhhhhheeeeeccCCceeEEEEeecCCccceeecccc Confidence 00000 00000 000 000001112455899999999988889999999888876542 33444 44556666766676 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG 157 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~ 157 (347) .++.. ..+.-+++++.+.+.- .-+.|.+-=-.++.+|+.+.+.++.++++++..|+.++.....+ . +.. T Consensus 186 ~~~~~-~~~~~~~v~~~~~k~~-~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g~g--------~-~~~ 254 (415) T protein:vir:79 186 ENPEL-AVKPFFQLAYDINTHR-GYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVITKG--------S-TGS 254 (415) T ss_pred ccCcc-cccceeeEEeeeeeeE-eeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccccC--------c-ccc Confidence 66432 1234466666665543 22344332223467889999999999999999999886432110 0 000 Q ss_pred ccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccce Q lcl|NC_015249. 158 LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGS 237 (347) Q Consensus 158 ~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~ 237 (347) ...... ......+.+ ....|+.|+++...+...+... -.+|++|..|..|.+- +-.+..|.-...+.+|. T Consensus 255 ~~~~~~---~~~~~~~~~----~~~~~~~i~~~~~~~~~~~~~~--~~~v~n~~~~~~l~~l-kd~~G~~l~~~~~~~~~ 324 (415) T protein:vir:79 255 TSSGFE---KEGKKLEVK----KAKSLDDIKDAINLNVKPNYEH--NVAIVSQTMFAKLDKM-KDKLGNYLIQPDVKEKT 324 (415) T ss_pred cccccc---ccccccccc----cccchhHHHHHHHhhhhhccCC--CEEEEcHHHHHHHHHh-hccCCceeeccCcCCCC Confidence 000000 000011111 1223677888888888777653 2468899999998753 33344555444567787 Q ss_pred EEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechh Q lcl|NC_015249. 238 IRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRAN 317 (347) Q Consensus 238 Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~ 317 (347) .++++|++|+.++++|....+.. ..+-+||+..+.+ ....+++++..++. T Consensus 325 ~~~l~G~pV~~~~~~~~~~~~~~--------------------~~~~Gd~~~~~~~---------~~~~~~~v~~~~~~- 374 (415) T protein:vir:79 325 QQRLLGAKIEILPDEVLGQKGNN--------------------TLIIGNLKDAIVL---------FDRSQYQASWTDYM- 374 (415) T ss_pred CceecceeeEEecccccCCCCcc--------------------EEEEEehhccEEE---------EeecceEEEEeccc- Confidence 88999999999999985432211 0233344443322 22334556655443 Q ss_pred hhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 318 FQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 318 ~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .....+++.++++..+.+|++.+.+.+..+ T Consensus 375 ~~~~~~~~~~r~d~~v~~~~a~~~~~~~~~ 404 (415) T protein:vir:79 375 HFGECLMIAVRQDCRILDYKSAIVIEYDDS 404 (415) T ss_pred cCceEEEEEEEeccEEeccccEEEEEEecc Confidence 334567889999999999999999999888 No 70 >protein:vir:98339 Length: 415 # NCBI annotation: putative capsid protein # Family: family:all:21 # MgeID: mge:1581 # MgeName: phiPVL(108) # Cross-refs: genbank:acc:YP_918931;genbank:gi:119443693;genbank:GeneID:4594501 Probab=99.47 E-value=3.1e-14 Score=94.67 Aligned_cols=293 Identities=9% Similarity=0.036 Sum_probs=170.8 Q ss_pred CCccc-ccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccc-eEEEe-ecCcceeeeeecCC Q lcl|NC_015249. 1 MAKMN-GGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGK-SAQFP-VLGRTKAAYLQPGE 77 (347) Q Consensus 1 ma~~~-~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~-tv~i~-~iG~~~~~~~~~g~ 77 (347) +.+.. .+... +.+ ....++--.+.-+.|..++.+.....+.+++++++..+.++. ++.++ ..+...+.....|. T Consensus 109 ~~~~~~~~~~~--~~~-~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~E~~ 185 (415) T protein:vir:98 109 FTEYLETRNDI--QGG-SLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVRQSEVAALEKVEELE 185 (415) T ss_pred HHHHHhhhhhh--hhc-cccccccccccchHHHHHHHHHHHhhhhhhhheeeeeccCCceeEEEEeecCCccceeecccc Confidence 00000 00000 000 000001112455899999999988889999999888876542 33444 44556666766676 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG 157 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~ 157 (347) .++.. ..+.-+++++.+.+.- .-+.|.+-=-.++.+|+.+.+.++.++++++..|+.++.....+ . +.. T Consensus 186 ~~~~~-~~~~~~~v~~~~~k~~-~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g~g--------~-~~~ 254 (415) T protein:vir:98 186 ENPEL-AVKPFFQLAYDINTHR-GYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVITKG--------S-TGS 254 (415) T ss_pred ccCcc-cccceeeEEeeeeeeE-eeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccccC--------c-ccc Confidence 66432 1234466666665543 22344332223467889999999999999999999886432110 0 000 Q ss_pred ccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccce Q lcl|NC_015249. 158 LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGS 237 (347) Q Consensus 158 ~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~ 237 (347) ...... ......+.+ ....|+.|+++...+...+... -.+|++|..|..|.+- +-.+..|.-...+.+|. T Consensus 255 ~~~~~~---~~~~~~~~~----~~~~~~~i~~~~~~~~~~~~~~--~~~v~n~~~~~~l~~l-kd~~G~~l~~~~~~~~~ 324 (415) T protein:vir:98 255 TSSGFE---KEGKKLEVK----KAKSLDDIKDAINLNVKPNYEH--NVAIVSQTMFAKLDKM-KDKLGNYLIQPDVKEKT 324 (415) T ss_pred cccccc---ccccccccc----cccchhHHHHHHHhhhhhccCC--CEEEEcHHHHHHHHHh-hccCCceeeccCcCCCC Confidence 000000 000011111 1223677888888888777653 2468899999998753 33344555444567787 Q ss_pred EEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechh Q lcl|NC_015249. 238 IRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRAN 317 (347) Q Consensus 238 Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~ 317 (347) .++++|++|+.++++|....+.. ..+-+||+..+.+ ....+++++..++. T Consensus 325 ~~~l~G~pV~~~~~~~~~~~~~~--------------------~~~~Gd~~~~~~~---------~~~~~~~v~~~~~~- 374 (415) T protein:vir:98 325 QQRLLGAKIEILPDEVLGQKGNN--------------------TLIIGNLKDAIVL---------FDRSQYQASWTDYM- 374 (415) T ss_pred CceecceeeEEecccccCCCCcc--------------------EEEEEehhccEEE---------EeecceEEEEeccc- Confidence 88999999999999985432211 0233344443322 22334556655443 Q ss_pred hhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 318 FQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 318 ~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .....+++.++++..+.+|++.+.+.+..+ T Consensus 375 ~~~~~~~~~~r~d~~v~~~~a~~~~~~~~~ 404 (415) T protein:vir:98 375 HFGECLMIAVRQDCRILDYKSAIVIEYDDS 404 (415) T ss_pred cCceEEEEEEEeccEEeccccEEEEEEecc Confidence 334567889999999999999999999888 No 71 >protein:vir:81100 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:1891 # MgeName: tp310-1 # Cross-refs: genbank:acc:YP_001429874;genbank:gi:156603927;genbank:GeneID:5525320 Probab=99.47 E-value=3.1e-14 Score=94.67 Aligned_cols=293 Identities=9% Similarity=0.036 Sum_probs=170.8 Q ss_pred CCccc-ccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccc-eEEEe-ecCcceeeeeecCC Q lcl|NC_015249. 1 MAKMN-GGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGK-SAQFP-VLGRTKAAYLQPGE 77 (347) Q Consensus 1 ma~~~-~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~-tv~i~-~iG~~~~~~~~~g~ 77 (347) +.+.. .+... +.+ ....++--.+.-+.|..++.+.....+.+++++++..+.++. ++.++ ..+...+.....|. T Consensus 109 ~~~~~~~~~~~--~~~-~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~E~~ 185 (415) T protein:vir:81 109 FTEYLETRNDI--QGG-SLKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVRQSEVAALEKVEELE 185 (415) T ss_pred HHHHHhhhhhh--hhc-cccccccccccchHHHHHHHHHHHhhhhhhhheeeeeccCCceeEEEEeecCCccceeecccc Confidence 00000 00000 000 000001112455899999999988889999999888876542 33444 44556666766676 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG 157 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~ 157 (347) .++.. ..+.-+++++.+.+.- .-+.|.+-=-.++.+|+.+.+.++.++++++..|+.++.....+ . +.. T Consensus 186 ~~~~~-~~~~~~~v~~~~~k~~-~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g~g--------~-~~~ 254 (415) T protein:vir:81 186 ENPEL-AVKPFFQLAYDINTHR-GYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVITKG--------S-TGS 254 (415) T ss_pred ccCcc-cccceeeEEeeeeeeE-eeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccccC--------c-ccc Confidence 66432 1234466666665543 22344332223467889999999999999999999886432110 0 000 Q ss_pred ccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccce Q lcl|NC_015249. 158 LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGS 237 (347) Q Consensus 158 ~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~ 237 (347) ...... ......+.+ ....|+.|+++...+...+... -.+|++|..|..|.+- +-.+..|.-...+.+|. T Consensus 255 ~~~~~~---~~~~~~~~~----~~~~~~~i~~~~~~~~~~~~~~--~~~v~n~~~~~~l~~l-kd~~G~~l~~~~~~~~~ 324 (415) T protein:vir:81 255 TSSGFE---KEGKKLEVK----KAKSLDDIKDAINLNVKPNYEH--NVAIVSQTMFAKLDKM-KDKLGNYLIQPDVKEKT 324 (415) T ss_pred cccccc---ccccccccc----cccchhHHHHHHHhhhhhccCC--CEEEEcHHHHHHHHHh-hccCCceeeccCcCCCC Confidence 000000 000011111 1223677888888888777653 2468899999998753 33344555444567787 Q ss_pred EEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechh Q lcl|NC_015249. 238 IRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRAN 317 (347) Q Consensus 238 Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~ 317 (347) .++++|++|+.++++|....+.. ..+-+||+..+.+ ....+++++..++. T Consensus 325 ~~~l~G~pV~~~~~~~~~~~~~~--------------------~~~~Gd~~~~~~~---------~~~~~~~v~~~~~~- 374 (415) T protein:vir:81 325 QQRLLGAKIEILPDEVLGQKGNN--------------------TLIIGNLKDAIVL---------FDRSQYQASWTDYM- 374 (415) T ss_pred CceecceeeEEecccccCCCCcc--------------------EEEEEehhccEEE---------EeecceEEEEeccc- Confidence 88999999999999985432211 0233344443322 22334556655443 Q ss_pred hhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 318 FQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 318 ~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .....+++.++++..+.+|++.+.+.+..+ T Consensus 375 ~~~~~~~~~~r~d~~v~~~~a~~~~~~~~~ 404 (415) T protein:vir:81 375 HFGECLMIAVRQDCRILDYKSAIVIEYDDS 404 (415) T ss_pred cCceEEEEEEEeccEEeccccEEEEEEecc Confidence 334567889999999999999999999888 No 72 >protein:vir:4511 Length: 409 # NCBI annotation: capsid # Family: family:all:21 # MgeID: mge:97 # MgeName: V # Cross-refs: genbank:acc:NP_599037;genbank:gi:19548995;genbank:GeneID:935211 Probab=99.46 E-value=4.3e-14 Score=93.85 Aligned_cols=297 Identities=11% Similarity=0.073 Sum_probs=163.7 Q ss_pred CCcccccc-ccccc--cccc-ccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcce--eeeee Q lcl|NC_015249. 1 MAKMNGGQ-QIGKD--QGKG-MSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTK--AAYLQ 74 (347) Q Consensus 1 ma~~~~~~-~~~t~--~~~~-~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~--~~~~~ 74 (347) |...-... .-..+ ...+ ...++--.+..++|.+++.+..+..+.++.+.++.++.++..+.++..+... ..... T Consensus 99 ~~~~~~~~e~~~~~~~~a~~~~~~~~gg~liP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~ 178 (409) T protein:vir:45 99 GASELTSEERKALRELRAQGVAQDEKGGYTVPETFLAKVVEKMKSYGGIASVAQILTTSDGRTMEWATADGTSEVGVLLG 178 (409) T ss_pred hhhhccHHHHHHHHHHhhccCccCcCCceeccHhHHHHHHHHHHhhhhhhhhceeeecCCCceEEEEeeccCcccccccc Confidence 11100000 00000 0000 0001111244589999999999888999999998888888888888775432 23444 Q ss_pred cCCCCCCccCCCCCceEEEEEEeeeecc--cccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccc Q lcl|NC_015249. 75 PGENLDDKRKDMKHTERTINIDGLLTAD--VLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASD 152 (347) Q Consensus 75 ~g~~~~~~~~~~~~~~~~l~ID~~~~~~--~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~ 152 (347) .|+..+. .+++-..+ ++...++.. +.|.+-=-.++.+|+.+.+.++.++++++..|+.|+.-- +. ... T Consensus 179 E~~~~~~--~~~~f~~~--~l~~~k~~~~~i~is~ell~ds~~~l~~~i~~~la~a~~~~~~~a~l~G~--G~----~~~ 248 (409) T protein:vir:45 179 ENEEAGE--EDTDFGMG--SLGALKMTSKIIRVSNELLQDSAIDMEAYLARRIAERIGRGEARYLIQGT--GA----GTP 248 (409) T ss_pred ccccccc--ccccccee--eeeeeeeeeeehhhhHHHHhccHHHHHHHHHHHHHHHHHHHHHHHhhccC--CC----CCc Confidence 4555433 23433443 444444432 235333333356899999999999999999999876311 00 000 Q ss_pred cccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCE-EEeCHHHHHHHhcchhhhhhhhcccc Q lcl|NC_015249. 153 ENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRV-FYTTPDNYSAILAALMPNAANYQALI 231 (347) Q Consensus 153 ~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~-~vv~P~~~~~Ll~~~~~~~~~~~~~~ 231 (347) ..+.|. ....+........ ....++.|+++...|..... ....| ++++|..|..|.+- +-.+..|.-.. T Consensus 249 ~~p~Gi----l~~~~~~~~~~~~----~~~~~d~i~~l~~~l~~~~~-~~a~~~~~~n~~~~~~l~~l-kd~~G~~i~~~ 318 (409) T protein:vir:45 249 KQPKGL----AASVTGTTQTAAA----NAVKWQEILALKHSIDPAYR-RGPKFRLAFNDNTLKLISEM-EDGQGRPLWLP 318 (409) T ss_pred ccccee----eeccccccccccc----cccchHHHHHHHHhhhhhhc-cCCeEEEEECHHHHHHHHHh-hcCCCceeecc Confidence 111111 1000000001111 11125667777777766553 33456 46799998887532 22334454444 Q ss_pred ccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeee Q lcl|NC_015249. 232 DPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALE 311 (347) Q Consensus 232 ~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e 311 (347) .+.+|...+++|.+|+.++++|....+... -+-+||++.+ ++ ....+.++ T Consensus 319 ~~~~~~~~~l~G~PV~~~~~~p~~~~~~~~--------------------i~~Gd~~~~~--i~--------~~~~~~~~ 368 (409) T protein:vir:45 319 DIVGVAPASVLNVPYVIDQEIDDIGAGKKF--------------------MFCGDFDRFI--IR--------RVRYMILK 368 (409) T ss_pred CcCCCCCceecceeeEEecCcCCccCCccE--------------------EEEeehhhhh--ee--------eccceEEE Confidence 556677788999999999999853211100 0112333321 11 22344556 Q ss_pred eeechhhhcc--eeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 312 RARRANFQAD--QIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 312 ~~~d~~~~~d--~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ...|+-.+-+ .|++..+++..+.+|++.+.++.+.+ T Consensus 369 ~~~d~~~~~~~~~~~~~~r~d~~~~~~~A~~~l~~k~s 406 (409) T protein:vir:45 369 RLVERYAEYDQTGFLAFHRFDCILEDTSAIKALVGKGS 406 (409) T ss_pred EeecccccCCcEEEEEEEEeccEeechhheEEEEeccC Confidence 5655543334 38889999999999999888888777 No 73 >protein:vir:9574 Length: 300 # NCBI annotation: gp40 # Family: family:all:966 # MgeID: mge:171 # MgeName: SM1 # Cross-refs: genbank:acc:NP_862879;genbank:gi:32469471;genbank:GeneID:1461316 Probab=99.45 E-value=1.1e-13 Score=91.59 Aligned_cols=284 Identities=10% Similarity=-0.007 Sum_probs=166.6 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEee-cCcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPV-LGRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~-iG~~~~~~~~~g~~~ 79 (347) ||..+... |. |..+++..++.+..++.|.++.+.+++.+.+| .+.+|+ .+...+.....|+.+ T Consensus 1 ma~~t~~~------------G~---lip~~~~~~ii~~l~~~s~i~~l~~~~~~~~~-~~~~p~~~~~~~a~wv~Eg~~~ 64 (300) T protein:vir:95 1 MSEAQLSK------------GN---LFNPELVTKVINKVKGHSSIAKLSPQKPIPFN-GQREFVFDFDSDIDIVAENGKK 64 (300) T ss_pred CcccccCC------------cc---eechhhHHHHHHHHHhhhhhhhhcceeeccCC-ceEEEEEecCcceEEeeCCccc Confidence 88877421 11 45578999999999999988888777776554 456776 466677887888777 Q ss_pred CCccCCCCCceEEEEEEeeeecccccccHHHHH-----hChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADVLIYDIEDAM-----NHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDEN 154 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q-----~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~ 154 (347) +.+ +++-+++++..-+. +.-..|. +|.. ...|+.+.+.++.++++++..|+.++.-.. ........ T Consensus 65 ~~s--~~~f~~v~l~~~k~-~~~~~iS--~ell~~~~d~~~~l~~~i~~~l~~aia~~~d~~~l~G~~----~~~g~~~~ 135 (300) T protein:vir:95 65 THG--GVSLDPVTIVPLKV-EYGARVS--DEFLHASEEAKVDMLTDFVEGFSKKLARGLDIMSIHGIN----PRTKQAST 135 (300) T ss_pred ccc--cccceeeEeeeEEE-EEeehhh--HHHhccCCCCHHHHHHHHHHHHHHHHHHHHHHhhhhccc----CCCCCCcc Confidence 653 45556666655443 2233332 2222 347889999999999999999998873210 00000000 Q ss_pred cccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccc Q lcl|NC_015249. 155 IAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPS 234 (347) Q Consensus 155 ~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~ 234 (347) ..+.....+. ............++.|.++...+...+... ...+++|..+..|.+-. -.+..+.-..... T Consensus 136 ----~~~~~~~~~~---~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~--~~~vmn~~~~~~L~~lk-d~~G~~i~~~~~~ 205 (300) T protein:vir:95 136 ----IIGDNCFDKK---VTQTVPFKDTNPDESMEDAVGMIDGSERDI--TGAILDPIFTTALSKMK-NAEGGKLYPELAW 205 (300) T ss_pred ----cccccccccc---cceeecccccchHHHHHHHHHHhhhcCCCc--cEEEECHHHHHHHHHhh-ccCCCeeccCccc Confidence 0010000000 000111112334677888888888776543 25789999999886533 2233333334455 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeee- Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERA- 313 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~- 313 (347) .|..++++|++|+.|+.+|...... ....+-+||++.+.+..+.. +++++. T Consensus 206 ~~~~~~l~G~Pv~~s~~v~~~~~~~-------------------~~~~~~GDf~~~~~~~~~~~---------~~~~v~~ 257 (300) T protein:vir:95 206 GGVPDAINGLAVDKNRTVSYSQTDP-------------------KNTAIVGDFETMFKWGYAKE---------VPMEIIK 257 (300) T ss_pred cCCCceecceeeEEecCCCCCCCCC-------------------ccEEEEeeccceEEEEEecc---------cEEEEee Confidence 6777899999999999998533211 01123356655443322222 233332 Q ss_pred -echh------hhcc--eeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 314 -RRAN------FQAD--QIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 314 -~d~~------~~~d--~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) -+++ ++.| .+++.+++|..++||++.+.|+=.++ T Consensus 258 ~~~~d~~~~~~f~~~~v~~r~~~r~d~~v~~~~a~~~l~~~~g 300 (300) T protein:vir:95 258 YGDPDNSGRDLKGYNQIYIRCEAYIGWGIMDAASFARIVKTGG 300 (300) T ss_pred ccCCCCcchhhhhcCcEEEEEEEeecceeecccceEEEecCCC Confidence 1221 3333 45777789999999997777654444 No 74 >protein:vir:9410 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:167 # MgeName: phi 13 # Cross-refs: genbank:acc:NP_803388;genbank:gi:29028700;genbank:GeneID:1258136 Probab=99.44 E-value=2.7e-14 Score=94.93 Aligned_cols=293 Identities=9% Similarity=0.041 Sum_probs=169.4 Q ss_pred CCcc-cccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccc-eEEEee-cCcceeeeeecCC Q lcl|NC_015249. 1 MAKM-NGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGK-SAQFPV-LGRTKAAYLQPGE 77 (347) Q Consensus 1 ma~~-~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~-tv~i~~-iG~~~~~~~~~g~ 77 (347) +.+. ..... .+.+. ...++--.+.-+.+.+++.......+.++.++++..+.++. ++.+++ .+...+.....|. T Consensus 109 ~~~~~~~~~~--~~~~~-~~~~~g~~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~Eg~ 185 (415) T protein:vir:94 109 FTEYLETRND--IQGGS-LKTDSGFVVIPEEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVRQSEVAALEKVEELE 185 (415) T ss_pred HHHHhhhhhh--hhhhc-cccccccccCcHHHHHHHHHHHHhhhhhhhhcceeeccCCceeEEEEeecCCccceeccccc Confidence 0000 00000 00000 00111112344789999999998999999999988876543 444444 3555666666676 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG 157 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~ 157 (347) .++.. ..+.-.++++.+.+. +..+.|.+-=-.++.+|+.+.+.++.++++++..|+.++.....+ .+.. T Consensus 186 ~~~~~-~~~~~~~i~~~~~k~-~~~~~is~ell~ds~~~~~~~i~~~l~~~~~~~~~~~il~g~g~g---------~~~~ 254 (415) T protein:vir:94 186 ENPEL-AVKPFFQLAYDINTH-RGYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVITKG---------STGS 254 (415) T ss_pred ccccc-ccccceeeEeeheee-eeechhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccccC---------cccc Confidence 66432 123345555555444 222344322222467889999999999999999999886432110 0010 Q ss_pred ccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccce Q lcl|NC_015249. 158 LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGS 237 (347) Q Consensus 158 ~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~ 237 (347) ...+... .. .....+ ....|+.|+++...+...+.. .-.+|++|..|..|.+- +-.++.|.-...+.+|. T Consensus 255 ~~~~~~~-~~--~~~~~~----~~~~~~~i~~~~~~~~~~~~~--~~~~vmn~~~~~~l~~l-kd~~G~~l~~~~~~~~~ 324 (415) T protein:vir:94 255 TSSGFEK-EG--KKLEVK----KAKSLDDIKDAINLNVKPNYE--HNVAIVSQTMFAKLDKM-KDKLGNYLIQPDVKEKT 324 (415) T ss_pred ccccccc-cc--cccccc----cccchHHHHHHHHhhhhhccC--CCEEEEcHHHHHHHHHh-hccCCCeeeccCcCCCC Confidence 0000000 00 000011 112367788888888777764 23568899999999753 33344455444567788 Q ss_pred EEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechh Q lcl|NC_015249. 238 IRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRAN 317 (347) Q Consensus 238 Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~ 317 (347) .++++|++|+.++++|....+.. ..+-+||++.+.+ +...+++++...+. T Consensus 325 ~~~l~G~pV~~~~~~~~~~~~~~--------------------~i~~gd~~~~~~~---------~~~~~~~v~~~~~~- 374 (415) T protein:vir:94 325 QQRLLGAKIEILPDEVLGQKGNN--------------------TLIIGNLKDAIVL---------FDRSQYQASWTDYM- 374 (415) T ss_pred CceecceeeEEecccccCCCCcc--------------------EEEEEehhccEEE---------EeecceEEEEeccc- Confidence 88999999999999985432211 0123344443322 22234555555433 Q ss_pred hhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 318 FQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 318 ~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .....+++.++++..+.+|++.+.+.+..+ T Consensus 375 ~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~ 404 (415) T protein:vir:94 375 HFGECLMIAVRQDCRILDYKSAIVIEYDDS 404 (415) T ss_pred cCceEEEEEEEeccEEeccccEEEEEEecc Confidence 344678899999999999999999998888 No 75 >protein:vir:94142 Length: 304 # NCBI annotation: ORF013 # Family: family:all:507 # MgeID: mge:1494 # MgeName: 96 # Cross-refs: genbank:acc:YP_240234;genbank:gi:66395898;genbank:GeneID:5133311 Probab=99.43 E-value=1e-13 Score=91.80 Aligned_cols=285 Identities=15% Similarity=0.060 Sum_probs=167.9 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~~ 79 (347) ||-........+-. ++.-.+.-+++..++.+.-.+.+.++.+.++..+. +.+.+||+. +...+..+..++.+ T Consensus 1 ma~~~~~~~~~~~t------~~gg~lip~~~~~~ii~~~~~~~~l~~~~~~~~~~-~~~~~ip~~~~~~~a~~v~E~~~~ 73 (304) T protein:vir:94 1 MATPTYTPGNVILS------DFKNGVIPAEQGTLIMKDIMANSAIMKLAKNEPMT-AQKKKFTYLAKGVGAYWVSETERI 73 (304) T ss_pred Cccccccccccccc------CCCceecchhHHHHHHHHHHhccchhhhcceeecc-CCceEEEEEeCCcceEEeecCccc Confidence 87766322211112 12123677899999998998889888888887755 456788877 56677777778777 Q ss_pred CCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLG 159 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~ 159 (347) +. .+++-+++++...+.- .-..|.+-=..++.+|+.+.+.++.++++++..|+.++.-- . ......... T Consensus 74 ~~--~~~~~~~i~~~~~k~~-~~~~iS~ell~ds~~~l~~~i~~~l~~~ia~~~d~~~l~G~----g----~~~~~~~~~ 142 (304) T protein:vir:94 74 QT--SKPEYAQAEMEAKKIG-VIIPLSKEFLKWTAKDFFNEVKPLIAEAFYKAFDQAVIFGT----K----SPYNTSTSG 142 (304) T ss_pred cc--ccceeeEEEEEEEEEE-EeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHhhheecc----C----CCccccccc Confidence 65 3466677777665543 33345432233456899999999999999999999875321 0 000011111 Q ss_pred CcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceEE Q lcl|NC_015249. 160 KAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSIR 239 (347) Q Consensus 160 ~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg 239 (347) .+........ ..........|+.|+++...+...+.... .++++|..|..|.+-. -.+..+ +-.+..+ T Consensus 143 ~~~~~~~~~~----~~~~~~~~~~~~~i~~~~~~l~~~~~~~~--~~v~~~~~~~~L~~lk-d~~G~~-----l~~~~~~ 210 (304) T protein:vir:94 143 KPLVEGAEEK----GNVVTDTNNLYVDLSALMATIEDEELDPN--GVLTTRSFRSKMRNAL-DANDRP-----LFDANGN 210 (304) T ss_pred cccccccccc----ccccccccchHHHHHHHHHHhhhccCCcC--EEEEcHHHHHHHHHhh-ccCCcE-----eecCCCc Confidence 1111111111 11111233458889999888888876543 5789999999987532 112222 1123346 Q ss_pred EEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeech--- Q lcl|NC_015249. 240 NVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRA--- 316 (347) Q Consensus 240 ~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~--- 316 (347) +++|.+|+.++++|...... .-+-+||++.. +.. ..+++++..++. T Consensus 211 ~l~G~PV~~~~~~~~~~~~~---------------------~~~~gd~~~~~-~~~---------~~~~~i~~~~e~~~~ 259 (304) T protein:vir:94 211 EIMGLPLSYTGADVYDKKKS---------------------LALMGDWDYAR-YGI---------LQGIEYAISEDATLT 259 (304) T ss_pred cccceeeEEecccccCCCCc---------------------EEEEEehhhEE-EEE---------ecceEEEEeecceee Confidence 89999999999998532111 01223454432 111 122333333322 Q ss_pred -------h------hhc--ceeeeeeeecccccccceEEEEEEcC Q lcl|NC_015249. 317 -------N------FQA--DQIIAKYAMGHGGLRPEACGALVFNK 346 (347) Q Consensus 317 -------~------~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~~ 346 (347) . ++. ..++..+++|..++||++.+.|...- T Consensus 260 ~~~~~~~~g~~~~~f~~~~~~~r~~~r~~~~v~~~~a~~~l~~a~ 304 (304) T protein:vir:94 260 TLQASDASGQPVSLFERDMFALRATMHIAYMNVKPEAFATLKPTE 304 (304) T ss_pred eecccccCccchhhhhcCcEEEEEEEEeccEeecccceEEEEecC Confidence 1 333 34566788999999999887554444 No 76 >protein:vir:105905 Length: 304 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:1514 # MgeName: phiETA3 # Cross-refs: genbank:acc:YP_001004375;genbank:gi:122891830;genbank:GeneID:4712376 Probab=99.43 E-value=1e-13 Score=91.80 Aligned_cols=285 Identities=15% Similarity=0.060 Sum_probs=167.9 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~~ 79 (347) ||-........+-. ++.-.+.-+++..++.+.-.+.+.++.+.++..+. +.+.+||+. +...+..+..++.+ T Consensus 1 ma~~~~~~~~~~~t------~~gg~lip~~~~~~ii~~~~~~~~l~~~~~~~~~~-~~~~~ip~~~~~~~a~~v~E~~~~ 73 (304) T protein:vir:10 1 MATPTYTPGNVILS------DFKNGVIPAEQGTLIMKDIMANSAIMKLAKNEPMT-AQKKKFTYLAKGVGAYWVSETERI 73 (304) T ss_pred Cccccccccccccc------CCCceecchhHHHHHHHHHHhccchhhhcceeecc-CCceEEEEEeCCcceEEeecCccc Confidence 87766322211112 12123677899999998998889888888887755 456788877 56677777778777 Q ss_pred CCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLG 159 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~ 159 (347) +. .+++-+++++...+.- .-..|.+-=..++.+|+.+.+.++.++++++..|+.++.-- . ......... T Consensus 74 ~~--~~~~~~~i~~~~~k~~-~~~~iS~ell~ds~~~l~~~i~~~l~~~ia~~~d~~~l~G~----g----~~~~~~~~~ 142 (304) T protein:vir:10 74 QT--SKPEYAQAEMEAKKIG-VIIPLSKEFLKWTAKDFFNEVKPLIAEAFYKAFDQAVIFGT----K----SPYNTSTSG 142 (304) T ss_pred cc--ccceeeEEEEEEEEEE-EeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHhhheecc----C----CCccccccc Confidence 65 3466677777665543 33345432233456899999999999999999999875321 0 000011111 Q ss_pred CcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceEE Q lcl|NC_015249. 160 KAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSIR 239 (347) Q Consensus 160 ~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg 239 (347) .+........ ..........|+.|+++...+...+.... .++++|..|..|.+-. -.+..+ +-.+..+ T Consensus 143 ~~~~~~~~~~----~~~~~~~~~~~~~i~~~~~~l~~~~~~~~--~~v~~~~~~~~L~~lk-d~~G~~-----l~~~~~~ 210 (304) T protein:vir:10 143 KPLVEGAEEK----GNVVTDTNNLYVDLSALMATIEDEELDPN--GVLTTRSFRSKMRNAL-DANDRP-----LFDANGN 210 (304) T ss_pred cccccccccc----ccccccccchHHHHHHHHHHhhhccCCcC--EEEEcHHHHHHHHHhh-ccCCcE-----eecCCCc Confidence 1111111111 11111233458889999888888876543 5789999999987532 112222 1123346 Q ss_pred EEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeech--- Q lcl|NC_015249. 240 NVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRA--- 316 (347) Q Consensus 240 ~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~--- 316 (347) +++|.+|+.++++|...... .-+-+||++.. +.. ..+++++..++. T Consensus 211 ~l~G~PV~~~~~~~~~~~~~---------------------~~~~gd~~~~~-~~~---------~~~~~i~~~~e~~~~ 259 (304) T protein:vir:10 211 EIMGLPLSYTGADVYDKKKS---------------------LALMGDWDYAR-YGI---------LQGIEYAISEDATLT 259 (304) T ss_pred cccceeeEEecccccCCCCc---------------------EEEEEehhhEE-EEE---------ecceEEEEeecceee Confidence 89999999999998532111 01223454432 111 122333333322 Q ss_pred -------h------hhc--ceeeeeeeecccccccceEEEEEEcC Q lcl|NC_015249. 317 -------N------FQA--DQIIAKYAMGHGGLRPEACGALVFNK 346 (347) Q Consensus 317 -------~------~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~~ 346 (347) . ++. ..++..+++|..++||++.+.|...- T Consensus 260 ~~~~~~~~g~~~~~f~~~~~~~r~~~r~~~~v~~~~a~~~l~~a~ 304 (304) T protein:vir:10 260 TLQASDASGQPVSLFERDMFALRATMHIAYMNVKPEAFATLKPTE 304 (304) T ss_pred eecccccCccchhhhhcCcEEEEEEEEeccEeecccceEEEEecC Confidence 1 333 34566788999999999887554444 No 77 >protein:vir:9309 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:165 # MgeName: phi 11 # Cross-refs: genbank:acc:NP_803287;genbank:gi:29028597;genbank:GeneID:1258044 Probab=99.42 E-value=6.5e-14 Score=92.87 Aligned_cols=278 Identities=12% Similarity=0.098 Sum_probs=167.1 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~~ 79 (347) +... .+.++ + ...+...+.-+++..++.+.....|+++.+.++..+.+ ..++||+. +...+.-+..|+.+ T Consensus 21 ~~~~-~a~~~-~------~~~~~~~liP~~~~~~ii~~~~~~s~l~~l~~~~~~~~-~~~~ip~~~~~~~a~~v~Eg~~~ 91 (324) T protein:vir:93 21 PQVF-NPDNV-M------MHEKKDGTLLNDFTTPILQEVMENSKIMQLGKYEPMEG-TEKKFTFWADKPGAYWVGEGQKI 91 (324) T ss_pred hhhc-ccccc-c------ccCCCcceechhHHHHHHHHHHhhchhhhhcceeeccC-CceEEEEEecCcceeeecCCccc Confidence 1111 11111 1 11111235678999999999999999999888877554 45778775 66777778888887 Q ss_pred CCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLG 159 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~ 159 (347) +.. +++-.++++..-+. +.-+.|.+-=-.++.+|+.+.+.++.++++++..|+.+|.--. ... .+ T Consensus 92 ~~~--~~~f~~i~~~~~k~-~~~~~iS~ell~ds~~~l~~~i~~~l~~aia~~~d~a~l~G~g--------~~~----~~ 156 (324) T protein:vir:93 92 ETS--KATWVNATMRAFKL-GVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGILNQG--------NNP----FG 156 (324) T ss_pred ccc--ccceeEEEEEeEEE-EEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhcCCC--------CCC----cC Confidence 653 46667777766554 3344554432334668999999999999999999998763210 000 01 Q ss_pred CcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceEE Q lcl|NC_015249. 160 KAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSIR 239 (347) Q Consensus 160 ~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg 239 (347) .+.......... .......++.|+++...|...+... ..++++|..|..|.+-. +-.|...+..+..+ T Consensus 157 ~~~~~~~~~~~~-----~~~~~~~~~~i~~~~~~l~~~~~~~--~~~v~n~~~~~~L~~l~-----d~~G~~~~~~~~~~ 224 (324) T protein:vir:93 157 KSIAQSIEKTNK-----VIKGDFTQDNIIDLEALLEDDELEA--NAFISKTQNRSLLRKIV-----DPETKERIYDRNSD 224 (324) T ss_pred ccccccccccce-----eccccccHHHHHHHHHhhhhccCCC--CEEEEcHHHHHHHHHhh-----CCCCCeeecCCCCC Confidence 111111100000 0111224677888888888877643 36889999999987532 22333345566677 Q ss_pred EEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechh-- Q lcl|NC_015249. 240 NVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRAN-- 317 (347) Q Consensus 240 ~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~-- 317 (347) +++|.+|+.+++.+...+ .-+-+||+... + +...+++++..++.- T Consensus 225 ~l~G~PVv~~~~~~~~~~-----------------------~i~~gdfs~~~-~---------~~~~~~~i~~~~~~~~~ 271 (324) T protein:vir:93 225 SLDGLPVVNLKSSNLKRG-----------------------ELITGDFDKLI-Y---------GIPQLIEYKIDETAQLS 271 (324) T ss_pred cccceeeEeecCCCCCcc-----------------------eEEEEecceEE-E---------EEecCcEEEEeeccccc Confidence 899999998766542111 01233444332 1 122345555554431 Q ss_pred ------------hh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 318 ------------FQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 318 ------------~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++ .-.+++..++|..++||++.+.|+...+ T Consensus 272 ~~~~~~~~~~~~f~~n~~~~r~~~r~d~~v~~~~a~~~l~~a~~ 315 (324) T protein:vir:93 272 TVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAFAKLVPADK 315 (324) T ss_pred ccccccccchhhhhcCcEEEEEEEEeccEEecccceEEEecccc Confidence 22 3456788889999999999887765554 No 78 >protein:vir:99920 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:1611 # MgeName: Halo # Cross-refs: genbank:acc:YP_655524;genbank:gi:109392294;genbank:GeneID:4157089 Probab=99.42 E-value=1.8e-13 Score=90.52 Aligned_cols=297 Identities=12% Similarity=0.030 Sum_probs=160.2 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~~ 79 (347) ||..+.... . +.-++|..++.+.....|+++.+.++..+.+ ...+||+. +..++..+..|+.+ T Consensus 1 Mat~tt~~g-------~--------~vP~~~~~~ii~~~~~~s~l~~~~~~i~~~~-~~~~~p~~~~~~~a~wv~Eg~~~ 64 (311) T protein:vir:99 1 MATFGTGNL-------K--------NLPRNIADGMVKDVVQGSTVAVLSARKPQRF-GNEDIITFNGRPKAEFVGEGQQK 64 (311) T ss_pred CceecCCCc-------e--------eccHHHHHHHHHHHHhhchhhhhcceeeccC-CceEEEEEeCCceeEEeecCccc Confidence 887663221 0 3347889999999988899998887776554 44688876 77788888888887 Q ss_pred CCccCCCCCceEEEEEEeeeecc-cccccHHH---HHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTAD-VLIYDIED---AMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENI 155 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~-~~Idd~D~---~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~ 155 (347) +.. +++-.++++.. .++.. +.|.+-=- .++..|+.+.+.+++++++++.+|+.+|.... ......+ T Consensus 65 ~~~--~~~f~~v~l~~--~k~~~~~~iS~ell~~~~d~~~~l~~~i~~~la~ai~~~~d~~~l~G~g------~~~g~~~ 134 (311) T protein:vir:99 65 SST--TGEFDFVTSTP--KKAQVTMRFNEEVQWADEDYQLGVLQTLSEAGAEALARALDLGLYHRIN------PLTGTVI 134 (311) T ss_pred ccc--cceeeEEEEee--EEEEEeehhhHHHhhcccccHHHHHHHHHHHHHHHHHHHHHHHhhcccC------cccCccc Confidence 653 45556666654 33333 23322101 13467899999999999999999998863210 0000011 Q ss_pred ccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccccccc Q lcl|NC_015249. 156 AGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPST 235 (347) Q Consensus 156 ~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~ 235 (347) .+.. ..+..+.... +.........+..|..+...+.........-.++++|..+..|.+-. -.+..|.-...... T Consensus 135 ~g~~--~~~~~~~~~~--~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~lk-d~~G~~l~~~~~~~ 209 (311) T protein:vir:99 135 PGWS--NYLGAASKRV--ELTADTIANPDLAIEAAVGLLVANGHPTPVNGLALHPSIAWGLSTAR-YTDGRKKFPELGLG 209 (311) T ss_pred cccc--ccccccccee--eccccccchhHHHHHHHHHHHhhhccCCCccEEEEcHHHHHHHHhhh-ccCCCeeecCcccC Confidence 1100 0000000000 00000111122334444444443332211112788999999986532 22334443444556 Q ss_pred ceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeec Q lcl|NC_015249. 236 GSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARR 315 (347) Q Consensus 236 G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d 315 (347) +..+++.|++|+.|+++|........ .......+....+-+||...+-+-..+ +++++.... T Consensus 210 ~~~~~l~G~Pv~~s~~i~~~~~~~~~---------~~~~~~~~~~~~~~Gdf~~~~~~~~~~---------~~~~~~~~~ 271 (311) T protein:vir:99 210 IGVSSFEGIDASVSDTVNGGDEADPD---------DEDLDAARAVRGIVGDFANGIHWGVQR---------DIPVELIKY 271 (311) T ss_pred CCCceecceeeEeecccccccccccc---------cchhhccCcceEEEeeccccEEEEEec---------CceEEEeec Confidence 66789999999999999854332111 001111122223444554433222222 233333321 Q ss_pred --h-----hhhccee--eeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 316 --A-----NFQADQI--IAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 316 --~-----~~~~d~i--~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) + .++.|++ ++..++|..+++|+ ++.+....| T Consensus 272 ~~~~~~~~~~~~d~~~~r~~~r~d~~v~~~~-~v~~~~~~A 311 (311) T protein:vir:99 272 GDPDGQGDLKRHNQIALRLEIVYGWYVFTDR-FVVIENAVA 311 (311) T ss_pred CCCCcchhhhhcCcEEEEEEEeecceecChh-HeeeecccC Confidence 1 1445555 66788899988875 556666666 No 79 >protein:vir:1638 Length: 298 # NCBI annotation: Structural protein # Family: family:all:966 # MgeID: mge:33 # MgeName: r1t # Cross-refs: genbank:acc:NP_695059;genbank:gi:23455750;genbank:GeneID:955469 Probab=99.41 E-value=1.8e-13 Score=90.51 Aligned_cols=280 Identities=11% Similarity=0.005 Sum_probs=165.7 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEee-cCcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPV-LGRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~-iG~~~~~~~~~g~~~ 79 (347) ||...| . |..+++..++.+..+..|+++.+.++..+.+|+ +.||+ .|..++..+..|+.+ T Consensus 1 ma~~gG-~-----------------lvp~~~~~~ii~~~~~~s~i~~l~~~~~~~~~~-~~ip~~~~~~~a~~v~E~~~~ 61 (298) T protein:vir:16 1 MVLNKG-T-----------------LFDPTLVTDLISKVAGKSSIARLSAQKPIPFNG-EKVFTFTMDSEIDVVAESGKK 61 (298) T ss_pred CcccCc-c-----------------eechhHHHHHHHHHHhhhhhhhhcceeeccCCc-eEEEEEecCcceEEecCCccc Confidence 663322 1 445678888888888889999998888766544 56776 466788888888777 Q ss_pred CCccCCCCCceEEEEEEeeeeccc-ccccHHHH-----HhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADV-LIYDIEDA-----MNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDE 153 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~-~Idd~D~~-----q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~ 153 (347) +.+ ++.-.++++.. .++... .|. +|. .+..++.+.+.++.++++++..|+.++.... +... T Consensus 62 ~~~--~~~f~~v~l~~--~k~a~~~~iS--~ell~~s~d~~~~l~~~i~~~la~ai~~~~d~~~l~G~~-------~~~g 128 (298) T protein:vir:16 62 THG--GVTLAPQTMVP--IKVEYGARIS--DEFMYASDEEKINILQEFNDGFAKKVARGIDLMAFHGVN-------PRLG 128 (298) T ss_pred ccc--ccceeEEEEee--eeEEEeehhh--HHHhhcCcccHHHHHHHHHHHHHHHHHHHHHHHhhcccc-------CCCC Confidence 653 34445555544 443332 232 222 2346788999999999999999998864210 0011 Q ss_pred ccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccccc Q lcl|NC_015249. 154 NIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDP 233 (347) Q Consensus 154 ~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~ 233 (347) .+.... +.... .................++.|+++...+...+.+.. .++++|..+..|.+-. -.+..|.-.... T Consensus 129 ~~~~~~-~~~~~-~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~--~~vmn~~~~~~l~~lk-d~~G~~i~~~~~ 203 (298) T protein:vir:16 129 TASAVI-GTNHF-DSKVTQKVEAPRGIADPNGAIENAVELLTGVDADVT--GIAINPSFRSALAKQK-DLQDNALFPELK 203 (298) T ss_pred cccccc-ccccc-ccccccccccccccccHHHHHHHHHHHhhhcCCCcc--EEEEcHHHHHHHHHhh-ccCCCeeecCcc Confidence 111100 00000 000000111111223456778888888888887543 4778999999887633 233444434455 Q ss_pred ccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeee Q lcl|NC_015249. 234 STGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERA 313 (347) Q Consensus 234 ~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~ 313 (347) ..|..++++|.+|+.++++|...... ....+-+||+..+.+..+. .++++.. T Consensus 204 ~~~~~~~l~G~PV~~~~~v~~~~~~~-------------------~~~~~~GDfs~~~~~~~~~---------~~~~~~~ 255 (298) T protein:vir:16 204 WGATPDTINGLPVDVNKTVSDMSLTQ-------------------RDRAIIGDFANGFKWGYAK---------EVPLEVI 255 (298) T ss_pred cCCCCceecceeeEEecccccccCCC-------------------ccEEEEeeccceEEEEEec---------CceEEEe Confidence 67777899999999999998532211 1113445666554332222 2333333 Q ss_pred e--chh------hhc--ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 314 R--RAN------FQA--DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 314 ~--d~~------~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) + ++. ++. -.+++.+++|..++||++.+.| +.| T Consensus 256 ~~~~~~~~~~~~f~~~~v~~ra~~r~d~~v~~~~a~~~l--~~a 297 (298) T protein:vir:16 256 QYGDPDNSGLDLKGYNQVYIRAELFLGWGILDATKFARV--TEA 297 (298) T ss_pred eccCCcCcchhhhhcCcEEEEEEEEEccEeecccceEEE--eec Confidence 2 221 233 3467788999999999987755 555 No 80 >protein:vir:4339 Length: 395 # NCBI annotation: major head protein # Family: family:all:585 # MgeID: mge:93 # MgeName: D3 # Cross-refs: genbank:acc:NP_061502;genbank:gi:9635591;genbank:GeneID:1262860 Probab=99.41 E-value=1.7e-13 Score=90.65 Aligned_cols=291 Identities=14% Similarity=0.108 Sum_probs=170.2 Q ss_pred CCcccccccccc-cccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-C-cceeeeeecCC Q lcl|NC_015249. 1 MAKMNGGQQIGK-DQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-G-RTKAAYLQPGE 77 (347) Q Consensus 1 ma~~~~~~~~~t-~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G-~~~~~~~~~g~ 77 (347) +....++....- +.......++.-.+..++|..++.......+.++++++...+. |.++.+++. + ..++..+..|+ T Consensus 98 ~~~~~~~~~~~~~~~~~~~~~~~~g~~vp~~~~~~ii~~~~~~~~l~~l~~~~~~~-~~~~~~~~~~~~~~~a~~v~E~~ 176 (395) T protein:vir:43 98 TSSLRGSHRVSMPRSAITSIDGSGGALVAPDRRPGVVAAPQRRLTIRDLVAPGTTE-SNSVEYVRETGFVNNAAPVSEGT 176 (395) T ss_pred HHHhhhhhhhhhhhhhhcccCCCCccccchhhHHHHHHHHHhhhhHHhhccceecC-CCceEEEEEecCCCceeeecCCc Confidence 111111111000 0000011111112567889999999998889999998888865 456778774 3 34666677777 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG 157 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~ 157 (347) .++. .+++-.++++.+.+...+ ..|.+ +-.+...++.+.+.++.+.++++..|..++..- .++..+.| T Consensus 177 ~~~~--~~~~~~~i~~~~~k~~~~-~~is~-ell~d~~~l~~~v~~~la~a~~~~~d~~~l~G~--------g~~~~~~G 244 (395) T protein:vir:43 177 QKPY--SDLTFELENAPVRTIAHL-FKASR-QILDDASALQSYIDARARYGLMLVEECQLLYGN--------GTGANLHG 244 (395) T ss_pred cccc--cccceeEEEEeeeeEEEe-ehhhH-HHHHhHHHHHHHHHHHHHHHHHHHHHHHHHhcc--------CCCCcccc Confidence 7654 346667777777666433 34532 233444568888899999999999999886321 01111111 Q ss_pred ccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccce Q lcl|NC_015249. 158 LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGS 237 (347) Q Consensus 158 ~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~ 237 (347) .-.... ...............++.|.++...+...+.+. -.+|++|..|..|.+-. -.+..|... ...+|. T Consensus 245 i~~~~~-----~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~--~~~vmn~~~~~~l~~lk-d~~G~~i~~-~~~~~~ 315 (395) T protein:vir:43 245 IIPQAQ-----AYAPPSGVVVTAEQRIDRIRLAILQAQLAEFPA--SGIVLNPIDWALIELNK-DAENRYIIG-SPQNGT 315 (395) T ss_pred cccccc-----ccccccccccccchhHHHHHHHHHhhccccCCC--cEEEEcHHHHHHHHHhh-ccCCceecc-ccccCC Confidence 111100 000011111223346788888888888877653 36789999999886433 223334332 244666 Q ss_pred EEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeech- Q lcl|NC_015249. 238 IRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRA- 316 (347) Q Consensus 238 Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~- 316 (347) .+.++|.+|+.++.+|... -+-+||+...-+ +.+ .+++++..+.. T Consensus 316 ~~~l~G~pVv~~~~~~~~~-------------------------~~~gd~~~~~~~-~~~--------~~~~i~~~~~~~ 361 (395) T protein:vir:43 316 TPTLWRLPVVETQAITQDE-------------------------FLTGAFSLGAQI-FDR--------MDIEVLVSTEND 361 (395) T ss_pred CceecceeeEEcCCCCCCc-------------------------EEEEeccceEEE-EEe--------cceEEEEecccc Confidence 7789999999999998421 012344443222 212 22445544432 Q ss_pred -hhh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 317 -NFQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 317 -~~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .++ ...+++.++++.++++|++.+.+.++.| T Consensus 362 ~~f~~~~~~~r~~~r~d~~v~~~~a~~~~~~taa 395 (395) T protein:vir:43 362 KDFENNMVTIRAEERLAFAVYRPEAFVTGSLTAS 395 (395) T ss_pred chhhcCcEEEEEEEeeccEEecccceEEEEeccC Confidence 233 3356777899999999999999999999 No 81 >protein:vir:97053 Length: 390 # NCBI annotation: putative head protein # Family: family:all:585 # MgeID: mge:1653 # MgeName: OP1 # Cross-refs: genbank:acc:YP_453565;genbank:gi:84662600;genbank:GeneID:5142468 Probab=99.40 E-value=1.4e-13 Score=91.03 Aligned_cols=285 Identities=16% Similarity=0.095 Sum_probs=171.6 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecC--cceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLG--RTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG--~~~~~~~~~g~~ 78 (347) ........ ..+.+.....++.-.+..+++...+.......+.++++++...+.+ .+..+++.. ...+..+..|+. T Consensus 101 ~~~~~~~~--~~~~~~~~~~~~~g~lip~~~~~~ii~~~~~~~~i~~~~~~~~~~~-~~~~~~~~~~~~~~a~~v~Eg~~ 177 (390) T protein:vir:97 101 RATMNIKA--ALNTASTDAAGSAGALTTPNRLPGFITPPDARLTVRDLIGSGRTDS-ALIEYVQETGFVNNAAIVAEGAL 177 (390) T ss_pred hhhhHHHH--HHHhhhcccccccccccchhhhHHHHHHHhhhhhhHhhcceeeccC-CceEEEEEecCCcceeeecCCcc Confidence 00000000 0011111122222336678899999999999999998888877654 467777753 346677777887 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) ++.. +++-.++++.+.+.. .-..|.+ +-.+...++.+.+.++.++++++..|+.+|.. +.++..+. T Consensus 178 ~~~~--~~~~~~i~~~~~k~~-~~~~is~-ell~ds~~l~~~i~~~la~a~~~~~d~a~l~G--------~g~~~~p~-- 243 (390) T protein:vir:97 178 KPES--SLKFAKKTDTTHVIA-HTMKATR-QILSDAPQLASYMNNRLIRGLKVKEDAEILRG--------TGANDGLL-- 243 (390) T ss_pred cccc--ccceeEEEEeeeeEE-EeehhhH-HHHHhHHHHHHHHHHHHHHHHHHHHHHHHhhc--------CCCCcccc-- Confidence 7653 466677778777653 3334543 22333457889999999999999999987631 11111111 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceE Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSI 238 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~V 238 (347) |-....+... ......+...++.|.++...+.....+.. .+|++|..|..|.+-. -.++.|.-.. ...+.. T Consensus 244 --Gi~~~~~~~~---~~~~~~~~~~~d~~~~~~~~~~~~~~~~~--~~v~n~~~~~~L~~lk-d~~G~~l~~~-~~~~~~ 314 (390) T protein:vir:97 244 --GLIPQATTYA---APTTIAGATRVDQLRLAMLQASLAEYPAS--GIVINPIDWAAIELAK-DANNQYLIGN-ARGTLT 314 (390) T ss_pred --ceeecccccc---ccccccccchHHHHHHHHHhhccccCCCC--EEEEcHHHHHHHHHhh-cCCCceeecC-ccCCCC Confidence 1111111000 11112233457788888899998888654 4678999999987533 2333343222 234555 Q ss_pred EEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeech-h Q lcl|NC_015249. 239 RNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRA-N 317 (347) Q Consensus 239 g~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~-~ 317 (347) ++++|.+|+.|+.+|... -+-+||+..+. ++ ..++++++..++. . T Consensus 315 ~~l~G~pV~~~~~~~~~~-------------------------~~~gd~~~~~~-~~--------~~~~~~i~~~~~~~~ 360 (390) T protein:vir:97 315 PTLWGLPVVATQAMAPGE-------------------------FLVGAFDLAAQ-IF--------DQWDARVEIGYVNDD 360 (390) T ss_pred ceecceeeEEcCCCCCCc-------------------------EEEEeccceEE-EE--------EecceEEEEeecccc Confidence 789999999999988421 02234443322 22 2345677776653 4 Q ss_pred hhcce--eeeeeeecccccccceEEEEEEc Q lcl|NC_015249. 318 FQADQ--IIAKYAMGHGGLRPEACGALVFN 345 (347) Q Consensus 318 ~~~d~--i~~~~a~G~~~~Rpe~a~~i~~~ 345 (347) ++.++ ++...+|+..+++|++.+.+.+. T Consensus 361 f~~~~~~~r~~~r~d~~v~~~~a~v~~~~a 390 (390) T protein:vir:97 361 FQRNMVTVLAEERLALVVYRPEALITGSFA 390 (390) T ss_pred cccCcEEEEEEEeeccEEeccccEEEEEeC Confidence 45554 56777899999999999999888 No 82 >protein:vir:96392 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1613 # MgeName: 53 # Cross-refs: genbank:acc:YP_239648;genbank:gi:66395381;genbank:GeneID:5132868 Probab=99.39 E-value=1e-13 Score=91.86 Aligned_cols=284 Identities=13% Similarity=0.086 Sum_probs=166.5 Q ss_pred CC--------------cccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec- Q lcl|NC_015249. 1 MA--------------KMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL- 65 (347) Q Consensus 1 ma--------------~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i- 65 (347) |= +...... .+.......++.-.+.-+.|..++.+.....|.++.+.++.++. |.+++||+. T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~--~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~l~~~~~~~-~~~~~~p~~~ 77 (324) T protein:vir:96 1 MEQTQKLKLNLQHFASNNVKPQV--FNPDNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQLGKYEPME-GTEKKFTFWA 77 (324) T ss_pred CCcchhhhHHHHHHHHHhhhhhh--hccccccccCcCccccchhHHHHHHHHHHhhchhhhhcceeecc-CCceEEEEEe Confidence 11 1111110 01101111222223556889999999988889999998887755 556888876 Q ss_pred CcceeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHh Q lcl|NC_015249. 66 GRTKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLC 145 (347) Q Consensus 66 G~~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a 145 (347) +...+..+..|+.++. .+++-.++++...+. ..-..|.+-=-.++.+|+.+.+.++.++++++..|+.+|.--. T Consensus 78 ~~~~a~~v~Eg~~~~~--~~~~~~~v~~~~~k~-~~~~~is~ell~ds~~~l~~~i~~~la~ai~~~~d~a~l~G~g--- 151 (324) T protein:vir:96 78 DKPGAYWVGEGQKIET--SKATWVNATMRAFKL-GVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGILNQG--- 151 (324) T ss_pred cCcceeEecCCccccc--cccceeEEEEeeEEE-EEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhccCC--- Confidence 6667777888888765 356667777776554 3334454322234568999999999999999999998863210 Q ss_pred hhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhh Q lcl|NC_015249. 146 NLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAA 225 (347) Q Consensus 146 ~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~ 225 (347) ... .+.+....... ..... .+...++.|+++...|...+.... .++++|..|..|.+-.. T Consensus 152 -----~~~----~~~gi~~~~~~-~~~~~----~~~~t~~~i~~~~~~l~~~~~~~~--~~vmn~~~~~~L~~l~d---- 211 (324) T protein:vir:96 152 -----NNP----FGKSIAQSIEK-TNKVI----KGDFTQDNIIDLEALLEDDELEAN--AFISKTQNRSLLRKIVD---- 211 (324) T ss_pred -----CCC----cCccccccccc-cceec----cccccHHHHHHHHHhhhhccCCCC--EEEEcHHHHHHHHHhhc---- Confidence 000 11111111110 01001 112247778888888888776433 57899999998865322 Q ss_pred hhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhh Q lcl|NC_015249. 226 NYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKL 305 (347) Q Consensus 226 ~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~ 305 (347) -.|...+..+..+.++|.+|+.++..+...+ .-+-+||+... + +.. T Consensus 212 -~~G~~~~~~~~~~~l~G~PV~~~~~~~~~~~-----------------------~~~~gd~~~~~-~---------g~~ 257 (324) T protein:vir:96 212 -PETKERIYDRNSDSLDGLPVVNLKSSNLKRG-----------------------ELITGDFDKLI-Y---------GIP 257 (324) T ss_pred -cCCCeeecCCCCCcccceeeEeeCCCCCCcc-----------------------eEEEEecceEE-E---------EEe Confidence 2233335566777899999998765542211 01223444432 1 222 Q ss_pred cceeeeeeechh--------------hh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 306 KDMALERARRAN--------------FQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 306 ~~~~~e~~~d~~--------------~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .++++|..++.. ++ ...+++.++++..++||++.+.|....+ T Consensus 258 ~~~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~r~~~r~d~~v~~~~A~~~l~~a~~ 315 (324) T protein:vir:96 258 QLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAFAKLVPADK 315 (324) T ss_pred cCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEEccEEecccceEEEecccc Confidence 345555554431 22 3445677889999999998876655444 No 83 >protein:vir:78830 Length: 324 # NCBI annotation: major head protein # Family: family:all:507 # MgeID: mge:1858 # MgeName: 80alpha # Cross-refs: genbank:acc:YP_001285361;genbank:gi:148717889;genbank:GeneID:5246961 Probab=99.39 E-value=1e-13 Score=91.86 Aligned_cols=284 Identities=13% Similarity=0.086 Sum_probs=166.5 Q ss_pred CC--------------cccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec- Q lcl|NC_015249. 1 MA--------------KMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL- 65 (347) Q Consensus 1 ma--------------~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i- 65 (347) |= +...... .+.......++.-.+.-+.|..++.+.....|.++.+.++.++. |.+++||+. T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~--~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~l~~~~~~~-~~~~~~p~~~ 77 (324) T protein:vir:78 1 MEQTQKLKLNLQHFASNNVKPQV--FNPDNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQLGKYEPME-GTEKKFTFWA 77 (324) T ss_pred CCcchhhhHHHHHHHHHhhhhhh--hccccccccCcCccccchhHHHHHHHHHHhhchhhhhcceeecc-CCceEEEEEe Confidence 11 1111110 01101111222223556889999999988889999998887755 556888876 Q ss_pred CcceeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHh Q lcl|NC_015249. 66 GRTKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLC 145 (347) Q Consensus 66 G~~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a 145 (347) +...+..+..|+.++. .+++-.++++...+. ..-..|.+-=-.++.+|+.+.+.++.++++++..|+.+|.--. T Consensus 78 ~~~~a~~v~Eg~~~~~--~~~~~~~v~~~~~k~-~~~~~is~ell~ds~~~l~~~i~~~la~ai~~~~d~a~l~G~g--- 151 (324) T protein:vir:78 78 DKPGAYWVGEGQKIET--SKATWVNATMRAFKL-GVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGILNQG--- 151 (324) T ss_pred cCcceeEecCCccccc--cccceeEEEEeeEEE-EEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhccCC--- Confidence 6667777888888765 356667777776554 3334454322234568999999999999999999998863210 Q ss_pred hhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhh Q lcl|NC_015249. 146 NLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAA 225 (347) Q Consensus 146 ~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~ 225 (347) ... .+.+....... ..... .+...++.|+++...|...+.... .++++|..|..|.+-.. T Consensus 152 -----~~~----~~~gi~~~~~~-~~~~~----~~~~t~~~i~~~~~~l~~~~~~~~--~~vmn~~~~~~L~~l~d---- 211 (324) T protein:vir:78 152 -----NNP----FGKSIAQSIEK-TNKVI----KGDFTQDNIIDLEALLEDDELEAN--AFISKTQNRSLLRKIVD---- 211 (324) T ss_pred -----CCC----cCccccccccc-cceec----cccccHHHHHHHHHhhhhccCCCC--EEEEcHHHHHHHHHhhc---- Confidence 000 11111111110 01001 112247778888888888776433 57899999998865322 Q ss_pred hhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhh Q lcl|NC_015249. 226 NYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKL 305 (347) Q Consensus 226 ~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~ 305 (347) -.|...+..+..+.++|.+|+.++..+...+ .-+-+||+... + +.. T Consensus 212 -~~G~~~~~~~~~~~l~G~PV~~~~~~~~~~~-----------------------~~~~gd~~~~~-~---------g~~ 257 (324) T protein:vir:78 212 -PETKERIYDRNSDSLDGLPVVNLKSSNLKRG-----------------------ELITGDFDKLI-Y---------GIP 257 (324) T ss_pred -cCCCeeecCCCCCcccceeeEeeCCCCCCcc-----------------------eEEEEecceEE-E---------EEe Confidence 2233335566777899999998765542211 01223444432 1 222 Q ss_pred cceeeeeeechh--------------hh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 306 KDMALERARRAN--------------FQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 306 ~~~~~e~~~d~~--------------~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .++++|..++.. ++ ...+++.++++..++||++.+.|....+ T Consensus 258 ~~~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~r~~~r~d~~v~~~~A~~~l~~a~~ 315 (324) T protein:vir:78 258 QLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAFAKLVPADK 315 (324) T ss_pred cCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEEccEEecccceEEEecccc Confidence 345555554431 22 3445677889999999998876655444 No 84 >protein:vir:96223 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1607 # MgeName: 69 # Cross-refs: genbank:acc:YP_239571;genbank:gi:66395304;genbank:GeneID:5132771 Probab=99.39 E-value=8.4e-14 Score=92.27 Aligned_cols=284 Identities=13% Similarity=0.061 Sum_probs=164.9 Q ss_pred CC-----------cccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-Ccc Q lcl|NC_015249. 1 MA-----------KMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRT 68 (347) Q Consensus 1 ma-----------~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~ 68 (347) |. +.--++. .++.-.....+.-.+.-+++..++.+.....|.++++.++..+. |.+++||+. +.. T Consensus 4 ~~~~~~~~~~f~~~~~~~~~--~~a~~~~~~~~~~~lip~~~~~~ii~~~~~~s~l~~l~~~~~~~-~~~~~~p~~~~~~ 80 (324) T protein:vir:96 4 TQKLKLNLQHFASNNVKPQV--FNPDNVMMHEKKDGTLLNDFTTPILQEVMENSKIMQLGKYEPME-GTEKKFTFWADKP 80 (324) T ss_pred chhhhHHHHHHHHhhhhhhh--cccccccccCCCcceechhHHHHHHHHHHhhchhhhhcceeecc-CCceEEEEEecCc Confidence 10 0000000 01111111122223566899999999998999999998887755 456888876 556 Q ss_pred eeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhc Q lcl|NC_015249. 69 KAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLP 148 (347) Q Consensus 69 ~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~ 148 (347) .+..+..|+.++. .+++-.++++..-+.. .-..|.+-=-.++.+|+.+.+.++.++++++..|+.+|.--. T Consensus 81 ~a~~v~Eg~~~~~--~~~~f~~v~~~~~k~~-~~~~is~ell~ds~~~l~~~i~~~l~~aia~~~d~~~l~G~g------ 151 (324) T protein:vir:96 81 GAYWVGEGQKIET--SKATWVNATMRAFKLG-VILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGILNQG------ 151 (324) T ss_pred ceeeecCCccccc--cccceeEEEEEeEEEE-EeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhhcCC------ Confidence 7777788888765 3466677777665543 334554322224568899999999999999999998863210 Q ss_pred cccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhc Q lcl|NC_015249. 149 SASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQ 228 (347) Q Consensus 149 ~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~ 228 (347) ....+ .+.. ........ ...+...++.|+++...+...+.... .++++|..+..|.+-.. -. T Consensus 152 --~~~~~----~~~~-~~~~~~~~----~~~~~~~~~~i~~~~~~i~~~~~~~~--~~i~n~~~~~~L~~lkd-----~~ 213 (324) T protein:vir:96 152 --NNPFG----KSIA-QSIKKTNK----VIKGDFTQDNIIDLEALLEDDELEAN--AFISKTQNRSLLRKIVD-----PE 213 (324) T ss_pred --CCCcC----cccc-ccccccce----ecccccchHHHHHHHHhhhhccCCCC--EEEEcHHHHHHHHHhhC-----CC Confidence 00011 1100 00000000 01112236778888888887766433 57899999999875322 22 Q ss_pred cccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcce Q lcl|NC_015249. 229 ALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDM 308 (347) Q Consensus 229 ~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~ 308 (347) |.-.+..+..+.++|++|+.++..+...+. -+-+||+... .+...++ T Consensus 214 G~~~~~~~~~~~l~G~PV~~~~~~~~~~~~-----------------------~~~gd~s~~~----------~~~~~~~ 260 (324) T protein:vir:96 214 TKERIYDRNSDSLDGLPVVNLKSSNLKRGE-----------------------LITGDFDKLI----------YGIPQLI 260 (324) T ss_pred CCeeecCCCCCcccceeeEeecCCCCCcce-----------------------EEEEecceEE----------EEEecCc Confidence 333345666778999999987665432110 1222333322 1222344 Q ss_pred eeeeeechh--------------hhc--ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 309 ALERARRAN--------------FQA--DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 309 ~~e~~~d~~--------------~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++..++.. ++. -.+++..++|.+++||++.+.|.-..+ T Consensus 261 ~i~~~~~~~~~~~~~~~~~~~~~~~~n~v~~r~~~r~d~~v~~~~a~~~l~~a~~ 315 (324) T protein:vir:96 261 EYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAFAKLVPADK 315 (324) T ss_pred EEEEeecccccccccccccchhhhhcCcEEEEEEEEeccEEecccceEEEecccc Confidence 555554421 222 345778889999999998886665444 No 85 >protein:vir:9759 Length: 303 # NCBI annotation: putative structural protein # Family: family:all:966 # MgeID: mge:175 # MgeName: 315.3 # Cross-refs: genbank:acc:NP_795521;genbank:gi:28876283;genbank:GeneID:1257824 Probab=99.39 E-value=2.9e-13 Score=89.32 Aligned_cols=284 Identities=10% Similarity=0.013 Sum_probs=162.8 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~~ 79 (347) ||..+.+.. +.-+++..++.+.-+..|.++.+.++..+.+ .+++||+. +...+..+..|+.+ T Consensus 1 m~t~t~gg~----------------liP~~~~~~ii~~l~~~s~i~~l~~~~~~~~-~~~~ip~~~~~~~a~wv~E~~~~ 63 (303) T protein:vir:97 1 MGTETSKAS----------------LFDKHLVSDLINKVKGHSSLAKLSSQKPIPF-NGSKEFTFTLDSDIDVVAENGKK 63 (303) T ss_pred CcccCCCCe----------------EcchhHHHHHHHHHHhhchhhhhcceeecCC-CceEEEEEecCcceEEeecCccc Confidence 765443221 3447899999999888999999988877654 45677774 66678888888877 Q ss_pred CCccCCCCCceEEEEEEeeeecccccccHHHH-----HhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADVLIYDIEDA-----MNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDEN 154 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~-----q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~ 154 (347) +.+ +++-+++++..-+. .....|. +|. ....++.+.+.++.+++|++..|+.++...- ...... T Consensus 64 ~~s--~~~f~~v~l~~~kl-~~~~~iS--~ell~~~~d~~~~l~~~i~~~la~a~~~~ld~a~l~G~~----~~~g~~-- 132 (303) T protein:vir:97 64 THG--GLSLEPVTIVPIKV-EYGARLS--DEFLYATEEEKIDILKAFNEGFAKKLARGIDLMAMHGIN----PRTKKA-- 132 (303) T ss_pred ccc--ccceeeEEeeeEEE-EEeehhh--HHHhhcCccchHHHHHHHHHHHHHHHHHHHHhhhhcccc----cCCccc-- Confidence 653 45555666654333 2222332 222 2456788999999999999999998864320 000000 Q ss_pred cccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccc-ccc Q lcl|NC_015249. 155 IAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQAL-IDP 233 (347) Q Consensus 155 ~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~-~~~ 233 (347) +.+.+.....+..+ ...........++.|.++...+...+... ..++++|..+..|.+-..- +..+.-. ... T Consensus 133 --~~~~~~~~~~~~~~--~~~~~~~~~~~~~~i~~~~~~~~~~~~~~--~~~vmn~~~~~~L~~lkd~-~g~~~~~~~~~ 205 (303) T protein:vir:97 133 --SDVIGTNHFDSKVT--QVVKFTESEDADANIEAAVNLIQGAEGVV--TGLAMDTEFSTALAKVTNG-EMGPKMYPELA 205 (303) T ss_pred --cccccccccccccc--cccccccccchHHHHHHHHHHHhhcCCCc--cEEEEcHHHHHHHHHhhcc-CCCeEEecCcc Confidence 00011000000000 00011112235778888888887766543 3488899999988753322 2222211 112 Q ss_pred ccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeee Q lcl|NC_015249. 234 STGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERA 313 (347) Q Consensus 234 ~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~ 313 (347) ..+..++++|.+|+.|+++|...... .....-+-+||.+.+.+..+.. +++|.. T Consensus 206 ~~~~~~~l~G~Pv~~s~~v~~~~~~~-----------------~~~~~~~~Gdf~~~~~~~~~~~---------~~~~~~ 259 (303) T protein:vir:97 206 WGANPDSINGLKSSVNTTVGAGADEA-----------------ESKDLVIIGDFESMFKWGYAKQ---------IPMEII 259 (303) T ss_pred CCCCCceecceeeEEecccCCccccC-----------------CCccEEEEeeccccEEEEEecC---------cEEEEe Confidence 34455789999999999998533211 0011234456655544433322 333333 Q ss_pred e--chh------hhcc--eeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 314 R--RAN------FQAD--QIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 314 ~--d~~------~~~d--~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) . |+. ++.| .+++..+++.+++||++-+. ++.| T Consensus 260 ~~~~~d~~~~~~~~~n~~~~r~~~r~~~~v~~p~af~~--l~~~ 301 (303) T protein:vir:97 260 KYGDPDNSGKDLKGYNQIYLRAEAYIGWGILDAKSFAR--VTKG 301 (303) T ss_pred eccCCCCcchhhhhcCcEEEEEEEEeccEeecccceEE--eeCC Confidence 1 211 3333 46678899999999997774 4455 No 86 >protein:vir:1886 Length: 385 # NCBI annotation: major capsid subunit precursor # Family: family:all:585 # MgeID: mge:41 # MgeName: HK022 # Cross-refs: genbank:acc:NP_037666;genbank:gi:9634124;genbank:GeneID:1262513 Probab=99.39 E-value=1.5e-13 Score=90.90 Aligned_cols=286 Identities=15% Similarity=0.119 Sum_probs=167.8 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-C-cceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-G-RTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G-~~~~~~~~~g~~ 78 (347) +.+..... .+..-....++.-.+..+++..++.+.....+.++.++++..+. +.++.+|+. + ...+.....|+. T Consensus 93 ~~~~~~~~---~~~~~~~~~~~~g~~i~~~~~~~ii~~~~~~~~l~~~~~~~~~~-~~~~~~~~~~~~~~~a~~v~E~~~ 168 (385) T protein:vir:18 93 QGTFGAKT---FNKSLGSDADSAGSLIQPMQIPGIIMPGLRRLTIRDLLAQGRTS-SNALEYVREEVFTNNADVVAEKAL 168 (385) T ss_pred hccchhhH---HHhhhccccccCCceecchhhhHHHHHhhhccchhhhcceeccc-CcceEEEEEecCCcceeeeccCcc Confidence 11110000 00000001111111455788888988888888888888887754 457888876 3 345666677777 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) ++. .+++-.++++.+.+.- ..+.|.+ +-.+...++.+.+.++.++++++..|+.++.-- .++..+.+. T Consensus 169 ~~~--~~~~~~~~~~~~~k~~-~~~~is~-ell~d~~~l~~~i~~~la~a~~~~~d~~~l~G~--------g~~~~~~Gi 236 (385) T protein:vir:18 169 KPE--SDITFSKQTANVKTIA-HWVQASR-QVMDDAPMLQSYINNRLMYGLALKEEGQLLNGD--------GTGDNLEGL 236 (385) T ss_pred ccc--cccceeEEEEeeeeEE-EeehhhH-HHHhhHHHHHHHHHHHHHHHHHHHHHHHHHhcc--------CCCCccccc Confidence 654 3466677777776653 3344532 223334568888899999999999999876321 011111111 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceE Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSI 238 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~V 238 (347) . ........ .........++.|.++...|...+.+. -.++++|..|..|.+-. -.+..|.... ...|.. T Consensus 237 ~-----~~~~~~~~--~~~~~~~~~~d~i~~~~~~l~~~~~~~--~~~~~~~~~~~~l~~lk-d~~G~~l~~~-~~~~~~ 305 (385) T protein:vir:18 237 N-----KVATAYDT--SLNATGDTRADIIAHAIYQVTESEFSA--SGIVLNPRDWHNIALLK-DNEGRYIFGG-PQAFTS 305 (385) T ss_pred c-----cccccccc--cccccccchHHHHHHHHHhhccccCCC--CEEEEcHHHHHHHHHhh-cCCCceeccC-cccCCC Confidence 1 11000000 011112335788888888887776543 36889999999987543 2333443322 346667 Q ss_pred EEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechh- Q lcl|NC_015249. 239 RNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRAN- 317 (347) Q Consensus 239 g~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~- 317 (347) +.++|.+|+.|+.+|.+. -+-+||..... ++.+ ++++++..++.. T Consensus 306 ~~l~G~pV~~~~~~p~~~-------------------------~~~gd~~~~~~-~~~~--------~~~~v~~~~~~~~ 351 (385) T protein:vir:18 306 NIMWGLPVVPTKAQAAGT-------------------------FTVGGFDMASQ-VWDR--------MDATVEVSREDRD 351 (385) T ss_pred ceecceeeEEcCcCCCCc-------------------------EEEeecccEEE-EEEe--------cceEEEEeccccc Confidence 889999999999998421 11234443332 2222 345555554332 Q ss_pred -hhc--ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 318 -FQA--DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 318 -~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +.- ..+++.+++|..+.+|++.+.+.+..| T Consensus 352 ~~~~~~~~~~~~~r~~~~v~~~~a~~~~~~~aa 384 (385) T protein:vir:18 352 NFVKNMLTILCEERLALAHYRPTAIIKGTFSSG 384 (385) T ss_pred hhhcCcEEEEEEEeeccEEecccceEEEEeccC Confidence 333 356788899999999999999999999 No 87 >protein:vir:191 Length: 385 # NCBI annotation: major head subunit precursor # Family: family:all:585 # MgeID: mge:6 # MgeName: HK97 # Cross-refs: genbank:acc:NP_037701;genbank:gi:9634158;genbank:GeneID:1262530 Probab=99.39 E-value=1.5e-13 Score=90.90 Aligned_cols=286 Identities=15% Similarity=0.119 Sum_probs=167.8 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-C-cceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-G-RTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G-~~~~~~~~~g~~ 78 (347) +.+..... .+..-....++.-.+..+++..++.+.....+.++.++++..+. +.++.+|+. + ...+.....|+. T Consensus 93 ~~~~~~~~---~~~~~~~~~~~~g~~i~~~~~~~ii~~~~~~~~l~~~~~~~~~~-~~~~~~~~~~~~~~~a~~v~E~~~ 168 (385) T protein:vir:19 93 QGTFGAKT---FNKSLGSDADSAGSLIQPMQIPGIIMPGLRRLTIRDLLAQGRTS-SNALEYVREEVFTNNADVVAEKAL 168 (385) T ss_pred hccchhhH---HHhhhccccccCCceecchhhhHHHHHhhhccchhhhcceeccc-CcceEEEEEecCCcceeeeccCcc Confidence 11110000 00000001111111455788888988888888888888887754 457888876 3 345666677777 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) ++. .+++-.++++.+.+.- ..+.|.+ +-.+...++.+.+.++.++++++..|+.++.-- .++..+.+. T Consensus 169 ~~~--~~~~~~~~~~~~~k~~-~~~~is~-ell~d~~~l~~~i~~~la~a~~~~~d~~~l~G~--------g~~~~~~Gi 236 (385) T protein:vir:19 169 KPE--SDITFSKQTANVKTIA-HWVQASR-QVMDDAPMLQSYINNRLMYGLALKEEGQLLNGD--------GTGDNLEGL 236 (385) T ss_pred ccc--cccceeEEEEeeeeEE-EeehhhH-HHHhhHHHHHHHHHHHHHHHHHHHHHHHHHhcc--------CCCCccccc Confidence 654 3466677777776653 3344532 223334568888899999999999999876321 011111111 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceE Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSI 238 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~V 238 (347) . ........ .........++.|.++...|...+.+. -.++++|..|..|.+-. -.+..|.... ...|.. T Consensus 237 ~-----~~~~~~~~--~~~~~~~~~~d~i~~~~~~l~~~~~~~--~~~~~~~~~~~~l~~lk-d~~G~~l~~~-~~~~~~ 305 (385) T protein:vir:19 237 N-----KVATAYDT--SLNATGDTRADIIAHAIYQVTESEFSA--SGIVLNPRDWHNIALLK-DNEGRYIFGG-PQAFTS 305 (385) T ss_pred c-----cccccccc--cccccccchHHHHHHHHHhhccccCCC--CEEEEcHHHHHHHHHhh-cCCCceeccC-cccCCC Confidence 1 11000000 011112335788888888887776543 36889999999987543 2333443322 346667 Q ss_pred EEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechh- Q lcl|NC_015249. 239 RNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRAN- 317 (347) Q Consensus 239 g~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~- 317 (347) +.++|.+|+.|+.+|.+. -+-+||..... ++.+ ++++++..++.. T Consensus 306 ~~l~G~pV~~~~~~p~~~-------------------------~~~gd~~~~~~-~~~~--------~~~~v~~~~~~~~ 351 (385) T protein:vir:19 306 NIMWGLPVVPTKAQAAGT-------------------------FTVGGFDMASQ-VWDR--------MDATVEVSREDRD 351 (385) T ss_pred ceecceeeEEcCcCCCCc-------------------------EEEeecccEEE-EEEe--------cceEEEEeccccc Confidence 889999999999998421 11234443332 2222 345555554332 Q ss_pred -hhc--ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 318 -FQA--DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 318 -~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +.- ..+++.+++|..+.+|++.+.+.+..| T Consensus 352 ~~~~~~~~~~~~~r~~~~v~~~~a~~~~~~~aa 384 (385) T protein:vir:19 352 NFVKNMLTILCEERLALAHYRPTAIIKGTFSSG 384 (385) T ss_pred hhhcCcEEEEEEEeeccEEecccceEEEEeccC Confidence 333 356788899999999999999999999 No 88 >protein:vir:104085 Length: 320 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:1656 # MgeName: Che12 # Cross-refs: genbank:acc:YP_655596;genbank:gi:109392467;genbank:GeneID:4156953 Probab=99.38 E-value=3.7e-13 Score=88.75 Aligned_cols=295 Identities=12% Similarity=0.018 Sum_probs=161.2 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~~ 79 (347) ||--+... ...+.-..-.+++.-.+..+++..++.+...+.|.++.+.++..+. +.+.+||+. +...+.....|+.+ T Consensus 1 ~~~~~~~~-~~~~~~~~t~~~~~~~~ip~~~~~~ii~~~~~~s~l~~~~~~~~~~-~~~~~~p~~~~~~~a~~v~E~~~~ 78 (320) T protein:vir:10 1 MAAGTAFQ-VDHAQIAQTGDTMFKGYLEPEQAKDYFAEAEKTSIVQQFAQKVPMG-TTGQKIPHWIGDVSAQWIGEGDMK 78 (320) T ss_pred CCCCccCC-HHHHHhhccccccccccccHHHHHHHHHHHHhccchhhhcceeecc-CCceEEEEEeCCcceEEecCCccc Confidence 55544321 1122222212222122556889999999999999999888877755 456788876 55677777888887 Q ss_pred CCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLG 159 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~ 159 (347) +.. +++-.++++.+-+. ..-+.|.+-=-.++..|+.+.+.++.++++++.+|+.+|.--. +.....+.+.. T Consensus 79 ~~~--~~~f~~v~~~~~k~-~~~~~is~ell~ds~~~l~~~i~~~l~~a~a~~~d~a~l~G~g------~~~~~~~~~~~ 149 (320) T protein:vir:10 79 PIT--KGNMTSQNIAPHKI-ATIFVASAETVRANPANYLGTMRTKVATAFAMAFDSAALNGTD------SPFPTYLAQTT 149 (320) T ss_pred ccc--ccceeEEEEeeEEE-EEeehhhHHHHhcChHHHHHHHHHHHHHHHHHHHHHHhhcccC------CCCCccccccc Confidence 653 46666666666554 2333443322234678999999999999999999998863110 00000000000 Q ss_pred CcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccc-----ccc Q lcl|NC_015249. 160 KAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALI-----DPS 234 (347) Q Consensus 160 ~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~-----~~~ 234 (347) .+ .........+.+. ....-+.+.++...+...+.+ .-+++++|..|..|.+-.. .+..+.... ... T Consensus 150 ~~--~~~~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~--~~~~v~n~~~~~~L~~lkd-~~G~~l~~~~~~~~~~~ 221 (320) T protein:vir:10 150 KS--VSLADPGGATASD---LTAYDAVAVNGLSLLVNAKKK--WTHTLLDDIVEPILNGAKD-KNGRPLFIESTYTDENS 221 (320) T ss_pred cc--ccceecccccccc---cccHHHHHHHHHhhhhcccCC--CcEEEEcHHHHHHHHHhhc-cCCceeeccccccCccc Confidence 00 0000010111110 111123355666666665543 4478899999999975322 222222111 111 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) ...-+++.|++|+.++++|..... -+-+||++.. + +...+++++..+ T Consensus 222 ~~~~~~i~g~pv~~~~~~~~~~~~-----------------------~~~gd~~~~~-~---------~~~~~~~i~~~~ 268 (320) T protein:vir:10 222 PFRAGRIVSRPTILSDHVADGTTV-----------------------GYMGDFRNVI-W---------GQVGGLSFDVTD 268 (320) T ss_pred cccCceeeeeeeEecCCCCCCceE-----------------------EEEeecceEE-E---------EEecCeEEEEee Confidence 111246899999999988742100 0123444432 1 222234555444 Q ss_pred chh--------------hhc--ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RAN--------------FQA--DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~--------------~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +.. ++. -.+++.+.++..++||++.+.|.-..| T Consensus 269 ~~~~~~~~~~~~~~~~~f~~~~~~~r~~~~~d~~v~~~~a~~~l~~~~a 317 (320) T protein:vir:10 269 QATLNLGTPTEPNFVSLWQHNLVAVRVEAEYAFHNNDKDAFVKLTNVVT 317 (320) T ss_pred cceeeeccccccccchhhhcCcEEEEEEEeeccEEecccceEEEEeccC Confidence 322 222 345677899999999998877654444 No 89 >protein:vir:10364 Length: 390 # NCBI annotation: head protein; major capsid subunit precursor # Family: family:all:585 # MgeID: mge:183 # MgeName: Xp10 # Cross-refs: genbank:acc:NP_858956;genbank:gi:32128421;genbank:GeneID:2648357 Probab=99.37 E-value=3.7e-13 Score=88.77 Aligned_cols=279 Identities=17% Similarity=0.113 Sum_probs=164.3 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec--CcceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL--GRTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i--G~~~~~~~~~g~~ 78 (347) ++....+ ....+++.-.+....+...+.......+.++.++++.++.+ .++.+++. +..++.....|+. T Consensus 107 ~~~~~~~--------~~~~~~~~g~~~~~~~~~~ii~~~~~~~~l~~~~~~~~~~~-~~~~~~~~~~~~~~a~~v~Eg~~ 177 (390) T protein:vir:10 107 KAALNTA--------STDAAGSAGALTTPNRLPGFITQPDARLTVRDLIGSGRTDS-ALIEYVQETGFVNNAAIVAEGAL 177 (390) T ss_pred HHHHHhh--------hcccccccccccchhHHHHHHHHHHhhchhhhhcceeeccC-CceEEEEEecCCcceeeecCCcc Confidence 1111111 11111111225566677777777777778888888877554 46777765 3346666777777 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) ++. .+++-.++++.+.+.. .-+.|.+ +-.+...++.+.+.++.+.++++..|+.++.-- .++..+.|. T Consensus 178 ~~~--~~~~~~~i~~~~~k~~-~~~~is~-ell~d~~~l~~~i~~~l~~~~~~~~~~~il~G~--------G~~~~p~Gi 245 (390) T protein:vir:10 178 KPE--SSLKFAKKTDTTHVIA-HTMKATR-QILSDAPQLASYMNNRLIRGLKVKEDAEILRGT--------GANDGLLGL 245 (390) T ss_pred ccc--cccceeEEEEeeEEEE-EeehhhH-HHHHhHHHHHHHHHHHHHHHHHHHHHHHHhhcC--------CCCcccccc Confidence 654 3466677777776653 3334433 123334678889999999999999999876310 011111111 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceE Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSI 238 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~V 238 (347) -.. .+... ......+...++.|.++...|...+.+.. .+|++|..|..|.+-. -.+..|.-.... .+.. T Consensus 246 ~~~----~~~~~---~~~~~~~~~~~~~~~~~~~~l~~~~~~~~--~~v~n~~~~~~L~~lk-d~~g~~l~~~~~-~~~~ 314 (390) T protein:vir:10 246 IPQ----ATTYA---APTTIAGATRVDQLRLAMLQASLAEYPAS--GIVINPIDWAAIELAK-DANNQYLIGNAR-GTLT 314 (390) T ss_pred ccc----ccccc---ccccccccchHHHHHHHHHhhccccCCCC--EEEEcHHHHHHHHHhh-cCCCceeecCCc-CcCC Confidence 111 00000 01111122356778888889988887654 4679999999887533 233334322222 3335 Q ss_pred EEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeech-h Q lcl|NC_015249. 239 RNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRA-N 317 (347) Q Consensus 239 g~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~-~ 317 (347) +.++|.+|+.++.+|.+. -+-+||+..+.++- ..+++++..++. . T Consensus 315 ~~l~G~pv~~~~~~p~~~-------------------------~~~gdf~~~~~~~~---------~~~~~i~~~~~~~~ 360 (390) T protein:vir:10 315 PTLWGLPVVATQAMAPGE-------------------------FLVGAFDLAAQIFD---------QWDARVEIGYVNDD 360 (390) T ss_pred ceecceeeEEcCCCCCCc-------------------------EEEEeccceEEEEE---------ecceEEEEeecccc Confidence 689999999999998321 12245554433322 234566666543 3 Q ss_pred hhcc--eeeeeeeecccccccceEEEEEEc Q lcl|NC_015249. 318 FQAD--QIIAKYAMGHGGLRPEACGALVFN 345 (347) Q Consensus 318 ~~~d--~i~~~~a~G~~~~Rpe~a~~i~~~ 345 (347) +..+ .+++..+++..+++|++.+.+.+. T Consensus 361 ~~~~~~~~r~~~r~d~~v~~~~a~~~~~~a 390 (390) T protein:vir:10 361 FQRNMVTVLAEERLALVVYRPEALISGSFA 390 (390) T ss_pred cccCcEEEEEEEeeccEEeccccEEEEEeC Confidence 4445 455668999999999999988888 No 90 >protein:vir:94771 Length: 298 # NCBI annotation: major head protein # Family: family:all:966 # MgeID: mge:1529 # MgeName: phi LC3 # Cross-refs: genbank:acc:NP_996706;genbank:gi:45597421;genbank:GeneID:2769044 Probab=99.36 E-value=6.8e-13 Score=87.31 Aligned_cols=281 Identities=12% Similarity=0.032 Sum_probs=167.1 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~~ 79 (347) ||-.. |. |.-++|..++.+..++.|+++.+.++..+.+| .+.||++ +..++..+..|+.+ T Consensus 1 ma~~g---------------G~---lip~~~~~~ii~~~~~~s~i~~~~~~~~~~~~-~~~~p~~~~~~~a~~v~Eg~~~ 61 (298) T protein:vir:94 1 MVLNK---------------GT---LFDPELVTDLISKVAGKSSIARLSAQKPIPFN-GEKVFTFTMDSEIDVVAESGKK 61 (298) T ss_pred Ceecc---------------cc---ccChhHHHHHHHHHHhhchhhhhcceeeccCC-ceEEEEEecCcceEEeeCCccc Confidence 55422 11 34478899999999899999988887776654 5688876 66788888888877 Q ss_pred CCccCCCCCceEEEEEEeeeecccccccHHHHH-----hChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADVLIYDIEDAM-----NHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDEN 154 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q-----~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~ 154 (347) +.+ ++.-+++++..-+.- ....|. ++.. ...++.+.+.++.+++|++.+|+.++.... ...... T Consensus 62 ~~~--~~~f~~v~l~~~k~~-~~~~iS--~ell~~~~~~~~~l~~~i~~~la~ai~~~~d~~~l~G~~------~~~g~~ 130 (298) T protein:vir:94 62 THG--GVTLAPQTMVPIKVE-YGARIS--DEFMYASDEEKINILQAFNDGFAKKVARGIDLMAFHGVN------PRLGTA 130 (298) T ss_pred ccc--ccceeEEEEeeeEEE-Eeeehh--HHHhccCCccHHHHHHHHHHHHHHHHHHHHHHHhhcccc------cCCCcc Confidence 653 455566666654442 223332 2221 345688899999999999999998864210 001111 Q ss_pred cccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccc Q lcl|NC_015249. 155 IAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPS 234 (347) Q Consensus 155 ~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~ 234 (347) ..+.+.......... ..........+++.|.++..+|...+.... ..+++|..+..|.+-. -.+..+.-..... T Consensus 131 ~~~~~~~~~~~~~~~---~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~--~~vmn~~~~~~l~~lk-d~~G~~l~~~~~~ 204 (298) T protein:vir:94 131 SAVIGTNHFDSKVTQ---KVEAPRGIADPNGAIENAVELLTGVDADVT--GIAINPSFRSALAKQK-DLQGNALFPELKW 204 (298) T ss_pred ccccccccccccccc---ccccccccccHHHHHHHHHHhhhhcCCCcc--EEEEcHHHHHHHHHhh-ccCCCeeecCccc Confidence 111111000000000 001111223457788899999988887543 6899999999986532 2334444444556 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) .|..+++.|++|+.++++|....+. ....+-+||++.+.+... .+++++..+ T Consensus 205 ~~~~~tl~G~PV~~~~~v~~~~~~~-------------------~~~~~~Gdfs~~~~~~~~---------~~~~~~~~~ 256 (298) T protein:vir:94 205 GATPDTINGLPVDVNKTVSDMSLTQ-------------------RDRAIIGDFANGFKWGYA---------KEVPLEVIQ 256 (298) T ss_pred CCCCceecceeeEEecccccccCCC-------------------ccEEEEeeccceEEEEEe---------cCceEEEee Confidence 7777899999999999998533211 011233455544322221 223333332 Q ss_pred --ch------hhhcc--eeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 --RA------NFQAD--QIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 --d~------~~~~d--~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++ .++.+ .+++.+++|..++||++.+.| +.| T Consensus 257 ~~~~d~~~~~~f~~~~v~~r~~~r~~~~~~~~~a~~~l--~~~ 297 (298) T protein:vir:94 257 YGDPDNSGLDLKGYNQVYIRAELFLGWGILDATKFARV--TEA 297 (298) T ss_pred cCCCcCcchhhhhcCcEEEEEEEEeccEeecccceEEE--Eec Confidence 22 13333 367788999999999987644 555 No 91 >protein:vir:103955 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1662 # MgeName: phiNM # Cross-refs: genbank:acc:YP_873992;genbank:gi:118430767;genbank:GeneID:4525449 Probab=99.36 E-value=2.4e-13 Score=89.73 Aligned_cols=279 Identities=13% Similarity=0.094 Sum_probs=163.1 Q ss_pred CCccc--ccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCC Q lcl|NC_015249. 1 MAKMN--GGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGE 77 (347) Q Consensus 1 ma~~~--~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~ 77 (347) |...+ .+.++ ....+...+.-+++..++.+.....|.++.+.++..+.+ .+++||+. +...+.....|+ T Consensus 18 ~~~~~~~~a~~~-------~~~~~~~~liP~~~~~~ii~~~~~~s~l~~~~~~~~~~~-~~~~~p~~~~~~~a~~v~Eg~ 89 (324) T protein:vir:10 18 NVKPQVFNPDNV-------MMHEKKDGTLLNDFTTPILQEVMENSKIMQLGKYEPMEG-TEKKFTFWADKPGAYWVGEGQ 89 (324) T ss_pred hhccceecccce-------eccCCCcceechhHHHHHHHHHHhhchhhhhcceeeccC-CceEEEEEeCCcceeEeccCc Confidence 21111 11111 111222235668999999999999999999888877654 46888876 556777788888 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG 157 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~ 157 (347) .++. .++.-.++++..-+. ..-..|.+-=-.++.+|+.+.+.++.++++++..|+.++.--. ... T Consensus 90 ~~~~--~~~~~~~v~~~~~k~-~~~~~iS~ell~ds~~~l~~~i~~~l~~ai~~~~d~a~l~G~g--------~~~---- 154 (324) T protein:vir:10 90 KIET--SKATWVNATMRAFKL-GVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGILNQG--------NNP---- 154 (324) T ss_pred cccc--cccceeEEEEeeEEE-EEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhhcCC--------CCc---- Confidence 8765 345666666665443 2333443322224568899999999999999999998863210 000 Q ss_pred ccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccce Q lcl|NC_015249. 158 LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGS 237 (347) Q Consensus 158 ~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~ 237 (347) .+.+....+.. ..... .+...++.|+++...|..++.... .++++|..|..|.+-. +..|...+..+. T Consensus 155 ~~~~i~~~~~~-~~~~~----~~~~t~~~i~~~~~~l~~~~~~~~--~~v~n~~~~~~L~~l~-----d~~g~~~~~~~~ 222 (324) T protein:vir:10 155 FGKSIAQSIEK-TNKVI----KGDFTQDNIIDLEALLEDDELEAN--AFISKTQNRSLLRKIV-----DPETKERIYDRN 222 (324) T ss_pred cCccccccccc-cceec----cccCCHHHHHHHHHhhhhccCCCC--EEEEcHHHHHHHHHhh-----ccCCceeecCCC Confidence 11111100000 00001 112236778888888888775433 5689999999887532 222333344555 Q ss_pred EEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechh Q lcl|NC_015249. 238 IRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRAN 317 (347) Q Consensus 238 Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~ 317 (347) -++++|.+|+.++..+...+ .-+-+||++.+ + +...+++++...+.. T Consensus 223 ~~~l~G~PV~~~~~~~~~~~-----------------------~~~~gd~~~~~-~---------~~~~~~~i~~~~~~~ 269 (324) T protein:vir:10 223 SDTLDGLPVVNLKSSNLKRG-----------------------ELITGDFDKLI-Y---------GIPQLIEYKIDETAQ 269 (324) T ss_pred CccccceeEEeecCCCCCcc-----------------------eEEEEecccEE-E---------EEecCcEEEEeeccc Confidence 56799999998765542111 01223444332 1 122345555554421 Q ss_pred --------------hh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 318 --------------FQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 318 --------------~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++ .-.++...++|..+++|++.+.|....+ T Consensus 270 ~~~~~~~~~~~~~~~~~~~~~~r~~~r~d~~v~~~~A~~~l~~a~~ 315 (324) T protein:vir:10 270 LSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAFAKLVPADK 315 (324) T ss_pred ccccccccccchhhhhcCcEEEEEEEEEccEEecccceEEEEeccC Confidence 22 2445667889999999998887766555 No 92 >protein:vir:8187 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:153 # MgeName: Che9d # Cross-refs: genbank:acc:NP_817980;genbank:gi:29566414;genbank:GeneID:2700968 Probab=99.36 E-value=5.4e-13 Score=87.85 Aligned_cols=295 Identities=10% Similarity=0.018 Sum_probs=164.6 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~~ 79 (347) ||.+..|+- +--++|..++.+.-+..|+++.+.++..+.+| .+++|+. +...+..+..|+.+ T Consensus 1 mat~~~gg~----------------lvP~~~~~~ii~~~~~~s~i~~~~~~i~~~~~-~~~~p~~~~~~~a~wv~Eg~~~ 63 (311) T protein:vir:81 1 MVALATGTF----------------QLPKHLVPGVWQKAQGQSVLARLSMAEPQEFG-EQQYMTLTAPPRGEVVGEGAQK 63 (311) T ss_pred CceecCCce----------------EcchhHHHHHHHHHHhcchhhhhcceeecCCC-ceEEEEEeCCceeEEeecCccc Confidence 887765331 23378899999999899999999888776554 5788876 67788888888887 Q ss_pred CCccCCCCCceEEEEEEeeeecccccccHHHH-----HhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADVLIYDIEDA-----MNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDEN 154 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~-----q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~ 154 (347) +.+ +++-.+++|...+. +.-..|. +|. ....++.+.+.++.+++|++.+|+.++.... +..... T Consensus 64 ~~~--~~~f~~v~l~~~kl-~~~~~iS--~ell~~~~d~~~~l~~~i~~~la~ai~~~~d~a~l~G~~------~~~~~~ 132 (311) T protein:vir:81 64 SES--TATFAPVTAIPRKV-QVTQRFS--QEVKWADESRQLGVLQTMADLSGVALGRALDLIGIHGIN------PLTGAA 132 (311) T ss_pred ccc--cceeeEEEEeeEEE-EEeehhh--HHHhhcCcccHHHHHHHHHHHHHHHHHHHHHHhhhcccc------CCCCcc Confidence 653 45556666665444 2222332 222 2345688999999999999999998864210 011111 Q ss_pred cccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccc Q lcl|NC_015249. 155 IAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPS 234 (347) Q Consensus 155 ~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~ 234 (347) +.+...+. ........ .........+..|.++...+...+... ...+++|..+..|.+- +-.+..+.-..... T Consensus 133 ~~gi~~~~-~~~~~~~~---~~~~~~~~~~~~i~~~~~~~~~~~~~~--~~~vmn~~~~~~l~~l-kd~~G~~l~~~~~~ 205 (311) T protein:vir:81 133 LSGSPAKI-LDTTNIVE---LTTGTSATPDLAVEAAVGLVLGDNLSP--DGVALDNTFSFMLATQ-RDSQGRKLYPELGF 205 (311) T ss_pred cccccccc-cccceeee---ecccccchHHHHHHHHHHHhhhcCCCc--eEEEEcHHHHHHHHhh-hccCCCeeecCccc Confidence 11111110 00000000 001111123344555555665555532 3579999999998653 22334443333445 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) .+..+.++|.+|+.++++|.............. .......-+-+||++..-. ...+++++..+ T Consensus 206 ~~~~~tl~G~Pv~~~~~i~~~~~~~~~~~~~~~-------~~~~~~~~~~gDfs~~~i~----------~~~~~~~~~~~ 268 (311) T protein:vir:81 206 GTDVASFAGLNAAVSDTVRGGPEAVTASTGVYR-------TTNPNVKAIAGDFSAFRWG----------VQVSIPLELIE 268 (311) T ss_pred cCCCceecceeEEecccccccccccccccchhc-------ccCCccEEEEEecccEEEE----------EeccceEEEec Confidence 667789999999999999864422111110000 0000111234555553321 12234555554 Q ss_pred chh-------hhcc--eeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RAN-------FQAD--QIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~-------~~~d--~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +.. ++.+ .+++..++|..++||++.+.|+-..= T Consensus 269 ~~~~~~~~~~~~~~~v~~r~~~r~d~~v~~~~a~~~l~~a~~ 310 (311) T protein:vir:81 269 FGDPDGLGDLKRQNQIAIRAEVVYGIGIMSTDAFAVVRDADE 310 (311) T ss_pred cCCCCcchhhhhcCcEEEEEEEEeccEeecccceEEEEeecc Confidence 421 3334 35566889999999997665422111 No 93 >protein:vir:485 Length: 407 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:11 # MgeName: P27 # Cross-refs: genbank:acc:NP_543092;swissprot:trembl:q8w627;genbank:gi:18249904;uniprot:Q8W627;genbank:GeneID:929693 Probab=99.35 E-value=4.3e-13 Score=88.39 Aligned_cols=293 Identities=12% Similarity=0.028 Sum_probs=162.0 Q ss_pred CCcccc---------cccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEe-ecCccee Q lcl|NC_015249. 1 MAKMNG---------GQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFP-VLGRTKA 70 (347) Q Consensus 1 ma~~~~---------~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~-~iG~~~~ 70 (347) |.+... +.+.+| ...+| .+.-++|..++.+.....+.++.++++.+..++ +..++ ..+.+++ T Consensus 90 l~~g~~~~~~~~e~~a~~~~t----~~~gG---~~iP~~~~~~I~~~~~~~~~l~~~~~~~~~~~~-~~~~~~~~~~~~a 161 (407) T protein:vir:48 90 MRKGREDGLRELERKALQVGN----DEDGG---YAIPEELDRTILTLLKDEVVMRQEATVITLGGS-DYKKLVNLGGTTS 161 (407) T ss_pred HhccchhhhhHHHHHhhhccc----CCCCc---ccccHhHHHHHHHHHHhhhhhhhhceeeecCCC-ceEEEEecCCcce Confidence 211110 001111 00111 134589999999999888999998888776655 45554 4566677 Q ss_pred eeeecCCCCCCccCCCCCceEEEEEEeeeecc-cccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Q lcl|NC_015249. 71 AYLQPGENLDDKRKDMKHTERTINIDGLLTAD-VLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPS 149 (347) Q Consensus 71 ~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~-~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~ 149 (347) .....|+..+.+ ....-.++++.+-+ +.. ..|.+-=-.++.+|+.+.+.++.++++++..|+.++.- T Consensus 162 ~~v~E~~~~~~~-~~~~f~~i~~~~~k--~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~~~a~l~G--------- 229 (407) T protein:vir:48 162 GWVGETDARPET-ATSKLGLIEPFMGE--IYGNPQATQKMLDDAFFNVEDWINSELALEFAEQEEIAFTSG--------- 229 (407) T ss_pred eeeccccccccc-ccccceeEEeeeee--eEeehhhHHHHHhcchHHHHHHHHHHHHHHHHHHHHhhhhcc--------- Confidence 666666665432 12234455555543 333 23333222346679999999999999999999987531 Q ss_pred ccccccccccCcceeecccc-----cccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhh Q lcl|NC_015249. 150 ASDENIAGLGKAHVLEVGKQ-----SELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNA 224 (347) Q Consensus 150 ~~~~~~~~~~~g~~i~~~~~-----~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~ 224 (347) .....+.|.-.......... ..............++.|+++...|.....+. . ..|++|..|..|.+-. -.+ T Consensus 230 ~G~~~p~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~i~~l~~~l~~~~~~~-a-~~v~n~~~~~~L~~lk-D~~ 306 (407) T protein:vir:48 230 DGSKKPKGFLAYESTDEDDKTRAFGKLQHIASGAASGVTADAIIKLIYTLRKAHRSG-A-KFMMNNSSLFAIRLLK-DND 306 (407) T ss_pred CCCCccceeeecccccccccccccccccccccccccccChHHHHHHHHhhchhhhcC-C-EEEEcHHHHHHHHHhh-ccC Confidence 00011111100000000000 00000001111123677888888888776653 2 4579999998886422 223 Q ss_pred hhhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhh Q lcl|NC_015249. 225 ANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVK 304 (347) Q Consensus 225 ~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~ 304 (347) +.|.-...+..|..++++|.+|+.++++|..+.+.... +-+||+.... ++.+ T Consensus 307 Gr~l~~~~~~~g~~~~l~G~PV~~~~~~p~~~~~~~~i--------------------~~Gd~~~~~~-i~~~------- 358 (407) T protein:vir:48 307 GNYLWRPGIELGQPSSLAGYGIVENEQMPDIAADAKAI--------------------AFGNFKRGYT-IVDR------- 358 (407) T ss_pred CceeeccCcCCCCCceecceeeEEecCcCCccCCccEE--------------------EEEeccccEE-EEEe------- Confidence 34443334567777899999999999998533221100 1134433222 2222 Q ss_pred hcceeeeeeechh--hhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 305 LKDMALERARRAN--FQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 305 ~~~~~~e~~~d~~--~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +.+++ .+|+- .-...+.+..+++.++++|++.+.+.+.+| T Consensus 359 -~~~~i--~~d~~~~~~~~~~~~~~r~d~~v~~~~a~~~l~~~aa 400 (407) T protein:vir:48 359 -IGTRI--LRDPYTNKPFVGFYTTKRTGGMLVDSQAIKLMKIGAA 400 (407) T ss_pred -eceEE--EeeccccCCcEEEEEEEEeccEEecccceEEEEeecc Confidence 22232 33332 233447788899999999999999988888 No 94 >protein:vir:100247 Length: 425 # NCBI annotation: gp76 # Family: family:all:21 # MgeID: mge:1619 # MgeName: Bcep176 # Cross-refs: genbank:acc:YP_355412;genbank:gi:77864702;genbank:GeneID:3725969 Probab=99.35 E-value=4e-13 Score=88.55 Aligned_cols=290 Identities=11% Similarity=0.063 Sum_probs=161.7 Q ss_pred CCcc--cccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEee-cCcceeeeeecCC Q lcl|NC_015249. 1 MAKM--NGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPV-LGRTKAAYLQPGE 77 (347) Q Consensus 1 ma~~--~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~-iG~~~~~~~~~g~ 77 (347) +.+. .-+-+.+| ...+| .+.-++|..++.+..+..+.++.+.++.++.+++ .++|+ .+..++.....|+ T Consensus 121 l~~~e~~~al~~~t----~~~gG---~lvP~~~~~~ii~~~~~~s~l~~l~~~~~~~~~~-~~~~~~~~~~~a~wv~E~~ 192 (425) T protein:vir:10 121 VKRGDVQAALNKGE----DSEGG---YLTPIEWDRTITNKLVLISPMRQLCRVQPVSKAG-FSKLFNMGGTTSGWVGEAS 192 (425) T ss_pred hhhhhhHHHhhcCc----CCCCc---eeccHhHHHHHHHHHHhhhhhhhhceeeeccCCc-eEEEEEcCCcceeeecccc Confidence 0000 00000111 11111 1455899999999998999999998888766554 45544 4666666666666 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG 157 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~ 157 (347) .++.+ ..+.-.++++..-+. +.-..|.+-=-.++.+|+.+.+.++.++++++..|+.++.- . . ...+ T Consensus 193 ~~~~~-~~~~f~~v~~~~~k~-~~~i~iS~ell~ds~~~l~~~i~~~la~ai~~~~d~~~l~G----~---G--~~~p-- 259 (425) T protein:vir:10 193 QRPQT-NAATFQPLSFASGEI-YANPAATQQILDDAEIDLESWLATEVQTEFAKQEGKAFLAG----D---G--TNKP-- 259 (425) T ss_pred ccccc-cccccceeeeeheee-EeehHhHHHHHhcchhHHHHHHHHHHHHHHHHHHHhhhhcc----c---C--CCCc-- Confidence 55432 122335555544333 22233433222346789999999999999999999987631 0 0 0011 Q ss_pred ccCcceeecccccccc---------cchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhc Q lcl|NC_015249. 158 LGKAHVLEVGKQSELR---------GDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQ 228 (347) Q Consensus 158 ~~~g~~i~~~~~~~~~---------~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~ 228 (347) .|.+...+...... ..........++.|+++...|...... .-..|++|..|..|.+-. -.++.|. T Consensus 260 --~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~l~~l~~~l~~~~~~--~a~~vmn~~~~~~L~~lk-D~~G~~l 334 (425) T protein:vir:10 260 --NGLLTYIAGGANAAKHPFGAIEVVNSGAAADITSDGIIDLVYDLPSAFTG--NARFAMNRNTQRQVRKLK-DGQGNYL 334 (425) T ss_pred --ceeeeccccccccccccccccccccccccccccHHHHHHHHhhhhhhhcc--CCEEEEchHHHHHHHHhh-cCCCcee Confidence 11110000000000 000011222467777877777765542 335689999999886532 2233443 Q ss_pred cccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcce Q lcl|NC_015249. 229 ALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDM 308 (347) Q Consensus 229 ~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~ 308 (347) =...+.+|.-++++|.+|+.++++|....+.... +-+||+..+. ++.+. .+ T Consensus 335 ~~~~~~~g~~~~l~G~PV~~~~~~p~~~~~~~~i--------------------~~Gd~~~~~~-i~~~~--------~~ 385 (425) T protein:vir:10 335 WQPSYVAGQPATLAGYPVTEVPDMPDVAANSTPI--------------------LFGDFQQTYL-IIDRI--------GV 385 (425) T ss_pred eccCccCCCCceecceeeEEecCcCCccCCccEE--------------------EEEehhccEE-EEEec--------ce Confidence 3334567777899999999999998543221110 1134444332 23232 12 Q ss_pred eeeeeechh--hhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 309 ALERARRAN--FQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 309 ~~e~~~d~~--~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) + ..+|+- +-...+++..+++.++++|++...+.++++ T Consensus 386 ~--v~~d~~~~~~~~~~~~~~r~d~~v~~~~A~~~l~~~as 424 (425) T protein:vir:10 386 R--VLRDPYTAKPYVLFYTTKRVGGGLLNPEPMRAMKVAAS 424 (425) T ss_pred E--EEecccccCCcEEEEEEEEeccEeecccceEEEEeecc Confidence 2 223332 233457788899999999999988888888 No 95 >protein:vir:94673 Length: 419 # NCBI annotation: major capsid protein # Family: family:all:585 # MgeID: mge:1527 # MgeName: mu1/6 # Cross-refs: genbank:acc:YP_579208;genbank:gi:93007444;genbank:GeneID:5076792 Probab=99.35 E-value=4.1e-13 Score=88.47 Aligned_cols=292 Identities=11% Similarity=0.097 Sum_probs=163.4 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcce---------e Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTK---------A 70 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~---------~ 70 (347) +....-....+ .....+.....-...+ +.+.+.+.......+.+++++++.... +++++|++....+ + T Consensus 110 ~~~~~~~~~~~-~~~~~~~~~~~~~~~~p~~~~~~i~~~~~~~~~i~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~a 187 (419) T protein:vir:94 110 MRDIDPNRLLS-RDAPAGTITNPNVPHLPQLVPGIVPTTPDLPLLVADLLDQQNAD-YNVLEYIRDTSGTAGAGSTWNKA 187 (419) T ss_pred HHHHHHHHhhc-cccccccccCCcccccchhhhHHHHHHHhhhhhhhhcceeeecc-CCceeeeeeccccccccccCccc Confidence 00000000000 1111111111111233 667777777766667778877776643 5667777643322 2 Q ss_pred eeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccc Q lcl|NC_015249. 71 AYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSA 150 (347) Q Consensus 71 ~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~ 150 (347) ..+..|+.++. .+++-.++++.+.+.- .-+.|.+ +-.+...++.+.+.++.++++++..|+.|+.- . T Consensus 188 ~~v~Eg~~~~~--~~~~~~~i~~~~~k~~-~~~~is~-ell~d~~~l~~~i~~~la~a~~~~~d~aii~G----~----- 254 (419) T protein:vir:94 188 AVVPEGTAKPQ--STLSFDTITTTLKTVA-HWLPITR-QAADDNSQLMGYIQGRLTYGLRFLRDRQLLNG----N----- 254 (419) T ss_pred ceecCCccccc--cccceeeEEeeeeeEE-EeehhhH-HHHHhHHHHHHHHHHHHHHHHHHHHHHHHHhc----c----- Confidence 33344555443 3455566666665553 3334432 22333456888889999999999999988631 0 Q ss_pred cccccccccCcceeecccc--cccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhc Q lcl|NC_015249. 151 SDENIAGLGKAHVLEVGKQ--SELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQ 228 (347) Q Consensus 151 ~~~~~~~~~~g~~i~~~~~--~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~ 228 (347) .. +.+.|.....+.. ..............++.|+++...+...+.+.. .++++|..|..|++-..-..+.+. T Consensus 255 G~----~~p~Gi~~~~~~~~~~~~~~~~~~t~~~~~~~l~~~~~~~~~~~~~~~--~~v~n~~~~~~l~~~k~~~~~~~~ 328 (419) T protein:vir:94 255 GS----TEMQGILTTPGIGTYQQPKPTAPATDEPPLVDIRRAKTVAEIAGFPPD--GVVVHPQDWESIELDQAPGSGVFR 328 (419) T ss_pred Cc----ccccceecccccccccccccccccccchhHHHHHHHHHhhhhccCCCC--EEEEcHHHHHHHHHHhhcCCCcee Confidence 00 1111111100000 000011112233468889999999888776533 679999999998765444344443 Q ss_pred cccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcce Q lcl|NC_015249. 229 ALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDM 308 (347) Q Consensus 229 ~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~ 308 (347) -.....+|..++++|++|+.++.+|... -+-+||+.... ++.+ +++ T Consensus 329 ~~~~~~~~~~~~l~G~pV~~~~~~~~~~-------------------------~~~gd~~~~~~-~~~~--------~~~ 374 (419) T protein:vir:94 329 VIANVQGEATPRIWGLNVVSTVAIAQGT-------------------------ALVGGFRQGAT-LWSR--------QGI 374 (419) T ss_pred ecCCcccCCCccccceeeEEcCCCCCcc-------------------------EEEeeccceEE-EEEe--------cce Confidence 3344567777899999999999998321 12234444332 2222 235 Q ss_pred eeeeeechh----hhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 309 ALERARRAN----FQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 309 ~~e~~~d~~----~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++...+.. +-...++...+++..+++|++.+.+.+..| T Consensus 375 ~v~~~~~~~~~~~~~~~~~r~~~r~d~~v~~~~a~~~~~~~aa 417 (419) T protein:vir:94 375 TVLMTDSHADFFTANTLVILAEFRANLAVYQPKAFVRVTFAAA 417 (419) T ss_pred EEEEeccccchhhcCcEEEEEEEeeccEEeccccEEEEEeccC Confidence 555554332 233456788899999999999999999999 No 96 >protein:vir:99749 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1497 # MgeName: phiETA2 # Cross-refs: genbank:acc:YP_001004307;genbank:gi:122891761;genbank:GeneID:4712304 Probab=99.34 E-value=5.2e-13 Score=87.92 Aligned_cols=280 Identities=13% Similarity=0.092 Sum_probs=165.2 Q ss_pred CCccc-ccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCC Q lcl|NC_015249. 1 MAKMN-GGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~-~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~ 78 (347) |...+ -..+ -.....+...+.-+.|..++.+.....|.++.+.++..+. +.+++||+. +...+.....|+. T Consensus 18 ~~~~~~~~a~------~~~~~~~~~~lip~~~~~~ii~~~~~~s~l~~~~~~~~~~-~~~~~~p~~~~~~~a~~v~Eg~~ 90 (324) T protein:vir:99 18 NVKPQVFNPD------NVMMHEKKDGTLLNDFTTPILQEVMENSKIMRLGKYEPME-GTEKKFTFWADKPGAYWVGEGQK 90 (324) T ss_pred hhhhhhcccc------ceeccCCCcceechhHHHHHHHHHHhhchhhhhcceeecc-CCceEEEEEecCcceeEeccCcc Confidence 11111 1111 0111112223566899999999999999999988887755 456888876 5567777788888 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) ++. .++.-.++++..-+. ..-..|.+-=-.++.+|+.+.+.++.++++++..|+.++.--. .+. . T Consensus 91 ~~~--~~~~~~~v~~~~~k~-~~~~~iS~ell~ds~~~l~~~i~~~l~~ai~~~~d~~~l~G~g--------~~~----~ 155 (324) T protein:vir:99 91 IET--SKATWVNATMRAFKL-GVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGILNQG--------NNP----F 155 (324) T ss_pred ccc--cccceeEEEEeeEEE-EEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhhcCC--------CCc----c Confidence 765 346667777766554 2333444322223568899999999999999999998863110 000 1 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceE Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSI 238 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~V 238 (347) +.+....... ...... ....++.|+++...|..++.... .++++|..|..|.+-. |..|...+..+.- T Consensus 156 ~~~~~~~~~~-~~~~~~----~~~~~~~i~~~~~~l~~~~~~~~--~~v~n~~~~~~L~~l~-----d~~g~~~~~~~~~ 223 (324) T protein:vir:99 156 GKSIAQSIEK-TNKVIK----GDFTQDNIIDLEALLEDDELEAN--AFISKTQNRSLLRKIV-----DPETKERIYDRNS 223 (324) T ss_pred Cccccccccc-cceecc----ccCCHHHHHHHHHhhhhccCCCC--EEEEcHHHHHHHHHhh-----cCCCceeecCCCC Confidence 1111101110 001111 12236778888888988776433 5789999999887432 2223333444555 Q ss_pred EEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechh- Q lcl|NC_015249. 239 RNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRAN- 317 (347) Q Consensus 239 g~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~- 317 (347) ++++|.+|+.++..+...+. -+-+||...+ + +...++++|..++.. T Consensus 224 ~~l~G~PVv~~~~~~~~~~~-----------------------~i~gd~~~~~-~---------~~~~~~~i~~~~~~~~ 270 (324) T protein:vir:99 224 DTLDGLPVVNLKSSNLKRGE-----------------------LITGDFDKLI-Y---------GIPQLIEYKIDETAQL 270 (324) T ss_pred ccccceeEEeecCCCCCcce-----------------------EEEEecccEE-E---------EEecCcEEEEeecccc Confidence 68999999988766532110 1223443332 1 222345555554431 Q ss_pred -------------hh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 318 -------------FQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 318 -------------~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++ .-.+++.+++|..+.||++.+.|+...+ T Consensus 271 ~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~lt~a~~ 315 (324) T protein:vir:99 271 STVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAFAKLVPADK 315 (324) T ss_pred cccccccccchhhhhcCcEEEEEEEEEccEEecccceEEEEeccC Confidence 22 2445667889999999999888776655 No 97 >protein:vir:97148 Length: 324 # NCBI annotation: ORF010 # Family: family:all:507 # MgeID: mge:1654 # MgeName: 85 # Cross-refs: genbank:acc:YP_239726;genbank:gi:66394880;genbank:GeneID:5130881 Probab=99.34 E-value=3.6e-13 Score=88.81 Aligned_cols=286 Identities=12% Similarity=0.076 Sum_probs=166.4 Q ss_pred CCccccccc-------c-----cccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-Cc Q lcl|NC_015249. 1 MAKMNGGQQ-------I-----GKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GR 67 (347) Q Consensus 1 ma~~~~~~~-------~-----~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~ 67 (347) |=+.+.... . ..+..-...+.+...+.-+.|..++.+.-...|.++.+.++..+. +.+++||+. +. T Consensus 1 ~~~~~~~~~~~~~f~~~~~~~~~~~a~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~~~~~~~~~-~~~~~ip~~~~~ 79 (324) T protein:vir:97 1 MEQTQKLKLNLQHFASNNVKPQVFNPDNVMMHEKKDGTLMNEFTTPILQEVMENSKIMQLGKYEPME-GTEKKFTFWADK 79 (324) T ss_pred CccchhHHHHHHHHHHhhhhhhhhccccccccCCCcceechhHHHHHHHHHHhhcchhhhcceeecc-CCceEEEEEecC Confidence 221111000 0 001100111122223556899999999988889999988777754 556888876 56 Q ss_pred ceeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Q lcl|NC_015249. 68 TKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNL 147 (347) Q Consensus 68 ~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~ 147 (347) ..+.-...|+.++. .+++-.++++..-+. ..-..|.+---.++.+++.+.+.++.++++++..|+.++.--. T Consensus 80 ~~a~~v~Eg~~~~~--~~~~f~~v~~~~~k~-~~~~~is~ell~ds~~~l~~~i~~~l~~aia~~~d~a~l~G~g----- 151 (324) T protein:vir:97 80 PGAYWVGEGQKIET--SKATWVNATMRAFKL-GVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGILNQG----- 151 (324) T ss_pred cceeEeccCccccc--cccceeEEEEeeEEE-EEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhccCC----- Confidence 67777778887765 356667777766554 3334454422224568899999999999999999998863210 Q ss_pred ccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhh Q lcl|NC_015249. 148 PSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANY 227 (347) Q Consensus 148 ~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~ 227 (347) ... .+.+.. ........... ....++.|+++...|...+.... .++++|..|..|.+-.. . T Consensus 152 ---~~~----~~~gi~-~~~~~~~~~~~----~~~~~~~i~~~~~~l~~~~~~~~--~~v~n~~~~~~L~~lkd-----~ 212 (324) T protein:vir:97 152 ---NNP----FGKSIA-QSIEKTNKVIK----GDFTQDNIIDLEALLEDDELEAN--AFISKTQNRSLLRKIVD-----P 212 (324) T ss_pred ---CCc----cCcccc-ccccccceecc----ccCCHHHHHHHHHhhhhccCCCC--EEEEcHHHHHHHHHhhc-----C Confidence 000 111110 00000111111 11236778888888888776432 56899999998864322 2 Q ss_pred ccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcc Q lcl|NC_015249. 228 QALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKD 307 (347) Q Consensus 228 ~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~ 307 (347) .|...+..+.-+.+.|.+|+.++..+...+. -+-+||.+.+ + +...+ T Consensus 213 ~g~~~~~~~~~~tl~G~PV~~~~~~~~~~~~-----------------------~~~gd~~~~~-i---------~~~~~ 259 (324) T protein:vir:97 213 ETKERIYDRNSDTLDGLPVVNLKSSNLKRGE-----------------------LITGDFDKLI-Y---------GIPQL 259 (324) T ss_pred CCceeecCCCCccccceeeEeecCCCCCcce-----------------------EEEEecccEE-E---------EEecC Confidence 2333344555578999999998766532211 1223444332 1 22334 Q ss_pred eeeeeeechh--------------hhc--ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 308 MALERARRAN--------------FQA--DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 308 ~~~e~~~d~~--------------~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++|..++.. ++. -.+++.++++..+.||++.+.|....+ T Consensus 260 ~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~r~~~r~d~~v~~~~a~~~l~~~~~ 315 (324) T protein:vir:97 260 IEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAFAKLVPADK 315 (324) T ss_pred cEEEEeecccccccccccccchhhhhcCcEEEEEEEEeccEEecccceEEEEeccC Confidence 5666655432 222 345566888999999999888777666 No 98 >protein:vir:8102 Length: 543 # NCBI annotation: gp6 # Family: family:all:21 # MgeID: mge:152 # MgeName: Che9c # Cross-refs: genbank:acc:NP_817683;genbank:gi:29566114;genbank:GeneID:1259308 Probab=99.34 E-value=3.8e-13 Score=88.69 Aligned_cols=298 Identities=13% Similarity=0.037 Sum_probs=161.0 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHH-HHHHHHHhhhcccccccccccceEEEee-cCcceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVL-TAFTRTSVTMNKHLVRSIQSGKSAQFPV-LGRTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~-~~f~~~s~~~~~~~~r~i~~G~tv~i~~-iG~~~~~~~~~g~~ 78 (347) +...........+ ..+...++--.|..+.|..++. ..+...+.++.+.++... +|+ +.+|+ .+...+.....|.. T Consensus 237 l~~~e~~~~~~~~-~~~~t~~~gg~lip~~~~~~ii~~~~~~~~~l~~~~~~~~~-~g~-~~~~~~~~~~~a~~v~Eg~~ 313 (543) T protein:vir:81 237 LTEEEKRAINEVR-AMGLTKADGGYLVPFQLDPTVIITSNGSLNDIRRFARQVVA-TGD-VWHGVSSAAVQWSWDAEFEE 313 (543) T ss_pred hhhhhhhhhhhhh-hcccccccCcccCchhhhhHHHHHHHhhhchhhhhcccccC-Ccc-eEEEEecCCcceeecccCcc Confidence 0000000000000 0000111111245578877764 556666788877776543 344 44544 46667777777877 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) ++. .+++-.++++...+.-. -+.|.+ +-.+.+.|+.+.+.+++++++++..|+.|+.- . .+...+.|. T Consensus 314 ~~~--~~~~~~~i~~~~~k~~~-~~~is~-ell~d~~~~~~~i~~~l~~~~~~~~d~ail~G----~----Gt~~~p~Gi 381 (543) T protein:vir:81 314 VSD--DSPEFGQPEIPVKKAQG-FVPISI-EALQDEANVTETVALLFAEGKDELEAVTLTTG----T----GQGNQPTGI 381 (543) T ss_pred ccc--cccccceeeeeeeeeEe-eehhhH-HHHhccHHHHHHHHHHHHHHHHHHHHHHHhcc----C----CCCcccccc Confidence 754 35666777777665533 334533 33445679999999999999999999988621 0 001111111 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceE Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSI 238 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~V 238 (347) -.. .+ ...............++.++++...|...+-+. -.+|++|..|..|.+-. -.++.|.-. .+..|.- T Consensus 382 ~~~----~~-~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~--~~~v~n~~~~~~l~~lk-d~~G~~l~~-~~~~g~~ 452 (543) T protein:vir:81 382 VTA----LA-GTAAEIAPVTAETFALADVYAVYEQLAARHRRQ--GAWLANNLIYNKIRQFD-TQGGAGLWT-TIGNGEP 452 (543) T ss_pred hhh----cc-cccccccccccccccHHHHHHHHHhhhccccCC--cEEEEcHHHHHHHHHhh-cCCCceecc-CcCCCCC Confidence 000 00 000000011111224677777877777665432 35789999999997532 223334322 2445666 Q ss_pred EEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeee----- Q lcl|NC_015249. 239 RNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERA----- 313 (347) Q Consensus 239 g~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~----- 313 (347) ++++|.+|+.++++|......... + ...-|-+||+... ++. ..++++++. T Consensus 453 ~~l~G~pv~~~~~~~~~~~~~~~~-------~--------~~~i~~gd~~~~~--i~~--------~~~~~i~~~~~~~~ 507 (543) T protein:vir:81 453 SQLLGRPVGEAEAMDANWNTSASA-------D--------NFVLLYGNFQNYV--IAD--------RIGMTVEFIPHLFG 507 (543) T ss_pred ccccceeeEEeccccccccccccC-------C--------cceEEEeecccee--EEe--------ecccEEEEeccccc Confidence 789999999999999654321110 0 0112334554332 111 122333322 Q ss_pred -echhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 314 -RRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 314 -~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++.....-.+.+...+|..+++|++.+.+.++.+ T Consensus 508 ~~~~~~~~~~~~~~~r~d~~v~~~~A~~~l~~~~~ 542 (543) T protein:vir:81 508 TNRRPNGSRGWFAYYRMGADVVNPNAFRLLNVETA 542 (543) T ss_pred cchhhcCceEEEEEEeeccEeecccceEEEEeccc Confidence 1222223456788889999999999999999888 No 99 >protein:vir:4456 Length: 401 # NCBI annotation: Major capsid protein precursor # Family: family:all:21 # MgeID: mge:96 # MgeName: ST64B # Cross-refs: genbank:acc:NP_700379;genbank:gi:23505451;genbank:GeneID:955658 Probab=99.34 E-value=2.1e-13 Score=90.14 Aligned_cols=292 Identities=11% Similarity=0.052 Sum_probs=158.5 Q ss_pred CCcccc-cccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEe-ecCcceeeeeecCCC Q lcl|NC_015249. 1 MAKMNG-GQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFP-VLGRTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~-~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~-~iG~~~~~~~~~g~~ 78 (347) +....- +.+.++ ...+| .+.-++|..++.+..+..+.++.+.++..+.++ +..++ ..+...+.....|.. T Consensus 99 ~~~~e~~a~~~~~----~~~GG---~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~-~~~~~~~~~~~~a~wv~E~~~ 170 (401) T protein:vir:44 99 LRDLERKALQVGT----DEDGG---YAVPEELDRSILSLLKDEVVMRQEATVITVGGS-DYKKLVNLGGTASGWVGETDT 170 (401) T ss_pred hHHHHHHHhhcCC----CCCCc---eeccHhHHHHHHHHHHhhhhhhhhceeeecCCC-ceEEEEecCCccceeeccccc Confidence 000000 000000 01111 134489999999999888999988888776544 44454 345555555555554 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) .+.+ ..+.-+++++.+-+. ..-..|.+-=-.++.+|+.+.+.++.++++++..|+.++.-= ....+.|. T Consensus 171 ~~~~-~~~~~~~v~~~~~k~-~~~~~iS~ell~ds~~~l~~~i~~~la~ai~~~~~~~~l~G~---------G~~~p~Gi 239 (401) T protein:vir:44 171 RSQT-ATSRLGLIEPFMGEI-YGNPQATQKMLDDAFFNVEAWINSELATEFAEQEEIAFTTGD---------GTKKPKGF 239 (401) T ss_pred cCcc-ccccceeeeeehhhe-eeehhhhHHHHhcchHHHHHHHHHHHHHHHHHHHHhhhhccC---------CCCcccee Confidence 4322 123345555555433 222234332223456789999999999999999999876310 00011110 Q ss_pred cC-cceee------cccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccc Q lcl|NC_015249. 159 GK-AHVLE------VGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALI 231 (347) Q Consensus 159 ~~-g~~i~------~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~ 231 (347) -. ..... .+..... .......-.|+.|+++...|...... +-..+++|..|..|.+-. -.++.+.-.. T Consensus 240 l~~~~~~~~~~~~~~~~~~~~--~t~~~~~~~~d~i~~~~~~l~~~~~~--~a~~v~n~~~~~~L~~lk-d~~G~~l~~~ 314 (401) T protein:vir:44 240 LAYESTEESDKARAFGKLQHI--VSGEATAVTADAIIKLIYTLRKAHRT--GAKFMMNNNSLFAIRLLK-DTEGNYLWRP 314 (401) T ss_pred ecccccccccccccccccccc--ccccccccCHHHHHHHHHhcchhhhc--CCEEEEcHHHHHHHHHhh-ccCCceeecC Confidence 00 00000 0000000 00011112367788888888765433 335679999998886432 2233344334 Q ss_pred ccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeee Q lcl|NC_015249. 232 DPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALE 311 (347) Q Consensus 232 ~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e 311 (347) .+.+|..++++|.+|+.++++|....+... -+-+||+.... ++.+ +.+++. T Consensus 315 ~~~~g~~~~l~G~PVv~~~~~p~~~~~~~~--------------------i~~Gd~~~~~~-i~~~--------~~~~~~ 365 (401) T protein:vir:44 315 GLELGQPSSLAGYGIAENEQMPDIAADAKA--------------------IAFGNFKRGYT-IVDR--------IGTRIL 365 (401) T ss_pred CcCCCCCceecceeeEEecCcCCccCCccE--------------------EEEeehhccEE-EEEe--------cceEEe Confidence 456777789999999999999853322110 01134433222 2322 233333 Q ss_pred eeechhhhc--ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 312 RARRANFQA--DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 312 ~~~d~~~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +|+-... ..+.+..++|..+.+|++.+.|.+++| T Consensus 366 --~~~~~~~~~v~~~a~~r~d~~~~~~~a~~~l~~~aa 401 (401) T protein:vir:44 366 --RDPYTNKPFVGFYTTKRTGGMLVDSQAIKLLKIAAA 401 (401) T ss_pred --eeccccCCcEEEEEEEEeccEEecccceEEEEeecC Confidence 3333323 346788899999999999999999999 No 100 >protein:vir:80684 Length: 315 # NCBI annotation: gp6 # Family: family:all:966 # MgeID: mge:1884 # MgeName: PA6 # Cross-refs: genbank:acc:YP_001285582;genbank:gi:148727088;genbank:GeneID:5247055 Probab=99.32 E-value=9e-13 Score=86.61 Aligned_cols=286 Identities=13% Similarity=0.045 Sum_probs=161.5 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~~ 79 (347) ||...+. .|.. +.-+++.+++.+..++.|+++.+.++.... +..++||+. |...+..+..|+.+ T Consensus 1 Ma~~~~~------~gg~--------~vP~~~~~~ii~~l~~~s~i~~l~~~i~~~-~~~~~ip~~~~~~~a~wv~Eg~~~ 65 (315) T protein:vir:80 1 MADDFLS------AGKL--------ELPGSMIGAVRDRAIDSGVLAKLSPEQPTI-FGPVKGAVFSGVPRAKIVGEGEVK 65 (315) T ss_pred CCCCcCC------cCce--------EcchHHHHHHHHHHHhhchhhhhcceeecC-CCceEEEEEeCCcceEEeeCCccc Confidence 8755421 1111 345899999999999999999888776654 456788875 66788888888877 Q ss_pred CCccCCCCCceEEEEEEeeeeccc-ccccHHHHHhChh----hHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccc-ccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADV-LIYDIEDAMNHYD----VRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSA-SDE 153 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~-~Idd~D~~q~~~D----~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~-~~~ 153 (347) +.+ +++-+++++.. .++..+ .|.+-=-.++..| +++.+.++.+++|++.+|+.++.-. .+ ... T Consensus 66 ~~s--~~~f~~v~l~~--~kl~~~~~iS~ell~~s~~~~~~~l~~~i~~~la~ai~~~~d~a~~~G~-------~~~~~~ 134 (315) T protein:vir:80 66 PSA--SVDVSAFTAQP--IKVVTQQRVSDEFMWADADYRLGVLQDLISPALGASIGRAVDLIAFHGI-------DPATGK 134 (315) T ss_pred ccc--ccceeeeEeee--eeEEeeehhhHHHhhcCchhHHHHHHHHHHHHHHHHHHHHHhhheeecc-------CCCCCc Confidence 653 45555555544 333332 3321111122333 6788899999999999998775210 10 011 Q ss_pred ccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccc--- Q lcl|NC_015249. 154 NIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQAL--- 230 (347) Q Consensus 154 ~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~--- 230 (347) .+.+.... +... +.........++.|+++...+...+.-... ..+++|..+..|.+-.....++..+. T Consensus 135 ~~~~~~~~--~~~~------~~~~~~~~~~~~d~~~~~~~~~~~~~~~~~-~~imn~~~~~~L~~l~~~~g~~~~g~~~~ 205 (315) T protein:vir:80 135 AASAVHTS--LNKT------KNIVDATDSATADLVKAVGLIAGAGLQVPN-GVALDPAFSFALSTEVYPKGSPLAGQPMY 205 (315) T ss_pred cccccccc--cccc------cceeeccccchHHHHHHHHHHhhccCccce-EEEEcHHHHHHHHHHhhccCCcccccccc Confidence 11111110 0000 001111122355677777666655443333 46789999999975543222221111 Q ss_pred cccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceee Q lcl|NC_015249. 231 IDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMAL 310 (347) Q Consensus 231 ~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~ 310 (347) .....|..++++|.+|+.++++|....... .....-+-+||++..-.++ . .+++ T Consensus 206 ~~~~~g~~~tl~G~PV~~~~~~~~~~~~~~----------------~~~~~~~~GDfs~~~~g~~-~---------~~~i 259 (315) T protein:vir:80 206 PAAGFAGLDNWRGLNVGASSTVSGAPEMSP----------------ASGVKAIVGDFSRVHWGFQ-R---------NFPI 259 (315) T ss_pred cccccCCCceecceeeEecCcCCccccccc----------------ccccEEEEeecccEEEEEe-c---------CeeE Confidence 123455567899999999999985432210 0011124456666432222 1 2333 Q ss_pred eeeech--------hhhcc--eeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 311 ERARRA--------NFQAD--QIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 311 e~~~d~--------~~~~d--~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++.++. .++.| .+++..++|.++.||++.+.|+-..| T Consensus 260 ~i~~~~~~~~~~~~~~~~~~v~~r~~~r~~~~v~~~~a~~~l~~~~a 306 (315) T protein:vir:80 260 ELIEYGDPDQTGRDLKGHNEVMVRAEAVLYVAIESLDSFAVVKEKAA 306 (315) T ss_pred EEeccccccCcccchhhcCcEEEEEEEEecceeecccceEEEeeccC Confidence 333221 13444 45566888999999999998887777 No 101 >protein:vir:4856 Length: 293 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:106 # MgeName: DT1 # Cross-refs: genbank:acc:NP_049396;genbank:gi:9632424;genbank:GeneID:1258532 Probab=99.31 E-value=1.6e-12 Score=85.30 Aligned_cols=274 Identities=14% Similarity=0.097 Sum_probs=166.4 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccc-cceEEEeecC--cceeeeeecCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQS-GKSAQFPVLG--RTKAAYLQPGE 77 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~-G~tv~i~~iG--~~~~~~~~~g~ 77 (347) |.+..... |-. ++| .+.-++|..++.+..+..+.++++.++..+.+ ..+..|+... ...+.....|+ T Consensus 1 ~l~~~~~~---t~~----~gg---~liP~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~~g~~~~~~~~~~~~~a~~v~Eg~ 70 (293) T protein:vir:48 1 MLDSKTDH---SGS----DAG---LTIPQDIRTAINTLVRQYDSLQEYVNVENVTTLTGSRVYEKWTDITGLANIDDEAG 70 (293) T ss_pred Cceeeccc---ccC----cCc---eEechhHHHHHHHHHHhhhhhhhhceeeeccCCcceEEEEeecCCCcceeeecCCc Confidence 33333211 111 111 24558999999999999999999888776654 3456666543 34556667777 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG 157 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~ 157 (347) .++.+ ..++-.++++...+.- ....|.+-=-.++.+|+.+.+.++.++++++..|+.|+..+. T Consensus 71 ~~~~~-~~~~~~~i~l~~~k~~-~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~i~~g~~--------------- 133 (293) T protein:vir:48 71 KIADI-DDPKLSLIKYTIKRYA-GISTVTNSLLADSAENILAWLSGWIAKKVVVTRNKAILGVVD--------------- 133 (293) T ss_pred ccccc-cccceeEEEEeeeEEE-EeehhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHhHHhhccc--------------- Confidence 76432 2345566677665553 334554332345678999999999999999999998864321 Q ss_pred ccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccce Q lcl|NC_015249. 158 LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGS 237 (347) Q Consensus 158 ~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~ 237 (347) +++ +.. .. .-|+.|+++..+|....-+. -..+++|..|..|.+-.. .+..+.-...+.+|. T Consensus 134 --~~~-----~~~-----~~----~~~d~i~~~~~~l~~~~~~~--a~~vmn~~~~~~L~~lkd-~~g~~l~~~~~~~~~ 194 (293) T protein:vir:48 134 --KLP-----TKP-----TL----TKWDDIIDLEAKVDPAIKQT--SFFLTNTSGFTALKKVKN-ALGDYLMERDVKSPT 194 (293) T ss_pred --ccc-----ccc-----cc----cCHHHHHHHHHhhhhhhcCC--CEEEEcHHHHHHHHHhhc-cCCceEeecCcCCCC Confidence 000 000 01 12677888888887665543 356789999998865332 234444444566777 Q ss_pred EEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeech- Q lcl|NC_015249. 238 IRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRA- 316 (347) Q Consensus 238 Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~- 316 (347) .++++|.+|+.+.+.+....+. + ...-+-+||++.+.++. ..+++++..+.. T Consensus 195 ~~~l~G~Pv~~~~~~~~~~~~~----------~--------~~~~~~gd~~~~~~~~~---------~~~~~i~~~~~~~ 247 (293) T protein:vir:48 195 GYSIAGFAVKEISDRWLPNASS----------G--------VMPLYFGDLKQAVTLFD---------RQQMSLLSTNIGG 247 (293) T ss_pred CceecceeeEEecccccCCccC----------C--------ceEEEEEeccceEEEEE---------ecceEEEEecccc Confidence 7899999999876655432110 0 00012234444433322 223555554322 Q ss_pred -hhhc--ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 317 -NFQA--DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 317 -~~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .++. ..++...+++..+.+|++.+.+.+..+ T Consensus 248 ~~~~~~~~~~r~~~r~d~~~~~~~a~~~l~~~~~ 281 (293) T protein:vir:48 248 GAFETDTTKVRVIDRFDVVATDTEAFVPASFKAI 281 (293) T ss_pred hhhhcCeEEEEEEEeeCcEEecccceEEEEeecc Confidence 2233 457788889999999999999998877 No 102 >protein:vir:4830 Length: 397 # NCBI annotation: MPL-7201 # Family: family:all:21 # MgeID: mge:105 # MgeName: 7201 # Cross-refs: genbank:acc:NP_038327;genbank:gi:9634653;genbank:GeneID:1262632 Probab=99.31 E-value=1.1e-12 Score=86.15 Aligned_cols=284 Identities=13% Similarity=0.048 Sum_probs=162.4 Q ss_pred CCc-ccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccc--cceEEEeec-CcceeeeeecC Q lcl|NC_015249. 1 MAK-MNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQS--GKSAQFPVL-GRTKAAYLQPG 76 (347) Q Consensus 1 ma~-~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~--G~tv~i~~i-G~~~~~~~~~g 76 (347) +.. +.+...........+.+++--.+.-+.|..++.+.....+.+++++++..+.+ |+....+.. +...+.....| T Consensus 94 ~~~~~~~~~~~~~~~~~~~t~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~ 173 (397) T protein:vir:48 94 FKNLVRGRYQNLLDSKTDASGSDAGLTIPQDIQTAIHTLVRQYDSLQEYVNVENVTTLTGSRVYEKWADITGLAKLDDEA 173 (397) T ss_pred HHHHHhhhhhHHHHHhhccCCccccccccHHHHHHHHHHHHHHHHHHhhhceeeccCCcceEEEEeecCCCcceeeeccc Confidence 000 00000000000000111111224568999999999999999999888877664 333333333 22334555566 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIA 156 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~ 156 (347) +.++.+ ..++-.++++.+.+. .....|.+-=-.++.+|+.+.+.++.++++++..|+.|+... T Consensus 174 ~~~~~~-~~~~~~~v~~~~~k~-~~~~~iS~ell~ds~~~l~~~v~~~l~~~~~~~~d~~il~G~--------------- 236 (397) T protein:vir:48 174 GSIGTN-DDPKLYPIRYAIKRY-AGISTVTNSLLADSAENILAWLSGWIAKKVVVTRNKAILEAI--------------- 236 (397) T ss_pred cccccc-cccceeeEEeeheee-eeehhhHHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhcc--------------- Confidence 665432 234556777777555 333455433233467899999999999999999999886321 Q ss_pred cccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccc Q lcl|NC_015249. 157 GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTG 236 (347) Q Consensus 157 ~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G 236 (347) +.+. ..+. ..-++.|+++...|.....+. =.++++|..|..|.+-. -.+..+.-...+.+| T Consensus 237 --g~~~--~~~~------------~~~~d~i~~~~~~l~~~~~~~--a~~v~n~~~~~~L~~lk-d~~G~~i~~~~~~~~ 297 (397) T protein:vir:48 237 --ATLP--TKPT------------LTKWDDIIDLQAKVDPAIKQT--SFFLTNTSGFTALKKVK-NAFGDYLMERDVKSP 297 (397) T ss_pred --cccc--cccc------------cccHHHHHHHHHHhhhhhcCC--CEEEECHHHHHHHHHhh-cCCCceeeccCcCCC Confidence 1110 0000 012567888888888877653 36679999999987533 223445444456677 Q ss_pred eEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeech Q lcl|NC_015249. 237 SIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRA 316 (347) Q Consensus 237 ~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~ 316 (347) .-+.++|++|+.+.+.+....+.. ...-+-++|+..+. .+....++++..+.. T Consensus 298 ~~~~l~G~PV~~~~~~~~~~~~~~------------------~~~~~~gd~~~~~~---------~~~~~~~~i~~~~~~ 350 (397) T protein:vir:48 298 TGYSIDGFAVKEVADRWLANASSG------------------AMPLYFGDLKQAVT---------LFDRQQMSLLSTNIG 350 (397) T ss_pred CCceeccceeEEecccccCCcCCC------------------ceEEEEEeccceEE---------EEeecceEEEEeccc Confidence 778999999998765443221100 00011123333222 222233555555432 Q ss_pred --hhh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 317 --NFQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 317 --~~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .+. ...+++.++++..+++|++.+.+.++.+ T Consensus 351 ~~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~ 385 (397) T protein:vir:48 351 GGAFETDTTKIRVIDRFDVVATDTESFVPASFKAI 385 (397) T ss_pred hhhhhcCceeEEEEeeeccEEecccceEEEEeccc Confidence 222 2466788889999999999999988888 No 103 >protein:vir:3870 Length: 400 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:82 # MgeName: A2 # Cross-refs: genbank:acc:NP_680487;swissprot:trembl:q8ltc0;genbank:gi:22296527;interpro:IPR006444;uniprot:Q8LTC0;genbank:GeneID:951713 Probab=99.30 E-value=4.7e-13 Score=88.17 Aligned_cols=277 Identities=12% Similarity=0.072 Sum_probs=156.6 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec--CcceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL--GRTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i--G~~~~~~~~~g~~ 78 (347) +...............+...++--.+.-+.|..++.+.....+.+++++++.++.++ +..+|+. +...+..+..|.. T Consensus 120 ~~~~~~~~~~~~~~~~~~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~E~~~ 198 (400) T protein:vir:38 120 AVLRAVPTDASDAVNAGVKAADAASTIPETISNTPQRELQTVVDLKPFTNVFQASTQ-KGTYPTVANATTKMVTVAELEK 198 (400) T ss_pred hhhhhhhHHHHHHHhhcccccCCcccccHHHHHHHHHHHHhhhhhhhcceeEeccCc-ceEEEEEecCCCcccccccccc Confidence 000000000000000111111111245589999999998888999999988876544 4455543 4444555555555 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) .+.. ..+.-.++++.+.+. +.-..|.+-=-.++.+|+.+.+.++.+++|+...|+.|+.-. T Consensus 199 ~~~~-~~~~f~~i~~~~~k~-~~~~~is~ell~ds~~~~~~~i~~~l~~~~~~~~~~~i~~~~----------------- 259 (400) T protein:vir:38 199 NPAM-AKPEFKPVNWSVETY-RQALPVSQESIDDSAIDLVGLIAQNGQQIKVNTTNGAVATLL----------------- 259 (400) T ss_pred cccc-ccccceeeEeehhhe-eeehhhHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhhhhcc----------------- Confidence 4321 234445555554333 223333332223456789999999999999999998875321 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHH-HhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccce Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARA-KLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGS 237 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~-~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~ 237 (347) +++.. .. ... ++.|.++.. .++. ...-..|++|..|..|.+- +-.++.|.-...+.+|. T Consensus 260 ~~~~~------~~-----~~~----~~~~~~~~~~~~~~----~~~a~~v~~~~~~~~l~~l-kd~~G~~i~~~~~~~~~ 319 (400) T protein:vir:38 260 KGFTA------KT-----ISS----VDDLKHINNVDLDP----AYSRVIIASQSFYNFLDTV-KDGNGRYLLQDSILTPS 319 (400) T ss_pred ccccc------cc-----ccc----HHHHHHHHHhhhhh----hhCcEEEEcHHHHHHHHHh-hccCCCeeeecCcCCCC Confidence 01100 00 011 333443322 2222 2245678899999998653 22344454334566777 Q ss_pred EEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechh Q lcl|NC_015249. 238 IRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRAN 317 (347) Q Consensus 238 Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~ 317 (347) -++++|++|+.+++.|....+... -+-+||++.+.+ + ...+++++..++ . T Consensus 320 ~~~l~G~pv~~~~~~~~~~~g~~~--------------------~~~gd~s~~~~~-~--------~~~~~~~~~~~~-~ 369 (400) T protein:vir:38 320 GKSVLGMPIAVVSDDTLGAAGEAH--------------------AFLGDIKRAILF-A--------NRADFMVRWVDD-Q 369 (400) T ss_pred ccccccceeEEecccccCCCCceE--------------------EEEEeccccEEE-E--------eecceEEEEecc-c Confidence 789999999999999864322110 122344443322 2 223455665544 4 Q ss_pred hhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 318 FQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 318 ~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++...+++.+++|.++.+|++.+.|.+..+ T Consensus 370 ~~~~~~~~~~r~d~~~~~~~a~~~l~~~~~ 399 (400) T protein:vir:38 370 IYGQFLQAGMRFGVSVADEKAGYFLTYTPK 399 (400) T ss_pred ccceeEEEEEEeccEEecccceEEEEeecC Confidence 556789999999999999999999998887 No 104 >protein:vir:81070 Length: 390 # NCBI annotation: p09 # Family: family:all:585 # MgeID: mge:1889 # MgeName: Xop411 # Cross-refs: genbank:acc:YP_001285679;genbank:gi:148727187;genbank:GeneID:5247115 Probab=99.30 E-value=1.8e-12 Score=85.03 Aligned_cols=279 Identities=16% Similarity=0.097 Sum_probs=166.6 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecC--cceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLG--RTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG--~~~~~~~~~g~~ 78 (347) .+..+. ......++.-.+..+++...+.......+.+++++++..+. +.++++++.. ..++..+..|+. T Consensus 107 ~~~~~~--------~~~~~~~~~g~~~~~~~~~~ii~~~~~~~~l~~~~~~~~~~-~~~~~~~~~~~~~~~a~~v~Eg~~ 177 (390) T protein:vir:81 107 KAALNT--------ASTDAAGSAGALTTPNRLPGFITPPDARLTVRDLIGSGRTD-SALIEYVQETGFVNNAAIVAEGAL 177 (390) T ss_pred HHHHHh--------hccccccCCcceechhhhHHHHHHHhhhhhhhhhcceeecc-CCceEEEEEecCCcceeeecCCcc Confidence 111110 00111122223566788888888888889888888877654 4567777753 245667777877 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) ++.. +++-+++++.+.+.-. -..|.+ +-.+...++.+.+.++.+.++++..|++++..- ..+..+.|. T Consensus 178 ~~~~--~~~~~~i~~~~~k~~~-~~~is~-ell~d~~~~~~~i~~~l~~~~~~~~d~a~l~G~--------g~~~~~~Gi 245 (390) T protein:vir:81 178 KPES--SLKFAKKTDTTHVIAH-TMKATR-QILSDAPQLASYMNNRLIRGLKVKEDAEILRGT--------GANDGLLGL 245 (390) T ss_pred cccc--cceeeEEEEeeeEEEE-eehhhH-HHHHhHHHHHHHHHHHHHHHHHHHHHHHHHhcC--------CCCCcccce Confidence 7643 4556777777765533 334432 223334678888999999999999999876310 111112111 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceE Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSI 238 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~V 238 (347) . ...... ...........++.|.++...+...+.+.. .+|++|..|..|.+-.. .++.|.-.. ...+.. T Consensus 246 ----~-~~~~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~v~~~~~~~~l~~lkd-~~G~~l~~~-~~~~~~ 314 (390) T protein:vir:81 246 ----I-PQATTY--AAPTTIAGATRVDQLRLAMLQASLAEYNPS--GIVINPIDWAAIELAKD-ANNQYLIGN-ARGTLT 314 (390) T ss_pred ----e-eccccc--ccccccccchhHHHHHHHHHhhccccCCCC--EEEEcHHHHHHHHHhhc-CCCceeecC-cccccC Confidence 1 100000 001111223356788888889988887654 46789999998875332 233333222 234445 Q ss_pred EEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechh- Q lcl|NC_015249. 239 RNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRAN- 317 (347) Q Consensus 239 g~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~- 317 (347) ..++|.+|+.++.+|.+. -+-+||+....+ +. ..+++++..+... T Consensus 315 ~~l~G~pv~~~~~~p~~~-------------------------~~~gd~~~~~~~-~~--------~~~~~v~~~~~~~~ 360 (390) T protein:vir:81 315 PTLWGLPVVATQAMAPGE-------------------------FLVGAFDLAAQI-FD--------QWDARVEIGYVGED 360 (390) T ss_pred ceecceeeEEcCCCCCCc-------------------------EEEEehhceEEE-EE--------ecceEEEEecccch Confidence 689999999999998421 122344443322 22 2356667665433 Q ss_pred hhcc--eeeeeeeecccccccceEEEEEEc Q lcl|NC_015249. 318 FQAD--QIIAKYAMGHGGLRPEACGALVFN 345 (347) Q Consensus 318 ~~~d--~i~~~~a~G~~~~Rpe~a~~i~~~ 345 (347) ++.+ .+++.++++..+++|++.+.+.+. T Consensus 361 ~~~~~v~~r~~~r~d~~v~~~~a~v~~t~a 390 (390) T protein:vir:81 361 FQRNMITVLAEERLALVVYRPEALISGSFA 390 (390) T ss_pred hhcCcEEEEEEEeeccEEecccceEEEEeC Confidence 3444 467889999999999999988888 No 105 >protein:vir:104256 Length: 458 # NCBI annotation: major head protein precursor # Family: family:all:27070 # MgeID: mge:1504 # MgeName: T5 # Cross-refs: genbank:acc:YP_006977;genbank:gi:46401878;genbank:GeneID:2777673 Probab=99.30 E-value=1.3e-12 Score=85.79 Aligned_cols=294 Identities=11% Similarity=0.043 Sum_probs=156.5 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEee-cCcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPV-LGRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~-iG~~~~~~~~~g~~~ 79 (347) +..... .+ ......+.-.+..+.+..++.+.-...+.++.+.++..+.++ ...+++ .+...+.....+... T Consensus 155 ~~~~~a-~~------~~~~~~~g~~~ip~~~~~~ii~~~~~~~~l~~~~~~~~~~~~-~~~~~~~~~~~~a~~v~e~~~~ 226 (458) T protein:vir:10 155 QRHLKA-VN------QSSSVEVSSESYETIFSQRIIRDLQKELVVGALFEELPMSSK-ILTMLVEPDAGKATWVAASTYG 226 (458) T ss_pred hhhhhh-hh------hcccCccccceehhhHhHHHHHHHHhhhhHHhhcceeecCCc-ceEEEEecCCcceeeccccccc Confidence 111100 00 001111222356689999999998888888888887776554 455553 445555555555444 Q ss_pred CCcc----CCCCCceEEEEEEeeeeccc-ccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccc Q lcl|NC_015249. 80 DDKR----KDMKHTERTINIDGLLTADV-LIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDEN 154 (347) Q Consensus 80 ~~~~----~~~~~~~~~l~ID~~~~~~~-~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~ 154 (347) +... .+++-.+++ +...++..+ .|.+-=-.++.+++.+.+.++++++|++..|+.++.-- .... T Consensus 227 ~~~~~~~~~~~~~~~i~--~~~~k~~~~v~is~ell~ds~~~~~~~i~~~l~~~i~~~~d~~~l~G~---------G~~~ 295 (458) T protein:vir:10 227 TDTTTGEEVKGALKEIH--FSTYKLAAKSFITDETEEDAIFSLLPLLRKRLIEAHAVSIEEAFMTGD---------GSGK 295 (458) T ss_pred ccccccccccccceeeE--eeeeeEEeeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhcCC---------CCCc Confidence 3221 122334444 444444443 33322223355889999999999999999999886310 0011 Q ss_pred cccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcc----c Q lcl|NC_015249. 155 IAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQA----L 230 (347) Q Consensus 155 ~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~----~ 230 (347) +.|.-.......+.... ...........|+.|+++...|...+.. .-..|++|..|..|.+-. -.+..+.. . T Consensus 296 p~Gi~~~~~~~~~~~~~-~~~~~~~~~~~~~~i~~~~~~l~~~~~~--~~~~v~~~~~~~~l~~lk-d~~G~~i~~~~~~ 371 (458) T protein:vir:10 296 PKGLLTLASEDSAKVVT-EAKADGSVLVTAKTISKLRRKLGRHGLK--LSKLVLIVSMDAYYDLLE-DEEWQDVAQVGND 371 (458) T ss_pred cceeeecccccccceee-cccccccccccHHHHHHHHHhhhhhhcC--CCEEEEcHHHHHHHHhhc-ccCCceeeccccc Confidence 11110000000000000 0000011122367788888888877654 334688999998875422 22333322 2 Q ss_pred cccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceee Q lcl|NC_015249. 231 IDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMAL 310 (347) Q Consensus 231 ~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~ 310 (347) .....|...+++|.+|+.++.+|..+..... +-++|.... +++ ...++++ T Consensus 372 ~~~~~~~~~~l~G~pv~~~~~~p~~~~~~~~---------------------~~~~f~~~~-~~~--------~~~~~~v 421 (458) T protein:vir:10 372 SVKLQGQVGRIYGLPVVVSEYFPAKANSAEF---------------------AVIVYKDNF-VMP--------RQRAVTV 421 (458) T ss_pred cccccCcCceecceeeEEccccccccCCcce---------------------EEEEecccE-EEE--------EeeceEE Confidence 2344566678999999999999864322110 112222211 122 2233343 Q ss_pred eeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 311 ERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 311 e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +...-.......+++..++|..+.||++.|...+.++ T Consensus 422 ~~d~~~~~~~~~~~~~~r~~~~v~~~~a~v~~~~aa~ 458 (458) T protein:vir:10 422 ERERQAGKQRDAYYVTQRVNLQRYFANGVVSGTYAAS 458 (458) T ss_pred EeecccCCCceEEEEEEEecceEecccceEEEeeccC Confidence 3221122333457888999999999999999888888 No 106 >protein:vir:4997 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:109 # MgeName: Sfi21 # Cross-refs: genbank:acc:NP_049971;genbank:gi:9632943;genbank:GeneID:1262106 Probab=99.29 E-value=1.9e-12 Score=84.89 Aligned_cols=284 Identities=12% Similarity=0.053 Sum_probs=161.0 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccc-eEEEeecCc--ceeeeeecCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGK-SAQFPVLGR--TKAAYLQPGE 77 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~-tv~i~~iG~--~~~~~~~~g~ 77 (347) ..-+.++....-+.......++--.+.-+.|..++.+.....+.++.++++..+.++. ++.+++... ..+.....|. T Consensus 95 ~~~l~~~~~~~~~~~~~~t~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~ 174 (397) T protein:vir:49 95 KNLVRGRYQNLLDSKTDGSGSDAGLTIPQDIRTAINTLVRQFDSLQEYVNVENVTTLTGSRVYEKWADITGLAKLDDEGG 174 (397) T ss_pred HHHhhcchhhHHHhhhccCCccCcceecHHHHHHHHHHHHhhhhHhhhcceeeccCCcceEEEEeeccCCcceeeecccc Confidence 0000010000000000011111112455899999999998889998888888776532 344554432 2445555566 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG 157 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~ 157 (347) .++.+ ..++-.++++...+.- .-..|.+-=-.++.+|+.+.+.+++++++++..|+.|+.-. T Consensus 175 ~~~~~-~~~~~~~v~~~~~k~~-~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~d~ail~G~---------------- 236 (397) T protein:vir:49 175 QIGQN-DDPKLSLIRYAIKRYA-GISTVTNSLLADSAENILAWLSGWIAKKVVVTRNKAILEAI---------------- 236 (397) T ss_pred ccccc-cccceeeeEeeeeeeE-eehhhHHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHHhcc---------------- Confidence 55432 1234466666666553 33345432223467899999999999999999999886321 Q ss_pred ccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccce Q lcl|NC_015249. 158 LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGS 237 (347) Q Consensus 158 ~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~ 237 (347) +.++. . .... -|+.|+++...|+....+. -.+|++|..|..|.+-. -.+..|.-...+.+|. T Consensus 237 -g~~~~--~--------~~~~----~~d~i~~~~~~l~~~~~~~--a~~v~n~~~~~~l~~lk-d~~g~~l~~~~~~~g~ 298 (397) T protein:vir:49 237 -GTLPN--K--------PTLA----KWDDIIDLQAKVDPAIKQT--SLFLTNTSGFTALKKVK-NAMGDYLMERDVKSPT 298 (397) T ss_pred -ccccc--c--------cccc----CHHHHHHHHHhhhhhhcCC--CEEEEcHHHHHHHHHhh-ccCCceeecccccCCC Confidence 01100 0 0001 2567788888888877664 36789999999886532 2233443333456677 Q ss_pred EEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeech- Q lcl|NC_015249. 238 IRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRA- 316 (347) Q Consensus 238 Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~- 316 (347) -++++|++|+.+.+.+....+.. ...-+-+||.+.+- .+....++++..... T Consensus 299 ~~~l~G~pV~~~~~~~~~~~~~~------------------~~~~~~gd~~~~~~---------~~~~~~~~i~~~~~~~ 351 (397) T protein:vir:49 299 GYSIDGFVVKEISDRFLPNGTGG------------------AMPLYFGDLKQAVT---------LFDRQHLSLLSTNIGG 351 (397) T ss_pred CceecceeeEEecccccccccCC------------------ceeEEEeeccceEE---------EEeecccEEEEecccc Confidence 78999999998665443221110 00012233333332 233334555554322 Q ss_pred -hhh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 317 -NFQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 317 -~~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .+. ...+++..+++.++++|++.+.+.+.+. T Consensus 352 ~~~~~~~~~~~~~~r~d~~~~~~~a~~~~~~~~~ 385 (397) T protein:vir:49 352 GAFETDTTKVRVIDRFDVVSTDTEAFVPASFKAI 385 (397) T ss_pred chhhcCeeeEEEEEeeccEEecccceEEEEeccc Confidence 223 3357888999999999999998887776 No 107 >protein:vir:81160 Length: 371 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:1892 # MgeName: Geobacillus virus E2 # Cross-refs: genbank:acc:YP_001285811;genbank:gi:148747732;genbank:GeneID:5247203 Probab=99.27 E-value=1.7e-12 Score=85.10 Aligned_cols=281 Identities=12% Similarity=0.044 Sum_probs=163.0 Q ss_pred CCcc----cccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccccccccccc-ceEEEeec-Ccceeeeee Q lcl|NC_015249. 1 MAKM----NGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSG-KSAQFPVL-GRTKAAYLQ 74 (347) Q Consensus 1 ma~~----~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G-~tv~i~~i-G~~~~~~~~ 74 (347) +..+ .-+.+.+|- ++--.+.-+.|.+++.+.....|.++.++++..+.++ -+..+++. +...+.... T Consensus 80 ~~~l~~~~~~a~~~~t~-------~~gg~~vP~~~~~~ii~~~~~~s~i~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~ 152 (371) T protein:vir:81 80 VNHIRTRFRNAMSEGSN-------QDGGYTVPQDIQTRINELRESKDALQNLITVEPVTTLSGSRVFKKRSQQTGFVEVA 152 (371) T ss_pred HHHHHHHHHHhhccCCC-------ccCceeecHhHHHHHHHHHHhhhhhhhhceeeeccCCceeEEEEeecCCcceeeec Confidence 0000 000011111 1111145588999999999999999999888877643 24445544 345777778 Q ss_pred cCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccc Q lcl|NC_015249. 75 PGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDEN 154 (347) Q Consensus 75 ~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~ 154 (347) .|+.++.. ..+.-.++++...+.- ....|.+-=-.++.+|+.+.+.++.++++++..|+.++.-.. T Consensus 153 Eg~~~~~~-~~~~f~~i~~~~~k~~-~~~~iS~ell~ds~~~l~~~i~~~l~~a~~~~~~~~i~~g~g------------ 218 (371) T protein:vir:81 153 EGAAIGEK-ATPQFTLLQYQVKKYA-GFFRVTNELLNDSTEAIVNTLVRWIGDESRVTRNGLIINVLN------------ 218 (371) T ss_pred cccccccc-cccceeeEEeeeeEEE-EeehhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHHhhcc------------ Confidence 78776432 2344566666665542 223443322233567899999999999999999988753210 Q ss_pred cccccCcceeecccccccccchhhhHHHHHHHHHHHH-HHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccccc Q lcl|NC_015249. 155 IAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLAR-AKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDP 233 (347) Q Consensus 155 ~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~-~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~ 233 (347) .+.+.+ .. -++.|..+. ..|+...-+ .=..|++|..|..|.+-. -.+..|.-...+ T Consensus 219 -~~~~~~---------------~~----~~~~i~~~~~~~l~~~~~~--~a~~vmn~~~~~~L~~lk-d~~g~~l~~~~~ 275 (371) T protein:vir:81 219 -TKAKTA---------------IA----DLDGLKQIINVQLDPVFRS--TSSVIVNQDAFNWLDTLK-DQNGQYLLQPSI 275 (371) T ss_pred -cccccc---------------cc----cHHHHHHHHHhhcchhhhc--CCEEEEcHHHHHHHHHhh-ccCCCeeeeccc Confidence 000000 00 133344333 234443322 336789999999887532 334445444455 Q ss_pred ccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeee Q lcl|NC_015249. 234 STGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERA 313 (347) Q Consensus 234 ~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~ 313 (347) ..|..++++|.+|+.++++|.+........ .+...-+-+||+..+.+.. ...+++++. T Consensus 276 ~~~~~~~l~G~pV~~~~~~~~~~~~~~~~~-------------~~~~~i~~Gd~~~~~~~~~---------~~~~~i~~~ 333 (371) T protein:vir:81 276 SSPTGRQLLGLPVVIVSNKVLANRVDGGTG-------------AQFAPIIVGDLKEAVVMFD---------RQRTEIMSS 333 (371) T ss_pred CCCCCceecceeEEEecccccCcccccccc-------------CCcceEEEEehhceEEEEe---------ecceEEEEe Confidence 677778999999999999996543211100 0111123344544333222 233455554 Q ss_pred echh--h--hcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 314 RRAN--F--QADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 314 ~d~~--~--~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +... + -...+++.++++.++++|++.+.+.+..| T Consensus 334 ~~~~~~f~~~~v~~~~~~r~d~~~~~~~a~~~~~~~~A 371 (371) T protein:vir:81 334 NVAMDAFETDATLWRAIERMDVKMRDDEAFVFGEVQLA 371 (371) T ss_pred ccccchhhcCceEEEEEEeeccEEecccceEEEEEecC Confidence 4332 2 23577888899999999999999999999 No 108 >protein:vir:102119 Length: 404 # NCBI annotation: phage major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1641 # MgeName: phiSM101 # Cross-refs: genbank:acc:YP_699941;genbank:gi:110804052;genbank:GeneID:4206662 Probab=99.27 E-value=2.3e-12 Score=84.43 Aligned_cols=297 Identities=10% Similarity=0.046 Sum_probs=161.3 Q ss_pred CCcccccc-ccc---ccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccccccccccc-ceEEEee-cCcceeeeee Q lcl|NC_015249. 1 MAKMNGGQ-QIG---KDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSG-KSAQFPV-LGRTKAAYLQ 74 (347) Q Consensus 1 ma~~~~~~-~~~---t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G-~tv~i~~-iG~~~~~~~~ 74 (347) +......+ ... .+--..+..++--.+.-+.+.+++...-+..+.++.+.++..+.++ -++.+++ .+...+.... T Consensus 92 ~~~~~~~~~~~~~~e~~a~~~~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~l~~~~~~~~~~g~~~~~~~~~~~~~~~v~ 171 (404) T protein:vir:10 92 LKQKNQRGLNLSEKEINAISENIDEDGGYAVPEDIQTKINTRLKDTTDLYNMVDYEPVFTRSGSRTYEKRSKQKPMKPLS 171 (404) T ss_pred HHHHHhhhhcchhhHHhhhccccCCCCceeechhHHHHHHHHHhhhhhHhhhhceeeccCCccceEEEEecCCcceeecc Confidence 11100000 000 0000000111111234588899998888888989998888887642 3455554 5667777777 Q ss_pred cCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccc Q lcl|NC_015249. 75 PGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDEN 154 (347) Q Consensus 75 ~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~ 154 (347) .|+..+.+...+.-.++++...+.- .-..|.+-=-.++.+++.+.+.++.++++++..|+.|+.-. ..... T Consensus 172 e~~~~~~~~~~~~f~~i~~~~~k~~-~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~il~G~--------g~~~~ 242 (404) T protein:vir:10 172 ENQQIPTNGDNGKLERFNFKLKDLA-DFMSIPNDLLKFADKSLEDWIINWFVDKVRITRNAEILYGA--------GGDEH 242 (404) T ss_pred ccccccccccccceeeeEeeheeeE-eeehhhHHHHhhcHHHHHHHHHHHHHHHHHHHHHHHHhhcC--------CCCCc Confidence 7777654322344555566554442 23344332223456789999999999999999999886311 01111 Q ss_pred cccccCcceeecccccccccchhhhHHHHHHHHHHHHH-HhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccccc Q lcl|NC_015249. 155 IAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARA-KLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDP 233 (347) Q Consensus 155 ~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~-~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~ 233 (347) +.+.- ........+.... ..++.+..+.. .|....-+ +-.+|++|..|..|.+- +-.+..|.-...+ T Consensus 243 ~~gi~-----~~~~~~~~~~~~~----~~~~~~~~~~~~~l~~~~~~--~~~~v~n~~~~~~L~~l-kd~~G~~l~~~~~ 310 (404) T protein:vir:10 243 ATGIM-----TANKFKKITLPKS----PALKDFKKCKNVELLNVFKA--TSSWIVNQDGFNYLDSL-EDKTGRPYLQPDP 310 (404) T ss_pred cccee-----eccccceeecccc----ccHHHHHHHHHhhhhccccC--CCEEEEcHHHHHHHHHh-hccCCceeeccCc Confidence 11111 1111111111111 12455555443 34444333 23578999999988753 2334455444456 Q ss_pred ccceEEEEeceEEEEecc-eecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeee Q lcl|NC_015249. 234 STGSIRNVMGFEVIEVPH-LTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALER 312 (347) Q Consensus 234 ~~G~Vg~i~G~~V~~sn~-lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~ 312 (347) .+|...+++|.+|+.++. +|....+. ...+-++|+..+.+.. ...++++. T Consensus 311 ~~~~~~~l~G~PV~~~~~~~~~~~~~~--------------------~~~~~gd~s~~~~~~~---------~~~~~i~~ 361 (404) T protein:vir:10 311 KDPTQYRFLGLPVIELPNDLLLSTESA--------------------IPVLLGDTKEAYKYVS---------DGAYELAT 361 (404) T ss_pred CCCCCccccceeeEEecccccCCCCCc--------------------cEEEEEeccccEEEEE---------ecceEEEE Confidence 677778999999986433 33211110 0112234444332222 23455555 Q ss_pred eech--hh--hcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 313 ARRA--NF--QADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 313 ~~d~--~~--~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ..++ .+ -...+++.+++|..++||++.+.+.++.| T Consensus 362 ~~~~~~~~~~~~~~~~~~~r~d~~v~~~~a~~~~~~~~a 400 (404) T protein:vir:10 362 TNIGAGAFETNTTKARIIMRIDGNVKDSEALLIAEIPVE 400 (404) T ss_pred eccccchhhcCceEEEEEEeeccEEecccceEEEEeecc Confidence 5443 22 23458899999999999999999999988 No 109 >protein:vir:2430 Length: 318 # NCBI annotation: major head subunit # Family: family:all:507 # MgeID: mge:52 # MgeName: D29 # Cross-refs: genbank:acc:NP_046832;genbank:gi:9630400;genbank:GeneID:1261582 Probab=99.27 E-value=3.2e-12 Score=83.59 Aligned_cols=289 Identities=11% Similarity=0.038 Sum_probs=156.7 Q ss_pred CCccc-ccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCC Q lcl|NC_015249. 1 MAKMN-GGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~-~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~ 78 (347) |+-=. ..+. .+.-..-...+.-.+..+++..++.+..++.++++.+.++..+. +.+.+||+. +...+.....|+. T Consensus 1 ~~~~~~~~~e--~~~~~~~~~~~~~~~ip~~~~~~ii~~~~~~~~l~~~~~~~~~~-~~~~~ip~~~~~~~a~~v~Eg~~ 77 (318) T protein:vir:24 1 MAAGTAFAVD--HAQIAQTGDTMFKGYLEPEQAKDYFAEAEKTSIVQQFAQKVPMG-TTGQKIPHWVGDVSAQWIGEGDM 77 (318) T ss_pred CCCCCCCCHH--HHHhhcccCcccceeechhHHHHHHHHHHhhchhhhhcceeecc-CCceEEEEEeCCcceEEecCCcc Confidence 11100 0000 00000001111112456889999999998999999888877755 456778765 5667777788888 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) ++. .+++-+++++..-+. ..-..|.+-=-.++.+|+.+.+.+++++++++.+|+.++.-. . ...+.+. T Consensus 78 ~~~--~~~~f~~i~~~~~k~-~~~~~iS~e~l~ds~~~~~~~i~~~l~~~~~~~~d~a~l~G~----g-----~~~~~~~ 145 (318) T protein:vir:24 78 KPI--TKGNMTSQTIAPHKI-ATIFVASAETVRANPANYLGTMRTKVATAFAMAFDGAAMHGT----D-----SPFPTYI 145 (318) T ss_pred ccc--cccceeEEEEeeEEE-EEeehhhHHHhhcChHHHHHHHHHHHHHHHHHHHHHhhhccc----C-----CCCCccc Confidence 765 345556666555443 222334331112366889999999999999999999886321 0 0011111 Q ss_pred cCcc-eeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccc- Q lcl|NC_015249. 159 GKAH-VLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTG- 236 (347) Q Consensus 159 ~~g~-~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G- 236 (347) .... .+..+. ..+. .....+.++++...+...+. ..-.++++|..|..|.+-. -.+..|........+ T Consensus 146 ~~~~~~~~~~~---~~~~----~~~~~~~~~~~~~~~~~~~~--~~~~~v~n~~~~~~L~~lk-d~~G~~l~~~~~~~~~ 215 (318) T protein:vir:24 146 GQTTKAISIAD---TTGA----TTVYDQVAVNGLSLLVNDGK--KWTHTLLDDITEPILNGAK-DQNGRPLFIESTYGEA 215 (318) T ss_pred ccccccccccc---cccc----cchHHHHHHHHHHhhccccC--CCCEEEEcHHHHHHHHHhh-ccCCceeecCccccCc Confidence 1100 011110 0001 11122334555555555443 3346799999999987532 223333222222222 Q ss_pred ----eEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeee Q lcl|NC_015249. 237 ----SIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALER 312 (347) Q Consensus 237 ----~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~ 312 (347) .-+.+.|++++.++++|.+... -+-+||+..+ +. ...++.++. T Consensus 216 ~~~~~~~~i~g~pv~~~~~~~~~~~~-----------------------~~~gdfs~~~--~~--------~~~~l~i~~ 262 (318) T protein:vir:24 216 ASPFRSGRIVARPTILSDHVVEGTTV-----------------------GFMGDFSQLI--WG--------QIGGLSFDV 262 (318) T ss_pred cccccCceEEEEeeEEeCCCCCCccE-----------------------EEEeecceEE--EE--------EecCeEEEE Confidence 1247889999999888742110 1223444432 11 123345555 Q ss_pred eechh--------------hhc--ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 313 ARRAN--------------FQA--DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 313 ~~d~~--------------~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .++.. ++. -.+++..+++..++||++.+.|....| T Consensus 263 ~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~i~~~~a 313 (318) T protein:vir:24 263 TDQATLNLGTVESPNFVSLWQHNLVAVRVEAEYAFHCNDAEAFVALTNVVS 313 (318) T ss_pred eeccceeccccccccchhhhhcCcEEEEEEEEEccEEecccceEEEEeecc Confidence 44422 222 445788899999999999888877777 No 110 >protein:vir:5974 Length: 324 # NCBI annotation: hypothetical protein # Family: family:all:1522 # MgeID: mge:125 # MgeName: SPP1 # Cross-refs: genbank:acc:NP_690674;genbank:geneid:6329212;genbank:gi:22855068;goa:Q38582;uniprot:Q38582;genbank:GeneID:955303 Probab=99.26 E-value=2.2e-12 Score=84.53 Aligned_cols=278 Identities=14% Similarity=0.070 Sum_probs=180.1 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhh--cc-cc---ccc----ccccceEEEeecCcc- Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTM--NK-HL---VRS----IQSGKSAQFPVLGRT- 68 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~--~~-~~---~r~----i~~G~tv~i~~iG~~- 68 (347) ||.+ +. + + +|+ |+|..+|.+...+.+.|. +. .+ ... -.+|+++.+|..+.. T Consensus 1 MA~T--------~l------s--d-~i~peVf~~yv~~~~~~~~~l~qSg~i~~~a~i~~~l~~~~~G~~i~~P~~~~l~ 63 (324) T protein:vir:59 1 MAYT--------KI------S--D-VIVPELFNPYVINTTTQLSAFFQSGIAATDDELNALAKKAGGGSTLNMPYWNDLD 63 (324) T ss_pred CCce--------ee------e--c-eechhHHHHHHHhhhHHHHHHhhcccccccHHHHHHhhccCCCCEEEecccccCC Confidence 8732 22 1 1 455 999999988888887652 11 11 111 136999999988764 Q ss_pred -eeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Q lcl|NC_015249. 69 -KAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNL 147 (347) Q Consensus 69 -~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~ 147 (347) ..+.+..++++. ...+..++..-+|= .....+.+.|+-...+-.|++.+++++.+..+++..|..++..+.+.... T Consensus 64 Gd~~~v~~~~~i~--~~~l~t~~~~a~i~-~~~k~~~~tD~a~~~sg~dp~~~i~~q~a~~~~~~~~~~lia~l~g~~~~ 140 (324) T protein:vir:59 64 GDSQVLNDTDDLV--PQKINAGQDKAVLI-LRGNAWSSHDLAATLSGSDPMQAIGSRVAAYWAREMQKIVFAELAGVFSN 140 (324) T ss_pred CcccccCCCcccc--hhhcccceeeEEEE-eecCceeehhhhhhhccchHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhc Confidence 456777777764 36787777776664 56788899999988888899999999999999999998887766433322 Q ss_pred ccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhh Q lcl|NC_015249. 148 PSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANY 227 (347) Q Consensus 148 ~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~ 227 (347) ..... ....+.++. +.... ++.|.+|..+|.++. ..-..+++.|..|..|.+.. +++.-. T Consensus 141 ~~~~~---------~~~dvsa~~----~~~~s----~~~l~~A~~~~GD~~--~~~~~ivmhS~v~~~L~~~~-li~~~~ 200 (324) T protein:vir:59 141 DDMKD---------NKLDISGTA----DGIYS----AETFVDASYKLGDHE--SLLTAIGMHSATMASAVKQD-LIEFVK 200 (324) T ss_pred ccccc---------ceeeeeccc----cceec----HHHHHHHHHHhCCcc--cCcEEEEEchHHHHHHHHhh-hhhhcc Confidence 21111 111111111 11112 356778888887764 34469999999999998764 332111 Q ss_pred ccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhc- Q lcl|NC_015249. 228 QALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLK- 306 (347) Q Consensus 228 ~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~- 306 (347) .. -.++.|+.++|.+|+.+..+|....+ +...+...+++-+-|++....+ T Consensus 201 ~s---~~~~~i~~~~G~~VivdD~~p~~~~~--------------------------~~~~~y~s~l~~~GAi~~~~~~~ 251 (324) T protein:vir:59 201 DS---QSGIRFPTYMNKRVIVDDSMPVETLE--------------------------DGTKVFTSYLFGAGALGYAEGQP 251 (324) T ss_pred cc---ccCceeeeecccEEEEeCCCCccccC--------------------------CCCceEEEEEEecCeEEEeecCC Confidence 11 12567899999999999999853321 1112344567778888887655 Q ss_pred ceeeeeeechhhhcceeeeeeeeccccc--cc-ceEEEEEEcCC Q lcl|NC_015249. 307 DMALERARRANFQADQIIAKYAMGHGGL--RP-EACGALVFNKA 347 (347) Q Consensus 307 ~~~~e~~~d~~~~~d~i~~~~a~G~~~~--Rp-e~a~~i~~~~a 347 (347) ++.+|..||+....|.+...+.|...|+ .+ +..+.-..|.- T Consensus 252 ~v~vE~dRd~~~g~~~l~~r~~~~~~p~G~s~~~~~~~~~sPt~ 295 (324) T protein:vir:59 252 EVPTETARNALGSQDILINRKHFVLHPRGVKFTENAMAGTTPTD 295 (324) T ss_pred CcceecccCccccceEEEEeeEEEeEeeeEEecccccCCCCCCh Confidence 4678999999988898888888875543 23 21111111111 No 111 >protein:vir:102944 Length: 330 # NCBI annotation: major head protein # Family: family:all:1522 # MgeID: mge:1461 # MgeName: EJ-1 # Cross-refs: genbank:acc:NP_945286;genbank:gi:39653721;uniprot:Q708M6;genbank:GeneID:2672858 Probab=99.26 E-value=1.3e-12 Score=85.72 Aligned_cols=281 Identities=12% Similarity=0.054 Sum_probs=172.7 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhc---ccccccc-----cccceEEEeecCcc--e Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMN---KHLVRSI-----QSGKSAQFPVLGRT--K 69 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~---~~~~r~i-----~~G~tv~i~~iG~~--~ 69 (347) ||+.+ |+.. + +|+ |+|..+|.+...+.+.|.. +++...+ .+|+++.+|..+.. . T Consensus 1 Ma~~~------T~l~--------d-~i~pevf~~yv~~~~~~~~~l~qSG~i~~~~~i~~~~~~~G~~i~~P~~~~l~G~ 65 (330) T protein:vir:10 1 MANEL------TKIL--------D-TITPQQYNAYMQQYTAAKSAFVQSGIAVSDERVSKNITSGGLLVNMPFWNDLTGD 65 (330) T ss_pred CCCCc------eEee--------e-eechhHHHHHHHHHhHHhhhhhhcccccccHHHHHHhhcCCCEEEecccccCCCc Confidence 99854 3331 1 455 9999999988887775521 2222112 35999999988755 3 Q ss_pred eeeeecCC-CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhc Q lcl|NC_015249. 70 AAYLQPGE-NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLP 148 (347) Q Consensus 70 ~~~~~~g~-~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~ 148 (347) ...+..|. ++. +..+..++..-+|=. .--.+.+.|+-...+-.|++.++.+|.+...++..+..++..+.+..+.. T Consensus 66 ~~~~~dg~~~i~--~~ki~t~~~~a~i~~-~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~w~~~~q~~lla~l~gvf~~~ 142 (330) T protein:vir:10 66 SEVLGNGDKALE--TGKITAGADIACVLY-RGRGWAANELTGVVAGSDPVRAILNRIGAYWLREDQKALIATLNGIFATG 142 (330) T ss_pred ccccCCCccccc--hhhcccceeEEEEEe-ecceeeehhhhhhhcchhHHHHHHHHHHHHhhhhHHHHHHHHHHhhhhhh Confidence 44554453 453 356777776666533 35568889999888888999999999999999998887776665544322 Q ss_pred cccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhc Q lcl|NC_015249. 149 SASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQ 228 (347) Q Consensus 149 ~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~ 228 (347) ....... .. .+.....+ ....... ++.|.+|..+|.++. ..-..+++.|..|..|.+.. +++..-. T Consensus 143 ~~~~~~~--~~-~~~~~~~~----~~~a~~s----~~~l~~A~~~~GD~~--~~~~~ivmhS~v~~~L~~~~-li~~~~~ 208 (330) T protein:vir:10 143 TAGEKGA--LE-ETHVSDQS----KASTGID----AGMVLDAKQLLGDSA--DQVTAIAMHSAVYTKLQKDN-LIQYIQP 208 (330) T ss_pred hcccchh--hh-hhheeccc----ccccccC----HHHHHHHHHHhcccc--ccceEEEEcHHHHHHHHHhh-hhhhhcc Confidence 2111110 00 00000000 0111111 356778888887765 34579999999999998743 4322111 Q ss_pred cccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcc- Q lcl|NC_015249. 229 ALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKD- 307 (347) Q Consensus 229 ~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~- 307 (347) . ..++.|+.++|.+|+.+..+|.... +....+|-+-|++..+..+ T Consensus 209 s---~~~~~i~~~~G~~VivdD~~p~~~~-------------------------------~yt~yl~~~GAi~~~~~~~~ 254 (330) T protein:vir:10 209 T---TATINIPTYLGYRVIIDDGIAPTGD-------------------------------IYTSYLFRTGSIGLNTGNPS 254 (330) T ss_pred c---ccCcccccccceEEEEeCCCCCCCC-------------------------------ceeEEEEecCceeeecccCC Confidence 1 1256789999999999999984211 1122455566666655332 Q ss_pred --eeeeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 308 --MALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 308 --~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +.+|..||+....+.+..++.|...|+=--.....+.... T Consensus 255 ~~v~~EtdRd~~~g~~~l~~r~~~~~hp~G~s~~~~~~~~~~ 296 (330) T protein:vir:10 255 GLTTFETSREAAKGNDMIYTRRALVMHPYGVKWTGAEVDAGN 296 (330) T ss_pred ccccccccCCccccceEEEEeeEEEeeeeeeeecccccccCc Confidence 5678889999888999988888765432222111111111 No 112 >protein:vir:100135 Length: 418 # NCBI annotation: gp5 # Family: family:all:585 # MgeID: mge:1639 # MgeName: phi1026b # Cross-refs: genbank:acc:NP_945035;genbank:gi:38707895;genbank:GeneID:2744182 Probab=99.26 E-value=3.6e-12 Score=83.32 Aligned_cols=287 Identities=14% Similarity=0.131 Sum_probs=163.7 Q ss_pred CCcccc--cccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-C-cceeeeeecC Q lcl|NC_015249. 1 MAKMNG--GQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-G-RTKAAYLQPG 76 (347) Q Consensus 1 ma~~~~--~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G-~~~~~~~~~g 76 (347) |..... ..+.....+.+ .++.-.+..+.|..++.......+.++++++...+. +.++.+++. + ..++.....| T Consensus 121 ~~~~~~~~~~~~~~~~~~~--~~~~g~lvp~~~~~~ii~~~~~~~~l~~~~~~~~~~-~~~~~~~~~~~~~~~a~~v~E~ 197 (418) T protein:vir:10 121 RVRVDRKSIMNVPATVGSG--VSGSNSLVVADRQAGIIAPPQRKMTIRDLLMPGQTS-SSSIEYTVETGFTNNAAAVAEG 197 (418) T ss_pred hhhhHHHHHHHhhhhccCC--CCCCccccchhHHHHHHHHHhhhhhHHhhcceeecc-CCceeEEEEecCCCceeeeccC Confidence 111110 00000011111 111122567899999999998999999998887765 556777764 3 2455666677 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIA 156 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~ 156 (347) +.++. .+++-.++++...+... -..|.+ +-.+...++.+.+.++.++++++..|++++.-- .++..+. T Consensus 198 ~~~~~--~~~~f~~v~~~~~k~~~-~~~is~-ell~ds~~l~~~i~~~l~~a~~~~~d~a~l~G~--------g~~~~p~ 265 (418) T protein:vir:10 198 AQKPT--SDLKFNLKNQPVRTIAH-LFKASR-QILDDAPALQSYIDGRARYGLQLTEEGQILKGD--------GTGANIL 265 (418) T ss_pred ccccc--cccceeeEEEeeeeEEE-eehhhH-HHHHhHHHHHHHHHHHHHHHHHHHHHHHHhccC--------CCCcccc Confidence 77654 34555666666655432 234432 233445688899999999999999999886310 0111122 Q ss_pred cccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccc Q lcl|NC_015249. 157 GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTG 236 (347) Q Consensus 157 ~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G 236 (347) |.-........ .........++.|+++...+...+.+.. .+|++|..|..|.+-. -.++.|.... ..+| T Consensus 266 Gi~~~~~~~~~-------~~~~~~~~~~~~i~~~~~~~~~~~~~~~--~~v~n~~~~~~L~~lk-d~~G~~i~~~-~~~~ 334 (418) T protein:vir:10 266 GILPQASAFMP-------SITLANATPIDKIRLALLQAVLAEFPAT--GIVLNPIDWASIELTK-DSQGRYIVGN-PVNG 334 (418) T ss_pred ccccccccccc-------cccccccccHHHHHHHHHhhccccCCCC--EEEEcHHHHHHHHHhh-cCCCceeccc-cccC Confidence 21111000000 0001111236677777777776665433 4778999998886433 2233444322 3456 Q ss_pred eEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeech Q lcl|NC_015249. 237 SIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRA 316 (347) Q Consensus 237 ~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~ 316 (347) ..+.++|++|+.|+++|.+. -+-+||+..+- ++.+ .+++++..++. T Consensus 335 ~~~~l~G~pV~~~~~~p~~~-------------------------~~~gd~s~~~~-~~~~--------~~~~i~~~~~~ 380 (418) T protein:vir:10 335 TTPRLWNLPVVETQAMTANE-------------------------FLVGAFSMAAQ-IFDR--------MEIEVLLSTEN 380 (418) T ss_pred CCceecceeeEEcCCCCCCc-------------------------EEEeeccceEE-EEEe--------cceEEEEeccc Confidence 67889999999999998421 11234443322 2222 23455544332 Q ss_pred h--hhc--ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 317 N--FQA--DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 317 ~--~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) . +.- ..+++.+.++..+++|++.+.+.+..+ T Consensus 381 ~~~f~~~~~~~r~~~~~d~~~~~~~a~~~~~~~~~ 415 (418) T protein:vir:10 381 VDDFEKNMVSIRAEERLALAVYRPESFVTGALVEQ 415 (418) T ss_pred chhhhcCceEEEEEEeeccEEecccceEEEEeccC Confidence 2 223 356677889999999999999888888 No 113 >protein:vir:4953 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:108 # MgeName: Sfi19 # Cross-refs: genbank:acc:NP_049929;genbank:gi:9632900;genbank:GeneID:1262076 Probab=99.25 E-value=4.1e-12 Score=83.02 Aligned_cols=284 Identities=13% Similarity=0.056 Sum_probs=162.9 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccc-cceEEEeec--CcceeeeeecCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQS-GKSAQFPVL--GRTKAAYLQPGE 77 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~-G~tv~i~~i--G~~~~~~~~~g~ 77 (347) ...+.++.......-.....++--.+.-+.|..++...-...+.+++++++..+.+ .-+..+++. +...+..+..|. T Consensus 95 ~~~l~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~ 174 (397) T protein:vir:49 95 KNLVRGRYQNLLDSKTDASGSDAGLTIPQDIQTAIHTLVSQYDSLQEYVNVENVTTLTGSRVYEKWTDITGLANIDDEAG 174 (397) T ss_pred HHHHhcchhHHHHHhhccccccCcccccHhHHHHHHHHHHhhhhHHhhhceeecccCccceEEEeeccCCcceeeecCcc Confidence 00001100000000000111111224558999999988888899998888887754 223445543 334566667677 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG 157 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~ 157 (347) .++.. ..++-.++++.+.+. +....|.+-=-.++.+|+.+.+.++.+++|++..|+.|+.-.. T Consensus 175 ~~~~~-~~~~~~~i~~~~~k~-~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~d~ai~~G~g--------------- 237 (397) T protein:vir:49 175 KIADV-DDPKLSLIKYTIKRY-AGISTVTNSLLADSAENILAWLSGWIAKKVVVTRNKAILEAIA--------------- 237 (397) T ss_pred ccccc-cccceeeEEeeeeeE-EeeehhHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHHhhcc--------------- Confidence 66532 235556677766554 3344554332334678999999999999999999998863210 Q ss_pred ccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccce Q lcl|NC_015249. 158 LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGS 237 (347) Q Consensus 158 ~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~ 237 (347) .+. ..+ .. .-++.|+++...|.....+. -.+|++|..|..|.+-. -.++.|.-...+..|. T Consensus 238 --~~~--~~~--------~~----~~~d~i~~~~~~l~~~~~~~--a~~vmn~~~~~~l~~lk-d~~G~~l~~~~~~~~~ 298 (397) T protein:vir:49 238 --ALP--TKP--------TL----TKWDDIIDLEAKVDPAIKQT--SFFLTNTSGFTALKKVK-NALGDYLMERDVKSPT 298 (397) T ss_pred --ccc--ccc--------cc----ccHHHHHHHHHhhhhhhcCC--CEEEEcHHHHHHHHHhh-cCCCceeeccCcCCCC Confidence 000 000 00 11567788888888877653 46789999999997532 2334454444466777 Q ss_pred EEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeech- Q lcl|NC_015249. 238 IRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRA- 316 (347) Q Consensus 238 Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~- 316 (347) -+.++|++|+.+.+-+..+++... ..-+-+||+..+.+ +..++++++..... T Consensus 299 ~~~l~G~PV~~~~~~~~~~~~~~~------------------~~i~~gd~~~~~~~---------~~~~~~~i~~~~~~~ 351 (397) T protein:vir:49 299 GYSIDGFAVKEVADRWLANGTGGA------------------MPLYFGDLKQAVTL---------FDRQHMSLLSTNIGG 351 (397) T ss_pred CceecceeeEEecccccccccCCc------------------eeEEEeeccceEEE---------EeecceEEEEecccc Confidence 789999999986543322211000 00111233333222 22233455544321 Q ss_pred -hhh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 317 -NFQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 317 -~~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .+. ...+++..+++..+++|++.+.+.+..+ T Consensus 352 ~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~ 385 (397) T protein:vir:49 352 GAFETDTTKVRVIDRFDVVATDTEAFVPASFKAI 385 (397) T ss_pred chhhcCceeEEEEeeeCcEEecccceEEEEeecc Confidence 222 3457889999999999999999998887 No 114 >protein:vir:95763 Length: 297 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1578 # MgeName: SMP # Cross-refs: genbank:acc:YP_950590;genbank:gi:119953785;genbank:GeneID:5076833 Probab=99.23 E-value=1.1e-11 Score=80.71 Aligned_cols=277 Identities=12% Similarity=0.082 Sum_probs=160.6 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~~ 79 (347) |.-..- ++.-...+++.-.|.-++|..++.+.....|.++.+.++..+.++....+++. +...+..+..|+.+ T Consensus 1 m~~~~~------~~~~~~~t~~~~~lvP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~Eg~~~ 74 (297) T protein:vir:95 1 MTVQTF------NPENVLVSQKKDGTLHKEFTDIIMKEVAQNSLVMQLGQYQEMEGEQEKTVYVQTDGISAYWVNETEKI 74 (297) T ss_pred CCcccc------ccccccccCCCcceechhHHHHHHHHHHhhchhhhhcceeecCCCccEEEEEEcCCceeEEeecCccc Confidence 332211 11111112222236679999999999989999998888877665555666644 56677888888887 Q ss_pred CCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLG 159 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~ 159 (347) +.. ++.-.++++...+. .....|.+-=-.++..|+.+.+.++.++++++..|+.++.- .... .+. T Consensus 75 ~~~--~~~f~~v~l~~~k~-~~~~~is~ell~ds~~~l~~~i~~~la~ai~~~~d~a~l~G----~g~~-----~~~--- 139 (297) T protein:vir:95 75 KTD--KPEVVPVTLKAHKL-GIILVTSREALNYTWKKFFEDMKPQIVEAFYKKIDEAGLLG----HDTP-----FAN--- 139 (297) T ss_pred ccc--ccceeEEEEeeEEE-EEeehhhHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHHhcc----cCCc-----ccc--- Confidence 653 46667777766554 33344543222235688999999999999999999988631 1000 011 Q ss_pred CcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceEE Q lcl|NC_015249. 160 KAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSIR 239 (347) Q Consensus 160 ~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg 239 (347) +-....+...... . ....|+.|+++..+|..++.+.. .++++|..|..|.+-. -.+..+ +-++..+ T Consensus 140 -gi~~~~~~~~~~~-~----~~~t~~~i~~~~~~l~~~~~~~~--~~v~~~~~~~~L~~l~-d~~G~~-----i~~~~~~ 205 (297) T protein:vir:95 140 -SVAKAAKDANKVI-G----GPINYDNILKLQDALYDADVEPN--AFVSKIQNRSALREAR-DGNKVS-----IYDKAAN 205 (297) T ss_pred -cccccccccceec-c----cccCHHHHHHHHHHhhhccCCcC--EEEEcHHHHHHHHHhh-ccCCce-----eecCCCC Confidence 1110001011110 1 11136778888888988877543 5788999999987422 112222 2234456 Q ss_pred EEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechh-- Q lcl|NC_015249. 240 NVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRAN-- 317 (347) Q Consensus 240 ~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~-- 317 (347) .+.|.+|+.+++.+...+. -+-+||+...- +...+++++..++.. T Consensus 206 ~l~G~Pv~~~~~~~~~~~~-----------------------~~~gd~s~~~~----------~~~~~~~i~~~~~~~~~ 252 (297) T protein:vir:95 206 TIDGITTVDLKSARFEKGD-----------------------LLAGDFDNLIY----------GVPYNITYKISEEGQIS 252 (297) T ss_pred cccceeeEeecCCCCCCce-----------------------EEEEecccEEE----------EEecCeEEEEeeccccc Confidence 7999999987665432111 12234444321 112234455443321 Q ss_pred ------------hhcc--eeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 318 ------------FQAD--QIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 318 ------------~~~d--~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++.+ .++...++|.++++|++.+.|+ .| T Consensus 253 ~~~~~~~~~~~~~~~~~~~~r~~~~~d~~v~~~~a~~~l~--~a 294 (297) T protein:vir:95 253 TITNADGTPINLFEQEMIAIRATMDIAVMITKTDAFAKLT--PA 294 (297) T ss_pred cccccCccchhhhhcCcEEEEEEEEeccEeecccceEEEe--ec Confidence 3333 4566688999999999777543 34 No 115 >protein:vir:3991 Length: 404 # NCBI annotation: major structural protein # Family: family:all:21 # MgeID: mge:319 # MgeName: BK5-T # Cross-refs: genbank:acc:NP_116499;genbank:gi:14251132;genbank:GeneID:921252 Probab=99.22 E-value=1.3e-11 Score=80.31 Aligned_cols=284 Identities=11% Similarity=0.034 Sum_probs=158.9 Q ss_pred CCcccccc-cccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccccccccccc-ceEEEeecC--cceeeeeecC Q lcl|NC_015249. 1 MAKMNGGQ-QIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSG-KSAQFPVLG--RTKAAYLQPG 76 (347) Q Consensus 1 ma~~~~~~-~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G-~tv~i~~iG--~~~~~~~~~g 76 (347) +-+..... ....|.-..+..++--.+.-++|..++.+.....+.+++++++..+.++ -+..+++.. ...+.....| T Consensus 101 ~~~~~~~~~~~e~~a~~~~t~~~gg~~iP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~Eg 180 (404) T protein:vir:39 101 VRNPMAFLNTVSSKTETSGSDSAAGLTIPQDIRTMINTLVRQYDSLQQYVRVESVSTSNGSRVYEKWTDVTPLTVMDAED 180 (404) T ss_pred HhcchhhhhhhhhhhhhcccccCCceeccHHHHHHHHHHHHhhhhHHhhcceeeccCCcceEEEEeecCCccceeeecCc Confidence 10000000 0000100011111111245689999999999999999999988877653 344444432 2344555666 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIA 156 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~ 156 (347) +.++.+ ..+.-.++++.+.+.- ..+.|.+-=-.++.+|+.+.+.++.++++++..|+.|+.-. T Consensus 181 ~~~~~~-~~~~f~~i~~~~~k~~-~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~d~~il~g~--------------- 243 (404) T protein:vir:39 181 GKIPDL-DNPRLTIIKYLIKRYA-GIITATNTLLKDTAENILAWLSSWIAKKVVVTRNQAIIAAM--------------- 243 (404) T ss_pred cccccc-cccceeeEEeeeeeEE-eeehhHHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHHhcc--------------- Confidence 665432 2345567777776653 33455443223467889999999999999999999876311 Q ss_pred cccCcceeecccccccccchhhhHHHHHHHHHHHHH-HhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccccccc Q lcl|NC_015249. 157 GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARA-KLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPST 235 (347) Q Consensus 157 ~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~-~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~ 235 (347) +.+. ..+. .. -++.|.++.. .++...-+ .=.+|++|..|..|.+-. -.+..|.-...+.. T Consensus 244 --g~~~--~~~~--------~~----~~~~i~~~~~~~~~~~~~~--~a~~v~n~~~~~~L~~lk-d~~G~~l~~~~~~~ 304 (404) T protein:vir:39 244 --GTVP--KKPT--------IA----KFDDVITMINTSVDPAIIA--TSSLLTNQSGLNKLALVK-TAEGKYLLEPDPTK 304 (404) T ss_pred --cccc--cccc--------cc----cHHHHHHHHHHhhhhhhcc--CCEEEEcHHHHHHHHHhh-ccCCceeeccCcCC Confidence 0110 0000 01 1344544433 33333322 236799999999998532 23344544445566 Q ss_pred ceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeec Q lcl|NC_015249. 236 GSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARR 315 (347) Q Consensus 236 G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d 315 (347) |...+++|++|+.+.+.+.+..+.. ...-|-+||...+.+ +..++++++..+. T Consensus 305 ~~~~~l~G~pV~~~~~~~~~~~~~~------------------~~~~~~gd~~~~~~~---------~~~~~~~i~~~~~ 357 (404) T protein:vir:39 305 PNSYLIKGKKVIVVADRWLPNSGST------------------VYPLYYGDMSQAITL---------FDRENMSLLPTNI 357 (404) T ss_pred CCcceecceeEEEecccccCccCCC------------------ccEEEEEeccccEEE---------EeecceEEEEecc Confidence 7778999999999776543322110 000122344433322 2223455555544 Q ss_pred h--hh--hcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 316 A--NF--QADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 316 ~--~~--~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) . .+ -...+++..++|..+++|++.+.+.+..+ T Consensus 358 ~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~ 393 (404) T protein:vir:39 358 GAGAFETDTTKIRVIDRFDVKTTDSEALVAGSFTAI 393 (404) T ss_pred chhhhhhceeeEEEEeeeccEEecccceEEEEeecc Confidence 3 22 23457788999999999999999997777 No 116 >protein:vir:2344 Length: 397 # NCBI annotation: gp14 # Family: family:all:507 # MgeID: mge:51 # MgeName: Bxb1 # Cross-refs: genbank:acc:NP_075281;genbank:gi:12657868;genbank:GeneID:920118 Probab=99.22 E-value=4.8e-12 Score=82.64 Aligned_cols=284 Identities=12% Similarity=0.048 Sum_probs=160.5 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~~ 79 (347) |.--. . .+..+....++.-.+..+++..++.+..++.|.++.+.++..+. +.+.+||+. +...+..+..|+.+ T Consensus 1 ~g~~~---e--~~~~~~~~t~~~~g~l~~~~~~~ii~~l~~~s~i~~l~~~~~~~-~~~~~ip~~~~~~~a~wv~Eg~~~ 74 (397) T protein:vir:23 1 MGFSA---D--HSQIAQTKDTMFTGYLDPVQAKDYFAEAEKTSIVQRVAQKIPMG-ATGIVIPHWTGDVSAQWIGEGDMK 74 (397) T ss_pred CCcCH---H--HHHHhhccCCCCccccchhHHHHHHHHHHhccchhhhcceeecc-CCceEEEEEcCCcceEEecCCccc Confidence 32111 0 11111111111112445667778888888888888888877755 456788876 55566777777777 Q ss_pred CCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLG 159 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~ 159 (347) +.. +++-.++++.+-+. ..-..|.+-=-.++.+|+.+.+.++.++++++..|+.++.-.. ....+.+ T Consensus 75 ~~s--~~~f~~v~l~~~k~-~~~v~iS~ell~ds~~~l~~~i~~~l~~aia~~~d~a~l~G~g--------t~~~~~~-- 141 (397) T protein:vir:23 75 PIT--KGNMTKRDVHPAKI-ATIFVASAETVRANPANYLGTMRTKVATAIAMAFDNAALHGTN--------APSAFQG-- 141 (397) T ss_pred ccc--ccceeEEEEeeEEE-EEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhhccc--------CCccccc-- Confidence 653 45566666666443 2334443322234668999999999999999999998863210 0000110 Q ss_pred CcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcc-----ccccc Q lcl|NC_015249. 160 KAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQA-----LIDPS 234 (347) Q Consensus 160 ~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~-----~~~~~ 234 (347) ........ ........++.++++...|.+..-+ .-..+++|..|..|.+-. -.+..+.- ..... T Consensus 142 ---~~~~~~~~-----~~~~~~~~~~~~~~~~~~l~~~~~~--~a~~vmn~~~~~~L~~lk-d~~G~~i~~~~~~~~~~~ 210 (397) T protein:vir:23 142 ---YLDQSNKT-----QSISPNAYQGLGVSGLTKLVTDGKK--WTHTLLDDTVEPVLNGSV-DANGRPLFVESTYESLTT 210 (397) T ss_pred ---ccccccce-----eeecccchhHHHHHHHHhhhhcccC--CCEEEEcHHHHHHHHHhh-ccCCceeecccccccccc Confidence 00000000 0011122345566777777777654 345799999999998633 22333321 11222 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) .+..+++.|++|+.++++|.+.. --+-+||++... .. ..++.++..+ T Consensus 211 ~~~~~tl~G~Pv~~s~~~~~g~~-----------------------~~~~gDfs~~~i-~~---------~~~i~i~~~~ 257 (397) T protein:vir:23 211 PFREGRILGRPTILSDHVAEGDV-----------------------VGYAGDFSQIIW-GQ---------VGGLSFDVTD 257 (397) T ss_pred cccCceeeeeeEEEeCCCCCCce-----------------------EEEEeecceEEE-EE---------EeceEEEEee Confidence 33446899999999999984221 012345554331 11 1223344433 Q ss_pred chh--------------hhcc--eeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 315 RAN--------------FQAD--QIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 315 d~~--------------~~~d--~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +.. ++.| .+++.++++..++||++.+.+..... T Consensus 258 e~~~~~~~~~~~~~~~lf~~d~v~~ra~~r~d~~v~~~~a~~~~~~~~~ 306 (397) T protein:vir:23 258 QATLNLGSQESPNFVSLWQHNLVAVRVEAEYGLLINDVNAFVKLTFDPV 306 (397) T ss_pred eeeeeeccccccceeeeeeccceeEEEEeeeccceecccceEEEeeccc Confidence 321 2333 45677889999999999988887666 No 117 >protein:vir:1433 Length: 435 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:30 # MgeName: phiE125 # Cross-refs: genbank:acc:NP_536362;genbank:gi:17975167;genbank:GeneID:929171 Probab=99.20 E-value=1.4e-11 Score=80.15 Aligned_cols=295 Identities=14% Similarity=0.160 Sum_probs=150.7 Q ss_pred CCccccc--------------ccc--cccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcc-cccccccccceEEEe Q lcl|NC_015249. 1 MAKMNGG--------------QQI--GKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNK-HLVRSIQSGKSAQFP 63 (347) Q Consensus 1 ma~~~~~--------------~~~--~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~-~~~r~i~~G~tv~i~ 63 (347) ++...+. ... ....+....+| .+.-+.+..++.+..+..++++.+ ++..+..+| .+.+| T Consensus 105 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~t~~~gg---~~vP~~~~~~ii~~l~~~~~i~~~~~~~~~~~~~-~~~~p 180 (435) T protein:vir:14 105 LAAARGDAQLASKLAIERGFGEEVAMSLNTLSPGAGG---VLVPENLSSEVIELLRPKSVVRKLGARTLPLSNG-NITIP 180 (435) T ss_pred HHhhcchhhHHHHHHHhhhhhhhhhhhcccCCcCCCc---cccchhHHHHHHHHHhhhchhhhhcceeeecCCC-ceEEE Confidence 0001000 000 00000111111 133478888888877777776665 444444444 57888 Q ss_pred ec-CcceeeeeecCCCCCCccCCCCCceEEEEEEeeeeccccccc--HHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015249. 64 VL-GRTKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYD--IEDAMNHYDVRSEYTAQLGESLAMAADGAVLAE 140 (347) Q Consensus 64 ~i-G~~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd--~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~ 140 (347) +. +...+.....|..++. .++.-.++++...+.- .-+.|.+ +++..-..++.+.+.++.++++++..|+.|+.- T Consensus 181 ~~~~~~~a~~v~E~~~~~~--~~~~f~~i~~~~~k~~-~~~~iS~ell~ds~~~~~l~~~i~~~l~~ai~~~~d~a~l~G 257 (435) T protein:vir:14 181 RLKGGAIVGYIGADTDIPT--TQQQFDDLKLTAKKMA-ALVPIANDLIKYAGVNPNVDQIVVGDLTAAIGAREDKAFIRD 257 (435) T ss_pred EEeCCcceeeeccCccccc--cccceeEEEeeeEEEE-EeehhhHHHHHhhccCHHHHHHHHHHHHHHHHHHHHHHhhcc Confidence 76 6566666666666654 3455556666554442 2334432 222222345888899999999999999988621 Q ss_pred HHHHhhhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcch Q lcl|NC_015249. 141 MAKLCNLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAAL 220 (347) Q Consensus 141 ~~~~a~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~ 220 (347) - .....+.|. ....................++..+.++...+...+.-......|++|..|..|.+-. T Consensus 258 ~--------G~~~~p~Gi----~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~v~n~~~~~~L~~lk 325 (435) T protein:vir:14 258 D--------GTANTPKGL----RFWALPSNVITASDASTLQKIETDLGKVILALENADANLTQPGWIMAPRTFRFLEGLR 325 (435) T ss_pred C--------CCCccccce----eecccccceeccccccchhhHHHHHHHHHHHhhhccccccCCEEEEcHHHHHHHHHhh Confidence 0 001112211 1111111111111222344445556666666666544333456799999999886433 Q ss_pred hhhhhhhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhh Q lcl|NC_015249. 221 MPNAANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAV 300 (347) Q Consensus 221 ~~~~~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av 300 (347) -.++.|.-. ....| .++|++|+.++.+|...+... + ...-+-+||+..+ ++.+ T Consensus 326 -d~~G~~l~~-~~~~g---~l~G~Pv~~~~~~p~~~~~~~------~-----------~~~i~~gd~s~~~--i~~~--- 378 (435) T protein:vir:14 326 -DGNGNKVYP-ELANG---MLKGYPVGKTTQVPINLGETG------K-----------ESEIYFTDFGDVF--IGEE--- 378 (435) T ss_pred -ccCCceecc-CCCCC---eeecceeEeeccccccccCCC------c-----------cceEEEeecccEE--EEEe--- Confidence 233333211 12233 689999999999996532210 0 0011234555432 2222 Q ss_pred hhhhhcceeeeeeechh-----------hh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 301 GTVKLKDMALERARRAN-----------FQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 301 ~~v~~~~~~~e~~~d~~-----------~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .+++++..++.. ++ .-.|+..++++..+.||++.+ .....+ T Consensus 379 -----~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~~~~~~a~~-~l~~~~ 432 (435) T protein:vir:14 379 -----ETLEIDYSKEATYKDADGHMVSAFQRDQTLIRVIAKNDFGPRHVESIA-VLAGVA 432 (435) T ss_pred -----cccEEEEeccccccccccchhhhhhcChhheeeeeeeCceeecccceE-EEecCC Confidence 234444443321 22 356788899999999999654 333333 No 118 >protein:vir:101607 Length: 379 # NCBI annotation: major capsid protein precursor # Family: family:all:585 # MgeID: mge:1646 # MgeName: 11b # Cross-refs: genbank:acc:YP_112497;genbank:gi:53793597;uniprot:Q5ZGF6;genbank:GeneID:3101715 Probab=99.20 E-value=1.8e-11 Score=79.51 Aligned_cols=271 Identities=15% Similarity=0.048 Sum_probs=156.3 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-C--cceeeeeecCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-G--RTKAAYLQPGE 77 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G--~~~~~~~~~g~ 77 (347) ........ .....++.-.+..+.|..++...-.+.+.++++.++.++. +.++.|++. | .........|+ T Consensus 100 ~~~~~~~~-------~~~~~~~~~~~ip~~~~~~ii~~~~~~~~i~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~v~Eg~ 171 (379) T protein:vir:10 100 SIQVKAVG-------DMTLPVNLTGAQPKDYNFDVVLNPSQMLNVSDIVGAVSIS-GGTYTFVRENGAGEGAIGAQVEGA 171 (379) T ss_pred hhhhhhhc-------ccccCCCCccccchhhhhHHHHhHHhhhhHHhhceeeecc-CCceEEEEeecCCCcccccccCCc Confidence 11111111 1111122222456889999998888888888888887765 456777764 2 22334445566 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG 157 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~ 157 (347) ..+. .+++-.++++.+.++-- -..|.+ +-.+...++.+.+.++.+++|++..|+.++.-+. T Consensus 172 ~~~~--~~~~f~~i~~~~~k~~~-~~~iS~-ell~D~~~l~~~i~~~la~~~~~~~~~~~~~g~~--------------- 232 (379) T protein:vir:10 172 TKGQ--KDYDISMIDVNTDFIAG-FTRYSK-KMANNLPFLTSFIPNALRRDYAKAENAAFNAVLA--------------- 232 (379) T ss_pred cccc--cccceeeeEeeeeeEEe-eehhhH-HHHhhHHHHHHHHHHHHHHHHHHHHHHHHhcccc--------------- Confidence 6543 34556666666655432 223422 2233334577778888999999999987753210 Q ss_pred ccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccc--ccccc Q lcl|NC_015249. 158 LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQAL--IDPST 235 (347) Q Consensus 158 ~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~--~~~~~ 235 (347) .+......+.+ ....++.|.++...+.....+.. .+|++|..|..|.+-. -.++.|... ..... T Consensus 233 --~~~~~~~~~~~---------~~~~~d~i~~~~~~~~~~~~~~~--~~vmn~~~~~~l~~lk-d~~G~~l~~~~~~~~~ 298 (379) T protein:vir:10 233 --ANATASTEIIT---------NKNKVEMLINEIAKQENLDFPVT--AIVLRPTDYYDILVTQ-KSVGAGYGLPGVVTQD 298 (379) T ss_pred --ccccccccccc---------CcccHHHHHHHHHhhhhccCCCC--EEEEcHHHHHHHHHhh-ccCCceeccCCccCCC Confidence 00000000000 11124667777777777766543 4668999999886533 233444332 22345 Q ss_pred ceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeec Q lcl|NC_015249. 236 GSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARR 315 (347) Q Consensus 236 G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d 315 (347) |...+++|++|+.|+.+|.+. -+-+||+... +++.+ +++++.+++ T Consensus 299 ~~~~~l~G~pvv~s~~~~ag~-------------------------~~~gdf~~~~-~~~~~---------~~~i~~~~~ 343 (379) T protein:vir:10 299 NGVLRINGIPLFRATWLAANK-------------------------YYVGDWTRVT-KVTTE---------GLSLEFSEV 343 (379) T ss_pred CCcceecceeeEecCCCCCCc-------------------------eEEeecccEE-EEEEe---------ceEEEEeec Confidence 666789999999999887321 1234555543 33332 345566655 Q ss_pred hh--hhcc--eeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 316 AN--FQAD--QIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 316 ~~--~~~d--~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +. +..+ .+++..++|..++||++.+.+.+..= T Consensus 344 ~~~~f~~~~~~~r~~~R~~~~v~~p~a~v~~~~~~~ 379 (379) T protein:vir:10 344 EGTNFVKNNITARIEAQVALAVEQPAALIFGDFTAV 379 (379) T ss_pred ccccccCCcEEEEEEEEeccEEecCccEEEEEecCC Confidence 43 3333 55667899999999999888777766 No 119 >protein:vir:80376 Length: 435 # NCBI annotation: gp6, major capsid head protein # Family: family:all:21 # MgeID: mge:1881 # MgeName: phi644-2 # Cross-refs: genbank:acc:YP_001111085;genbank:gi:134288639;genbank:GeneID:4960624 Probab=99.18 E-value=2.7e-11 Score=78.49 Aligned_cols=295 Identities=15% Similarity=0.149 Sum_probs=155.2 Q ss_pred CCccccc--------------cc--ccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcc-cccccccccceEEEe Q lcl|NC_015249. 1 MAKMNGG--------------QQ--IGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNK-HLVRSIQSGKSAQFP 63 (347) Q Consensus 1 ma~~~~~--------------~~--~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~-~~~r~i~~G~tv~i~ 63 (347) |+...+- .. .....+-...+| .+.-+.+..++.+..+..+.++.+ .++.+...| .+.+| T Consensus 105 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~gg---~lvP~~~~~~ii~~l~~~~~i~~~~~~~v~~~~~-~~~~p 180 (435) T protein:vir:80 105 LAAARGDAQLASKLAIERGFGEEVAMSLNTLSPGAGG---VLVPENLSSEVIELLRPKSVVRKLGARTLPLSNG-NITIP 180 (435) T ss_pred HHhccchhHHHHHHHHhhhhhhhhhhhhcccCCCCCc---cccchhHHHHHHHHHhhhchhhhccceeeecCCC-ceEEE Confidence 1111100 00 000001111111 134478888888888777777765 344343344 47777 Q ss_pred ec-CcceeeeeecCCCCCCccCCCCCceEEEEEEeeeeccccccc--HHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015249. 64 VL-GRTKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYD--IEDAMNHYDVRSEYTAQLGESLAMAADGAVLAE 140 (347) Q Consensus 64 ~i-G~~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd--~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~ 140 (347) +. |.+.+.....|+.++. .++.-.++++...+.- ..+.|.+ +++...++++.+.+.++.++++++..|+.++.- T Consensus 181 ~~~~~~~a~~v~E~~~~~~--~~~~f~~i~~~~~k~~-~~~~is~ell~ds~~~~~l~~~i~~~l~~a~~~~~d~a~l~G 257 (435) T protein:vir:80 181 RLKGGAIVGYIGADTDIPT--TQQQFDDLKLTAKKMA-ALVPIANDLIKYAGVNPNVDQIVVGDLTAAIGAREDKAFIRD 257 (435) T ss_pred EEeCCcceeeeccCccccc--cccceeeEEEeeEEEE-EeehhhHHHHHhhcccHHHHHHHHHHHHHHHHHHHHHHhhcc Confidence 66 5566666667776654 3455566666665542 3334422 222223567889999999999999999988631 Q ss_pred HHHHhhhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcch Q lcl|NC_015249. 141 MAKLCNLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAAL 220 (347) Q Consensus 141 ~~~~a~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~ 220 (347) + .+...+.|.-... +...............++..+.++...|...+.....-..|++|..|..|.+-. T Consensus 258 -------~-G~~~~p~Gi~~~~----~~~~~~~~~~~~~~~~~~~d~~~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~lk 325 (435) T protein:vir:80 258 -------D-GTANTPKGLRFWA----LPGNVITASDGSTLQKIETDLGKAILALENADANLTQPGWIMAPRTFRFLEGLR 325 (435) T ss_pred -------C-CCCCcccceeecc----cccceeecccccchhhHHHHHHHHHHHhhccccccccCEEEEcHHHHHHHHhhh Confidence 0 0111122211110 001111111122233445566777777777666444445689999998885422 Q ss_pred hhhhhhhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhh Q lcl|NC_015249. 221 MPNAANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAV 300 (347) Q Consensus 221 ~~~~~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av 300 (347) -.++.|.-. ....| +++|.+|+.++++|...+.. .+ ...-|-+||+..+ ++ T Consensus 326 -d~~G~~l~~-~~~~~---~l~G~pv~~~~~~p~~~~~~------~~-----------~~~i~~gd~s~~~--i~----- 376 (435) T protein:vir:80 326 -DGNGNKVYP-ELANG---MLKGYPVGKTTQVPINLGEA------GK-----------ESEIYFTDFGDVF--IG----- 376 (435) T ss_pred -ccCCceecc-CCCCC---eEeeeeeEEeccccccccCC------CC-----------cceEEEEEcccEE--EE----- Confidence 233333211 12233 69999999999999643221 00 0112334555433 11 Q ss_pred hhhhhcceeeeeeechh-----------hh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 301 GTVKLKDMALERARRAN-----------FQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 301 ~~v~~~~~~~e~~~d~~-----------~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ...+++++..++.. ++ ...|+...+|+..+.||++.+. ....+ T Consensus 377 ---~~~~~~i~~~~~~~~~~~~~~~~~~f~~n~~~~r~~~r~d~~~~~~~a~~~-l~~~~ 432 (435) T protein:vir:80 377 ---EEETLEIDYSKEATYKDADGHMVSAFQRDQTLIRVIAKNDFGPRHVESIAV-LSGVA 432 (435) T ss_pred ---eecceEEEEeccccccccccchhhhhhcCcceeeeeeeeCcEeecccceEE-EeccC Confidence 12345555554432 22 3567888899999999996553 34444 No 120 >protein:vir:4226 Length: 326 # NCBI annotation: observed 35.2Kd protein # Family: family:all:507 # MgeID: mge:89 # MgeName: L5 # Cross-refs: genbank:acc:NP_039681;swissprot:sw:q05223;genbank:gi:9625447;uniprot:Q05223;genbank:GeneID:2942929 Probab=99.18 E-value=2.2e-11 Score=79.04 Aligned_cols=285 Identities=12% Similarity=0.038 Sum_probs=154.5 Q ss_pred CCc------------ccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-Cc Q lcl|NC_015249. 1 MAK------------MNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GR 67 (347) Q Consensus 1 ma~------------~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~ 67 (347) |+- ..-+.+++|- .+| .+..+++..++.+..++.+.++.+.++..+. ++..+||+. +. T Consensus 1 ~~~~~~r~~~~~~~~e~~a~~~~~~-----~~g---~~ip~~~~~~ii~~~~~~s~i~~~~~~~~~~-~~~~~~p~~~~~ 71 (326) T protein:vir:42 1 MAVNPDRTTPFLGVNDPKVAQTGDS-----MFE---GYLEPEQAQDYFAEAEKISIVQQFAQKIPMG-TTGQKIPHWTGD 71 (326) T ss_pred CCCCccchhhhcCcchhhheecccc-----CCc---ceechhhHHHHHHHHHhcchhhhhcceeecc-CCceEEEEEeCC Confidence 111 1111111111 111 1456888999999998889888877776654 556778765 55 Q ss_pred ceeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Q lcl|NC_015249. 68 TKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNL 147 (347) Q Consensus 68 ~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~ 147 (347) ..+..+..|+.++. .+++-.++++...+. ..-+.|.+-=-.++.+|+.+.+.++.++++++..|+.++.-- . T Consensus 72 ~~a~~v~Eg~~~~~--~~~~f~~i~~~~~k~-~~~v~iS~ell~~s~~~~~~~i~~~l~~a~~~~~d~a~l~G~----g- 143 (326) T protein:vir:42 72 VSASWIGEGDMKPI--TKGNMTSQTIAPHKI-ATIFVASAETVRANPANYLGTMRTKVATAFAMAFDNAAINGT----D- 143 (326) T ss_pred cceEEecCCccccc--cccceeEEEEeeEEE-EEeehhhHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHhhccc----C- Confidence 67777778888765 356667777776554 344455443334567899999999999999999999886311 0 Q ss_pred ccccccccccccCcce--eecccccccccchhhhHHHHHHH--HHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhh Q lcl|NC_015249. 148 PSASDENIAGLGKAHV--LEVGKQSELRGDQVKLGQAIIAQ--LTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPN 223 (347) Q Consensus 148 ~~~~~~~~~~~~~g~~--i~~~~~~~~~~~~~~~~~~~~~~--l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~ 223 (347) ...+.+...... ..+..... ...+...+.. +..+...+. +.....-..+++|..+..|.+-. -. T Consensus 144 ----s~~p~gi~~~~~~~~~~~~~~~-----~~~~~~~~~~~~~~~~~~~~~--~~~~~~a~~v~n~~~~~~L~~lk-d~ 211 (326) T protein:vir:42 144 ----SPFPTFLAQTTKEVSLVDPDGT-----GSNADLTVYDAVAVNALSLLV--NAGKKWTHTLLDDITEPILNGAK-DK 211 (326) T ss_pred ----CCccccccccccccceeecccc-----cccccchhHHHHHHHHHhhhh--hhccCccEEEEeHHHHHHHHHhh-cc Confidence 001111000000 00000000 0001111111 122222222 22233445679999999997522 22 Q ss_pred hhhhcccccc-----ccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechh Q lcl|NC_015249. 224 AANYQALIDP-----STGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRS 298 (347) Q Consensus 224 ~~~~~~~~~~-----~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~ 298 (347) ++.+.-.... .....+.+.|++|+.++.+|..... -+-+||++.. +..+. T Consensus 212 ~G~~l~~~~~~~~~~~~~~~~~l~G~pv~~~~~~~~~~~~-----------------------~~~Gd~s~~~--~~~~~ 266 (326) T protein:vir:42 212 SGRPLFIESTYTEENSPFRLGRIVARPTILSDHVASGTVV-----------------------GYQGDFRQLV--WGQVG 266 (326) T ss_pred CCceeeccccccCccccccCceeeeeeEEEcCCCCCCceE-----------------------EEEeecceEE--EEEec Confidence 2333221112 2223357999999999999842100 1233555443 22222 Q ss_pred hhhhhhhcceeeeeeechh--------------hhc--ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 299 AVGTVKLKDMALERARRAN--------------FQA--DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 299 Av~~v~~~~~~~e~~~d~~--------------~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++++.+.+.. ++. ..+++.+.++.++.||++.+.|+-..+ T Consensus 267 --------~~~v~~~~e~~~~~~~~~~~~~~~~~~~d~~~~r~~~~~d~~v~~~~a~~~l~~~~~ 323 (326) T protein:vir:42 267 --------GLSFDVTDQATLNLGTPQAPNFVSLWQHNLVAVRVEAEYAFHCNDKDAFVKLTNVDA 323 (326) T ss_pred --------ceEEEEeecceeeecccccccchhhhhcCcEEEEEEEEeccEEecccceEEEeeccc Confidence 23333333221 333 345788899999999998877776666 No 121 >protein:vir:100172 Length: 394 # NCBI annotation: putative major head protein # Family: family:all:21 # MgeID: mge:1524 # MgeName: phi AT3 # Cross-refs: genbank:acc:YP_025031;genbank:gi:48697264;genbank:GeneID:2948270 Probab=99.18 E-value=1.6e-11 Score=79.82 Aligned_cols=277 Identities=10% Similarity=0.047 Sum_probs=154.0 Q ss_pred CCc-ccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec--CcceeeeeecCC Q lcl|NC_015249. 1 MAK-MNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL--GRTKAAYLQPGE 77 (347) Q Consensus 1 ma~-~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i--G~~~~~~~~~g~ 77 (347) |-. .....+ .......++--.+.-++|..++.......+.+++++++..+.++ +.++++. +...+.....+. T Consensus 100 l~~~~~~~~~----~~~~~t~~~gg~~vP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~E~~ 174 (394) T protein:vir:10 100 IHSHGKVIDN----AAGHVTSTEAGVLIPEEIIYDPTAEVNSVVDLSTLVTKTPVTTP-KGTYPILKRATDRFSSVAELA 174 (394) T ss_pred Hhccchhhhh----hhcccccccCceeccHHHHHHHHHHHHhhhhhhhhceeeeccCC-ceEEEEEecCCCccccccccc Confidence 100 000000 00001111111234589999999999888999999888876543 4555543 445555555555 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG 157 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~ 157 (347) ..+.. ..+.-.++++.+-+.- .-..|.+-=-.++.+|+.+.+.+++++++++..|+.|+.... T Consensus 175 ~~~~~-~~~~~~~v~l~~~k~~-~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~il~g~g--------------- 237 (394) T protein:vir:10 175 ENPAL-AEPEFEQVDWSVSTYR-GAIPLSEEAIADSAVDLTSLVGQSINEKSVNTYNAMIAPVLQ--------------- 237 (394) T ss_pred ccccc-ccccceeEEeeeeeeE-eeehhHHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHhhccc--------------- Confidence 54421 2344566666664442 223444333334678999999999999999999998763221 Q ss_pred ccCcceeecccccccccchhhhHHHHHHHHHHHHH-HhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcc----ccc Q lcl|NC_015249. 158 LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARA-KLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQA----LID 232 (347) Q Consensus 158 ~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~-~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~----~~~ 232 (347) .+....+ .+ ...++.|.++.. .++... .-.+|++|..|..|.+-. -.++.|.- ... T Consensus 238 --~~~~~~~--~~----------~~~~d~l~~~~~~~~~~~~----~a~~vmn~~~~~~l~~lk-d~~G~~i~~~~~~~~ 298 (394) T protein:vir:10 238 --SFTAKAT--TT----------DTLVDSLKHILNVDLDPAY----SRALVVTQSLFNTLDTLK-DKNGRYLLHDASDSI 298 (394) T ss_pred --ccccccc--cc----------cccHHHHHHHHHhhhhhhc----cCEEEecHHHHHHHHHhh-ccCCCeeeecccccc Confidence 1111000 00 112445555433 333332 236789999999987532 22333321 112 Q ss_pred cccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeee Q lcl|NC_015249. 233 PSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALER 312 (347) Q Consensus 233 ~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~ 312 (347) ...|.-++++|.+|+.+++......++. . .-+-+||++.+.+ + ...+++++. T Consensus 299 ~~~~~~~~L~G~PV~~~~~~~~~~~~~~-----------~--------~i~~gd~s~~~~~-~--------~~~~~~v~~ 350 (394) T protein:vir:10 299 TDGTAKGTVLGVPVYVVGDALLGSAAGD-----------Q--------KAFVGDLKRGVLF-A--------DRQQVTLAW 350 (394) T ss_pred ccCCcccccccceeEEecccccCCCCCc-----------e--------EEEEeeccccEEE-E--------eecceEEEE Confidence 2345556899999998766432221100 0 0122345543322 2 223455665 Q ss_pred eechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 313 ARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 313 ~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .++. .+...+++.++++..+.+|++.+.+....+ T Consensus 351 ~~~~-~~~~~~~~~~r~d~~~~~~~ai~~~~~~~~ 384 (394) T protein:vir:10 351 EDSK-IYGRYLGAAFRFGVKQADSNAGYFVTNTDA 384 (394) T ss_pred eccc-ccceeEEEEEEeccEEeccccEEEEEeecc Confidence 5443 344678999999999999999988888777 No 122 >protein:vir:95376 Length: 425 # NCBI annotation: phage major capsid protein # Family: family:all:635 # MgeID: mge:1567 # MgeName: GBSV1 # Cross-refs: genbank:acc:YP_764476;genbank:gi:115334630;genbank:GeneID:5179263 Probab=99.16 E-value=1.2e-11 Score=80.38 Aligned_cols=291 Identities=9% Similarity=0.061 Sum_probs=154.1 Q ss_pred CCcccccccc-----cccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcc-eeeeee Q lcl|NC_015249. 1 MAKMNGGQQI-----GKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRT-KAAYLQ 74 (347) Q Consensus 1 ma~~~~~~~~-----~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~-~~~~~~ 74 (347) |......... ..........++--.+.-+++..++.+..+..+.+++++++..+. |+ ..+|+.+.. .+.... T Consensus 119 ~~~~~~~~~~~~~~~~~~~~~~~~~~~gg~~vP~~~~~~Ii~~l~~~~~i~~~~~~~~~~-g~-~~ip~~~~~~~a~~v~ 196 (425) T protein:vir:95 119 LKTGEYYKRSEVVEFYEKFRNLRAVAGGELTIPEVVVNRIMDIMGDYTTLYPLVDKIRVK-GT-TRILVDTDTSPATWIE 196 (425) T ss_pred HhhhhhhhhhHHHHHHHHHHhhcccccCceeccHHHHHHHHHHHHhhhhHHHhhceeecC-ce-eEEEEecCCccccccc Confidence 1000000000 000001111111122455889999999999999999998887754 44 467766444 555566 Q ss_pred cCCCCCCccCCCCCceEEEEEEeeeecc-cccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc Q lcl|NC_015249. 75 PGENLDDKRKDMKHTERTINIDGLLTAD-VLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDE 153 (347) Q Consensus 75 ~g~~~~~~~~~~~~~~~~l~ID~~~~~~-~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~ 153 (347) .|..++.+. .+.-+++++.. .++.. +.|.+-=-.++..++.+.+.++.++++++..|+.|+.-- ..... T Consensus 197 E~~~~~~~~-~~~f~~i~l~~--~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~il~G~-------G~~~~ 266 (425) T protein:vir:95 197 QSGALPTGD-VGTIASIDFDG--FKVGKVTFVDNYLLQDSIINLDDYVTKKIARAIAKALDLAIVKGT-------GAANK 266 (425) T ss_pred ccccccccc-ccccceeeeeh--eeeeeeehhhHHHHhccHHHHHHHHHHHHHHHHHHHHHHHhhccC-------CCCcc Confidence 676654321 12234444444 44443 344332223455689999999999999999999886310 00000 Q ss_pred ccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHH-HHHHHhcch--hhhhhhhccc Q lcl|NC_015249. 154 NIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPD-NYSAILAAL--MPNAANYQAL 230 (347) Q Consensus 154 ~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~-~~~~Ll~~~--~~~~~~~~~~ 230 (347) .|.|-.-.+....... .......++.|+++...+.....+...-+.+++|. +|..|.+-. +-.+..|... T Consensus 267 ----~p~Gil~~~~~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~~l~~l~~~kd~~g~~i~~ 339 (425) T protein:vir:95 267 ----QPLGIIPSLPPENQVT---VEADNNLLKNLVKQIGLIDTGDDSVGEIVAVMKRSTYYNRLVEFSIQVDSNGNVVGK 339 (425) T ss_pred ----ccceeecccccccccc---cccccchHHHHHHHHHhhhhhccccCceEEEEeChHHHHHHHHHHhhcCCCCceeec Confidence 1111111111111000 00112245667777777766655444434455555 555453222 1223334322 Q ss_pred cccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceee Q lcl|NC_015249. 231 IDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMAL 310 (347) Q Consensus 231 ~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~ 310 (347) ...+....+.|.+|+.++++|... -+-+||+.. .+.. ..++++ T Consensus 340 --~~~~~~~~l~G~pvv~~~~~~~~~-------------------------i~~Gd~~~~-~~~~---------~~~~~i 382 (425) T protein:vir:95 340 --LPNLRTPDLLGLRVVFNNFLDDDT-------------------------VLFGEFEQY-TLVE---------RENITI 382 (425) T ss_pred --cCCCCCccccceeeEEcCcCCCcc-------------------------EEEEecccE-EEEe---------ecceEE Confidence 234556689999999999998421 122345442 2221 223555 Q ss_pred eeeechhhh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 311 ERARRANFQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 311 e~~~d~~~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +...+..+. ...+++..+++.++.+|++.+.+.+..- T Consensus 383 ~~~~~~~f~~~~~~~~~~~r~d~~~~~~~a~~~~~i~~~ 421 (425) T protein:vir:95 383 DSSTHVKFTEDQTAFRGKGRFDGKPVKPEAFVLVTITDP 421 (425) T ss_pred EeecccccccCceEEEEEEeeCcEeecccceEEEEecCc Confidence 555444322 2457777899999999999998887763 No 123 >protein:vir:1383 Length: 421 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:314 # MgeName: phi3626 # Cross-refs: genbank:acc:NP_612835;genbank:gi:20065969;genbank:GeneID:935826 Probab=99.15 E-value=1.1e-11 Score=80.63 Aligned_cols=277 Identities=13% Similarity=0.050 Sum_probs=161.6 Q ss_pred CCccccccc-ccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcc---eeeeeecC Q lcl|NC_015249. 1 MAKMNGGQQ-IGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRT---KAAYLQPG 76 (347) Q Consensus 1 ma~~~~~~~-~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~---~~~~~~~g 76 (347) +..+.+... ...|.+....+| -.+.-+++..++....+..+.++++++...+.++ +.++++.... .+.....| T Consensus 101 ~~~~~~~~~~~~~ra~~t~~~g--g~liP~~~~~~Ii~~~~~~~~l~~l~~~~~~~~~-~~~~~~~~~~~~~~~~~~~E~ 177 (421) T protein:vir:13 101 SKTIRGIQLSEEERDIMSSTNN--GAVIPQEFVNEFEKLKEGYPSLKEHCHVIPVNRN-AGKMPVRAGASVDKLANLAKD 177 (421) T ss_pred HHhhhccchhHHHhhccccCCc--ceecchhhHHHHHHHHHhhhhhhhhceeeeccCC-ceEEEEeecCCccceeecccc Confidence 000000000 001221111112 1234588999998888888888988888776544 4566544222 23445555 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIA 156 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~ 156 (347) ..++. .++.-.++++.+.+.. .-+.|.+-=-.++.+|+.+.+.++++++++...|..++..+.. T Consensus 178 ~~~~~--s~~~f~~i~~~~~k~~-~~v~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~i~~~~~g------------- 241 (421) T protein:vir:13 178 TELVK--AMLKTQPMAYDIDDYG-LLAPIDNSLLEDSEINFLEFVNEEFAEFAVNTENAEIVKQAKA------------- 241 (421) T ss_pred ccccc--cccceeEEEeeeeeeE-eehhhhHHHHhhhHHHHHHHHHHHHHHHHHHHhhhhHhhhhhh------------- Confidence 55543 2455566666665542 3334543322446688999999999999999999887643311 Q ss_pred cccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccc Q lcl|NC_015249. 157 GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTG 236 (347) Q Consensus 157 ~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G 236 (347) .. . .. ... -|+.|+++...|..+..+. -.+|++|..|..|.+- +-.++.|.-. ....| T Consensus 242 -----~~-~---~~-----~~~----~~d~i~~~~~~l~~~~~~~--a~~v~n~~~~~~l~~l-kd~~G~~i~~-~~~~~ 299 (421) T protein:vir:13 242 -----VL-A---EE-----TIN----DYAGLVKTINSLVPNARKR--AIIVTNSDGRAYLDGL-MDKQGRPLLK-ELSDG 299 (421) T ss_pred -----cc-c---cc-----ccc----chHHHHHHHHHhhhhhcCC--CEEEEcHHHHHHHHHh-hcCCCceeec-CcCCC Confidence 00 0 00 001 1566777778887766553 2567899999998643 2233444322 24566 Q ss_pred eEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeech Q lcl|NC_015249. 237 SIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRA 316 (347) Q Consensus 237 ~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~ 316 (347) ....++|.+|+.++++|..+++. ..-+-+||++.+.+ ....+++++..++. T Consensus 300 ~~~tl~G~pV~~~~~~~~~~~~~--------------------~~~~~gd~~~~~~~---------~~~~~~~v~~~~~~ 350 (421) T protein:vir:13 300 GDLVFKGRPVIELEESIFDVGDE--------------------TKFIVSDFKTLIKF---------MDRKQYLIDQSKEA 350 (421) T ss_pred CCceecceeeEEeccccccCCCc--------------------eEEEEEeccccEEE---------EEecceEEEeeccc Confidence 67789999999999988533221 01233444443322 22335777777776 Q ss_pred hhhc--ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 317 NFQA--DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 317 ~~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .+.- ..|++..+++..+.+|+++..++.... T Consensus 351 ~f~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~ 383 (421) T protein:vir:13 351 GYTKNETIARIIERFDVNSPLDKSSDAEKIRKF 383 (421) T ss_pred ccccCeeEEEEEeeecceeecchhhheeeeccc Confidence 5554 357888999999999998765544432 No 124 >protein:vir:2504 Length: 305 # NCBI annotation: major capsid subunit gp9 # Family: family:all:507 # MgeID: mge:53 # MgeName: TM4 # Cross-refs: genbank:acc:NP_569745;genbank:gi:18496895;genbank:GeneID:932268 Probab=99.15 E-value=2.9e-11 Score=78.36 Aligned_cols=280 Identities=12% Similarity=0.010 Sum_probs=150.1 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~~ 79 (347) ||.++.... . .|.-+++..++.+..++.|.++.+.++.++. +.+..||+. +...+.-+..|+.. T Consensus 1 ma~~t~~~g------g--------~liP~~~~~~Ii~~~~~~s~l~~l~~~~~~~-~~~~~~p~~~~~~~a~wv~E~~~~ 65 (305) T protein:vir:25 1 MADISRAEV------A--------SLIQEAYSDTLLAAAKQGSTVLSAFQNVNMG-TKTTHLPVLATLPEADWVGESATD 65 (305) T ss_pred CCCccCCcc------c--------eecCHHHHHHHHHHHHhhchhhhhcceeecc-CCcEEEEEEeCCcceEEeeccccc Confidence 887764221 1 1455889999999999999999999888865 446778765 44566666666654 Q ss_pred CCcc---CCCCCceEEEEEEeeeecc-cccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccc Q lcl|NC_015249. 80 DDKR---KDMKHTERTINIDGLLTAD-VLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENI 155 (347) Q Consensus 80 ~~~~---~~~~~~~~~l~ID~~~~~~-~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~ 155 (347) +... .+++-.++++. ..++.. ..|.+-=-.++.+|+.+.+.++.++++++..|+.++.--- .+....+ T Consensus 66 ~~~~~~~s~~~f~~i~~~--~~k~~~~~~is~ell~ds~~~~~~~i~~~l~~~~a~~~d~a~~~G~g------~~~~~~~ 137 (305) T protein:vir:25 66 PKGVKPTSKVTWANRTLV--AEEIAVIIPVHENVIDDATVAVLTEVAELGGQAIGKKLDQAVIFGTD------KPASWVS 137 (305) T ss_pred ccccccccccceeeEEee--eEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHhhhheeccC------CCCCccc Confidence 3211 12223334444 344333 3343322224568899999999999999999998863210 0000000 Q ss_pred c-cccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccc Q lcl|NC_015249. 156 A-GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPS 234 (347) Q Consensus 156 ~-~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~ 234 (347) . ..+.. .........+........+++.+..+...+....-... -++++|..|..|.+-. -.+..+ .+. T Consensus 138 ~~~~~~~---~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~v~~~~~~~~l~~lk-d~~G~~----i~~ 207 (305) T protein:vir:25 138 PALIPAA---VTAGQAVEVVGGVANESDIVGATNRAAKAVASAGWAPD--TLLSSLALRYEVANIR-DANGNP----VFR 207 (305) T ss_pred ccccccc---ccccccccccccchhhhHHHHHHHHHHHhhhhcccccc--eeEecHHHHHHHHHhh-ccCCce----eec Confidence 0 00000 00001111112222233345555555555444322211 2677999999886432 122222 233 Q ss_pred cceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeee Q lcl|NC_015249. 235 TGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERAR 314 (347) Q Consensus 235 ~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~ 314 (347) .+ .++|++|+.++++|...... .-+-+||++... .. ..+++++..+ T Consensus 208 ~~---~l~G~Pv~~~~~~~~~~~~~---------------------~~~~gd~s~~~i-~~---------~~~~~i~~~~ 253 (305) T protein:vir:25 208 DD---SFAGFRTFFNRNGAWDADAA---------------------IEVIADSSRVKI-GV---------RQDITVKFLD 253 (305) T ss_pred CC---cccccceEEcCccCCCCCcc---------------------EEEEEecceEEE-EE---------ecCeEEEEee Confidence 33 69999999999987432110 112345544321 11 1223333332 Q ss_pred ch----------hhhc--ceeeeeeeecccccccceEEEEEEc-------CC Q lcl|NC_015249. 315 RA----------NFQA--DQIIAKYAMGHGGLRPEACGALVFN-------KA 347 (347) Q Consensus 315 d~----------~~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~-------~a 347 (347) +. .++. -.+++..++|..++||++++.+... +| T Consensus 254 ~~~~~~~~~~~~~~~~~~~~~R~~~r~~~~v~~p~a~v~~~~~~~~~~~pa~ 305 (305) T protein:vir:25 254 QATLGTGENQINLAERDMVALRLKARFAYVLGVSATAQGANKTPVAVVAPAA 305 (305) T ss_pred eeeeecCCceeeeeecCcEEEEEEEeecceeeCcccEEEEccccccccCCCC Confidence 21 1222 3456778899999999988876543 23 No 125 >protein:vir:5739 Length: 366 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:122 # MgeName: PY54 # Cross-refs: genbank:acc:NP_892050;genbank:gi:33770513;interpro:IPR006444;uniprot:Q7Y410;genbank:GeneID:1732928 Probab=99.15 E-value=3.2e-11 Score=78.09 Aligned_cols=293 Identities=15% Similarity=0.132 Sum_probs=146.3 Q ss_pred CCcccccc-----cccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcc-cccccccccceEEEeec-Ccceeeee Q lcl|NC_015249. 1 MAKMNGGQ-----QIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNK-HLVRSIQSGKSAQFPVL-GRTKAAYL 73 (347) Q Consensus 1 ma~~~~~~-----~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~-~~~r~i~~G~tv~i~~i-G~~~~~~~ 73 (347) ||...++. .+++-.+. +| .|.-+++.+++.+.....++++.+ .++-...+| .+.+|+. +...+... T Consensus 52 ~a~~~~~~~~~~~a~~~~~~~---Gg---~lvP~~~~~~ii~~l~~~s~l~~lg~~~v~~~~g-~~~~p~~t~~~~a~wv 124 (366) T protein:vir:57 52 FAATELGDTGLSMAISTAAGS---GG---ALIPQNMQNEVIELLRDRTVVRILGARSIPLPNG-NLSMPRLSGGATAGYV 124 (366) T ss_pred HHHHhhcchhhhhhccccccC---Cc---cccchhHHHHHHHHHhhhcchhhhceeeeecCCC-ceEEEEEeCCcceeee Confidence 11111110 01111111 11 134478899998887777877765 444343445 4777765 56677777 Q ss_pred ecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc Q lcl|NC_015249. 74 QPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDE 153 (347) Q Consensus 74 ~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~ 153 (347) ..|+.++.+ +++-+++++..-+. +.-..|.+-=-.++.+++.+.+.+++++++++..|+.++.-- .++. T Consensus 125 ~E~~~~~~s--~~~f~~i~~~~~k~-~~~~~iS~ell~ds~~~~~~~i~~~l~~a~~~~~d~a~l~G~--------G~~~ 193 (366) T protein:vir:57 125 GEGKDVVAT--GATFDDVKLSAKTM-IALVPVSNQLIGRAGFNVEQLLLGDILSAIATREDKAFLRDD--------GTGD 193 (366) T ss_pred ccCcccccc--ccceeEEEEeeEEE-EEeehhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHhhccC--------CCCc Confidence 778777653 45556666655443 233344322223567899999999999999999999886310 0111 Q ss_pred ccccccCcceeecccccccccchhhhHHHHHHHHHH-HHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccc Q lcl|NC_015249. 154 NIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTL-ARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALID 232 (347) Q Consensus 154 ~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~-a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~ 232 (347) .+.|.-...... ......+........ ++.+++ +...+...+.-...-..+++|..|..|.+-. -.++.+.-. . T Consensus 194 ~p~Gi~~~~~~~--~~~~~~~~t~~~~~~-~~~~~~~~~~~~~~~~~~~~~a~~vmn~~~~~~L~~lk-d~~G~~l~~-~ 268 (366) T protein:vir:57 194 TPKGMKAVATAA--NRLVAWTGTAINLTT-IDEYLDSLILKHMDSNSNMIRCGWGLSNRTYMTLFGLR-DGNGNKVYP-E 268 (366) T ss_pred cccceeeccccc--cceeeccccccchhh-HHHHHHHHHHhhhccccccccCEEEecHHHHHHHHhhh-ccCCceecc-C Confidence 122211000000 000000000111111 121221 1222222222223334579999999886532 222223211 1 Q ss_pred cccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeee Q lcl|NC_015249. 233 PSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALER 312 (347) Q Consensus 233 ~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~ 312 (347) ... +.++|++|+.|+++|...+.. .+ ...-|-+||+... +.. ..+++++. T Consensus 269 ~~~---g~l~G~Pvv~s~~ip~~~~~~------~~-----------~~~i~~gdfs~~~--i~~--------~~~i~i~~ 318 (366) T protein:vir:57 269 MSQ---GILKGYPIQRTSAIPANLGDD------GN-----------ESEIYFCDFNDVV--IGE--------DGMMKVDF 318 (366) T ss_pred CCC---CeecceeeEEccccccccccC------CC-----------ccEEEEEecceEE--EEE--------ecceEEEE Confidence 223 469999999999999643221 00 0112335555543 111 22344555 Q ss_pred eechh-----------hhcc--eeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 313 ARRAN-----------FQAD--QIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 313 ~~d~~-----------~~~d--~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .++.. ++.| .|+....++..++||++.+ +.+... T Consensus 319 ~~ea~~~~~~g~~~~~f~~~~~~iR~~~~~d~~v~~~~a~~-~lt~~~ 365 (366) T protein:vir:57 319 STEATYKDADGQLVSAFARNQSLIRVVTEHDIGFRHPEGLV-LGTGVI 365 (366) T ss_pred eeccccccccccchhhhhcCceeEEeeeeeCcEeeccccEE-EEeccc Confidence 44432 2333 5677788899999998655 333333 No 126 >protein:vir:81227 Length: 413 # NCBI annotation: gp6, major capsid protein # Family: family:all:585 # MgeID: mge:1893 # MgeName: BFK20 # Cross-refs: genbank:acc:YP_001456736;genbank:gi:157168379;hssp:P49861;interpro:IPR006444;uniprot:Q9MBJ9;genbank:GeneID:5580350 Probab=99.14 E-value=4.6e-11 Score=77.26 Aligned_cols=290 Identities=12% Similarity=0.055 Sum_probs=155.6 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCc-----ceeeeeec Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGR-----TKAAYLQP 75 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~-----~~~~~~~~ 75 (347) +....... ...+-..++..++.-.+.-++|..++.+.....+.+++++++..+. +.++.+++... ..+..... T Consensus 105 ~~~~~~~~-~~~~~~~~~~~~~~~~~vp~~~~~~ii~~~~~~~~l~~~~~~~~~~-~~~~~~~~~~~~~~~~~~a~~v~E 182 (413) T protein:vir:81 105 YVAPRVKA-ASDPASTATLTDEFQGGYGTTWNRNIIYRRREKLVVADLMDNLTMT-NTTIKYLMEKANRVVEGGFKTVAE 182 (413) T ss_pred hhhhHHHh-hhhhhhhcccccccccccchhhHHHHHHHHhhhhhHHhhcceeecc-CCceeEEEeccccccccccceecC Confidence 00000000 0011112222334444567899999999998899999998888865 44566665432 23445555 Q ss_pred CCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccc Q lcl|NC_015249. 76 GENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENI 155 (347) Q Consensus 76 g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~ 155 (347) |+.++.+ ....-.++++.+.+.. ..+.|.+- -.+...++.+.+.++.++++++..|+.++..- .++..+ T Consensus 183 g~~~~~~-~~~~f~~i~~~~~k~~-~~~~iS~e-ll~ds~~l~~~i~~~la~~~~~~~d~~~l~G~--------G~~~~~ 251 (413) T protein:vir:81 183 GGKKPYM-RFADFDIVTESLSKIA-GLTKITDE-MIEDYDFLVSYINARLLEELAIEEERQLLLGD--------GTGNNL 251 (413) T ss_pred ccccccc-CcccceeeEeeeeeEE-EeehhhHH-HHHHHHHHHHHHHHHHHHHHHHHHHHHHhccC--------CCCCcc Confidence 6665432 1122355566555542 22344331 12222347777888899999999999876310 011111 Q ss_pred ccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcc------ Q lcl|NC_015249. 156 AGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQA------ 229 (347) Q Consensus 156 ~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~------ 229 (347) .|. ....... +........+++.|.++...+..+..-.... +|++|..|..|.+-. -.++.|.- T Consensus 252 ~Gi-----~~~~~~~---~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~-~vmn~~~~~~l~~lk-d~~G~~l~~~~~~~ 321 (413) T protein:vir:81 252 TGL-----LKRDGIQ---TLAVSNKDELADSIYKAMTNISLATPFQADA-LVINPLDYQELRLAK-DANGQYYGGGVFQG 321 (413) T ss_pred ccc-----ccccccc---cccccccchhHHHHHHHHHHhhhhccCCCcE-EEEcHHHHHHHHHhh-ccCCceeccccccc Confidence 111 1110000 0011112335666777766665544322333 578999999875432 12222221 Q ss_pred -ccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcce Q lcl|NC_015249. 230 -LIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDM 308 (347) Q Consensus 230 -~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~ 308 (347) ......+..++++|.+|+.|+.+|.+. -+-+||+.... ++.+ .++ T Consensus 322 ~~~~~~~~~~~~l~G~pv~~s~~~~~~~-------------------------~~~gd~~~~~~-~~~~--------~~~ 367 (413) T protein:vir:81 322 QYGSGGIMLDPAPWGLRTVQSQVVPVGK-------------------------PVVGAFRSAAS-VLRK--------GGV 367 (413) T ss_pred cccccccccCceecceeeEEcCCCCccc-------------------------EEEEecccEEE-EEEe--------cce Confidence 111122233579999999999998321 12245555433 2322 235 Q ss_pred eeeeeech--hhhcc--eeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 309 ALERARRA--NFQAD--QIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 309 ~~e~~~d~--~~~~d--~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++..++. .+.-+ .+++..+++..+.+|++.+.+.+..+ T Consensus 368 ~v~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~l~~~~~ 410 (413) T protein:vir:81 368 RIDSTNTNVDDFENNLITVRAEERVGLMVTFPEAIVQLDVAEV 410 (413) T ss_pred EEEEeccccchhhcCcEEEEEEEeeccEEecccceEEEEecCC Confidence 55655433 23334 56677889999999999999998888 No 127 >protein:vir:1583 Length: 351 # NCBI annotation: minor capsid protein # Family: family:all:1522 # MgeID: mge:32 # MgeName: phig1e # Cross-refs: genbank:acc:NP_695165;swissprot:trembl:o03966;genbank:gi:23455804;uniprot:O03966;genbank:GeneID:955561 Probab=99.13 E-value=1.2e-11 Score=80.37 Aligned_cols=280 Identities=13% Similarity=0.054 Sum_probs=170.0 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhhhc---ccccccc-----cccceEEEeecCcc--e Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMN---KHLVRSI-----QSGKSAQFPVLGRT--K 69 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~~~---~~~~r~i-----~~G~tv~i~~iG~~--~ 69 (347) ||.+ +. + + +++ |+|..+|.+.+.+.+.|.. +++...+ .+|+++.||..+.. . T Consensus 1 MA~T--------~l------s--d-~i~PEvf~~yv~~~~~~~~~l~qSG~i~~~~~l~~~~~~~G~~it~P~~~~l~Gd 63 (351) T protein:vir:15 1 MAET--------HL------S--D-LIVPEVFGNYVVNQIIKTNRFVQSGILTPDPDLGPHLLEAGTRITVPFLNDLTGD 63 (351) T ss_pred CCce--------ee------e--e-eechhHHHHHHhhhhHHhhhHhhcccccccHHHHHHhhcCCCEEEecccccCCCc Confidence 8842 22 1 1 455 9999999888887775532 2222122 35999999988753 5 Q ss_pred eeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Q lcl|NC_015249. 70 AAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPS 149 (347) Q Consensus 70 ~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~ 149 (347) ...+..++++. ++.+...+..-+|=. .--.+.+.|+...-+-.|++.++.++.+...++..+..+|..+........ T Consensus 64 ~~~~~~~~~i~--~~kitt~~~~a~i~~-~~kg~~~tD~a~~~sg~dp~~~i~~q~a~~w~~~~q~~lla~l~gv~~~~~ 140 (351) T protein:vir:15 64 PDNWTDSDDID--VNNLTSGKQQGIKFY-QTKAYGYTDLGTMISGAPVQETIGNRFAAFWQRADQKTLLSVLKGVMGVTK 140 (351) T ss_pred ccccCCCcccc--hheecccceeEEEEe-eccceehhhhhHhhccchHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhchh Confidence 67777777775 367888887777744 455688999999988889999999999999999998888776643322221 Q ss_pred ccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhh-hhc Q lcl|NC_015249. 150 ASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAA-NYQ 228 (347) Q Consensus 150 ~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~-~~~ 228 (347) .... .+..+...+ ..+.... ++.|.+|..+|-+..= ..-..+++.|..|..|.+.. +++. .+. T Consensus 141 ~~~~--------~~~d~t~~~--~~~~~is----~~~l~~A~~~~GD~~~-~~~~~ivmhS~v~~~L~~~~-li~~~~~s 204 (351) T protein:vir:15 141 IANS--------KVYDQTKVS--PSEPMFG----AKGFTGAIGLMGDLQD-TAFGAIAVNSATYSLMKVQG-LIETIQPQ 204 (351) T ss_pred hccc--------ceecccccc--ccccccC----HHHHHHHHHHhccccc-cceEEEEEChHHHHHHHhhh-hhhhcccc Confidence 1111 111111111 1111122 3567788888855321 11368888999999998764 3221 111 Q ss_pred cccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcce Q lcl|NC_015249. 229 ALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDM 308 (347) Q Consensus 229 ~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~ 308 (347) . .++.|+.+.|.+|+.+..+|....+. +..+....+|-+-|++..+..+ T Consensus 205 -~---~~~~i~t~~G~~VivdD~~p~~~~~~--------------------------~~~~ytsyl~~~GAi~~~~~~~- 253 (351) T protein:vir:15 205 -N---GATPFEAYNGLRIVLDDDIEIDLTDK--------------------------TKPVSTSYIFAPGAVRYSTNMR- 253 (351) T ss_pred -c---cCcccceecceEEEEcCCCccccCCC--------------------------CCceeEEEEEecceeeeecCCc- Confidence 1 24678999999999999998653221 1112344566677777665544 Q ss_pred eeeeeechhhh--cceeeeeeeecccc--cccceEE---EEEEcCC Q lcl|NC_015249. 309 ALERARRANFQ--ADQIIAKYAMGHGG--LRPEACG---ALVFNKA 347 (347) Q Consensus 309 ~~e~~~d~~~~--~d~i~~~~a~G~~~--~Rpe~a~---~i~~~~a 347 (347) .+|..||+... -|.+..++.|...| ..++.+. ....|.- T Consensus 254 ~ve~~rd~~~~~g~d~l~~r~~~~~hp~G~s~~~~~~~~~~~sPt~ 299 (351) T protein:vir:15 254 STETKYDPLINGGQDVIVQKRVGTIHVAGTSIKASFSPSKASFPTI 299 (351) T ss_pred CcceeecccCCCCceEEEEeeeeeeeeeeeeecccccccCcCCcCh Confidence 56777877643 36666666665322 2333211 0111110 No 128 >protein:vir:962 Length: 397 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:19 # MgeName: bIL285 # Cross-refs: genbank:acc:NP_076616;genbank:gi:13095724;genbank:GeneID:920264 Probab=99.11 E-value=9.2e-12 Score=81.08 Aligned_cols=276 Identities=13% Similarity=0.044 Sum_probs=147.7 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccccccccccc-ceEEEeecCcceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSG-KSAQFPVLGRTKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G-~tv~i~~iG~~~~~~~~~g~~~ 79 (347) +...... ..+.-.+....+...+..+.+..++...- ..+.++...++..+.++ -.+.++..+...+.....+... T Consensus 121 ~~~~~~~---~~~~~~~~~~~~~~~~vp~~~~~~i~~~~-~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~E~~~~ 196 (397) T protein:vir:96 121 NAFVKSK---GAEKRDGFTSVEGGALIPQELLQPQLEPK-DIVDLSKYVRSVPVNSASGKFPVISKSGSKMATVQQLEKN 196 (397) T ss_pred HHHHHhh---hhhhhhcccccccccchhHHHHHHHHHhh-hhhhHHHhhhhccccccceeEEEEeccCCccccccccccc Confidence 0000000 00111111222222345578888877643 33344555665554432 2334444444455545555444 Q ss_pred CCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLG 159 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~ 159 (347) +.. .++.-.++++.+... +.-..|.+---.++.+|+.+.+.++.++++++..|..|+.-. + T Consensus 197 ~~~-~~~~~~~i~~~~~~~-~~~~~~s~ell~ds~~~l~~~i~~~l~~~~~~~~~~~i~~g~-----------------g 257 (397) T protein:vir:96 197 PQL-ANPKMVEIDYSVATR-RGYIPISQEMIDDASYDVTGLIADEIQDQSLNTKNADIAAVL-----------------K 257 (397) T ss_pred ccc-ccccccceeecHhHh-hcchhhHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhcc-----------------c Confidence 321 234456666666443 223333322223456788999999999999999998775321 0 Q ss_pred CcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceEE Q lcl|NC_015249. 160 KAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSIR 239 (347) Q Consensus 160 ~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg 239 (347) .+. +.. .. .|+.|.++....... .. +-..|++|..|..|.+- +-.++.|.-...+.+|.-+ T Consensus 258 ~~~------~~~-----~~----~~d~~~~~~~~~~~~-~~--~a~~v~n~~~~~~l~~l-kd~~G~~~~~~~~~~~~~~ 318 (397) T protein:vir:96 258 TAT------AKS-----VV----GVDGLKDLINKEIKK-VY--DVKLFISASMYSELDKL-KDKNGRYLLQDSITAASGK 318 (397) T ss_pred ccc------ccc-----cc----chHHHHHHHHHhhhh-hc--CcEEEEcHHHHHHHHHh-hccCCCeEeccCccCCCcc Confidence 000 000 11 134454443332221 22 34679999999998753 2334455444456677778 Q ss_pred EEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhhh Q lcl|NC_015249. 240 NVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANFQ 319 (347) Q Consensus 240 ~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~ 319 (347) +++|.+|+.+++.+........ .-+-+||+..+. ++. ..+++++...+ .++ T Consensus 319 ~l~G~pv~~~~~~~~~~~~~~~-------------------~~~~gd~~~~~~-~~~--------~~~~~~~~~~~-~~~ 369 (397) T protein:vir:96 319 QLLGKEVVVLDDDVIGKSVGNV-------------------VGFIGDAKAFAS-FFD--------RKQVSVSWVDN-NIY 369 (397) T ss_pred cccccceEEecccccCCCCCce-------------------EEEEeehhcceE-eEe--------ecceEEEEecc-ccc Confidence 9999999998875432211000 012245555432 222 22345554433 344 Q ss_pred cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 320 ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 320 ~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ...+++.+++|.++.+|++.+.+.++.| T Consensus 370 ~~~~~~~~r~d~~~~~~~a~~~~~~~~a 397 (397) T protein:vir:96 370 GQLLAGIIRYDVKATDKKAGFYVTFTIG 397 (397) T ss_pred ceeEEEEEEEccEEecccceEEEEeecC Confidence 5678999999999999999999999999 No 129 >protein:vir:1268 Length: 397 # NCBI annotation: hypothetical protein # Family: family:all:21 # MgeID: mge:329 # MgeName: phi-105 # Cross-refs: genbank:acc:NP_690760;genbank:gi:22855000;genbank:GeneID:955203 Probab=99.10 E-value=3.5e-11 Score=77.92 Aligned_cols=282 Identities=9% Similarity=0.006 Sum_probs=159.3 Q ss_pred CCcccc-----ccc-ccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccccccccccc-ceEEEee-cCcceeee Q lcl|NC_015249. 1 MAKMNG-----GQQ-IGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSG-KSAQFPV-LGRTKAAY 72 (347) Q Consensus 1 ma~~~~-----~~~-~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G-~tv~i~~-iG~~~~~~ 72 (347) +..... ..+ ...|...+...++--.+.-++|..++.......+.++.+.++..+.++ ..+.+++ .+...+.. T Consensus 103 ~~~~~~~~~~~~~~~~~~~a~~~~~~~~gg~lvP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~a~~ 182 (397) T protein:vir:12 103 RGKRLTDEERDLLDSPEFRAMSGINDEDGGILIPEDIGRQIHEFKRQFEPLEQYVTVEPVTTRSGTRLLEKNADMVPFSP 182 (397) T ss_pred hccCCcHHHHHHHhhhhhhhccccccccCcccCchhHHHHHHHhhhhhhhHHhhcceeeccCCceeEEEEEecCCcceee Confidence 000000 000 000111111111212245589999999888888888888888777642 2444544 46667778 Q ss_pred eecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccc Q lcl|NC_015249. 73 LQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASD 152 (347) Q Consensus 73 ~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~ 152 (347) +..|...+.. ..+..+++++...+.- .-..|.+-=-..+.+|+.+.+.++.+++|++..|..|+.-. T Consensus 183 v~Eg~~~~~~-~~~~~~~v~~~~~k~~-~~~~is~e~l~ds~~~l~~~i~~~l~~~~~~~~d~~il~G~----------- 249 (397) T protein:vir:12 183 VEELGNLPEI-DQPRFTKVSYSIIDYG-GIMTLSNSMLNDSDQAIMTYVAKWFAKKSVVTRNNLILAAI----------- 249 (397) T ss_pred eccccccccc-ccccceeEEeeheeeE-eeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHHhcc----------- Confidence 8878776432 2345566666665543 22344333223466789999999999999999999876321 Q ss_pred cccccccCcceeecccccccccchhhhHHHHHHHHHHHHH-HhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccc Q lcl|NC_015249. 153 ENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARA-KLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALI 231 (347) Q Consensus 153 ~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~-~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~ 231 (347) +.+. +... ..++.|.++.. .|+...- .+-.++++|..|..|.+-. -.++.|.-.. T Consensus 250 ------g~~~------~~g~---------~~~~~i~~~~~~~l~~~~~--~~a~~~~n~~~~~~L~~lk-d~~G~~l~~~ 305 (397) T protein:vir:12 250 ------ASLK------KVDI---------DGLDGIKKALNVTLDPMVA--PGSIVLTNQDGYDWLDTLK-DGTGRYLLQP 305 (397) T ss_pred ------cccc------cccc---------ccHHHHHHHHhhccchhhh--CCCEEEEcHHHHHHHHHhh-ccCCceeecc Confidence 0000 0000 11455555443 4544332 2345789999999986532 2234454444 Q ss_pred ccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeee Q lcl|NC_015249. 232 DPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALE 311 (347) Q Consensus 232 ~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e 311 (347) .+.+|.-+.++|.+|+.+++........ ...-+-+||++.+.+ +...+++++ T Consensus 306 ~~~~g~~~~l~G~pv~~~~~~~~~~~~~-------------------~~~~~~gd~~~~~~~---------~~~~~~~i~ 357 (397) T protein:vir:12 306 DPTNPTKKLLDGRPVVPFTNRVLKTQKG-------------------KAPLIIGNLKEAIVL---------FDREQQSIA 357 (397) T ss_pred cccCCCCccccceeeEEecccccccCCC-------------------ccEEEEEehhceEEE---------EeecceEEE Confidence 5667777899999999987643221110 000122344433222 222334555 Q ss_pred eeechh----hhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 312 RARRAN----FQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 312 ~~~d~~----~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ..+.+. +-...+++.++++..+++|++.+.+.+..= T Consensus 358 ~~~~~~~~f~~~~~~~r~~~r~d~~~~~~~a~~~~~~t~~ 397 (397) T protein:vir:12 358 STDTGAGAFETNSTKVRGIEREDVRKWDEDAVVFGQITVE 397 (397) T ss_pred EeccccchhhcCceEEEEEEeeccEEecccceEEEEEeeC Confidence 544332 224568888999999999998887777666 No 130 >protein:vir:1025 Length: 408 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:20 # MgeName: bIL286 # Cross-refs: genbank:acc:NP_076679;genbank:gi:13095788;genbank:GeneID:920362 Probab=99.10 E-value=1.1e-10 Score=75.24 Aligned_cols=284 Identities=10% Similarity=0.019 Sum_probs=156.2 Q ss_pred CCccccc-ccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccc-cceEEEeecCc--ceeeeeecC Q lcl|NC_015249. 1 MAKMNGG-QQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQS-GKSAQFPVLGR--TKAAYLQPG 76 (347) Q Consensus 1 ma~~~~~-~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~-G~tv~i~~iG~--~~~~~~~~g 76 (347) +-+.... .....+.-..+..++--.+.-+.+..++.+.....+.+++++++..+.+ ..++.+++... ..+.....| T Consensus 101 ~~~~~~~~~~~~~~a~~~~t~~~gg~~vP~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~ 180 (408) T protein:vir:10 101 VRNPMAFMNTVSSKTETSGSDSAAGLTIPQDIRTMINTLVRQYDSLQQYVRVESVSTSNGSRVYEKWTDVTPLTVMDAED 180 (408) T ss_pred hhcchhhhhhhhhhhhhcccccCCceeccHhHHHHHHHHHHhhchhhhhcceeeccCCcceEEEeeccccccceeeecCc Confidence 1000000 0000111111111121124458999999999988899999988877654 23345554433 344455556 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIA 156 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~ 156 (347) +.++.+ ..+.-.++++...+.. .-..|.+-=-.++.+|+.+.+.++.++++++..|+.|+.-.. T Consensus 181 ~~~~~~-~~~~~~~i~~~~~k~~-~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g-------------- 244 (408) T protein:vir:10 181 GKIPDL-DNPQLTIIKYLIKRYA-GIITATNTSLKDTAENILAWLSSWIAKKVVVTRNQAIIEVMK-------------- 244 (408) T ss_pred cccccc-cCcceeeEEeeeeeEE-eeehhHHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccc-------------- Confidence 655432 1233455566554442 223443322234678999999999999999999998763211 Q ss_pred cccCcceeecccccccccchhhhHHHHHHHHHHHH-HHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccccccc Q lcl|NC_015249. 157 GLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLAR-AKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPST 235 (347) Q Consensus 157 ~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~-~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~ 235 (347) ++. ..+ ... .++.|+++. ..++...-+ +-.++++|..|..|.+-. -.++.|.-...+.+ T Consensus 245 ---~~~--~~~--------~~~----~~~~l~~~~~~~~~~~~~~--~a~~v~n~~~~~~l~~lk-d~~G~~i~~~~~~~ 304 (408) T protein:vir:10 245 ---AAP--KKP--------TIA----KFDDVITMINTAVDPAIIA--TSSLLTNQSGLNKLALVK-TAEGKYLLEPDPTK 304 (408) T ss_pred ---ccc--ccc--------ccc----cHHHHHHHHHHhhhhhhcc--CCEEEEcHHHHHHHHHhh-ccCCceEeccCcCC Confidence 110 000 001 145555544 334443322 235789999999987643 33444544444667 Q ss_pred ceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeec Q lcl|NC_015249. 236 GSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARR 315 (347) Q Consensus 236 G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d 315 (347) |...+++|++|+.+++.+.+..+.. ...-+-+||+..+.+ +...+++++..++ T Consensus 305 ~~~~~l~G~PV~~~~~~~~~~~~~~------------------~~~i~~gd~~~~~~~---------~~~~~~~v~~~~~ 357 (408) T protein:vir:10 305 PNSYLIKGKQVIVVADRWLPNTGST------------------VYPLYYGDMSQAITL---------FDRENMSLLPTNI 357 (408) T ss_pred CCCceecceeeEEecccccCccCCC------------------ceEEEEEehhccEEE---------EEecceEEEEccc Confidence 7778999999999765433221110 000122344433222 2223455555433 Q ss_pred hh----hhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 316 AN----FQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 316 ~~----~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .- +-...+++.++++.++.+|++.+.+.+..+ T Consensus 358 ~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~~~~~~~ 393 (408) T protein:vir:10 358 GAGAFETDTTKIRVIDRFDVKATDSEALVAGSFSAI 393 (408) T ss_pred ccchhhcCceEEEEEEeeccEEeccccEEEEEeecc Confidence 21 223467788999999999999999998776 No 131 >protein:vir:7409 Length: 408 # NCBI annotation: major structural protein # Family: family:all:21 # MgeID: mge:146 # MgeName: P335 # Cross-refs: genbank:acc:NP_839926;genbank:gi:30089896;genbank:GeneID:1260683 Probab=99.10 E-value=6.1e-11 Score=76.57 Aligned_cols=284 Identities=10% Similarity=0.034 Sum_probs=155.9 Q ss_pred CCc----cccc-ccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccccccccccc-ceEEEeecCcc-e-eee Q lcl|NC_015249. 1 MAK----MNGG-QQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSG-KSAQFPVLGRT-K-AAY 72 (347) Q Consensus 1 ma~----~~~~-~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G-~tv~i~~iG~~-~-~~~ 72 (347) +.+ .... .....+.-.....++--.+.-+.|..++.......+.+++++++..+.++ .++.+++.... . ... T Consensus 97 ~~~~~~~~~~~~~~~~~~a~~~~~~~~gg~~vP~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 176 (408) T protein:vir:74 97 FVNMVRNPMAFLNTVSSKTETSGSDSAAGLTIPQDIRTMINTLVRQYDSLQQYVRVESVSTSSGSRVYEKWTDVTPLKAM 176 (408) T ss_pred HHHHHhcchhhhhhhhhhhhcccccCCCceeechhHhhHHHHHHhhhcchhhhcceeeccCCcceEEEEeecCCcccccc Confidence 000 0000 00001110011111111235589999999999888989999988887653 35556654332 2 223 Q ss_pred eecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccc Q lcl|NC_015249. 73 LQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASD 152 (347) Q Consensus 73 ~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~ 152 (347) ...|+.++.. ..++-.++++...+. +....|.+-=-.++.+|+.+.+.++.+++|++..|+.|+.-. T Consensus 177 v~E~~~~~~~-~~~~~~~i~~~~~k~-~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~d~~il~G~----------- 243 (408) T protein:vir:74 177 DEEDGKIPDL-DNPRLTIIKYLIKRY-AGIITATNTLLKDTAENILAWLSSWIAKKVVVTRNQAIIAAM----------- 243 (408) T ss_pred cccccccccc-cccceeeEEeeeeeE-EeeehhHHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhcc----------- Confidence 3334444321 234556666666554 233345433333467899999999999999999999875310 Q ss_pred cccccccCcceeecccccccccchhhhHHHHHHHHHHHH-HHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccc Q lcl|NC_015249. 153 ENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLAR-AKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALI 231 (347) Q Consensus 153 ~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~-~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~ 231 (347) +.+. +.+ ... -++.|+++. ..|+....+. -.+|++|..|..|.+- +-.+..|.-.. T Consensus 244 ------G~~~--~~~--------~~~----~~~~i~~~~~~~l~~~~~~~--a~~v~n~~~~~~l~~l-kd~~G~~l~~~ 300 (408) T protein:vir:74 244 ------GTVP--KKP--------TIA----NFDDVITMINTSVDPAIIAT--SSLLTNQSGLNKLALV-KTAEGKYLLEP 300 (408) T ss_pred ------cccc--ccc--------ccc----cHHHHHHHHHHhhhhhhcCC--CEEEEcHHHHHHHHHh-hcCCCceEecc Confidence 0110 000 011 134555543 4566655442 3567899999999753 23344454444 Q ss_pred ccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeee Q lcl|NC_015249. 232 DPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALE 311 (347) Q Consensus 232 ~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e 311 (347) .+..|.-+.++|++|+.+++.+.+..+.. ...-+-+||+..+. ++. .++++++ T Consensus 301 ~~~~~~~~~l~G~pV~~~~~~~~~~~~~~------------------~~~i~~gd~~~~~~-~~~--------~~~~~i~ 353 (408) T protein:vir:74 301 DPTKPNSYLIKGKQVIVVADRWLPNSGST------------------VYPLYYGDMSQAIT-LFD--------RENMSLL 353 (408) T ss_pred CcCCCCCceecceeeEEecCcccccccCC------------------cceEEEEehhccEE-EEE--------ecceEEE Confidence 45667668999999998876433221110 00012234443332 222 2334555 Q ss_pred eeech----hhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 312 RARRA----NFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 312 ~~~d~----~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ..+.. .+....+++.++++.++++|++.+.+.+... T Consensus 354 ~~~~~~~~f~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~ 393 (408) T protein:vir:74 354 PTNIGAGAFETDTTKIRVIDRFDVKATDSEALVAGSFTAI 393 (408) T ss_pred EeccccchhhcceeeEEEEEeeCcEEecccceEEEEeecc Confidence 44322 2334557888999999999999999888555 No 132 >protein:vir:105610 Length: 430 # NCBI annotation: virion structural protein # Family: family:all:974 # MgeID: mge:1540 # MgeName: F116 # Cross-refs: genbank:acc:YP_164307;genbank:gi:56692923;genbank:GeneID:3197221 Probab=99.07 E-value=2.6e-11 Score=78.64 Aligned_cols=328 Identities=12% Similarity=0.056 Sum_probs=170.5 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHH----hhh----------------------cccccccc Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTS----VTM----------------------NKHLVRSI 54 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s----~~~----------------------~~~~~r~i 54 (347) |..+. |..+.+ |+. -+++|+.-+.+.-.+++ +|. +.++..++ T Consensus 1 ~~~a~------T~~~~~----~p~--a~~~ws~~l~~~~~k~~~~~~kl~G~~~~~~~~~~~~~~~~ts~~~pI~r~~dL 68 (430) T protein:vir:10 1 MTASK------TTMRYG----DPN--AMIQQAAGLFALCQGRNSTLNRLTGKMPSGTSDAEKKTKGQSSLELPIVQAQDL 68 (430) T ss_pred Cccee------eecccC----Chh--HHHHHHHHHHHHHhhhhhhHHHhhccccccccchhhhccCCCCCCccEEEeccC Confidence 43322 344333 333 46677766655554432 222 36666666 Q ss_pred c--ccceEEEeecCcceeeeeecCCCCCCccCCCCCceEEEEEEeeeeccccc-ccHHHHHhChhhHHHHHHHHHHHHHH Q lcl|NC_015249. 55 Q--SGKSAQFPVLGRTKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLI-YDIEDAMNHYDVRSEYTAQLGESLAM 131 (347) Q Consensus 55 ~--~G~tv~i~~iG~~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~I-dd~D~~q~~~D~r~~~~~~~g~aLa~ 131 (347) . .|++|.|+-+...+-.....++.+.+.-+.+.-....|+|||..- .+.+ ..+++..+-+|+|++.-..++.=+++ T Consensus 69 ~K~~GD~Vtf~L~~~L~g~gv~Gd~~lEGnee~L~~~~d~l~IDq~R~-~V~~gg~msqQRt~~dlR~~ar~~L~~w~~~ 147 (430) T protein:vir:10 69 GRNKGDEVRFHFVQPANAFPIMGSEYAEGKGTGLKIGSDQLRVNQARF-PVDLGDVMSQIRNPYDLRRLGRPKAKWFMDA 147 (430) T ss_pred CCCCccEEEEeEeeccccCceecCceeeccccceEEEeeEEEEeeecc-ccccCCchhhhhhhhHHHHHHHHHHHHHHHH Confidence 4 399999998877766666667777776677888888999999753 3333 36677788999999999999999999 Q ss_pred HHHHHHHHHHHHHhh----------------hccccccccccccCc--ceeecccccccc------cchhhhHHHHHHHH Q lcl|NC_015249. 132 AADGAVLAEMAKLCN----------------LPSASDENIAGLGKA--HVLEVGKQSELR------GDQVKLGQAIIAQL 187 (347) Q Consensus 132 ~~D~~i~~~~~~~a~----------------~~~~~~~~~~~~~~g--~~i~~~~~~~~~------~~~~~~~~~~~~~l 187 (347) ..||.+|.+++.+.. ++.... +.--.|.. .....|.+++.. ......-..-++.| T Consensus 148 ~~Dq~~~v~laGarg~~~~~~~~~~~~~~~~~~~~~~-N~v~aPt~nrh~~~~G~at~~~~~~~~~~sl~stD~~s~~~i 226 (430) T protein:vir:10 148 YLDQSMLVHLAGARGNHYNKEWCLPLETHPKLADMLV-NRVKAPTKNRHFVASADAITGVAPNAGEYNITTADVLDVDVV 226 (430) T ss_pred HHHHHHHHHHhhhhcccccccccccccCCcchhhhhc-cccCCCCCceeEeecccccccccccccccchhhhcccCHHHH Confidence 999999999976411 010000 00000111 111111111100 00000001124556 Q ss_pred HHHHHHhhhcCCCC-------CC-------CEEEeCHHHHHHHhcchhhhh----h-hh--cc-ccccccceEEEEeceE Q lcl|NC_015249. 188 TLARAKLTGNYVPS-------AD-------RVFYTTPDNYSAILAALMPNA----A-NY--QA-LIDPSTGSIRNVMGFE 245 (347) Q Consensus 188 ~~a~~~Lde~~VP~-------~g-------R~~vv~P~~~~~Ll~~~~~~~----~-~~--~~-~~~~~~G~Vg~i~G~~ 245 (347) .+|...++..+.|. +. +++++.|.+|..|..++.+-. + .. .+ .+.+-.|.++.++|+- T Consensus 227 d~a~~~a~~~~~~i~Pv~v~gd~~~g~~~~yV~~~~p~q~~~Lr~dt~~~~wq~~~~a~a~~g~~nPlF~G~~gm~ngvi 306 (430) T protein:vir:10 227 DSIATYMDQIELPPPPVKFEGDEAAEDSPIRVLLCSPAQYNSFAKQEKFRSWQAAALARASNAKQHPIFRVDAGLWSNTL 306 (430) T ss_pred HHHHHHHHhhCCCCcceEeecccccCCccEEEEEechHHHHHHhhCcchHHHHHHHHHhhcccccCCceecceeeecCeE Confidence 67777777765432 22 789999999999999987631 1 11 12 4557799999999999 Q ss_pred EEEecce-ecccccccccccccccccc-cccccccccccccccccceEEEEechhhhhhhhhcc----ee---eeeeech Q lcl|NC_015249. 246 VIEVPHL-TAGGAGEDRPEEGANPTGQ-KHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKD----MA---LERARRA 316 (347) Q Consensus 246 V~~sn~l-p~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~----~~---~e~~~d~ 316 (347) |++.+++ .+..+.....+...+..+. ....+. .+...+.-..+|++-..|++-+-.+. .. .|..+|- T Consensus 307 i~~~~~virf~~g~~~~~~a~~~~~~~~~~~~~a----~~~~~~~v~RalllGaQA~~~A~g~~~~~g~~f~w~Ee~~D~ 382 (430) T protein:vir:10 307 IIKMPKPIRFYAGDTIKYCAAYNSEAESSAVVSD----SFGNQYAVDRALLLGGQALAQAWAASEHSGMPFFWSEKDMDH 382 (430) T ss_pred EecCCceeeecCCCccccccCCcccccccccccc----cccccccchhhhhccchhheeeeeccCCCCcceeeeeecccc Confidence 9998644 2332221111111111110 011111 11112222233444344332221111 00 1222222 Q ss_pred hhhcceeeeeeeecccccc----------cceEEEEEEcCC Q lcl|NC_015249. 317 NFQADQIIAKYAMGHGGLR----------PEACGALVFNKA 347 (347) Q Consensus 317 ~~~~d~i~~~~a~G~~~~R----------pe~a~~i~~~~a 347 (347) .++ -.|.....+|.+=.| ..==|+|+++.| T Consensus 383 g~~-~~i~~~~i~G~kK~rF~~~~~~~~~~~DfGvi~idta 422 (430) T protein:vir:10 383 GDK-LELLIGAILGCSKIRFAVEATNGLEYTDHGVMAIDTA 422 (430) T ss_pred Cch-hhhhhhHHhccceeeecCCCCCCceeeeeEEEEhhhh Confidence 222 122222223322222 122455666665 No 133 >protein:vir:9704 Length: 394 # NCBI annotation: hypothetical protein # Family: family:all:21 # MgeID: mge:174 # MgeName: 315.2 # Cross-refs: genbank:acc:NP_795466;genbank:gi:28876225;genbank:GeneID:1257769 Probab=99.07 E-value=6.9e-11 Score=76.29 Aligned_cols=274 Identities=12% Similarity=0.079 Sum_probs=154.4 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec--CcceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL--GRTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i--G~~~~~~~~~g~~ 78 (347) .+........ .....+...++--.+.-+.|..++.+.....+.+++++++..+.+|+ .++|+. +..++..+..|.. T Consensus 115 ~~~~~~~~~~-~~~~~~~t~~~gg~liP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~-~~~~~~~~~~~~~~~v~E~~~ 192 (394) T protein:vir:97 115 LMPINETTPV-EPQKDGIKKENAKPVSSEEILYTPAREVKTVVDLKPFTTVYQAKKAS-GKYPVLQRATTKMVTVAELEK 192 (394) T ss_pred HHHHHhhhhh-hhhccccccccccccChHHHHHHHHHHhhhhhhhhhhceeeeccCcc-eEEEEEecCCCccceeccccc Confidence 0000000000 00000111111112455889999988888888999998887766553 556654 4445566666665 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) .+.. ..+.-.++++...+. +.-..|.+-=-.++.+|+.+.+.++.+++|++..|+.|+..+. T Consensus 193 ~~~~-~~~~~~~v~l~~~k~-~~~i~is~ell~ds~~~~~~~i~~~la~~~~~~~~~~i~~g~~---------------- 254 (394) T protein:vir:97 193 NPAL-AKPDFKDVAWNIDTY-RGAIPLSQESIDDADVDLVGIVSESISQIKVNTTNDAIAKVLK---------------- 254 (394) T ss_pred cccc-ccccceeEEeehhhe-eeehhhHHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHhhccc---------------- Confidence 5421 234456666666443 2333443322234567899999999999999999988763210 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceE Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSI 238 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~V 238 (347) .++ +.. ... ++.|.++...+-. |...-.+|++|..|..|.+- +-.++.|.-...+.+|.- T Consensus 255 -~~~------~~~-----~~~----~~~~~~~~~~~~~---~~~~a~~v~n~~~~~~l~~l-kd~~G~~i~~~~~~~~~~ 314 (394) T protein:vir:97 255 -SFT------TKT-----VKN----LDEIKALLNGGFD---PAYNVSLIVSQSFYQTLDTL-KDGNGRYLLQDDITAVSG 314 (394) T ss_pred -ccc------ccc-----ccc----HHHHHHHHHhhhh---hhhCCEEEEcHHHHHHHHHh-hccCCCeeeecCcCCCCC Confidence 000 000 001 3334433322211 11233468999999998653 223344443334566766 Q ss_pred EEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhh Q lcl|NC_015249. 239 RNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANF 318 (347) Q Consensus 239 g~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~ 318 (347) +.++|++|+.+++.+.+... -+-+||+..+.++- ..+++++...+. + T Consensus 315 ~~l~G~pv~~~~~~~~~~~~-----------------------~~~gd~~~~~~~~~---------~~~~~~~~~~~~-~ 361 (394) T protein:vir:97 315 KVLLGKPVFVLSDEVLGANK-----------------------AFIGDFKRGVLFAD---------RKDLGLRWADNE-I 361 (394) T ss_pred ceeccceeEEecccccCCcc-----------------------EEEeeccccEEEEE---------ecceEEEEeccc-c Confidence 79999999997765432211 12345554332222 223555554443 4 Q ss_pred hcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 319 QADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 319 ~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ....+++..++|..+.+|++.+.|.+..+ T Consensus 362 ~~~~~~~~~r~d~~v~~~~a~~~~~~~~~ 390 (394) T protein:vir:97 362 YGQYLQAVLRFGVSKVDDKAGYYVTFTPE 390 (394) T ss_pred cceeEEEEEEEccEEecccceEEEEeccc Confidence 45678999999999999999999999888 No 134 >protein:vir:100884 Length: 389 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1473 # MgeName: Lc-Nu # Cross-refs: genbank:acc:YP_358764;genbank:gi:78000028;genbank:GeneID:3726155 Probab=99.05 E-value=1.1e-10 Score=75.12 Aligned_cols=280 Identities=8% Similarity=0.041 Sum_probs=153.5 Q ss_pred CCc-ccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec--CcceeeeeecCC Q lcl|NC_015249. 1 MAK-MNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL--GRTKAAYLQPGE 77 (347) Q Consensus 1 ma~-~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i--G~~~~~~~~~g~ 77 (347) +.. +-++... -+.-.+..+++-=.+.-++|..++.+.....+.++.+.++..+.++ +.++++. +.........|. T Consensus 95 ~~~~lr~~~~~-~~~~~~~t~~~gg~~vP~~~~~~i~~~~~~~~~l~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~E~~ 172 (389) T protein:vir:10 95 INDFIHSHGKV-IDATSKVTSTEAGVLIPEEIIYDPTAEVNSVVDLSTLVTKTPVTTP-KGTYPILKRATDRFSSVAELA 172 (389) T ss_pred HHHHhhcchhh-hhhhcccccCCcceeehHHHHHHHHHHHHhhhhHHhhcceeeccCC-eeEEEEEecCCCccccccccc Confidence 000 0000000 0000011111111133488999998888888988888888776643 3455544 334444555555 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG 157 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~ 157 (347) ..+.. ..+.-.++++.+.+. +.-+.|.+-=-.++.+|+.+.+.++.+++|++..|..|+..+.. T Consensus 173 ~~~~~-~~~~~~~i~~~~~k~-~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~i~~g~~~-------------- 236 (389) T protein:vir:10 173 ENPKL-AEPEFNKVDWSVATY-RGAIPLSEEAIADSAVDLTALVGQSIKEKSVNTYNAMIAPVLQS-------------- 236 (389) T ss_pred ccccc-ccccceeeeeeheee-EeeehhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHhhhhcc-------------- Confidence 54321 234556666666544 23334433222346788999999999999999999987643210 Q ss_pred ccCcceeecccccccccchhhhHHHHHHHHHHHHH-HhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccc----cc Q lcl|NC_015249. 158 LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARA-KLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQAL----ID 232 (347) Q Consensus 158 ~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~-~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~----~~ 232 (347) +. ..+.. ....++.|.++.. .++.. .+-.++++|..|..|.+-.. .++.|.-. .. T Consensus 237 ---~~--~~~~~----------~~~~~d~l~~~~~~~~~~~----~~a~~~~n~~~~~~L~~lkd-~~G~~i~~~~~~~~ 296 (389) T protein:vir:10 237 ---FT--AKKTT----------TDTLVDSLKHILNVDLDPA----YSRALVVTQSLFNTLDTLKD-KNGRYLLHDASDSI 296 (389) T ss_pred ---cc--ccccc----------ccccHHHHHHHHHhhhhhh----hCcEEEecHHHHHHHHHhhc-cCCCeeeecCcccc Confidence 00 00000 0112455555443 33332 23467899999999875332 33344321 22 Q ss_pred cccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeee Q lcl|NC_015249. 233 PSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALER 312 (347) Q Consensus 233 ~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~ 312 (347) ...|..++++|.+|+.+++......++ . ..-+-+||++.+.++- .++++++. T Consensus 297 ~~~~~~~~l~G~pV~~~~~~~~~~~~~-----------~--------~~~~~gd~~~~~~~~~---------~~~~~i~~ 348 (389) T protein:vir:10 297 TDGTAKGTILGVPVYVVGDTLLGSLAG-----------D--------QKAFVGDLKRGVLFTD---------RQQVTLAW 348 (389) T ss_pred cccccccccccceeEEecccccCCCCC-----------c--------eEEEEeeccccEEEEe---------ecceEEEe Confidence 334556789999999876543221110 0 0013345555433322 23466666 Q ss_pred eechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 313 ARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 313 ~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .++ .++...+++.+++|..+.+|++.+.+.+..+ T Consensus 349 ~~~-~~~~~~~~~~~r~d~~~~~~~a~~~~~~~~~ 382 (389) T protein:vir:10 349 EDS-KIYGKYLGAAFRFGVQKADSKAGYFVTNTDV 382 (389) T ss_pred ecc-ccccceEEEEEEeccEEecccceEEEEeecc Confidence 654 4445678899999999999999998887766 No 135 >protein:vir:3845 Length: 395 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:322 # MgeName: phi adh # Cross-refs: genbank:acc:NP_050151;swissprot:trembl:q9t1f6;genbank:gi:9633043;uniprot:Q9T1F6;genbank:GeneID:1262163 Probab=99.04 E-value=1.1e-10 Score=75.09 Aligned_cols=274 Identities=11% Similarity=0.029 Sum_probs=154.0 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccc-cceEEEeecCcc--eeeeeecCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQS-GKSAQFPVLGRT--KAAYLQPGE 77 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~-G~tv~i~~iG~~--~~~~~~~g~ 77 (347) +.....+ .+-.+. +| .+.-+.|..++.......+.++.+.++..+++ ..+..++..... .+.....|+ T Consensus 102 ~~~~~~~---~~~~~~---gg---~~vP~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~E~~ 172 (395) T protein:vir:38 102 KNLVTSG---TTGTGN---AG---LTIPEDIQLQIRTLTRSFTSLESLANVENVTTSHGSRVYEKLADITPLKDLDDESA 172 (395) T ss_pred HHHHhhc---cCccCC---Cc---eecchhHhhHHHHHHHhhcchhhhcceeeccCCcceEEEEeeccCCcccccccccc Confidence 1100000 011111 11 23458899999999988899999888877653 234445444332 233344555 Q ss_pred CCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG 157 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~ 157 (347) .++.. ..++-.++++...+.-. -..|.+-=-..+.+|+.+.+.++.++++++..|+.|+.-.. T Consensus 173 ~~~~~-~~~~f~~v~~~~~k~~~-~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~il~g~g--------------- 235 (395) T protein:vir:38 173 LIGDN-DDPELTVVKYLIHRYAG-ITTVTNTLLKDTVDNIIQWLVNWAAKKDVVTRNAKILEVMG--------------- 235 (395) T ss_pred ccccc-cccceeeEEeeeeeeEe-ehhhHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhccc--------------- Confidence 55422 12333555555544422 22343322234678899999999999999999998763210 Q ss_pred ccCcceeecccccccccchhhhHHHHHHHHHHHHH-HhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccc Q lcl|NC_015249. 158 LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARA-KLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTG 236 (347) Q Consensus 158 ~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~-~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G 236 (347) .+.- ... .. -++.|.++.. .|+...-+ +-.++++|..|..|.+- +-.++.|.-...+.+| T Consensus 236 --~~~~--~~~--------~~----~~~~i~~~~~~~l~~~~~~--~a~~v~n~~~~~~L~~l-kd~~G~~l~~~~~~~~ 296 (395) T protein:vir:38 236 --KAPK--KPT--------IS----QFDNIKDLENNTLDPAIES--TSSFITNQSGYNILSKV-KDADGRYLMQPDVTSP 296 (395) T ss_pred --cccc--ccc--------cc----cHHHHHHHHHHhhhhhhcC--CCEEEEcHHHHHHHHHh-hccCCceeeccCcCCC Confidence 1100 000 00 1334444432 34433322 34678999999998753 2334445444456778 Q ss_pred eEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeech Q lcl|NC_015249. 237 SIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRA 316 (347) Q Consensus 237 ~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~ 316 (347) ....++|++|+.+++.+.+..+.. ..-|-+||...+ ..+...+++++..++. T Consensus 297 ~~~~l~G~pV~~~~~~~~~~~~~~-------------------~~i~~gd~~~~~---------~i~~~~~~~i~~~~~~ 348 (395) T protein:vir:38 297 DKYLIDGKPVIRIADKWLPDVSGS-------------------HPLYFGDLKQGI---------TLFDRQQMQIDTTNVG 348 (395) T ss_pred CcceeccceeEEecccccCcCCCc-------------------ceEEEEeccccE---------EEEEecceEEEEeccc Confidence 788999999999987664432110 001223333322 2222334566665433 Q ss_pred h----hhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 317 N----FQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 317 ~----~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) . +-...+++..+++..++||++.+.+.+..+ T Consensus 349 ~~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~ 383 (395) T protein:vir:38 349 AGSFEHDTTKLRFIDRFDVQLIDDGAFAAASFKTV 383 (395) T ss_pred cchhhcCceEEEEEEeeccEEecccceEEEEeecc Confidence 2 234567888889999999999999998877 No 136 >protein:vir:105004 Length: 392 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:1490 # MgeName: W Beta # Cross-refs: genbank:acc:YP_459969;genbank:gi:85701384;genbank:GeneID:3882145 Probab=99.04 E-value=1.2e-10 Score=75.04 Aligned_cols=284 Identities=11% Similarity=0.074 Sum_probs=155.6 Q ss_pred CCccccc--------ccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccc-eEEEee-cCccee Q lcl|NC_015249. 1 MAKMNGG--------QQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGK-SAQFPV-LGRTKA 70 (347) Q Consensus 1 ma~~~~~--------~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~-tv~i~~-iG~~~~ 70 (347) |.+..-. ...-.+.......++--.+.-++|.+++...-+..|.+++++++..+.++. ...+++ .+...+ T Consensus 84 l~~~~~~~~~~~~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~~~~~~~~~~~a 163 (392) T protein:vir:10 84 LRNKPLNAEEREFLEDDLEQRAMSGLTGEDGGLVIPQDIQTQINELARSFDALEQYVTVEPVRTRSGSRVLEKNSDMIPF 163 (392) T ss_pred HhcccccHHHHHHHhhhhhhhhccccccCCCceecchhHHHHHHHHHHhhhhhhhhceeeeccCCceeEEEEeecCCccc Confidence 1100000 000000000001111111344889999998888889999999998887532 334444 444567 Q ss_pred eeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccc Q lcl|NC_015249. 71 AYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSA 150 (347) Q Consensus 71 ~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~ 150 (347) .....|..++.+ ..++-+++++...+. +.-..|.+-=-.++.+|+.+.+.++.++++++..|..++.-.. T Consensus 164 ~~v~E~~~~~~~-~~~~~~~v~l~~~k~-~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~~~g~g-------- 233 (392) T protein:vir:10 164 AEITEMGEIPET-DNPKFSNVQYAVKDR-AGILPLSRSLLQDSDQNILKYVTKWLGKKSKVTRNVLILGVIE-------- 233 (392) T ss_pred eeeccccccccc-ccccceeEEeeeeeE-EEeehhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhccc-------- Confidence 777777665432 224556667766554 3444554422234678999999999999999999988763210 Q ss_pred cccccccccCcceeecccccccccchhhhHHHHHHHHHHHH-HHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcc Q lcl|NC_015249. 151 SDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLAR-AKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQA 229 (347) Q Consensus 151 ~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~-~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~ 229 (347) .+. .. ... .++.|.++. ..|+....+ +-..|++|..|..|.+- +-.++.|.- T Consensus 234 ---------~~~------~~-----~~~----~~d~i~~~~~~~l~~~~~~--~a~~vm~~~~~~~L~~l-kd~~G~~l~ 286 (392) T protein:vir:10 234 ---------KLT------KQ-----AIK----SLDDIKDVLNVKLDPAISP--NAILLTNQDGFNYLDKL-KDKDGKYIL 286 (392) T ss_pred ---------ccc------cc-----Ccc----CHHHHHHHHHHhhhhhhcc--CCEEEEcHHHHHHHHHh-hccCCCeEe Confidence 000 00 001 134555544 355555443 34578999999999653 233444543 Q ss_pred ccccccceEEEEeceEEEE-e-cceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcc Q lcl|NC_015249. 230 LIDPSTGSIRNVMGFEVIE-V-PHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKD 307 (347) Q Consensus 230 ~~~~~~G~Vg~i~G~~V~~-s-n~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~ 307 (347) ...+.+|.-++++|.+++. + +++|...+.. .+ ...-+-+||+..+.+ +.... T Consensus 287 ~~~~~~~~~~tllG~~~v~~~~~~~~~~~~~~---------~~--------~~~~~~gdfs~~~~i---------~~~~~ 340 (392) T protein:vir:10 287 QSDPTQKNKKLFAGTNPVVVVSNRFLKSKGTT---------AK--------KAPLIIGDLKEAIVL---------FKRED 340 (392) T ss_pred ecCccCCccccccCcccEEEecccccCCCccc---------CC--------ceEEEEEehhceEEE---------Eeecc Confidence 3445667778899986554 3 3333211110 00 000122344443322 22233 Q ss_pred eeeeeee--chhhhcce--eeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 308 MALERAR--RANFQADQ--IIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 308 ~~~e~~~--d~~~~~d~--i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++++... +..+.-+. +++..++|..+++|++.+.+.+..+ T Consensus 341 ~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~~ 384 (392) T protein:vir:10 341 MELASTDVGGKAFTRNTLDLRAIQRDDVQMWDNEAAVYGEIDLS 384 (392) T ss_pred eEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEeccc Confidence 4444432 33344443 7788889999999999999888666 No 137 >protein:vir:102082 Length: 392 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1503 # MgeName: Fah # Cross-refs: genbank:acc:YP_512315;genbank:gi:89152484;genbank:GeneID:3953075 Probab=99.04 E-value=1.2e-10 Score=75.04 Aligned_cols=284 Identities=11% Similarity=0.074 Sum_probs=155.6 Q ss_pred CCccccc--------ccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccc-eEEEee-cCccee Q lcl|NC_015249. 1 MAKMNGG--------QQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGK-SAQFPV-LGRTKA 70 (347) Q Consensus 1 ma~~~~~--------~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~-tv~i~~-iG~~~~ 70 (347) |.+..-. ...-.+.......++--.+.-++|.+++...-+..|.+++++++..+.++. ...+++ .+...+ T Consensus 84 l~~~~~~~~~~~~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~~~~~~~~~~~a 163 (392) T protein:vir:10 84 LRNKPLNAEEREFLEDDLEQRAMSGLTGEDGGLVIPQDIQTQINELARSFDALEQYVTVEPVRTRSGSRVLEKNSDMIPF 163 (392) T ss_pred HhcccccHHHHHHHhhhhhhhhccccccCCCceecchhHHHHHHHHHHhhhhhhhhceeeeccCCceeEEEEeecCCccc Confidence 1100000 000000000001111111344889999998888889999999998887532 334444 444567 Q ss_pred eeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccc Q lcl|NC_015249. 71 AYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSA 150 (347) Q Consensus 71 ~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~ 150 (347) .....|..++.+ ..++-+++++...+. +.-..|.+-=-.++.+|+.+.+.++.++++++..|..++.-.. T Consensus 164 ~~v~E~~~~~~~-~~~~~~~v~l~~~k~-~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~~~g~g-------- 233 (392) T protein:vir:10 164 AEITEMGEIPET-DNPKFSNVQYAVKDR-AGILPLSRSLLQDSDQNILKYVTKWLGKKSKVTRNVLILGVIE-------- 233 (392) T ss_pred eeeccccccccc-ccccceeEEeeeeeE-EEeehhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhccc-------- Confidence 777777665432 224556667766554 3444554422234678999999999999999999988763210 Q ss_pred cccccccccCcceeecccccccccchhhhHHHHHHHHHHHH-HHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcc Q lcl|NC_015249. 151 SDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLAR-AKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQA 229 (347) Q Consensus 151 ~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~-~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~ 229 (347) .+. .. ... .++.|.++. ..|+....+ +-..|++|..|..|.+- +-.++.|.- T Consensus 234 ---------~~~------~~-----~~~----~~d~i~~~~~~~l~~~~~~--~a~~vm~~~~~~~L~~l-kd~~G~~l~ 286 (392) T protein:vir:10 234 ---------KLT------KQ-----AIK----SLDDIKDVLNVKLDPAISP--NAILLTNQDGFNYLDKL-KDKDGKYIL 286 (392) T ss_pred ---------ccc------cc-----Ccc----CHHHHHHHHHHhhhhhhcc--CCEEEEcHHHHHHHHHh-hccCCCeEe Confidence 000 00 001 134555544 355555443 34578999999999653 233444543 Q ss_pred ccccccceEEEEeceEEEE-e-cceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcc Q lcl|NC_015249. 230 LIDPSTGSIRNVMGFEVIE-V-PHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKD 307 (347) Q Consensus 230 ~~~~~~G~Vg~i~G~~V~~-s-n~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~ 307 (347) ...+.+|.-++++|.+++. + +++|...+.. .+ ...-+-+||+..+.+ +.... T Consensus 287 ~~~~~~~~~~tllG~~~v~~~~~~~~~~~~~~---------~~--------~~~~~~gdfs~~~~i---------~~~~~ 340 (392) T protein:vir:10 287 QSDPTQKNKKLFAGTNPVVVVSNRFLKSKGTT---------AK--------KAPLIIGDLKEAIVL---------FKRED 340 (392) T ss_pred ecCccCCccccccCcccEEEecccccCCCccc---------CC--------ceEEEEEehhceEEE---------Eeecc Confidence 3445667778899986554 3 3333211110 00 000122344443322 22233 Q ss_pred eeeeeee--chhhhcce--eeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 308 MALERAR--RANFQADQ--IIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 308 ~~~e~~~--d~~~~~d~--i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++++... +..+.-+. +++..++|..+++|++.+.+.+..+ T Consensus 341 ~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~~ 384 (392) T protein:vir:10 341 MELASTDVGGKAFTRNTLDLRAIQRDDVQMWDNEAAVYGEIDLS 384 (392) T ss_pred eEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEeccc Confidence 4444432 33344443 7788889999999999999888666 No 138 >protein:vir:102873 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1492 # MgeName: Cherry # Cross-refs: genbank:acc:YP_338137;genbank:gi:77020198;genbank:GeneID:3703782 Probab=99.04 E-value=1.2e-10 Score=75.04 Aligned_cols=284 Identities=11% Similarity=0.074 Sum_probs=155.6 Q ss_pred CCccccc--------ccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccc-eEEEee-cCccee Q lcl|NC_015249. 1 MAKMNGG--------QQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGK-SAQFPV-LGRTKA 70 (347) Q Consensus 1 ma~~~~~--------~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~-tv~i~~-iG~~~~ 70 (347) |.+..-. ...-.+.......++--.+.-++|.+++...-+..|.+++++++..+.++. ...+++ .+...+ T Consensus 84 l~~~~~~~~~~~~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~~~~~~~~~~~a 163 (392) T protein:vir:10 84 LRNKPLNAEEREFLEDDLEQRAMSGLTGEDGGLVIPQDIQTQINELARSFDALEQYVTVEPVRTRSGSRVLEKNSDMIPF 163 (392) T ss_pred HhcccccHHHHHHHhhhhhhhhccccccCCCceecchhHHHHHHHHHHhhhhhhhhceeeeccCCceeEEEEeecCCccc Confidence 1100000 000000000001111111344889999998888889999999998887532 334444 444567 Q ss_pred eeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccc Q lcl|NC_015249. 71 AYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSA 150 (347) Q Consensus 71 ~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~ 150 (347) .....|..++.+ ..++-+++++...+. +.-..|.+-=-.++.+|+.+.+.++.++++++..|..++.-.. T Consensus 164 ~~v~E~~~~~~~-~~~~~~~v~l~~~k~-~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~~~g~g-------- 233 (392) T protein:vir:10 164 AEITEMGEIPET-DNPKFSNVQYAVKDR-AGILPLSRSLLQDSDQNILKYVTKWLGKKSKVTRNVLILGVIE-------- 233 (392) T ss_pred eeeccccccccc-ccccceeEEeeeeeE-EEeehhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhccc-------- Confidence 777777665432 224556667766554 3444554422234678999999999999999999988763210 Q ss_pred cccccccccCcceeecccccccccchhhhHHHHHHHHHHHH-HHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcc Q lcl|NC_015249. 151 SDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLAR-AKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQA 229 (347) Q Consensus 151 ~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~-~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~ 229 (347) .+. .. ... .++.|.++. ..|+....+ +-..|++|..|..|.+- +-.++.|.- T Consensus 234 ---------~~~------~~-----~~~----~~d~i~~~~~~~l~~~~~~--~a~~vm~~~~~~~L~~l-kd~~G~~l~ 286 (392) T protein:vir:10 234 ---------KLT------KQ-----AIK----SLDDIKDVLNVKLDPAISP--NAILLTNQDGFNYLDKL-KDKDGKYIL 286 (392) T ss_pred ---------ccc------cc-----Ccc----CHHHHHHHHHHhhhhhhcc--CCEEEEcHHHHHHHHHh-hccCCCeEe Confidence 000 00 001 134555544 355555443 34578999999999653 233444543 Q ss_pred ccccccceEEEEeceEEEE-e-cceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcc Q lcl|NC_015249. 230 LIDPSTGSIRNVMGFEVIE-V-PHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKD 307 (347) Q Consensus 230 ~~~~~~G~Vg~i~G~~V~~-s-n~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~ 307 (347) ...+.+|.-++++|.+++. + +++|...+.. .+ ...-+-+||+..+.+ +.... T Consensus 287 ~~~~~~~~~~tllG~~~v~~~~~~~~~~~~~~---------~~--------~~~~~~gdfs~~~~i---------~~~~~ 340 (392) T protein:vir:10 287 QSDPTQKNKKLFAGTNPVVVVSNRFLKSKGTT---------AK--------KAPLIIGDLKEAIVL---------FKRED 340 (392) T ss_pred ecCccCCccccccCcccEEEecccccCCCccc---------CC--------ceEEEEEehhceEEE---------Eeecc Confidence 3445667778899986554 3 3333211110 00 000122344443322 22233 Q ss_pred eeeeeee--chhhhcce--eeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 308 MALERAR--RANFQADQ--IIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 308 ~~~e~~~--d~~~~~d~--i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++++... +..+.-+. +++..++|..+++|++.+.+.+..+ T Consensus 341 ~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~~ 384 (392) T protein:vir:10 341 MELASTDVGGKAFTRNTLDLRAIQRDDVQMWDNEAAVYGEIDLS 384 (392) T ss_pred eEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEeccc Confidence 4444432 33344443 7788889999999999999888666 No 139 >protein:vir:107593 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1491 # MgeName: Gamma # Cross-refs: genbank:acc:YP_338188;genbank:gi:77020144;genbank:GeneID:3703724 Probab=99.04 E-value=1.2e-10 Score=75.04 Aligned_cols=284 Identities=11% Similarity=0.074 Sum_probs=155.6 Q ss_pred CCccccc--------ccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccc-eEEEee-cCccee Q lcl|NC_015249. 1 MAKMNGG--------QQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGK-SAQFPV-LGRTKA 70 (347) Q Consensus 1 ma~~~~~--------~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~-tv~i~~-iG~~~~ 70 (347) |.+..-. ...-.+.......++--.+.-++|.+++...-+..|.+++++++..+.++. ...+++ .+...+ T Consensus 84 l~~~~~~~~~~~~~~~~~~~~~~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~~~~~~~~~~~~~~~~~~~~~~~~~a 163 (392) T protein:vir:10 84 LRNKPLNAEEREFLEDDLEQRAMSGLTGEDGGLVIPQDIQTQINELARSFDALEQYVTVEPVRTRSGSRVLEKNSDMIPF 163 (392) T ss_pred HhcccccHHHHHHHhhhhhhhhccccccCCCceecchhHHHHHHHHHHhhhhhhhhceeeeccCCceeEEEEeecCCccc Confidence 1100000 000000000001111111344889999998888889999999998887532 334444 444567 Q ss_pred eeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccc Q lcl|NC_015249. 71 AYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSA 150 (347) Q Consensus 71 ~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~ 150 (347) .....|..++.+ ..++-+++++...+. +.-..|.+-=-.++.+|+.+.+.++.++++++..|..++.-.. T Consensus 164 ~~v~E~~~~~~~-~~~~~~~v~l~~~k~-~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~~~g~g-------- 233 (392) T protein:vir:10 164 AEITEMGEIPET-DNPKFSNVQYAVKDR-AGILPLSRSLLQDSDQNILKYVTKWLGKKSKVTRNVLILGVIE-------- 233 (392) T ss_pred eeeccccccccc-ccccceeEEeeeeeE-EEeehhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhccc-------- Confidence 777777665432 224556667766554 3444554422234678999999999999999999988763210 Q ss_pred cccccccccCcceeecccccccccchhhhHHHHHHHHHHHH-HHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcc Q lcl|NC_015249. 151 SDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLAR-AKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQA 229 (347) Q Consensus 151 ~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~-~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~ 229 (347) .+. .. ... .++.|.++. ..|+....+ +-..|++|..|..|.+- +-.++.|.- T Consensus 234 ---------~~~------~~-----~~~----~~d~i~~~~~~~l~~~~~~--~a~~vm~~~~~~~L~~l-kd~~G~~l~ 286 (392) T protein:vir:10 234 ---------KLT------KQ-----AIK----SLDDIKDVLNVKLDPAISP--NAILLTNQDGFNYLDKL-KDKDGKYIL 286 (392) T ss_pred ---------ccc------cc-----Ccc----CHHHHHHHHHHhhhhhhcc--CCEEEEcHHHHHHHHHh-hccCCCeEe Confidence 000 00 001 134555544 355555443 34578999999999653 233444543 Q ss_pred ccccccceEEEEeceEEEE-e-cceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcc Q lcl|NC_015249. 230 LIDPSTGSIRNVMGFEVIE-V-PHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKD 307 (347) Q Consensus 230 ~~~~~~G~Vg~i~G~~V~~-s-n~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~ 307 (347) ...+.+|.-++++|.+++. + +++|...+.. .+ ...-+-+||+..+.+ +.... T Consensus 287 ~~~~~~~~~~tllG~~~v~~~~~~~~~~~~~~---------~~--------~~~~~~gdfs~~~~i---------~~~~~ 340 (392) T protein:vir:10 287 QSDPTQKNKKLFAGTNPVVVVSNRFLKSKGTT---------AK--------KAPLIIGDLKEAIVL---------FKRED 340 (392) T ss_pred ecCccCCccccccCcccEEEecccccCCCccc---------CC--------ceEEEEEehhceEEE---------Eeecc Confidence 3445667778899986554 3 3333211110 00 000122344443322 22233 Q ss_pred eeeeeee--chhhhcce--eeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 308 MALERAR--RANFQADQ--IIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 308 ~~~e~~~--d~~~~~d~--i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++++... +..+.-+. +++..++|..+++|++.+.+.+..+ T Consensus 341 ~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~~ 384 (392) T protein:vir:10 341 MELASTDVGGKAFTRNTLDLRAIQRDDVQMWDNEAAVYGEIDLS 384 (392) T ss_pred eEEEEeccccchhhcCceEEEEEEeeccEEecccceEEEEeccc Confidence 4444432 33344443 7788889999999999999888666 No 140 >protein:vir:6212 Length: 434 # NCBI annotation: prohead protease # Family: family:all:21 # MgeID: mge:128 # MgeName: phBC6A52 # Cross-refs: genbank:acc:NP_852592;genbank:gi:31415852;genbank:GeneID:1489210 Probab=99.04 E-value=1.1e-10 Score=75.19 Aligned_cols=288 Identities=8% Similarity=0.035 Sum_probs=150.5 Q ss_pred CCcccccccccccccc-cccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-Ccceeeee---ec Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGK-GMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYL---QP 75 (347) Q Consensus 1 ma~~~~~~~~~t~~~~-~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~---~~ 75 (347) |.... ...+.-+ +..+++-=.|.-+.|..++.+.....+.++.+.++.... | .+.+|+. +...+... .. T Consensus 131 l~~~~----~~~e~~a~~~~t~~GG~lvP~~~~~~Ii~~l~~~~~i~~~~~~~~~~-~-~~~~p~~~~~~~a~~~~~~~e 204 (434) T protein:vir:62 131 IVGNI----DEKEARALGLVTGNGSVTIPDFLSKEIITYAQEENFLRRLGTGVKTK-E-NIKYPVLVKKAEAQGHKNERT 204 (434) T ss_pred hcccc----chhhhhhhcccccccceecchhhHHHHHHhhhhhhhhhhhcceeccC-C-ceEEEEEecCCcccceecccc Confidence 11000 0000000 011111112344899999999998889988888775543 3 3667764 23333222 22 Q ss_pred CCCCCCccCCCCCceEEEEEEeeeecc-cccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccc Q lcl|NC_015249. 76 GENLDDKRKDMKHTERTINIDGLLTAD-VLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDEN 154 (347) Q Consensus 76 g~~~~~~~~~~~~~~~~l~ID~~~~~~-~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~ 154 (347) |...+. .+++-.++++.+-+ +.. +.|.+-=-.++.+|+.+.+.++.+++|++..|+.++.-- .++.. T Consensus 205 ~~~~~~--~~~~f~~v~~~~~k--~~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~d~~~l~G~--------G~~~~ 272 (434) T protein:vir:62 205 NNEMPE--TDIEFDEIELSPTE--FDALATVTKKLLARTGLPIEQIVMDELKKAYVRKETQYMVNGD--------EANNI 272 (434) T ss_pred cccccc--cccceeeEEeehee--eEeehhhHHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhccC--------CCCcc Confidence 333222 23334445554433 333 233222222456899999999999999999999886310 01111 Q ss_pred cccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcc--ccc Q lcl|NC_015249. 155 IAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQA--LID 232 (347) Q Consensus 155 ~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~--~~~ 232 (347) + ++.....+.. .. ......++.|+++...|+....+. ..| |++|..|..|.+- +-.++.|.- ... T Consensus 273 ~----~g~~~~~~~~--~~----~~~~~~~d~l~~l~~~l~~~~~~~-a~~-v~n~~~~~~L~~l-kd~~G~~l~~~~~~ 339 (434) T protein:vir:62 273 N----DGALAKKAVE--FK----TDEKNLYDALVKMKNTPVKEVRKK-ARW-VLNTAALTKIETM-KTDDGFPLLRPFNQ 339 (434) T ss_pred c----cceeeccccc--cc----ccccchhhHHHHHHhhcchhhhcC-CEE-EEcHHHHHHHHHh-hccCCCEeeccCCC Confidence 1 1111111111 11 112235788888888887765542 344 7899999988542 223444432 223 Q ss_pred cccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeee Q lcl|NC_015249. 233 PSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALER 312 (347) Q Consensus 233 ~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~ 312 (347) ...|.-..++|.+|+.++.+|....++.. .-|-+||+... ++-+.. .+.+++ T Consensus 340 ~~~g~~~tl~G~pV~~~~~~~~~~~~~~~-------------------~i~~Gdfs~~~--i~~~~g-------~~~i~~ 391 (434) T protein:vir:62 340 AEGGIGYTLLGFPVEEEDAIDIPDSPDTP-------------------VFYFGDFSKFY--IQDVIG-------SLEVQK 391 (434) T ss_pred ccCCCCceecceeeEEecCccCccCCCce-------------------EEEEeeccceE--EEEeec-------eeEEEe Confidence 44566678999999999999854332111 01234555442 222211 123444 Q ss_pred eechhhhcc--eeeeeeeeccccc-ccceEEEEEEc--CC Q lcl|NC_015249. 313 ARRANFQAD--QIIAKYAMGHGGL-RPEACGALVFN--KA 347 (347) Q Consensus 313 ~~d~~~~~d--~i~~~~a~G~~~~-Rpe~a~~i~~~--~a 347 (347) ..+.-+.-+ .+++..++.+.++ +|+...++.+. .| T Consensus 392 ~~~~~~~~~~v~~~~~~r~Dgk~i~~~~~~~~~~~~~~~~ 431 (434) T protein:vir:62 392 LVELFSRTNRVGFRIWNLLDAQLIHSPFEVPVYKYVLKAP 431 (434) T ss_pred ehhhhcccCceEEEEEeeecceeecCcccceEEEEEeccC Confidence 433322223 3678888888765 58877766333 33 No 141 >protein:vir:1084 Length: 437 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:21 # MgeName: bIL309 # Cross-refs: genbank:acc:NP_076738;genbank:gi:13095848;genbank:GeneID:920418 Probab=99.03 E-value=5.7e-11 Score=76.73 Aligned_cols=280 Identities=9% Similarity=0.012 Sum_probs=145.0 Q ss_pred CCc----ccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec--Ccceeeeee Q lcl|NC_015249. 1 MAK----MNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL--GRTKAAYLQ 74 (347) Q Consensus 1 ma~----~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i--G~~~~~~~~ 74 (347) +.. ...+. .+..+....++.-.+.-+.+...+... ...+.++...++....++ +..+|.. +........ T Consensus 141 ~~~~~~~~~~~e---~~~~~~~~~~~~g~lvp~~~~~~i~~~-~~~~~l~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~ 215 (437) T protein:vir:10 141 VTAFADYLKTGE---VRDVTGIALKDGKVIIPETILTPEKEV-HQFPRLGSLVRTESVTTT-TGKLPIFNNSTDLLTAHT 215 (437) T ss_pred hhhhHHHHHhhh---hhhhhhcccccccccchHHHHHHHHHh-hhhhhhhhcceeEeeccC-ceeeEEeecccccccccc Confidence 000 00000 011111112222223447777777654 344566667776665543 3445543 333445555 Q ss_pred cCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccc Q lcl|NC_015249. 75 PGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDEN 154 (347) Q Consensus 75 ~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~ 154 (347) .+...+.. ..+.-.++++.+.+. +.-+.|.+-=-..+.+|+.+.+.++.+++|++..|..|+.-.. T Consensus 216 e~~~~~e~-~~~~~~~v~~~~~k~-~~~~~is~ell~ds~~~~~~~i~~~l~~~~~~~~~~~i~~g~g------------ 281 (437) T protein:vir:10 216 EYGQTTKN-ATPVITPILWDLKTY-TGGYVFSQELISDSSYDWQAELQSRLIELRDNTDDSLIITALT------------ 281 (437) T ss_pred cccccccc-ccccceeeeeehhhe-eeehhhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhhhc------------ Confidence 55544321 223334555544433 2223333221224567899999999999999999988764221 Q ss_pred cccccCcceeecccccccccchhhhHHHHHHHHHHHH-HHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccccc Q lcl|NC_015249. 155 IAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLAR-AKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDP 233 (347) Q Consensus 155 ~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~-~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~ 233 (347) ++.. .+.++. .++.|.++. ..|+....+ +-..|++|..|..|.+- +-.++.|.-...+ T Consensus 282 -----~~~~--~~~~~~-----------~~~~~~~~~~~~l~~~~~~--~~~~~~~~~~~~~l~~l-kd~~g~~~~~~~~ 340 (437) T protein:vir:10 282 -----DGIK--KTTSTY-----------LLGDLKKVLNVTLKPQDSA--AASIVMSQSAYNLFDMA-TDAMGRPLLQPNV 340 (437) T ss_pred -----cccc--cccccc-----------chhhHHHHHHhhhhhhhhc--CCEEEEcHHHHHHHHHh-hccCCCeeeccCc Confidence 1110 000010 122333332 245554433 33569999999988653 2334455444456 Q ss_pred ccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeee Q lcl|NC_015249. 234 STGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERA 313 (347) Q Consensus 234 ~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~ 313 (347) .+|.-++++|.+|+.+++.+..+.+. +. ..-+-+||++.+.+ |.+. +++++.. T Consensus 341 ~~~~~~~l~G~pv~~~~~~~~~~~~~----------~~--------~~~~~gd~~~~~~~-~~r~--------~~~~~~~ 393 (437) T protein:vir:10 341 TAATGYTLLGKTVVIVDDKLFPSASA----------GD--------VNIVVAPLKKAVIN-FKLT--------EITGQFQ 393 (437) T ss_pred cCCCCcccccceeEEecccccCCcCC----------Cc--------eEEEEeeccccEEE-Eeee--------ceEEEEe Confidence 67777899999999987653322211 00 00133466655433 3222 3455544 Q ss_pred echhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 314 RRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 314 ~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .+-..+...+...++|+.++++|++.+.|..+.. T Consensus 394 ~~~~~~~~~~~~~~r~d~~~~~~~a~~~l~~~~~ 427 (437) T protein:vir:10 394 DTYDIWYKQLGIFLRQNVVQASKDLIVNLTGKLK 427 (437) T ss_pred cccccccceeeEEEEEccEEecccceEEEEeecc Confidence 3344455677777899999999999887764432 No 142 >protein:vir:4092 Length: 390 # NCBI annotation: major capsid protein a # Family: family:all:635 # MgeID: mge:86 # MgeName: 2389 # Cross-refs: genbank:acc:NP_510986;swissprot:trembl:q8w604;genbank:gi:17488508;uniprot:Q8W604;genbank:GeneID:1260361 Probab=99.01 E-value=4.2e-10 Score=72.00 Aligned_cols=290 Identities=12% Similarity=0.040 Sum_probs=151.6 Q ss_pred CCc-cc-----ccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEee-cCcceeeee Q lcl|NC_015249. 1 MAK-MN-----GGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPV-LGRTKAAYL 73 (347) Q Consensus 1 ma~-~~-----~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~-iG~~~~~~~ 73 (347) +.. +. ...+. +. .++.++--.|.-+.|..++.+...+.|.++.++++..+.+| ...||+ .+...+... T Consensus 68 ~~~~l~~~~r~~~~~~--~~--~~~~~~gg~lvP~~~~~~I~~~~~~~s~i~~~~~~~~~~~~-~~~i~~~~~~~~a~~~ 142 (390) T protein:vir:40 68 GANALTSDESKYYNEV--IA--GNGFAGVTALLPPTVFERVFEDLTVEHPLLSKINFVNTTAT-TEWIISVGDVATAWWG 142 (390) T ss_pred CchhccHHHHHHHHHH--Hh--ccCcccCcccccHHHHHHHHHHHHhhhhhhhhceeeecCCc-eeEEEEEcCCcceeee Confidence 000 00 00000 00 01111112245599999999999999999999988886554 455665 455566666 Q ss_pred ecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc Q lcl|NC_015249. 74 QPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDE 153 (347) Q Consensus 74 ~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~ 153 (347) ..+..++.. .+++-++++|..-++ +.-+.|.+-=-.++.+|+.+.+.++.++++++..|+.++.-- + .. T Consensus 143 ~E~~~~~~~-~~~~f~~i~l~~~k~-~~~i~iS~ell~ds~~~l~~~i~~~la~~i~~~~~~a~l~G~--G-------~~ 211 (390) T protein:vir:40 143 PLCAEIKEV-LDNGFDKIQTGMYKL-SAYIPVCNAMLDLGPSWLDQYVRTILGEAMALGLEAGIVNGS--G-------KD 211 (390) T ss_pred ccccccCcc-ccccceeeEeeeeeE-EEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHhhhhccc--C-------CC Confidence 655554322 235556666666554 333455443334577889999999999999999999886311 0 00 Q ss_pred ccccccCc-ceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCC-CCCEEEeCHHHHHHHhcchhhhhhhhcccc Q lcl|NC_015249. 154 NIAGLGKA-HVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPS-ADRVFYTTPDNYSAILAALMPNAANYQALI 231 (347) Q Consensus 154 ~~~~~~~g-~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~-~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~ 231 (347) .+.|.-.. .....+.... .+..........+.+..+...+.....+. ..-+.+++|..+..+++..+.. .+-.| T Consensus 212 ~P~Gil~~~~~~~~~~~~~-~~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~a~~i~n~~t~~~~l~~~~~~-~d~~G-- 287 (390) T protein:vir:40 212 QPIGMMRDLNNVTAGEHPV-KTATPLTDLTPATLATKVMLPLTDNGKKSVSDAILVINPADYWSKIYAATSY-MTPQG-- 287 (390) T ss_pred ccceeeecccccccccccc-ccccccchhhHHHHHHHHHHHhhcchhhhhcCceEEEcchhHHHHHHHHhhc-cCCCC-- Confidence 11111000 0000000001 11111122223333333344443322221 2344678887655444432211 11111 Q ss_pred ccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeee Q lcl|NC_015249. 232 DPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALE 311 (347) Q Consensus 232 ~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e 311 (347) .+..+. ...|.+|+.|+++|... -+-+||+... ++.+. +++++ T Consensus 288 ~~v~~~--~~~g~pvv~~~~~p~~~-------------------------i~~Gd~s~~~--i~~~~--------~~~v~ 330 (390) T protein:vir:40 288 VWVTGI--LPVPLEIVQSVAVPVGK-------------------------AVAGRAKDYF--MGIGS--------EQVIR 330 (390) T ss_pred cccccc--CCCceeEEEcCCCCCCc-------------------------EEEEeeceEE--EEeec--------ceEEE Confidence 111111 24699999999998421 1123555432 22222 45555 Q ss_pred eeechh--hhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 312 RARRAN--FQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 312 ~~~d~~--~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ...+.. +-...+++.++++..+.+|++.+.+.++.+ T Consensus 331 ~~~~~~f~~~~~~~r~~~r~dg~v~~~~A~~~l~~~~~ 368 (390) T protein:vir:40 331 TSTEYRLLDDETLYYAKQYANGRPKDNSSFLVFDITGL 368 (390) T ss_pred ecchhhhhcCcEEEEEEEEeCCEEecccceEEEEeecc Confidence 554332 233568899999999999999999998887 No 143 >protein:vir:105038 Length: 428 # NCBI annotation: major capsid head protein precursor # Family: family:all:21 # MgeID: mge:1465 # MgeName: phiKO2 # Cross-refs: genbank:acc:YP_006586;genbank:gi:46402092;genbank:GeneID:2777903 Probab=99.00 E-value=5.5e-10 Score=71.33 Aligned_cols=299 Identities=13% Similarity=0.103 Sum_probs=143.6 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcc-cccccccccceEEEeec-CcceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNK-HLVRSIQSGKSAQFPVL-GRTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~-~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~~ 78 (347) |+.-........+.-.. .++.--.+.-+.+..++.+.....++++.+ +++-+..+| .+.||+. +..++.....|+. T Consensus 113 ~~~~~~~~~~~~~~~~~-~~~~gg~liP~~~~~~ii~~l~~~~~l~~~~~~~~~~~~g-~~~~p~~~~~~~a~~v~Eg~~ 190 (428) T protein:vir:10 113 FASDELNDQSVSMAIST-AAGSGGVLIPQNIHSEVIELLRDRTIVRKLGARSIPLPNG-NMSLPRLAGGATASYTGENQD 190 (428) T ss_pred HhhhhhhhhhHhhhhcc-cccCCccccchhHHHHHHHHHhhhchhhhhcceeeecCCc-ceEEEEEeCCcceeeeccCcc Confidence 11111110000010000 000001123377788887777777887776 333222333 3778875 4556777777777 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) ++. .++.-+++++...++ +.-+.|.+-=-.++.+++.+.+.++.+++|++..|+.++.- +.++..+.|. T Consensus 191 ~~~--~~~~f~~i~~~~~k~-~~~v~is~ell~ds~~~l~~~i~~~l~~ai~~~~d~~~l~G--------~G~~~~p~Gi 259 (428) T protein:vir:10 191 AKV--SEARFDDVKLTAKTM-IAMVPISNALIGRAGFNVEQLVLQDILTAISVREDKAFMRD--------DGTGDTPIGM 259 (428) T ss_pred ccc--cccceeeEEeeeEEE-EEeehhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHhcc--------CCCCcccccc Confidence 654 345556666666444 23344543323356789999999999999999999988631 0111122221 Q ss_pred cCcceeecccccccccchhhhHHHHHHHHHHHHHHh-hhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccce Q lcl|NC_015249. 159 GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKL-TGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGS 237 (347) Q Consensus 159 ~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~L-de~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~ 237 (347) -........ .............. ++..+++...+ ...+.....-..+++|..|..|.+-. -.++.|.-. ....| T Consensus 260 ~~~~~~~~~-~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~lk-d~~G~~i~~-~~~~g- 334 (428) T protein:vir:10 260 KARATQWNR-LLPWAADAAVNLDT-IDTYLDSIILMSMDGNSNMISSGWGMSNRTYMKLFGLR-DGNGNKVYP-EMAQG- 334 (428) T ss_pred ccccccccc-cccccccccccHHH-HHHHHHHHHHhhhccccccccCEEEEcHHHHHHHHHhh-ccCCceecc-CCCCC- Confidence 110000000 00000011111111 11122211111 11111112234566999998886532 233333211 12233 Q ss_pred EEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechh Q lcl|NC_015249. 238 IRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRAN 317 (347) Q Consensus 238 Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~ 317 (347) .++|.+|+.++++|...+.. .+ ...-|-+||+..+ + ....+++++..++.. T Consensus 335 --~l~G~pv~~~~~~p~~~~~~------~~-----------~~~i~~gd~s~~~-i---------~~~~~i~i~~~~~~~ 385 (428) T protein:vir:10 335 --MLKGYPIQRTSAIPANLGEG------GK-----------ESEIYFADFNDVV-I---------GEDGNMKVDFSKEAS 385 (428) T ss_pred --eeeceeeEEeccccccccCC------Cc-----------cceEEEEecceEE-E---------EEecceEEEeecccc Confidence 69999999999999643221 00 0011234444332 1 112234444444422 Q ss_pred -----------hh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 318 -----------FQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 318 -----------~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++ .-.|++..+++..+.||++. ++.+... T Consensus 386 ~~~~~~~~~~~f~~~~~~~R~~~r~d~~v~~p~a~-~~~t~~~ 427 (428) T protein:vir:10 386 YIDTDGKLVSAFSRNQSLIRVVTEHDIGFRHPEGL-VLGTGVL 427 (428) T ss_pred cccccccccchhhcchhheeeeeeeCceeeccceE-EEEeccC Confidence 22 23567888899999999954 4555555 No 144 >protein:vir:93616 Length: 645 # NCBI annotation: putative major head protein/prohead protease # Family: family:all:21 # MgeID: mge:157 # MgeName: phi 4795 # Cross-refs: genbank:acc:YP_001449293;genbank:gi:157166041;goa:Q6H9U8;interpro:IPR006433;uniprot:Q6H9U8;genbank:GeneID:5580438 Probab=98.94 E-value=1.1e-09 Score=69.65 Aligned_cols=287 Identities=11% Similarity=0.037 Sum_probs=144.0 Q ss_pred CCcccccc--------------cccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccccc--ccccc-cceEEEe Q lcl|NC_015249. 1 MAKMNGGQ--------------QIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLV--RSIQS-GKSAQFP 63 (347) Q Consensus 1 ma~~~~~~--------------~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~--r~i~~-G~tv~i~ 63 (347) +|...++. ...+.+++ +|. + +.-++|.+++.+.-...|+++.+-.. ....+ -..+.|| T Consensus 315 ~a~~~~~~~~~~~~~~~~a~~~~~~~~~~~---~Gg-~-~vp~~~~~~ii~~l~~~svv~~l~~~~~~~~~~~~~~~~ip 389 (645) T protein:vir:93 315 VARRQYPDDSRLHHVLKSAVGAGTTTDPQW---AGS-L-SEYQEYAQDFIDYLRPQTIIGRFGQGGIPALRQVPFNIRVH 389 (645) T ss_pred HHHhhcccchhhhhhhhhhhhccccccccc---cCC-c-cCchhhHHHHHHhhhhhhhHHhhccccccccccccCceeee Confidence 10000000 00011111 111 1 23478888888887777887655332 22221 1246777 Q ss_pred e-cCcceeeeeecCCCCCCccCCCCCceEEEEEEeeeecc-cccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015249. 64 V-LGRTKAAYLQPGENLDDKRKDMKHTERTINIDGLLTAD-VLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEM 141 (347) Q Consensus 64 ~-iG~~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~-~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~ 141 (347) + .+..++.....|+.++.+ +++-+++++.. .++.. ..|.+-=-.++.+|+.+.+.++.+++|++..|+.+|..- T Consensus 390 ~~t~~~~a~wv~Eg~~~~~s--~~~f~~v~l~~--~kla~~~~iS~ell~ds~~~~~~~i~~~l~~aia~~~d~a~l~g~ 465 (645) T protein:vir:93 390 AQVSGGAAGWVGEGKTKPLT--KFDFESITFSH--AKVSAIAVLTEELIRFSSPAADALVRNALAEAVVARLDTDFVDPK 465 (645) T ss_pred eeecCcceEEeccCcccccc--ccceeEEEEee--EEEEEeehhHHHHHhhchHHHHHHHHHHHHHHHHHHHHHHhhcCC Confidence 6 466777777778777654 34555655554 33333 333221123567889999999999999999999886311 Q ss_pred HHHhhhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchh Q lcl|NC_015249. 142 AKLCNLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALM 221 (347) Q Consensus 142 ~~~a~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~ 221 (347) . +...... +.+.. .+... ..+ ....+..+..+...|..+++...+-+.|++|..+..|.+-.. T Consensus 466 g-----~~~~~~~----p~gi~--~~~~~-~~~-----~~~~~~d~~~~~~~~~~a~~~~~~a~~vmn~~~~~~L~~lkd 528 (645) T protein:vir:93 466 K-----AAVADVS----PASIT--HDVKG-TAS-----SGNPDADAEAAFGQFVAANLQPTGAVWLMSSTNALALSMRKN 528 (645) T ss_pred C-----cccCCcc----cccee--ccccc-ccc-----ccchHHHHHHHHHHHHhcCCCccccEEEEcHHHHHHHHhccc Confidence 0 0001111 11111 00000 000 011234566677778888876666677889999999876432 Q ss_pred hhhhhhcc-ccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhh Q lcl|NC_015249. 222 PNAANYQA-LIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAV 300 (347) Q Consensus 222 ~~~~~~~~-~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av 300 (347) - +..+.- ...... ++++|.+|+.|+++|..- . + .+.+..|-+++. .+-+-+++++ T Consensus 529 ~-~G~~~~~~~~~~~---~tL~G~PV~~s~~vp~~~------~-----------~-gd~s~~~ig~~~-~v~i~~s~~a- 584 (645) T protein:vir:93 529 A-LGQKEYPDMTLLG---GSFQGLPVIVSQYVGDQL------V-----------L-VNAPDIYLADDG-GVAVDMSREA- 584 (645) T ss_pred c-CCceeecCCCCCC---ceeeceeeEEeccCCcce------e-----------E-eccccEEEEEec-ceEEEeecce- Confidence 2 222221 111222 479999999999998310 0 0 001111111111 1111222221 Q ss_pred hhhhhcceeeeeeec--------------hhhhcce--eeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 301 GTVKLKDMALERARR--------------ANFQADQ--IIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 301 ~~v~~~~~~~e~~~d--------------~~~~~d~--i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++..-+ ..++.|+ |++.++++..++||++.+.|. .+ T Consensus 585 --------~~~~~~~~~~~~~~~~~~~~v~lf~~d~vaira~~r~d~~~~~p~a~~~lt--~~ 637 (645) T protein:vir:93 585 --------SLEMQSEPTGDSTTPSPVELVSMFQTGSVAIRAERWINWRRRRTAAVAVIT--GV 637 (645) T ss_pred --------eEEEeecccccccccccccchhHhhcCceEEEEEEEEcceeeCccceEEEe--cc Confidence 1111111 1144444 566677899999999866433 33 No 145 >protein:vir:9361 Length: 402 # NCBI annotation: SLT orf 37-like protein # Family: family:all:658 # MgeID: mge:166 # MgeName: phi 12 # Cross-refs: genbank:acc:NP_803339;genbank:gi:29028650;genbank:GeneID:1258088 Probab=98.90 E-value=2.4e-10 Score=73.32 Aligned_cols=271 Identities=12% Similarity=0.066 Sum_probs=148.2 Q ss_pred CCccccc----------ccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec--Ccc Q lcl|NC_015249. 1 MAKMNGG----------QQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL--GRT 68 (347) Q Consensus 1 ma~~~~~----------~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i--G~~ 68 (347) |...... ...++ |-..++| .+.-+.|..++.+.....+.++.++++.++.+ .++|++ +.. T Consensus 114 ~~~~~~~~~~~~~~~~~~a~~~--~t~~~GG---~lIP~~~~~~Ii~~~~~~~~l~~~~~v~~~~~---~~~p~~~~~~~ 185 (402) T protein:vir:93 114 ILPNEFEKPSMEAQRLLHALPT--GNDSGGD---KLLPKTLSKEIVSEPFAKNQLREKARLTNIKG---LEIPRVSYTLD 185 (402) T ss_pred HhhhhHHHHHHhHHHHHhhhcc--CCCcCCc---cccchhHHHHHHHhHHhhhhhhhhceeeecCC---ceeeeeeccCC Confidence 1000000 00000 0011111 24458899999999988899999988877643 334543 334 Q ss_pred eeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhc Q lcl|NC_015249. 69 KAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLP 148 (347) Q Consensus 69 ~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~ 148 (347) ++.....|+..+.+ +++-.++++.+.++ +.-+.|.+-=-..+.+|+.+.+.++.++++++..++.++..- T Consensus 186 ~a~~v~Eg~~~~~~--~~~f~~i~~~~~k~-~~~i~iS~ell~Ds~~~l~~~i~~~la~~~~~~e~~~~~~~g------- 255 (402) T protein:vir:93 186 DDDFITDVETAKEL--KAKGDTVKFTTNKF-KVFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAVS------- 255 (402) T ss_pred cccccccccccccc--ccccceeeecceee-eeechhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhHhhcC------- Confidence 55666666665543 45556666655444 233344322223357889999999999999987666554211 Q ss_pred cccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhc Q lcl|NC_015249. 149 SASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQ 228 (347) Q Consensus 149 ~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~ 228 (347) ...+.+.|.....+.. ...+...++.|+++...|+..... ...|+ +++..|..|++-.+-.++. T Consensus 256 -----~g~g~p~g~~~~~~~~-------~~~~~~~~d~l~~~~~~l~~~y~~-na~~i-mn~~t~~~~~~~~~d~~~~-- 319 (402) T protein:vir:93 256 -----PKSGLEHMSFYNGSVK-------EVEGADMYDAIINALADLHEDYRD-NATIY-MRYADYVKIISVLSNGTTN-- 319 (402) T ss_pred -----CCccccceeeeccccc-------cccccchHHHHHHHHhccChhhhc-CCEEE-EechHHHHHHHHHhcCCCc-- Confidence 1122222222111100 111233578888888888877654 45664 5555444443322211222 Q ss_pred cccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcce Q lcl|NC_015249. 229 ALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDM 308 (347) Q Consensus 229 ~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~ 308 (347) +..|.-.++.|.+|+.++..|.. +-+||..... .+.+ + T Consensus 320 ----~~~~~~~~llG~PV~~t~~~~~i---------------------------~~GDf~~~~~-~~~~----------~ 357 (402) T protein:vir:93 320 ----FFDTPAEKVFGKPVVFTDAAVKP---------------------------IVGDFNYFGI-NYDG----------T 357 (402) T ss_pred ----ccccCCccccccceEEecCCCce---------------------------eeechhhhhh-hhhh----------h Confidence 22333457999999998754310 1123333221 1111 2 Q ss_pred eeeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 309 ALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 309 ~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .++.++++..-.-.+++..+++.++.+|++...+.++.+ T Consensus 358 ~~~~~~~~~~~~~~~~~~~r~Dg~v~~~~A~~~l~ik~~ 396 (402) T protein:vir:93 358 TYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKEN 396 (402) T ss_pred hhhhhhcccCCceEEEEEEEeCcEEechhheEEEEeecC Confidence 233444444444567888899999999999999999888 No 146 >protein:vir:96762 Length: 632 # NCBI annotation: putative phage-related protein # Family: family:all:21 # MgeID: mge:1628 # MgeName: VP882 # Cross-refs: genbank:acc:YP_001039818;genbank:gi:126010917;genbank:GeneID:5076272 Probab=98.89 E-value=2.9e-10 Score=72.91 Aligned_cols=285 Identities=14% Similarity=0.075 Sum_probs=149.1 Q ss_pred CCcccccc---------cccccccccccccchhhhhhhh-hhhHHHHHHHHHHhhhcc-cccccccccceEEEeec-Ccc Q lcl|NC_015249. 1 MAKMNGGQ---------QIGKDQGKGMSAGDKLALFLKV-FGGEVLTAFTRTSVTMNK-HLVRSIQSGKSAQFPVL-GRT 68 (347) Q Consensus 1 ma~~~~~~---------~~~t~~~~~~~~~d~~al~ie~-f~g~V~~~f~~~s~~~~~-~~~r~i~~G~tv~i~~i-G~~ 68 (347) ++...|-. ....|....+..++--.|...+ +..++.+.....++++.+ ++..+...| .+.||+. +.. T Consensus 334 ~a~~~G~~arg~~~~~~~l~~ra~~~~t~~~gg~lvp~~~~~~~iie~lr~~s~i~~l~~~~~~~~~g-~~~ip~~~~~~ 412 (632) T protein:vir:96 334 IADASGKEARGFYMPHEVLVQRQLEKKTAGKGGELVATELLSEEFIDILRNKAIIGQMGARMLPGLVG-DVDIPKKTSGA 412 (632) T ss_pred HHHhhhhhhhhhhhhHHHHHHhhhhcccccccccccccccchHHHHHHHhhcchhhhhcceEeecCCc-ceEEEEEeCCc Confidence 00000000 0000110111111111133444 456666666666776665 333333334 5778875 666 Q ss_pred eeeeeecCCCCCCccCCCCCceEEEEEEeeeecc-cccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Q lcl|NC_015249. 69 KAAYLQPGENLDDKRKDMKHTERTINIDGLLTAD-VLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNL 147 (347) Q Consensus 69 ~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~-~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~ 147 (347) ++.....|+.++.+ ++.-+++++.. .++.. +.|.+-=-.++.+|+.+.+.+++++++++..|+.+|.-- T Consensus 413 ~a~wv~E~~~~~~s--~~~f~~i~l~~--~k~~~~v~iS~ell~ds~~~~~~~i~~~l~~a~~~~~d~a~l~G~------ 482 (632) T protein:vir:96 413 NFYWIGEDEDVQDS--DFDFTTLSFSP--KTIAGAVPVTRKLRKQSSIHVENLIREDLIEGIGVALDLAMLTGT------ 482 (632) T ss_pred eeEeecCCcccccc--ccceeeEEeee--eEEEEehhhHHHHHhccchHHHHHHHHHHHHHHHHHHHHHhhccc------ Confidence 77777777776543 45555555555 33343 333221123567899999999999999999999876310 Q ss_pred ccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcch-hhhhhh Q lcl|NC_015249. 148 PSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAAL-MPNAAN 226 (347) Q Consensus 148 ~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~-~~~~~~ 226 (347) . ....+ .|-....+..+..... ....++.|+++...+...++....-..+++|..+..|.... +-.++. T Consensus 483 -G-~~~~p----~Gi~~~~~~~~~~~~~----~~~~~~~i~~~~~~i~~~~~~~~~~~~~~~~~~~~~l~~~~l~d~~G~ 552 (632) T protein:vir:96 483 -G-LANDP----VGLLNMTGVPALTYPA----GGVDWASVVDMETKISTFNADAGRLAYLTSVTQRGAAKKAQVFDNTGE 552 (632) T ss_pred -C-CCCcc----ceeeecccccceeccc----ccCCHHHHHHHHHHHhhcccccCccEEEEchhHHHHHHHHhccCCCCc Confidence 0 01111 1211111111000000 11125667888888888887666666788998887776432 112222 Q ss_pred hccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhc Q lcl|NC_015249. 227 YQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLK 306 (347) Q Consensus 227 ~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~ 306 (347) | .+..| .+.|.+|+.||++|... -+-+||+......|. .+ T Consensus 553 ~----i~~~~---~l~G~pv~~s~~ip~~~-------------------------~~~gd~s~~~i~~~~--------~~ 592 (632) T protein:vir:96 553 R----IWQNN---EVNGYRAEASNQIPADT-------------------------WIFGDWSQIVIAMWG--------VL 592 (632) T ss_pred e----eecCC---eecccceEeccccccCc-------------------------EEEeecceEEEEEec--------ce Confidence 3 23344 68999999999998432 012344443211121 11 Q ss_pred ceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcC Q lcl|NC_015249. 307 DMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNK 346 (347) Q Consensus 307 ~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~ 346 (347) .+.+..+.....-.-.++..+.++.+++||++-+.+...+ T Consensus 593 ~i~~~~~~~~~~~~v~~~~~~~~d~~v~~~~af~~~k~~A 632 (632) T protein:vir:96 593 DLKVDPYTKAASDGLVLRVFQDVDAGVRRKEAFCIAKKGA 632 (632) T ss_pred EEEEccccccccCceEEEEEeecCceeechhhhhheeecC Confidence 1222222222334457888999999999999777665555 No 147 >protein:vir:8420 Length: 477 # NCBI annotation: gp15 # Family: family:all:21 # MgeID: mge:155 # MgeName: Omega # Cross-refs: genbank:acc:NP_818316;genbank:gi:29566752;genbank:GeneID:1260033 Probab=98.88 E-value=1.9e-09 Score=68.43 Aligned_cols=297 Identities=12% Similarity=0.093 Sum_probs=145.3 Q ss_pred CCcccccccccccccccccccchhhhhhhhh-hhHHHHHHHHHHhhhcccccccccc-cceEEEeecCcc--eeeeeecC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVF-GGEVLTAFTRTSVTMNKHLVRSIQS-GKSAQFPVLGRT--KAAYLQPG 76 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f-~g~V~~~f~~~s~~~~~~~~r~i~~-G~tv~i~~iG~~--~~~~~~~g 76 (347) .....-.....+-.+. +| .+.+.+| .+++.+..+..++++.+++...+.+ +.++.||++-.. .......| T Consensus 148 ~~~~~~~~~~~~~~~~---gg---~lv~~~~~~~~ii~~l~~~~~i~~~~~~~~~~~~~~~~~ip~~~~~~~~a~~~~Eg 221 (477) T protein:vir:84 148 AKVGEEYRDLDRNGGT---GG---YAVPPLWMMNRFIELARAGRTYANLCPTEPLPGGTSSINIPKILTGTSTAIQAADN 221 (477) T ss_pred HHhhhhhccccccCCC---cc---eeeccchhHHHHHHHhhhcchHHHhhceeeecCCcceeEEEEEecCcceeeeeccC Confidence 0000000000111111 11 1344444 6778888877788888888888765 567999986332 23333334 Q ss_pred CCCCCcc---CCCCCceEEEEEEeeeeccc-ccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccc Q lcl|NC_015249. 77 ENLDDKR---KDMKHTERTINIDGLLTADV-LIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASD 152 (347) Q Consensus 77 ~~~~~~~---~~~~~~~~~l~ID~~~~~~~-~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~ 152 (347) ..+.... .++.-.. ++++-.++..+ .|.+-=-.++.+|+.+.+.++++++|++..|+.+|.- . . +. T Consensus 222 ~~~~~~~~~~s~~~f~~--i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~d~~~l~G----~---G-t~ 291 (477) T protein:vir:84 222 AALTAPSAHEVDLTDGF--VQANVKTIAGQQGIAIQLLDQAAVSVDEFVFRDLAADYANKLNVQVISG----T---G-SN 291 (477) T ss_pred cccccccccccccceee--EEEeeeeEEeeeHHHHHHHhccchhHHHHHHHHHHHHHHHHHHHHHhcc----C---C-CC Confidence 3332111 1122233 34444444443 3332222345789999999999999999999987621 0 0 01 Q ss_pred cccccccCc-ceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhc--- Q lcl|NC_015249. 153 ENIAGLGKA-HVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQ--- 228 (347) Q Consensus 153 ~~~~~~~~g-~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~--- 228 (347) ..+.|.-.. .+..+.... ..........+++.|+++...++....- .....++.|..|..|.+-.. .+..|. T Consensus 292 ~~p~Gi~~~~~~~~~~~~~--~~~t~~~~~~~~~~i~~~~~~~~~~~~~-~~~~~v~~~~~~~~l~~lkd-~~G~~l~~~ 367 (477) T protein:vir:84 292 NQVVGVRATAGITQVTATS--AGSALEKHQIIYQKIADAIQRVHTSRFL-EPEVIVMHPRRWASFHAIFA-GDDRPLIVP 367 (477) T ss_pred Cccceeeeccccccccccc--cccchhhHHHHHHHHHHHHhhccccccC-CccEEEEcHHHHHHHHHhhc-cCCCeeeec Confidence 112221110 000000000 0011112234567777777666654432 23466889998888865321 122221 Q ss_pred ----------cccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechh Q lcl|NC_015249. 229 ----------ALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRS 298 (347) Q Consensus 229 ----------~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~ 298 (347) ....+.+|..+++.|.+|+.|+.+|...+. +++ .+..+-++|... +++. . T Consensus 368 ~~~~~~~~~~~~~~~~~~~~~~l~G~pVv~s~~~p~~~~~------~~d-----------~~~i~~gd~~~~--~i~~-~ 427 (477) T protein:vir:84 368 SGPGFNNLGVLTEVASQRVVGQMHGLPVVTDPTLPTTLGT------GTD-----------QDVIHVLRASDL--ALFE-S 427 (477) T ss_pred CcccccccccccccccccccchhcccceEecCcccccccc------cCC-----------cceEEEEEeceE--EEEe-e Confidence 122355667789999999999999953211 111 112233455443 2221 1 Q ss_pred hhhhhhhcceeeeeeechhhhcceeeeee-eec---ccccc-cceEEEEEEcCC Q lcl|NC_015249. 299 AVGTVKLKDMALERARRANFQADQIIAKY-AMG---HGGLR-PEACGALVFNKA 347 (347) Q Consensus 299 Av~~v~~~~~~~e~~~d~~~~~d~i~~~~-a~G---~~~~R-pe~a~~i~~~~a 347 (347) .+.+ ..++..+.+.....+ .+| ....| |++-+.|..... T Consensus 428 --------~~~~--~~~~~~~~~~~~~~~~v~~~~~~~~~r~~~afv~~t~~~~ 471 (477) T protein:vir:84 428 --------SVRM--RALQETRAENLSVLLQVYGYLAFTAARFPQSVVEIGGTAL 471 (477) T ss_pred --------ceeE--EeccccccccceeeeeehhhhhhhhhccccceEEeecccc Confidence 1222 334444344333322 222 24556 997776665555 No 148 >protein:vir:2770 Length: 318 # NCBI annotation: hypothetical protein # Family: family:all:974 # MgeID: mge:59 # MgeName: Stx2 converting bacteriophage I # Cross-refs: genbank:acc:NP_612887;genbank:gi:20065804;genbank:GeneID:935710 Probab=98.87 E-value=2.3e-10 Score=73.44 Aligned_cols=259 Identities=10% Similarity=0.022 Sum_probs=149.9 Q ss_pred CCcccccccccccccccc----cccchhhhhhhhhhhHHHHHHHHHHhhh---------ccccccccc--ccceEEEeec Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGM----SAGDKLALFLKVFGGEVLTAFTRTSVTM---------NKHLVRSIQ--SGKSAQFPVL 65 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~----~~~d~~al~ie~f~g~V~~~f~~~s~~~---------~~~~~r~i~--~G~tv~i~~i 65 (347) |.++.+|+-.-... ++. ...|+ .++.|++.+...-++.+-+. +.++..++. .|++|.|.-+ T Consensus 1 mt~~~~~~~~~~~~-~~~ft~~~~~~~---~vk~ws~~l~~~~~~~~~~~~~~g~~~~~~I~r~~dL~K~~GD~Vtf~L~ 76 (318) T protein:vir:27 1 MTTVTSAQANKLFQ-VALFTAANRNRS---MVNILTEQQEAPKAVSPDKKSTKQTSAGAPVVRITDLNKQAGDEVTFSIM 76 (318) T ss_pred CCccCCCChHHHHH-HHHHHHHhcCCh---HHHHHHHhhhhHHHhhhhhhcccCCCCCceEEEeccCCCCCccEEEEeEe Confidence 99988765321100 000 11122 46789988766555443222 233444443 4999999988 Q ss_pred CcceeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHh Q lcl|NC_015249. 66 GRTKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLC 145 (347) Q Consensus 66 G~~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a 145 (347) -..+-.....++.+.+..+.+.-....|.|||..-.-..=..+++-.+-+|+|++.-..++.-+++..||.+|.+++... T Consensus 77 ~~L~g~gv~Gd~~lEGnee~L~~~~d~l~IDq~r~~V~~gg~msqqRt~~dlR~~ar~~L~~w~~~~~Dq~~~v~laGar 156 (318) T protein:vir:27 77 HKLSKRPTMGDERVEGRGEDLSHADFSLKINQGRHLVDAGGRMSQQRTKFNLASSARTLLGTYFNDLQDQCAIVHLAGAR 156 (318) T ss_pred eccccCccccCceeeccccceEEEeeEEEEeeeccccccccchhhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhcc Confidence 77766666667777777677888888999999753321225677778889999999999999999999999999997654 Q ss_pred hh-cc-----c--ccc------cccc-ccC-cceeecccccccccchhhhHH--HHHHHHHHHHHHhhhcCCC------- Q lcl|NC_015249. 146 NL-PS-----A--SDE------NIAG-LGK-AHVLEVGKQSELRGDQVKLGQ--AIIAQLTLARAKLTGNYVP------- 200 (347) Q Consensus 146 ~~-~~-----~--~~~------~~~~-~~~-g~~i~~~~~~~~~~~~~~~~~--~~~~~l~~a~~~Lde~~VP------- 200 (347) .. .. + .+. .+.. .|. .-.+..+.+ +.+..+... .-++.|-.+...+++..-| T Consensus 157 g~~~n~~~~~p~~~~~~~~~~~~N~v~aPt~~r~~~~g~a---t~~~~l~stD~~s~~lid~~~~~~~~~a~pi~PV~v~ 233 (318) T protein:vir:27 157 GDFVADDTILPTAEHPEFKKIMINDVLPPTHDRHFFGGDA---TSFEQIEAADIFSIGLVDNLSLFIDEMAHPLQPVRLS 233 (318) T ss_pred cccccccceEecccCccchhhhhcccCCCCCCcEEeccCc---cchhhhhhcccccHHHHHHHHHHHHHhCCCCcceeec Confidence 21 00 0 000 0000 000 111111111 111111111 1233344566666663222 Q ss_pred CCC-------CEEEeCHHHHHHHhcchh------hh-hhhhc---cccccccceEEEEeceEEEEecceec--ccccccc Q lcl|NC_015249. 201 SAD-------RVFYTTPDNYSAILAALM------PN-AANYQ---ALIDPSTGSIRNVMGFEVIEVPHLTA--GGAGEDR 261 (347) Q Consensus 201 ~~g-------R~~vv~P~~~~~Ll~~~~------~~-~~~~~---~~~~~~~G~Vg~i~G~~V~~sn~lp~--~~~~~~~ 261 (347) .+. ++++++|.+|..|..+.. +. ++... ..+.+-.|.+|.++|+-|.+.+++|+ ..+.+.. T Consensus 234 g~~~~~~~~~yV~~~~p~q~~~Lrtdt~~~~w~d~q~~A~~r~~g~knPLF~G~~gm~ngvil~~~~~vpIrf~~G~~v~ 313 (318) T protein:vir:27 234 GDELHGEDPYYVLYVTPRQWNDWYTSTSGKDWNQMMVRAVNRAKGFNHPLFKGECAMWRNILVRKYAGMPIRFYQGQRFW 313 (318) T ss_pred cccccCCcceEEEEechHHHHHHhhcCCCHHHHHHHHHHHhcccccCCCceecceeeecCEEEeecCCccEEEcCCCeee Confidence 112 788999999999998752 22 12222 24457889999999999999998864 2211100 Q ss_pred cccccccc Q lcl|NC_015249. 262 PEEGANPT 269 (347) Q Consensus 262 ~~~~~~~~ 269 (347) ..+.+ T Consensus 314 ---~~~~~ 318 (318) T protein:vir:27 314 ---YQRIT 318 (318) T ss_pred ---eeecC Confidence 00111 No 149 >protein:vir:93881 Length: 387 # NCBI annotation: ORF011 # Family: family:all:658 # MgeID: mge:1485 # MgeName: 3A # Cross-refs: genbank:acc:YP_239938;genbank:gi:66395599;genbank:GeneID:5130947 Probab=98.87 E-value=4.7e-10 Score=71.70 Aligned_cols=272 Identities=12% Similarity=0.071 Sum_probs=147.8 Q ss_pred CCcccccc-cc-------cccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEee--cCccee Q lcl|NC_015249. 1 MAKMNGGQ-QI-------GKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPV--LGRTKA 70 (347) Q Consensus 1 ma~~~~~~-~~-------~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~--iG~~~~ 70 (347) +....... .. .-..+...++| .+.-+.|..++.+.-...+.++++.++.++.+. .+|+ .+..++ T Consensus 99 ~~~~~~~~~~~~~~~~~~al~~~t~s~gG---~~IP~~~~~~Ii~~~~~~~~l~~~~~v~~~~~~---~~p~~~~~~~~a 172 (387) T protein:vir:93 99 ILPNEFEKPSMEAQRLLHALPTGNDSGGD---KLLPKTLSKEIVSEPFAKNQLREKARLTNIKGL---EIPRVSYTLDDD 172 (387) T ss_pred hhhhhhhhhhhhhHHHHHhhccCcCCCCc---eeechhHHHHHHHHHHhhchhhhheeeeecCCc---eEEEEeecCCcc Confidence 00000000 00 00001111111 134588999999888888888998888776532 3443 244556 Q ss_pred eeeecCCCCCCccCCCCCceEEEEEEeeeecc-cccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Q lcl|NC_015249. 71 AYLQPGENLDDKRKDMKHTERTINIDGLLTAD-VLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPS 149 (347) Q Consensus 71 ~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~-~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~ 149 (347) .....|+..+. .+++-+++++.. .++.. +.|.+-=-..+.+|+.+.+.++.++++++..++.++.. T Consensus 173 ~~v~E~~~~~~--~~~~f~~v~~~~--~k~~~~~~iS~ell~Ds~~~l~~~i~~~la~~~~~~e~~~~~~~--------- 239 (387) T protein:vir:93 173 DFITDVETAKE--LKLKGDTVKFTT--NKFKVFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAV--------- 239 (387) T ss_pred ccccCcccccc--cccccceeeeeh--eeeeeechhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhHhhc--------- Confidence 66666666544 345556655554 44444 34442223346788999999999999998866655421 Q ss_pred ccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcc Q lcl|NC_015249. 150 ASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQA 229 (347) Q Consensus 150 ~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~ 229 (347) ....+.+.|.....+.. ...+...|+.|+++...|+..... ...| ++++..|..|++-.+-.++.+ T Consensus 240 ---g~g~g~p~g~l~~~~~~-------~v~~~~~~d~i~~~~~~l~~~~~~-~a~~-~mn~~t~~~~~~~~~d~~~~~-- 305 (387) T protein:vir:93 240 ---SPKSGLDHMSFYNGSVK-------EVEGADMYDAIINALADLHEDYRD-NATI-YMRYADYVKIISVLSNGTTNF-- 305 (387) T ss_pred ---CCCccccceeeeccccc-------cccccchHHHHHHHHhccChhhhc-CCEE-EEechHHHHHHHHHhcCCCcc-- Confidence 11112222222111100 111233578888888888877654 3456 556665555443222222222 Q ss_pred ccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhccee Q lcl|NC_015249. 230 LIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMA 309 (347) Q Consensus 230 ~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~ 309 (347) ..|.-.++.|.+|+.++..|.. +-+||+..... +. .+. T Consensus 306 ----~~~~~~~llG~PV~~~~~~~~~---------------------------~~GDf~~~~~~-~~----------~~~ 343 (387) T protein:vir:93 306 ----FDTPAEKVFGKPVVFTDAAVKP---------------------------IVGDFNYFGIN-YD----------GTT 343 (387) T ss_pred ----cccCCccccccceEEecCCCce---------------------------eeeehhhhhee-hh----------hhe Confidence 2233357999999998754310 11344333211 11 122 Q ss_pred eeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 310 LERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 310 ~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .++..+...-...+++..+|+.++++|++.+.+.++.| T Consensus 344 ~~~~~~~~~~~~~~~~~~r~d~~v~~~eA~~~l~~k~~ 381 (387) T protein:vir:93 344 YDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKEN 381 (387) T ss_pred eeecccccCCceeEEEEeeeCceeechhheEEEEeecC Confidence 33444444445667788899999999999999888887 No 150 >protein:vir:93696 Length: 364 # NCBI annotation: Bcep22gp55 # Family: family:all:974 # MgeID: mge:1470 # MgeName: Bcep22 # Cross-refs: genbank:acc:NP_944284;genbank:gi:38640361;genbank:GeneID:2658350 Probab=98.87 E-value=5e-10 Score=71.57 Aligned_cols=308 Identities=12% Similarity=0.071 Sum_probs=165.7 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhc----------cccccccc--ccceEEEeecCcc Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMN----------KHLVRSIQ--SGKSAQFPVLGRT 68 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~----------~~~~r~i~--~G~tv~i~~iG~~ 68 (347) ||.++.+. +|+. -.++|+..+...-.+.|-+.+ .++...+. .|++|.|.-+... T Consensus 1 Ma~T~~~~------------~~p~--a~~~ws~~l~~~~~~~s~f~~~l~G~~~~~~I~~~~dL~k~~Gd~v~f~L~~~L 66 (364) T protein:vir:93 1 MSQTVIPF------------GDPK--AVKRWSADLAVDVRKKSYFEQRFIGTSENAVIQRKTELESDAGDRITFDLSVHL 66 (364) T ss_pred CceeccCc------------CCHH--HHHHHHHHHHHHHHhhCccccccccCCCCCcEEEeeecCCCCCceEEeeeeeec Confidence 88666433 3444 568999999888877764443 22222333 3999999988777 Q ss_pred eeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhc Q lcl|NC_015249. 69 KAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLP 148 (347) Q Consensus 69 ~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~ 148 (347) +-.....++.+.+..+++.-...+|+|||..-.-..=..+++..+-+|+|.+.-..++.-+++..|+.++.+++.+.... T Consensus 67 ~g~gv~Gd~~leGnee~L~~~~~~i~idq~r~~V~~~g~ms~qRt~~dlr~~ar~~L~~w~~~~~d~~~f~~laGarg~~ 146 (364) T protein:vir:93 67 RGKPTYGDARVEGKEESLRFYQDEVRIDQVRHSVSAGGRMSRKRTVHNIRRIARDRLGDYFYKFTDELLFIYLSGARGIN 146 (364) T ss_pred ccCCcccCceeeccccceeEEeeEEEEeeccccccccCchhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhcccccc Confidence 76666667777777677888888999999753221114688888999999999999999999999999999987532111 Q ss_pred cc--cccccc-------cccC-cceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCC--C------------CCCC Q lcl|NC_015249. 149 SA--SDENIA-------GLGK-AHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYV--P------------SADR 204 (347) Q Consensus 149 ~~--~~~~~~-------~~~~-g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~V--P------------~~gR 204 (347) .+ .+.... -.|. ...+..+. ..........-..-++.|..|...++.... | ++-- T Consensus 147 ~~~~~~~~~~~~~~N~v~aPt~~r~~~~~~-at~~~~l~stD~~sl~~id~a~~~a~~~~~~~~~~~~~~Pv~~~g~~~y 225 (364) T protein:vir:93 147 LDFIETPDFTGYAGNPLDAPDVDHLLYGGV-ATSKASLAATDIMAPLVIEKAVEKAAMMQAENPDVANMVPVSIDGDDHY 225 (364) T ss_pred cccccccCcccccccccCCCCCCcEEeccc-cCchhhccccccccHHHHHHHHHHHHHhCCCCCCCcccceeEecCccee Confidence 00 000000 0001 11111111 110000000111124556666666655432 1 1224 Q ss_pred EEEeCHHHHHHHhcch--hhh---hhhh--c-cccccccceEEEEeceEEEEecceeccccccccccccccccccccccc Q lcl|NC_015249. 205 VFYTTPDNYSAILAAL--MPN---AANY--Q-ALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFP 276 (347) Q Consensus 205 ~~vv~P~~~~~Ll~~~--~~~---~~~~--~-~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~ 276 (347) ++++.|.++..|..+. .+. ..-. . ..+.+-.|.+|.++|+-|++.++++..... +.+.+.. T Consensus 226 V~~l~p~q~~~Lr~~t~~~w~d~qk~A~~~~g~~nPlF~G~~gm~ngvii~~~~~vi~~~~~----~~~~~v~------- 294 (364) T protein:vir:93 226 VCVMSEYQATDMRTAAGGTWIDFQKAAAAAEGRNNPIFKGGLGMINNVVLHKHRNVIRFNDY----GAGANVE------- 294 (364) T ss_pred EEEEcchhhhhhhhcCCHHHHHHHHHhhhcccccCCceecCeeeEcCeEEeccCCccccccc----ccCcccc------- Confidence 7899999999998543 322 2111 1 234477899999999999999988643211 1111110 Q ss_pred ccccccccccccceEEEEechhhh--hhhhhcce---eeeeeechhhhcceeeeeeeeccccc--ccceEEEEEEcCC Q lcl|NC_015249. 277 ETSSGDTRVALDNVVGLFNHRSAV--GTVKLKDM---ALERARRANFQADQIIAKYAMGHGGL--RPEACGALVFNKA 347 (347) Q Consensus 277 ~~~~~~y~~~~~~~~~l~~~~~Av--~~v~~~~~---~~e~~~d~~~~~d~i~~~~a~G~~~~--Rpe~a~~i~~~~a 347 (347) -.++|++-..|+ +.++.... -.|..+|..++- .|.....+|-+=. ...==|+|+++.| T Consensus 295 ------------~~ralllGaQA~~~a~g~~~g~~~~w~Ee~~D~gn~~-~i~~~~i~G~kK~rF~~~DfGvi~idta 359 (364) T protein:vir:93 295 ------------AARALFMGRQAGVIAYGTANGLRFDWEETVKDYGNEP-AIAAGFIAGMKKARFNNKDFGVISIDTA 359 (364) T ss_pred ------------chhhheecceeeEEEeecCCCCCceeeecccCCCCch-hhhhhhHhhhhhcccCCccceEEEeccc Confidence 011222222222 11121111 122223333221 1222222332111 1222455555555 No 151 >protein:vir:101650 Length: 497 # NCBI annotation: gp13 # Family: family:all:585 # MgeID: mge:1515 # MgeName: 244 # Cross-refs: genbank:acc:YP_654768;genbank:gi:109302766;genbank:GeneID:4156084 Probab=98.85 E-value=1.1e-09 Score=69.63 Aligned_cols=297 Identities=12% Similarity=0.072 Sum_probs=150.6 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec--CcceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL--GRTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i--G~~~~~~~~~g~~ 78 (347) +.+....... -+....+..++--.+..++|..++.+...+.+.++++.++..+.++ ++.||+. +..++.....|+. T Consensus 138 ~~~~~~~~~~-~~~~~~~~~~~gg~~vp~~~~~~ii~~~~~~~~i~~l~~~~~~~~~-~~~~~~~~~~~~~a~wv~E~~~ 215 (497) T protein:vir:10 138 FADGETAPAA-IGQNPFGSTGTFAPGILPTFLPGIVEQLFYELSLADLISSRPVTSP-NLSYLTESAAHNNAAAVAEAGT 215 (497) T ss_pred HhhhhhhHHH-HHhhhcccCcccccccchhhhHHHHHHHHhhhhHHhhccccccCCC-ceEEEEEcCCCCcceeeccCcc Confidence 0000000000 0000001111212256689999999988888999999988887655 5788864 3456777777877 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) ++.+ +++-+++++...+.-. -..|.+ +-.+...++.+.+.++.++++++..|+.+|.-- ....+.|. T Consensus 216 ~~~s--~~~f~~i~~~~~k~a~-~~~iS~-ell~d~~~l~~~i~~~l~~~i~~~~d~~~l~G~---------G~~~p~Gi 282 (497) T protein:vir:10 216 YPFS--SEEFARVYEQVGKVAN-ALTITD-EGLRDAPELFNFVQGRLLEGIQRKEEVQLLAGG---------GYPGVNGL 282 (497) T ss_pred cccc--cccceeeEeeeeeeEe-ecHhHH-HHHHhHHHHHHHHHHHHHHHHHHHHHHHhhcCC---------Cccccccc Confidence 7653 4555666665554422 223321 222233568888899999999999999876310 00001110 Q ss_pred cC---cceeecccccccc-----------c---chhh------------------------------hHHHHHHHHHHHH Q lcl|NC_015249. 159 GK---AHVLEVGKQSELR-----------G---DQVK------------------------------LGQAIIAQLTLAR 191 (347) Q Consensus 159 ~~---g~~i~~~~~~~~~-----------~---~~~~------------------------------~~~~~~~~l~~a~ 191 (347) -. +..+..+...... . .... ........++.+. T Consensus 283 l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 362 (497) T protein:vir:10 283 LQRSTGFTASSASSLFGATSATVSNVKFPADGTNGAFVGQDTVASLKYGRVVTGAAGSGSGVAGSYPTAAEIAENVFDAF 362 (497) T ss_pred ccccccccccccccchhhhhhhhhhhhhhcccccchhhhhhHHHHHHHHHhhhhhhhhccchhccccchhhhhhHHHHHH Confidence 00 0000000000000 0 0000 0001111222222 Q ss_pred HHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccc------ccccceEEEEeceEEEEecceecccccccccccc Q lcl|NC_015249. 192 AKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALI------DPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEG 265 (347) Q Consensus 192 ~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~------~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~ 265 (347) ..+.....= ..-..|++|..|..|.+-.. .++.|.... ....+....+.|.+|+.++.+|.+. T Consensus 363 ~~~~~~~~~-~~~~~vmn~~~~~~l~~lkd-~~G~~i~~~~~~~~~~~~~~~~~~l~G~pV~~t~~~~~~~--------- 431 (497) T protein:vir:10 363 VDIQLTLFQ-TPNAVVMNPRDWELLRLTKD-ANGQYMGGNFFGNAYGNPVNGGKNIWGVPVVTTPLIPLGT--------- 431 (497) T ss_pred hhhhhhccc-CCCeEEEchHHHHHHHHhhc-CCCceeccCcccccccccccCCceeeceeeEecCCCCCCc--------- Confidence 222221110 01147799998888754332 222232111 1112223479999999999998421 Q ss_pred cccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeec--hhhhcc--eeeeeeeecccccccceEEE Q lcl|NC_015249. 266 ANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARR--ANFQAD--QIIAKYAMGHGGLRPEACGA 341 (347) Q Consensus 266 ~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d--~~~~~d--~i~~~~a~G~~~~Rpe~a~~ 341 (347) -+-+||+...-+++.+. +++++.... +.++.+ .|++..+++..+++|++.+. T Consensus 432 ----------------~~~Gd~~~~~~~i~~r~--------~~~v~~~~~~~~~f~~n~v~~r~~~r~~~~v~~p~A~~~ 487 (497) T protein:vir:10 432 ----------------ILVGHFAPSVIQTARRE--------GVTMQMTNSNGTDFVDGKVTVRAEERLGLLVYRPSAFQL 487 (497) T ss_pred ----------------eEEeecccceEEEEEec--------ccEEEeecccchhhhcCcEEEEEEEeecceeeccccEEE Confidence 12235544433444443 344444321 223334 47778899999999999999 Q ss_pred EEEcCC Q lcl|NC_015249. 342 LVFNKA 347 (347) Q Consensus 342 i~~~~a 347 (347) +.+..+ T Consensus 488 l~~~~~ 493 (497) T protein:vir:10 488 IQLKKG 493 (497) T ss_pred EEecCC Confidence 999988 No 152 >protein:vir:7855 Length: 497 # NCBI annotation: gp12 # Family: family:all:585 # MgeID: mge:150 # MgeName: CJW1 # Cross-refs: genbank:acc:NP_817462;genbank:gi:29565891;genbank:GeneID:1259081 Probab=98.85 E-value=1.1e-09 Score=69.63 Aligned_cols=297 Identities=12% Similarity=0.072 Sum_probs=150.6 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec--CcceeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL--GRTKAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i--G~~~~~~~~~g~~ 78 (347) +.+....... -+....+..++--.+..++|..++.+...+.+.++++.++..+.++ ++.||+. +..++.....|+. T Consensus 138 ~~~~~~~~~~-~~~~~~~~~~~gg~~vp~~~~~~ii~~~~~~~~i~~l~~~~~~~~~-~~~~~~~~~~~~~a~wv~E~~~ 215 (497) T protein:vir:78 138 FADGETAPAA-IGQNPFGSTGTFAPGILPTFLPGIVEQLFYELSLADLISSRPVTSP-NLSYLTESAAHNNAAAVAEAGT 215 (497) T ss_pred HhhhhhhHHH-HHhhhcccCcccccccchhhhHHHHHHHHhhhhHHhhccccccCCC-ceEEEEEcCCCCcceeeccCcc Confidence 0000000000 0000001111212256689999999988888999999988887655 5788864 3456777777877 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) ++.+ +++-+++++...+.-. -..|.+ +-.+...++.+.+.++.++++++..|+.+|.-- ....+.|. T Consensus 216 ~~~s--~~~f~~i~~~~~k~a~-~~~iS~-ell~d~~~l~~~i~~~l~~~i~~~~d~~~l~G~---------G~~~p~Gi 282 (497) T protein:vir:78 216 YPFS--SEEFARVYEQVGKVAN-ALTITD-EGLRDAPELFNFVQGRLLEGIQRKEEVQLLAGG---------GYPGVNGL 282 (497) T ss_pred cccc--cccceeeEeeeeeeEe-ecHhHH-HHHHhHHHHHHHHHHHHHHHHHHHHHHHhhcCC---------Cccccccc Confidence 7653 4555666665554422 223321 222233568888899999999999999876310 00001110 Q ss_pred cC---cceeecccccccc-----------c---chhh------------------------------hHHHHHHHHHHHH Q lcl|NC_015249. 159 GK---AHVLEVGKQSELR-----------G---DQVK------------------------------LGQAIIAQLTLAR 191 (347) Q Consensus 159 ~~---g~~i~~~~~~~~~-----------~---~~~~------------------------------~~~~~~~~l~~a~ 191 (347) -. +..+..+...... . .... ........++.+. T Consensus 283 l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 362 (497) T protein:vir:78 283 LQRSTGFTASSASSLFGATSATVSNVKFPADGTNGAFVGQDTVASLKYGRVVTGAAGSGSGVAGSYPTAAEIAENVFDAF 362 (497) T ss_pred ccccccccccccccchhhhhhhhhhhhhhcccccchhhhhhHHHHHHHHHhhhhhhhhccchhccccchhhhhhHHHHHH Confidence 00 0000000000000 0 0000 0001111222222 Q ss_pred HHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhcccc------ccccceEEEEeceEEEEecceecccccccccccc Q lcl|NC_015249. 192 AKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALI------DPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEG 265 (347) Q Consensus 192 ~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~------~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~ 265 (347) ..+.....= ..-..|++|..|..|.+-.. .++.|.... ....+....+.|.+|+.++.+|.+. T Consensus 363 ~~~~~~~~~-~~~~~vmn~~~~~~l~~lkd-~~G~~i~~~~~~~~~~~~~~~~~~l~G~pV~~t~~~~~~~--------- 431 (497) T protein:vir:78 363 VDIQLTLFQ-TPNAVVMNPRDWELLRLTKD-ANGQYMGGNFFGNAYGNPVNGGKNIWGVPVVTTPLIPLGT--------- 431 (497) T ss_pred hhhhhhccc-CCCeEEEchHHHHHHHHhhc-CCCceeccCcccccccccccCCceeeceeeEecCCCCCCc--------- Confidence 222221110 01147799998888754332 222232111 1112223479999999999998421 Q ss_pred cccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeeeec--hhhhcc--eeeeeeeecccccccceEEE Q lcl|NC_015249. 266 ANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARR--ANFQAD--QIIAKYAMGHGGLRPEACGA 341 (347) Q Consensus 266 ~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d--~~~~~d--~i~~~~a~G~~~~Rpe~a~~ 341 (347) -+-+||+...-+++.+. +++++.... +.++.+ .|++..+++..+++|++.+. T Consensus 432 ----------------~~~Gd~~~~~~~i~~r~--------~~~v~~~~~~~~~f~~n~v~~r~~~r~~~~v~~p~A~~~ 487 (497) T protein:vir:78 432 ----------------ILVGHFAPSVIQTARRE--------GVTMQMTNSNGTDFVDGKVTVRAEERLGLLVYRPSAFQL 487 (497) T ss_pred ----------------eEEeecccceEEEEEec--------ccEEEeecccchhhhcCcEEEEEEEeecceeeccccEEE Confidence 12235544433444443 344444321 223334 47778899999999999999 Q ss_pred EEEcCC Q lcl|NC_015249. 342 LVFNKA 347 (347) Q Consensus 342 i~~~~a 347 (347) +.+..+ T Consensus 488 l~~~~~ 493 (497) T protein:vir:78 488 IQLKKG 493 (497) T ss_pred EEecCC Confidence 999988 No 153 >protein:vir:78640 Length: 352 # NCBI annotation: phage capsid # Family: family:all:658 # MgeID: mge:1855 # MgeName: tp310-2 # Cross-refs: genbank:acc:YP_001429943;genbank:gi:156603997;genbank:GeneID:5525386 Probab=98.83 E-value=6.2e-10 Score=71.05 Aligned_cols=269 Identities=13% Similarity=0.092 Sum_probs=146.2 Q ss_pred CCc---------cc---ccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec--C Q lcl|NC_015249. 1 MAK---------MN---GGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL--G 66 (347) Q Consensus 1 ma~---------~~---~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i--G 66 (347) |.. .. -+.+.++. ..+| .|.-+.+..++.+.-+..+.++.+.++.++.+ . ++|++ + T Consensus 64 ~~~~~~~~~~~~~~~~~~al~~~~~----~~gG---~lIP~~~~~~Ii~~l~~~s~l~~~~~v~~~~~-~--~~p~~~~~ 133 (352) T protein:vir:78 64 ILPNEFEKPSMEAQRLLHALPTGND----SGGD---KLLPKTLSKEIVSEPFAKNQLREKARLTNIKG-L--EIPRVSYT 133 (352) T ss_pred hhhhHHHHHHhhHHHHHHHhccCCC----CCCc---eeccHhHHHHHHHHHHhhcchhhheeeEecCC-c--eEEEEecC Confidence 100 00 00011111 1111 13448999999999888899999988877543 2 34432 3 Q ss_pred cceeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_015249. 67 RTKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCN 146 (347) Q Consensus 67 ~~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~ 146 (347) ..++.....|..++.. +++-+++++.+.++-. -+.|.+-=-.++.+|+.+.+.++.++++++..++.++.. T Consensus 134 ~~~a~~v~E~~~~~~~--~~~f~~v~~~~~k~~~-~i~is~ell~Ds~~~l~~~i~~~la~~~~~~e~~~~~~~------ 204 (352) T protein:vir:78 134 LDDDDFITDVETAKEL--KLKGDTVKFTTNKFKV-FAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAV------ 204 (352) T ss_pred CCcccccccccccccc--cccceeeeecceeEEe-echhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHHhhhhc------ Confidence 3455555666665543 4555666666655422 244543323345788999999999999987645444321 Q ss_pred hccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhh Q lcl|NC_015249. 147 LPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAAN 226 (347) Q Consensus 147 ~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~ 226 (347) ....+.+.+.....+.. ...+...|+.|+++...|+..... ... ++++|..|..|++-.+-.++. T Consensus 205 ------g~g~~~~~g~l~~~~~~-------~~t~~~~~d~i~~~~~~l~~~~~~-~a~-~~mn~~t~~~l~~~~~~~~~~ 269 (352) T protein:vir:78 205 ------SPKSGLEHMSFYNGSVK-------EVEGANMYDAIINALADLHEDYRD-NAT-IYMRYADYVKIISVLSNGTTN 269 (352) T ss_pred ------CCCCcccccceeccccc-------cccccchHHHHHHHHhccChhhhc-CCE-EEEehHHHHHHHHHHhccCCc Confidence 00111122211111100 111223478888888877776543 234 466777777766433222222 Q ss_pred hccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhc Q lcl|NC_015249. 227 YQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLK 306 (347) Q Consensus 227 ~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~ 306 (347) + ..|.-.+++|.+|+.++..+.. +-+||...... + + T Consensus 270 ~------~~~~~~~llG~PV~~~~~~~~~---------------------------~~Gdf~~~~~~-~--~-------- 305 (352) T protein:vir:78 270 F------FDTPAEKVFGKPVVFTDAAVKP---------------------------IVGDFNYFGIN-Y--D-------- 305 (352) T ss_pred c------cccCCccccccceEEecCCCce---------------------------eEeehhhhhhh-h--h-------- Confidence 2 2333347899999988754310 11233322111 1 0 Q ss_pred ceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 307 DMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 307 ~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .+..++..+.......+++.++|+.+++||++.+.+.++.+ T Consensus 306 ~~~~~~~~~~~~g~~~f~~~~r~Dg~~~~~eA~~~l~~~a~ 346 (352) T protein:vir:78 306 GTTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKES 346 (352) T ss_pred hheeeeeccccCCeeEEEEEeeeCceeechhheEEEEeecc Confidence 12344444444334566778899999999999888888777 No 154 >protein:vir:9927 Length: 295 # NCBI annotation: hypothetical protein # Family: family:all:1178 # MgeID: mge:178 # MgeName: 315.6 # Cross-refs: genbank:acc:NP_795689;genbank:gi:28876459;genbank:GeneID:1258000 Probab=98.77 E-value=2e-09 Score=68.30 Aligned_cols=272 Identities=13% Similarity=0.051 Sum_probs=151.4 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCc-ceeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGR-TKAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~-~~~~~~~~g~~~ 79 (347) ||..+- ..++..+ ..-..+ |++.|+.-+.+-+ .+++..|...+..|+++++|+..- ..++++..|+.| T Consensus 1 mAe~nl--t~~~dL~---~~~sid--fv~~f~~~i~~L~----~~Lgi~r~~p~a~G~tIt~pK~~~tgda~dVaEGe~I 69 (295) T protein:vir:99 1 MAEKNL--NTMADLG---DIKSID--FVNKFSKNINDLL----KLLGVTRRETLTNDLKIQTYKWEVTLDQTDPGEGETI 69 (295) T ss_pred CCCccc--ccHhhcc---Cceeeh--hhHHhhhhHHHHH----HHhccccccccccCCeEEeeeeeeecccccccCCccc Confidence 998652 1112221 112233 9999997775444 455667777888899999997542 244778889998 Q ss_pred CCccCCCCCc---eEEEEEEeeeecccccccHHHH-H-hCh-hhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc Q lcl|NC_015249. 80 DDKRKDMKHT---ERTINIDGLLTADVLIYDIEDA-M-NHY-DVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDE 153 (347) Q Consensus 80 ~~~~~~~~~~---~~~l~ID~~~~~~~~Idd~D~~-q-~~~-D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~ 153 (347) |-+ .+..+ ..++.|.++. . .+ =||+ | +-| |...|..+++..+|++++|..++..+..+ T Consensus 70 pls--kvt~~~~~t~t~kikK~r--K-~t--TdEAIqlsGygdpvgead~qL~~~ia~kId~D~~~~lkta--------- 133 (295) T protein:vir:99 70 PLS--KVTRTKDKDYTVKWFKKR--R-AT--TAEAIARHGAARAITEADKRIMRELQNGIKDAFFTFLKTK--------- 133 (295) T ss_pred chh--hheeeeeeeeEEEeeeec--c-cc--cHHHHHhcCCCchhHHHHHHHHHHHHHhhhHHHHHHhccC--------- Confidence 764 34433 4566665543 3 34 2555 4 444 49999999999999999999998665210 Q ss_pred ccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhh--hhhhcccc Q lcl|NC_015249. 154 NIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPN--AANYQALI 231 (347) Q Consensus 154 ~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~--~~~~~~~~ 231 (347) +. . .. ...-+..++.+..+-..+.|.+= ...+++|+|..++.||++.... .++..|.. T Consensus 134 --------t~-t------~t---g~~lq~a~a~~~~al~~f~Ee~~--~~~V~FVnP~D~a~yl~~A~~~~~~a~~fG~~ 193 (295) T protein:vir:99 134 --------PT-K------VK---GVGLQKALSASWAKLATFNEFEG--SPLVSFVSPLDVANYLGDTKVGADASNVFGMT 193 (295) T ss_pred --------ce-e------ee---hhhHHHHHHHhhhhhhhcccccC--CceEEEEehHHHHHHHhccccccchhhhhhhh Confidence 00 0 00 00112234555554555544331 2469999999999999886554 22223444 Q ss_pred ccccceEEEEeceE-EEEecceeccccccccc----ccccccccccccccccccccccccccceEEEEechhhhhhhhhc Q lcl|NC_015249. 232 DPSTGSIRNVMGFE-VIEVPHLTAGGAGEDRP----EEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLK 306 (347) Q Consensus 232 ~~~~G~Vg~i~G~~-V~~sn~lp~~~~~~~~~----~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~ 306 (347) .+. ++.|++ |+.|+.+|.+....+.. .+-.+..+ ..++... ....|.+..+|+ T Consensus 194 ~L~-----nfLG~q~II~S~kv~~G~~~aT~~~Ni~~ay~~~~~--g~l~~~f--~~~~D~tglIg~------------- 251 (295) T protein:vir:99 194 LLK-----NFLGMQNVIVMPSVPEGKIYSTAVENLVFASLNVKG--GDLGGLF--ADFTDETGLIAA------------- 251 (295) T ss_pred hhh-----hhhccceEEEcccCCCceEEEeeccceEEEEecCCc--hhhhhhh--hhccCcccceEE------------- Confidence 444 499997 99999999765332111 00000110 0000000 011123333332 Q ss_pred ceeeeeeechhhhcceeeeeeeecc--cccccceEEEEEEcCC Q lcl|NC_015249. 307 DMALERARRANFQADQIIAKYAMGH--GGLRPEACGALVFNKA 347 (347) Q Consensus 307 ~~~~e~~~d~~~~~d~i~~~~a~G~--~~~Rpe~a~~i~~~~a 347 (347) ..++.++.=-+...+.+|. =+-|+|+.+...+..+ T Consensus 252 ------~h~~~~~~~t~et~~~~~~~lfpE~~dgiv~~tI~~~ 288 (295) T protein:vir:99 252 ------ARNRQLSNLTYESVFFGANVLFAEIPEGVVEATIEAA 288 (295) T ss_pred ------EeccccceeeehhhhHhHHHhcccccceEEEEEEecC Confidence 1222222222223333343 3458999999999777 No 155 >protein:vir:96978 Length: 387 # NCBI annotation: ORF009 # Family: family:all:658 # MgeID: mge:1643 # MgeName: 42e # Cross-refs: genbank:acc:YP_239859;genbank:gi:66395517;genbank:GeneID:5133011 Probab=98.75 E-value=6.9e-10 Score=70.79 Aligned_cols=273 Identities=12% Similarity=0.055 Sum_probs=147.4 Q ss_pred CCcccccccc-c-------ccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec--Cccee Q lcl|NC_015249. 1 MAKMNGGQQI-G-------KDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL--GRTKA 70 (347) Q Consensus 1 ma~~~~~~~~-~-------t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i--G~~~~ 70 (347) |......... . -..|-..++| .+.-+.|..++.+.....+.+++++++.++.+. .+|++ +..++ T Consensus 99 ~~~~~~~~~~~~~~~~~~a~~~~~~~~gG---~lIP~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~---~~p~~~~~~~~a 172 (387) T protein:vir:96 99 ILPNEFEKPSMEAQRLLHALPTGNDSGGD---KLLPKTLSKEIVSEPFAKNQLREKARLTNIKGL---EIPRVSYTLDDD 172 (387) T ss_pred HhhhhHHHHHHHHHHHHhhhccCCCCCCc---eeechhHHHHHHHHHHhhchhhhhceeeecCCc---eeeeeeccCCcc Confidence 1000000000 0 0001111112 244588999999999888988998888776543 34432 33455 Q ss_pred eeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccc Q lcl|NC_015249. 71 AYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSA 150 (347) Q Consensus 71 ~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~ 150 (347) .....|+..+.+ +++-+++++...++. .-+.|.+-=-.++.+|+.+.+.++.++++++..++.++..- T Consensus 173 ~~v~Eg~~~~~~--~~~f~~v~l~~~k~~-~~i~iS~ell~ds~~~l~~~i~~~la~~~~~~e~~~~~~~g--------- 240 (387) T protein:vir:96 173 DFITDVETAKEL--KAKGDTVKFTTNKFK-VFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAVS--------- 240 (387) T ss_pred cccccccccccc--ccccceeeechheee-eechhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhHhhcC--------- Confidence 566666665543 455566666554442 22344322222356889999999999999987666554211 Q ss_pred cccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccc Q lcl|NC_015249. 151 SDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQAL 230 (347) Q Consensus 151 ~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~ 230 (347) ...+.+.+.....+... ..+...++.|+++...|+....+ ...|+ +++..|..|++-.+-.++. T Consensus 241 ---~g~g~~~g~~~~~~~~~-------~~~~~~~d~i~~~~~~l~~~y~~-na~~i-mn~~t~~~~~~~~~~~~~~---- 304 (387) T protein:vir:96 241 ---PKSGLEHMSFYNGSVKE-------VEGADMYDAIINALADLHEDYRD-NATIY-MRYADYVKIISVLSNGTTN---- 304 (387) T ss_pred ---CCccccceeeecccccc-------ccccchHHHHHHHHhccChhhhc-CCEEE-EechHHHHHHHHHhcCCCc---- Confidence 11122222221111100 11233478888888888876654 34565 5665555554322222222 Q ss_pred cccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceee Q lcl|NC_015249. 231 IDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMAL 310 (347) Q Consensus 231 ~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~ 310 (347) +..|.-.++.|.+|+.++..|.. +-+||..... .+. .+.. T Consensus 305 --~~~~~~~~llG~PV~~~~~~~~~---------------------------~~GDf~~~~~-~~~----------~~~~ 344 (387) T protein:vir:96 305 --FFDTPAEKVFGKPVVFTDAAVKP---------------------------IVGDFNYFGI-NYD----------GTTY 344 (387) T ss_pred --ccccCCccccccceEEecCCCce---------------------------eeechhhhhh-hhh----------hhhh Confidence 22344457999999998754310 1123332211 111 1223 Q ss_pred eeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 311 ERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 311 e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++++...-...+++..+|+.++++|++.+.+.+++| T Consensus 345 ~~~~~~~~~~~~~~~~~r~Dg~v~~~~A~~~l~~ka~ 381 (387) T protein:vir:96 345 DTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKEN 381 (387) T ss_pred eecccccCCceEEEEEEEeCcEeechhheEEEEeecC Confidence 3444444334567778899999999999999999888 No 156 >protein:vir:2685 Length: 387 # NCBI annotation: hypothetical protein # Family: family:all:658 # MgeID: mge:57 # MgeName: phiSLT # Cross-refs: genbank:acc:NP_075504;genbank:gi:12719433;genbank:GeneID:920169 Probab=98.75 E-value=6.9e-10 Score=70.79 Aligned_cols=273 Identities=12% Similarity=0.055 Sum_probs=147.4 Q ss_pred CCcccccccc-c-------ccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec--Cccee Q lcl|NC_015249. 1 MAKMNGGQQI-G-------KDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL--GRTKA 70 (347) Q Consensus 1 ma~~~~~~~~-~-------t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i--G~~~~ 70 (347) |......... . -..|-..++| .+.-+.|..++.+.....+.+++++++.++.+. .+|++ +..++ T Consensus 99 ~~~~~~~~~~~~~~~~~~a~~~~~~~~gG---~lIP~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~---~~p~~~~~~~~a 172 (387) T protein:vir:26 99 ILPNEFEKPSMEAQRLLHALPTGNDSGGD---KLLPKTLSKEIVSEPFAKNQLREKARLTNIKGL---EIPRVSYTLDDD 172 (387) T ss_pred HhhhhHHHHHHHHHHHHhhhccCCCCCCc---eeechhHHHHHHHHHHhhchhhhhceeeecCCc---eeeeeeccCCcc Confidence 1000000000 0 0001111112 244588999999999888988998888776543 34432 33455 Q ss_pred eeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccc Q lcl|NC_015249. 71 AYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSA 150 (347) Q Consensus 71 ~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~ 150 (347) .....|+..+.+ +++-+++++...++. .-+.|.+-=-.++.+|+.+.+.++.++++++..++.++..- T Consensus 173 ~~v~Eg~~~~~~--~~~f~~v~l~~~k~~-~~i~iS~ell~ds~~~l~~~i~~~la~~~~~~e~~~~~~~g--------- 240 (387) T protein:vir:26 173 DFITDVETAKEL--KAKGDTVKFTTNKFK-VFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAVS--------- 240 (387) T ss_pred cccccccccccc--ccccceeeechheee-eechhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhHhhcC--------- Confidence 566666665543 455566666554442 22344322222356889999999999999987666554211 Q ss_pred cccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccc Q lcl|NC_015249. 151 SDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQAL 230 (347) Q Consensus 151 ~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~ 230 (347) ...+.+.+.....+... ..+...++.|+++...|+....+ ...|+ +++..|..|++-.+-.++. T Consensus 241 ---~g~g~~~g~~~~~~~~~-------~~~~~~~d~i~~~~~~l~~~y~~-na~~i-mn~~t~~~~~~~~~~~~~~---- 304 (387) T protein:vir:26 241 ---PKSGLEHMSFYNGSVKE-------VEGADMYDAIINALADLHEDYRD-NATIY-MRYADYVKIISVLSNGTTN---- 304 (387) T ss_pred ---CCccccceeeecccccc-------ccccchHHHHHHHHhccChhhhc-CCEEE-EechHHHHHHHHHhcCCCc---- Confidence 11122222221111100 11233478888888888876654 34565 5665555554322222222 Q ss_pred cccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceee Q lcl|NC_015249. 231 IDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMAL 310 (347) Q Consensus 231 ~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~ 310 (347) +..|.-.++.|.+|+.++..|.. +-+||..... .+. .+.. T Consensus 305 --~~~~~~~~llG~PV~~~~~~~~~---------------------------~~GDf~~~~~-~~~----------~~~~ 344 (387) T protein:vir:26 305 --FFDTPAEKVFGKPVVFTDAAVKP---------------------------IVGDFNYFGI-NYD----------GTTY 344 (387) T ss_pred --ccccCCccccccceEEecCCCce---------------------------eeechhhhhh-hhh----------hhhh Confidence 22344457999999998754310 1123332211 111 1223 Q ss_pred eeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 311 ERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 311 e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++++...-...+++..+|+.++++|++.+.+.+++| T Consensus 345 ~~~~~~~~~~~~~~~~~r~Dg~v~~~~A~~~l~~ka~ 381 (387) T protein:vir:26 345 DTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKEN 381 (387) T ss_pred eecccccCCceEEEEEEEeCcEeechhheEEEEeecC Confidence 3444444334567778899999999999999999888 No 157 >protein:vir:94424 Length: 387 # NCBI annotation: ORF010 # Family: family:all:658 # MgeID: mge:1506 # MgeName: 47 # Cross-refs: genbank:acc:YP_240005;genbank:gi:66395666;genbank:GeneID:5133084 Probab=98.75 E-value=6.9e-10 Score=70.79 Aligned_cols=273 Identities=12% Similarity=0.055 Sum_probs=147.4 Q ss_pred CCcccccccc-c-------ccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec--Cccee Q lcl|NC_015249. 1 MAKMNGGQQI-G-------KDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL--GRTKA 70 (347) Q Consensus 1 ma~~~~~~~~-~-------t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i--G~~~~ 70 (347) |......... . -..|-..++| .+.-+.|..++.+.....+.+++++++.++.+. .+|++ +..++ T Consensus 99 ~~~~~~~~~~~~~~~~~~a~~~~~~~~gG---~lIP~~~~~~Ii~~~~~~~~l~~~~~~~~~~~~---~~p~~~~~~~~a 172 (387) T protein:vir:94 99 ILPNEFEKPSMEAQRLLHALPTGNDSGGD---KLLPKTLSKEIVSEPFAKNQLREKARLTNIKGL---EIPRVSYTLDDD 172 (387) T ss_pred HhhhhHHHHHHHHHHHHhhhccCCCCCCc---eeechhHHHHHHHHHHhhchhhhhceeeecCCc---eeeeeeccCCcc Confidence 1000000000 0 0001111112 244588999999999888988998888776543 34432 33455 Q ss_pred eeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccc Q lcl|NC_015249. 71 AYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSA 150 (347) Q Consensus 71 ~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~ 150 (347) .....|+..+.+ +++-+++++...++. .-+.|.+-=-.++.+|+.+.+.++.++++++..++.++..- T Consensus 173 ~~v~Eg~~~~~~--~~~f~~v~l~~~k~~-~~i~iS~ell~ds~~~l~~~i~~~la~~~~~~e~~~~~~~g--------- 240 (387) T protein:vir:94 173 DFITDVETAKEL--KAKGDTVKFTTNKFK-VFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAVS--------- 240 (387) T ss_pred cccccccccccc--ccccceeeechheee-eechhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhHhhcC--------- Confidence 566666665543 455566666554442 22344322222356889999999999999987666554211 Q ss_pred cccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccc Q lcl|NC_015249. 151 SDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQAL 230 (347) Q Consensus 151 ~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~ 230 (347) ...+.+.+.....+... ..+...++.|+++...|+....+ ...|+ +++..|..|++-.+-.++. T Consensus 241 ---~g~g~~~g~~~~~~~~~-------~~~~~~~d~i~~~~~~l~~~y~~-na~~i-mn~~t~~~~~~~~~~~~~~---- 304 (387) T protein:vir:94 241 ---PKSGLEHMSFYNGSVKE-------VEGADMYDAIINALADLHEDYRD-NATIY-MRYADYVKIISVLSNGTTN---- 304 (387) T ss_pred ---CCccccceeeecccccc-------ccccchHHHHHHHHhccChhhhc-CCEEE-EechHHHHHHHHHhcCCCc---- Confidence 11122222221111100 11233478888888888876654 34565 5665555554322222222 Q ss_pred cccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceee Q lcl|NC_015249. 231 IDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMAL 310 (347) Q Consensus 231 ~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~ 310 (347) +..|.-.++.|.+|+.++..|.. +-+||..... .+. .+.. T Consensus 305 --~~~~~~~~llG~PV~~~~~~~~~---------------------------~~GDf~~~~~-~~~----------~~~~ 344 (387) T protein:vir:94 305 --FFDTPAEKVFGKPVVFTDAAVKP---------------------------IVGDFNYFGI-NYD----------GTTY 344 (387) T ss_pred --ccccCCccccccceEEecCCCce---------------------------eeechhhhhh-hhh----------hhhh Confidence 22344457999999998754310 1123332211 111 1223 Q ss_pred eeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 311 ERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 311 e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++++...-...+++..+|+.++++|++.+.+.+++| T Consensus 345 ~~~~~~~~~~~~~~~~~r~Dg~v~~~~A~~~l~~ka~ 381 (387) T protein:vir:94 345 DTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKEN 381 (387) T ss_pred eecccccCCceEEEEEEEeCcEeechhheEEEEeecC Confidence 3444444334567778899999999999999999888 No 158 >protein:vir:9875 Length: 296 # NCBI annotation: hypothetical protein # Family: family:all:1178 # MgeID: mge:177 # MgeName: 315.5 # Cross-refs: genbank:acc:NP_795637;genbank:gi:28876404;genbank:GeneID:1257935 Probab=98.74 E-value=2.5e-09 Score=67.75 Aligned_cols=274 Identities=9% Similarity=0.020 Sum_probs=154.9 Q ss_pred CCccccccccccccccccccc-------chhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcc--eee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAG-------DKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRT--KAA 71 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~-------d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~--~~~ 71 (347) |-.-- |.|--+.-.+ ..| |+++|+.-+.+-+ .+++..|...+..|++++++.-+.. .++ T Consensus 1 ~~~~~------~~~e~nlt~~~dl~~~~siD--f~~~f~~~i~~L~----~~LGv~r~~pla~GstIkt~k~~~y~gda~ 68 (296) T protein:vir:98 1 MVTSR------TYPEENLIKSTDLKYPITID--VTNKFQENISKLL----EMLGVTRKISVSEGMTLKTYAGYDVTLAEG 68 (296) T ss_pred CCCcc------ccCcCCCcchhhhhhhhhhh--hHHHHhhhHHHHH----HHhhhcccccccCCCEEeeccceeeeeccc Confidence 32211 2332222111 233 8889988776554 3556677778888999966542222 335 Q ss_pred eeecCCCCCCccCCCCCc---eEEEEEEeeeecccccccHHHH-H-hCh-hhHHHHHHHHHHHHHHHHHHHHHHHHHHHh Q lcl|NC_015249. 72 YLQPGENLDDKRKDMKHT---ERTINIDGLLTADVLIYDIEDA-M-NHY-DVRSEYTAQLGESLAMAADGAVLAEMAKLC 145 (347) Q Consensus 72 ~~~~g~~~~~~~~~~~~~---~~~l~ID~~~~~~~~Idd~D~~-q-~~~-D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a 145 (347) +...|+.||-+ .+... ..+++|.++.-. + =||+ | +-| |...+..+++..++++++|..++..+..+. T Consensus 69 dVaEGe~Ipls--kvt~~~~~t~t~~ikK~rK~---t--TdEAIqlsGyg~aVgetd~qL~~~iq~kId~d~~t~LktaT 141 (296) T protein:vir:98 69 NVPEGEVIPLS--KVERKIHSEKKIELKKYRKA---T--TGEDIQMYGSNEAVTNTDNALVRQLQKKIRTDFVTALKTGT 141 (296) T ss_pred cccCCcccchh--hheeeecceEEEEeeccccc---c--CHHHHHhhcCCchhHHHHHHHHHHHHHhhhHHHHHHHhccc Confidence 67788888764 34433 366777554322 4 3566 4 444 499999999999999999999986652110 Q ss_pred hhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhh Q lcl|NC_015249. 146 NLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAA 225 (347) Q Consensus 146 ~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~ 225 (347) ++ ........-.++...+.++..+|.+.+ ....+++|+|...+.+|++..+... T Consensus 142 ---------------~t---------~~~t~~~lQ~Ala~~~~~l~~~feded--~~~~V~FVnP~D~a~ylg~a~it~q 195 (296) T protein:vir:98 142 ---------------GT---------QDALGAGLQGALASAWGKLQVLFEDYG--SERAIVFANSLDVAEYIAKAGITTQ 195 (296) T ss_pred ---------------ce---------eeechhhHHHHHHHHhhhhhhhccccC--CCceEEEEehHHHHHHhcCCccchh Confidence 00 001112233455667777778887764 2468999999999999998876543 Q ss_pred hhccccccccceEEEEeceEEEEecceecccccccc----cccccccccccccccccccccccccccceEEEEechhhhh Q lcl|NC_015249. 226 NYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDR----PEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVG 301 (347) Q Consensus 226 ~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~----~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~ 301 (347) ..-|..-+. ++.|.+|+.|+.+|.+..-... ..+-.+..+ ...+... ....|.+..+|+. T Consensus 196 t~fG~tyl~-----nfLG~~II~S~kV~~G~~~~T~~~Ni~~ay~~~~~--~~l~~~f--~~~~d~tglIGv~------- 259 (296) T protein:vir:98 196 TAFGLTYLV-----DFTGTVIISTNDVTKGEIWATVPENIIFAYINPNN--SELAKEF--NLYGDPTGYIGMN------- 259 (296) T ss_pred heechhhhh-----hccccEEEEcCcCCCceEEEeeecceEEEeecccc--cchhhhh--ccccccccceEEE------- Confidence 333333232 3889999999999965433211 111011100 0011000 1222344444432 Q ss_pred hhhhcceeeeeeechhhhcceeeeeeeecc--cccccceEEEEEEcCC Q lcl|NC_015249. 302 TVKLKDMALERARRANFQADQIIAKYAMGH--GGLRPEACGALVFNKA 347 (347) Q Consensus 302 ~v~~~~~~~e~~~d~~~~~d~i~~~~a~G~--~~~Rpe~a~~i~~~~a 347 (347) .++....=-+...+.+|. =+-|+|+.+...++.| T Consensus 260 ------------h~~~~~~~t~eT~~~~~~~lfpE~~dgiv~~tI~~~ 295 (296) T protein:vir:98 260 ------------HFQENTTLTIQTLLVSGMLMYPERIDGIVKVTLTPG 295 (296) T ss_pred ------------eccccceeeehhHhHhHHHhcccccceEEEEEecCC Confidence 122222112222223333 3458999999999988 No 159 >protein:vir:3298 Length: 404 # NCBI annotation: hypothetical protein # Family: family:all:974 # MgeID: mge:66 # MgeName: 933W # Cross-refs: genbank:acc:NP_049514;genbank:gi:9632520;genbank:GeneID:1262006 Probab=98.71 E-value=3.5e-09 Score=66.93 Aligned_cols=333 Identities=11% Similarity=0.065 Sum_probs=171.4 Q ss_pred CCcccccccccccccccc--cccchhhhhhhhhhhHHHHHHHHHHhh---------hccccccccc--ccceEEEeecCc Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGM--SAGDKLALFLKVFGGEVLTAFTRTSVT---------MNKHLVRSIQ--SGKSAQFPVLGR 67 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~--~~~d~~al~ie~f~g~V~~~f~~~s~~---------~~~~~~r~i~--~G~tv~i~~iG~ 67 (347) |......+- ....-.+. ..+ ...-+++.|.+.+...=+..+-+ .+.++..++. .|++|.|.-+-. T Consensus 1 ~~~~~~~~a-~~~~~~~lft~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~I~~~~dL~K~aGd~vtf~L~~~ 78 (404) T protein:vir:32 1 MTTVTSAQA-NKLYQVALFTAAN-RNRSMVNILTEQQEAPKAVSPDKKSTKQTSAGAPVVRITDLNKQAGDEVTFSIMHK 78 (404) T ss_pred CCCcCCcch-hhhHHHHHHHHHh-cCChhHhhhhhhhhhhhhhccchhhccCCCCCccEEEeecCCCCCCcEEEEeEeee Confidence 433332211 11111110 000 11124667766643322222111 2333444443 399999998877 Q ss_pred ceeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Q lcl|NC_015249. 68 TKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNL 147 (347) Q Consensus 68 ~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~ 147 (347) .+-.....++.+.+..+++.-...+|+|||..-.-..=..+++-.+-+|+|++.-..++.-+++..||.+|.+++..... T Consensus 79 L~g~gv~Gd~~lEGnee~L~~~s~~i~Idq~r~~V~~~g~msqQRt~~dlr~~ar~~L~~w~~~~~d~~~~~~laG~rg~ 158 (404) T protein:vir:32 79 LSKRPTMGDERVEGRGEDLSHADFSLKINQGRHLVDAGGRMSQQRTKFNLASSARTLLGTYFNDLQDQCAIVHLAGARGD 158 (404) T ss_pred cccCCcccCceeeccccceeEEeeEEEEeeecccccccCchhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHhccccc Confidence 77666666777777777888888999999976442222578888899999999999999999999999999999854321 Q ss_pred -cc-----cc--ccc------ccc-ccC-cceeecccccccccchhhhHH--HHHHHHHHHHHHhhhcCCCC-------C Q lcl|NC_015249. 148 -PS-----AS--DEN------IAG-LGK-AHVLEVGKQSELRGDQVKLGQ--AIIAQLTLARAKLTGNYVPS-------A 202 (347) Q Consensus 148 -~~-----~~--~~~------~~~-~~~-g~~i~~~~~~~~~~~~~~~~~--~~~~~l~~a~~~Lde~~VP~-------~ 202 (347) .. +. +.. +.. .|. .-.+..+. .+.+...... ..++.|-++.+.+++..-|. + T Consensus 159 ~~n~~~~vp~~~~~~~~~~~~N~v~APt~~r~~~~g~---at~~~~l~stD~~s~~~Id~~~~~~~~~~~pi~Pv~~~g~ 235 (404) T protein:vir:32 159 FVADDTILPTAEHPEFKKIMINDVLPPTHDRHFFGGD---ATSFEQIEAADIFSIGLVDNLSLFIDEMAHPLQPVRLSGD 235 (404) T ss_pred cccccceeeccccccccceeecccCCCCCCcEEeccC---ccchhhhhhcccccHHHHHHHHHHHHHhCCCCcceEeccc Confidence 00 00 000 000 000 00111111 1111111111 12344556667776643332 2 Q ss_pred C-------CEEEeCHHHHHHHhcchh---hh---h-hhh---ccccccccceEEEEeceEEEEecceecccccccccccc Q lcl|NC_015249. 203 D-------RVFYTTPDNYSAILAALM---PN---A-ANY---QALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEG 265 (347) Q Consensus 203 g-------R~~vv~P~~~~~Ll~~~~---~~---~-~~~---~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~ 265 (347) . ++++++|.+|..|..++. +. . +.. ...+.+-.|.+|.++|+-|++.++.|+--......... T Consensus 236 ~~~~~~~~yV~~~~p~q~~~Lr~dt~~~~w~d~q~~A~a~~rg~~nPlF~G~~gm~ngvii~~~~~~~Irf~~g~~~~~~ 315 (404) T protein:vir:32 236 ELHGEDPYYVLYVTPRQWNDWYTSTSGKDWNQMMVRAVNRAKGFNHPLFKGECAMWRNILVRKYAGMPIRFYQGSKVLVS 315 (404) T ss_pred cccCccceEEEEechHHHHHHhhCCCcHHHHHHHHHHhhccccccCCceecCeeEEcCEEEEecCCceeeecccceeeec Confidence 2 889999999999999852 22 1 111 23455788999999999999999887532221111111 Q ss_pred cc-cccccccccccccccccccccceEEEEechhhh--hhhhhc---ceeeeeeechhhhcceeeeeeeecccccc-c-- Q lcl|NC_015249. 266 AN-PTGQKHAFPETSSGDTRVALDNVVGLFNHRSAV--GTVKLK---DMALERARRANFQADQIIAKYAMGHGGLR-P-- 336 (347) Q Consensus 266 ~~-~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av--~~v~~~---~~~~e~~~d~~~~~d~i~~~~a~G~~~~R-p-- 336 (347) .+ ..+.... ....+.-..+|++-..|+ +.++.. ..-.|..+|-.++ -.|.....+|.+=.| | T Consensus 316 ~n~~~a~~~~--------~aa~~~v~RallLGaQAl~~A~g~~~g~~~~w~Ee~~D~g~~-~~i~~~~i~G~kK~rF~~~ 386 (404) T protein:vir:32 316 ENNLTATTKE--------VAAATNIDRAMLLGAQALANAYGQKAGGHFNMVEKKTDMDNR-TEIAISWINGLKKIRFPEK 386 (404) T ss_pred CCcccccccc--------ccccccchhheeecceeEEEEeeccCCCCceeEeeccccCch-hhhhhHHHhhhhhccccCC Confidence 11 1111111 111111123344434433 222221 1122333344333 334445555655455 4 Q ss_pred ----ceEEEEEEcCC Q lcl|NC_015249. 337 ----EACGALVFNKA 347 (347) Q Consensus 337 ----e~a~~i~~~~a 347 (347) .-=|+|+++.| T Consensus 387 ~g~~~DfGvi~idta 401 (404) T protein:vir:32 387 SGKMQDHGVIAVDTA 401 (404) T ss_pred CCceeeEEEEEeccc Confidence 23456666666 No 160 >protein:vir:104439 Length: 404 # NCBI annotation: putative virion structural protein # Family: family:all:974 # MgeID: mge:1471 # MgeName: 86 # Cross-refs: genbank:acc:YP_794063;genbank:gi:116222008;genbank:GeneID:4397504 Probab=98.71 E-value=3.5e-09 Score=66.93 Aligned_cols=333 Identities=11% Similarity=0.065 Sum_probs=171.4 Q ss_pred CCcccccccccccccccc--cccchhhhhhhhhhhHHHHHHHHHHhh---------hccccccccc--ccceEEEeecCc Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGM--SAGDKLALFLKVFGGEVLTAFTRTSVT---------MNKHLVRSIQ--SGKSAQFPVLGR 67 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~--~~~d~~al~ie~f~g~V~~~f~~~s~~---------~~~~~~r~i~--~G~tv~i~~iG~ 67 (347) |......+- ....-.+. ..+ ...-+++.|.+.+...=+..+-+ .+.++..++. .|++|.|.-+-. T Consensus 1 ~~~~~~~~a-~~~~~~~lft~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~I~~~~dL~K~aGd~vtf~L~~~ 78 (404) T protein:vir:10 1 MTTVTSAQA-NKLYQVALFTAAN-RNRSMVNILTEQQEAPKAVSPDKKSTKQTSAGAPVVRITDLNKQAGDEVTFSIMHK 78 (404) T ss_pred CCCcCCcch-hhhHHHHHHHHHh-cCChhHhhhhhhhhhhhhhccchhhccCCCCCccEEEeecCCCCCCcEEEEeEeee Confidence 433332211 11111110 000 11124667766643322222111 2333444443 399999998877 Q ss_pred ceeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Q lcl|NC_015249. 68 TKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNL 147 (347) Q Consensus 68 ~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~ 147 (347) .+-.....++.+.+..+++.-...+|+|||..-.-..=..+++-.+-+|+|++.-..++.-+++..||.+|.+++..... T Consensus 79 L~g~gv~Gd~~lEGnee~L~~~s~~i~Idq~r~~V~~~g~msqQRt~~dlr~~ar~~L~~w~~~~~d~~~~~~laG~rg~ 158 (404) T protein:vir:10 79 LSKRPTMGDERVEGRGEDLSHADFSLKINQGRHLVDAGGRMSQQRTKFNLASSARTLLGTYFNDLQDQCAIVHLAGARGD 158 (404) T ss_pred cccCCcccCceeeccccceeEEeeEEEEeeecccccccCchhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHhccccc Confidence 77666666777777777888888999999976442222578888899999999999999999999999999999854321 Q ss_pred -cc-----cc--ccc------ccc-ccC-cceeecccccccccchhhhHH--HHHHHHHHHHHHhhhcCCCC-------C Q lcl|NC_015249. 148 -PS-----AS--DEN------IAG-LGK-AHVLEVGKQSELRGDQVKLGQ--AIIAQLTLARAKLTGNYVPS-------A 202 (347) Q Consensus 148 -~~-----~~--~~~------~~~-~~~-g~~i~~~~~~~~~~~~~~~~~--~~~~~l~~a~~~Lde~~VP~-------~ 202 (347) .. +. +.. +.. .|. .-.+..+. .+.+...... ..++.|-++.+.+++..-|. + T Consensus 159 ~~n~~~~vp~~~~~~~~~~~~N~v~APt~~r~~~~g~---at~~~~l~stD~~s~~~Id~~~~~~~~~~~pi~Pv~~~g~ 235 (404) T protein:vir:10 159 FVADDTILPTAEHPEFKKIMINDVLPPTHDRHFFGGD---ATSFEQIEAADIFSIGLVDNLSLFIDEMAHPLQPVRLSGD 235 (404) T ss_pred cccccceeeccccccccceeecccCCCCCCcEEeccC---ccchhhhhhcccccHHHHHHHHHHHHHhCCCCcceEeccc Confidence 00 00 000 000 000 00111111 1111111111 12344556667776643332 2 Q ss_pred C-------CEEEeCHHHHHHHhcchh---hh---h-hhh---ccccccccceEEEEeceEEEEecceecccccccccccc Q lcl|NC_015249. 203 D-------RVFYTTPDNYSAILAALM---PN---A-ANY---QALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEG 265 (347) Q Consensus 203 g-------R~~vv~P~~~~~Ll~~~~---~~---~-~~~---~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~ 265 (347) . ++++++|.+|..|..++. +. . +.. ...+.+-.|.+|.++|+-|++.++.|+--......... T Consensus 236 ~~~~~~~~yV~~~~p~q~~~Lr~dt~~~~w~d~q~~A~a~~rg~~nPlF~G~~gm~ngvii~~~~~~~Irf~~g~~~~~~ 315 (404) T protein:vir:10 236 ELHGEDPYYVLYVTPRQWNDWYTSTSGKDWNQMMVRAVNRAKGFNHPLFKGECAMWRNILVRKYAGMPIRFYQGSKVLVS 315 (404) T ss_pred cccCccceEEEEechHHHHHHhhCCCcHHHHHHHHHHhhccccccCCceecCeeEEcCEEEEecCCceeeecccceeeec Confidence 2 889999999999999852 22 1 111 23455788999999999999999887532221111111 Q ss_pred cc-cccccccccccccccccccccceEEEEechhhh--hhhhhc---ceeeeeeechhhhcceeeeeeeecccccc-c-- Q lcl|NC_015249. 266 AN-PTGQKHAFPETSSGDTRVALDNVVGLFNHRSAV--GTVKLK---DMALERARRANFQADQIIAKYAMGHGGLR-P-- 336 (347) Q Consensus 266 ~~-~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av--~~v~~~---~~~~e~~~d~~~~~d~i~~~~a~G~~~~R-p-- 336 (347) .+ ..+.... ....+.-..+|++-..|+ +.++.. ..-.|..+|-.++ -.|.....+|.+=.| | T Consensus 316 ~n~~~a~~~~--------~aa~~~v~RallLGaQAl~~A~g~~~g~~~~w~Ee~~D~g~~-~~i~~~~i~G~kK~rF~~~ 386 (404) T protein:vir:10 316 ENNLTATTKE--------VAAATNIDRAMLLGAQALANAYGQKAGGHFNMVEKKTDMDNR-TEIAISWINGLKKIRFPEK 386 (404) T ss_pred CCcccccccc--------ccccccchhheeecceeEEEEeeccCCCCceeEeeccccCch-hhhhhHHHhhhhhccccCC Confidence 11 1111111 111111123344434433 222221 1122333344333 334445555655455 4 Q ss_pred ----ceEEEEEEcCC Q lcl|NC_015249. 337 ----EACGALVFNKA 347 (347) Q Consensus 337 ----e~a~~i~~~~a 347 (347) .-=|+|+++.| T Consensus 387 ~g~~~DfGvi~idta 401 (404) T protein:vir:10 387 SGKMQDHGVIAVDTA 401 (404) T ss_pred CCceeeEEEEEeccc Confidence 23456666666 No 161 >protein:vir:819 Length: 404 # NCBI annotation: hypothetical protein # Family: family:all:974 # MgeID: mge:16 # MgeName: VT2-Sa # Cross-refs: genbank:acc:NP_050552;genbank:gi:9633449;genbank:GeneID:1262254 Probab=98.71 E-value=3.5e-09 Score=66.93 Aligned_cols=333 Identities=11% Similarity=0.065 Sum_probs=171.4 Q ss_pred CCcccccccccccccccc--cccchhhhhhhhhhhHHHHHHHHHHhh---------hccccccccc--ccceEEEeecCc Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGM--SAGDKLALFLKVFGGEVLTAFTRTSVT---------MNKHLVRSIQ--SGKSAQFPVLGR 67 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~--~~~d~~al~ie~f~g~V~~~f~~~s~~---------~~~~~~r~i~--~G~tv~i~~iG~ 67 (347) |......+- ....-.+. ..+ ...-+++.|.+.+...=+..+-+ .+.++..++. .|++|.|.-+-. T Consensus 1 ~~~~~~~~a-~~~~~~~lft~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~I~~~~dL~K~aGd~vtf~L~~~ 78 (404) T protein:vir:81 1 MTTVTSAQA-NKLYQVALFTAAN-RNRSMVNILTEQQEAPKAVSPDKKSTKQTSAGAPVVRITDLNKQAGDEVTFSIMHK 78 (404) T ss_pred CCCcCCcch-hhhHHHHHHHHHh-cCChhHhhhhhhhhhhhhhccchhhccCCCCCccEEEeecCCCCCCcEEEEeEeee Confidence 433332211 11111110 000 11124667766643322222111 2333444443 399999998877 Q ss_pred ceeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Q lcl|NC_015249. 68 TKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNL 147 (347) Q Consensus 68 ~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~ 147 (347) .+-.....++.+.+..+++.-...+|+|||..-.-..=..+++-.+-+|+|++.-..++.-+++..||.+|.+++..... T Consensus 79 L~g~gv~Gd~~lEGnee~L~~~s~~i~Idq~r~~V~~~g~msqQRt~~dlr~~ar~~L~~w~~~~~d~~~~~~laG~rg~ 158 (404) T protein:vir:81 79 LSKRPTMGDERVEGRGEDLSHADFSLKINQGRHLVDAGGRMSQQRTKFNLASSARTLLGTYFNDLQDQCAIVHLAGARGD 158 (404) T ss_pred cccCCcccCceeeccccceeEEeeEEEEeeecccccccCchhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHhccccc Confidence 77666666777777777888888999999976442222578888899999999999999999999999999999854321 Q ss_pred -cc-----cc--ccc------ccc-ccC-cceeecccccccccchhhhHH--HHHHHHHHHHHHhhhcCCCC-------C Q lcl|NC_015249. 148 -PS-----AS--DEN------IAG-LGK-AHVLEVGKQSELRGDQVKLGQ--AIIAQLTLARAKLTGNYVPS-------A 202 (347) Q Consensus 148 -~~-----~~--~~~------~~~-~~~-g~~i~~~~~~~~~~~~~~~~~--~~~~~l~~a~~~Lde~~VP~-------~ 202 (347) .. +. +.. +.. .|. .-.+..+. .+.+...... ..++.|-++.+.+++..-|. + T Consensus 159 ~~n~~~~vp~~~~~~~~~~~~N~v~APt~~r~~~~g~---at~~~~l~stD~~s~~~Id~~~~~~~~~~~pi~Pv~~~g~ 235 (404) T protein:vir:81 159 FVADDTILPTAEHPEFKKIMINDVLPPTHDRHFFGGD---ATSFEQIEAADIFSIGLVDNLSLFIDEMAHPLQPVRLSGD 235 (404) T ss_pred cccccceeeccccccccceeecccCCCCCCcEEeccC---ccchhhhhhcccccHHHHHHHHHHHHHhCCCCcceEeccc Confidence 00 00 000 000 000 00111111 1111111111 12344556667776643332 2 Q ss_pred C-------CEEEeCHHHHHHHhcchh---hh---h-hhh---ccccccccceEEEEeceEEEEecceecccccccccccc Q lcl|NC_015249. 203 D-------RVFYTTPDNYSAILAALM---PN---A-ANY---QALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEG 265 (347) Q Consensus 203 g-------R~~vv~P~~~~~Ll~~~~---~~---~-~~~---~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~ 265 (347) . ++++++|.+|..|..++. +. . +.. ...+.+-.|.+|.++|+-|++.++.|+--......... T Consensus 236 ~~~~~~~~yV~~~~p~q~~~Lr~dt~~~~w~d~q~~A~a~~rg~~nPlF~G~~gm~ngvii~~~~~~~Irf~~g~~~~~~ 315 (404) T protein:vir:81 236 ELHGEDPYYVLYVTPRQWNDWYTSTSGKDWNQMMVRAVNRAKGFNHPLFKGECAMWRNILVRKYAGMPIRFYQGSKVLVS 315 (404) T ss_pred cccCccceEEEEechHHHHHHhhCCCcHHHHHHHHHHhhccccccCCceecCeeEEcCEEEEecCCceeeecccceeeec Confidence 2 889999999999999852 22 1 111 23455788999999999999999887532221111111 Q ss_pred cc-cccccccccccccccccccccceEEEEechhhh--hhhhhc---ceeeeeeechhhhcceeeeeeeecccccc-c-- Q lcl|NC_015249. 266 AN-PTGQKHAFPETSSGDTRVALDNVVGLFNHRSAV--GTVKLK---DMALERARRANFQADQIIAKYAMGHGGLR-P-- 336 (347) Q Consensus 266 ~~-~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av--~~v~~~---~~~~e~~~d~~~~~d~i~~~~a~G~~~~R-p-- 336 (347) .+ ..+.... ....+.-..+|++-..|+ +.++.. ..-.|..+|-.++ -.|.....+|.+=.| | T Consensus 316 ~n~~~a~~~~--------~aa~~~v~RallLGaQAl~~A~g~~~g~~~~w~Ee~~D~g~~-~~i~~~~i~G~kK~rF~~~ 386 (404) T protein:vir:81 316 ENNLTATTKE--------VAAATNIDRAMLLGAQALANAYGQKAGGHFNMVEKKTDMDNR-TEIAISWINGLKKIRFPEK 386 (404) T ss_pred CCcccccccc--------ccccccchhheeecceeEEEEeeccCCCCceeEeeccccCch-hhhhhHHHhhhhhccccCC Confidence 11 1111111 111111123344434433 222221 1122333344333 334445555655455 4 Q ss_pred ----ceEEEEEEcCC Q lcl|NC_015249. 337 ----EACGALVFNKA 347 (347) Q Consensus 337 ----e~a~~i~~~~a 347 (347) .-=|+|+++.| T Consensus 387 ~g~~~DfGvi~idta 401 (404) T protein:vir:81 387 SGKMQDHGVIAVDTA 401 (404) T ss_pred CCceeeEEEEEeccc Confidence 23456666666 No 162 >protein:vir:10123 Length: 404 # NCBI annotation: hypothetical protein # Family: family:all:974 # MgeID: mge:180 # MgeName: Stx2 converting bacteriophage II # Cross-refs: genbank:acc:NP_859253;genbank:gi:32171009;genbank:GeneID:2653345 Probab=98.71 E-value=3.5e-09 Score=66.93 Aligned_cols=333 Identities=11% Similarity=0.065 Sum_probs=171.4 Q ss_pred CCcccccccccccccccc--cccchhhhhhhhhhhHHHHHHHHHHhh---------hccccccccc--ccceEEEeecCc Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGM--SAGDKLALFLKVFGGEVLTAFTRTSVT---------MNKHLVRSIQ--SGKSAQFPVLGR 67 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~--~~~d~~al~ie~f~g~V~~~f~~~s~~---------~~~~~~r~i~--~G~tv~i~~iG~ 67 (347) |......+- ....-.+. ..+ ...-+++.|.+.+...=+..+-+ .+.++..++. .|++|.|.-+-. T Consensus 1 ~~~~~~~~a-~~~~~~~lft~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~I~~~~dL~K~aGd~vtf~L~~~ 78 (404) T protein:vir:10 1 MTTVTSAQA-NKLYQVALFTAAN-RNRSMVNILTEQQEAPKAVSPDKKSTKQTSAGAPVVRITDLNKQAGDEVTFSIMHK 78 (404) T ss_pred CCCcCCcch-hhhHHHHHHHHHh-cCChhHhhhhhhhhhhhhhccchhhccCCCCCccEEEeecCCCCCCcEEEEeEeee Confidence 433332211 11111110 000 11124667766643322222111 2333444443 399999998877 Q ss_pred ceeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Q lcl|NC_015249. 68 TKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNL 147 (347) Q Consensus 68 ~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~ 147 (347) .+-.....++.+.+..+++.-...+|+|||..-.-..=..+++-.+-+|+|++.-..++.-+++..||.+|.+++..... T Consensus 79 L~g~gv~Gd~~lEGnee~L~~~s~~i~Idq~r~~V~~~g~msqQRt~~dlr~~ar~~L~~w~~~~~d~~~~~~laG~rg~ 158 (404) T protein:vir:10 79 LSKRPTMGDERVEGRGEDLSHADFSLKINQGRHLVDAGGRMSQQRTKFNLASSARTLLGTYFNDLQDQCAIVHLAGARGD 158 (404) T ss_pred cccCCcccCceeeccccceeEEeeEEEEeeecccccccCchhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHhccccc Confidence 77666666777777777888888999999976442222578888899999999999999999999999999999854321 Q ss_pred -cc-----cc--ccc------ccc-ccC-cceeecccccccccchhhhHH--HHHHHHHHHHHHhhhcCCCC-------C Q lcl|NC_015249. 148 -PS-----AS--DEN------IAG-LGK-AHVLEVGKQSELRGDQVKLGQ--AIIAQLTLARAKLTGNYVPS-------A 202 (347) Q Consensus 148 -~~-----~~--~~~------~~~-~~~-g~~i~~~~~~~~~~~~~~~~~--~~~~~l~~a~~~Lde~~VP~-------~ 202 (347) .. +. +.. +.. .|. .-.+..+. .+.+...... ..++.|-++.+.+++..-|. + T Consensus 159 ~~n~~~~vp~~~~~~~~~~~~N~v~APt~~r~~~~g~---at~~~~l~stD~~s~~~Id~~~~~~~~~~~pi~Pv~~~g~ 235 (404) T protein:vir:10 159 FVADDTILPTAEHPEFKKIMINDVLPPTHDRHFFGGD---ATSFEQIEAADIFSIGLVDNLSLFIDEMAHPLQPVRLSGD 235 (404) T ss_pred cccccceeeccccccccceeecccCCCCCCcEEeccC---ccchhhhhhcccccHHHHHHHHHHHHHhCCCCcceEeccc Confidence 00 00 000 000 000 00111111 1111111111 12344556667776643332 2 Q ss_pred C-------CEEEeCHHHHHHHhcchh---hh---h-hhh---ccccccccceEEEEeceEEEEecceecccccccccccc Q lcl|NC_015249. 203 D-------RVFYTTPDNYSAILAALM---PN---A-ANY---QALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEG 265 (347) Q Consensus 203 g-------R~~vv~P~~~~~Ll~~~~---~~---~-~~~---~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~ 265 (347) . ++++++|.+|..|..++. +. . +.. ...+.+-.|.+|.++|+-|++.++.|+--......... T Consensus 236 ~~~~~~~~yV~~~~p~q~~~Lr~dt~~~~w~d~q~~A~a~~rg~~nPlF~G~~gm~ngvii~~~~~~~Irf~~g~~~~~~ 315 (404) T protein:vir:10 236 ELHGEDPYYVLYVTPRQWNDWYTSTSGKDWNQMMVRAVNRAKGFNHPLFKGECAMWRNILVRKYAGMPIRFYQGSKVLVS 315 (404) T ss_pred cccCccceEEEEechHHHHHHhhCCCcHHHHHHHHHHhhccccccCCceecCeeEEcCEEEEecCCceeeecccceeeec Confidence 2 889999999999999852 22 1 111 23455788999999999999999887532221111111 Q ss_pred cc-cccccccccccccccccccccceEEEEechhhh--hhhhhc---ceeeeeeechhhhcceeeeeeeecccccc-c-- Q lcl|NC_015249. 266 AN-PTGQKHAFPETSSGDTRVALDNVVGLFNHRSAV--GTVKLK---DMALERARRANFQADQIIAKYAMGHGGLR-P-- 336 (347) Q Consensus 266 ~~-~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av--~~v~~~---~~~~e~~~d~~~~~d~i~~~~a~G~~~~R-p-- 336 (347) .+ ..+.... ....+.-..+|++-..|+ +.++.. ..-.|..+|-.++ -.|.....+|.+=.| | T Consensus 316 ~n~~~a~~~~--------~aa~~~v~RallLGaQAl~~A~g~~~g~~~~w~Ee~~D~g~~-~~i~~~~i~G~kK~rF~~~ 386 (404) T protein:vir:10 316 ENNLTATTKE--------VAAATNIDRAMLLGAQALANAYGQKAGGHFNMVEKKTDMDNR-TEIAISWINGLKKIRFPEK 386 (404) T ss_pred CCcccccccc--------ccccccchhheeecceeEEEEeeccCCCCceeEeeccccCch-hhhhhHHHhhhhhccccCC Confidence 11 1111111 111111123344434433 222221 1122333344333 334445555655455 4 Q ss_pred ----ceEEEEEEcCC Q lcl|NC_015249. 337 ----EACGALVFNKA 347 (347) Q Consensus 337 ----e~a~~i~~~~a 347 (347) .-=|+|+++.| T Consensus 387 ~g~~~DfGvi~idta 401 (404) T protein:vir:10 387 SGKMQDHGVIAVDTA 401 (404) T ss_pred CCceeeEEEEEeccc Confidence 23456666666 No 163 >protein:vir:4159 Length: 315 # NCBI annotation: structural protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:87 # MgeName: psiM2 # Cross-refs: genbank:acc:NP_046968;genbank:gi:9630538;genbank:GeneID:1261712 Probab=98.61 E-value=2.1e-08 Score=62.67 Aligned_cols=305 Identities=13% Similarity=0.059 Sum_probs=155.4 Q ss_pred CCccccc--ccccccccccccccchhhhhh--hhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcc--eeeeee Q lcl|NC_015249. 1 MAKMNGG--QQIGKDQGKGMSAGDKLALFL--KVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRT--KAAYLQ 74 (347) Q Consensus 1 ma~~~~~--~~~~t~~~~~~~~~d~~al~i--e~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~--~~~~~~ 74 (347) |-.+..- ++. .+.-+....+|...-++ ++++. ....-++.|.++...++.+..++.+..|+.+|-. .....+ T Consensus 1 ~~~~~~~~~~~~-~~~~k~~t~~d~~Gg~l~P~~~~~-~i~~~~e~s~~l~~~~vi~~~~~~~~~i~~~g~~~~~~~g~~ 78 (315) T protein:vir:41 1 MLTIEDIRGGKP-FEIVPKIDVPDLGRGVLSVDRFGE-FVKAVRDSAVIIPEARIDNALKSYEKDISRLSLVLDVGPGRD 78 (315) T ss_pred CcccchhhcCCh-hhhhhhcCCcCCCCceechHHHHH-HHHHHHhhhhhhhhceeeeccccccccccccccCcccccccc Confidence 2211100 000 01111111122222233 56654 5566777898998888765555566667776532 221112 Q ss_pred cCCCCC-CccCCCCCceEEEEEEeeeeccccc--ccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccc Q lcl|NC_015249. 75 PGENLD-DKRKDMKHTERTINIDGLLTADVLI--YDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSAS 151 (347) Q Consensus 75 ~g~~~~-~~~~~~~~~~~~l~ID~~~~~~~~I--dd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~ 151 (347) .+++-. .+...++-.+.+|..-++. +...| +-+|+..-..|+.+.+..+.++++++..+.+++.-- .+ ...+- T Consensus 79 ~~~~~~~~~~~~~~f~~~~l~~~~l~-~~~~it~elL~D~~~~~~~e~~l~~~~a~~~a~~~~~~~~nGd--g~-s~~p~ 154 (315) T protein:vir:41 79 ETGQKLAPPESTAEVKTNTLYMREMV-TKVVIHEDAIEDNIEGKAFEQKIVTLLGEGISYVLEKYYLHGD--TS-SSDPL 154 (315) T ss_pred cccCcCCCCCCccccceeeeceeeee-eeccccHHHHHhhhccccHHHHHHHHHHHHHHHHHHHHhhccC--Cc-CcCcc Confidence 111111 1112344455556555543 33344 233433334689999999999999999887665310 00 00010 Q ss_pred ccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCC-CCCEEEeCHHHHHHHhcchhhhhhhhccc Q lcl|NC_015249. 152 DENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPS-ADRVFYTTPDNYSAILAALMPNAANYQAL 230 (347) Q Consensus 152 ~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~-~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~ 230 (347) -..+ .|.+-..+.................+.|.++...|..+.--. .+-..++++..+..|.+-. -.+..|.++ T Consensus 155 ~~~~----~G~l~~a~~~~~~~~~~~~a~~~~~d~l~~l~~sl~~~yr~~~~~~~~imn~~t~~~~rklk-~~~g~~lw~ 229 (315) T protein:vir:41 155 LRMS----DGWLKLASEKLTESDVDPEAEDWPMNLFDTMIESLPTPYRNNLPNMKFYVTWDIYRAYRDAL-KGRETGLGD 229 (315) T ss_pred cccc----ccceecccccccccccccccccccHHHHHHHHHhcChHHhhcCCceEEEEcHHHHHHHHHHh-ccCCCcccc Confidence 0111 222111100000000000001112344555555555433211 1335688999888775422 334567777 Q ss_pred cccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceee Q lcl|NC_015249. 231 IDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMAL 310 (347) Q Consensus 231 ~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~ 310 (347) ..+..|....+.|.+|+.++.+|........ -+-++|.+.+ .+-..++.+ T Consensus 230 ~~~~~g~~~tl~G~PV~~~~~m~~~~~~~~~--------------------ilf~d~~nl~----------~~~~~~i~i 279 (315) T protein:vir:41 230 QALTGANSILYDGRPVQYVPALEALNDGKSR--------------------ALFVVPTQLV----------YGFWRNIKV 279 (315) T ss_pred chhhcCCCceecccceEecccccccCCCCcc--------------------EEEecccceE----------EEeccccEE Confidence 7788888899999999999999854322111 1112333321 122345778 Q ss_pred eeeechhhhcceeeeeeeecccccccceEEEEEEcC Q lcl|NC_015249. 311 ERARRANFQADQIIAKYAMGHGGLRPEACGALVFNK 346 (347) Q Consensus 311 e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~ 346 (347) ++.|+.......+....+.|.+..-++++++-+++. T Consensus 280 ~~~~~a~~~~~~~~~~~r~d~~~~~~~~~a~~~~~v 315 (315) T protein:vir:41 280 VPDYDAEMRLTKYVASLRTDNHYEDEEGAVSATITV 315 (315) T ss_pred EeeecCCCCceEEEEEEEeceeEEeccceeEeeeeC Confidence 888888877788888888888776666666666666 No 164 >protein:vir:4197 Length: 314 # NCBI annotation: putative structural protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:88 # MgeName: psiM100 # Cross-refs: genbank:acc:NP_071822;genbank:gi:11863105;genbank:GeneID:1257607 Probab=98.61 E-value=3.1e-08 Score=61.75 Aligned_cols=299 Identities=15% Similarity=0.124 Sum_probs=161.8 Q ss_pred CCccccccccc-----ccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcce--eeee Q lcl|NC_015249. 1 MAKMNGGQQIG-----KDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTK--AAYL 73 (347) Q Consensus 1 ma~~~~~~~~~-----t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~--~~~~ 73 (347) |==++-..++. +..+ +| + |--++|+ ++.+.-+..|.++...++..-.+..+..|+++|... .... T Consensus 1 ~~~~~~~~~~~k~it~~d~~----gG--~-L~P~~~~-~~i~~l~e~s~i~~~a~vi~t~~s~~~~i~~i~~g~~~~~~~ 72 (314) T protein:vir:41 1 MDFLNKPFQITPKIDVPDLG----KG--I-LAVQRFG-EFVREVRENSAIIKDARVLNALKSYEVDISRISLGVELEPGR 72 (314) T ss_pred CchhhhHHHhhcccccccCC----Cc--e-eChHHHH-HHHHHHHhccchhhheeeecccCccceeecccccCccccccc Confidence 32222111111 1221 11 1 3347886 566788888999988887543344567888887431 1111 Q ss_pred ec-CCCCCCccCCCCCceEEEEEEeeeeccccc-ccHHHHHh-ChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhc-c Q lcl|NC_015249. 74 QP-GENLDDKRKDMKHTERTINIDGLLTADVLI-YDIEDAMN-HYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLP-S 149 (347) Q Consensus 74 ~~-g~~~~~~~~~~~~~~~~l~ID~~~~~~~~I-dd~D~~q~-~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~-~ 149 (347) .- |+....+..+++-+.++|..-++.. .+.| +++=+... ..|+.+.++.+.++++++....+++.-- ....+ . T Consensus 73 ~~~~~~~~~~~~~~tf~~~~l~~~kl~~-~v~is~e~L~D~a~~~~le~~i~~~~Ae~~g~~~~~~~~nGd--g~~~s~~ 149 (314) T protein:vir:41 73 NTSGTKVAPTADEVTVSTNTLEMKELVT-KVVLEDEALEDNIEQSAFEQTITSLLASGVTYDLECFFLHAD--SSLTTGR 149 (314) T ss_pred ccccCCccCCcccccccceeeeeEEEEE-eecccHHHHHhhhchhhHHHHHHHHHHHHHHHHHHHHhhccc--cCCcCcc Confidence 11 1111112234555666777766644 3455 33322222 3589999999999999998877665321 00000 0 Q ss_pred ccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCC-CCEEEeCHHHHHHHhcchhhhhhhhc Q lcl|NC_015249. 150 ASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSA-DRVFYTTPDNYSAILAALMPNAANYQ 228 (347) Q Consensus 150 ~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~-gR~~vv~P~~~~~Ll~~~~~~~~~~~ 228 (347) +-...+.| ..-..+.. .............+.|.++...|....--.. .-..+++++.+..+.+-. -.+..+. T Consensus 150 ~~~~~p~G----~l~~a~~~--~~~~~~~~~~~~~~~~~~l~~sl~~~yr~~~~~~~~~m~~~t~~~~r~~l-~~~~~~l 222 (314) T protein:vir:41 150 ELYRINDG----WMKLAGNQ--YTDAEPEDENWPLNLFDGMMDELDTRYLQLKPRMKFYVSNEIYNGYRKQL-LVRETGL 222 (314) T ss_pred cchhcchh----hhhhcccc--eeecCccccccHHHHHHHHHHhcCchhhcCCCceEEEecHHHHHHHHHHH-hccCCcc Confidence 00111222 11100000 0000000112233445556666655432111 224556888887775421 1123345 Q ss_pred cccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcce Q lcl|NC_015249. 229 ALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDM 308 (347) Q Consensus 229 ~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~ 308 (347) ++..+..|.-..+.|++|+.++.+|..+... ...++.+++-+..+-..++ T Consensus 223 ~~~~~~~~~~~~l~G~PV~~~~~~~~~~~~~------------------------------~~i~fgd~~nlv~~~~~~i 272 (314) T protein:vir:41 223 GDSALIGATGLQYDGIPIQYVPALDALGDDK------------------------------ARALLTVPTNLVYGFWRNI 272 (314) T ss_pred cchhhhCCCCceecceeeEecccccccCCCC------------------------------ceEEEechhheEEEeecee Confidence 6666778888899999999999997432211 1112222332222334557 Q ss_pred eeeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 309 ALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 309 ~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .++.+|+.......+.+..+++....-+++|+..++..| T Consensus 273 r~~~~~~a~~~~~~~~~~~r~d~~~~~~~aa~~~~~~~~ 311 (314) T protein:vir:41 273 RIEPKRDAAMRRTEYIASLRADCNYEDENAAVAAVIDMS 311 (314) T ss_pred EEeecccCcCCeEEEEEEEEeceEEEEcCcEEEEEeecc Confidence 888888888888888999999999988999999999998 No 165 >protein:vir:9643 Length: 377 # NCBI annotation: major coat protein # Family: family:all:635 # MgeID: mge:173 # MgeName: 315.1 # Cross-refs: genbank:acc:NP_795405;genbank:gi:28876178;genbank:GeneID:1257724 Probab=98.51 E-value=1.5e-07 Score=57.94 Aligned_cols=285 Identities=11% Similarity=0.019 Sum_probs=146.2 Q ss_pred CCccccccccccc--------ccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-Ccceee Q lcl|NC_015249. 1 MAKMNGGQQIGKD--------QGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAA 71 (347) Q Consensus 1 ma~~~~~~~~~t~--------~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~ 71 (347) .++... +..|. --+.+..++--.|.-+.+..++.+...+.|.++.++++.++. |. .+|++- +..++. T Consensus 59 ~~~~~~--~~lt~ee~~~~~~~~~~~~~~~gg~lvP~~~~~~I~~~l~~~s~i~~~~~v~~~~-~~-~~i~~~~~~~~a~ 134 (377) T protein:vir:96 59 DLRDKN--RELTAEEIKFFNDIDKNVGGKDKFKLLPEETMVQVFDDLVAEHPLLKVINFKNTS-LR-LKALTAETSGTAV 134 (377) T ss_pred HhccCC--cccCHHHHHHHHHHHhcCCCCCCceecCHHHHHHHHHHHHhhhhhhhhceeEecC-Cc-eEEEEecCCccee Confidence 111000 00000 000111222222455889999999998999999999888764 43 556654 334555 Q ss_pred eeecCCCCCCccCCCCCceEEEEEEeeeecc-cccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccc Q lcl|NC_015249. 72 YLQPGENLDDKRKDMKHTERTINIDGLLTAD-VLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSA 150 (347) Q Consensus 72 ~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~-~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~ 150 (347) -...+.++..+ .+++-.+++|.. .++.. ..|..-=-..+.+|+-+.+.++.+.++++..|++++.-- T Consensus 135 wv~e~~~~~~~-~~~~f~~i~l~~--~kl~~~~~is~~ll~ds~~~le~~i~~~l~~~~~~~~~~a~i~G~--------- 202 (377) T protein:vir:96 135 WGDIFGEIKGQ-LKQAFKEQDFSQ--FKLTAFVVIPKDALKFGPKWLKQFITEQLKEAIAVALELAIVKGN--------- 202 (377) T ss_pred Eeecccccccc-cCccceeEeeee--eeEEeechhhHHHhhcchhhHHHHHHHHHHHHHHHHHhhceEecc--------- Confidence 44444444322 234445545544 44444 334322223467789999999999999999999875311 Q ss_pred ccccccccc---------------CcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCC--C---CCCCEEEeCH Q lcl|NC_015249. 151 SDENIAGLG---------------KAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYV--P---SADRVFYTTP 210 (347) Q Consensus 151 ~~~~~~~~~---------------~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~V--P---~~gR~~vv~P 210 (347) ....|.|.- ............ ......+..+++.+..+...+....- | ...-+.+++| T Consensus 203 G~~~P~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~a~~~mn~ 280 (377) T protein:vir:96 203 GLLQPVGLLKDLSQPTVDQSTGRDITTYKTDKEAIA--DLSDLDPDTAVELLVPVMKHLSVNDKKHPLKIAGQVKLLLNP 280 (377) T ss_pred CCCcceeeeeccccccccccccccccceeecccccc--ccccCChhHHHHHHHHHHHhhccccccccccccCceEEEEch Confidence 001111110 000000000000 00011233344444444444443321 1 1123567888 Q ss_pred HHHHHHhcchhhhhhhhccccccccceEEEEec--eEEEEecceeccccccccccccccccccccccccccccccccccc Q lcl|NC_015249. 211 DNYSAILAALMPNAANYQALIDPSTGSIRNVMG--FEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALD 288 (347) Q Consensus 211 ~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg~i~G--~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~ 288 (347) ..|..++......+ .+|.-..+.| .+|++|+.+|... -+-+||+ T Consensus 281 ~t~~~~~~~~~~~~---------~~G~~~~~l~~p~~v~~s~~~p~~~-------------------------i~fgdf~ 326 (377) T protein:vir:96 281 EDRWTLEAKFTSRN---------QFGEYVTVLPHGITILESLAVETGK-------------------------AIAFVAN 326 (377) T ss_pred hhHHhccccccccC---------CCCCceeccCCCceEEecCCCCccc-------------------------EEEEEcC Confidence 87776642221111 2344445554 4578888887421 1223554 Q ss_pred ceEEEEechhhhhhhhhcceeeeeeechh--hhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 289 NVVGLFNHRSAVGTVKLKDMALERARRAN--FQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 289 ~~~~l~~~~~Av~~v~~~~~~~e~~~d~~--~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ... +.- ..++.++...+.. +-...+++.+++++++++|++.+++.+..+ T Consensus 327 ~Y~--i~~--------r~~~~i~~~~~~~~~~d~~~f~~~~r~dG~~~d~~a~~vl~l~~~ 377 (377) T protein:vir:96 327 RYD--AFM--------ATASTIEEYDQTFAMEDLQLYLTKNYFYGKAKDNHTAALLTLAGG 377 (377) T ss_pred cEE--EEE--------ecccEEEeehhhhhhcCCeEEEEEEEEcCEEecCCcEEEEEEecC Confidence 422 222 2345566554333 223458899999999999999999999999 No 166 >protein:vir:3158 Length: 321 # NCBI annotation: capsid protein gpE # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:316 # MgeName: PhiCh1 # Cross-refs: genbank:acc:NP_665929;genbank:gi:22091115;genbank:GeneID:951342 Probab=98.45 E-value=4e-08 Score=61.16 Aligned_cols=298 Identities=10% Similarity=0.003 Sum_probs=146.3 Q ss_pred CCcccccccccccccc--cccccchhhhh--hhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcc--eeeeee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGK--GMSAGDKLALF--LKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRT--KAAYLQ 74 (347) Q Consensus 1 ma~~~~~~~~~t~~~~--~~~~~d~~al~--ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~--~~~~~~ 74 (347) |+.-.-.++.. +.-+ ....+|...=| ...+..++...-+..|.++...++..+++ .+-+|+.+|-. ....-+ T Consensus 1 ~~~k~~~~~l~-~~~~~~~~~~~~~~~g~~v~~~~~~~l~~~i~e~s~~l~~i~v~~v~~-~~~~i~~~~~~~~~~~~~~ 78 (321) T protein:vir:31 1 MASRTINNDLS-RITEKNALTVDDLDAGGTLPDPLWDEFWTDMIEETPLLDAIRTETVGA-KKTRIPTLNIGERHRRPQD 78 (321) T ss_pred CchHHHHHHHH-HHHHhccccccccCCcceeCHHHHHHHHHHHHHhhhhhhhceeeeccC-cceeeeeeccCCccccccc Confidence 65544322211 1111 11222322222 37788888888888888898888887654 34566666532 222111 Q ss_pred cCCCCCCccCCCCCceEEEEEEeeeeccccc--ccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccc Q lcl|NC_015249. 75 PGENLDDKRKDMKHTERTINIDGLLTADVLI--YDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASD 152 (347) Q Consensus 75 ~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~I--dd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~ 152 (347) .++.. ....+++-+++++..-+.. +...| +-+|++....|+.+.+.+..++++++..++.++.- -..+.+.. T Consensus 79 e~~~~-~~~~~~~~~~~~~~~~k~~-~~~~it~e~L~d~a~~~d~e~~i~~~ia~~~a~~~~~~~~nG----d~~~~~~~ 152 (321) T protein:vir:31 79 EGEWN-ENESDVSTGTIDISTEKAT-VAWDLPREVVQENPEGEALADRILNLMTDAWSADVEDLAANG----DEDAEDSF 152 (321) T ss_pred ccccc-cccccceeeeeeeeeEEEE-eehhccHHHHHhhhcchhHHHHHHHHHHHHHHHHHHhheeec----cccCCCcc Confidence 22211 1112344455666665553 33334 23444333578999999999999999998765421 10111110 Q ss_pred cccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccc Q lcl|NC_015249. 153 ENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALID 232 (347) Q Consensus 153 ~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~ 232 (347) .. ...|.+-.........+ .......++.|.++...|+++.--..+-+.+++++.+..+++-..- +....+... T Consensus 153 ~~---~n~G~l~~a~~~~~~~~--~~~~~~~~d~l~~l~~~l~~~yr~~~~~v~im~~~~~~~~~~~l~~-~~~~~~~~~ 226 (321) T protein:vir:31 153 EN---QNDGFITVAEGDVETID--AADDILDNDLVIRTIAGLDSKYRARMNPALIVSEDQLLSYHYTLTD-RDTPLGDNV 226 (321) T ss_pred cc---cchhhhhhhcccccccc--ccccccCHHHHHHHHHhccHhHhcCCCeEEEechHHHHHHHHHHhc-CCCccccch Confidence 00 01121110000000000 0001112355666777777665322244678999987665432211 112234455 Q ss_pred cccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeee Q lcl|NC_015249. 233 PSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALER 312 (347) Q Consensus 233 ~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~ 312 (347) +..|...++.|++|+.++++|.... .-.++++.+ .+-..++..++ T Consensus 227 l~~~~~~tl~G~pvv~~~~mP~~~i-------------------------l~t~~~nl~----------~~~~~~~~~~~ 271 (321) T protein:vir:31 227 IMGEADVNPFSFPIIGSGLWPDDKA-------------------------MFTDPQNLI----------YALYRDLEIDV 271 (321) T ss_pred hhccccccccceeEEEcCCCCCCcE-------------------------EEeccccEE----------EEEeeccEEEE Confidence 6677777899999999999984221 011222322 11223345666 Q ss_pred eechhh---hcceeeeeeee--cccccccceEEEEE-EcCC Q lcl|NC_015249. 313 ARRANF---QADQIIAKYAM--GHGGLRPEACGALV-FNKA 347 (347) Q Consensus 313 ~~d~~~---~~d~i~~~~a~--G~~~~Rpe~a~~i~-~~~a 347 (347) .++... +.+.+...+++ +..+.++++++.++ ++-+ T Consensus 272 ~~~~~~~~~~~~~~~~~~~~~~~~~ve~~~a~a~~~~i~~~ 312 (321) T protein:vir:31 272 LTESDKVSERDLHARYFMRGDDDFAIENTEAVVLAEGLGDP 312 (321) T ss_pred eecCccccccceeeEeeeeeecceeEeccccEEEEecCCcc Confidence 655442 23445554443 44455666555544 3333 No 167 >protein:vir:100632 Length: 381 # NCBI annotation: 77ORF006 # Family: family:all:635 # MgeID: mge:1476 # MgeName: 77 # Cross-refs: genbank:acc:NP_958606;genbank:gi:41189521;genbank:GeneID:2743778 Probab=98.45 E-value=1.2e-07 Score=58.53 Aligned_cols=284 Identities=13% Similarity=0.064 Sum_probs=141.5 Q ss_pred CCccccccccccc----------ccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCc-ce Q lcl|NC_015249. 1 MAKMNGGQQIGKD----------QGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGR-TK 69 (347) Q Consensus 1 ma~~~~~~~~~t~----------~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~-~~ 69 (347) +... +... .+. .+-+ .+|. .|.-++|..++.+.....|.++.+.++.++ +|+ .++++... .. T Consensus 57 ~~~~-~~~~-l~~~e~~~~~~~~~~t~-~~Gg--~lvP~~~~~~I~~~l~~~spir~~a~v~~~-~~~-~~i~~~~~~~~ 129 (381) T protein:vir:10 57 SLPK-SAQT-LSANQRNFFMDINKSVG-YKEE--KLLPEETIDRIFEDLTTNHPLLADLGIKNA-GLR-LKFLKSETSGV 129 (381) T ss_pred Hhcc-cccc-cCHHHHHHHHHHhhcCC-CCCc--eecCHHHHHHHHHHHHhhcceeeeeeeEec-Ccc-eEEEeecCCcc Confidence 1100 0000 000 1111 1111 245699999999999999999999988875 444 45555433 33 Q ss_pred eeeeecCCCCCCccCCCCCceEEEEEEeeeecc-cccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhc Q lcl|NC_015249. 70 AAYLQPGENLDDKRKDMKHTERTINIDGLLTAD-VLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLP 148 (347) Q Consensus 70 ~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~-~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~ 148 (347) +.-..-+..+..+ .+++-.+++ +...++.. ..|..-=-..+.+|+-+.+..+.+.++++..|++++.- T Consensus 130 a~W~~e~~~~~~~-~~~~f~~i~--l~~~kl~a~i~is~elL~Ds~~~le~~i~~~la~~~a~~~~~afi~G-------- 198 (381) T protein:vir:10 130 AVWGKIYGEIKGQ-LDAAFSEET--AIQNKLTAFVVLPKDLNDFGPAWIERFVRVQIEEAFAVALETAFLKG-------- 198 (381) T ss_pred eEEeecccccccc-cCccceeEe--ecceeEEeeccccHHHHhccHHHHHHHHHHHHHHHHHHHhhceeEec-------- Confidence 3322222222211 123334444 44444443 33422112235678899999999999999999877521 Q ss_pred cccccccccc----cCcceeecccccccccchhh---hHHHHHHHHHHHHHHhh----hcCC-CCCCCEEEeCHHHHHHH Q lcl|NC_015249. 149 SASDENIAGL----GKAHVLEVGKQSELRGDQVK---LGQAIIAQLTLARAKLT----GNYV-PSADRVFYTTPDNYSAI 216 (347) Q Consensus 149 ~~~~~~~~~~----~~g~~i~~~~~~~~~~~~~~---~~~~~~~~l~~a~~~Ld----e~~V-P~~gR~~vv~P~~~~~L 216 (347) .....|.|. .++..+..+...+....... .....++.+......+. .+.. +..+.+++++|..+..| T Consensus 199 -dG~~qP~Gil~~~~~~~~~~~g~~~~~~~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~vmn~~t~~~l 277 (381) T protein:vir:10 199 -TGKDQPIGLNRQVQKGVSVTDGAYPEKEEQGTLTFANPRATVNELTQVFKYHSTNEKGKSVAVKGNVTMVVNPSDAFEV 277 (381) T ss_pred -ccCCCceeeeecCCccccccccccccccccccccccchhhHHHHHHHHHHhhhhhhccccccccCceEEEEchhhHHhh Confidence 111112111 11111111111110000000 11122222222222221 1121 34467889999988888 Q ss_pred hcchhhhhhhhccccccccce-EEEE-eceEEEEecceecccccccccccccccccccccccccccccccccccceEEEE Q lcl|NC_015249. 217 LAALMPNAANYQALIDPSTGS-IRNV-MGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLF 294 (347) Q Consensus 217 l~~~~~~~~~~~~~~~~~~G~-Vg~i-~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~ 294 (347) +....+.+ .+|. +..+ .|.+|++++.+|... -.-+||+... + T Consensus 278 ~~~~~~~~---------~~G~~v~~lp~g~~vv~~~~~p~~~-------------------------i~fGDfs~Y~--i 321 (381) T protein:vir:10 278 QAQYTHLN---------ANGVYVTALPFNLNVIESTVQEAGK-------------------------VLTYVKGLYD--G 321 (381) T ss_pred ccccccCC---------CCCceeecCCCCceeEEcCCCCcCc-------------------------EEEEEcccEE--E Confidence 65443221 1222 1111 477899999888421 1224666532 2 Q ss_pred echhhhhhhhhcceeeeeeechhhh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 295 NHRSAVGTVKLKDMALERARRANFQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 295 ~~~~Av~~v~~~~~~~e~~~d~~~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +-+. .++++...+.... ...+++++++++++++|++++++.++.. T Consensus 322 ~~r~--------~~~i~~~~~~~~~~d~~~f~a~~r~dG~~~~~~A~~v~~l~~~ 368 (381) T protein:vir:10 322 YLAG--------GINVQKFKETLALDDMDLYTAKQFAYGKAKDNKVAAVWKLDLK 368 (381) T ss_pred EEec--------ccEEEeechhhhhcCceEEEEEEEEcCEEecCCcEEEEEEeec Confidence 3232 3455554333222 2368899999999999999999888755 No 168 >protein:vir:95875 Length: 401 # NCBI annotation: major coat protein # Family: family:all:10944 # MgeID: mge:1586 # MgeName: N4 # Cross-refs: genbank:acc:YP_950534;genbank:gi:119952248;genbank:GeneID:5075702 Probab=98.44 E-value=8.2e-08 Score=59.44 Aligned_cols=322 Identities=12% Similarity=0.048 Sum_probs=167.7 Q ss_pred CCccccccccc-ccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccc--cccceEEEeecCcc-ee-eeeec Q lcl|NC_015249. 1 MAKMNGGQQIG-KDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSI--QSGKSAQFPVLGRT-KA-AYLQP 75 (347) Q Consensus 1 ma~~~~~~~~~-t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i--~~G~tv~i~~iG~~-~~-~~~~~ 75 (347) |-+-+--.+.. +.+ + ++.++ .+...=|...++..-.+.-++..+-.++++ .+|+|+.|.+--.- .. .-.+. T Consensus 1 ~~~~~a~~~~~~~s~-~-g~~~~--~~~t~y~~~k~L~~Aa~~lv~~~fA~~~piPkn~GkTIk~r~y~pl~~~~~pl~e 76 (401) T protein:vir:95 1 MLNYNAPTDGQKSSI-D-GANSD--QMQTFFWLKKAIITARKEQYFMPLASVTNMPKHYGKTIKVYEYVPLLDDRNINDQ 76 (401) T ss_pred CCccCCCcccccccc-c-ccccc--eeeehhhHHHHHhhhhhhhhhhhcccccccccccCCeEEEEecccccccccchhc Confidence 54443111100 011 1 12222 123333444444333334555555555554 46999999754322 11 11122 Q ss_pred CCCCCCc------------------------------c--CCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHH Q lcl|NC_015249. 76 GENLDDK------------------------------R--KDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTA 123 (347) Q Consensus 76 g~~~~~~------------------------------~--~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~ 123 (347) |.+..+. + ..++-.++...|-|+-.|..+-|.++..-....+...++. T Consensus 77 Gv~a~G~~~~~g~~y~~~rdv~~it~~m~~~t~~~~rvn~v~~~~~d~~g~l~qyG~~~e~Td~~~dt~~D~~l~~h~s~ 156 (401) T protein:vir:95 77 GIDASGATIVNGNLYGSSKDIGNITSKLPLLTENGGRVNRVGFTRIAREGSIHKFGFFYEFTQESIDFDSDDGLMEHLSR 156 (401) T ss_pred CCCcccccccCccccccccccceeecccccccccccccccccceeeeeeeeeeeccCccchhhhhhhhhcchHHHHHHHH Confidence 3222221 0 0122233455677877777766777766666667766655 Q ss_pred HHHHHH-HHHHHHHHHHHHHHHhhhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCC- Q lcl|NC_015249. 124 QLGESL-AMAADGAVLAEMAKLCNLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPS- 201 (347) Q Consensus 124 ~~g~aL-a~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~- 201 (347) ++-..= .+..|. +-+++.-++ ...+-. ++.++...+++. ..++. .-.++.|..+...|+++..|. T Consensus 157 ell~g~~~~t~d~-i~~dll~ag------~~viyA-g~ats~At~~~~-~~~~t----~vt~~~l~rl~~~L~~nRapk~ 223 (401) T protein:vir:95 157 ELMNGATQITEAV-LQKDLLAAA------GTVLYA-GAATSDATITGE-GSTPS----VVSYKNLMRLDQILTENRTPTQ 223 (401) T ss_pred HHhhhhhhhHHHH-HHHHHHhhc------CeeecC-Cccceeeecccc-ccccc----eechhHHHHHHHHHHhcccccc Confidence 444332 333443 333332111 000000 011111111111 11111 223677888999999977776 Q ss_pred ----------------CCCEEEeCH------HHHHHHhcchhhhhh-hhccccccccceEEEEeceEEEEecceeccccc Q lcl|NC_015249. 202 ----------------ADRVFYTTP------DNYSAILAALMPNAA-NYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAG 258 (347) Q Consensus 202 ----------------~gR~~vv~P------~~~~~Ll~~~~~~~~-~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~ 258 (347) .-|+.++.| .....|+.++.|+.. .|...+...+|.||++.+|++++++.+-.-... T Consensus 224 t~~i~~s~~~dTk~i~~s~va~~h~~L~~di~a~~D~~~~~~fi~v~kYa~~~~i~~gEiG~i~~vR~i~~p~~~~w~~a 303 (401) T protein:vir:95 224 TTIITGSRMIDTKVIGATRVMYVGSELVPELKAMKDLFGNKAFIETQHYADAGTIMNGEVGSIDKFRIIQVPEMLHWAGA 303 (401) T ss_pred hhhhhhhhccCccccccceEEEEecCchhHHHHHHHhcCCCCceehhhcCCccccccccccccCceeEEecccceeecCC Confidence 137888888 444677788889875 577788899999999999999998875422211 Q ss_pred cccccccccccccccccccc-ccccccccccceEEEEechhhhhhhhhcceee----e----ee-------echhhhcce Q lcl|NC_015249. 259 EDRPEEGANPTGQKHAFPET-SSGDTRVALDNVVGLFNHRSAVGTVKLKDMAL----E----RA-------RRANFQADQ 322 (347) Q Consensus 259 ~~~~~~~~~~~~~~~~~~~~-~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~----e----~~-------~d~~~~~d~ 322 (347) ..... +.+ ...+.+.. ..+.|+ -.-+||+-++|.+++.++..-. + .- -||.-|.=. T Consensus 304 g~~a~-~~~---~~y~~~~~~~gg~~d----Vyp~lV~G~dAf~~~~l~g~g~~~~~~~ivk~pG~~~ad~~DPlgQ~g~ 375 (401) T protein:vir:95 304 GAQAT-GAN---PGYRTSMVSGQEHYD----VYPMLVVGDDSFTSIGFQTDGKSLKFTVMTKMPGKETADRNDPYGETGF 375 (401) T ss_pred ccccc-ccc---cccccccccCCCcce----eeeeeEEccccceecccccCCccccceeEeecCCcCCCCCCCcccceeh Confidence 10000 000 00011111 111222 2457888899998887765421 1 11 355666666 Q ss_pred eeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 323 IIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 323 i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +-=++.+++.+|||+.-+-|++.+= T Consensus 376 vgwK~~~a~~vL~~e~m~~ies~a~ 400 (401) T protein:vir:95 376 SSIKWYYGILVKRPERLALIKTVAP 400 (401) T ss_pred hhhhhhhhhheeccceeEEEEeecC Confidence 6677888999999999998887666 No 169 >protein:vir:9509 Length: 381 # NCBI annotation: hypothetical protein # Family: family:all:635 # MgeID: mge:170 # MgeName: phiN315 # Cross-refs: genbank:acc:NP_835556;genbank:gi:30043951;genbank:GeneID:1260537 Probab=98.42 E-value=2e-07 Score=57.32 Aligned_cols=285 Identities=14% Similarity=0.060 Sum_probs=146.7 Q ss_pred CCccccc----------ccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-Ccce Q lcl|NC_015249. 1 MAKMNGG----------QQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTK 69 (347) Q Consensus 1 ma~~~~~----------~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~ 69 (347) +....+- ....+ +- .+.|. .|.-+++..++.+.-...|.++.+.++.++. |+ ..|++. +... T Consensus 57 ~~~~~~~~lt~~e~~~~~~~~~--~~-~~~gg--~lvP~~~~~~I~~~l~~~s~i~~~~~v~~~~-~~-~~i~~~~~~~~ 129 (381) T protein:vir:95 57 SLPKSAQSLSANQRSFFMDINK--NV-NYKEE--KLLPEETIDRIFEDLTTNHPLLADLGIKNAG-LR-LKFLKSETSGV 129 (381) T ss_pred HhccCcccccHHHHHHHHHHhc--cc-CCCCc--eecCHHHHHHHHHHHHhhccceeheeeEecC-cc-eEEEEecCCcc Confidence 1111100 00001 11 11121 2456999999999999999999998887754 44 466665 3344 Q ss_pred eeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Q lcl|NC_015249. 70 AAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPS 149 (347) Q Consensus 70 ~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~ 149 (347) +.-..-+..+... .+++-.+++|..-++ +.-..|..-=-..+.+|+.+.+.++.+.++++..|++++.-- T Consensus 130 a~w~~e~~~~~~~-~~~~f~~i~l~~~kl-~~~~~is~elL~Ds~~~ie~~i~~~la~~~a~~~~~a~i~G~-------- 199 (381) T protein:vir:95 130 AVWGKIYGEIKGQ-LDAAFSEETAIQNKL-TAFVVLPKDLNDFGPAWIERFVRVQIEEAFAVALETAFLKGT-------- 199 (381) T ss_pred eeeeccccccccc-ccccceeeeecceeE-EeechhhHHHhhcCHHHHHHHHHHHHHHHHHHHhhheeEecc-------- Confidence 4443433333321 123344555554443 333344322223366789999999999999999998875311 Q ss_pred cccccccccc----Ccceeecccccccccch---hhhHHHHHHHHHHHHHHhhhc----C-CCCCCCEEEeCHHHHHHHh Q lcl|NC_015249. 150 ASDENIAGLG----KAHVLEVGKQSELRGDQ---VKLGQAIIAQLTLARAKLTGN----Y-VPSADRVFYTTPDNYSAIL 217 (347) Q Consensus 150 ~~~~~~~~~~----~g~~i~~~~~~~~~~~~---~~~~~~~~~~l~~a~~~Lde~----~-VP~~gR~~vv~P~~~~~Ll 217 (347) ....|.|.- .+.....+...+..... .......++.|..+...|... . .+..+-+++++|..+..|+ T Consensus 200 -G~~qP~Gil~~~~~~~~~~~g~~~~~~~~~t~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~a~~~mn~~t~~~l~ 278 (381) T protein:vir:95 200 -GKDQPIGLNRQVQKGVSVTEGAYPEKEEQGTLTFANPRATVNELTQVFKYHSTNEKGKSVAVKGNVTMVVNPSDAFEVQ 278 (381) T ss_pred -CCCCceeeeeccCcccccccccccccccccccccccchhhHHHHHHHHHhhccccccccccccCceEEEEccccHHhhc Confidence 011111110 00011111000000000 011222344454444444322 2 2344567789999888876 Q ss_pred cchhhhhhhhccccccccceEEEE--eceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEe Q lcl|NC_015249. 218 AALMPNAANYQALIDPSTGSIRNV--MGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFN 295 (347) Q Consensus 218 ~~~~~~~~~~~~~~~~~~G~Vg~i--~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~ 295 (347) .-....+ .+|..... .|.+|++|+.+|... -+-+||+... ++ T Consensus 279 ~~~~~~~---------~~G~~v~~l~~g~~vv~s~~~p~~~-------------------------iifgDfs~Y~--i~ 322 (381) T protein:vir:95 279 AQYTHLN---------ANGVYVTALPFNLNVIESTVQEAGK-------------------------VLTYVKGLYD--GY 322 (381) T ss_pred cccccCC---------CCCceeecCCCCceEEecCCCCcCc-------------------------EEEEecccEE--EE Confidence 4332211 23332223 366789998887321 1224555532 23 Q ss_pred chhhhhhhhhcceeeeeeechhhh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 296 HRSAVGTVKLKDMALERARRANFQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 296 ~~~Av~~v~~~~~~~e~~~d~~~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) -+. .++++...+.... ...+++.+++++.+++|++.+++.++.. T Consensus 323 ~r~--------~~~i~~~~~~~~~~d~~~f~a~~r~dg~~~~~~A~~v~~l~~~ 368 (381) T protein:vir:95 323 LAG--------GINVQKFKETLALDDMDLYTAKQFAYGKAKDNKVAAVWKLDLK 368 (381) T ss_pred Eec--------ccEEEeechhHhhcCCeEEEEEEEEcCEEecCceEEEEEEEec Confidence 233 3455554332222 2368899999999999999999888776 No 170 >protein:vir:101291 Length: 381 # NCBI annotation: hypothetical protein # Family: family:all:635 # MgeID: mge:1591 # MgeName: phiNM3 # Cross-refs: genbank:acc:YP_908831;genbank:gi:118725095;genbank:GeneID:4555862 Probab=98.42 E-value=2e-07 Score=57.32 Aligned_cols=285 Identities=14% Similarity=0.060 Sum_probs=146.7 Q ss_pred CCccccc----------ccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-Ccce Q lcl|NC_015249. 1 MAKMNGG----------QQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTK 69 (347) Q Consensus 1 ma~~~~~----------~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~ 69 (347) +....+- ....+ +- .+.|. .|.-+++..++.+.-...|.++.+.++.++. |+ ..|++. +... T Consensus 57 ~~~~~~~~lt~~e~~~~~~~~~--~~-~~~gg--~lvP~~~~~~I~~~l~~~s~i~~~~~v~~~~-~~-~~i~~~~~~~~ 129 (381) T protein:vir:10 57 SLPKSAQSLSANQRSFFMDINK--NV-NYKEE--KLLPEETIDRIFEDLTTNHPLLADLGIKNAG-LR-LKFLKSETSGV 129 (381) T ss_pred HhccCcccccHHHHHHHHHHhc--cc-CCCCc--eecCHHHHHHHHHHHHhhccceeheeeEecC-cc-eEEEEecCCcc Confidence 1111100 00001 11 11121 2456999999999999999999998887754 44 466665 3344 Q ss_pred eeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Q lcl|NC_015249. 70 AAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPS 149 (347) Q Consensus 70 ~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~ 149 (347) +.-..-+..+... .+++-.+++|..-++ +.-..|..-=-..+.+|+.+.+.++.+.++++..|++++.-- T Consensus 130 a~w~~e~~~~~~~-~~~~f~~i~l~~~kl-~~~~~is~elL~Ds~~~ie~~i~~~la~~~a~~~~~a~i~G~-------- 199 (381) T protein:vir:10 130 AVWGKIYGEIKGQ-LDAAFSEETAIQNKL-TAFVVLPKDLNDFGPAWIERFVRVQIEEAFAVALETAFLKGT-------- 199 (381) T ss_pred eeeeccccccccc-ccccceeeeecceeE-EeechhhHHHhhcCHHHHHHHHHHHHHHHHHHHhhheeEecc-------- Confidence 4443433333321 123344555554443 333344322223366789999999999999999998875311 Q ss_pred cccccccccc----Ccceeecccccccccch---hhhHHHHHHHHHHHHHHhhhc----C-CCCCCCEEEeCHHHHHHHh Q lcl|NC_015249. 150 ASDENIAGLG----KAHVLEVGKQSELRGDQ---VKLGQAIIAQLTLARAKLTGN----Y-VPSADRVFYTTPDNYSAIL 217 (347) Q Consensus 150 ~~~~~~~~~~----~g~~i~~~~~~~~~~~~---~~~~~~~~~~l~~a~~~Lde~----~-VP~~gR~~vv~P~~~~~Ll 217 (347) ....|.|.- .+.....+...+..... .......++.|..+...|... . .+..+-+++++|..+..|+ T Consensus 200 -G~~qP~Gil~~~~~~~~~~~g~~~~~~~~~t~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~a~~~mn~~t~~~l~ 278 (381) T protein:vir:10 200 -GKDQPIGLNRQVQKGVSVTEGAYPEKEEQGTLTFANPRATVNELTQVFKYHSTNEKGKSVAVKGNVTMVVNPSDAFEVQ 278 (381) T ss_pred -CCCCceeeeeccCcccccccccccccccccccccccchhhHHHHHHHHHhhccccccccccccCceEEEEccccHHhhc Confidence 011111110 00011111000000000 011222344454444444322 2 2344567789999888876 Q ss_pred cchhhhhhhhccccccccceEEEE--eceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEe Q lcl|NC_015249. 218 AALMPNAANYQALIDPSTGSIRNV--MGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFN 295 (347) Q Consensus 218 ~~~~~~~~~~~~~~~~~~G~Vg~i--~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~ 295 (347) .-....+ .+|..... .|.+|++|+.+|... -+-+||+... ++ T Consensus 279 ~~~~~~~---------~~G~~v~~l~~g~~vv~s~~~p~~~-------------------------iifgDfs~Y~--i~ 322 (381) T protein:vir:10 279 AQYTHLN---------ANGVYVTALPFNLNVIESTVQEAGK-------------------------VLTYVKGLYD--GY 322 (381) T ss_pred cccccCC---------CCCceeecCCCCceEEecCCCCcCc-------------------------EEEEecccEE--EE Confidence 4332211 23332223 366789998887321 1224555532 23 Q ss_pred chhhhhhhhhcceeeeeeechhhh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 296 HRSAVGTVKLKDMALERARRANFQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 296 ~~~Av~~v~~~~~~~e~~~d~~~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) -+. .++++...+.... ...+++.+++++.+++|++.+++.++.. T Consensus 323 ~r~--------~~~i~~~~~~~~~~d~~~f~a~~r~dg~~~~~~A~~v~~l~~~ 368 (381) T protein:vir:10 323 LAG--------GINVQKFKETLALDDMDLYTAKQFAYGKAKDNKVAAVWKLDLK 368 (381) T ss_pred Eec--------ccEEEeechhHhhcCCeEEEEEEEEcCEEecCceEEEEEEEec Confidence 233 3455554332222 2368899999999999999999888776 No 171 >protein:vir:108211 Length: 318 # NCBI annotation: gp9 # Family: family:all:6420 # MgeID: mge:2004 # MgeName: Giles # Cross-refs: genbank:acc:YP_001552338;genbank:gi:160700658;genbank:GeneID:5758931 Probab=98.42 E-value=9.1e-08 Score=59.19 Aligned_cols=290 Identities=11% Similarity=0.005 Sum_probs=151.5 Q ss_pred CCcccccccccccccccccccchhhhhh--hhhhhHHHHHHHHHHhhhc-ccccccccccceEEE----eecCcceeeee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL--KVFGGEVLTAFTRTSVTMN-KHLVRSIQSGKSAQF----PVLGRTKAAYL 73 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i--e~f~g~V~~~f~~~s~~~~-~~~~r~i~~G~tv~i----~~iG~~~~~~~ 73 (347) |.+-++-- ..+.++.-. ++ -+| ..|-......-.+.....+ +.+.-.-+++-++.| |.......... T Consensus 1 ~~~~~~i~----s~~~~~~it-v~-~ll~~P~~I~~~i~e~~~~~~iad~lf~~~~a~~~~~v~f~~~~p~~~~~d~e~V 74 (318) T protein:vir:10 1 MTAPTGIV----SVSDGPAIT-VR-ELVGNPLWIPTALKKMMVNQFISESLFRNGGANPNGVVAYNEGNPSFLEDDVADV 74 (318) T ss_pred CCCCCcce----eeecCCcee-hH-HhhCCchhHHHHHHHHHhccchhhhhhhcccccccceeEEEecccccccCcHhhc Confidence 66654321 222222100 11 011 1222121122222222222 222222345667887 44555677788 Q ss_pred ecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc Q lcl|NC_015249. 74 QPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDE 153 (347) Q Consensus 74 ~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~ 153 (347) .+|.+++.. +..+.+..+-.-+..--.+.|.|=-...+..|+.....++++.+++++.|+.++..+..+. + +... T Consensus 75 aEggEiP~~--~~~~G~~~ia~~~K~G~~~~vS~Em~~~n~~~~v~r~~~~l~Nti~r~~d~~a~dal~sa~--t-~~~~ 149 (318) T protein:vir:10 75 AEFGEIPVS--AGARGLPRTAFAVKKALGVRVSKEMIDENRVGAVNDQMLQLRNTFIRANDRSAKALLQSPI--V-PTLA 149 (318) T ss_pred cCccccccc--CCCCCchhhhhhehhccceeccHHHHhhcChhHHHHHHHHHHHHHHHHHHHHHHHHHhccc--c-cccc Confidence 899998764 3444444442222335667888888888999999999999999999999998876542211 0 0000 Q ss_pred ccccccCcceeecccccccccchhhhHHHHHHHHHHH---HHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccc Q lcl|NC_015249. 154 NIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLA---RAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQAL 230 (347) Q Consensus 154 ~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a---~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~ 230 (347) ..+.-.++..... +...... ..+.....+..+ .+.+..-..| -.+|+.|..|..|++++.+... |.+. T Consensus 150 ~s~~w~~~~~~~~----d~~~A~e-~v~~a~~~~~~a~~~~~~~~~GY~p---dtIVlhP~~~~~l~~n~~~~~~-y~~~ 220 (318) T protein:vir:10 150 VPTAWDNGGKVRT----DIAIAIE-QISTAAPTAYPAGVGSSDEYFGFIP---DTIVMHYALLPILMDNENFMKV-YERN 220 (318) T ss_pred CCcCCCCcccccc----cchhhhh-hhhhhhhhhhhhhhhhhhhccCccc---eeeEECHHHHHHHhcchhhhhh-hhcc Confidence 0000000000000 0000000 000000011111 1111111223 3899999999999999876532 2211 Q ss_pred cc------cccceE-EEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhh Q lcl|NC_015249. 231 ID------PSTGSI-RNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTV 303 (347) Q Consensus 231 ~~------~~~G~V-g~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v 303 (347) .. -..|.+ ++++|++|+.|+++|... ++++++..+|+- T Consensus 221 a~~~~~~~~~tg~~~g~~lGl~vi~s~~~p~~~-----------------------------------alvlq~g~vG~~ 265 (318) T protein:vir:10 221 ANYVSTAPDWTGNFPGSVMGLNVIRSRTFPIDR-----------------------------------VLIMERGTVGFY 265 (318) T ss_pred chhhhhcccccccccceeeceEEeecCccCCCe-----------------------------------eEEEecCCccee Confidence 11 113333 678999999999999421 234444444432 Q ss_pred -hhcceeeeeeech-------hhhcceeeeeeeecccccccceEEEEE---Ec Q lcl|NC_015249. 304 -KLKDMALERARRA-------NFQADQIIAKYAMGHGGLRPEACGALV---FN 345 (347) Q Consensus 304 -~~~~~~~e~~~d~-------~~~~d~i~~~~a~G~~~~Rpe~a~~i~---~~ 345 (347) -..+++.+.+|.. .-..|.++..+....++.+|-++.-|+ ++ T Consensus 266 ~d~~pl~~t~~~~egg~~~g~~~~s~~~~~~~~~~~~V~~PkA~~~itgi~~~ 318 (318) T protein:vir:10 266 SDTRPLQFTALYPEGNGPNGGPTESYRADASHKRALAVDQPKAALWLTGIVTP 318 (318) T ss_pred eccccceeeecccCCCCCCCCcchhhheehheeeeeeeeCcceeEEEeeccCC Confidence 2345666777743 456699999999999999999776543 33 No 172 >protein:vir:95963 Length: 395 # NCBI annotation: ORF009 # Family: family:all:635 # MgeID: mge:1594 # MgeName: 2638A # Cross-refs: genbank:acc:YP_239802;genbank:gi:66395459;genbank:GeneID:5132880 Probab=98.38 E-value=2.4e-07 Score=56.87 Aligned_cols=285 Identities=12% Similarity=0.064 Sum_probs=141.2 Q ss_pred CCccc----cccccc----------ccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecC Q lcl|NC_015249. 1 MAKMN----GGQQIG----------KDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLG 66 (347) Q Consensus 1 ma~~~----~~~~~~----------t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG 66 (347) +.... .+.+.. -+.+-+.++| .|.-+.+..++.+...+.|.++.++++.++. |+ +.|++.. T Consensus 61 ~~~~~~~~~r~~~~l~~ee~~~~~~~~~~t~~~gG---~liP~~~~~~Ii~~l~~~s~i~~~~~v~~~~-~~-~~i~~~~ 135 (395) T protein:vir:95 61 VVDNGILAKRSQDPLTSEERKFFNDINYDVGYTDE---KILPETVVERVFDDLQKDHPLLSKINFQNAG-IK-TRVIKAD 135 (395) T ss_pred HHHHHHHhhcCccccchHHHHHHHHHhhccCCCCc---eeccHHHHHHHHHHHHhhhhhhhhceeEecC-Cc-eEEEEec Confidence 00000 000000 0111111112 1445899999999999999999999887764 44 5677654 Q ss_pred cc-eeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHh Q lcl|NC_015249. 67 RT-KAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLC 145 (347) Q Consensus 67 ~~-~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a 145 (347) .. .+.....+..+... .+++-+++++..-++ +.-..|.+-=-..+.+|+-+.+.++.++++++..|++++.-- T Consensus 136 ~~~~a~w~~e~~~~~~~-~~~~f~~i~l~~~kl-~~~~~iS~ell~ds~~~ie~~i~~~la~~ia~~~~~a~i~G~---- 209 (395) T protein:vir:95 136 PAGQAVWGKVFGEIKGQ-LDAAFREENFTQYKL-TCFVVLPDDLSTFGPAWIERFVRTQIQEAISVALESAIINGG---- 209 (395) T ss_pred CCcceEEeecccccCcc-ccccceeeeeceeeE-EEeecccHHHHhcchhHHHHHHHHHHHHHHHHHHhhheeecc---- Confidence 43 44333322333221 234445555555333 333444322233466889999999999999999998775210 Q ss_pred hhccccccccccccCcceeecccccc-----cccc--hhhhHHHHHHHHHHHHHHhhh----cC-CCCCCCEEEeCHHHH Q lcl|NC_015249. 146 NLPSASDENIAGLGKAHVLEVGKQSE-----LRGD--QVKLGQAIIAQLTLARAKLTG----NY-VPSADRVFYTTPDNY 213 (347) Q Consensus 146 ~~~~~~~~~~~~~~~g~~i~~~~~~~-----~~~~--~~~~~~~~~~~l~~a~~~Lde----~~-VP~~gR~~vv~P~~~ 213 (347) ...+ ..|.|.+-.....+. ..+. ........+..+.++...+.- .. ........+++|..+ T Consensus 210 -----G~~~--~qP~Gil~~~~~~~~~~~~~~~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~mn~~t~ 282 (395) T protein:vir:95 210 -----GAAK--TQPVGLMKDVNTNSGAVTDKASSGTLTFADADTTILELNDVLKNLSVDEKGKELKIDGKVALVVNPRDS 282 (395) T ss_pred -----CCCC--cCceeeeecccccccccccccccchhhhhhhHhhHHHHHHHHHhhccccccchhhhcCceEEEEcchhh Confidence 0000 001111110000000 0000 001111223333333333211 01 111233457788766 Q ss_pred HHHhcchhhhhhhhccccccccceEEEEe--ceEEEEecceecccccccccccccccccccccccccccccccccccceE Q lcl|NC_015249. 214 SAILAALMPNAANYQALIDPSTGSIRNVM--GFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVV 291 (347) Q Consensus 214 ~~Ll~~~~~~~~~~~~~~~~~~G~Vg~i~--G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~ 291 (347) ..+.... + ++ ..+|...++. |.+|++|+.+|... -.-+||+.. T Consensus 283 ~~~~g~~--~---~~----~~~G~~~~~lg~g~~v~~~~~~p~~~-------------------------i~fgdfs~y- 327 (395) T protein:vir:95 283 WDVQARY--T---YL----TANGGFVTVLPYNVTIITSEFVPEGK-------------------------LVAFVTDRY- 327 (395) T ss_pred hhcCCcc--e---ec----cCCCcceeccCCcceEEEcCCCCCCc-------------------------EEEEecccE- Confidence 5443211 1 11 1345555665 55689999998321 012355552 Q ss_pred EEEechhhhhhhhhcceeeeeeechh--hhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 292 GLFNHRSAVGTVKLKDMALERARRAN--FQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 292 ~l~~~~~Av~~v~~~~~~~e~~~d~~--~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++-+ .+++++...+.. +-...+++..++|+++++|++.+.+.+..+ T Consensus 328 -~i~~r--------~~~~i~~~~~~~~~~d~~~f~~~~r~dg~~~~~~A~~~l~i~~~ 376 (395) T protein:vir:95 328 -NAVRG--------GGLTVKKFDQTLALEDAVLFTAKTFAYGQPDDNKASAVYDLKVA 376 (395) T ss_pred -EEEEe--------cceEEEeccchhhhCCcEEEEEEEEECCEEeccccEEEEEeecc Confidence 22222 234444443322 223558899999999999999999999877 No 173 >protein:vir:78350 Length: 383 # NCBI annotation: Cps # Family: family:all:635 # MgeID: mge:1850 # MgeName: B025 # Cross-refs: genbank:acc:YP_001468644;genbank:gi:157325222;genbank:GeneID:5601696 Probab=98.34 E-value=2.7e-07 Score=56.59 Aligned_cols=285 Identities=13% Similarity=0.062 Sum_probs=136.7 Q ss_pred CCccc---------cc-ccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcc-e Q lcl|NC_015249. 1 MAKMN---------GG-QQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRT-K 69 (347) Q Consensus 1 ma~~~---------~~-~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~-~ 69 (347) +.... .. +.+.+.. .++|. .|.-++|..++.+...+.|.++.++++.++ +|+ .+|++.... . T Consensus 64 ~~~~g~~~lt~~e~~~~~~~~~~~---~~~gg--~lvP~~~~~~I~~~l~~~s~l~~~~~v~~~-~~~-~~i~~~~~~~~ 136 (383) T protein:vir:78 64 SASRTDKNITNEEIKFFNDINKEV---GYKEE--TLLPQTVVDEIFEDLTTEHPFLASIGMRTT-GLR-TKFLKSETSGV 136 (383) T ss_pred HhcCChhhhhHHHHHHHHHHhccC---CCCCc--cccCHHHHHHHHHHHHhhccceeeeeeEec-CCc-eEEEEEcCCcc Confidence 00000 00 0111111 11111 245699999999999999999999988875 455 467776544 3 Q ss_pred eeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Q lcl|NC_015249. 70 AAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPS 149 (347) Q Consensus 70 ~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~ 149 (347) +.....+.++... .+++-.+++|..-++ +.-+.|..-=-..+.+|+.+.+.++.+.++++..|++++.-- T Consensus 137 a~w~~e~~~~~~~-~~~~f~~i~l~~~kl-~~~i~is~ell~Ds~~~ie~~i~~~l~~~~a~~~~~a~i~G~-------- 206 (383) T protein:vir:78 137 AVWGKIFGEIKGQ-LDATFSDEESIQNKL-TAFVVVPKDLEKFGPAWVKRFVVTQIEEAFAVALESAYIVGD-------- 206 (383) T ss_pred eEEeecccccccc-cCcceeeEeecceee-EeeccchHHHhhccHHHHHHHHHHHHHHHHHHHHhhheEecc-------- Confidence 3333333333221 234445556655444 444445322223356789999999999999999999875211 Q ss_pred cccccccccc----Ccceeecccccccccchhhh---HHHHHHHHHHHH---HHhhhcCC-CCCC-CEEEeCHHHHHHHh Q lcl|NC_015249. 150 ASDENIAGLG----KAHVLEVGKQSELRGDQVKL---GQAIIAQLTLAR---AKLTGNYV-PSAD-RVFYTTPDNYSAIL 217 (347) Q Consensus 150 ~~~~~~~~~~----~g~~i~~~~~~~~~~~~~~~---~~~~~~~l~~a~---~~Lde~~V-P~~g-R~~vv~P~~~~~Ll 217 (347) ....|.|.- .......+...+........ ....++.+.... ..+..... ...+ ...+++|.-|+.++ T Consensus 207 -G~~qP~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~n~~~~~~~~ 285 (383) T protein:vir:78 207 -GNDKPIGLNRKVGKGSTVVDGVYAEKAATGTLTFANPKTTVNELTDVYKYHSVKENGHPLNVAGKVTLLVNPTDAWDVK 285 (383) T ss_pred -CCCCceeeeeccCCcccccccccccccccchhhhhhhHHHHHHHHHHHhccchhcccchhhhcCceEEEEcCcchhhhc Confidence 011111110 00000000000000000000 111111111111 11111000 0111 23456665444443 Q ss_pred cchhhhhhhhccccccccceEEEEece--EEEEecceecccccccccccccccccccccccccccccccccccceEEEEe Q lcl|NC_015249. 218 AALMPNAANYQALIDPSTGSIRNVMGF--EVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFN 295 (347) Q Consensus 218 ~~~~~~~~~~~~~~~~~~G~Vg~i~G~--~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~ 295 (347) ...... ..+|....+.|+ +|++|+.+|... -.-+||+... +. T Consensus 286 ~~~~~~---------~~~G~~~t~l~~~~~iv~s~~~p~~~-------------------------iifgdfs~Y~--i~ 329 (383) T protein:vir:78 286 KQYTSL---------NANGVYVTALPFNLNIIESLFVPEKK-------------------------AISYVAERYD--AL 329 (383) T ss_pred cchhcc---------CCCCceeeecCCCceEEecCCCCccc-------------------------EEEeeccceE--EE Confidence 211110 123444455544 588888887421 0113444422 22 Q ss_pred chhhhhhhhhcceeeeeeechhhh--cceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 296 HRSAVGTVKLKDMALERARRANFQ--ADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 296 ~~~Av~~v~~~~~~~e~~~d~~~~--~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ..+++.++.+.+..+. ...+++.+++++++++|++.+++.++-+ T Consensus 330 --------~r~~~~i~~~~~~~f~~d~~~f~~~~r~dG~~~~~~A~~vl~~~~~ 375 (383) T protein:vir:78 330 --------IGGPLDIGTYDQTLAIEDLNLYAAKQFAYGKAKDDKAAAVWTLNIN 375 (383) T ss_pred --------ecccceEEecchhhhhcCceEEEEEEEEcCEEecCCeEEEEEEEec Confidence 2234556554332222 2568999999999999999999887777 No 174 >protein:vir:106647 Length: 303 # NCBI annotation: ORF011 # Family: family:all:1178 # MgeID: mge:1557 # MgeName: 187 # Cross-refs: genbank:acc:YP_239493;genbank:gi:66395226;genbank:GeneID:4555801 Probab=98.26 E-value=3.6e-07 Score=55.90 Aligned_cols=272 Identities=12% Similarity=0.080 Sum_probs=146.1 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecC----cceeeeeecC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLG----RTKAAYLQPG 76 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG----~~~~~~~~~g 76 (347) |+--..=- .++..+ .+-..+ |+++|+.-+.+-++ .++..|...+..|.+++.++.- ....++...| T Consensus 1 M~~e~nl~-~~~dL~---~a~siD--F~~~f~~~i~~L~~----~LGv~r~~pla~Gt~iktyK~~~~~y~gda~dVaEG 70 (303) T protein:vir:10 1 MSAENNLI-NVEALG---KAKSID--FANKLGVGLNKLFE----ALAIQNKIPMNVGSALKQYRFKVEDSEKPNGDVAEG 70 (303) T ss_pred CCCCcCCc-chhhcc---cceeeh--hhhhhhhhHHHHHH----HhhhhccccccCCceeeeeeeeceeeccccccccCC Confidence 55443100 112332 222344 99999988876654 4445555556667777665432 1233566788 Q ss_pred CCCCCccCCCCC---ceEEEEEEeeeecccccccHHHH-H-hChh-hHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccc Q lcl|NC_015249. 77 ENLDDKRKDMKH---TERTINIDGLLTADVLIYDIEDA-M-NHYD-VRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSA 150 (347) Q Consensus 77 ~~~~~~~~~~~~---~~~~l~ID~~~~~~~~Idd~D~~-q-~~~D-~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~ 150 (347) +.||-+ .+.. ...++++.++. . .+ =||+ | .-|+ ...+.-+++..++++.+|..++..+-.+. T Consensus 71 e~Ipls--kvt~~~~~t~~~~~kK~r--K-~t--TdEAIqlsGyg~aVgetd~qL~~~Iq~kIdnd~~~~lktaT----- 138 (303) T protein:vir:10 71 DVIPLT--KVTREQVDITELQFAKYR--K-ST--SAEAIQAHGYDLAINQTDNEMIKYVQKKFRAKFFETLKSAI----- 138 (303) T ss_pred cccchh--hheeeecceEEEEeeccc--c-cc--cHHHHHhhcCCchhHHHHHHHHHHHHhhhhHHHHHHHhhcc----- Confidence 888764 3442 34677776543 3 33 3444 4 4444 89999999999999999999987653210 Q ss_pred cccccccccCcceeecccccccccchhhhHHHHHHHHHHHHH-------HhhhcCCCCCCCEEEeCHHHHHHHhcchhhh Q lcl|NC_015249. 151 SDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARA-------KLTGNYVPSADRVFYTTPDNYSAILAALMPN 223 (347) Q Consensus 151 ~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~-------~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~ 223 (347) ++ ...+ ...... ++.|-+|.. .++|.++ .-+++|+|.-.+.+|.+.... T Consensus 139 ----------~t------~~~t-~~t~~s----~~glq~Al~~~~~kl~~~~ed~~---~~V~FvNP~Daa~yl~~A~i~ 194 (303) T protein:vir:10 139 ----------EN------GKRT-NKTKLS----AENLQGALSKGRANLSVLLDDEI---TPIAFVNPNDTAEYLANGFIN 194 (303) T ss_pred ----------cc------cccc-cceeec----HHHHHHHHHhhhhhccccccccc---cEEEEEchHHHHHHhhcCCcc Confidence 00 0000 000011 222333323 2344443 359999999999999887765 Q ss_pred hh-hhccccccccceEEEEeceEEEEecceeccccccccc----ccccccccccccccccccccccccccceEEEEechh Q lcl|NC_015249. 224 AA-NYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRP----EEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRS 298 (347) Q Consensus 224 ~~-~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~----~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~ 298 (347) .. .--|...+. ++.|+.|+.|+.+|.+..-.+.. .+-.+..| ..+ ..=.|..|.+..+|+. T Consensus 195 ~~~t~fG~n~L~-----nfLG~~II~S~kv~~G~~~~T~~~Ni~~ay~~~~g---~l~--~~f~~t~D~tglIGv~---- 260 (303) T protein:vir:10 195 STGAQFGVNLLT-----PYVGVKIVEFADVPQGEVWMTVAENLNVAYANPRG---ELS--RAFAFATDATGFVGVL---- 260 (303) T ss_pred hhhhhhhhhhhh-----hhhcceEEEeccCCCceEEEeeccceEEEEecCch---hhh--hhhhhccccccceEEE---- Confidence 33 223444454 49999999999999765332111 00001111 000 0002333444444432 Q ss_pred hhhhhhhcceeeeeeechhhhcceeeeeeeecc--cccccceEEEEEEcCC Q lcl|NC_015249. 299 AVGTVKLKDMALERARRANFQADQIIAKYAMGH--GGLRPEACGALVFNKA 347 (347) Q Consensus 299 Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~G~--~~~Rpe~a~~i~~~~a 347 (347) .++.++.=-+...+.+|. =+-|+|+.+...+.++ T Consensus 261 ---------------h~~~~~~~t~eT~~~~~~~lfpE~~dgiv~~ti~~~ 296 (303) T protein:vir:10 261 ---------------HDIQPQRLTSDTIYASAISMFPENIDAVIKVTIKKD 296 (303) T ss_pred ---------------eccccceeeehhHhHhHHHhcccccceEEEEEEecc Confidence 222222222222333333 3458999999999888 No 175 >protein:vir:98635 Length: 377 # NCBI annotation: major coat protein # Family: family:all:635 # MgeID: mge:1601 # MgeName: phi3396 # Cross-refs: genbank:acc:YP_001039923;genbank:gi:126011098;genbank:GeneID:4818471 Probab=98.14 E-value=1.1e-06 Score=53.25 Aligned_cols=285 Identities=12% Similarity=0.016 Sum_probs=137.8 Q ss_pred CCcccc---------cccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEee-cCccee Q lcl|NC_015249. 1 MAKMNG---------GQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPV-LGRTKA 70 (347) Q Consensus 1 ma~~~~---------~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~-iG~~~~ 70 (347) +++... +-+-....| +.+.+ -.+.-+.|..++.+...+.|.++..+++.++. |+ +++++ -+..++ T Consensus 59 ~~~~~~~~lt~ee~~~~~~~~~~~-~~~~g--g~~vP~~~~~~I~~~l~~~s~i~~~~~v~~~~-~~-~~~~~~~~~~~a 133 (377) T protein:vir:98 59 DLRDKNRELTAEEIKFFNDIDKNV-GGKDK--FKLLPEETMVQVFDDLVAEHPLLKVINFKNTS-LR-LKALTAETSGTA 133 (377) T ss_pred HhccCCcccCHHHHHHHHHHHhcc-CCCCC--ccccCHHHHHHHHHHHHHhhhhhhheeeEecC-cc-eEEEEecCCcce Confidence 111100 000000111 11122 22455899999999999999999998888764 44 46664 354555 Q ss_pred eeeecCCCCCCccCCCCCceEEEEEEeeeeccc-ccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Q lcl|NC_015249. 71 AYLQPGENLDDKRKDMKHTERTINIDGLLTADV-LIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPS 149 (347) Q Consensus 71 ~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~-~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~ 149 (347) .-...+..+..+ .+++-. .+++...++..+ .|..-=-..+.+|+-+.+.++.+.++++..|++++.- T Consensus 134 ~w~~e~~~~~~~-~~~~f~--~i~l~~~kl~a~~~is~elL~ds~~~ie~~i~~~la~~~a~~~~~a~i~G--------- 201 (377) T protein:vir:98 134 VWGDIFGEIKGQ-LKQAFK--EQDFSQFKLTAFVVIPKDALKFGPKWIKQFITEQLKEAIAVALELAIVKG--------- 201 (377) T ss_pred eEeecccccCcc-cCccce--eEeecceeEEeeecccHHhhhccHhHHHHHHHHHHHHHHHHHHhhceEec--------- Confidence 544444443321 123333 455555554443 3422112236678999999999999999999887521 Q ss_pred ccccccccccC---cceeecccccccccchhhhHHHHHHH-----------HHHHHHHhhhcCC----CCCCCEEE-eCH Q lcl|NC_015249. 150 ASDENIAGLGK---AHVLEVGKQSELRGDQVKLGQAIIAQ-----------LTLARAKLTGNYV----PSADRVFY-TTP 210 (347) Q Consensus 150 ~~~~~~~~~~~---g~~i~~~~~~~~~~~~~~~~~~~~~~-----------l~~a~~~Lde~~V----P~~gR~~v-v~P 210 (347) .....|.|.-. +..+.......+.+ .....+.+.+. ...++..+....+ -..||++. ++| T Consensus 202 ~G~~qP~Gil~~~~~~~~~~~~~~~~~~-~~~~~~~~~~l~~~~~~~~~~~a~~~m~~~t~~~~~klkd~~G~~i~~~n~ 280 (377) T protein:vir:98 202 DGLLQPVGLLKDLSQPTVDQSTGRDITT-YKTDKEAIADLSDLTPDNAPKKLVPVMKHLSVNDKKRPLKIAGQVKLILNP 280 (377) T ss_pred cCCCcceeeeeccccccccccccccccc-ccchhhhHhhhhhhchhHHHHHHHHHHHHHHHHHHhhhhccCCceEEEecc Confidence 11111111110 00000000000000 00000011000 0001111110000 12455543 566 Q ss_pred HHHHHHhcchhhhhhhhccccccccceEEEEece--EEEEecceeccccccccccccccccccccccccccccccccccc Q lcl|NC_015249. 211 DNYSAILAALMPNAANYQALIDPSTGSIRNVMGF--EVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALD 288 (347) Q Consensus 211 ~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg~i~G~--~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~ 288 (347) .-|+.++..... ...+|.-..+.|+ .|++|+.+|.... .-+||+ T Consensus 281 ~~~~~~~p~~~~---------~~~~G~~~t~lg~p~~vv~s~~~p~~~i-------------------------~fgdf~ 326 (377) T protein:vir:98 281 EDRWALEAQFTS---------RNQFGEYVTVLPHGITILESLAVETGKA-------------------------IAFVAN 326 (377) T ss_pred cchhhccccccc---------cCCCCccccccCCCceEEecCCCCcccE-------------------------EEEEec Confidence 555555321111 1134544456665 4778888874210 113444 Q ss_pred ceEEEEechhhhhhhhhcceeeeeeechhh--hcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 289 NVVGLFNHRSAVGTVKLKDMALERARRANF--QADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 289 ~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~--~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ... ++ ...++.++...+... -...+++.++++++++.|++.+++.+..+ T Consensus 327 ~Y~--i~--------~r~~~~i~~~~~~~~~~d~~~f~~~~r~dg~~~~~~a~~vl~i~~~ 377 (377) T protein:vir:98 327 RYD--AF--------MATASTIEEYDQTFAMEDLQLYLTKNYFYGKAKDNHTAALLTLAGG 377 (377) T ss_pred cee--EE--------eecceEEEeechhhhhcCceEEEEEEEEcCEEeccCcEEEEEEecC Confidence 422 22 223455655533332 22558899999999999999999999999 No 176 >protein:vir:80128 Length: 466 # NCBI annotation: Phage capsid protein # Family: family:all:635 # MgeID: mge:1877 # MgeName: bacteriophage bv1 # Cross-refs: genbank:acc:YP_001425603;genbank:gi:155042936;genbank:GeneID:5469556 Probab=98.14 E-value=5.3e-07 Score=54.99 Aligned_cols=289 Identities=11% Similarity=0.109 Sum_probs=139.2 Q ss_pred CCccccc---------------ccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec Q lcl|NC_015249. 1 MAKMNGG---------------QQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL 65 (347) Q Consensus 1 ma~~~~~---------------~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i 65 (347) |..-+.. .+. .+...+ .++-..+.-+.+...+.......+.+++.+++..+.+ .+++++- T Consensus 123 ~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~--~~g~~~~vP~~~~~~i~~~l~~~~~l~~~~~v~~~~g--~~~~~~~ 197 (466) T protein:vir:80 123 MPYEQRAALIARSEVKEFLAQVRTL-AQQKRA--VSGAELTIPDVMLELLRDNMHRYSKLISKVRLRPLKG--TARQNIA 197 (466) T ss_pred hhhhhHHHHHHHHHHHHHHHHHHHH-hhhhhh--hccccccccHHHHHHHHHhhhhhhhhhhheeeeecCc--eeEeeee Confidence 0000000 000 001111 1111123447788888888877788888888887654 3455554 Q ss_pred Ccc-eeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015249. 66 GRT-KAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKL 144 (347) Q Consensus 66 G~~-~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~ 144 (347) +.. .+.-...|..++. .++.-.++++.+.++ +.-+.|.+-=-..+.+|+-+.+.++.+++++...|+.|+.-- T Consensus 198 ~~~~~a~wv~E~~~~~~--~~~~f~~i~~~~~k~-~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~ail~G~--- 271 (466) T protein:vir:80 198 GAIPEGVWTEAVANLNE--LSLSFSQIEVDGYKV-GGFIPIPNSTLEDSDLNLADEILDAIGQAIGFALDKAILYGT--- 271 (466) T ss_pred cCCcceeeccccccccc--ccccccceeecceee-eeehhhhHHHHhcchHHHHHHHHHHHHHHHHHHHhhheeecc--- Confidence 433 3333445555543 234455566655554 233344322223456789999999999999999999876311 Q ss_pred hhhccccccccccccCc-ceeeccccc--------cccc-------chhhhHHHHHHHHHHHHHHhhhcCCCCCCC-EEE Q lcl|NC_015249. 145 CNLPSASDENIAGLGKA-HVLEVGKQS--------ELRG-------DQVKLGQAIIAQLTLARAKLTGNYVPSADR-VFY 207 (347) Q Consensus 145 a~~~~~~~~~~~~~~~g-~~i~~~~~~--------~~~~-------~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR-~~v 207 (347) ....|.|.-.. ......... +... +....+...+..++.+.. +.+... ..++ +.+ T Consensus 272 ------G~~~P~Gil~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~-~~~~~~w~ 343 (466) T protein:vir:80 272 ------GTKMPVGIVTRLAQTTQPPNWGTKAPAWTNLSTTNLLKIDPTGKSAEEFFSELVLKLS-KARANY-SNGMKFWA 343 (466) T ss_pred ------CCCCcceeeecccccccccccccccccccccchhhhhhhhhhccchhhHHHHHHHHHH-hhhccc-cCCceeEE Confidence 00111111000 000000000 0000 000011111212211111 111121 2233 356 Q ss_pred eCHHHHHHHhcchhhhh--hhhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccc Q lcl|NC_015249. 208 TTPDNYSAILAALMPNA--ANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRV 285 (347) Q Consensus 208 v~P~~~~~Ll~~~~~~~--~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~ 285 (347) +++..+..|+.-.-..+ ..+... ..++ ..+.|.+|+.|+++|... -+-+ T Consensus 344 ~~~~~~~~l~~~~~~~~~~g~~~~~--~~~~--~~i~G~pvv~s~~~~~~~-------------------------~~~g 394 (466) T protein:vir:80 344 MSSNTHAVLMSKAITFNSAGALVAS--LNNT--MPIVGGDIVILDFIPDND-------------------------IIGG 394 (466) T ss_pred ecchhHHHhhcccccccCCcccccc--CCCc--ccccccceeecCccCccc-------------------------eeee Confidence 78888887764332211 112111 1122 248899999999998532 0112 Q ss_pred cccceEEEEechhhhhhhhhcceeeeeeechhhhc--ceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 286 ALDNVVGLFNHRSAVGTVKLKDMALERARRANFQA--DQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 286 ~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~~--d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +|.... ++-+ ++++++...+..+.- ..+++.++++.++.+|++.+.+.+... T Consensus 395 ~~~~y~--i~~r--------~~~~i~~~~~~~f~~d~~~~r~~~r~dg~~~~~~afv~~~~~~~ 448 (466) T protein:vir:80 395 YGSLYL--LAER--------ADIKLAQSEHVRFIEDQTVFKGTARYDGKPVFGEGFVAVNIANA 448 (466) T ss_pred ccccEE--EEee--------cceEEEechhhhhhcCcEEEEEEEEEccEEeccCceEEEEecCC Confidence 333321 2222 235555554443323 458899999999999999998877666 No 177 >protein:vir:79928 Length: 393 # NCBI annotation: major head protein # Family: family:all:30335 # MgeID: mge:1874 # MgeName: 0305phi8-36 # Cross-refs: genbank:acc:YP_001429616;genbank:gi:156564106;genbank:GeneID:5525693 Probab=98.11 E-value=4.7e-07 Score=55.27 Aligned_cols=301 Identities=15% Similarity=0.155 Sum_probs=165.9 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeeeecCCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYLQPGENLD 80 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~~~g~~~~ 80 (347) |+.-.-+.+..+|-- ..+++..=|.-+.-++-|.++-..-.+-..++-.-.++.|.+-.|+.+|-.-+.....|++.+ T Consensus 59 m~G~~p~~eV~~~e~--mtt~~a~IliP~vis~v~~Eaaepl~~~~kl~qk~~L~~Grsm~F~~~g~~Ra~~IgEGgE~~ 136 (393) T protein:vir:79 59 MEGETPTNEVNLREF--MATPSAQILIPRVIVGTMREAAEPLYIGTKMLQKIRLKSGQSMIFPSIGIMRAYDVAEGQEIP 136 (393) T ss_pred hcCCCchhheehhhh--hcCCCcceechhhhhhhhhhcccchhHHHHHHHHHhhhcCcceeccchheeeecccccccccc Confidence 776555555444433 333333323447777777765433333333444445678999999999988777777788876 Q ss_pred CccCCCCCceEEEEEEeeeecccccccHHH--HHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccc Q lcl|NC_015249. 81 DKRKDMKHTERTINIDGLLTADVLIYDIED--AMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGL 158 (347) Q Consensus 81 ~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~--~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~ 158 (347) ...-+. .+.-.+++.+-+|. ..|.=-|+ -.+.+|+++-..+.++.+|+|..|+-++.+.-+-. +..-.++ T Consensus 137 ~~sld~-~T~dsv~~~~gK~G-~~Ia~SqEmIsDSg~Dvin~~l~aA~RaMaRkKee~a~n~fk~~g------htvfDa~ 208 (393) T protein:vir:79 137 EDSIDW-QTHESPEIRVGKSG-IRLRFTDEMISDSQWDLMSMMIKQAGRAMGRHKEQKAYHQFRSHG------HTVFDNY 208 (393) T ss_pred ccchhh-hcCCceeEEechhh-hhhhhHHHHhhcchHHHHHHHHHHHHHHHHhhhHHHHHhhhhccc------ceeeecc Confidence 643221 22224556565543 23432233 34789999999999999999999999887763211 1111111 Q ss_pred cCccee-ecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhh--h----hh---- Q lcl|NC_015249. 159 GKAHVL-EVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNA--A----NY---- 227 (347) Q Consensus 159 ~~g~~i-~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~--~----~~---- 227 (347) +.++.. ++|-+-+ ...+++-..+.|+++.-.---.. -.+-++++.|=.|+..-|....-. . +| T Consensus 209 st~t~ahptGr~~~----~~qNGTlSleDllDm~~av~~~h--yt~svi~MHPLAWnv~AKna~me~~~~na~gN~~~~~ 282 (393) T protein:vir:79 209 STNKLAHTTGLDKN----GVQNDTFSAEDFLDLIIAVMANE--YTPSDLMMHPLAWTVFAKNELMGSLQANPYGNYPAKG 282 (393) T ss_pred ccCccceeecCCcc----ccccccccHHHHHHHHHHHhccc--CCcceEEEcCchhhhhhhhhhhcceeeccccccCccc Confidence 122111 1111111 12222323344444333322222 134589999988888876643211 1 11 Q ss_pred -ccccc----cccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhh Q lcl|NC_015249. 228 -QALID----PSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGT 302 (347) Q Consensus 228 -~~~~~----~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~ 302 (347) ..+.. +.+|++ -..|+|+.||-+|..... ..+. -|..+ .+.+++..-++ T Consensus 283 ~~ts~algp~~i~~~~--~~nlnv~~sPfvp~d~k~------------~rFd-------~~~Vd-~NnvgvlLV~D---- 336 (393) T protein:vir:79 283 APSSMALGPDSIQGRL--PFNFNVNLSPFIPLDKKS------------RRFD-------VYAVD-RNNVGVLLVRD---- 336 (393) T ss_pred cchhhhhchhhhcccc--ccceeEEEeccccccccc------------ceee-------EEEee-cCCceEEEEec---- Confidence 11111 111211 135899999988864321 1111 11111 23344444233 Q ss_pred hhhcceeeeeeechhhhcceeeeeeeecccccccceEEEEEE----cCC Q lcl|NC_015249. 303 VKLKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVF----NKA 347 (347) Q Consensus 303 v~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~----~~a 347 (347) ++++++|.|+-+--.-|.-+-+||.|+|.--.++++.= .+- T Consensus 337 ----~i~tdq~ddk~rdiq~iKl~ERYG~gvLn~gkaiavakNI~~~k~ 381 (393) T protein:vir:79 337 ----DLKTDQWDEKARGLQNIKMIERYGIGILNEGKAIAVAKNISMDKS 381 (393) T ss_pred ----CcceeccccccccceeeeeeeeeceeeeeCCceEEEEecceeecc Confidence 68899999999999999999999999998877765432 222 No 178 >protein:vir:78387 Length: 349 # NCBI annotation: putative coat protein # Family: family:all:1522 # MgeID: mge:1851 # MgeName: SETP3 # Cross-refs: genbank:acc:YP_001110837;genbank:gi:134288598;genbank:GeneID:5179650 Probab=97.61 E-value=3.8e-05 Score=44.85 Aligned_cols=289 Identities=12% Similarity=0.053 Sum_probs=154.3 Q ss_pred CCcccccccccccccccccccchhhhh-hhhhhhHHHHHHHHHHhhh-c-ccc-ccc-----ccccceEEEeecCcceee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALF-LKVFGGEVLTAFTRTSVTM-N-KHL-VRS-----IQSGKSAQFPVLGRTKAA 71 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~-ie~f~g~V~~~f~~~s~~~-~-~~~-~r~-----i~~G~tv~i~~iG~~~~~ 71 (347) ||.+. . +|.- .| +|+|..+|.+...+.+.|. + .+. ... ..+|+.+.+|..+...-. T Consensus 1 Ma~T~--------l------~D~i-ipe~~vf~~Yv~~~~~e~~~l~qSGii~~d~~l~~~~~~gG~~~~iPf~~~L~g~ 65 (349) T protein:vir:78 1 MAITT--------I------GDIV-TGNIPVLASYMTEDPVEKTAFFDSGILTSTPYAAEIANGPSNIANLPFWKAIDTS 65 (349) T ss_pred CCceE--------E------eeee-ccCHHHHHHHHHHhhHHhhhhhhccceeccHHHHHHhhcCCCEEEeeeeecCCCC Confidence 66332 2 2211 11 2478888887777666432 2 222 111 246999999999765321 Q ss_pred ----eeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Q lcl|NC_015249. 72 ----YLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNL 147 (347) Q Consensus 72 ----~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~ 147 (347) +-..+..=+.+++.+...+..-++ ..+-..|...|+-..-+--|+|..++++.+.--.|...+.++ .+.++... T Consensus 66 ~e~nv~~D~~~~~~t~~kitt~~~~a~~-~~r~kaw~~~Dla~~lsG~dpm~~Ia~~va~yW~r~~q~~Li-a~L~Gvf~ 143 (349) T protein:vir:78 66 IEPNYSNDVYQDIATPRAIQTGEMMARV-AYLNEGFGQADLTVELTSQNPLQSVASRLDNFWQRQAQRRLI-ATALGLYN 143 (349) T ss_pred cccccCCCCcccccccccccccceeeee-eeeccccchhHHHHHhhCchHHHHHHHHHHHHHhhHHHHHHH-HHHHHhhc Confidence 111111001123445444443322 444567778899888888899999999888877776554444 44454433 Q ss_pred ccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhc---CCCCCCCEEEeCHHHHHHHhcchhhhh Q lcl|NC_015249. 148 PSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGN---YVPSADRVFYTTPDNYSAILAALMPNA 224 (347) Q Consensus 148 ~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~---~VP~~gR~~vv~P~~~~~Ll~~~~~~~ 224 (347) ..........+.+..+..+.+.+. ..++ .+++|..+|... +....=-.+++-+..|..|.+...+ T Consensus 144 ~~~~a~~~~~~~~~~t~d~s~~a~------~~~~----~~~dA~~~lgda~~Gd~~~~lt~i~mHS~v~~~L~~~~li-- 211 (349) T protein:vir:78 144 DNVSATDAYHEQNDMVVDVSATLG------FDAG----AFIDATQTMGDALMGNGGEVLGAIAMHSFVYAQARKAQLI-- 211 (349) T ss_pred ccccccchhhhcccceeeeccccC------CChh----hhhhhHHHHHHHhccccccceeEEEEchHHHHHHHhhhhh-- Confidence 222222222122222222222211 1223 344555555543 1122226899999999998765432 Q ss_pred hhhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhh Q lcl|NC_015249. 225 ANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVK 304 (347) Q Consensus 225 ~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~ 304 (347) +|.- ..-.+..|..++|.+|+....+|....++. .+....+|-+-|++..+ T Consensus 212 -~~i~-~s~~~~~i~ty~G~~VivDD~~Pv~~~g~~---------------------------~~yttylfg~GAi~~~~ 262 (349) T protein:vir:78 212 -DFIR-DAENNTMFATYQGYRVIVDDSMTVVGQGAQ---------------------------RKFISIIFGQGAIGYGE 262 (349) T ss_pred -hhcc-CcccCcccceecCeEEEEeCCCccccCCCC---------------------------ceEEEEEeecceEEEcc Confidence 2221 112355689999999999999997543211 12233455566666665 Q ss_pred hcc-eeeeeeechhhh----cceeeeeeeeccc--ccccceEEEEEE-----cCC Q lcl|NC_015249. 305 LKD-MALERARRANFQ----ADQIIAKYAMGHG--GLRPEACGALVF-----NKA 347 (347) Q Consensus 305 ~~~-~~~e~~~d~~~~----~d~i~~~~a~G~~--~~Rpe~a~~i~~-----~~a 347 (347) ..+ +.+|..||+... .|.+..+++|.-. ...+..+.+..- ..+ T Consensus 263 ~~~~~~~et~rd~~~g~~~G~d~l~~R~~~~~hp~G~s~~~a~v~~~~~~~~~~s 317 (349) T protein:vir:78 263 GNPVMPLEYEREASRANGGGVETLWTRKTWLLHPFGYRFTSAVITGNGTETIARS 317 (349) T ss_pred CCCccceeeecccccCCcceeEEEEEeeEEEeeeeeeeeccccccCCccccccCC Confidence 543 247777888653 4888887777543 334443321100 011 No 179 >protein:vir:80446 Length: 367 # NCBI annotation: BcepGomrgp07 # Family: family:all:1522 # MgeID: mge:1882 # MgeName: BcepGomr # Cross-refs: genbank:acc:YP_001210227;genbank:gi:146329919;genbank:GeneID:5123555 Probab=97.60 E-value=3.9e-05 Score=44.77 Aligned_cols=294 Identities=12% Similarity=0.047 Sum_probs=155.1 Q ss_pred CCcccccccccccccccccccchhhhhh-hhhhhHHHHHHHHHHhh-hccccc--cc-----ccccceEEEeecCcceee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVT-MNKHLV--RS-----IQSGKSAQFPVLGRTKAA 71 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~i-e~f~g~V~~~f~~~s~~-~~~~~~--r~-----i~~G~tv~i~~iG~~~~~ 71 (347) |+..+. . |+. + + +|+ |+|..+|.+...+++-| .+-+-+ .. -.+|+.+.+|..+...-. T Consensus 1 M~~~~~---~-T~l------~--D-ii~pEvF~~Yv~~~~~e~~~l~qSGiv~~d~~l~~~~~~gG~~v~iPf~~~L~g~ 67 (367) T protein:vir:80 1 MPDFNN---Q-VRL------V--D-AVIPEVYTSYTAIDRPELTAFFLSGAVASNDFLSQFLSAPGRLINIPFWRDLDSL 67 (367) T ss_pred Ccchhh---h-hhh------h--h-ccchhhhhHHHhhhhhhhhhhhhcceeecCHHHHHHhhcCCCEEEeeeeccCCCC Confidence 666542 1 333 1 2 455 99999998877766643 222222 12 257999999999776432 Q ss_pred e--eecCCC-CCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh- Q lcl|NC_015249. 72 Y--LQPGEN-LDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNL- 147 (347) Q Consensus 72 ~--~~~g~~-~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~- 147 (347) . +...++ .+.++..+...+..-.+ ...--.|...|+-..-+--|+|..+..+.+.--.|..-+ +|..+.++... T Consensus 68 ~~n~~~d~~~~~~t~~kittg~~~a~v-~~r~kaw~~~Dla~~lsG~dpm~~Ia~qva~yW~r~~q~-~Lla~L~Gvf~~ 145 (367) T protein:vir:80 68 EPNYGSDNPNVEAPIDGLGSGEMKTTK-TWLNKAYGAMDLTAELAGSNPMTRIRNRFGVYWTRQWQR-RIIAMAVGVYKS 145 (367) T ss_pred ccccCCCCCcccccccccccchheeee-ehhcccchhhhHHHHhhCchHHHHHHHHHHHHhhhhhHH-HHHHHHHHhhcc Confidence 1 111111 01122344444332222 334556778899999999999999999988766666544 44444444332 Q ss_pred cccccc-----------ccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHH Q lcl|NC_015249. 148 PSASDE-----------NIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAI 216 (347) Q Consensus 148 ~~~~~~-----------~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~L 216 (347) ...... .+....+..+..+.+.+. +.+.... .+.+.+|+..|.++ .+.=-.+++-+..|..| T Consensus 146 ~~a~~~~~~~~~~~~~a~~~~~~~~~~~Dis~~t~-~~~~~~s----~~~~~~A~~~lGD~--~~~l~~i~mHS~V~~~L 218 (367) T protein:vir:80 146 NLAGNFATIKTRGRVPAEVLGTAGDMVIDISGQTN-PADAVFN----REAFVDAAFTMGDH--VGSIAAIAVHSMVYKRM 218 (367) T ss_pred ccccchhhhhhhhccccccccccCceeeeeeccCC-Cccceec----HHHHHHHHHHhccc--cccccEEEEchHHHHHH Confidence 111110 011122333333322221 1122233 34567787888664 33447899999999998 Q ss_pred hcchhhhhhhhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEec Q lcl|NC_015249. 217 LAALMPNAANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNH 296 (347) Q Consensus 217 l~~~~~~~~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~ 296 (347) .+.. ++.---..+ .+..|+.++|.+|++...+|....+.. .+....+|- T Consensus 219 ~~~~-li~~i~~sd---~~~~i~ty~G~~VIvDD~~Pv~~~~a~---------------------------~~yttYlfg 267 (367) T protein:vir:80 219 TNND-EIEFIPDSK---GQLTIPTYMGKVVIVDDGMPVFGTGAD---------------------------KTYLSILFG 267 (367) T ss_pred Hhcc-ccccccCCC---CccccceecceeEEEeCCCcccccCCC---------------------------ceEEEEEEe Confidence 7764 322111111 145689999999999999997543211 012233555 Q ss_pred hhhhhhhhhcce-eeeeeechhhh----cceeeeeeeecc--cccccceEEEE-EE---------cCC Q lcl|NC_015249. 297 RSAVGTVKLKDM-ALERARRANFQ----ADQIIAKYAMGH--GGLRPEACGAL-VF---------NKA 347 (347) Q Consensus 297 ~~Av~~v~~~~~-~~e~~~d~~~~----~d~i~~~~a~G~--~~~Rpe~a~~i-~~---------~~a 347 (347) +-|++..+..+. .+|..||+..+ .|.+..+.+|=- ......-+..+ -+ ... T Consensus 268 ~GAi~~~~~~~~~~~E~~Rd~~~~~~gG~d~L~~Rr~~~~hP~G~s~~~~~v~~~~~~~~~~~~~~~~ 335 (367) T protein:vir:80 268 GAAFGYADGAPQVPVAVGRRELRGNGSGLEYILERKEWIVHPGGFNWLDADVTIPDNTGSPSGITSGP 335 (367) T ss_pred cceeeecccCCccceecccchhhhcCCceEEEEeeeeEEeecceeeeccccccccccccccccccccc Confidence 556655554432 25888999864 256665544321 11222211111 00 000 No 180 >protein:vir:107687 Length: 319 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:1518 # MgeName: T1 # Cross-refs: genbank:acc:YP_003898;genbank:gi:45686314;genbank:GeneID:2773027 Probab=97.10 E-value=0.00017 Score=41.22 Aligned_cols=294 Identities=10% Similarity=0.017 Sum_probs=137.5 Q ss_pred CCcccccc-------cccccccccccccchhhhhhhhhhhHHHHHHHHH----Hhhhcccccccccc--cceEEEe---e Q lcl|NC_015249. 1 MAKMNGGQ-------QIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRT----SVTMNKHLVRSIQS--GKSAQFP---V 64 (347) Q Consensus 1 ma~~~~~~-------~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~----s~~~~~~~~r~i~~--G~tv~i~---~ 64 (347) |-++++.- ...-+.|--..+.+..++|+.+...+++....+. -+.+.++.+++--+ -.++.+. . T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~da~~~~g~~~~~ql~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~ 80 (319) T protein:vir:10 1 MTTKKFDEADKSNVEMYLIQAGVKQDAAATMGIWTAQELHRIKSQSYEEDYPVGSALRVFPVTTELSPTDKTFEYMTFDK 80 (319) T ss_pred CCCcchhHHhhHHHHHHHhhccchhhhhhhhhhHHHHHHHHHHHHHHhhhhcceechhhcccccCCCCceEEEEeeeecc Confidence 77776541 1111122223333444567644444666444332 24556666653222 3445444 3 Q ss_pred cCcceeeeeec-CCCCCCccCCCCCceEEEEEEee-eecccccccHHHHH-hChhhHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015249. 65 LGRTKAAYLQP-GENLDDKRKDMKHTERTINIDGL-LTADVLIYDIEDAM-NHYDVRSEYTAQLGESLAMAADGAVLAEM 141 (347) Q Consensus 65 iG~~~~~~~~~-g~~~~~~~~~~~~~~~~l~ID~~-~~~~~~Idd~D~~q-~~~D~r~~~~~~~g~aLa~~~D~~i~~~~ 141 (347) +|..+. +.. ..+++. .+..-.+....|-.. .-+.+.+.++..++ ...++-..-...+..++++..|+.+|.-. T Consensus 81 ~G~a~~--~~d~~~dip~--v~~~~~~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~i~f~G~ 156 (319) T protein:vir:10 81 VGTAQI--IADYTDDLPL--VDALGTSEFGKVFRLGNAYLISIDEIKAGQATGRPLSTRKASACQLAHDQLVNRLVFKGS 156 (319) T ss_pred ccceee--ecCccccccc--eeccceeeEEEEEEEEeeeeecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhceEEEeec Confidence 455443 222 233332 223334444444332 23445556777774 67777788888899999999998876432 Q ss_pred HHHhhhccccccccccc-cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhc--CCCCCCCEEEeCHHHHHHHhc Q lcl|NC_015249. 142 AKLCNLPSASDENIAGL-GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGN--YVPSADRVFYTTPDNYSAILA 218 (347) Q Consensus 142 ~~~a~~~~~~~~~~~~~-~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~--~VP~~gR~~vv~P~~~~~Ll~ 218 (347) .+ ....|+ -..++... ..+.-..-..+.++.+++.|..+..+|..+ .+ ...-.++|+|+.|..|.. T Consensus 157 ~~---------~g~~GLlN~p~~~~~-~~~~~~~~~t~t~~~i~~di~~~~~~l~~~s~g~-~~p~~L~L~p~~~~~L~~ 225 (319) T protein:vir:10 157 AP---------HKIVSVFNHPNITKI-TSGKWIDVSTMKPETAEAELTQAIETIETITRGQ-HRATNILIPPSMRKVLAI 225 (319) T ss_pred cc---------ccceeEEeCCCceee-ecCCCCCccccCHHHHHHHHHHHHHHHHHhcCce-eeceEEEecHHHHHhhhc Confidence 11 111111 11111111 111111112245678899999988888754 22 123489999999999952 Q ss_pred chhhhhhhhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEec-- Q lcl|NC_015249. 219 ALMPNAANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNH-- 296 (347) Q Consensus 219 ~~~~~~~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~-- 296 (347) +..+.+..-...++. +..+++|...+.|...++. + +-+++++. T Consensus 226 --~~~~~~~t~l~~lk~----~~~~l~I~~~pel~~ag~~-----------g------------------~~~~v~y~~~ 270 (319) T protein:vir:10 226 --RMPETTMSYLDYFKS----QNSGIEIDSIAELEDIDGA-----------G------------------TKGVLVYEKN 270 (319) T ss_pred --ccCCCCeeHHHHHHH----hcCCceEEEeeeecccCCC-----------c------------------ceEEEEEecC Confidence 111111110111111 1235667776666421100 0 11222222 Q ss_pred hhhhhhhhhcceeeeeeechhhhcceeeeeeee-cccccccceEEEEEEcC Q lcl|NC_015249. 297 RSAVGTVKLKDMALERARRANFQADQIIAKYAM-GHGGLRPEACGALVFNK 346 (347) Q Consensus 297 ~~Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~-G~~~~Rpe~a~~i~~~~ 346 (347) ++-+...-.++++.-. ..++...+.+....+. |.-+.||++++.+. -- T Consensus 271 ~~~~~~~v~~~~~~~~-~e~~~l~~~~~~~~r~~Gv~i~~P~ai~~~d-GI 319 (319) T protein:vir:10 271 PMNMSIEIPEAFNMLP-AQPKDLHFKVPCTSKCTGLTIYRPMTIVLIT-GV 319 (319) T ss_pred CceEEEecCcceeeee-eeecCceEEEeeeeeeEEEEEEccceeEeee-cC Confidence 2222222223333221 2333444555555555 46788999765332 11 No 181 >protein:vir:94989 Length: 349 # NCBI annotation: hypothetical protein # Family: family:all:1522 # MgeID: mge:1547 # MgeName: KS7 # Cross-refs: genbank:acc:YP_224029;genbank:gi:62327316;genbank:GeneID:5176817 Probab=97.08 E-value=0.00018 Score=41.15 Aligned_cols=289 Identities=11% Similarity=0.030 Sum_probs=151.8 Q ss_pred CCcccccccccccccccccccchhhhh-hhhhhhHHHHHHHHHHhhh-c-ccc-ccc-----ccccceEEEeecCcceee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALF-LKVFGGEVLTAFTRTSVTM-N-KHL-VRS-----IQSGKSAQFPVLGRTKAA 71 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~-ie~f~g~V~~~f~~~s~~~-~-~~~-~r~-----i~~G~tv~i~~iG~~~~~ 71 (347) ||.+. . +|.- .| +|+|..+|.+...+.+.|. + .+. ... ..+|+.+.+|..+...-. T Consensus 1 Ma~T~--------l------~D~i-ipe~~vf~~Yv~~~~~e~~~l~qSGii~~d~~l~~~~~~gG~~~~iPf~~~l~g~ 65 (349) T protein:vir:94 1 MAITT--------I------GNIV-TGNIPVLASYMTEDPVEKTAFFNSGILTPTPYAAEIARGPSNIANLPFWKAIDTS 65 (349) T ss_pred CCceE--------E------eeee-ccChHHHHHHHHHhHHHhhhhhhccceeccHHHHHHHhcCCCEEEeeeeecCCCC Confidence 66333 2 1211 11 3478888888777666433 2 222 111 246999999988764321 Q ss_pred ---eeecCCCC-CCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Q lcl|NC_015249. 72 ---YLQPGENL-DDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNL 147 (347) Q Consensus 72 ---~~~~g~~~-~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~ 147 (347) .+..-++. +.++..+...+..-.+ ..+-..|...|+-..-+--|+|..++++.+.--.|...+.++ .+.++... T Consensus 66 ~e~n~~~dt~~~~~t~~kit~~~~~a~~-~~r~kaw~~~Dla~~lsG~dpm~~Ia~~va~yW~r~~q~~Li-a~L~Gvf~ 143 (349) T protein:vir:94 66 IEPNYSNDVYQDIATPRAIQTGEMMARV-AYLNEGFGQADLTVELTSQNPLQSVASRLDNFWQRQAQRRLI-ATALGLYN 143 (349) T ss_pred cccccCCCCcccccccccccccceeeee-eeeccccchhHHHHHhhCchHHHHHHHHHHHHHhhHHHHHHH-HHHHhhhc Confidence 11111111 1122344444432222 344566778899888888899999999988877777555444 44455443 Q ss_pred ccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhc---CCCCCCCEEEeCHHHHHHHhcchhhhh Q lcl|NC_015249. 148 PSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGN---YVPSADRVFYTTPDNYSAILAALMPNA 224 (347) Q Consensus 148 ~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~---~VP~~gR~~vv~P~~~~~Ll~~~~~~~ 224 (347) ......+.....+..+..+.+.+. ..++.+ .+|..+|..+ +....=-.+++-+..|..|.+...+ T Consensus 144 ~~~~~~~~~~~~~~~~~d~~~~a~------~~~~~~----~~A~~~~Gdaa~Gd~~~~lt~i~mHS~v~~~L~~~~li-- 211 (349) T protein:vir:94 144 DNVSATDAYHEQNDMVVDVSATSG------FDAGAF----IDATQTMGDALMGNGGEVLGAIAMHSFVYAQARKAQLI-- 211 (349) T ss_pred ccccccccccccCceeEEecccCC------CChhhH----HHHHHHHHHHhccccccceeEEEEchHHHHHHHhcchh-- Confidence 222222222222233333322221 223333 3444444443 1111125789999999998765432 Q ss_pred hhhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhh Q lcl|NC_015249. 225 ANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVK 304 (347) Q Consensus 225 ~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~ 304 (347) +|.- ..-.+..|..++|.+|++...+|....++. .+....+|-+-|++..+ T Consensus 212 -~~i~-~s~~~~~i~ty~G~~VivDD~~Pv~~~g~~---------------------------~~yttylfg~GAi~~~~ 262 (349) T protein:vir:94 212 -DFIR-DAENNTMFATYQGYRVIVDDSMTVVGQDTS---------------------------RKFISIIFGQGAIGYGE 262 (349) T ss_pred -hhcc-CcccCcccceecCcEEEEeCCCccccCCCC---------------------------ceEEEEEeecceEEeec Confidence 1211 111344688999999999999997543211 12233455566666666 Q ss_pred hcc-eeeeeeechhhh----cceeeeeeeecccc--cccceEEEEEE-----cCC Q lcl|NC_015249. 305 LKD-MALERARRANFQ----ADQIIAKYAMGHGG--LRPEACGALVF-----NKA 347 (347) Q Consensus 305 ~~~-~~~e~~~d~~~~----~d~i~~~~a~G~~~--~Rpe~a~~i~~-----~~a 347 (347) ..+ +.+|..||+... .|.+..+++|.-.| ..+..+.+..- ... T Consensus 263 ~~~~~~~E~~rd~~~g~~~G~d~L~~R~~~~~hp~G~s~~~a~v~~~~~~~~~~s 317 (349) T protein:vir:94 263 GNPEMPLEYEREASRANGGGVETLWTRKTWLLHPFGYSFTSAVITGNGTETIARS 317 (349) T ss_pred CCCCcceeeecccccCCcceeEEEEEeeEEEeeeeeeeecccccCCCccccccCC Confidence 553 347777888654 47888876665322 23332221100 001 No 182 >protein:vir:3969 Length: 287 # NCBI annotation: major capsid protein # Family: family:all:3269 # MgeID: mge:83 # MgeName: ul36 # Cross-refs: genbank:acc:NP_663677;genbank:gi:21716114;genbank:GeneID:951200 Probab=96.89 E-value=0.00023 Score=40.56 Aligned_cols=250 Identities=15% Similarity=0.136 Sum_probs=115.5 Q ss_pred chhhhhhhhhhhHHHHHHHHHHhhhccccc-----ccccccceEEEeecCcc--eeeeeecCCCC---CCccCCCCCc-- Q lcl|NC_015249. 22 DKLALFLKVFGGEVLTAFTRTSVTMNKHLV-----RSIQSGKSAQFPVLGRT--KAAYLQPGENL---DDKRKDMKHT-- 89 (347) Q Consensus 22 d~~al~ie~f~g~V~~~f~~~s~~~~~~~~-----r~i~~G~tv~i~~iG~~--~~~~~~~g~~~---~~~~~~~~~~-- 89 (347) =+.-.|-|+|.|.+.+-|+.+|.|++..-- --+.+.++..--+...+ .++.|..++.. .++.+.-.-. T Consensus 1 ~avr~y~Kq~~glL~~vf~~qa~F~~~FGg~lQ~~DGV~~N~taf~vKtsD~pVVi~~Y~Td~Nv~FGtGTg~ssRFG~r 80 (287) T protein:vir:39 1 MAIKYFTKQYAGMLPDLFAKKSAFLRAFGGVLQVKDGVTENDTFMELKVSDTDVVIQAYSTDANVGFGSGTGNTSRFGQR 80 (287) T ss_pred CCcccccHHHHHHHHHHHHHHHhhhhhcccceeeecCCcccceEEEEEecCcceEEecccCCCCcccccCCCccccccce Confidence 111147799999999999999998855432 12222333222222221 12222222221 0010100001 Q ss_pred eEEEEEEeee-e-ccccc-ccHHHHHhChh---hHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccCcce Q lcl|NC_015249. 90 ERTINIDGLL-T-ADVLI-YDIEDAMNHYD---VRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGKAHV 163 (347) Q Consensus 90 ~~~l~ID~~~-~-~~~~I-dd~D~~q~~~D---~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~g~~ 163 (347) +-.+.+|+.. | |.+.| .-+|+.-.+-| ...+..+.++.|-++.+|..+-..|...+ .... .+ T Consensus 81 kEi~y~dt~V~Y~~~~~ihEGiD~~TVNnd~~aaVAdRL~Lqa~A~t~~~n~~~Gk~ls~~A------~~t~------~~ 148 (287) T protein:vir:39 81 KEVKSVNKQVSYDAPLAINEGIDDFTVNDIKDQVVAERLALHGVAWAQHVDKLLGKLLSDSA------SETL------TV 148 (287) T ss_pred eEEEEecccccceeccccccccccccccCChhHHHHHHHHhHHHHHHHHHHHHHHHHHHhhc------chhe------ee Confidence 1122222221 1 11111 23444444434 34555677888889999975533222111 0000 00 Q ss_pred eecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCC-CCEEEeCHHHHHHHhcchhhhhhhhccccccccceEEEEe Q lcl|NC_015249. 164 LEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSA-DRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSIRNVM 242 (347) Q Consensus 164 i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~-gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg~i~ 242 (347) +...+.+-..+-+|..++..++|-.. .....|.|+.|.+|..++-.+..--.+-+.-.+| |.++- T Consensus 149 -------------~~t~d~V~~LF~~a~~~yvNn~v~~~~~~~AyV~aevYnaiiD~~l~TsaK~SsaNiDen~-i~kFk 214 (287) T protein:vir:39 149 -------------KLDEDSVTKLFSDAHKKFVNNNVSIAVPWVAYVNADIYDLLIDSKLATTAKNSSANVDEQT-LYKFK 214 (287) T ss_pred -------------eecccchHHHHHHHHHHhhccceeeEEEEEEEEChhHHhHHhccccccccccceeeeccCC-cceec Confidence 01111122334455666666666433 4778999999999998886665433333333444 56888 Q ss_pred ceEEEEecc-----------------eecccccccccccccccccccccccccccccccccc-cceEEEEechh Q lcl|NC_015249. 243 GFEVIEVPH-----------------LTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVAL-DNVVGLFNHRS 298 (347) Q Consensus 243 G~~V~~sn~-----------------lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~-~~~~~l~~~~~ 298 (347) ||.+-|.|. .++.++...+.....+..|.+++.+ ...|.|-.+. .+.+.-+..++ T Consensus 215 Gf~l~e~P~~~~q~g~~a~fs~dnig~af~GI~vaR~i~sEdF~GvalQgA-gK~G~~i~e~Nk~Ai~k~t~~k 287 (287) T protein:vir:39 215 GFILSELPDEKFQLNEGAYFAADNVGVAGVGIQVTRAMDSEDFAGTALQAA-AKYGKYLPEKNKKAILKATVTK 287 (287) T ss_pred ceEEEecchHhhccCcEEEEccccceeecccceeEEeeecccccceeeecc-cccccccccccceEEEEEecCC Confidence 998888762 1222333334444444455554432 2223333332 22222222222 No 183 >protein:vir:97397 Length: 517 # NCBI annotation: major capsid protein # Family: family:all:11745 # MgeID: mge:1675 # MgeName: Q54 # Cross-refs: genbank:acc:YP_762590;genbank:gi:115304291;genbank:GeneID:5130600 Probab=96.81 E-value=0.00029 Score=40.00 Aligned_cols=281 Identities=12% Similarity=0.048 Sum_probs=110.2 Q ss_pred CCcccccccccc--cccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeec-CcceeeeeecCC Q lcl|NC_015249. 1 MAKMNGGQQIGK--DQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVL-GRTKAAYLQPGE 77 (347) Q Consensus 1 ma~~~~~~~~~t--~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~i-G~~~~~~~~~g~ 77 (347) ++......+... +.......+. +-...+...+...+...+.+...+++.++. ...++.- ....+..+..|. T Consensus 226 ~~~~~~~~~~~~~~~~~~~~~~~~---~~p~~~~~~i~~~~~~~~~i~~~~~~~~i~---~~~~~~~~~~~~a~~~~eG~ 299 (517) T protein:vir:97 226 SASLTKDPKAAWTAELKERGISGM---PAPAGILKRIQDAVNDEGSLLPFIRHENLP---TLVVGGDNALTQGTGHTTGT 299 (517) T ss_pred Hhcccccccceeeeeccccccccc---ccchHHHHHHHHhhhhhccceeeeeecccc---ceeeecccccceeeeeecCC Confidence 111111111000 0000000010 011233344444454445555555544432 2333322 222344555555 Q ss_pred CCCCccCCCCCceEEEEEEeeeecc-cccccHHHHHhChh----hHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccc Q lcl|NC_015249. 78 NLDDKRKDMKHTERTINIDGLLTAD-VLIYDIEDAMNHYD----VRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASD 152 (347) Q Consensus 78 ~~~~~~~~~~~~~~~l~ID~~~~~~-~~Idd~D~~q~~~D----~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~ 152 (347) ..+. .+++-.++++.+ .+++. +.+..---..+.+| +.+-+..++.++|+++.++.++.--. .. T Consensus 300 ~kp~--s~~tf~~~~~~~--~~ia~~~~~S~qll~Ds~~dd~~~l~s~i~~~l~~~l~~~ee~a~l~GdG--------tg 367 (517) T protein:vir:97 300 DKTE--SNITLQTRVLTP--QYVYKYIKLPKIVMNSNATDIAGAILTYVMNRLPDMVIMAVNRAIIMGGV--------TG 367 (517) T ss_pred cccc--cccceeeEEeeH--hhhhhhhhhhHHHHHHhhhccHHHHHHHHHHHHHHHHHHHHHHHHhcccC--------CC Confidence 5432 234444444443 22222 12221111112334 77778899999999999988753100 00 Q ss_pred cccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccc Q lcl|NC_015249. 153 ENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANYQALID 232 (347) Q Consensus 153 ~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~ 232 (347) .... ..+......... .......+.+.+ ..|.....+..+-.+|++|..|..|.+-.. .++.|.=... T Consensus 368 ~~~~-----gi~~~a~~~~~~--~~~~~~~~~d~i----~~l~~a~~~a~~a~~vmn~~t~~~I~klKD-~~G~Yl~~~~ 435 (517) T protein:vir:97 368 VSET-----QIYPVVGDAWAT--NVTGTTNIQELL----EKLSVATPKAADSTLVIHRNDLAAIRFLKD-KNGNYVFPVG 435 (517) T ss_pred cccc-----cccccccccccc--cccccchHHHHH----HHHHHHhhhccCCEEEECHHHHHHHHHhhc-CCCCeeccCc Confidence 0000 111111111100 001111122222 222222222223456799999998865432 3455654444 Q ss_pred cccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeee Q lcl|NC_015249. 233 PSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALER 312 (347) Q Consensus 233 ~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~ 312 (347) ..++.+..++|+.-+.. .++... .... . .++ | .++- .+.+..-+ T Consensus 436 ~~~~~~~~l~G~~~~~~-~~~~~~---~~~~-----~---------~~~-y--------~i~~---------~~g~~~~~ 479 (517) T protein:vir:97 436 VSNQTIATHFGFNRLVQ-SVAVDE---KTAV-----S---------LSG-Y--------VTNG---------SRGMEFEQ 479 (517) T ss_pred CCcccccccCCcccccc-ccccCc---eeEe-----e---------ccc-c--------EEEe---------ecceeeee Confidence 55666666777422221 121100 0000 0 000 0 0000 01111112 Q ss_pred eechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 313 ARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 313 ~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .+|-.+-.+.+...++.|..++.|+..+-.++.-. T Consensus 480 ~fd~~~n~~~f~~~~~~~g~i~~~~r~a~~~~~p~ 514 (517) T protein:vir:97 480 GTILVENNKEYLFEMPISGSLEYKGTTAYGTYTPP 514 (517) T ss_pred eeecccCceeEeeeeeeccccccccceEEEEEcCC Confidence 23333445556666777777777776655444333 No 184 >protein:vir:94528 Length: 286 # NCBI annotation: major head protein # Family: family:all:3269 # MgeID: mge:1510 # MgeName: phiJL-1 # Cross-refs: genbank:acc:YP_223889;genbank:gi:62327101;genbank:GeneID:5075544 Probab=96.60 E-value=0.00048 Score=38.78 Aligned_cols=249 Identities=14% Similarity=0.083 Sum_probs=109.3 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccccc-c---cccccceEEEeecCcc--eeeeee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLV-R---SIQSGKSAQFPVLGRT--KAAYLQ 74 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~-r---~i~~G~tv~i~~iG~~--~~~~~~ 74 (347) |+..+- |...| .|-|+|.+-+.+-|+.++.|++..-- + -+.+..+..--+...+ .++.|. T Consensus 1 m~t~N~--n~avr------------~Y~Kqf~glL~~vf~~qa~F~~~fgglQalDGV~~N~tafsvKt~D~pVVig~Y~ 66 (286) T protein:vir:94 1 MATTNN--DLPVR------------VYSKEFLQLLSTVYQAQSVFTPTFGALQALDGVPNNATAFSVKTNDMAVVVGEYS 66 (286) T ss_pred CCCCcc--cccee------------ehhHHHHHHHHHHHhhHHHhhhhhcchhhhhCCCccceEEEEeecCcceEEeccc Confidence 655442 22222 48899999999999999998855432 1 1222222222122111 122233 Q ss_pred cCCCCC---CccCCCCCc--eEEEEEEeee-e-ccccc-ccHHHHHhChhh---HHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015249. 75 PGENLD---DKRKDMKHT--ERTINIDGLL-T-ADVLI-YDIEDAMNHYDV---RSEYTAQLGESLAMAADGAVLAEMAK 143 (347) Q Consensus 75 ~g~~~~---~~~~~~~~~--~~~l~ID~~~-~-~~~~I-dd~D~~q~~~D~---r~~~~~~~g~aLa~~~D~~i~~~~~~ 143 (347) .++..- ++.+.-.-. +-.+.+|+.. | |.+.| .-+|..-.+-|+ ..+..+.++.|-++.+|..+-..|.. T Consensus 67 TdeNv~FGtgTg~SsRFG~rkEi~y~dtdV~Y~~~~~iHEGiD~~TVNnd~~aaVAdRL~lQA~Akt~~~n~~~Gk~ls~ 146 (286) T protein:vir:94 67 TDANTAFGTGTSNSSRFGEMKEVIYADTDVPYTAGWAIHEGLDQMTVNNDLDAAVADRLNLQAQAKTRLFNVAMGEALAT 146 (286) T ss_pred CCCccccccCCccccccCceeeEEeecccccccccchhhhccccccccCChhHHHHHHHHHHHHHHHHHHHHHHHHHHHh Confidence 222210 000110001 1122233221 1 12222 344554444444 34455667888888888754322211 Q ss_pred HhhhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHh----hhcCCCCCCCEEEeCHHHHHHHhcc Q lcl|NC_015249. 144 LCNLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKL----TGNYVPSADRVFYTTPDNYSAILAA 219 (347) Q Consensus 144 ~a~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~L----de~~VP~~gR~~vv~P~~~~~Ll~~ 219 (347) .+ . + ...+|.+.++..+| ....|-. ..-+.|.|+.|.+|..+ T Consensus 147 ~A------~----------------------~-----t~~~D~V~~LF~~as~~yvn~ev~~-~~~ayV~~evYnaiiD~ 192 (286) T protein:vir:94 147 AG------T----------------------D-----LGAVDDVNALFESAVEKYTDLEVIA-PVRAYVTASVYNAIIDL 192 (286) T ss_pred hh------h----------------------h-----hhhhhhHHHHHHHHHHHhhhhheee-eeEEEEchhHHHHHhcc Confidence 10 0 0 00123344444444 4444433 33499999999999998 Q ss_pred hhhhhhhhccccccccceEEEEeceEEEEeccee----------------cccccccccccccccccccccccccccccc Q lcl|NC_015249. 220 LMPNAANYQALIDPSTGSIRNVMGFEVIEVPHLT----------------AGGAGEDRPEEGANPTGQKHAFPETSSGDT 283 (347) Q Consensus 220 ~~~~~~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp----------------~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y 283 (347) +-.++.--.+-+.-.+| |.++-||.|-|.|.-- +.++...+.....+..|.+++.+ ...+.| T Consensus 193 ~l~TsaK~SsaNiDeng-i~~FkGf~i~e~P~~~~~g~~aifs~dnig~aftGIn~aR~IesEdF~GValQgA-GK~G~~ 270 (286) T protein:vir:94 193 ANVTTAKNSAVNIDTNG-MLSFRGIAITKVPTQYMGGKAVIFAPDNVARVFTGINIARTIQAIDFAGVELQGA-GKYGTF 270 (286) T ss_pred ccccccccceeeeccCC-cceecceEEeecchhhccCceEEEccccceeeeccceeeeeeeccccCceeeecc-cccccc Confidence 86665433333333455 5688888888766311 11111222222222233332211 112222 Q ss_pred cccccceEEEEechhh Q lcl|NC_015249. 284 RVALDNVVGLFNHRSA 299 (347) Q Consensus 284 ~~~~~~~~~l~~~~~A 299 (347) =.+.-|..-+-..|++ T Consensus 271 I~edNk~Ai~~~~~k~ 286 (286) T protein:vir:94 271 ILDDNKKAIFTATPKA 286 (286) T ss_pred ccccCceeEEEeecCC Confidence 2222222112122222 No 185 >protein:vir:103285 Length: 296 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:1605 # MgeName: JK06 # Cross-refs: genbank:acc:YP_277465;genbank:gi:71834107;genbank:GeneID:3562396 Probab=96.60 E-value=0.00048 Score=38.77 Aligned_cols=274 Identities=12% Similarity=0.038 Sum_probs=131.5 Q ss_pred ccccccchhhhhh-hhhhhHHHHHHHH----HHhhhcccccccc-c-ccceEEEee---cCcceeeeeec-CCCCCCccC Q lcl|NC_015249. 16 KGMSAGDKLALFL-KVFGGEVLTAFTR----TSVTMNKHLVRSI-Q-SGKSAQFPV---LGRTKAAYLQP-GENLDDKRK 84 (347) Q Consensus 16 ~~~~~~d~~al~i-e~f~g~V~~~f~~----~s~~~~~~~~r~i-~-~G~tv~i~~---iG~~~~~~~~~-g~~~~~~~~ 84 (347) -+...+|.-..|+ +++. .++....+ .=..+.++.+++- - +-.++.++. .|..+ -+.. ..+++. . T Consensus 1 ~~~~~a~~~~~f~~~ql~-~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~--~~~~~~~dip~--v 75 (296) T protein:vir:10 1 MGVDKADAAGIWTVKQLT-ASLNKAYETEYDQNSVVNLFPVSNEIPGYAKYFEYPVFDGVGIAQ--IVADYTDDLPL--V 75 (296) T ss_pred CcccchhhhHHHHHHHHH-HHHHHHHhhhhcccccceecccccCCCCceeEEEeeeeeccCcee--EeCCCccccce--e Confidence 2333334433455 5554 44444432 2345566666542 2 134555544 34443 2222 223332 2 Q ss_pred CCCCceEEEEEEee-eecccccccHHHHH-hChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccC-c Q lcl|NC_015249. 85 DMKHTERTINIDGL-LTADVLIYDIEDAM-NHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGK-A 161 (347) Q Consensus 85 ~~~~~~~~l~ID~~-~~~~~~Idd~D~~q-~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~-g 161 (347) +..-.+....|-.. .-+.+.+.++..++ ...++-..-...++.++++..|+.+|--.. .....|+-. . T Consensus 76 ~~~~~~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~ka~aA~~~~~~~~n~~~f~G~~---------~~g~~GLlN~p 146 (296) T protein:vir:10 76 DALATERQGKVFRFGNAFLISIDEIKVGQATGQSLSTRKQSLAFEAHDKLLDKLVWSGST---------AHGIPSVFDYP 146 (296) T ss_pred eccceeEEEEEEEEEeeeeecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhceEEEeecc---------cccceeEeecC Confidence 33334444444332 23444567777665 467787888888999999999987763211 111112110 1 Q ss_pred ceeecccccccccchhhhHHHHHHHHHHHHHHhhhc--CCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceEE Q lcl|NC_015249. 162 HVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGN--YVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSIR 239 (347) Q Consensus 162 ~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~--~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg 239 (347) ++....+.. +. ..++.+++.|.++...|.++ .+= ..-.++|+|+.|..|...- .+.+..-..-+++ T Consensus 147 ~v~~~~~~~--~W---~~~t~i~~Di~~~~~~l~~~s~g~~-~p~~l~L~p~~~~~L~~~~--~~~~~t~l~~ik~---- 214 (296) T protein:vir:10 147 NINNVVSGG--SW---SQPTTAVSDITSLLDIIETSTNGQH-RATHLLLPTTARRIMQNLV--PGTSVSYGEFFRQ---- 214 (296) T ss_pred CCccccccC--Cc---cCHHHHHHHHHHHHHHHHHhhCcee-cceeEEeCHHHHHHHhhcc--CCCCccHHHHHHH---- Confidence 110011111 11 22346788898888777654 221 1237888999999886321 1111100111111 Q ss_pred EEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEec--hhhhhhhhhcceeeeeeechh Q lcl|NC_015249. 240 NVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNH--RSAVGTVKLKDMALERARRAN 317 (347) Q Consensus 240 ~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~--~~Av~~v~~~~~~~e~~~d~~ 317 (347) +..+.+|...+.|...+++ + +..++++. ++-+...-.++++.- ...++ T Consensus 215 ~~~~l~i~~~~~l~~a~~~-----------g------------------~~~~v~~~~~~~~~~~~v~~~~~~~-~~e~~ 264 (296) T protein:vir:10 215 NNSGVTVEFVQYLNDYNGT-----------G------------------TSAAIAYEKDPNNMAIEIPEATNAL-PAQPK 264 (296) T ss_pred hcCCceEEEeeeeccCCCC-----------c------------------ceEEEEEEcCCceEEEEcCcceeee-ccccc Confidence 2245566666666421110 0 11223332 222222223343332 13444 Q ss_pred hhcceeeeeeee-cccccccceEEEE---EEc Q lcl|NC_015249. 318 FQADQIIAKYAM-GHGGLRPEACGAL---VFN 345 (347) Q Consensus 318 ~~~d~i~~~~a~-G~~~~Rpe~a~~i---~~~ 345 (347) ...+.+....+. |.-+.||++++.+ .+. T Consensus 265 ~l~~~~~~~~~~~Gv~i~~P~ai~~~dGI~~~ 296 (296) T protein:vir:10 265 DLHFKIPVTSKATGLIVYRPLTMAVMKGITFA 296 (296) T ss_pred CceEEEeeEeeEEEEEEECCceeEEEeeeecC Confidence 455566666666 5789999988866 444 No 186 >protein:vir:4786 Length: 295 # NCBI annotation: hypothetical protein # Family: family:all:3269 # MgeID: mge:104 # MgeName: MM1 # Cross-refs: genbank:acc:NP_150166;swissprot:trembl:q94m45;genbank:gi:15088777;uniprot:Q94M45;genbank:GeneID:955980 Probab=95.91 E-value=0.00021 Score=40.74 Aligned_cols=275 Identities=15% Similarity=0.101 Sum_probs=111.6 Q ss_pred ccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccccc-c---cccccceEEEeecCcc--eeeeeecCC Q lcl|NC_015249. 4 MNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLV-R---SIQSGKSAQFPVLGRT--KAAYLQPGE 77 (347) Q Consensus 4 ~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~-r---~i~~G~tv~i~~iG~~--~~~~~~~g~ 77 (347) |+.-+|.. .-.|-|+|.|-+.+-|+.++.|++..-- + -+.+.++..--+...+ .++.|..++ T Consensus 1 mp~N~n~a------------vr~Y~Kqf~glL~~vf~~qa~F~~~FGglQalDGV~~N~tafsvKt~D~pVVig~Y~Tde 68 (295) T protein:vir:47 1 MPSNQNNA------------VRRYEKQYAGILETVFGVRAAFSNALAPIQILDGVQENSKAFSVKTNNTPVVIGEYKTGE 68 (295) T ss_pred CCCCCCcc------------chhhhHHHHHHHHHHHhHHHHHhhhhcchhhhhCCCccceEEEEeecCcceEeecccCCC Confidence 22211111 1248899999999999999998855432 1 1222222222222211 223344444 Q ss_pred CCC----CccCCCCCce--EEEEEEeee-e-ccccc-ccHHHHHhChhhH---HHHHHHHHHHHHHHHHHHHHHHHHHHh Q lcl|NC_015249. 78 NLD----DKRKDMKHTE--RTINIDGLL-T-ADVLI-YDIEDAMNHYDVR---SEYTAQLGESLAMAADGAVLAEMAKLC 145 (347) Q Consensus 78 ~~~----~~~~~~~~~~--~~l~ID~~~-~-~~~~I-dd~D~~q~~~D~r---~~~~~~~g~aLa~~~D~~i~~~~~~~a 145 (347) ..- ++.+.-.-.+ -.+.+|+.. | |.+.| .-+|..-.+-|+- .+..+.++.|-++.+|..+-..|...+ T Consensus 69 NvagFGtGTg~SsRFG~rkEi~y~dtdV~Y~~~~~iHEGiD~~TVNnd~~aaVAdRL~LQA~Akt~~~n~~~Gk~ls~~A 148 (295) T protein:vir:47 69 NDGGFGDNSGAQSRFGGVTEVKYENTDVNYDYTLTIHEGLDRYTVNNDLNAAVADRLKLQSEAQTRTVNKRIGKYLSDTA 148 (295) T ss_pred cccccccCCccccccCceeeEEeecccccccccchhhhccccccccCChhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Confidence 432 1111111111 122233221 2 22222 3455555444443 445566788888888875533222111 Q ss_pred hhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhh Q lcl|NC_015249. 146 NLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAA 225 (347) Q Consensus 146 ~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~ 225 (347) . . +..-++.+. +.+-..+-.+..+.....|-..- -+.|.|+.|.+|..++-.++. T Consensus 149 ------~-~-----------te~~td~t~------d~V~~LF~~as~~yvn~ev~~~~-~AyV~~evYnaiiD~~l~Tsa 203 (295) T protein:vir:47 149 ------T-K-----------TEALADFTD------DKVKALFNKLSAFYTNNEVTAPI-TVYLRSEFYNAIVDMASVTSA 203 (295) T ss_pred ------h-h-----------hhhhhcccc------hhHHHHHHHHHHHhhhhheeeee-EEEEchhHHHHHhcccccccc Confidence 0 0 000011111 12223344555666666664333 399999999999998866654 Q ss_pred hhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccc-cc-cccccceEEEEechhhhhhh Q lcl|NC_015249. 226 NYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSG-DT-RVALDNVVGLFNHRSAVGTV 303 (347) Q Consensus 226 ~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~y-~~~~~~~~~l~~~~~Av~~v 303 (347) --.+-+.-.+| |-++-||.+-|.|.--..++--.. +.....+-++..-... .. .+||+.+ .++-- +-.+ T Consensus 204 K~SsaNiDeng-i~~FkGf~i~e~P~~~~q~G~~ai----fs~dnig~aftGIn~aR~IesEdF~GV---alQ~~-~~~~ 274 (295) T protein:vir:47 204 KGATISLDENG-LPKYKGFTLEETPAQYFETGVIAI----FSPNGIIIPFVGISTARVIEAENFDGV---NCKLL-LRVV 274 (295) T ss_pred ccceeeeccCC-cceecceEEEeccHhhccCCcEEE----Eccccceeecccceeeeeeecccccch---HHHHH-HHHH Confidence 33333333455 568899999886544332211000 0001111111100000 00 1122222 11110 0000 Q ss_pred hhcceeeeeeechhhhcceeeeeeeecccccc Q lcl|NC_015249. 304 KLKDMALERARRANFQADQIIAKYAMGHGGLR 335 (347) Q Consensus 304 ~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~R 335 (347) -..-+++...+-+... -+ | +| T Consensus 275 ~~~~~~~~~~~~~~~~-~~----~------~~ 295 (295) T protein:vir:47 275 LTLLMTIRKQFTKLQE-LL----Y------RR 295 (295) T ss_pred HHHHHHHHHHHHHHHH-Hh----h------cC Confidence 0000111111100000 00 0 00 No 187 >protein:vir:80068 Length: 301 # NCBI annotation: gp8 # Family: family:all:463 # MgeID: mge:1876 # MgeName: B054 # Cross-refs: genbank:acc:YP_001468712;genbank:gi:157325292;genbank:GeneID:5601759 Probab=95.84 E-value=0.0014 Score=36.27 Aligned_cols=279 Identities=15% Similarity=0.102 Sum_probs=128.9 Q ss_pred cccchhhhhhhhhhhHH----HHHHHHHHhhhccccccccc--ccceEEEeecCcc-eeeeeecC-CCCCCccCCCCCce Q lcl|NC_015249. 19 SAGDKLALFLKVFGGEV----LTAFTRTSVTMNKHLVRSIQ--SGKSAQFPVLGRT-KAAYLQPG-ENLDDKRKDMKHTE 90 (347) Q Consensus 19 ~~~d~~al~ie~f~g~V----~~~f~~~s~~~~~~~~r~i~--~G~tv~i~~iG~~-~~~~~~~g-~~~~~~~~~~~~~~ 90 (347) --+|..+.|+.++...+ .+.....-..+.++.+++-- +..++.++..-.+ .++-+..+ .+++. .+..-.+ T Consensus 1 ~~~~~~g~f~~~~l~~id~~v~e~~~~~l~~r~l~~v~~~~~~~~~~~~~~~~~~~G~~~~~~~~~~dip~--~~~~~~~ 78 (301) T protein:vir:80 1 MQGKITATIEARDLQAIDNVIYEPKQEELTARSVFPQKFDVNEGAESYSFDVMTRSGAAKIIANGADDLPL--VDVDMVR 78 (301) T ss_pred CCccccchhhHHHHHHHHHHHHHhhhhhhhhhhhcccccCCCCceEEEEEeeeccceeEEEecCccccccc--cccccee Confidence 22233334554444444 44444444567777665432 2455665544222 23333322 23332 2233344 Q ss_pred EEEEEEee-eecccccccHHHHH-hChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccccc-Ccceee-- Q lcl|NC_015249. 91 RTINIDGL-LTADVLIYDIEDAM-NHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLG-KAHVLE-- 165 (347) Q Consensus 91 ~~l~ID~~-~~~~~~Idd~D~~q-~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~-~g~~i~-- 165 (347) ....|-.. .-|.+.+.+++.++ ...++-..-...+..++++..|+.+|.-..+. ...|+- ..++.. T Consensus 79 ~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aa~~~~~~~~n~~~f~G~~~~---------g~~GLlN~p~~~~~~ 149 (301) T protein:vir:80 79 KSVPIYSIGIGLSYTIQDLRAARMQGTTVDAAKATTVRRAIAEKENSIAFRGEKKY---------AIKGAFEATGIQIDV 149 (301) T ss_pred EEEEEEEEEeeeeecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhceEEeeecccc---------cceeeecCCCccccc Confidence 44444432 23455566777774 67778888888999999999999876432211 111100 011000 Q ss_pred ---cccccccccchhhhHHHHHHHHHHHHHHhhhc--CCCCCCCEEEeCHHHHHHHhcchhhhhh-hhccccccccceEE Q lcl|NC_015249. 166 ---VGKQSELRGDQVKLGQAIIAQLTLARAKLTGN--YVPSADRVFYTTPDNYSAILAALMPNAA-NYQALIDPSTGSIR 239 (347) Q Consensus 166 ---~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~--~VP~~gR~~vv~P~~~~~Ll~~~~~~~~-~~~~~~~~~~G~Vg 239 (347) .++... ..=..++++.+++.|.++..+|.++ .+ ...-.++|+|+.|..|..- +..+. +..-..-++. T Consensus 150 ~~~~~~~~~-~~w~~~t~~ei~~di~~~~~~l~~~s~g~-~~p~~L~L~p~~~~~L~~~-~~~~~~~~tvl~~l~~---- 222 (301) T protein:vir:80 150 SPTTGVGNV-SKWEKKTAEQIIDEIGEAHTKITVLPGYG-TASLKLCLPPKQFELINKK-RYSNEDSRSVLKVLQD---- 222 (301) T ss_pred ccCcccccc-cccccCCHHHHHHHHHHHHHHHHHhcCce-ecccEEEecHHHHHhhhhc-cccCCCCeeHHHHHHH---- Confidence 011111 1113456788999999999998764 22 1124799999999999621 11000 0000011111 Q ss_pred EEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEech--hhhhhhhhcceeeeeeechh Q lcl|NC_015249. 240 NVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHR--SAVGTVKLKDMALERARRAN 317 (347) Q Consensus 240 ~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~--~Av~~v~~~~~~~e~~~d~~ 317 (347) +.-+.+|...+.|...+.. + +.+++++.. +-+-..-.++++.-. -.++ T Consensus 223 ~~~~~~I~~~p~L~~~g~~-----------g------------------~~~~v~~~~~~d~~~~~v~~~~~~~~-~e~~ 272 (301) T protein:vir:80 223 NAWFSAIVRVPDLAGMGTA-----------G------------------SDSFAVIHDSNETAELIIPMDITRHP-EEYS 272 (301) T ss_pred HcCcceEEEcceeccCCCC-----------c------------------ccEEEEEecCCcEEEEEecCceeeec-ceec Confidence 1223466666666421100 0 112222221 211111122322111 1111 Q ss_pred hhcceeeeeeee-cccccccceEEEEEEc Q lcl|NC_015249. 318 FQADQIIAKYAM-GHGGLRPEACGALVFN 345 (347) Q Consensus 318 ~~~d~i~~~~a~-G~~~~Rpe~a~~i~~~ 345 (347) -..+.+....+. |.-+.||++++.+.== T Consensus 273 ~~~~~~~~~~r~~Gv~i~~P~ai~~~~GI 301 (301) T protein:vir:80 273 FPRTKVPFEERTAGVVVRFPAAIVRVDGI 301 (301) T ss_pred CceeEeeeeeeeEEEEEEccceEEEEecC Confidence 122334444555 5688899976643211 No 188 >protein:vir:4074 Length: 480 # NCBI annotation: major capsid (head) protein # Family: family:all:11745 # MgeID: mge:85 # MgeName: c2 # Cross-refs: genbank:acc:NP_043553;genbank:gi:9628687;genbank:GeneID:1261180 Probab=95.51 E-value=0.0019 Score=35.47 Aligned_cols=280 Identities=10% Similarity=0.026 Sum_probs=108.1 Q ss_pred CCccccccccc------cc-ccccccccchhhhhhhhhhhHH-HHHHH-HHHhhhcccccccccc--------cceEEEe Q lcl|NC_015249. 1 MAKMNGGQQIG------KD-QGKGMSAGDKLALFLKVFGGEV-LTAFT-RTSVTMNKHLVRSIQS--------GKSAQFP 63 (347) Q Consensus 1 ma~~~~~~~~~------t~-~~~~~~~~d~~al~ie~f~g~V-~~~f~-~~s~~~~~~~~r~i~~--------G~tv~i~ 63 (347) |-+........ .| .+....+.+ ..-|++...... ...-. -.++.+.+.+...+.. ..++ . T Consensus 171 ~~~~~~~~~~~~~~~~e~r~~~~~~~~~~-e~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~ 247 (480) T protein:vir:40 171 REASIPSEKPEDAERKFMRELGSKMAEMP-EQGFLREFANGADLNVVNSLGSITSKYARKSGIYDGAMKARFQGLTL--A 247 (480) T ss_pred hhhhccccchhhhhhHHHHHHHHHhccch-hhhhhhhhhhhccccccccccccccchhhheeechhhhhhhhhccee--e Confidence 11100000000 00 000000000 000000000000 00000 0011111111111100 0011 1 Q ss_pred ecCcce---e-eeeecCCCCCCccCCCCCceEEEEEEee--eec---ccccccHHHHHhChhhHHHHHHHHHHHHHHHHH Q lcl|NC_015249. 64 VLGRTK---A-AYLQPGENLDDKRKDMKHTERTINIDGL--LTA---DVLIYDIEDAMNHYDVRSEYTAQLGESLAMAAD 134 (347) Q Consensus 64 ~iG~~~---~-~~~~~g~~~~~~~~~~~~~~~~l~ID~~--~~~---~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D 134 (347) ..|... + .....+.... ....+...+ .++. ++. ......+|+ ..++.+.+..+.++.|+++.+ T Consensus 248 ~~g~~~~~~~~e~~~~~~~~~----~~~~~~~~~-~~~~v~~l~~~~k~t~~lLDD---a~~l~~~i~~~l~~~~~~~ee 319 (480) T protein:vir:40 248 EDGVDDTFISGTFKAGTDKNK----SQTATKRSL-RPQMAEAYLQMDKATVRGVND---SGALSEYVMSEMVNRVIQKVE 319 (480) T ss_pred eccccceeeeeeeeccccccc----ccccccchh-hHHHHHHHHHhHHHHHHHhhh---hHHHHHHHHHHHHHHHHHHHH Confidence 111110 0 1111111100 000011111 1110 111 111112222 235788889999999999998 Q ss_pred HHHHHHHHHHhhhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHH Q lcl|NC_015249. 135 GAVLAEMAKLCNLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYS 214 (347) Q Consensus 135 ~~i~~~~~~~a~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~ 214 (347) +.++..- ... .. ...+ +.+..+.. .....+...++.|+.+..+-..++- -++|++|..+. T Consensus 320 ~a~l~G~--------g~g--~~-~~~g--~~~~~~~~---~~~~~~~d~id~L~~al~~~y~~~a----~~~vmn~~t~~ 379 (480) T protein:vir:40 320 YNMILGS--------VDG--SN-GFYG--LKTATDGW---TKQIEYTDLFEGITDAVAECSISDA----ITIVMSPQTFA 379 (480) T ss_pred HHhhccC--------CCC--cc-cccc--ceeecccc---cccchhHHHHHHHHHhhhHHhhCCC----CEEEECHHHHH Confidence 8775310 000 00 0111 11111111 1112234445545544433222221 15789999999 Q ss_pred HHhcchhhhhhhhccccccccceEEEEeceEEEEecce-ecccccccccccccccccccccccccccccccccccceEEE Q lcl|NC_015249. 215 AILAALMPNAANYQALIDPSTGSIRNVMGFEVIEVPHL-TAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGL 293 (347) Q Consensus 215 ~Ll~~~~~~~~~~~~~~~~~~G~Vg~i~G~~V~~sn~l-p~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l 293 (347) .|.+-. -.++.|.=+..+..|....+.|++|++++.. |.... ... .+. ...+ T Consensus 380 ~I~klK-D~~G~Yi~q~~~~~~~~~~llG~pvv~~~~~~~~~~~----~~~---------------------~~~-~~~~ 432 (480) T protein:vir:40 380 ELRKAK-GTDGHSRFNELATKEQIAQSFGAVNLETRVWMPKDEV----AVY---------------------NHD-EYVL 432 (480) T ss_pred HHHHhh-cCCCCeeccCcccccCcceecccceeeeeccccCCcc----eee---------------------eCC-ccEE Confidence 886543 4456676666778899999999998876433 21100 000 000 1123 Q ss_pred EechhhhhhhhhcceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 294 FNHRSAVGTVKLKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 294 ~~~~~Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) ++.++ ......++.++-...+....+.|..+.||+++..+..+.. T Consensus 433 ~~d~~---------~~~~~~~~~~~~~~~~~~e~~v~g~~~~~~~~~~~~~~~~ 477 (480) T protein:vir:40 433 IGDLN---------VENYNDFDLRYNVEQWLSETLVGGSIRGKNRSAYLKKKGS 477 (480) T ss_pred EEecc---------cceecccccccchhhhhhhhhhceeeEccccEEEEEeccC Confidence 44332 1111223444555677788888999999997777777666 No 189 >protein:vir:95512 Length: 693 # NCBI annotation: Putative Clp protease # Family: family:all:62 # ACLAME annotation(s): go:0008236 - serine-type peptidase activity; phi:0000017 - phage prohead/capsid assembly # MgeID: mge:1574 # MgeName: F10 # Cross-refs: genbank:acc:YP_001293349;genbank:gi:148912770;genbank:GeneID:5228164 Probab=95.00 E-value=0.003 Score=34.43 Aligned_cols=298 Identities=13% Similarity=0.056 Sum_probs=129.5 Q ss_pred CCc------------ccccccccccccccccccchhhhhhhhhhhHHHHHHHHH-HhhhcccccccccccceEEEeecCc Q lcl|NC_015249. 1 MAK------------MNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRT-SVTMNKHLVRSIQSGKSAQFPVLGR 67 (347) Q Consensus 1 ma~------------~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~-s~~~~~~~~r~i~~G~tv~i~~iG~ 67 (347) ||- ++.- .+..|- -.+.++|==.|.-......++..|+.. +-++.|.+.++++-=+..+..++|. T Consensus 371 lAr~~L~~rg~~~~~~~~~-~~~~~a-~~htTSDFp~IL~~~~nk~l~~~y~~a~~t~~~~~~~~~~~DFk~~~~~~lg~ 448 (693) T protein:vir:95 371 LARASLVDRGIGVASLNAP-QMVGLA-FTHTSSDFGLILLDVANKSVLAGWEEAEETFPLWTKSGILTDFKPARRVGLGE 448 (693) T ss_pred HHHHHHHhcCCccCCCCHH-HHHHHH-HhcCcchhHHHHHHHHHHHHHHHHHhhhhHHHHHhccCCCCcccccceeecCC Confidence 221 1100 000010 013455533244455566777777755 4567777766655444444444444 Q ss_pred c-eeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_015249. 68 T-KAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCN 146 (347) Q Consensus 68 ~-~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~ 146 (347) . ++.....|.+.. ...+....-++.+.++- --|.|..-.-.=-+.+....+....|++-++..++.++..+..-.. T Consensus 449 ~~~L~~V~E~gEyk--~~t~~e~~e~~~l~tyG-~~~~iTRqaiINDDLga~~~ip~~~g~aA~~~~~~~vy~~L~~Np~ 525 (693) T protein:vir:95 449 FSSLRQVREGAEYK--YVTLGERGEQIILATYG-ELFSITRQAIINDDLQMLSDIPFKLGQAAKATIGDLVYAVLTGNPA 525 (693) T ss_pred CCChhhcCCCCcee--eeecCCccceeehhhcC-CeeeecHHhhhccchHHHHHHHHHHHHHHHHHHHHHHHHHHhcCcc Confidence 3 333333333321 11233333344444331 1111111111111223455677788999999999999876642111 Q ss_pred hccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcC----------CCCCCCEEEeCHHHHHHH Q lcl|NC_015249. 147 LPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNY----------VPSADRVFYTTPDNYSAI 216 (347) Q Consensus 147 ~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~----------VP~~gR~~vv~P~~~~~L 216 (347) +.+ +....-.+++++...+ +.. .. ++.|-.++..|..+. +--.++|+||||+..... T Consensus 526 m~D--Gk~LFhadH~Nl~tga-~sa------ls----~~sl~~a~~am~~qk~~~~~~~g~~L~i~P~~llvP~~le~~a 592 (693) T protein:vir:95 526 MSD--GKTLFHADHSNLLTGA-ASA------LS----IDSLSKAKTQMATQKAQVEKGKGRTLNIRPGFVLTPVALEDKA 592 (693) T ss_pred ccC--Ccceeecccccccccc-ccc------cC----hHHHHHHHHHHHHhhcchhccCCceeecccceEEecchHHHHH Confidence 111 1111112223322111 111 11 122333333332221 112368999999877654 Q ss_pred hcchhhhhhhhccccccccceEEEEece-EEEEecceecccccccccccccccccccccccccccccccccccceEEEEe Q lcl|NC_015249. 217 LAALMPNAANYQALIDPSTGSIRNVMGF-EVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFN 295 (347) Q Consensus 217 l~~~~~~~~~~~~~~~~~~G~Vg~i~G~-~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~ 295 (347) . ++++..+.-......|.+--+.|+ +|+..++|...+.+.|...+..... .+=+.| T Consensus 593 ~---~l~~s~~~~~a~~~~~~~NP~~~~~~vi~~prL~~~s~~~Wyl~a~~~~d--------------------tie~~y 649 (693) T protein:vir:95 593 N---QIINSESVPGADVNSGIVNPIRAFAQVIGEPRLDDASATAWYMAAKKGSD--------------------TIEVAY 649 (693) T ss_pred H---HHhccccccccccccccccchhccccccccceecCCCCCceEEecCCCCC--------------------eEEEEE Confidence 3 233322222222334444445564 7888999976555555543221111 111111 Q ss_pred chhhhhhhhhcceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 296 HRSAVGTVKLKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 296 ~~~Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) =. + ...+.+|....-...|=.+..++=||++++..-++. +-+.| T Consensus 650 L~-----G-~~~P~ie~~~gf~~dG~~~kvr~D~G~~~iD~Rg~~--kn~GA 693 (693) T protein:vir:95 650 LD-----G-VDTPYLEQQEGFTVDGVASKVRIDAGVAPLDFRGLQ--KSNGA 693 (693) T ss_pred ec-----C-CCCCeEeecCCCCcceEEEEEEEeccCceeeccccc--cCCCC Confidence 00 0 122445544333333444455666788777655433 55555 No 190 >protein:vir:98871 Length: 314 # NCBI annotation: major capsid protein # Family: family:all:3269 # MgeID: mge:1568 # MgeName: BCJA1c # Cross-refs: genbank:acc:YP_164418;genbank:gi:56694908;genbank:GeneID:3197261 Probab=94.65 E-value=0.0038 Score=33.82 Aligned_cols=266 Identities=12% Similarity=0.088 Sum_probs=112.3 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccccc--c---cccccceEEEeecCcc--ee-ee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLV--R---SIQSGKSAQFPVLGRT--KA-AY 72 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~--r---~i~~G~tv~i~~iG~~--~~-~~ 72 (347) +-|++-.+..-++.+. +.-.|-|+|.|-+.+-|+.++.|++..-- + -+.+..+..--+...+ .+ +- T Consensus 11 ~~~~~~~~~~t~N~n~------avr~Y~Kqf~glL~~vf~~qa~F~~~FGg~lQalDGV~~N~tafsvKtsD~pVVig~~ 84 (314) T protein:vir:98 11 LNNIQFFASGTANQNK------AARSYQKEFRQLLQAVFRSQAYFRDFFGGGIEALDGVQHNDTAFYVKTSDIPVVVGNE 84 (314) T ss_pred ccceeeeeeccccCcc------ceeeecHHHHHHHHHHHhhHhhhhhhcccceeeccCCCccceEEEEeecccceeecCc Confidence 4555543322111111 12258899999999999999998865432 1 1222222221111111 11 11 Q ss_pred eecCCCC---CCccCCCCCc--eEEEEEEee-ee-ccccc-ccHHHHHhChhh---HHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015249. 73 LQPGENL---DDKRKDMKHT--ERTINIDGL-LT-ADVLI-YDIEDAMNHYDV---RSEYTAQLGESLAMAADGAVLAEM 141 (347) Q Consensus 73 ~~~g~~~---~~~~~~~~~~--~~~l~ID~~-~~-~~~~I-dd~D~~q~~~D~---r~~~~~~~g~aLa~~~D~~i~~~~ 141 (347) |..++.. .++.+.-.-. +-.+.+|.. .| |.+.| .-+|..-.+-|+ ..+..+.++.|-++.+|..+-..| T Consensus 85 Y~TdeNvaFGtGTg~SsRFGprkEi~y~dtdVpY~~~~~iHEGiD~~TVNnd~~aaVAdRL~LQA~Akt~~~n~~~Gk~l 164 (314) T protein:vir:98 85 YNKDENVGFGEGTSRSTRFGPRREIIYQDTPVPYTWEWVYHEGIDKHTVNNDFQAAVADRLDLQANAKIKQFNAQHSKFI 164 (314) T ss_pred ccCCCCcccccCCccccccCceeEEEeecccccccccchhhhccccccccCChhHHHHHHHHHHHHHHHHHHHHHHHHHH Confidence 1212110 0000100001 112223322 11 12222 344544444443 344556678888888887553222 Q ss_pred HHHhhhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchh Q lcl|NC_015249. 142 AKLCNLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALM 221 (347) Q Consensus 142 ~~~a~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~ 221 (347) ...+ +.+...++ ...+.+...+-.+..+....+|- ....+.|.|+.|.+|..++- T Consensus 165 S~~A----------------------s~te~ltd--~~~d~V~~LF~~as~~yvn~ev~-~~~~AyV~~evYnaiiD~~l 219 (314) T protein:vir:98 165 SSIA----------------------EKTETLTD--YSADNVLRLFNELSKYYVNIEAI-GTKAAKVSPELYNAIVDHPL 219 (314) T ss_pred Hhhh----------------------hhhhhhhh--cchhhHHHHHHHHHhhhhcceee-EEEEEEEchhHHhHhhcccc Confidence 2111 00000000 00111122222334444444442 24788999999999999886 Q ss_pred hhhhhhccccccccceEEEEeceEEEEecceec-----------------cccccccccccccccccccccccccccccc Q lcl|NC_015249. 222 PNAANYQALIDPSTGSIRNVMGFEVIEVPHLTA-----------------GGAGEDRPEEGANPTGQKHAFPETSSGDTR 284 (347) Q Consensus 222 ~~~~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~-----------------~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~ 284 (347) .++.--.+-+.-.+| |.++-||.|-|.|.--. .++...+.....+..|.+++.+ ...|.|= T Consensus 220 ~TsaK~SsaNIDeng-i~~FkGf~i~e~P~~~~q~g~ia~~s~dnig~aftGIn~aR~IesEdF~GValQgA-GK~G~~I 297 (314) T protein:vir:98 220 TTSAKSSSANIDQNG-IVNFKGFAIQEIPESMLQSGDVAYTYITNIGKAFTGINTSRIIESEDFDGVALQGA-GKAGEFI 297 (314) T ss_pred ccccccceeeeccCC-cceecceEEEecchhhcCCCcEEEEccccceeecccceeeeeeecccccceeeecc-ccccccc Confidence 665433333333455 56888888877553221 1222223333333344333321 1222222 Q ss_pred ccccceEE--EEechhh Q lcl|NC_015249. 285 VALDNVVG--LFNHRSA 299 (347) Q Consensus 285 ~~~~~~~~--l~~~~~A 299 (347) .+.-|..- +--.|++ T Consensus 298 ~edNk~Ai~k~t~tp~~ 314 (314) T protein:vir:98 298 LDDNKKAVAKVTSTPEG 314 (314) T ss_pred ccccceeeEEEecCCCC Confidence 22211111 1112222 No 191 >protein:vir:79548 Length: 652 # NCBI annotation: putative protease/scaffold protein # Family: family:all:62 # ACLAME annotation(s): go:0008236 - serine-type peptidase activity; phi:0000017 - phage prohead/capsid assembly # MgeID: mge:1871 # MgeName: cdtI # Cross-refs: genbank:acc:YP_001272518;genbank:gi:148609387;genbank:GeneID:5204384 Probab=94.60 E-value=0.004 Score=33.74 Aligned_cols=294 Identities=13% Similarity=0.072 Sum_probs=132.6 Q ss_pred CCccc---ccccccc-----cccc--cccccchhhhhhhhhhhHHHHHHHHH-HhhhcccccccccccceEEEeecCcc- Q lcl|NC_015249. 1 MAKMN---GGQQIGK-----DQGK--GMSAGDKLALFLKVFGGEVLTAFTRT-SVTMNKHLVRSIQSGKSAQFPVLGRT- 68 (347) Q Consensus 1 ma~~~---~~~~~~t-----~~~~--~~~~~d~~al~ie~f~g~V~~~f~~~-s~~~~~~~~r~i~~G~tv~i~~iG~~- 68 (347) ||-.. -|..+.+ -.+. .++++|==.|....-...++..|+.. .-++.|.+.++++-=|..+..++|.. T Consensus 336 lAr~~L~~~G~~~~~~~~~~~v~~A~~hsTsDFp~IL~~~~nk~l~~~y~~a~~t~~~~~~~~~~~DFk~~~~~~lg~~~ 415 (652) T protein:vir:79 336 YARMSLTERGIGVSSYNPMQMVGAAFTHSTSDFGNILLDVANKAILQGWEDAPETYEQWTRKGQLSDFKIAHRVGMGGFS 415 (652) T ss_pred HHHHHHHhhccCCCCCCHHHHHHHHhhcCcchHHHHHHHHHHHHHHHHHhhhHHHHHHHhccCCCccccccceeecCCCC Confidence 22111 0000000 0001 13455533233444455667777655 46777777776655444444445433 Q ss_pred eeeeeecCCCCCCccCCCCCceEEEEEEeee-----eccccc-ccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015249. 69 KAAYLQPGENLDDKRKDMKHTERTINIDGLL-----TADVLI-YDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMA 142 (347) Q Consensus 69 ~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~-----~~~~~I-dd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~ 142 (347) ++.....|.++- ...+....-++.+.++- .-...| ||++ ....+....|++-++..++.++..+. T Consensus 416 ~L~~V~E~gEyk--~~t~~e~~e~~~l~tyG~~~~iTRqaiINDDL~-------a~~~ip~~~g~aA~~~~~~~vy~~l~ 486 (652) T protein:vir:79 416 ALRQVREGAEYK--YVTTGDKQATIALATYGELFSITRQAIINDDLN-------MLTDVPMKLGRAAKSTIADLVYAILT 486 (652) T ss_pred CccccCCCCccc--eeeecCccceeeeecccCeeeeehheeeccchh-------HHHHHHHHHHHHHHHHHHHHHHHHHh Confidence 444444444432 22344455566665531 111122 4454 44567778888888999998887664 Q ss_pred HHhhhccccccc-cccc-cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcC--CCCCCCEEEeCHHHHHHHhc Q lcl|NC_015249. 143 KLCNLPSASDEN-IAGL-GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNY--VPSADRVFYTTPDNYSAILA 218 (347) Q Consensus 143 ~~a~~~~~~~~~-~~~~-~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~--VP~~gR~~vv~P~~~~~Ll~ 218 (347) .-..+. .+.+ ..++ .++++...+ . .... .+++-+.++.+-.+-+ +--.+||++|||+......+ T Consensus 487 ~Np~~~--~DGk~LF~hA~H~Nl~~~a---a------~~~~-~l~~ar~aM~~Qk~g~~~l~i~P~~llvp~~le~~a~~ 554 (652) T protein:vir:79 487 SNPKIS--TDNVSLFDKAKHANVLESA---A------MDVA-SLDKARQLMRVQKEGERHLNIRPAFVLVPTAMESVANQ 554 (652) T ss_pred cCcccc--cCCceeecccccccccccc---c------CCHH-HHHHHHHHHHHhccCCccccccccEEEecchhHHHHHH Confidence 211111 0111 1111 233332211 1 1111 1222222222222222 22347999999997654422 Q ss_pred chhhhhhhhccccccccceEEEEece-EEEEecceecccccccccccccccccccccccccccccccccccceEEEEech Q lcl|NC_015249. 219 ALMPNAANYQALIDPSTGSIRNVMGF-EVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHR 297 (347) Q Consensus 219 ~~~~~~~~~~~~~~~~~G~Vg~i~G~-~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~ 297 (347) +++...........|.+--+.|+ +|+..++|...+.+.+......... .+=+.|=- T Consensus 555 ---ll~s~~v~~a~~~~~~~Np~~~~~~~i~eprL~~~s~~~wylaa~~~~d--------------------tiev~yL~ 611 (652) T protein:vir:79 555 ---VIRSSSVKGADINAGIINPVKDFATVIAEPRLDDNSQTTFYLAASKGSD--------------------TIEVAYLN 611 (652) T ss_pred ---HhccCCCcccccccccccccccccccccccccCCCCcccEEEecCCCCC--------------------eEEEEEec Confidence 22221211112223444445554 8889999975444443333221111 11111100 Q ss_pred hhhhhhhhcceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcC Q lcl|NC_015249. 298 SAVGTVKLKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNK 346 (347) Q Consensus 298 ~Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~ 346 (347) + ...|.+|....-.-.|=.+..++=||++++..-+++ +..+ T Consensus 612 -----G-~~~P~ie~~~gf~~dG~~~kvrlD~G~~~iD~RG~~--k~t~ 652 (652) T protein:vir:79 612 -----G-VDTPYIDQMEGFSVDGVTTKVRIDAGVAPVDHRGLV--KCTA 652 (652) T ss_pred -----C-CCCCeeeecCCCCcceEEEEEEEeccCceeecccee--eecC Confidence 0 223455555333333555666777888888777655 4444 No 192 >protein:vir:96079 Length: 382 # NCBI annotation: hypothetical protein ORF023 # Family: family:all:1653 # MgeID: mge:1597 # MgeName: F8 # Cross-refs: genbank:acc:YP_001294440;genbank:gi:149408337;genbank:GeneID:5237198 Probab=93.08 E-value=0.0089 Score=31.84 Aligned_cols=278 Identities=10% Similarity=0.002 Sum_probs=119.1 Q ss_pred CC-------------cccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccc---cceEEEee Q lcl|NC_015249. 1 MA-------------KMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQS---GKSAQFPV 64 (347) Q Consensus 1 ma-------------~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~---G~tv~i~~ 64 (347) |+ +.++ ..|.++-+ ..+-|++-|...+.+.-..--+.+.++.+.+ ++ -+++.|+. T Consensus 51 ~~~~~~~~~~~amDa~~~~---~~t~~~~g-----~p~~~l~~~~p~~~~~~~~p~~~~~l~pv~t-~g~W~~~t~ty~~ 121 (382) T protein:vir:96 51 LAKAGAFRSGSAMDSNFTA---PVTTPSIP-----TPIQFLQTWLPGFVKVMTAARKIDEIIGIDT-VGSWEDQEIVQGI 121 (382) T ss_pred hhhhhhhhhhcccccccCC---ccccCCcc-----HHHHHHhhhhhhhhhhhhhhhhhhhhccccc-cCCccceEEEEee Confidence 11 1111 11222111 3555777776544332222234445555544 22 25666654 Q ss_pred ---cCcceeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHHH---hChhhHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015249. 65 ---LGRTKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDAM---NHYDVRSEYTAQLGESLAMAADGAVL 138 (347) Q Consensus 65 ---iG~~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q---~~~D~r~~~~~~~g~aLa~~~D~~i~ 138 (347) +|..++ |..+++.|-...+.+-.++++.+=+ ..+.+.++++.+ +.+|+-++-...+..+|.+..|+..| T Consensus 122 ~e~~G~A~~--ygd~~D~Pl~d~~~~~~~r~v~~~~---~g~~yg~lE~~rAa~~~~~l~~~Ka~aA~~ale~~~N~i~f 196 (382) T protein:vir:96 122 VEPAGTAVE--YGDHTNIPLTSWNANFERRTIVRGE---LGLLVGTLEEGRASAIRLNSAETKRQQAAIGLEIFRNAIGF 196 (382) T ss_pred eecccceEE--eecccCCCccccccceeEEEEEEEE---EeeeecHHHHHHHHhhCCCcHHHHHHHHHHHHHHhhceEEE Confidence 566653 3445555333333444445554433 234455666655 57888888777888888888887544 Q ss_pred HHHHHHhhhcccccccccc-ccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCC----CCC-CCEEEeCHHH Q lcl|NC_015249. 139 AEMAKLCNLPSASDENIAG-LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYV----PSA-DRVFYTTPDN 212 (347) Q Consensus 139 ~~~~~~a~~~~~~~~~~~~-~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~V----P~~-gR~~vv~P~~ 212 (347) .-. .+.......| +...++....++.. ..-..++.+.+++.|..+...|....- |.. ...++|||.. T Consensus 197 ~G~------~~g~~~~~yGllNdP~l~a~~t~a~-~~Wa~kT~~eI~~Di~~l~~~i~~qt~G~~~~~~~~~~L~LP~~~ 269 (382) T protein:vir:96 197 YGW------QSGLGNRTYGFLNDPNLPPFQTPPS-QGWATADWAGIIGDIREAVRQLRIQSQDQIDPKAEKITMALATSK 269 (382) T ss_pred Eee------ecCcCcceEEEEeCCCcccccccCC-CCcccccHHHHHHHHHHHHHHHHhccCCeeeecccceEEeechHH Confidence 211 0000000011 11111100011111 112345678899999998888866652 433 4578999999 Q ss_pred HHHHhcchhhhhhhhccccccccceEEEEeceEEEEecceecccccc-----------------------cccccccccc Q lcl|NC_015249. 213 YSAILAALMPNAANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGE-----------------------DRPEEGANPT 269 (347) Q Consensus 213 ~~~Ll~~~~~~~~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~-----------------------~~~~~~~~~~ 269 (347) |..|-....+ +..-...+++ +.-+++|...+.|-...... .........- T Consensus 270 ~~~Ls~~n~~---g~Tvl~~lk~----n~Pnl~i~t~peL~~a~~~g~g~~~~~~~~~~e~~~~~~~s~~~p~~f~q~~p 342 (382) T protein:vir:96 270 VDYLSVTTPY---GISVSDWIEQ----TYPKMRIVSAPELSGVQMQGKTPEDALVLFVEEVDASVDGSTDGGSVFSQLVQ 342 (382) T ss_pred HhhccccCcc---CccHHHHHHH----hcCCcEEEEccccccccCCCccceeEEEEecchhhhhcccccccCcceecccc Confidence 9988543111 0000011111 12334444444442110000 0000000000 Q ss_pred cccccccc-cccccccccccceE--EEEechhhhhhhhhc Q lcl|NC_015249. 270 GQKHAFPE-TSSGDTRVALDNVV--GLFNHRSAVGTVKLK 306 (347) Q Consensus 270 ~~~~~~~~-~~~~~y~~~~~~~~--~l~~~~~Av~~v~~~ 306 (347) .+....+. ....+|...++..+ .+|..|.|+.....| T Consensus 343 ~~~~~l~ve~~~~~~~~~~s~~t~Gv~i~~P~ai~~~~GI 382 (382) T protein:vir:96 343 SKFITLGVEKRAKSYVEDFSNGTAGALCKRPWAVVRYLGI 382 (382) T ss_pred ceeeeccceeecceeEeccccceeeeEEEcchhhhhccCC Confidence 00000011 12223333332222 224455555555444 No 193 >protein:vir:5942 Length: 523 # NCBI annotation: similar to major head protein # Family: family:all:364 # MgeID: mge:123 # MgeName: RM 378 # Cross-refs: genbank:acc:NP_835728;genbank:gi:30044131 Probab=89.12 E-value=0.028 Score=29.08 Aligned_cols=312 Identities=10% Similarity=-0.049 Sum_probs=134.7 Q ss_pred CCccc-----------------ccccccccccccccccchhhhhhhhhhhHH---HHHHHHHHhhhcccccccccccc-- Q lcl|NC_015249. 1 MAKMN-----------------GGQQIGKDQGKGMSAGDKLALFLKVFGGEV---LTAFTRTSVTMNKHLVRSIQSGK-- 58 (347) Q Consensus 1 ma~~~-----------------~~~~~~t~~~~~~~~~d~~al~ie~f~g~V---~~~f~~~s~~~~~~~~r~i~~G~-- 58 (347) |+... ......+++.++ ..+...++.++.|.+.+ -.+|........-.......+|. T Consensus 162 ~s~si~k~~vTa~s~agta~~~li~A~~~q~itg-~tga~fa~s~~~an~astAss~Al~gEA~t~~sTd~at~~~Gtt~ 240 (523) T protein:vir:59 162 SSGAVYYVDVPVASLPGVADVNTVRFWQYDDASG-DPENTVAYPLPRYNRIVGAVGSALYARLFFVTGSDFATVAGGTPS 240 (523) T ss_pred cccceeeeeccccccccccccccccccccccccc-cccccccchhhccccccccccccccccccccccccccccCCCccc Confidence 11110 000001111111 11111222222222111 11111000000000000000000 Q ss_pred ----eEEEe-ecCcceeeeeecCC-CCCCccCCCCCceEEEEEEeeee--------cccccccHHHHHh---ChhhHHHH Q lcl|NC_015249. 59 ----SAQFP-VLGRTKAAYLQPGE-NLDDKRKDMKHTERTINIDGLLT--------ADVLIYDIEDAMN---HYDVRSEY 121 (347) Q Consensus 59 ----tv~i~-~iG~~~~~~~~~g~-~~~~~~~~~~~~~~~l~ID~~~~--------~~~~Idd~D~~q~---~~D~r~~~ 121 (347) ..-+. ..|..+..--+.+. ...+. .+..-.|.-+.||+... +...+.-..+.++ -.|.-.|+ T Consensus 241 t~~~~~lyt~~~g~~t~~~~~~~~~~~~~~-~~~~~~eM~FsIeK~tVtAkSRaLKAeYT~ELAQDLKAiH~GLDAE~EL 319 (523) T protein:vir:59 241 TQDLDLVYYIDARNDFEDQSTDPDYPDPGF-QSLDIPEINLELRSRPVATKTRKLRAAWTPEAMQDLAAYHKGVDLENEI 319 (523) T ss_pred ccccccccccccccchhhcccccccccccc-ccccccceeeEEEeEEEeeecccccccccHHHHHHHHHHhcCCChhHHH Confidence 00000 11111111001010 00000 11223466777776533 3445555666666 38999999 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhh------ Q lcl|NC_015249. 122 TAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLT------ 195 (347) Q Consensus 122 ~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Ld------ 195 (347) +.=++.++..++.+-|+..+...+..-. ..+.....+..+-...+............++.+..+..++. T Consensus 320 anILStEImlEINR~ii~~~~~~a~~~~-----~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~e~~~~l~~~~~~~~n~i 394 (523) T protein:vir:59 320 VTLMSQYIAREIDLEILSTIMAHARRTD-----NYGFWSEVVGEYYDETSGNFVAGNFYGSKQEWLATLMIELNKVSNRI 394 (523) T ss_pred HHHHHHHHHHHhhHHHHHhHhhhheeee-----eccccccceeeecccccchhhhhhhhhhhHHHHHHHHHHHHHHHHHH Confidence 9999999999999999888754432211 11111111111111111100000000011222222222222 Q ss_pred -hcCCCCCCCEEEeCHHHHHHHhcchhhhhhhhccccccccceEEEE-eceEEEEecceecccccccccccccccccccc Q lcl|NC_015249. 196 -GNYVPSADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSIRNV-MGFEVIEVPHLTAGGAGEDRPEEGANPTGQKH 273 (347) Q Consensus 196 -e~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg~i-~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~ 273 (347) .+----.+-|+|++|+..+.|-..+-+...+..-....-.-.+|.+ .|++||.-++-|..=.. .| T Consensus 395 ~~~t~~~~~~~~~~s~~v~~~l~~~~~~~~~~~~~~~~~~~~~~g~l~~~~~vy~d~~~~~dy~~----------~g--- 461 (523) T protein:vir:59 395 QQKTAVAGANFLVTSPQVAALLESMPGFTPGNDNRDGGTGIFYVGMVQGRYRLYKNIYQNQPVII----------MG--- 461 (523) T ss_pred HHhcccccccEEEEchhHHHHHHhccccccCCccccccccceeEEEecCceEEEecCCCCcceEE----------EE--- Confidence 1111124679999999999997666554332221111112134555 56789987776542111 11 Q ss_pred cccccccccccccc-cceEEEEechhhhhhhhhcceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 274 AFPETSSGDTRVAL-DNVVGLFNHRSAVGTVKLKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 274 ~~~~~~~~~y~~~~-~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) |+++. ..-.+|+|+|=. ++.. .....||..+-..|-.+.+||-.+.+|.+-+.+.++-- T Consensus 462 ---------~k~~~~~~~~~~~y~Py~----~l~~--~~~~~dp~s~qp~~~~~tRY~l~v~nP~~~~~~~~~~~ 521 (523) T protein:vir:59 462 ---------NQDLNTPWQTGAVYAPYV----PLLF--TPTIVDPVNFSYRRGLMTRYALEVVRPEFYGLLYVKLL 521 (523) T ss_pred ---------ecccCCcccccceecccc----hhhc--ccccccCCcccceeeeeeehhheecchhHhhhhhhhhc Confidence 11111 011478888862 2211 23346999999999999999999999998887776554 No 194 >protein:vir:10324 Length: 320 # NCBI annotation: ORF26 # Family: family:all:570 # MgeID: mge:182 # MgeName: VHML # Cross-refs: genbank:acc:NP_758919;genbank:gi:27311193;genbank:GeneID:956155 Probab=89.11 E-value=0.028 Score=29.08 Aligned_cols=288 Identities=12% Similarity=0.034 Sum_probs=103.8 Q ss_pred ccccccccccccchhhhhhhhhhhHHHHHHH-HHHhhhcccccccccccceEEEeecCcc--eeeeeecCCCCCCccCCC Q lcl|NC_015249. 10 IGKDQGKGMSAGDKLALFLKVFGGEVLTAFT-RTSVTMNKHLVRSIQSGKSAQFPVLGRT--KAAYLQPGENLDDKRKDM 86 (347) Q Consensus 10 ~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~-~~s~~~~~~~~r~i~~G~tv~i~~iG~~--~~~~~~~g~~~~~~~~~~ 86 (347) ++..|+.-+ .++..|- ...+. ..+|.|...+.. -+....+|.+.. .. T Consensus 1 i~~~P~~~g---------------~~~glff~~~~v~-----------T~~V~ie~~~~~l~lip~v~rg~~g~----~~ 50 (320) T protein:vir:10 1 MNLLPVNYG---------------DSRALFAREKKVR-----------TRTILVEEKNGVLTLIQSREPGSTEN----VA 50 (320) T ss_pred CCcCCchhh---------------hhhhhccCCCCcc-----------cceEEEEEecCceeeeeccCCCCCce----ee Confidence 333443321 1112221 11121 222333322211 112222222210 01 Q ss_pred CCceEEEEEEeeeecc--cccccHHHHH-----------hChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc Q lcl|NC_015249. 87 KHTERTINIDGLLTAD--VLIYDIEDAM-----------NHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDE 153 (347) Q Consensus 87 ~~~~~~l~ID~~~~~~--~~Idd~D~~q-----------~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~ 153 (347) ...++.+..=..-|+. ..| +-|+.| +--+++.+...++...+.... |++..+..++.- ..+.+. T Consensus 51 ~~~~~~~~~f~~p~~~~~d~i-~a~eiq~~Ra~G~~~~~~~~~~v~~~l~~lr~~~~~T~-E~m~~~AL~G~i-ldadGt 127 (320) T protein:vir:10 51 KRGKRKVRSFVIPHLPLEDVI-LPDEYEGLRGFGTTALAAKSELVKERXETMKSSHDITH-EHLRMGAKKGQI-LDADGT 127 (320) T ss_pred cCCcceEEEEecceeccCCcc-CHHHHcCcccCCCchHHHHHHHHHHHHHHHHHHHHHHH-HHHHHhhhcCeE-EcCCCc Confidence 1111111110111110 001 112221 111222222222222222221 122111111110 000000 Q ss_pred cccc----ccC-cceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhh--h Q lcl|NC_015249. 154 NIAG----LGK-AHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAA--N 226 (347) Q Consensus 154 ~~~~----~~~-g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~--~ 226 (347) .+-. ++. ...+.. +..++.....++..+.+..+...|. ..+..+-.++++|++|..|+.|+.+-.. . T Consensus 128 v~~d~y~~fGi~~~~i~~----~l~~a~~dv~~~~~~~~~~i~~~l~--g~~~t~v~al~g~~f~~al~~h~~Vke~y~~ 201 (320) T protein:vir:10 128 VLYDLYAEFGITKKTIYF----GLDNKDANVAESCRQVLRHVEDNLR--GDVMKDVSVDVSEEFFDKFIKHASVKEVFLN 201 (320) T ss_pred EEEechhhhCCccceeEE----ecCCCCccHHHHHHHHHHHHHHHhc--cCCCCceEEEEChHHHHHHhcCHHHHHHHHh Confidence 0000 000 001111 1111122233445555555555564 4566777899999999999999876432 1 Q ss_pred hc-cccc----cccceEEEEeceEEEEecc-eecccccccccccccccccccccccccccccccccccceEEEEechhhh Q lcl|NC_015249. 227 YQ-ALID----PSTGSIRNVMGFEVIEVPH-LTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAV 300 (347) Q Consensus 227 ~~-~~~~----~~~G~Vg~i~G~~V~~sn~-lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av 300 (347) +. +... ...| ..+.|+.+++-.. .+-..++.. .....+.++.++....+. |....+.+=.-+++ T Consensus 202 ~~~~~~~l~~~~~~~--f~~gGi~~~~Y~g~~~d~~g~~~----~~I~~~~~~~~p~g~~~~----f~~~~apad~~e~v 271 (320) T protein:vir:10 202 HEAAVNRLGGDTRKG--FKFGGLIFNENRARHVDEEGKET----RFIKAGKGHAFPTGTTNT----FFTALAPADFNETA 271 (320) T ss_pred hhhhhhhccccccce--EEecCEEEEEcccEEEcCCCCee----EeecCCeeEEEEecCchh----heeeecccCcHhhc Confidence 11 1111 2233 2678888877432 111111111 123344444444322221 11112222112233 Q ss_pred hhhhhcceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 301 GTVKLKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 301 ~~v~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) .+. .+++=.+.+.++.-.+..+.+-...=+-..||++.+-++..++ T Consensus 272 nt~-g~p~y~k~~~~~~~~g~~l~~qS~PLpi~~rP~~lv~~~~~a~ 317 (320) T protein:vir:10 272 GTL-GKRYYAKMEPRRMGRGFDLHSQSNVLPMCCRPGVLVELDAAAQ 317 (320) T ss_pred CCc-ccccccccccccCCCeEEEEeeecccccccCcceEEEEEecCC Confidence 321 1222233444455445555555444467789999998888877 No 195 >protein:vir:95131 Length: 325 # NCBI annotation: hypothetical protein ORF010 # Family: family:all:47 # MgeID: mge:1552 # MgeName: PA73 # Cross-refs: genbank:acc:YP_001293417;genbank:gi:148912838;genbank:GeneID:5228206 Probab=80.88 E-value=0.091 Score=26.30 Aligned_cols=276 Identities=12% Similarity=0.035 Sum_probs=123.0 Q ss_pred ccccchhhhhhhhhhhHHHHHHHHH-----Hhhhc-----ccccccccccceEEEeecCcc-----eeeeeecCCCCCCc Q lcl|NC_015249. 18 MSAGDKLALFLKVFGGEVLTAFTRT-----SVTMN-----KHLVRSIQSGKSAQFPVLGRT-----KAAYLQPGENLDDK 82 (347) Q Consensus 18 ~~~~d~~al~ie~f~g~V~~~f~~~-----s~~~~-----~~~~r~i~~G~tv~i~~iG~~-----~~~~~~~g~~~~~~ 82 (347) ++-+|.- +|..++.+++.++ .+|-. .+.......|+-+..|..... ....+....++ + T Consensus 1 m~lsD~~-----vfN~~~~~a~~e~~~q~~~~fn~as~gai~l~~~~~~Gd~~~~pf~~~l~g~~~~~~~~~~~~~v--t 73 (325) T protein:vir:95 1 MALSDLA-----VYSEYAYSAFSETLRQQVDLFNTATGGAIMLQSAAHQGDFSDVAFFAKVTGGLVRRRNAYGSGTV--A 73 (325) T ss_pred Cchhhhh-----hhhhhhhhhhhhhhhhhHhhhhhcccceeEeccccccCceeeccccccccccccccccCCCCcee--c Confidence 5555533 5777777776553 11111 111111223666666654322 11222211122 2 Q ss_pred cCCCCCceEEEEEEeeeecccccccHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccCcc Q lcl|NC_015249. 83 RKDMKHTERTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGKAH 162 (347) Q Consensus 83 ~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~g~ 162 (347) +..+...+..-++ -..-..+...|+-..-...|.+++++++.|..+++...+.++..+.++...+-.. .+ .. T Consensus 74 ~~kitt~~~~av~-~~r~~g~~~~d~~~~~~g~~~~~~~~~~Ig~~~a~~~~~~~l~~~~~~l~~a~~~---~~----~~ 145 (325) T protein:vir:95 74 EKVLKHLVDTSVK-VAAGTPPVRLDPGQFRWIQQNPEVAGAAMGQQLAVDTMADMLNVGLGSVYSALSQ---VS----DV 145 (325) T ss_pred cceeccccceeeE-EecccCcccccHHHHhhcCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhcc---cc----cc Confidence 2334433332222 1112222334555555667788999999999999887776666554433211100 11 11 Q ss_pred eeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchhhhhhhh--ccccccccceEEE Q lcl|NC_015249. 163 VLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSAILAALMPNAANY--QALIDPSTGSIRN 240 (347) Q Consensus 163 ~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~~~~~~~--~~~~~~~~G~Vg~ 240 (347) +..+.+..+. .+.... ++.|.+|+.+|.++. +.=..+++.+..|..|.+.. +++... ...+. ..|.. T Consensus 146 v~dis~~~~~-~~~~~s----~~~l~~A~~klGD~~--~~l~~~~MHS~v~~~L~~~~-L~~~~~~~~~~g~---~~i~t 214 (325) T protein:vir:95 146 VYDATANTDA-ADKLPT----WNNLNNGQAKFGDQS--SQIAAWIMHSTPMHKLYGSN-LTNGERLFTYGTV---NVVRD 214 (325) T ss_pred eeeeecccCc-cccccc----HHHHHHHHHHhcccc--cceeEEEEchHHHHHHHHhh-ccccccccccCCc---ccccc Confidence 1121111111 011112 356778888887753 11257889999999998653 433211 11111 13457 Q ss_pred EeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhhcceeeeee---echh Q lcl|NC_015249. 241 VMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERA---RRAN 317 (347) Q Consensus 241 i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~---~d~~ 317 (347) ++|-+|+.+..+|....++.. +...+.|-+-|++..+..++..... ++ + T Consensus 215 ~~G~~VIVdD~~p~~~~g~~~---------------------------~ytty~lg~GAi~~~~~~~~~~~~~~~~~~-~ 266 (325) T protein:vir:95 215 PFGKLLVMTDSPNLFAAGTPN---------------------------VYHILGLVPGGVLIGQNNDFDANEETKNGD-E 266 (325) T ss_pred cCCcEEEEeCCCCCCCccCce---------------------------eEEEEEEecCeEEecCCCCccccccccCcc-c Confidence 889999999999876543211 1122333344444444333322211 11 1 Q ss_pred hhcceeeeeeeec--ccccccceEEEEEEcCC Q lcl|NC_015249. 318 FQADQIIAKYAMG--HGGLRPEACGALVFNKA 347 (347) Q Consensus 318 ~~~d~i~~~~a~G--~~~~Rpe~a~~i~~~~a 347 (347) +.+.-+...+.|. -...+++.+..-+.|.- T Consensus 267 ~~~~~~~~~~tf~lhp~G~sw~~s~~g~sPt~ 298 (325) T protein:vir:95 267 NIIRTYQAEWSYNIGVKGFAWDKANGGKSPTD 298 (325) T ss_pred ceeeeeeeeeeEEeecceeeeecccccCCcCh Confidence 2222222222221 12223322211111111 No 196 >protein:vir:79642 Length: 329 # NCBI annotation: HsbB # Family: family:all:463 # MgeID: mge:1872 # MgeName: TLS # Cross-refs: genbank:acc:YP_001285525;genbank:gi:148734508;genbank:GeneID:5220000 Probab=79.80 E-value=0.1 Score=26.04 Aligned_cols=292 Identities=11% Similarity=0.034 Sum_probs=126.0 Q ss_pred CCccccccccccc-ccccccccchhhhhhhhhhhHHHHHHHHH----Hhhhccccccc-cc-ccceEEEeec---Cccee Q lcl|NC_015249. 1 MAKMNGGQQIGKD-QGKGMSAGDKLALFLKVFGGEVLTAFTRT----SVTMNKHLVRS-IQ-SGKSAQFPVL---GRTKA 70 (347) Q Consensus 1 ma~~~~~~~~~t~-~~~~~~~~d~~al~ie~f~g~V~~~f~~~----s~~~~~~~~r~-i~-~G~tv~i~~i---G~~~~ 70 (347) =+++...++ ..+ ++.-..+.+.. .|+......++....+. -+.+.++.+++ +. +-.++.+... |..+. T Consensus 14 ~~~~~~~a~-~~~~~~~~~~~~~~~-~f~~~ql~~id~~v~e~~~~~l~~~~~i~i~~~~~~~~~~~t~~~~~~~G~a~~ 91 (329) T protein:vir:79 14 EFEANVIAN-HMQLRGAKNDASDMG-IWTSQELHKIKAQAYEKEYPAGSALRVFPVTSELSDTDKTFEYQTFDKVGHAKI 91 (329) T ss_pred hhhhhhHhh-hcccccceeccchhh-HHHHHHHHHHHHHHHhhhhcccchhhhcccccCCCCceeEEEeeeeecceeeee Confidence 011111111 011 11111122223 46543344555544332 24455556554 22 2345555544 44432 Q ss_pred eeeecCCCCCCccCCCCCceEEEEEEee-eecccccccHHHHH-hChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhc Q lcl|NC_015249. 71 AYLQPGENLDDKRKDMKHTERTINIDGL-LTADVLIYDIEDAM-NHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLP 148 (347) Q Consensus 71 ~~~~~g~~~~~~~~~~~~~~~~l~ID~~-~~~~~~Idd~D~~q-~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~ 148 (347) - -...++++. .+..-.+....|-.. .-+.+.+.++..++ +..++-..-...+..++++..|+.+|.--.+ T Consensus 92 ~-~d~~~dip~--vd~~~~~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~i~f~G~~~----- 163 (329) T protein:vir:79 92 I-ADYTDDLST--VDALMTSEFGKVFRLGNAFLISIDEIKAGQRTGKSLSTRKANAAQNAHDQLVNHLVFKGSKP----- 163 (329) T ss_pred e-cCcccccce--eecccceeEEEEEEEEEEEEecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhccEEEeeccc----- Confidence 1 112234432 222223333333221 12344556777774 6777888888888899999999877532211 Q ss_pred ccccccccccc-Ccce--eecccccccccchhhhHHHHHHHHHHHHHHhhhc--CCCCCCCEEEeCHHHHHHHhcchhhh Q lcl|NC_015249. 149 SASDENIAGLG-KAHV--LEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGN--YVPSADRVFYTTPDNYSAILAALMPN 223 (347) Q Consensus 149 ~~~~~~~~~~~-~g~~--i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~--~VP~~gR~~vv~P~~~~~Ll~~~~~~ 223 (347) ....|+- ..++ ...++.. .+.=..++++.+++.|.++..+|.++ .+ ...-.++|+|+.|..|..- .. T Consensus 164 ----~g~~GLlN~p~v~~~~~~~~~-~~~w~~kt~~ei~~di~~~~~~l~~~s~g~-~~p~~L~Lpp~~~~~L~~~--~~ 235 (329) T protein:vir:79 164 ----HKIISVFEHPNLTTINSAGWN-NAAGTGKKPETAQDELEQAIEKIETLTNGQ-HRANMILIPPSMRKVLMVR--MP 235 (329) T ss_pred ----ccceeeecCCCccccccCCCC-CccccccCHHHHHHHHHHHHHHHHHhcCce-ecccEEEecHHHHHHhhcc--cC Confidence 1111111 0111 1111111 11122346788999999998888765 22 1124799999999888521 11 Q ss_pred hhhhcccccc-ccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEec--hhhh Q lcl|NC_015249. 224 AANYQALIDP-STGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNH--RSAV 300 (347) Q Consensus 224 ~~~~~~~~~~-~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~--~~Av 300 (347) +.+..-...+ +++ ..++|...+.|-..+ ..+ +-+++++. ++-+ T Consensus 236 ~~~~tvl~~lk~~~-----~~l~I~~~~el~~ag-----------~~g------------------~~~~v~y~~~~~~~ 281 (329) T protein:vir:79 236 ETTMSYLDYFKQQN-----GGITIESISELEDID-----------GAG------------------TKAALVYEKDPMNM 281 (329) T ss_pred CCCccHHHHHHHhC-----CCcEEEEcccccccC-----------CCC------------------ceEEEEEecCCceE Confidence 1111000111 121 234455544442100 000 11122221 2222 Q ss_pred hhhhhcceeeeeeechhhhcceeeeeeee-cccccccceEEEE---EEc Q lcl|NC_015249. 301 GTVKLKDMALERARRANFQADQIIAKYAM-GHGGLRPEACGAL---VFN 345 (347) Q Consensus 301 ~~v~~~~~~~e~~~d~~~~~d~i~~~~a~-G~~~~Rpe~a~~i---~~~ 345 (347) .....++++... ..++-..+.+....+. |.-+.||++++-+ ++. T Consensus 282 ~~~vp~~~~~l~-~q~~~~~~~v~~~~r~~Gv~i~~P~ai~~~dGI~~~ 329 (329) T protein:vir:79 282 SIEIPEAFNMLT-AQPKDLHFKVPCTSKCTGLTIYRPLTLVLIKGLVVG 329 (329) T ss_pred EEecCcceeeee-ceecCceEEEceeeeEEEEEEECcceeeeeeeeeeC Confidence 222223333221 2333344555555555 4688899976532 222 No 197 >protein:vir:8324 Length: 410 # NCBI annotation: gp41 # Family: family:all:30827 # MgeID: mge:154 # MgeName: Corndog # Cross-refs: genbank:acc:NP_817892;genbank:gi:29566325;genbank:GeneID:1259520 Probab=72.18 E-value=0.19 Score=24.59 Aligned_cols=275 Identities=15% Similarity=0.120 Sum_probs=112.2 Q ss_pred CCcccccccccccc----------------------------cccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccc Q lcl|NC_015249. 1 MAKMNGGQQIGKDQ----------------------------GKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVR 52 (347) Q Consensus 1 ma~~~~~~~~~t~~----------------------------~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r 52 (347) |....-..-.-.|. -....++|...-.-..|-+.+.+-....-...++...= T Consensus 89 ~r~~p~~~~veyRSaGE~lkal~~~~~Gd~~A~~~~e~~r~a~~~~~Tgd~~~~i~~~~v~d~i~li~q~r~i~slf~tL 168 (410) T protein:vir:83 89 MRGSPVGTEVEYRSAGEYMLDMWNSAQGNASAADRLEVYARAADHQKTGDLQGVIPDPIVGPVIDFIDSARPLVSTLGTL 168 (410) T ss_pred CcCCCCCCCcccccHHHHHHHHhccCCchHHHHHHHHHHHHhhccCcccccccccchhHhhhHHHHHhhccchhhhhhhC Confidence 33221000000000 00111222110011223333333332222112111111 Q ss_pred cccccceEEEeecCcc-eeee-------eecCCCCCCccCCCCCceEEEEEEeeeec----ccccccHHHHHhChhhHHH Q lcl|NC_015249. 53 SIQSGKSAQFPVLGRT-KAAY-------LQPGENLDDKRKDMKHTERTINIDGLLTA----DVLIYDIEDAMNHYDVRSE 120 (347) Q Consensus 53 ~i~~G~tv~i~~iG~~-~~~~-------~~~g~~~~~~~~~~~~~~~~l~ID~~~~~----~~~Idd~D~~q~~~D~r~~ 120 (347) .. .|.|...+..-.. ++.. -..|.+++- ..+.....+-.|+.+--. +-.|+ .++....+- T Consensus 169 P~-~g~T~eY~v~t~~~tV~~q~~~~kqa~EGd~L~~--gKl~~~t~tA~ikTyGGyt~LSRQ~IE-----Rs~v~~L~~ 240 (410) T protein:vir:83 169 PL-NNATFYRPIVSQRPAVGLQGVAGGASDEKTELDS--QKMVIDRLTVNAKTLGGYVNVSRQAID-----FSSPSALDL 240 (410) T ss_pred CC-CCCeeEEeeecccccccccccccccccccccccc--cceeeeeccceeehhcCcccccceeee-----cCChhhHHH Confidence 11 2777777544222 2211 123444432 344445555666665322 22232 233333333 Q ss_pred HHHHHHHHHHHHHHHHHHHHHHHHhhhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhc--C Q lcl|NC_015249. 121 YTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGN--Y 198 (347) Q Consensus 121 ~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~--~ 198 (347) ..+-++.+-|+.....+=..+. +. +++ ..+....+++++...|.++....+.+ + T Consensus 241 ~lraL~~AYA~atea~vra~L~------~t----~t~--------------~~a~~~~Tad~~~~~i~da~~~v~da~~~ 296 (410) T protein:vir:83 241 VVNGLGQQYAIETEALVGAALA------ST----STG--------------AVGYGNATADNVASAIWQAAGAVYTAVKG 296 (410) T ss_pred HHHHHHHHHHHHHHHHHHHHHH------Hh----hhh--------------hhhhhhccHHHHHHHHHHHHHHHhhhhcc Confidence 3344444444444333211111 00 000 00111235678888888998888886 5 Q ss_pred CCCCCCEEEeCHHHHHHHhcchhhhhhh---hcc--ccccccceEEEEeceEEEEecceecccccccccccccccccccc Q lcl|NC_015249. 199 VPSADRVFYTTPDNYSAILAALMPNAAN---YQA--LIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKH 273 (347) Q Consensus 199 VP~~gR~~vv~P~~~~~Ll~~~~~~~~~---~~~--~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~ 273 (347) + .=+++.|+|+.+..+.+--+..+.+ ..| .+.+-.|.-|.+.|++|.+.+.+|.+.. T Consensus 297 ~--~~~~i~vS~DVl~~~~~~f~~~~~~~~dt~Gfg~~~lg~gi~G~~~~ipVvm~~~a~AgTA---------------- 358 (410) T protein:vir:83 297 M--GRLVIAIAPDVLGDFGPLFAPVNPTNAHSTGFEAGRFGQGVMGSISGIPVVMSAALGSGDA---------------- 358 (410) T ss_pred c--eeeeEEechhhhhhccceeeccCCCCcccccccccccccchhhhhcccceEEecCCCcCee---------------- Confidence 4 2378999999976665433222332 222 2224467778999999999998874221 Q ss_pred cccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhhhcceeeeeeeecccccccceEEEEEEc Q lcl|NC_015249. 274 AFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFN 345 (347) Q Consensus 274 ~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~ 345 (347) .|-...++-+-.+.++++++++-. ++--.+-++ +.+ +..+.-|++.+=++=. T Consensus 359 ------------~f~~~~Ai~~~eS~~gp~qL~d~~--i~nLt~~yS----gY~--a~a~~~~~gliPv~g~ 410 (410) T protein:vir:83 359 ------------YLFSTAAIECFEQRVGTLQVVEPS--VFGLQVAYA----GYF--STLVVNEDAIVPLVGS 410 (410) T ss_pred ------------eEeccceeeeeecCCceeEeeCCc--hhhhhhhhe----eee--eeccccccceeeeccC Confidence 011122333334455555555421 111111111 222 2223333333322222 No 198 >protein:vir:104342 Length: 314 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:1593 # MgeName: RTP # Cross-refs: genbank:acc:YP_398971;genbank:gi:81343955;genbank:GeneID:3778874 Probab=71.96 E-value=0.19 Score=24.55 Aligned_cols=290 Identities=12% Similarity=0.025 Sum_probs=128.6 Q ss_pred CCccccccccc---cccc-ccccccchhhhhh-hhhhhHHHHHHHHH----Hhhhcccccccccc--cceEEEe---ecC Q lcl|NC_015249. 1 MAKMNGGQQIG---KDQG-KGMSAGDKLALFL-KVFGGEVLTAFTRT----SVTMNKHLVRSIQS--GKSAQFP---VLG 66 (347) Q Consensus 1 ma~~~~~~~~~---t~~~-~~~~~~d~~al~i-e~f~g~V~~~f~~~----s~~~~~~~~r~i~~--G~tv~i~---~iG 66 (347) || |.+..+.. ++.. .+....|....|+ ++.. .++....+. -..+.++.+++--+ -.++.+. ..| T Consensus 1 ~~-~~~~~~~~~~~~~~~~~~~~~~d~~~~fl~~ql~-~id~~v~e~~~~~~~~~~~i~v~~~~~~~~et~~~~~~e~~G 78 (314) T protein:vir:10 1 MA-IKFDAEQAKITTHLEQMGVEKADAAGIWAVSQLT-AALNRAYEKEYAENSVVNIFPVTNEIPGHAKYFEYPEFDGVG 78 (314) T ss_pred Cc-cchHHHHHHHHHHHHhhcccchhhhHHHHHHHHH-HHHHHHhhhhccccccceeeccccCCCCceeEEEeeeecccc Confidence 32 22221111 1111 1123334322344 4443 555544332 23445555554211 2355544 445 Q ss_pred cceeeeeec-CCCCCCccCCCCCceEEEEEEee-eecccccccHHHH-HhChhhHHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015249. 67 RTKAAYLQP-GENLDDKRKDMKHTERTINIDGL-LTADVLIYDIEDA-MNHYDVRSEYTAQLGESLAMAADGAVLAEMAK 143 (347) Q Consensus 67 ~~~~~~~~~-g~~~~~~~~~~~~~~~~l~ID~~-~~~~~~Idd~D~~-q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~ 143 (347) ..+. +.. +.+++. .+..-.+....|-.. .-+.+.+.++..+ +...++-..-...+..++++..|+.++.-.. T Consensus 79 ~a~~--~~d~~~dip~--vd~~~~~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~~~~~n~i~f~G~~- 153 (314) T protein:vir:10 79 IAQI--IADYSDDLPL--VDAFMTEKQGKVFRFGNAFLISTDEIKAGAATGQSLSARKQALAFEAHDNLLDKLVWSGSA- 153 (314) T ss_pred ceee--eCCcccccce--eecccceeEEEEEEEEeeEEecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhceEEEeecc- Confidence 4442 222 233433 223334444433332 2233345566666 3566777777788888888888887653211 Q ss_pred HhhhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCC-CCCCEEEeCHHHHHHHhcchhh Q lcl|NC_015249. 144 LCNLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVP-SADRVFYTTPDNYSAILAALMP 222 (347) Q Consensus 144 ~a~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP-~~gR~~vv~P~~~~~Ll~~~~~ 222 (347) .....|+-....++..+ ..... .+++.+++.|..+..+|.++.-= ...-.++|+|+.|..|.. +. T Consensus 154 --------~~g~~GLlN~p~v~~~~-~~~~W---aT~~ei~~Di~~~~~~l~~~s~g~~~p~~l~Lpp~~~~~L~~--~~ 219 (314) T protein:vir:10 154 --------PHGIVSVFDQPNINNVV-ATPNW---SVPQNAIDDVTAMIDAVESSTQGLHHVTDILLPASARRVMQG--LV 219 (314) T ss_pred --------cccceeEeecCCCcccc-CCCCc---ccHHHHHHHHHHHHHHHHHhcCccccceeEEecHHHHHhhcc--cc Confidence 11111111110111111 11111 35678899999999999875210 012378999999987742 11 Q ss_pred hhhhhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechh--hh Q lcl|NC_015249. 223 NAANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRS--AV 300 (347) Q Consensus 223 ~~~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~--Av 300 (347) .+.+..-...+.+ +--+++|...+.|...++. .+.+++++..+ -+ T Consensus 220 ~~~~~tvl~~l~~----n~~~l~I~~~~el~~ag~~-----------------------------g~~~~v~y~~~~~~~ 266 (314) T protein:vir:10 220 PQTNLSYGELFTR----NNPGLTIRFLQFLDNYDGA-----------------------------GGKAALAFEKSPLNM 266 (314) T ss_pred cCCCccHHHHHHH----hCCCcEEEEcccccccCCC-----------------------------cceEEEEEecCCcEE Confidence 1111100111111 1235666666665421100 01122222221 11 Q ss_pred hhhhhcceeeeeeechhhhcceeeeeeee-cccccccceEE---EEEEc Q lcl|NC_015249. 301 GTVKLKDMALERARRANFQADQIIAKYAM-GHGGLRPEACG---ALVFN 345 (347) Q Consensus 301 ~~v~~~~~~~e~~~d~~~~~d~i~~~~a~-G~~~~Rpe~a~---~i~~~ 345 (347) .....++++.- -..++...+.+....+. |.-+.||.+++ -|.+. T Consensus 267 ~~~vp~~~~~l-~~e~~~~~~~~~~~~r~~Gv~i~~P~ai~~~dGI~~~ 314 (314) T protein:vir:10 267 SIEIPEVTNVL-PAQPKDLHFRYPVTSKATGLIVYRPLTMAVIKGITFA 314 (314) T ss_pred EEecCccceee-cceecCceEEEcceeeeEEEEEECcceeEeeeeeecC Confidence 11112222221 12333444555555666 56888999887 45555 No 199 >protein:vir:94070 Length: 339 # NCBI annotation: putative structural protein # Family: family:all:1653 # MgeID: mge:1493 # MgeName: OP2 # Cross-refs: genbank:acc:YP_453625;genbank:gi:84662661;genbank:GeneID:5142580 Probab=71.49 E-value=0.19 Score=24.48 Aligned_cols=284 Identities=10% Similarity=-0.028 Sum_probs=121.9 Q ss_pred CC--cccccccccccccccccccchhhhhhhh-hhhHHHHHH----HHHHhhhcccccccccc--cceEEEe---ecCcc Q lcl|NC_015249. 1 MA--KMNGGQQIGKDQGKGMSAGDKLALFLKV-FGGEVLTAF----TRTSVTMNKHLVRSIQS--GKSAQFP---VLGRT 68 (347) Q Consensus 1 ma--~~~~~~~~~t~~~~~~~~~d~~al~ie~-f~g~V~~~f----~~~s~~~~~~~~r~i~~--G~tv~i~---~iG~~ 68 (347) || ....+ |..+ .... .+|.. ....|+..+ ...-..+.++.+.+.-. -+++.+. ..|.. T Consensus 35 ~a~d~~~~~------~~~~--~~~~--~~i~a~~~~~i~~~vy~~~~~~~~~~~l~pv~t~g~w~~~t~~y~~~e~~G~a 104 (339) T protein:vir:94 35 YAMDAVNLT------PTLQ--TTAN--AGIPAWMTTFVDRRVIDIQLAPMAAAKIFPEVKKGDWTTTYGVFIIAEPVGQV 104 (339) T ss_pred hhccccccc------cccc--cccc--cchhhhhhhhhchhheeecccccchhhhcccccCCCCcccEEEEeeeecccce Confidence 11 11111 1111 1111 13322 223332222 11223455556555322 3577775 44555 Q ss_pred eeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHH---HhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHh Q lcl|NC_015249. 69 KAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDA---MNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLC 145 (347) Q Consensus 69 ~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~---q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a 145 (347) +. |..+.+.+-...+.+-.++++.+=+. .+.+..++.. ++..|+-..-.+.+..+|.+..|+..+.-- T Consensus 105 ~~--ygd~ad~Pl~~~~v~~~~~~v~~~~~---g~~y~~~E~~~A~~~g~~l~~~Ka~aA~~al~~~~N~i~~~Gd---- 175 (339) T protein:vir:94 105 AT--YSDWSANGMSKANVNFESRQNYRYQT---WTEYGDLEMATYGEAGIDYVARQEISASLVMAKFANSSYLLGV---- 175 (339) T ss_pred EE--cccccCCCcccccceeeEEeEEEEEE---EEeecHHHHHHHHhhCCChHHHHHHHHHHHHHHhhceEEeeee---- Confidence 43 34455543332334444455544443 2234444432 356777777777788888888887544211 Q ss_pred hhcccccccccccc-CcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcC----CCCCCCEEEeCHHHHHHHhcch Q lcl|NC_015249. 146 NLPSASDENIAGLG-KAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNY----VPSADRVFYTTPDNYSAILAAL 220 (347) Q Consensus 146 ~~~~~~~~~~~~~~-~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~----VP~~gR~~vv~P~~~~~Ll~~~ 220 (347) ....+.|+- ..++ +...+... .=..++++.+++.|..+...|-... -|.....+++||..|..|-.-. T Consensus 176 -----~~~~~~GLlN~P~l-~~~v~~s~-~Wa~kT~~eI~~Di~~~~~~l~~~s~g~~~~~~~~~L~LP~~~~~~L~~~n 248 (339) T protein:vir:94 176 -----AGIANYGLMNDPSL-PAPVAATV-NWATAAPEDIANDVVAMVGRLISQSGGLITGQERMVMALAPSALNNVNRTN 248 (339) T ss_pred -----cccceEEEEeCCCc-cccccCCC-CcccCCHHHHHHHHHHHHHHHHHhcCCeeeeccCcEEEecHHHHHhcccCC Confidence 111112211 1111 11111111 1123567888999988888876653 2455668999999999886432 Q ss_pred hhhhhhhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhh Q lcl|NC_015249. 221 MPNAANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAV 300 (347) Q Consensus 221 ~~~~~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av 300 (347) .+ +..-...+++ +.-+++|...+.|-.. + +.... .-..+..+.+.. -+.+. + T Consensus 249 ~~---~~Tvl~~lk~----n~pnl~i~~~~el~~a-----------~--g~~~~----~~~~~~~~~~~~-~~~~p-~-- 300 (339) T protein:vir:94 249 NF---GLSAGAKIAQ----TYPNIQFVAVPEFDTA-----------S--GRLVQ----LWVPEVNGQPTG-EVAFA-E-- 300 (339) T ss_pred cC---CccHHHHHHH----hcCCcEEEEccccccC-----------C--CceEE----EEEEeccCCcce-EEEcc-h-- Confidence 11 0000111221 1335666665555210 0 00000 000111111111 12221 1 Q ss_pred hhhhhcceeeeeeechhhhcceeeeeeee-cccccccceEEEEEEcC Q lcl|NC_015249. 301 GTVKLKDMALERARRANFQADQIIAKYAM-GHGGLRPEACGALVFNK 346 (347) Q Consensus 301 ~~v~~~~~~~e~~~d~~~~~d~i~~~~a~-G~~~~Rpe~a~~i~~~~ 346 (347) +..-+.+ .++...+.+-...+. |.-+.||.+.+-+ .-- T Consensus 301 ---~~~~lpv----q~~~~~~~v~~~~rt~Gv~i~~P~ai~~~-~GI 339 (339) T protein:vir:94 301 ---KLRSHSI----ERYSTTTRQKHSGATFGAVIYQPWAVTQE-LGV 339 (339) T ss_pred ---hhhcccc----EEcCceEEecceeeeeeEEEEccceeeee-ecC Confidence 1111111 123334555555554 5688899865432 222 No 200 >protein:vir:97255 Length: 310 # NCBI annotation: hypothetical protein ORF017 # Family: family:all:1120 # MgeID: mge:1657 # MgeName: M6 # Cross-refs: genbank:acc:YP_001294525;genbank:gi:149408246;genbank:GeneID:5237120 Probab=70.89 E-value=0.2 Score=24.38 Aligned_cols=284 Identities=13% Similarity=0.057 Sum_probs=122.2 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCc---ceeee-eecC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGR---TKAAY-LQPG 76 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~---~~~~~-~~~g 76 (347) |...+ ..-|+-...|.+ ...|.+.|.+.|-+....+-..+. |++.++++.-. ..... ..+- T Consensus 1 mpalt-------Laea~k~~~d~l-------~~~ViE~~~~~s~lL~~LpF~~ve-g~~~~ynR~~~~~~~~~~~v~~~~ 65 (310) T protein:vir:97 1 MASVT-------LAESAKLAQDEL-------VAGVIENIITVNRMFDVLPFDSIE-GNSLAYNRENVLGDVIMAGVGTTF 65 (310) T ss_pred Ccccc-------hHHHhhcCcchH-------HHHHHHHHhccchHHHhCCccccc-CCcceeeEeeccCCcccccccccc Confidence 55333 222332333322 345567776666444444444433 66787776632 22110 0000 Q ss_pred CCCCCccCCCCCceEEEEEEeeeecccccccHHHH--H---h-ChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccc Q lcl|NC_015249. 77 ENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDA--M---N-HYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSA 150 (347) Q Consensus 77 ~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~--q---~-~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~ 150 (347) ..........+.++++..+ ..+. -+-++|.. + . -.|.+.+-.+...++|++++...++.. .+. T Consensus 66 ~~~g~~~~~~t~~~~~~~L---~i~~-g~~~Vd~~i~dl~~~~~~dq~~~Ql~~~iea~~~~~e~~lING-------D~a 134 (310) T protein:vir:97 66 SGAGAGKAAATFTKVNSNL---TTIM-GDAEVNGLIQATRSGDGNDQTAVQIASKAKSAGRKYQDQLING-------NGA 134 (310) T ss_pred cCCCccccccccceeeeee---eeee-ehhhhhhHHHhhhcCChHHHHHHHHHHHHHHHHHHHHHHhhcc-------ccC Confidence 0000000111223333322 2222 12245532 1 2 234566667777888888876654320 000 Q ss_pred cccccccc----cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhc-CCCCCCCEEEeCHHHHHHHhcchhhhhh Q lcl|NC_015249. 151 SDENIAGL----GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGN-YVPSADRVFYTTPDNYSAILAALMPNAA 225 (347) Q Consensus 151 ~~~~~~~~----~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~-~VP~~gR~~vv~P~~~~~Ll~~~~~~~~ 225 (347) .+. -.|+ ..+..+..++.... .+.. .+|.| |+.. +-..+..+++..|.++..|..--|-... T Consensus 135 ~n~-F~GL~~~~~~~q~i~~~~~gg~-----~t~d-~LDeL------l~~v~~~~g~p~~~l~~~~~~r~i~A~~R~~~~ 201 (310) T protein:vir:97 135 GNE-FAGLIQLCASGQKATTGATGSA-----ISFA-ILDEL------MDLVVDKDGQVDYLTMHARTLRSYKALLRALGG 201 (310) T ss_pred CCc-ccchhhcCCccceeecCCCCCC-----CCHH-HHHHH------HHHHhcCCCCCCEEEecHHHHHHHHHHHHHhcC Confidence 000 0011 11223332221111 1122 23333 2222 1122456999999876666554443322 Q ss_pred --hhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccc----cceEEEEechhh Q lcl|NC_015249. 226 --NYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVAL----DNVVGLFNHRSA 299 (347) Q Consensus 226 --~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~----~~~~~l~~~~~A 299 (347) -|....+.-.-.|-.+.|++|+.++.+|...... +...+.. --+.+-+. ...+||.-.... T Consensus 202 ~g~~~~~~~~~G~~v~~~~GiPi~~~d~ip~~~~~~-----~~~gtTs--------Iya~r~Ge~~~~~Gv~Gl~~~~~~ 268 (310) T protein:vir:97 202 ASINEVVELPSGAEVPAYSGTPIFRNDYIPTNQTKG-----GTTGCTT--------IFAGTLDDGSRTHGIAGLTATQAA 268 (310) T ss_pred CCCCCccccCCCCEEeeeCCeEEEEeCccCCCcccc-----ccCCcee--------EEEEeeCccccccceeccccCCcc Confidence 2333445566678899999999999999753211 0000000 00111111 123343211111 Q ss_pred hhhhhhcceeeeee---echhhhcceeeeeeeecccccccceEEEEE--Ec Q lcl|NC_015249. 300 VGTVKLKDMALERA---RRANFQADQIIAKYAMGHGGLRPEACGALV--FN 345 (347) Q Consensus 300 v~~v~~~~~~~e~~---~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~--~~ 345 (347) .+.++-. .++--..+.|. +| +|..++.|++++.|. +. T Consensus 269 -------glsVr~~G~~~~~~v~~~~V~-~Y-~~~av~~~~A~a~L~~V~~ 310 (310) T protein:vir:97 269 -------GIQVVDVGESEDSDEHIWRVK-WY-CGLALFSEKGLACADGITN 310 (310) T ss_pred -------ceeEEeCCcccCCcceeEEEE-Ee-eeEEEecccceeeeccccC Confidence 1222222 13333334442 22 688889999888774 44 No 201 >protein:vir:95258 Length: 368 # NCBI annotation: Phage conserved protein # Family: family:all:570 # MgeID: mge:1561 # MgeName: Felix 01 # Cross-refs: genbank:acc:NP_944891;genbank:gi:38707831;genbank:GeneID:2744044 Probab=70.07 E-value=0.21 Score=24.25 Aligned_cols=315 Identities=10% Similarity=0.034 Sum_probs=117.6 Q ss_pred CCcccccccccccccccccccchhhhhhhhhh-hHHHHHHHHH----Hhh--hcccccccccccceEEEeecCcc-ee-e Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFG-GEVLTAFTRT----SVT--MNKHLVRSIQSGKSAQFPVLGRT-KA-A 71 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~-g~V~~~f~~~----s~~--~~~~~~r~i~~G~tv~i~~iG~~-~~-~ 71 (347) |.+.-- + + -|+ -++-++..+. ..+ +++...+.++ ..+|.|...+.. ++ . T Consensus 1 ~~d~f~------~----------d-----~Fs~~~LT~ain~~p~~p~~l~~lglF~~~~v~-t~~v~iE~~~~~l~Lvp 58 (368) T protein:vir:95 1 MLTNSE------K----------S-----RFFLADLTGEVQSIPNTYGYISNLGLFRSAPIT-QTTFLMDLTDWDVSLLD 58 (368) T ss_pred Cccccc------C----------C-----cccHHHHHHHHHhcCCCcceecccccccCCCcc-ceEEEEEEEcCeEEEcc Confidence 322210 0 0 011 0011111100 011 1334444433 466777655333 22 3 Q ss_pred eeecCCCCCCccCCCC-CceEEEEEEeeeecccccccHHHHH------------hChhhHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_015249. 72 YLQPGENLDDKRKDMK-HTERTINIDGLLTADVLIYDIEDAM------------NHYDVRSEYTAQLGESLAMAADGAVL 138 (347) Q Consensus 72 ~~~~g~~~~~~~~~~~-~~~~~l~ID~~~~~~~~Idd~D~~q------------~~~D~r~~~~~~~g~aLa~~~D~~i~ 138 (347) ...+|.+.. .....+ -.-+.+.+--... ...| .-|+.| +.-+++.+....+-..+.... |++. T Consensus 59 ~~~rg~~~~-~~~~~~~r~~~~f~~ph~~~-~d~I-~a~eiQg~RafG~~~~l~~v~~~v~~kl~~~r~~~d~T~-E~~r 134 (368) T protein:vir:95 59 AVDRDSRKA-ETSAPERVRQISFPMMYFKE-VESI-TPDEIQGVRQPGTANELTTEAVVRAKKLMKIRTKFDITR-EFLF 134 (368) T ss_pred ccCCCCCCc-ccccCCceeEEEEecceecc-cccc-chHHHccccCCCChhHHHHHHHHHHHHHHHHHHHHHHHH-HHHH Confidence 333444321 111111 1222333311111 1111 122222 111222222222222222222 2221 Q ss_pred HHHHHHhhhcccccccccc----cc-CcceeecccccccccchhhhHHHHHHHHHHHHHHhh-hcCCCCCCCEEEeCHHH Q lcl|NC_015249. 139 AEMAKLCNLPSASDENIAG----LG-KAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLT-GNYVPSADRVFYTTPDN 212 (347) Q Consensus 139 ~~~~~~a~~~~~~~~~~~~----~~-~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Ld-e~~VP~~gR~~vv~P~~ 212 (347) .+..++.- ..+.+..+-. ++ ...++.. +..++.+.......+.+..+...|. ..-++..+-.++++|++ T Consensus 135 ~gAL~G~i-lDadGtvl~dly~eFGit~~~v~f----~l~~~~tdv~~~~~~~~~~i~d~l~g~~~~~~~~v~alcg~~F 209 (368) T protein:vir:95 135 MQALKGKV-VDARGTLYADLYKQFDVEKKTIYF----DLDNPNADIDASIEELRMHMEDEAKTGTVINGEEIHVVVDRVF 209 (368) T ss_pred HHhhcCee-ECCCCcEEecchhhhCCccceEEE----EeCCCCcCHHHHHHHHHHHHHHhhcccccccccceEEEEChHH Confidence 12112111 1111111100 00 0011111 1112222233333444444445564 34467778889999999 Q ss_pred HHHHhcchhhhhh--hhccc-------cccccc---------eEEEEeceEEEEec-ceeccccccccccc---cccccc Q lcl|NC_015249. 213 YSAILAALMPNAA--NYQAL-------IDPSTG---------SIRNVMGFEVIEVP-HLTAGGAGEDRPEE---GANPTG 270 (347) Q Consensus 213 ~~~Ll~~~~~~~~--~~~~~-------~~~~~G---------~Vg~i~G~~V~~sn-~lp~~~~~~~~~~~---~~~~~~ 270 (347) |..|..|+..-.. .++.. ..++.| ....+.|+.+.+-. .++...+......+ -...++ T Consensus 210 fd~L~~h~~Vkeay~~~~~a~~~~~lr~~~r~g~~~~~~~~~~~F~fgGi~f~eYrg~~~~~~g~~~~~v~~d~v~I~~g 289 (368) T protein:vir:95 210 FSKLTKHPKIRDAYLAQQTPLAWQQITGSLRTGGADGVQAHMNTFYYGGVKFVQYNGKFKDKRGKVHTLVSIDSVADTVG 289 (368) T ss_pred HHHhhcChhHHHHHHHHHhhhhhhhhccccccccccccccccceeEecCEEEEEcceeecCCCcceeeeecCCceeeccC Confidence 9999999875432 22211 112222 12467788887632 22221111111111 124466 Q ss_pred cccccccccc-ccccccccceEEEEechhhhhhhhhcceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 271 QKHAFPETSS-GDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 271 ~~~~~~~~~~-~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++.++.... ....+-|....+-+=+-+++.+. .+++=...+..+.-++..+.+-.-.=+-..||+..+-++..+. T Consensus 290 ea~~~P~G~~~~~~~~~F~~~~aPad~~e~vNt~-g~p~Ya~~~~~~~~~g~~le~qSnpLpic~RP~~lv~~~~~a~ 366 (368) T protein:vir:95 290 VGHAFPNVAMLGEANNIFEVAYGPCPKMGYANTL-GQELYVFEYEKDRDEGIDFEAHSYMLPYCTRPQLLVDVRADAK 366 (368) T ss_pred ceEEEeecccccccCcceEEEecCCCcHhhcCCC-cccccceeeeccCCCeeEEEEeecccchhcccceeEEEEecCC Confidence 6677664421 01111222222322223444332 2222223333344455555555555566779998887776666 No 202 >protein:vir:3643 Length: 336 # NCBI annotation: gp12 # Family: family:all:1653 # MgeID: mge:75 # MgeName: Bcep781 # Cross-refs: genbank:acc:NP_705638;genbank:gi:23752323;genbank:GeneID:955719 Probab=69.78 E-value=0.22 Score=24.21 Aligned_cols=285 Identities=9% Similarity=-0.025 Sum_probs=110.5 Q ss_pred CC-cccccccccccccccccccchhhhhhhhhh--hHHHHHHHHHHhhhcccccccccc--cceEEEee---cCcceeee Q lcl|NC_015249. 1 MA-KMNGGQQIGKDQGKGMSAGDKLALFLKVFG--GEVLTAFTRTSVTMNKHLVRSIQS--GKSAQFPV---LGRTKAAY 72 (347) Q Consensus 1 ma-~~~~~~~~~t~~~~~~~~~d~~al~ie~f~--g~V~~~f~~~s~~~~~~~~r~i~~--G~tv~i~~---iG~~~~~~ 72 (347) .| +++++..+.+++|. +. |+.-|- +.++..+... +...++.+.+.=. -+++.|+. .|..+ - T Consensus 34 da~d~~~~~~~~~~~~~------~~--~l~~~i~p~~~~~~~~~~-~~~~l~pv~t~g~W~~~~~~~~~~e~~G~a~--~ 102 (336) T protein:vir:36 34 DAADLSPHLSSTGSSGI------PN--YLTTYVDPSVIDILVAPM-KAAELVGESKKGDWTTLVAAFITAEPTTKVA--T 102 (336) T ss_pred hhhhccCccccCCCcch------HH--HHHHhhccceEeeecchh-hhhhhccccccCCccceeEEEeeeeceeeEE--E Confidence 12 22222221122211 11 555554 3334444332 2233444433111 24555554 45554 3 Q ss_pred eecCCCCCCccCCCCCceEEEEEEeeeecccccccHH---HHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Q lcl|NC_015249. 73 LQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIE---DAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPS 149 (347) Q Consensus 73 ~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D---~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~ 149 (347) |..+.++|. .+..-...+-.|--.- ..+.+..++ ..++.+|+-.+-.+.+..+|.+..++..+--. T Consensus 103 ygd~~D~P~--~d~~~~~~~~~v~~~~-~g~~yg~~E~~~Aa~~~~~l~~~Ka~aA~~ale~~~N~i~~~Gd-------- 171 (336) T protein:vir:36 103 YGDYSSDGD--SGANINYPQRQSYFFQ-TWTRWGERELEMAGAGRVDLASELNYSSALGLAKFLNGSYLFGV-------- 171 (336) T ss_pred eeccCCCce--eecccceeeeeEEEEE-eeeeeCHHHHHHHHHhCCCcHHHHHHHHHHHHHHhhCcEEEEec-------- Confidence 344445432 2222222222221111 112222222 23466777777777777888877776443211 Q ss_pred cccccccc-ccCccee-ecccccccccchhhhHHHHHHHHHHHHHHhhhcC---C-CCCCCEEEeCHHHHHHHhcchhhh Q lcl|NC_015249. 150 ASDENIAG-LGKAHVL-EVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNY---V-PSADRVFYTTPDNYSAILAALMPN 223 (347) Q Consensus 150 ~~~~~~~~-~~~g~~i-~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~---V-P~~gR~~vv~P~~~~~Ll~~~~~~ 223 (347) ......| +-..++. .++.+ ...-..++++.+++.|..+...|-... + +...-.++|||..+..|-.-..+ T Consensus 172 -~~~~~yGllNdP~l~a~~t~~--t~~~~~~t~~ei~~Di~~~~~~l~~qt~G~i~~~~~~tL~LP~~~~~~Ls~~n~~- 247 (336) T protein:vir:36 172 -AGLENYGLINDPSLSAPITAT--TPWSGSPAVEAVVNEVVALFQVLQTQSQGIITQEDVLRMGLPPTAMSDLSKTNQY- 247 (336) T ss_pred -cccceEEEEecCCCccccccC--CCcccccCHHHHHHHHHHHHHHHHHhcCCeeeeccccEEEechHHHHhccCCCcc- Confidence 1111111 1111110 01111 111223456788999998888877743 2 24456899999999888532211 Q ss_pred hhhhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhh Q lcl|NC_015249. 224 AANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTV 303 (347) Q Consensus 224 ~~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v 303 (347) +..-...++. +.-+++|...+.+-..++. ... -...+.. +.+... +.++.. T Consensus 248 --g~Tvl~~lk~----n~Pnl~i~t~pEl~~a~g~-------------~~~---l~~~~~~-~~~t~~-~~~p~~----- 298 (336) T protein:vir:36 248 --GLAAAAKLKD----IFPKLEFVTIPEYDTASGR-------------LVQ---LWAPRVE-GKDTAT-CGFTEK----- 298 (336) T ss_pred --CccHHHHHHH----hcCccEEEEccccccCCCc-------------eEE---EEEEecC-CCccee-eecchh----- Confidence 0000011111 1344566666655211100 000 0000001 111111 111111 Q ss_pred hhcceeeeeeechhhhcceeeeeeee-cccccccceEEEEEEcC Q lcl|NC_015249. 304 KLKDMALERARRANFQADQIIAKYAM-GHGGLRPEACGALVFNK 346 (347) Q Consensus 304 ~~~~~~~e~~~d~~~~~d~i~~~~a~-G~~~~Rpe~a~~i~~~~ 346 (347) -.-+.+ .+....+.+....+. |.-+.||-+.+- ..-- T Consensus 299 -~~~l~v----q~~~~~~~v~~~~rt~Gv~i~~P~ai~~-~~GI 336 (336) T protein:vir:36 299 -MRAHSI----ERYSSYFRQKKSAGTWGAVIFRPFAVAQ-MIGV 336 (336) T ss_pred -hhccce----eecCceeEeccccceeeeeeeccchhee-eecC Confidence 000011 122222333333333 456666664432 2222 No 203 >protein:vir:94933 Length: 330 # NCBI annotation: putative phage structural protein # Family: family:all:1120 # MgeID: mge:1538 # MgeName: Xp15 # Cross-refs: genbank:acc:YP_239278;genbank:gi:66392060;genbank:GeneID:5076578 Probab=65.37 E-value=0.28 Score=23.58 Aligned_cols=285 Identities=12% Similarity=0.080 Sum_probs=122.9 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcc-eeeeeecCCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRT-KAAYLQPGENL 79 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~-~~~~~~~g~~~ 79 (347) |+..+=+. ++- |.-......|.+.|.+.|-++...+...+. |+..++++.-.. .+.-++.++.+ T Consensus 25 m~alTLae-------a~~-------l~~d~~~~~VIE~l~~~s~iL~~lpf~~ve-~~~~~~~r~~~lp~a~~r~~n~~~ 89 (330) T protein:vir:94 25 MPTVTLAE-------SAK-------LSQDHLVSGLIETIVEVNPLYEMMPFTEIE-GNALAYNRENVLGDVQFLAVGGTI 89 (330) T ss_pred hhhhhhhH-------Hhh-------cCchhhHHHHHHhhhccchHHhhccccccc-CCcceeeeeecCCcceeeeccccc Confidence 44333111 111 122345677778887776444444433333 556666654332 22333333333 Q ss_pred CCccCCCCCceEEEEEEeeeecccccccHHHHHhC-----hhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccc Q lcl|NC_015249. 80 DDKRKDMKHTERTINIDGLLTADVLIYDIEDAMNH-----YDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDEN 154 (347) Q Consensus 80 ~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~q~~-----~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~ 154 (347) +.. .+.+...++.+ ...+. -+-++|+.-++ .|.+.+-.+...++|++++...++.- .+.... T Consensus 90 ~~~---~~~Tf~q~t~~-l~~l~-~~~~Vd~~iadl~g~~~d~~~~q~~~~ieal~~~~e~~linG-------Ds~~~~- 156 (330) T protein:vir:94 90 TAK---NPATFTKVTSE-LTTLI-GDAEVNGLIQATRSDFMDQTSVQVASKAKSIGRQYQASMITG-------DGTGNS- 156 (330) T ss_pred ccc---Ccceeeeeeec-hhhhh-hhHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHhhcc-------CCCCcc- Confidence 211 11121122221 11222 22356665532 36777777778888888876655421 000000 Q ss_pred cccc----cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcC-CCCCCCEEEeCHHHHHHHhcchhhhhh-h-h Q lcl|NC_015249. 155 IAGL----GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNY-VPSADRVFYTTPDNYSAILAALMPNAA-N-Y 227 (347) Q Consensus 155 ~~~~----~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~-VP~~gR~~vv~P~~~~~Ll~~~~~~~~-~-~ 227 (347) -.|+ ..+..+..++.+.. .+... +|.| |+... -|-..-+++++..+...|.+-.|-... . + T Consensus 157 F~GL~~~~~~~q~i~tg~~gg~-----~T~d~-LDeL------l~~v~~~~g~~~~~l~n~a~~r~I~a~~R~~~~~~v~ 224 (330) T protein:vir:94 157 FQGMMGLVAASQTISAGANGGT-----LTFEL-LDQL------LDLVKDKDGQVDYLMSSFAMRRKYFSLLRALGGAAIG 224 (330) T ss_pred ccchhhcCCcccEEecCCCCCC-----CCHHH-HHHH------HHHhcCCCCCCcEEEechhHHHHHHHHHHhccCCCCC Confidence 0011 12233333222111 11121 2333 33321 123345899888887777665553321 1 1 Q ss_pred ccccccccceEEEEeceEEEEecceeccccccccccccccccccccccccccccccc--cc--ccceEEEEechhhhhhh Q lcl|NC_015249. 228 QALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTR--VA--LDNVVGLFNHRSAVGTV 303 (347) Q Consensus 228 ~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~--~~--~~~~~~l~~~~~Av~~v 303 (347) .-..+.-...|-.+.|++|+.++.+|........ ...+. --+.+ .+ .-.++||-..... T Consensus 225 ~~~~~~~G~~v~~~~GvPi~~~d~ip~~~~~~~~----~~tts---------Iyav~~G~~~~~qgV~Gl~~~g~~---- 287 (330) T protein:vir:94 225 EVMTLPSGRQIPTYRGVPWFVNDFIPSNMTQGTA----TNATA---------IFAGTFDDGSNKYGIAGLTARGSA---- 287 (330) T ss_pred CcccccCCCEEeeeCCeEEEecccccCCCCcccC----CCcee---------EEEEeecccccccceEeecCCCCC---- Confidence 1233345666789999999999999975321000 00000 00111 11 1134454322211 Q ss_pred hhcceeeeeee--chh-hhcceeeeeeeecccccccceEEEEEE-cCC Q lcl|NC_015249. 304 KLKDMALERAR--RAN-FQADQIIAKYAMGHGGLRPEACGALVF-NKA 347 (347) Q Consensus 304 ~~~~~~~e~~~--d~~-~~~d~i~~~~a~G~~~~Rpe~a~~i~~-~~a 347 (347) .+.++-.- +.. -..+.| ..-+|..++.|++++.|.= .-+ T Consensus 288 ---glsVr~~G~~~~k~v~~~~v--~~y~~~av~~~~a~~~L~~V~~g 330 (330) T protein:vir:94 288 ---GLRVQNVGAKENADETITRV--KMYCGFANFSQLGLAAIKGLIPG 330 (330) T ss_pred ---cceeeeCCCccccceeeEEE--EEeeeeEEechhheeeeccccCC Confidence 12222211 111 111222 2236778888887776642 222 No 204 >protein:vir:99424 Length: 360 # NCBI annotation: hypothetical protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:1595 # MgeName: BJ1 # Cross-refs: genbank:acc:YP_919080;genbank:gi:119757038;genbank:GeneID:4606077 Probab=63.62 E-value=0.31 Score=23.34 Aligned_cols=304 Identities=9% Similarity=0.038 Sum_probs=113.4 Q ss_pred CCcccccccc----cccccccc-cccchh--hhhhhhhhhHHHHHHHHHHhhhcccccccccccceEEEeecCcceeeee Q lcl|NC_015249. 1 MAKMNGGQQI----GKDQGKGM-SAGDKL--ALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQSGKSAQFPVLGRTKAAYL 73 (347) Q Consensus 1 ma~~~~~~~~----~t~~~~~~-~~~d~~--al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~G~tv~i~~iG~~~~~~~ 73 (347) |.+.....++ .++.-... ..+|.. .|=-+++...|...+ ..+-++...+... ...++..|+++|-....-+ T Consensus 1 ~~~~~~~~~~~n~~~~~i~k~~it~~~l~~g~L~p~~a~~Fl~~v~-~~t~iL~~~r~~~-~~s~~~ei~kig~G~r~~r 78 (360) T protein:vir:99 1 MSSNSTIDSVRNQNMNSLSQKDIGLAELDGFQLPVDVTEEFLERMQ-KGVQILGMADTMT-LARLEMEVPQFGVPRLSGH 78 (360) T ss_pred CcchhHHHHHhhhHHHHHHhhhccccccCceeecHHHHHHHHHHHh-hccchhhhcceee-cccccccccccccceeecc Confidence 5544432221 11211111 111111 122356666665555 4444445555543 3467777787776543332 Q ss_pred ecCCCCCCc-cCCCCCceEEE-EEEeeeeccccc-ccHHHH----HhC--hhhHHHHHHHHHHHHHHHH----------- Q lcl|NC_015249. 74 QPGENLDDK-RKDMKHTERTI-NIDGLLTADVLI-YDIEDA----MNH--YDVRSEYTAQLGESLAMAA----------- 133 (347) Q Consensus 74 ~~g~~~~~~-~~~~~~~~~~l-~ID~~~~~~~~I-dd~D~~----q~~--~D~r~~~~~~~g~aLa~~~----------- 133 (347) ...+....+ .-++....+.+ ..+...++...+ +++.+- +.. -.++..++++.|+-|.... T Consensus 79 ~~~e~~~~~~~~~~~~~~v~~~~~~~~~~~~~i~~~~~~~n~~~~~~~f~~~i~~~~ae~~~~Dle~l~~~g~~ds~d~~ 158 (360) T protein:vir:99 79 TRDEEGSRTENSEAESGSVKFNATDKSYYILVEPKRDALKNTHYGPDQFGDYIVDQFIERYGNDLGLMGIRAGASSGNLQ 158 (360) T ss_pred ccccCCCCCcCCcCccccCccccccceeeEeechHHHHHhhhhcccchhHHHHHHHHHHHHHHHHHHHHhhccchhcccc Confidence 221111110 01121222222 344443443333 222221 111 2356666666665443321 Q ss_pred --------HHHHHHHHHHHhh-hccc-----cccccccccCcceeeccc----ccccccchhhhHHHHHHHHHHHHHHhh Q lcl|NC_015249. 134 --------DGAVLAEMAKLCN-LPSA-----SDENIAGLGKAHVLEVGK----QSELRGDQVKLGQAIIAQLTLARAKLT 195 (347) Q Consensus 134 --------D~~i~~~~~~~a~-~~~~-----~~~~~~~~~~g~~i~~~~----~~~~~~~~~~~~~~~~~~l~~a~~~Ld 195 (347) +...-+ +.|.+. -... -..+....+....-..+. +.+...+.......+ +.++...|. T Consensus 159 ~~~~~d~fl~~~dG-wlKka~~~~~~id~a~d~t~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~l---f~~~~~~Lp 234 (360) T protein:vir:99 159 SIGGAAELDNTFKG-WIARAEGDAQSVDDAGDSTRIGLEDTATADADSMPSIANTDGSGNPQPVDTSL---FNETIQTLD 234 (360) T ss_pred cCcccchhhhhhHH-HHHHhhcccchhhccccccccccccccccccccchhhhccccccccccchHHH---HHHHHHhcc Confidence 111111 111111 0000 000000000000000000 000111111122222 334555565 Q ss_pred hcCC--CCCCCEEEeCHHHHHHHhcchhhhhhh-hccccccccceEEEEeceEEEEecceeccccccccccccccccccc Q lcl|NC_015249. 196 GNYV--PSADRVFYTTPDNYSAILAALMPNAAN-YQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQK 272 (347) Q Consensus 196 e~~V--P~~gR~~vv~P~~~~~Ll~~~~~~~~~-~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~ 272 (347) .+.- |...-+.+++|..+..... .+.++. -.|...+.++..-...|++|+..+.+|... T Consensus 235 ~kyr~~~~~~~~~~~s~~~~~~yr~--~L~~R~t~LGd~~l~g~~~~~~~Gipi~~v~~~pd~~---------------- 296 (360) T protein:vir:99 235 SRYRESDAYSPVLMTSPNQVQSYTM--SLTEREDPLGSAVIFGDSDITPFSYDLVGVNGFPDEY---------------- 296 (360) T ss_pred hhhhcCcccceEEEccCchHHHHHH--HHhccCcccchhheecccccccceeeeEEcCCCCCCc---------------- Confidence 5542 1112156677765444432 222222 255556766666678999999999998421 Q ss_pred ccccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhhhcc---eeeeeeee-cccccccceEEEEEEcCC Q lcl|NC_015249. 273 HAFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANFQAD---QIIAKYAM-GHGGLRPEACGALVFNKA 347 (347) Q Consensus 273 ~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~~d---~i~~~~a~-G~~~~Rpe~a~~i~~~~a 347 (347) .++.+|+=+..+-..++.++...++.+..+ .+...+.. ---+.+=+-||++++..- T Consensus 297 -------------------~mlT~p~NLi~g~~~~iri~~~~e~~~~~~~~~~~~~~~~~~~D~~iee~~Av~~vt~~~ 356 (360) T protein:vir:99 297 -------------------MMFTDPNNLAFGLYEEMELDQSTDTDKVHEQRLHSRNWLEGQFDFQIKEQQAGVLVTDLE 356 (360) T ss_pred -------------------eEEeccCceeEEeeeeeEEeecccchhhhhhceeeeEEEEEEeeEEEEecccEEEEecCC Confidence 123334433333334444443333333222 12222221 122333333555555433 No 205 >protein:vir:101557 Length: 336 # NCBI annotation: gp12 # Family: family:all:1653 # MgeID: mge:1477 # MgeName: Bcep43 # Cross-refs: genbank:acc:NP_958117;genbank:gi:41057663;genbank:GeneID:2716814 Probab=59.62 E-value=0.39 Score=22.83 Aligned_cols=286 Identities=10% Similarity=-0.032 Sum_probs=111.4 Q ss_pred CC-cccccccccccccccccccchhhhhhhhhh--hHHHHHHHHHHhhhcccccccccc--cceEEEee---cCcceeee Q lcl|NC_015249. 1 MA-KMNGGQQIGKDQGKGMSAGDKLALFLKVFG--GEVLTAFTRTSVTMNKHLVRSIQS--GKSAQFPV---LGRTKAAY 72 (347) Q Consensus 1 ma-~~~~~~~~~t~~~~~~~~~d~~al~ie~f~--g~V~~~f~~~s~~~~~~~~r~i~~--G~tv~i~~---iG~~~~~~ 72 (347) .| +++++-.. ...+|-+. |+..|- +.++..++.. +...++.+.+.=. -+++.|+. .|..+ - T Consensus 34 da~d~~~~~~~------~~~~~i~~--~l~~~i~p~~~~~~~~p~-~a~~l~pv~t~g~W~~~~~~~~~~e~~G~a~--~ 102 (336) T protein:vir:10 34 DAADLSPHLSS------TGSSGIPN--YLTTYVDPAVIDILVAPM-KAAELVGESKKGDWTTLVAAFITAEPTTKVA--T 102 (336) T ss_pred hhhhccCcccc------CCCchhHH--HHHhhcccceeeehhhhh-hhhhhccccccCCccceeEEEeeeeceeeEE--E Confidence 12 22221111 12223333 666664 3333333322 2333444433111 24555554 45554 3 Q ss_pred eecCCCCCCccCCCCCceEEEEEEeeeecccccccHHH---HHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Q lcl|NC_015249. 73 LQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIED---AMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPS 149 (347) Q Consensus 73 ~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~---~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~ 149 (347) |..+.++|. .+..-...+-.|--.- ..+.+...+. .++..|+-.+-.+.+..+|.+..++..+--. T Consensus 103 ygd~~D~P~--~d~~~~~~~~~v~~~~-~g~~yg~~El~~A~~~g~~l~~~Ka~aA~~ale~~~N~i~~~Gd-------- 171 (336) T protein:vir:10 103 YGDYSSDGD--SGANINYPQRQSYFFQ-TWTRWGERELEMAGAGRVDLASELNYSSALGLAKFLNGSYLFGV-------- 171 (336) T ss_pred eeccCCCce--eecccceeeeeEEEEE-eeeeeCHHHHHHHHHhCCCcHHHHHHHHHHHHHHhhCcEEEEec-------- Confidence 344445432 2222222222221111 1222222222 2356777777777777888887776443211 Q ss_pred cccccccc-ccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcC---C-CCCCCEEEeCHHHHHHHhcchhhhh Q lcl|NC_015249. 150 ASDENIAG-LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNY---V-PSADRVFYTTPDNYSAILAALMPNA 224 (347) Q Consensus 150 ~~~~~~~~-~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~---V-P~~gR~~vv~P~~~~~Ll~~~~~~~ 224 (347) ......| +-..++ +...+.....-..++++.+++.|..+...|-.+. + +...-.++|||..+..|-.-..+ T Consensus 172 -~~~~~yGllN~P~l-~a~~t~~t~~~~~~t~eei~~Di~~~~~~l~~qs~G~i~~~~~~tL~LP~~~~~~Ls~~n~~-- 247 (336) T protein:vir:10 172 -AGLENYGLINDPSL-SAPITATTPWSGSPAVEAVVNEVVALFQVLQTQSQGIITQEDVLRMGLPPTAMSDLSKTNQY-- 247 (336) T ss_pred -cccceEEEEeCCCC-ccccccCCCcccccCHHHHHHHHHHHHHHHHHhcCCeecccCcceEEecHHHHHhccCCCcc-- Confidence 1111111 111111 1000111111223456788999998888887743 2 24457899999999888532211 Q ss_pred hhhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhh Q lcl|NC_015249. 225 ANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVK 304 (347) Q Consensus 225 ~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~ 304 (347) +..-...++. +.-+++|...+.+-..++. ... -...+.. +.+... +.++.. T Consensus 248 -g~Tvl~~lk~----n~Pnl~i~t~pEl~~a~G~-------------~~~---l~~~~~~-~~~t~~-~~~p~~------ 298 (336) T protein:vir:10 248 -GLAAAAKLKD----IFPKLEFVTIPEYDTASGR-------------LVQ---LWAPRVE-GKDTAT-CGFTEK------ 298 (336) T ss_pred -CccHHHHHHH----hcCccEEEEccccccCCCc-------------eEE---EEEEecC-CCccee-eecchh------ Confidence 0000011111 1344566666555211100 000 0000001 111111 111111 Q ss_pred hcceeeeeeechhhhcceeeeeeee-cccccccceEEEEEEcC Q lcl|NC_015249. 305 LKDMALERARRANFQADQIIAKYAM-GHGGLRPEACGALVFNK 346 (347) Q Consensus 305 ~~~~~~e~~~d~~~~~d~i~~~~a~-G~~~~Rpe~a~~i~~~~ 346 (347) -.-+.+ .+....+.+....+. |.-+.||-+.+- ..-- T Consensus 299 ~~~l~v----q~~~~~~~v~~~~rt~Gv~i~~P~ai~~-~~GI 336 (336) T protein:vir:10 299 MRAHSI----ERYSSYFRQKKSAGTWGAVIFRPFAVAQ-MIGV 336 (336) T ss_pred hhccce----eecCceeEeccccceeeeeeeccchhee-eecC Confidence 000001 122222333333333 456666664432 2222 No 206 >protein:vir:78558 Length: 336 # NCBI annotation: major capsid protein # Family: family:all:1653 # MgeID: mge:1854 # MgeName: BcepNY3 # Cross-refs: genbank:acc:YP_001294848;genbank:gi:149882911;genbank:GeneID:5291029 Probab=58.87 E-value=0.4 Score=22.74 Aligned_cols=286 Identities=11% Similarity=-0.010 Sum_probs=110.9 Q ss_pred CCccccccccccccccc-ccccchhhhhhhhhh--hHHHHHHHHHHhhhccccccc---ccccceEEEe---ecCcceee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKG-MSAGDKLALFLKVFG--GEVLTAFTRTSVTMNKHLVRS---IQSGKSAQFP---VLGRTKAA 71 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~-~~~~d~~al~ie~f~--g~V~~~f~~~s~~~~~~~~r~---i~~G~tv~i~---~iG~~~~~ 71 (347) .|+-. .|++. .+.+=+-+ |+.-|- +.++..+.... ...++.+.+ +. -+++.|+ ..|..+ T Consensus 34 da~d~-------~~~~~t~~~~g~~~-~l~~~i~p~~~~~~~~~~~-~~~l~~v~t~g~W~-~~~~~~~~~e~~G~a~-- 101 (336) T protein:vir:78 34 DAADL-------SPHLSSTGSSGIPN-YLTTYVDPSVIDILVAPMK-AAELVGESKKGDWT-TLVAAFITAEPTTTVA-- 101 (336) T ss_pred hhhhh-------ccccccCCCcchHH-HHHHhcccceeeehhhhhh-hhhhcccccCCCcc-ccEEEEeeeecceeeE-- Confidence 12211 22222 11111221 555554 33444444332 223333333 21 2455664 445554 Q ss_pred eeecCCCCCCccCCCCCceEEEEEEeee-ecccccccHHHH-HhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Q lcl|NC_015249. 72 YLQPGENLDDKRKDMKHTERTINIDGLL-TADVLIYDIEDA-MNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPS 149 (347) Q Consensus 72 ~~~~g~~~~~~~~~~~~~~~~l~ID~~~-~~~~~Idd~D~~-q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~ 149 (347) -|..+.+++. .+..-++.+-.|-..- -+.+-+..+..+ ++..|+-.+-.+.+..+|.+..++..+.-. T Consensus 102 ~ygd~~D~P~--vd~~~~~~~~~v~~~~~g~~yg~~El~~A~~~g~~l~~~Ka~aA~~ale~~~N~~~~~Gd-------- 171 (336) T protein:vir:78 102 TYGDYSSDGD--SGTNINYPQRQSYFFQTWTRWGERELEMAGAGRVDLASELNYSSALGLAKFLNGSYLFGV-------- 171 (336) T ss_pred EeecccCCCe--eecceeeEEEEEEEEEeeeeecHHHHHHHHHhCCCcHHHHHHHHHHHHHHhhCeEEEEec-------- Confidence 3344445432 2232233222221111 111122222222 256777777777777777777776433211 Q ss_pred ccccccccc-cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcC---C-CCCCCEEEeCHHHHHHHhcchhhhh Q lcl|NC_015249. 150 ASDENIAGL-GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNY---V-PSADRVFYTTPDNYSAILAALMPNA 224 (347) Q Consensus 150 ~~~~~~~~~-~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~---V-P~~gR~~vv~P~~~~~Ll~~~~~~~ 224 (347) ....+.|+ -..++ +...+.....-..++++.+++.|..+...|.... + |...-.+++||..+..|-.-..+ T Consensus 172 -~~~~~~GllN~P~l-~a~~t~~~~~w~~~T~~~I~~Di~~~~~~l~~qt~g~~~~~~~~tL~Lp~~~~~~L~~~n~~-- 247 (336) T protein:vir:78 172 -AGLENYGLINDPSL-SAPITATTPWSGSPAVEAVVNEVVTLFQVLQTQSQGIITQEAVLHMGLPPTAMSDLSKTNQY-- 247 (336) T ss_pred -cccceEEEEeCCCC-CcccccCcCcccccCHHHHHHHHHHHHHHHHHhcCCeeeeccceEEEechHHHHhccCCCcc-- Confidence 11111111 11111 1000111111123557789999998888876554 2 44456899999999999643211 Q ss_pred hhhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhh Q lcl|NC_015249. 225 ANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVK 304 (347) Q Consensus 225 ~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~ 304 (347) +..-...+++ +.-+++|...+.|-..+ +.... -...+.++..+-.+. +-.+--...++ T Consensus 248 -g~tv~~~lk~----n~Pnl~i~t~pel~~Ag-------------g~~~~---~~~~~~~~~~t~~~~-~p~~f~~lpvq 305 (336) T protein:vir:78 248 -GLSAAAKLKE----IFPKLEFVTIPEYDTAS-------------GRLVQ---LWAPRVEGKDTATCG-FTEKMRAHSIE 305 (336) T ss_pred -CccHHHHHHH----hcCccEEEEcccccccC-------------cceEE---EEEeeccCCcceeee-cchhhhcccee Confidence 0000111221 13345666655552111 00000 000011111111111 11011001111 Q ss_pred hcceeeeeeechhhhcceeeeeeee-cccccccceEEEEEEcC Q lcl|NC_015249. 305 LKDMALERARRANFQADQIIAKYAM-GHGGLRPEACGALVFNK 346 (347) Q Consensus 305 ~~~~~~e~~~d~~~~~d~i~~~~a~-G~~~~Rpe~a~~i~~~~ 346 (347) +....+.+....+. |.-+.||-+..-+ .-- T Consensus 306 -----------~~~~~~~v~~~~rt~Gv~i~~P~ai~~~-~GI 336 (336) T protein:vir:78 306 -----------RYSSYFRQKKSAGTWGAVIFRPFAVAQM-IGV 336 (336) T ss_pred -----------ecCceeEeccccceeeeeeeccchheee-ccC Confidence 12222333333333 4555666543322 222 No 207 >protein:vir:103886 Length: 302 # NCBI annotation: putative major head subunit protein # Family: family:all:776 # MgeID: mge:1522 # MgeName: D3112 # Cross-refs: genbank:acc:NP_938242;genbank:gi:38229147;genbank:GeneID:2648201 Probab=52.39 E-value=0.56 Score=21.98 Aligned_cols=278 Identities=14% Similarity=0.067 Sum_probs=118.6 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHH-HhhhcccccccccccceEEEeecCcc-eeeeeecCCC Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRT-SVTMNKHLVRSIQSGKSAQFPVLGRT-KAAYLQPGEN 78 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~-s~~~~~~~~r~i~~G~tv~i~~iG~~-~~~~~~~g~~ 78 (347) |.-.+ ....+|+. -|...+..+|+.. +-.+.+.+. .-+..++-+...+|.. .+... .|+- T Consensus 1 m~it~---------------~~l~~l~~-~~~~~~~~~y~~a~~~~~~~a~~-~~sdf~~~~~~~lg~~p~l~e~-~Ge~ 62 (302) T protein:vir:10 1 MLINK---------------QSLNAAFV-AIKTIFNNAFAAAPTTWQKIAME-VPSNTSSNDYKWLSTFPKMRRW-IGAK 62 (302) T ss_pred CcccH---------------HHHHHHHH-HHHHHHHHHHHhhhhhhhceeee-cCCCcceeeceecCCCCCcccc-ccce Confidence 32111 11122332 4555566666544 233333322 2234555555566543 22111 1211 Q ss_pred CCCccCCCCCceEEEEEEeeeecccccc--cHHHHHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccccc-- Q lcl|NC_015249. 79 LDDKRKDMKHTERTINIDGLLTADVLIY--DIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDEN-- 154 (347) Q Consensus 79 ~~~~~~~~~~~~~~l~ID~~~~~~~~Id--d~D~~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~-- 154 (347) ....+....-+|.+.++. -.+.|. +|... ++..-..+.+++|++-++..|+.++..+..+.+. ...+.+ T Consensus 63 ---~~~~l~~~~~~i~~~~~g-~~v~i~R~~i~nD--dlg~~~~~~~~~G~aaa~~~~~lv~~~L~~g~~~-~~~DG~~f 135 (302) T protein:vir:10 63 ---VVKNLKAYKYVVENEDFE-ATVEVDRNDIEDD--QIGIYSPQAKMAGYSAAQLPDELVYEAVNGAFTK-PCFDGQYF 135 (302) T ss_pred ---eeccccccceeEEeeccc-ceecccHHhhccc--ccchhHHHHHHHHHHHHhhHHHHHHHHHhccCCC-cccCCcce Confidence 123344455556555442 223332 33222 3567788899999999999999999877542211 001111 Q ss_pred -cccccCcce--eecccccccccchhhhHHHHHHHHHHHHHHhhhc---CCCCCCCEEEeCHHHHH---HHhcchhhhhh Q lcl|NC_015249. 155 -IAGLGKAHV--LEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGN---YVPSADRVFYTTPDNYS---AILAALMPNAA 225 (347) Q Consensus 155 -~~~~~~g~~--i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~---~VP~~gR~~vv~P~~~~---~Ll~~~~~~~~ 225 (347) -+.|+.+.- ..++...-...... .....+++.+.++.++... .+--..+++||+|.... .|+.+.+..+ T Consensus 136 F~~dH~~g~~~~~N~g~~~~~~~~~~-l~~~~~~aa~~am~~~k~~~G~~L~i~P~~LiVp~~le~~A~~ll~~~~~~~- 213 (302) T protein:vir:10 136 IDTDHPVGDASVSNKGTAPLSNASQA-AAKAGYGAARTAMKKFKDEEGRSLNVSPNVLLVGPALEDVAKMLLTNPKLAD- 213 (302) T ss_pred ecccccccccccccccchhhhhcccc-cchHHHHHHHHHHHHHhhhcccccccCCCEEEecchhHHHHHHHhhccccCC- Confidence 111111100 00000000000011 1122344444444443221 22223689999997544 3444443321 Q ss_pred hhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechhhhhhhhh Q lcl|NC_015249. 226 NYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRSAVGTVKL 305 (347) Q Consensus 226 ~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~ 305 (347) +......|. ++++.++.|. +++.|-..... ..-+. +++.. . T Consensus 214 ---g~~Np~~g~------~~~vv~p~L~--s~~aWyL~a~~-------------------~~i~~--~~l~g-------~ 254 (302) T protein:vir:10 214 ---NTPNPYVGT------AELVVDGRIE--SDTAWFLLDTT-------------------KPVKP--FIFQP-------R 254 (302) T ss_pred ---CCcceeccc------eEEEEeeccC--CCCceEEEecC-------------------Cccce--EEEcC-------c Confidence 222222332 5888888874 22223222110 01111 12211 2 Q ss_pred cceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 306 KDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 306 ~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +.+.++..-++...+=++...+.||+ +.-+++-..+.+. T Consensus 255 ~~P~~~~~~~~~~dgv~~k~~~d~Gv---d~R~~~G~~~wq~ 293 (302) T protein:vir:10 255 KQPEFVSQVNLDSDDVFNLRKLKFGA---EARAAAGYGFWQL 293 (302) T ss_pred cccEEEeccCCCCCceEEEEEEEEee---eeeeecchhhhhh Confidence 33556655555555556666666664 2223333323322 No 208 >protein:vir:107732 Length: 379 # NCBI annotation: gp23 # Family: family:all:1653 # MgeID: mge:1520 # MgeName: BcepB1A # Cross-refs: genbank:acc:YP_024871;genbank:gi:48697513;genbank:GeneID:2948349 Probab=51.64 E-value=0.58 Score=21.90 Aligned_cols=299 Identities=13% Similarity=0.012 Sum_probs=118.5 Q ss_pred CCcc--cccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhcccccccccc--cceEEEee---cCcceeeee Q lcl|NC_015249. 1 MAKM--NGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKHLVRSIQS--GKSAQFPV---LGRTKAAYL 73 (347) Q Consensus 1 ma~~--~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~~~r~i~~--G~tv~i~~---iG~~~~~~~ 73 (347) |.-. ..++..++-......+|=+. |+.-|...+.+....--+...++.+.+.=. -+++.|+. .|..+. | T Consensus 56 md~~~~~~~~~~~~~l~~~~~~g~~~--~l~~~~p~~i~~~tap~~a~~l~pv~t~g~W~~~~~~~~v~e~~G~A~~--y 131 (379) T protein:vir:10 56 MDSNDIGPIPTPLSPLSPVSIPGLIQ--FLQNWLPGHVRILTAVREADEFLGLSTVGQWDDEQIVQRVLEGLGTAQP--Y 131 (379) T ss_pred hccccccccccccCccccccccchHH--HHHhhcchHHHHHhhhhhhhhhcccccCCCceeeeEEEeeeeeeeeeEE--e Confidence 4432 22221111111111222233 777776444333333334455555554211 25555554 455543 3 Q ss_pred ecCCCCCCccCCCCCceEEEEEEeeeecccccccHHH---HHhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccc Q lcl|NC_015249. 74 QPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIED---AMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSA 150 (347) Q Consensus 74 ~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~---~q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~ 150 (347) ..+.+.+...-+.+-.++.+..=+ ..+.+.+++. .++..|+-.+-.+.+..+|.+..|+..|--. . . T Consensus 132 gd~~d~pl~d~~~~~~~r~v~~~~---~g~~yg~~El~~Aa~~g~~l~~~Ka~aA~~ale~~~N~i~f~G~------~-d 201 (379) T protein:vir:10 132 TDGGNMALMSWTPTFETRTVVRFE---AGLQVAPLEEARSSRVQVSSADEKRAMVGEALEVQRNRVAFYGY------N-D 201 (379) T ss_pred ccccCCCeeeeeeeeeeeeeEEEE---EEEeecHHHHHHHHHhCCChHHHHHHHHHHHHHHhhceEEEEee------c-C Confidence 444444221112222223222211 1222333332 2356777777777777888888776543211 0 0 Q ss_pred ccccccc------ccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhc---C-CCCC-CCEEEeCHHHHHHHhcc Q lcl|NC_015249. 151 SDENIAG------LGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGN---Y-VPSA-DRVFYTTPDNYSAILAA 219 (347) Q Consensus 151 ~~~~~~~------~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~---~-VP~~-gR~~vv~P~~~~~Ll~~ 219 (347) ....+.| .+.......++.+... =..++++.+++.|..+...|-.+ . .|.+ ...++++|..+..|..- T Consensus 202 ~~~~~yGllNdP~l~a~~t~atg~~~~t~-Wa~kT~~eI~~Di~~~~~~l~~qs~g~~~~~~~~~tL~LP~~~~~~L~~~ 280 (379) T protein:vir:10 202 GSGRTFGFLNDPNLPAYVAVPNGAGGSPL-WAQKTTLEIIADLRNGLTALQVQSMGRIKSNKTPITIGIPNAYENYITTP 280 (379) T ss_pred CCcceEEEEeCCCCcccccccCCcccccc-cccCCHHHHHHHHHHHHHHHHHhhCCeecccccceeEEecHHHHHhhccc Confidence 0111111 1111111111111111 12456777888888877765544 2 2543 34899999999999743 Q ss_pred hhhhhhhhccccccccceEEEEeceEEEEecceeccccccccccccccccccccccccccccccccccc---ceEEEEec Q lcl|NC_015249. 220 LMPNAANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALD---NVVGLFNH 296 (347) Q Consensus 220 ~~~~~~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~---~~~~l~~~ 296 (347) ..+ +..-...+++ +.-+++|...+.|-..+++ +..... ......+.-+ ..+-..+. T Consensus 281 n~~---g~Tvl~~lk~----n~Pnl~i~t~pEL~~aggg-----------~~~~~~---~~~~~~~~~t~~~~~~~~~~p 339 (379) T protein:vir:10 281 TEL---GYSVAQYMRE----SYPNVTFVSAPELNDANGG-----------SSAIYY---YADAVENNGTDDGRTWLQVVP 339 (379) T ss_pred ccc---CccHHHHHHH----hcCCcEEEEcccccccCCC-----------ccEEEE---EeeccCCCccCCcceEEEecc Confidence 211 1000011111 2445677777666321100 000000 0000000000 11112222 Q ss_pred hhhhhhhhhcceeeeeeechhhhcceeeeeeee-cccccccceEEEEEEcC Q lcl|NC_015249. 297 RSAVGTVKLKDMALERARRANFQADQIIAKYAM-GHGGLRPEACGALVFNK 346 (347) Q Consensus 297 ~~Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~-G~~~~Rpe~a~~i~~~~ 346 (347) .. ..-+.+ .+....+.+....+. |.-+.||-++ +-.+-+ T Consensus 340 ~k------~~~l~v----e~~~~~~~~~~~~rt~Gv~ir~P~Ai-~~~~G~ 379 (379) T protein:vir:10 340 TK------MFTLGV----EKKIKGYAEGYTNATAGAMLKRPFAT-YRQTGA 379 (379) T ss_pred hh------hhhccc----eecCceeEeccccceeeeeeecchhh-heecCC Confidence 21 101111 112222333333333 4566677643 333333 No 209 >protein:vir:103181 Length: 457 # NCBI annotation: gp135 # Family: family:all:364 # MgeID: mge:1583 # MgeName: Syn9 # Cross-refs: genbank:acc:YP_717802;genbank:gi:113200639;genbank:GeneID:4239190 Probab=47.50 E-value=0.7 Score=21.43 Aligned_cols=306 Identities=17% Similarity=0.074 Sum_probs=140.7 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHHHhhhccc---ccccccccceEE------------Eeec Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVTMNKH---LVRSIQSGKSAQ------------FPVL 65 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~---~~r~i~~G~tv~------------i~~i 65 (347) |..=+| .=-.=|...++..+...+-.=|-|-.|.++.|.-..-..... ..-+..+....- ...- T Consensus 97 mTgPTG-LIFAmRsrY~~q~~~~~a~~~EAl~nEadt~fSg~~~~~~~~~~~~~~~~~gt~~~~~~~~~~~~~~~~~~~~ 175 (457) T protein:vir:10 97 MTGPTG-LIFAMRTNYGAERNPAAAGYDEAFFNEPNAGFSGGPGAYDPGATGVTNDAEGTNPALLNDSPAGTYEQADDAT 175 (457) T ss_pred CCCcce-eeeeeeeeecCccccccccccceeeeccCcccCcccccccccccccccccccccccccCcccccccccccccc Confidence 322221 101112222222111111112333344555553211000000 000111111110 0111 Q ss_pred CcceeeeeecCCCCCCccCCCCCceEEEEEEeeee--------cccccccHHHHHh-C-hhhHHHHHHHHHHHHHHHHHH Q lcl|NC_015249. 66 GRTKAAYLQPGENLDDKRKDMKHTERTINIDGLLT--------ADVLIYDIEDAMN-H-YDVRSEYTAQLGESLAMAADG 135 (347) Q Consensus 66 G~~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~--------~~~~Idd~D~~q~-~-~D~r~~~~~~~g~aLa~~~D~ 135 (347) |-.++. ++.+.....+..-.+.-+.||+... +...+.-..+.++ | .|.-.|++.=++.++..++.+ T Consensus 176 gmsTA~----aE~lgd~~~n~~f~EMaFsIeK~tVtAKSRaLKAEYTiELAQDLKAiHGLDAEtELaNILStEImlEINR 251 (457) T protein:vir:10 176 GMSTAT----VEALDDSTANTAFREMGFSIEKVTVTARARALKAEYSIEMAQDLKAIHGLDAEQELANILSTEILAEINR 251 (457) T ss_pred chhhhh----hhccCCCCCccchhhheeEEEEEEEeeeccceeccccHHHHHHHHHhcCCChhHHHHHHHHHHHHHHhhH Confidence 111111 1121100011122556677776533 4455666666666 4 889999999999999999999 Q ss_pred HHHHHHHHHhhhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcCCCCCCCEEEeCHHHHHH Q lcl|NC_015249. 136 AVLAEMAKLCNLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNYVPSADRVFYTTPDNYSA 215 (347) Q Consensus 136 ~i~~~~~~~a~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~VP~~gR~~vv~P~~~~~ 215 (347) -|+..+...+..- +..+.....+.......+.....+.....+|...++|.....+-- --.+.|+|.+|+..++ T Consensus 252 eii~~l~~~a~~~-----~~~~~~~~gv~dl~~~~~g~~~~e~~k~L~~~i~~ean~i~~~T~-rg~gn~~i~S~~Va~~ 325 (457) T protein:vir:10 252 EVVRTIYTNAVAG-----AQNNTATAGVFDLDVDSNGRWSVEKFKGLLFQIERDANAIGHQTR-RGKGNILICSADVVSA 325 (457) T ss_pred HHHHhHhhhheee-----eccccccceeeeeeccccchhhHHHHHHHHHHHHHHHHHHHHhhc-cccceEEEEchhHHHH Confidence 9998876443211 111111122222222222122222222222443345554433322 2357999999999999 Q ss_pred Hhcch--hhhhhh--hc---cccccccceEEEE-eceEEEEe----cceecccccccccccccccccccccccccccccc Q lcl|NC_015249. 216 ILAAL--MPNAAN--YQ---ALIDPSTGSIRNV-MGFEVIEV----PHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDT 283 (347) Q Consensus 216 Ll~~~--~~~~~~--~~---~~~~~~~G~Vg~i-~G~~V~~s----n~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y 283 (347) |-... ++..+. .. +.++.....+|.+ .|++||.- +|-|..= . .. -| T Consensus 326 L~~sg~l~~~p~~~~~~~~~~~d~~~~~~~G~l~~r~~vy~D~Ya~~ns~~dy----~------~v------------G~ 383 (457) T protein:vir:10 326 LGMAGVLDYTPALNGNNGLAGVDDTSSTLVGTLNGRIKVYVDPYSANVADKHF----Y------VA------------GY 383 (457) T ss_pred HhhcccccccchhhccccccccccccceeEEEecCCeEEEEecccccCCccce----E------EE------------EE Confidence 87643 233211 11 1234556667776 56788875 3323211 1 01 12 Q ss_pred cccccceEEEEechhhhhhhhhcceeeeeeechhhhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 284 RVALDNVVGLFNHRSAVGTVKLKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 284 ~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +++..-..+|+|+|=. ++... +..||..+-..|-.+.+||- ..+|.+...=--.++ T Consensus 384 KG~~~~~~glfy~PYv----~l~~~---~~~dp~sfqP~~g~~tRY~l-~~NP~~~~~~~~~~~ 439 (457) T protein:vir:10 384 KGTSPYDAGLFYCPYV----PLQQV---RAINPDTFQPKIGFKTRYGM-VSNPFAGGLTQGSGA 439 (457) T ss_pred eCCcceecceeecccc----ccccc---CccCCccccceeeeeeeeee-eeccccccccccccc Confidence 2333334678888863 23332 23499999999999999998 678875542211111 No 210 >protein:vir:99576 Length: 388 # NCBI annotation: hypothetical protein # Family: family:all:1653 # MgeID: mge:1544 # MgeName: BcepF1 # Cross-refs: genbank:acc:YP_001039801;genbank:gi:126011051;genbank:GeneID:4818271 Probab=42.61 E-value=0.88 Score=20.89 Aligned_cols=298 Identities=12% Similarity=0.025 Sum_probs=106.8 Q ss_pred CCcccc---------------------------------ccc--ccccccccccccchhhhhhhhhhhHHHHHHHHHHhh Q lcl|NC_015249. 1 MAKMNG---------------------------------GQQ--IGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRTSVT 45 (347) Q Consensus 1 ma~~~~---------------------------------~~~--~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~s~~ 45 (347) |-++.. ++. ..|-.+ . =..+-|+.-|...|.+.-..--+. T Consensus 30 ~~~~~~~~l~~~g~~~~~~~~~~~~~~~~~~~~~~~a~da~~~~~~t~~~----~-gip~~~~~~~~p~~~~~~~~p~~~ 104 (388) T protein:vir:99 30 LTDMAVRELKKFGLVFDHATVKRQIELLHEGGVATQAFDSAYVAPTTQAS----I-PTPIQFLQQWLPGFVKVLTSARKI 104 (388) T ss_pred eechhhHhhhhcceeccCccchhhhhhhhhhhhhhcccCcccccccccCc----c-cHHHHHhhhhccceeeeeechhhh Confidence 111100 000 001111 1 123334444443332222111233 Q ss_pred hcccccccccc---cceEEEee---cCcceeeeeecCCCCCCccCCCCCceEEEEEEeeeecccccccHHHH---HhChh Q lcl|NC_015249. 46 MNKHLVRSIQS---GKSAQFPV---LGRTKAAYLQPGENLDDKRKDMKHTERTINIDGLLTADVLIYDIEDA---MNHYD 116 (347) Q Consensus 46 ~~~~~~r~i~~---G~tv~i~~---iG~~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~~~~~Idd~D~~---q~~~D 116 (347) ..++.+.+ ++ -+++.|+. .|+.++ |..+.+.+-..-+.+-.++++..=+. .+.+.+++.. ++.+| T Consensus 105 ~~l~pv~t-~g~W~~~~~~f~v~e~~G~A~~--ygd~~D~Pl~d~~~~~~~r~v~~~~~---g~~yg~~El~~A~~~g~~ 178 (388) T protein:vir:99 105 DEILGVKT-VGSWEDQEIVQGIVEPAGTAME--YGDLTNIPLSSWNVNFERRTIVRGEM---GIQVGLLEEGRASAMRIN 178 (388) T ss_pred hhhccccc-cCCccceeEEEeeeecceeEEE--eecccCCCceeccceeeeeeEEEEEe---eeeecHHHHHHHHhhCCC Confidence 44555544 22 34666654 365553 34455543322223333344433232 2334443332 35777 Q ss_pred hHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccccccc-ccCcce----eecccccccccchhhhHHHHHHHHHHHH Q lcl|NC_015249. 117 VRSEYTAQLGESLAMAADGAVLAEMAKLCNLPSASDENIAG-LGKAHV----LEVGKQSELRGDQVKLGQAIIAQLTLAR 191 (347) Q Consensus 117 ~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~~~~~~~~~-~~~g~~----i~~~~~~~~~~~~~~~~~~~~~~l~~a~ 191 (347) +-.+-...+..+|.+..++..|--.+ ......+.| +-..++ ...+.... ++-..++++.+++.|..+. T Consensus 179 l~~~Ka~AA~~ale~~~N~i~f~G~~------g~~~~~~yGllNdP~l~a~v~at~~~~~-~~Wa~kT~~eI~~Di~~~~ 251 (388) T protein:vir:99 179 SAEVKRQGAAVQLEIMRNAIGFYGWE------GKNGNRTFGFLNDPSLLPAIASTTPGGW-VSGGANAFQGIVGDLRLML 251 (388) T ss_pred cHHHHHHHHHHHHHhhhceEEEEeec------CCCccceEEEeeCCCcccccccccCCcC-cccccCCHHHHHHHHHHHH Confidence 87777777888888887765432111 110000111 111111 00111111 1123456788999999888 Q ss_pred HHhhhcC--C--CC-CCCEEEeCHHHHHHHhcchhhhhhhhccccccccceEEEEeceEEEEecceeccccccccccccc Q lcl|NC_015249. 192 AKLTGNY--V--PS-ADRVFYTTPDNYSAILAALMPNAANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGA 266 (347) Q Consensus 192 ~~Lde~~--V--P~-~gR~~vv~P~~~~~Ll~~~~~~~~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~ 266 (347) ..|.... + |. ....++|||..|..|-.-..+ +..-...++. +.-+++|...+.+-..+.+. T Consensus 252 ~~i~~qs~g~~~~~~~~~tL~LP~~~~~~Ls~~n~~---g~Tvl~~lk~----n~Pnl~i~t~pEl~~a~~tg------- 317 (388) T protein:vir:99 252 ITLRVQSEDNIDPEDVDITLVLPMNKVDMLSVVTDL---GISVRDWLKQ----TYPRVRVMSAPELQGGNPDD------- 317 (388) T ss_pred HHHHHhcCCeeeecccceEEEechHHHHhccccCcC---CccHHHHHHH----hcCCcEEEEecccccccccC------- Confidence 8875543 2 33 235799999999999533211 1000011111 13344555544442111000 Q ss_pred ccccccccccccc-----cccccccccceEEEEechh-hhhhhhhcceeeeeeechhhhcceeeeeeee-cccccccceE Q lcl|NC_015249. 267 NPTGQKHAFPETS-----SGDTRVALDNVVGLFNHRS-AVGTVKLKDMALERARRANFQADQIIAKYAM-GHGGLRPEAC 339 (347) Q Consensus 267 ~~~~~~~~~~~~~-----~~~y~~~~~~~~~l~~~~~-Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~-G~~~~Rpe~a 339 (347) .+.....-... .++-++..+..+ .+... -+..++.. ...+.+....+. |.-+.||-++ T Consensus 318 --g~~~~~~~~~~~~~~~~~~~~~~~t~~~--~~p~~~~~l~vq~~-----------~~~~~~~~~~rt~Gv~ir~P~Ai 382 (388) T protein:vir:99 318 --GKDIAYMFLDSVDTAVDGSTDGGDTWAQ--LVQSKFVTLGVEKR-----------VKNYVEAYSNATAGVMLKRPWAV 382 (388) T ss_pred --CceeEEEEecccccccccCccCcceeEE--ecccccccccceec-----------CceeEeccccceeeeEEeccchh Confidence 00000000000 000000000000 00000 00011111 111111111111 3344445433 Q ss_pred EEEEEcC Q lcl|NC_015249. 340 GALVFNK 346 (347) Q Consensus 340 ~~i~~~~ 346 (347) +-+ .-- T Consensus 383 ~~~-~GI 388 (388) T protein:vir:99 383 VRL-IGL 388 (388) T ss_pred hee-ccC Confidence 211 111 No 211 >protein:vir:106734 Length: 336 # NCBI annotation: gp13 # Family: family:all:1653 # MgeID: mge:1599 # MgeName: Bcep1 # Cross-refs: genbank:acc:NP_944321;genbank:gi:38638620;genbank:GeneID:2657363 Probab=41.40 E-value=0.93 Score=20.76 Aligned_cols=285 Identities=10% Similarity=-0.030 Sum_probs=107.0 Q ss_pred CCccccccccccccccc-ccccchhhhhhhhhh--hHHHHHHHHHHhhhcccccccccc---cceEEEe---ecCcceee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKG-MSAGDKLALFLKVFG--GEVLTAFTRTSVTMNKHLVRSIQS---GKSAQFP---VLGRTKAA 71 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~-~~~~d~~al~ie~f~--g~V~~~f~~~s~~~~~~~~r~i~~---G~tv~i~---~iG~~~~~ 71 (347) .|+- +.|++. .+.+=+-+ |+.-|- +-++..++... ...++.+.+ ++ -+.+.|+ ..|++.. T Consensus 34 da~d-------~~~~~~t~~~~g~~~-~l~~~i~p~~~~~~~~~~~-~~~l~~v~t-~g~w~~~~~~~~~~e~~G~a~~- 102 (336) T protein:vir:10 34 DAAD-------LSPHLSSTGSSGIPN-YLTTYVDPSVIDILVAPMK-AAELVGESK-KGDWTTLVAAFITAEPTTKVAT- 102 (336) T ss_pred hhhh-------hccccccCCCcchHH-HHHhhcCcceeeeeechhc-hhhhccccc-CCCcceeeEEEEeeeeeeeEEE- Confidence 1221 122222 11111221 555554 33344443332 233444433 22 2444444 4455542 Q ss_pred eeecCCCCCCccCCCCCceEEEEEEeee-ecccccccHHHH-HhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Q lcl|NC_015249. 72 YLQPGENLDDKRKDMKHTERTINIDGLL-TADVLIYDIEDA-MNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPS 149 (347) Q Consensus 72 ~~~~g~~~~~~~~~~~~~~~~l~ID~~~-~~~~~Idd~D~~-q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~~~ 149 (347) |-.+.+++. .+..-.+..-.|--.- -+..-+..+..+ ++..|+-.+-.+.+..+|.+..++..+.-. T Consensus 103 -ygd~~d~P~--~d~~~~~~~~~v~~~~~g~~yg~~El~~A~~~g~~l~~~Ka~aA~~ale~~~N~~~~~Gd-------- 171 (336) T protein:vir:10 103 -YGDYSSDGD--SGTNINYPQRQSYFFQTWTRWGERELEMAGAGRVDLASELNYSSALGLAKFLNGSYLFGV-------- 171 (336) T ss_pred -ccccCCCcc--eeeeeeeeeeeEEEEEEEEeeCHHHHHHHHHhCCCcHHHHHHHHHHHHHHhhCeEEEEee-------- Confidence 222334432 1222222222111110 111122222222 255667666666677777777665432211 Q ss_pred ccccccccc-cCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhcC---C-CCCCCEEEeCHHHHHHHhcchhhhh Q lcl|NC_015249. 150 ASDENIAGL-GKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGNY---V-PSADRVFYTTPDNYSAILAALMPNA 224 (347) Q Consensus 150 ~~~~~~~~~-~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~~---V-P~~gR~~vv~P~~~~~Ll~~~~~~~ 224 (347) ......|+ -..++ +...+.....-..++++.+++.|..+...|.... + |.....+++||..+..|..-..+ T Consensus 172 -~~~~~~GllN~P~l-~a~~t~~~~~w~~~T~~eI~~Di~~~~~~l~~qt~g~i~~~~~~tL~Lp~~~~~~L~~~n~~-- 247 (336) T protein:vir:10 172 -AGLENYGLINDPSL-SAPITATTPWSGSPAVEAVVNEVVTLFQVLQTQSQGIITQEAVLHMGLPPTAMSDLSKTNQY-- 247 (336) T ss_pred -cccceEEEeecCCC-CcccccCcCcccccCHHHHHHHHHHHHHHHHHhcCCeeeeccceEEEechHHHHhccCCCcc-- Confidence 11111111 11111 1000111111123557789999998888875554 3 44456899999999999643211 Q ss_pred hhhccccccccceEEEEeceEEEEecceecccccccccccccccccccccccccccccccccccceEEEEechh-hhhhh Q lcl|NC_015249. 225 ANYQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTRVALDNVVGLFNHRS-AVGTV 303 (347) Q Consensus 225 ~~~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~-Av~~v 303 (347) +..-...+++ +.-+++|...+.|-..++ ..... .....++..+-.+ .+... -...+ T Consensus 248 -g~tv~~~lk~----n~Pnl~i~t~pel~~Agg-------------~~~~~---~~~~~~~~~t~~~--~~P~~f~~lpv 304 (336) T protein:vir:10 248 -GLSAAAKLKE----IFPKLEFVTIPEYDTASG-------------RLVQL---WAPRVEGKDTATC--GFTEKMRAHSI 304 (336) T ss_pred -CccHHHHHHH----hCCccEEEEcccccccCC-------------ceEEE---EEecccCCcceee--ecChhhhccce Confidence 0000111221 133456666655521110 00000 0000111011111 11110 00111 Q ss_pred hhcceeeeeeechhhhcceeeeeeee-cccccccceEEEEEEcC Q lcl|NC_015249. 304 KLKDMALERARRANFQADQIIAKYAM-GHGGLRPEACGALVFNK 346 (347) Q Consensus 304 ~~~~~~~e~~~d~~~~~d~i~~~~a~-G~~~~Rpe~a~~i~~~~ 346 (347) + +....+.+....+. |.-+.||-+..- ..-- T Consensus 305 q-----------~~~~~~~v~~~~rt~Gv~i~rP~ai~~-~~GI 336 (336) T protein:vir:10 305 E-----------RYSSYFRQKKSAGTWGAVIFRPFAVAQ-MLGV 336 (336) T ss_pred e-----------ecCceeEeccccceeeeeeeccchhee-eccC Confidence 1 12222333333333 445556654332 1122 No 212 >protein:vir:78148 Length: 123 # NCBI annotation: hypothetical protein # Family: family:all:4955 # MgeID: mge:1847 # MgeName: Min1 # Cross-refs: genbank:acc:YP_001294802;genbank:gi:149882823;genbank:GeneID:5309176 Probab=36.46 E-value=1.1 Score=20.40 Aligned_cols=118 Identities=13% Similarity=0.022 Sum_probs=54.0 Q ss_pred EeCHHHHHHHhcchhhhhhh--hccccccccceEEEEeceEEEEecceeccccccccccccccccccccccccccccccc Q lcl|NC_015249. 207 YTTPDNYSAILAALMPNAAN--YQALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKHAFPETSSGDTR 284 (347) Q Consensus 207 vv~P~~~~~Ll~~~~~~~~~--~~~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~ 284 (347) +|+--+|..++-+.-....- .+.+-.+..+.--+++|.+++.|+|||-.. .+.. ..+...+.+- ..-.+--|. T Consensus 1 vvsdlqfA~~~g~~v~~~aLpRE~aNp~ltG~lpV~~~GltWl~tpnlpg~~--a~vl-Dst~lGgmaD--E~l~~Pgya 75 (123) T protein:vir:78 1 MLSGAQFAKLIGILVDDKALPREQANIVLTGSLPVSAYGLTWVTSRHITGTD--PWLF-DVEQLGGMAD--EKLLSPEFA 75 (123) T ss_pred CcchhhHHHHhcchhcccccccccCCceEecCcceeeeceeeeecCCCCCCc--ccee-ehhhhccccc--cccCCCccc Confidence 55555677777554322111 123334555666779999999999999321 1111 1111111000 000000111 Q ss_pred ccccceEEEEechhhhhhhhhcceeeeeeechh--hhcceeeeeeeecccccccceEEEEEEcCC Q lcl|NC_015249. 285 VALDNVVGLFNHRSAVGTVKLKDMALERARRAN--FQADQIIAKYAMGHGGLRPEACGALVFNKA 347 (347) Q Consensus 285 ~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~--~~~d~i~~~~a~G~~~~Rpe~a~~i~~~~a 347 (347) +. ....++++..|..+ --+|.|+++-.-=.-+..|.+-+-|. ..+ T Consensus 76 ~~-----------------~~~Gvevkt~Red~~~nD~yriRaRRvTvpiv~EP~Agv~lt-g~g 122 (123) T protein:vir:78 76 PA-----------------GNTGVEASTERAHQGVKDGYLVRGRRNTVAVVTEPMAGVRLT-GTG 122 (123) T ss_pred CC-----------------CCcceeEEeeccccCCCCceEEeeeecceeEEecCccceEEe-eec Confidence 10 01124555566666 55677777655555555555333222 222 No 213 >protein:vir:5670 Length: 514 # NCBI annotation: gp23 # Family: family:all:364 # MgeID: mge:119 # MgeName: KVP40 # Cross-refs: genbank:acc:NP_899609;genbank:gi:34419596;genbank:GeneID:2546039 Probab=35.82 E-value=1.2 Score=20.13 Aligned_cols=301 Identities=13% Similarity=0.104 Sum_probs=126.6 Q ss_pred CCccccccccc-----cccccc-ccccchhhhhhhhhhhHHHHHHHHHHhhhccc------------ccccccccceEEE Q lcl|NC_015249. 1 MAKMNGGQQIG-----KDQGKG-MSAGDKLALFLKVFGGEVLTAFTRTSVTMNKH------------LVRSIQSGKSAQF 62 (347) Q Consensus 1 ma~~~~~~~~~-----t~~~~~-~~~~d~~al~ie~f~g~V~~~f~~~s~~~~~~------------~~r~i~~G~tv~i 62 (347) -+|..+....+ ..++.+ ...++........-.|.+ |.........+ ....+.+|. + T Consensus 144 Eadt~fSG~~~~~~~~~~~~~~~~~~G~~~~~~~t~~~gd~---~~~~~~~~~~~~~~~~~~~~~t~~~~~~a~~~---~ 217 (514) T protein:vir:56 144 QADASFSGQAAASTIADFPTTGAATDGTPYKAEVTTSGGDV---SMRYFLALGAVTLAVAGQMTATEYTDGVAGGL---L 217 (514) T ss_pred ccCcCcccccccccccccccccccccccccccccccccccc---ccccccccccccccccccccccccccccccch---h Confidence 11111100000 000000 001111100000111111 11000000000 000111111 1 Q ss_pred eecC--cceeeeeecCCCCCCccCCCCCceEEEEEEeeee--------cccccccHHHHHh-C-hhhHHHHHHHHHHHHH Q lcl|NC_015249. 63 PVLG--RTKAAYLQPGENLDDKRKDMKHTERTINIDGLLT--------ADVLIYDIEDAMN-H-YDVRSEYTAQLGESLA 130 (347) Q Consensus 63 ~~iG--~~~~~~~~~g~~~~~~~~~~~~~~~~l~ID~~~~--------~~~~Idd~D~~q~-~-~D~r~~~~~~~g~aLa 130 (347) ..+| ..+...-..+ .+.++ .+..-.|.-+.||+... +...|.-..+.++ | .|.-.|++.=++.++. T Consensus 218 y~~~~Gm~Ta~aEal~-~lggs-~~~~f~EMaFsIdK~tVtAKSRaLKAEYTiELAQDLKAVHGLDAEtELsNILSTEIm 295 (514) T protein:vir:56 218 VEIDAGMATSQAELQE-NFNGS-SNNEWNEMSFRIDKQVVEAKSRQLKAQYSIELAQDLRAVHGLDADAELSGILANEVM 295 (514) T ss_pred hhhhhhhhhhhhhhcc-cCCCC-cccccceeeeEEEEEEEeeeccceeccccHHHHHHHHHhcCCChHHHHHHHHHHHHH Confidence 1111 1111100000 01111 11122466677776532 4455666666666 3 8899999999999999 Q ss_pred HHHHHHHHHHHHHHhhhccccccccccccCcceeecccccccccchhhhHHHHHHHHHHHHHHhhhc-C-----CC-CCC Q lcl|NC_015249. 131 MAADGAVLAEMAKLCNLPSASDENIAGLGKAHVLEVGKQSELRGDQVKLGQAIIAQLTLARAKLTGN-Y-----VP-SAD 203 (347) Q Consensus 131 ~~~D~~i~~~~~~~a~~~~~~~~~~~~~~~g~~i~~~~~~~~~~~~~~~~~~~~~~l~~a~~~Lde~-~-----VP-~~g 203 (347) .++.+-|++.+...+...... +..+.....+.......+..+ +...++.+..+..++.+. + -- -.+ T Consensus 296 lEINReii~~l~~~atv~~~~--~~~~~~~~G~~d~~~~~d~~~-----~~~~~e~~~~l~~~i~~~an~i~~~T~rg~g 368 (514) T protein:vir:56 296 VELNREIVNLVNSQAQIGKSG--WTQGAGAAGVFDFSDAVDVKG-----ARWAGEAYKALLIQIEKEANEIGRQTGRGNG 368 (514) T ss_pred HHhhHHHHHHHHhheeehhcc--ccccccccccccccccccccc-----chHHHHHHHHHHHHHHHHHHHHHhhcccccc Confidence 999999987776544322221 111111111222211111111 111233333333333321 1 11 247 Q ss_pred CEEEeCHHHHHHHhcchhh--------hhhhhc--cccccccceEEEEeceEEEEecceecccccccccccccccccccc Q lcl|NC_015249. 204 RVFYTTPDNYSAILAALMP--------NAANYQ--ALIDPSTGSIRNVMGFEVIEVPHLTAGGAGEDRPEEGANPTGQKH 273 (347) Q Consensus 204 R~~vv~P~~~~~Ll~~~~~--------~~~~~~--~~~~~~~G~Vg~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~ 273 (347) .|+|.+|+..++|-...-+ ..+... ....+.-|.+. .|++||.-++-|..=.. . T Consensus 369 n~~i~S~~Va~~L~~sg~l~~~~~~g~~~~~~~~d~~~~~~aG~l~--~~~~vy~D~y~~~dy~~----------v---- 432 (514) T protein:vir:56 369 NFIIASRNVVSALSMTDTLVGPAAQGMQDGSMNTDTNQTVFAGVLG--GRFKVYIDQYAVNDYFT----------V---- 432 (514) T ss_pred cEEEEchhHHHHHHhhhhhccccccCccccccccccCcceEEEEec--CceEEEecCCCCcceEE----------E---- Confidence 8999999999998653322 111111 11112224332 67899987776642111 1 Q ss_pred cccccccccccccccceEEEEechhhhhhhhhcceeeeeeechhhhcceeeeeeeecc--cccccceEEEEEEcCC Q lcl|NC_015249. 274 AFPETSSGDTRVALDNVVGLFNHRSAVGTVKLKDMALERARRANFQADQIIAKYAMGH--GGLRPEACGALVFNKA 347 (347) Q Consensus 274 ~~~~~~~~~y~~~~~~~~~l~~~~~Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~G~--~~~Rpe~a~~i~~~~a 347 (347) -|+++..-..+|+|+|= +++.+ -+.+||..+-..|-.+.+||- .|.-.+.+..+.+.-. T Consensus 433 --------G~KG~~~~~~glfyaPY----v~l~~---~~~~dp~sfqP~~g~~tRY~l~~NPy~~~~~~~~~~~~~ 493 (514) T protein:vir:56 433 --------GFKGSTEMDAGVFYSPY----VPLTP---LRGSDSKNFQPVIGFKTRYGVQVNPFADPTASATKVGNG 493 (514) T ss_pred --------EEecCcceecceeeccc----ccccc---ccccCCccccceeeeeeeeceeeCCCCCccccccccCCc Confidence 12233333467899887 22322 355799999999988888874 4544444443332111 No 214 >protein:vir:105778 Length: 358 # NCBI annotation: gp9 # Family: family:all:10995 # MgeID: mge:1501 # MgeName: ES18 # Cross-refs: genbank:acc:YP_224147;genbank:gi:62362222;genbank:GeneID:3342531 Probab=20.32 E-value=2.8 Score=18.14 Aligned_cols=301 Identities=12% Similarity=0.027 Sum_probs=113.5 Q ss_pred CCcccccccccccccccccccchhhhhhhhhhhHHHHHHHHH---Hhhhccccc-ccccccceEEEe-ecCcc--eeeee Q lcl|NC_015249. 1 MAKMNGGQQIGKDQGKGMSAGDKLALFLKVFGGEVLTAFTRT---SVTMNKHLV-RSIQSGKSAQFP-VLGRT--KAAYL 73 (347) Q Consensus 1 ma~~~~~~~~~t~~~~~~~~~d~~al~ie~f~g~V~~~f~~~---s~~~~~~~~-r~i~~G~tv~i~-~iG~~--~~~~~ 73 (347) || .+.+... -..-..|+.+++..-|=++|...+...|+.. .++-++... +++.=|||++.+ ++|.. ++... T Consensus 36 ma-an~a~~~-~~~~~~NAv~~v~~D~wr~~D~~~~q~fr~e~~~~l~NDLm~ls~sv~Igktv~~y~~~gd~~~~v~~S 113 (358) T protein:vir:10 36 IA-ANRSNMT-PEWLAVNAVGGFTRDFWAEIDRQVLQLRDQEVGMEIVNDLIGVQTVLPVGKTAKLYNVIGDIADDVSVS 113 (358) T ss_pred Hh-hhHHHhh-hhhheecccccCCHHHHHHHhhhhhhhcccchhHHHHhhhhhccccccHHHHHHHHhhhcCCCceEEEE Confidence 33 1212211 1223345556555447788899999899874 355566655 456668888775 34441 23222 Q ss_pred ecCCCCCCccCCCCCceEE-----EEEEeeeecccccccHHHH-HhChhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Q lcl|NC_015249. 74 QPGENLDDKRKDMKHTERT-----INIDGLLTADVLIYDIEDA-MNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNL 147 (347) Q Consensus 74 ~~g~~~~~~~~~~~~~~~~-----l~ID~~~~~~~~Idd~D~~-q~~~D~r~~~~~~~g~aLa~~~D~~i~~~~~~~a~~ 147 (347) =.|+... .+++.+.. |-|=+.-|...+= +..-. -..+|....--.+.-..+.+++-.++|.--.. ... T Consensus 114 msGQ~~~----~lD~~~y~~dGtpiPIfdsg~~f~WR-~~~~~~~~g~d~~~daQ~~~~~kv~~~~vdy~lNG~~~-I~v 187 (358) T protein:vir:10 114 IDGQAPF----SFDHTEYASDGDPIPVFTAGYGVNWR-HAAGLNSLGIDLVLDSQMAKMRKFNQKRVNYYLNGDPN-IQV 187 (358) T ss_pred ecccCcc----cccceeeeccCCEeeeeccCcccccc-chhhcCccccchhHHHHHHHHHHHHHHHHhhhhccCCc-eee Confidence 2233321 12222111 1111111111111 11111 13345544444556677777776666543211 112 Q ss_pred ccccccccccccCcceeecccc-----cccccchhhhHHHHHHHH-HHHHHHhhhcCCCCCCCEEEeCHHHHHHHhcchh Q lcl|NC_015249. 148 PSASDENIAGLGKAHVLEVGKQ-----SELRGDQVKLGQAIIAQL-TLARAKLTGNYVPSADRVFYTTPDNYSAILAALM 221 (347) Q Consensus 148 ~~~~~~~~~~~~~g~~i~~~~~-----~~~~~~~~~~~~~~~~~l-~~a~~~Lde~~VP~~gR~~vv~P~~~~~Ll~~~~ 221 (347) ...+..-+..++-...+..++. .+.++. ++++++..+ .++..+|..++--...-.++|+|+.++.|-.. - T Consensus 188 ~g~t~~Glrn~~n~~qv~l~~~s~g~NiDltta---t~~a~~~~f~~~l~~~~~~~N~~~~~~~~~vs~ei~~n~~r~-Y 263 (358) T protein:vir:10 188 QSYPAQGIKNHRNTKKINLGSGSGGANIDLTTA---DMTALFAFFGKGAFGTLARANKVAQYDVMWVSPEIWANLAQP-Y 263 (358) T ss_pred cCcccccccCCcceeEEEeccCCCcceeeeccC---CHHHHHHHHHHHHHHHHHhhcccceeeEEEEcHHHHhhhhcc-c Confidence 2222222333333333333322 333332 233344444 55555554444333445778999998877531 1 Q ss_pred hhhhhhccccccccceEEEEece-EEEEecceeccccccccccccccc--ccccccccccccccccccccceEEEEechh Q lcl|NC_015249. 222 PNAANYQALIDPSTGSIRNVMGF-EVIEVPHLTAGGAGEDRPEEGANP--TGQKHAFPETSSGDTRVALDNVVGLFNHRS 298 (347) Q Consensus 222 ~~~~~~~~~~~~~~G~Vg~i~G~-~V~~sn~lp~~~~~~~~~~~~~~~--~~~~~~~~~~~~~~y~~~~~~~~~l~~~~~ 298 (347) + ...+..... --.|.++.++ .|.+...|+-+-.-.......... -|.+...-..-...|..| ..-.+|. T Consensus 264 ~-~~~~~~gTI--l~~vl~~~~va~I~~~~~LsgNeii~~~~~~~vi~plvG~~~gt~~~pR~~p~dd---Y~f~vws-- 335 (358) T protein:vir:10 264 V-VNGVVSGNV--LNAVLPFAPVREIRQTFALSGNEFIAYVRRQDIISPLVGMAVGVVPLPRPLPNVN---YNFQIMS-- 335 (358) T ss_pred c-cccccchhh--HHHhhcccCcccccccccCCCccEEEEEeCCceeeeeecceeeeecCCCCCCCcc---hhhhhhh-- Confidence 1 111211111 1112333333 333444443211110000000000 000000000000001111 1111221 Q ss_pred hhhhhhhcceeeeeeechhhhcceeeeeeeeccccc Q lcl|NC_015249. 299 AVGTVKLKDMALERARRANFQADQIIAKYAMGHGGL 334 (347) Q Consensus 299 Av~~v~~~~~~~e~~~d~~~~~d~i~~~~a~G~~~~ 334 (347) |.| |+++ .|..- .+.+.+||... T Consensus 336 A~g------lqik--~D~~G-----ks~Vv~~~~~~ 358 (358) T protein:vir:10 336 AEG------LQIT--ADDQG-----LSGVVYGANLV 358 (358) T ss_pred hhc------eeee--ecccc-----ceeeEeecccC Confidence 111 2211 12211 12344555444 Done!