Query lcl|NC_020854.1_cdsid_YP_007675017.1 [gene=CPKG_00044] [protein=phage coat protein] [protein_id=YP_007675017.1] [location=24989..26017] Match_columns 342 No_of_seqs 115 out of 179 Neff 7.4 Searched_HMMs 1612 Date Thu Nov 7 16:02:29 2013 Command /home/guerois/workspace/virfam/python/lib/hhsearch//hhsearch2 -i .//seq/seq_44 -d /home/guerois/workspace/virfam/python/profile_database/capsid_neck_tail.hhm -glob -cpu 7 -o .//seq/HHR/seq_44_vs_rec_db.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 protein:vir:80446 Length: 367 100.0 4E-112 3E-115 631.1 25.2 318 1-339 1-367 (367) 2 protein:vir:1583 Length: 351 # 100.0 4E-111 2E-114 626.1 29.0 318 1-342 1-331 (351) 3 protein:vir:5974 Length: 324 # 100.0 9E-111 6E-114 623.9 29.9 314 1-341 1-324 (324) 4 protein:vir:102944 Length: 330 100.0 5E-110 3E-113 620.0 28.1 316 1-341 1-330 (330) 5 protein:vir:94989 Length: 349 100.0 1E-106 7E-110 601.4 27.4 315 1-341 1-349 (349) 6 protein:vir:78387 Length: 349 100.0 2E-106 1E-109 600.0 27.3 315 1-341 1-349 (349) 7 protein:vir:95131 Length: 325 100.0 2.5E-78 1.5E-81 446.1 25.8 316 1-338 1-325 (325) 8 protein:vir:95107 Length: 270 100.0 6.3E-74 3.9E-77 421.9 22.6 267 1-315 1-270 (270) 9 protein:vir:105334 Length: 276 100.0 9.8E-68 6.1E-71 388.0 22.4 268 1-313 1-276 (276) 10 protein:vir:96262 Length: 274 100.0 8.2E-66 5.1E-69 377.4 22.2 271 1-325 1-274 (274) 11 protein:vir:95898 Length: 274 100.0 8.2E-66 5.1E-69 377.4 22.2 271 1-325 1-274 (274) 12 protein:vir:1239 Length: 274 # 100.0 3.5E-65 2.2E-68 373.9 24.2 267 1-312 1-274 (274) 13 protein:vir:96833 Length: 275 100.0 2.7E-65 1.7E-68 374.5 22.8 265 1-310 3-275 (275) 14 protein:vir:3613 Length: 272 # 100.0 6.9E-64 4.3E-67 366.9 22.2 265 1-308 1-272 (272) 15 protein:vir:97433 Length: 274 100.0 2.1E-63 1.3E-66 364.2 23.7 267 1-312 1-274 (274) 16 protein:vir:94494 Length: 274 100.0 2.1E-63 1.3E-66 364.2 23.7 267 1-312 1-274 (274) 17 protein:vir:96792 Length: 315 100.0 4.5E-62 2.8E-65 356.9 25.4 302 1-341 1-315 (315) 18 protein:vir:96123 Length: 274 100.0 8E-61 4.9E-64 350.1 23.8 267 1-315 1-274 (274) 19 protein:vir:93742 Length: 274 100.0 2.2E-59 1.3E-62 342.2 23.5 267 1-312 1-274 (274) 20 protein:vir:80930 Length: 278 100.0 1.4E-57 8.7E-61 332.3 21.9 269 1-308 1-278 (278) 21 protein:vir:9820 Length: 272 # 100.0 6.6E-52 4.1E-55 301.2 23.5 263 1-309 1-272 (272) 22 protein:vir:3033 Length: 272 # 100.0 6.6E-52 4.1E-55 301.2 23.5 263 1-309 1-272 (272) 23 protein:vir:739 Length: 231 # 100.0 2.2E-49 1.4E-52 287.3 20.5 227 40-308 1-231 (231) 24 protein:vir:7990 Length: 273 # 99.9 1E-24 6.3E-28 152.1 22.2 266 1-308 1-273 (273) 25 protein:vir:9927 Length: 295 # 99.9 5.9E-25 3.7E-28 153.4 15.2 263 1-314 1-295 (295) 26 protein:vir:105822 Length: 273 99.9 2.3E-23 1.4E-26 144.7 22.1 266 1-308 1-273 (273) 27 protein:vir:102605 Length: 273 99.9 2.3E-23 1.4E-26 144.7 22.1 266 1-308 1-273 (273) 28 protein:vir:80180 Length: 381 99.8 3.7E-21 2.3E-24 132.6 20.2 313 1-342 15-347 (381) 29 protein:vir:94622 Length: 341 99.8 4E-21 2.5E-24 132.4 16.5 307 1-342 12-341 (341) 30 protein:vir:106647 Length: 303 99.8 7.3E-21 4.5E-24 131.0 17.5 267 1-311 1-303 (303) 31 protein:vir:9875 Length: 296 # 99.7 2.7E-20 1.7E-23 127.9 15.4 254 1-312 1-296 (296) 32 protein:vir:108211 Length: 318 99.6 1.2E-17 7.3E-21 113.4 14.0 277 1-310 1-318 (318) 33 protein:vir:99749 Length: 324 99.6 1.9E-15 1.2E-18 101.3 21.4 273 1-315 30-324 (324) 34 protein:vir:9309 Length: 324 # 99.5 2.6E-15 1.6E-18 100.6 21.0 273 1-315 30-324 (324) 35 protein:vir:99075 Length: 392 99.5 1.5E-15 9.6E-19 101.8 19.4 299 1-342 1-339 (392) 36 protein:vir:9759 Length: 303 # 99.5 2.7E-15 1.7E-18 100.5 20.7 283 1-308 1-303 (303) 37 protein:vir:97148 Length: 324 99.5 4.6E-15 2.8E-18 99.2 21.0 272 1-315 30-324 (324) 38 protein:vir:103955 Length: 324 99.5 5.4E-15 3.3E-18 98.8 21.2 273 1-315 30-324 (324) 39 protein:vir:1638 Length: 298 # 99.5 5.7E-15 3.5E-18 98.7 19.3 279 1-307 1-298 (298) 40 protein:vir:96223 Length: 324 99.5 2.5E-14 1.6E-17 95.1 20.9 273 1-315 30-324 (324) 41 protein:vir:41 Length: 299 # N 99.5 5.7E-14 3.5E-17 93.2 22.2 273 1-322 6-299 (299) 42 protein:vir:78830 Length: 324 99.5 4.4E-14 2.7E-17 93.8 21.1 273 1-315 30-324 (324) 43 protein:vir:96392 Length: 324 99.5 4.4E-14 2.7E-17 93.8 21.1 273 1-315 30-324 (324) 44 protein:vir:94771 Length: 298 99.5 4.1E-14 2.5E-17 94.0 20.9 276 1-307 1-298 (298) 45 protein:vir:4856 Length: 293 # 99.4 1.1E-13 6.8E-17 91.6 22.0 272 1-317 5-293 (293) 46 protein:vir:105905 Length: 304 99.4 5.8E-14 3.6E-17 93.2 19.9 271 1-340 1-304 (304) 47 protein:vir:94142 Length: 304 99.4 5.8E-14 3.6E-17 93.2 19.9 271 1-340 1-304 (304) 48 protein:vir:9574 Length: 300 # 99.4 7.5E-14 4.7E-17 92.5 20.0 277 1-307 1-300 (300) 49 protein:vir:7771 Length: 330 # 99.4 1.3E-13 7.9E-17 91.3 20.8 285 1-313 1-330 (330) 50 protein:vir:78223 Length: 333 99.4 3E-13 1.9E-16 89.2 21.3 281 1-310 20-333 (333) 51 protein:vir:81100 Length: 415 99.4 4.6E-13 2.9E-16 88.2 21.6 283 1-317 120-415 (415) 52 protein:vir:79987 Length: 415 99.4 4.6E-13 2.9E-16 88.2 21.6 283 1-317 120-415 (415) 53 protein:vir:98339 Length: 415 99.4 4.6E-13 2.9E-16 88.2 21.6 283 1-317 120-415 (415) 54 protein:vir:2344 Length: 397 # 99.4 3.5E-13 2.2E-16 88.9 20.9 303 1-342 10-377 (397) 55 protein:vir:80684 Length: 315 99.4 2.6E-13 1.6E-16 89.6 20.0 289 1-314 1-315 (315) 56 protein:vir:8187 Length: 311 # 99.4 2.1E-13 1.3E-16 90.0 19.4 284 1-309 1-311 (311) 57 protein:vir:108303 Length: 418 99.4 3E-13 1.8E-16 89.3 20.1 300 1-342 1-346 (418) 58 protein:vir:78523 Length: 338 99.4 5.4E-13 3.3E-16 87.8 21.3 284 1-312 10-338 (338) 59 protein:vir:104256 Length: 458 99.3 5.1E-13 3.1E-16 88.0 20.2 282 1-311 165-458 (458) 60 protein:vir:9410 Length: 415 # 99.3 1E-12 6.2E-16 86.4 21.6 283 1-317 123-415 (415) 61 protein:vir:100247 Length: 425 99.3 2.6E-13 1.6E-16 89.6 18.3 283 1-312 130-425 (425) 62 protein:vir:95763 Length: 297 99.3 9.4E-13 5.8E-16 86.5 20.9 269 1-341 12-297 (297) 63 protein:vir:191 Length: 385 # 99.3 1E-12 6.3E-16 86.4 21.1 268 1-309 105-385 (385) 64 protein:vir:1886 Length: 385 # 99.3 1E-12 6.3E-16 86.4 21.1 268 1-309 105-385 (385) 65 protein:vir:4600 Length: 415 # 99.3 1.7E-12 1.1E-15 85.1 21.7 283 1-317 120-415 (415) 66 protein:vir:4700 Length: 415 # 99.3 1.7E-12 1.1E-15 85.1 21.7 283 1-317 120-415 (415) 67 protein:vir:97053 Length: 390 99.3 1.4E-12 8.5E-16 85.6 20.8 268 1-309 113-390 (390) 68 protein:vir:8102 Length: 543 # 99.3 1.2E-12 7.2E-16 86.0 20.4 275 1-339 251-543 (543) 69 protein:vir:100135 Length: 418 99.3 1.9E-12 1.2E-15 84.8 21.6 267 1-307 136-418 (418) 70 protein:vir:4456 Length: 401 # 99.3 7.5E-13 4.7E-16 87.0 19.3 283 1-311 107-401 (401) 71 protein:vir:4953 Length: 397 # 99.3 2.2E-12 1.4E-15 84.5 21.7 272 1-317 109-397 (397) 72 protein:vir:3364 Length: 347 # 99.3 3.6E-13 2.3E-16 88.8 16.8 286 1-310 1-347 (347) 73 protein:vir:94576 Length: 347 99.3 2.2E-13 1.4E-16 90.0 15.2 282 1-306 1-347 (347) 74 protein:vir:485 Length: 407 # 99.3 2.1E-12 1.3E-15 84.6 20.4 284 1-313 106-407 (407) 75 protein:vir:81070 Length: 390 99.3 2.7E-12 1.7E-15 84.0 20.8 268 1-309 114-390 (390) 76 protein:vir:3991 Length: 404 # 99.3 4.9E-12 3E-15 82.6 21.7 273 1-319 116-404 (404) 77 protein:vir:80213 Length: 334 99.2 2.9E-13 1.8E-16 89.3 13.9 287 1-310 1-334 (334) 78 protein:vir:94711 Length: 347 99.2 3.7E-13 2.3E-16 88.7 14.3 282 1-314 1-347 (347) 79 protein:vir:4997 Length: 397 # 99.2 1.3E-11 8.3E-15 80.2 22.3 272 1-317 109-397 (397) 80 protein:vir:5739 Length: 366 # 99.2 2.9E-12 1.8E-15 83.8 18.5 276 1-321 64-366 (366) 81 protein:vir:1328 Length: 392 # 99.2 3.4E-12 2.1E-15 83.4 18.9 273 1-339 111-392 (392) 82 protein:vir:104085 Length: 320 99.2 1E-11 6.3E-15 80.9 21.0 281 1-312 14-320 (320) 83 protein:vir:4830 Length: 397 # 99.2 1.3E-11 7.8E-15 80.4 21.4 271 1-317 109-397 (397) 84 protein:vir:10364 Length: 390 99.2 1.6E-11 1E-14 79.7 22.0 268 1-309 114-390 (390) 85 protein:vir:2430 Length: 318 # 99.2 1.3E-11 7.8E-15 80.4 21.1 277 1-310 14-318 (318) 86 protein:vir:81160 Length: 371 99.2 2.1E-11 1.3E-14 79.2 22.2 261 1-306 91-371 (371) 87 protein:vir:2201 Length: 345 # 99.2 1.5E-12 9.1E-16 85.5 15.5 287 1-340 1-345 (345) 88 protein:vir:6242 Length: 390 # 99.2 4.9E-12 3.1E-15 82.6 18.2 271 1-339 111-390 (390) 89 protein:vir:78739 Length: 332 99.2 1.5E-12 9.2E-16 85.4 15.2 287 1-318 1-332 (332) 90 protein:vir:1541 Length: 347 # 99.2 4.5E-12 2.8E-15 82.8 17.6 287 1-310 1-347 (347) 91 protein:vir:4339 Length: 395 # 99.2 3.8E-11 2.4E-14 77.7 21.8 270 1-311 114-395 (395) 92 protein:vir:2504 Length: 305 # 99.2 1E-11 6.3E-15 80.9 18.5 270 1-312 1-305 (305) 93 protein:vir:6212 Length: 434 # 99.2 4.2E-11 2.6E-14 77.5 21.9 278 1-313 143-434 (434) 94 protein:vir:10450 Length: 344 99.2 1.8E-12 1.1E-15 84.9 14.1 287 1-338 1-344 (344) 95 protein:vir:8885 Length: 347 # 99.1 4.3E-12 2.6E-15 82.9 15.7 288 1-312 1-347 (347) 96 protein:vir:1383 Length: 421 # 99.1 2.2E-11 1.4E-14 79.0 19.5 285 1-334 117-421 (421) 97 protein:vir:99920 Length: 311 99.1 3.6E-11 2.2E-14 77.9 20.5 282 1-308 1-311 (311) 98 protein:vir:80376 Length: 435 99.1 4.7E-11 2.9E-14 77.2 20.0 277 1-323 130-435 (435) 99 protein:vir:100884 Length: 389 99.1 9.5E-11 5.9E-14 75.5 21.5 268 1-312 109-389 (389) 100 protein:vir:7409 Length: 408 # 99.1 1.4E-10 8.4E-14 74.7 21.9 272 1-317 116-408 (408) 101 protein:vir:1433 Length: 435 # 99.1 3E-11 1.8E-14 78.3 18.2 280 1-323 132-435 (435) 102 protein:vir:4226 Length: 326 # 99.1 8.6E-11 5.3E-14 75.8 19.9 275 1-309 20-326 (326) 103 protein:vir:1268 Length: 397 # 99.1 2.3E-10 1.4E-13 73.4 21.3 264 1-338 123-397 (397) 104 protein:vir:94673 Length: 419 99.0 1.6E-10 1E-13 74.3 20.2 271 1-311 123-419 (419) 105 protein:vir:100172 Length: 394 99.0 3.3E-10 2E-13 72.6 21.1 275 1-319 111-394 (394) 106 protein:vir:3845 Length: 395 # 99.0 3.7E-10 2.3E-13 72.3 21.3 270 1-319 105-395 (395) 107 protein:vir:105038 Length: 428 99.0 9.1E-11 5.7E-14 75.6 17.8 278 1-321 127-428 (428) 108 protein:vir:102655 Length: 322 99.0 9.5E-11 5.9E-14 75.5 17.5 291 1-341 13-322 (322) 109 protein:vir:1025 Length: 408 # 99.0 6.5E-10 4E-13 70.9 22.1 274 1-320 116-408 (408) 110 protein:vir:3136 Length: 322 # 99.0 7.8E-11 4.8E-14 76.0 15.2 297 1-313 1-322 (322) 111 protein:vir:102119 Length: 404 98.9 9.5E-10 5.9E-13 70.0 20.7 277 1-311 110-404 (404) 112 protein:vir:101607 Length: 379 98.9 1.2E-09 7.7E-13 69.4 20.8 261 1-308 109-379 (379) 113 protein:vir:3870 Length: 400 # 98.9 9.1E-10 5.6E-13 70.2 19.9 258 1-339 137-400 (400) 114 protein:vir:102873 Length: 392 98.9 2E-09 1.2E-12 68.3 21.6 268 1-316 106-392 (392) 115 protein:vir:102082 Length: 392 98.9 2E-09 1.2E-12 68.3 21.6 268 1-316 106-392 (392) 116 protein:vir:107593 Length: 392 98.9 2E-09 1.2E-12 68.3 21.6 268 1-316 106-392 (392) 117 protein:vir:105004 Length: 392 98.9 2E-09 1.2E-12 68.3 21.6 268 1-316 106-392 (392) 118 protein:vir:6324 Length: 335 # 98.9 1.6E-10 9.9E-14 74.3 15.7 287 1-312 1-335 (335) 119 protein:vir:81227 Length: 413 98.9 2.4E-09 1.5E-12 67.8 21.7 278 1-310 118-413 (413) 120 protein:vir:4511 Length: 409 # 98.9 6.7E-10 4.2E-13 70.9 18.1 281 1-314 117-409 (409) 121 protein:vir:8420 Length: 477 # 98.9 4.7E-10 2.9E-13 71.7 16.5 293 1-313 157-477 (477) 122 protein:vir:9704 Length: 394 # 98.9 1.3E-09 8.2E-13 69.3 18.9 258 1-315 127-394 (394) 123 protein:vir:103323 Length: 364 98.9 1.1E-09 6.9E-13 69.7 18.3 303 1-331 1-364 (364) 124 protein:vir:100057 Length: 375 98.8 1.9E-09 1.2E-12 68.4 18.4 290 1-315 1-375 (375) 125 protein:vir:78935 Length: 335 98.8 3.4E-10 2.1E-13 72.5 14.1 286 1-312 1-335 (335) 126 protein:vir:96762 Length: 632 98.8 2.2E-09 1.3E-12 68.1 18.1 264 1-308 357-632 (632) 127 protein:vir:4197 Length: 314 # 98.8 8.8E-09 5.5E-12 64.7 20.6 275 1-314 14-314 (314) 128 protein:vir:95376 Length: 425 98.7 1.1E-08 6.6E-12 64.3 20.2 272 1-311 141-425 (425) 129 protein:vir:9361 Length: 402 # 98.7 1.8E-09 1.1E-12 68.5 16.0 258 1-311 133-402 (402) 130 protein:vir:2685 Length: 387 # 98.7 2E-09 1.3E-12 68.2 15.9 258 1-311 118-387 (387) 131 protein:vir:94424 Length: 387 98.7 2E-09 1.3E-12 68.2 15.9 258 1-311 118-387 (387) 132 protein:vir:96978 Length: 387 98.7 2E-09 1.3E-12 68.2 15.9 258 1-311 118-387 (387) 133 protein:vir:93881 Length: 387 98.7 2.7E-09 1.7E-12 67.6 16.2 258 1-311 118-387 (387) 134 protein:vir:1084 Length: 437 # 98.7 1.7E-08 1E-11 63.2 20.0 266 1-313 156-437 (437) 135 protein:vir:4092 Length: 390 # 98.7 1.1E-08 6.8E-12 64.2 17.7 270 1-314 87-390 (390) 136 protein:vir:99675 Length: 324 98.6 2.8E-09 1.8E-12 67.4 14.2 272 40-339 1-324 (324) 137 protein:vir:962 Length: 397 # 98.6 3.6E-08 2.2E-11 61.4 19.9 254 1-306 132-397 (397) 138 protein:vir:174 Length: 423 # 98.6 4.1E-08 2.5E-11 61.1 20.1 302 1-342 1-370 (423) 139 protein:vir:4159 Length: 315 # 98.6 8.2E-08 5.1E-11 59.4 20.6 275 1-310 19-315 (315) 140 protein:vir:101650 Length: 497 98.6 7E-08 4.3E-11 59.8 20.0 285 1-309 151-497 (497) 141 protein:vir:7855 Length: 497 # 98.6 7E-08 4.3E-11 59.8 20.0 285 1-309 151-497 (497) 142 protein:vir:105374 Length: 423 98.6 4E-08 2.5E-11 61.2 18.3 298 1-342 1-356 (423) 143 protein:vir:93616 Length: 645 98.5 7.6E-08 4.7E-11 59.6 19.5 277 1-309 338-645 (645) 144 protein:vir:78640 Length: 352 98.5 2E-08 1.2E-11 62.8 15.4 258 1-311 83-352 (352) 145 protein:vir:79928 Length: 393 98.5 7.7E-08 4.8E-11 59.6 17.7 297 1-320 74-393 (393) 146 protein:vir:3525 Length: 423 # 98.5 1.1E-07 6.7E-11 58.8 18.2 303 1-342 1-356 (423) 147 protein:vir:105522 Length: 423 98.2 1.5E-06 9.5E-10 52.5 19.4 296 1-342 1-356 (423) 148 protein:vir:93696 Length: 364 98.2 7.6E-07 4.7E-10 54.1 17.5 303 1-317 1-364 (364) 149 protein:vir:107120 Length: 329 98.2 3.2E-06 2E-09 50.7 20.8 280 1-325 18-329 (329) 150 protein:vir:97031 Length: 402 98.1 4.2E-07 2.6E-10 55.5 13.8 313 1-342 1-365 (402) 151 protein:vir:97331 Length: 319 98.1 5.6E-06 3.5E-09 49.4 21.2 279 1-323 1-319 (319) 152 protein:vir:94800 Length: 319 98.1 5.6E-06 3.5E-09 49.4 21.2 279 1-323 1-319 (319) 153 protein:vir:80128 Length: 466 98.0 7.1E-07 4.4E-10 54.3 14.5 286 1-333 150-466 (466) 154 protein:vir:7019 Length: 401 # 97.9 7.4E-07 4.6E-10 54.2 12.0 314 1-342 1-368 (401) 155 protein:vir:105645 Length: 400 97.8 7.7E-07 4.8E-10 54.1 11.3 317 1-342 1-375 (400) 156 protein:vir:79008 Length: 299 97.6 3.5E-05 2.2E-08 45.0 20.9 272 1-308 1-299 (299) 157 protein:vir:3158 Length: 321 # 97.3 9.7E-05 6E-08 42.6 18.7 280 1-320 19-321 (321) 158 protein:vir:819 Length: 404 # 97.2 4.5E-05 2.8E-08 44.4 13.9 293 1-315 1-404 (404) 159 protein:vir:3298 Length: 404 # 97.2 4.5E-05 2.8E-08 44.4 13.9 293 1-315 1-404 (404) 160 protein:vir:104439 Length: 404 97.2 4.5E-05 2.8E-08 44.4 13.9 293 1-315 1-404 (404) 161 protein:vir:10123 Length: 404 97.2 4.5E-05 2.8E-08 44.4 13.9 293 1-315 1-404 (404) 162 protein:vir:2770 Length: 318 # 97.0 0.00023 1.4E-07 40.5 17.8 247 1-293 1-318 (318) 163 protein:vir:9643 Length: 377 # 96.8 0.00034 2.1E-07 39.6 15.4 267 1-308 79-377 (377) 164 protein:vir:105610 Length: 430 96.8 0.00035 2.2E-07 39.5 18.1 304 1-328 1-430 (430) 165 protein:vir:1781 Length: 221 # 96.6 0.00032 2E-07 39.7 13.5 188 85-284 1-221 (221) 166 protein:vir:102335 Length: 312 96.1 0.00095 5.9E-07 37.2 19.3 267 1-310 1-312 (312) 167 protein:vir:78920 Length: 290 96.1 0.00097 6E-07 37.1 20.1 267 1-308 1-290 (290) 168 protein:vir:79712 Length: 285 95.7 0.0017 1E-06 35.8 18.6 269 1-308 1-285 (285) 169 protein:vir:97397 Length: 517 94.7 0.0038 2.3E-06 33.9 14.3 260 1-307 237-517 (517) 170 protein:vir:9509 Length: 381 # 94.5 0.0041 2.6E-06 33.6 15.4 272 1-320 76-381 (381) 171 protein:vir:101291 Length: 381 94.5 0.0041 2.6E-06 33.6 15.4 272 1-320 76-381 (381) 172 protein:vir:95963 Length: 395 94.4 0.0046 2.9E-06 33.4 15.1 274 1-318 86-395 (395) 173 protein:vir:95451 Length: 313 94.4 0.0046 2.9E-06 33.4 14.1 290 1-309 1-313 (313) 174 protein:vir:94933 Length: 330 93.5 0.0073 4.6E-06 32.3 17.5 281 1-318 25-330 (330) 175 protein:vir:97255 Length: 310 93.2 0.0084 5.2E-06 32.0 19.9 278 1-320 1-310 (310) 176 protein:vir:98635 Length: 377 91.9 0.014 8.4E-06 30.8 11.6 277 1-338 79-377 (377) 177 protein:vir:99523 Length: 311 91.9 0.014 8.5E-06 30.8 20.5 278 1-306 1-311 (311) 178 protein:vir:78090 Length: 302 89.6 0.025 1.6E-05 29.3 18.1 282 1-316 1-302 (302) 179 protein:vir:100632 Length: 381 86.5 0.045 2.8E-05 28.0 16.6 272 1-319 76-381 (381) 180 protein:vir:105464 Length: 346 85.2 0.054 3.4E-05 27.5 19.8 301 1-342 1-331 (346) 181 protein:vir:95512 Length: 693 82.1 0.079 4.9E-05 26.6 19.5 271 1-308 394-693 (693) 182 protein:vir:8324 Length: 410 # 79.2 0.11 6.6E-05 25.9 15.0 261 1-314 127-410 (410) 183 protein:vir:79548 Length: 652 78.3 0.12 7.2E-05 25.7 18.9 267 1-308 359-652 (652) 184 protein:vir:96490 Length: 348 76.5 0.14 8.4E-05 25.3 15.8 311 1-339 1-348 (348) 185 protein:vir:78350 Length: 383 75.6 0.14 9E-05 25.2 13.8 263 1-312 83-383 (383) 186 protein:vir:95875 Length: 401 70.9 0.2 0.00013 24.4 15.2 289 1-341 1-401 (401) 187 protein:vir:4074 Length: 480 # 67.2 0.26 0.00016 23.8 12.4 264 1-310 175-480 (480) 188 protein:vir:2106 Length: 430 # 54.4 0.5 0.00031 22.2 20.7 303 1-342 1-371 (430) 189 protein:vir:4902 Length: 348 # 54.0 0.51 0.00032 22.2 14.7 308 1-339 1-348 (348) 190 protein:vir:2736 Length: 348 # 50.5 0.61 0.00038 21.8 16.5 311 1-339 1-348 (348) No 1 >protein:vir:80446 Length: 367 # NCBI annotation: BcepGomrgp07 # Family: family:all:1522 # MgeID: mge:1882 # MgeName: BcepGomr # Cross-refs: genbank:acc:YP_001210227;genbank:gi:146329919;genbank:GeneID:5123555 Probab=100.00 E-value=4.3e-112 Score=631.11 Aligned_cols=318 Identities=24% Similarity=0.398 Sum_probs=291.4 Q ss_pred Cc-----ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhc--cCCCCEEEccccccCCCCcccccCCCc---e Q lcl|NC_020854. 1 MA-----TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNA--TEGGDFINVPFWKANLSGDFEVLSDSS---S 70 (342) Q Consensus 1 Ma-----T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~--~~~G~ti~~P~~~~i~~gda~~~~~~~---~ 70 (342) |+ |+++|+|+||||++||.++..++++|+|||++++|++|.+ +.||++++||||++ |+|+++++.+++ + T Consensus 1 M~~~~~~T~l~Dii~pEvF~~Yv~~~~~e~~~l~qSGiv~~d~~l~~~~~~gG~~v~iPf~~~-L~g~~~n~~~d~~~~~ 79 (367) T protein:vir:80 1 MPDFNNQVRLVDAVIPEVYTSYTAIDRPELTAFFLSGAVASNDFLSQFLSAPGRLINIPFWRD-LDSLEPNYGSDNPNVE 79 (367) T ss_pred CcchhhhhhhhhccchhhhhHHHhhhhhhhhhhhhcceeecCHHHHHHhhcCCCEEEeeeecc-CCCCccccCCCCCccc Confidence 99 8999999999999999999999999999999999999975 57999999999996 589999996654 6 Q ss_pred echhhcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccc--------- Q lcl|NC_020854. 71 LTPGKITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANT--------- 141 (342) Q Consensus 71 i~~~~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~--------- 141 (342) ++|.||+++++++++++|+|||+++||+++++|+|||++|++|+++||.|+.|++||++|+|+|+++.++. T Consensus 80 ~t~~kittg~~~a~v~~r~kaw~~~Dla~~lsG~dpm~~Ia~qva~yW~r~~q~~Lla~L~Gvf~~~~a~~~~~~~~~~~ 159 (367) T protein:vir:80 80 APIDGLGSGEMKTTKTWLNKAYGAMDLTAELAGSNPMTRIRNRFGVYWTRQWQRRIIAMAVGVYKSNLAGNFATIKTRGR 159 (367) T ss_pred ccccccccchheeeeehhcccchhhhHHHHhhCchHHHHHHHHHHHHhhhhhHHHHHHHHHHhhccccccchhhhhhhhc Confidence 89999999999999999999999999999999999999999999999999999999999999999887765 Q ss_pred --------cchhheeeecccccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhccccee Q lcl|NC_020854. 142 --------SSSAFFDLCIDSESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTST 213 (342) Q Consensus 142 --------~~~~~~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~ 213 (342) .+++++|++.. ++++.+.|++++|++|+++|||+.++|++++|||+||++|++++||+|++++++ T Consensus 160 ~~a~~~~~~~~~~~Dis~~--t~~~~~~~s~~~~~~A~~~lGD~~~~l~~i~mHS~V~~~L~~~~li~~i~~sd~----- 232 (367) T protein:vir:80 160 VPAEVLGTAGDMVIDISGQ--TNPADAVFNREAFVDAAFTMGDHVGSIAAIAVHSMVYKRMTNNDEIEFIPDSKG----- 232 (367) T ss_pred cccccccccCceeeeeecc--CCCccceecHHHHHHHHHHhccccccccEEEEchHHHHHHHhccccccccCCCC----- Confidence 46788888644 345677899999999999999999999999999999999999999999999974 Q ss_pred eeccceeecccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecceeEeecCCc-ceeEeccCC----CcceeEE Q lcl|NC_020854. 214 TQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMA-MQTETDRDI----LAKSDAM 288 (342) Q Consensus 214 ~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~-~~ve~dr~~----~~g~~~l 288 (342) +..|++|+|++|||||+||+..++..++|+||||++|||+|+...| ..+|++|+. ++|+++| T Consensus 233 -------------~~~i~ty~G~~VIvDD~~Pv~~~~a~~~yttYlfg~GAi~~~~~~~~~~~E~~Rd~~~~~~gG~d~L 299 (367) T protein:vir:80 233 -------------QLTIPTYMGKVVIVDDGMPVFGTGADKTYLSILFGGAAFGYADGAPQVPVAVGRRELRGNGSGLEYI 299 (367) T ss_pred -------------ccccceecceeEEEeCCCcccccCCCceEEEEEEecceeeecccCCccceecccchhhhcCCceEEE Confidence 3679999999999999999999988999999999999999987765 334555544 5689999 Q ss_pred EEeeEEEeeecceeeecC-----------------cCCcChHHhcCCcCceeecCccccceEEEEecC Q lcl|NC_020854. 289 SIDLHYVYHPVGAKWAVT-----------------TTNPTRAQLETVANWSKVYELKNIGIVRATNVS 339 (342) Q Consensus 289 ~~r~~y~~~~~G~s~~~~-----------------~~sPt~~~L~~~~NW~~v~d~k~i~~~~~~~~~ 339 (342) ++|+||++||+|+||+++ ..|||++||++++||+||||||+||||+||||- T Consensus 300 ~~Rr~~~~hP~G~s~~~~~v~~~~~~~~~~~~~~~~~sPt~~eLa~~~NW~~v~d~K~I~iv~~it~g 367 (367) T protein:vir:80 300 LERKEWIVHPGGFNWLDADVTIPDNTGSPSGITSGPPAITLANLANPDNWERVTYRKNVPMAFLVTKG 367 (367) T ss_pred EeeeeEEeecceeeecccccccccccccccccccccCCCChHHhcCCcccccccchhhcceEEEEecC Confidence 999999999999999753 257999999999999999999999999999999 No 2 >protein:vir:1583 Length: 351 # NCBI annotation: minor capsid protein # Family: family:all:1522 # MgeID: mge:32 # MgeName: phig1e # Cross-refs: genbank:acc:NP_695165;swissprot:trembl:o03966;genbank:gi:23455804;uniprot:O03966;genbank:GeneID:955561 Probab=100.00 E-value=3.5e-111 Score=626.12 Aligned_cols=318 Identities=27% Similarity=0.438 Sum_probs=289.4 Q ss_pred Cc-ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhc--cCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA-TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNA--TEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma-T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~--~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) || |+++|||+||||++||+++++++++|+|||++++|++|.+ +.||++++||||++ ++||+++++++++|++++|+ T Consensus 1 MA~T~lsd~i~PEvf~~yv~~~~~~~~~l~qSG~i~~~~~l~~~~~~~G~~it~P~~~~-l~Gd~~~~~~~~~i~~~kit 79 (351) T protein:vir:15 1 MAETHLSDLIVPEVFGNYVVNQIIKTNRFVQSGILTPDPDLGPHLLEAGTRITVPFLND-LTGDPDNWTDSDDIDVNNLT 79 (351) T ss_pred CCceeeeeeechhHHHHHHhhhhHHhhhHhhcccccccHHHHHHhhcCCCEEEeccccc-CCCcccccCCCcccchheec Confidence 99 7889999999999999999999999999999999999975 35999999999996 58999999999999999999 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) +++++++|++|+|||+++|++.+++|+|||++|++|+++||+|++|++||++|+|+|++.. ..++++++++..+ + T Consensus 80 t~~~~a~i~~~~kg~~~tD~a~~~sg~dp~~~i~~q~a~~w~~~~q~~lla~l~gv~~~~~--~~~~~~~d~t~~~---~ 154 (351) T protein:vir:15 80 SGKQQGIKFYQTKAYGYTDLGTMISGAPVQETIGNRFAAFWQRADQKTLLSVLKGVMGVTK--IANSKVYDQTKVS---P 154 (351) T ss_pred ccceeEEEEeeccceehhhhhHhhccchHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhchh--hcccceecccccc---c Confidence 9999999999999999999999999999999999999999999999999999999998754 3455667765433 4 Q ss_pred ccccccHHHHHHHHHHhCcccc-CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccc Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGD-KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGL 236 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~-~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~ 236 (342) +++.|++++|++|+++|||+.+ .+++|+|||++|++|++++|++|++++++ ...|++|+|+ T Consensus 155 ~~~~is~~~l~~A~~~~GD~~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~------------------~~~i~t~~G~ 216 (351) T protein:vir:15 155 SEPMFGAKGFTGAIGLMGDLQDTAFGAIAVNSATYSLMKVQGLIETIQPQNG------------------ATPFEAYNGL 216 (351) T ss_pred cccccCHHHHHHHHHHhccccccceEEEEEChHHHHHHHhhhhhhhcccccc------------------Ccccceecce Confidence 5678999999999999999754 69999999999999999999999998863 2569999999 Q ss_pred eEEEeCCcceeccC-CCcceEEEEEecceeEeecCCc-ceeEeccCCCcceeEEEEeeEEEeeecceeeec-----CcCC Q lcl|NC_020854. 237 RVIVSDDVNTAGSG-GSTEYATYFFTQGAVASGEQMA-MQTETDRDILAKSDAMSIDLHYVYHPVGAKWAV-----TTTN 309 (342) Q Consensus 237 ~VvvdD~~p~~~~~-~~~~y~t~l~~~GAi~~~~k~~-~~ve~dr~~~~g~~~l~~r~~y~~~~~G~s~~~-----~~~s 309 (342) +|||||++|+..++ ..++|++|+|++|||+|..+.+ ++++||+..++|+++|++||||++||+||||++ ++.| T Consensus 217 ~VivdD~~p~~~~~~~~~~ytsyl~~~GAi~~~~~~~~ve~~rd~~~~~g~d~l~~r~~~~~hp~G~s~~~~~~~~~~~s 296 (351) T protein:vir:15 217 RIVLDDDIEIDLTDKTKPVSTSYIFAPGAVRYSTNMRSTETKYDPLINGGQDVIVQKRVGTIHVAGTSIKASFSPSKASF 296 (351) T ss_pred EEEEcCCCccccCCCCCceeEEEEEecceeeeecCCcCcceeecccCCCCceEEEEeeeeeeeeeeeeecccccccCcCC Confidence 99999999998655 4458999999999999988776 566677778889999999999999999999973 4679 Q ss_pred cChHHhcCCcCceee--cCccccceEEEEecCCCC Q lcl|NC_020854. 310 PTRAQLETVANWSKV--YELKNIGIVRATNVSNFD 342 (342) Q Consensus 310 Pt~~~L~~~~NW~~v--~d~k~i~~~~~~~~~~~~ 342 (342) ||++||++++||+|| ||||+||||+|++|++.+ T Consensus 297 Pt~~~L~~~~NW~~v~~~d~k~I~iv~~~~~~~~~ 331 (351) T protein:vir:15 297 PTIDELAKSSTWEVVDGIDVRSIGVVAYTAQLDPA 331 (351) T ss_pred cChHHhcCCcccccccCCCccccceEEEEEecCcc Confidence 999999999999999 899999999999999887 No 3 >protein:vir:5974 Length: 324 # NCBI annotation: hypothetical protein # Family: family:all:1522 # MgeID: mge:125 # MgeName: SPP1 # Cross-refs: genbank:acc:NP_690674;genbank:geneid:6329212;genbank:gi:22855068;goa:Q38582;uniprot:Q38582;genbank:GeneID:955303 Probab=100.00 E-value=8.9e-111 Score=623.92 Aligned_cols=314 Identities=39% Similarity=0.635 Sum_probs=289.5 Q ss_pred Cc-ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhc----cCCCCEEEccccccCCCCcccccCCCceechhh Q lcl|NC_020854. 1 MA-TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNA----TEGGDFINVPFWKANLSGDFEVLSDSSSLTPGK 75 (342) Q Consensus 1 Ma-T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~----~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~ 75 (342) || |+++|||+||||++||+++++++++|+|||++++++++.+ ++||+++++|||++ ++||+++++++++|++++ T Consensus 1 MA~T~lsd~i~peVf~~yv~~~~~~~~~l~qSg~i~~~a~i~~~l~~~~~G~~i~~P~~~~-l~Gd~~~v~~~~~i~~~~ 79 (324) T protein:vir:59 1 MAYTKISDVIVPELFNPYVINTTTQLSAFFQSGIAATDDELNALAKKAGGGSTLNMPYWND-LDGDSQVLNDTDDLVPQK 79 (324) T ss_pred CCceeeeceechhHHHHHHHhhhHHHHHHhhcccccccHHHHHHhhccCCCCEEEeccccc-CCCcccccCCCcccchhh Confidence 99 7899999999999999999999999999999999988753 57999999999996 599999999999999999 Q ss_pred cccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 76 ITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 76 lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) |++++++++|++|+|||+++|++.+++|+|||++|++|+++||+|++|++||++|+|+|+++.++ ++.++++ T Consensus 80 l~t~~~~a~i~~~~k~~~~tD~a~~~sg~dp~~~i~~q~a~~~~~~~~~~lia~l~g~~~~~~~~---~~~~dvs----- 151 (324) T protein:vir:59 80 INAGQDKAVLILRGNAWSSHDLAATLSGSDPMQAIGSRVAAYWAREMQKIVFAELAGVFSNDDMK---DNKLDIS----- 151 (324) T ss_pred cccceeeEEEEeecCceeehhhhhhhccchHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccc---cceeeee----- Confidence 99999999999999999999999999999999999999999999999999999999999876543 3444443 Q ss_pred ccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) +.+.+.|++++|++|+++|||+.+++++|+|||++|++||++++++|++++++ .+.|++|+| T Consensus 152 a~~~~~~s~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~------------------~~~i~~~~G 213 (324) T protein:vir:59 152 GTADGIYSAETFVDASYKLGDHESLLTAIGMHSATMASAVKQDLIEFVKDSQS------------------GIRFPTYMN 213 (324) T ss_pred ccccceecHHHHHHHHHHhCCcccCcEEEEEchHHHHHHHHhhhhhhcccccc------------------Cceeeeecc Confidence 33456799999999999999999999999999999999999999999998874 256999999 Q ss_pred ceEEEeCCccee-ccCCCcceEEEEEecceeEeecCC-cceeEeccCCCcceeEEEEeeEEEeeecceeee---cCcCCc Q lcl|NC_020854. 236 LRVIVSDDVNTA-GSGGSTEYATYFFTQGAVASGEQM-AMQTETDRDILAKSDAMSIDLHYVYHPVGAKWA---VTTTNP 310 (342) Q Consensus 236 ~~VvvdD~~p~~-~~~~~~~y~t~l~~~GAi~~~~k~-~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~s~~---~~~~sP 310 (342) ++|||||+||+. .+++.++|++|+|++|||+|.+++ ++.+|++|++++|+++|++|+||++||+||||+ .++.|| T Consensus 214 ~~VivdD~~p~~~~~~~~~~y~s~l~~~GAi~~~~~~~~v~vE~dRd~~~g~~~l~~r~~~~~~p~G~s~~~~~~~~~sP 293 (324) T protein:vir:59 214 KRVIVDDSMPVETLEDGTKVFTSYLFGAGALGYAEGQPEVPTETARNALGSQDILINRKHFVLHPRGVKFTENAMAGTTP 293 (324) T ss_pred cEEEEeCCCCccccCCCCceEEEEEEecCeEEEeecCCCcceecccCccccceEEEEeeEEEeEeeeEEecccccCCCCC Confidence 999999999985 455678999999999999999865 578999999999999999999999999999995 457899 Q ss_pred ChHHhcCCcCceeecCccccceEEEEecCCC Q lcl|NC_020854. 311 TRAQLETVANWSKVYELKNIGIVRATNVSNF 341 (342) Q Consensus 311 t~~~L~~~~NW~~v~d~k~i~~~~~~~~~~~ 341 (342) |++||++++||+||+|||+||||+||+|.+- T Consensus 294 t~~~L~~~~NW~~v~~~k~i~i~~~~~~~~~ 324 (324) T protein:vir:59 294 TDEELANGANWQRVYDPKKIRIVQFKHRLQA 324 (324) T ss_pred ChhhhcCCcccccccCccccceEEEEeeccC Confidence 9999999999999999999999999999888 No 4 >protein:vir:102944 Length: 330 # NCBI annotation: major head protein # Family: family:all:1522 # MgeID: mge:1461 # MgeName: EJ-1 # Cross-refs: genbank:acc:NP_945286;genbank:gi:39653721;uniprot:Q708M6;genbank:GeneID:2672858 Probab=100.00 E-value=4.7e-110 Score=619.95 Aligned_cols=316 Identities=34% Similarity=0.605 Sum_probs=288.1 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhc--cCCCCEEEccccccCCCCcccccCCCc-eechh Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNA--TEGGDFINVPFWKANLSGDFEVLSDSS-SLTPG 74 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~--~~~G~ti~~P~~~~i~~gda~~~~~~~-~i~~~ 74 (342) || |+++|||+||||++||+++.+++++|+|||++++|++|.+ +.||+++++|||++ ++|+++++.+++ +|+++ T Consensus 1 Ma~~~T~l~d~i~pevf~~yv~~~~~~~~~l~qSG~i~~~~~i~~~~~~~G~~i~~P~~~~-l~G~~~~~~dg~~~i~~~ 79 (330) T protein:vir:10 1 MANELTKILDTITPQQYNAYMQQYTAAKSAFVQSGIAVSDERVSKNITSGGLLVNMPFWND-LTGDSEVLGNGDKALETG 79 (330) T ss_pred CCCCceEeeeeechhHHHHHHHHHhHHhhhhhhcccccccHHHHHHhhcCCCEEEeccccc-CCCcccccCCCccccchh Confidence 99 8999999999999999999999999999999999999976 36999999999996 599999998885 89999 Q ss_pred hcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccc Q lcl|NC_020854. 75 KITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSE 154 (342) Q Consensus 75 ~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~ 154 (342) +|++++++++|++|+|||+++|++.+++|+|||++|++|+++||+|+.|++||++|+|+|++..+........ .+..+ T Consensus 80 ki~t~~~~a~i~~~~k~~~~tD~a~~~~g~dp~~~i~~q~a~~w~~~~q~~lla~l~gvf~~~~~~~~~~~~~-~~~~~- 157 (330) T protein:vir:10 80 KITAGADIACVLYRGRGWAANELTGVVAGSDPVRAILNRIGAYWLREDQKALIATLNGIFATGTAGEKGALEE-THVSD- 157 (330) T ss_pred hcccceeEEEEEeecceeeehhhhhhhcchhHHHHHHHHHHHHhhhhHHHHHHHHHHhhhhhhhcccchhhhh-hheec- Confidence 9999999999999999999999999999999999999999999999999999999999999876655443322 12222 Q ss_pred cccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeec Q lcl|NC_020854. 155 SGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYM 234 (342) Q Consensus 155 ~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~ 234 (342) ...+.+.|++++|++|+++|||+.+++++|+|||++|++||+++||+|++++++ .+.|++|+ T Consensus 158 ~~~~~a~~s~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~------------------~~~i~~~~ 219 (330) T protein:vir:10 158 QSKASTGIDAGMVLDAKQLLGDSADQVTAIAMHSAVYTKLQKDNLIQYIQPTTA------------------TINIPTYL 219 (330) T ss_pred ccccccccCHHHHHHHHHHhccccccceEEEEcHHHHHHHHHhhhhhhhccccc------------------Cccccccc Confidence 234567899999999999999999999999999999999999999999998864 25699999 Q ss_pred cceEEEeCCcceeccCCCcceEEEEEecceeEeecCCc---ceeEeccCCCcceeEEEEeeEEEeeecceeeec-----C Q lcl|NC_020854. 235 GLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMA---MQTETDRDILAKSDAMSIDLHYVYHPVGAKWAV-----T 306 (342) Q Consensus 235 G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~---~~ve~dr~~~~g~~~l~~r~~y~~~~~G~s~~~-----~ 306 (342) ||+|||||++|+. .++|++|+|++|||+|.++++ +.+|++|++++|+++|++|+||++||+||||++ + T Consensus 220 G~~VivdD~~p~~----~~~yt~yl~~~GAi~~~~~~~~~~v~~EtdRd~~~g~~~l~~r~~~~~hp~G~s~~~~~~~~~ 295 (330) T protein:vir:10 220 GYRVIIDDGIAPT----GDIYTSYLFRTGSIGLNTGNPSGLTTFETSREAAKGNDMIYTRRALVMHPYGVKWTGAEVDAG 295 (330) T ss_pred ceEEEEeCCCCCC----CCceeEEEEecCceeeecccCCccccccccCCccccceEEEEeeEEEeeeeeeeecccccccC Confidence 9999999999975 368999999999999987654 789999999999999999999999999999974 4 Q ss_pred cCCcChHHhcCCcCceeecCccccceEEEEecCCC Q lcl|NC_020854. 307 TTNPTRAQLETVANWSKVYELKNIGIVRATNVSNF 341 (342) Q Consensus 307 ~~sPt~~~L~~~~NW~~v~d~k~i~~~~~~~~~~~ 341 (342) +.|||++||++++||+||||||+||||+||+|.+- T Consensus 296 ~~sPt~~~L~~~~NW~~v~~~k~i~iv~~~~~~~~ 330 (330) T protein:vir:10 296 NITPSNADLAKFKNWKRVYEPKNIGIIALKHKIGK 330 (330) T ss_pred cCCcChHHhcCCcCcccccChhhcceEEEEEecCC Confidence 67999999999999999999999999999999877 No 5 >protein:vir:94989 Length: 349 # NCBI annotation: hypothetical protein # Family: family:all:1522 # MgeID: mge:1547 # MgeName: KS7 # Cross-refs: genbank:acc:YP_224029;genbank:gi:62327316;genbank:GeneID:5176817 Probab=100.00 E-value=1.1e-106 Score=601.39 Aligned_cols=315 Identities=25% Similarity=0.453 Sum_probs=280.3 Q ss_pred Cc-ceeccccchh--HHHHHHHhhhHHhhhhhhcCccccchhhhc--cCCCCEEEccccccCCCCccc-ccCCCc---ee Q lcl|NC_020854. 1 MA-TLRSDIIIPE--VFTPYVIEQTTQRDAFLASGVVQPMTELNA--TEGGDFINVPFWKANLSGDFE-VLSDSS---SL 71 (342) Q Consensus 1 Ma-T~~~d~i~Pe--v~~~yv~~~~~~~~~f~~sg~~~~d~~l~~--~~~G~ti~~P~~~~i~~gda~-~~~~~~---~i 71 (342) || |+++|+|+|| ||++||.++..++++|+|||++++|++|.+ +.||++++||||++ |+|+.+ ++.+.+ ++ T Consensus 1 Ma~T~l~D~iipe~~vf~~Yv~~~~~e~~~l~qSGii~~d~~l~~~~~~gG~~~~iPf~~~-l~g~~e~n~~~dt~~~~~ 79 (349) T protein:vir:94 1 MAITTIGNIVTGNIPVLASYMTEDPVEKTAFFNSGILTPTPYAAEIARGPSNIANLPFWKA-IDTSIEPNYSNDVYQDIA 79 (349) T ss_pred CCceEEeeeeccChHHHHHHHHHhHHHhhhhhhccceeccHHHHHHHhcCCCEEEeeeeec-CCCCcccccCCCCccccc Confidence 99 7899999998 899999999999999999999999999975 57999999999995 578754 555543 69 Q ss_pred chhhcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc-----hhh Q lcl|NC_020854. 72 TPGKITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSS-----SAF 146 (342) Q Consensus 72 ~~~~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~-----~~~ 146 (342) +|.||+++++++++++|+|+|+++||+.+++|+|||++|++|+++||.|+.|+.||++|+|+|+++.+...+ +++ T Consensus 80 t~~kit~~~~~a~~~~r~kaw~~~Dla~~lsG~dpm~~Ia~~va~yW~r~~q~~Lia~L~Gvf~~~~~~~~~~~~~~~~~ 159 (349) T protein:vir:94 80 TPRAIQTGEMMARVAYLNEGFGQADLTVELTSQNPLQSVASRLDNFWQRQAQRRLIATALGLYNDNVSATDAYHEQNDMV 159 (349) T ss_pred ccccccccceeeeeeeeccccchhHHHHHhhCchHHHHHHHHHHHHHhhHHHHHHHHHHHhhhcccccccccccccCcee Confidence 999999999999999999999999999999999999999999999999999999999999999987665433 333 Q ss_pred eeeecccccccccccccHHHHHHHHHHhCcc-----ccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceee Q lcl|NC_020854. 147 FDLCIDSESGDTPTALSPRHVAEARAILGDQ-----GDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMA 221 (342) Q Consensus 147 ~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~-----~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~ 221 (342) +++ .+.+.+++++|++|+++|||. .++|++++|||+||++|++++||+|+++++. T Consensus 160 ~d~-------~~~a~~~~~~~~~A~~~~Gdaa~Gd~~~~lt~i~mHS~v~~~L~~~~li~~i~~s~~------------- 219 (349) T protein:vir:94 160 VDV-------SATSGFDAGAFIDATQTMGDALMGNGGEVLGAIAMHSFVYAQARKAQLIDFIRDAEN------------- 219 (349) T ss_pred EEe-------cccCCCChhhHHHHHHHHHHHhccccccceeEEEEchHHHHHHHhcchhhhccCccc------------- Confidence 333 245568999999999888876 6899999999999999999999999998763 Q ss_pred cccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecceeEeecCCc-ceeEeccCC----CcceeEEEEeeEEEe Q lcl|NC_020854. 222 AAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMA-MQTETDRDI----LAKSDAMSIDLHYVY 296 (342) Q Consensus 222 ~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~-~~ve~dr~~----~~g~~~l~~r~~y~~ 296 (342) ++.|++|+||+|||||+||+.+++++++|+||+|++|||+|+++.| ..+|++|+. ++|+++|++|+||++ T Consensus 220 -----~~~i~ty~G~~VivDD~~Pv~~~g~~~~yttylfg~GAi~~~~~~~~~~~E~~rd~~~g~~~G~d~L~~R~~~~~ 294 (349) T protein:vir:94 220 -----NTMFATYQGYRVIVDDSMTVVGQDTSRKFISIIFGQGAIGYGEGNPEMPLEYEREASRANGGGVETLWTRKTWLL 294 (349) T ss_pred -----CcccceecCcEEEEeCCCccccCCCCceEEEEEeecceEEeecCCCCcceeeecccccCCcceeEEEEEeeEEEe Confidence 4679999999999999999999999999999999999999998764 344555443 568899999999999 Q ss_pred eecceeeecC----------cCCcChHHhcCCcCceeecCccccceEEEEecCCC Q lcl|NC_020854. 297 HPVGAKWAVT----------TTNPTRAQLETVANWSKVYELKNIGIVRATNVSNF 341 (342) Q Consensus 297 ~~~G~s~~~~----------~~sPt~~~L~~~~NW~~v~d~k~i~~~~~~~~~~~ 341 (342) ||+|+||+++ +.|||++||++++||+||||||+||||+||||.+- T Consensus 295 hp~G~s~~~a~v~~~~~~~~~~sPt~aeLa~~~NW~~v~~~K~I~iv~~~~~~~a 349 (349) T protein:vir:94 295 HPFGYSFTSAVITGNGTETIARSASWQDLANAANWNRVVDRKHVPIAFLVTGVGA 349 (349) T ss_pred eeeeeeecccccCCCccccccCCCChHHhcCCcCcccccChhhcceEEEEeccCC Confidence 9999999754 25899999999999999999999999999999888 No 6 >protein:vir:78387 Length: 349 # NCBI annotation: putative coat protein # Family: family:all:1522 # MgeID: mge:1851 # MgeName: SETP3 # Cross-refs: genbank:acc:YP_001110837;genbank:gi:134288598;genbank:GeneID:5179650 Probab=100.00 E-value=2.1e-106 Score=599.98 Aligned_cols=315 Identities=24% Similarity=0.454 Sum_probs=279.5 Q ss_pred Cc-ceeccccchh--HHHHHHHhhhHHhhhhhhcCccccchhhhc--cCCCCEEEccccccCCCCccc-cc-CCC--cee Q lcl|NC_020854. 1 MA-TLRSDIIIPE--VFTPYVIEQTTQRDAFLASGVVQPMTELNA--TEGGDFINVPFWKANLSGDFE-VL-SDS--SSL 71 (342) Q Consensus 1 Ma-T~~~d~i~Pe--v~~~yv~~~~~~~~~f~~sg~~~~d~~l~~--~~~G~ti~~P~~~~i~~gda~-~~-~~~--~~i 71 (342) || |+|+|+|+|| ||++||.++..++++|+|||++++|++|.+ +.||++++||||++ |+|++| ++ +++ +++ T Consensus 1 Ma~T~l~D~iipe~~vf~~Yv~~~~~e~~~l~qSGii~~d~~l~~~~~~gG~~~~iPf~~~-L~g~~e~nv~~D~~~~~~ 79 (349) T protein:vir:78 1 MAITTIGDIVTGNIPVLASYMTEDPVEKTAFFDSGILTSTPYAAEIANGPSNIANLPFWKA-IDTSIEPNYSNDVYQDIA 79 (349) T ss_pred CCceEEeeeeccCHHHHHHHHHHhhHHhhhhhhccceeccHHHHHHhhcCCCEEEeeeeec-CCCCcccccCCCCccccc Confidence 99 7899999999 899999999999999999999999999975 57999999999995 577654 44 333 478 Q ss_pred chhhcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccc-----chhh Q lcl|NC_020854. 72 TPGKITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTS-----SSAF 146 (342) Q Consensus 72 ~~~~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~-----~~~~ 146 (342) +++||+++++++++++|+|+|+++||+.+++|+|||++|++|+++||.|+.|++||++|+|+|+++.++.. ++++ T Consensus 80 t~~kitt~~~~a~~~~r~kaw~~~Dla~~lsG~dpm~~Ia~~va~yW~r~~q~~Lia~L~Gvf~~~~~a~~~~~~~~~~t 159 (349) T protein:vir:78 80 TPRAIQTGEMMARVAYLNEGFGQADLTVELTSQNPLQSVASRLDNFWQRQAQRRLIATALGLYNDNVSATDAYHEQNDMV 159 (349) T ss_pred ccccccccceeeeeeeeccccchhHHHHHhhCchHHHHHHHHHHHHHhhHHHHHHHHHHHHhhcccccccchhhhcccce Confidence 99999999999999999999999999999999999999999999999999999999999999987655443 3445 Q ss_pred eeeecccccccccccccHHHHHHHHHHhCcc-----ccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceee Q lcl|NC_020854. 147 FDLCIDSESGDTPTALSPRHVAEARAILGDQ-----GDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMA 221 (342) Q Consensus 147 ~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~-----~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~ 221 (342) +++. +++.++++.|++|.++|||. .++|++++|||++|++|++++||+|+++++. T Consensus 160 ~d~s-------~~a~~~~~~~~dA~~~lgda~~Gd~~~~lt~i~mHS~v~~~L~~~~li~~i~~s~~------------- 219 (349) T protein:vir:78 160 VDVS-------ATLGFDAGAFIDATQTMGDALMGNGGEVLGAIAMHSFVYAQARKAQLIDFIRDAEN------------- 219 (349) T ss_pred eeec-------cccCCChhhhhhhHHHHHHHhccccccceeEEEEchHHHHHHHhhhhhhhccCccc------------- Confidence 5442 34468999999999888885 6899999999999999999999999998763 Q ss_pred cccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecceeEeecCCc-ceeEeccCC----CcceeEEEEeeEEEe Q lcl|NC_020854. 222 AAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMA-MQTETDRDI----LAKSDAMSIDLHYVY 296 (342) Q Consensus 222 ~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~-~~ve~dr~~----~~g~~~l~~r~~y~~ 296 (342) ++.|++|+||+|||||+||+.++++.++|+||+|++|||+|+...| ..+|++|+. ++|+++|++||||++ T Consensus 220 -----~~~i~ty~G~~VivDD~~Pv~~~g~~~~yttylfg~GAi~~~~~~~~~~~et~rd~~~g~~~G~d~l~~R~~~~~ 294 (349) T protein:vir:78 220 -----NTMFATYQGYRVIVDDSMTVVGQGAQRKFISIIFGQGAIGYGEGNPVMPLEYEREASRANGGGVETLWTRKTWLL 294 (349) T ss_pred -----CcccceecCeEEEEeCCCccccCCCCceEEEEEeecceEEEccCCCccceeeecccccCCcceeEEEEEeeEEEe Confidence 4679999999999999999999999999999999999999988765 345555554 468899999999999 Q ss_pred eecceeeecC----------cCCcChHHhcCCcCceeecCccccceEEEEecCCC Q lcl|NC_020854. 297 HPVGAKWAVT----------TTNPTRAQLETVANWSKVYELKNIGIVRATNVSNF 341 (342) Q Consensus 297 ~~~G~s~~~~----------~~sPt~~~L~~~~NW~~v~d~k~i~~~~~~~~~~~ 341 (342) ||+|+||+++ +.|||++||++++||+||||||+||||+||||.+- T Consensus 295 hp~G~s~~~a~v~~~~~~~~~~sPt~aeLa~~~NW~~v~~~K~I~iv~~~~~~~a 349 (349) T protein:vir:78 295 HPFGYRFTSAVITGNGTETIARSASWQDLANATNWNRVVDRKHVPIAFLVTGVGA 349 (349) T ss_pred eeeeeeeccccccCCccccccCCCChHHhcCCcCcccccChhhcceEEEEeccCC Confidence 9999999744 36899999999999999999999999999999888 No 7 >protein:vir:95131 Length: 325 # NCBI annotation: hypothetical protein ORF010 # Family: family:all:47 # MgeID: mge:1552 # MgeName: PA73 # Cross-refs: genbank:acc:YP_001293417;genbank:gi:148912838;genbank:GeneID:5228206 Probab=100.00 E-value=2.5e-78 Score=446.06 Aligned_cols=316 Identities=17% Similarity=0.183 Sum_probs=252.5 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhc--CccccchhhhccCCCCEEEccccccCCCC--cccccCCCceechhhc Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLAS--GVVQPMTELNATEGGDFINVPFWKANLSG--DFEVLSDSSSLTPGKI 76 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~s--g~~~~d~~l~~~~~G~ti~~P~~~~i~~g--da~~~~~~~~i~~~~l 76 (342) |+-....+|||+++..|++.-..+..+|..+ |+++..+.+.. |+++++|||+++.+| +.+++++.+++++.+| T Consensus 1 m~lsD~~vfN~~~~~a~~e~~~q~~~~fn~as~gai~l~~~~~~---Gd~~~~pf~~~l~g~~~~~~~~~~~~~vt~~ki 77 (325) T protein:vir:95 1 MALSDLAVYSEYAYSAFSETLRQQVDLFNTATGGAIMLQSAAHQ---GDFSDVAFFAKVTGGLVRRRNAYGSGTVAEKVL 77 (325) T ss_pred CchhhhhhhhhhhhhhhhhhhhhhHhhhhhcccceeEecccccc---CceeeccccccccccccccccCCCCceecccee Confidence 9865555688999988887543344455443 66765555443 999999999975333 4578999999999999 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +++++.+++++|++||..+|+++++++.|||.+++++|+.+|++.++.++|+++-+++....+. .++++++++.. .+ T Consensus 78 tt~~~~av~~~r~~g~~~~d~~~~~~g~~~~~~~~~~Ig~~~a~~~~~~~l~~~~~~l~~a~~~-~~~~v~dis~~--~~ 154 (325) T protein:vir:95 78 KHLVDTSVKVAAGTPPVRLDPGQFRWIQQNPEVAGAAMGQQLAVDTMADMLNVGLGSVYSALSQ-VSDVVYDATAN--TD 154 (325) T ss_pred ccccceeeEEecccCcccccHHHHhhcCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhcc-cccceeeeecc--cC Confidence 9999999999999999999999999999999976666666666666555555444433322222 23566666543 33 Q ss_pred cccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccc Q lcl|NC_020854. 157 DTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGL 236 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~ 236 (342) ...+.+++++|++|+++|||+.++|++|+|||+||++|++++|+++.+..+..+ ...+++|+|| T Consensus 155 ~~~~~~s~~~l~~A~~klGD~~~~l~~~~MHS~v~~~L~~~~L~~~~~~~~~~g----------------~~~i~t~~G~ 218 (325) T protein:vir:95 155 AADKLPTWNNLNNGQAKFGDQSSQIAAWIMHSTPMHKLYGSNLTNGERLFTYGT----------------VNVVRDPFGK 218 (325) T ss_pred cccccccHHHHHHHHHHhcccccceeEEEEchHHHHHHHHhhccccccccccCC----------------cccccccCCc Confidence 455678999999999999999999999999999999999999998877554332 2347899999 Q ss_pred eEEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcc--eeEEEEeeEEEeeecceeeec--CcCCcCh Q lcl|NC_020854. 237 RVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAK--SDAMSIDLHYVYHPVGAKWAV--TTTNPTR 312 (342) Q Consensus 237 ~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g--~~~l~~r~~y~~~~~G~s~~~--~~~sPt~ 312 (342) +|||||+||+.+++..++|++|+|++|||.|.+.+|+..+..+..+.+ ...+..|+.|++||+||||+. ++.|||+ T Consensus 219 ~VIVdD~~p~~~~g~~~~ytty~lg~GAi~~~~~~~~~~~~~~~~~~~~~~~~~~~~~tf~lhp~G~sw~~s~~g~sPt~ 298 (325) T protein:vir:95 219 LLVMTDSPNLFAAGTPNVYHILGLVPGGVLIGQNNDFDANEETKNGDENIIRTYQAEWSYNIGVKGFAWDKANGGKSPTD 298 (325) T ss_pred EEEEeCCCCCCCccCceeEEEEEEecCeEEecCCCCccccccccCcccceeeeeeeeeeEEeecceeeeecccccCCcCh Confidence 999999999999999999999999999999999999776655544332 234568888999999999964 4689999 Q ss_pred HHhcCCcCceeec-CccccceEEEEec Q lcl|NC_020854. 313 AQLETVANWSKVY-ELKNIGIVRATNV 338 (342) Q Consensus 313 ~~L~~~~NW~~v~-d~k~i~~~~~~~~ 338 (342) +||++++||+||| +.|.++.|-++|| T Consensus 299 aeL~~~~NW~rv~~~~K~tagv~~~~~ 325 (325) T protein:vir:95 299 AALFTSTNWDKYATSHKDLAGVVVKTN 325 (325) T ss_pred HhhcCCcCcceecCCCccccceeEeeC Confidence 9999999999999 8999999999999 No 8 >protein:vir:95107 Length: 270 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1549 # MgeName: X2 # Cross-refs: genbank:acc:YP_240822;genbank:gi:66394683;genbank:GeneID:5133901 Probab=100.00 E-value=6.3e-74 Score=421.88 Aligned_cols=267 Identities=18% Similarity=0.216 Sum_probs=234.2 Q ss_pred Cc-ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccc Q lcl|NC_020854. 1 MA-TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITAD 79 (342) Q Consensus 1 Ma-T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~ 79 (342) || |+++|||+||||++||++++.++++|.+ ++..|++|.++ ||++|+||||+++ ||++++.||++|++++|+++ T Consensus 1 Ma~T~~~d~I~Pev~~~~V~e~~~~~~~~~~--~~~~d~~L~g~-~G~ti~~P~~~~i--gdae~~~eg~~i~~~~lt~~ 75 (270) T protein:vir:95 1 MTQTKKANLINPEVLANVVSAQMQNAIRFTP--YAVTDDTLVGQ-PGDTITRPKYAYI--GAAEDLQEGVAMDTTQMSMT 75 (270) T ss_pred CCceehhhhcchHHHHHHHHHHHHhHHhhcc--ccccccccCCC-CCCEEEeeeecCC--CccccccCCCccchhhcccc Confidence 99 9999999999999999999999999955 56678999885 9999999999976 99999999999999999999 Q ss_pred eeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccccc Q lcl|NC_020854. 80 KQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTP 159 (342) Q Consensus 80 ~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~ 159 (342) +++++|++|+|+|+++|++.+++++||++++++|++.||+|++|++|+++|+|++.+. + T Consensus 76 ~~~a~i~~~gk~~~itD~a~~~~~~dp~~~~~~q~a~~~a~~~d~~li~~l~~a~~~~---------------------~ 134 (270) T protein:vir:95 76 TTKVTVKETGKAVEVTQTAIITNVNGTLQEASRQLAMSLADKVEIDYIAELNKSKQTA---------------------T 134 (270) T ss_pred hheeeeehhhCcceecHHHHhhhccchHHHHHHHHHHHHHHHHHHHHHHHhccccccc---------------------c Confidence 9999999999999999999999999999999999999999999999999999865431 2 Q ss_pred ccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceEE Q lcl|NC_020854. 160 TALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVI 239 (342) Q Consensus 160 ~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vv 239 (342) ..++++.|++|+++|||+.+.+.+++|||++|++|||++++++.++.+. ...++.|++|+|++|| T Consensus 135 ~~~t~~~~~dA~~~lgd~~~~~~~i~vhs~~~~~Lrk~~~~~~~~~~~~---------------~~~~G~ig~~~G~~Vi 199 (270) T protein:vir:95 135 VSADATGILDAIEVFNSENDEDYVLYVNPKDYNKLVKSLFKVGGNVQDR---------------AISKGDLVEIVGVSDI 199 (270) T ss_pred cccCHHHHHHHHHHhccccCCCcEEEEcHHHHHHHHhhhcccccccccc---------------hhcccccceecceeEE Confidence 3468899999999999999999999999999999999998877655432 1234679999999999 Q ss_pred EeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeeccee-eecCcCCcChH-Hh Q lcl|NC_020854. 240 VSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVGAK-WAVTTTNPTRA-QL 315 (342) Q Consensus 240 vdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~s-~~~~~~sPt~~-~L 315 (342) |+|++| ++|++|+|++|||+++.++++++|++|++.++.+.+++|+||++|+..-+ +.....+|+-+ |+ T Consensus 200 v~s~~~-------~~~~~~l~~~gAi~~~~~~~~~vEtdRd~~~~~d~i~~~~~y~v~~~~~skvv~~t~~~a~~~~~ 270 (270) T protein:vir:95 200 VKSKRV-------SENTAFLQRYGAMEIVNKKKPEAYTDFDILKRTHLLSTNYHYSVNLKDETGVVKVTFKPSGSLEM 270 (270) T ss_pred EeCCCC-------CceeEEEEeccceeeeecCCceeeeccchhhcccEEEeeeEEEEEEEccceEEEEEecCCCCcCC Confidence 999875 46899999999999999999999999999999999999999999998843 32222222211 11 No 9 >protein:vir:105334 Length: 276 # NCBI annotation: putative phage major capsid protein # Family: family:all:522 # MgeID: mge:1679 # MgeName: PH15 # Cross-refs: genbank:acc:YP_950669;genbank:gi:119967839;genbank:GeneID:4643213 Probab=100.00 E-value=9.8e-68 Score=387.96 Aligned_cols=268 Identities=21% Similarity=0.234 Sum_probs=238.9 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) || |+++|+|+||||++||.+++.++++| ++++..|++|.++ ||++|+||+|+++ |+++++.||++|++++|+ T Consensus 1 Ma~~~T~l~d~i~Pev~~~~v~~~~~~~~~~--~~~~~~~~~l~g~-~G~ti~iP~~~~i--gda~~~~eg~~i~~~~lt 75 (276) T protein:vir:10 1 MAQGTTTKSTQIVPEVLAPMMQAELDKKLRF--AQFADIDSTLVGQ-PGDTLTFPAFVYS--GDATVVPEGQKIPVDKIE 75 (276) T ss_pred CCcceeehhhhhchHHHHHHHHHHHHhhhhh--cccceecccccCC-CCCEEEeeeecCC--CccccccCCCccCccccc Confidence 99 78999999999999999999999999 6678889999885 8999999999987 999999999999999999 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) ++++.+++++++|+|.++|++.+.+++||++++.+|++.+|+|++|++++++|++.... . T Consensus 76 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~~~~~~~~~~a~~~d~~~~~~l~~~~~~--------------------~ 135 (276) T protein:vir:10 76 TNRREAKIHKIGKGTDITDEALLSGYGDPQGEAVRQHGLAIANKVDNDVLEALRGTKLT--------------------V 135 (276) T ss_pred cceeeEEeehccccccccHHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhccccc--------------------c Confidence 99999999999999999999999999999999999999999999999999999863221 1 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) +...++++.|++|.++|||+..++.+++|||++|+.|+|+++++|++.++.. ..+..++.|++|+|++ T Consensus 136 ~~~~~t~d~i~~A~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g------------~~~~~~G~ig~~~G~~ 203 (276) T protein:vir:10 136 SADIGTLAGLEAAIDTFDDEDLEPMVLFINPKDAGKLRSSASDNFTRATELG------------DNIIVKGAFGEALGAV 203 (276) T ss_pred cccccCHHHHHHHHHHhccccCcccEEEEcHHHHHHHHHhcccccccccccc------------ccceeccccceeccee Confidence 2235789999999999999999999999999999999999999999877632 1223457799999999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecce----eee-cCcCCcCh Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVGA----KWA-VTTTNPTR 312 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~----s~~-~~~~sPt~ 312 (342) ||+||.+| +|++|++++||+++..++++++|++|++.++.+.++.|+||++++.-= .-+ .++..|+. T Consensus 204 Vi~s~~~p--------~~t~~l~~~gAi~~~~~~~~~vE~dRd~~~~~d~i~~~~~y~~~~~~~~~vv~~t~~~~~~~~~ 275 (276) T protein:vir:10 204 IVRSKKLD--------EGEAILAKRGAVKLITKRDFFLETDRDPSTKTTALYSDKHYVAYLYDESKAVKVTKGAGTTDSG 275 (276) T ss_pred EEEcCCCC--------cceEEEEeccceeeeecCCceeecccchhhcccEEEEeeEEEEEEEcCcceEEEecCCcCCcCC Confidence 99999885 578999999999999999999999999999999999999999987542 222 34667888 Q ss_pred H Q lcl|NC_020854. 313 A 313 (342) Q Consensus 313 ~ 313 (342) + T Consensus 276 ~ 276 (276) T protein:vir:10 276 A 276 (276) T ss_pred C Confidence 8 No 10 >protein:vir:96262 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1612 # MgeName: ROSA # Cross-refs: genbank:acc:YP_240311;genbank:gi:66395978;genbank:GeneID:5133339 Probab=100.00 E-value=8.2e-66 Score=377.40 Aligned_cols=271 Identities=19% Similarity=0.176 Sum_probs=236.2 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) || |+++|+|+||||++||.+++.++++| ++++..|.+|.++ ||++|+||+|+++ |+++++.+|++|++++|+ T Consensus 1 m~~~~T~l~d~i~Pev~~~~v~~~~~~~l~~--~~~~~~~~~l~g~-~G~tv~iP~~~~i--g~a~~~~~g~~i~~~~lt 75 (274) T protein:vir:96 1 MAQGMTKLTNQIVPEVLAPMMQAELEKKLRF--ASFAEIDNTLVGQ-PGDTLTFPAFIYS--GDAKVVAEGEKIPTDILE 75 (274) T ss_pred CCcceeehhheechHHHHHHHHHHHHhhhhc--cccceecccccCC-CCCEEEeeeecCC--CccccccCCCccchhhcc Confidence 99 89999999999999999999988888 7889899999885 8999999999987 999999999999999999 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) ++++.++|++++|+|.++|++.+.+++|||+++.+|++.+|++++|++|+++|++.... . T Consensus 76 ~~~~~~~i~~~~~a~~i~D~~~~~~~~d~~~~~~~~~~~~~a~~vd~~i~~~l~~a~~~--------------------~ 135 (274) T protein:vir:96 76 TKKREAKIRKIAKGTSISDEALLSGYGDPQGEQVRQHGLAHANKVDDDVLEALKSAKLT--------------------V 135 (274) T ss_pred cceeEEEeeeeecceeehHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHHhccccc--------------------c Confidence 99999999999999999999999999999999999999999999999999999763211 0 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) ....++++.|++|.++|||+.+.++.++|||++|+.|+++++++|++.++.. .....++.|++|+|++ T Consensus 136 ~~~~~~~d~i~~A~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g------------~~~~~~G~ig~~~G~~ 203 (274) T protein:vir:96 136 EADITKLTGLQTAIDKFNDEDLEPMVLFISPLDAGKLRGDATTNFTRATELG------------DDVIVKGAFGEALGAV 203 (274) T ss_pred cccccCHHHHHHHHHHhccccccccEEEeCHHHHHHHHhhcccccccccccc------------ccceeccccceecCeE Confidence 1235689999999999999999999999999999999999999999877632 1223356799999999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecceeeecCcCCcChHHhcC Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVGAKWAVTTTNPTRAQLET 317 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~s~~~~~~sPt~~~L~~ 317 (342) |++||.+| +|++|++++|||++..++++++|++|++.++.|.++.|+||++++.- |+-.-..+ T Consensus 204 Vi~s~~~~--------~~t~~l~~~gA~~~~~~~~~~vE~~Rd~~~~~d~i~~~~~y~~~~~~---------~~~~v~~t 266 (274) T protein:vir:96 204 IVRSNKLE--------AGTAILAKKGAVKLITKRDFFLETDRDPSTKTTALYSDKHYVAYLYD---------ESKAVKIT 266 (274) T ss_pred EEEeCCCC--------CceEEEEeccceeeeecCCcccccccccccccCEEEEeEEEEEEEEc---------CCcEEEEE Confidence 99999874 68999999999999999999999999999999999999999998743 22223333 Q ss_pred CcCceeec Q lcl|NC_020854. 318 VANWSKVY 325 (342) Q Consensus 318 ~~NW~~v~ 325 (342) ..+|.+-. T Consensus 267 k~~~~~~~ 274 (274) T protein:vir:96 267 KGSGSLEM 274 (274) T ss_pred cCCccccC Confidence 33444322 No 11 >protein:vir:95898 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1588 # MgeName: 71 # Cross-refs: genbank:acc:YP_240385;genbank:gi:66396054;genbank:GeneID:5133409 Probab=100.00 E-value=8.2e-66 Score=377.40 Aligned_cols=271 Identities=19% Similarity=0.176 Sum_probs=236.2 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) || |+++|+|+||||++||.+++.++++| ++++..|.+|.++ ||++|+||+|+++ |+++++.+|++|++++|+ T Consensus 1 m~~~~T~l~d~i~Pev~~~~v~~~~~~~l~~--~~~~~~~~~l~g~-~G~tv~iP~~~~i--g~a~~~~~g~~i~~~~lt 75 (274) T protein:vir:95 1 MAQGMTKLTNQIVPEVLAPMMQAELEKKLRF--ASFAEIDNTLVGQ-PGDTLTFPAFIYS--GDAKVVAEGEKIPTDILE 75 (274) T ss_pred CCcceeehhheechHHHHHHHHHHHHhhhhc--cccceecccccCC-CCCEEEeeeecCC--CccccccCCCccchhhcc Confidence 99 89999999999999999999988888 7889899999885 8999999999987 999999999999999999 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) ++++.++|++++|+|.++|++.+.+++|||+++.+|++.+|++++|++|+++|++.... . T Consensus 76 ~~~~~~~i~~~~~a~~i~D~~~~~~~~d~~~~~~~~~~~~~a~~vd~~i~~~l~~a~~~--------------------~ 135 (274) T protein:vir:95 76 TKKREAKIRKIAKGTSISDEALLSGYGDPQGEQVRQHGLAHANKVDDDVLEALKSAKLT--------------------V 135 (274) T ss_pred cceeEEEeeeeecceeehHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHHhccccc--------------------c Confidence 99999999999999999999999999999999999999999999999999999763211 0 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) ....++++.|++|.++|||+.+.++.++|||++|+.|+++++++|++.++.. .....++.|++|+|++ T Consensus 136 ~~~~~~~d~i~~A~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g------------~~~~~~G~ig~~~G~~ 203 (274) T protein:vir:95 136 EADITKLTGLQTAIDKFNDEDLEPMVLFISPLDAGKLRGDATTNFTRATELG------------DDVIVKGAFGEALGAV 203 (274) T ss_pred cccccCHHHHHHHHHHhccccccccEEEeCHHHHHHHHhhcccccccccccc------------ccceeccccceecCeE Confidence 1235689999999999999999999999999999999999999999877632 1223356799999999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecceeeecCcCCcChHHhcC Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVGAKWAVTTTNPTRAQLET 317 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~s~~~~~~sPt~~~L~~ 317 (342) |++||.+| +|++|++++|||++..++++++|++|++.++.|.++.|+||++++.- |+-.-..+ T Consensus 204 Vi~s~~~~--------~~t~~l~~~gA~~~~~~~~~~vE~~Rd~~~~~d~i~~~~~y~~~~~~---------~~~~v~~t 266 (274) T protein:vir:95 204 IVRSNKLE--------AGTAILAKKGAVKLITKRDFFLETDRDPSTKTTALYSDKHYVAYLYD---------ESKAVKIT 266 (274) T ss_pred EEEeCCCC--------CceEEEEeccceeeeecCCcccccccccccccCEEEEeEEEEEEEEc---------CCcEEEEE Confidence 99999874 68999999999999999999999999999999999999999998743 22223333 Q ss_pred CcCceeec Q lcl|NC_020854. 318 VANWSKVY 325 (342) Q Consensus 318 ~~NW~~v~ 325 (342) ..+|.+-. T Consensus 267 k~~~~~~~ 274 (274) T protein:vir:95 267 KGSGSLEM 274 (274) T ss_pred cCCccccC Confidence 33444322 No 12 >protein:vir:1239 Length: 274 # NCBI annotation: similar to phage B1 major head protein # Family: family:all:522 # MgeID: mge:25 # MgeName: phi ETA # Cross-refs: genbank:acc:NP_510938;genbank:gi:17426272;genbank:GeneID:927376 Probab=100.00 E-value=3.5e-65 Score=373.92 Aligned_cols=267 Identities=19% Similarity=0.191 Sum_probs=232.8 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) || |+++|+|+||||++||.+++.++++| ++++..|.+|.++ ||++|+||+|+++ |+++++.+|++|++++|+ T Consensus 1 ma~~~T~l~d~iiPev~~~~v~~~~~~~l~~--~~~~~~d~~l~g~-~G~tv~iP~~~~i--g~a~~~~~g~~i~~~~lt 75 (274) T protein:vir:12 1 MAQGLTKTSNQIIPEVLAPMMQAQLEKKLRF--ASFAEVDSTLQGQ-PGDTLTFPAFVYS--GDAQVVAEGEKIPTDILE 75 (274) T ss_pred CCcceeehhhhhchHHHHHHHHHHHHhhhhh--cccceecccccCC-CCCEEEEeeecCC--CccccccCCCccchhhcc Confidence 99 88999999999999999999887777 7889999999885 8999999999977 999999999999999999 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) ++++.++|++++|+|+++|++.+.+++|||+++.+|++.+|++++|+++++.+++... + . T Consensus 76 ~~~~~~~i~~~~~~~~i~D~~~~~~~~d~~~~~~~q~~~~~a~~vd~~~l~~~~~a~~-----------------~---~ 135 (274) T protein:vir:12 76 TKKREAKIRKIAKGTSITDEALLSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGAKL-----------------T---V 135 (274) T ss_pred cceeeEEeeeecceeeecHHHHHhcccchHHHHHHHHHHHHHHHHHHHHHHHHhcccc-----------------c---c Confidence 9999999999999999999999999999999999999999999999999998875210 0 1 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) ....++++.|++|.++|||+.+.+++++|||.+|+.|+++++++|+++++.. .....++.|++|+|++ T Consensus 136 ~~~a~~~d~i~dA~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~fv~~s~~g------------~~~~~~G~ig~~~G~~ 203 (274) T protein:vir:12 136 NADITKLNGLQSAIDKFNDEDLEPMVLFINPLDAGKLRGDASTNFTRATELG------------DDIIVKGAFGEALGAI 203 (274) T ss_pred cccccCHHHHHHHHHHhccccccccEEEeCHHHHHHHHhhhhhhcccccccc------------ccceecccceeecCee Confidence 2235789999999999999999999999999999999999999999887632 1123356799999999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecce----eeecCcCCcCh Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVGA----KWAVTTTNPTR 312 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~----s~~~~~~sPt~ 312 (342) |++||.+| +|++|++++|||++..++++++|++|++.++.+.+++|+||++++.-= .-+.++-|-.. T Consensus 204 Vi~s~~~p--------~~t~~l~~~gA~~~~~~~~~~vE~~Rd~~~~~d~i~~~~~y~~~~~~~~~vv~~t~~~~~~~~ 274 (274) T protein:vir:12 204 IVRSNKLE--------AGTAILAKKGAVKLILKRDFFLEVARDASTKTTALYSDKHYVAYLYDESKAVKITKGSGSLEM 274 (274) T ss_pred EEEeCCCC--------cceEEEEeccceeeeecCCceeccccchhhcccEEEeeeEEEEEEEcCCceEEEEcCCccccC Confidence 99999986 578999999999999999999999999999999999999999887542 22222211111 No 13 >protein:vir:96833 Length: 275 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1642 # MgeName: EW # Cross-refs: genbank:acc:YP_240157;genbank:gi:66395822;genbank:GeneID:5133174 Probab=100.00 E-value=2.7e-65 Score=374.55 Aligned_cols=265 Identities=18% Similarity=0.173 Sum_probs=234.4 Q ss_pred Cc--ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhccc Q lcl|NC_020854. 1 MA--TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITA 78 (342) Q Consensus 1 Ma--T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~ 78 (342) || |+++|+|+||||++||.+++.++++| ++++..|++|.++ ||++|+||+|+++ |+++++.+|++|++++|++ T Consensus 3 ~~~~T~l~d~i~PEv~~~~v~~~~~~~~~~--~~~~~~~~~l~g~-~G~tv~iP~~~~i--g~a~~~~~g~~i~~~~lt~ 77 (275) T protein:vir:96 3 LENMTKLANMVNPEVLAPMMQAELDKKLKF--AQFADIDNTLVGQ-PGNTITFPAFVYS--GDAKVVPEGEEIPIDLIET 77 (275) T ss_pred CcccchhhhhhchHHHHHHHHHHHHHhhhh--cccceecccccCC-CCCEEEeeeeccC--CccccccCCCCcchhhccc Confidence 44 88999999999999999999999999 6788889999885 8999999999987 9999999999999999999 Q ss_pred ceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccccc Q lcl|NC_020854. 79 DKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDT 158 (342) Q Consensus 79 ~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~ 158 (342) +++.+++++++|+|.++|++.+.+++||++++.+|++.+|++++|++++++|++.... .. T Consensus 78 ~~~~~~i~~~~~~~~i~D~~~~~~~~d~~~~~~~~~a~~~a~~~d~~ll~~l~~a~~~--------------------~~ 137 (275) T protein:vir:96 78 KKRQATIRKIGKGTVLTDEALLSGYGDPKGEAVRQHGLAIANKVDNDVLEALQGATLK--------------------VE 137 (275) T ss_pred ceeeEEeehhcccccccHHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhccccc--------------------cc Confidence 9999999999999999999999999999999999999999999999999998762211 12 Q ss_pred cccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceE Q lcl|NC_020854. 159 PTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRV 238 (342) Q Consensus 159 ~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~V 238 (342) ...++++.|++|.++|||+.+.++.++|||++|+.|+|++.++|++.++.. .....++.|++|+|++| T Consensus 138 ~~~~~~d~i~dA~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~f~~~~~~g------------~~~~~~G~ig~~~G~~V 205 (275) T protein:vir:96 138 ADITKLAGLQTAIDKFNDEDLEPMVLFVNPLDAGKLRASATDNFTRATLLG------------DNVIVKGAFGEALGAII 205 (275) T ss_pred ccccCHHHHHHHHHHhccccCCccEEEeCHHHHHHHHhccccccccccccc------------ccceeccccceecCeeE Confidence 235789999999999999999999999999999999999999998876531 12234578999999999 Q ss_pred EEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecc------eeeecCcCCc Q lcl|NC_020854. 239 IVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVG------AKWAVTTTNP 310 (342) Q Consensus 239 vvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G------~s~~~~~~sP 310 (342) |+||.+| +|++|++++||+++..++++++|++|++.++.+.++.|+||++|+.- ++++.++..- T Consensus 206 i~s~~~p--------~~t~~i~~~gA~~~~~~~~~~vE~~Rd~~~~~d~i~~~~~y~~~~~~~~~vv~~t~~~~~~~~ 275 (275) T protein:vir:96 206 VRSNKIK--------EGEAILAKRGAVKLITKRDFFLETERHASHKSTALFSDKHYVAYLYDESKVVKITKSASGLGV 275 (275) T ss_pred EEeCCCC--------cceEEEEeccceeeeecCCcccccccchhhcCcEEEEeEEEEEEEEcCccEEEEEecccccCC Confidence 9999885 57899999999999999999999999999999999999999999754 2455554433 No 14 >protein:vir:3613 Length: 272 # NCBI annotation: MHP # Family: family:all:522 # MgeID: mge:74 # MgeName: TP901-1 # Cross-refs: genbank:acc:NP_112699;genbank:gi:13786567;genbank:GeneID:921035 Probab=100.00 E-value=6.9e-64 Score=366.86 Aligned_cols=265 Identities=20% Similarity=0.241 Sum_probs=234.3 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) || |+++|+|+||||++||.+++.++++| ++++..|..|.++ ||++|+||+|+++ |+++++.||++|++++|+ T Consensus 1 ma~~~T~~~d~iiPev~~~~v~~~~~~~~~~--~~~~~~~~~l~g~-~G~ti~iP~~~~~--gda~~~~eg~~i~~~~lt 75 (272) T protein:vir:36 1 MSKQKTTLADLVNPEVLAPIVSYELNKALRF--APLAQVDTTLQGQ-PGNTLKFPAFTYI--GDAADVAEGGEISLDKIG 75 (272) T ss_pred CCCcceehhhhhchHHHHHHHHHHHHhhhhh--ccccccccccccC-CCCEEEEeeeccC--ccccccCCCCccChhhcC Confidence 99 78999999999999999999999888 5577778899875 8999999999977 999999999999999999 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) ++++++++++++|+|.++|++.+.+++|||.++.+|++.+|+|++|++|++.|+|.... T Consensus 76 ~~~~~~~i~~~~k~~~vtD~~~~~~~~d~~~~~~~~~a~~~a~~~d~~i~~~l~~~~~~--------------------- 134 (272) T protein:vir:36 76 TTTKSVTIKKAAKGTEITDEAALSGYGDPIGESNKQLGLSLANKVDDDLLSAAKTTSQT--------------------- 134 (272) T ss_pred CcceeEeeehhhccccccHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhcccccc--------------------- Confidence 99999999999999999999999999999999999999999999999999998763211 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) .++.++++.|++|+++|||+.+...+++|||++|..|++++.+++...+...++ ..++.|++|+|++ T Consensus 135 ~~~~~~~d~i~~A~~~lgd~~~~~~~ivv~p~~~~~L~k~~~~~~~~~~~~~~~-------------~~~G~ig~~~G~~ 201 (272) T protein:vir:36 135 VSTKANVDGVQAALDIFNDEDAQAYVLIVNPKDAAKIRKDANAKNIGSEVGANA-------------LINGTYADVLGAQ 201 (272) T ss_pred ccccccHHHHHHHHHHhhhcCCCceEEEEcHHHHHHHhcccccccccccccccc-------------eeeeccceecCee Confidence 123567899999999999999999999999999999999988877765443221 2245689999999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecc----eeeecCcC Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVG----AKWAVTTT 308 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G----~s~~~~~~ 308 (342) |++||.+|.. ...|++|++++||+++..++++++|++|++.++.+.+++|+||++|+.- +..+.+|. T Consensus 202 Vv~s~~~p~~----~~~~~~~~~~~gA~~~~~~~~~~vE~~R~~~~~~d~i~~~~~y~~~v~~~~~vv~~t~~g~ 272 (272) T protein:vir:36 202 IVRSKKLAEG----SALMFKIVSNSPALKLVLKRGVQVETDRDIVTKTTVITADEHYAAYLYDLTKVVNITFTGV 272 (272) T ss_pred EEEeCCCCCC----ceeEEEEEecccceeeeecCCcccccccchhhcCcEEEEEEEEEEEEEcCccEEEEeecCC Confidence 9999999853 3579999999999999999999999999999999999999999998744 45566666 No 15 >protein:vir:97433 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1676 # MgeName: 92 # Cross-refs: genbank:acc:YP_240749;genbank:gi:66396420;genbank:GeneID:5133789 Probab=100.00 E-value=2.1e-63 Score=364.21 Aligned_cols=267 Identities=19% Similarity=0.193 Sum_probs=231.7 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) || |+++|+|+||||++||.+++.++.+| ++++..|.+|.++ ||++|+||+|+++ |+++++.+|++|++++|+ T Consensus 1 ma~~~T~~~d~iiPev~~~~v~~~~~~~l~~--~~~~~~d~~l~g~-~G~tv~iP~~~~~--g~a~~~~~g~~i~~~~lt 75 (274) T protein:vir:97 1 MPQGLTKTSDQIIPEVLAPMMQAQLEKKLRF--ASFAEVDSTLQGQ-PGDTLTFPAFVYS--GDAQVVAEGEKIPTDILE 75 (274) T ss_pred CCccceehhheechHHHHHHHHHhhhhhhhh--cccceecccccCC-CCCEEEEeeecCC--CccccccCCCcccccccc Confidence 99 88999999999999999999887766 8899999999875 8999999999977 999999999999999999 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) ++++.++|++++|+|+++|++.+.+++|||+++.+|++.+|++++|+++++.|++.- . + . T Consensus 76 ~~~~~~~i~~~~~~~~i~D~~~~~~~~dp~~~~~~~~a~a~a~~vd~~~~~~l~~a~------~-----------~---~ 135 (274) T protein:vir:97 76 TKKREAKIRKIAKGTSITDEALLSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGAK------L-----------T---V 135 (274) T ss_pred cceeEEEeeeecceecccHHHHHhccchHHHHHHHHHHHHHHHHHHHHHHHHHhccC------c-----------c---c Confidence 999999999999999999999999999999999999999999999999999987510 0 0 1 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) ....++++.|++|.++|||+...++.++|||.+|+.|+++++++|++.++.. .....++.|++|+|++ T Consensus 136 ~~~~~~~d~i~dA~~~l~d~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g------------~~~~~~G~ig~~~G~~ 203 (274) T protein:vir:97 136 NADITKLNGLQSAIDKFNDEDLEPMVLFVNPLDAGKLRGDASTNFTRATELG------------DDIIVKGAFGEALGAI 203 (274) T ss_pred cccccCHHHHHHHHHHhhccCCCceEEEeCHHHHHHHHhhhhhhccccCccc------------ccceeccccceecCee Confidence 2245789999999999999999999999999999999999999999877631 1122356799999999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecce----eeecCcCCcCh Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVGA----KWAVTTTNPTR 312 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~----s~~~~~~sPt~ 312 (342) |++||.+| +|++|++++||++++.++++.+|++|++.++.+.++.|+||++++.-= .-+.++.|-.. T Consensus 204 Vi~s~~~p--------~~t~~l~~~gA~~~~~~~~~~vE~~Rd~~~~~d~i~~~~~y~~~~~~~~~vv~~t~~~~~~~~ 274 (274) T protein:vir:97 204 IVRTNKLE--------AGTAILAKKGAVKLILKRDFFLEVARDASTKTTALYSDKHYVAYLYDESKAVKITKGSGSLEM 274 (274) T ss_pred EEEcCCCC--------cceEEEEeCcceEeeecCCceeccccchhhcccEEEEEEEEEEEEEcCCceEEEecCcccccC Confidence 99999986 578999999999999999999999999999999999999999986432 11222212111 No 16 >protein:vir:94494 Length: 274 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1508 # MgeName: 88 # Cross-refs: genbank:acc:YP_240676;genbank:gi:66396348;genbank:GeneID:5133758 Probab=100.00 E-value=2.1e-63 Score=364.21 Aligned_cols=267 Identities=19% Similarity=0.193 Sum_probs=231.7 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) || |+++|+|+||||++||.+++.++.+| ++++..|.+|.++ ||++|+||+|+++ |+++++.+|++|++++|+ T Consensus 1 ma~~~T~~~d~iiPev~~~~v~~~~~~~l~~--~~~~~~d~~l~g~-~G~tv~iP~~~~~--g~a~~~~~g~~i~~~~lt 75 (274) T protein:vir:94 1 MPQGLTKTSDQIIPEVLAPMMQAQLEKKLRF--ASFAEVDSTLQGQ-PGDTLTFPAFVYS--GDAQVVAEGEKIPTDILE 75 (274) T ss_pred CCccceehhheechHHHHHHHHHhhhhhhhh--cccceecccccCC-CCCEEEEeeecCC--CccccccCCCcccccccc Confidence 99 88999999999999999999887766 8899999999875 8999999999977 999999999999999999 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) ++++.++|++++|+|+++|++.+.+++|||+++.+|++.+|++++|+++++.|++.- . + . T Consensus 76 ~~~~~~~i~~~~~~~~i~D~~~~~~~~dp~~~~~~~~a~a~a~~vd~~~~~~l~~a~------~-----------~---~ 135 (274) T protein:vir:94 76 TKKREAKIRKIAKGTSITDEALLSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGAK------L-----------T---V 135 (274) T ss_pred cceeEEEeeeecceecccHHHHHhccchHHHHHHHHHHHHHHHHHHHHHHHHHhccC------c-----------c---c Confidence 999999999999999999999999999999999999999999999999999987510 0 0 1 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) ....++++.|++|.++|||+...++.++|||.+|+.|+++++++|++.++.. .....++.|++|+|++ T Consensus 136 ~~~~~~~d~i~dA~~~l~d~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g------------~~~~~~G~ig~~~G~~ 203 (274) T protein:vir:94 136 NADITKLNGLQSAIDKFNDEDLEPMVLFVNPLDAGKLRGDASTNFTRATELG------------DDIIVKGAFGEALGAI 203 (274) T ss_pred cccccCHHHHHHHHHHhhccCCCceEEEeCHHHHHHHHhhhhhhccccCccc------------ccceeccccceecCee Confidence 2245789999999999999999999999999999999999999999877631 1122356799999999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecce----eeecCcCCcCh Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVGA----KWAVTTTNPTR 312 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~----s~~~~~~sPt~ 312 (342) |++||.+| +|++|++++||++++.++++.+|++|++.++.+.++.|+||++++.-= .-+.++.|-.. T Consensus 204 Vi~s~~~p--------~~t~~l~~~gA~~~~~~~~~~vE~~Rd~~~~~d~i~~~~~y~~~~~~~~~vv~~t~~~~~~~~ 274 (274) T protein:vir:94 204 IVRTNKLE--------AGTAILAKKGAVKLILKRDFFLEVARDASTKTTALYSDKHYVAYLYDESKAVKITKGSGSLEM 274 (274) T ss_pred EEEcCCCC--------cceEEEEeCcceEeeecCCceeccccchhhcccEEEEEEEEEEEEEcCCceEEEecCcccccC Confidence 99999986 578999999999999999999999999999999999999999986432 11222212111 No 17 >protein:vir:96792 Length: 315 # NCBI annotation: major capsid protein # Family: family:all:47 # MgeID: mge:1629 # MgeName: phiHSIC # Cross-refs: genbank:acc:YP_224246;genbank:gi:62362381;genbank:GeneID:3345731 Probab=100.00 E-value=4.5e-62 Score=356.92 Aligned_cols=302 Identities=19% Similarity=0.209 Sum_probs=225.2 Q ss_pred Ccce-eccc--cchhHHHHHHHhhhHHhhhhhhc--CccccchhhhccCCCCEEEccccccCCCC-cccccCCCceechh Q lcl|NC_020854. 1 MATL-RSDI--IIPEVFTPYVIEQTTQRDAFLAS--GVVQPMTELNATEGGDFINVPFWKANLSG-DFEVLSDSSSLTPG 74 (342) Q Consensus 1 MaT~-~~d~--i~Pev~~~yv~~~~~~~~~f~~s--g~~~~d~~l~~~~~G~ti~~P~~~~i~~g-da~~~~~~~~i~~~ 74 (342) |||+ ++|| |||.+...|++........|... |+++..+.. -.||+...|||+ +... ..+++++++++++. T Consensus 1 ~~~t~~sdl~vfn~~~~~a~~e~~~~~~~~Fnaas~Gai~l~~~~---~~GDf~~~~ff~-i~~~~~~rnv~~~~~~t~~ 76 (315) T protein:vir:96 1 MATTVNSDLVIYNDTAQTAYLERNMDNLAVFNENSRAAIGLNSEL---IEGDLKLRSFYK-VGGAIADRDVNSTATVAGT 76 (315) T ss_pred CceeeecceeeehhhhhhhHHhhhHHHHHHhhhhcCCcccccccc---cccccccccccc-cccchhhcccCCCccccce Confidence 9965 6997 56666666665444344445321 333222211 239999999998 5221 35688899999999 Q ss_pred hcccceeeeEeeeeccceeechHHHh-hhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccc Q lcl|NC_020854. 75 KITADKQVAAILHRGRAFEARDLAAL-AAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDS 153 (342) Q Consensus 75 ~lt~~~~~a~i~~~~k~~~~tD~a~~-~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~ 153 (342) +|++++++++++.++.+.-..+.+++ ..+.||+..++.....+|.+..+..+...|+++++....+.. ++ T Consensus 77 kit~~~dvaVk~~~~~~~~~~~~~~~a~~g~dp~~~~~~i~~~~~~~~l~~~l~~~l~~~~aai~~~t~-----~~---- 147 (315) T protein:vir:96 77 KIAADEMVSVKVPWKYGPYETTEEAFKRRARSPEEFSMLIGQDMADATMAGWIGYALNALQGAIGSNAG-----MN---- 147 (315) T ss_pred ecccccceeEEEeecCCchhccHHHHHHhhcCHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhccccc-----cc---- Confidence 99999999999988877444444444 358899988777777778777777777777777764433221 11 Q ss_pred ccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeee Q lcl|NC_020854. 154 ESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTY 233 (342) Q Consensus 154 ~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~ 233 (342) ..++.+.++.++|++|.++|||++++|++++|||++|++|++++|++++. ++.+++.+ ...++| T Consensus 148 -~~~~~a~~~~~~l~dA~~klGD~~~~l~~~vMHS~v~~~L~~q~L~~~~~-~~~~~~~~--------------~~~~~~ 211 (315) T protein:vir:96 148 -VSGELATEGKKVLTKGLRTMGDKASSIAIWVMDSTSYFDIVDEAIDNKLY-EEAGVVVY--------------GGTPGT 211 (315) T ss_pred -ccccccccCHHHHHHHHHHhcccccCeeEEEEchHHHHHHHHhhhhhhcc-cccceeEe--------------cCcCcc Confidence 22456789999999999999999999999999999999999999988775 44332211 123667 Q ss_pred ccceEEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeE----EEeeecceeeec-CcC Q lcl|NC_020854. 234 MGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLH----YVYHPVGAKWAV-TTT 308 (342) Q Consensus 234 ~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~----y~~~~~G~s~~~-~~~ 308 (342) +||||||||+||+ |++|+|++|||+|.+++++.... ....|++.|+.|++ |++||+||||+. ++. T Consensus 212 lGkrViVdD~~P~--------~~~~gl~~GAi~~~~~~~~~~~~--~~~~g~e~l~~~~r~e~tf~l~p~G~sw~~~~~~ 281 (315) T protein:vir:96 212 LGKPVLVTDQCPA--------TKIFGLVAGAVMITESQAPGMRS--YQIDDQENLAIGFRAEGTANVEVLGYKWKTKTNV 281 (315) T ss_pred cccEEEEECCCCc--------ceeeeeecceeeecCCCcccccc--ccCCCcceeEEEEeeeeEeeeeeeeEEeecCCCc Confidence 8999999999985 78999999999999998743322 23346677777765 999999999964 568 Q ss_pred CcChHHhcCCcCceeec-CccccceEEEEecCCC Q lcl|NC_020854. 309 NPTRAQLETVANWSKVY-ELKNIGIVRATNVSNF 341 (342) Q Consensus 309 sPt~~~L~~~~NW~~v~-d~k~i~~~~~~~~~~~ 341 (342) |||++||++++||+||| +.|....|-++-.+.- T Consensus 282 sPt~aeLat~~NWekV~~~~K~tagv~~~~~~~~ 315 (315) T protein:vir:96 282 NPASATLATTTNWEKYATDDKATAGFIITLTTTP 315 (315) T ss_pred CCChHHhcCCcCcccccCCCcccceEEEEecCCC Confidence 99999999999999998 7899999988766555 No 18 >protein:vir:96123 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1602 # MgeName: 37 # Cross-refs: genbank:acc:YP_240078;genbank:gi:66395742;genbank:GeneID:5133103 Probab=100.00 E-value=8e-61 Score=350.07 Aligned_cols=267 Identities=20% Similarity=0.223 Sum_probs=231.2 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) || |+++|+|+||||++|+.+++.++.+| ++++..|.+|.++ ||++++||+|+.+ |+++++.++++|++++++ T Consensus 1 ma~~~T~~~d~i~Pev~s~~v~~~~~~~~~~--~~~~~~~~~l~g~-~G~tv~ip~~~~~--g~~~~~~~g~~i~~~~it 75 (274) T protein:vir:96 1 MAQGTTKVSNLIVPEVLAPMMQAELDKKLRF--AQFADIDSTLVGQ-PGDTLTFPAFTYS--GDAQVIAEGEKIPVDQIG 75 (274) T ss_pred CCccccchhhhhhhHHHHHHHHHHHHhhhhh--cccccccccccCC-CCCEEEEEeeccC--CCccccCCCCcCchhhcc Confidence 99 88999999999999999999988888 7788889999875 8999999999966 999999999999999999 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) ++++++++++++|+|.++|++.+.+++|||+++++|++.+|++++|+++++.|++.-. .. T Consensus 76 ~~~~~~~i~~~~~~~~i~D~~~~~~~~d~~~~~~~~~~~~~a~~~d~~i~~~l~~a~~--------------------~~ 135 (274) T protein:vir:96 76 TSKREAKVRKIGKGTELTDEAVLSGFGDPQGEAVRQHGLAIANKVDNDVLEALKGATL--------------------TV 135 (274) T ss_pred cceeEEEEEeeeceeeecHHHHHhhcchHHHHHHHHHHHHHHHHHHHHHHHHHhcCCC--------------------Cc Confidence 9999999999999999999999999999999999999999999999999999875210 01 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) ....++++.|++|.++|||+...++.++|||.+|+.|+++++++|+++++.. ......+.|++|+|++ T Consensus 136 ~~~~~~~d~i~dA~~~l~d~~~~~~~ivv~p~~~~~L~k~~~~~f~~~~~~g------------~~~~~~g~ig~~~G~~ 203 (274) T protein:vir:96 136 EADITKLDGLQTAIDKFNDEDLEPMVLFVNPLDAGGLRTSASDNFTRPTQLG------------DNIIVKGAFGEALGAV 203 (274) T ss_pred CcccccHHHHHHHHHHhcccCCCceEEEeCHHHHHHHHhccccccccccccc------------ccceeecccceecCee Confidence 2245689999999999999999999999999999999999999999877532 1112346799999999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecc----eeeecCcCCcChH Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVG----AKWAVTTTNPTRA 313 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G----~s~~~~~~sPt~~ 313 (342) |++||++| +|++|++++|||++..++++.+|++|+..++.+.++.|+||++++.- +..+.+... + T Consensus 204 Vi~s~~~p--------~~t~~l~~~gA~~~~~~~~~~vE~~Rd~~~~~d~i~~~~~yg~~~~~~~~vv~~t~~~~~---~ 272 (274) T protein:vir:96 204 IVRSNKLN--------KGEALLAKKGAVKLITKRDFFLEKDRDASRKSTALYSDKHYVAYLYDESKVVKITKGAGD---E 272 (274) T ss_pred EEEcCCCC--------cceEEEEeCcceeeeecCCcccccccchhhcccEEEEeeEEEEEEEcCccEEEEEcCccc---c Confidence 99999986 46899999999999999999999999999999999999999888643 233322111 1 Q ss_pred Hh Q lcl|NC_020854. 314 QL 315 (342) Q Consensus 314 ~L 315 (342) -+ T Consensus 273 ~~ 274 (274) T protein:vir:96 273 VM 274 (274) T ss_pred cC Confidence 11 No 19 >protein:vir:93742 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1475 # MgeName: 55 # Cross-refs: genbank:acc:YP_240459;genbank:gi:66396126;genbank:GeneID:5133511 Probab=100.00 E-value=2.2e-59 Score=342.21 Aligned_cols=267 Identities=18% Similarity=0.184 Sum_probs=230.6 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) || |+++|+|+||||++|+.+++.++.+| ++++..|.+|.++ ||++|+||+|+++ |+++++.+|++|++++++ T Consensus 1 ma~~~T~~~~~iiPev~~~~v~~~~~~~~~~--~~~~~~~~~l~g~-~G~tv~ip~~~~~--g~~~~~~eg~~i~~~~it 75 (274) T protein:vir:93 1 MPQGITKTSNQIIPEVLAPMMQAQLEKKLRF--ASFAEVDSTLQGQ-PGDTLTFPAFVYS--GDAQVVAEGEKIPTDILE 75 (274) T ss_pred CCccceehhheechHHHHHHHHHHHHhhhhh--cccccccccccCC-CCCEEEEEeeccC--CCcccccCCCcccccccc Confidence 99 88999999999999999999998888 5677888889875 8999999999977 999999999999999999 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) ++++.+++++++++|.++|++.+.+++|||+++.+|++.+|++++|+++++.|++... .. T Consensus 76 ~~~~~~~i~~~~~~~~i~D~~~~~~~~d~~~~~~~~~~~~~a~~~d~~~~~~~~~a~~--------------------~~ 135 (274) T protein:vir:93 76 TKKREAKIRKIAKGTSITDEALLSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGAKL--------------------TV 135 (274) T ss_pred cceeEEEeeeecccccccHHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhcccc--------------------cc Confidence 9999999999999999999999999999999999999999999999999998865210 01 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) ....++++.|++|.++|||+...+..++|||.+++.|+++++++|++.+... .....++.|++|+|++ T Consensus 136 ~~~~~~~d~i~dA~~~l~d~~~~~~~ivv~p~~~~~L~k~~~~~f~~~s~~g------------~~~~~~G~ig~~~G~~ 203 (274) T protein:vir:93 136 NADITKLNGLQSAIDKFNDEDLEPMVLFINPLDAGKLRGDASTNFTRATELG------------DDIIVKGAFGEALGAI 203 (274) T ss_pred cccccCHHHHHHHHHHhhhccCCccEEEeCHHHHHHHHhhhhhccccccccc------------ccceeecccceecCee Confidence 2235689999999999999999999999999999999999999999876531 1122356799999999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecce----eeecCcCCcCh Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVGA----KWAVTTTNPTR 312 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~----s~~~~~~sPt~ 312 (342) |++||.+| +|++|++++|||++..++++++|++|+..++.+.++.++||++++.-= .-+.++-|=.. T Consensus 204 Vi~s~~~p--------~~t~~l~~~gai~~~~~~~~~vE~~Rd~~~~~d~i~~~~~y~~~~~~~~~~v~~t~~~~s~~~ 274 (274) T protein:vir:93 204 IVRTNKLE--------AGTAILAKKGAVKLILKRDFFLEVARDASTKTTALYSDKHYVAYLYDESKAVKITKGSGSLEM 274 (274) T ss_pred EEEcCCCC--------cceEEEEeCCeEEEEecCCcccccccchhhcccEEEEEEEEEEEEEcCCceEEEeeCccccCC Confidence 99999986 578999999999999999999999999999999999999999886432 22222212111 No 20 >protein:vir:80930 Length: 278 # NCBI annotation: Cps # Family: family:all:522 # MgeID: mge:1886 # MgeName: A500 # Cross-refs: genbank:acc:YP_001468392;genbank:gi:157324966;genbank:GeneID:5601363 Probab=100.00 E-value=1.4e-57 Score=332.27 Aligned_cols=269 Identities=17% Similarity=0.189 Sum_probs=225.7 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) || |+++|+|+||||++||.+++.++.+|.+ ++..+..|.++ ||++++||+|+++ |+++++.++++|++++|+ T Consensus 1 Ma~~~T~~~~~iiPev~s~~v~~~~~~~~v~~~--~~~~~~~l~g~-~G~tv~ip~~~~~--g~a~~~~~g~~i~~~~lt 75 (278) T protein:vir:80 1 MADLTTKLANLIDPEVMGPMISAKLPKAIKFGK--IAPIDNSLEGQ-PGSEITVPKYKYI--GDAQDVAEGAAIDYSALE 75 (278) T ss_pred CCCcceehhheecHHHHHHHHHHHHHHhhhhcc--cceecccccCC-CCCEEEEeeeccC--CcceeecCCCcCcccccc Confidence 77 8999999999999999999999988854 56678888874 8999999999977 999999999999999999 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) ++++++++++++++|.++|++.+.+++|||+++++|++.+|+|++|++|++.|+|........ .+ . T Consensus 76 ~~~~~~~i~~~~~a~~v~D~~~~~~~~d~~~~~~~~~a~~~a~~~d~~l~~~l~~a~~~~~~~-------------~t-~ 141 (278) T protein:vir:80 76 TESVKHGIKKAGKGVKLTDESVLSGYGDPVEEAQKQIRMAIASKVDNDILEEALTTTLEVKGA-------------IN-I 141 (278) T ss_pred cceeeEeeehhhccccccHHHHhhccccHHHHHHHHHHHHHHHHHHHHHHHHHhccccccccc-------------cc-c Confidence 999999999999999999999999999999999999999999999999999998754221110 00 0 Q ss_pred ccccccHHHHHHHHHHhCcccc-CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccc Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGD-KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGL 236 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~-~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~ 236 (342) ......++.|.+|..+|+++.. ....++|||.+|+.|+++++++|++.++.. .....++.|++|+|+ T Consensus 142 ~~~~~~~~~~~da~~~l~~~~~~~~~~ivv~p~~~~~L~k~~~~~~~~~~~~g------------~~~~~~G~ig~~~G~ 209 (278) T protein:vir:80 142 GLIDKIENTFTDAPDAIEDESITTTGVLFLNYKDTAKLREEAAGSWTKASQLG------------DDLLVKGAFGELLGW 209 (278) T ss_pred chhhhHHHHHHHHHHhhcccCCCcccEEEECHHHHHHHHhhhhhhcccccccc------------ccceeeccceeecce Confidence 1112246789999999988654 355799999999999999999998766532 112235679999999 Q ss_pred eEEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecc----eeee-cCcC Q lcl|NC_020854. 237 RVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVG----AKWA-VTTT 308 (342) Q Consensus 237 ~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G----~s~~-~~~~ 308 (342) +|++||.+| +|++|++++|||++..++++++|++|+..++.+.++.|+||++++.- +.-+ .+|. T Consensus 210 ~Vi~s~~~p--------~~t~~l~~~gAi~~~~~~~~~vE~~Rd~~~~~d~i~~~~~yg~~v~~~~~~v~it~~a~~ 278 (278) T protein:vir:80 210 EIVRTKKLA--------DGNALAVKAGALKTFLKRNLLAESGRDMDHKLTKFNADQHYAVALVDETKAVKVVPVAGN 278 (278) T ss_pred eEEEcCCCC--------cceEEEEeccceeeeecCCcccccccchhhccceeeeeeEEEEEEEcCcceEEEeeccCC Confidence 999999986 46899999999999999999999999999999999999999999743 2222 2332 No 21 >protein:vir:9820 Length: 272 # NCBI annotation: putative major capsid/head protein # Family: family:all:522 # MgeID: mge:176 # MgeName: 315.4 # Cross-refs: genbank:acc:NP_795582;genbank:gi:28876339;genbank:GeneID:1257858 Probab=100.00 E-value=6.6e-52 Score=301.17 Aligned_cols=263 Identities=20% Similarity=0.262 Sum_probs=228.6 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) || |+++++|+||+|++|+.+++.++++| ++++..+.++.++ ||+++++|+|+.+ |+++++.||+++++++++ T Consensus 1 MA~~~T~~~~~~iPev~s~~v~~~~~~~~~~--~~~~~~~~~~~g~-~G~tv~iP~~~~~--~~a~~v~eg~~i~~~~~~ 75 (272) T protein:vir:98 1 MAVGTTKMAQMLDPEVLADMIDAEVGKAIRF--APLAEVDTTLEGQ-PGTTLTVPKWDYI--GDAEDVAEGEAIPMTQLG 75 (272) T ss_pred CCCccccchheechHHHHHHHHHHHHHHhhh--hccccccccccCC-CCCEEEEEEecCC--CCcccccCCCcccccccc Confidence 99 78999999999999999999998888 5677788888874 8999999999866 999999999999999999 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) .++..+++++++++|.++|+..+.+..|++.++.+|++.+|+|++|+++++.|++.... T Consensus 76 ~~~~~~~~~~~~~~~~itd~~~~~s~~d~~~~~~~~~~~~~a~~~d~~i~~~~~~a~~~--------------------- 134 (272) T protein:vir:98 76 FKKTTMTIKKAGKGVEITDEAILSGYGDPVGQAAKQIVEAIDHKVDADVLDALSKSTQT--------------------- 134 (272) T ss_pred cceEEEEeeeeeeeeeecHHHHhhccccHHHHHHHHHHHHHHHHHHHHHHHHhcccccc--------------------- Confidence 99999999999999999999999999999999999999999999999999988763210 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) .....+++.+++|.++|||+......|+|||.+|..|+++++++|++.++.. .....++.+++++|++ T Consensus 135 ~~~~~t~d~i~da~~~l~~~~~~~~~~vv~p~~~~~L~k~~~~~~~~~~~~~------------~~~~~~g~ig~i~G~~ 202 (272) T protein:vir:98 135 VEATATVDGVSKALDIFNDEDDAETVIVMNPADASTLRLDAAKEWLGATEVG------------ANRVVSGVYGEVLGVQ 202 (272) T ss_pred cccccCHHHHHHHHHHHhccCCCccEEEEcHHHHHHHHHhcccccccccccc------------ccccccccchhhcCee Confidence 1223468999999999999999999999999999999999999988765521 1112345689999999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecc----eee--ecCcCC Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVG----AKW--AVTTTN 309 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G----~s~--~~~~~s 309 (342) |++|+.+| ++++|++++||+++..++++.+|++|+..++.+.++.|+||++|+.- +.+ +.+++- T Consensus 203 Vi~s~~~p--------~~t~~~~~~~a~~~~~~~~~~ve~~r~~~~~~~~i~~~~~~~~~v~~~~~vv~~t~~~a~~~ 272 (272) T protein:vir:98 203 IVRSRKCP--------KGTAYMVRKGALRIMLKRNTMVETDRDITKAINQIVANKHYGVYLYKAEKAVKITLKDAAKK 272 (272) T ss_pred EEEcCCCC--------cceEEEEcCCeEEEEecCCceeeeccccccceeEEEEEEEEEEEEEcCCceEEEEecccccC Confidence 99999986 56899999999999999999999999999999999999999999744 233 233333 No 22 >protein:vir:3033 Length: 272 # NCBI annotation: major capsid protein # Family: family:all:522 # MgeID: mge:61 # MgeName: PhiNIH1.1 # Cross-refs: genbank:acc:NP_438146;genbank:gi:16271809;genbank:GeneID:929235 Probab=100.00 E-value=6.6e-52 Score=301.17 Aligned_cols=263 Identities=20% Similarity=0.262 Sum_probs=228.6 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) || |+++++|+||+|++|+.+++.++++| ++++..+.++.++ ||+++++|+|+.+ |+++++.||+++++++++ T Consensus 1 MA~~~T~~~~~~iPev~s~~v~~~~~~~~~~--~~~~~~~~~~~g~-~G~tv~iP~~~~~--~~a~~v~eg~~i~~~~~~ 75 (272) T protein:vir:30 1 MAVGTTKMAQMLDPEVLADMIDAEVGKAIRF--APLAEVDTTLEGQ-PGTTLTVPKWDYI--GDAEDVAEGEAIPMTQLG 75 (272) T ss_pred CCCccccchheechHHHHHHHHHHHHHHhhh--hccccccccccCC-CCCEEEEEEecCC--CCcccccCCCcccccccc Confidence 99 78999999999999999999998888 5677788888874 8999999999866 999999999999999999 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) .++..+++++++++|.++|+..+.+..|++.++.+|++.+|+|++|+++++.|++.... T Consensus 76 ~~~~~~~~~~~~~~~~itd~~~~~s~~d~~~~~~~~~~~~~a~~~d~~i~~~~~~a~~~--------------------- 134 (272) T protein:vir:30 76 FKKTTMTIKKAGKGVEITDEAILSGYGDPVGQAAKQIVEAIDHKVDADVLDALSKSTQT--------------------- 134 (272) T ss_pred cceEEEEeeeeeeeeeecHHHHhhccccHHHHHHHHHHHHHHHHHHHHHHHHhcccccc--------------------- Confidence 99999999999999999999999999999999999999999999999999988763210 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) .....+++.+++|.++|||+......|+|||.+|..|+++++++|++.++.. .....++.+++++|++ T Consensus 135 ~~~~~t~d~i~da~~~l~~~~~~~~~~vv~p~~~~~L~k~~~~~~~~~~~~~------------~~~~~~g~ig~i~G~~ 202 (272) T protein:vir:30 135 VEATATVDGVSKALDIFNDEDDAETVIVMNPADASTLRLDAAKEWLGATEVG------------ANRVVSGVYGEVLGVQ 202 (272) T ss_pred cccccCHHHHHHHHHHHhccCCCccEEEEcHHHHHHHHHhcccccccccccc------------ccccccccchhhcCee Confidence 1223468999999999999999999999999999999999999988765521 1112345689999999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecc----eee--ecCcCC Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVG----AKW--AVTTTN 309 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G----~s~--~~~~~s 309 (342) |++|+.+| ++++|++++||+++..++++.+|++|+..++.+.++.|+||++|+.- +.+ +.+++- T Consensus 203 Vi~s~~~p--------~~t~~~~~~~a~~~~~~~~~~ve~~r~~~~~~~~i~~~~~~~~~v~~~~~vv~~t~~~a~~~ 272 (272) T protein:vir:30 203 IVRSRKCP--------KGTAYMVRKGALRIMLKRNTMVETDRDITKAINQIVANKHYGVYLYKAEKAVKITLKDAAKK 272 (272) T ss_pred EEEcCCCC--------cceEEEEcCCeEEEEecCCceeeeccccccceeEEEEEEEEEEEEEcCCceEEEEecccccC Confidence 99999986 56899999999999999999999999999999999999999999744 233 233333 No 23 >protein:vir:739 Length: 231 # NCBI annotation: major structural protein 4 # Family: family:all:522 # MgeID: mge:14 # MgeName: Tuc2009 # Cross-refs: genbank:acc:NP_108716;genbank:gi:13487838;genbank:GeneID:920884 Probab=100.00 E-value=2.2e-49 Score=287.31 Aligned_cols=227 Identities=19% Similarity=0.190 Sum_probs=192.5 Q ss_pred hhccCCCCEEEccccccCCCCcccccCCCceechhhcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHH Q lcl|NC_020854. 40 LNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVA 119 (342) Q Consensus 40 l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~ 119 (342) -++...||||++|+| | ||++++.||++|++++|++++++++|++++|+|.++|++.+.+++||++|+++|++++++ T Consensus 1 ~~~~~~Gdtit~P~~--i--Gda~~v~eG~~i~~~~l~~t~~~atIk~~gk~~~itD~a~l~~~gDp~~ea~~Q~~~~iA 76 (231) T protein:vir:73 1 ENGINLANLCEYPND--I--GDAADVAEGGEISLDKIGTTTKSVTIKKAAKGTEITDEAALSGYGDPIGESNKQLGLSLA 76 (231) T ss_pred CccccCCceEEeccc--c--cchhhhcCCCcCChhhccccceeeeEeeeccceeeeHHHHhhccCchHHHHHHHHHHHHH Confidence 233456999999988 6 999999999999999999999999999999999999999999999999999999999999 Q ss_pred HHHHHHHHHHHHHHHhhhcccccchhheeeecccccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhh Q lcl|NC_020854. 120 NQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRA 199 (342) Q Consensus 120 ~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~l 199 (342) +++|++++++|++.... .+..++++.+++|+++|||+.+...+++|||+++++||+..- T Consensus 77 ~kvD~di~~~~~~a~l~---------------------~~~~~t~d~i~~A~~~fgde~~~~~vivv~p~~~~~Lrk~~~ 135 (231) T protein:vir:73 77 NKVDDDLLKAAKTTSQT---------------------VSTKANVDGVQAALDIFNDEDAQAYVLIVNPKDAAKIRKDAN 135 (231) T ss_pred HhhhHHHHHhhcccccc---------------------ccccccHHHHHHHHHHhccccccceEEEEcchHHHhhhhccc Confidence 99999999998752210 123478999999999999999999999999999999999653 Q ss_pred hhhhhhhhcccceeeeccceeecccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEecc Q lcl|NC_020854. 200 IDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDR 279 (342) Q Consensus 200 i~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr 279 (342) ..+... ..+......+.|++++|++|++|+.+|.. ...+..|+.++||+++..|+++++|++| T Consensus 136 ~~~~~~-------------~~g~~i~~~G~iG~i~G~~Vi~S~~~~~~----~~~~~~~i~~~gAl~~~~k~~~~vEtdR 198 (231) T protein:vir:73 136 AKNIGS-------------EVGANALINGTYADVLGAQIVRSKKLAEG----SALMFKIVSNSPALKLVLKRGVQVETDR 198 (231) T ss_pred hhhhhh-------------hhccceeeecccceEcceEEEEcCCCCCC----ceeeeeEEeeccceeeeecccceeeccc Confidence 333321 12222334678999999999999999842 3467789999999999999999999999 Q ss_pred CCCcceeEEEEeeEEEeeeccee----eecCcC Q lcl|NC_020854. 280 DILAKSDAMSIDLHYVYHPVGAK----WAVTTT 308 (342) Q Consensus 280 ~~~~g~~~l~~r~~y~~~~~G~s----~~~~~~ 308 (342) ++..+.+.+++++||++|++-=+ -+.+|. T Consensus 199 d~~~k~~~i~~~~~y~v~l~~~~~vv~~t~~g~ 231 (231) T protein:vir:73 199 DIVTKTTVITADEHYAAYLYDLTKVVNITFTGV 231 (231) T ss_pred cccccccEEEEeEEEEEEEEcCccEEEEEeecC Confidence 99999999999999999986532 233444 No 24 >protein:vir:7990 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:151 # MgeName: Che8 # Cross-refs: genbank:acc:NP_817344;genbank:gi:29565772;genbank:GeneID:1258978 Probab=99.89 E-value=1e-24 Score=152.11 Aligned_cols=266 Identities=18% Similarity=0.142 Sum_probs=193.1 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) ||. ..+.||+|++++.+++.+.+.| +.++.+|.+..+ .+|++|+||.|... +..+...++..++++.++.++ T Consensus 1 MA~---~~~~pei~~~~v~~~~~~~lv~--~~l~~~~~~~~~-~~GdTv~ip~~~~~--~~~d~~~~~~~~~~~~~~~~~ 72 (273) T protein:vir:79 1 MAF---NNFIPELWSDMLLEEWTAQTVF--ANLVNREYEGIA-SKGNVVHIAGVVAP--TVKDYKAAGRQTSADAISDTG 72 (273) T ss_pred Ccc---hhhhHHHHHHHHHHHHHhhccc--hhhhhccccccc-cCCcEEEEeecCcc--cccccccCCCccCccccccce Confidence 885 3468999999999999998887 446767766655 47999999999865 555555678889999999999 Q ss_pred eeeEee-eeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccccc Q lcl|NC_020854. 81 QVAAIL-HRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTP 159 (342) Q Consensus 81 ~~a~i~-~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~ 159 (342) ...++. .+..++.++|+.......| +.++.+|.+...++++|+++++.+.+.-. ... .....+ T Consensus 73 ~~~tid~~~~~~~~i~d~d~~~~~~~-~~~~~~~~~~ala~~vD~~i~~~~~~a~~---~~~------------~~~~~~ 136 (273) T protein:vir:79 73 VDLLIDQEKSIDFLVDDIDRVQVAGS-LEAYTRAGATALATDTDKFIADMLVDNGT---ALT------------GSAPSD 136 (273) T ss_pred EEEEEeeecccceeeccHHHHhhccc-HHHHHHHHHHHHHHHHHHHHHHHHhhccc---ccc------------cccccc Confidence 999995 5899999999887777667 56799999999999999999988764210 000 000111 Q ss_pred ccccHHHHHHHHHHhCccc--cCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 160 TALSPRHVAEARAILGDQG--DKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 160 ~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) ..-.++.|.+|..+|+++. ..-..++++|.++..|++..- ++...+..+.. .. ..++.++.+.|.. T Consensus 137 ~~~~~~~i~~a~~~ld~~~vP~~~R~lvv~p~~~~~Ll~~~~--~~~~~~~~~~~-----~~-----l~~G~ig~~~G~~ 204 (273) T protein:vir:79 137 ADDAFDLIASALKELTKANVPNVGRVVVVNAEMAFWLRSSGS--KLTSADTSGDA-----AG-----LRAGTIGNLLGAR 204 (273) T ss_pred hhhHHHHHHHHHHHhhhccCCccCcEEEECHHHHHHHhhchh--hhhhhhhcccc-----cc-----eeeeEeeEEeceE Confidence 1223678999999998864 245789999999999998631 22222211100 01 1235688999999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeee---cce-eeecCcC Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHP---VGA-KWAVTTT 308 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~---~G~-s~~~~~~ 308 (342) |+++..+|... .++++.+.++|+++. ++...+|..|+..+..+.++.+.+|++.+ .|+ ..+.+|. T Consensus 205 i~~s~~lp~~~-----~~~~~a~~~~A~~~a-~~~~~~e~~r~~~~~~~~v~~~~~yg~~v~~p~~vv~~~~~g~ 273 (273) T protein:vir:79 205 IVESNNLRDTD-----DEQFVAFHPSAAAYV-SQIDTVEALRDQDSFSDRIRALHVYGGKVVRPTGVVVFNKTGS 273 (273) T ss_pred EEecccccccC-----ceEEEEEeccceeee-eehhhhhcccCcccceeeeeeeeeeeeEEecCceEEEEeccCC Confidence 99999998632 256788899999874 45568999999999999999999987765 343 2233332 No 25 >protein:vir:9927 Length: 295 # NCBI annotation: hypothetical protein # Family: family:all:1178 # MgeID: mge:178 # MgeName: 315.6 # Cross-refs: genbank:acc:NP_795689;genbank:gi:28876459;genbank:GeneID:1258000 Probab=99.87 E-value=5.9e-25 Score=153.42 Aligned_cols=263 Identities=12% Similarity=0.095 Sum_probs=160.9 Q ss_pred Cc----ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhc Q lcl|NC_020854. 1 MA----TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKI 76 (342) Q Consensus 1 Ma----T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~l 76 (342) || |+..||.+|+++ +++..-..+..+|.+ ++.....+.- .-|++|++|+|.++ |+++++.||+.|+.+++ T Consensus 1 mAe~nlt~~~dL~~~~si-dfv~~f~~~i~~L~~--~Lgi~r~~p~-a~G~tIt~pK~~~t--gda~dVaEGe~Iplskv 74 (295) T protein:vir:99 1 MAEKNLNTMADLGDIKSI-DFVNKFSKNINDLLK--LLGVTRRETL-TNDLKIQTYKWEVT--LDQTDPGEGETIPLSKV 74 (295) T ss_pred CCCcccccHhhccCceee-hhhHHhhhhHHHHHH--Hhcccccccc-ccCCeEEeeeeeee--cccccccCCcccchhhh Confidence 99 567899999987 444333233334433 2211112211 23999999999988 99999999999999999 Q ss_pred ccce---eeeEeeeeccceeechHH-HhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecc Q lcl|NC_020854. 77 TADK---QVAAILHRGRAFEARDLA-ALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCID 152 (342) Q Consensus 77 t~~~---~~a~i~~~~k~~~~tD~a-~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~ 152 (342) +..+ .++.+++.+|+. ||+| ++.+++||++|..+|+.+.+++++++++++.|++.... +. T Consensus 75 t~~~~~t~t~kikK~rK~t--TdEAIqlsGygdpvgead~qL~~~ia~kId~D~~~~lktat~t--~t------------ 138 (295) T protein:vir:99 75 TRTKDKDYTVKWFKKRRAT--TAEAIARHGAARAITEADKRIMRELQNGIKDAFFTFLKTKPTK--VK------------ 138 (295) T ss_pred eeeeeeeeEEEeeeecccc--cHHHHHhcCCCchhHHHHHHHHHHHHHhhhHHHHHHhccCcee--ee------------ Confidence 9874 677778888875 9999 58999999999999999999999999999999752111 00 Q ss_pred cccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceee Q lcl|NC_020854. 153 SESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPT 232 (342) Q Consensus 153 ~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~ 232 (342) + ..--..+..+..+++.|.|+.+.-.++||||+++++||+.+-++|...++- |..-+.. T Consensus 139 ---g-~~lq~a~a~~~~al~~f~Ee~~~~~V~FVnP~D~a~yl~~A~~~~~~a~~f-----------------G~~~L~n 197 (295) T protein:vir:99 139 ---G-VGLQKALSASWAKLATFNEFEGSPLVSFVSPLDVANYLGDTKVGADASNVF-----------------GMTLLKN 197 (295) T ss_pred ---h-hhHHHHHHHhhhhhhhcccccCCceEEEEehHHHHHHHhccccccchhhhh-----------------hhhhhhh Confidence 0 011224566777799999999888999999999999999998887765542 1223445 Q ss_pred eccce-EEEeCCcceeccCCCcceEEEEEecceeEeec--------CCcceeEec----------cCCCc--cee-EEEE Q lcl|NC_020854. 233 YMGLR-VIVSDDVNTAGSGGSTEYATYFFTQGAVASGE--------QMAMQTETD----------RDILA--KSD-AMSI 290 (342) Q Consensus 233 ~~G~~-VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~--------k~~~~ve~d----------r~~~~--g~~-~l~~ 290 (342) ++|.. ||++..+|-. +.|.-...=|-+.. .+.++.-+| ++... .++ .+.+ T Consensus 198 fLG~q~II~S~kv~~G--------~~~aT~~~Ni~~ay~~~~~g~l~~~f~~~~D~tglIg~~h~~~~~~~t~et~~~~~ 269 (295) T protein:vir:99 198 FLGMQNVIVMPSVPEG--------KIYSTAVENLVFASLNVKGGDLGGLFADFTDETGLIAAARNRQLSNLTYESVFFGA 269 (295) T ss_pred hhccceEEEcccCCCc--------eEEEeeccceEEEEecCCchhhhhhhhhccCcccceEEEeccccceeeehhhhHhH Confidence 89996 9988887631 22222222222110 111222222 21111 111 0000 Q ss_pred eeEEEeeecceee-e-cCcCCcChHH Q lcl|NC_020854. 291 DLHYVYHPVGAKW-A-VTTTNPTRAQ 314 (342) Q Consensus 291 r~~y~~~~~G~s~-~-~~~~sPt~~~ 314 (342) -..|-=.+-|+=- + .+..+|--.- T Consensus 270 ~~lfpE~~dgiv~~tI~~~~~~~~~~ 295 (295) T protein:vir:99 270 NVLFAEIPEGVVEATIEAAAVPGIGG 295 (295) T ss_pred HHhcccccceEEEEEEecCcCCCCCC Confidence 0001111111100 0 1111221111 No 26 >protein:vir:105822 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:1636 # MgeName: PMC # Cross-refs: genbank:acc:YP_655767;genbank:gi:109522090;genbank:GeneID:4157630 Probab=99.87 E-value=2.3e-23 Score=144.67 Aligned_cols=266 Identities=17% Similarity=0.145 Sum_probs=191.2 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) ||- +.+.||+|++.+.+++.+.+.| ..++.+|.+..+ .+|+++++|.|..+ +..+...++..++++.++..+ T Consensus 1 MA~---~~~~pe~~~~~v~~~~~~~lv~--~~l~~~~~~~~~-~~Gdtv~ip~~~~~--~~~d~~~~~~~~~~~~~~~~~ 72 (273) T protein:vir:10 1 MAF---NNFIPELWSDMLLEEWTAQTVF--ANLVNREYEGTA-SKGNVVHIAGVVAP--TVKDYKAAGRQTSADAISDTG 72 (273) T ss_pred Ccc---hhhhHHHHHHHHHHHHHhhhcc--chhhcccccccc-ccCceEEEeecccc--cccccccCCCccCccccccce Confidence 884 3567999999999999998887 446667766655 36999999999765 444334567778999999999 Q ss_pred eeeEee-eeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccccc Q lcl|NC_020854. 81 QVAAIL-HRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTP 159 (342) Q Consensus 81 ~~a~i~-~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~ 159 (342) ...++. .+..++.++|+.......| +.++.+|.+...++++|+++++.+.+.-.. . ..+...+ T Consensus 73 ~~~tid~~~~~~~~i~d~d~~~~~~~-~~~~~~~~~~alA~~vD~~i~~~~~~a~~~---~------------~~~~~~~ 136 (273) T protein:vir:10 73 VDLLIDQEKSIDFLVDDIDRVQVAGS-LEAYTRAGATALATDTDKFIADMLVDNGTA---L------------TGSAPTD 136 (273) T ss_pred EEEEEeeeeecceEeecHHHhhhhcc-HHHHHHHHHHHHHHHHHHHHHHHHhccccc---c------------ccccccc Confidence 999885 5899999999887776666 567999999999999999999877642110 0 0000111 Q ss_pred ccccHHHHHHHHHHhCccc--cCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 160 TALSPRHVAEARAILGDQG--DKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 160 ~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) ..-.++.|.+|..+|.++. ..-..++++|+++..|++.. .++...+..+.. .. ..++.++.+.|.. T Consensus 137 ~~~~~~~i~~a~~~ld~~~vP~~~R~lvv~p~~~~~L~~~~--~~~~~~~~~~~~-----~~-----l~~G~ig~i~G~~ 204 (273) T protein:vir:10 137 ADDAFDLIAKALKELTKANVPNVGRVVVVNAEMAFWLRSSG--SKLTSADTSGDA-----AG-----LRAGTIGNLLGAR 204 (273) T ss_pred hhHHHHHHHHHHHHhhhcCCCcCCCEEEECHHHHHHHhcch--hhhhhhhccccc-----cc-----eeeeeeeEEeceE Confidence 1223678999999998764 24578999999999999864 233333221110 11 1135689999999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeee---cce-eeecCcC Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHP---VGA-KWAVTTT 308 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~---~G~-s~~~~~~ 308 (342) |++++.+|.. ..++.+.+.++|+++. ++...+|..|+..+..+.++.+.+|++.+ .|+ ..+.+|. T Consensus 205 v~~s~~lp~~-----~~~~~~~~~~~A~~~a-~q~~~~e~~r~~~~~~~~v~~~~~yg~~v~~~~~~~~l~~~g~ 273 (273) T protein:vir:10 205 IVESNNLRDT-----DDEQFVAFHPSAAAYV-SQIDTVEALRDQDSFSDRIRALHVYGGKVVRPTGVVVFNKTGS 273 (273) T ss_pred EEEecccccC-----CccEEEEEeccceeee-eeeehhhcccCCCcceeeeeeeeeeeeeEeccceEEEEeccCC Confidence 9999999853 2356788899999876 45568999999999999999999988765 342 2333332 No 27 >protein:vir:102605 Length: 273 # NCBI annotation: gp6 # Family: family:all:2203 # MgeID: mge:1661 # MgeName: Llij # Cross-refs: genbank:acc:YP_655002;genbank:gi:109392192;genbank:GeneID:4157227 Probab=99.87 E-value=2.3e-23 Score=144.67 Aligned_cols=266 Identities=17% Similarity=0.145 Sum_probs=191.2 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) ||- +.+.||+|++.+.+++.+.+.| ..++.+|.+..+ .+|+++++|.|..+ +..+...++..++++.++..+ T Consensus 1 MA~---~~~~pe~~~~~v~~~~~~~lv~--~~l~~~~~~~~~-~~Gdtv~ip~~~~~--~~~d~~~~~~~~~~~~~~~~~ 72 (273) T protein:vir:10 1 MAF---NNFIPELWSDMLLEEWTAQTVF--ANLVNREYEGTA-SKGNVVHIAGVVAP--TVKDYKAAGRQTSADAISDTG 72 (273) T ss_pred Ccc---hhhhHHHHHHHHHHHHHhhhcc--chhhcccccccc-ccCceEEEeecccc--cccccccCCCccCccccccce Confidence 884 3567999999999999998887 446667766655 36999999999765 444334567778999999999 Q ss_pred eeeEee-eeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccccc Q lcl|NC_020854. 81 QVAAIL-HRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTP 159 (342) Q Consensus 81 ~~a~i~-~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~ 159 (342) ...++. .+..++.++|+.......| +.++.+|.+...++++|+++++.+.+.-.. . ..+...+ T Consensus 73 ~~~tid~~~~~~~~i~d~d~~~~~~~-~~~~~~~~~~alA~~vD~~i~~~~~~a~~~---~------------~~~~~~~ 136 (273) T protein:vir:10 73 VDLLIDQEKSIDFLVDDIDRVQVAGS-LEAYTRAGATALATDTDKFIADMLVDNGTA---L------------TGSAPTD 136 (273) T ss_pred EEEEEeeeeecceEeecHHHhhhhcc-HHHHHHHHHHHHHHHHHHHHHHHHhccccc---c------------ccccccc Confidence 999885 5899999999887776666 567999999999999999999877642110 0 0000111 Q ss_pred ccccHHHHHHHHHHhCccc--cCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 160 TALSPRHVAEARAILGDQG--DKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 160 ~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) ..-.++.|.+|..+|.++. ..-..++++|+++..|++.. .++...+..+.. .. ..++.++.+.|.. T Consensus 137 ~~~~~~~i~~a~~~ld~~~vP~~~R~lvv~p~~~~~L~~~~--~~~~~~~~~~~~-----~~-----l~~G~ig~i~G~~ 204 (273) T protein:vir:10 137 ADDAFDLIAKALKELTKANVPNVGRVVVVNAEMAFWLRSSG--SKLTSADTSGDA-----AG-----LRAGTIGNLLGAR 204 (273) T ss_pred hhHHHHHHHHHHHHhhhcCCCcCCCEEEECHHHHHHHhcch--hhhhhhhccccc-----cc-----eeeeeeeEEeceE Confidence 1223678999999998764 24578999999999999864 233333221110 11 1135689999999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeee---cce-eeecCcC Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHP---VGA-KWAVTTT 308 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~---~G~-s~~~~~~ 308 (342) |++++.+|.. ..++.+.+.++|+++. ++...+|..|+..+..+.++.+.+|++.+ .|+ ..+.+|. T Consensus 205 v~~s~~lp~~-----~~~~~~~~~~~A~~~a-~q~~~~e~~r~~~~~~~~v~~~~~yg~~v~~~~~~~~l~~~g~ 273 (273) T protein:vir:10 205 IVESNNLRDT-----DDEQFVAFHPSAAAYV-SQIDTVEALRDQDSFSDRIRALHVYGGKVVRPTGVVVFNKTGS 273 (273) T ss_pred EEEecccccC-----CccEEEEEeccceeee-eeeehhhcccCCCcceeeeeeeeeeeeeEeccceEEEEeccCC Confidence 9999999853 2356788899999876 45568999999999999999999988765 342 2333332 No 28 >protein:vir:80180 Length: 381 # NCBI annotation: capsid protein # Family: family:all:2203 # MgeID: mge:1878 # MgeName: Pf-WMP3 # Cross-refs: genbank:acc:YP_001285797;genbank:gi:148747831;genbank:GeneID:5220456 Probab=99.81 E-value=3.7e-21 Score=132.58 Aligned_cols=313 Identities=15% Similarity=0.108 Sum_probs=186.9 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) |+|+....|+||+|++.+.+.+.+...|.+ + ..+..+.+ .+|++++||.+.. ..+.++.++..+++++++..+ T Consensus 15 ~~~t~~~~fiPev~s~~v~~~l~~~lv~~~--l-~~~~~~~~-~~GdTV~ip~~g~---~~a~d~~~g~~i~~~~~~~~~ 87 (381) T protein:vir:80 15 VDLSNVQVFIPEVWSSEVRMFRDQKFAALE--A-TKKIPFEG-KKGDLIHIPNISR---AAVYDKQPQTPVNLQARTDSE 87 (381) T ss_pred cchhhHHhhhhHHHHHHHHHHHHHhhhhhh--c-ccccccee-ecCceEEeeccCc---ceeeeecCCCcccccccCCce Confidence 777777888899999999999988877743 3 34455555 3799999999863 567889999999999999999 Q ss_pred eeeEe-eeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccc--chhheeeeccccccc Q lcl|NC_020854. 81 QVAAI-LHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTS--SSAFFDLCIDSESGD 157 (342) Q Consensus 81 ~~a~i-~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~--~~~~~~~~~~~~~~~ 157 (342) ...++ +.+..++.++|+.......||+.++.+|++..++++.|+.+++.+..+-........ ............... T Consensus 88 ~~itID~~~~~~~~Idd~D~~~~~~D~~~~~~~~~~~aLA~~~D~~i~~~~~~~~~~~~~~~~t~~~~i~~~~~~~~~t~ 167 (381) T protein:vir:80 88 FTFTVTKYKESSFMIEDIVNTQASYTLRQYYTKEAGYALARDMDNFALAHRAVINAFPSQRIYSYDTTLGDGTVNAHLTG 167 (381) T ss_pred EEEEEeeeeecceeechHHHHhhccChHHHHHHHHHHHHHHHHHHHHHHHHhhccccccccccccccccccccccccccc Confidence 98888 567888999999988888899999999999999999999999877543321110000 000001011111112 Q ss_pred ccccccHHHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGD--KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) ....+.++.|.+|..+|.++.- .-..++++|.+|..|++..- +.. .+.. .. ....++.++.+.| T Consensus 168 ~~~~~t~~~i~~a~~~Lde~~VP~egR~lvv~P~~~~~Ll~~~~--~~~-ad~~------~~-----~~l~~G~Ig~i~G 233 (381) T protein:vir:80 168 TPAPLTYAALLLAKQKLDEADVPQEGRIVMVSPAQYIDLLSINQ--FIS-VDFS------QV-----KPVTSGVVGTILG 233 (381) T ss_pred chhhHHHHHHHHHHHHHhhcCCCcCCcEEEeCHHHHHHHhhchh--hhh-hhhc------cc-----hhhhceeeeEEcc Confidence 2234567889999999987642 34689999999999997632 222 1111 00 1122456899999 Q ss_pred ceEEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEE---------ee-ecceeeec Q lcl|NC_020854. 236 LRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYV---------YH-PVGAKWAV 305 (342) Q Consensus 236 ~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~---------~~-~~G~s~~~ 305 (342) .+|+++..+|.... +.+.+..|+-.......-......+.-.....+...+.|- ++ ..|..|+. T Consensus 234 ~~Vv~Sn~lp~~~~------t~~~~~agap~~~~~~~~~~~~~g~~s~~a~av~~~k~yd~~~~~~~~~~~~~~g~~~~~ 307 (381) T protein:vir:80 234 MEVIVTTQIGINSL------TGYVNGQGAPTQPTPGVLGSPYLPDQAGTANVVNTGSASDLAVSLSYFGLPVFSGAGATA 307 (381) T ss_pred eEEEeecccccccc------cceeeeccccccccccccccccccccccceeeeeeeeeeceeeeeeeccceeeecceeee Confidence 99999999986432 1222333332211110000000111111112222222222 22 24455544 Q ss_pred CcCCcChHHhcCCcCcee-ecC----ccccceEEEEecCCCC Q lcl|NC_020854. 306 TTTNPTRAQLETVANWSK-VYE----LKNIGIVRATNVSNFD 342 (342) Q Consensus 306 ~~~sPt~~~L~~~~NW~~-v~d----~k~i~~~~~~~~~~~~ 342 (342) +...++..-.....-|-. |.+ +-.-|=++ +++.-| T Consensus 308 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~~~~ 347 (381) T protein:vir:80 308 ADGGQTLGSFGGANRWATAVVCHPDWLAVGVQQN--VKSESS 347 (381) T ss_pred cCCCceeeeehhhhhhhhhcccccccccccceeE--eecccc Confidence 434455443333334443 211 11222223 344333 No 29 >protein:vir:94622 Length: 341 # NCBI annotation: PfWMP4_37 # Family: family:all:2203 # MgeID: mge:1525 # MgeName: Pf-WMP4 # Cross-refs: genbank:acc:YP_762667;genbank:gi:115304375;genbank:GeneID:5142322 Probab=99.78 E-value=4e-21 Score=132.39 Aligned_cols=307 Identities=15% Similarity=0.089 Sum_probs=175.7 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) ++|....-|+||+|++++.+++.+++.|.+ ++ +|-+... ..|++|+||.+. ...+.++.++..+++++++..+ T Consensus 12 ~~t~~v~~fipei~s~~i~~~l~~~~v~~~--~~-~d~~~~~-~~Gdtv~ip~~g---~~~~~d~~~~~~i~~~~~~~~~ 84 (341) T protein:vir:94 12 INTQRGQQFIPEQWLSEVQMFRKAKMLDTS--VV-KTWGAQV-KKGDTFHVPRIS---ELGVEDKATDVPVGVQPVNDTD 84 (341) T ss_pred ccchhHHHHHHHHHHHHHHHHHHhhcchhh--cc-ccccccc-cCCceEEEeccC---cceeeeecCCCccccccccCce Confidence 444555668899999999999999888844 33 4544433 359999999875 2667889999999999999999 Q ss_pred eeeEe-eeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccccc Q lcl|NC_020854. 81 QVAAI-LHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTP 159 (342) Q Consensus 81 ~~a~i-~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~ 159 (342) ...++ +.+..++.++|+.......|++.++.+|.+...++++|+++++.+.+.- ..+.. .... .......... T Consensus 85 ~~itiD~~~~~~~~i~d~d~~~~~~d~~~~~~~~~~~aLA~~~D~~i~~~~a~~~--~~~~~---~~~~-~~~~~~t~~~ 158 (341) T protein:vir:94 85 FVITVDTDRTTAVALDDLLEIQASYDLRAPYLEAMGYALAKDMTGSILGLRAAVQ--NTASQ---NVFS-SSNGAITGNG 158 (341) T ss_pred EEEEEeeeeecceeechHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHhhhcc--ccccC---cccc-CccccccCch Confidence 99988 6789999999999988889999999999999999999999988765321 11110 0000 0011111223 Q ss_pred ccccHHHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 160 TALSPRHVAEARAILGDQGD--KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 160 ~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) ..+.++.|.+|..+|.++.- .-..++++|.+|..|++..- |.+. +..+ .....++.++.+.|.. T Consensus 159 ~~~~~~~i~~a~~~Lde~~VP~~gR~lvv~P~~~~~Ll~~~~--~~~~-~~~g-----------~~~l~~G~ig~i~G~~ 224 (341) T protein:vir:94 159 QAFSFAVFLAARRLLLEADVPEEKIVLLISPGQESALFTIPQ--FISK-DFIN-----------NAPIAQGQIGSLMGVR 224 (341) T ss_pred hhhhHHHHHHHHHHHhhcCCCccCCEEEeCHHHHHHHhhchh--hhhh-hccc-----------cchhheeeeeeEeceE Confidence 45677889999999987632 34778999999999987643 2222 1110 0012245688999999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEeecCCc--ceeEeccCCCc-----ceeEEEEeeEEEeeecceeeec----- Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMA--MQTETDRDILA-----KSDAMSIDLHYVYHPVGAKWAV----- 305 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~--~~ve~dr~~~~-----g~~~l~~r~~y~~~~~G~s~~~----- 305 (342) |+++..+|..... .|..+.+-.......+ ...+..|...+ ..-.++.+.-+.+-+.-..|-. T Consensus 225 V~~Sn~lp~~~~~------~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~gl~~~~~av~~~k~~~~~~~~~~~~~ 298 (341) T protein:vir:94 225 VIRTSLIGNNSAT------GWRNGAPTIAPAEATPGFTGSRYLPKQDSFTSLPATFTGNSRPVHTAVMCHMDWAAAVVSK 298 (341) T ss_pred EEEeccccccccc------cccccccceecccccccccccccccccccccccEEEEEEecccccceeeecchhhhccccc Confidence 9999999864311 1111111111110000 11111111110 1112222333333332222200 Q ss_pred -----CcCCcC-hHHhcCCcC--ceeecCccccceEEEEecCCCC Q lcl|NC_020854. 306 -----TTTNPT-RAQLETVAN--WSKVYELKNIGIVRATNVSNFD 342 (342) Q Consensus 306 -----~~~sPt-~~~L~~~~N--W~~v~d~k~i~~~~~~~~~~~~ 342 (342) ..-.|. -+++-.+.+ =-++..|+. .|.|.+-.+.= T Consensus 299 ~~~~~~~~~~~~~~~~i~~~~~~G~~~lrp~~--~v~~~~~~~~~ 341 (341) T protein:vir:94 299 APRVTQSFENREQVWLMVGRQAYGARLYRPLH--AVNIHTTGDTV 341 (341) T ss_pred cccccccchhhhhhhhhhhhhhhcccccCcce--eEEEecCcCCC Confidence 000010 112222222 000122222 12332222222 No 30 >protein:vir:106647 Length: 303 # NCBI annotation: ORF011 # Family: family:all:1178 # MgeID: mge:1557 # MgeName: 187 # Cross-refs: genbank:acc:YP_239493;genbank:gi:66395226;genbank:GeneID:4555801 Probab=99.78 E-value=7.3e-21 Score=130.99 Aligned_cols=267 Identities=12% Similarity=0.063 Sum_probs=157.6 Q ss_pred CcceeccccchhHHHHHHHhhhHH-----hhhhhhc-CccccchhhhccCCCCEEEccc---cccCCCCcccccCCCcee Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQ-----RDAFLAS-GVVQPMTELNATEGGDFINVPF---WKANLSGDFEVLSDSSSL 71 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~-----~~~f~~s-g~~~~d~~l~~~~~G~ti~~P~---~~~i~~gda~~~~~~~~i 71 (342) |+ ...++++++.+++-.+..+.+ -..|.+- |+.. -.-|. -|.+|++++ |.++ |+++++.||+.| T Consensus 1 M~-~e~nl~~~~dL~~a~siDF~~~f~~~i~~L~~~LGv~r-~~pla---~Gt~iktyK~~~~~y~--gda~dVaEGe~I 73 (303) T protein:vir:10 1 MS-AENNLINVEALGKAKSIDFANKLGVGLNKLFEALAIQN-KIPMN---VGSALKQYRFKVEDSE--KPNGDVAEGDVI 73 (303) T ss_pred CC-CCcCCcchhhcccceeehhhhhhhhhHHHHHHHhhhhc-ccccc---CCceeeeeeeeceeec--cccccccCCccc Confidence 87 567778888886433333322 2233221 2221 11121 266665554 5566 999999999999 Q ss_pred chhhcccc---eeeeEeeeeccceeechHH-HhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhhe Q lcl|NC_020854. 72 TPGKITAD---KQVAAILHRGRAFEARDLA-ALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFF 147 (342) Q Consensus 72 ~~~~lt~~---~~~a~i~~~~k~~~~tD~a-~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~ 147 (342) +.+|++.. ...+.+++.+|+. ||+| ++.+++||+++.-+|+.+.+++++++++++.|++..+.. T Consensus 74 plskvt~~~~~t~~~~~kK~rK~t--TdEAIqlsGyg~aVgetd~qL~~~Iq~kIdnd~~~~lktaT~t~---------- 141 (303) T protein:vir:10 74 PLTKVTREQVDITELQFAKYRKST--SAEAIQAHGYDLAINQTDNEMIKYVQKKFRAKFFETLKSAIENG---------- 141 (303) T ss_pred chhhheeeecceEEEEeecccccc--cHHHHHhhcCCchhHHHHHHHHHHHHhhhhHHHHHHHhhccccc---------- Confidence 99999975 5677788888877 9999 599999999999999999999999999999998633210 Q ss_pred eeecccccccccccccHHHHHHHHHHhC------ccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceee Q lcl|NC_020854. 148 DLCIDSESGDTPTALSPRHVAEARAILG------DQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMA 221 (342) Q Consensus 148 ~~~~~~~~~~~~~~~~~~~l~~A~~~~G------D~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~ 221 (342) +.......+.++|..|+..|- ++.+.-.++||||++++++|+.+-+. .+.++ T Consensus 142 -------~~t~~t~~s~~glq~Al~~~~~kl~~~~ed~~~~V~FvNP~Daa~yl~~A~i~-~~~t~-------------- 199 (303) T protein:vir:10 142 -------KRTNKTKLSAENLQGALSKGRANLSVLLDDEITPIAFVNPNDTAEYLANGFIN-STGAQ-------------- 199 (303) T ss_pred -------ccccceeecHHHHHHHHHhhhhhccccccccccEEEEEchHHHHHHhhcCCcc-hhhhh-------------- Confidence 011123457899999998774 45555679999999999999876543 22222 Q ss_pred cccccccceeeeccceEEEeCCccee---ccCCCcceEEEEEecc----eeEeec--CCcceeEeccCCCc--cee---- Q lcl|NC_020854. 222 AAYGGEVSVPTYMGLRVIVSDDVNTA---GSGGSTEYATYFFTQG----AVASGE--QMAMQTETDRDILA--KSD---- 286 (342) Q Consensus 222 ~~~~~~~~i~~~~G~~VvvdD~~p~~---~~~~~~~y~t~l~~~G----Ai~~~~--k~~~~ve~dr~~~~--g~~---- 286 (342) .|..-+..++|..||++..+|-. .+...+-...|.-..| +|.+.. ..-+.+-.+++... .++ T Consensus 200 ---fG~n~L~nfLG~~II~S~kv~~G~~~~T~~~Ni~~ay~~~~g~l~~~f~~t~D~tglIGv~h~~~~~~~t~eT~~~~ 276 (303) T protein:vir:10 200 ---FGVNLLTPYVGVKIVEFADVPQGEVWMTVAENLNVAYANPRGELSRAFAFATDATGFVGVLHDIQPQRLTSDTIYAS 276 (303) T ss_pred ---hhhhhhhhhhcceEEEeccCCCceEEEeeccceEEEEecCchhhhhhhhhccccccceEEEeccccceeeehhHhHh Confidence 12233456899999999988732 1222222222322222 111110 00022222222211 111 Q ss_pred --EEEEeeEEEeeecceeeecCcCCcC Q lcl|NC_020854. 287 --AMSIDLHYVYHPVGAKWAVTTTNPT 311 (342) Q Consensus 287 --~l~~r~~y~~~~~G~s~~~~~~sPt 311 (342) .|..++.=++-.-=++=.+.+.-|+ T Consensus 277 ~~~lfpE~~dgiv~~ti~~~e~~~~~~ 303 (303) T protein:vir:10 277 AISMFPENIDAVIKVTIKKDEAGELPS 303 (303) T ss_pred HHHhcccccceEEEEEEeccccCCCCC Confidence 1112221111111122223345566 No 31 >protein:vir:9875 Length: 296 # NCBI annotation: hypothetical protein # Family: family:all:1178 # MgeID: mge:177 # MgeName: 315.5 # Cross-refs: genbank:acc:NP_795637;genbank:gi:28876404;genbank:GeneID:1257935 Probab=99.75 E-value=2.7e-20 Score=127.90 Aligned_cols=254 Identities=12% Similarity=0.117 Sum_probs=142.7 Q ss_pred Ccce----eccccchhHHH-----HHHHhhhHHhhhhhh-cCccccchhhhccCCCCEE-EccccccCCCCcccccCCCc Q lcl|NC_020854. 1 MATL----RSDIIIPEVFT-----PYVIEQTTQRDAFLA-SGVVQPMTELNATEGGDFI-NVPFWKANLSGDFEVLSDSS 69 (342) Q Consensus 1 MaT~----~~d~i~Pev~~-----~yv~~~~~~~~~f~~-sg~~~~d~~l~~~~~G~ti-~~P~~~~i~~gda~~~~~~~ 69 (342) |-|. -.+++.-..++ +|+.+-..+-..|.+ -|+. +-.-| .-|++| +.|.|.++ |+++++.||+ T Consensus 1 ~~~~~~~~e~nlt~~~dl~~~~siDf~~~f~~~i~~L~~~LGv~-r~~pl---a~GstIkt~k~~~y~--gda~dVaEGe 74 (296) T protein:vir:98 1 MVTSRTYPEENLIKSTDLKYPITIDVTNKFQENISKLLEMLGVT-RKISV---SEGMTLKTYAGYDVT--LAEGNVPEGE 74 (296) T ss_pred CCCccccCcCCCcchhhhhhhhhhhhHHHHhhhHHHHHHHhhhc-ccccc---cCCCEEeeccceeee--eccccccCCc Confidence 5531 13344444443 333322222223322 1221 11111 239999 77889988 9999999999 Q ss_pred eechhhcccce---eeeEeeeeccceeechHH-HhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchh Q lcl|NC_020854. 70 SLTPGKITADK---QVAAILHRGRAFEARDLA-ALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSA 145 (342) Q Consensus 70 ~i~~~~lt~~~---~~a~i~~~~k~~~~tD~a-~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~ 145 (342) .|+.+|++..+ ..+.+++.+|+. ||+| ++.+++||++|.-+|+.+.+++++++++++.|++..+.. T Consensus 75 ~Iplskvt~~~~~t~t~~ikK~rK~t--TdEAIqlsGyg~aVgetd~qL~~~iq~kId~d~~t~LktaT~t~-------- 144 (296) T protein:vir:98 75 VIPLSKVERKIHSEKKIELKKYRKAT--TGEDIQMYGSNEAVTNTDNALVRQLQKKIRTDFVTALKTGTGTQ-------- 144 (296) T ss_pred ccchhhheeeecceEEEEeecccccc--CHHHHHhhcCCchhHHHHHHHHHHHHHhhhHHHHHHHhccccee-------- Confidence 99999999875 777788889995 9999 599999999999999999999999999999998532210 Q ss_pred heeeecccccccccccccHHHHHH--------HHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeecc Q lcl|NC_020854. 146 FFDLCIDSESGDTPTALSPRHVAE--------ARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSG 217 (342) Q Consensus 146 ~~~~~~~~~~~~~~~~~~~~~l~~--------A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~ 217 (342) ..+.++|.. +..+|+|+.+.-.++||||.+++++++..-|. .. T Consensus 145 ---------------~~t~~~lQ~Ala~~~~~l~~~feded~~~~V~FVnP~D~a~ylg~a~it---~q----------- 195 (296) T protein:vir:98 145 ---------------DALGAGLQGALASAWGKLQVLFEDYGSERAIVFANSLDVAEYIAKAGIT---TQ----------- 195 (296) T ss_pred ---------------eechhhHHHHHHHHhhhhhhhccccCCCceEEEEehHHHHHHhcCCccc---hh----------- Confidence 012344444 44899999888999999999999999876441 11 Q ss_pred ceeecccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecceeEee--------cCCcceeEeccCCCccee--E Q lcl|NC_020854. 218 GSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASG--------EQMAMQTETDRDILAKSD--A 287 (342) Q Consensus 218 ~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~--------~k~~~~ve~dr~~~~g~~--~ 287 (342) ... +..-+..++|..||++..+|-. +.|.| ...=|-+. ..+.+++.+|.--.-|.+ . T Consensus 196 t~f-----G~tyl~nfLG~~II~S~kV~~G-----~~~~T---~~~Ni~~ay~~~~~~~l~~~f~~~~d~tglIGv~h~~ 262 (296) T protein:vir:98 196 TAF-----GLTYLVDFTGTVIISTNDVTKG-----EIWAT---VPENIIFAYINPNNSELAKEFNLYGDPTGYIGMNHFQ 262 (296) T ss_pred hee-----chhhhhhccccEEEEcCcCCCc-----eEEEe---eecceEEEeecccccchhhhhccccccccceEEEecc Confidence 001 1122334899999999988721 11111 11111110 111122222211110100 0 Q ss_pred EEEeeEE-Eeeecceee--------ecCcCCcCh Q lcl|NC_020854. 288 MSIDLHY-VYHPVGAKW--------AVTTTNPTR 312 (342) Q Consensus 288 l~~r~~y-~~~~~G~s~--------~~~~~sPt~ 312 (342) ...+-.| .+..-|..+ ..+.++|.- T Consensus 263 ~~~~~t~eT~~~~~~~lfpE~~dgiv~~tI~~~~ 296 (296) T protein:vir:98 263 ENTTLTIQTLLVSGMLMYPERIDGIVKVTLTPGV 296 (296) T ss_pred ccceeeehhHhHhHHHhcccccceEEEEEecCCC Confidence 0000000 000111111 011112222 No 32 >protein:vir:108211 Length: 318 # NCBI annotation: gp9 # Family: family:all:6420 # MgeID: mge:2004 # MgeName: Giles # Cross-refs: genbank:acc:YP_001552338;genbank:gi:160700658;genbank:GeneID:5758931 Probab=99.61 E-value=1.2e-17 Score=113.40 Aligned_cols=277 Identities=13% Similarity=0.089 Sum_probs=171.1 Q ss_pred Cc-------------ceeccccc-hhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccC Q lcl|NC_020854. 1 MA-------------TLRSDIII-PEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLS 66 (342) Q Consensus 1 Ma-------------T~~~d~i~-Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~ 66 (342) |+ -+++|+++ |+++-.++.+...+ .|+..-+...-..-.+++-.-+-..|+|- .+|++++. T Consensus 1 ~~~~~~i~s~~~~~~itv~~ll~~P~~I~~~i~e~~~~--~~iad~lf~~~~a~~~~~v~f~~~~p~~~---~~d~e~Va 75 (318) T protein:vir:10 1 MTAPTGIVSVSDGPAITVRELVGNPLWIPTALKKMMVN--QFISESLFRNGGANPNGVVAYNEGNPSFL---EDDVADVA 75 (318) T ss_pred CCCCCcceeeecCCceehHHhhCCchhHHHHHHHHHhc--cchhhhhhhcccccccceeEEEecccccc---cCcHhhcc Confidence 54 25678876 99888888665532 33221111110000111223334447764 59999999 Q ss_pred CCceechhhcccce-eeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHH-hhhcccc--- Q lcl|NC_020854. 67 DSSSLTPGKITADK-QVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVF-GSLNANT--- 141 (342) Q Consensus 67 ~~~~i~~~~lt~~~-~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~-~~~~a~~--- 141 (342) |+..++....+.+. ..++.+++|+++.++||+....+.++++...+|++....|+.+++++++|.... ....+.. T Consensus 76 EggEiP~~~~~~G~~~ia~~~K~G~~~~vS~Em~~~n~~~~v~r~~~~l~Nti~r~~d~~a~dal~sa~t~~~~~s~~w~ 155 (318) T protein:vir:10 76 EFGEIPVSAGARGLPRTAFAVKKALGVRVSKEMIDENRVGAVNDQMLQLRNTFIRANDRSAKALLQSPIVPTLAVPTAWD 155 (318) T ss_pred CcccccccCCCCCchhhhhhehhccceeccHHHHhhcChhHHHHHHHHHHHHHHHHHHHHHHHHHhccccccccCCcCCC Confidence 99999999999954 455778999999999999999999999999999999999999999999885421 1111100 Q ss_pred -cchhheeeecccccccccccccHHHHHHHHH-----HhCccc----cCeEEEEEchHHHHHHHhhhhhhhhhhhhcccc Q lcl|NC_020854. 142 -SSSAFFDLCIDSESGDTPTALSPRHVAEARA-----ILGDQG----DKLTAVAMHSKVYYDLVERRAIDYVSTADARGT 211 (342) Q Consensus 142 -~~~~~~~~~~~~~~~~~~~~~~~~~l~~A~~-----~~GD~~----~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~ 211 (342) ......|+... .+.+..|.. .++.+. -....|||||..++.|.++..+--....+++.. T Consensus 156 ~~~~~~~d~~~A-----------~e~v~~a~~~~~~a~~~~~~~~~GY~pdtIVlhP~~~~~l~~n~~~~~~y~~~a~~~ 224 (318) T protein:vir:10 156 NGGKVRTDIAIA-----------IEQISTAAPTAYPAGVGSSDEYFGFIPDTIVMHYALLPILMDNENFMKVYERNANYV 224 (318) T ss_pred Ccccccccchhh-----------hhhhhhhhhhhhhhhhhhhhhccCccceeeEECHHHHHHHhcchhhhhhhhccchhh Confidence 01111111100 111111111 122221 245799999999999998854322222222222 Q ss_pred eeeeccceeecccccccce-eeeccceEEEeCCcceeccCCCcceEEEEEecceeEe-ecCCcceeEeccCC----Ccce Q lcl|NC_020854. 212 STTQSGGSMAAAYGGEVSV-PTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVAS-GEQMAMQTETDRDI----LAKS 285 (342) Q Consensus 212 ~~~~~~~~~~~~~~~~~~i-~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~-~~k~~~~ve~dr~~----~~g~ 285 (342) +..... ...| +.++|++||+|-.+|-. +.|++-.|++++ +..+|+.++.-|.. +++. T Consensus 225 ~~~~~~---------tg~~~g~~lGl~vi~s~~~p~~--------~alvlq~g~vG~~~d~~pl~~t~~~~egg~~~g~~ 287 (318) T protein:vir:10 225 STAPDW---------TGNFPGSVMGLNVIRSRTFPID--------RVLIMERGTVGFYSDTRPLQFTALYPEGNGPNGGP 287 (318) T ss_pred hhcccc---------cccccceeeceEEeecCccCCC--------eeEEEecCCcceeeccccceeeecccCCCCCCCCc Confidence 211110 1122 35699999999888732 479999999997 56777777766643 3333 Q ss_pred e------EEEEeeEEEeeecceeeecCcCCc Q lcl|NC_020854. 286 D------AMSIDLHYVYHPVGAKWAVTTTNP 310 (342) Q Consensus 286 ~------~l~~r~~y~~~~~G~s~~~~~~sP 310 (342) + ...-|..++..|+..-|-.+-.+| T Consensus 288 ~~s~~~~~~~~~~~~V~~PkA~~~itgi~~~ 318 (318) T protein:vir:10 288 TESYRADASHKRALAVDQPKAALWLTGIVTP 318 (318) T ss_pred chhhheehheeeeeeeeCcceeEEEeeccCC Confidence 2 233445667778999996665678 No 33 >protein:vir:99749 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1497 # MgeName: phiETA2 # Cross-refs: genbank:acc:YP_001004307;genbank:gi:122891761;genbank:GeneID:4712304 Probab=99.56 E-value=1.9e-15 Score=101.25 Aligned_cols=273 Identities=11% Similarity=0.067 Sum_probs=175.8 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) |++..+...+|+.+..-+.+...+.+.+.+ ++.. + ..+|.++++|.+.. .+++..+.|++.++..+++..+ T Consensus 30 ~~~~~~~~lip~~~~~~ii~~~~~~s~l~~--~~~~---~--~~~~~~~~~p~~~~--~~~a~~v~Eg~~~~~~~~~~~~ 100 (324) T protein:vir:99 30 MMHEKKDGTLLNDFTTPILQEVMENSKIMR--LGKY---E--PMEGTEKKFTFWAD--KPGAYWVGEGQKIETSKATWVN 100 (324) T ss_pred eccCCCcceechhHHHHHHHHHHhhchhhh--hcce---e--eccCCceEEEEEec--CcceeEeccCccccccccceeE Confidence 334445556788777777677777666644 2211 1 12466788999874 4788889999999999999988 Q ss_pred eeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccccccc Q lcl|NC_020854. 81 QVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTPT 160 (342) Q Consensus 81 ~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~~ 160 (342) .....++.+..+.++++...-+..+..+.+.+++++.+++++++.+|. | .. .+.....+.. ........... T Consensus 101 v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~ai~~~~d~~~l~---G---~g-~~~~~~~~~~-~~~~~~~~~~~ 172 (324) T protein:vir:99 101 ATMRAFKLGVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGIL---N---QG-NNPFGKSIAQ-SIEKTNKVIKG 172 (324) T ss_pred EEEeeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhh---c---CC-CCccCccccc-cccccceeccc Confidence 888889999989998877665667888999999999999999988764 1 11 1111111111 11122233445 Q ss_pred cccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceEEE Q lcl|NC_020854. 161 ALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIV 240 (342) Q Consensus 161 ~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vvv 240 (342) .++++.|.++...+.+......+|+||+..+..|++. ++++++.+.. ...-++++|++|++ T Consensus 173 ~~~~~~i~~~~~~l~~~~~~~~~~v~n~~~~~~L~~l------~d~~g~~~~~-------------~~~~~~l~G~PVv~ 233 (324) T protein:vir:99 173 DFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKI------VDPETKERIY-------------DRNSDTLDGLPVVN 233 (324) T ss_pred cCCHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHh------hcCCCceeec-------------CCCCccccceeEEe Confidence 6789999999999988777778899999999999864 3344332221 12346789999999 Q ss_pred eCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCC----------------cceeEEEEeeEEEeeecc---e Q lcl|NC_020854. 241 SDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDIL----------------AKSDAMSIDLHYVYHPVG---A 301 (342) Q Consensus 241 dD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~----------------~g~~~l~~r~~y~~~~~G---~ 301 (342) ++.++.. +...++--..-+.++...++.++..++.. .....+....++...|.- + T Consensus 234 ~~~~~~~------~~~~i~gd~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~ 307 (324) T protein:vir:99 234 LKSSNLK------RGELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAF 307 (324) T ss_pred ecCCCCC------cceEEEEecccEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEEccEEecccce Confidence 9877542 22222211222345666677777766532 234556666666655432 2 Q ss_pred ee-e--cCcCCcChHHh Q lcl|NC_020854. 302 KW-A--VTTTNPTRAQL 315 (342) Q Consensus 302 s~-~--~~~~sPt~~~L 315 (342) .- + ..+..++.+|. T Consensus 308 ~~lt~a~~~~~~~~~~~ 324 (324) T protein:vir:99 308 AKLVPADKKTDSVPGEV 324 (324) T ss_pred EEEEeccCCCCCCCCCC Confidence 21 2 23334555555 No 34 >protein:vir:9309 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:165 # MgeName: phi 11 # Cross-refs: genbank:acc:NP_803287;genbank:gi:29028597;genbank:GeneID:1258044 Probab=99.54 E-value=2.6e-15 Score=100.57 Aligned_cols=273 Identities=10% Similarity=0.061 Sum_probs=174.7 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) |++..+.-.+|+.+.+-+.+...+.+.+.+ ++.. + ..+|..+++|.+.. .+.+..+.|++.++..+++..+ T Consensus 30 ~~~~~~~~liP~~~~~~ii~~~~~~s~l~~--l~~~---~--~~~~~~~~ip~~~~--~~~a~~v~Eg~~~~~~~~~f~~ 100 (324) T protein:vir:93 30 MMHEKKDGTLLNDFTTPILQEVMENSKIMQ--LGKY---E--PMEGTEKKFTFWAD--KPGAYWVGEGQKIETSKATWVN 100 (324) T ss_pred cccCCCcceechhHHHHHHHHHHhhchhhh--hcce---e--eccCCceEEEEEec--CcceeeecCCccccccccceeE Confidence 333445556788887777777777766644 2211 1 13466788999864 3778889999999999999888 Q ss_pred eeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccccccc Q lcl|NC_020854. 81 QVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTPT 160 (342) Q Consensus 81 ~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~~ 160 (342) .....++.+.-+.++++...-+..|..+.+.+++++.++++.++.+|. |. .++.....+.... ......... T Consensus 101 i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~aia~~~d~a~l~---G~----g~~~~~~~~~~~~-~~~~~~~~~ 172 (324) T protein:vir:93 101 ATMRAFKLGVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGIL---NQ----GNNPFGKSIAQSI-EKTNKVIKG 172 (324) T ss_pred EEEEeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhc---CC----CCCCcCccccccc-cccceeccc Confidence 888889999889999887666667889999999999999999987763 21 1111111111111 111222345 Q ss_pred cccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceEEE Q lcl|NC_020854. 161 ALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIV 240 (342) Q Consensus 161 ~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vvv 240 (342) .++++.+.++...+.+.......|+||+..+..|++. ++++++.+.. +..-++++|++|++ T Consensus 173 ~~~~~~i~~~~~~l~~~~~~~~~~v~n~~~~~~L~~l------~d~~G~~~~~-------------~~~~~~l~G~PVv~ 233 (324) T protein:vir:93 173 DFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKI------VDPETKERIY-------------DRNSDSLDGLPVVN 233 (324) T ss_pred cccHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHh------hCCCCCeeec-------------CCCCCcccceeeEe Confidence 6789999999998888777788999999999999874 3344332211 23356789999998 Q ss_pred eCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCC----------------cceeEEEEeeEEEeeecc---e Q lcl|NC_020854. 241 SDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDIL----------------AKSDAMSIDLHYVYHPVG---A 301 (342) Q Consensus 241 dD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~----------------~g~~~l~~r~~y~~~~~G---~ 301 (342) ++..+.. +...|+--..-+.++...++.++..++.. .....+....+|.+.+.- + T Consensus 234 ~~~~~~~------~~~i~~gdfs~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~n~~~~r~~~r~d~~v~~~~a~ 307 (324) T protein:vir:93 234 LKSSNLK------RGELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAF 307 (324) T ss_pred ecCCCCC------cceEEEEecceEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEeccEEecccce Confidence 8765432 22222211223345666667777766542 234566677777666543 2 Q ss_pred ee-e--cCcCCcChHHh Q lcl|NC_020854. 302 KW-A--VTTTNPTRAQL 315 (342) Q Consensus 302 s~-~--~~~~sPt~~~L 315 (342) .- + .++..+|..+. T Consensus 308 ~~l~~a~~~~~~~~~~~ 324 (324) T protein:vir:93 308 AKLVPADKRTDSVPGEV 324 (324) T ss_pred EEEecccccCCCCCCCC Confidence 22 1 22333444444 No 35 >protein:vir:99075 Length: 392 # NCBI annotation: gp30 # Family: family:all:10837 # MgeID: mge:1671 # MgeName: Wildcat # Cross-refs: genbank:acc:YP_655895;genbank:gi:109521467;genbank:GeneID:4158040 Probab=99.54 E-value=1.5e-15 Score=101.79 Aligned_cols=299 Identities=11% Similarity=0.029 Sum_probs=165.7 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccc--hhhhccCCCCEEEccccccCCCCc--ccccCCCceechhhc Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPM--TELNATEGGDFINVPFWKANLSGD--FEVLSDSSSLTPGKI 76 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d--~~l~~~~~G~ti~~P~~~~i~~gd--a~~~~~~~~i~~~~l 76 (342) ||- .+|+||++++.+.+.+.+.+.|. .++-+| .++.+ .+||+|+||.+...-..+ ......+..++++++ T Consensus 1 Ma~---~~~~p~~~a~~~l~~l~~~lv~~--~lv~~~~~~~~~~-~~GdtV~i~~~~~~~~~~~~~~~~~~~~~~~~~~~ 74 (392) T protein:vir:99 1 MAN---AFSKPTAVVDTAIQMLQNELILT--NLVWLNGIGDFAH-KFNDTITVRVPAPSRGHTRKLRGAGAERNLTVSDF 74 (392) T ss_pred Ccc---ccccHHHHHHHHHHHHHhhccch--hhhcccccccccc-CCCCeEEEeecccccceeeeccccccCCccccccc Confidence 883 34999999999999999988884 355454 35655 369999999887431122 122345678999999 Q ss_pred ccceeeeEe-eeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 77 TADKQVAAI-LHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 77 t~~~~~a~i-~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) +..+...++ +.+.+++.++|+.......|++.++.+|.+...+++++.+++..+.+.-... .... T Consensus 75 ~~~~~~~~id~~k~~~~~i~d~e~~~~~~~~~~~~~~~a~~ala~~vd~~i~~~~~~a~~~~--------------~~~~ 140 (392) T protein:vir:99 75 TEDSFPVTLTDVAYHLGVLTDEELTFDLESFATQILPRQVRGVADILEEGVRDMIVGAPYEA--------------AGAV 140 (392) T ss_pred ccceEEEEEeeeeecceeechHHHhhhhhhhHHHHHHHHHHHHHHHHHHHHHHHHhcccccc--------------cccc Confidence 999999988 6899999999999999999999999999999999999999998776521100 0000 Q ss_pred ccccccccHHHHHHHHHHhCccc-cCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeec Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAILGDQG-DKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYM 234 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~~GD~~-~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~ 234 (342) ........++.|.+|..+|.+.. ..-+.+++.|..+..|+++.-+ ++..... ........++.++.+. T Consensus 141 ~~~~~~~~~~~i~~a~~~L~~~~vP~~R~~vv~p~~~~~l~~~~~~--~~~~~~g---------~~~~~~l~~G~vg~i~ 209 (392) T protein:vir:99 141 HEVAPDEFFKGVNGARRALNELYIPQGRVLVVGTAVTEQILNDDRF--IKYESQG---------QSAVSALQEARLGRIY 209 (392) T ss_pred cccChhhhHHHHHHHHHHHhhcCCCCCCEEEEcHHHHHHHhcccce--eeccccc---------chhhhhhhcceeeeee Confidence 11122345788999999998843 3347899999999999987432 2111100 0000011245688999 Q ss_pred cceEEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCC--CcceeEEEEeeE--E---------Eee-ecc Q lcl|NC_020854. 235 GLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDI--LAKSDAMSIDLH--Y---------VYH-PVG 300 (342) Q Consensus 235 G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~--~~g~~~l~~r~~--y---------~~~-~~G 300 (342) |..|+++..+|... .+.+.+.++....+.+...+-.... -.+...+..++. | .++ +.| T Consensus 210 G~~v~~s~~~~~~t--------~~a~~~~a~~~at~a~v~~~~~~~~~s~s~~~~v~~~~~~~~~~t~~s~~~~v~~~~g 281 (392) T protein:vir:99 210 GYEIVESTLIPHGD--------AYLYHPTAFIMATRAPAPPMGAVRSTAISGDQRIAMRWLVDYDSTITSNRSLIDTYFG 281 (392) T ss_pred eeEEEeeccccccc--------ceeeeccccccccccccccccccceeEEecccceecceeecccceeeccccccceeEE Confidence 99999998876432 2344445554444433211110000 000000000000 0 000 011 Q ss_pred eeeecCcCCcChHHhcCCcCceeecC-ccccc-------------------eEEEEecCCCC Q lcl|NC_020854. 301 AKWAVTTTNPTRAQLETVANWSKVYE-LKNIG-------------------IVRATNVSNFD 342 (342) Q Consensus 301 ~s~~~~~~sPt~~~L~~~~NW~~v~d-~k~i~-------------------~~~~~~~~~~~ 342 (342) +.-......+. +-...+.....+ ....| +-....+. .| T Consensus 282 ~~~v~~~~~~~---~~~~~~~~~~~~~v~v~~v~~~~~~~~~~~~~~~~~~~t~~~~~~-~~ 339 (392) T protein:vir:99 282 LKVVEDPNGVG---FVRARKIHLIPGSIEVAPEAGANATITAAAGEDHTVQLKVTDANG-DD 339 (392) T ss_pred EEEEeeccccc---eeeeeeeeeecceeeeeeeecccceeEeeeccceeEEEEEEecCC-cc Confidence 11000000000 000000100000 00000 00000000 00 No 36 >protein:vir:9759 Length: 303 # NCBI annotation: putative structural protein # Family: family:all:966 # MgeID: mge:175 # MgeName: 315.3 # Cross-refs: genbank:acc:NP_795521;genbank:gi:28876283;genbank:GeneID:1257824 Probab=99.54 E-value=2.7e-15 Score=100.47 Aligned_cols=283 Identities=9% Similarity=0.016 Sum_probs=172.1 Q ss_pred Ccce-eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccc Q lcl|NC_020854. 1 MATL-RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITAD 79 (342) Q Consensus 1 MaT~-~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~ 79 (342) |+|. -...++|+.+..-+.+...+.+.+.+. +... ..++..+++|.+.. .+.+.-+.|++.++..+++-. T Consensus 1 m~t~t~gg~liP~~~~~~ii~~l~~~s~i~~l--~~~~-----~~~~~~~~ip~~~~--~~~a~wv~E~~~~~~s~~~f~ 71 (303) T protein:vir:97 1 MGTETSKASLFDKHLVSDLINKVKGHSSLAKL--SSQK-----PIPFNGSKEFTFTL--DSDIDVVAENGKKTHGGLSLE 71 (303) T ss_pred CcccCCCCeEcchhHHHHHHHHHHhhchhhhh--ccee-----ecCCCceEEEEEec--CcceEEeecCcccccccccee Confidence 9964 466778888877777777777777553 2221 13466789999864 478889999999999998887 Q ss_pred eeeeEeeeeccceeechHHHh---hhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 80 KQVAAILHRGRAFEARDLAAL---AAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 80 ~~~a~i~~~~k~~~~tD~a~~---~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +.....++.+.-+.++++-.. -...+..+.+.++++++.+++++..++.-..+.-+............... ..... T Consensus 72 ~v~l~~~kl~~~~~iS~ell~~~~d~~~~l~~~i~~~la~a~~~~ld~a~l~G~~~~~g~~~~~~~~~~~~~~~-~~~~~ 150 (303) T protein:vir:97 72 PVTIVPIKVEYGARLSDEFLYATEEEKIDILKAFNEGFAKKLARGIDLMAMHGINPRTKKASDVIGTNHFDSKV-TQVVK 150 (303) T ss_pred eEEeeeEEEEEeehhhHHHhhcCccchHHHHHHHHHHHHHHHHHHHHhhhhcccccCCcccccccccccccccc-ccccc Confidence 777777788887888777542 23346788899999999999999877742211011100000000000000 11111 Q ss_pred cccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccc Q lcl|NC_020854. 157 DTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGL 236 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~ 236 (342) .+.....++.+.++..++-+.......|+|||..+..|+++ +++++..+.. .........++++|+ T Consensus 151 ~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~L~~l------kd~~g~~~~~--------~~~~~~~~~~~l~G~ 216 (303) T protein:vir:97 151 FTESEDADANIEAAVNLIQGAEGVVTGLAMDTEFSTALAKV------TNGEMGPKMY--------PELAWGANPDSINGL 216 (303) T ss_pred cccccchHHHHHHHHHHHhhcCCCccEEEEcHHHHHHHHHh------hccCCCeEEe--------cCccCCCCCceecce Confidence 22334567899999988877777778899999999999864 3343322211 111112334689999 Q ss_pred eEEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCCC----------cceeEEEEeeEEEeeec---ce Q lcl|NC_020854. 237 RVIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDIL----------AKSDAMSIDLHYVYHPV---GA 301 (342) Q Consensus 237 ~VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~----------~g~~~l~~r~~y~~~~~---G~ 301 (342) +|++++.+|.......++ ..++|+. .++.++..+.+++|..+... .....+.+..++...|. .| T Consensus 217 Pv~~s~~v~~~~~~~~~~-~~~~~Gdf~~~~~~~~~~~~~~~~~~~~~~d~~~~~~~~~n~~~~r~~~r~~~~v~~p~af 295 (303) T protein:vir:97 217 KSSVNTTVGAGADEAESK-DLVIIGDFESMFKWGYAKQIPMEIIKYGDPDNSGKDLKGYNQIYLRAEAYIGWGILDAKSF 295 (303) T ss_pred eeEEecccCCccccCCCc-cEEEEeeccccEEEEEecCcEEEEeeccCCCCcchhhhhcCcEEEEEEEEeccEeecccce Confidence 999999998654433333 3455553 55667776777776544211 12234555555544442 23 Q ss_pred ee-ecCcC Q lcl|NC_020854. 302 KW-AVTTT 308 (342) Q Consensus 302 s~-~~~~~ 308 (342) .. +.+.. T Consensus 296 ~~l~~~~~ 303 (303) T protein:vir:97 296 ARVTKGEV 303 (303) T ss_pred EEeeCCCC Confidence 22 23333 No 37 >protein:vir:97148 Length: 324 # NCBI annotation: ORF010 # Family: family:all:507 # MgeID: mge:1654 # MgeName: 85 # Cross-refs: genbank:acc:YP_239726;genbank:gi:66394880;genbank:GeneID:5130881 Probab=99.53 E-value=4.6e-15 Score=99.19 Aligned_cols=272 Identities=11% Similarity=0.088 Sum_probs=175.2 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) +++......+|+.+..-+.+...+.+.+.+ ++. .+ ..+|..+++|.+.. .+.+..+.|++.++..+++..+ T Consensus 30 ~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~--~~~---~~--~~~~~~~~ip~~~~--~~~a~~v~Eg~~~~~~~~~f~~ 100 (324) T protein:vir:97 30 MMHEKKDGTLMNEFTTPILQEVMENSKIMQ--LGK---YE--PMEGTEKKFTFWAD--KPGAYWVGEGQKIETSKATWVN 100 (324) T ss_pred cccCCCcceechhHHHHHHHHHHhhcchhh--hcc---ee--eccCCceEEEEEec--CcceeEeccCccccccccceeE Confidence 334556678899887777777777666644 221 11 13466789999874 4778889999999999999999 Q ss_pred eeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccccccc Q lcl|NC_020854. 81 QVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTPT 160 (342) Q Consensus 81 ~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~~ 160 (342) .....++.+.-+.++++...-+.-+..+.+.+++++.++++.++.+|. | . .+......+.. ........... T Consensus 101 v~~~~~k~~~~~~is~ell~ds~~~l~~~i~~~l~~aia~~~d~a~l~---G---~-g~~~~~~gi~~-~~~~~~~~~~~ 172 (324) T protein:vir:97 101 ATMRAFKLGVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGIL---N---Q-GNNPFGKSIAQ-SIEKTNKVIKG 172 (324) T ss_pred EEEeeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhc---c---C-CCCccCccccc-cccccceeccc Confidence 988889999889998877666667888999999999999999987774 1 1 11111111111 11122233445 Q ss_pred cccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceEEE Q lcl|NC_020854. 161 ALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIV 240 (342) Q Consensus 161 ~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vvv 240 (342) .++++.|.++...+.+......+|+||+..+..|++. ++++++.+.. ...-++++|++|++ T Consensus 173 ~~~~~~i~~~~~~l~~~~~~~~~~v~n~~~~~~L~~l------kd~~g~~~~~-------------~~~~~tl~G~PV~~ 233 (324) T protein:vir:97 173 DFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKI------VDPETKERIY-------------DRNSDTLDGLPVVN 233 (324) T ss_pred cCCHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHh------hcCCCceeec-------------CCCCccccceeeEe Confidence 6789999999999988777778999999999999864 3344332211 23346789999999 Q ss_pred eCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccCCC----------------cceeEEEEeeEEEeeecc--- Q lcl|NC_020854. 241 SDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRDIL----------------AKSDAMSIDLHYVYHPVG--- 300 (342) Q Consensus 241 dD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~~----------------~g~~~l~~r~~y~~~~~G--- 300 (342) ++..+.. +.. ++|+. .-+.++...++.++..++.. .....+....+|...|.- T Consensus 234 ~~~~~~~------~~~-~~~gd~~~~~i~~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~r~~~r~d~~v~~~~a 306 (324) T protein:vir:97 234 LKSSNLK------RGE-LITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKA 306 (324) T ss_pred ecCCCCC------cce-EEEEecccEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEeccEEecccc Confidence 8876542 222 33332 22335566677777766532 233556666666655433 Q ss_pred eee-e--cCcCCcChHHh Q lcl|NC_020854. 301 AKW-A--VTTTNPTRAQL 315 (342) Q Consensus 301 ~s~-~--~~~~sPt~~~L 315 (342) +.- + .++..-+.+|. T Consensus 307 ~~~l~~~~~~~~~~~~~~ 324 (324) T protein:vir:97 307 FAKLVPADKKTDSVPGEV 324 (324) T ss_pred eEEEEeccCCCCCCCCCC Confidence 222 1 12222233333 No 38 >protein:vir:103955 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1662 # MgeName: phiNM # Cross-refs: genbank:acc:YP_873992;genbank:gi:118430767;genbank:GeneID:4525449 Probab=99.52 E-value=5.4e-15 Score=98.82 Aligned_cols=273 Identities=11% Similarity=0.063 Sum_probs=173.5 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) |++..+...+|+.+..-+.+...+.+.|.+. +. .+ ..++..+++|.+.. .++++.+.|++.++..+.+..+ T Consensus 30 ~~~~~~~~liP~~~~~~ii~~~~~~s~l~~~--~~---~~--~~~~~~~~~p~~~~--~~~a~~v~Eg~~~~~~~~~~~~ 100 (324) T protein:vir:10 30 MMHEKKDGTLLNDFTTPILQEVMENSKIMQL--GK---YE--PMEGTEKKFTFWAD--KPGAYWVGEGQKIETSKATWVN 100 (324) T ss_pred eccCCCcceechhHHHHHHHHHHhhchhhhh--cc---ee--eccCCceEEEEEeC--CcceeEeccCccccccccceeE Confidence 4444555577877777666766666666442 21 11 12456789999874 4788899999999999999888 Q ss_pred eeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccccccc Q lcl|NC_020854. 81 QVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTPT 160 (342) Q Consensus 81 ~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~~ 160 (342) .....++.+..+.++++...-+..+..+.+.+++++.++++.++.+|. | .. .+.....+.. ........... T Consensus 101 v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~ai~~~~d~a~l~---G---~g-~~~~~~~i~~-~~~~~~~~~~~ 172 (324) T protein:vir:10 101 ATMRAFKLGVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGIL---N---QG-NNPFGKSIAQ-SIEKTNKVIKG 172 (324) T ss_pred EEEeeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhh---c---CC-CCccCccccc-cccccceeccc Confidence 888888999888998877666667888999999999999999987764 1 11 1111111111 11122223445 Q ss_pred cccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceEEE Q lcl|NC_020854. 161 ALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIV 240 (342) Q Consensus 161 ~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vvv 240 (342) .++++.|.++...+.+......+|+||+..+..|++. ++++++.+.. ...-++++|++|++ T Consensus 173 ~~t~~~i~~~~~~l~~~~~~~~~~v~n~~~~~~L~~l------~d~~g~~~~~-------------~~~~~~l~G~PV~~ 233 (324) T protein:vir:10 173 DFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKI------VDPETKERIY-------------DRNSDTLDGLPVVN 233 (324) T ss_pred cCCHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHh------hccCCceeec-------------CCCCccccceeEEe Confidence 7789999999999988777778999999999999864 3344332211 12346789999999 Q ss_pred eCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCC----------------cceeEEEEeeEEEeeecc---e Q lcl|NC_020854. 241 SDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDIL----------------AKSDAMSIDLHYVYHPVG---A 301 (342) Q Consensus 241 dD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~----------------~g~~~l~~r~~y~~~~~G---~ 301 (342) ++.++.. +...++--..-+.++..+++.++..++.. .....+...++|...|.- + T Consensus 234 ~~~~~~~------~~~~~~gd~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~r~d~~v~~~~A~ 307 (324) T protein:vir:10 234 LKSSNLK------RGELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAF 307 (324) T ss_pred ecCCCCC------cceEEEEecccEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEEccEEecccce Confidence 8876532 22222211222345666667777665532 233556666667655543 2 Q ss_pred ee-e--cCcCCcChHHh Q lcl|NC_020854. 302 KW-A--VTTTNPTRAQL 315 (342) Q Consensus 302 s~-~--~~~~sPt~~~L 315 (342) .- + .++..++-++. T Consensus 308 ~~l~~a~~~~~~~~~~~ 324 (324) T protein:vir:10 308 AKLVPADKKTDSVPGEV 324 (324) T ss_pred EEEEeccCCCCCCCCCC Confidence 22 2 12222344444 No 39 >protein:vir:1638 Length: 298 # NCBI annotation: Structural protein # Family: family:all:966 # MgeID: mge:33 # MgeName: r1t # Cross-refs: genbank:acc:NP_695059;genbank:gi:23455750;genbank:GeneID:955469 Probab=99.50 E-value=5.7e-15 Score=98.68 Aligned_cols=279 Identities=12% Similarity=0.046 Sum_probs=163.3 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) ||+.-..++.||+..+.+ +.+.+.+.+.+- +..- ..++..+++|.+.. .+.+.-+.|+++++..+++-.+ T Consensus 1 ma~~gG~lvp~~~~~~ii-~~~~~~s~i~~l--~~~~-----~~~~~~~~ip~~~~--~~~a~~v~E~~~~~~~~~~f~~ 70 (298) T protein:vir:16 1 MVLNKGTLFDPTLVTDLI-SKVAGKSSIARL--SAQK-----PIPFNGEKVFTFTM--DSEIDVVAESGKKTHGGVTLAP 70 (298) T ss_pred CcccCcceechhHHHHHH-HHHHhhhhhhhh--ccee-----eccCCceEEEEEec--CcceEEecCCccccccccceeE Confidence 998888888888877766 444455555442 2111 12455678998864 4778889999999999988777 Q ss_pred eeeEeeeeccceeechHHHhhh---cchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 81 QVAAILHRGRAFEARDLAALAA---GSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 81 ~~a~i~~~~k~~~~tD~a~~~~---~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) .....++.+....++++....+ ..+..+.+.+++++.+.++++..++.-...--+....-................. T Consensus 71 v~l~~~k~a~~~~iS~ell~~s~d~~~~l~~~i~~~la~ai~~~~d~~~l~G~~~~~g~~~~~~~~~~~~~~~~~~~~~~ 150 (298) T protein:vir:16 71 QTMVPIKVEYGARISDEFMYASDEEKINILQEFNDGFAKKVARGIDLMAFHGVNPRLGTASAVIGTNHFDSKVTQKVEAP 150 (298) T ss_pred EEEeeeeEEEeehhhHHHhhcCcccHHHHHHHHHHHHHHHHHHHHHHHhhccccCCCCcccccccccccccccccccccc Confidence 7777778887788877765322 3477888999999999999998887521000000000000000000000011111 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) ......+..+.++..++.....+..+|+|||+.+..|+++ ++++++.+.. .. .....-++++|++ T Consensus 151 ~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~l------kd~~G~~i~~--~~-------~~~~~~~~l~G~P 215 (298) T protein:vir:16 151 RGIADPNGAIENAVELLTGVDADVTGIAINPSFRSALAKQ------KDLQDNALFP--EL-------KWGATPDTINGLP 215 (298) T ss_pred cccccHHHHHHHHHHHhhhcCCCccEEEEcHHHHHHHHHh------hccCCCeeec--Cc-------ccCCCCceeccee Confidence 1112225678888888877777788999999999999875 3454443321 10 0122346899999 Q ss_pred EEEeCCcceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCCC----------cceeEEEEeeEEEeee---ccee Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDIL----------AKSDAMSIDLHYVYHP---VGAK 302 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~~----------~g~~~l~~r~~y~~~~---~G~s 302 (342) |++++.+|.... ..++ .++++ ..++.++....++++..+... .+...+.+..++...+ ..+. T Consensus 216 V~~~~~v~~~~~--~~~~-~~~~GDfs~~~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~v~~ra~~r~d~~v~~~~a~~ 292 (298) T protein:vir:16 216 VDVNKTVSDMSL--TQRD-RAIIGDFANGFKWGYAKEVPLEVIQYGDPDNSGLDLKGYNQVYIRAELFLGWGILDATKFA 292 (298) T ss_pred eEEecccccccC--CCcc-EEEEeeccceEEEEEecCceEEEeeccCCcCcchhhhhcCcEEEEEEEEEccEeecccceE Confidence 999999985432 2333 34444 245566665566665544321 1223444444444332 3333 Q ss_pred ee-cCc Q lcl|NC_020854. 303 WA-VTT 307 (342) Q Consensus 303 ~~-~~~ 307 (342) .- .+. T Consensus 293 ~l~~at 298 (298) T protein:vir:16 293 RVTEAN 298 (298) T ss_pred EEeecC Confidence 31 111 No 40 >protein:vir:96223 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1607 # MgeName: 69 # Cross-refs: genbank:acc:YP_239571;genbank:gi:66395304;genbank:GeneID:5132771 Probab=99.47 E-value=2.5e-14 Score=95.11 Aligned_cols=273 Identities=11% Similarity=0.059 Sum_probs=171.0 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) |++....-++|+.+..-+.+...+.+.+.+ ++.. + ..+|..+++|.+.. .+++..+.|++.++..+++-.+ T Consensus 30 ~~~~~~~~lip~~~~~~ii~~~~~~s~l~~--l~~~---~--~~~~~~~~~p~~~~--~~~a~~v~Eg~~~~~~~~~f~~ 100 (324) T protein:vir:96 30 MMHEKKDGTLLNDFTTPILQEVMENSKIMQ--LGKY---E--PMEGTEKKFTFWAD--KPGAYWVGEGQKIETSKATWVN 100 (324) T ss_pred cccCCCcceechhHHHHHHHHHHhhchhhh--hcce---e--eccCCceEEEEEec--CcceeeecCCccccccccceeE Confidence 444445556677777666666666665544 2211 1 12466788998864 3778889999999999999999 Q ss_pred eeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccccccc Q lcl|NC_020854. 81 QVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTPT 160 (342) Q Consensus 81 ~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~~ 160 (342) .....++.+.-+.++++...-+..+..+.+.+++++.++++.++.+|. |. . ++.....+... .......... T Consensus 101 v~~~~~k~~~~~~is~ell~ds~~~l~~~i~~~l~~aia~~~d~~~l~---G~---g-~~~~~~~~~~~-~~~~~~~~~~ 172 (324) T protein:vir:96 101 ATMRAFKLGVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGIL---NQ---G-NNPFGKSIAQS-IKKTNKVIKG 172 (324) T ss_pred EEEEeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhh---cC---C-CCCcCcccccc-ccccceeccc Confidence 888889999889998877665667888999999999999999987764 21 1 11111111111 1112223345 Q ss_pred cccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceEEE Q lcl|NC_020854. 161 ALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIV 240 (342) Q Consensus 161 ~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vvv 240 (342) .++++.|.++..++.+......+|+||+..+..|++. ++++++.+.. +..-++++|++|++ T Consensus 173 ~~~~~~i~~~~~~i~~~~~~~~~~i~n~~~~~~L~~l------kd~~G~~~~~-------------~~~~~~l~G~PV~~ 233 (324) T protein:vir:96 173 DFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKI------VDPETKERIY-------------DRNSDSLDGLPVVN 233 (324) T ss_pred ccchHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHh------hCCCCCeeec-------------CCCCCcccceeeEe Confidence 6789999999998888777778999999999999864 3344332211 23356789999998 Q ss_pred eCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCC----------------cceeEEEEeeEEEeeecc---e Q lcl|NC_020854. 241 SDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDIL----------------AKSDAMSIDLHYVYHPVG---A 301 (342) Q Consensus 241 dD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~----------------~g~~~l~~r~~y~~~~~G---~ 301 (342) +...+.. +...++--..-+.++...++.++..++.. .....+...+++.+.|.- + T Consensus 234 ~~~~~~~------~~~~~~gd~s~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~n~v~~r~~~r~d~~v~~~~a~ 307 (324) T protein:vir:96 234 LKSSNLK------RGELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAF 307 (324) T ss_pred ecCCCCC------cceEEEEecceEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEeccEEecccce Confidence 8765432 11222211223345555667777666532 233566677777666543 2 Q ss_pred ee-e--cCcCCcChHHh Q lcl|NC_020854. 302 KW-A--VTTTNPTRAQL 315 (342) Q Consensus 302 s~-~--~~~~sPt~~~L 315 (342) .. + ..+..-+..+. T Consensus 308 ~~l~~a~~~~~~~~~~~ 324 (324) T protein:vir:96 308 AKLVPADKRTDSVPGEV 324 (324) T ss_pred EEEecccccCCCCCCCC Confidence 22 1 12222222222 No 41 >protein:vir:41 Length: 299 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:2 # MgeName: A118 # Cross-refs: genbank:acc:NP_463467;swissprot:trembl:q9t1b7;genbank:gi:16798789;uniprot:Q9T1B7;genbank:GeneID:922353 Probab=99.46 E-value=5.7e-14 Score=93.20 Aligned_cols=273 Identities=10% Similarity=0.031 Sum_probs=167.8 Q ss_pred Ccc---eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MAT---LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT---~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |++ ......+|+.+..-+.+.+.+.+.+.+ ++. .+. .+|...++|.+.. ..+..+.|++.++..+.+ T Consensus 6 ~~~~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~--~~~---~~~--~~~~~~~~~~~~~---~~a~~v~E~~~~~~~~~~ 75 (299) T protein:vir:41 6 DTTTMQSAKTGSIPINISEQIITGVKNGSAAMK--LAK---AVP--MTKPEEEFTFMSG---VGAFWVDEAERIQTSKPT 75 (299) T ss_pred CcccccCCCceecchhHHHHHHHHHHhcchhhh--hce---eee--cCCCcEEEEEEcC---CceeeeecCccccccccc Confidence 552 233457788887767777777766644 221 121 3466788898862 556778999999999988 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) -++-....++.+.-+.++++...-+..|..+.+.+++++.+++++++.++. | ...... .+ +........... T Consensus 76 f~~v~l~~~k~~~~~~is~ell~ds~~~~~~~i~~~l~~a~~~~~d~a~l~---G---~g~~~~-~g-il~~~~~~~~~~ 147 (299) T protein:vir:41 76 FTKAKMRSKKMGVIIPTTKENLNYSVTNFFSLMQAEIVEAFYKKFDQAVFT---G---VESPYN-WN-ILKSATDASNLV 147 (299) T ss_pred eeEEEEeeEEEEEeehhhHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHHhh---c---ccCccc-cc-ccccccccceee Confidence 888888888888889999988777777888999999999999999987764 2 111000 11 111111122222 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) ....++++.|.++..++-+....-.+|+|||..+..|++. ++++++++.. .. .....++++|++ T Consensus 148 ~~~~~~~~~l~~~~~~l~~~~~~~~~~v~n~~~~~~L~~l------kd~~G~~l~~--~~--------~~~~~~~l~G~P 211 (299) T protein:vir:41 148 EETANKYDDLNEAIGLIEAEDLEPNGIATIRKQRVKYRST------KDGNGMPIFN--TA--------TSNGVDDVLGLP 211 (299) T ss_pred ccccccHHHHHHHHHhhhcccCCcCEEEEcHHHHHHHHHh------hccCCceeec--CC--------cCCCCceeccee Confidence 3445678999999998888777788999999999999874 3344333321 11 112346899999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecce-eEeecCCcceeEeccCCCc----------------ceeEEEEeeEEEeeecc Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGA-VASGEQMAMQTETDRDILA----------------KSDAMSIDLHYVYHPVG 300 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GA-i~~~~k~~~~ve~dr~~~~----------------g~~~l~~r~~y~~~~~G 300 (342) |+++|.+|... .. ..++|+.-+ +.++..+++.++..|+... ....+....++...++- T Consensus 212 V~~~~~~~~~~---~~--~~~~~gdfs~~~i~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~~~d~~v~~ 286 (299) T protein:vir:41 212 IAYTPKYTFGD---KD--ISELVGDWNQAYYGILRGVEYEILTEATLTTVADETGKPLNLAERDMAAIKATFEVGFMVVK 286 (299) T ss_pred eEEecccCCCC---Cc--eEEEEEecccEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEeccEEec Confidence 99999998532 11 123333221 2356666777777765431 12333333443333321 Q ss_pred -eeeecCcCCcChHHhcCCcCce Q lcl|NC_020854. 301 -AKWAVTTTNPTRAQLETVANWS 322 (342) Q Consensus 301 -~s~~~~~~sPt~~~L~~~~NW~ 322 (342) -++..- +...|+ T Consensus 287 ~~A~~~l----------~~~aa~ 299 (299) T protein:vir:41 287 DEAFSAV----------QPKAGN 299 (299) T ss_pred ccceEEE----------EeccCC Confidence 111100 000000 No 42 >protein:vir:78830 Length: 324 # NCBI annotation: major head protein # Family: family:all:507 # MgeID: mge:1858 # MgeName: 80alpha # Cross-refs: genbank:acc:YP_001285361;genbank:gi:148717889;genbank:GeneID:5246961 Probab=99.45 E-value=4.4e-14 Score=93.82 Aligned_cols=273 Identities=11% Similarity=0.071 Sum_probs=172.5 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) |.+......+|+.+..-+.+...+.+.+.+ ++.. + ..+|..+++|.+.. .+++..+.|++.++..+++..+ T Consensus 30 ~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~--l~~~---~--~~~~~~~~~p~~~~--~~~a~~v~Eg~~~~~~~~~~~~ 100 (324) T protein:vir:78 30 MMHEKKDGTLMNEFTTPILQEVMENSKIMQ--LGKY---E--PMEGTEKKFTFWAD--KPGAYWVGEGQKIETSKATWVN 100 (324) T ss_pred cccCcCccccchhHHHHHHHHHHhhchhhh--hcce---e--eccCCceEEEEEec--CcceeEecCCccccccccceeE Confidence 444556678888887766666666666644 2221 2 13466788998864 4778889999999999999888 Q ss_pred eeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccccccc Q lcl|NC_020854. 81 QVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTPT 160 (342) Q Consensus 81 ~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~~ 160 (342) .....++.+..+.++++...-+..|..+.+.+++++.++++.++.+|. |. . ++.....+... .......... T Consensus 101 v~~~~~k~~~~~~is~ell~ds~~~l~~~i~~~la~ai~~~~d~a~l~---G~---g-~~~~~~gi~~~-~~~~~~~~~~ 172 (324) T protein:vir:78 101 ATMRAFKLGVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGIL---NQ---G-NNPFGKSIAQS-IEKTNKVIKG 172 (324) T ss_pred EEEeeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhc---cC---C-CCCcCcccccc-ccccceeccc Confidence 888888999888998877666667888999999999999999987764 21 1 11111111111 1112222345 Q ss_pred cccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceEEE Q lcl|NC_020854. 161 ALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIV 240 (342) Q Consensus 161 ~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vvv 240 (342) .++++.|.++...+........+|+||++.+..|++. +++++..+. ....-++++|++|++ T Consensus 173 ~~t~~~i~~~~~~l~~~~~~~~~~vmn~~~~~~L~~l------~d~~G~~~~-------------~~~~~~~l~G~PV~~ 233 (324) T protein:vir:78 173 DFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKI------VDPETKERI-------------YDRNSDSLDGLPVVN 233 (324) T ss_pred cccHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHh------hccCCCeee-------------cCCCCCcccceeeEe Confidence 6789999999999988777888999999999999864 233332211 123346789999999 Q ss_pred eCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCC----------------cceeEEEEeeEEEeeecc-eee Q lcl|NC_020854. 241 SDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDIL----------------AKSDAMSIDLHYVYHPVG-AKW 303 (342) Q Consensus 241 dD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~----------------~g~~~l~~r~~y~~~~~G-~s~ 303 (342) ++..+.. +...++--..-+.++...++.+|..++.. .....+...+++...+.- =+| T Consensus 234 ~~~~~~~------~~~~~~gd~~~~~~g~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~r~~~r~d~~v~~~~A~ 307 (324) T protein:vir:78 234 LKSSNLK------RGELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAF 307 (324) T ss_pred eCCCCCC------cceEEEEecceEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEEccEEecccce Confidence 8766532 22222211222345666667777766532 234556666666666543 122 Q ss_pred ---e--cCcCCcChHHh Q lcl|NC_020854. 304 ---A--VTTTNPTRAQL 315 (342) Q Consensus 304 ---~--~~~~sPt~~~L 315 (342) + +.+..-|-.|. T Consensus 308 ~~l~~a~~~~~~~~~~~ 324 (324) T protein:vir:78 308 AKLVPADKRTDSVPGEV 324 (324) T ss_pred EEEecccccCCCCCCCC Confidence 1 11111122222 No 43 >protein:vir:96392 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1613 # MgeName: 53 # Cross-refs: genbank:acc:YP_239648;genbank:gi:66395381;genbank:GeneID:5132868 Probab=99.45 E-value=4.4e-14 Score=93.82 Aligned_cols=273 Identities=11% Similarity=0.071 Sum_probs=172.5 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) |.+......+|+.+..-+.+...+.+.+.+ ++.. + ..+|..+++|.+.. .+++..+.|++.++..+++..+ T Consensus 30 ~~~~~~~~~iP~~~~~~ii~~~~~~s~l~~--l~~~---~--~~~~~~~~~p~~~~--~~~a~~v~Eg~~~~~~~~~~~~ 100 (324) T protein:vir:96 30 MMHEKKDGTLMNEFTTPILQEVMENSKIMQ--LGKY---E--PMEGTEKKFTFWAD--KPGAYWVGEGQKIETSKATWVN 100 (324) T ss_pred cccCcCccccchhHHHHHHHHHHhhchhhh--hcce---e--eccCCceEEEEEec--CcceeEecCCccccccccceeE Confidence 444556678888887766666666666644 2221 2 13466788998864 4778889999999999999888 Q ss_pred eeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccccccc Q lcl|NC_020854. 81 QVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTPT 160 (342) Q Consensus 81 ~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~~ 160 (342) .....++.+..+.++++...-+..|..+.+.+++++.++++.++.+|. |. . ++.....+... .......... T Consensus 101 v~~~~~k~~~~~~is~ell~ds~~~l~~~i~~~la~ai~~~~d~a~l~---G~---g-~~~~~~gi~~~-~~~~~~~~~~ 172 (324) T protein:vir:96 101 ATMRAFKLGVILPVTKEFLNYTYSQFFEEMKPMIAEAFYKKFDEAGIL---NQ---G-NNPFGKSIAQS-IEKTNKVIKG 172 (324) T ss_pred EEEeeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhc---cC---C-CCCcCcccccc-ccccceeccc Confidence 888888999888998877666667888999999999999999987764 21 1 11111111111 1112222345 Q ss_pred cccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceEEE Q lcl|NC_020854. 161 ALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIV 240 (342) Q Consensus 161 ~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vvv 240 (342) .++++.|.++...+........+|+||++.+..|++. +++++..+. ....-++++|++|++ T Consensus 173 ~~t~~~i~~~~~~l~~~~~~~~~~vmn~~~~~~L~~l------~d~~G~~~~-------------~~~~~~~l~G~PV~~ 233 (324) T protein:vir:96 173 DFTQDNIIDLEALLEDDELEANAFISKTQNRSLLRKI------VDPETKERI-------------YDRNSDSLDGLPVVN 233 (324) T ss_pred cccHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHh------hccCCCeee-------------cCCCCCcccceeeEe Confidence 6789999999999988777888999999999999864 233332211 123346789999999 Q ss_pred eCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCC----------------cceeEEEEeeEEEeeecc-eee Q lcl|NC_020854. 241 SDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDIL----------------AKSDAMSIDLHYVYHPVG-AKW 303 (342) Q Consensus 241 dD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~----------------~g~~~l~~r~~y~~~~~G-~s~ 303 (342) ++..+.. +...++--..-+.++...++.+|..++.. .....+...+++...+.- =+| T Consensus 234 ~~~~~~~------~~~~~~gd~~~~~~g~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~~~r~~~r~d~~v~~~~A~ 307 (324) T protein:vir:96 234 LKSSNLK------RGELITGDFDKLIYGIPQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMVALRATMHVALHIADDKAF 307 (324) T ss_pred eCCCCCC------cceEEEEecceEEEEEecCcEEEEeecccccccccccccchhhhhcCcEEEEEEEEEccEEecccce Confidence 8766532 22222211222345666667777766532 234556666666666543 122 Q ss_pred ---e--cCcCCcChHHh Q lcl|NC_020854. 304 ---A--VTTTNPTRAQL 315 (342) Q Consensus 304 ---~--~~~~sPt~~~L 315 (342) + +.+..-|-.|. T Consensus 308 ~~l~~a~~~~~~~~~~~ 324 (324) T protein:vir:96 308 AKLVPADKRTDSVPGEV 324 (324) T ss_pred EEEecccccCCCCCCCC Confidence 1 11111122222 No 44 >protein:vir:94771 Length: 298 # NCBI annotation: major head protein # Family: family:all:966 # MgeID: mge:1529 # MgeName: phi LC3 # Cross-refs: genbank:acc:NP_996706;genbank:gi:45597421;genbank:GeneID:2769044 Probab=99.45 E-value=4.1e-14 Score=93.99 Aligned_cols=276 Identities=13% Similarity=0.061 Sum_probs=162.8 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) ||+.-..+|.||+..+++ +.+.+.+.+.+. +. .+ ..++..+++|.+.. .+.+.-+.|+++++..+.+-.+ T Consensus 1 ma~~gG~lip~~~~~~ii-~~~~~~s~i~~~--~~---~~--~~~~~~~~~p~~~~--~~~a~~v~Eg~~~~~~~~~f~~ 70 (298) T protein:vir:94 1 MVLNKGTLFDPELVTDLI-SKVAGKSSIARL--SA---QK--PIPFNGEKVFTFTM--DSEIDVVAESGKKTHGGVTLAP 70 (298) T ss_pred CeeccccccChhHHHHHH-HHHHhhchhhhh--cc---ee--eccCCceEEEEEec--CcceEEeeCCccccccccceeE Confidence 998777777666655544 444444444331 11 11 13455678998864 3677789999999999998888 Q ss_pred eeeEeeeeccceeechHHHhh---hcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc---hhheeeecccc Q lcl|NC_020854. 81 QVAAILHRGRAFEARDLAALA---AGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSS---SAFFDLCIDSE 154 (342) Q Consensus 81 ~~a~i~~~~k~~~~tD~a~~~---~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~---~~~~~~~~~~~ 154 (342) .....++.+..+.++++.... ...+..+.+.+++++.++++++..++. |.-.....+... ........... T Consensus 71 v~l~~~k~~~~~~iS~ell~~~~~~~~~l~~~i~~~la~ai~~~~d~~~l~---G~~~~~g~~~~~~~~~~~~~~~~~~~ 147 (298) T protein:vir:94 71 QTMVPIKVEYGARISDEFMYASDEEKINILQAFNDGFAKKVARGIDLMAFH---GVNPRLGTASAVIGTNHFDSKVTQKV 147 (298) T ss_pred EEEeeeEEEEeeehhHHHhccCCccHHHHHHHHHHHHHHHHHHHHHHHhhc---ccccCCCccccccccccccccccccc Confidence 888888888888888876432 234667889999999999999887774 211001111100 00011111111 Q ss_pred cccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeec Q lcl|NC_020854. 155 SGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYM 234 (342) Q Consensus 155 ~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~ 234 (342) ...+......+.+.++..++........+|+|||+.+..|++. ++++++.+.. .. .....-++++ T Consensus 148 ~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~l------kd~~G~~l~~--~~-------~~~~~~~tl~ 212 (298) T protein:vir:94 148 EAPRGIADPNGAIENAVELLTGVDADVTGIAINPSFRSALAKQ------KDLQGNALFP--EL-------KWGATPDTIN 212 (298) T ss_pred ccccccccHHHHHHHHHHhhhhcCCCccEEEEcHHHHHHHHHh------hccCCCeeec--Cc-------ccCCCCceec Confidence 1112223346678999988888777788999999999999874 3444433321 10 1123346899 Q ss_pred cceEEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCCC----------cceeEEEEeeEEEeee---c Q lcl|NC_020854. 235 GLRVIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDIL----------AKSDAMSIDLHYVYHP---V 299 (342) Q Consensus 235 G~~VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~----------~g~~~l~~r~~y~~~~---~ 299 (342) |++|++++.+|.... ..+. ..+++. .++.|+...++.++..+... .....+.+..++.+.+ . T Consensus 213 G~PV~~~~~v~~~~~--~~~~-~~~~Gdfs~~~~~~~~~~~~~~~~~~~~~d~~~~~~f~~~~v~~r~~~r~~~~~~~~~ 289 (298) T protein:vir:94 213 GLPVDVNKTVSDMSL--TQRD-RAIIGDFANGFKWGYAKEVPLEVIQYGDPDNSGLDLKGYNQVYIRAELFLGWGILDAT 289 (298) T ss_pred ceeeEEecccccccC--CCcc-EEEEeeccceEEEEEecCceEEEeecCCCcCcchhhhhcCcEEEEEEEEeccEeeccc Confidence 999999999885432 2222 345553 44566666777776655321 1223455555554443 2 Q ss_pred ceee-ecCc Q lcl|NC_020854. 300 GAKW-AVTT 307 (342) Q Consensus 300 G~s~-~~~~ 307 (342) .|.. +.+. T Consensus 290 a~~~l~~~t 298 (298) T protein:vir:94 290 KFARVTEAN 298 (298) T ss_pred ceEEEEecC Confidence 2222 1111 No 45 >protein:vir:4856 Length: 293 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:106 # MgeName: DT1 # Cross-refs: genbank:acc:NP_049396;genbank:gi:9632424;genbank:GeneID:1258532 Probab=99.43 E-value=1.1e-13 Score=91.64 Aligned_cols=272 Identities=10% Similarity=0.018 Sum_probs=174.2 Q ss_pred Ccc-ee--ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceech-hhc Q lcl|NC_020854. 1 MAT-LR--SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTP-GKI 76 (342) Q Consensus 1 MaT-~~--~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~-~~l 76 (342) |++ +. .-..+|+.+.+-+.+...+...+.+ ++..-+ +. ...-...+|.+.. ..+.+..+.|++.++. ++. T Consensus 5 ~~~~t~~~gg~liP~~~~~~Ii~~~~~~~~l~~--~~~~~~-~~--~~~g~~~~~~~~~-~~~~a~~v~Eg~~~~~~~~~ 78 (293) T protein:vir:48 5 KTDHSGSDAGLTIPQDIRTAINTLVRQYDSLQE--YVNVEN-VT--TLTGSRVYEKWTD-ITGLANIDDEAGKIADIDDP 78 (293) T ss_pred ecccccCcCceEechhHHHHHHHHHHhhhhhhh--hceeee-cc--CCcceEEEEeecC-CCcceeeecCCccccccccc Confidence 773 33 3367899988877788777777754 221111 11 1123556777753 3467788999999974 567 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +-.+-....++.+..+.++++...-+.-|..+.+.+++++.+.+..++.++..+... . T Consensus 79 ~~~~i~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~i~~g~~~~----------------------~ 136 (293) T protein:vir:48 79 KLSLIKYTIKRYAGISTVTNSLLADSAENILAWLSGWIAKKVVVTRNKAILGVVDKL----------------------P 136 (293) T ss_pred ceeEEEEeeeEEEEeehhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHhHHhhccccc----------------------c Confidence 777777788888888888888766666788888999999999999988776543210 1 Q ss_pred cccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccc Q lcl|NC_020854. 157 DTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGL 236 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~ 236 (342) .....++++.|.++...+......-..|+||+..+..|+++ ++++++.+.... ..+..-++++|+ T Consensus 137 ~~~~~~~~d~i~~~~~~l~~~~~~~a~~vmn~~~~~~L~~l------kd~~g~~l~~~~---------~~~~~~~~l~G~ 201 (293) T protein:vir:48 137 TKPTLTKWDDIIDLEAKVDPAIKQTSFFLTNTSGFTALKKV------KNALGDYLMERD---------VKSPTGYSIAGF 201 (293) T ss_pred ccccccCHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHHh------hccCCceEeecC---------cCCCCCceecce Confidence 12345678999999988877777778999999999999875 345544333211 123445789999 Q ss_pred eEEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCC----CcceeEEEEeeEEEeee---cceee---e Q lcl|NC_020854. 237 RVIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDI----LAKSDAMSIDLHYVYHP---VGAKW---A 304 (342) Q Consensus 237 ~VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~----~~g~~~l~~r~~y~~~~---~G~s~---~ 304 (342) +|++.+..++...+. ++ .+++|+. -++.+....++.++.++.. ......+....++.+.+ ..+.. + T Consensus 202 Pv~~~~~~~~~~~~~-~~-~~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~l~~~ 279 (293) T protein:vir:48 202 AVKEISDRWLPNASS-GV-MPLYFGDLKQAVTLFDRQQMSLLSTNIGGGAFETDTTKVRVIDRFDVVATDTEAFVPASFK 279 (293) T ss_pred eeEEecccccCCccC-Cc-eEEEEEeccceEEEEEecceEEEEecccchhhhcCeEEEEEEEeeCcEEecccceEEEEee Confidence 999877666543322 22 2345553 3555666677788777643 34556677777776654 34433 2 Q ss_pred cCcCC-cChHHhcC Q lcl|NC_020854. 305 VTTTN-PTRAQLET 317 (342) Q Consensus 305 ~~~~s-Pt~~~L~~ 317 (342) .+... ++..-.+- T Consensus 280 ~~~~~~~~~~~~~~ 293 (293) T protein:vir:48 280 AIADQKGNIGSTAV 293 (293) T ss_pred ccccCCccccccCC Confidence 22212 22222222 No 46 >protein:vir:105905 Length: 304 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:1514 # MgeName: phiETA3 # Cross-refs: genbank:acc:YP_001004375;genbank:gi:122891830;genbank:GeneID:4712376 Probab=99.42 E-value=5.8e-14 Score=93.17 Aligned_cols=271 Identities=10% Similarity=0.043 Sum_probs=161.5 Q ss_pred Cc-----------ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCc Q lcl|NC_020854. 1 MA-----------TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSS 69 (342) Q Consensus 1 Ma-----------T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~ 69 (342) || |...-..+|+.+.+-+.+.+.+.+.+.+. +...+ .++...++|.+.. .+.+..+.|++ T Consensus 1 ma~~~~~~~~~~~t~~gg~lip~~~~~~ii~~~~~~~~l~~~--~~~~~-----~~~~~~~ip~~~~--~~~a~~v~E~~ 71 (304) T protein:vir:10 1 MATPTYTPGNVILSDFKNGVIPAEQGTLIMKDIMANSAIMKL--AKNEP-----MTAQKKKFTYLAK--GVGAYWVSETE 71 (304) T ss_pred CcccccccccccccCCCceecchhHHHHHHHHHHhccchhhh--cceee-----ccCCceEEEEEeC--CcceEEeecCc Confidence 55 22334578888877667777766666542 22211 2456688999874 46777899999 Q ss_pred eechhhcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc---hhh Q lcl|NC_020854. 70 SLTPGKITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSS---SAF 146 (342) Q Consensus 70 ~i~~~~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~---~~~ 146 (342) .++..+.+..+-....++.+..+.++++...-+.-|..+.+.+++++.++++.+..++. | .......+ ... T Consensus 72 ~~~~~~~~~~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~ia~~~d~~~l~---G---~g~~~~~~~~~~~~ 145 (304) T protein:vir:10 72 RIQTSKPEYAQAEMEAKKIGVIIPLSKEFLKWTAKDFFNEVKPLIAEAFYKAFDQAVIF---G---TKSPYNTSTSGKPL 145 (304) T ss_pred ccccccceeeEEEEEEEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHhhhee---c---cCCCcccccccccc Confidence 99988888888888888888888888887766677888999999999999998887764 1 11110000 000 Q ss_pred eeeecccccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeeccccc Q lcl|NC_020854. 147 FDLCIDSESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGG 226 (342) Q Consensus 147 ~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~ 226 (342) ...........+.....++.|.++...+.+......+|+||+..+..|++. ++++++++. T Consensus 146 ~~~~~~~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~~~~~~~~L~~l------kd~~G~~l~-------------- 205 (304) T protein:vir:10 146 VEGAEEKGNVVTDTNNLYVDLSALMATIEDEELDPNGVLTTRSFRSKMRNA------LDANDRPLF-------------- 205 (304) T ss_pred cccccccccccccccchHHHHHHHHHHhhhccCCcCEEEEcHHHHHHHHHh------hccCCcEee-------------- Confidence 111111122223445678999999998888777788999999999999864 344433322 Q ss_pred ccceeeeccceEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccCCC------------------cceeE Q lcl|NC_020854. 227 EVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRDIL------------------AKSDA 287 (342) Q Consensus 227 ~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~~------------------~g~~~ 287 (342) +...++++|++|++++.+|...+ +.. ++|+. --+.++.-.++.++..++.. ..+.. T Consensus 206 ~~~~~~l~G~PV~~~~~~~~~~~----~~~-~~~gd~~~~~~~~~~~~~i~~~~e~~~~~~~~~~~~g~~~~~f~~~~~~ 280 (304) T protein:vir:10 206 DANGNEIMGLPLSYTGADVYDKK----KSL-ALMGDWDYARYGILQGIEYAISEDATLTTLQASDASGQPVSLFERDMFA 280 (304) T ss_pred cCCCccccceeeEEecccccCCC----CcE-EEEEehhhEEEEEecceEEEEeecceeeeecccccCccchhhhhcCcEE Confidence 12246789999999999986432 112 22221 11223333444444433321 11233 Q ss_pred EEEeeEEEeeecceeeecCcCCcChHHhcCCcCceeecCccccceEEEEecCC Q lcl|NC_020854. 288 MSIDLHYVYHPVGAKWAVTTTNPTRAQLETVANWSKVYELKNIGIVRATNVSN 340 (342) Q Consensus 288 l~~r~~y~~~~~G~s~~~~~~sPt~~~L~~~~NW~~v~d~k~i~~~~~~~~~~ 340 (342) +....+|...|.- ++++ .+++++| T Consensus 281 ~r~~~r~~~~v~~--------------------------~~a~---~~l~~a~ 304 (304) T protein:vir:10 281 LRATMHIAYMNVK--------------------------PEAF---ATLKPTE 304 (304) T ss_pred EEEEEEeccEeec--------------------------ccce---EEEEecC Confidence 3444444433311 1111 1111111 No 47 >protein:vir:94142 Length: 304 # NCBI annotation: ORF013 # Family: family:all:507 # MgeID: mge:1494 # MgeName: 96 # Cross-refs: genbank:acc:YP_240234;genbank:gi:66395898;genbank:GeneID:5133311 Probab=99.42 E-value=5.8e-14 Score=93.17 Aligned_cols=271 Identities=10% Similarity=0.043 Sum_probs=161.5 Q ss_pred Cc-----------ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCc Q lcl|NC_020854. 1 MA-----------TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSS 69 (342) Q Consensus 1 Ma-----------T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~ 69 (342) || |...-..+|+.+.+-+.+.+.+.+.+.+. +...+ .++...++|.+.. .+.+..+.|++ T Consensus 1 ma~~~~~~~~~~~t~~gg~lip~~~~~~ii~~~~~~~~l~~~--~~~~~-----~~~~~~~ip~~~~--~~~a~~v~E~~ 71 (304) T protein:vir:94 1 MATPTYTPGNVILSDFKNGVIPAEQGTLIMKDIMANSAIMKL--AKNEP-----MTAQKKKFTYLAK--GVGAYWVSETE 71 (304) T ss_pred CcccccccccccccCCCceecchhHHHHHHHHHHhccchhhh--cceee-----ccCCceEEEEEeC--CcceEEeecCc Confidence 55 22334578888877667777766666542 22211 2456688999874 46777899999 Q ss_pred eechhhcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc---hhh Q lcl|NC_020854. 70 SLTPGKITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSS---SAF 146 (342) Q Consensus 70 ~i~~~~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~---~~~ 146 (342) .++..+.+..+-....++.+..+.++++...-+.-|..+.+.+++++.++++.+..++. | .......+ ... T Consensus 72 ~~~~~~~~~~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~ia~~~d~~~l~---G---~g~~~~~~~~~~~~ 145 (304) T protein:vir:94 72 RIQTSKPEYAQAEMEAKKIGVIIPLSKEFLKWTAKDFFNEVKPLIAEAFYKAFDQAVIF---G---TKSPYNTSTSGKPL 145 (304) T ss_pred ccccccceeeEEEEEEEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHhhhee---c---cCCCcccccccccc Confidence 99988888888888888888888888887766677888999999999999998887764 1 11110000 000 Q ss_pred eeeecccccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeeccccc Q lcl|NC_020854. 147 FDLCIDSESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGG 226 (342) Q Consensus 147 ~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~ 226 (342) ...........+.....++.|.++...+.+......+|+||+..+..|++. ++++++++. T Consensus 146 ~~~~~~~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~~~~~~~~L~~l------kd~~G~~l~-------------- 205 (304) T protein:vir:94 146 VEGAEEKGNVVTDTNNLYVDLSALMATIEDEELDPNGVLTTRSFRSKMRNA------LDANDRPLF-------------- 205 (304) T ss_pred cccccccccccccccchHHHHHHHHHHhhhccCCcCEEEEcHHHHHHHHHh------hccCCcEee-------------- Confidence 111111122223445678999999998888777788999999999999864 344433322 Q ss_pred ccceeeeccceEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccCCC------------------cceeE Q lcl|NC_020854. 227 EVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRDIL------------------AKSDA 287 (342) Q Consensus 227 ~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~~------------------~g~~~ 287 (342) +...++++|++|++++.+|...+ +.. ++|+. --+.++.-.++.++..++.. ..+.. T Consensus 206 ~~~~~~l~G~PV~~~~~~~~~~~----~~~-~~~gd~~~~~~~~~~~~~i~~~~e~~~~~~~~~~~~g~~~~~f~~~~~~ 280 (304) T protein:vir:94 206 DANGNEIMGLPLSYTGADVYDKK----KSL-ALMGDWDYARYGILQGIEYAISEDATLTTLQASDASGQPVSLFERDMFA 280 (304) T ss_pred cCCCccccceeeEEecccccCCC----CcE-EEEEehhhEEEEEecceEEEEeecceeeeecccccCccchhhhhcCcEE Confidence 12246789999999999986432 112 22221 11223333444444433321 11233 Q ss_pred EEEeeEEEeeecceeeecCcCCcChHHhcCCcCceeecCccccceEEEEecCC Q lcl|NC_020854. 288 MSIDLHYVYHPVGAKWAVTTTNPTRAQLETVANWSKVYELKNIGIVRATNVSN 340 (342) Q Consensus 288 l~~r~~y~~~~~G~s~~~~~~sPt~~~L~~~~NW~~v~d~k~i~~~~~~~~~~ 340 (342) +....+|...|.- ++++ .+++++| T Consensus 281 ~r~~~r~~~~v~~--------------------------~~a~---~~l~~a~ 304 (304) T protein:vir:94 281 LRATMHIAYMNVK--------------------------PEAF---ATLKPTE 304 (304) T ss_pred EEEEEEeccEeec--------------------------ccce---EEEEecC Confidence 3444444433311 1111 1111111 No 48 >protein:vir:9574 Length: 300 # NCBI annotation: gp40 # Family: family:all:966 # MgeID: mge:171 # MgeName: SM1 # Cross-refs: genbank:acc:NP_862879;genbank:gi:32469471;genbank:GeneID:1461316 Probab=99.42 E-value=7.5e-14 Score=92.53 Aligned_cols=277 Identities=10% Similarity=0.017 Sum_probs=162.4 Q ss_pred Cc-ce-e-ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA-TL-R-SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma-T~-~-~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) || ++ - ..+|.||+ .+=+.+.+.+.+.+.+ +... . ..++..+++|.+.. .+++.-+.|++.++..+.+ T Consensus 1 ma~~t~~~G~lip~~~-~~~ii~~l~~~s~i~~--l~~~---~--~~~~~~~~~p~~~~--~~~a~wv~Eg~~~~~s~~~ 70 (300) T protein:vir:95 1 MSEAQLSKGNLFNPEL-VTKVINKVKGHSSIAK--LSPQ---K--PIPFNGQREFVFDF--DSDIDIVAENGKKTHGGVS 70 (300) T ss_pred CcccccCCcceechhh-HHHHHHHHHhhhhhhh--hcce---e--eccCCceEEEEEec--CcceEEeeCCccccccccc Confidence 99 22 2 44555554 4445555655555544 1111 1 13455678998764 3677889999999999988 Q ss_pred cceeeeEeeeeccceeechHHHh---hhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAAL---AAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSE 154 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~---~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~ 154 (342) -.+.....++.+.-+.++++-.. -+.-+..+.+.+++++..+++.++.++.-...--+.... .......+ ..... T Consensus 71 f~~v~l~~~k~~~~~~iS~ell~~~~d~~~~l~~~i~~~l~~aia~~~d~~~l~G~~~~~g~~~~-~~~~~~~~-~~~~~ 148 (300) T protein:vir:95 71 LDPVTIVPLKVEYGARVSDEFLHASEEAKVDMLTDFVEGFSKKLARGLDIMSIHGINPRTKQAST-IIGDNCFD-KKVTQ 148 (300) T ss_pred ceeeEeeeEEEEEeehhhHHHhccCCCCHHHHHHHHHHHHHHHHHHHHHHhhhhcccCCCCCCcc-cccccccc-cccce Confidence 88777777788877888777542 234578889999999999999999888421100000000 00000000 00111 Q ss_pred cccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeec Q lcl|NC_020854. 155 SGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYM 234 (342) Q Consensus 155 ~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~ 234 (342) .........++.+.++..++.+...+..+|+|||..+..|+++ ++++++.+..... ....-++++ T Consensus 149 ~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~L~~l------kd~~G~~i~~~~~---------~~~~~~~l~ 213 (300) T protein:vir:95 149 TVPFKDTNPDESMEDAVGMIDGSERDITGAILDPIFTTALSKM------KNAEGGKLYPELA---------WGGVPDAIN 213 (300) T ss_pred eecccccchHHHHHHHHHHhhhcCCCccEEEECHHHHHHHHHh------hccCCCeeccCcc---------ccCCCceec Confidence 1122345567889999999888777788999999999999875 3455443321000 012357899 Q ss_pred cceEEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccC--CC--------cceeEEEEeeEEEeeecc-- Q lcl|NC_020854. 235 GLRVIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRD--IL--------AKSDAMSIDLHYVYHPVG-- 300 (342) Q Consensus 235 G~~VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~--~~--------~g~~~l~~r~~y~~~~~G-- 300 (342) |++|++++.+|...+.. +. .++++. .++.++....++++..+. .. ..+..+....++.+.+.- T Consensus 214 G~Pv~~s~~v~~~~~~~--~~-~~~~GDf~~~~~~~~~~~~~~~v~~~~~~d~~~~~~f~~~~v~~r~~~r~d~~v~~~~ 290 (300) T protein:vir:95 214 GLAVDKNRTVSYSQTDP--KN-TAIVGDFETMFKWGYAKEVPMEIIKYGDPDNSGRDLKGYNQIYIRCEAYIGWGIMDAA 290 (300) T ss_pred ceeeEEecCCCCCCCCC--cc-EEEEeeccceEEEEEecccEEEEeeccCCCCcchhhhhcCcEEEEEEEeecceeeccc Confidence 99999999998644322 22 233332 344455555555554432 11 122455666666555433 Q ss_pred -eee-e-cCc Q lcl|NC_020854. 301 -AKW-A-VTT 307 (342) Q Consensus 301 -~s~-~-~~~ 307 (342) |.- + .+| T Consensus 291 a~~~l~~~~g 300 (300) T protein:vir:95 291 SFARIVKTGG 300 (300) T ss_pred ceEEEecCCC Confidence 322 2 334 No 49 >protein:vir:7771 Length: 330 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:149 # MgeName: Bxz2 # Cross-refs: genbank:acc:NP_817605;genbank:gi:29566035;genbank:GeneID:1259229 Probab=99.41 E-value=1.3e-13 Score=91.27 Aligned_cols=285 Identities=13% Similarity=0.114 Sum_probs=163.3 Q ss_pred Cc---------c---eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCC Q lcl|NC_020854. 1 MA---------T---LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDS 68 (342) Q Consensus 1 Ma---------T---~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~ 68 (342) || + .-..++.|++...++ +.+.+.+.+.+ ++.. . ..++..+.+|.+.. .+.+.-+.|+ T Consensus 1 m~~~~~~a~~~~~t~~~g~~i~~~~~~~ii-~~~~~~s~l~~--~~~~----~-~~~~~~~~~p~~~~--~~~a~~v~Eg 70 (330) T protein:vir:77 1 MAGSTVPSTQVALTGDFSAFLTPEQSQDYF-AEIEKTSIVQR--IARK----V-PMGPTGISIPHWTG--AVSASWTGEA 70 (330) T ss_pred CcccccchhhccccCCCcceechhHHHHHH-HHHHhccchhh--hcce----e-eccCCceEEEEEcC--CcceeEecCC Confidence 44 1 123457777765544 44555555543 2211 1 12455688999864 3677788999 Q ss_pred ceechhhcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH------HHHHHHhhhccccc Q lcl|NC_020854. 69 SSLTPGKITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLS------CLQGVFGSLNANTS 142 (342) Q Consensus 69 ~~i~~~~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla------~L~g~~~~~~a~~~ 142 (342) +.++..+.+..+-....++.+.-+.++++...-+..|..+.+.+++++.+++++++.+|. ...|++...... T Consensus 71 ~~~~~~~~~f~~i~~~~~k~~~~~~is~ell~ds~~~~~~~i~~~l~~ai~~~~~~~~l~G~g~~~~~~g~~~~~~~~-- 148 (330) T protein:vir:77 71 ERKPITKGSFGKQELEPVKITTIFAESAEVVRLNPLNYLNTMRTKIAEAIALKFDAAAIHGIDKPSAFKGYLAETTKV-- 148 (330) T ss_pred CccccccceeeEEEEeEEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhcccCCCCcccccccccccc-- Confidence 999999998888888888888888888877666667888899999999999999988773 111111110000 Q ss_pred chhheeeecccccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeec Q lcl|NC_020854. 143 SSAFFDLCIDSESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAA 222 (342) Q Consensus 143 ~~~~~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~ 222 (342) .........+........++.|.+++.++.........|+||+..+..|+++ +++++..+... ...... T Consensus 149 ---~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~vmn~~~~~~l~~l------kd~~G~~l~~~--~~~~~~ 217 (330) T protein:vir:77 149 ---VSLADTNLTTASGPQGNAYLAVNNALSLLVNSGKKWTGTLLDNVTEPILNTA------VDGNGRPLFVE--STYTEQ 217 (330) T ss_pred ---ceeecccccccccccchhHHHHHHHHHhhhhcCCCccEEEEcHHHHHHHHHH------hccCCceeecC--cccccc Confidence 0000011111122233446778888888887777778999999999999874 34544433221 111000 Q ss_pred ccccccceeeeccceEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccCC-------------------- Q lcl|NC_020854. 223 AYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRDI-------------------- 281 (342) Q Consensus 223 ~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~-------------------- 281 (342) .....-.+++|++|+++|.+|-... .+++. ++|+. -.+.++....+.++..++. T Consensus 218 --~~~~~~~~l~G~PV~~~~~~p~~~~--~~~~~-~~~gd~s~~~i~~~~~~~i~~~~e~~~~~~~~~~~~~~~~~~~~f 292 (330) T protein:vir:77 218 --VGAIREGRILGRPTYVADNVVNGTV--GNRVV-GVMGDFSQVIWGQIGGLSFDVTDQATLDFGEEQGGVWVPKLISLW 292 (330) T ss_pred --ccccCCceecceeeEEeccccCCCC--CCccE-EEEEecceEEEEEecCcEEEEeecceeeecccccccccccccchh Confidence 1122346899999999999985322 23333 33332 2233444445555443331 Q ss_pred CcceeEEEEeeEEEeee---cceee---ecCcCCcChH Q lcl|NC_020854. 282 LAKSDAMSIDLHYVYHP---VGAKW---AVTTTNPTRA 313 (342) Q Consensus 282 ~~g~~~l~~r~~y~~~~---~G~s~---~~~~~sPt~~ 313 (342) .+....+....++.+.| ..+.- +.++..|.-+ T Consensus 293 ~~~~~~~r~~~r~d~~v~~~~a~~~i~~~~~~~~~~~~ 330 (330) T protein:vir:77 293 QHNMVAVRCEAEFAFMVNDKDAFVKLTDQVAGTDPEEE 330 (330) T ss_pred hcCcEEEEEEEEeccEEecccceEEEEeccCCcCCCCC Confidence 12234455555555443 22222 2233334433 No 50 >protein:vir:78223 Length: 333 # NCBI annotation: Putative major head protein # Family: family:all:966 # MgeID: mge:1849 # MgeName: Bethlehem # Cross-refs: genbank:acc:YP_001491666;genbank:gi:157786490;genbank:GeneID:5625701 Probab=99.38 E-value=3e-13 Score=89.22 Aligned_cols=281 Identities=9% Similarity=0.047 Sum_probs=162.4 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCC------CCcccccCCCceechh Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANL------SGDFEVLSDSSSLTPG 74 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~------~gda~~~~~~~~i~~~ 74 (342) |+..-++ .+|+.+..-+.+.+.+.+.+.+. +..- ..++..+.+|.+.... .|.+..+.|++.++.. T Consensus 20 ~~~~~~~-liP~~~~~~ii~~l~~~s~l~~~--~~~~-----~~~~~~~~~p~~~~~~~a~~v~eg~~~~~~e~~~~~~~ 91 (333) T protein:vir:78 20 LAHVPSD-LLPKEIVGPIFDKAQESSLVLRM--GEQI-----PISYGETIIPTTVKRPEVGQVGVGTSNEQREGGLKPLS 91 (333) T ss_pred eecCCcc-ccchhHHHHHHHHHHhhchhhhh--ccee-----eccCCceEEEEEeCCceeEeecCccccccccccccccc Confidence 2222233 44666555555555555555442 2111 1245667888887531 1344445666777777 Q ss_pred hcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH--------HHHHHhhhcccccchhh Q lcl|NC_020854. 75 KITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSC--------LQGVFGSLNANTSSSAF 146 (342) Q Consensus 75 ~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~--------L~g~~~~~~a~~~~~~~ 146 (342) +.+-.+-....++.+.-+.++++...-+..+..+.+.+++++.+++.+++.+|.- .+|+.... . T Consensus 92 ~~~f~~i~l~~~kl~~~~~is~ell~~s~~~~~~~i~~~la~ai~~~~d~~~l~G~g~~~~~~~~g~~~~~--------~ 163 (333) T protein:vir:78 92 GTAWDTRSVSPIKLATIVTVSEEFARMNPSGLYTKLQGDLAYAIGRGIDLAVFHGKSPLTGSALQGIDTDN--------V 163 (333) T ss_pred ccceeEEEEeeEEEEEeehhhHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHHhcccCCCCCcccccccccc--------c Confidence 7777777777778888888888776666778889999999999999999888741 11111000 0 Q ss_pred eeeecccccccccccccHHHHHHHHHHhCccc-cCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccc Q lcl|NC_020854. 147 FDLCIDSESGDTPTALSPRHVAEARAILGDQG-DKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYG 225 (342) Q Consensus 147 ~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~-~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~ 225 (342) ...............+.++.|.++...+.... ....+|+|||..+..|++... ++++++..+.. ... T Consensus 164 ~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~~~~---~~d~~G~~i~~---------~~~ 231 (333) T protein:vir:78 164 IANTTNVDYLQETGDPLLDRLLDGYDLVSANTDVEFNGWAVDPRFRAHLLRAQA---YRDANGNVDPS---------RIN 231 (333) T ss_pred ccccccccccccccchhHHHHHHHHHhhccccccCceEEEEcchHHHHHHHHhh---hcCCCCceeec---------Ccc Confidence 00001112223344567888999988776543 345689999999999987542 34444333321 111 Q ss_pred cccceeeeccceEEEeCCcceec-cCCCcceEEEEEec-ceeEeecCCcceeEeccCCC-------------cceeEEEE Q lcl|NC_020854. 226 GEVSVPTYMGLRVIVSDDVNTAG-SGGSTEYATYFFTQ-GAVASGEQMAMQTETDRDIL-------------AKSDAMSI 290 (342) Q Consensus 226 ~~~~i~~~~G~~VvvdD~~p~~~-~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~~-------------~g~~~l~~ 290 (342) ....-++++|++|++++.+|... +...+++. ++|+. .-+.++...++.++.++... .+...+.. T Consensus 232 ~~~~~~~l~G~Pv~~~~~i~~~~~~~~~~~~~-~~~gD~~~~~~g~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~v~~r~ 310 (333) T protein:vir:78 232 LAAQTGDVLGLPAQFGRAVGGDLGAAVDSKTR-IIGGDFSQLKFGFADEIRIKMSDTATLTDSGSATVSMWQTNQIAILI 310 (333) T ss_pred ccCCCceeeceeeEEccccCCCccccCCCccE-EEEEecccEEEEEeeccEEEEeccccccccccceeehhhcCcEEEEE Confidence 12345789999999999998653 22223333 33332 22335555667777666531 22344556 Q ss_pred eeEEEeeecc---eeeecCcCCc Q lcl|NC_020854. 291 DLHYVYHPVG---AKWAVTTTNP 310 (342) Q Consensus 291 r~~y~~~~~G---~s~~~~~~sP 310 (342) ..++.+.++- +..-.....| T Consensus 311 ~~r~d~~v~~~~a~~~l~~~~a~ 333 (333) T protein:vir:78 311 EVTFGWLLGDKQAFVKFVDDEQP 333 (333) T ss_pred EEEEccEEecccceEEEeccCCC Confidence 5666655433 4443344567 No 51 >protein:vir:81100 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:1891 # MgeName: tp310-1 # Cross-refs: genbank:acc:YP_001429874;genbank:gi:156603927;genbank:GeneID:5525320 Probab=99.37 E-value=4.6e-13 Score=88.21 Aligned_cols=283 Identities=12% Similarity=0.038 Sum_probs=165.6 Q ss_pred Cc----ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechh-h Q lcl|NC_020854. 1 MA----TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPG-K 75 (342) Q Consensus 1 Ma----T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~-~ 75 (342) ++ |.-.-.++|+.+.+-+.+...+...+.+ ++...+ + ..+.-.+.+|.+.. ...+..+.|+.+++.. . T Consensus 120 ~~~~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~--~~~~~~-~--~~~~~~~~~~~~~~--~~~~~~v~E~~~~~~~~~ 192 (415) T protein:vir:81 120 QGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDK--YVTVKR-V--TNGSGKYPVVRQSE--VAALEKVEELEENPELAV 192 (415) T ss_pred hhccccccccccccchHHHHHHHHHHHhhhhhhh--heeeee-c--cCCceeEEEEeecC--CccceeeccccccCcccc Confidence 11 2235678999888777777666666633 222111 1 11122444555543 3566678888888754 4 Q ss_pred cccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 76 ITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 76 lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) .+-.+-....++.+.-+.+++....-+.-|..+.+.+++++.+.+..++.++.-+. . ........ ........ T Consensus 193 ~~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g---~---g~~~~~~~-~~~~~~~~ 265 (415) T protein:vir:81 193 KPFFQLAYDINTHRGYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVIT---K---GSTGSTSS-GFEKEGKK 265 (415) T ss_pred cceeeEEeeeeeeEeeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccc---c---Cccccccc-cccccccc Confidence 45566666777777778887776555556778889999999999998888766331 0 11111111 11111222 Q ss_pred ccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) .......+++.|.+++..+.+....-.+|+||+..+..|++. ++++++++.... ..+...++++| T Consensus 266 ~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~l~~l------kd~~G~~l~~~~---------~~~~~~~~l~G 330 (415) T protein:vir:81 266 LEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKM------KDKLGNYLIQPD---------VKEKTQQRLLG 330 (415) T ss_pred cccccccchhHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHh------hccCCceeeccC---------cCCCCCceecc Confidence 334456789999999999988877788999999999999863 455544433211 11234578999 Q ss_pred ceEEEeCCcceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCCCcceeEEEEeeEEEeee---cceee---ecCc Q lcl|NC_020854. 236 LRVIVSDDVNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHP---VGAKW---AVTT 307 (342) Q Consensus 236 ~~VvvdD~~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~---~G~s~---~~~~ 307 (342) ++|++++.+|.... +. .+++|+ ..++.+.....+.++..+..... +.+..-.++-..| ..+-+ +.+. T Consensus 331 ~pV~~~~~~~~~~~---~~-~~~~~Gd~~~~~~~~~~~~~~v~~~~~~~~~-~~~~~~~r~d~~v~~~~a~~~~~~~~~~ 405 (415) T protein:vir:81 331 AKIEILPDEVLGQK---GN-NTLIIGNLKDAIVLFDRSQYQASWTDYMHFG-ECLMIAVRQDCRILDYKSAIVIEYDDSE 405 (415) T ss_pred eeeEEecccccCCC---Cc-cEEEEEehhccEEEEeecceEEEEeccccCc-eEEEEEEEeccEEeccccEEEEEEeccC Confidence 99999999986542 22 245666 44555666667777766644322 2333333343333 33322 2221 Q ss_pred CCcChHHhcC Q lcl|NC_020854. 308 TNPTRAQLET 317 (342) Q Consensus 308 ~sPt~~~L~~ 317 (342) ..|-+--|+. T Consensus 406 ~~~~~~~~~~ 415 (415) T protein:vir:81 406 RGEGDLGLEA 415 (415) T ss_pred CCCCccccCC Confidence 1122222222 No 52 >protein:vir:79987 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:1875 # MgeName: tp310-3 # Cross-refs: genbank:acc:YP_001430002;genbank:gi:156604057;genbank:GeneID:5525447 Probab=99.37 E-value=4.6e-13 Score=88.21 Aligned_cols=283 Identities=12% Similarity=0.038 Sum_probs=165.6 Q ss_pred Cc----ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechh-h Q lcl|NC_020854. 1 MA----TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPG-K 75 (342) Q Consensus 1 Ma----T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~-~ 75 (342) ++ |.-.-.++|+.+.+-+.+...+...+.+ ++...+ + ..+.-.+.+|.+.. ...+..+.|+.+++.. . T Consensus 120 ~~~~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~--~~~~~~-~--~~~~~~~~~~~~~~--~~~~~~v~E~~~~~~~~~ 192 (415) T protein:vir:79 120 QGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDK--YVTVKR-V--TNGSGKYPVVRQSE--VAALEKVEELEENPELAV 192 (415) T ss_pred hhccccccccccccchHHHHHHHHHHHhhhhhhh--heeeee-c--cCCceeEEEEeecC--CccceeeccccccCcccc Confidence 11 2235678999888777777666666633 222111 1 11122444555543 3566678888888754 4 Q ss_pred cccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 76 ITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 76 lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) .+-.+-....++.+.-+.+++....-+.-|..+.+.+++++.+.+..++.++.-+. . ........ ........ T Consensus 193 ~~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g---~---g~~~~~~~-~~~~~~~~ 265 (415) T protein:vir:79 193 KPFFQLAYDINTHRGYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVIT---K---GSTGSTSS-GFEKEGKK 265 (415) T ss_pred cceeeEEeeeeeeEeeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccc---c---Cccccccc-cccccccc Confidence 45566666777777778887776555556778889999999999998888766331 0 11111111 11111222 Q ss_pred ccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) .......+++.|.+++..+.+....-.+|+||+..+..|++. ++++++++.... ..+...++++| T Consensus 266 ~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~l~~l------kd~~G~~l~~~~---------~~~~~~~~l~G 330 (415) T protein:vir:79 266 LEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKM------KDKLGNYLIQPD---------VKEKTQQRLLG 330 (415) T ss_pred cccccccchhHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHh------hccCCceeeccC---------cCCCCCceecc Confidence 334456789999999999988877788999999999999863 455544433211 11234578999 Q ss_pred ceEEEeCCcceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCCCcceeEEEEeeEEEeee---cceee---ecCc Q lcl|NC_020854. 236 LRVIVSDDVNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHP---VGAKW---AVTT 307 (342) Q Consensus 236 ~~VvvdD~~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~---~G~s~---~~~~ 307 (342) ++|++++.+|.... +. .+++|+ ..++.+.....+.++..+..... +.+..-.++-..| ..+-+ +.+. T Consensus 331 ~pV~~~~~~~~~~~---~~-~~~~~Gd~~~~~~~~~~~~~~v~~~~~~~~~-~~~~~~~r~d~~v~~~~a~~~~~~~~~~ 405 (415) T protein:vir:79 331 AKIEILPDEVLGQK---GN-NTLIIGNLKDAIVLFDRSQYQASWTDYMHFG-ECLMIAVRQDCRILDYKSAIVIEYDDSE 405 (415) T ss_pred eeeEEecccccCCC---Cc-cEEEEEehhccEEEEeecceEEEEeccccCc-eEEEEEEEeccEEeccccEEEEEEeccC Confidence 99999999986542 22 245666 44555666667777766644322 2333333343333 33322 2221 Q ss_pred CCcChHHhcC Q lcl|NC_020854. 308 TNPTRAQLET 317 (342) Q Consensus 308 ~sPt~~~L~~ 317 (342) ..|-+--|+. T Consensus 406 ~~~~~~~~~~ 415 (415) T protein:vir:79 406 RGEGDLGLEA 415 (415) T ss_pred CCCCccccCC Confidence 1122222222 No 53 >protein:vir:98339 Length: 415 # NCBI annotation: putative capsid protein # Family: family:all:21 # MgeID: mge:1581 # MgeName: phiPVL(108) # Cross-refs: genbank:acc:YP_918931;genbank:gi:119443693;genbank:GeneID:4594501 Probab=99.37 E-value=4.6e-13 Score=88.21 Aligned_cols=283 Identities=12% Similarity=0.038 Sum_probs=165.6 Q ss_pred Cc----ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechh-h Q lcl|NC_020854. 1 MA----TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPG-K 75 (342) Q Consensus 1 Ma----T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~-~ 75 (342) ++ |.-.-.++|+.+.+-+.+...+...+.+ ++...+ + ..+.-.+.+|.+.. ...+..+.|+.+++.. . T Consensus 120 ~~~~~~~~~gg~~iP~~~~~~ii~~~~~~~~l~~--~~~~~~-~--~~~~~~~~~~~~~~--~~~~~~v~E~~~~~~~~~ 192 (415) T protein:vir:98 120 QGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDK--YVTVKR-V--TNGSGKYPVVRQSE--VAALEKVEELEENPELAV 192 (415) T ss_pred hhccccccccccccchHHHHHHHHHHHhhhhhhh--heeeee-c--cCCceeEEEEeecC--CccceeeccccccCcccc Confidence 11 2235678999888777777666666633 222111 1 11122444555543 3566678888888754 4 Q ss_pred cccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 76 ITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 76 lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) .+-.+-....++.+.-+.+++....-+.-|..+.+.+++++.+.+..++.++.-+. . ........ ........ T Consensus 193 ~~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g---~---g~~~~~~~-~~~~~~~~ 265 (415) T protein:vir:98 193 KPFFQLAYDINTHRGYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVIT---K---GSTGSTSS-GFEKEGKK 265 (415) T ss_pred cceeeEEeeeeeeEeeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccc---c---Cccccccc-cccccccc Confidence 45566666777777778887776555556778889999999999998888766331 0 11111111 11111222 Q ss_pred ccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) .......+++.|.+++..+.+....-.+|+||+..+..|++. ++++++++.... ..+...++++| T Consensus 266 ~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~l~~l------kd~~G~~l~~~~---------~~~~~~~~l~G 330 (415) T protein:vir:98 266 LEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKM------KDKLGNYLIQPD---------VKEKTQQRLLG 330 (415) T ss_pred cccccccchhHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHh------hccCCceeeccC---------cCCCCCceecc Confidence 334456789999999999988877788999999999999863 455544433211 11234578999 Q ss_pred ceEEEeCCcceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCCCcceeEEEEeeEEEeee---cceee---ecCc Q lcl|NC_020854. 236 LRVIVSDDVNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHP---VGAKW---AVTT 307 (342) Q Consensus 236 ~~VvvdD~~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~---~G~s~---~~~~ 307 (342) ++|++++.+|.... +. .+++|+ ..++.+.....+.++..+..... +.+..-.++-..| ..+-+ +.+. T Consensus 331 ~pV~~~~~~~~~~~---~~-~~~~~Gd~~~~~~~~~~~~~~v~~~~~~~~~-~~~~~~~r~d~~v~~~~a~~~~~~~~~~ 405 (415) T protein:vir:98 331 AKIEILPDEVLGQK---GN-NTLIIGNLKDAIVLFDRSQYQASWTDYMHFG-ECLMIAVRQDCRILDYKSAIVIEYDDSE 405 (415) T ss_pred eeeEEecccccCCC---Cc-cEEEEEehhccEEEEeecceEEEEeccccCc-eEEEEEEEeccEEeccccEEEEEEeccC Confidence 99999999986542 22 245666 44555666667777766644322 2333333343333 33322 2221 Q ss_pred CCcChHHhcC Q lcl|NC_020854. 308 TNPTRAQLET 317 (342) Q Consensus 308 ~sPt~~~L~~ 317 (342) ..|-+--|+. T Consensus 406 ~~~~~~~~~~ 415 (415) T protein:vir:98 406 RGEGDLGLEA 415 (415) T ss_pred CCCCccccCC Confidence 1122222222 No 54 >protein:vir:2344 Length: 397 # NCBI annotation: gp14 # Family: family:all:507 # MgeID: mge:51 # MgeName: Bxb1 # Cross-refs: genbank:acc:NP_075281;genbank:gi:12657868;genbank:GeneID:920118 Probab=99.37 E-value=3.5e-13 Score=88.88 Aligned_cols=303 Identities=14% Similarity=0.091 Sum_probs=167.8 Q ss_pred Cc----ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhc Q lcl|NC_020854. 1 MA----TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKI 76 (342) Q Consensus 1 Ma----T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~l 76 (342) |+ +.-..++.||+..+++.. ..+.+.+.+ ++. .+ ..++..+++|.+.. ...+.-+.|++.++..+. T Consensus 10 ~~~~~t~~~~g~l~~~~~~~ii~~-l~~~s~i~~--l~~---~~--~~~~~~~~ip~~~~--~~~a~wv~Eg~~~~~s~~ 79 (397) T protein:vir:23 10 IAQTKDTMFTGYLDPVQAKDYFAE-AEKTSIVQR--VAQ---KI--PMGATGIVIPHWTG--DVSAQWIGEGDMKPITKG 79 (397) T ss_pred HhhccCCCCccccchhHHHHHHHH-HHhccchhh--hcc---ee--eccCCceEEEEEcC--CcceEEecCCcccccccc Confidence 44 234578999988776654 334344433 221 12 13466789999874 367778899999999999 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +-.+.....++.+..+.++++...-+.-|....+.+++++.+++++++.+|.-. ..... ........ .... T Consensus 80 ~f~~v~l~~~k~~~~v~iS~ell~ds~~~l~~~i~~~l~~aia~~~d~a~l~G~------gt~~~-~~~~~~~~--~~~~ 150 (397) T protein:vir:23 80 NMTKRDVHPAKIATIFVASAETVRANPANYLGTMRTKVATAIAMAFDNAALHGT------NAPSA-FQGYLDQS--NKTQ 150 (397) T ss_pred ceeEEEEeeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhhcc------cCCcc-cccccccc--ccee Confidence 988888889999999999998877777889999999999999999999887421 11111 01111111 1111 Q ss_pred cccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccc Q lcl|NC_020854. 157 DTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGL 236 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~ 236 (342) ........+.+.++...+-+.......|+||++.+..|++. ++++++.+..-...+. .......++++|+ T Consensus 151 ~~~~~~~~~~~~~~~~~l~~~~~~~a~~vmn~~~~~~L~~l------kd~~G~~i~~~~~~~~----~~~~~~~~tl~G~ 220 (397) T protein:vir:23 151 SISPNAYQGLGVSGLTKLVTDGKKWTHTLLDDTVEPVLNGS------VDANGRPLFVESTYES----LTTPFREGRILGR 220 (397) T ss_pred eecccchhHHHHHHHHhhhhcccCCCEEEEcHHHHHHHHHh------hccCCceeeccccccc----ccccccCceeeee Confidence 22334566778888887777777788999999999999874 3444443322111111 0112234689999 Q ss_pred eEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccCCC----------------cceeEEEEeeEEEeeec Q lcl|NC_020854. 237 RVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRDIL----------------AKSDAMSIDLHYVYHPV 299 (342) Q Consensus 237 ~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~~----------------~g~~~l~~r~~y~~~~~ 299 (342) +|++++.+|-.. . ..+++. .-+.++..+++.++..|+.. .....+....++.+.++ T Consensus 221 Pv~~s~~~~~g~------~-~~~~gDfs~~~i~~~~~i~i~~~~e~~~~~~~~~~~~~~~lf~~d~v~~ra~~r~d~~v~ 293 (397) T protein:vir:23 221 PTILSDHVAEGD------V-VGYAGDFSQIIWGQVGGLSFDVTDQATLNLGSQESPNFVSLWQHNLVAVRVEAEYGLLIN 293 (397) T ss_pred eEEEeCCCCCCc------e-EEEEeecceEEEEEEeceEEEEeeeeeeeeccccccceeeeeeccceeEEEEeeecccee Confidence 999999987321 1 112221 11224444445555444321 12345555555554432 Q ss_pred ---ceee---ecC--------------------------cC--CcCh----HHhcCCcCc------eeecCccccceEEE Q lcl|NC_020854. 300 ---GAKW---AVT--------------------------TT--NPTR----AQLETVANW------SKVYELKNIGIVRA 335 (342) Q Consensus 300 ---G~s~---~~~--------------------------~~--sPt~----~~L~~~~NW------~~v~d~k~i~~~~~ 335 (342) .+.. +.. ++ +.|. +.|+.-.|| +...+ .-| ..+ T Consensus 294 ~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~--~~~-~~~ 370 (397) T protein:vir:23 294 DVNAFVKLTFDPVLTTYALDLDGASAGNFTLSLDGKTSANIAYNASTATVKSAIVAIDDGVSADDVTVTGS--AGD-YTI 370 (397) T ss_pred cccceEEEeeccccceeeecccccCcceEEEEecCccccCcccccchhhhHHHhhhcccccccceeeeecC--Cce-eEE Confidence 2221 000 00 0011 122222222 22111 000 111 Q ss_pred EecCCCC Q lcl|NC_020854. 336 TNVSNFD 342 (342) Q Consensus 336 ~~~~~~~ 342 (342) ++.-.+- T Consensus 371 ~~~~~~~ 377 (397) T protein:vir:23 371 TVPGTLT 377 (397) T ss_pred Eeccccc Confidence 1111000 No 55 >protein:vir:80684 Length: 315 # NCBI annotation: gp6 # Family: family:all:966 # MgeID: mge:1884 # MgeName: PA6 # Cross-refs: genbank:acc:YP_001285582;genbank:gi:148727088;genbank:GeneID:5247055 Probab=99.37 E-value=2.6e-13 Score=89.55 Aligned_cols=289 Identities=10% Similarity=0.003 Sum_probs=159.8 Q ss_pred Ccc---eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MAT---LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT---~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) ||+ ......+|+.+.+-+.+.+.+.+.+.+- +.. + ..++..+++|.+.. .+.+.-+.|++.++..+.+ T Consensus 1 Ma~~~~~~gg~~vP~~~~~~ii~~l~~~s~i~~l--~~~---i--~~~~~~~~ip~~~~--~~~a~wv~Eg~~~~~s~~~ 71 (315) T protein:vir:80 1 MADDFLSAGKLELPGSMIGAVRDRAIDSGVLAKL--SPE---Q--PTIFGPVKGAVFSG--VPRAKIVGEGEVKPSASVD 71 (315) T ss_pred CCCCcCCcCceEcchHHHHHHHHHHHhhchhhhh--cce---e--ecCCCceEEEEEeC--CcceEEeeCCccccccccc Confidence 993 3466899999988788888887777552 211 1 12456788999874 3677789999999998888 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchH----HHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDP----MAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDS 153 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp----~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~ 153 (342) -++-....++.+.-..++++...-+..+. ...+.++++++.++++++.++. |.=......... +....... T Consensus 72 f~~v~l~~~kl~~~~~iS~ell~~s~~~~~~~l~~~i~~~la~ai~~~~d~a~~~---G~~~~~~~~~~~--~~~~~~~~ 146 (315) T protein:vir:80 72 VSAFTAQPIKVVTQQRVSDEFMWADADYRLGVLQDLISPALGASIGRAVDLIAFH---GIDPATGKAASA--VHTSLNKT 146 (315) T ss_pred eeeeEeeeeeEEeeehhhHHHhhcCchhHHHHHHHHHHHHHHHHHHHHHhhheee---ccCCCCCccccc--cccccccc Confidence 77777777777777777776544444443 2557888888888888876663 210000000000 00000001 Q ss_pred ccccccccccHHHHHHHHHHhC-ccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceee Q lcl|NC_020854. 154 ESGDTPTALSPRHVAEARAILG-DQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPT 232 (342) Q Consensus 154 ~~~~~~~~~~~~~l~~A~~~~G-D~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~ 232 (342) ..........+..+.++..++- .+...-.+|+|||+.+..|++... .+.+.... ....+ ....+.-++ T Consensus 147 ~~~~~~~~~~~~d~~~~~~~~~~~~~~~~~~~imn~~~~~~L~~l~~------~~g~~~~g---~~~~~--~~~~g~~~t 215 (315) T protein:vir:80 147 KNIVDATDSATADLVKAVGLIAGAGLQVPNGVALDPAFSFALSTEVY------PKGSPLAG---QPMYP--AAGFAGLDN 215 (315) T ss_pred cceeeccccchHHHHHHHHHHhhccCccceEEEEcHHHHHHHHHHhh------ccCCcccc---ccccc--ccccCCCce Confidence 1111112234566788877664 444555689999999999997642 22111110 00000 111223468 Q ss_pred eccceEEEeCCcceeccCCCcceEEEEEecce-eEeecCCcceeEeccCCC----------cceeEEEEeeEEEeee--- Q lcl|NC_020854. 233 YMGLRVIVSDDVNTAGSGGSTEYATYFFTQGA-VASGEQMAMQTETDRDIL----------AKSDAMSIDLHYVYHP--- 298 (342) Q Consensus 233 ~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GA-i~~~~k~~~~ve~dr~~~----------~g~~~l~~r~~y~~~~--- 298 (342) ++|++|++++.||.............+|+.=. +.|+.-+...++..+... ..+..+....++..++ T Consensus 216 l~G~PV~~~~~~~~~~~~~~~~~~~~~~GDfs~~~~g~~~~~~i~i~~~~~~~~~~~~~~~~~~v~~r~~~r~~~~v~~~ 295 (315) T protein:vir:80 216 WRGLNVGASSTVSGAPEMSPASGVKAIVGDFSRVHWGFQRNFPIELIEYGDPDQTGRDLKGHNEVMVRAEAVLYVAIESL 295 (315) T ss_pred ecceeeEecCcCCcccccccccccEEEEeecccEEEEEecCeeEEEeccccccCcccchhhcCcEEEEEEEEecceeecc Confidence 99999999999986533222111123333211 234444455666555432 2234555556665443 Q ss_pred cceee-ec---CcCCcChHH Q lcl|NC_020854. 299 VGAKW-AV---TTTNPTRAQ 314 (342) Q Consensus 299 ~G~s~-~~---~~~sPt~~~ 314 (342) ..|.. +. ...+|-.+. T Consensus 296 ~a~~~l~~~~a~~~~~~~~~ 315 (315) T protein:vir:80 296 DSFAVVKEKAAPKPNPPAEN 315 (315) T ss_pred cceEEEeeccCCCCCCCCCC Confidence 33333 11 122333333 No 56 >protein:vir:8187 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:153 # MgeName: Che9d # Cross-refs: genbank:acc:NP_817980;genbank:gi:29566414;genbank:GeneID:2700968 Probab=99.36 E-value=2.1e-13 Score=90.04 Aligned_cols=284 Identities=12% Similarity=0.031 Sum_probs=159.8 Q ss_pred Ccce-eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccc Q lcl|NC_020854. 1 MATL-RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITAD 79 (342) Q Consensus 1 MaT~-~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~ 79 (342) |||. ....++|+.|.+=+.+...+.+.+.+- +..-+ .++..+++|.+.. .+.+.-+.|++.++..+.+-. T Consensus 1 mat~~~gg~lvP~~~~~~ii~~~~~~s~i~~~--~~~i~-----~~~~~~~~p~~~~--~~~a~wv~Eg~~~~~~~~~f~ 71 (311) T protein:vir:81 1 MVALATGTFQLPKHLVPGVWQKAQGQSVLARL--SMAEP-----QEFGEQQYMTLTA--PPRGEVVGEGAQKSESTATFA 71 (311) T ss_pred CceecCCceEcchhHHHHHHHHHHhcchhhhh--cceee-----cCCCceEEEEEeC--CceeEEeecCcccccccceee Confidence 9865 578999999988777777777766542 21111 2455688999864 377778899999999888877 Q ss_pred eeeeEeeeeccceeechHHHhhh---cchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 80 KQVAAILHRGRAFEARDLAALAA---GSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 80 ~~~a~i~~~~k~~~~tD~a~~~~---~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +.....++.+.-+.++++-...+ ..+..+.+.+++++.++++++..++.--..--+........... ........+ T Consensus 72 ~v~l~~~kl~~~~~iS~ell~~~~d~~~~l~~~i~~~la~ai~~~~d~a~l~G~~~~~~~~~~gi~~~~~-~~~~~~~~~ 150 (311) T protein:vir:81 72 PVTAIPRKVQVTQRFSQEVKWADESRQLGVLQTMADLSGVALGRALDLIGIHGINPLTGAALSGSPAKIL-DTTNIVELT 150 (311) T ss_pred EEEEeeEEEEEeehhhHHHhhcCcccHHHHHHHHHHHHHHHHHHHHHHhhhccccCCCCccccccccccc-ccceeeeec Confidence 77777777776677766643222 33567889999999999999988764210000000000000000 000000111 Q ss_pred cccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccc Q lcl|NC_020854. 157 DTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGL 236 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~ 236 (342) ..........+.++..++-+...+..+|+|||..+..|+++ ++++++.+.... .....-++++|+ T Consensus 151 ~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~l------kd~~G~~l~~~~---------~~~~~~~tl~G~ 215 (311) T protein:vir:81 151 TGTSATPDLAVEAAVGLVLGDNLSPDGVALDNTFSFMLATQ------RDSQGRKLYPEL---------GFGTDVASFAGL 215 (311) T ss_pred ccccchHHHHHHHHHHHhhhcCCCceEEEEcHHHHHHHHhh------hccCCCeeecCc---------cccCCCceecce Confidence 11122223445666667766555667899999999999874 345444332210 112345789999 Q ss_pred eEEEeCCcceeccCCCcceEE---------EEEecc-eeEeecCCcceeEeccCCC---------cceeEEEEeeEEEee Q lcl|NC_020854. 237 RVIVSDDVNTAGSGGSTEYAT---------YFFTQG-AVASGEQMAMQTETDRDIL---------AKSDAMSIDLHYVYH 297 (342) Q Consensus 237 ~VvvdD~~p~~~~~~~~~y~t---------~l~~~G-Ai~~~~k~~~~ve~dr~~~---------~g~~~l~~r~~y~~~ 297 (342) +|++++.+|............ ++|+.= -+.++.-+++.++..++.. .+...+....++... T Consensus 216 Pv~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~gDfs~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~r~~~r~d~~ 295 (311) T protein:vir:81 216 NAAVSDTVRGGPEAVTASTGVYRTTNPNVKAIAGDFSAFRWGVQVSIPLELIEFGDPDGLGDLKRQNQIAIRAEVVYGIG 295 (311) T ss_pred eEEecccccccccccccccchhcccCCccEEEEEecccEEEEEeccceEEEeccCCCCcchhhhhcCcEEEEEEEEeccE Confidence 999999887432211111111 222221 1233444455666555432 223455555666555 Q ss_pred ec---ceee-ecCcCC Q lcl|NC_020854. 298 PV---GAKW-AVTTTN 309 (342) Q Consensus 298 ~~---G~s~-~~~~~s 309 (342) |. .|.. +.+..+ T Consensus 296 v~~~~a~~~l~~a~~~ 311 (311) T protein:vir:81 296 IMSTDAFAVVRDADES 311 (311) T ss_pred eecccceEEEEeeccC Confidence 43 3332 222222 No 57 >protein:vir:108303 Length: 418 # NCBI annotation: hypothetical protein # Family: family:all:1412 # MgeID: mge:2007 # MgeName: BA3 # Cross-refs: genbank:acc:YP_001552282;genbank:gi:160700607;genbank:GeneID:5758819 Probab=99.36 E-value=3e-13 Score=89.27 Aligned_cols=300 Identities=12% Similarity=0.028 Sum_probs=172.6 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccc--hhhhccCCCCEEEccccccCCCCcccccCCCceechhhccc Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPM--TELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITA 78 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d--~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~ 78 (342) |||+-..++.||++++-+.+.+.+.+.|. .++-+| +++.. -||+|+||.... . .+.++..++++.++. T Consensus 1 m~~~~N~~ltp~iia~~~l~~l~~~lV~~--~lv~r~y~~e~~~--~GDTV~I~vp~~---~---~v~dg~~~~~~~~te 70 (418) T protein:vir:10 1 MAVQDNNLLTDDVIAKEALRLLKNNLVMA--KCVYRNYEKTFGK--VGDTIRLKLPYR---V---KSASGRTLVKQPMVD 70 (418) T ss_pred CCccccccccHHHHHHHHHHHHHHhccch--hhhcCCCchHHhh--CCCEEEEeeCCc---e---eecccCCcccccccc Confidence 99988888999999999999998888774 356554 34443 399999997542 2 234567889999998 Q ss_pred ceeeeEe-eeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 79 DKQVAAI-LHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 79 ~~~~a~i-~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) ++...++ +....++.++|+...+...|.+.++.++.+...++++|.++++.+++.-. .. .+ .+ T Consensus 71 ~~v~l~id~~k~~~~~itD~e~a~~~~d~~~~~l~~A~~aLA~~vD~~ia~l~~~a~~-----~~---------gt-~g- 134 (418) T protein:vir:10 71 QTIPFKIAYQEHVGLEYTVKDKTLDIMQFSERYLKSGMVQIANQIDRSLALTLKKAFH-----SS---------GT-PG- 134 (418) T ss_pred ceEEEEEecccccceeechHHHhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHhhccc-----cc---------cc-CC- Confidence 8877776 57788999999999888999999999999999999999999987765310 00 00 00 Q ss_pred ccccccHHHHHHHHHHhCcccc---CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeec Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGD---KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYM 234 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~---~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~ 234 (342) ...-.++.+.+|..+|.+..= ..+.+|+.|..|..|.+.....+.+... ....-++.|+.+. T Consensus 135 -t~~~~~~~i~~a~~~Ld~~~VP~~G~R~lVv~P~~~~~L~~~~~~~~~~~~~--------------~~~lr~G~IG~i~ 199 (418) T protein:vir:10 135 -VRPGAFIDFANAGAKQTTYAVPQDGMRHAVLDPFTCASLSDEVTKLFKESMV--------------EQAYKMGYRGNVA 199 (418) T ss_pred -cCcchHHHHHHHHHHHHhcCCCCCCceEEEeCHHHHHHHhhhcccccccccc--------------chhhheeeeeeee Confidence 111247889999999988632 2478999999999998765432221110 0112245688999 Q ss_pred cceEEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecc-------eeeecC- Q lcl|NC_020854. 235 GLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVG-------AKWAVT- 306 (342) Q Consensus 235 G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G-------~s~~~~- 306 (342) |..|++++.+|....+.. -.+.+..+.++-..........-........-|.+.--..+.+|+.. -.|... T Consensus 200 GF~V~~S~nip~~tag~~-~~t~~v~ga~~~~~~~~~~~~t~s~~g~l~~Gd~~ti~gv~~v~~~t~~~~~~~~~f~V~~ 278 (418) T protein:vir:10 200 AYEVYESQNLPKHTVGDH-GGTPLVNGTVVNGDTVGFDGGTASTTGFLKAGDVITFGGVFGVNPQNYETTGLLQEFVVLE 278 (418) T ss_pred ceEEEEecCCCccccccc-ccceeeecccccceeEEEeecceeeccceeeccEEEECceeecccccccccccceEEEEEe Confidence 999999999996543321 11222222221111111000111111222222333333333333322 123210 Q ss_pred ------------cCCcCh-----------HHhcCCcCceeec-Cc-cccceEEE-------EecCCCC Q lcl|NC_020854. 307 ------------TTNPTR-----------AQLETVANWSKVY-EL-KNIGIVRA-------TNVSNFD 342 (342) Q Consensus 307 ------------~~sPt~-----------~~L~~~~NW~~v~-d~-k~i~~~~~-------~~~~~~~ 342 (342) .++|.. .+.-...+.+-|- .| ...++-.+ .-|.-|. T Consensus 279 ~~~~~~~~~~tv~i~p~~~~~~~~~~~~~~~~~~~~~~~~v~a~~a~~~~it~~~~a~~~~~~nl~f~ 346 (418) T protein:vir:10 279 DVDTDAGGAGSIKISPSLNDGTATINNENGDPVSLTAYQNVTALPADNAPITVLGAANTTYEQNYLFH 346 (418) T ss_pred eccccccCcceeEeccccccccccccccccccccccCCCcccccccCcceeeeecccccceeeeeeee Confidence 112332 2222222222221 11 11111111 1111222 No 58 >protein:vir:78523 Length: 338 # NCBI annotation: Putative head structural protein # Family: family:all:507 # MgeID: mge:1853 # MgeName: U2 # Cross-refs: genbank:acc:YP_001491585;genbank:gi:157786408;genbank:GeneID:5625675 Probab=99.36 E-value=5.4e-13 Score=87.85 Aligned_cols=284 Identities=11% Similarity=0.042 Sum_probs=164.7 Q ss_pred Cccee---------ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccC------CCCccccc Q lcl|NC_020854. 1 MATLR---------SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKAN------LSGDFEVL 65 (342) Q Consensus 1 MaT~~---------~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i------~~gda~~~ 65 (342) |++.. .--++|+.+..-+.+.+.+.+.+.+ ++.. + ..++..+++|.+... ..+.+..+ T Consensus 10 ~~~~~~~~~~~~~~~~~liP~~~~~~ii~~~~~~s~l~~--l~~~---~--~~~~~~~~ip~~~~~~~a~~v~~~~~~~~ 82 (338) T protein:vir:78 10 NTAGSNHQGRLAHVPSDLLPKEIVGPIFDKAQESSLVLR--LGEN---I--PISYGETIIPTTVKRPEVGQVGVGTSNEQ 82 (338) T ss_pred hhcccccccceecccccccchHHHHHHHHHHHhhchhhh--hcce---e--eccCCceEEEEEecCccceeecccccccc Confidence 22111 1116888887777777777777644 2211 1 135678888887532 12445567 Q ss_pred CCCceechhhcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH--------HHHHHhhh Q lcl|NC_020854. 66 SDSSSLTPGKITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSC--------LQGVFGSL 137 (342) Q Consensus 66 ~~~~~i~~~~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~--------L~g~~~~~ 137 (342) .|++.++..+++-.+-....++.+.-..++++...-+..|..+.+.+++++.++++++..+|.- ..|+.... T Consensus 83 ~Eg~~~~~~~~~f~~v~l~~~k~~~~~~is~ell~ds~~~~~~~i~~~la~a~~~~~d~~~l~G~g~~~~~~~~gi~~~~ 162 (338) T protein:vir:78 83 REGGTKPLSGTAWDTRSVAPIKLATIVTVSEEFARMNPSGLYTKLQADLAYAIGRGIDLAVFHGKSPLTGSALQGIDTNN 162 (338) T ss_pred cccccccccccceeEEEEEEEEEEEeehhhHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHhhcccCCCcccccccccccc Confidence 8899999999888888888888888889988776666678899999999999999999888741 11111100 Q ss_pred cccccchhheeeecccccccccccccHHHHHHHHHHhCc-cccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeec Q lcl|NC_020854. 138 NANTSSSAFFDLCIDSESGDTPTALSPRHVAEARAILGD-QGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQS 216 (342) Q Consensus 138 ~a~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD-~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~ 216 (342) . .... .......+.....++.|.++..++.. ......+|+||+..+..|.+... +++++++.+.... T Consensus 163 ~-------~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~m~~~~~~~L~~~~~---l~d~~g~~l~~~~- 230 (338) T protein:vir:78 163 V-------IVNT-TNVDYLQTGTTPLLDRFLDGYDLVSANTDVDFNGWAADPRYRARLLRSQA---YRDANGNVDPTRI- 230 (338) T ss_pred c-------cccc-cccccccccchhhHHHHHHHHHHhhhhccccceEEEEchHHHHHHHHHhh---hccCCCceeeccc- Confidence 0 0000 01111112223456778888776643 33456789999999999976432 3445444332111 Q ss_pred cceeecccccccceeeeccceEEEeCCcceecc-CCCcceEEEEEecc-eeEeecCCcceeEeccCCC------------ Q lcl|NC_020854. 217 GGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGS-GGSTEYATYFFTQG-AVASGEQMAMQTETDRDIL------------ 282 (342) Q Consensus 217 ~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~-~~~~~y~t~l~~~G-Ai~~~~k~~~~ve~dr~~~------------ 282 (342) .....-++++|++|+++|.+|.... ....+...| |+.- -+.++.-.++.++..++.. T Consensus 231 --------~~~~~~~~l~G~PV~~~~~ip~~~~~~~~~~~~~~-~gdfs~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~ 301 (338) T protein:vir:78 231 --------NLAASAGDLLGLPVQFGKAVGGDLGAATDSKVRVV-GGDFSQLKYGFADEIRVKMSDTATLTDNTSPTPQTV 301 (338) T ss_pred --------ccCCCCceeeeeeEEEccccCccccccCCcccEEE-EEecceEEEEeecccEEEEeecccccccccccccch Confidence 1123347899999999999985422 222233333 3322 2335555667777666532 Q ss_pred ----cceeEEEEeeEEEeee---cceeeecCcCCcCh Q lcl|NC_020854. 283 ----AKSDAMSIDLHYVYHP---VGAKWAVTTTNPTR 312 (342) Q Consensus 283 ----~g~~~l~~r~~y~~~~---~G~s~~~~~~sPt~ 312 (342) .....+....++...| ..+.--.....|.- T Consensus 302 ~~~~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~~~~~ 338 (338) T protein:vir:78 302 SMWQTNQIAILIEVTFGWLLGDKQAFVKFVDDEDPDA 338 (338) T ss_pred hhhhcCcEEEEEEEEeccEeecccceEEEecccCCCC Confidence 1223344444554443 22333222233433 No 59 >protein:vir:104256 Length: 458 # NCBI annotation: major head protein precursor # Family: family:all:27070 # MgeID: mge:1504 # MgeName: T5 # Cross-refs: genbank:acc:YP_006977;genbank:gi:46401878;genbank:GeneID:2777673 Probab=99.34 E-value=5.1e-13 Score=88.00 Aligned_cols=282 Identities=12% Similarity=0.092 Sum_probs=161.2 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechh------ Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPG------ 74 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~------ 74 (342) +++......+|+.+.+-+.+...+.+.+.+ ++..-+ .+|....+|..... +.+.-+.++...+-. T Consensus 165 ~~~~~g~~~ip~~~~~~ii~~~~~~~~l~~--~~~~~~-----~~~~~~~~~~~~~~--~~a~~v~e~~~~~~~~~~~~~ 235 (458) T protein:vir:10 165 SSVEVSSESYETIFSQRIIRDLQKELVVGA--LFEELP-----MSSKILTMLVEPDA--GKATWVAASTYGTDTTTGEEV 235 (458) T ss_pred ccCccccceehhhHhHHHHHHHHhhhhHHh--hcceee-----cCCcceEEEEecCC--cceeecccccccccccccccc Confidence 112234557788777777666666665533 222111 23555566654432 455445666554422 Q ss_pred hcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH-----HHHHHhhhcccccchhheee Q lcl|NC_020854. 75 KITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSC-----LQGVFGSLNANTSSSAFFDL 149 (342) Q Consensus 75 ~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~-----L~g~~~~~~a~~~~~~~~~~ 149 (342) +.+-.+-....++.+.-+.+++....-+.-+..+.+.+++++.+.++.+..+|.- -+|++....... .. T Consensus 236 ~~~~~~i~~~~~k~~~~v~is~ell~ds~~~~~~~i~~~l~~~i~~~~d~~~l~G~G~~~p~Gi~~~~~~~~------~~ 309 (458) T protein:vir:10 236 KGALKEIHFSTYKLAAKSFITDETEEDAIFSLLPLLRKRLIEAHAVSIEEAFMTGDGSGKPKGLLTLASEDS------AK 309 (458) T ss_pred cccceeeEeeeeeEEeeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHHHhhcCCCCCccceeeecccccc------cc Confidence 2233333344455666667777654444557788899999999999999877631 111111100000 00 Q ss_pred ecccccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccc Q lcl|NC_020854. 150 CIDSESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVS 229 (342) Q Consensus 150 ~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~ 229 (342) .....++.....++++.|.++...+......-..|+||+..+..|++. ++++++++....... ...... T Consensus 310 ~~~~~~~~~~~~~~~~~i~~~~~~l~~~~~~~~~~v~~~~~~~~l~~l------kd~~G~~i~~~~~~~-----~~~~~~ 378 (458) T protein:vir:10 310 VVTEAKADGSVLVTAKTISKLRRKLGRHGLKLSKLVLIVSMDAYYDLL------EDEEWQDVAQVGNDS-----VKLQGQ 378 (458) T ss_pred eeecccccccccccHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHhh------cccCCceeecccccc-----ccccCc Confidence 111223334456789999999988887776778999999999988764 345544332211111 111234 Q ss_pred eeeeccceEEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecce-eeecCcC Q lcl|NC_020854. 230 VPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVGA-KWAVTTT 308 (342) Q Consensus 230 i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~-s~~~~~~ 308 (342) .++++|++|+++|.||..+.. .......|+. ++.+.....+.+++|+....+...++++.++...++-- .|..... T Consensus 379 ~~~l~G~pv~~~~~~p~~~~~--~~~~~~~f~~-~~~~~~~~~~~v~~d~~~~~~~~~~~~~~r~~~~v~~~~a~v~~~~ 455 (458) T protein:vir:10 379 VGRIYGLPVVVSEYFPAKANS--AEFAVIVYKD-NFVMPRQRAVTVERERQAGKQRDAYYVTQRVNLQRYFANGVVSGTY 455 (458) T ss_pred CceecceeeEEccccccccCC--cceEEEEecc-cEEEEEeeceEEEeecccCCCceEEEEEEEecceEecccceEEEee Confidence 568999999999999865322 2222223333 34456666688888888888888888888887554322 2222222 Q ss_pred CcC Q lcl|NC_020854. 309 NPT 311 (342) Q Consensus 309 sPt 311 (342) +.+ T Consensus 456 aa~ 458 (458) T protein:vir:10 456 AAS 458 (458) T ss_pred ccC Confidence 222 No 60 >protein:vir:9410 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:167 # MgeName: phi 13 # Cross-refs: genbank:acc:NP_803388;genbank:gi:29028700;genbank:GeneID:1258136 Probab=99.34 E-value=1e-12 Score=86.36 Aligned_cols=283 Identities=12% Similarity=0.042 Sum_probs=165.7 Q ss_pred Cc-ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechh-hccc Q lcl|NC_020854. 1 MA-TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPG-KITA 78 (342) Q Consensus 1 Ma-T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~-~lt~ 78 (342) .. |.-....+|+.+.+-+.+...+...+.+ ++..-+. ..+...+.+|.+.. .+.+..+.|+..++.. ..+- T Consensus 123 ~~~~~~g~~~iP~~~~~~ii~~~~~~~~l~~--~~~~~~~---~~~~~~~~~~~~~~--~~~~~~v~Eg~~~~~~~~~~~ 195 (415) T protein:vir:94 123 SLKTDSGFVVIPEEIVTDILKLKEVEFNLDK--YVTVKRV---TNGSGKYPVVRQSE--VAALEKVEELEENPELAVKPF 195 (415) T ss_pred ccccccccccCcHHHHHHHHHHHHhhhhhhh--hcceeec---cCCceeEEEEeecC--Cccceeccccccccccccccc Confidence 11 2235568899888877777777776644 2211111 11233555666643 3667788899988754 4455 Q ss_pred ceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccccc Q lcl|NC_020854. 79 DKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDT 158 (342) Q Consensus 79 ~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~ 158 (342) .+-....++.+.-+.++++...-+.-|..+.+.+++++.+.+..++.++.-+.. ........ ........... T Consensus 196 ~~i~~~~~k~~~~~~is~ell~ds~~~~~~~i~~~l~~~~~~~~~~~il~g~g~------g~~~~~~~-~~~~~~~~~~~ 268 (415) T protein:vir:94 196 FQLAYDINTHRGYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVITK------GSTGSTSS-GFEKEGKKLEV 268 (415) T ss_pred eeeEeeheeeeeechhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhcccc------Cccccccc-ccccccccccc Confidence 666666677777778887765555567788899999999999998887763310 11100111 11111122233 Q ss_pred cccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceE Q lcl|NC_020854. 159 PTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRV 238 (342) Q Consensus 159 ~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~V 238 (342) .+..+++.|.++...+.+..-.-.+|+|||..+..|++. ++++++++.... ..+...++++|++| T Consensus 269 ~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~l------kd~~G~~l~~~~---------~~~~~~~~l~G~pV 333 (415) T protein:vir:94 269 KKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKM------KDKLGNYLIQPD---------VKEKTQQRLLGAKI 333 (415) T ss_pred ccccchHHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHh------hccCCCeeeccC---------cCCCCCceecceee Confidence 445678999999998888777778999999999999864 455554433211 11234578999999 Q ss_pred EEeCCcceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCCCcceeEEEEeeEEEeee---cceee---ecCcCCc Q lcl|NC_020854. 239 IVSDDVNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHP---VGAKW---AVTTTNP 310 (342) Q Consensus 239 vvdD~~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~---~G~s~---~~~~~sP 310 (342) ++++.+|....+ . ..++|+ ..++.+.....+.++..+.... ++.+..-.++.+.| ..+.. +.+...| T Consensus 334 ~~~~~~~~~~~~---~-~~i~~gd~~~~~~~~~~~~~~v~~~~~~~~-~~~~r~~~r~d~~~~~~~a~~~~~~~~~~~~~ 408 (415) T protein:vir:94 334 EILPDEVLGQKG---N-NTLIIGNLKDAIVLFDRSQYQASWTDYMHF-GECLMIAVRQDCRILDYKSAIVIEYDDSERGE 408 (415) T ss_pred EEecccccCCCC---c-cEEEEEehhccEEEEeecceEEEEeccccC-ceEEEEEEEeccEEeccccEEEEEEeccCCCC Confidence 999999865422 2 235555 3445555555667766554332 23444444444443 33322 2221112 Q ss_pred ChHHhcC Q lcl|NC_020854. 311 TRAQLET 317 (342) Q Consensus 311 t~~~L~~ 317 (342) -+--|+. T Consensus 409 ~~~~~~~ 415 (415) T protein:vir:94 409 GDLGLEA 415 (415) T ss_pred CccccCC Confidence 2222222 No 61 >protein:vir:100247 Length: 425 # NCBI annotation: gp76 # Family: family:all:21 # MgeID: mge:1619 # MgeName: Bcep176 # Cross-refs: genbank:acc:YP_355412;genbank:gi:77864702;genbank:GeneID:3725969 Probab=99.34 E-value=2.6e-13 Score=89.58 Aligned_cols=283 Identities=9% Similarity=-0.010 Sum_probs=164.1 Q ss_pred Ccc-e--eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhc- Q lcl|NC_020854. 1 MAT-L--RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKI- 76 (342) Q Consensus 1 MaT-~--~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~l- 76 (342) |.+ + -.-.++|+.+..-+.+...+.+.+.+. +..-+ .++...++|.... ...+.-+.|++.++..+. T Consensus 130 l~~~t~~~gG~lvP~~~~~~ii~~~~~~s~l~~l--~~~~~-----~~~~~~~~~~~~~--~~~a~wv~E~~~~~~~~~~ 200 (425) T protein:vir:10 130 LNKGEDSEGGYLTPIEWDRTITNKLVLISPMRQL--CRVQP-----VSKAGFSKLFNMG--GTTSGWVGEASQRPQTNAA 200 (425) T ss_pred hhcCcCCCCceeccHhHHHHHHHHHHhhhhhhhh--ceeee-----ccCCceEEEEEcC--Ccceeeecccccccccccc Confidence 432 2 223467777776666666666666542 21111 2334566776542 356667788888876654 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH-----HHHHHhhhccc-ccchhheeee Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSC-----LQGVFGSLNAN-TSSSAFFDLC 150 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~-----L~g~~~~~~a~-~~~~~~~~~~ 150 (342) +-.+.....++.+.-+.++++...-+.-|..+.+.+++++.++++.+..+|.= -+|++...... .......+ . T Consensus 201 ~f~~v~~~~~k~~~~i~iS~ell~ds~~~l~~~i~~~la~ai~~~~d~~~l~G~G~~~p~Gil~~~~~~~~~~~~~~~-~ 279 (425) T protein:vir:10 201 TFQPLSFASGEIYANPAATQQILDDAEIDLESWLATEVQTEFAKQEGKAFLAGDGTNKPNGLLTYIAGGANAAKHPFG-A 279 (425) T ss_pred ccceeeeeheeeEeehHhHHHHHhcchhHHHHHHHHHHHHHHHHHHHhhhhcccCCCCcceeeecccccccccccccc-c Confidence 44444555566666667766655445557788899999999999999877641 00111100000 00000000 0 Q ss_pred cccccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccce Q lcl|NC_020854. 151 IDSESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSV 230 (342) Q Consensus 151 ~~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i 230 (342) .......+...++++.|.+....+......-.+|+||+..+..|+++ ++++++++.. .. ...+.- T Consensus 280 ~~~~~~~~~~~~~~d~l~~l~~~l~~~~~~~a~~vmn~~~~~~L~~l------kD~~G~~l~~--~~-------~~~g~~ 344 (425) T protein:vir:10 280 IEVVNSGAAADITSDGIIDLVYDLPSAFTGNARFAMNRNTQRQVRKL------KDGQGNYLWQ--PS-------YVAGQP 344 (425) T ss_pred cccccccccccccHHHHHHHHhhhhhhhccCCEEEEchHHHHHHHHh------hcCCCceeec--cC-------ccCCCC Confidence 01112224456788899998887777766777899999999999874 3454443321 11 112234 Q ss_pred eeeccceEEEeCCcceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecc-eeeecCc Q lcl|NC_020854. 231 PTYMGLRVIVSDDVNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVG-AKWAVTT 307 (342) Q Consensus 231 ~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G-~s~~~~~ 307 (342) ++++|++|+++|.||..+++. ..++|+ ..++.+.....+.+..++....+...+....++...|.- -+|..-. T Consensus 345 ~~l~G~PV~~~~~~p~~~~~~----~~i~~Gd~~~~~~i~~~~~~~v~~d~~~~~~~~~~~~~~r~d~~v~~~~A~~~l~ 420 (425) T protein:vir:10 345 ATLAGYPVTEVPDMPDVAANS----TPILFGDFQQTYLIIDRIGVRVLRDPYTAKPYVLFYTTKRVGGGLLNPEPMRAMK 420 (425) T ss_pred ceecceeeEEecCcCCccCCc----cEEEEEehhccEEEEEecceEEEecccccCCcEEEEEEEEeccEeecccceEEEE Confidence 689999999999999655432 234554 344556666667788888777888888888777666532 2221110 Q ss_pred CCcCh Q lcl|NC_020854. 308 TNPTR 312 (342) Q Consensus 308 ~sPt~ 312 (342) ..-+. T Consensus 421 ~~as~ 425 (425) T protein:vir:10 421 VAASE 425 (425) T ss_pred eeccC Confidence 00011 No 62 >protein:vir:95763 Length: 297 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1578 # MgeName: SMP # Cross-refs: genbank:acc:YP_950590;genbank:gi:119953785;genbank:GeneID:5076833 Probab=99.33 E-value=9.4e-13 Score=86.52 Aligned_cols=269 Identities=9% Similarity=0.010 Sum_probs=161.5 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) |.|......+|+.+.+=+.+...+.+.+.+. +..-+ ..++..+.+|.... ...+.-+.|++.++..+.+..+ T Consensus 12 ~~t~~~~~lvP~~~~~~ii~~~~~~s~l~~~--~~~~~----~~~~~~~~~~~~~~--~~~a~~v~Eg~~~~~~~~~f~~ 83 (297) T protein:vir:95 12 LVSQKKDGTLHKEFTDIIMKEVAQNSLVMQL--GQYQE----MEGEQEKTVYVQTD--GISAYWVNETEKIKTDKPEVVP 83 (297) T ss_pred cccCCCcceechhHHHHHHHHHHhhchhhhh--cceee----cCCCccEEEEEEcC--CceeEEeecCccccccccceeE Confidence 3334455577888876666666666666442 22211 11233456775542 3667789999999999988888 Q ss_pred eeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccccccc Q lcl|NC_020854. 81 QVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTPT 160 (342) Q Consensus 81 ~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~~ 160 (342) .....++.+..+.++++...-+.-|..+.+.+++++.++++.++.+|. | .. ++. ...+.... ......... T Consensus 84 v~l~~~k~~~~~~is~ell~ds~~~l~~~i~~~la~ai~~~~d~a~l~---G---~g-~~~-~~gi~~~~-~~~~~~~~~ 154 (297) T protein:vir:95 84 VTLKAHKLGIILVTSREALNYTWKKFFEDMKPQIVEAFYKKIDEAGLL---G---HD-TPF-ANSVAKAA-KDANKVIGG 154 (297) T ss_pred EEEeeEEEEEeehhhHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHHhc---c---cC-Ccc-cccccccc-cccceeccc Confidence 888888999999999887766667889999999999999999998873 2 11 111 11111111 112223345 Q ss_pred cccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceEEE Q lcl|NC_020854. 161 ALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIV 240 (342) Q Consensus 161 ~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vvv 240 (342) .++++.+.++...+.+......+|+||++.+..|++. ++++++.+. +...++++|++|++ T Consensus 155 ~~t~~~i~~~~~~l~~~~~~~~~~v~~~~~~~~L~~l------~d~~G~~i~--------------~~~~~~l~G~Pv~~ 214 (297) T protein:vir:95 155 PINYDNILKLQDALYDADVEPNAFVSKIQNRSALREA------RDGNKVSIY--------------DKAANTIDGITTVD 214 (297) T ss_pred ccCHHHHHHHHHHhhhccCCcCEEEEcHHHHHHHHHh------hccCCceee--------------cCCCCcccceeeEe Confidence 6789999999999988887888999999999999864 233333221 22346789999998 Q ss_pred eCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccCCC----------------cceeEEEEeeEEEeeecceee Q lcl|NC_020854. 241 SDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRDIL----------------AKSDAMSIDLHYVYHPVGAKW 303 (342) Q Consensus 241 dD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~~----------------~g~~~l~~r~~y~~~~~G~s~ 303 (342) +...+... .+ ++|+. -.+.++...++.++..++.. .....+....++...|. T Consensus 215 ~~~~~~~~------~~-~~~gd~s~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~~~~~d~~v~---- 283 (297) T protein:vir:95 215 LKSARFEK------GD-LLAGDFDNLIYGVPYNITYKISEEGQISTITNADGTPINLFEQEMIAIRATMDIAVMIT---- 283 (297) T ss_pred ecCCCCCC------ce-EEEEecccEEEEEecCeEEEEeeccccccccccCccchhhhhcCcEEEEEEEEeccEee---- Confidence 77655322 22 33332 22335566666776655432 12223333333333321 Q ss_pred ecCcCCcChHHhcCCcCceeecCccccceEEEEecCCC Q lcl|NC_020854. 304 AVTTTNPTRAQLETVANWSKVYELKNIGIVRATNVSNF 341 (342) Q Consensus 304 ~~~~~sPt~~~L~~~~NW~~v~d~k~i~~~~~~~~~~~ 341 (342) +++++.. ++--++. T Consensus 284 ----------------------~~~a~~~--l~~at~~ 297 (297) T protein:vir:95 284 ----------------------KTDAFAK--LTPAERV 297 (297) T ss_pred ----------------------cccceEE--EeecCCC Confidence 1111111 1111111 No 63 >protein:vir:191 Length: 385 # NCBI annotation: major head subunit precursor # Family: family:all:585 # MgeID: mge:6 # MgeName: HK97 # Cross-refs: genbank:acc:NP_037701;genbank:gi:9634158;genbank:GeneID:1262530 Probab=99.33 E-value=1e-12 Score=86.35 Aligned_cols=268 Identities=11% Similarity=0.070 Sum_probs=159.6 Q ss_pred Cccee---ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MATLR---SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT~~---~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |.+.. ..++.|++. ..+.+...+.+.+.+ ++ +.+ ..+|..+++|.+.. ..+.+..+.|++.++..+.+ T Consensus 105 ~~~~~~~~g~~i~~~~~-~~ii~~~~~~~~l~~--~~---~~~--~~~~~~~~~~~~~~-~~~~a~~v~E~~~~~~~~~~ 175 (385) T protein:vir:19 105 LGSDADSAGSLIQPMQI-PGIIMPGLRRLTIRD--LL---AQG--RTSSNALEYVREEV-FTNNADVVAEKALKPESDIT 175 (385) T ss_pred hccccccCCceecchhh-hHHHHHhhhccchhh--hc---cee--cccCcceEEEEEec-CCcceeeeccCccccccccc Confidence 44322 234555554 445555555555543 11 111 12466788998763 24566778999999999998 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) -.+.....++.+..+.++++.-.-+ .+..+.+.+++++.+++.++..+|. | ......-.+............. T Consensus 176 ~~~~~~~~~k~~~~~~is~ell~d~-~~l~~~i~~~la~a~~~~~d~~~l~---G---~g~~~~~~Gi~~~~~~~~~~~~ 248 (385) T protein:vir:19 176 FSKQTANVKTIAHWVQASRQVMDDA-PMLQSYINNRLMYGLALKEEGQLLN---G---DGTGDNLEGLNKVATAYDTSLN 248 (385) T ss_pred eeEEEEeeeeEEEeehhhHHHHhhH-HHHHHHHHHHHHHHHHHHHHHHHHh---c---cCCCCccccccccccccccccc Confidence 8888888888888888888754333 4667779999999999999887764 2 1111111111111111111122 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) ......++.|.++...+......-.+|+||+..+..|++. ++++++.+.. .. ....-++++|++ T Consensus 249 ~~~~~~~d~i~~~~~~l~~~~~~~~~~~~~~~~~~~l~~l------kd~~G~~l~~--------~~--~~~~~~~l~G~p 312 (385) T protein:vir:19 249 ATGDTRADIIAHAIYQVTESEFSASGIVLNPRDWHNIALL------KDNEGRYIFG--------GP--QAFTSNIMWGLP 312 (385) T ss_pred ccccchHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHh------hcCCCceecc--------Cc--ccCCCceeccee Confidence 2334567889999988877777788999999999999864 3455443321 11 123457889999 Q ss_pred EEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCC----CcceeEEEEeeEEEeeec---ceee-ecCc Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDI----LAKSDAMSIDLHYVYHPV---GAKW-AVTT 307 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~----~~g~~~l~~r~~y~~~~~---G~s~-~~~~ 307 (342) |++++.+|-. +.+|+. .++.+.....+.++..+.. ..+...++...++.++++ .+.. +.+. T Consensus 313 V~~~~~~p~~---------~~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~~~r~~~~v~~~~a~~~~~~~a 383 (385) T protein:vir:19 313 VVPTKAQAAG---------TFTVGGFDMASQVWDRMDATVEVSREDRDNFVKNMLTILCEERLALAHYRPTAIIKGTFSS 383 (385) T ss_pred eEEcCcCCCC---------cEEEeecccEEEEEEecceEEEEeccccchhhcCcEEEEEEEeeccEEecccceEEEEecc Confidence 9999999731 233332 2344555555666554433 234556777777766653 3322 1111 Q ss_pred CC Q lcl|NC_020854. 308 TN 309 (342) Q Consensus 308 ~s 309 (342) .+ T Consensus 384 a~ 385 (385) T protein:vir:19 384 GS 385 (385) T ss_pred CC Confidence 11 No 64 >protein:vir:1886 Length: 385 # NCBI annotation: major capsid subunit precursor # Family: family:all:585 # MgeID: mge:41 # MgeName: HK022 # Cross-refs: genbank:acc:NP_037666;genbank:gi:9634124;genbank:GeneID:1262513 Probab=99.33 E-value=1e-12 Score=86.35 Aligned_cols=268 Identities=11% Similarity=0.070 Sum_probs=159.6 Q ss_pred Cccee---ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MATLR---SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT~~---~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |.+.. ..++.|++. ..+.+...+.+.+.+ ++ +.+ ..+|..+++|.+.. ..+.+..+.|++.++..+.+ T Consensus 105 ~~~~~~~~g~~i~~~~~-~~ii~~~~~~~~l~~--~~---~~~--~~~~~~~~~~~~~~-~~~~a~~v~E~~~~~~~~~~ 175 (385) T protein:vir:18 105 LGSDADSAGSLIQPMQI-PGIIMPGLRRLTIRD--LL---AQG--RTSSNALEYVREEV-FTNNADVVAEKALKPESDIT 175 (385) T ss_pred hccccccCCceecchhh-hHHHHHhhhccchhh--hc---cee--cccCcceEEEEEec-CCcceeeeccCccccccccc Confidence 44322 234555554 445555555555543 11 111 12466788998763 24566778999999999998 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) -.+.....++.+..+.++++.-.-+ .+..+.+.+++++.+++.++..+|. | ......-.+............. T Consensus 176 ~~~~~~~~~k~~~~~~is~ell~d~-~~l~~~i~~~la~a~~~~~d~~~l~---G---~g~~~~~~Gi~~~~~~~~~~~~ 248 (385) T protein:vir:18 176 FSKQTANVKTIAHWVQASRQVMDDA-PMLQSYINNRLMYGLALKEEGQLLN---G---DGTGDNLEGLNKVATAYDTSLN 248 (385) T ss_pred eeEEEEeeeeEEEeehhhHHHHhhH-HHHHHHHHHHHHHHHHHHHHHHHHh---c---cCCCCccccccccccccccccc Confidence 8888888888888888888754333 4667779999999999999887764 2 1111111111111111111122 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) ......++.|.++...+......-.+|+||+..+..|++. ++++++.+.. .. ....-++++|++ T Consensus 249 ~~~~~~~d~i~~~~~~l~~~~~~~~~~~~~~~~~~~l~~l------kd~~G~~l~~--------~~--~~~~~~~l~G~p 312 (385) T protein:vir:18 249 ATGDTRADIIAHAIYQVTESEFSASGIVLNPRDWHNIALL------KDNEGRYIFG--------GP--QAFTSNIMWGLP 312 (385) T ss_pred ccccchHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHh------hcCCCceecc--------Cc--ccCCCceeccee Confidence 2334567889999988877777788999999999999864 3455443321 11 123457889999 Q ss_pred EEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCC----CcceeEEEEeeEEEeeec---ceee-ecCc Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDI----LAKSDAMSIDLHYVYHPV---GAKW-AVTT 307 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~----~~g~~~l~~r~~y~~~~~---G~s~-~~~~ 307 (342) |++++.+|-. +.+|+. .++.+.....+.++..+.. ..+...++...++.++++ .+.. +.+. T Consensus 313 V~~~~~~p~~---------~~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~~~~~~r~~~~v~~~~a~~~~~~~a 383 (385) T protein:vir:18 313 VVPTKAQAAG---------TFTVGGFDMASQVWDRMDATVEVSREDRDNFVKNMLTILCEERLALAHYRPTAIIKGTFSS 383 (385) T ss_pred eEEcCcCCCC---------cEEEeecccEEEEEEecceEEEEeccccchhhcCcEEEEEEEeeccEEecccceEEEEecc Confidence 9999999731 233332 2344555555666554433 234556777777766653 3322 1111 Q ss_pred CC Q lcl|NC_020854. 308 TN 309 (342) Q Consensus 308 ~s 309 (342) .+ T Consensus 384 a~ 385 (385) T protein:vir:18 384 GS 385 (385) T ss_pred CC Confidence 11 No 65 >protein:vir:4600 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:101 # MgeName: PVL # Cross-refs: genbank:acc:NP_058445;genbank:gi:9635171;genbank:GeneID:1262708 Probab=99.32 E-value=1.7e-12 Score=85.09 Aligned_cols=283 Identities=12% Similarity=0.063 Sum_probs=163.7 Q ss_pred Cc----ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceech-hh Q lcl|NC_020854. 1 MA----TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTP-GK 75 (342) Q Consensus 1 Ma----T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~-~~ 75 (342) ++ |.-....+|+.+.+.+.+...+...+.+ ++..-+. +.+...+.+|.+.. ...+..+.|+..++. +. T Consensus 120 ~~~~~~t~~g~~~iP~~~~~~ii~~~~~~~~l~~--~~~~~~~---~~~~~~~~~~~~~~--~~~~~~v~Eg~~~~~~~~ 192 (415) T protein:vir:46 120 QGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDK--YVTVKRV---TNGSGKYPVVRQSE--VAALEKVEELEENPELAV 192 (415) T ss_pred hhccccccCCcccccHHHHHHHHHHHHhhhhhhh--hcceeec---cCCceeEEEEEecC--Ccceeecccccccccccc Confidence 11 2345678999998888888777777754 2211111 11112344444432 356667888888875 34 Q ss_pred cccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 76 ITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 76 lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) .+-.+-....++.+..+.++++...-+.-|..+.+.+++++.+.+..+..++.-+. ........ ......... T Consensus 193 ~~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~il~g~g------~g~~~~~~-~~~~~~~~~ 265 (415) T protein:vir:46 193 KPFFQLAYDINTHRGYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVIT------KGSTGSTS-SGFEKEGKK 265 (415) T ss_pred cceeeEEeeeeeeEeeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccc------cCCccccc-cccccccce Confidence 45555566667777777777766544555778889999999999999988776331 00000000 111111122 Q ss_pred ccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) .......+++.|.++...+.+....-.+|+||+..+..|++. ++++++++.... ..+..-++++| T Consensus 266 ~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~L~~l------kd~~G~~i~~~~---------~~~~~~~~l~G 330 (415) T protein:vir:46 266 LEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKM------KDKLGNYLIQPD---------VKEKTQQRLLG 330 (415) T ss_pred eccccccchHHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHh------hccCCCeeeccC---------cCCCCCccccc Confidence 234456788999999988888777788999999999998763 445544433211 11233468999 Q ss_pred ceEEEeCCcceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCCCcceeEEEEeeEEEeee---cce---eeecCc Q lcl|NC_020854. 236 LRVIVSDDVNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHP---VGA---KWAVTT 307 (342) Q Consensus 236 ~~VvvdD~~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~---~G~---s~~~~~ 307 (342) ++|++++.+|....+. ..++|+ .-++.+....++.++..+.... ++.+..-.++.+.| ..+ +++.+. T Consensus 331 ~pV~~~~~~~~~~~~~----~~~~~gd~~~~~~~~~~~~~~v~~~~~~~~-~~~~~~~~r~d~~v~~~~a~~~~~~~~~~ 405 (415) T protein:vir:46 331 AKIEILPDEVLGQKGN----NTLIIGNLKDAIVLFDRSQYQASWTDYMHF-GECLMIAVRQDCRILDYKSAIVIEYDDSE 405 (415) T ss_pred eeeEEeccccccCCCc----cEEEEEehhccEEEEeecceEEEeeccccC-ceEEEEEEEeccEEeccccEEEEEeeccC Confidence 9999999998654322 235665 3345556656667766553322 23444444444443 333 222221 Q ss_pred CCcChHHhcC Q lcl|NC_020854. 308 TNPTRAQLET 317 (342) Q Consensus 308 ~sPt~~~L~~ 317 (342) .-|-+--|+. T Consensus 406 ~~~~~~~~~~ 415 (415) T protein:vir:46 406 RGEGDLGLEA 415 (415) T ss_pred CCCCCccCCC Confidence 1122222222 No 66 >protein:vir:4700 Length: 415 # NCBI annotation: phi PVL ORF 7 homologue # Family: family:all:21 # MgeID: mge:102 # MgeName: phiPV83 # Cross-refs: genbank:acc:NP_061632;genbank:gi:9635719;genbank:GeneID:1262976 Probab=99.32 E-value=1.7e-12 Score=85.09 Aligned_cols=283 Identities=12% Similarity=0.063 Sum_probs=163.7 Q ss_pred Cc----ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceech-hh Q lcl|NC_020854. 1 MA----TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTP-GK 75 (342) Q Consensus 1 Ma----T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~-~~ 75 (342) ++ |.-....+|+.+.+.+.+...+...+.+ ++..-+. +.+...+.+|.+.. ...+..+.|+..++. +. T Consensus 120 ~~~~~~t~~g~~~iP~~~~~~ii~~~~~~~~l~~--~~~~~~~---~~~~~~~~~~~~~~--~~~~~~v~Eg~~~~~~~~ 192 (415) T protein:vir:47 120 QGGSLKTDSGFVVIPEEIVTDILKLKEVEFNLDK--YVTVKRV---TNGSGKYPVVRQSE--VAALEKVEELEENPELAV 192 (415) T ss_pred hhccccccCCcccccHHHHHHHHHHHHhhhhhhh--hcceeec---cCCceeEEEEEecC--Ccceeecccccccccccc Confidence 11 2345678999998888888777777754 2211111 11112344444432 356667888888875 34 Q ss_pred cccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 76 ITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 76 lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) .+-.+-....++.+..+.++++...-+.-|..+.+.+++++.+.+..+..++.-+. ........ ......... T Consensus 193 ~~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~il~g~g------~g~~~~~~-~~~~~~~~~ 265 (415) T protein:vir:47 193 KPFFQLAYDINTHRGYFRISREAIEDAKVNVLQELKLWMARTIAATRNKAIIDVIT------KGSTGSTS-SGFEKEGKK 265 (415) T ss_pred cceeeEEeeeeeeEeeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccc------cCCccccc-cccccccce Confidence 45555566667777777777766544555778889999999999999988776331 00000000 111111122 Q ss_pred ccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) .......+++.|.++...+.+....-.+|+||+..+..|++. ++++++++.... ..+..-++++| T Consensus 266 ~~~~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~L~~l------kd~~G~~i~~~~---------~~~~~~~~l~G 330 (415) T protein:vir:47 266 LEVKKAKSLDDIKDAINLNVKPNYEHNVAIVSQTMFAKLDKM------KDKLGNYLIQPD---------VKEKTQQRLLG 330 (415) T ss_pred eccccccchHHHHHHHHhhhhhccCCCEEEEcHHHHHHHHHh------hccCCCeeeccC---------cCCCCCccccc Confidence 234456788999999988888777788999999999998763 445544433211 11233468999 Q ss_pred ceEEEeCCcceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCCCcceeEEEEeeEEEeee---cce---eeecCc Q lcl|NC_020854. 236 LRVIVSDDVNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHP---VGA---KWAVTT 307 (342) Q Consensus 236 ~~VvvdD~~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~---~G~---s~~~~~ 307 (342) ++|++++.+|....+. ..++|+ .-++.+....++.++..+.... ++.+..-.++.+.| ..+ +++.+. T Consensus 331 ~pV~~~~~~~~~~~~~----~~~~~gd~~~~~~~~~~~~~~v~~~~~~~~-~~~~~~~~r~d~~v~~~~a~~~~~~~~~~ 405 (415) T protein:vir:47 331 AKIEILPDEVLGQKGN----NTLIIGNLKDAIVLFDRSQYQASWTDYMHF-GECLMIAVRQDCRILDYKSAIVIEYDDSE 405 (415) T ss_pred eeeEEeccccccCCCc----cEEEEEehhccEEEEeecceEEEeeccccC-ceEEEEEEEeccEEeccccEEEEEeeccC Confidence 9999999998654322 235665 3345556656667766553322 23444444444443 333 222221 Q ss_pred CCcChHHhcC Q lcl|NC_020854. 308 TNPTRAQLET 317 (342) Q Consensus 308 ~sPt~~~L~~ 317 (342) .-|-+--|+. T Consensus 406 ~~~~~~~~~~ 415 (415) T protein:vir:47 406 RGEGDLGLEA 415 (415) T ss_pred CCCCCccCCC Confidence 1122222222 No 67 >protein:vir:97053 Length: 390 # NCBI annotation: putative head protein # Family: family:all:585 # MgeID: mge:1653 # MgeName: OP1 # Cross-refs: genbank:acc:YP_453565;genbank:gi:84662600;genbank:GeneID:5142468 Probab=99.31 E-value=1.4e-12 Score=85.63 Aligned_cols=268 Identities=8% Similarity=0.019 Sum_probs=162.0 Q ss_pred Cc---cee-ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhc Q lcl|NC_020854. 1 MA---TLR-SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKI 76 (342) Q Consensus 1 Ma---T~~-~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~l 76 (342) +. |.- ..++.|+.... +.+...+...+.+ ++...+ .++..+++|.+.. .++.+..+.|++.++..++ T Consensus 113 ~~~~~~~~~g~lip~~~~~~-ii~~~~~~~~i~~--~~~~~~-----~~~~~~~~~~~~~-~~~~a~~v~Eg~~~~~~~~ 183 (390) T protein:vir:97 113 ASTDAAGSAGALTTPNRLPG-FITPPDARLTVRD--LIGSGR-----TDSALIEYVQETG-FVNNAAIVAEGALKPESSL 183 (390) T ss_pred hhcccccccccccchhhhHH-HHHHHhhhhhhHh--hcceee-----ccCCceEEEEEec-CCcceeeecCCcccccccc Confidence 22 222 33555555555 4444555555543 222211 2456788888763 3467788999999999999 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +..+-....++.+.-..++++...-+ .+....+.+++++.++++++..+|. | ....+.-.+.+.......... T Consensus 184 ~~~~i~~~~~k~~~~~~is~ell~ds-~~l~~~i~~~la~a~~~~~d~a~l~---G---~g~~~~p~Gi~~~~~~~~~~~ 256 (390) T protein:vir:97 184 KFAKKTDTTHVIAHTMKATRQILSDA-PQLASYMNNRLIRGLKVKEDAEILR---G---TGANDGLLGLIPQATTYAAPT 256 (390) T ss_pred ceeEEEEeeeeEEEeehhhHHHHHhH-HHHHHHHHHHHHHHHHHHHHHHHhh---c---CCCCccccceeeccccccccc Confidence 88888888888888888888754333 4677779999999999999987764 2 111110011111111111122 Q ss_pred cccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccc Q lcl|NC_020854. 157 DTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGL 236 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~ 236 (342) .......++.+.++...+.+..-...+|+|||.++..|++. +++++.++.. .. ....-++++|+ T Consensus 257 ~~~~~~~~d~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~l------kd~~G~~l~~--------~~--~~~~~~~l~G~ 320 (390) T protein:vir:97 257 TIAGATRVDQLRLAMLQASLAEYPASGIVINPIDWAAIELA------KDANNQYLIG--------NA--RGTLTPTLWGL 320 (390) T ss_pred cccccchHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHh------hcCCCceeec--------Cc--cCCCCceecce Confidence 23345567889999988888877888999999999999864 3454443321 11 12334688999 Q ss_pred eEEEeCCcceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCC---CcceeEEEEeeEEEeeecc-eeeecCcCC Q lcl|NC_020854. 237 RVIVSDDVNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDI---LAKSDAMSIDLHYVYHPVG-AKWAVTTTN 309 (342) Q Consensus 237 ~VvvdD~~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~---~~g~~~l~~r~~y~~~~~G-~s~~~~~~s 309 (342) +|+++|.+|-. +++++ ..++.+.....+.++..+.. ..+...+....+|...|+- -+|.....+ T Consensus 321 pV~~~~~~~~~---------~~~~gd~~~~~~~~~~~~~~i~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~v~~~~a 390 (390) T protein:vir:97 321 PVVATQAMAPG---------EFLVGAFDLAAQIFDQWDARVEIGYVNDDFQRNMVTVLAEERLALVVYRPEALITGSFA 390 (390) T ss_pred eeEEcCCCCCC---------cEEEEeccceEEEEEecceEEEEeecccccccCcEEEEEEEeeccEEeccccEEEEEeC Confidence 99999998732 13333 23455666666777765532 3455567777777666543 122222222 No 68 >protein:vir:8102 Length: 543 # NCBI annotation: gp6 # Family: family:all:21 # MgeID: mge:152 # MgeName: Che9c # Cross-refs: genbank:acc:NP_817683;genbank:gi:29566114;genbank:GeneID:1259308 Probab=99.31 E-value=1.2e-12 Score=86.03 Aligned_cols=275 Identities=9% Similarity=0.044 Sum_probs=158.5 Q ss_pred Cccee--c-cccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MATLR--S-DIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT~~--~-d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) +.++. . .+|.+++....+...+.+.+.+.+ ++ ... ..+|+ +.+|.-. ..+.+..+.|+..++..+++ T Consensus 251 ~~~t~~~gg~lip~~~~~~ii~~~~~~~~~l~~--~~----~~~-~~~g~-~~~~~~~--~~~~a~~v~Eg~~~~~~~~~ 320 (543) T protein:vir:81 251 MGLTKADGGYLVPFQLDPTVIITSNGSLNDIRR--FA----RQV-VATGD-VWHGVSS--AAVQWSWDAEFEEVSDDSPE 320 (543) T ss_pred cccccccCcccCchhhhhHHHHHHHhhhchhhh--hc----ccc-cCCcc-eEEEEec--CCcceeecccCccccccccc Confidence 22222 2 234445555555555555454533 21 111 12343 4567543 34677789999999999998 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH------HHHHHHhhhcccccchhheeeec Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLS------CLQGVFGSLNANTSSSAFFDLCI 151 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla------~L~g~~~~~~a~~~~~~~~~~~~ 151 (342) -++-....++.+.-+.++.....-+ .|....+.++++..+++..+..+|. ...|++..... .. T Consensus 321 ~~~i~~~~~k~~~~~~is~ell~d~-~~~~~~i~~~l~~~~~~~~d~ail~G~Gt~~~p~Gi~~~~~~----------~~ 389 (543) T protein:vir:81 321 FGQPEIPVKKAQGFVPISIEALQDE-ANVTETVALLFAEGKDELEAVTLTTGTGQGNQPTGIVTALAG----------TA 389 (543) T ss_pred cceeeeeeeeeEeeehhhHHHHhcc-HHHHHHHHHHHHHHHHHHHHHHHhccCCCCcccccchhhccc----------cc Confidence 8888888888888888888765434 4888889999999999999887763 22333221100 01 Q ss_pred ccccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeeccccccccee Q lcl|NC_020854. 152 DSESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVP 231 (342) Q Consensus 152 ~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~ 231 (342) ......+...++++.+.++...+......-.+|+||+.++..|++. +++++.+++. . ...+.-+ T Consensus 390 ~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~v~n~~~~~~l~~l------kd~~G~~l~~--~--------~~~g~~~ 453 (543) T protein:vir:81 390 AEIAPVTAETFALADVYAVYEQLAARHRRQGAWLANNLIYNKIRQF------DTQGGAGLWT--T--------IGNGEPS 453 (543) T ss_pred ccccccccccccHHHHHHHHHhhhccccCCcEEEEcHHHHHHHHHh------hcCCCceecc--C--------cCCCCCc Confidence 1112234456788999999888776666667899999999999864 3444433321 1 1122346 Q ss_pred eeccceEEEeCCcceeccC--CCcceEEEEEec-ceeEeecCCcceeEeccCCCc------ceeEEEEeeEEEeeeccee Q lcl|NC_020854. 232 TYMGLRVIVSDDVNTAGSG--GSTEYATYFFTQ-GAVASGEQMAMQTETDRDILA------KSDAMSIDLHYVYHPVGAK 302 (342) Q Consensus 232 ~~~G~~VvvdD~~p~~~~~--~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~~~------g~~~l~~r~~y~~~~~G~s 302 (342) +++|++|+++|.||..... +.+.+ .++|+. -.+.++...++.+++++.... +...+....++.+.+ T Consensus 454 ~l~G~pv~~~~~~~~~~~~~~~~~~~-~i~~gd~~~~~i~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~r~d~~v---- 528 (543) T protein:vir:81 454 QLLGRPVGEAEAMDANWNTSASADNF-VLLYGNFQNYVIADRIGMTVEFIPHLFGTNRRPNGSRGWFAYYRMGADV---- 528 (543) T ss_pred cccceeeEEeccccccccccccCCcc-eEEEeeccceeEEeecccEEEEeccccccchhhcCceEEEEEEeeccEe---- Confidence 8999999999999875422 22332 344432 123345555677777665433 233444444444332 Q ss_pred eecCcCCcChHHhcCCcCceeecCccccceEEEEecC Q lcl|NC_020854. 303 WAVTTTNPTRAQLETVANWSKVYELKNIGIVRATNVS 339 (342) Q Consensus 303 ~~~~~~sPt~~~L~~~~NW~~v~d~k~i~~~~~~~~~ 339 (342) .+++++.++.+.+.+ T Consensus 529 ----------------------~~~~A~~~l~~~~~a 543 (543) T protein:vir:81 529 ----------------------VNPNAFRLLNVETAS 543 (543) T ss_pred ----------------------ecccceEEEEecccC Confidence 112222222222222 No 69 >protein:vir:100135 Length: 418 # NCBI annotation: gp5 # Family: family:all:585 # MgeID: mge:1639 # MgeName: phi1026b # Cross-refs: genbank:acc:NP_945035;genbank:gi:38707895;genbank:GeneID:2744182 Probab=99.31 E-value=1.9e-12 Score=84.80 Aligned_cols=267 Identities=7% Similarity=0.011 Sum_probs=162.4 Q ss_pred Ccce--eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhccc Q lcl|NC_020854. 1 MATL--RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITA 78 (342) Q Consensus 1 MaT~--~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~ 78 (342) +.+. -...++|+.+..-+.+...+...|.+ ++..-+ .+|..+++|.+.. .++.+..+.|++.++..+.+- T Consensus 136 ~~~~~~~~g~lvp~~~~~~ii~~~~~~~~l~~--~~~~~~-----~~~~~~~~~~~~~-~~~~a~~v~E~~~~~~~~~~f 207 (418) T protein:vir:10 136 VGSGVSGSNSLVVADRQAGIIAPPQRKMTIRD--LLMPGQ-----TSSSSIEYTVETG-FTNNAAAVAEGAQKPTSDLKF 207 (418) T ss_pred ccCCCCCCccccchhHHHHHHHHHhhhhhHHh--hcceee-----ccCCceeEEEEec-CCCceeeeccCccccccccce Confidence 3332 34456777777666667777666654 222211 3466788888764 245666789999998888877 Q ss_pred ceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccccc Q lcl|NC_020854. 79 DKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDT 158 (342) Q Consensus 79 ~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~ 158 (342) .+-....++.+....+++.....+ .+..+.+.+++++..++..++.+|. | ......-.+.+............ T Consensus 208 ~~v~~~~~k~~~~~~is~ell~ds-~~l~~~i~~~l~~a~~~~~d~a~l~---G---~g~~~~p~Gi~~~~~~~~~~~~~ 280 (418) T protein:vir:10 208 NLKNQPVRTIAHLFKASRQILDDA-PALQSYIDGRARYGLQLTEEGQILK---G---DGTGANILGILPQASAFMPSITL 280 (418) T ss_pred eeEEEeeeeEEEeehhhHHHHHhH-HHHHHHHHHHHHHHHHHHHHHHHhc---c---CCCCccccccccccccccccccc Confidence 777777777777777777654434 4777779999999999998887763 1 11111001111111111112222 Q ss_pred cccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceE Q lcl|NC_020854. 159 PTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRV 238 (342) Q Consensus 159 ~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~V 238 (342) .....++.+.++...+........+|+||+.++..|++. +++++..+.. .. ....-++++|++| T Consensus 281 ~~~~~~~~i~~~~~~~~~~~~~~~~~v~n~~~~~~L~~l------kd~~G~~i~~--------~~--~~~~~~~l~G~pV 344 (418) T protein:vir:10 281 ANATPIDKIRLALLQAVLAEFPATGIVLNPIDWASIELT------KDSQGRYIVG--------NP--VNGTTPRLWNLPV 344 (418) T ss_pred cccccHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHh------hcCCCceecc--------cc--ccCCCceecceee Confidence 334467888999888877777778999999999998764 3454433321 11 1233568999999 Q ss_pred EEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCCC----cceeEEEEeeEEEeee---cceee---e-- Q lcl|NC_020854. 239 IVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDIL----AKSDAMSIDLHYVYHP---VGAKW---A-- 304 (342) Q Consensus 239 vvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~----~g~~~l~~r~~y~~~~---~G~s~---~-- 304 (342) ++++.+|-. .++|+. -++.+.....+.++.++... .+...+....++.+.+ ..|.+ + T Consensus 345 ~~~~~~p~~---------~~~~gd~s~~~~~~~~~~~~i~~~~~~~~~f~~~~~~~r~~~~~d~~~~~~~a~~~~~~~~~ 415 (418) T protein:vir:10 345 VETQAMTAN---------EFLVGAFSMAAQIFDRMEIEVLLSTENVDDFEKNMVSIRAEERLALAVYRPESFVTGALVEQ 415 (418) T ss_pred EEcCCCCCC---------cEEEeeccceEEEEEecceEEEEecccchhhhcCceEEEEEEeeccEEecccceEEEEeccC Confidence 999999732 134443 23445556667777666542 4445566666665543 44544 1 Q ss_pred cCc Q lcl|NC_020854. 305 VTT 307 (342) Q Consensus 305 ~~~ 307 (342) .+| T Consensus 416 ~~g 418 (418) T protein:vir:10 416 AGG 418 (418) T ss_pred CCC Confidence 223 No 70 >protein:vir:4456 Length: 401 # NCBI annotation: Major capsid protein precursor # Family: family:all:21 # MgeID: mge:96 # MgeName: ST64B # Cross-refs: genbank:acc:NP_700379;genbank:gi:23505451;genbank:GeneID:955658 Probab=99.31 E-value=7.5e-13 Score=87.05 Aligned_cols=283 Identities=8% Similarity=-0.040 Sum_probs=160.4 Q ss_pred Ccce-ec--cccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhc- Q lcl|NC_020854. 1 MATL-RS--DIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKI- 76 (342) Q Consensus 1 MaT~-~~--d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~l- 76 (342) |++. .+ -..+|+.+.+-+.+.+.+.+.|.+ ++..- ..+|....+|.... ...+.-+.|+..++.... T Consensus 107 ~~~~~~~~GG~~iP~~~~~~ii~~~~~~~~l~~--~~~~~-----~~~~~~~~~~~~~~--~~~a~wv~E~~~~~~~~~~ 177 (401) T protein:vir:44 107 LQVGTDEDGGYAVPEELDRSILSLLKDEVVMRQ--EATVI-----TVGGSDYKKLVNLG--GTASGWVGETDTRSQTATS 177 (401) T ss_pred hhcCCCCCCceeccHhHHHHHHHHHHhhhhhhh--hceee-----ecCCCceEEEEecC--CccceeeccccccCccccc Confidence 5532 22 357788877766666666555543 22211 12355566676542 233445678877764443 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH-----HHHHHHhhhcccccchhheeeec Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLS-----CLQGVFGSLNANTSSSAFFDLCI 151 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla-----~L~g~~~~~~a~~~~~~~~~~~~ 151 (342) +-.+-....++.+.-..++++...-+.-|..+.+.+++++.+.++.+..+|. ..+|++................. T Consensus 178 ~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~la~ai~~~~~~~~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~ 257 (401) T protein:vir:44 178 RLGLIEPFMGEIYGNPQATQKMLDDAFFNVEAWINSELATEFAEQEEIAFTTGDGTKKPKGFLAYESTEESDKARAFGKL 257 (401) T ss_pred cceeeeeehhheeeehhhhHHHHhcchHHHHHHHHHHHHHHHHHHHHhhhhccCCCCccceeeccccccccccccccccc Confidence 3343344555555555666655444455778889999999999998887774 11122211100000000000000 Q ss_pred ccccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeeccccccccee Q lcl|NC_020854. 152 DSESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVP 231 (342) Q Consensus 152 ~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~ 231 (342) ..........++++.+.++...+......-.+|+||+..+..|+++ ++++++++.... ...+... T Consensus 258 ~~~~t~~~~~~~~d~i~~~~~~l~~~~~~~a~~v~n~~~~~~L~~l------kd~~G~~l~~~~---------~~~g~~~ 322 (401) T protein:vir:44 258 QHIVSGEATAVTADAIIKLIYTLRKAHRTGAKFMMNNNSLFAIRLL------KDTEGNYLWRPG---------LELGQPS 322 (401) T ss_pred cccccccccccCHHHHHHHHHhcchhhhcCCEEEEcHHHHHHHHHh------hccCCceeecCC---------cCCCCCc Confidence 1111224456789999999988877666677899999999999864 345544432211 1123346 Q ss_pred eeccceEEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecc-eeeecCcC Q lcl|NC_020854. 232 TYMGLRVIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVG-AKWAVTTT 308 (342) Q Consensus 232 ~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G-~s~~~~~~ 308 (342) +++|++|+++|.+|..+++. .+++|+. -++.+.....+.+++++....+...+.+..++...|.- -.|..-.. T Consensus 323 ~l~G~PVv~~~~~p~~~~~~----~~i~~Gd~~~~~~i~~~~~~~~~~~~~~~~~~v~~~a~~r~d~~~~~~~a~~~l~~ 398 (401) T protein:vir:44 323 SLAGYGIAENEQMPDIAADA----KAIAFGNFKRGYTIVDRIGTRILRDPYTNKPFVGFYTTKRTGGMLVDSQAIKLLKI 398 (401) T ss_pred eecceeeEEecCcCCccCCc----cEEEEeehhccEEEEEecceEEeeeccccCCcEEEEEEEEeccEEecccceEEEEe Confidence 79999999999999765432 2344442 34555555567788888777777778887777666532 12211100 Q ss_pred CcC Q lcl|NC_020854. 309 NPT 311 (342) Q Consensus 309 sPt 311 (342) ..+ T Consensus 399 ~aa 401 (401) T protein:vir:44 399 AAA 401 (401) T ss_pred ecC Confidence 111 No 71 >protein:vir:4953 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:108 # MgeName: Sfi19 # Cross-refs: genbank:acc:NP_049929;genbank:gi:9632900;genbank:GeneID:1262076 Probab=99.30 E-value=2.2e-12 Score=84.45 Aligned_cols=272 Identities=11% Similarity=0.041 Sum_probs=170.6 Q ss_pred Ccc---eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceech-hhc Q lcl|NC_020854. 1 MAT---LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTP-GKI 76 (342) Q Consensus 1 MaT---~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~-~~l 76 (342) |++ .-...++|+.+...+.+...+.+.|.+ ++...+. ... -|. +.+|.+.. ..+.+..+.|++.++. +++ T Consensus 109 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~~~l~~--~~~~~~~-~~~-~~~-~~~~~~~~-~~~~a~~v~E~~~~~~~~~~ 182 (397) T protein:vir:49 109 KTDASGSDAGLTIPQDIQTAIHTLVSQYDSLQE--YVNVENV-TTL-TGS-RVYEKWTD-ITGLANIDDEAGKIADVDDP 182 (397) T ss_pred hhccccccCcccccHhHHHHHHHHHHhhhhHHh--hhceeec-ccC-ccc-eEEEeecc-CCcceeeecCcccccccccc Confidence 542 235678899998888877777776644 2222111 111 133 34565542 3466788999999885 566 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +-.+-....++.+.-+.++++...-+.-|..+.+.+++++.+++..+..++.-.. .. . T Consensus 183 ~~~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~d~ai~~G~g------~~----------------~ 240 (397) T protein:vir:49 183 KLSLIKYTIKRYAGISTVTNSLLADSAENILAWLSGWIAKKVVVTRNKAILEAIA------AL----------------P 240 (397) T ss_pred ceeeEEeeeeeEEeeehhHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHHhhcc------cc----------------c Confidence 7777777777888777888776555556778889999999999999887765321 00 0 Q ss_pred cccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccc Q lcl|NC_020854. 157 DTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGL 236 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~ 236 (342) ......+++.+.++...+-.....-++|+||+..+..|+++ ++++++++.. .. .....-++++|+ T Consensus 241 ~~~~~~~~d~i~~~~~~l~~~~~~~a~~vmn~~~~~~l~~l------kd~~G~~l~~--~~-------~~~~~~~~l~G~ 305 (397) T protein:vir:49 241 TKPTLTKWDDIIDLEAKVDPAIKQTSFFLTNTSGFTALKKV------KNALGDYLME--RD-------VKSPTGYSIDGF 305 (397) T ss_pred cccccccHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHHh------hcCCCceeec--cC-------cCCCCCceecce Confidence 12234578889999888877777788999999999999875 3454443321 11 112345789999 Q ss_pred eEEEeCCcceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCC----CcceeEEEEeeEEEeeec---ceee---e Q lcl|NC_020854. 237 RVIVSDDVNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDI----LAKSDAMSIDLHYVYHPV---GAKW---A 304 (342) Q Consensus 237 ~VvvdD~~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~----~~g~~~l~~r~~y~~~~~---G~s~---~ 304 (342) +|++.+..+...... +. ..++|+ ..++.+....++.+++++.. ..+...+....++.+.++ .+.. + T Consensus 306 PV~~~~~~~~~~~~~-~~-~~i~~gd~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~ 383 (397) T protein:vir:49 306 AVKEVADRWLANGTG-GA-MPLYFGDLKQAVTLFDRQHMSLLSTNIGGGAFETDTTKVRVIDRFDVVATDTEAFVPASFK 383 (397) T ss_pred eeEEecccccccccC-Cc-eeEEEeeccceEEEEeecceEEEEeccccchhhcCceeEEEEeeeCcEEecccceEEEEee Confidence 999877654433222 22 245555 34566666777888877743 245566777777766653 3332 2 Q ss_pred c-CcCCcChHHhcC Q lcl|NC_020854. 305 V-TTTNPTRAQLET 317 (342) Q Consensus 305 ~-~~~sPt~~~L~~ 317 (342) . +...|+..-++- T Consensus 384 ~~~~~~~~~~~~~~ 397 (397) T protein:vir:49 384 AIADQKGNLGSTAV 397 (397) T ss_pred cccCCCCCcccccC Confidence 1 222333333333 No 72 >protein:vir:3364 Length: 347 # NCBI annotation: major capsid protein 10A # Family: family:all:975 # MgeID: mge:67 # MgeName: T3 # Cross-refs: genbank:acc:NP_523335;genbank:gi:17570826;genbank:GeneID:927448 Probab=99.29 E-value=3.6e-13 Score=88.77 Aligned_cols=286 Identities=14% Similarity=0.122 Sum_probs=176.1 Q ss_pred Cc---------cee--------cc-ccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcc Q lcl|NC_020854. 1 MA---------TLR--------SD-IIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDF 62 (342) Q Consensus 1 Ma---------T~~--------~d-~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda 62 (342) || |+. .+ +++ |+|...|...+.+.+.|... + ...++ .+|+++.||... .... T Consensus 1 ~~~~~~~~~~~t~~g~~~~~~~~~al~i-e~~~g~V~~~f~~~s~~~~~--v-~~r~~---~~G~sv~i~~iG---~~t~ 70 (347) T protein:vir:33 1 MANIQGGQQIGTNQGKGQSAADKLALFL-KVFGGEVLTAFARTSVTMPR--H-MLRSI---ASGKSAQFPVIG---RTKA 70 (347) T ss_pred CCCCccCcccccccccCCcccchHHHHH-HHHHHHHHHHHHHHHhhhhh--h-ccccc---cccceeEeeecc---ceee Confidence 55 221 13 677 99999999999888888552 2 22233 359999999764 3667 Q ss_pred cccCCCceec--hhhcccceeeeEeee-eccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Q lcl|NC_020854. 63 EVLSDSSSLT--PGKITADKQVAAILH-RGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNA 139 (342) Q Consensus 63 ~~~~~~~~i~--~~~lt~~~~~a~i~~-~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a 139 (342) +.+..|+.++ +..+...+..-++=. .--.+.+.|+-..-+..|++.++.++.+...+++.|+.++..|..+.....+ T Consensus 71 ~~~~~g~~l~~~~~~~~~~e~~ltiD~~~y~~~~VddiD~~q~~~D~~~~~~~~~g~aLA~~~D~~i~~~l~~~~~~~~~ 150 (347) T protein:vir:33 71 AYLKPGENLDDKRKDIKHTEKVIHIDGLLTADVLIYDIEDAMNHYDVRAEYTAQLGESLAMAADGAVLAELAGLVNLPDG 150 (347) T ss_pred eeecCCCCCCCCCCCCccceEEEEechhhhhhHHHhhHHHHhcCCchhHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhcc Confidence 7888888875 455666666555433 2335778899888888999999999999999999999998877544321111 Q ss_pred cc-cchhheee----eccccccc-cc----ccccHHHHHHHHHHhCccc--cCeEEEEEchHHHHHHHhhhhhhhhhhhh Q lcl|NC_020854. 140 NT-SSSAFFDL----CIDSESGD-TP----TALSPRHVAEARAILGDQG--DKLTAVAMHSKVYYDLVERRAIDYVSTAD 207 (342) Q Consensus 140 ~~-~~~~~~~~----~~~~~~~~-~~----~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~v~~~L~~~~li~~~~~s~ 207 (342) .. ....+... .....++. .+ +..-++.|.+|..+|.+.. ..=..++|.|..|..|.+..-+ . ..+ T Consensus 151 ~~~~~~~~~~~~~~~~~~~~tg~~~d~~~~a~~i~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~~~--~-~~d 227 (347) T protein:vir:33 151 SNENIEGLGKPTVLTLVKPTTGSLTDPVELGKAIIAQLTIARASLTKNYVPAADRTFYTTPDNYSAILAALMP--N-AAN 227 (347) T ss_pred cccccccccccccccccccccccccchhhhHHHHHHHHHHHHHHHhhcCCCccCcEEEeCHHHHHHHhccccc--c-ccc Confidence 00 00000000 00000100 00 1112456777777887642 2347899999999999876422 1 111 Q ss_pred cccceeeeccceeecccccccceeeeccceEEEeCCcceeccCCC------c------------------ceEEEEEecc Q lcl|NC_020854. 208 ARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGS------T------------------EYATYFFTQG 263 (342) Q Consensus 208 ~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~------~------------------~y~t~l~~~G 263 (342) .. ... ....+.++.++|.+|+.+..+|....... + ...++++-+. T Consensus 228 ~~------~~~-----~~~~G~V~~i~G~~V~~Sn~lp~~~~~~~~~~~~ag~~~~~~~~~~~~~~~a~~~~~gl~~h~~ 296 (347) T protein:vir:33 228 YQ------ALL-----DPERGTIRNVMGFEVVEVPHLTAGGAGDTREDAPADQKHAFPATSSTTVKVALDNVVGLFQHRS 296 (347) T ss_pred cc------ccc-----ccccceeEEEeceeEEEecccccCccccccccccccccccccCCcccceeccccceeeeeecch Confidence 11 001 12245689999999999999986321000 0 0124567788 Q ss_pred eeEeecCCcceeEeccCCCcceeEEEEeeEEEeeec---c-eeeecCcCCc Q lcl|NC_020854. 264 AVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPV---G-AKWAVTTTNP 310 (342) Q Consensus 264 Ai~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~---G-~s~~~~~~sP 310 (342) |++.....++.+|..|+.....+.+...+.|+..+. + ..+.-.+.|- T Consensus 297 A~g~v~~~~~~~e~~r~~~~~~d~i~~~~~~G~~vlrP~~av~i~~~~~~~ 347 (347) T protein:vir:33 297 AVGTVKLKDLALERARRANYQADQIIAKYAMGHGGLRPEAAGAIVLPKVSE 347 (347) T ss_pred hheeeeeeceeeeeccchhhhhHhhhhhhhcCCceecccceEEEecCCCCC Confidence 888888888999999998877777666666655432 2 2222222222 No 73 >protein:vir:94576 Length: 347 # NCBI annotation: Major capsid protein # Family: family:all:975 # MgeID: mge:1516 # MgeName: Berlin # Cross-refs: genbank:acc:YP_919012;genbank:gi:119637776;genbank:GeneID:5179336 Probab=99.28 E-value=2.2e-13 Score=89.97 Aligned_cols=282 Identities=14% Similarity=0.094 Sum_probs=179.8 Q ss_pred Cc---------ce--------ecc-ccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcc Q lcl|NC_020854. 1 MA---------TL--------RSD-IIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDF 62 (342) Q Consensus 1 Ma---------T~--------~~d-~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda 62 (342) || |+ ..| |+. |+|...|...+.+.+.|... +.. -++ .+|+++.+|+.. ...+ T Consensus 1 ma~~~~~~~~~t~~g~~~~~~d~~al~i-e~~~geV~~~f~~~s~~~~~--~~~-rti---~~G~sv~~~~iG---~~~~ 70 (347) T protein:vir:94 1 MANMNGGQQMGKDQGKGMSAGDKLALFL-KVFGGEVLTAFTRTSVTMNK--HLV-RSI---QSGKSAQFPVLG---RTKA 70 (347) T ss_pred CCccccccccccccccCCcccchHHHHH-HHHhHHHHHHHHHHHhhhhh--hhh-eec---cccceEEeeecc---ceeE Confidence 44 22 223 566 99999999999999988653 221 133 469999999754 3566 Q ss_pred cccCCCceec--hhhcccceeeeEeeee-ccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Q lcl|NC_020854. 63 EVLSDSSSLT--PGKITADKQVAAILHR-GRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNA 139 (342) Q Consensus 63 ~~~~~~~~i~--~~~lt~~~~~a~i~~~-~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a 139 (342) +.+.+|+++. .+.+...+.+-+|=.. --.+.+.|+-...+..|++.++.++.+...+++.|+.++..|.-.-+...+ T Consensus 71 ~~~~~G~~l~~~~~~~~~~e~~ltID~~~y~~~~VddiD~~q~~~D~rs~~~~~~g~ALA~~~D~~i~~~l~~~a~~~~~ 150 (347) T protein:vir:94 71 AYLQPGENLDDKRKDMKHTEKTINIDGLLTADVLIYDIEDAMNHYDVRSEYTAQLGESLAMAADGAVLAEMAKLCNLPTA 150 (347) T ss_pred eeeecCcCCCCCcCCccccceEEEEcchhhhhhhhhhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhccccc Confidence 7788998874 3567777777655443 445778899988888999999999999999999999888765322111000 Q ss_pred ccc---c-hhheeeecccccc-ccccccc----HHHHHHHHHHhCccc--cCeEEEEEchHHHHHHHhhhhhhhhhhhhc Q lcl|NC_020854. 140 NTS---S-SAFFDLCIDSESG-DTPTALS----PRHVAEARAILGDQG--DKLTAVAMHSKVYYDLVERRAIDYVSTADA 208 (342) Q Consensus 140 ~~~---~-~~~~~~~~~~~~~-~~~~~~~----~~~l~~A~~~~GD~~--~~~~~ivmhS~v~~~L~~~~li~~~~~s~~ 208 (342) ... + ..-..+....... ......+ ++.|.+|..+|.++. ..-..+++.|+.|..|.+.....+ .+. T Consensus 151 ~~~~~~g~~~~~~v~i~~~~~~~~~~~~~~~~~~d~i~~a~~~Lde~dVP~~~R~~vv~P~~y~~LLk~~~~~~---~~~ 227 (347) T protein:vir:94 151 NNENIAGLGKAHVLEVGDQATLQGDQVKLGQAIIAQLTLARAKLTGNYVPSSDRVFYTTPDNYSAILAALMPNA---ANY 227 (347) T ss_pred cccccccCCcceeEeeeccccccccccccHHHHHHHHHHHHHHhhhcCCCCCCCEEEeChHHHHHHHHhhcccc---ccc Confidence 000 0 0000001111000 0111122 455666667766542 235788999999999987422111 111 Q ss_pred ccceeeeccceeecccccccceeeeccceEEEeCCcceec--------------------cCCCcce-------EEEEEe Q lcl|NC_020854. 209 RGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAG--------------------SGGSTEY-------ATYFFT 261 (342) Q Consensus 209 ~~~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~--------------------~~~~~~y-------~t~l~~ 261 (342) ... + ....+.++.++|.+|+++..+|... ....++| ..+++- T Consensus 228 ~~~------~-----~~~~G~V~~v~G~~V~~Sn~~p~~~~~~~~~~~~~~~~~~~~~~~~~~~~~y~~d~~~~~~l~~~ 296 (347) T protein:vir:94 228 QAL------I-----DPSTGSIRNVMGFEVIEVPHLTAGGAGDNRAEEGVAPTNQKHAFPDTASGDTRVALDNVVGLFNH 296 (347) T ss_pred ccc------c-----ccccceeEEeeceEEEEcCccccccCcccccccccccccccccccccccccccccccceEEEEec Confidence 111 0 1124679999999999999998642 1112334 357777 Q ss_pred cceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecce------eeecC Q lcl|NC_020854. 262 QGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVGA------KWAVT 306 (342) Q Consensus 262 ~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~------s~~~~ 306 (342) +-|++.....++.+|+.|+.....+.|.+++.|+..|+== .++.+ T Consensus 297 ~~A~~tv~~~~~~~e~~~~~~~~~~~i~~~~a~G~g~~rPe~a~~i~~~~a 347 (347) T protein:vir:94 297 RSAVGTVKLKDMALERARRANFQADQIIAKYAMGHGGLRPEACGALVFKKA 347 (347) T ss_pred hhhhhhhhhcccceeeeechhhhhhhhhhhhhhcCcccccceeEEEEecCC Confidence 8888888888999999999999999999998888876431 22222 No 74 >protein:vir:485 Length: 407 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:11 # MgeName: P27 # Cross-refs: genbank:acc:NP_543092;swissprot:trembl:q8w627;genbank:gi:18249904;uniprot:Q8W627;genbank:GeneID:929693 Probab=99.28 E-value=2.1e-12 Score=84.55 Aligned_cols=284 Identities=8% Similarity=-0.017 Sum_probs=165.1 Q ss_pred Ccc-ee--ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhc- Q lcl|NC_020854. 1 MAT-LR--SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKI- 76 (342) Q Consensus 1 MaT-~~--~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~l- 76 (342) |.+ +. .-.++|+.+..-+.+.+.+.+.+.+ ++..-+ .++..+.+|.... ...+.-+.|++.++..+. T Consensus 106 ~~~~t~~~gG~~iP~~~~~~I~~~~~~~~~l~~--~~~~~~-----~~~~~~~~~~~~~--~~~a~~v~E~~~~~~~~~~ 176 (407) T protein:vir:48 106 LQVGNDEDGGYAIPEELDRTILTLLKDEVVMRQ--EATVIT-----LGGSDYKKLVNLG--GTTSGWVGETDARPETATS 176 (407) T ss_pred hhcccCCCCcccccHhHHHHHHHHHHhhhhhhh--hceeee-----cCCCceEEEEecC--Ccceeeecccccccccccc Confidence 442 22 2357899988888777777665543 222111 2344667776542 355556788888875554 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH-----HHHHHhhhcccccchhheee-e Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSC-----LQGVFGSLNANTSSSAFFDL-C 150 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~-----L~g~~~~~~a~~~~~~~~~~-~ 150 (342) +..+.....++.+.-+.++++...-+..|..+.+.++++..+.++.+..++.- -+|++....... ....... . T Consensus 177 ~f~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~~~a~l~G~G~~~p~Gil~~~~~~~-~~~~~~~~~ 255 (407) T protein:vir:48 177 KLGLIEPFMGEIYGNPQATQKMLDDAFFNVEDWINSELALEFAEQEEIAFTSGDGSKKPKGFLAYESTDE-DDKTRAFGK 255 (407) T ss_pred cceeEEeeeeeeEeehhhHHHHHhcchHHHHHHHHHHHHHHHHHHHHhhhhccCCCCccceeeecccccc-ccccccccc Confidence 34444555556666667776665555668888899999999999988876530 011111000000 0000000 0 Q ss_pred cccccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccce Q lcl|NC_020854. 151 IDSESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSV 230 (342) Q Consensus 151 ~~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i 230 (342) ...........++++.|.+....+......-.+|+||+..+..|+++ ++++++++.. .. ...+.. T Consensus 256 ~~~~~~~~~~~~~~d~i~~l~~~l~~~~~~~a~~v~n~~~~~~L~~l------kD~~Gr~l~~--~~-------~~~g~~ 320 (407) T protein:vir:48 256 LQHIASGAASGVTADAIIKLIYTLRKAHRSGAKFMMNNSSLFAIRLL------KDNDGNYLWR--PG-------IELGQP 320 (407) T ss_pred ccccccccccccChHHHHHHHHhhchhhhcCCEEEEcHHHHHHHHHh------hccCCceeec--cC-------cCCCCC Confidence 00112233456788999999887776666667899999999998764 3455443321 11 112334 Q ss_pred eeeccceEEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCCCcceeEEEEeeEEEeee---cceee-- Q lcl|NC_020854. 231 PTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHP---VGAKW-- 303 (342) Q Consensus 231 ~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~---~G~s~-- 303 (342) ++++|++|+++|.||..+.++ .+++|+. .++.+.....+.+++++....+...+....++.+.| ..|.. T Consensus 321 ~~l~G~PV~~~~~~p~~~~~~----~~i~~Gd~~~~~~i~~~~~~~i~~d~~~~~~~~~~~~~~r~d~~v~~~~a~~~l~ 396 (407) T protein:vir:48 321 SSLAGYGIVENEQMPDIAADA----KAIAFGNFKRGYTIVDRIGTRILRDPYTNKPFVGFYTTKRTGGMLVDSQAIKLMK 396 (407) T ss_pred ceecceeeEEecCcCCccCCc----cEEEEEeccccEEEEEeeceEEEeeccccCCcEEEEEEEEeccEEecccceEEEE Confidence 689999999999999754332 2344442 344555555677888877777888888888876665 33332 Q ss_pred -ecCcCCcChH Q lcl|NC_020854. 304 -AVTTTNPTRA 313 (342) Q Consensus 304 -~~~~~sPt~~ 313 (342) +.+..+-.-+ T Consensus 397 ~~aa~~~~~~~ 407 (407) T protein:vir:48 397 IGAATRQKAAA 407 (407) T ss_pred eeccCCCCCCC Confidence 2221111111 No 75 >protein:vir:81070 Length: 390 # NCBI annotation: p09 # Family: family:all:585 # MgeID: mge:1889 # MgeName: Xop411 # Cross-refs: genbank:acc:YP_001285679;genbank:gi:148727187;genbank:GeneID:5247115 Probab=99.28 E-value=2.7e-12 Score=83.98 Aligned_cols=268 Identities=7% Similarity=0.018 Sum_probs=162.0 Q ss_pred Cccee---ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MATLR---SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT~~---~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) +.+.. .-++.||....++ +...+.+.+.+ ++.. + ..++..+++|.+.. ..+.+..+.|++.++..+.+ T Consensus 114 ~~~~~~~~g~~~~~~~~~~ii-~~~~~~~~l~~--~~~~---~--~~~~~~~~~~~~~~-~~~~a~~v~Eg~~~~~~~~~ 184 (390) T protein:vir:81 114 STDAAGSAGALTTPNRLPGFI-TPPDARLTVRD--LIGS---G--RTDSALIEYVQETG-FVNNAAIVAEGALKPESSLK 184 (390) T ss_pred ccccccCCcceechhhhHHHH-HHHhhhhhhhh--hcce---e--eccCCceEEEEEec-CCcceeeecCCcccccccce Confidence 22222 2267777665544 44555555543 2211 1 12466788898863 24567788999999999998 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) ..+.....++.+....+++....-+ .+....+.+++++.+++..+..+|. | ....+.-.+.+........... T Consensus 185 ~~~i~~~~~k~~~~~~is~ell~d~-~~~~~~i~~~l~~~~~~~~d~a~l~---G---~g~~~~~~Gi~~~~~~~~~~~~ 257 (390) T protein:vir:81 185 FAKKTDTTHVIAHTMKATRQILSDA-PQLASYMNNRLIRGLKVKEDAEILR---G---TGANDGLLGLIPQATTYAAPTT 257 (390) T ss_pred eeEEEEeeeEEEEeehhhHHHHHhH-HHHHHHHHHHHHHHHHHHHHHHHHh---c---CCCCCcccceeecccccccccc Confidence 8888888888888888888765444 4677779999999999999987663 2 1110100111111111111222 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) ......++.|.++...+........+|+|||.++..|++. ++++++.+.. .+ ....-++++|++ T Consensus 258 ~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~l------kd~~G~~l~~--------~~--~~~~~~~l~G~p 321 (390) T protein:vir:81 258 IAGATRVDQLRLAMLQASLAEYNPSGIVINPIDWAAIELA------KDANNQYLIG--------NA--RGTLTPTLWGLP 321 (390) T ss_pred cccchhHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHh------hcCCCceeec--------Cc--ccccCceeccee Confidence 3334567889999988888777788999999999999864 3444433221 11 123346899999 Q ss_pred EEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCC---CcceeEEEEeeEEEeeecc-eeeecCcCC Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDI---LAKSDAMSIDLHYVYHPVG-AKWAVTTTN 309 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~---~~g~~~l~~r~~y~~~~~G-~s~~~~~~s 309 (342) |++++.+|.. +++|+. .++.+.....+.++.++.. ..+...+....++..++.- =+|.....+ T Consensus 322 v~~~~~~p~~---------~~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~v~~r~~~r~d~~v~~~~a~v~~t~a 390 (390) T protein:vir:81 322 VVATQAMAPG---------EFLVGAFDLAAQIFDQWDARVEIGYVGEDFQRNMITVLAEERLALVVYRPEALISGSFA 390 (390) T ss_pred eEEcCCCCCC---------cEEEEehhceEEEEEecceEEEEecccchhhcCcEEEEEEEeeccEEecccceEEEEeC Confidence 9999998732 133332 3444555556777766532 2355566677777665432 122222222 No 76 >protein:vir:3991 Length: 404 # NCBI annotation: major structural protein # Family: family:all:21 # MgeID: mge:319 # MgeName: BK5-T # Cross-refs: genbank:acc:NP_116499;genbank:gi:14251132;genbank:GeneID:921252 Probab=99.27 E-value=4.9e-12 Score=82.59 Aligned_cols=273 Identities=11% Similarity=0.043 Sum_probs=166.7 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccC-CCCcccccCCCceech-hh Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKAN-LSGDFEVLSDSSSLTP-GK 75 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i-~~gda~~~~~~~~i~~-~~ 75 (342) |. +.-....+|+.+.+.+.+...+.+.|.+ ++..-+ .++....+|+|... ..+.+..+.|++.++. ++ T Consensus 116 ~~~~t~~~gg~~iP~~~~~~ii~~~~~~~~l~~--~~~~~~-----~~~~~~~~~~~~~~~~~~~a~~v~Eg~~~~~~~~ 188 (404) T protein:vir:39 116 ETSGSDSAAGLTIPQDIRTMINTLVRQYDSLQQ--YVRVES-----VSTSNGSRVYEKWTDVTPLTVMDAEDGKIPDLDN 188 (404) T ss_pred hhcccccCCceeccHHHHHHHHHHHHhhhhHHh--hcceee-----ccCCcceEEEEeecCCccceeeecCccccccccc Confidence 33 2234567899998888877777776644 221111 23444555666432 2245567889988874 66 Q ss_pred cccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 76 ITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 76 lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) .+..+-....++.+.-+.++++...-+.-|....+.+++++.+.+..++.+|.-.. . + T Consensus 189 ~~f~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~d~~il~g~g----~------~------------ 246 (404) T protein:vir:39 189 PRLTIIKYLIKRYAGIITATNTLLKDTAENILAWLSSWIAKKVVVTRNQAIIAAMG----T------V------------ 246 (404) T ss_pred cceeeEEeeeeeEEeeehhHHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHHhccc----c------c------------ Confidence 77777777888888888888877665666788889999999999999987765321 0 0 Q ss_pred ccccccccHHHHHHHHHH-hCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeec Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAI-LGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYM 234 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~-~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~ 234 (342) .......+++.+.+++.. +......-.+|+||+..+..|++. ++++++++.. .. .....-++++ T Consensus 247 ~~~~~~~~~~~i~~~~~~~~~~~~~~~a~~v~n~~~~~~L~~l------kd~~G~~l~~--~~-------~~~~~~~~l~ 311 (404) T protein:vir:39 247 PKKPTIAKFDDVITMINTSVDPAIIATSSLLTNQSGLNKLALV------KTAEGKYLLE--PD-------PTKPNSYLIK 311 (404) T ss_pred ccccccccHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHh------hccCCceeec--cC-------cCCCCcceec Confidence 011234467788888764 444445567899999999999864 3454443321 11 1123457899 Q ss_pred cceEEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCCC----cceeEEEEeeEEEeeecc---eee-e Q lcl|NC_020854. 235 GLRVIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDIL----AKSDAMSIDLHYVYHPVG---AKW-A 304 (342) Q Consensus 235 G~~VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~----~g~~~l~~r~~y~~~~~G---~s~-~ 304 (342) |++|++.|..++...+ .+.+ +++++. .++.+....++.++.++... .....+....+|.+.++- |.. + T Consensus 312 G~pV~~~~~~~~~~~~-~~~~-~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~ 389 (404) T protein:vir:39 312 GKKVIVVADRWLPNSG-STVY-PLYYGDMSQAITLFDRENMSLLPTNIGAGAFETDTTKIRVIDRFDVKTTDSEALVAGS 389 (404) T ss_pred ceeEEEecccccCccC-CCcc-EEEEEeccccEEEEeecceEEEEeccchhhhhhceeeEEEEeeeccEEecccceEEEE Confidence 9999998875544322 2222 344553 35556666777888777643 455677787887777644 222 1 Q ss_pred cCcCCcChHHhcCCc Q lcl|NC_020854. 305 VTTTNPTRAQLETVA 319 (342) Q Consensus 305 ~~~~sPt~~~L~~~~ 319 (342) ....+|.-+....|. T Consensus 390 ~~~~a~~~~~~~~~~ 404 (404) T protein:vir:39 390 FTAIADQVGNFTAGK 404 (404) T ss_pred eeccccCCCCCCCCC Confidence 111122222222233 No 77 >protein:vir:80213 Length: 334 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:1879 # MgeName: LKA1 # Cross-refs: genbank:acc:YP_001522884;genbank:gi:158345177;genbank:GeneID:5687476 Probab=99.24 E-value=2.9e-13 Score=89.28 Aligned_cols=287 Identities=12% Similarity=0.019 Sum_probs=179.2 Q ss_pred Ccce-------------ec--cccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCccccc Q lcl|NC_020854. 1 MATL-------------RS--DIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVL 65 (342) Q Consensus 1 MaT~-------------~~--d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~ 65 (342) |+|- -+ +++. |+|...|...+.+++.|... .... ++ .+|+++.+|+-. ...++.. T Consensus 1 m~~~~~~~~t~~~~~~~~~~~~l~l-e~~~geV~~af~~~s~~~~~--~~~r-~i---~~G~s~~~~~iG---~~~~~~~ 70 (334) T protein:vir:80 1 MTYPAANTHTRPGWGGANSDVSLHI-EEHLGLVDASFMYSSKFASW--MNVR-SL---RGTNQLRVDRVG---ASTIAGR 70 (334) T ss_pred CCCCcCCCccccccccccchheehh-hhhhhHHHHHHHHhhhhhcc--ceee-ec---cccceEEEeeec---ceeeeee Confidence 6632 12 3333 89999998888888888642 2221 23 469999999643 3567788 Q ss_pred CCCceechhhcccceeeeEeee-eccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH-HHHHhhhcc---- Q lcl|NC_020854. 66 SDSSSLTPGKITADKQVAAILH-RGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCL-QGVFGSLNA---- 139 (342) Q Consensus 66 ~~~~~i~~~~lt~~~~~a~i~~-~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L-~g~~~~~~a---- 139 (342) ..|+.++.+.+.+.+.+-+|=. .--.+.+.|+-...+..|...++.+|.+...++..|+.++..| ++.....-+ T Consensus 71 ~~g~~l~~~~~~~~~~~l~ID~~l~~~~~VddiD~~q~~~D~rse~~~~~G~aLA~~~D~~~~~~l~kaa~~~~~~~~~~ 150 (334) T protein:vir:80 71 KAGEELVVQKNVSDKLNLTVDTVLYARHFFDKFDEWTSNLDVRKETAREDGIALARQYDQACIIQLQKCGDFLAPAHLKP 150 (334) T ss_pred cCCCCCCCCCcccCceEEEEeeeeehhhhHhhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccccc Confidence 9999999999999887776654 5556889999998899999999999999999999999887654 444321111 Q ss_pred cccchhheeeecccccccccccccHHHH----HHHHHHhCcccc-----CeEEEEEchHHHHHHHhhhhhhhhhhhhccc Q lcl|NC_020854. 140 NTSSSAFFDLCIDSESGDTPTALSPRHV----AEARAILGDQGD-----KLTAVAMHSKVYYDLVERRAIDYVSTADARG 210 (342) Q Consensus 140 ~~~~~~~~~~~~~~~~~~~~~~~~~~~l----~~A~~~~GD~~~-----~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~ 210 (342) ....+....+.. .+.. ....-+++.+ .+|.+.|.++.- .-..++|.|++|..|.+..-+ +. .+. T Consensus 151 ~~~~G~~~~~~~-~g~~-~~~~~~~~~l~~a~~~a~~~L~e~dvp~~~~~~R~~vv~P~~y~~Ll~~~r~--~n-~d~-- 223 (334) T protein:vir:80 151 AFHDGILLPSTI-SGLA-ADAAADADVLVAAHRQGVEAMVFRDLGDQLMSEGVTLLDPVIFSFLLEHDRL--MN-VEF-- 223 (334) T ss_pred cccCCcceeecc-cccc-cchhhhHHHHHHHHHHHHHHHHhcCCCCCcCCceEEEeChHHHHHHhccccc--cc-cee-- Confidence 111111211111 1111 1122234344 455555654322 247999999999999986322 11 110 Q ss_pred ceeeeccceeecccccccceeeeccceEEEeCCcceecc------CCCcce-------EEEEEecceeEeecCCcceeEe Q lcl|NC_020854. 211 TSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGS------GGSTEY-------ATYFFTQGAVASGEQMAMQTET 277 (342) Q Consensus 211 ~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~------~~~~~y-------~t~l~~~GAi~~~~k~~~~ve~ 277 (342) ....+ +.......+..++|.+|+.+..+|.... +..++| ..+++.+.|++...-.++..|. T Consensus 224 --~~s~~----~~~~~~g~i~~v~G~~V~~Sn~~P~~~~t~~~~g~~~~~~agd~t~~~~~~~~~~Al~t~~~~~~~~e~ 297 (334) T protein:vir:80 224 --GAKEG----GNSFVGGRIAMLNGVRVVETPRFPQSAITANALGADFNVTDAEVRRKMITFIPSMALISAQVHPVSAQF 297 (334) T ss_pred --ccccc----cccccceeEEEEeceEEEeecCCCCccccccccccccccccccccceEEEEEeCceEEEEEEeecceee Confidence 00000 0111246689999999999999996531 122233 2356678899988888888999 Q ss_pred ccCCCcceeEEEEeeEEEeeecc---eee-ecCcCCc Q lcl|NC_020854. 278 DRDILAKSDAMSIDLHYVYHPVG---AKW-AVTTTNP 310 (342) Q Consensus 278 dr~~~~g~~~l~~r~~y~~~~~G---~s~-~~~~~sP 310 (342) .|+.....+.+.+.+.|+..+.= ..- .-...+| T Consensus 298 ~~~~~~~~d~i~~~~a~G~g~lRPeaa~vv~~~~~~~ 334 (334) T protein:vir:80 298 WEEKKDFGHYLDTFQSYNIGQRRPDAVAVHDITVTNP 334 (334) T ss_pred eechhhHHHHHHHHHHcCCceeccceEEEEEEeeecC Confidence 99888777766666555554321 100 0122345 No 78 >protein:vir:94711 Length: 347 # NCBI annotation: capsid # Family: family:all:975 # MgeID: mge:1528 # MgeName: K1F # Cross-refs: genbank:acc:YP_338120;genbank:gi:77118198;genbank:GeneID:3707734 Probab=99.24 E-value=3.7e-13 Score=88.74 Aligned_cols=282 Identities=15% Similarity=0.097 Sum_probs=166.2 Q ss_pred Cc--c--ee----------c-------cccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCC Q lcl|NC_020854. 1 MA--T--LR----------S-------DIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLS 59 (342) Q Consensus 1 Ma--T--~~----------~-------d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~ 59 (342) || + .+ + ..|.|||+..| .+.+.|.. .+ ...++ .+|+++.||... . T Consensus 1 m~~~~~~~~~t~~g~~~~~~d~~al~ik~f~~eV~~~f-----~~~s~~~~--~~-~~r~i---~~G~sv~i~~iG---~ 66 (347) T protein:vir:94 1 MANVPGQKIGTDQGKGKSSSDALALFLKVFAGEVLTAF-----TRRSVTAD--KH-IVRTI---QNGKSAQFPVMG---R 66 (347) T ss_pred CCCCCccccccccccCCccccHHHHHHHHHhHHHHHHH-----HHHHhhhc--cc-ccccc---cccceEEEeccc---c Confidence 44 1 11 1 25666666654 34444532 22 11222 369999999764 3 Q ss_pred CcccccCCCcee--chhhcccceeeeEeeee-ccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_020854. 60 GDFEVLSDSSSL--TPGKITADKQVAAILHR-GRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGS 136 (342) Q Consensus 60 gda~~~~~~~~i--~~~~lt~~~~~a~i~~~-~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~ 136 (342) .....+..|+.+ ++..+...+..-+|-.. --.+.+.|+-......|++.+..++.+..+++++|+.++..|..+.+. T Consensus 67 ~tv~~~t~G~~l~~~~~~~~~~e~~itID~~~~~~~~VddiD~~q~~~D~~~~~~~~~g~aLa~~~D~~i~~~~~~~aa~ 146 (347) T protein:vir:94 67 TSGVYLAPGERLSDKRKGIKHTEKVITIDGLLTADVMIFDIEDAMNHYDVAGEYSNQLGEALAIAADGAVLAEMAILCNL 146 (347) T ss_pred eeeeeecCCCCcCCCCCCCCcceEEEEecchhhhhHHhhhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHHHhcc Confidence 677788999988 45566666665554432 335678888888888899999999999999999999999887543322 Q ss_pred hcccc------cchhheeeeccccccccccccc----HHHHHHHHHHhCccc--cCeEEEEEchHHHHHHHhhhhhhhhh Q lcl|NC_020854. 137 LNANT------SSSAFFDLCIDSESGDTPTALS----PRHVAEARAILGDQG--DKLTAVAMHSKVYYDLVERRAIDYVS 204 (342) Q Consensus 137 ~~a~~------~~~~~~~~~~~~~~~~~~~~~~----~~~l~~A~~~~GD~~--~~~~~ivmhS~v~~~L~~~~li~~~~ 204 (342) ..+.. ............. ..+...+ ++.|.+|...|.+.. ..=..++|.|..|..|.+...+.- T Consensus 147 ~~~~~~~~~g~~~~s~~~~~~~~~--~~~~~~~~~~~~~~i~~a~~~Lde~~VP~~~R~~vv~P~~~~~Ll~~~~~~~-- 222 (347) T protein:vir:94 147 PAASNENIAGLGTASVLEVGKKAD--LDTPAKLGEAIIGQLTIARAKLTSNYVPAGDRYFYTTPDNYSAILAALMPNA-- 222 (347) T ss_pred ccccccccCCCcccceeecccccc--ccchhhhHHHHHHHHHHHHHHHhhcCCCCCCcEEEeCHHHHHHHhccchhhh-- Confidence 11110 0011111000110 0111112 345566667776542 134788999999999976543211 Q ss_pred hhhcccceeeeccceeecccccccceeeeccceEEEeCCcceeccCCC---------------------cce-------E Q lcl|NC_020854. 205 TADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGS---------------------TEY-------A 256 (342) Q Consensus 205 ~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~---------------------~~y-------~ 256 (342) .+.. ... ....+.++.++|.+|+++..+|+...+.. .+| . T Consensus 223 -~~~~------~~~-----~~~~G~Vg~i~G~~V~~Sn~lp~~~~t~~~~~~~~~~~aG~~~~~~~~~~~~~~~~~~~~~ 290 (347) T protein:vir:94 223 -ANYA------ALI-----DPETGNIRNVMGFVVVEVPHLVQGGAGETRGDDGITIASGQKHAFPATASSDVKVTMDNVV 290 (347) T ss_pred -hhcc------ccc-----cccccceEEEeceEEEecCcccccccccccccCcceecCcccccccccchhhhccccccee Confidence 1110 000 11235689999999999999996322110 011 3 Q ss_pred EEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecceeeecCc-CCcChHH Q lcl|NC_020854. 257 TYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVGAKWAVTT-TNPTRAQ 314 (342) Q Consensus 257 t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~s~~~~~-~sPt~~~ 314 (342) ..+|-+-|++.....++.+|..|+.....+.+.+.+.|+..+.= +...+ ..-+-+| T Consensus 291 ~l~~h~~A~~~v~~~~~~~e~~r~~~~~~d~i~~~~~~G~~~~r--P~~a~~~~~~~A~ 347 (347) T protein:vir:94 291 GLFSHRSAVGTVKLRDLALERDRDVDAQGDLIVGKYAMGHGGLR--PEAAGALVFSPAE 347 (347) T ss_pred EEEeehhhhhhhhcccccccchhchhhHHHHhhhhhhhcCcccc--cceeEEEEecCCC Confidence 46667778888888889999999999988888877777766531 11000 0000111 No 79 >protein:vir:4997 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:109 # MgeName: Sfi21 # Cross-refs: genbank:acc:NP_049971;genbank:gi:9632943;genbank:GeneID:1262106 Probab=99.23 E-value=1.3e-11 Score=80.18 Aligned_cols=272 Identities=10% Similarity=0.027 Sum_probs=166.4 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhc- Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKI- 76 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~l- 76 (342) |+ +.-....+|+.+..-+.+...+.+.+.+ ++..-+ +. .+.-.+.+|.+.. ..+.+..+.|++.++.... T Consensus 109 ~~~~t~~~gg~~iP~~~~~~ii~~~~~~~~l~~--~~~~~~-~~--~~~~~~~~~~~~~-~~~~a~~v~E~~~~~~~~~~ 182 (397) T protein:vir:49 109 KTDGSGSDAGLTIPQDIRTAINTLVRQFDSLQE--YVNVEN-VT--TLTGSRVYEKWAD-ITGLAKLDDEGGQIGQNDDP 182 (397) T ss_pred hhccCCccCcceecHHHHHHHHHHHHhhhhHhh--hcceee-cc--CCcceEEEEeecc-CCcceeeecccccccccccc Confidence 44 2334567899988777777777776644 221111 11 1112344555542 3466777889988875554 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +-..-....++.+.-+.++++...-+.-|....+.+++++.+++..+..++.-. + . + . T Consensus 183 ~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~d~ail~G~----g---~---~------------~ 240 (397) T protein:vir:49 183 KLSLIRYAIKRYAGISTVTNSLLADSAENILAWLSGWIAKKVVVTRNKAILEAI----G---T---L------------P 240 (397) T ss_pred ceeeeEeeeeeeEeehhhHHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHHhcc----c---c---c------------c Confidence 444555566667766777776554455677888999999999999888765421 0 0 0 1 Q ss_pred cccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccc Q lcl|NC_020854. 157 DTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGL 236 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~ 236 (342) .....++++.|.++...+......-+.|+|||..+..|+++ ++++++++.. .. .....-++++|+ T Consensus 241 ~~~~~~~~d~i~~~~~~l~~~~~~~a~~v~n~~~~~~l~~l------kd~~g~~l~~--~~-------~~~g~~~~l~G~ 305 (397) T protein:vir:49 241 NKPTLAKWDDIIDLQAKVDPAIKQTSLFLTNTSGFTALKKV------KNAMGDYLME--RD-------VKSPTGYSIDGF 305 (397) T ss_pred ccccccCHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHHh------hccCCceeec--cc-------ccCCCCceecce Confidence 12345678999999888887777788999999999999874 3455443321 10 112334689999 Q ss_pred eEEEeCCcceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCCC----cceeEEEEeeEEEeee---cceee---e Q lcl|NC_020854. 237 RVIVSDDVNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDIL----AKSDAMSIDLHYVYHP---VGAKW---A 304 (342) Q Consensus 237 ~VvvdD~~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~~----~g~~~l~~r~~y~~~~---~G~s~---~ 304 (342) +|++.+..++..... ++ .+++|+ ..++.++....+.+++++... .+...+....++...| ..+.+ + T Consensus 306 pV~~~~~~~~~~~~~-~~-~~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~r~d~~~~~~~a~~~~~~~ 383 (397) T protein:vir:49 306 VVKEISDRFLPNGTG-GA-MPLYFGDLKQAVTLFDRQHLSLLSTNIGGGAFETDTTKVRVIDRFDVVSTDTEAFVPASFK 383 (397) T ss_pred eeEEecccccccccC-Cc-eeEEEeeccceEEEEeecccEEEEeccccchhhcCeeeEEEEEeeccEEecccceEEEEec Confidence 999877665443222 22 245566 346677777788888887542 4556677777766654 44443 2 Q ss_pred c-CcCCcChHHhcC Q lcl|NC_020854. 305 V-TTTNPTRAQLET 317 (342) Q Consensus 305 ~-~~~sPt~~~L~~ 317 (342) . +...|.....+. T Consensus 384 ~~~~~~~~~~~~~~ 397 (397) T protein:vir:49 384 AIADQKAKLSTAGA 397 (397) T ss_pred ccccccCcccccCC Confidence 1 122233333333 No 80 >protein:vir:5739 Length: 366 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:122 # MgeName: PY54 # Cross-refs: genbank:acc:NP_892050;genbank:gi:33770513;interpro:IPR006444;uniprot:Q7Y410;genbank:GeneID:1732928 Probab=99.23 E-value=2.9e-12 Score=83.83 Aligned_cols=276 Identities=13% Similarity=0.121 Sum_probs=151.6 Q ss_pred Cc--cee--ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhc Q lcl|NC_020854. 1 MA--TLR--SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKI 76 (342) Q Consensus 1 Ma--T~~--~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~l 76 (342) |+ |.. .-..+|+.+..-+.+.+.+.+.+.+.|+ .. +.. +...+++|.+.. ...+.-+.|++.++..+. T Consensus 64 ~a~~~~~~~Gg~lvP~~~~~~ii~~l~~~s~l~~lg~-~~---v~~--~~g~~~~p~~t~--~~~a~wv~E~~~~~~s~~ 135 (366) T protein:vir:57 64 MAISTAAGSGGALIPQNMQNEVIELLRDRTVVRILGA-RS---IPL--PNGNLSMPRLSG--GATAGYVGEGKDVVATGA 135 (366) T ss_pred hhccccccCCccccchhHHHHHHHHHhhhcchhhhce-ee---eec--CCCceEEEEEeC--CcceeeeccCcccccccc Confidence 33 222 3356798887766666666555544321 11 111 122588998863 366777899999999888 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH------HHHHHHhhhcccccchhheeee Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLS------CLQGVFGSLNANTSSSAFFDLC 150 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla------~L~g~~~~~~a~~~~~~~~~~~ 150 (342) +-++-....++.+.-+.++++...-+.-+..+.+.+++++.+.++.++.+|. .-+|++....+.. . T Consensus 136 ~f~~i~~~~~k~~~~~~iS~ell~ds~~~~~~~i~~~l~~a~~~~~d~a~l~G~G~~~~p~Gi~~~~~~~~-------~- 207 (366) T protein:vir:57 136 TFDDVKLSAKTMIALVPVSNQLIGRAGFNVEQLLLGDILSAIATREDKAFLRDDGTGDTPKGMKAVATAAN-------R- 207 (366) T ss_pred ceeEEEEeeEEEEEeehhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHhhccCCCCccccceeecccccc-------c- Confidence 8777777777777777777766555555777889999999999999987763 2222221111000 0 Q ss_pred cccccccccccccHHHHHHHHH-HhCcc--ccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccc Q lcl|NC_020854. 151 IDSESGDTPTALSPRHVAEARA-ILGDQ--GDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGE 227 (342) Q Consensus 151 ~~~~~~~~~~~~~~~~l~~A~~-~~GD~--~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~ 227 (342) ....++.....-..+.+.+.+. .+.+. ...-..|+||+..+..|+++ ++++++.+.. + T Consensus 208 ~~~~~~t~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~vmn~~~~~~L~~l------kd~~G~~l~~-------------~ 268 (366) T protein:vir:57 208 LVAWTGTAINLTTIDEYLDSLILKHMDSNSNMIRCGWGLSNRTYMTLFGL------RDGNGNKVYP-------------E 268 (366) T ss_pred eeeccccccchhhHHHHHHHHHHhhhccccccccCEEEecHHHHHHHHhh------hccCCceecc-------------C Confidence 0000011111112333444433 23332 23467899999999999874 3444433321 1 Q ss_pred cceeeeccceEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccCCC----cc---------eeEEEEeeE Q lcl|NC_020854. 228 VSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRDIL----AK---------SDAMSIDLH 293 (342) Q Consensus 228 ~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~~----~g---------~~~l~~r~~ 293 (342) ..-++++|++|++++.+|.......+.. .++|+. .-+.++...++.++..|++. .+ ...+....+ T Consensus 269 ~~~g~l~G~Pvv~s~~ip~~~~~~~~~~-~i~~gdfs~~~i~~~~~i~i~~~~ea~~~~~~g~~~~~f~~~~~~iR~~~~ 347 (366) T protein:vir:57 269 MSQGILKGYPIQRTSAIPANLGDDGNES-EIYFCDFNDVVIGEDGMMKVDFSTEATYKDADGQLVSAFARNQSLIRVVTE 347 (366) T ss_pred CCCCeecceeeEEccccccccccCCCcc-EEEEEecceEEEEEecceEEEEeeccccccccccchhhhhcCceeEEeeee Confidence 1235789999999999987543233333 333332 23334555666676666532 11 122323332 Q ss_pred EEeeecceeeecCcCCcChHHhcCCcCc Q lcl|NC_020854. 294 YVYHPVGAKWAVTTTNPTRAQLETVANW 321 (342) Q Consensus 294 y~~~~~G~s~~~~~~sPt~~~L~~~~NW 321 (342) +-+.| ..|.-=.+-++.+| T Consensus 348 ~d~~v---------~~~~a~~~lt~~~~ 366 (366) T protein:vir:57 348 HDIGF---------RHPEGLVLGTGVIW 366 (366) T ss_pred eCcEe---------eccccEEEEecccC Confidence 22222 12333334455556 No 81 >protein:vir:1328 Length: 392 # NCBI annotation: gp36 # Family: family:all:21 # MgeID: mge:28 # MgeName: phi-C31 # Cross-refs: genbank:acc:NP_047927;swissprot:trembl:q9zwv6;genbank:gi:9631145;uniprot:Q9ZWV6;genbank:GeneID:2715889 Probab=99.23 E-value=3.4e-12 Score=83.43 Aligned_cols=273 Identities=7% Similarity=0.013 Sum_probs=162.7 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) .. +--..++.|+++.+.+.+...+...+.+ ++..-+ ..+|..+.+|.... .+.+.-+.|++.++..+.+ T Consensus 111 ~~~t~~~~g~~~~~~~~~~~i~~~~~~~~~l~~--~~~~~~----~~~~~~~~~~~~~~--~~~a~~v~E~~~~~~~~~~ 182 (392) T protein:vir:13 111 RDGTKAGNPNVLSRTLYGQLIAQAVERSAIMRG--GASTFT----TSDANPMDFTVITG--RATAGIVGETAEIPESYPA 182 (392) T ss_pred hcccccCCCccccccchHHHHHHHHhhhhhhhh--cceeee----cCCCceeEEEEEcC--Ccceeeecccccccccccc Confidence 11 2234478888888888776655544432 221111 13467788998764 3666678999999988888 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeee--ccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLC--IDSES 155 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~--~~~~~ 155 (342) .++.....++.+.-+.++++...-+.-|..+.+.+++++.+++..+..+|. | .....+ .+.+.... ....+ T Consensus 183 f~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~l~---G---~Gt~~p-~Gil~~~~~~~~~~~ 255 (392) T protein:vir:13 183 TTQRSMGGFKYGFASVVSYEFATDQVLDLVGFLVSDAGPAIGDAMGRHFLT---G---TGTGQP-RGILTDATGANAAFG 255 (392) T ss_pred eeeEEeeeeeEEeeehhHHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhc---c---cCCccc-ccccccccccccccc Confidence 777777777777777787776555555777789999999999999987774 1 110000 00111000 01111 Q ss_pred ccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) ......++++.|.++...+......-.+|+||+..+..|++. ++++++++... . .....-.+++| T Consensus 256 ~~~~~~~~~d~l~~~~~~l~~~~~~~a~~v~n~~~~~~l~~l------kd~~G~~l~~~--~-------~~~g~~~~l~G 320 (392) T protein:vir:13 256 EADADSKVSDALIDLFHEVPSAYRKNAKFVVNDLRAAQMRKL------KDANGQYLWQS--A-------LTVGAPDTFNG 320 (392) T ss_pred ccccccccHHHHHHHHHhhhhhhhcCCEEEEcHHHHHHHHHh------hccCCceeecC--C-------cCCCCCceecc Confidence 223456778899998877766655667899999999988763 44554433221 1 11123358999 Q ss_pred ceEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccCC--CcceeEEEEeeEEEeeecce-eeecCcCCcC Q lcl|NC_020854. 236 LRVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRDI--LAKSDAMSIDLHYVYHPVGA-KWAVTTTNPT 311 (342) Q Consensus 236 ~~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~--~~g~~~l~~r~~y~~~~~G~-s~~~~~~sPt 311 (342) ++|+++|.+|.. +++|+. ..+.++...++.+++.++. ..+.+.+....++.+.|.-- .|. T Consensus 321 ~Pv~~~~~~~~~---------~i~~Gdf~~~~i~~~~~~~i~~~~~~~~~~~~~~~r~~~r~d~~~~~~~A~~------- 384 (392) T protein:vir:13 321 KVVETDDGMPAD---------KVLFADLSKYRVRFAGSLRVDRSVDAKFSTDQIVYRFLQRADGLLVDARGAK------- 384 (392) T ss_pred eeeEEcCCCCCC---------cEEEeeccceeEEeecceEEEeeccccccCCcEEEEEEEEeccEEecccceE------- Confidence 999999999742 234443 2234455555666554443 44556677766666655321 121 Q ss_pred hHHhcCCcCceeecCccccceEEEEecC Q lcl|NC_020854. 312 RAQLETVANWSKVYELKNIGIVRATNVS 339 (342) Q Consensus 312 ~~~L~~~~NW~~v~d~k~i~~~~~~~~~ 339 (342) ++.++.-. T Consensus 385 --------------------~~~~~~aa 392 (392) T protein:vir:13 385 --------------------VLTVTPAA 392 (392) T ss_pred --------------------EEEeeccC Confidence 11111111 No 82 >protein:vir:104085 Length: 320 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:1656 # MgeName: Che12 # Cross-refs: genbank:acc:YP_655596;genbank:gi:109392467;genbank:GeneID:4156953 Probab=99.22 E-value=1e-11 Score=80.85 Aligned_cols=281 Identities=15% Similarity=0.142 Sum_probs=156.0 Q ss_pred Cc----ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhc Q lcl|NC_020854. 1 MA----TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKI 76 (342) Q Consensus 1 Ma----T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~l 76 (342) |+ +.-+.+|.|++..++ .+.+.+.+.+.+ ++. .+ ..+|..+++|.+.. .+++.-+.|++.++..+. T Consensus 14 ~~~t~~~~~~~~ip~~~~~~i-i~~~~~~s~l~~--~~~---~~--~~~~~~~~~p~~~~--~~~a~~v~E~~~~~~~~~ 83 (320) T protein:vir:10 14 IAQTGDTMFKGYLEPEQAKDY-FAEAEKTSIVQQ--FAQ---KV--PMGTTGQKIPHWIG--DVSAQWIGEGDMKPITKG 83 (320) T ss_pred hhccccccccccccHHHHHHH-HHHHHhccchhh--hcc---ee--eccCCceEEEEEeC--CcceEEecCCcccccccc Confidence 44 223567766665444 455555555543 121 11 13466789999874 377778999999999999 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +..+.....++.+..+.++++...-+..|..+.+.+++++.+++++++.+|. |. ++...................+ T Consensus 84 ~f~~v~~~~~k~~~~~~is~ell~ds~~~l~~~i~~~l~~a~a~~~d~a~l~---G~-g~~~~~~~~~~~~~~~~~~~~~ 159 (320) T protein:vir:10 84 NMTSQNIAPHKIATIFVASAETVRANPANYLGTMRTKVATAFAMAFDSAALN---GT-DSPFPTYLAQTTKSVSLADPGG 159 (320) T ss_pred ceeEEEEeeEEEEEeehhhHHHHhcChHHHHHHHHHHHHHHHHHHHHHHhhc---cc-CCCCCcccccccccccceeccc Confidence 9888888889999999999888666667888899999999999999988764 10 0000000000000000011111 Q ss_pred ccccccc--HHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeec Q lcl|NC_020854. 157 DTPTALS--PRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYM 234 (342) Q Consensus 157 ~~~~~~~--~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~ 234 (342) .+.+.+. ...+.++..........-.+|+||+..+..|+++ ++++++.+..-...+..+ ....-.+++ T Consensus 160 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~l------kd~~G~~l~~~~~~~~~~----~~~~~~~i~ 229 (320) T protein:vir:10 160 ATASDLTAYDAVAVNGLSLLVNAKKKWTHTLLDDIVEPILNGA------KDKNGRPLFIESTYTDEN----SPFRAGRIV 229 (320) T ss_pred ccccccccHHHHHHHHHhhhhcccCCCcEEEEcHHHHHHHHHh------hccCCceeeccccccCcc----ccccCceee Confidence 1111111 1235566666666666788999999999999874 344443332211111111 122345789 Q ss_pred cceEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccCCC----------------cceeEEEEeeEEEee Q lcl|NC_020854. 235 GLRVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRDIL----------------AKSDAMSIDLHYVYH 297 (342) Q Consensus 235 G~~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~~----------------~g~~~l~~r~~y~~~ 297 (342) |++|++++.+|-. +.. .+|+. .-+.++...++.++..|+.. ..+..+....++.++ T Consensus 230 g~pv~~~~~~~~~------~~~-~~~gd~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~~~d~~ 302 (320) T protein:vir:10 230 SRPTILSDHVADG------TTV-GYMGDFRNVIWGQVGGLSFDVTDQATLNLGTPTEPNFVSLWQHNLVAVRVEAEYAFH 302 (320) T ss_pred eeeeEecCCCCCC------ceE-EEEeecceEEEEEecCeEEEEeecceeeeccccccccchhhhcCcEEEEEEEeeccE Confidence 9999999988632 111 11221 11224555556666555432 122334444555544 Q ss_pred ecc-eeee--cCcCCcCh Q lcl|NC_020854. 298 PVG-AKWA--VTTTNPTR 312 (342) Q Consensus 298 ~~G-~s~~--~~~~sPt~ 312 (342) +.- -+|. ....+|.- T Consensus 303 v~~~~a~~~l~~~~ap~~ 320 (320) T protein:vir:10 303 NNDKDAFVKLTNVVTPDA 320 (320) T ss_pred EecccceEEEEeccCCCC Confidence 422 2331 11112322 No 83 >protein:vir:4830 Length: 397 # NCBI annotation: MPL-7201 # Family: family:all:21 # MgeID: mge:105 # MgeName: 7201 # Cross-refs: genbank:acc:NP_038327;genbank:gi:9634653;genbank:GeneID:1262632 Probab=99.22 E-value=1.3e-11 Score=80.35 Aligned_cols=271 Identities=11% Similarity=0.003 Sum_probs=167.7 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEcccccc-CCCCcccccCCCceechh-h Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKA-NLSGDFEVLSDSSSLTPG-K 75 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~-i~~gda~~~~~~~~i~~~-~ 75 (342) |+ +.....++|+.+..-+.+...+.+.+.+. +..-+ .++...++|++.. ...+.+..+.|++.++.. + T Consensus 109 ~~~~t~~~gg~~iP~~~~~~ii~~~~~~~~l~~~--~~~~~-----~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~~~ 181 (397) T protein:vir:48 109 KTDASGSDAGLTIPQDIQTAIHTLVRQYDSLQEY--VNVEN-----VTTLTGSRVYEKWADITGLAKLDDEAGSIGTNDD 181 (397) T ss_pred hhccCCccccccccHHHHHHHHHHHHHHHHHHhh--hceee-----ccCCcceEEEEeecCCCcceeeeccccccccccc Confidence 43 23456788999888888888777777542 21111 2344445555432 123456678888888644 4 Q ss_pred cccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 76 ITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 76 lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) .+-.+-....++.+.-..++++.-.-+.-|..+.+.+++++.+++..+..++.-.. .. T Consensus 182 ~~~~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~v~~~l~~~~~~~~d~~il~G~g------~~---------------- 239 (397) T protein:vir:48 182 PKLYPIRYAIKRYAGISTVTNSLLADSAENILAWLSGWIAKKVVVTRNKAILEAIA------TL---------------- 239 (397) T ss_pred cceeeEEeeheeeeeehhhHHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhccc------cc---------------- Confidence 66666666677777777887776555566788889999999999999888765221 00 Q ss_pred ccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) .......+++.|.++...+......-++|+||+..+..|++. +++++.++... ......-++++| T Consensus 240 ~~~~~~~~~d~i~~~~~~l~~~~~~~a~~v~n~~~~~~L~~l------kd~~G~~i~~~---------~~~~~~~~~l~G 304 (397) T protein:vir:48 240 PTKPTLTKWDDIIDLQAKVDPAIKQTSFFLTNTSGFTALKKV------KNAFGDYLMER---------DVKSPTGYSIDG 304 (397) T ss_pred ccccccccHHHHHHHHHHhhhhhcCCCEEEECHHHHHHHHHh------hcCCCceeecc---------CcCCCCCceecc Confidence 112234578889999888877777789999999999999874 34544433211 111234578999 Q ss_pred ceEEEeCCcceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCC----CcceeEEEEeeEEEeee---cceee--- Q lcl|NC_020854. 236 LRVIVSDDVNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDI----LAKSDAMSIDLHYVYHP---VGAKW--- 303 (342) Q Consensus 236 ~~VvvdD~~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~----~~g~~~l~~r~~y~~~~---~G~s~--- 303 (342) ++|++.|..+.... ..++ .+++|+ ..++.++....+.++.++.. ..+...+....++.+.+ ..+.+ T Consensus 305 ~PV~~~~~~~~~~~-~~~~-~~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~~~~ 382 (397) T protein:vir:48 305 FAVKEVADRWLANA-SSGA-MPLYFGDLKQAVTLFDRQQMSLLSTNIGGGAFETDTTKIRVIDRFDVVATDTESFVPASF 382 (397) T ss_pred ceeEEecccccCCc-CCCc-eEEEEEeccceEEEEeecceEEEEeccchhhhhcCceeEEEEeeeccEEecccceEEEEe Confidence 99998876544322 2222 245555 33556666667778777643 34556666666665554 44544 Q ss_pred ec-CcCCcChHHhcC Q lcl|NC_020854. 304 AV-TTTNPTRAQLET 317 (342) Q Consensus 304 ~~-~~~sPt~~~L~~ 317 (342) +. +...|+...++- T Consensus 383 ~~~~~~~~~~~~~~~ 397 (397) T protein:vir:48 383 KAIADQKGNLGSTAV 397 (397) T ss_pred cccccCCCCccccCC Confidence 22 222344444443 No 84 >protein:vir:10364 Length: 390 # NCBI annotation: head protein; major capsid subunit precursor # Family: family:all:585 # MgeID: mge:183 # MgeName: Xp10 # Cross-refs: genbank:acc:NP_858956;genbank:gi:32128421;genbank:GeneID:2648357 Probab=99.21 E-value=1.6e-11 Score=79.74 Aligned_cols=268 Identities=8% Similarity=0.024 Sum_probs=158.1 Q ss_pred Ccc--e-eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MAT--L-RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT--~-~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) +++ . -.-++.|+++...+. ...+.+.+.+ ++... ..++..+++|.|.. ..+.+..+.|++.++..+.+ T Consensus 114 ~~~~~~~~g~~~~~~~~~~ii~-~~~~~~~l~~--~~~~~-----~~~~~~~~~~~~~~-~~~~a~~v~Eg~~~~~~~~~ 184 (390) T protein:vir:10 114 STDAAGSAGALTTPNRLPGFIT-QPDARLTVRD--LIGSG-----RTDSALIEYVQETG-FVNNAAIVAEGALKPESSLK 184 (390) T ss_pred hcccccccccccchhHHHHHHH-HHHhhchhhh--hccee-----eccCCceEEEEEec-CCcceeeecCCccccccccc Confidence 221 1 133678887766554 3344444433 22211 13466789999874 34667788999999998888 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) ..+.....++.+..+.+++....-+ .+....+.+++++.+++..++.+|. | ....+.-.+.+........... T Consensus 185 ~~~i~~~~~k~~~~~~is~ell~d~-~~l~~~i~~~l~~~~~~~~~~~il~---G---~G~~~~p~Gi~~~~~~~~~~~~ 257 (390) T protein:vir:10 185 FAKKTDTTHVIAHTMKATRQILSDA-PQLASYMNNRLIRGLKVKEDAEILR---G---TGANDGLLGLIPQATTYAAPTT 257 (390) T ss_pred eeEEEEeeEEEEEeehhhHHHHHhH-HHHHHHHHHHHHHHHHHHHHHHHhh---c---CCCCcccccccccccccccccc Confidence 7777777788777777877653333 3667779999999999999887763 2 1111101111111111111122 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) .......+.+.++...+.+....-.+|+||+..+..|++. +++++..+.. . .. ...-++++|++ T Consensus 258 ~~~~~~~~~~~~~~~~l~~~~~~~~~~v~n~~~~~~L~~l------kd~~g~~l~~--~------~~--~~~~~~l~G~p 321 (390) T protein:vir:10 258 IAGATRVDQLRLAMLQASLAEYPASGIVINPIDWAAIELA------KDANNQYLIG--N------AR--GTLTPTLWGLP 321 (390) T ss_pred ccccchHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHh------hcCCCceeec--C------Cc--CcCCceeccee Confidence 2334457888999888888777888999999999998864 3444433221 1 11 12346789999 Q ss_pred EEEeCCcceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCC---CcceeEEEEeeEEEeeecce-eeecCcCC Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDI---LAKSDAMSIDLHYVYHPVGA-KWAVTTTN 309 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~---~~g~~~l~~r~~y~~~~~G~-s~~~~~~s 309 (342) |++++.+|.. .++|+ ..++.+.....+.++..+.. ..+...+....++.+.|+-- .|.....+ T Consensus 322 v~~~~~~p~~---------~~~~gdf~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~r~~~r~d~~v~~~~a~~~~~~a 390 (390) T protein:vir:10 322 VVATQAMAPG---------EFLVGAFDLAAQIFDQWDARVEIGYVNDDFQRNMVTVLAEERLALVVYRPEALISGSFA 390 (390) T ss_pred eEEcCCCCCC---------cEEEEeccceEEEEEecceEEEEeecccccccCcEEEEEEEeeccEEeccccEEEEEeC Confidence 9999999732 12333 22344444455666655432 33556666777776665432 12222222 No 85 >protein:vir:2430 Length: 318 # NCBI annotation: major head subunit # Family: family:all:507 # MgeID: mge:52 # MgeName: D29 # Cross-refs: genbank:acc:NP_046832;genbank:gi:9630400;genbank:GeneID:1261582 Probab=99.21 E-value=1.3e-11 Score=80.35 Aligned_cols=277 Identities=12% Similarity=0.071 Sum_probs=156.4 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |+ +...--++|+.+..-+.+.+.+.+.+.+ ++.. + ..++..+.+|.+... +.+.-+.|++.++..+.+ T Consensus 14 ~~~~~~~~~~~~ip~~~~~~ii~~~~~~~~l~~--~~~~---~--~~~~~~~~ip~~~~~--~~a~~v~Eg~~~~~~~~~ 84 (318) T protein:vir:24 14 IAQTGDTMFKGYLEPEQAKDYFAEAEKTSIVQQ--FAQK---V--PMGTTGQKIPHWVGD--VSAQWIGEGDMKPITKGN 84 (318) T ss_pred hhcccCcccceeechhHHHHHHHHHHhhchhhh--hcce---e--eccCCceEEEEEeCC--cceEEecCCccccccccc Confidence 44 2222234566666555666666665544 2221 1 134667889988754 778889999999999988 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeee-cccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLC-IDSESG 156 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~-~~~~~~ 156 (342) -.+-....++.+..+.++++...-+..|..+.+.+++++.+++++++.++. |. ..... ........ ...... T Consensus 85 f~~i~~~~~k~~~~~~iS~e~l~ds~~~~~~~i~~~l~~~~~~~~d~a~l~---G~---g~~~~-~~~~~~~~~~~~~~~ 157 (318) T protein:vir:24 85 MTSQTIAPHKIATIFVASAETVRANPANYLGTMRTKVATAFAMAFDGAAMH---GT---DSPFP-TYIGQTTKAISIADT 157 (318) T ss_pred eeEEEEeeEEEEEeehhhHHHhhcChHHHHHHHHHHHHHHHHHHHHHhhhc---cc---CCCCC-ccccccccccccccc Confidence 777777778888888898877666667889999999999999999988864 21 10000 00010010 011111 Q ss_pred cccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccc Q lcl|NC_020854. 157 DTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGL 236 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~ 236 (342) .+......+.+.++....-.....-.+|+|||..+..|+++ ++++++.+......+.. .....-..+.|+ T Consensus 158 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~l------kd~~G~~l~~~~~~~~~----~~~~~~~~i~g~ 227 (318) T protein:vir:24 158 TGATTVYDQVAVNGLSLLVNDGKKWTHTLLDDITEPILNGA------KDQNGRPLFIESTYGEA----ASPFRSGRIVAR 227 (318) T ss_pred ccccchHHHHHHHHHHhhccccCCCCEEEEcHHHHHHHHHh------hccCCceeecCccccCc----cccccCceEEEE Confidence 12223333455666666555555667899999999999864 33443332211111100 011223467899 Q ss_pred eEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccCCC----------------cceeEEEEeeEEEeeec Q lcl|NC_020854. 237 RVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRDIL----------------AKSDAMSIDLHYVYHPV 299 (342) Q Consensus 237 ~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~~----------------~g~~~l~~r~~y~~~~~ 299 (342) +|++++.+|... . ..+++. .-+.++...++.++..|+.. .....+....++.+.|. T Consensus 228 pv~~~~~~~~~~------~-~~~~gdfs~~~~~~~~~l~i~~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~ 300 (318) T protein:vir:24 228 PTILSDHVVEGT------T-VGFMGDFSQLIWGQIGGLSFDVTDQATLNLGTVESPNFVSLWQHNLVAVRVEAEYAFHCN 300 (318) T ss_pred eeEEeCCCCCCc------c-EEEEeecceEEEEEecCeEEEEeeccceeccccccccchhhhhcCcEEEEEEEEEccEEe Confidence 999999886321 1 122221 12345666666776655432 23345566666655542 Q ss_pred c-eee---ec---CcCCc Q lcl|NC_020854. 300 G-AKW---AV---TTTNP 310 (342) Q Consensus 300 G-~s~---~~---~~~sP 310 (342) - -+| +. ++-+- T Consensus 301 ~~~a~~~i~~~~a~~~~~ 318 (318) T protein:vir:24 301 DAEAFVALTNVVSGGGEG 318 (318) T ss_pred cccceEEEEeeccCCCCC Confidence 2 222 11 11111 No 86 >protein:vir:81160 Length: 371 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:1892 # MgeName: Geobacillus virus E2 # Cross-refs: genbank:acc:YP_001285811;genbank:gi:148747732;genbank:GeneID:5247203 Probab=99.21 E-value=2.1e-11 Score=79.15 Aligned_cols=261 Identities=8% Similarity=0.027 Sum_probs=157.9 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceec-hhhc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLT-PGKI 76 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~-~~~l 76 (342) |+ +.....++|+.+...+.+...+.+.+.+ ++...+ .++...++|+....-.+.+..+.|++.++ .++. T Consensus 91 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~i~~--~~~~~~-----~~~~~~~~~~~~~~~~~~a~~v~Eg~~~~~~~~~ 163 (371) T protein:vir:81 91 MSEGSNQDGGYTVPQDIQTRINELRESKDALQN--LITVEP-----VTTLSGSRVFKKRSQQTGFVEVAEGAAIGEKATP 163 (371) T ss_pred hccCCCccCceeecHhHHHHHHHHHHhhhhhhh--hceeee-----ccCCceeEEEEeecCCcceeeecccccccccccc Confidence 44 2335578898888878777777776654 222111 12333333333322235667789998886 4667 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +-.+.....++.+.-+.++++...-+.-|....+.+++++.+.+..+..++.-... . T Consensus 164 ~f~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~a~~~~~~~~i~~g~g~-----------------------~ 220 (371) T protein:vir:81 164 QFTLLQYQVKKYAGFFRVTNELLNDSTEAIVNTLVRWIGDESRVTRNGLIINVLNT-----------------------K 220 (371) T ss_pred ceeeEEeeeeEEEEeehhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHHhhccc-----------------------c Confidence 77777777778877778877765444456678899999999999888776653210 0 Q ss_pred cccccccHHHHHHHHHH-hCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 157 DTPTALSPRHVAEARAI-LGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~~-~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) ......+++.+..+... +-.....-.+|+||+..+..|++. ++++++.+.... .....-++++| T Consensus 221 ~~~~~~~~~~i~~~~~~~l~~~~~~~a~~vmn~~~~~~L~~l------kd~~g~~l~~~~---------~~~~~~~~l~G 285 (371) T protein:vir:81 221 AKTAIADLDGLKQIINVQLDPVFRSTSSVIVNQDAFNWLDTL------KDQNGQYLLQPS---------ISSPTGRQLLG 285 (371) T ss_pred cccccccHHHHHHHHHhhcchhhhcCCEEEEcHHHHHHHHHh------hccCCCeeeecc---------cCCCCCceecc Confidence 11223467777777643 334444567899999999999864 344444332111 11234578999 Q ss_pred ceEEEeCCcceeccC---CCcceEEEEEec--ceeEeecCCcceeEeccCCC----cceeEEEEeeEEEeeec---ceee Q lcl|NC_020854. 236 LRVIVSDDVNTAGSG---GSTEYATYFFTQ--GAVASGEQMAMQTETDRDIL----AKSDAMSIDLHYVYHPV---GAKW 303 (342) Q Consensus 236 ~~VvvdD~~p~~~~~---~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~----~g~~~l~~r~~y~~~~~---G~s~ 303 (342) ++|+++|.+|..... .......++|+. -++.+.....+.++.++... .+...+....++...+. .+.+ T Consensus 286 ~pV~~~~~~~~~~~~~~~~~~~~~~i~~Gd~~~~~~~~~~~~~~i~~~~~~~~~f~~~~v~~~~~~r~d~~~~~~~a~~~ 365 (371) T protein:vir:81 286 LPVVIVSNKVLANRVDGGTGAQFAPIIVGDLKEAVVMFDRQRTEIMSSNVAMDAFETDATLWRAIERMDVKMRDDEAFVF 365 (371) T ss_pred eeEEEecccccCccccccccCCcceEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeeccEEecccceEE Confidence 999999999854211 111223455552 23445555667777766543 45566777777666543 3322 Q ss_pred ---ecC Q lcl|NC_020854. 304 ---AVT 306 (342) Q Consensus 304 ---~~~ 306 (342) +.+ T Consensus 366 ~~~~~A 371 (371) T protein:vir:81 366 GEVQLA 371 (371) T ss_pred EEEecC Confidence 223 No 87 >protein:vir:2201 Length: 345 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:49 # MgeName: T7 # Cross-refs: genbank:acc:NP_041998;swissprot:sw:p19726;genbank:gi:9627469;goa:P19726;uniprot:P19726;genbank:GeneID:1261026 Probab=99.20 E-value=1.5e-12 Score=85.47 Aligned_cols=287 Identities=13% Similarity=0.100 Sum_probs=174.1 Q ss_pred Cccee-------------------ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCc Q lcl|NC_020854. 1 MATLR-------------------SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGD 61 (342) Q Consensus 1 MaT~~-------------------~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gd 61 (342) ||... -+++. |+|...|...+.+++.|.. .+.. -++ .+|+++.+|+.. ... T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~al~l-e~f~geV~~~f~~~s~~~~--~~~~-r~i---~~gks~~~~~iG---~~~ 70 (345) T protein:vir:22 1 MASMTGGQQMGTNQGKGVVAAGDKLALFL-KVFGGEVLTAFARTSVTTS--RHMV-RSI---SSGKSAQFPVLG---RTQ 70 (345) T ss_pred CcccccchhcccccccccccCCchhHHHH-HHHhHHHHHHHHHHhhhcc--ccee-eec---cccceEEEeeec---ceE Confidence 55111 14555 8898899888888888853 3322 233 469999999753 356 Q ss_pred ccccCCCceechhh--cccceeeeEee-eeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhc Q lcl|NC_020854. 62 FEVLSDSSSLTPGK--ITADKQVAAIL-HRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLN 138 (342) Q Consensus 62 a~~~~~~~~i~~~~--lt~~~~~a~i~-~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~ 138 (342) +.....|+++.... +...+.+-+|= ..--.+.+.|+-...+..|.+.+..+|.+...++++|+.++..|..+-.... T Consensus 71 ~~~~~~G~~l~~~~~~~~~~e~~ltID~~~y~~~~VddiD~~q~~~D~r~~~s~~~G~aLA~~~D~~i~~~l~k~a~~~~ 150 (345) T protein:vir:22 71 AAYLAPGENLDDKRKDIKHTEKVITIDGLLTADVLIYDIEDAMNHYDVRSEYTSQLGESLAMAADGAVLAEIAGLCNVES 150 (345) T ss_pred EEeeecCCCCCCCCCCcccceEEEEecchhhhhhhHhhHHHHhcCchhHHHHHHHHHHHHHHHHHHHHHHHHHHhhcccc Confidence 77788999886543 55555444433 3444678889999899999999999999999999999999987743211100 Q ss_pred ------ccccchhheeeecccccccccccc----cHHHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhhhhhhhhhhh Q lcl|NC_020854. 139 ------ANTSSSAFFDLCIDSESGDTPTAL----SPRHVAEARAILGDQGD--KLTAVAMHSKVYYDLVERRAIDYVSTA 206 (342) Q Consensus 139 ------a~~~~~~~~~~~~~~~~~~~~~~~----~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~~li~~~~~s 206 (342) +........++...... .+.... -++.|.+|..+|.+..= .=..++|.|..|..|.+...+.-..+. T Consensus 151 ~~~~~~~~~~~~~~~~~~~~g~~-~t~~~~~~~~~~~ai~~a~~~Lde~~VP~~~R~~vv~P~~y~~Ll~~~~~~~~~~~ 229 (345) T protein:vir:22 151 KYNENIEGLGTATVIETTQNKAA-LTDQVALGKEIIAALTKARAALTKNYVPAADRVFYCDPDSYSAILAALMPNAANYA 229 (345) T ss_pred ccccccccccccccccccccccc-ccccccCHHHHHHHHHHHHHHhhhcCCCccCCEEEeChHHHHHHhccccccccccc Confidence 00001111111111110 011111 25667777777765421 347899999999999877543211110 Q ss_pred hcccceeeeccceeecccccccceeeeccceEEEeCCcceeccCC---------------C---------cceEEEEEec Q lcl|NC_020854. 207 DARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGG---------------S---------TEYATYFFTQ 262 (342) Q Consensus 207 ~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~---------------~---------~~y~t~l~~~ 262 (342) ..+ ....+.++.++|.+|+++..+|....+. . ......+|-+ T Consensus 230 ---------~~~-----~~~~G~V~~i~G~~V~~sn~lp~~~~~~~~~~~~~~~~~~~~~~g~~~~~~~~~~~~~l~~h~ 295 (345) T protein:vir:22 230 ---------ALI-----DPEKGSIRNVMGFEVVEVPHLTAGGAGTAREGTTGQKHVFPANKGEGNVKVAKDNVIGLFMHR 295 (345) T ss_pred ---------ccc-----ccccceEEEEeceEEEecccccccccCccccCcccccccccccccceeeeeccCceEEEEEeh Confidence 011 1224678999999999999887421110 0 0113467778 Q ss_pred ceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecceeeecCcCCcChHHhcCCcCceeecCccccceEEEEecCC Q lcl|NC_020854. 263 GAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVGAKWAVTTTNPTRAQLETVANWSKVYELKNIGIVRATNVSN 340 (342) Q Consensus 263 GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~s~~~~~~sPt~~~L~~~~NW~~v~d~k~i~~~~~~~~~~ 340 (342) .|++.....++.+|..|+.....+.+.+.+.|+..+.= |.. .+.+++|.+ T Consensus 296 ~A~~~v~~~~~~~e~~r~~~~~~d~I~~~~a~G~~vlR--------------------------Pea--a~~i~~~~~ 345 (345) T protein:vir:22 296 SAVGTVKLRDLALERARRANFQADQIIAKYAMGHGGLR--------------------------PEA--AGAVVFKVE 345 (345) T ss_pred hheeeeeeecceeeeeechhHHHHHHHHHHhcCCcccc--------------------------cce--eEEEEEeeC Confidence 88988888889999999888777777777666655421 000 111111111 No 88 >protein:vir:6242 Length: 390 # NCBI annotation: gp36 # Family: family:all:21 # MgeID: mge:131 # MgeName: phi-BT1 # Cross-refs: genbank:acc:NP_813696;swissprot:trembl:q859c1;genbank:gi:29366756;interpro:IPR006444;uniprot:Q859C1;genbank:GeneID:1258897 Probab=99.19 E-value=4.9e-12 Score=82.57 Aligned_cols=271 Identities=7% Similarity=0.012 Sum_probs=158.8 Q ss_pred Cc-c--eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA-T--LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma-T--~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) +. | --..++.|+++...+.....+...+.+ ++..-+ ...|..+.+|.+.. .+.+.-+.|++.++..+.+ T Consensus 111 ~~~t~~~~g~~~~~~~~~~~i~~~~~~~~~l~~--~~~~~~----~~~~~~~~~p~~~~--~~~a~wv~E~~~~~~~~~~ 182 (390) T protein:vir:62 111 RDGTKAGNPNVLSRTLYGQLIAQAVERSAIMRG--GATTFT----TSDANPLDFTVITG--RSSASIVGETAEIPESYPA 182 (390) T ss_pred hcccccCCCccccccchHHHHHHHHhhhhhhhh--cceeee----cCCCceeEEEEEcC--Ccceeeecccccccccccc Confidence 22 2 224578888888777665554444422 221111 22466788998763 3566678999999988888 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH---HHHHHhhhcccccchhheeeecccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSC---LQGVFGSLNANTSSSAFFDLCIDSE 154 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~---L~g~~~~~~a~~~~~~~~~~~~~~~ 154 (342) .++-....++.+.-..++++.-.-+.-|..+.+.++++..++++.+..+|.- -+|++..... ..... T Consensus 183 f~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~l~G~G~p~Gi~~~~~~----------~~~~~ 252 (390) T protein:vir:62 183 TAQRSMGGFKYGFASVVSYEFATDQVLDLVGFLVSDAGPAIGDAMGRHFITGTGQPRGILTDASP----------ATATF 252 (390) T ss_pred eeeeEeeeeeEEeehHHHHHHHhhhhHHHHHHHHHHHHHHHHHHHHhhhhccCCccccccccccc----------cccce Confidence 7777777777777777777765555557778899999999999999877640 0111111000 00111 Q ss_pred cccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeec Q lcl|NC_020854. 155 SGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYM 234 (342) Q Consensus 155 ~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~ 234 (342) .......++++.|++....+......-.+|+||+..+..|++. ++.+++.+.. .+ ...+.-.+++ T Consensus 253 ~~~~~~~~~~~~l~~~~~~l~~~~~~~a~~vmn~~~~~~L~~l------kd~~g~~l~~--~~-------~~~g~~~~l~ 317 (390) T protein:vir:62 253 LATDTDSKVSDALIDLFHEVPSAYRANAKYVVNDLRAAQMRKL------KDANGQYLWQ--SG-------LTVGAPSLFN 317 (390) T ss_pred ecccccccchHHHHHHHHhhhhhhhcCCEEEEchHHHHHHHHh------hccCCCeeec--CC-------cCCCccceec Confidence 1123345788889888877766555566899999999998763 4444433221 11 1112336899 Q ss_pred cceEEEeCCcceeccCCCcceEEEEEecc-eeEeecCCcceeEeccCC--CcceeEEEEeeEEEeeecceeeecCcCCcC Q lcl|NC_020854. 235 GLRVIVSDDVNTAGSGGSTEYATYFFTQG-AVASGEQMAMQTETDRDI--LAKSDAMSIDLHYVYHPVGAKWAVTTTNPT 311 (342) Q Consensus 235 G~~VvvdD~~p~~~~~~~~~y~t~l~~~G-Ai~~~~k~~~~ve~dr~~--~~g~~~l~~r~~y~~~~~G~s~~~~~~sPt 311 (342) |++|+++|.+|.. .++|+.- .+.+....++.++...+. ..+.+.+.+..++.+.|.- T Consensus 318 G~Pv~~~~~~p~~---------~i~~gd~s~~~i~~~~~~~v~~~~~~~~~~~~~~~~~~~r~d~~~~~----------- 377 (390) T protein:vir:62 318 GKVVETDDGMPAD---------KILFADLSKYRVRFAGSLRVDRSVDAKFSTDQIVYRFLQRADGLLVD----------- 377 (390) T ss_pred ccceEEecCCCCc---------cEEEeeccceeEEeecceEEEeeccccccCCcEEEEEEEEeCcEeec----------- Confidence 9999999999742 1333321 123444455666554443 4455666666666555432 Q ss_pred hHHhcCCcCceeecCccccceEEEEecC Q lcl|NC_020854. 312 RAQLETVANWSKVYELKNIGIVRATNVS 339 (342) Q Consensus 312 ~~~L~~~~NW~~v~d~k~i~~~~~~~~~ 339 (342) ++++.+..++..+ T Consensus 378 ---------------~~A~~~l~~~~~a 390 (390) T protein:vir:62 378 ---------------ARGAKVLTVTPGA 390 (390) T ss_pred ---------------hhheEEEEeecCC Confidence 2222222211111 No 89 >protein:vir:78739 Length: 332 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:1856 # MgeName: Syn5 # Cross-refs: genbank:acc:YP_001285448;genbank:gi:148724482;genbank:GeneID:5220210 Probab=99.19 E-value=1.5e-12 Score=85.44 Aligned_cols=287 Identities=13% Similarity=0.168 Sum_probs=168.5 Q ss_pred Ccceecc---------------------ccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCC Q lcl|NC_020854. 1 MATLRSD---------------------IIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLS 59 (342) Q Consensus 1 MaT~~~d---------------------~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~ 59 (342) |+ +|+| +++ |+|...|.+.+.+.+.|.. ++. ..++ .+|+++.||+.. . T Consensus 1 ~~-~~~~~~~~~~~~~~~~~~~~d~~~al~l-e~~~geV~~~f~~~s~~~~--~~~-~r~i---~~G~tv~i~~ig---~ 69 (332) T protein:vir:78 1 MT-TLSNFSLPNQANGGARNADYDVRYATAL-KLFSGEVFTAFNNASIFKG--LVR-SYDL---RGGKSKQFMFTG---K 69 (332) T ss_pred Cc-ccccccCCccccCCccccccccchhhhh-hhhhhhHHHHHHHHhhhhh--ccc-cccc---cccceEEEEecc---c Confidence 43 2222 444 6777777777777777743 222 1233 369999999764 3 Q ss_pred CcccccCCCceechhh-cccceeeeEeee-eccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Q lcl|NC_020854. 60 GDFEVLSDSSSLTPGK-ITADKQVAAILH-RGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSL 137 (342) Q Consensus 60 gda~~~~~~~~i~~~~-lt~~~~~a~i~~-~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~ 137 (342) ....++..|+.+.+.+ +.+.+.+-+|=. .--++.+.|+-...+..|.+.+++++.+...++++|+.++..|...... T Consensus 70 ~~~~~~~~g~~l~~~~~~~~~~~~l~ID~~ky~~~~VddiD~~q~~~dl~~~~~~~~g~aLA~~~D~~i~~~l~~aa~~- 148 (332) T protein:vir:78 70 LSAGYHTPGTPIVGDAGIKANEKTLVMDDLLVSSQFVYSLDEIFSQYSTRAEVSKQIGEALATHYDERIARVLAKASAE- 148 (332) T ss_pred eeEeeecCCCCCCCCCCCCCceEEEEEehhhhhHHHHHhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc- Confidence 6677889999998864 888888766654 5567888999988888999999999999999999999999877542211 Q ss_pred cccccchhheeeec-ccccccccccccHHHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceee Q lcl|NC_020854. 138 NANTSSSAFFDLCI-DSESGDTPTALSPRHVAEARAILGDQGD--KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTT 214 (342) Q Consensus 138 ~a~~~~~~~~~~~~-~~~~~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~ 214 (342) .+ ....+.....+ .+.+...++..-++.|.+|..+|.+..= .=..+++.|..|..|.+..--.+++ .+. + T Consensus 149 ~~-~~~~~~g~~~~~~~~~~~~~~~~~~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~d~~~~n-~~~-----~ 221 (332) T protein:vir:78 149 AS-PVTGEPGGFHVNIGAGNTNDAQAIVDGFFEAAAVLDERSAPQEGRVAVLSPRQYYSLISSVDTNILN-REI-----G 221 (332) T ss_pred cC-cccccccccccccCCccccCHHHHHHHHHHHHHHHhhcCCCccCCEEEeCHHHHHHHHhhcCceeee-eec-----c Confidence 11 10111100000 0111111122234667777777765422 2367889999999998742111111 000 0 Q ss_pred eccceeecccccccceeeeccceEEEeCCcceeccC---------CC-------cceEEEEEecceeEeecCCcceeE-- Q lcl|NC_020854. 215 QSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSG---------GS-------TEYATYFFTQGAVASGEQMAMQTE-- 276 (342) Q Consensus 215 ~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~---------~~-------~~y~t~l~~~GAi~~~~k~~~~ve-- 276 (342) +..+.+ .....++.++|.+|+.+..+|..... .. .+..+++|-+-|++.....++.+| T Consensus 222 ~~~~~~----~~g~~i~~i~G~~V~~Sn~lp~~~g~~~~~~~~~~~~n~~~~~~~~~~~~~~h~~a~~~v~~~~~~~~~t 297 (332) T protein:vir:78 222 NSQGDM----NSGKGLYSIAGIRILKSNNLAGLYGQDLSSAAVTGENNDYQVDASALAGLIFHREAAGCIQSVAPTIQTT 297 (332) T ss_pred ccccce----ecceeeeEEeeeEEEecCccccCcccccccccccccccccccccccceEEeecccceeeeeeeccchhhh Confidence 000111 11124789999999999999864211 11 133568888889888777666444 Q ss_pred -eccCCCcceeEEEEeeEEEeeecceeeecCcCCcChHHhcCC Q lcl|NC_020854. 277 -TDRDILAKSDAMSIDLHYVYHPVGAKWAVTTTNPTRAQLETV 318 (342) Q Consensus 277 -~dr~~~~g~~~l~~r~~y~~~~~G~s~~~~~~sPt~~~L~~~ 318 (342) .+|+..+..+.+...+.|+..+. .+... ..|.++ T Consensus 298 ~~~~~~~~~~d~i~~~~~~G~~v~--rPe~~------v~l~~a 332 (332) T protein:vir:78 298 SGDFNVQYQGDLIVGKLAMGCGSL--RTSVA------GSFQAA 332 (332) T ss_pred hcccchhhhHhhhhhhhhhcCcee--cccce------EEEeeC Confidence 46677666555554444443221 11100 011111 No 90 >protein:vir:1541 Length: 347 # NCBI annotation: major capsid protein 10A # Family: family:all:975 # MgeID: mge:31 # MgeName: phiYeO3-12 # Cross-refs: genbank:acc:NP_052109;swissprot:trembl:q9t107;genbank:gi:9634035;uniprot:Q9T107;genbank:GeneID:1262383 Probab=99.19 E-value=4.5e-12 Score=82.81 Aligned_cols=287 Identities=15% Similarity=0.130 Sum_probs=170.7 Q ss_pred Cccee-cc-c--------cch-------hHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCccc Q lcl|NC_020854. 1 MATLR-SD-I--------IIP-------EVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFE 63 (342) Q Consensus 1 MaT~~-~d-~--------i~P-------ev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~ 63 (342) ||.+. .. + ..+ |+|...|...+.+.+.|... + ...++ .+|+++.+|... ....+ T Consensus 1 ma~~~~~~~~~t~~~~~~~~~~~~a~~ie~f~g~V~~~f~~~s~~~~~--~-~~~~~---~~G~sv~i~~ig---~~t~~ 71 (347) T protein:vir:15 1 MANIQGGQQIGTNQGKGQSAADKLALFLKVFGGEVLTAFARTSVTMPR--H-MLRSI---ASGKSAQFPVIG---RTKAA 71 (347) T ss_pred CCccccCCccccccccCCCcchHHHHHHHHHHHHHHHHHHHhhhhhhc--c-ccccc---cccceeEeeecc---ceeee Confidence 76211 11 0 233 55566666666666666432 2 11222 369999999764 35677 Q ss_pred ccCCCceec--hhhcccceeeeEee-eeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc- Q lcl|NC_020854. 64 VLSDSSSLT--PGKITADKQVAAIL-HRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNA- 139 (342) Q Consensus 64 ~~~~~~~i~--~~~lt~~~~~a~i~-~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a- 139 (342) ++..|+.++ +..++..+..-+|= ..--++.+.|+...-...|++.++.++.+...+++.|+.++..|.++....-+ T Consensus 72 ~~~~g~~l~~~~~~~~~~e~~ltID~~~~~~~~VddlD~~q~~~D~~~~~~~~~g~aLA~~~D~~i~~~l~~~~~~~~~~ 151 (347) T protein:vir:15 72 YLKPGENLDDKRKDIKHTEKVIHIDGLLTADVLIYDIEDAMNHYDVRAEYTAQLGESLAMAADGAVLAELAGLVNLPDAS 151 (347) T ss_pred eeccCCCCCCCCCCCccceEEEEechhhhhhHHhhhHHHHhcCCcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhccccc Confidence 888898874 45567777666654 33446778899988888899999999999999999999999988765421111 Q ss_pred -cccch---hheeeeccccccc-cccccc----HHHHHHHHHHhCccc--cCeEEEEEchHHHHHHHhhhhhhhhhhhhc Q lcl|NC_020854. 140 -NTSSS---AFFDLCIDSESGD-TPTALS----PRHVAEARAILGDQG--DKLTAVAMHSKVYYDLVERRAIDYVSTADA 208 (342) Q Consensus 140 -~~~~~---~~~~~~~~~~~~~-~~~~~~----~~~l~~A~~~~GD~~--~~~~~ivmhS~v~~~L~~~~li~~~~~s~~ 208 (342) ..... ..........+++ ..+... .+.+.+|..+|.++. ..=..++|.|..|..|.+.. ++.. .+. T Consensus 152 ~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~i~d~~~~a~~~Lde~~VP~~gR~~vv~P~~y~~LL~~~--~~~~-~d~ 228 (347) T protein:vir:15 152 NENIEGLGKPTVLTLVKPTTGDLTDPVELGKAIIAQLTIARASLTKNYVPAADRTFYTTPDNYSAILAAL--MPNA-ANY 228 (347) T ss_pred cccccccCccccccccccccccchhhhhHHHHHHHHHHHHHHHHhhcCCCccCCEEEeCHHHHHHHhccc--cccc-ccc Confidence 00000 0000000011111 111112 445556666776542 23478999999999998864 2221 111 Q ss_pred ccceeeeccceeecccccccceeeeccceEEEeCCcceeccCCC------------------------cceEEEEEecce Q lcl|NC_020854. 209 RGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGS------------------------TEYATYFFTQGA 264 (342) Q Consensus 209 ~~~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~------------------------~~y~t~l~~~GA 264 (342) +.... ...+.++.++|.+|+.+..+|....... ...-.+++-+-| T Consensus 229 ------~~~~~-----~~~G~Vg~i~G~~V~~Sn~lp~~~~t~~~~~~~~g~~~~~~~~~~~~~~~~f~~~~~l~~h~~A 297 (347) T protein:vir:15 229 ------QALID-----HERGTIRNVMGFEVVEVPHLTAGGAGDTREDAPADQKHAFPATSSTTVKVALDNVVGLFQHRSA 297 (347) T ss_pred ------ccccc-----ccceEEEEEeceEEEecccccccccccccccccccccccccccccceeeeccccceeeeeccce Confidence 11111 2346689999999999999986422100 011235556777 Q ss_pred eEeecCCcceeEeccCCCcceeEEEEeeEEEeeecc----eeeecCcCCc Q lcl|NC_020854. 265 VASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVG----AKWAVTTTNP 310 (342) Q Consensus 265 i~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G----~s~~~~~~sP 310 (342) ++.....++.+|..|+.....+.+...+.|+..+.- ..+.-.+.|- T Consensus 298 ~g~v~~~~~~~e~~~~~~~~~d~i~~~~~~G~~vlrP~~av~~~~~~~~~ 347 (347) T protein:vir:15 298 VGTVKLKDLALERARRANYQADQIIAKYAMGHGGLRPEAAGAIVLPKVSE 347 (347) T ss_pred eeeeEeeceeeeecccchhhhhhhehhhhcCCceeccccEEEEecCCCCC Confidence 888888888999999988877777776666555422 2222222222 No 91 >protein:vir:4339 Length: 395 # NCBI annotation: major head protein # Family: family:all:585 # MgeID: mge:93 # MgeName: D3 # Cross-refs: genbank:acc:NP_061502;genbank:gi:9635591;genbank:GeneID:1262860 Probab=99.17 E-value=3.8e-11 Score=77.71 Aligned_cols=270 Identities=10% Similarity=0.058 Sum_probs=157.0 Q ss_pred Cccee---ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MATLR---SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT~~---~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) +++.. .-++.|+ +..-+.+.+.+.+.|.+ ++...+ .+|..+++|.+.. .++.+..+.|++.++..+.+ T Consensus 114 ~~~~~~~~g~~vp~~-~~~~ii~~~~~~~~l~~--l~~~~~-----~~~~~~~~~~~~~-~~~~a~~v~E~~~~~~~~~~ 184 (395) T protein:vir:43 114 ITSIDGSGGALVAPD-RRPGVVAAPQRRLTIRD--LVAPGT-----TESNSVEYVRETG-FVNNAAPVSEGTQKPYSDLT 184 (395) T ss_pred hcccCCCCccccchh-hHHHHHHHHHhhhhHHh--hcccee-----cCCCceEEEEEec-CCCceeeecCCccccccccc Confidence 22221 2245555 44445555666666643 222221 2466788888753 24567788999999988888 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheee--eccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDL--CIDSES 155 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~--~~~~~~ 155 (342) ..+-....++.+..+.+++....-+ .+....+.+++++.+++..+..+|. | +...+.-.+.+... ...+.. T Consensus 185 ~~~i~~~~~k~~~~~~is~ell~d~-~~l~~~v~~~la~a~~~~~d~~~l~---G---~g~~~~~~Gi~~~~~~~~~~~~ 257 (395) T protein:vir:43 185 FELENAPVRTIAHLFKASRQILDDA-SALQSYIDARARYGLMLVEECQLLY---G---NGTGANLHGIIPQAQAYAPPSG 257 (395) T ss_pred eeEEEEeeeeEEEeehhhHHHHHhH-HHHHHHHHHHHHHHHHHHHHHHHHh---c---cCCCCccccccccccccccccc Confidence 7777777888887788887754333 3566778999999999998887764 2 11111101111110 111111 Q ss_pred ccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) ........++.+.++...+......-.+|+||+..+..|+++ +++++..+.. . .....-++++| T Consensus 258 ~~~~~~~~~~~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~l------kd~~G~~i~~--------~--~~~~~~~~l~G 321 (395) T protein:vir:43 258 VVVTAEQRIDRIRLAILQAQLAEFPASGIVLNPIDWALIELN------KDAENRYIIG--------S--PQNGTTPTLWR 321 (395) T ss_pred cccccchhHHHHHHHHHhhccccCCCcEEEEcHHHHHHHHHh------hccCCceecc--------c--cccCCCceecc Confidence 122334467888888888877776778999999999998764 3444433321 1 11234578999 Q ss_pred ceEEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCCC----cceeEEEEeeEEEeeecc-eeeecCcC Q lcl|NC_020854. 236 LRVIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDIL----AKSDAMSIDLHYVYHPVG-AKWAVTTT 308 (342) Q Consensus 236 ~~VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~----~g~~~l~~r~~y~~~~~G-~s~~~~~~ 308 (342) ++|+++|.+|-. .++|+. .++.+.....+.++.++... .....+....++.++++- -+|..-.. T Consensus 322 ~pVv~~~~~~~~---------~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~~~~ 392 (395) T protein:vir:43 322 LPVVETQAITQD---------EFLTGAFSLGAQIFDRMDIEVLVSTENDKDFENNMVTIRAEERLAFAVYRPEAFVTGSL 392 (395) T ss_pred eeeEEcCCCCCC---------cEEEEeccceEEEEEecceEEEEeccccchhhcCcEEEEEEEeeccEEecccceEEEEe Confidence 999999998732 123332 23334444456666655432 455677777777776532 12321111 Q ss_pred CcC Q lcl|NC_020854. 309 NPT 311 (342) Q Consensus 309 sPt 311 (342) +++ T Consensus 393 taa 395 (395) T protein:vir:43 393 TAS 395 (395) T ss_pred ccC Confidence 222 No 92 >protein:vir:2504 Length: 305 # NCBI annotation: major capsid subunit gp9 # Family: family:all:507 # MgeID: mge:53 # MgeName: TM4 # Cross-refs: genbank:acc:NP_569745;genbank:gi:18496895;genbank:GeneID:932268 Probab=99.16 E-value=1e-11 Score=80.87 Aligned_cols=270 Identities=10% Similarity=0.084 Sum_probs=149.3 Q ss_pred Cc-ce--eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceec----- Q lcl|NC_020854. 1 MA-TL--RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLT----- 72 (342) Q Consensus 1 Ma-T~--~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~----- 72 (342) || +. .....+|+.+.+-+.+++.+.+.+.+ ++. .+ ..++..+++|.+... .++.-+.|++.++ T Consensus 1 ma~~t~~~gg~liP~~~~~~Ii~~~~~~s~l~~--l~~---~~--~~~~~~~~~p~~~~~--~~a~wv~E~~~~~~~~~~ 71 (305) T protein:vir:25 1 MADISRAEVASLIQEAYSDTLLAAAKQGSTVLS--AFQ---NV--NMGTKTTHLPVLATL--PEADWVGESATDPKGVKP 71 (305) T ss_pred CCCccCCccceecCHHHHHHHHHHHHhhchhhh--hcc---ee--eccCCcEEEEEEeCC--cceEEeeccccccccccc Confidence 99 22 34456788887777777777666644 221 11 124667889998743 6777788887654 Q ss_pred hhhcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH---HHHHhhhcccccchhheee Q lcl|NC_020854. 73 PGKITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCL---QGVFGSLNANTSSSAFFDL 149 (342) Q Consensus 73 ~~~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L---~g~~~~~~a~~~~~~~~~~ 149 (342) ..+.+-.+-....++.+..+.++++...-+..|....+.+++++.++++.++.++.-- ++.+........ .... T Consensus 72 ~s~~~f~~i~~~~~k~~~~~~is~ell~ds~~~~~~~i~~~l~~~~a~~~d~a~~~G~g~~~~~~~~~~~~~~-~~~~-- 148 (305) T protein:vir:25 72 TSKVTWANRTLVAEEIAVIIPVHENVIDDATVAVLTEVAELGGQAIGKKLDQAVIFGTDKPASWVSPALIPAA-VTAG-- 148 (305) T ss_pred ccccceeeEEeeeEEEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHhhhheeccCCCCCcccccccccc-cccc-- Confidence 3445555556667788888888888776677788999999999999999998887410 000000000000 0000 Q ss_pred ecccccccccccccHH----HHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccc Q lcl|NC_020854. 150 CIDSESGDTPTALSPR----HVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYG 225 (342) Q Consensus 150 ~~~~~~~~~~~~~~~~----~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~ 225 (342) ............. .+.++.....+.......|+||+..+..|++. ++++++.+.. T Consensus 149 ---~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~l------kd~~G~~i~~------------ 207 (305) T protein:vir:25 149 ---QAVEVVGGVANESDIVGATNRAAKAVASAGWAPDTLLSSLALRYEVANI------RDANGNPVFR------------ 207 (305) T ss_pred ---ccccccccchhhhHHHHHHHHHHHhhhhcccccceeEecHHHHHHHHHh------hccCCceeec------------ Confidence 0000111122222 23333333434444556799999999999864 3344333221 Q ss_pred cccceeeeccceEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccCCC----c--------ceeEEEEee Q lcl|NC_020854. 226 GEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRDIL----A--------KSDAMSIDL 292 (342) Q Consensus 226 ~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~~----~--------g~~~l~~r~ 292 (342) =++++|++|+++|.+|...+ +.. .+|+. --+.++...++.++..++.. . ....+.+.. T Consensus 208 ----~~~l~G~Pv~~~~~~~~~~~----~~~-~~~gd~s~~~i~~~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~R~~~ 278 (305) T protein:vir:25 208 ----DDSFAGFRTFFNRNGAWDAD----AAI-EVIADSSRVKIGVRQDITVKFLDQATLGTGENQINLAERDMVALRLKA 278 (305) T ss_pred ----CCcccccceEEcCccCCCCC----ccE-EEEEecceEEEEEecCeEEEEeeeeeeecCCceeeeeecCcEEEEEEE Confidence 13789999999999876432 222 23332 22344555566666665431 1 123344444 Q ss_pred EEEeee---cceeee-c---CcCCcCh Q lcl|NC_020854. 293 HYVYHP---VGAKWA-V---TTTNPTR 312 (342) Q Consensus 293 ~y~~~~---~G~s~~-~---~~~sPt~ 312 (342) ++..++ ..+-+- . +...|+- T Consensus 279 r~~~~v~~p~a~v~~~~~~~~~~~pa~ 305 (305) T protein:vir:25 279 RFAYVLGVSATAQGANKTPVAVVAPAA 305 (305) T ss_pred eecceeeCcccEEEEccccccccCCCC Confidence 444332 333221 1 1123333 No 93 >protein:vir:6212 Length: 434 # NCBI annotation: prohead protease # Family: family:all:21 # MgeID: mge:128 # MgeName: phBC6A52 # Cross-refs: genbank:acc:NP_852592;genbank:gi:31415852;genbank:GeneID:1489210 Probab=99.16 E-value=4.2e-11 Score=77.46 Aligned_cols=278 Identities=10% Similarity=0.033 Sum_probs=154.9 Q ss_pred Cccee--ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccc---cCCCceechhh Q lcl|NC_020854. 1 MATLR--SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEV---LSDSSSLTPGK 75 (342) Q Consensus 1 MaT~~--~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~---~~~~~~i~~~~ 75 (342) +.+.. .-.++|+.|..-+.+.+.+.+.+.+- +..-+ .+| .+.+|.+... +.+.- ..++..++... T Consensus 143 ~~~~t~~GG~lvP~~~~~~Ii~~l~~~~~i~~~--~~~~~-----~~~-~~~~p~~~~~--~~a~~~~~~~e~~~~~~~~ 212 (434) T protein:vir:62 143 LGLVTGNGSVTIPDFLSKEIITYAQEENFLRRL--GTGVK-----TKE-NIKYPVLVKK--AEAQGHKNERTNNEMPETD 212 (434) T ss_pred hcccccccceecchhhHHHHHHhhhhhhhhhhh--cceec-----cCC-ceEEEEEecC--Ccccceecccccccccccc Confidence 22222 23578998887777777776666442 22111 123 3677876432 33333 24466777777 Q ss_pred cccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 76 ITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 76 lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) .+-.+-....++.+.-+.++++...-+.-|..+.+.+++++.+.+..++.+|. |..........+ +....+ T Consensus 213 ~~f~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~d~~~l~------G~G~~~~~~g~~---~~~~~~ 283 (434) T protein:vir:62 213 IEFDEIELSPTEFDALATVTKKLLARTGLPIEQIVMDELKKAYVRKETQYMVN------GDEANNINDGAL---AKKAVE 283 (434) T ss_pred cceeeEEeeheeeEeehhhHHHHHhcchHHHHHHHHHHHHHHHHHHHHHHHhc------cCCCCcccccee---eccccc Confidence 76666677777777777777776554555778889999999999999887763 111111111111 111222 Q ss_pred ccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) ..+.....++.|.++...+-.....-+.|+||+.++..|+++ ++++++.+.. ..... ..+.-.+++| T Consensus 284 ~~~~~~~~~d~l~~l~~~l~~~~~~~a~~v~n~~~~~~L~~l------kd~~G~~l~~--~~~~~-----~~g~~~tl~G 350 (434) T protein:vir:62 284 FKTDEKNLYDALVKMKNTPVKEVRKKARWVLNTAALTKIETM------KTDDGFPLLR--PFNQA-----EGGIGYTLLG 350 (434) T ss_pred ccccccchhhHHHHHHhhcchhhhcCCEEEEcHHHHHHHHHh------hccCCCEeec--cCCCc-----cCCCCceecc Confidence 233445678889998888877666777899999999999874 3455443321 11111 1123457999 Q ss_pred ceEEEeCCcceeccCCCcceEEEEEecc-eeEeec-CCcceeEec--cCCCcceeEEEEeeEEE---ee-ecceee-ecC Q lcl|NC_020854. 236 LRVIVSDDVNTAGSGGSTEYATYFFTQG-AVASGE-QMAMQTETD--RDILAKSDAMSIDLHYV---YH-PVGAKW-AVT 306 (342) Q Consensus 236 ~~VvvdD~~p~~~~~~~~~y~t~l~~~G-Ai~~~~-k~~~~ve~d--r~~~~g~~~l~~r~~y~---~~-~~G~s~-~~~ 306 (342) ++|+++|.+|....+. . ..++|+.= ...++. .....+++. +....++..+.+..++- +| |.-... +.. T Consensus 351 ~pV~~~~~~~~~~~~~--~-~~i~~Gdfs~~~i~~~~g~~~i~~~~~~~~~~~~v~~~~~~r~Dgk~i~~~~~~~~~~~~ 427 (434) T protein:vir:62 351 FPVEEEDAIDIPDSPD--T-PVFYFGDFSKFYIQDVIGSLEVQKLVELFSRTNRVGFRIWNLLDAQLIHSPFEVPVYKYV 427 (434) T ss_pred eeeEEecCccCccCCC--c-eEEEEeeccceEEEEeeceeEEEeehhhhcccCceEEEEEeeecceeecCcccceEEEEE Confidence 9999999998654322 2 23444311 111222 223444443 33345555566666553 22 222221 122 Q ss_pred cCCcChH Q lcl|NC_020854. 307 TTNPTRA 313 (342) Q Consensus 307 ~~sPt~~ 313 (342) ++.||-+ T Consensus 428 ~~~~~~~ 434 (434) T protein:vir:62 428 LKAPTGA 434 (434) T ss_pred eccCCCC Confidence 2333333 No 94 >protein:vir:10450 Length: 344 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:184 # MgeName: phiA1122 # Cross-refs: genbank:acc:NP_848297;genbank:gi:30387487;genbank:GeneID:1733971 Probab=99.16 E-value=1.8e-12 Score=84.93 Aligned_cols=287 Identities=14% Similarity=0.125 Sum_probs=176.1 Q ss_pred Ccce-------------------eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCc Q lcl|NC_020854. 1 MATL-------------------RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGD 61 (342) Q Consensus 1 MaT~-------------------~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gd 61 (342) ||.+ .-+++. |+|...|...+.+++.|.. .+..- ++ .+|+++.+|+.. ... T Consensus 1 ma~~~~~~~~n~~~~~~~~~~~~~~al~i-e~~~geV~~~f~~~s~~~~--~~~~r-~i---~~g~s~~~~~iG---~~~ 70 (344) T protein:vir:10 1 MANMTGGQQLGTNQGKDVMAAGDKLALFL-KVFGGEVLTAFARTSVTTS--RHMVR-SI---SSGKSAQFPVLG---RTQ 70 (344) T ss_pred CccccccccCCcccCCccCCccchhHHHH-HHHHHHHHHHHHHHhhhcc--cceee-ee---cccceEEEEeec---eeE Confidence 6611 123455 8898889888888888853 33322 33 469999999753 255 Q ss_pred ccccCCCceech--hhcccceeeeEeee-eccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhc Q lcl|NC_020854. 62 FEVLSDSSSLTP--GKITADKQVAAILH-RGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLN 138 (342) Q Consensus 62 a~~~~~~~~i~~--~~lt~~~~~a~i~~-~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~ 138 (342) +.....|+++.. +.+...+.+-+|=. .--.+.+.|+-...+..|.+.++.++.+...++.+|+.++..|..+-+... T Consensus 71 ~~~~~~G~~l~~t~~~~~~~e~~l~ID~~~y~~~~VdDiD~~q~~~D~r~~~~~~~G~aLA~~~D~~i~~~la~~a~~~~ 150 (344) T protein:vir:10 71 AAYLAPGENLDDIRKDIKHTEKVITIDGLLTADVLIYDIEDAMNHYDVRSEYTSQLGESLAMAADGAVLAEIAGLCNVES 150 (344) T ss_pred EEeeecCCCCCCCCCCcccceEEEEEcchhhhhhhhhhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccc Confidence 667888888863 56777777766653 445678889999899999999999999999999999999887743221100 Q ss_pred ---ccc---cchhheeeeccccccccccccc----HHHHHHHHHHhCccc--cCeEEEEEchHHHHHHHhhhhhhhhhhh Q lcl|NC_020854. 139 ---ANT---SSSAFFDLCIDSESGDTPTALS----PRHVAEARAILGDQG--DKLTAVAMHSKVYYDLVERRAIDYVSTA 206 (342) Q Consensus 139 ---a~~---~~~~~~~~~~~~~~~~~~~~~~----~~~l~~A~~~~GD~~--~~~~~ivmhS~v~~~L~~~~li~~~~~s 206 (342) ... ..+.+......+.. .+....+ ++.|.+|...|.++. ..=..++|.|..|..|.+...+....+. T Consensus 151 ~~~~~~~g~~~~~~~~~~~~~~~-~t~~~~~~~~~~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~~~~~~~~~ 229 (344) T protein:vir:10 151 QYNENITGLGTATVIETTQDKTT-LTDQVALGKEIIAALTKARAALTKNYVPSSDRVFYCDPDSYSAILAALMPNAANYA 229 (344) T ss_pred ccccccccccccceeeccccccc-ccchhhhHHHHHHHHHHHHHHHhhcCCCccCCEEEeChHHHHHHhhcccccccccc Confidence 000 00111111111111 1122222 345666666666532 1336789999999999877554222111 Q ss_pred hcccceeeeccceeecccccccceeeeccceEEEeCCcceecc--------CCC---------------cceEEEEEecc Q lcl|NC_020854. 207 DARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGS--------GGS---------------TEYATYFFTQG 263 (342) Q Consensus 207 ~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~--------~~~---------------~~y~t~l~~~G 263 (342) ..+ ....+.++.++|.+|+++..+|.... +.. ..-...+|-+- T Consensus 230 ---------~~~-----~~~~G~V~~v~G~~V~~Sn~lp~~~~~~~~~~~tg~~~~~~~~~~~~~~~~~s~~~~l~~h~~ 295 (344) T protein:vir:10 230 ---------ALI-----DPEKGSIRNVMGFEVVEVPHLTAGGAGTSREGTTGQKHAFPATKSGNDKVAKDNVIGLFMHRS 295 (344) T ss_pred ---------ccc-----ceeeeEEEEEeceEEEeccccccccCCcccccccCccccccCCcccceeeecceeEEEeechh Confidence 111 12246789999999999999884211 000 01123455667 Q ss_pred eeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecceeeecCcCCcChHHhcCCcCceeecCccccceEEEEec Q lcl|NC_020854. 264 AVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVGAKWAVTTTNPTRAQLETVANWSKVYELKNIGIVRATNV 338 (342) Q Consensus 264 Ai~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~s~~~~~~sPt~~~L~~~~NW~~v~d~k~i~~~~~~~~ 338 (342) |++.....++.+|..|+.....+.+.+.+.|+..+ ..|+....++++|| T Consensus 296 A~~~v~~~~~~~e~~r~~~~~~d~i~g~~~~G~~v--------------------------lRPe~a~~v~~~~~ 344 (344) T protein:vir:10 296 AVGTVKLRDLALERARRANFQADQIIAKYAMGHGG--------------------------LRPEAAGAVVFKTK 344 (344) T ss_pred hhhhhhhccceeecccchhHHHHHHHHHhhcccce--------------------------ecccceEEEEeecC Confidence 77777778888888888877777666665555443 23333344444444 No 95 >protein:vir:8885 Length: 347 # NCBI annotation: major capsid protein A # Family: family:all:975 # MgeID: mge:161 # MgeName: gh-1 # Cross-refs: genbank:acc:NP_813774;genbank:gi:29366729;genbank:GeneID:1258837 Probab=99.14 E-value=4.3e-12 Score=82.92 Aligned_cols=288 Identities=15% Similarity=0.101 Sum_probs=174.8 Q ss_pred Cc---------ce--------e-ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcc Q lcl|NC_020854. 1 MA---------TL--------R-SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDF 62 (342) Q Consensus 1 Ma---------T~--------~-~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda 62 (342) || |. . -++++ |+|...|...+.+.+.|.. .+. .-++ .+|+++.+|... .... T Consensus 1 ~a~~~~~~~~~~~~g~~~~~~d~~al~i-e~~~geV~~~f~~~s~~~~--~~~-~r~i---~~G~sv~~~~iG---~~~~ 70 (347) T protein:vir:88 1 MANATGGQQIGANQGKGQSAADKLALFL-KVFGGEVLTAFVRRSVTMD--KHM-VRTI---QNGKSASFPVMG---RTKG 70 (347) T ss_pred CCCcccchhhhccCCCCccccchHHHHH-HHHHHHHHHHHHHHhhhhh--ccc-cccc---cCcceEEEeeec---ceee Confidence 66 11 1 24566 9999999888888888854 222 2233 369999999754 2556 Q ss_pred cccCCCceech--hhcccceeeeEeee-eccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc Q lcl|NC_020854. 63 EVLSDSSSLTP--GKITADKQVAAILH-RGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNA 139 (342) Q Consensus 63 ~~~~~~~~i~~--~~lt~~~~~a~i~~-~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a 139 (342) +....|+.+.. ..+...+..-+|=. .-..+.+.|+-..-...|++.+..++.+...+|+.|+.++..|........+ T Consensus 71 ~~~~~g~~l~~~~~~~~~~~~~i~ID~~~y~~~~Vdd~D~~q~~~D~r~~~~~~~g~aLA~~~D~~i~~~l~~~a~~~~~ 150 (347) T protein:vir:88 71 YYLAPGENLDDKRKDIKHSEKVIQIDGLLTSDVLIYDIEDAMNHYDVRAEYSAQLGEALAIAADGAVLAEMAKLCNLPAA 150 (347) T ss_pred eeeccccCCCCCCCCCccceEEEEEechhhhhhhhhhHHHHhhcCCchHHHHHHHHHHHHHHHHHHHHHHHHHhhccccc Confidence 66788888653 46777776666654 3556788899988888899999999999999999999998777433221111 Q ss_pred c------ccchhheeeeccccccccc--ccccHHHHHHHHHHhCccc--cCeEEEEEchHHHHHHHhhhhhhhhhhhhcc Q lcl|NC_020854. 140 N------TSSSAFFDLCIDSESGDTP--TALSPRHVAEARAILGDQG--DKLTAVAMHSKVYYDLVERRAIDYVSTADAR 209 (342) Q Consensus 140 ~------~~~~~~~~~~~~~~~~~~~--~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~ 209 (342) . ........+.......+.. ...-++.|.+|...|.+.. ..-..+++.|..|..|.+...+.- .+.. T Consensus 151 ~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~~~~~---~~~~ 227 (347) T protein:vir:88 151 SNENIAGLGQAVVLNIGAAADLVDVEARGKAILKGLTLARARLTKNYVPAGDRRFYCAPEDYSAILSALMPNA---ANYA 227 (347) T ss_pred cccccCCccccccccccccccccchhhhHHHHHHHHHHHHHHHhhcCCCCCCCEEEeCHHHHHHHhcchhhhh---hhhc Confidence 0 0000000110000000000 0111566788888886643 145889999999999987543211 1111 Q ss_pred cceeeeccceeecccccccceeeeccceEEEeCCcceeccCCC---------------------------cceEEEEEec Q lcl|NC_020854. 210 GTSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGS---------------------------TEYATYFFTQ 262 (342) Q Consensus 210 ~~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~---------------------------~~y~t~l~~~ 262 (342) . ... ...+.++.++|.+|+++..+|+...+.. .+-..+.+-+ T Consensus 228 ~------~~~-----~~~G~vg~i~G~~V~~s~nlp~~~~~~~~~~~~~~~t~~~~~~~~~~~~~~~~d~~~~~~l~~~~ 296 (347) T protein:vir:88 228 A------LID-----PETGNIRNVMGFEVIEVPHLTVGGAGDNNPADGVAPTNQKHIFPATATGDDRVAQNNVVGLFNHR 296 (347) T ss_pred c------ccc-----hhcceeeeeccceEEEeecccccccccccccccccccccccccccccccccccccCcEEEEEech Confidence 0 001 1134678899999999999996432211 0111244455 Q ss_pred ceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecceeeec-CcCCcCh Q lcl|NC_020854. 263 GAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVGAKWAV-TTTNPTR 312 (342) Q Consensus 263 GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~s~~~-~~~sPt~ 312 (342) -|++.....++.+|..|+.....+.+.+.+.|+..+.==.-.. -..+++- T Consensus 297 ~a~g~v~~~d~~~e~~r~~~~~~d~i~~~~~~G~~~~rPe~a~~~~~~~a~ 347 (347) T protein:vir:88 297 SAVGTVKLKDMALERARRPEFQADQIIGKYAMGHGGLRPEAAGALVFTPAA 347 (347) T ss_pred hhhhheecccceeeeeechhhHHHHhhhhhhhcCceeccceEEEEEeCCCC Confidence 6677777777889999999988888888777776653211100 0001111 No 96 >protein:vir:1383 Length: 421 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:314 # MgeName: phi3626 # Cross-refs: genbank:acc:NP_612835;genbank:gi:20065969;genbank:GeneID:935826 Probab=99.14 E-value=2.2e-11 Score=78.99 Aligned_cols=285 Identities=9% Similarity=0.048 Sum_probs=165.4 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) +++.-....+|+.+..-+.....+.+.+.+ ++..- ..++..+.+|.+..........+.|+..++..+++-.. T Consensus 117 ~t~~~gg~liP~~~~~~Ii~~~~~~~~l~~--l~~~~-----~~~~~~~~~~~~~~~~~~~~~~~~E~~~~~~s~~~f~~ 189 (421) T protein:vir:13 117 MSSTNNGAVIPQEFVNEFEKLKEGYPSLKE--HCHVI-----PVNRNAGKMPVRAGASVDKLANLAKDTELVKAMLKTQP 189 (421) T ss_pred cccCCcceecchhhHHHHHHHHHhhhhhhh--hceee-----eccCCceEEEEeecCCccceeeccccccccccccceeE Confidence 223345556777765545555544444432 22211 12355677887764322334558899999888888777 Q ss_pred eeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccccccc Q lcl|NC_020854. 81 QVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTPT 160 (342) Q Consensus 81 ~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~~ 160 (342) -....++.+.-+.++++...-+.-|....+.+++++.+.+..+..++..++|++... . T Consensus 190 i~~~~~k~~~~v~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~i~~~~~g~~~~~----------------------~ 247 (421) T protein:vir:13 190 MAYDIDDYGLLAPIDNSLLEDSEINFLEFVNEEFAEFAVNTENAEIVKQAKAVLAEE----------------------T 247 (421) T ss_pred EEeeeeeeEeehhhhHHHHhhhHHHHHHHHHHHHHHHHHHHhhhhHhhhhhhccccc----------------------c Confidence 777778888777887776544555677889999999999999998888887754211 1 Q ss_pred cccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceEEE Q lcl|NC_020854. 161 ALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIV 240 (342) Q Consensus 161 ~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vvv 240 (342) ..+++.|.+++..+......-.+|+||+..+..|+++ ++++++.+.. . .....-++++|++|++ T Consensus 248 ~~~~d~i~~~~~~l~~~~~~~a~~v~n~~~~~~l~~l------kd~~G~~i~~--~--------~~~~~~~tl~G~pV~~ 311 (421) T protein:vir:13 248 INDYAGLVKTINSLVPNARKRAIIVTNSDGRAYLDGL------MDKQGRPLLK--E--------LSDGGDLVFKGRPVIE 311 (421) T ss_pred ccchHHHHHHHHHhhhhhcCCCEEEEcHHHHHHHHHh------hcCCCceeec--C--------cCCCCCceecceeeEE Confidence 2357788999888877776778999999999999864 3455444332 1 1123346899999999 Q ss_pred eCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCCC--cceeEEEEeeEEEeeecc---------eeee--- Q lcl|NC_020854. 241 SDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDIL--AKSDAMSIDLHYVYHPVG---------AKWA--- 304 (342) Q Consensus 241 dD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~--~g~~~l~~r~~y~~~~~G---------~s~~--- 304 (342) +|.+|.... +. ..++|+. -++.++....+.++..++.. .+...+....+|.+.+.- .+|. T Consensus 312 ~~~~~~~~~---~~-~~~~~gd~~~~~~~~~~~~~~v~~~~~~~f~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~~a~v 387 (421) T protein:vir:13 312 LEESIFDVG---DE-TKFIVSDFKTLIKFMDRKQYLIDQSKEAGYTKNETIARIIERFDVNSPLDKSSDAEKIRKFGVIV 387 (421) T ss_pred eccccccCC---Cc-eEEEEEeccccEEEEEecceEEEeecccccccCeeEEEEEeeecceeecchhhheeeecccceee Confidence 999985432 22 2444553 23555666677787777654 444566666666444321 1221 Q ss_pred -c---CcCCcChHHhcCCcCceeecCccccceEE Q lcl|NC_020854. 305 -V---TTTNPTRAQLETVANWSKVYELKNIGIVR 334 (342) Q Consensus 305 -~---~~~sPt~~~L~~~~NW~~v~d~k~i~~~~ 334 (342) . ...+++-...-+..-=+..-++.+.-.-+ T Consensus 388 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 421 (421) T protein:vir:13 388 KLQEVLKSSPRSGKNKNESKEEIKEEGEATQQNE 421 (421) T ss_pred ccccccCCCCcCCCCccccchheeeccccccCCC Confidence 0 00011111000000000000111100000 No 97 >protein:vir:99920 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:1611 # MgeName: Halo # Cross-refs: genbank:acc:YP_655524;genbank:gi:109392294;genbank:GeneID:4157089 Probab=99.14 E-value=3.6e-11 Score=77.87 Aligned_cols=282 Identities=13% Similarity=0.117 Sum_probs=154.1 Q ss_pred Ccceecc--ccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhccc Q lcl|NC_020854. 1 MATLRSD--IIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITA 78 (342) Q Consensus 1 MaT~~~d--~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~ 78 (342) |||..++ ..+|+.+.+-+.+.+.+.+.+.+.+-..+ .++..+++|.+... +.+.-+.|++.++..+.+- T Consensus 1 Mat~tt~~g~~vP~~~~~~ii~~~~~~s~l~~~~~~i~-------~~~~~~~~p~~~~~--~~a~wv~Eg~~~~~~~~~f 71 (311) T protein:vir:99 1 MATFGTGNLKNLPRNIADGMVKDVVQGSTVAVLSARKP-------QRFGNEDIITFNGR--PKAEFVGEGQQKSSTTGEF 71 (311) T ss_pred CceecCCCceeccHHHHHHHHHHHHhhchhhhhcceee-------ccCCceEEEEEeCC--ceeEEeecCccccccccee Confidence 9965433 46788887766666666666644221111 23455789998743 6777899999999988887 Q ss_pred ceeeeEeeeeccceeechHHHhh---hcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 79 DKQVAAILHRGRAFEARDLAALA---AGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 79 ~~~~a~i~~~~k~~~~tD~a~~~---~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) .+.....++.+--+.++++-... +..+..+.+.+++++.+++++++.+|.-...-.+... ..... .........+ T Consensus 72 ~~v~l~~~k~~~~~~iS~ell~~~~d~~~~l~~~i~~~la~ai~~~~d~~~l~G~g~~~g~~~-~g~~~-~~~~~~~~~~ 149 (311) T protein:vir:99 72 DFVTSTPKKAQVTMRFNEEVQWADEDYQLGVLQTLSEAGAEALARALDLGLYHRINPLTGTVI-PGWSN-YLGAASKRVE 149 (311) T ss_pred eEEEEeeEEEEEeehhhHHHhhcccccHHHHHHHHHHHHHHHHHHHHHHHhhcccCcccCccc-ccccc-ccccccceee Confidence 77777777777777777775422 2346788999999999999999888752110000000 00000 0000001111 Q ss_pred c-cccccccHHHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceee Q lcl|NC_020854. 156 G-DTPTALSPRHVAEARAILGDQGD--KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPT 232 (342) Q Consensus 156 ~-~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~ 232 (342) . ..........+.++..++..... ...+|+|||..+..|+++ ++++++.+... ......-++ T Consensus 150 ~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~l------kd~~G~~l~~~---------~~~~~~~~~ 214 (311) T protein:vir:99 150 LTADTIANPDLAIEAAVGLLVANGHPTPVNGLALHPSIAWGLSTA------RYTDGRKKFPE---------LGLGIGVSS 214 (311) T ss_pred ccccccchhHHHHHHHHHHHhhhccCCCccEEEEcHHHHHHHHhh------hccCCCeeecC---------cccCCCCce Confidence 1 11112223445556665544322 234599999999999874 34444333210 011123468 Q ss_pred eccceEEEeCCcceeccCCCc-------ceEEEEEe--cceeEeecCCcceeEeccCCC---------cceeEEEEeeEE Q lcl|NC_020854. 233 YMGLRVIVSDDVNTAGSGGST-------EYATYFFT--QGAVASGEQMAMQTETDRDIL---------AKSDAMSIDLHY 294 (342) Q Consensus 233 ~~G~~VvvdD~~p~~~~~~~~-------~y~t~l~~--~GAi~~~~k~~~~ve~dr~~~---------~g~~~l~~r~~y 294 (342) ++|++|++++.+|.......+ ...-++++ ...+.++..+.+.++..+... .....+....++ T Consensus 215 l~G~Pv~~s~~i~~~~~~~~~~~~~~~~~~~~~~~Gdf~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~~~~r~~~r~ 294 (311) T protein:vir:99 215 FEGIDASVSDTVNGGDEADPDDEDLDAARAVRGIVGDFANGIHWGVQRDIPVELIKYGDPDGQGDLKRHNQIALRLEIVY 294 (311) T ss_pred ecceeeEeecccccccccccccchhhccCcceEEEeeccccEEEEEecCceEEEeecCCCCcchhhhhcCcEEEEEEEee Confidence 999999999988732110000 01112222 234556555566665544321 123445555666 Q ss_pred Eeeecc---eeeecCcC Q lcl|NC_020854. 295 VYHPVG---AKWAVTTT 308 (342) Q Consensus 295 ~~~~~G---~s~~~~~~ 308 (342) ...++- +..+.+.. T Consensus 295 d~~v~~~~~v~~~~~~A 311 (311) T protein:vir:99 295 GWYVFTDRFVVIENAVA 311 (311) T ss_pred cceecChhHeeeecccC Confidence 655544 23322222 No 98 >protein:vir:80376 Length: 435 # NCBI annotation: gp6, major capsid head protein # Family: family:all:21 # MgeID: mge:1881 # MgeName: phi644-2 # Cross-refs: genbank:acc:YP_001111085;genbank:gi:134288639;genbank:GeneID:4960624 Probab=99.11 E-value=4.7e-11 Score=77.21 Aligned_cols=277 Identities=13% Similarity=0.136 Sum_probs=148.3 Q ss_pred Cc-----ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhh Q lcl|NC_020854. 1 MA-----TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGK 75 (342) Q Consensus 1 Ma-----T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~ 75 (342) ++ +.....++|+.+...+.+.+.+.+.+.+-+. . .+.. ....+.+|.+.. .+++.-+.|++.++..+ T Consensus 130 ~~~~~~~~~~gg~lvP~~~~~~ii~~l~~~~~i~~~~~-~---~v~~--~~~~~~~p~~~~--~~~a~~v~E~~~~~~~~ 201 (435) T protein:vir:80 130 MSLNTLSPGAGGVLVPENLSSEVIELLRPKSVVRKLGA-R---TLPL--SNGNITIPRLKG--GAIVGYIGADTDIPTTQ 201 (435) T ss_pred hhhcccCCCCCccccchhHHHHHHHHHhhhchhhhccc-e---eeec--CCCceEEEEEeC--CcceeeeccCccccccc Confidence 22 1223457888887777666655555533211 0 1111 122578898863 36667789999999888 Q ss_pred cccceeeeEeeeeccceeechHHHhhhcc--hHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeec-- Q lcl|NC_020854. 76 ITADKQVAAILHRGRAFEARDLAALAAGS--DPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCI-- 151 (342) Q Consensus 76 lt~~~~~a~i~~~~k~~~~tD~a~~~~~~--dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~-- 151 (342) .+-.+-....++.+..+.++++...-+.- +..+.+.+++++.+.++.+..++. | ....+.-.+.+..... T Consensus 202 ~~f~~i~~~~~k~~~~~~is~ell~ds~~~~~l~~~i~~~l~~a~~~~~d~a~l~---G---~G~~~~p~Gi~~~~~~~~ 275 (435) T protein:vir:80 202 QQFDDLKLTAKKMAALVPIANDLIKYAGVNPNVDQIVVGDLTAAIGAREDKAFIR---D---DGTANTPKGLRFWALPGN 275 (435) T ss_pred cceeeEEEeeEEEEEeehhhHHHHHhhcccHHHHHHHHHHHHHHHHHHHHHHhhc---c---CCCCCcccceeecccccc Confidence 87777777777888778887776444433 455779999999999999987764 2 1111111111111000 Q ss_pred -ccccccccccccHHHHHHHHHHhCcc--ccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeeccccccc Q lcl|NC_020854. 152 -DSESGDTPTALSPRHVAEARAILGDQ--GDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEV 228 (342) Q Consensus 152 -~~~~~~~~~~~~~~~l~~A~~~~GD~--~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~ 228 (342) ...............+.++...+-.. ...-.+|+|||.++..|++. +++++..+.. .. T Consensus 276 ~~~~~~~~~~~~~~~d~~~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~l------kd~~G~~l~~-------------~~ 336 (435) T protein:vir:80 276 VITASDGSTLQKIETDLGKAILALENADANLTQPGWIMAPRTFRFLEGL------RDGNGNKVYP-------------EL 336 (435) T ss_pred eeecccccchhhHHHHHHHHHHHhhccccccccCEEEEcHHHHHHHHhh------hccCCceecc-------------CC Confidence 00111111111133455665544332 22346799999999998764 3454443321 11 Q ss_pred ceeeeccceEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccCCC-------------cceeEEEEeeEE Q lcl|NC_020854. 229 SVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRDIL-------------AKSDAMSIDLHY 294 (342) Q Consensus 229 ~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~~-------------~g~~~l~~r~~y 294 (342) .-++++|++|+++|.+|.......++ ..++|+. .-+.++...++.++..+... .....+....++ T Consensus 337 ~~~~l~G~pv~~~~~~p~~~~~~~~~-~~i~~gd~s~~~i~~~~~~~i~~~~~~~~~~~~~~~~~~f~~n~~~~r~~~r~ 415 (435) T protein:vir:80 337 ANGMLKGYPVGKTTQVPINLGEAGKE-SEIYFTDFGDVFIGEEETLEIDYSKEATYKDADGHMVSAFQRDQTLIRVIAKN 415 (435) T ss_pred CCCeEeeeeeEEeccccccccCCCCc-ceEEEEEcccEEEEeecceEEEEeccccccccccchhhhhhcCcceeeeeeee Confidence 22478999999999998754333222 2334442 12334555667777776542 122334444444 Q ss_pred Eeee---cceeeecCcCCcChHHhcCCcCcee Q lcl|NC_020854. 295 VYHP---VGAKWAVTTTNPTRAQLETVANWSK 323 (342) Q Consensus 295 ~~~~---~G~s~~~~~~sPt~~~L~~~~NW~~ 323 (342) .+.| ..+.. -++.+|-- T Consensus 416 d~~~~~~~a~~~------------l~~~~~~~ 435 (435) T protein:vir:80 416 DFGPRHVESIAV------------LSGVAWGA 435 (435) T ss_pred CcEeecccceEE------------EeccCCCC Confidence 3332 22222 11111111 No 99 >protein:vir:100884 Length: 389 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1473 # MgeName: Lc-Nu # Cross-refs: genbank:acc:YP_358764;genbank:gi:78000028;genbank:GeneID:3726155 Probab=99.11 E-value=9.5e-11 Score=75.53 Aligned_cols=268 Identities=12% Similarity=0.078 Sum_probs=153.4 Q ss_pred Cc-ceec--cccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceech-hhc Q lcl|NC_020854. 1 MA-TLRS--DIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTP-GKI 76 (342) Q Consensus 1 Ma-T~~~--d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~-~~l 76 (342) |+ ++.+ -.++|+.+..-+.+...+.+.+.+ ++..-+ .++...++|.++. .++....+.|++.++. ++. T Consensus 109 ~~~~t~~~gg~~vP~~~~~~i~~~~~~~~~l~~--~~~~~~-----~~~~~~~~~~~~~-~~~~~~~~~E~~~~~~~~~~ 180 (389) T protein:vir:10 109 TSKVTSTEAGVLIPEEIIYDPTAEVNSVVDLST--LVTKTP-----VTTPKGTYPILKR-ATDRFSSVAELAENPKLAEP 180 (389) T ss_pred hcccccCCcceeehHHHHHHHHHHHHhhhhHHh--hcceee-----ccCCeeEEEEEec-CCCccccccccccccccccc Confidence 54 2233 367888887766666666666643 222111 2355677888764 2455567888887764 566 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +-.+-....++.+.-+.++++...-+.-|..+.+.+++++...+..+..++..+.+.. .. T Consensus 181 ~~~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~i~~g~~~~~--------------------~~ 240 (389) T protein:vir:10 181 EFNKVDWSVATYRGAIPLSEEAIADSAVDLTALVGQSIKEKSVNTYNAMIAPVLQSFT--------------------AK 240 (389) T ss_pred cceeeeeeheeeEeeehhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHhhhhcccc--------------------cc Confidence 6666666777778777887776554555677889999999999888877765443210 01 Q ss_pred cccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccc Q lcl|NC_020854. 157 DTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGL 236 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~ 236 (342) ......+++.+.++....=+... -.+|+||+..+..|+++ ++++++++......+. .....-.+++|+ T Consensus 241 ~~~~~~~~d~l~~~~~~~~~~~~-~a~~~~n~~~~~~L~~l------kd~~G~~i~~~~~~~~-----~~~~~~~~l~G~ 308 (389) T protein:vir:10 241 KTTTDTLVDSLKHILNVDLDPAY-SRALVVTQSLFNTLDTL------KDKNGRYLLHDASDSI-----TDGTAKGTILGV 308 (389) T ss_pred cccccccHHHHHHHHHhhhhhhh-CcEEEecHHHHHHHHHh------hccCCCeeeecCcccc-----cccccccccccc Confidence 12234466777776653222222 26899999999999874 3455554433221111 112334689999 Q ss_pred eEEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCCCcceeEEEEeeEEE---eeecceee---e-cCc Q lcl|NC_020854. 237 RVIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDILAKSDAMSIDLHYV---YHPVGAKW---A-VTT 307 (342) Q Consensus 237 ~VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~---~~~~G~s~---~-~~~ 307 (342) +|++.+++.....+ .. .+++|+. -++.+.....+.+++.++..-.+ .+..-.++. +++..+.+ + .++ T Consensus 309 pV~~~~~~~~~~~~-~~--~~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~~-~~~~~~r~d~~~~~~~a~~~~~~~~~~~ 384 (389) T protein:vir:10 309 PVYVVGDTLLGSLA-GD--QKAFVGDLKRGVLFTDRQQVTLAWEDSKIYGK-YLGAAFRFGVQKADSKAGYFVTNTDVPG 384 (389) T ss_pred eeEEecccccCCCC-Cc--eEEEEeeccccEEEEeecceEEEeeccccccc-eEEEEEEeccEEecccceEEEEeeccCC Confidence 99877654332221 11 2466663 34556666667777766544333 222223333 23444444 1 222 Q ss_pred CCcCh Q lcl|NC_020854. 308 TNPTR 312 (342) Q Consensus 308 ~sPt~ 312 (342) .+|+- T Consensus 385 ~~~~~ 389 (389) T protein:vir:10 385 SALGK 389 (389) T ss_pred CCCCC Confidence 33443 No 100 >protein:vir:7409 Length: 408 # NCBI annotation: major structural protein # Family: family:all:21 # MgeID: mge:146 # MgeName: P335 # Cross-refs: genbank:acc:NP_839926;genbank:gi:30089896;genbank:GeneID:1260683 Probab=99.10 E-value=1.4e-10 Score=74.69 Aligned_cols=272 Identities=12% Similarity=0.053 Sum_probs=160.6 Q ss_pred Cc-ce--eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceech-hhc Q lcl|NC_020854. 1 MA-TL--RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTP-GKI 76 (342) Q Consensus 1 Ma-T~--~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~-~~l 76 (342) |. +. -...++|+.+...+.+...+.+.|.+ ++..-+. +.+.-.+.+|.+... .+.+..+.|++.++. ++. T Consensus 116 ~~~~~~~~gg~~vP~~~~~~Ii~~~~~~~~l~~--~~~~~~~---~~~~~~~~~~~~~~~-~~~~~~v~E~~~~~~~~~~ 189 (408) T protein:vir:74 116 ETSGSDSAAGLTIPQDIRTMINTLVRQYDSLQQ--YVRVESV---STSSGSRVYEKWTDV-TPLKAMDEEDGKIPDLDNP 189 (408) T ss_pred hcccccCCCceeechhHhhHHHHHHhhhcchhh--hcceeec---cCCcceEEEEeecCC-ccccccccccccccccccc Confidence 33 22 24467899988888887777776644 2221111 111123445555421 233456778888774 567 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +-..-....++.+....++++...-+.-|..+.+.+++++.+.++.+..+|.-. + . + . T Consensus 190 ~~~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~d~~il~G~----G---~---~------------~ 247 (408) T protein:vir:74 190 RLTIIKYLIKRYAGIITATNTLLKDTAENILAWLSSWIAKKVVVTRNQAIIAAM----G---T---V------------P 247 (408) T ss_pred ceeeEEeeeeeEEeeehhHHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhcc----c---c---c------------c Confidence 777777788888888888887765566678888999999999999998766421 0 0 0 0 Q ss_pred cccccccHHHHHHHHH-HhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 157 DTPTALSPRHVAEARA-ILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~-~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) ......+++.+.+++. .+-.....-.+|+||+..+..|+++ ++++++.+... . .....-++++| T Consensus 248 ~~~~~~~~~~i~~~~~~~l~~~~~~~a~~v~n~~~~~~l~~l------kd~~G~~l~~~--------~-~~~~~~~~l~G 312 (408) T protein:vir:74 248 KKPTIANFDDVITMINTSVDPAIIATSSLLTNQSGLNKLALV------KTAEGKYLLEP--------D-PTKPNSYLIKG 312 (408) T ss_pred cccccccHHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHHh------hcCCCceEecc--------C-cCCCCCceecc Confidence 1223456788888765 3434444456899999999999864 34544433221 1 11233468999 Q ss_pred ceEEEeCCcceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCCC----cceeEEEEeeEEEeee---cceee--- Q lcl|NC_020854. 236 LRVIVSDDVNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDIL----AKSDAMSIDLHYVYHP---VGAKW--- 303 (342) Q Consensus 236 ~~VvvdD~~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~~----~g~~~l~~r~~y~~~~---~G~s~--- 303 (342) ++|++.+..++...+ .+++ +++|+ ..++.+.....+.++.++... .....+....++.+.+ ..|.+ T Consensus 313 ~pV~~~~~~~~~~~~-~~~~-~i~~gd~~~~~~~~~~~~~~i~~~~~~~~~f~~~~~~~r~~~r~d~~~~~~~a~~~~~~ 390 (408) T protein:vir:74 313 KQVIVVADRWLPNSG-STVY-PLYYGDMSQAITLFDRENMSLLPTNIGAGAFETDTTKIRVIDRFDVKATDSEALVAGSF 390 (408) T ss_pred eeeEEecCccccccc-CCcc-eEEEEehhccEEEEEecceEEEEeccccchhhcceeeEEEEEeeCcEEecccceEEEEe Confidence 999987653322221 2222 34555 335666767778888777542 4556677777776665 34433 Q ss_pred ec----CcCCcChHHhcC Q lcl|NC_020854. 304 AV----TTTNPTRAQLET 317 (342) Q Consensus 304 ~~----~~~sPt~~~L~~ 317 (342) +. .+.+|+.+.=+- T Consensus 391 ~~~~~~~~~~~~~~~~~~ 408 (408) T protein:vir:74 391 TAIADQVGNFKTTTSTAV 408 (408) T ss_pred ecccCCCCCCCCCccccC Confidence 11 122333322222 No 101 >protein:vir:1433 Length: 435 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:30 # MgeName: phiE125 # Cross-refs: genbank:acc:NP_536362;genbank:gi:17975167;genbank:GeneID:929171 Probab=99.09 E-value=3e-11 Score=78.29 Aligned_cols=280 Identities=13% Similarity=0.141 Sum_probs=147.3 Q ss_pred Ccc-e--eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MAT-L--RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT-~--~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |.+ + -.-.++|+.+..-+.+.+.+.+.+.+.+. . .+.. ....+++|.+.. .+.+..+.|++.++..+.+ T Consensus 132 ~~~~t~~~gg~~vP~~~~~~ii~~l~~~~~i~~~~~-~---~~~~--~~~~~~~p~~~~--~~~a~~v~E~~~~~~~~~~ 203 (435) T protein:vir:14 132 LNTLSPGAGGVLVPENLSSEVIELLRPKSVVRKLGA-R---TLPL--SNGNITIPRLKG--GAIVGYIGADTDIPTTQQQ 203 (435) T ss_pred cccCCcCCCccccchhHHHHHHHHHhhhchhhhhcc-e---eeec--CCCceEEEEEeC--CcceeeeccCccccccccc Confidence 221 1 12246788876666555555554433211 0 1111 122578898863 3667778999999988887 Q ss_pred cceeeeEeeeeccceeechHHHhhhcch--HHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeee---cc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSD--PMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLC---ID 152 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~d--p~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~---~~ 152 (342) -..-....++.+..+.++++.-.-++-+ ..+.+.+++++.+.++.++.++. +...++.-.+...... +. T Consensus 204 f~~i~~~~~k~~~~~~iS~ell~ds~~~~~l~~~i~~~l~~ai~~~~d~a~l~------G~G~~~~p~Gi~~~~~~~~~~ 277 (435) T protein:vir:14 204 FDDLKLTAKKMAALVPIANDLIKYAGVNPNVDQIVVGDLTAAIGAREDKAFIR------DDGTANTPKGLRFWALPSNVI 277 (435) T ss_pred eeEEEeeeEEEEEeehhhHHHHHhhccCHHHHHHHHHHHHHHHHHHHHHHhhc------cCCCCccccceeeccccccee Confidence 7777777777777778877654434434 44669999999999999988763 2111111111111000 00 Q ss_pred cccccccccccHHHHHHHHHHhCcc--ccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccce Q lcl|NC_020854. 153 SESGDTPTALSPRHVAEARAILGDQ--GDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSV 230 (342) Q Consensus 153 ~~~~~~~~~~~~~~l~~A~~~~GD~--~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i 230 (342) ..............+.+....+-.. .-.-.+|+||+..+..|++. ++++++.+.. ...- T Consensus 278 ~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~v~n~~~~~~L~~l------kd~~G~~l~~-------------~~~~ 338 (435) T protein:vir:14 278 TASDASTLQKIETDLGKVILALENADANLTQPGWIMAPRTFRFLEGL------RDGNGNKVYP-------------ELAN 338 (435) T ss_pred ccccccchhhHHHHHHHHHHHhhhccccccCCEEEEcHHHHHHHHHh------hccCCceecc-------------CCCC Confidence 1111111111223445554444332 11235799999999999864 3455443321 1223 Q ss_pred eeeccceEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccCCCc-------------ceeEEEEeeEEEe Q lcl|NC_020854. 231 PTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRDILA-------------KSDAMSIDLHYVY 296 (342) Q Consensus 231 ~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~~~-------------g~~~l~~r~~y~~ 296 (342) ++++|++|++++.+|.......+. ..++|+. .-+.++...++.++.++.... ....+....++.+ T Consensus 339 g~l~G~Pv~~~~~~p~~~~~~~~~-~~i~~gd~s~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~ 417 (435) T protein:vir:14 339 GMLKGYPVGKTTQVPINLGETGKE-SEIYFTDFGDVFIGEEETLEIDYSKEATYKDADGHMVSAFQRDQTLIRVIAKNDF 417 (435) T ss_pred CeeecceeEeeccccccccCCCcc-ceEEEeecccEEEEEecccEEEEeccccccccccchhhhhhcChhheeeeeeeCc Confidence 578999999999998754332222 2344442 223345556677777765421 1233333333333 Q ss_pred eecceeeecCcCCcChHHhcCCcCcee Q lcl|NC_020854. 297 HPVGAKWAVTTTNPTRAQLETVANWSK 323 (342) Q Consensus 297 ~~~G~s~~~~~~sPt~~~L~~~~NW~~ 323 (342) .| ..|..--.-++.+|-- T Consensus 418 ~~---------~~~~a~~~l~~~~~~~ 435 (435) T protein:vir:14 418 GP---------RHVESIAVLAGVAWGA 435 (435) T ss_pred ee---------ecccceEEEecCCCCC Confidence 22 1222222333333332 No 102 >protein:vir:4226 Length: 326 # NCBI annotation: observed 35.2Kd protein # Family: family:all:507 # MgeID: mge:89 # MgeName: L5 # Cross-refs: genbank:acc:NP_039681;swissprot:sw:q05223;genbank:gi:9625447;uniprot:Q05223;genbank:GeneID:2942929 Probab=99.07 E-value=8.6e-11 Score=75.76 Aligned_cols=275 Identities=13% Similarity=0.099 Sum_probs=150.2 Q ss_pred Ccc---eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MAT---LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT---~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |.| .-+.++.|++..+ +.+.+.+.+.+.+ ++.. + ..++..+++|.+.. .+.+.-+.|++.++..+.+ T Consensus 20 ~~~~~~~~g~~ip~~~~~~-ii~~~~~~s~i~~--~~~~---~--~~~~~~~~~p~~~~--~~~a~~v~Eg~~~~~~~~~ 89 (326) T protein:vir:42 20 AQTGDSMFEGYLEPEQAQD-YFAEAEKISIVQQ--FAQK---I--PMGTTGQKIPHWTG--DVSASWIGEGDMKPITKGN 89 (326) T ss_pred eeccccCCcceechhhHHH-HHHHHHhcchhhh--hcce---e--eccCCceEEEEEeC--CcceEEecCCccccccccc Confidence 432 2345666665544 4445555444433 2211 1 13466788998874 3667788999999999999 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhhee----eeccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFD----LCIDS 153 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~----~~~~~ 153 (342) ..+.....++.+..+.++++...-+..|..+.+.+++++.+++++++.++. | .. .+........ ..+.. T Consensus 90 f~~i~~~~~k~~~~v~iS~ell~~s~~~~~~~i~~~l~~a~~~~~d~a~l~---G---~g-s~~p~gi~~~~~~~~~~~~ 162 (326) T protein:vir:42 90 MTSQTIAPHKIATIFVASAETVRANPANYLGTMRTKVATAFAMAFDNAAIN---G---TD-SPFPTFLAQTTKEVSLVDP 162 (326) T ss_pred eeEEEEeeEEEEEeehhhHHHHhcCHHHHHHHHHHHHHHHHHHHHHHHhhc---c---cC-CCccccccccccccceeec Confidence 888888889999999999887666677888999999999999999988763 1 11 0000000000 00001 Q ss_pred cccccccccc-HHH-HHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeeccccccccee Q lcl|NC_020854. 154 ESGDTPTALS-PRH-VAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVP 231 (342) Q Consensus 154 ~~~~~~~~~~-~~~-l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~ 231 (342) ........+. .+. +.++..........-..|+||++.+..|+++ ++++++.+....... ........+ T Consensus 163 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~n~~~~~~L~~l------kd~~G~~l~~~~~~~----~~~~~~~~~ 232 (326) T protein:vir:42 163 DGTGSNADLTVYDAVAVNALSLLVNAGKKWTHTLLDDITEPILNGA------KDKSGRPLFIESTYT----EENSPFRLG 232 (326) T ss_pred ccccccccchhHHHHHHHHHhhhhhhccCccEEEEeHHHHHHHHHh------hccCCceeecccccc----CccccccCc Confidence 1111111112 222 3344444445555677899999999999874 345444333211110 011123456 Q ss_pred eeccceEEEeCCcceeccCCCcceEEEEEecce-eEeecCCcceeEeccCCC----------------cceeEEEEeeEE Q lcl|NC_020854. 232 TYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGA-VASGEQMAMQTETDRDIL----------------AKSDAMSIDLHY 294 (342) Q Consensus 232 ~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GA-i~~~~k~~~~ve~dr~~~----------------~g~~~l~~r~~y 294 (342) +++|++|++++.+|-.. . ..+++.=. +.++.-.++.++..++.. ..+..+....++ T Consensus 233 ~l~G~pv~~~~~~~~~~------~-~~~~Gd~s~~~~~~~~~~~v~~~~e~~~~~~~~~~~~~~~~~~~d~~~~r~~~~~ 305 (326) T protein:vir:42 233 RIVARPTILSDHVASGT------V-VGYQGDFRQLVWGQVGGLSFDVTDQATLNLGTPQAPNFVSLWQHNLVAVRVEAEY 305 (326) T ss_pred eeeeeeEEEcCCCCCCc------e-EEEEeecceEEEEEecceEEEEeecceeeecccccccchhhhhcCcEEEEEEEEe Confidence 89999999999886421 1 11122110 112333344444433321 223445555555 Q ss_pred Eeeecc-eee-----ecCcCC Q lcl|NC_020854. 295 VYHPVG-AKW-----AVTTTN 309 (342) Q Consensus 295 ~~~~~G-~s~-----~~~~~s 309 (342) .+.|.- -+| +.++.+ T Consensus 306 d~~v~~~~a~~~l~~~~~~~~ 326 (326) T protein:vir:42 306 AFHCNDKDAFVKLTNVDATEA 326 (326) T ss_pred ccEEecccceEEEeeccccCC Confidence 554422 122 122222 No 103 >protein:vir:1268 Length: 397 # NCBI annotation: hypothetical protein # Family: family:all:21 # MgeID: mge:329 # MgeName: phi-105 # Cross-refs: genbank:acc:NP_690760;genbank:gi:22855000;genbank:GeneID:955203 Probab=99.05 E-value=2.3e-10 Score=73.44 Aligned_cols=264 Identities=12% Similarity=-0.026 Sum_probs=151.2 Q ss_pred Ccc---eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceech-hhc Q lcl|NC_020854. 1 MAT---LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTP-GKI 76 (342) Q Consensus 1 MaT---~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~-~~l 76 (342) |++ .....++|+.+.+.+.+...+.+.+.+ ++..-+. ..+.-.+.+|... ..+.+.-+.|++.++. +.. T Consensus 123 ~~~~~~~~gg~lvP~~~~~~ii~~~~~~~~l~~--~~~~~~~---~~~~~~~~~~~~~--~~~~a~~v~Eg~~~~~~~~~ 195 (397) T protein:vir:12 123 MSGINDEDGGILIPEDIGRQIHEFKRQFEPLEQ--YVTVEPV---TTRSGTRLLEKNA--DMVPFSPVEELGNLPEIDQP 195 (397) T ss_pred ccccccccCcccCchhHHHHHHHhhhhhhhHHh--hcceeec---cCCceeEEEEEec--CCcceeeecccccccccccc Confidence 552 234468899998888877777666644 2211111 1111244455443 2356778899988864 456 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +..+-....++.+..+.++++...-+.-|....+.++++..+++..+..++.-.. . + T Consensus 196 ~~~~v~~~~~k~~~~~~is~e~l~ds~~~l~~~i~~~l~~~~~~~~d~~il~G~g------~-----------------~ 252 (397) T protein:vir:12 196 RFTKVSYSIIDYGGIMTLSNSMLNDSDQAIMTYVAKWFAKKSVVTRNNLILAAIA------S-----------------L 252 (397) T ss_pred cceeEEeeheeeEeeehhhHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHHhccc------c-----------------c Confidence 6666666777777777777776555555777889999999999998877665221 0 0 Q ss_pred cccccccHHHHHHHHH-HhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 157 DTPTALSPRHVAEARA-ILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~-~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) ......+++.+.++.. .+-.....-.+|+||+..+..|++. ++++++.+. +.. ..+..-++++| T Consensus 253 ~~~g~~~~~~i~~~~~~~l~~~~~~~a~~~~n~~~~~~L~~l------kd~~G~~l~--~~~-------~~~g~~~~l~G 317 (397) T protein:vir:12 253 KKVDIDGLDGIKKALNVTLDPMVAPGSIVLTNQDGYDWLDTL------KDGTGRYLL--QPD-------PTNPTKKLLDG 317 (397) T ss_pred cccccccHHHHHHHHhhccchhhhCCCEEEEcHHHHHHHHHh------hccCCceee--ccc-------ccCCCCccccc Confidence 1122346777888775 3433444557899999999999864 344433322 111 11233478999 Q ss_pred ceEEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCCC----cceeEEEEeeEEEeeecceeeecCcCC Q lcl|NC_020854. 236 LRVIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDIL----AKSDAMSIDLHYVYHPVGAKWAVTTTN 309 (342) Q Consensus 236 ~~VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~----~g~~~l~~r~~y~~~~~G~s~~~~~~s 309 (342) ++|++++.+..... .++ ..++|+. .++.+.....+.++.++... .+...+....++...+.- T Consensus 318 ~pv~~~~~~~~~~~--~~~-~~~~~gd~~~~~~~~~~~~~~i~~~~~~~~~f~~~~~~~r~~~r~d~~~~~--------- 385 (397) T protein:vir:12 318 RPVVPFTNRVLKTQ--KGK-APLIIGNLKEAIVLFDREQQSIASTDTGAGAFETNSTKVRGIEREDVRKWD--------- 385 (397) T ss_pred eeeEEecccccccC--CCc-cEEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeeccEEec--------- Confidence 99987765432221 122 2355653 34555555666777665432 344556666555554422 Q ss_pred cChHHhcCCcCceeecCccccceEEEEec Q lcl|NC_020854. 310 PTRAQLETVANWSKVYELKNIGIVRATNV 338 (342) Q Consensus 310 Pt~~~L~~~~NW~~v~d~k~i~~~~~~~~ 338 (342) ++++..+.+.-+ T Consensus 386 -----------------~~a~~~~~~t~~ 397 (397) T protein:vir:12 386 -----------------EDAVVFGQITVE 397 (397) T ss_pred -----------------ccceEEEEEeeC Confidence 111111111111 No 104 >protein:vir:94673 Length: 419 # NCBI annotation: major capsid protein # Family: family:all:585 # MgeID: mge:1527 # MgeName: mu1/6 # Cross-refs: genbank:acc:YP_579208;genbank:gi:93007444;genbank:GeneID:5076792 Probab=99.04 E-value=1.6e-10 Score=74.27 Aligned_cols=271 Identities=10% Similarity=0.049 Sum_probs=157.0 Q ss_pred Cc----ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEE--------ccccccCCCCcccccCCC Q lcl|NC_020854. 1 MA----TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFIN--------VPFWKANLSGDFEVLSDS 68 (342) Q Consensus 1 Ma----T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~--------~P~~~~i~~gda~~~~~~ 68 (342) +. +.-...+.|+.+...+.........+.+ ++..-+. .+..++ +|.|.. .+.+.-+.|+ T Consensus 123 ~~~~~~~~~~~~~~p~~~~~~i~~~~~~~~~i~~--~~~~~~~-----~~~~~~~~~~~~~~~~~~~~--~~~a~~v~Eg 193 (419) T protein:vir:94 123 APAGTITNPNVPHLPQLVPGIVPTTPDLPLLVAD--LLDQQNA-----DYNVLEYIRDTSGTAGAGST--WNKAAVVPEG 193 (419) T ss_pred cccccccCCcccccchhhhHHHHHHHhhhhhhhh--cceeeec-----cCCceeeeeecccccccccc--CcccceecCC Confidence 11 1223467888888777655444333322 2322221 222333 344443 2556778899 Q ss_pred ceechhhcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH-----HHHHHHhhhcccccc Q lcl|NC_020854. 69 SSLTPGKITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLS-----CLQGVFGSLNANTSS 143 (342) Q Consensus 69 ~~i~~~~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla-----~L~g~~~~~~a~~~~ 143 (342) +.++..+++-.+-....++.+....++.+...-+ .+....+.+++++.+++..+..+|. ..+|++...... T Consensus 194 ~~~~~~~~~~~~i~~~~~k~~~~~~is~ell~d~-~~l~~~i~~~la~a~~~~~d~aii~G~G~~~p~Gi~~~~~~~--- 269 (419) T protein:vir:94 194 TAKPQSTLSFDTITTTLKTVAHWLPITRQAADDN-SQLMGYIQGRLTYGLRFLRDRQLLNGNGSTEMQGILTTPGIG--- 269 (419) T ss_pred ccccccccceeeEEeeeeeEEEeehhhHHHHHhH-HHHHHHHHHHHHHHHHHHHHHHHHhccCcccccceecccccc--- Confidence 9999888888877777888887777777654433 4667779999999999999988874 222222111100 Q ss_pred hhheeeecccccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecc Q lcl|NC_020854. 144 SAFFDLCIDSESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAA 223 (342) Q Consensus 144 ~~~~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~ 223 (342) .+..............++.|.++...+-.......+|+||+..+..|++.. ++...+. ..+ + T Consensus 270 ----~~~~~~~~~~~t~~~~~~~l~~~~~~~~~~~~~~~~~v~n~~~~~~l~~~k------~~~~~~~-~~~-------~ 331 (419) T protein:vir:94 270 ----TYQQPKPTAPATDEPPLVDIRRAKTVAEIAGFPPDGVVVHPQDWESIELDQ------APGSGVF-RVI-------A 331 (419) T ss_pred ----cccccccccccccchhHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHHh------hcCCCce-eec-------C Confidence 011111111223344578899998887776667779999999999998653 2221111 100 1 Q ss_pred cccccceeeeccceEEEeCCcceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCC----CcceeEEEEeeEEEee Q lcl|NC_020854. 224 YGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDI----LAKSDAMSIDLHYVYH 297 (342) Q Consensus 224 ~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~----~~g~~~l~~r~~y~~~ 297 (342) ...+..-++++|++|++++.+|-. +++|+ ..++.+....++.++.++.. ..+.+.+....++.+. T Consensus 332 ~~~~~~~~~l~G~pV~~~~~~~~~---------~~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~~r~~~r~d~~ 402 (419) T protein:vir:94 332 NVQGEATPRIWGLNVVSTVAIAQG---------TALVGGFRQGATLWSRQGITVLMTDSHADFFTANTLVILAEFRANLA 402 (419) T ss_pred CcccCCCccccceeeEEcCCCCCc---------cEEEeeccceEEEEEecceEEEEeccccchhhcCcEEEEEEEeeccE Confidence 112334568999999999998731 12222 22334455556666665543 3566777777777766 Q ss_pred ecc---eeeecCcCCcC Q lcl|NC_020854. 298 PVG---AKWAVTTTNPT 311 (342) Q Consensus 298 ~~G---~s~~~~~~sPt 311 (342) |+- +..-.-...|| T Consensus 403 v~~~~a~~~~~~~aa~~ 419 (419) T protein:vir:94 403 VYQPKAFVRVTFAAATT 419 (419) T ss_pred EeccccEEEEEeccCCC Confidence 533 32212223455 No 105 >protein:vir:100172 Length: 394 # NCBI annotation: putative major head protein # Family: family:all:21 # MgeID: mge:1524 # MgeName: phi AT3 # Cross-refs: genbank:acc:YP_025031;genbank:gi:48697264;genbank:GeneID:2948270 Probab=99.02 E-value=3.3e-10 Score=72.59 Aligned_cols=275 Identities=11% Similarity=0.078 Sum_probs=149.4 Q ss_pred Cc-ce--eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceec-hhhc Q lcl|NC_020854. 1 MA-TL--RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLT-PGKI 76 (342) Q Consensus 1 Ma-T~--~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~-~~~l 76 (342) +. ++ -....+|+.+..-+.....+.+.+.+ ++..- ..++...++|.... .++....+.|++.++ .+.. T Consensus 111 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~~~l~~--~~~~~-----~~~~~~~~~~~~~~-~~~~~~~~~E~~~~~~~~~~ 182 (394) T protein:vir:10 111 AGHVTSTEAGVLIPEEIIYDPTAEVNSVVDLST--LVTKT-----PVTTPKGTYPILKR-ATDRFSSVAELAENPALAEP 182 (394) T ss_pred hcccccccCceeccHHHHHHHHHHHHhhhhhhh--hceee-----eccCCceEEEEEec-CCCccccccccccccccccc Confidence 22 12 23366787776555555555555532 22111 13566777887653 346666788887776 3556 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +-.+-....++.+.-+.++++.-.-+.-|....+.+++++..++..+..++..+.. . ... T Consensus 183 ~~~~v~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~il~g~g~----~----------------~~~ 242 (394) T protein:vir:10 183 EFEQVDWSVSTYRGAIPLSEEAIADSAVDLTSLVGQSINEKSVNTYNAMIAPVLQS----F----------------TAK 242 (394) T ss_pred cceeEEeeeeeeEeeehhHHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHhhcccc----c----------------ccc Confidence 66666666777776677777654444456777799999999999888877653311 0 000 Q ss_pred cccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccc Q lcl|NC_020854. 157 DTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGL 236 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~ 236 (342) ...+..+++.+.++....=+... -++|+||+.++..|+++ ++++++.+......+.. ....-.+++|+ T Consensus 243 ~~~~~~~~d~l~~~~~~~~~~~~-~a~~vmn~~~~~~l~~l------kd~~G~~i~~~~~~~~~-----~~~~~~~L~G~ 310 (394) T protein:vir:10 243 ATTTDTLVDSLKHILNVDLDPAY-SRALVVTQSLFNTLDTL------KDKNGRYLLHDASDSIT-----DGTAKGTVLGV 310 (394) T ss_pred cccccccHHHHHHHHHhhhhhhc-cCEEEecHHHHHHHHHh------hccCCCeeeeccccccc-----cCCcccccccc Confidence 11223456777777654333332 26899999999999874 45655544332221111 12333679999 Q ss_pred eEEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCCCcceeEEEEeeEEEe---eecceeeecCcCCcC Q lcl|NC_020854. 237 RVIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDILAKSDAMSIDLHYVY---HPVGAKWAVTTTNPT 311 (342) Q Consensus 237 ~VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~---~~~G~s~~~~~~sPt 311 (342) +|++.|.+..... . ++ ..++|+. -++.+.....+.+++.++.... +.+..-.++.+ ++..+.|-.-..... T Consensus 311 PV~~~~~~~~~~~-~-~~-~~i~~gd~s~~~~~~~~~~~~v~~~~~~~~~-~~~~~~~r~d~~~~~~~ai~~~~~~~~~~ 386 (394) T protein:vir:10 311 PVYVVGDALLGSA-A-GD-QKAFVGDLKRGVLFADRQQVTLAWEDSKIYG-RYLGAAFRFGVKQADSNAGYFVTNTDAAS 386 (394) T ss_pred eeEEecccccCCC-C-Cc-eEEEEeeccccEEEEeecceEEEEecccccc-eeEEEEEEeccEEeccccEEEEEeecccC Confidence 9998776543322 1 11 2345442 2344555556677666654433 23333333333 345565521111111 Q ss_pred hHHhcCCc Q lcl|NC_020854. 312 RAQLETVA 319 (342) Q Consensus 312 ~~~L~~~~ 319 (342) .+.=++|. T Consensus 387 ~~~~~~~~ 394 (394) T protein:vir:10 387 GSTSGTGK 394 (394) T ss_pred CCCCCCCC Confidence 11222333 No 106 >protein:vir:3845 Length: 395 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:322 # MgeName: phi adh # Cross-refs: genbank:acc:NP_050151;swissprot:trembl:q9t1f6;genbank:gi:9633043;uniprot:Q9T1F6;genbank:GeneID:1262163 Probab=99.02 E-value=3.7e-10 Score=72.29 Aligned_cols=270 Identities=11% Similarity=0.010 Sum_probs=155.7 Q ss_pred Cc---ce--eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccC-CCCcccccCCCceechh Q lcl|NC_020854. 1 MA---TL--RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKAN-LSGDFEVLSDSSSLTPG 74 (342) Q Consensus 1 Ma---T~--~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i-~~gda~~~~~~~~i~~~ 74 (342) |+ +. -.-.++|+.+..-+.+...+.+.+.+ ++..-+ .++....+|+|... ..+.+..+.|++.++.. T Consensus 105 ~~~~~~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~--~~~~~~-----~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~ 177 (395) T protein:vir:38 105 VTSGTTGTGNAGLTIPEDIQLQIRTLTRSFTSLES--LANVEN-----VTTSHGSRVYEKLADITPLKDLDDESALIGDN 177 (395) T ss_pred HhhccCccCCCceecchhHhhHHHHHHHhhcchhh--hcceee-----ccCCcceEEEEeeccCCccccccccccccccc Confidence 22 22 24467888887777777777666644 221111 23444456666532 12344567888888643 Q ss_pred -hcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccc Q lcl|NC_020854. 75 -KITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDS 153 (342) Q Consensus 75 -~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~ 153 (342) ..+-..-....++.+....++++...-+.-|..+.+.+++++.+.+..+..++.-.. . .. T Consensus 178 ~~~~f~~v~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~il~g~g----~--~~------------- 238 (395) T protein:vir:38 178 DDPELTVVKYLIHRYAGITTVTNTLLKDTVDNIIQWLVNWAAKKDVVTRNAKILEVMG----K--AP------------- 238 (395) T ss_pred cccceeeEEeeeeeeEeehhhHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhccc----c--cc------------- Confidence 455555555666677667776665444555678889999999999998877765221 0 00 Q ss_pred ccccccccccHHHHHHHHH-HhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceee Q lcl|NC_020854. 154 ESGDTPTALSPRHVAEARA-ILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPT 232 (342) Q Consensus 154 ~~~~~~~~~~~~~l~~A~~-~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~ 232 (342) ......+++.+.+++. .+......-.+|+||+..+..|++. +++++.++.. .. ..+..-.+ T Consensus 239 ---~~~~~~~~~~i~~~~~~~l~~~~~~~a~~v~n~~~~~~L~~l------kd~~G~~l~~--~~-------~~~~~~~~ 300 (395) T protein:vir:38 239 ---KKPTISQFDNIKDLENNTLDPAIESTSSFITNQSGYNILSKV------KDADGRYLMQ--PD-------VTSPDKYL 300 (395) T ss_pred ---cccccccHHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHHh------hccCCceeec--cC-------cCCCCcce Confidence 0112335677777765 3444455567899999999999874 4455443321 10 11234568 Q ss_pred eccceEEEeCCcceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCCC----cceeEEEEeeEEEeeec---ceee Q lcl|NC_020854. 233 YMGLRVIVSDDVNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDIL----AKSDAMSIDLHYVYHPV---GAKW 303 (342) Q Consensus 233 ~~G~~VvvdD~~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~~----~g~~~l~~r~~y~~~~~---G~s~ 303 (342) ++|++|+++|.++..... +. .+++|+ ..++.++...++.++..+... .....+....+|...|. .|.. T Consensus 301 l~G~pV~~~~~~~~~~~~--~~-~~i~~gd~~~~~~i~~~~~~~i~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~ 377 (395) T protein:vir:38 301 IDGKPVIRIADKWLPDVS--GS-HPLYFGDLKQGITLFDRQQMQIDTTNVGAGSFEHDTTKLRFIDRFDVQLIDDGAFAA 377 (395) T ss_pred eccceeEEecccccCcCC--Cc-ceEEEEeccccEEEEEecceEEEEeccccchhhcCceEEEEEEeeccEEecccceEE Confidence 999999999987665432 22 245566 345666777777888777543 44556666666665543 3322 Q ss_pred ---ecC-cCCcChHHhcCCc Q lcl|NC_020854. 304 ---AVT-TTNPTRAQLETVA 319 (342) Q Consensus 304 ---~~~-~~sPt~~~L~~~~ 319 (342) +.. ...|.-.. +|. T Consensus 378 ~~~~~~~~~~~~~~~--~~~ 395 (395) T protein:vir:38 378 ASFKTVANQAQGTAG--TGK 395 (395) T ss_pred EEeecccCCCCCccC--CCC Confidence 211 11122112 222 No 107 >protein:vir:105038 Length: 428 # NCBI annotation: major capsid head protein precursor # Family: family:all:21 # MgeID: mge:1465 # MgeName: phiKO2 # Cross-refs: genbank:acc:YP_006586;genbank:gi:46402092;genbank:GeneID:2777903 Probab=99.02 E-value=9.1e-11 Score=75.62 Aligned_cols=278 Identities=11% Similarity=0.100 Sum_probs=149.5 Q ss_pred Ccce--eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhccc Q lcl|NC_020854. 1 MATL--RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITA 78 (342) Q Consensus 1 MaT~--~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~ 78 (342) +.+. -.-..+|+-+.+-+.+.+.+.+.+.+-|. . .+.. +...+++|.+.. .+.+.-+.|++.++..+.+- T Consensus 127 ~~~~~~~gg~liP~~~~~~ii~~l~~~~~l~~~~~-~---~~~~--~~g~~~~p~~~~--~~~a~~v~Eg~~~~~~~~~f 198 (428) T protein:vir:10 127 ISTAAGSGGVLIPQNIHSEVIELLRDRTIVRKLGA-R---SIPL--PNGNMSLPRLAG--GATASYTGENQDAKVSEARF 198 (428) T ss_pred hcccccCCccccchhHHHHHHHHHhhhchhhhhcc-e---eeec--CCcceEEEEEeC--CcceeeeccCccccccccce Confidence 2222 23367888776656566666555544221 0 1111 122478898753 36677789999999888776 Q ss_pred ceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeec--ccccc Q lcl|NC_020854. 79 DKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCI--DSESG 156 (342) Q Consensus 79 ~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~--~~~~~ 156 (342) .+-....++.+.-+.++++...-+.-+..+.+.+++++.+.++.++.+|. | ....+.-.+.+..... ..... T Consensus 199 ~~i~~~~~k~~~~v~is~ell~ds~~~l~~~i~~~l~~ai~~~~d~~~l~---G---~G~~~~p~Gi~~~~~~~~~~~~~ 272 (428) T protein:vir:10 199 DDVKLTAKTMIAMVPISNALIGRAGFNVEQLVLQDILTAISVREDKAFMR---D---DGTGDTPIGMKARATQWNRLLPW 272 (428) T ss_pred eeEEeeeEEEEEeehhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHhc---c---CCCCccccccccccccccccccc Confidence 66666667777777887776544555778889999999999999987763 1 1111000001000000 00001 Q ss_pred cccccccHHHH---HHHHHHh---CccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccce Q lcl|NC_020854. 157 DTPTALSPRHV---AEARAIL---GDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSV 230 (342) Q Consensus 157 ~~~~~~~~~~l---~~A~~~~---GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i 230 (342) ...+..+.+.+ .+++..+ +.....-..|+||+..+..|++. ++++++.+.. +..- T Consensus 273 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~n~~~~~~L~~l------kd~~G~~i~~-------------~~~~ 333 (428) T protein:vir:10 273 AADAAVNLDTIDTYLDSIILMSMDGNSNMISSGWGMSNRTYMKLFGL------RDGNGNKVYP-------------EMAQ 333 (428) T ss_pred cccccccHHHHHHHHHHHHHhhhccccccccCEEEEcHHHHHHHHHh------hccCCceecc-------------CCCC Confidence 11223334433 3333222 12233356899999999999874 3454443331 1122 Q ss_pred eeeccceEEEeCCcceeccCCCcceEEEEEecce-eEeecCCcceeEeccCCCc----c---------eeEEEEeeEEEe Q lcl|NC_020854. 231 PTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGA-VASGEQMAMQTETDRDILA----K---------SDAMSIDLHYVY 296 (342) Q Consensus 231 ~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GA-i~~~~k~~~~ve~dr~~~~----g---------~~~l~~r~~y~~ 296 (342) ++++|++|+++|.+|.......+ ...++|+.-+ +.++...++.++.+|.... + ...+....++.+ T Consensus 334 g~l~G~pv~~~~~~p~~~~~~~~-~~~i~~gd~s~~~i~~~~~i~i~~~~~~~~~~~~~~~~~~f~~~~~~~R~~~r~d~ 412 (428) T protein:vir:10 334 GMLKGYPIQRTSAIPANLGEGGK-ESEIYFADFNDVVIGEDGNMKVDFSKEASYIDTDGKLVSAFSRNQSLIRVVTEHDI 412 (428) T ss_pred CeeeceeeEEeccccccccCCCc-cceEEEEecceEEEEEecceEEEeecccccccccccccchhhcchhheeeeeeeCc Confidence 47899999999999875322222 2344444322 2334445566766665321 1 111222222211 Q ss_pred eecceeeecCcCCcChHHhcCCcCc Q lcl|NC_020854. 297 HPVGAKWAVTTTNPTRAQLETVANW 321 (342) Q Consensus 297 ~~~G~s~~~~~~sPt~~~L~~~~NW 321 (342) ....|.-=.+.++.+| T Consensus 413 ---------~v~~p~a~~~~t~~~~ 428 (428) T protein:vir:10 413 ---------GFRHPEGLVLGTGVLF 428 (428) T ss_pred ---------eeeccceEEEEeccCC Confidence 1234555556666677 No 108 >protein:vir:102655 Length: 322 # NCBI annotation: Hypothetical protein # Family: family:all:6384 # MgeID: mge:1624 # MgeName: VP2 # Cross-refs: genbank:acc:YP_052979;genbank:gi:50282923;genbank:GeneID:2948122 Probab=99.00 E-value=9.5e-11 Score=75.52 Aligned_cols=291 Identities=11% Similarity=0.008 Sum_probs=167.4 Q ss_pred CcceeccccchhHHHHHHH-hhhHHhhhhhhcCccccchhhhccCCCCEEEcccccc---CCCCccccc-CCCc-eechh Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVI-EQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKA---NLSGDFEVL-SDSS-SLTPG 74 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~-~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~---i~~gda~~~-~~~~-~i~~~ 74 (342) |+|++...|+ +.|.+-+. ....+.++|.+. +. ... +..+++++..|--.. .-.+....- .+++ +.++. T Consensus 13 Ms~~i~~~fv-~qy~~~v~~~~qq~~s~L~~t--V~-~~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~d~~~dtp~~ 86 (322) T protein:vir:10 13 IAGDIDQAFV-QTYETTLRILSQQKSAKLKQY--CQ-HKN--ESSESHNWETLASMDPDAVKRKRSRQQSADGTYPTPVN 86 (322) T ss_pred eechhhhHHH-HHHHHHHHHHHHHhhhhhhcc--cc-ccc--ccccccceeecccccccccccccccccccCcccCCCcc Confidence 8998888888 77776554 444455677543 32 222 123456655543211 101222221 1111 45555 Q ss_pred hcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccc Q lcl|NC_020854. 75 KITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSE 154 (342) Q Consensus 75 ~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~ 154 (342) .+..+...+......-++-+.|+...-...||....+++.+.+++|+.|+.+++.+.|.- .....+..+.. ..+.. T Consensus 87 ~~~~~~r~~~~~d~~~~~~VDd~D~~k~~~D~~~~~~~~~a~AL~R~~D~~I~~a~~g~a---~~~~~gt~v~~-~ss~~ 162 (322) T protein:vir:10 87 NKPFAKRRTNVDTYDTGHVVEQEDISQMLLDPNSALITSQAYAMARKTDDLIIAGAWKPA---SIKGTGQPVEF-LATQE 162 (322) T ss_pred ccccceEEEeecccccceecchHHHHHhhcCchHHHHHHHHHHhhhHHHHHHHhhhhccc---ccccccccccc-CCCcc Confidence 666666666666566677777888777888999999999999999999998887665432 11111111110 01111 Q ss_pred cccccccccHHHHHHHHHHhCcccc--C-eEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeeccccccccee Q lcl|NC_020854. 155 SGDTPTALSPRHVAEARAILGDQGD--K-LTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVP 231 (342) Q Consensus 155 ~~~~~~~~~~~~l~~A~~~~GD~~~--~-~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~ 231 (342) .++....++.+.|.+|.++|..+.- + -..+++.|..+.+|.+..-+ ..+|..+.... ..++.++ T Consensus 163 i~~g~~g~t~~kl~~a~~~l~~~dvp~d~~R~~vv~p~~~~~LL~d~~~---ts~D~~~~~~l----------~~~G~ig 229 (322) T protein:vir:10 163 IGDGTKPISFDYVTEITERFLENEIEPEVSKVIVIGPTQARKLLQITEA---TSADYTSAMDL----------QSKGIIT 229 (322) T ss_pred cccCccchhHHHHHHHHHHHHhcCCCCCCCeEEEeCHHHHHHHhcchhh---hhhhcccchhh----------hhcCeee Confidence 2233457788899999999987422 2 36899999999999875432 23433221111 1234588 Q ss_pred eeccceEEEeCCcceecc----------CCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeecce Q lcl|NC_020854. 232 TYMGLRVIVSDDVNTAGS----------GGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPVGA 301 (342) Q Consensus 232 ~~~G~~VvvdD~~p~~~~----------~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~ 301 (342) .++|..++++.++|..++ ....+..||..-+.|+.+....++.++.++++.+.. +.+.|..|.+|- T Consensus 230 ~~lGf~~i~s~~lp~~~~t~~~~~~~~~~~~~~~~~~a~~k~Av~~a~~~dv~~~i~~~~~~~~----a~~I~~~~~~Ga 305 (322) T protein:vir:10 230 NWMGYTWIVSTRLDKFDPTQWGMAAEDGPQGDEIWCIAMTDMALGYHSCKDIWTKVAEDPSASF----AWRIYSAFTADC 305 (322) T ss_pred eeeeEEEEEeccCCccccccccccccCCCCccceeEEEEecCceeEEEeeeeeEEeeccCCcch----hhhhhhhhhhCc Confidence 999999999999986432 233466799999999999998888777665555432 122222222221 Q ss_pred eeecCcCCcChHHhcCCcCceeecCccccceEEEEecCCC Q lcl|NC_020854. 302 KWAVTTTNPTRAQLETVANWSKVYELKNIGIVRATNVSNF 341 (342) Q Consensus 302 s~~~~~~sPt~~~L~~~~NW~~v~d~k~i~~~~~~~~~~~ 341 (342) .= .++|.| |++.-+--| T Consensus 306 ~r---------------------i~~~gV--v~i~~~e~~ 322 (322) T protein:vir:10 306 VR---------------------VEDEHI--FKLRLKNSL 322 (322) T ss_pred eE---------------------eccCcE--EEEEEeccC Confidence 10 011100 000000000 No 109 >protein:vir:1025 Length: 408 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:20 # MgeName: bIL286 # Cross-refs: genbank:acc:NP_076679;genbank:gi:13095788;genbank:GeneID:920362 Probab=99.00 E-value=6.5e-10 Score=70.94 Aligned_cols=274 Identities=11% Similarity=0.049 Sum_probs=158.3 Q ss_pred Cc-ceec--cccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEcccccc-CCCCcccccCCCceech-hh Q lcl|NC_020854. 1 MA-TLRS--DIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKA-NLSGDFEVLSDSSSLTP-GK 75 (342) Q Consensus 1 Ma-T~~~--d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~-i~~gda~~~~~~~~i~~-~~ 75 (342) |. +..+ ...+|+.+...+.+...+.+.+.+ ++..-+ .++....+|+++. ...+.+.-+.|++.++- +. T Consensus 116 ~~~~t~~~gg~~vP~~~~~~Ii~~~~~~~~l~~--~~~~~~-----~~~~~~~~~~~~~~~~~~~a~~v~E~~~~~~~~~ 188 (408) T protein:vir:10 116 ETSGSDSAAGLTIPQDIRTMINTLVRQYDSLQQ--YVRVES-----VSTSNGSRVYEKWTDVTPLTVMDAEDGKIPDLDN 188 (408) T ss_pred hhcccccCCceeccHhHHHHHHHHHHhhchhhh--hcceee-----ccCCcceEEEeeccccccceeeecCccccccccC Confidence 33 2333 367899998888877777776644 222111 1233344444432 12345566888888874 44 Q ss_pred cccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 76 ITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 76 lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) .+-.+-....++.+....++++.-.-+.-|....+.+++++.+.+..+..++.-..+ . T Consensus 189 ~~~~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~~~~il~g~g~-------~--------------- 246 (408) T protein:vir:10 189 PQLTIIKYLIKRYAGIITATNTSLKDTAENILAWLSSWIAKKVVVTRNQAIIEVMKA-------A--------------- 246 (408) T ss_pred cceeeEEeeeeeEEeeehhHHHHHhhchHHHHHHHHHHHHHHHHHHHHHHHhhcccc-------c--------------- Confidence 555666666667777777777665445567788899999999999988877653211 0 Q ss_pred ccccccccHHHHHHHHH-HhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeec Q lcl|NC_020854. 156 GDTPTALSPRHVAEARA-ILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYM 234 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~-~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~ 234 (342) .......+++.+.+++. .+-.....-..|+||+..+..|++. ++++++.+.. .. .....-.+++ T Consensus 247 ~~~~~~~~~~~l~~~~~~~~~~~~~~~a~~v~n~~~~~~l~~l------kd~~G~~i~~--~~-------~~~~~~~~l~ 311 (408) T protein:vir:10 247 PKKPTIAKFDDVITMINTAVDPAIIATSSLLTNQSGLNKLALV------KTAEGKYLLE--PD-------PTKPNSYLIK 311 (408) T ss_pred ccccccccHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHh------hccCCceEec--cC-------cCCCCCceec Confidence 01122346778888764 3434444456899999999999874 3444443321 11 1123346899 Q ss_pred cceEEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCCC----cceeEEEEeeEEEeeec---cee--- Q lcl|NC_020854. 235 GLRVIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDIL----AKSDAMSIDLHYVYHPV---GAK--- 302 (342) Q Consensus 235 G~~VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~----~g~~~l~~r~~y~~~~~---G~s--- 302 (342) |++|++.+..++...+ .+.+ .++|+. .++.+.....+.++.++... .+...+....++.+.+. +|. T Consensus 312 G~PV~~~~~~~~~~~~-~~~~-~i~~gd~~~~~~~~~~~~~~v~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~~~ 389 (408) T protein:vir:10 312 GKQVIVVADRWLPNTG-STVY-PLYYGDMSQAITLFDRENMSLLPTNIGAGAFETDTTKIRVIDRFDVKATDSEALVAGS 389 (408) T ss_pred ceeeEEecccccCccC-CCce-EEEEEehhccEEEEEecceEEEEcccccchhhcCceEEEEEEeeccEEeccccEEEEE Confidence 9999997654332222 2232 355653 34666667778888777653 45667888888776653 333 Q ss_pred eec-CcCCcChHHhcCCcC Q lcl|NC_020854. 303 WAV-TTTNPTRAQLETVAN 320 (342) Q Consensus 303 ~~~-~~~sPt~~~L~~~~N 320 (342) ++. +...|...-.+.++- T Consensus 390 ~~~~~~~~~~~~~~~~~~~ 408 (408) T protein:vir:10 390 FSAIADQVGNFKTTTSTAV 408 (408) T ss_pred eeccccCCCCCCCCCcccC Confidence 221 111122111111111 No 110 >protein:vir:3136 Length: 322 # NCBI annotation: hypothetical protein # Family: family:all:11728 # MgeID: mge:64 # MgeName: VpV262 # Cross-refs: genbank:acc:NP_640318;genbank:gi:21234405;genbank:GeneID:956058 Probab=98.95 E-value=7.8e-11 Score=76.00 Aligned_cols=297 Identities=9% Similarity=0.028 Sum_probs=158.9 Q ss_pred Ccc----e-eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhh Q lcl|NC_020854. 1 MAT----L-RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGK 75 (342) Q Consensus 1 MaT----~-~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~ 75 (342) |+| . ....|.||+|++.+...+.++..+.. ++... .. + -||+|.||-.. .-...++..+++|+++. T Consensus 1 ~~~~n~ts~~qafi~~EiWsa~il~~l~~~Lv~~~--~~~~~-d~-g--~GDtV~InsIg---~~tV~dY~~~~~i~~d~ 71 (322) T protein:vir:31 1 MSTGNNTSNTQALIVSEIWADEIEDILHEKLLDVN--IARVV-DF-P--DGDKLTIPSVG---TPVVRSRPEQGDFTFDN 71 (322) T ss_pred CCCCCCcccceEEeehhhhHHHHHHHhhhhhhhhh--hhccc-cc-C--CCCeEEecccc---ccccccccCCCCccccc Confidence 994 2 23348899999999887777554422 22211 11 1 29999998643 35567788999999999 Q ss_pred cccceeeeEee-eeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccc Q lcl|NC_020854. 76 ITADKQVAAIL-HRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSE 154 (342) Q Consensus 76 lt~~~~~a~i~-~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~ 154 (342) +++.+..-+|- ..-.++.+.| ...=..+|.++.+.++.+...++..|.-+...|+.--+.......-..+.+....-. T Consensus 72 ltt~~~~l~IDq~KYfaf~VdD-D~~Qa~~dl~~~~~~~aa~ala~~~D~fva~lL~~gA~~~~~~~~p~vin~~~~~iv 150 (322) T protein:vir:31 72 LDTGEISIILRDEVYAGNAISK-KLRQDSRWISNVGAMLPAEQARAIMERYQTDLLALGNAQFAGQNDPNVINGVPHRFV 150 (322) T ss_pred CCCceEEEEEehhhhhccccch-hHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhccCCcceecCCcccee Confidence 99988876653 2444567777 333356788888888889888888888887766542211111110001110000001 Q ss_pred cccccccccHHHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceee Q lcl|NC_020854. 155 SGDTPTALSPRHVAEARAILGDQGD--KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPT 232 (342) Q Consensus 155 ~~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~ 232 (342) ...+.....++.|.++..+|.+..= .=..+||.|.+++.|+....+..+. .+.+.. .....+...+ ..-++. T Consensus 151 ~~gt~~~~ay~~lv~l~~kLdkanVP~~gR~vVV~P~~~~~L~~i~~~~~l~-~D~rf~-~i~~sG~a~g----~~~Vg~ 224 (322) T protein:vir:31 151 GTGTDQTMDVTDFSRVNYVMTQSKMPMGGMIGIIDPSVAHHLETITNISNIS-NNPRWE-GIVESGIAPD----MQFVRS 224 (322) T ss_pred ccCCCchhhHHHHHHHHHHhccccCCCCCeEEEeCchhhhhhhhhhhhhhhh-cccccc-ccccccchhh----HHHHHH Confidence 1123445678999999999977432 3478899999999886644332211 111110 0111111111 123788 Q ss_pred eccceEEEeCCcceec----------cCCCcceEEEEEe--cceeE--eecCCcceeEeccCCCcceeEEEEeeEEEeee Q lcl|NC_020854. 233 YMGLRVIVSDDVNTAG----------SGGSTEYATYFFT--QGAVA--SGEQMAMQTETDRDILAKSDAMSIDLHYVYHP 298 (342) Q Consensus 233 ~~G~~VvvdD~~p~~~----------~~~~~~y~t~l~~--~GAi~--~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~ 298 (342) .+|..|++|..++... ....++|..|... .|+.- ..-++=+..|..|+..+-.+-+..+.+|+..+ T Consensus 225 ~~GF~V~~SN~l~~~~~~i~aG~d~~~t~ag~~n~f~~~~~~~~~~~~~~~~~l~~~e~~r~~~~~~d~~~~~~~~g~g~ 304 (322) T protein:vir:31 225 VYGIDLFVSNLLADANETINAGGDARSTTAGKCNMFMNVSDMGLLPFVVAWKEMPTTKSFIDDYNDDLNTATTARWGNGL 304 (322) T ss_pred HhceeeeeeccccccccccccCcccccccceeecccccccchhhhhhhhHhhhhhhhhcccCccccccceeeeeeeccee Confidence 9999999999875210 0011111111110 01110 01112234466676666555555555555443 Q ss_pred c-ceee--ecCcCCcChH Q lcl|NC_020854. 299 V-GAKW--AVTTTNPTRA 313 (342) Q Consensus 299 ~-G~s~--~~~~~sPt~~ 313 (342) . --+- -.+...|+-= T Consensus 305 ~r~e~l~~~~a~~~~~~~ 322 (322) T protein:vir:31 305 VRDENLVCVLANADKVTF 322 (322) T ss_pred ecccceEEEEeccccccC Confidence 1 1111 0111112111 No 111 >protein:vir:102119 Length: 404 # NCBI annotation: phage major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1641 # MgeName: phiSM101 # Cross-refs: genbank:acc:YP_699941;genbank:gi:110804052;genbank:GeneID:4206662 Probab=98.94 E-value=9.5e-10 Score=70.04 Aligned_cols=277 Identities=9% Similarity=-0.004 Sum_probs=151.1 Q ss_pred Ccc-e--eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhh-- Q lcl|NC_020854. 1 MAT-L--RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGK-- 75 (342) Q Consensus 1 MaT-~--~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~-- 75 (342) |++ . -.-..+|+.+..-+.+...+.+.|.+ ++...+. ..+.-.+.+|... ....+..+.|++.++.+. T Consensus 110 ~~~~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~--l~~~~~~---~~~~g~~~~~~~~--~~~~~~~v~e~~~~~~~~~~ 182 (404) T protein:vir:10 110 ISENIDEDGGYAVPEDIQTKINTRLKDTTDLYN--MVDYEPV---FTRSGSRTYEKRS--KQKPMKPLSENQQIPTNGDN 182 (404) T ss_pred hccccCCCCceeechhHHHHHHHHHhhhhhHhh--hhceeec---cCCccceEEEEec--CCcceeeccccccccccccc Confidence 442 2 23356788776666666666555533 2211111 1112234455443 235566788888776543 Q ss_pred cccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 76 ITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 76 lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) ++...-....++.+.-+.++++...-+..+....+.+++++.+++..+..+|. | ........+........+ T Consensus 183 ~~f~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~~il~---G---~g~~~~~~gi~~~~~~~~-- 254 (404) T protein:vir:10 183 GKLERFNFKLKDLADFMSIPNDLLKFADKSLEDWIINWFVDKVRITRNAEILY---G---AGGDEHATGIMTANKFKK-- 254 (404) T ss_pred cceeeeEeeheeeEeeehhhHHHHhhcHHHHHHHHHHHHHHHHHHHHHHHHhh---c---CCCCCcccceeeccccce-- Confidence 44455555566666667777765444445677779999999999999887763 2 111111111111111111 Q ss_pred ccccccccHHHHHHHHHH-hCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeec Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAI-LGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYM 234 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~-~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~ 234 (342) ........++.+.+++.. +-.....-.+|+|||..+..|+++ ++++++.+.. . .......++++ T Consensus 255 ~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~v~n~~~~~~L~~l------kd~~G~~l~~--~-------~~~~~~~~~l~ 319 (404) T protein:vir:10 255 ITLPKSPALKDFKKCKNVELLNVFKATSSWIVNQDGFNYLDSL------EDKTGRPYLQ--P-------DPKDPTQYRFL 319 (404) T ss_pred eeccccccHHHHHHHHHhhhhccccCCCEEEEcHHHHHHHHHh------hccCCceeec--c-------CcCCCCCcccc Confidence 112233457778777653 333333446799999999999875 3444433321 1 11233456899 Q ss_pred cceEEEeCC-cceeccCCCcceEEEEEe--cceeEeecCCcceeEeccCC----CcceeEEEEeeEEEeee---cceee- Q lcl|NC_020854. 235 GLRVIVSDD-VNTAGSGGSTEYATYFFT--QGAVASGEQMAMQTETDRDI----LAKSDAMSIDLHYVYHP---VGAKW- 303 (342) Q Consensus 235 G~~VvvdD~-~p~~~~~~~~~y~t~l~~--~GAi~~~~k~~~~ve~dr~~----~~g~~~l~~r~~y~~~~---~G~s~- 303 (342) |++|++.+. +|... .++ .+++|+ ..++.+.....+.++.+++. ..+...+....++.+.+ ..+.. T Consensus 320 G~PV~~~~~~~~~~~---~~~-~~~~~gd~s~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~~~~r~d~~v~~~~a~~~~ 395 (404) T protein:vir:10 320 GLPVIELPNDLLLST---ESA-IPVLLGDTKEAYKYVSDGAYELATTNIGAGAFETNTTKARIIMRIDGNVKDSEALLIA 395 (404) T ss_pred ceeeEEecccccCCC---CCc-cEEEEEeccccEEEEEecceEEEEeccccchhhcCceEEEEEEeeccEEecccceEEE Confidence 999985543 43222 111 345666 34566666666777776543 24666777777777665 34443 Q ss_pred e-cCcCCcC Q lcl|NC_020854. 304 A-VTTTNPT 311 (342) Q Consensus 304 ~-~~~~sPt 311 (342) + .+..+|. T Consensus 396 ~~~~aa~~~ 404 (404) T protein:vir:10 396 EIPVESVQA 404 (404) T ss_pred EeecccCCC Confidence 1 2234566 No 112 >protein:vir:101607 Length: 379 # NCBI annotation: major capsid protein precursor # Family: family:all:585 # MgeID: mge:1646 # MgeName: 11b # Cross-refs: genbank:acc:YP_112497;genbank:gi:53793597;uniprot:Q5ZGF6;genbank:GeneID:3101715 Probab=98.92 E-value=1.2e-09 Score=69.41 Aligned_cols=261 Identities=11% Similarity=0.028 Sum_probs=148.0 Q ss_pred Ccc--eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhccc Q lcl|NC_020854. 1 MAT--LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITA 78 (342) Q Consensus 1 MaT--~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~ 78 (342) |.+ .... .+|+.+..-+.+...+.+.+.+ ++..- ...+..+++|.....-.+....+.|++.++..+++. T Consensus 109 ~~~~~~~~~-~ip~~~~~~ii~~~~~~~~i~~--~~~~~-----~~~~~~~~~~~~~~~~~~~~~~v~Eg~~~~~~~~~f 180 (379) T protein:vir:10 109 MTLPVNLTG-AQPKDYNFDVVLNPSQMLNVSD--IVGAV-----SISGGTYTFVRENGAGEGAIGAQVEGATKGQKDYDI 180 (379) T ss_pred cccCCCCcc-ccchhhhhHHHHhHHhhhhHHh--hceee-----eccCCceEEEEeecCCCcccccccCCccccccccce Confidence 332 2333 4577666666665555555532 22221 123567888887543233445578999998888887 Q ss_pred ceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccccc Q lcl|NC_020854. 79 DKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDT 158 (342) Q Consensus 79 ~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~ 158 (342) ++-....++.+.-..++++.-.-+ .+....+.+++++..+++.+..++..+. ++...... . . T Consensus 181 ~~i~~~~~k~~~~~~iS~ell~D~-~~l~~~i~~~la~~~~~~~~~~~~~g~~-------~~~~~~~~------~----~ 242 (379) T protein:vir:10 181 SMIDVNTDFIAGFTRYSKKMANNL-PFLTSFIPNALRRDYAKAENAAFNAVLA-------ANATASTE------I----I 242 (379) T ss_pred eeeEeeeeeEEeeehhhHHHHhhH-HHHHHHHHHHHHHHHHHHHHHHHhcccc-------cccccccc------c----c Confidence 777777777777677776653222 2455667788888888887776655332 11101000 0 1 Q ss_pred cccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceE Q lcl|NC_020854. 159 PTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRV 238 (342) Q Consensus 159 ~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~V 238 (342) ....+.+.+.++...+.+..-.-.+|+||+..+..|++. ++++++.+. +.+.. .....-.+++|++| T Consensus 243 ~~~~~~d~i~~~~~~~~~~~~~~~~~vmn~~~~~~l~~l------kd~~G~~l~--~~~~~-----~~~~~~~~l~G~pv 309 (379) T protein:vir:10 243 TNKNKVEMLINEIAKQENLDFPVTAIVLRPTDYYDILVT------QKSVGAGYG--LPGVV-----TQDNGVLRINGIPL 309 (379) T ss_pred cCcccHHHHHHHHHhhhhccCCCCEEEEcHHHHHHHHHh------hccCCceec--cCCcc-----CCCCCcceecceee Confidence 122346789999888877777778999999999999864 344443221 11111 11233458899999 Q ss_pred EEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccC----CCcceeEEEEeeEEEeeecc---eee-ecCcC Q lcl|NC_020854. 239 IVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRD----ILAKSDAMSIDLHYVYHPVG---AKW-AVTTT 308 (342) Q Consensus 239 vvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~----~~~g~~~l~~r~~y~~~~~G---~s~-~~~~~ 308 (342) ++++.||- ++...-=|...++.+. +...++..+. -....+.+....++.+.++- +-+ +-+.. T Consensus 310 v~s~~~~a------g~~~~gdf~~~~~~~~--~~~~i~~~~~~~~~f~~~~~~~r~~~R~~~~v~~p~a~v~~~~~~~ 379 (379) T protein:vir:10 310 FRATWLAA------NKYYVGDWTRVTKVTT--EGLSLEFSEVEGTNFVKNNITARIEAQVALAVEQPAALIFGDFTAV 379 (379) T ss_pred EecCCCCC------CceEEeecccEEEEEE--eceEEEEeecccccccCCcEEEEEEEEeccEEecCccEEEEEecCC Confidence 99998863 1111111223333332 2344444433 23556677888888777643 222 11222 No 113 >protein:vir:3870 Length: 400 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:82 # MgeName: A2 # Cross-refs: genbank:acc:NP_680487;swissprot:trembl:q8ltc0;genbank:gi:22296527;interpro:IPR006444;uniprot:Q8LTC0;genbank:GeneID:951713 Probab=98.92 E-value=9.1e-10 Score=70.16 Aligned_cols=258 Identities=10% Similarity=0.040 Sum_probs=145.6 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceech-hhcccc Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTP-GKITAD 79 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~-~~lt~~ 79 (342) +++.....++|+.+..-+.+...+.+.+.+ ++..- ..++...++|.+.. .++.+..+.|+...+. ...+-. T Consensus 137 ~~~~~gg~~vP~~~~~~ii~~~~~~~~l~~--~~~~~-----~~~~~~~~~~~~~~-~~~~~~~~~E~~~~~~~~~~~f~ 208 (400) T protein:vir:38 137 VKAADAASTIPETISNTPQRELQTVVDLKP--FTNVF-----QASTQKGTYPTVAN-ATTKMVTVAELEKNPAMAKPEFK 208 (400) T ss_pred ccccCCcccccHHHHHHHHHHHHhhhhhhh--cceeE-----eccCcceEEEEEec-CCCccccccccccccccccccce Confidence 122234568888887777766666665533 22111 12455778888763 3466677888887764 345555 Q ss_pred eeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccccc Q lcl|NC_020854. 80 KQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTP 159 (342) Q Consensus 80 ~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~ 159 (342) .-....++.+.-+.++++...-+.-|..+.+.+++++...+..+..++..+.+ +... T Consensus 209 ~i~~~~~k~~~~~~is~ell~ds~~~~~~~i~~~l~~~~~~~~~~~i~~~~~~-----------------------~~~~ 265 (400) T protein:vir:38 209 PVNWSVETYRQALPVSQESIDDSAIDLVGLIAQNGQQIKVNTTNGAVATLLKG-----------------------FTAK 265 (400) T ss_pred eeEeehhheeeehhhHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhhhhcccc-----------------------cccc Confidence 55566667777777777654444556667788888888777766555432211 0112 Q ss_pred ccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceEE Q lcl|NC_020854. 160 TALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVI 239 (342) Q Consensus 160 ~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vv 239 (342) ...+++.+.++....-+... -++|+|||..+..|+++ ++++++++.... .....-++++|++|+ T Consensus 266 ~~~~~~~~~~~~~~~~~~~~-~a~~v~~~~~~~~l~~l------kd~~G~~i~~~~---------~~~~~~~~l~G~pv~ 329 (400) T protein:vir:38 266 TISSVDDLKHINNVDLDPAY-SRVIIASQSFYNFLDTV------KDGNGRYLLQDS---------ILTPSGKSVLGMPIA 329 (400) T ss_pred ccccHHHHHHHHHhhhhhhh-CcEEEEcHHHHHHHHHh------hccCCCeeeecC---------cCCCCccccccceeE Confidence 23456777777665544433 37899999999999875 445544443211 112334689999999 Q ss_pred EeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCCCcceeEEEEeeEEEee---ecceeeecCcCCcChHH Q lcl|NC_020854. 240 VSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYH---PVGAKWAVTTTNPTRAQ 314 (342) Q Consensus 240 vdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~---~~G~s~~~~~~sPt~~~ 314 (342) ++|.+|....+. ..++|+. -++.+.....+.++..++.... +.+....++.+. +..+.+ T Consensus 330 ~~~~~~~~~~g~----~~~~~gd~s~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~r~d~~~~~~~a~~~----------- 393 (400) T protein:vir:38 330 VVSDDTLGAAGE----AHAFLGDIKRAILFANRADFMVRWVDDQIYG-QFLQAGMRFGVSVADEKAGYF----------- 393 (400) T ss_pred EecccccCCCCc----eEEEEEeccccEEEEeecceEEEEecccccc-eeEEEEEEeccEEecccceEE----------- Confidence 999998654322 2345543 2344454455666655443222 233333333332 233333 Q ss_pred hcCCcCceeecCccccceEEEEecC Q lcl|NC_020854. 315 LETVANWSKVYELKNIGIVRATNVS 339 (342) Q Consensus 315 L~~~~NW~~v~d~k~i~~~~~~~~~ 339 (342) ..+.+.. T Consensus 394 ------------------l~~~~~a 400 (400) T protein:vir:38 394 ------------------LTYTPKA 400 (400) T ss_pred ------------------EEeecCC Confidence 1111111 No 114 >protein:vir:102873 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1492 # MgeName: Cherry # Cross-refs: genbank:acc:YP_338137;genbank:gi:77020198;genbank:GeneID:3703782 Probab=98.92 E-value=2e-09 Score=68.32 Aligned_cols=268 Identities=9% Similarity=0.009 Sum_probs=150.7 Q ss_pred Cc--c-eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCE--EEccccccCCCCcccccCCCceechh- Q lcl|NC_020854. 1 MA--T-LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDF--INVPFWKANLSGDFEVLSDSSSLTPG- 74 (342) Q Consensus 1 Ma--T-~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~t--i~~P~~~~i~~gda~~~~~~~~i~~~- 74 (342) |+ | .-...++|+.+...+.+...+.+.+.+ ++..-+ .++.. ..+|.... .+.+.-+.|+..++.. T Consensus 106 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~--~~~~~~-----~~~~~~~~~~~~~~~--~~~a~~v~E~~~~~~~~ 176 (392) T protein:vir:10 106 MSGLTGEDGGLVIPQDIQTQINELARSFDALEQ--YVTVEP-----VRTRSGSRVLEKNSD--MIPFAEITEMGEIPETD 176 (392) T ss_pred ccccccCCCceecchhHHHHHHHHHHhhhhhhh--hceeee-----ccCCceeEEEEeecC--Cccceeecccccccccc Confidence 44 2 234567899888888777777776644 221111 12333 34454432 2556678888888643 Q ss_pred hcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccc Q lcl|NC_020854. 75 KITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSE 154 (342) Q Consensus 75 ~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~ 154 (342) ..+-.+-....++.+.-..++++.-.-+.-|..+.+.+++++.+.+..+..++..... T Consensus 177 ~~~~~~v~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~~~g~g~---------------------- 234 (392) T protein:vir:10 177 NPKFSNVQYAVKDRAGILPLSRSLLQDSDQNILKYVTKWLGKKSKVTRNVLILGVIEK---------------------- 234 (392) T ss_pred cccceeEEeeeeeEEEeehhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhcccc---------------------- Confidence 3455555556667776677776654334446678899999999998888776542210 Q ss_pred cccccccccHHHHHHHHH-HhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeee Q lcl|NC_020854. 155 SGDTPTALSPRHVAEARA-ILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTY 233 (342) Q Consensus 155 ~~~~~~~~~~~~l~~A~~-~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~ 233 (342) +......+++.+.+++. .+......-..|+||+..+..|++. ++++++.+.... .....-+++ T Consensus 235 -~~~~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~l------kd~~G~~l~~~~---------~~~~~~~tl 298 (392) T protein:vir:10 235 -LTKQAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDKL------KDKDGKYILQSD---------PTQKNKKLF 298 (392) T ss_pred -ccccCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHh------hccCCCeEeecC---------ccCCccccc Confidence 11123456788888874 4555555568899999999999874 455554433211 112345688 Q ss_pred ccceEEEe-CCcceec-cCCCcceEEEEEec--ceeEeecCCcceeEeccCCC----cceeEEEEeeEEEeee---ccee Q lcl|NC_020854. 234 MGLRVIVS-DDVNTAG-SGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDIL----AKSDAMSIDLHYVYHP---VGAK 302 (342) Q Consensus 234 ~G~~Vvvd-D~~p~~~-~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~----~g~~~l~~r~~y~~~~---~G~s 302 (342) +|+++|+. |.++... ....+. ..++|+. -++.+.....+.++.++... .++..+....++...| .++. T Consensus 299 lG~~~v~~~~~~~~~~~~~~~~~-~~~~~gdfs~~~~i~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~ 377 (392) T protein:vir:10 299 AGTNPVVVVSNRFLKSKGTTAKK-APLIIGDLKEAIVLFKREDMELASTDVGGKAFTRNTLDLRAIQRDDVQMWDNEAAV 377 (392) T ss_pred cCcccEEEecccccCCCcccCCc-eEEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeeccEEecccceE Confidence 99876654 3333221 112222 3455553 34555666667777776432 3445666666665554 3443 Q ss_pred e-ecCcCCcChHHhc Q lcl|NC_020854. 303 W-AVTTTNPTRAQLE 316 (342) Q Consensus 303 ~-~~~~~sPt~~~L~ 316 (342) . +.+...|+-+--+ T Consensus 378 ~l~~~~~a~~~~~~~ 392 (392) T protein:vir:10 378 YGEIDLSAPVEQPQG 392 (392) T ss_pred EEEecccccccCCCC Confidence 3 1111122211111 No 115 >protein:vir:102082 Length: 392 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1503 # MgeName: Fah # Cross-refs: genbank:acc:YP_512315;genbank:gi:89152484;genbank:GeneID:3953075 Probab=98.92 E-value=2e-09 Score=68.32 Aligned_cols=268 Identities=9% Similarity=0.009 Sum_probs=150.7 Q ss_pred Cc--c-eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCE--EEccccccCCCCcccccCCCceechh- Q lcl|NC_020854. 1 MA--T-LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDF--INVPFWKANLSGDFEVLSDSSSLTPG- 74 (342) Q Consensus 1 Ma--T-~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~t--i~~P~~~~i~~gda~~~~~~~~i~~~- 74 (342) |+ | .-...++|+.+...+.+...+.+.+.+ ++..-+ .++.. ..+|.... .+.+.-+.|+..++.. T Consensus 106 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~--~~~~~~-----~~~~~~~~~~~~~~~--~~~a~~v~E~~~~~~~~ 176 (392) T protein:vir:10 106 MSGLTGEDGGLVIPQDIQTQINELARSFDALEQ--YVTVEP-----VRTRSGSRVLEKNSD--MIPFAEITEMGEIPETD 176 (392) T ss_pred ccccccCCCceecchhHHHHHHHHHHhhhhhhh--hceeee-----ccCCceeEEEEeecC--Cccceeecccccccccc Confidence 44 2 234567899888888777777776644 221111 12333 34454432 2556678888888643 Q ss_pred hcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccc Q lcl|NC_020854. 75 KITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSE 154 (342) Q Consensus 75 ~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~ 154 (342) ..+-.+-....++.+.-..++++.-.-+.-|..+.+.+++++.+.+..+..++..... T Consensus 177 ~~~~~~v~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~~~g~g~---------------------- 234 (392) T protein:vir:10 177 NPKFSNVQYAVKDRAGILPLSRSLLQDSDQNILKYVTKWLGKKSKVTRNVLILGVIEK---------------------- 234 (392) T ss_pred cccceeEEeeeeeEEEeehhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhcccc---------------------- Confidence 3455555556667776677776654334446678899999999998888776542210 Q ss_pred cccccccccHHHHHHHHH-HhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeee Q lcl|NC_020854. 155 SGDTPTALSPRHVAEARA-ILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTY 233 (342) Q Consensus 155 ~~~~~~~~~~~~l~~A~~-~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~ 233 (342) +......+++.+.+++. .+......-..|+||+..+..|++. ++++++.+.... .....-+++ T Consensus 235 -~~~~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~l------kd~~G~~l~~~~---------~~~~~~~tl 298 (392) T protein:vir:10 235 -LTKQAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDKL------KDKDGKYILQSD---------PTQKNKKLF 298 (392) T ss_pred -ccccCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHh------hccCCCeEeecC---------ccCCccccc Confidence 11123456788888874 4555555568899999999999874 455554433211 112345688 Q ss_pred ccceEEEe-CCcceec-cCCCcceEEEEEec--ceeEeecCCcceeEeccCCC----cceeEEEEeeEEEeee---ccee Q lcl|NC_020854. 234 MGLRVIVS-DDVNTAG-SGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDIL----AKSDAMSIDLHYVYHP---VGAK 302 (342) Q Consensus 234 ~G~~Vvvd-D~~p~~~-~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~----~g~~~l~~r~~y~~~~---~G~s 302 (342) +|+++|+. |.++... ....+. ..++|+. -++.+.....+.++.++... .++..+....++...| .++. T Consensus 299 lG~~~v~~~~~~~~~~~~~~~~~-~~~~~gdfs~~~~i~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~ 377 (392) T protein:vir:10 299 AGTNPVVVVSNRFLKSKGTTAKK-APLIIGDLKEAIVLFKREDMELASTDVGGKAFTRNTLDLRAIQRDDVQMWDNEAAV 377 (392) T ss_pred cCcccEEEecccccCCCcccCCc-eEEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeeccEEecccceE Confidence 99876654 3333221 112222 3455553 34555666667777776432 3445666666665554 3443 Q ss_pred e-ecCcCCcChHHhc Q lcl|NC_020854. 303 W-AVTTTNPTRAQLE 316 (342) Q Consensus 303 ~-~~~~~sPt~~~L~ 316 (342) . +.+...|+-+--+ T Consensus 378 ~l~~~~~a~~~~~~~ 392 (392) T protein:vir:10 378 YGEIDLSAPVEQPQG 392 (392) T ss_pred EEEecccccccCCCC Confidence 3 1111122211111 No 116 >protein:vir:107593 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1491 # MgeName: Gamma # Cross-refs: genbank:acc:YP_338188;genbank:gi:77020144;genbank:GeneID:3703724 Probab=98.92 E-value=2e-09 Score=68.32 Aligned_cols=268 Identities=9% Similarity=0.009 Sum_probs=150.7 Q ss_pred Cc--c-eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCE--EEccccccCCCCcccccCCCceechh- Q lcl|NC_020854. 1 MA--T-LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDF--INVPFWKANLSGDFEVLSDSSSLTPG- 74 (342) Q Consensus 1 Ma--T-~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~t--i~~P~~~~i~~gda~~~~~~~~i~~~- 74 (342) |+ | .-...++|+.+...+.+...+.+.+.+ ++..-+ .++.. ..+|.... .+.+.-+.|+..++.. T Consensus 106 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~--~~~~~~-----~~~~~~~~~~~~~~~--~~~a~~v~E~~~~~~~~ 176 (392) T protein:vir:10 106 MSGLTGEDGGLVIPQDIQTQINELARSFDALEQ--YVTVEP-----VRTRSGSRVLEKNSD--MIPFAEITEMGEIPETD 176 (392) T ss_pred ccccccCCCceecchhHHHHHHHHHHhhhhhhh--hceeee-----ccCCceeEEEEeecC--Cccceeecccccccccc Confidence 44 2 234567899888888777777776644 221111 12333 34454432 2556678888888643 Q ss_pred hcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccc Q lcl|NC_020854. 75 KITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSE 154 (342) Q Consensus 75 ~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~ 154 (342) ..+-.+-....++.+.-..++++.-.-+.-|..+.+.+++++.+.+..+..++..... T Consensus 177 ~~~~~~v~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~~~g~g~---------------------- 234 (392) T protein:vir:10 177 NPKFSNVQYAVKDRAGILPLSRSLLQDSDQNILKYVTKWLGKKSKVTRNVLILGVIEK---------------------- 234 (392) T ss_pred cccceeEEeeeeeEEEeehhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhcccc---------------------- Confidence 3455555556667776677776654334446678899999999998888776542210 Q ss_pred cccccccccHHHHHHHHH-HhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeee Q lcl|NC_020854. 155 SGDTPTALSPRHVAEARA-ILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTY 233 (342) Q Consensus 155 ~~~~~~~~~~~~l~~A~~-~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~ 233 (342) +......+++.+.+++. .+......-..|+||+..+..|++. ++++++.+.... .....-+++ T Consensus 235 -~~~~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~l------kd~~G~~l~~~~---------~~~~~~~tl 298 (392) T protein:vir:10 235 -LTKQAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDKL------KDKDGKYILQSD---------PTQKNKKLF 298 (392) T ss_pred -ccccCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHh------hccCCCeEeecC---------ccCCccccc Confidence 11123456788888874 4555555568899999999999874 455554433211 112345688 Q ss_pred ccceEEEe-CCcceec-cCCCcceEEEEEec--ceeEeecCCcceeEeccCCC----cceeEEEEeeEEEeee---ccee Q lcl|NC_020854. 234 MGLRVIVS-DDVNTAG-SGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDIL----AKSDAMSIDLHYVYHP---VGAK 302 (342) Q Consensus 234 ~G~~Vvvd-D~~p~~~-~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~----~g~~~l~~r~~y~~~~---~G~s 302 (342) +|+++|+. |.++... ....+. ..++|+. -++.+.....+.++.++... .++..+....++...| .++. T Consensus 299 lG~~~v~~~~~~~~~~~~~~~~~-~~~~~gdfs~~~~i~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~ 377 (392) T protein:vir:10 299 AGTNPVVVVSNRFLKSKGTTAKK-APLIIGDLKEAIVLFKREDMELASTDVGGKAFTRNTLDLRAIQRDDVQMWDNEAAV 377 (392) T ss_pred cCcccEEEecccccCCCcccCCc-eEEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeeccEEecccceE Confidence 99876654 3333221 112222 3455553 34555666667777776432 3445666666665554 3443 Q ss_pred e-ecCcCCcChHHhc Q lcl|NC_020854. 303 W-AVTTTNPTRAQLE 316 (342) Q Consensus 303 ~-~~~~~sPt~~~L~ 316 (342) . +.+...|+-+--+ T Consensus 378 ~l~~~~~a~~~~~~~ 392 (392) T protein:vir:10 378 YGEIDLSAPVEQPQG 392 (392) T ss_pred EEEecccccccCCCC Confidence 3 1111122211111 No 117 >protein:vir:105004 Length: 392 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:1490 # MgeName: W Beta # Cross-refs: genbank:acc:YP_459969;genbank:gi:85701384;genbank:GeneID:3882145 Probab=98.92 E-value=2e-09 Score=68.32 Aligned_cols=268 Identities=9% Similarity=0.009 Sum_probs=150.7 Q ss_pred Cc--c-eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCE--EEccccccCCCCcccccCCCceechh- Q lcl|NC_020854. 1 MA--T-LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDF--INVPFWKANLSGDFEVLSDSSSLTPG- 74 (342) Q Consensus 1 Ma--T-~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~t--i~~P~~~~i~~gda~~~~~~~~i~~~- 74 (342) |+ | .-...++|+.+...+.+...+.+.+.+ ++..-+ .++.. ..+|.... .+.+.-+.|+..++.. T Consensus 106 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~s~l~~--~~~~~~-----~~~~~~~~~~~~~~~--~~~a~~v~E~~~~~~~~ 176 (392) T protein:vir:10 106 MSGLTGEDGGLVIPQDIQTQINELARSFDALEQ--YVTVEP-----VRTRSGSRVLEKNSD--MIPFAEITEMGEIPETD 176 (392) T ss_pred ccccccCCCceecchhHHHHHHHHHHhhhhhhh--hceeee-----ccCCceeEEEEeecC--Cccceeecccccccccc Confidence 44 2 234567899888888777777776644 221111 12333 34454432 2556678888888643 Q ss_pred hcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccc Q lcl|NC_020854. 75 KITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSE 154 (342) Q Consensus 75 ~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~ 154 (342) ..+-.+-....++.+.-..++++.-.-+.-|..+.+.+++++.+.+..+..++..... T Consensus 177 ~~~~~~v~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~~~~g~g~---------------------- 234 (392) T protein:vir:10 177 NPKFSNVQYAVKDRAGILPLSRSLLQDSDQNILKYVTKWLGKKSKVTRNVLILGVIEK---------------------- 234 (392) T ss_pred cccceeEEeeeeeEEEeehhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhcccc---------------------- Confidence 3455555556667776677776654334446678899999999998888776542210 Q ss_pred cccccccccHHHHHHHHH-HhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeee Q lcl|NC_020854. 155 SGDTPTALSPRHVAEARA-ILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTY 233 (342) Q Consensus 155 ~~~~~~~~~~~~l~~A~~-~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~ 233 (342) +......+++.+.+++. .+......-..|+||+..+..|++. ++++++.+.... .....-+++ T Consensus 235 -~~~~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~l------kd~~G~~l~~~~---------~~~~~~~tl 298 (392) T protein:vir:10 235 -LTKQAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDKL------KDKDGKYILQSD---------PTQKNKKLF 298 (392) T ss_pred -ccccCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHHh------hccCCCeEeecC---------ccCCccccc Confidence 11123456788888874 4555555568899999999999874 455554433211 112345688 Q ss_pred ccceEEEe-CCcceec-cCCCcceEEEEEec--ceeEeecCCcceeEeccCCC----cceeEEEEeeEEEeee---ccee Q lcl|NC_020854. 234 MGLRVIVS-DDVNTAG-SGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDIL----AKSDAMSIDLHYVYHP---VGAK 302 (342) Q Consensus 234 ~G~~Vvvd-D~~p~~~-~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~----~g~~~l~~r~~y~~~~---~G~s 302 (342) +|+++|+. |.++... ....+. ..++|+. -++.+.....+.++.++... .++..+....++...| .++. T Consensus 299 lG~~~v~~~~~~~~~~~~~~~~~-~~~~~gdfs~~~~i~~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~ 377 (392) T protein:vir:10 299 AGTNPVVVVSNRFLKSKGTTAKK-APLIIGDLKEAIVLFKREDMELASTDVGGKAFTRNTLDLRAIQRDDVQMWDNEAAV 377 (392) T ss_pred cCcccEEEecccccCCCcccCCc-eEEEEEehhceEEEEeecceEEEEeccccchhhcCceEEEEEEeeccEEecccceE Confidence 99876654 3333221 112222 3455553 34555666667777776432 3445666666665554 3443 Q ss_pred e-ecCcCCcChHHhc Q lcl|NC_020854. 303 W-AVTTTNPTRAQLE 316 (342) Q Consensus 303 ~-~~~~~sPt~~~L~ 316 (342) . +.+...|+-+--+ T Consensus 378 ~l~~~~~a~~~~~~~ 392 (392) T protein:vir:10 378 YGEIDLSAPVEQPQG 392 (392) T ss_pred EEEecccccccCCCC Confidence 3 1111122211111 No 118 >protein:vir:6324 Length: 335 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:132 # MgeName: phiKMV # Cross-refs: genbank:acc:NP_877471;genbank:gi:33300843;uniprot:Q7Y2D3;genbank:GeneID:1482613 Probab=98.92 E-value=1.6e-10 Score=74.28 Aligned_cols=287 Identities=12% Similarity=0.055 Sum_probs=171.9 Q ss_pred Ccc-------------eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCC Q lcl|NC_020854. 1 MAT-------------LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSD 67 (342) Q Consensus 1 MaT-------------~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~ 67 (342) |++ -..+|+. |+|..-|...+.+++.|.. ....- +| .+|+++.+|+-. ...++.... T Consensus 1 ms~~~~~tr~~~~~s~~d~al~l-e~f~geV~~af~~~s~~~~--~~~~r-ti---~~g~s~~~~~iG---~~~~~~~~p 70 (335) T protein:vir:63 1 MSFLNDLTRPNYAGKNADVDIHL-EEHLGIVDKHFAYTSKFAP--LMNIR-DL---RGSNVVRLDRLG---NVEAKGRRA 70 (335) T ss_pred CCCcccchhhhcccccchhheeh-hhhhhhHHHHHHhhhhhcc--cccee-ee---ccceeEEEeeee---eeeeecccC Confidence 652 1123444 7777777777777777753 22211 23 469999999853 256777888 Q ss_pred CceechhhcccceeeeEeee-eccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH-HHHHhhhcccc---- Q lcl|NC_020854. 68 SSSLTPGKITADKQVAAILH-RGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCL-QGVFGSLNANT---- 141 (342) Q Consensus 68 ~~~i~~~~lt~~~~~a~i~~-~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L-~g~~~~~~a~~---- 141 (342) |+.+........+..-+|=. .--...+.|+-...+.-|...+++++++...++..|+.++.+| ++.-....+.. T Consensus 71 G~~l~~~~~~~~k~~itVD~ll~a~~~I~dlDe~~~~yDvRse~s~e~G~aLA~~~D~~~~~~i~~aa~~~a~~~~~~~~ 150 (335) T protein:vir:63 71 GEELERSRVVNDKWNLTVDTLLYLRHQFDHQDEWTQSFDMRKEVAELDGQELARKFDQACLIQVIKAAAMDAPVDLEDAF 150 (335) T ss_pred CcCcCCCCccccceEEEecceeechhhhhhHHHHhcCchhHHHHHHHHHHHHHHHHHHHHHHHHHhhccccCccccCCCc Confidence 99888887777665554432 2233558888888888899999999999999999999988544 33221111110 Q ss_pred cchhheeeecccccccccccccHHHHH----HHHHHhCccc--c---CeEEEEEchHHHHHHHhhhhhhhhhhhhcccce Q lcl|NC_020854. 142 SSSAFFDLCIDSESGDTPTALSPRHVA----EARAILGDQG--D---KLTAVAMHSKVYYDLVERRAIDYVSTADARGTS 212 (342) Q Consensus 142 ~~~~~~~~~~~~~~~~~~~~~~~~~l~----~A~~~~GD~~--~---~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~ 212 (342) ..+....+..... ++.-+++.|. +|.+.|-+++ + .-.+++|.|++|..|.+..-+--......+ T Consensus 151 ~~G~~~~~~~tg~----~~~~~~~~l~~a~~~a~~~L~e~dVP~~~~~dr~~vv~P~~y~~Ll~~~~l~n~~~~~s~--- 223 (335) T protein:vir:63 151 SPGVLEKLDLTGL----TAKQAADKIVRMHRRVVETFIDRDLGDAVYSEGLTPMSPRVFSLLLEHDKLMNVEYQATG--- 223 (335) T ss_pred CCCcceeeeeccC----cccccHHHHHHHHHHHHHHHHhccCCCcccCceEEEeChHHHHHHhcccccccccccccc--- Confidence 0111111111111 1111344444 5666676543 2 237899999999999986321111011100 Q ss_pred eeeccceeecccccccceeeeccceEEEeCCcceeccCC-------------CcceEEEEEecceeEeecCCcceeEecc Q lcl|NC_020854. 213 TTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGG-------------STEYATYFFTQGAVASGEQMAMQTETDR 279 (342) Q Consensus 213 ~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~-------------~~~y~t~l~~~GAi~~~~k~~~~ve~dr 279 (342) .+ .....+.+...+|.+|+.+..+|...... ..+...+++-+.|++...-.++..|..+ T Consensus 224 ---~~-----~~~~~g~v~~v~Gv~V~~sn~lP~~~~t~~~lg~a~n~~~~d~~~~~~~~~~~~Al~t~~~~~vt~e~~~ 295 (335) T protein:vir:63 224 ---AT-----NDYVKSRVAILNGVKVLETPRFATKAIAAHPLGRHFNVSAEESERQIALFLPSKTLITAQVAPVQAKLWE 295 (335) T ss_pred ---cc-----ccccCceeEEeeceEEEeeccCCCCCcccccccccCCccccccceeEEEEEecceEEEEEEeecccceee Confidence 00 01124678999999999999998643211 1123577888899999888888888888 Q ss_pred CCCcceeEEEEeeEEEeeecc------eeeecCc-CCcCh Q lcl|NC_020854. 280 DILAKSDAMSIDLHYVYHPVG------AKWAVTT-TNPTR 312 (342) Q Consensus 280 ~~~~g~~~l~~r~~y~~~~~G------~s~~~~~-~sPt~ 312 (342) +.....+.+.+.+.|++.++= +..+..| .+-|. T Consensus 296 ~~~~~~~~i~~~~a~G~g~lRPe~a~~i~~tg~~~~~~~~ 335 (335) T protein:vir:63 296 DNEKFSWVLDTFQMYNIGARRPDTAGAIELKGIGAFDITA 335 (335) T ss_pred ccchhhHHhHHHHHcCCcccccceEEEEEEcCCCceeecC Confidence 887777777777776665432 1111111 11111 No 119 >protein:vir:81227 Length: 413 # NCBI annotation: gp6, major capsid protein # Family: family:all:585 # MgeID: mge:1893 # MgeName: BFK20 # Cross-refs: genbank:acc:YP_001456736;genbank:gi:157168379;hssp:P49861;interpro:IPR006444;uniprot:Q9MBJ9;genbank:GeneID:5580350 Probab=98.90 E-value=2.4e-09 Score=67.82 Aligned_cols=278 Identities=13% Similarity=0.062 Sum_probs=147.8 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccC--CCCcccccCCCceechhh Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKAN--LSGDFEVLSDSSSLTPGK 75 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i--~~gda~~~~~~~~i~~~~ 75 (342) ++ +....-.+|+.+.+-+.+...+.+.+.+ ++.. + ..+|..+.+|..... ..+.+..+.|++.++-.. T Consensus 118 ~~~~~~~~~~~~vp~~~~~~ii~~~~~~~~l~~--~~~~---~--~~~~~~~~~~~~~~~~~~~~~a~~v~Eg~~~~~~~ 190 (413) T protein:vir:81 118 STATLTDEFQGGYGTTWNRNIIYRRREKLVVAD--LMDN---L--TMTNTTIKYLMEKANRVVEGGFKTVAEGGKKPYMR 190 (413) T ss_pred hhcccccccccccchhhHHHHHHHHhhhhhHHh--hcce---e--eccCCceeEEEeccccccccccceecCcccccccC Confidence 22 2345556788887777777777666643 2221 1 124556677765532 224566789998887666 Q ss_pred ccc-ceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccc Q lcl|NC_020854. 76 ITA-DKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSE 154 (342) Q Consensus 76 lt~-~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~ 154 (342) +.. .......++.+.-..++++.-.-+ ......+.+++++.+++..++.+|. | ......-.+.+......+. T Consensus 191 ~~~f~~i~~~~~k~~~~~~iS~ell~ds-~~l~~~i~~~la~~~~~~~d~~~l~---G---~G~~~~~~Gi~~~~~~~~~ 263 (413) T protein:vir:81 191 FADFDIVTESLSKIAGLTKITDEMIEDY-DFLVSYINARLLEELAIEEERQLLL---G---DGTGNNLTGLLKRDGIQTL 263 (413) T ss_pred cccceeeEeeeeeEEEeehhhHHHHHHH-HHHHHHHHHHHHHHHHHHHHHHHhc---c---CCCCCcccccccccccccc Confidence 654 334445556666667776643223 2455568888888888888887664 1 1111100011100000011 Q ss_pred cccccccccHHHHHHHHHHhCcc-ccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeee Q lcl|NC_020854. 155 SGDTPTALSPRHVAEARAILGDQ-GDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTY 233 (342) Q Consensus 155 ~~~~~~~~~~~~l~~A~~~~GD~-~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~ 233 (342) . ...+.-.++.+.++...+-.. ...-.+|+||+..+..|+++ ++++++.+......+.... .+...-+++ T Consensus 264 ~-~~~~~~~~~~i~~~~~~~~~~~~~~~~~~vmn~~~~~~l~~l------kd~~G~~l~~~~~~~~~~~--~~~~~~~~l 334 (413) T protein:vir:81 264 A-VSNKDELADSIYKAMTNISLATPFQADALVINPLDYQELRLA------KDANGQYYGGGVFQGQYGS--GGIMLDPAP 334 (413) T ss_pred c-ccccchhHHHHHHHHHHhhhhccCCCcEEEEcHHHHHHHHHh------hccCCceeccccccccccc--cccccCcee Confidence 1 111122355566666544322 22334699999999998865 3455444332111111110 111234688 Q ss_pred ccceEEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCC----CcceeEEEEeeEEEeeecc---eee- Q lcl|NC_020854. 234 MGLRVIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDI----LAKSDAMSIDLHYVYHPVG---AKW- 303 (342) Q Consensus 234 ~G~~VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~----~~g~~~l~~r~~y~~~~~G---~s~- 303 (342) +|++|+++|.+|-. .++|+. .++.+.....+.++.++.. ......+....+|.+.|.- +.. T Consensus 335 ~G~pv~~s~~~~~~---------~~~~gd~~~~~~~~~~~~~~v~~~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~l 405 (413) T protein:vir:81 335 WGLRTVQSQVVPVG---------KPVVGAFRSAASVLRKGGVRIDSTNTNVDDFENNLITVRAEERVGLMVTFPEAIVQL 405 (413) T ss_pred cceeeEEcCCCCcc---------cEEEEecccEEEEEEecceEEEEeccccchhhcCcEEEEEEEeeccEEecccceEEE Confidence 99999999998732 133332 2344444455667766654 3455677777777666533 322 Q ss_pred e-cCcCCc Q lcl|NC_020854. 304 A-VTTTNP 310 (342) Q Consensus 304 ~-~~~~sP 310 (342) + .+..+| T Consensus 406 ~~~~~~~p 413 (413) T protein:vir:81 406 DVAEVVTP 413 (413) T ss_pred EecCCCCC Confidence 1 233467 No 120 >protein:vir:4511 Length: 409 # NCBI annotation: capsid # Family: family:all:21 # MgeID: mge:97 # MgeName: V # Cross-refs: genbank:acc:NP_599037;genbank:gi:19548995;genbank:GeneID:935211 Probab=98.89 E-value=6.7e-10 Score=70.88 Aligned_cols=281 Identities=10% Similarity=0.066 Sum_probs=153.4 Q ss_pred Cc-ceec--cccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA-TLRS--DIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma-T~~~--d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |. +... -.++|+.+..-+.+.+.+.+.+.+ ++..-+ ...+..+.+|.-.. ....+..+.|++.++...++ T Consensus 117 ~~~~~~~~gg~liP~~~~~~ii~~~~~~~~l~~--~~~~~~----~~~~~~~~~~~~~~-~~~~~~~v~E~~~~~~~~~~ 189 (409) T protein:vir:45 117 QGVAQDEKGGYTVPETFLAKVVEKMKSYGGIAS--VAQILT----TSDGRTMEWATADG-TSEVGVLLGENEEAGEEDTD 189 (409) T ss_pred ccCccCcCCceeccHhHHHHHHHHHHhhhhhhh--hceeee----cCCCceEEEEeecc-Cccccccccccccccccccc Confidence 43 2223 256787776655555555554533 221111 12355566665442 12345577888888877776 Q ss_pred cceeeeEeeeec-cceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc--cccchhheeeecccc Q lcl|NC_020854. 78 ADKQVAAILHRG-RAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNA--NTSSSAFFDLCIDSE 154 (342) Q Consensus 78 ~~~~~a~i~~~~-k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a--~~~~~~~~~~~~~~~ 154 (342) -....-.-++.. +-+.++++...-+.-|..+.+.+++++.+.++.+..+|. | .... ..-.+.+... .... T Consensus 190 f~~~~l~~~k~~~~~i~is~ell~ds~~~l~~~i~~~la~a~~~~~~~a~l~---G---~G~~~~~~p~Gil~~~-~~~~ 262 (409) T protein:vir:45 190 FGMGSLGALKMTSKIIRVSNELLQDSAIDMEAYLARRIAERIGRGEARYLIQ---G---TGAGTPKQPKGLAASV-TGTT 262 (409) T ss_pred cceeeeeeeeeeeeehhhhHHHHhccHHHHHHHHHHHHHHHHHHHHHHHhhc---c---CCCCCccccceeeecc-cccc Confidence 555444444443 334566665444445778889999999999998888764 1 1100 0000111111 1111 Q ss_pred cccccccccHHHHHHHHHHhCccccCeE--EEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceee Q lcl|NC_020854. 155 SGDTPTALSPRHVAEARAILGDQGDKLT--AVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPT 232 (342) Q Consensus 155 ~~~~~~~~~~~~l~~A~~~~GD~~~~~~--~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~ 232 (342) .......++++.|.++...+........ +|+||+..+..|++. ++++++.+.. .. .....-.+ T Consensus 263 ~~~~~~~~~~d~i~~l~~~l~~~~~~~a~~~~~~n~~~~~~l~~l------kd~~G~~i~~--~~-------~~~~~~~~ 327 (409) T protein:vir:45 263 QTAAANAVKWQEILALKHSIDPAYRRGPKFRLAFNDNTLKLISEM------EDGQGRPLWL--PD-------IVGVAPAS 327 (409) T ss_pred ccccccccchHHHHHHHHhhhhhhccCCeEEEEECHHHHHHHHHh------hcCCCceeec--cC-------cCCCCCce Confidence 2233456788999999888776554444 568899999998864 3455443321 10 11233468 Q ss_pred eccceEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEe--ccCCCcceeEEEEeeEEEeeecce-eeecCcC Q lcl|NC_020854. 233 YMGLRVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTET--DRDILAKSDAMSIDLHYVYHPVGA-KWAVTTT 308 (342) Q Consensus 233 ~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~--dr~~~~g~~~l~~r~~y~~~~~G~-s~~~~~~ 308 (342) ++|+||+++|.+|..+. +++ +++|+. .-+.+....+..++. |+....+...+....+|...|.-- +|..-.. T Consensus 328 l~G~PV~~~~~~p~~~~---~~~-~i~~Gd~~~~~i~~~~~~~~~~~~d~~~~~~~~~~~~~~r~d~~~~~~~A~~~l~~ 403 (409) T protein:vir:45 328 VLNVPYVIDQEIDDIGA---GKK-FMFCGDFDRFIIRRVRYMILKRLVERYAEYDQTGFLAFHRFDCILEDTSAIKALVG 403 (409) T ss_pred ecceeeEEecCcCCccC---Ccc-EEEEeehhhhheeeccceEEEEeecccccCCcEEEEEEEEeccEeechhheEEEEe Confidence 99999999999986543 222 344433 112244455555554 445556777888888888776532 2311100 Q ss_pred CcChHH Q lcl|NC_020854. 309 NPTRAQ 314 (342) Q Consensus 309 sPt~~~ 314 (342) .|+-+- T Consensus 404 k~s~~~ 409 (409) T protein:vir:45 404 KGSVGG 409 (409) T ss_pred ccCCCC Confidence 000000 No 121 >protein:vir:8420 Length: 477 # NCBI annotation: gp15 # Family: family:all:21 # MgeID: mge:155 # MgeName: Omega # Cross-refs: genbank:acc:NP_818316;genbank:gi:29566752;genbank:GeneID:1260033 Probab=98.86 E-value=4.7e-10 Score=71.72 Aligned_cols=293 Identities=15% Similarity=0.095 Sum_probs=143.9 Q ss_pred Cccee---ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceec----- Q lcl|NC_020854. 1 MATLR---SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLT----- 72 (342) Q Consensus 1 MaT~~---~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~----- 72 (342) +.+.. ..++.||.+..-+.+.+.+.+.+.+ ++..- .+. .++..+.+|..... ...+.-+.|++.++ T Consensus 157 ~~~~~~~gg~lv~~~~~~~~ii~~l~~~~~i~~--~~~~~-~~~--~~~~~~~ip~~~~~-~~~a~~~~Eg~~~~~~~~~ 230 (477) T protein:vir:84 157 LDRNGGTGGYAVPPLWMMNRFIELARAGRTYAN--LCPTE-PLP--GGTSSINIPKILTG-TSTAIQAADNAALTAPSAH 230 (477) T ss_pred ccccCCCcceeeccchhHHHHHHHhhhcchHHH--hhcee-eec--CCcceeEEEEEecC-cceeeeeccCccccccccc Confidence 22222 3467888776655555555444433 11111 111 12446788875321 11223456665543 Q ss_pred hhhcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH------HHHHHHhhhcccccchhh Q lcl|NC_020854. 73 PGKITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLS------CLQGVFGSLNANTSSSAF 146 (342) Q Consensus 73 ~~~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla------~L~g~~~~~~a~~~~~~~ 146 (342) ..+++-+.-....++.+.-+.+++..-.-+.-+..+.+.++++..++++.+..+|. ..+|++....... T Consensus 231 ~s~~~f~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~~~~~~d~~~l~G~Gt~~~p~Gi~~~~~~~~----- 305 (477) T protein:vir:84 231 EVDLTDGFVQANVKTIAGQQGIAIQLLDQAAVSVDEFVFRDLAADYANKLNVQVISGTGSNNQVVGVRATAGITQ----- 305 (477) T ss_pred ccccceeeEEEeeeeEEeeeHHHHHHHhccchhHHHHHHHHHHHHHHHHHHHHHhccCCCCCccceeeecccccc----- Confidence 34455555555666666666666665444455777889999999999999987763 1222221100000 Q ss_pred eeeecccccccccccccHHHHHHHHHHhCcccc-CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccc----eee Q lcl|NC_020854. 147 FDLCIDSESGDTPTALSPRHVAEARAILGDQGD-KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGG----SMA 221 (342) Q Consensus 147 ~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~-~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~----~~~ 221 (342) ... ..............+.+.++......... .-.+|+|||..+..|++. ++++.+++....... ... T Consensus 306 ~~~-~~~~~t~~~~~~~~~~i~~~~~~~~~~~~~~~~~~v~~~~~~~~l~~l------kd~~G~~l~~~~~~~~~~~~~~ 378 (477) T protein:vir:84 306 VTA-TSAGSALEKHQIIYQKIADAIQRVHTSRFLEPEVIVMHPRRWASFHAI------FAGDDRPLIVPSGPGFNNLGVL 378 (477) T ss_pred ccc-cccccchhhHHHHHHHHHHHHhhccccccCCccEEEEcHHHHHHHHHh------hccCCCeeeecCcccccccccc Confidence 000 00000000011223445666554444333 345899999999988875 345444333211110 011 Q ss_pred cccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecce-eEeecCCcceeEeccCCCcc--eeEEEEeeEEEe-- Q lcl|NC_020854. 222 AAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGA-VASGEQMAMQTETDRDILAK--SDAMSIDLHYVY-- 296 (342) Q Consensus 222 ~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GA-i~~~~k~~~~ve~dr~~~~g--~~~l~~r~~y~~-- 296 (342) ....++...++++|++|++++.+|... +..+....++|+.-+ +.++. ....++.++....+ +..+.....+.+ T Consensus 379 ~~~~~~~~~~~l~G~pVv~s~~~p~~~-~~~~d~~~i~~gd~~~~~i~~-~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~ 456 (477) T protein:vir:84 379 TEVASQRVVGQMHGLPVVTDPTLPTTL-GTGTDQDVIHVLRASDLALFE-SSVRMRALQETRAENLSVLLQVYGYLAFTA 456 (477) T ss_pred cccccccccchhcccceEecCcccccc-cccCCcceEEEEEeceEEEEe-eceeEEeccccccccceeeeeehhhhhhhh Confidence 112234566799999999999998642 222222344444322 22233 23445555544333 333322221211 Q ss_pred --eecceee-ecC-cCCcChH Q lcl|NC_020854. 297 --HPVGAKW-AVT-TTNPTRA 313 (342) Q Consensus 297 --~~~G~s~-~~~-~~sPt~~ 313 (342) ||.-|.= |.. ...||.+ T Consensus 457 ~r~~~afv~~t~~~~~~~~~~ 477 (477) T protein:vir:84 457 ARFPQSVVEIGGTALTAPTFA 477 (477) T ss_pred hccccceEEeecccccccccC Confidence 5555432 332 3468887 No 122 >protein:vir:9704 Length: 394 # NCBI annotation: hypothetical protein # Family: family:all:21 # MgeID: mge:174 # MgeName: 315.2 # Cross-refs: genbank:acc:NP_795466;genbank:gi:28876225;genbank:GeneID:1257769 Probab=98.86 E-value=1.3e-09 Score=69.26 Aligned_cols=258 Identities=10% Similarity=0.016 Sum_probs=142.3 Q ss_pred Cc--ce--eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceech-hh Q lcl|NC_020854. 1 MA--TL--RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTP-GK 75 (342) Q Consensus 1 Ma--T~--~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~-~~ 75 (342) +. .+ -....+|+.+..-+.+...+...+.+ ++..-+ .++...++|.+.. .++.+.-+.|+..++. +. T Consensus 127 ~~~~~t~~~gg~liP~~~~~~ii~~~~~~~~l~~--~~~~~~-----~~~~~~~~~~~~~-~~~~~~~v~E~~~~~~~~~ 198 (394) T protein:vir:97 127 QKDGIKKENAKPVSSEEILYTPAREVKTVVDLKP--FTTVYQ-----AKKASGKYPVLQR-ATTKMVTVAELEKNPALAK 198 (394) T ss_pred hccccccccccccChHHHHHHHHHHhhhhhhhhh--hceeee-----ccCcceEEEEEec-CCCccceeccccccccccc Confidence 11 22 23467888887666666655555533 221111 2344577888763 3455667888888864 44 Q ss_pred cccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 76 ITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 76 lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) .+-..-....++.+.-+.++++.-.-+.-|....+.+++++...+..+..++..+.+ T Consensus 199 ~~~~~v~l~~~k~~~~i~is~ell~ds~~~~~~~i~~~la~~~~~~~~~~i~~g~~~----------------------- 255 (394) T protein:vir:97 199 PDFKDVAWNIDTYRGAIPLSQESIDDADVDLVGIVSESISQIKVNTTNDAIAKVLKS----------------------- 255 (394) T ss_pred ccceeEEeehhheeeehhhHHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHHhhcccc----------------------- Confidence 555555556666776666766554434456777799999999888877766543211 Q ss_pred ccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) .......+++.+.++....-+... -+.|+||+.++..|++. ++++++++.... ..+..-++++| T Consensus 256 ~~~~~~~~~~~~~~~~~~~~~~~~-~a~~v~n~~~~~~l~~l------kd~~G~~i~~~~---------~~~~~~~~l~G 319 (394) T protein:vir:97 256 FTTKTVKNLDEIKALLNGGFDPAY-NVSLIVSQSFYQTLDTL------KDGNGRYLLQDD---------ITAVSGKVLLG 319 (394) T ss_pred ccccccccHHHHHHHHHhhhhhhh-CCEEEEcHHHHHHHHHh------hccCCCeeeecC---------cCCCCCceecc Confidence 011223467788887765444322 36799999999998764 455554443211 11233468999 Q ss_pred ceEEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCCCcceeEEEEeeEEEeee---cceeeecCcCCc Q lcl|NC_020854. 236 LRVIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHP---VGAKWAVTTTNP 310 (342) Q Consensus 236 ~~VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~---~G~s~~~~~~sP 310 (342) ++|+++++.++.. + +++|+. -++.+.....+.++..+.....+ .+....++...| ..+.. -..+| T Consensus 320 ~pv~~~~~~~~~~----~---~~~~gd~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~r~d~~v~~~~a~~~--~~~~~ 389 (394) T protein:vir:97 320 KPVFVLSDEVLGA----N---KAFIGDFKRGVLFADRKDLGLRWADNEIYGQ-YLQAVLRFGVSKVDDKAGYY--VTFTP 389 (394) T ss_pred ceeEEecccccCC----c---cEEEeeccccEEEEEecceEEEEecccccce-eEEEEEEEccEEecccceEE--EEecc Confidence 9999987654422 1 234442 23445555556666554433222 333333333222 22222 12233 Q ss_pred ChHHh Q lcl|NC_020854. 311 TRAQL 315 (342) Q Consensus 311 t~~~L 315 (342) +-+-| T Consensus 390 ~~~p~ 394 (394) T protein:vir:97 390 EPLPL 394 (394) T ss_pred cccCC Confidence 33333 No 123 >protein:vir:103323 Length: 364 # NCBI annotation: major capsid-like protein # Family: family:all:2806 # MgeID: mge:1609 # MgeName: Era103 # Cross-refs: genbank:acc:YP_001039668;genbank:gi:125999997;genbank:GeneID:4818399 Probab=98.86 E-value=1.1e-09 Score=69.66 Aligned_cols=303 Identities=16% Similarity=0.086 Sum_probs=177.1 Q ss_pred Cc-----ce------eccc-cchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCC Q lcl|NC_020854. 1 MA-----TL------RSDI-IIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDS 68 (342) Q Consensus 1 Ma-----T~------~~d~-i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~ 68 (342) |+ |+ -++. +-=|+|..-|...+.+.+.|.. .... -++ .+|+++.+|+-.. ..+.....| T Consensus 1 ms~~n~~t~~~~~~~~~~~al~le~f~geV~taf~~~s~~~~--~~~~-rti---~~gkS~q~~~iG~---~~~~~~~~G 71 (364) T protein:vir:10 1 MSNPNVLTQPAVSASGEVDSLLIEKFNNRVHEQYLKGENLLQ--WFDV-QEV---VGTNSVSNKYIGE---TELQVLSPG 71 (364) T ss_pred CCCcccccccccccccchhhhhhhhhhhhHHHHHHHHHhhcC--ccee-eee---cccceEEeeeeee---eEEeeeccC Confidence 66 21 1222 2227777777777777777743 2222 123 4699999998532 456677888 Q ss_pred ceechhhcccceeeeEeee-eccceeechHHHhhhcch-HHHHHHHHHHHHHHHHHHHHHHHHHHHHH-hhhcccc---- Q lcl|NC_020854. 69 SSLTPGKITADKQVAAILH-RGRAFEARDLAALAAGSD-PMAAIGAKVADYVANQRQKDLLSCLQGVF-GSLNANT---- 141 (342) Q Consensus 69 ~~i~~~~lt~~~~~a~i~~-~~k~~~~tD~a~~~~~~d-p~~~i~~qia~~~~~~~~~~lla~L~g~~-~~~~a~~---- 141 (342) +.+.++.+...+..-+|=. .--...+.|+-.....-| +-.+++++++.+.++.+|+.++..++... +...... T Consensus 72 ~~ld~~~~~~~k~~itID~ll~a~~~V~diDe~q~~~D~vR~e~s~e~G~ALA~~~Dq~i~~~v~~aa~a~~~~~~~~~~ 151 (364) T protein:vir:10 72 KSPDASPTEFDKNRLVVDTTVIARNTVAHFHDVQNDIDGLKSKLSVNQAKKLKKMEDSMVIQQLVLGGISNTEAIRKNPR 151 (364) T ss_pred cccCCCCcccCcEEEEecceeeechhhhhHHHHhcCccchhHHHHHHHHHHHHHHHHHHHHHHHHhhhhhcccccccCCc Confidence 8898888888776655543 222355788887777667 67799999999999999998876554321 1111110 Q ss_pred cchhheeeecccccccccccccH----HHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhhh-hhh--hhhhhhcccce Q lcl|NC_020854. 142 SSSAFFDLCIDSESGDTPTALSP----RHVAEARAILGDQGD--KLTAVAMHSKVYYDLVERR-AID--YVSTADARGTS 212 (342) Q Consensus 142 ~~~~~~~~~~~~~~~~~~~~~~~----~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~~-li~--~~~~s~~~~~~ 212 (342) ....-..+.... .......++ ++|.+|.+.|.++.= .-.+++|.|+.|..|.+.. |+. |.. T Consensus 152 ~~~~g~~i~~~~--~a~~~~~~~~~l~~ai~~a~~~LdEkdVP~~~R~~vv~P~~y~~Ll~~~~lvn~d~~~-------- 221 (364) T protein:vir:10 152 VAGHGFSIHIVG--LASSFLTSPQYMMAAIEMAMEQQTEQEVDTSELCGLMPWTAFNCLRDADRIVDKSYTI-------- 221 (364) T ss_pred ccCCcceeeecc--cCcchhhhHHHHHHHHHHHHHHHhhcCCCccccEEEeChHHHHHHhcCCccccccccc-------- Confidence 001110111111 111222333 344566677766422 3488999999999998853 221 111 Q ss_pred eeeccceeecccccccceeeeccceEEEeCCcceecc-------------------------CCCcceEEEEEecceeEe Q lcl|NC_020854. 213 TTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGS-------------------------GGSTEYATYFFTQGAVAS 267 (342) Q Consensus 213 ~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~-------------------------~~~~~y~t~l~~~GAi~~ 267 (342) ..++. ...+.+....|.+|+.+..+|.... +...+....+|-+-|++. T Consensus 222 --~~~~~-----~~~G~v~~v~Gv~Vv~Sn~lP~~~~~~~~t~~~t~h~ls~~~~g~~y~v~~d~~~~~~~~f~~~Al~t 294 (364) T protein:vir:10 222 --AASDN-----TVDGFVLKSWNTPIVPSNRFPKLSDNTEGTGNTKHHKLSNAGNGNRYDVTAGQTSAQAVLFTQDALLV 294 (364) T ss_pred --cCCCc-----cccceeEEEeceEEEeccccccccccccccccccccccccccCCcccccccccceeEEEEEecceEEE Confidence 01111 1246788899999999999985311 111245577888899999 Q ss_pred ecCCcceeEeccCCCcceeEEEEeeEEEeeeccee----ee-cCcCCcC---hHHhcCCcCceeecCccccc Q lcl|NC_020854. 268 GEQMAMQTETDRDILAKSDAMSIDLHYVYHPVGAK----WA-VTTTNPT---RAQLETVANWSKVYELKNIG 331 (342) Q Consensus 268 ~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~s----~~-~~~~sPt---~~~L~~~~NW~~v~d~k~i~ 331 (342) ....++.+|..|+.....+.+.+.+.|++.++==. .+ .+...|. ++-|+-+ |-+ +.-.|.+. T Consensus 295 v~~~~~t~e~~~~~~~~~~~ida~~a~G~g~lRPeaa~~i~~~~~~~~~~~~~~~~~~~-~~~-~~~~~~~~ 364 (364) T protein:vir:10 295 GRTISITGDIFYEKKEKTWYIDTFLAEGAIPDRWEAVAVVTAADTAELATDHNAILARA-NRK-VTLTKSVN 364 (364) T ss_pred EEEecceeeeeeccceeeeeeeeehcccCcccCccceEEEEecCCCCCccchhhhhhhc-ccc-EEEEEecC Confidence 99999999999999998888888777776653311 11 1111122 1223222 211 11111111 No 124 >protein:vir:100057 Length: 375 # NCBI annotation: T7-like capsid protein # Family: family:all:975 # MgeID: mge:1604 # MgeName: P-SSP7 # Cross-refs: genbank:acc:YP_214206;genbank:gi:61806429;genbank:GeneID:3294737 Probab=98.82 E-value=1.9e-09 Score=68.44 Aligned_cols=290 Identities=12% Similarity=0.159 Sum_probs=169.8 Q ss_pred Cc-------------c--------eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCC Q lcl|NC_020854. 1 MA-------------T--------LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLS 59 (342) Q Consensus 1 Ma-------------T--------~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~ 59 (342) |+ | -.-+|+. |+|..-|...+.+.+.|.. .+. ..++ .+|+++.||+-. . T Consensus 1 ~~~~~~~~~~~~n~~t~~~~~~~~~~~al~l-e~f~geV~~~f~~~si~~~--~~~-~rti---~~Gksv~f~~iG---~ 70 (375) T protein:vir:10 1 MANANQVALGRSNLSTGTGYGGATDKYALYL-KLFSGEMFKGFQHETIARD--LVT-KRTL---KNGKSLQFIYTG---R 70 (375) T ss_pred CccccccccCccccCCccccccccchHHHHH-HHHhHHHHHHHHHHHhhhc--ccc-cccc---ccCceEEEEeee---e Confidence 33 1 1114555 7887777777777777743 332 2233 359999999753 3 Q ss_pred CcccccCCCceechh---hcccceeeeEee-eeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHH-HHH Q lcl|NC_020854. 60 GDFEVLSDSSSLTPG---KITADKQVAAIL-HRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQ-GVF 134 (342) Q Consensus 60 gda~~~~~~~~i~~~---~lt~~~~~a~i~-~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~-g~~ 134 (342) .....+..|+++... .+.+.+.+-+|= ..-..+.+.|+-...+..|.+.++.++.+...++++|+.++..|. +.- T Consensus 71 ~t~~~~t~G~~i~~~~~~d~~~te~~l~ID~~~y~~~~VdDiD~aqa~~Dlr~e~s~~~G~aLA~~~D~~i~~~l~kaa~ 150 (375) T protein:vir:10 71 MTSSFHTPGTPILGNADKAPPVAEKTIVMDDLLISSAFVYDLDETLAHYELRGEISKKIGYALAEKYDRLIFRSITRGAR 150 (375) T ss_pred eEEeeecCCcCcCCccccCCCCCceEEEecchhhhhhhHhhHHHHhcCchhHHHHHHHHHHHHHHHHHHHHHHHHHHhhh Confidence 566778888887533 444555554443 334568889999988999999999999999999999999997664 322 Q ss_pred hhhcccccch----hheeeeccccccccccccc----HHHHHHHHHHhCccc--cCeEEEEEchHHHHHHHhhhhhhhhh Q lcl|NC_020854. 135 GSLNANTSSS----AFFDLCIDSESGDTPTALS----PRHVAEARAILGDQG--DKLTAVAMHSKVYYDLVERRAIDYVS 204 (342) Q Consensus 135 ~~~~a~~~~~----~~~~~~~~~~~~~~~~~~~----~~~l~~A~~~~GD~~--~~~~~ivmhS~v~~~L~~~~li~~~~ 204 (342) ... ...... ....+...+.+ .+...++ ++.|.+|..+|.++. +.-..++|.|.+|..|.+.....++. T Consensus 151 ~~~-p~~~~~~~~~Gg~~i~~~sg~-~~~~~~ta~~~~~ai~~a~~~Lde~~VP~~~R~~vv~P~~y~~Ll~~~d~~~~~ 228 (375) T protein:vir:10 151 SAS-PVSATNFVEPGGTQIRVGSGT-NESDAFTASALVNAFYDAAAAMDEKGVSSQGRCAVLNPRQYYALIQDIGSNGLV 228 (375) T ss_pred hcc-ccccccccccCcceeeecccc-ccccccCHHHHHHHHHHHHHHHhhcCCCCCCCEEEeChHHHHHHHhcCCcccee Confidence 110 000000 00011111111 1112233 455666666666532 23467899999999998763222222 Q ss_pred hhhcccceeeeccceeecccccccceeeeccceEEEeCCcceecc--------------------------------CCC Q lcl|NC_020854. 205 TADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGS--------------------------------GGS 252 (342) Q Consensus 205 ~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~--------------------------------~~~ 252 (342) ..+. +..+. ...+.+..+.|.+|+.+..+|.... +.. T Consensus 229 n~d~------~~~~~-----~~~g~v~~i~Gv~V~~Sn~lP~~~~~~~~~g~~~~~~a~~~~~~~~~~~~~~~~~~~g~~ 297 (375) T protein:vir:10 229 NRDV------QGSAL-----QSGNGVIEIAGIHIYKSMNIPFLGKYGVKYGGTTGETSPGNLGSHIGPTPENANATGGVN 297 (375) T ss_pred eecc------cccce-----eccceEEEEeceEEEEeccccccccccccccccccccchhhhhccccccCCcceeecccc Confidence 2111 11111 1134577889999999999985421 011 Q ss_pred cce----------EEEEEecceeEeecCCcceeEe---ccCCCcceeEEEEeeEEEeeecceee----ecCcCCcChHHh Q lcl|NC_020854. 253 TEY----------ATYFFTQGAVASGEQMAMQTET---DRDILAKSDAMSIDLHYVYHPVGAKW----AVTTTNPTRAQL 315 (342) Q Consensus 253 ~~y----------~t~l~~~GAi~~~~k~~~~ve~---dr~~~~g~~~l~~r~~y~~~~~G~s~----~~~~~sPt~~~L 315 (342) ++| ...+|-+-|++..+-.++.+|+ ++++....+.+.+++.|+..++==.- ..++ |.++.+ T Consensus 298 ~~y~~d~~~~~~~~~~~~~~~A~g~v~~~~~~~~~~~~~~~~~~q~~~i~~~~a~G~~~lrp~~av~l~~~~--~~~~~~ 375 (375) T protein:vir:10 298 NDYGTNAELGAKSCGLIFQKEAAGVVEAIGPQVQVTNGDVSVIYQGDVILGRMAMGADYLNPAAAVELYIGA--TAPSAF 375 (375) T ss_pred ccccccccccCceEEEEEchhheeeeeeeccccccccchhhheeeeeeeeeeeeeccCccCceeEEEEecCc--CccccC Confidence 111 2356667777777666666664 46888888999888888776543221 1111 222222 No 125 >protein:vir:78935 Length: 335 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:1860 # MgeName: LKD16 # Cross-refs: genbank:acc:YP_001522824;genbank:gi:158345059;genbank:GeneID:5687425 Probab=98.81 E-value=3.4e-10 Score=72.47 Aligned_cols=286 Identities=11% Similarity=0.058 Sum_probs=170.8 Q ss_pred Ccc-------------eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCC Q lcl|NC_020854. 1 MAT-------------LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSD 67 (342) Q Consensus 1 MaT-------------~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~ 67 (342) |++ -..+++. |+|...|...+.+++.|.. ....- ++ .+|+++.+|+-. ...++.... T Consensus 1 ms~~~~~t~~~~~~s~~d~al~l-e~f~geV~~af~~~s~~~~--~~~~r-ti---~~g~s~~~~~iG---~~~~~~~~p 70 (335) T protein:vir:78 1 MSFLNDLTRPNYAGKNADVDIHL-EEHLGIVDKHFAYTSKFAP--LMNIR-DL---RGSNVVRLDRLG---NVEAKGRRA 70 (335) T ss_pred CCccccccccccccccchhhhhh-hhhhhHHHHHHHHhhhhcc--cccee-ee---ccceeEEEeeee---eeeeccccc Confidence 551 1234555 8888888888888888854 32222 23 469999999753 245666788 Q ss_pred CceechhhcccceeeeEeee-eccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHH-HHHHhhhccc---c- Q lcl|NC_020854. 68 SSSLTPGKITADKQVAAILH-RGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCL-QGVFGSLNAN---T- 141 (342) Q Consensus 68 ~~~i~~~~lt~~~~~a~i~~-~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L-~g~~~~~~a~---~- 141 (342) |+.+..+.+...+..-+|=. .--..-+.|+-...+.-|-..++++|++...++.+|+.++..| ++.-...... . T Consensus 71 G~~l~~~~~~~~k~~itID~ll~a~~~VddlDe~~~~yDvR~e~s~~~G~aLA~~~Dq~~~~~l~~aa~~~a~~~~~~~~ 150 (335) T protein:vir:78 71 GEELERSRVVNDKWNLTVDTLLYLRHQFDHQDEWTQSFDMRKEVAELDGQELARKFDQACLIQVIKAAAMDAPVDLEDAF 150 (335) T ss_pred CcccCCCCcccCCeEEEecceeechhhHhhHHHhhcCchhHHHHHHHHHHHHHHHHHHHHHHHHHhhcccccccccCCCc Confidence 88888888877665555532 2334558888888888899999999999999999999887544 4321111000 0 Q ss_pred cchhheeeecccccccccccccHHHHHHHH----HHhCccc--c---CeEEEEEchHHHHHHHhhh-hhhhhhhhhcccc Q lcl|NC_020854. 142 SSSAFFDLCIDSESGDTPTALSPRHVAEAR----AILGDQG--D---KLTAVAMHSKVYYDLVERR-AIDYVSTADARGT 211 (342) Q Consensus 142 ~~~~~~~~~~~~~~~~~~~~~~~~~l~~A~----~~~GD~~--~---~~~~ivmhS~v~~~L~~~~-li~~~~~s~~~~~ 211 (342) ..+.. .....+ + .++.-++..+.+|. +.|-++. + .-.+++|.|++|..|++.. +++ ......+ T Consensus 151 ~~G~~-~~~~~t--g-~~~~~~~~~l~~a~~~a~~~l~ekdvP~~~~~~rv~vv~P~~y~~Ll~~~~l~n-~~~~~s~-- 223 (335) T protein:vir:78 151 SPGVL-EKLDLT--G-LTAKEAAEKIVRMHRRVVETFIERDLGDAVYSEGLTPMSPRVFSLLLEHDKLMS-VEYQATG-- 223 (335) T ss_pred CCCcc-eeeeec--c-ccccccHHHHHHHHHHHHHHHHhccCCCCCCCccEEEeChHHHHHHhccccccc-ccccccc-- Confidence 00111 111111 1 12222455555554 3454221 1 1378999999999999863 221 0001000 Q ss_pred eeeeccceeecccccccceeeeccceEEEeCCcceeccCC------Cc-------ceEEEEEecceeEeecCCcceeEec Q lcl|NC_020854. 212 STTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGG------ST-------EYATYFFTQGAVASGEQMAMQTETD 278 (342) Q Consensus 212 ~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~------~~-------~y~t~l~~~GAi~~~~k~~~~ve~d 278 (342) .++ ....+.+...+|.+|+.+..+|...... .+ .-..+++-+.|++...-.++..|.. T Consensus 224 ----~~~-----~~~~g~v~~v~Gv~V~~Sn~lP~~~~t~~~lg~a~n~~~~d~~~~~~~~~~~~Al~t~~~~~~~~e~~ 294 (335) T protein:vir:78 224 ----ATN-----DYVKSRVAILNGVKVLETPRFATKAISAHPLGRHFNVSAEEAERQIALFLPSKTLITAQVAPVQAKLW 294 (335) T ss_pred ----ccc-----ccccceeEEeeceEEEeeccCCCCCCccccccccCCcccccccceEEEEEecceEEEEEEEeccccee Confidence 000 1124578999999999999999642110 01 1145677888888888888888888 Q ss_pred cCCCcceeEEEEeeEEEeeecc------eeeecCc-CCcCh Q lcl|NC_020854. 279 RDILAKSDAMSIDLHYVYHPVG------AKWAVTT-TNPTR 312 (342) Q Consensus 279 r~~~~g~~~l~~r~~y~~~~~G------~s~~~~~-~sPt~ 312 (342) |+.....+.+.+.+.|++.++= +..+..+ .+-|. T Consensus 295 ~~~~~~~~~i~~~~a~G~g~lRPe~a~~i~~tg~~~~~~~~ 335 (335) T protein:vir:78 295 EDHDQFSWVLDTFQMYNIGARRPDTAGAIELKGIEAFDITA 335 (335) T ss_pred eccchhhHhhhHHHHcCCcccCcceEEEEEecCCCcccccC Confidence 8887777777777776665432 1111111 11111 No 126 >protein:vir:96762 Length: 632 # NCBI annotation: putative phage-related protein # Family: family:all:21 # MgeID: mge:1628 # MgeName: VP882 # Cross-refs: genbank:acc:YP_001039818;genbank:gi:126010917;genbank:GeneID:5076272 Probab=98.80 E-value=2.2e-09 Score=68.09 Aligned_cols=264 Identities=8% Similarity=0.017 Sum_probs=152.1 Q ss_pred Ccc-ee---ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhc Q lcl|NC_020854. 1 MAT-LR---SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKI 76 (342) Q Consensus 1 MaT-~~---~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~l 76 (342) |.+ +. ..+|.|+++.+-+.+.+...+.+.+.|+ . .+... .| .+++|.... .+.+.-+.|++.++..++ T Consensus 357 ~~~~t~~~gg~lvp~~~~~~~iie~lr~~s~i~~l~~-~---~~~~~-~g-~~~ip~~~~--~~~a~wv~E~~~~~~s~~ 428 (632) T protein:vir:96 357 LEKKTAGKGGELVATELLSEEFIDILRNKAIIGQMGA-R---MLPGL-VG-DVDIPKKTS--GANFYWIGEDEDVQDSDF 428 (632) T ss_pred hhcccccccccccccccchHHHHHHHhhcchhhhhcc-e---EeecC-Cc-ceEEEEEeC--CceeEeecCCcccccccc Confidence 232 11 3467777776655555555554433221 1 11111 23 578898763 356666899999998888 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +-++-....++.+.-+.++.....-+.-+..+.+.++++..++++.++.+|. | ...++.-.+.+....+.. .. T Consensus 429 ~f~~i~l~~~k~~~~v~iS~ell~ds~~~~~~~i~~~l~~a~~~~~d~a~l~---G---~G~~~~p~Gi~~~~~~~~-~~ 501 (632) T protein:vir:96 429 DFTTLSFSPKTIAGAVPVTRKLRKQSSIHVENLIREDLIEGIGVALDLAMLT---G---TGLANDPVGLLNMTGVPA-LT 501 (632) T ss_pred ceeeEEeeeeEEEEehhhHHHHHhccchHHHHHHHHHHHHHHHHHHHHHhhc---c---cCCCCccceeeecccccc-ee Confidence 7777777777777666776665444555667778999999999999987763 2 111111111111111111 11 Q ss_pred cccccccHHHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeec Q lcl|NC_020854. 157 DTPTALSPRHVAEARAILGDQGD--KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYM 234 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~ 234 (342) .+.+.+++..+.++..++..... .-.+|+|||.....|++..+ +++++..+.. -++++ T Consensus 502 ~~~~~~~~~~i~~~~~~i~~~~~~~~~~~~~~~~~~~~~l~~~~l----~d~~G~~i~~----------------~~~l~ 561 (632) T protein:vir:96 502 YPAGGVDWASVVDMETKISTFNADAGRLAYLTSVTQRGAAKKAQV----FDNTGERIWQ----------------NNEVN 561 (632) T ss_pred cccccCCHHHHHHHHHHHhhcccccCccEEEEchhHHHHHHHHhc----cCCCCceeec----------------CCeec Confidence 23345678888888776654332 34589999999999987654 2333322211 13678 Q ss_pred cceEEEeCCcceeccCCCcceEEEEEecce-eEeecCCcceeEeccCC--CcceeEEEEeeEEEe---eecceeeecCcC Q lcl|NC_020854. 235 GLRVIVSDDVNTAGSGGSTEYATYFFTQGA-VASGEQMAMQTETDRDI--LAKSDAMSIDLHYVY---HPVGAKWAVTTT 308 (342) Q Consensus 235 G~~VvvdD~~p~~~~~~~~~y~t~l~~~GA-i~~~~k~~~~ve~dr~~--~~g~~~l~~r~~y~~---~~~G~s~~~~~~ 308 (342) |++|++++.+|-. +.+|+.-+ +.++.-..+.++.++.. ..+...+....++-+ ||..|.|..... T Consensus 562 G~pv~~s~~ip~~---------~~~~gd~s~~~i~~~~~~~i~~~~~~~~~~~~v~~~~~~~~d~~v~~~~af~~~k~~A 632 (632) T protein:vir:96 562 GYRAEASNQIPAD---------TWIFGDWSQIVIAMWGVLDLKVDPYTKAASDGLVLRVFQDVDAGVRRKEAFCIAKKGA 632 (632) T ss_pred ccceEeccccccC---------cEEEeecceEEEEEecceEEEEccccccccCceEEEEEeecCceeechhhhhheeecC Confidence 9999999998742 13333322 22333344666666644 456666666666644 456666632221 No 127 >protein:vir:4197 Length: 314 # NCBI annotation: putative structural protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:88 # MgeName: psiM100 # Cross-refs: genbank:acc:NP_071822;genbank:gi:11863105;genbank:GeneID:1257607 Probab=98.77 E-value=8.8e-09 Score=64.74 Aligned_cols=275 Identities=11% Similarity=0.055 Sum_probs=156.9 Q ss_pred Cccee--ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcc---cc-cCCCceechh Q lcl|NC_020854. 1 MATLR--SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDF---EV-LSDSSSLTPG 74 (342) Q Consensus 1 MaT~~--~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda---~~-~~~~~~i~~~ 74 (342) |++.. .-..+||.+..++ +.+.+.+.|.+ .+....++ +.....+|.+... ... .+ ..+.+..+.. T Consensus 14 it~~d~~gG~L~P~~~~~~i-~~l~e~s~i~~--~a~vi~t~----~s~~~~i~~i~~g--~~~~~~~~~~~~~~~~~~~ 84 (314) T protein:vir:41 14 IDVPDLGKGILAVQRFGEFV-REVRENSAIIK--DARVLNAL----KSYEVDISRISLG--VELEPGRNTSGTKVAPTAD 84 (314) T ss_pred cccccCCCceeChHHHHHHH-HHHHhccchhh--heeeeccc----CccceeecccccC--cccccccccccCCccCCcc Confidence 54322 2358999997766 56767676654 34333332 2345677766421 111 11 1222334444 Q ss_pred hcccceeeeEeeeeccceeechHHHhhh--cchHHHHHHHHHHHHHHHHHHHHHHH-------------HHHHHHhhhcc Q lcl|NC_020854. 75 KITADKQVAAILHRGRAFEARDLAALAA--GSDPMAAIGAKVADYVANQRQKDLLS-------------CLQGVFGSLNA 139 (342) Q Consensus 75 ~lt~~~~~a~i~~~~k~~~~tD~a~~~~--~~dp~~~i~~qia~~~~~~~~~~lla-------------~L~g~~~~~~a 139 (342) +.+-++..-..++..--+.++++.-.-+ +.|.-+.+..++++.+.+..+..++. .-+|++... T Consensus 85 ~~tf~~~~l~~~kl~~~v~is~e~L~D~a~~~~le~~i~~~~Ae~~g~~~~~~~~nGdg~~~s~~~~~~~p~G~l~~a-- 162 (314) T protein:vir:41 85 EVTVSTNTLEMKELVTKVVLEDEALEDNIEQSAFEQTITSLLASGVTYDLECFFLHADSSLTTGRELYRINDGWMKLA-- 162 (314) T ss_pred cccccceeeeeEEEEEeecccHHHHHhhhchhhHHHHHHHHHHHHHHHHHHHHhhccccCCcCcccchhcchhhhhhc-- Confidence 4444444444455445577777664433 45888999999999999988877754 112222110 Q ss_pred cccchhheeeecccccccccccccHHHHHHHHHHhCcccc---CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeec Q lcl|NC_020854. 140 NTSSSAFFDLCIDSESGDTPTALSPRHVAEARAILGDQGD---KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQS 216 (342) Q Consensus 140 ~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~---~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~ 216 (342) ..++ .+ ....+..+..+.|.++...+-++.- .-.+|+||+.++..+++.-- ... T Consensus 163 ---~~~~-----~~-~~~~~~~~~~~~~~~l~~sl~~~yr~~~~~~~~~m~~~t~~~~r~~l~------~~~-------- 219 (314) T protein:vir:41 163 ---GNQY-----TD-AEPEDENWPLNLFDGMMDELDTRYLQLKPRMKFYVSNEIYNGYRKQLL------VRE-------- 219 (314) T ss_pred ---ccce-----ee-cCccccccHHHHHHHHHHhcCchhhcCCCceEEEecHHHHHHHHHHHh------ccC-------- Confidence 0000 01 1122334667888899998988643 35689999999998886421 110 Q ss_pred cceeecccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecce-eEeecCCcceeEeccCCCcceeEEEEeeEEE Q lcl|NC_020854. 217 GGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGA-VASGEQMAMQTETDRDILAKSDAMSIDLHYV 295 (342) Q Consensus 217 ~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GA-i~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~ 295 (342) ..+..+......-.+++|++|+....||....++ ..++|+.-. +.+...+.+.++++|+...++..++.+.++- T Consensus 220 -~~l~~~~~~~~~~~~l~G~PV~~~~~~~~~~~~~----~~i~fgd~~nlv~~~~~~ir~~~~~~a~~~~~~~~~~~r~d 294 (314) T protein:vir:41 220 -TGLGDSALIGATGLQYDGIPIQYVPALDALGDDK----ARALLTVPTNLVYGFWRNIRIEPKRDAAMRRTEYIASLRAD 294 (314) T ss_pred -CcccchhhhCCCCceecceeeEecccccccCCCC----ceEEEechhheEEEeeceeEEeecccCcCCeEEEEEEEEec Confidence 0111111223445568999999999887654322 245555443 3356677788999999998999888888887 Q ss_pred eeecceeeec-CcCCcChHH Q lcl|NC_020854. 296 YHPVGAKWAV-TTTNPTRAQ 314 (342) Q Consensus 296 ~~~~G~s~~~-~~~sPt~~~ 314 (342) ++..-..+.. +-...+++- T Consensus 295 ~~~~~~~aa~~~~~~~~~~~ 314 (314) T protein:vir:41 295 CNYEDENAAVAAVIDMSSGG 314 (314) T ss_pred eEEEEcCcEEEEEeeccCCC Confidence 7765443421 111111111 No 128 >protein:vir:95376 Length: 425 # NCBI annotation: phage major capsid protein # Family: family:all:635 # MgeID: mge:1567 # MgeName: GBSV1 # Cross-refs: genbank:acc:YP_764476;genbank:gi:115334630;genbank:GeneID:5179263 Probab=98.74 E-value=1.1e-08 Score=64.28 Aligned_cols=272 Identities=13% Similarity=0.163 Sum_probs=151.9 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) .+|.-...++|+.+.+.+.+.+.+.+.+.+ ++..-+ .+|+ ..+|.... .+.+.-+.|++.++.....+.. T Consensus 141 ~~~~~gg~~vP~~~~~~Ii~~l~~~~~i~~--~~~~~~-----~~g~-~~ip~~~~--~~~a~~v~E~~~~~~~~~~~f~ 210 (425) T protein:vir:95 141 RAVAGGELTIPEVVVNRIMDIMGDYTTLYP--LVDKIR-----VKGT-TRILVDTD--TSPATWIEQSGALPTGDVGTIA 210 (425) T ss_pred cccccCceeccHHHHHHHHHHHHhhhhHHH--hhceee-----cCce-eEEEEecC--Cccccccccccccccccccccc Confidence 223345568899998888888877777755 222111 2354 47887643 3667778999998777765444 Q ss_pred e-eeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccc-cchhheeeecccccccc Q lcl|NC_020854. 81 Q-VAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANT-SSSAFFDLCIDSESGDT 158 (342) Q Consensus 81 ~-~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~-~~~~~~~~~~~~~~~~~ 158 (342) + .-..++.+.-+.++++...-+..+....+.++++..+++..++.+|. | ....+. ..+.+..+......... T Consensus 211 ~i~l~~~k~~~~~~iS~ell~ds~~~l~~~i~~~l~~~i~~~~d~~il~---G---~G~~~~~p~Gil~~~~~~~~~~~~ 284 (425) T protein:vir:95 211 SIDFDGFKVGKVTFVDNYLLQDSIINLDDYVTKKIARAIAKALDLAIVK---G---TGAANKQPLGIIPSLPPENQVTVE 284 (425) T ss_pred eeeeeheeeeeeehhhHHHHhccHHHHHHHHHHHHHHHHHHHHHHHhhc---c---CCCCccccceeecccccccccccc Confidence 3 44555666666777776555566777889999999999999987765 1 110000 00111111111111223 Q ss_pred cccccHHHHHHHHHHhCcccc--CeEEEEEchHHHHH-HHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 159 PTALSPRHVAEARAILGDQGD--KLTAVAMHSKVYYD-LVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 159 ~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~-L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) ....+++.+.++...+.-... .-.+|+||+.++.. |.. +..+++++++.+. +. .....++++| T Consensus 285 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~v~~~~~~~~~l~~---l~~~kd~~g~~i~--~~---------~~~~~~~l~G 350 (425) T protein:vir:95 285 ADNNLLKNLVKQIGLIDTGDDSVGEIVAVMKRSTYYNRLVE---FSIQVDSNGNVVG--KL---------PNLRTPDLLG 350 (425) T ss_pred cccchHHHHHHHHHhhhhhccccCceEEEEeChHHHHHHHH---HHhhcCCCCceee--cc---------CCCCCccccc Confidence 345577888888776554333 34568999887543 433 2334555543321 11 1234678999 Q ss_pred ceEEEeCCcceeccCCCcceEEEEEecce-eEeecCCcceeEeccCC--CcceeEEEEeeEEEeee---cceee-ecC-c Q lcl|NC_020854. 236 LRVIVSDDVNTAGSGGSTEYATYFFTQGA-VASGEQMAMQTETDRDI--LAKSDAMSIDLHYVYHP---VGAKW-AVT-T 307 (342) Q Consensus 236 ~~VvvdD~~p~~~~~~~~~y~t~l~~~GA-i~~~~k~~~~ve~dr~~--~~g~~~l~~r~~y~~~~---~G~s~-~~~-~ 307 (342) ++|+++|.+|.. .++|+.-. ..++...++.++..++. ..+.+.+....++-..| ..|.. +.. . T Consensus 351 ~pvv~~~~~~~~---------~i~~Gd~~~~~~~~~~~~~i~~~~~~~f~~~~~~~~~~~r~d~~~~~~~a~~~~~i~~~ 421 (425) T protein:vir:95 351 LRVVFNNFLDDD---------TVLFGEFEQYTLVERENITIDSSTHVKFTEDQTAFRGKGRFDGKPVKPEAFVLVTITDP 421 (425) T ss_pred eeeEEcCcCCCc---------cEEEEecccEEEEeecceEEEeecccccccCceEEEEEEeeCcEeecccceEEEEecCc Confidence 999999999742 13333211 22334444555554443 34556666666655443 34433 111 0 Q ss_pred CCcC Q lcl|NC_020854. 308 TNPT 311 (342) Q Consensus 308 ~sPt 311 (342) ..+. T Consensus 422 ~~g~ 425 (425) T protein:vir:95 422 VQGA 425 (425) T ss_pred CCCC Confidence 1111 No 129 >protein:vir:9361 Length: 402 # NCBI annotation: SLT orf 37-like protein # Family: family:all:658 # MgeID: mge:166 # MgeName: phi 12 # Cross-refs: genbank:acc:NP_803339;genbank:gi:29028650;genbank:GeneID:1258088 Probab=98.74 E-value=1.8e-09 Score=68.47 Aligned_cols=258 Identities=13% Similarity=0.122 Sum_probs=144.9 Q ss_pred Ccc---eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MAT---LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT---~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |.+ .-...++|+-+..-+.+.+.+.+.+.+ ++..- +++|. .+|.... ..+++.-+.|++.++..+.+ T Consensus 133 ~~~~t~~~GG~lIP~~~~~~Ii~~~~~~~~l~~--~~~v~-----~~~~~--~~p~~~~-~~~~a~~v~Eg~~~~~~~~~ 202 (402) T protein:vir:93 133 LPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLRE--KARLT-----NIKGL--EIPRVSY-TLDDDDFITDVETAKELKAK 202 (402) T ss_pred hccCCCcCCccccchhHHHHHHHhHHhhhhhhh--hceee-----ecCCc--eeeeeec-cCCccccccccccccccccc Confidence 332 224467898887777776666666643 22211 12333 3465442 23556667888888877766 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) -.+-.-..++.+.-+.++.+.-.-+..|..+.+.+++++.+.+...+.++....|. +...+.+. ...... T Consensus 203 f~~i~~~~~k~~~~i~iS~ell~Ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g~------g~p~g~~~----~~~~~~ 272 (402) T protein:vir:93 203 GDTVKFTTNKFKVFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAVSPKS------GLEHMSFY----NGSVKE 272 (402) T ss_pred cceeeecceeeeeechhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhHhhcCCCc------cccceeee----cccccc Confidence 55555555555544566655443345577788999999998887655554332211 11011111 111111 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) .+....++.|.++...+......-..|+||+.++..|++. +++.. + .... ..-.+++|++ T Consensus 273 ~~~~~~~d~l~~~~~~l~~~y~~na~~imn~~t~~~~~~~-----~~d~~--~-------~~~~------~~~~~llG~P 332 (402) T protein:vir:93 273 VEGADMYDAIINALADLHEDYRDNATIYMRYADYVKIISV-----LSNGT--T-------NFFD------TPAEKVFGKP 332 (402) T ss_pred ccccchHHHHHHHHhccChhhhcCCEEEEechHHHHHHHH-----HhcCC--C-------cccc------cCCccccccc Confidence 2223346788888877776666677899999999887653 12211 1 1111 1124789999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEe--ecCCcceeEeccCCCcceeEEEEeeEEEeeec---ceee---e-cCcC Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVAS--GEQMAMQTETDRDILAKSDAMSIDLHYVYHPV---GAKW---A-VTTT 308 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~--~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~---G~s~---~-~~~~ 308 (342) |+++|+++.. +| |-|.+ ..-....+...++...+...+....++-..|. .+.. + .++. T Consensus 333 V~~t~~~~~i-----------~~--GDf~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~Dg~v~~~~A~~~l~ik~~~~~ 399 (402) T protein:vir:93 333 VVFTDAAVKP-----------IV--GDFNYFGINYDGTTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKENTGP 399 (402) T ss_pred eEEecCCCce-----------ee--echhhhhhhhhhhhhhhhhcccCCceEEEEEEEeCcEEechhheEEEEeecCCCC Confidence 9999987532 22 22221 11122444556666677888888888876653 3433 1 3344 Q ss_pred CcC Q lcl|NC_020854. 309 NPT 311 (342) Q Consensus 309 sPt 311 (342) .|| T Consensus 400 ~~~ 402 (402) T protein:vir:93 400 LPS 402 (402) T ss_pred CCC Confidence 566 No 130 >protein:vir:2685 Length: 387 # NCBI annotation: hypothetical protein # Family: family:all:658 # MgeID: mge:57 # MgeName: phiSLT # Cross-refs: genbank:acc:NP_075504;genbank:gi:12719433;genbank:GeneID:920169 Probab=98.73 E-value=2e-09 Score=68.24 Aligned_cols=258 Identities=12% Similarity=0.124 Sum_probs=146.3 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |. +.-...++|+-+..-+.+.+.+.+.+.+ ++..- .++| ..+|.... ..+++.-+.|++.++..+.+ T Consensus 118 ~~~~~~~~gG~lIP~~~~~~Ii~~~~~~~~l~~--~~~~~-----~~~~--~~~p~~~~-~~~~a~~v~Eg~~~~~~~~~ 187 (387) T protein:vir:26 118 LPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLRE--KARLT-----NIKG--LEIPRVSY-TLDDDDFITDVETAKELKAK 187 (387) T ss_pred hccCCCCCCceeechhHHHHHHHHHHhhchhhh--hceee-----ecCC--ceeeeeec-cCCccccccccccccccccc Confidence 33 2234578898887766666666555533 22211 1223 34565442 23566668889888877777 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) -++-.-..++.+--+.++++.-.-+..|..+.+.+++++.+.+...+.++....| .+...+.+. ...... T Consensus 188 f~~v~l~~~k~~~~i~iS~ell~ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g------~g~~~g~~~----~~~~~~ 257 (387) T protein:vir:26 188 GDTVKFTTNKFKVFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAVSPK------SGLEHMSFY----NGSVKE 257 (387) T ss_pred cceeeechheeeeechhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhHhhcCCC------ccccceeee----cccccc Confidence 6655555555555566666654445557778899999998888766655533221 111001111 111111 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) .+....++.|.++...+-.....-..|+||+..+..|++. +++.. . .... ..-.+++|++ T Consensus 258 ~~~~~~~d~i~~~~~~l~~~y~~na~~imn~~t~~~~~~~-----~~~~~-~--------~~~~------~~~~~llG~P 317 (387) T protein:vir:26 258 VEGADMYDAIINALADLHEDYRDNATIYMRYADYVKIISV-----LSNGT-T--------NFFD------TPAEKVFGKP 317 (387) T ss_pred ccccchHHHHHHHHhccChhhhcCCEEEEechHHHHHHHH-----HhcCC-C--------cccc------cCCccccccc Confidence 2223357888888877776666667899999998887653 12211 1 1111 1224789999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeE--eecCCcceeEeccCCCcceeEEEEeeEEEeeec---ceee---e-cCcC Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVA--SGEQMAMQTETDRDILAKSDAMSIDLHYVYHPV---GAKW---A-VTTT 308 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~--~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~---G~s~---~-~~~~ 308 (342) |+++|+++.. +|+ -|. |..-....+.+.++...+.+.+....+|-..|. .+.. + .++. T Consensus 318 V~~~~~~~~~-----------~~G--Df~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~Dg~v~~~~A~~~l~~ka~~~~ 384 (387) T protein:vir:26 318 VVFTDAAVKP-----------IVG--DFNYFGINYDGTTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKENTGP 384 (387) T ss_pred eEEecCCCce-----------eee--chhhhhhhhhhhhheecccccCCceEEEEEEEeCcEeechhheEEEEeecCCCC Confidence 9999987532 221 111 111122445566777778888888888877764 2333 1 2344 Q ss_pred CcC Q lcl|NC_020854. 309 NPT 311 (342) Q Consensus 309 sPt 311 (342) .|| T Consensus 385 ~~~ 387 (387) T protein:vir:26 385 LPS 387 (387) T ss_pred CCC Confidence 566 No 131 >protein:vir:94424 Length: 387 # NCBI annotation: ORF010 # Family: family:all:658 # MgeID: mge:1506 # MgeName: 47 # Cross-refs: genbank:acc:YP_240005;genbank:gi:66395666;genbank:GeneID:5133084 Probab=98.73 E-value=2e-09 Score=68.24 Aligned_cols=258 Identities=12% Similarity=0.124 Sum_probs=146.3 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |. +.-...++|+-+..-+.+.+.+.+.+.+ ++..- .++| ..+|.... ..+++.-+.|++.++..+.+ T Consensus 118 ~~~~~~~~gG~lIP~~~~~~Ii~~~~~~~~l~~--~~~~~-----~~~~--~~~p~~~~-~~~~a~~v~Eg~~~~~~~~~ 187 (387) T protein:vir:94 118 LPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLRE--KARLT-----NIKG--LEIPRVSY-TLDDDDFITDVETAKELKAK 187 (387) T ss_pred hccCCCCCCceeechhHHHHHHHHHHhhchhhh--hceee-----ecCC--ceeeeeec-cCCccccccccccccccccc Confidence 33 2234578898887766666666555533 22211 1223 34565442 23566668889888877777 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) -++-.-..++.+--+.++++.-.-+..|..+.+.+++++.+.+...+.++....| .+...+.+. ...... T Consensus 188 f~~v~l~~~k~~~~i~iS~ell~ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g------~g~~~g~~~----~~~~~~ 257 (387) T protein:vir:94 188 GDTVKFTTNKFKVFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAVSPK------SGLEHMSFY----NGSVKE 257 (387) T ss_pred cceeeechheeeeechhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhHhhcCCC------ccccceeee----cccccc Confidence 6655555555555566666654445557778899999998888766655533221 111001111 111111 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) .+....++.|.++...+-.....-..|+||+..+..|++. +++.. . .... ..-.+++|++ T Consensus 258 ~~~~~~~d~i~~~~~~l~~~y~~na~~imn~~t~~~~~~~-----~~~~~-~--------~~~~------~~~~~llG~P 317 (387) T protein:vir:94 258 VEGADMYDAIINALADLHEDYRDNATIYMRYADYVKIISV-----LSNGT-T--------NFFD------TPAEKVFGKP 317 (387) T ss_pred ccccchHHHHHHHHhccChhhhcCCEEEEechHHHHHHHH-----HhcCC-C--------cccc------cCCccccccc Confidence 2223357888888877776666667899999998887653 12211 1 1111 1224789999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeE--eecCCcceeEeccCCCcceeEEEEeeEEEeeec---ceee---e-cCcC Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVA--SGEQMAMQTETDRDILAKSDAMSIDLHYVYHPV---GAKW---A-VTTT 308 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~--~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~---G~s~---~-~~~~ 308 (342) |+++|+++.. +|+ -|. |..-....+.+.++...+.+.+....+|-..|. .+.. + .++. T Consensus 318 V~~~~~~~~~-----------~~G--Df~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~Dg~v~~~~A~~~l~~ka~~~~ 384 (387) T protein:vir:94 318 VVFTDAAVKP-----------IVG--DFNYFGINYDGTTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKENTGP 384 (387) T ss_pred eEEecCCCce-----------eee--chhhhhhhhhhhhheecccccCCceEEEEEEEeCcEeechhheEEEEeecCCCC Confidence 9999987532 221 111 111122445566777778888888888877764 2333 1 2344 Q ss_pred CcC Q lcl|NC_020854. 309 NPT 311 (342) Q Consensus 309 sPt 311 (342) .|| T Consensus 385 ~~~ 387 (387) T protein:vir:94 385 LPS 387 (387) T ss_pred CCC Confidence 566 No 132 >protein:vir:96978 Length: 387 # NCBI annotation: ORF009 # Family: family:all:658 # MgeID: mge:1643 # MgeName: 42e # Cross-refs: genbank:acc:YP_239859;genbank:gi:66395517;genbank:GeneID:5133011 Probab=98.73 E-value=2e-09 Score=68.24 Aligned_cols=258 Identities=12% Similarity=0.124 Sum_probs=146.3 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |. +.-...++|+-+..-+.+.+.+.+.+.+ ++..- .++| ..+|.... ..+++.-+.|++.++..+.+ T Consensus 118 ~~~~~~~~gG~lIP~~~~~~Ii~~~~~~~~l~~--~~~~~-----~~~~--~~~p~~~~-~~~~a~~v~Eg~~~~~~~~~ 187 (387) T protein:vir:96 118 LPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLRE--KARLT-----NIKG--LEIPRVSY-TLDDDDFITDVETAKELKAK 187 (387) T ss_pred hccCCCCCCceeechhHHHHHHHHHHhhchhhh--hceee-----ecCC--ceeeeeec-cCCccccccccccccccccc Confidence 33 2234578898887766666666555533 22211 1223 34565442 23566668889888877777 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) -++-.-..++.+--+.++++.-.-+..|..+.+.+++++.+.+...+.++....| .+...+.+. ...... T Consensus 188 f~~v~l~~~k~~~~i~iS~ell~ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g------~g~~~g~~~----~~~~~~ 257 (387) T protein:vir:96 188 GDTVKFTTNKFKVFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAVSPK------SGLEHMSFY----NGSVKE 257 (387) T ss_pred cceeeechheeeeechhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhHhhcCCC------ccccceeee----cccccc Confidence 6655555555555566666654445557778899999998888766655533221 111001111 111111 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) .+....++.|.++...+-.....-..|+||+..+..|++. +++.. . .... ..-.+++|++ T Consensus 258 ~~~~~~~d~i~~~~~~l~~~y~~na~~imn~~t~~~~~~~-----~~~~~-~--------~~~~------~~~~~llG~P 317 (387) T protein:vir:96 258 VEGADMYDAIINALADLHEDYRDNATIYMRYADYVKIISV-----LSNGT-T--------NFFD------TPAEKVFGKP 317 (387) T ss_pred ccccchHHHHHHHHhccChhhhcCCEEEEechHHHHHHHH-----HhcCC-C--------cccc------cCCccccccc Confidence 2223357888888877776666667899999998887653 12211 1 1111 1224789999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeE--eecCCcceeEeccCCCcceeEEEEeeEEEeeec---ceee---e-cCcC Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVA--SGEQMAMQTETDRDILAKSDAMSIDLHYVYHPV---GAKW---A-VTTT 308 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~--~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~---G~s~---~-~~~~ 308 (342) |+++|+++.. +|+ -|. |..-....+.+.++...+.+.+....+|-..|. .+.. + .++. T Consensus 318 V~~~~~~~~~-----------~~G--Df~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~Dg~v~~~~A~~~l~~ka~~~~ 384 (387) T protein:vir:96 318 VVFTDAAVKP-----------IVG--DFNYFGINYDGTTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKENTGP 384 (387) T ss_pred eEEecCCCce-----------eee--chhhhhhhhhhhhheecccccCCceEEEEEEEeCcEeechhheEEEEeecCCCC Confidence 9999987532 221 111 111122445566777778888888888877764 2333 1 2344 Q ss_pred CcC Q lcl|NC_020854. 309 NPT 311 (342) Q Consensus 309 sPt 311 (342) .|| T Consensus 385 ~~~ 387 (387) T protein:vir:96 385 LPS 387 (387) T ss_pred CCC Confidence 566 No 133 >protein:vir:93881 Length: 387 # NCBI annotation: ORF011 # Family: family:all:658 # MgeID: mge:1485 # MgeName: 3A # Cross-refs: genbank:acc:YP_239938;genbank:gi:66395599;genbank:GeneID:5130947 Probab=98.72 E-value=2.7e-09 Score=67.55 Aligned_cols=258 Identities=12% Similarity=0.108 Sum_probs=145.3 Q ss_pred Ccc---eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MAT---LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT---~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |.+ .-...++|+-|..-+.+.+.+.+.+.+ ++.. . ..+|. ++|.... ..+++.-+.|++.++..+.+ T Consensus 118 l~~~t~s~gG~~IP~~~~~~Ii~~~~~~~~l~~--~~~v----~-~~~~~--~~p~~~~-~~~~a~~v~E~~~~~~~~~~ 187 (387) T protein:vir:93 118 LPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLRE--KARL----T-NIKGL--EIPRVSY-TLDDDDFITDVETAKELKLK 187 (387) T ss_pred hccCcCCCCceeechhHHHHHHHHHHhhchhhh--heee----e-ecCCc--eEEEEee-cCCccccccCcccccccccc Confidence 442 234578898887777777666666643 2221 1 12232 3464332 23556668888888877776 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) -++-.-..++.+.-+.++++...-+..|..+.+.+++++.+.+..+..++....|. +.....+. ...... T Consensus 188 f~~v~~~~~k~~~~~~iS~ell~Ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g~---------g~p~g~l~-~~~~~~ 257 (387) T protein:vir:93 188 GDTVKFTTNKFKVFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAVSPKS---------GLDHMSFY-NGSVKE 257 (387) T ss_pred cceeeeeheeeeeechhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHhHhhcCCCc---------cccceeee-cccccc Confidence 66555555555555666655443345577778999999988877665555322211 11011111 011111 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) .+....++.|.++...+......-..|+||+.++..|++. +++.+ + ..+. ..-.+++|++ T Consensus 258 v~~~~~~d~i~~~~~~l~~~~~~~a~~~mn~~t~~~~~~~-----~~d~~--~-------~~~~------~~~~~llG~P 317 (387) T protein:vir:93 258 VEGADMYDAIINALADLHEDYRDNATIYMRYADYVKIISV-----LSNGT--T-------NFFD------TPAEKVFGKP 317 (387) T ss_pred ccccchHHHHHHHHhccChhhhcCCEEEEechHHHHHHHH-----HhcCC--C-------cccc------cCCccccccc Confidence 1223347889998887877766777899999998777643 12221 1 1111 1124789999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEe--ecCCcceeEeccCCCcceeEEEEeeEEEeeec-ceee------ecCcC Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVAS--GEQMAMQTETDRDILAKSDAMSIDLHYVYHPV-GAKW------AVTTT 308 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~--~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~-G~s~------~~~~~ 308 (342) |+++|+++..- || -|.+ ..-....+.+++....+...++.+.++-..|. .-+| +.++. T Consensus 318 V~~~~~~~~~~-----------~G--Df~~~~~~~~~~~~~~~~~~~~~~~~~~~~~r~d~~v~~~eA~~~l~~k~~~~~ 384 (387) T protein:vir:93 318 VVFTDAAVKPI-----------VG--DFNYFGINYDGTTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKENTGS 384 (387) T ss_pred eEEecCCCcee-----------ee--ehhhhheehhhheeeecccccCCceeEEEEeeeCceeechhheEEEEeecCCCC Confidence 99999875321 21 1211 11112445566677778888888888877764 2233 12344 Q ss_pred CcC Q lcl|NC_020854. 309 NPT 311 (342) Q Consensus 309 sPt 311 (342) .|+ T Consensus 385 ~~~ 387 (387) T protein:vir:93 385 LPS 387 (387) T ss_pred CCC Confidence 566 No 134 >protein:vir:1084 Length: 437 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:21 # MgeName: bIL309 # Cross-refs: genbank:acc:NP_076738;genbank:gi:13095848;genbank:GeneID:920418 Probab=98.70 E-value=1.7e-08 Score=63.23 Aligned_cols=266 Identities=12% Similarity=0.045 Sum_probs=135.7 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceec-hhhc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLT-PGKI 76 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~-~~~l 76 (342) ++ +...-..+|+.+...+... .+...+.+ .+.. . ..++..+.+|.+.. ..+....+.++..++ .+.. T Consensus 156 ~~~~~~~~~g~lvp~~~~~~i~~~-~~~~~l~~--~~~~---~--~~~~~~~~~~~~~~-~~~~~~~~~e~~~~~e~~~~ 226 (437) T protein:vir:10 156 VTGIALKDGKVIIPETILTPEKEV-HQFPRLGS--LVRT---E--SVTTTTGKLPIFNN-STDLLTAHTEYGQTTKNATP 226 (437) T ss_pred hhhcccccccccchHHHHHHHHHh-hhhhhhhh--ccee---E--eeccCceeeEEeec-cccccccccccccccccccc Confidence 22 1223356787776554332 22222211 1111 1 12334566777653 234556677777765 3444 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +-+.-....++.+.-+.++.+.-.-+.-|..+.+.+.+++.+.+..+..++.-+.+ . .. T Consensus 227 ~~~~v~~~~~k~~~~~~is~ell~ds~~~~~~~i~~~l~~~~~~~~~~~i~~g~g~----~-----------------~~ 285 (437) T protein:vir:10 227 VITPILWDLKTYTGGYVFSQELISDSSYDWQAELQSRLIELRDNTDDSLIITALTD----G-----------------IK 285 (437) T ss_pred cceeeeeehhheeeehhhhHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhhhcc----c-----------------cc Confidence 44554555556666566666554334446777799999999998888777663311 0 00 Q ss_pred cccccccHHHHHHHHHH-hCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 157 DTPTALSPRHVAEARAI-LGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~~-~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) ........+.+.+++.. +-.....-.+|+||+..+..|+++ ++++++.+.... .....-++++| T Consensus 286 ~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~l~~l------kd~~g~~~~~~~---------~~~~~~~~l~G 350 (437) T protein:vir:10 286 KTTSTYLLGDLKKVLNVTLKPQDSAAASIVMSQSAYNLFDMA------TDAMGRPLLQPN---------VTAATGYTLLG 350 (437) T ss_pred ccccccchhhHHHHHHhhhhhhhhcCCEEEEcHHHHHHHHHh------hccCCCeeeccC---------ccCCCCccccc Confidence 11122344556666542 222333456899999999999874 445544433211 11233468999 Q ss_pred ceEEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCCCcceeEEEEeeEE---Eeeecceeeec----- Q lcl|NC_020854. 236 LRVIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDILAKSDAMSIDLHY---VYHPVGAKWAV----- 305 (342) Q Consensus 236 ~~VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y---~~~~~G~s~~~----- 305 (342) ++|++++++++...+. +. .+++|+. -++.+.....+.++...+.....+.+..-.+| +++|..+..-. T Consensus 351 ~pv~~~~~~~~~~~~~-~~-~~~~~gd~~~~~~~~~r~~~~~~~~~~~~~~~~~~~~~~r~d~~~~~~~a~~~l~~~~~~ 428 (437) T protein:vir:10 351 KTVVIVDDKLFPSASA-GD-VNIVVAPLKKAVINFKLTEITGQFQDTYDIWYKQLGIFLRQNVVQASKDLIVNLTGKLKA 428 (437) T ss_pred ceeEEecccccCCcCC-Cc-eEEEEeeccccEEEEeeeceEEEEecccccccceeeEEEEEccEEecccceEEEEeeccc Confidence 9999998875543322 22 2344452 34555555556665543322222222222233 23455554411 Q ss_pred -CcCCcChH Q lcl|NC_020854. 306 -TTTNPTRA 313 (342) Q Consensus 306 -~~~sPt~~ 313 (342) +...|+-. T Consensus 429 ~~~~~~~~~ 437 (437) T protein:vir:10 429 VTVVQSTAV 437 (437) T ss_pred cccCCCCCC Confidence 11122222 No 135 >protein:vir:4092 Length: 390 # NCBI annotation: major capsid protein a # Family: family:all:635 # MgeID: mge:86 # MgeName: 2389 # Cross-refs: genbank:acc:NP_510986;swissprot:trembl:q8w604;genbank:gi:17488508;uniprot:Q8W604;genbank:GeneID:1260361 Probab=98.65 E-value=1.1e-08 Score=64.21 Aligned_cols=270 Identities=11% Similarity=-0.017 Sum_probs=146.6 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceech-hhcccc Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTP-GKITAD 79 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~-~~lt~~ 79 (342) .++.-...++|+.+..-+.+.+.+.+.+.+ ++..-+ .++....+|.+.. .+.+.-+.|+..++. .+.+-+ T Consensus 87 ~~~~~gg~lvP~~~~~~I~~~~~~~s~i~~--~~~~~~-----~~~~~~~i~~~~~--~~~a~~~~E~~~~~~~~~~~f~ 157 (390) T protein:vir:40 87 NGFAGVTALLPPTVFERVFEDLTVEHPLLS--KINFVN-----TTATTEWIISVGD--VATAWWGPLCAEIKEVLDNGFD 157 (390) T ss_pred cCcccCcccccHHHHHHHHHHHHhhhhhhh--hceeee-----cCCceeEEEEEcC--CcceeeeccccccCccccccce Confidence 224456778899888777777766666643 222211 2456677887753 367777788877764 455555 Q ss_pred eeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhhee---eecccccc Q lcl|NC_020854. 80 KQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFD---LCIDSESG 156 (342) Q Consensus 80 ~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~---~~~~~~~~ 156 (342) +-.-..++.+.-+.++++...-+.-|..+.+.+++++.++++.++.+|. | .. .+.-.+.+.. .+...... T Consensus 158 ~i~l~~~k~~~~i~iS~ell~ds~~~l~~~i~~~la~~i~~~~~~a~l~---G---~G-~~~P~Gil~~~~~~~~~~~~~ 230 (390) T protein:vir:40 158 KIQTGMYKLSAYIPVCNAMLDLGPSWLDQYVRTILGEAMALGLEAGIVN---G---SG-KDQPIGMMRDLNNVTAGEHPV 230 (390) T ss_pred eeEeeeeeEEEeehhhHHHHhcchHHHHHHHHHHHHHHHHHHHHhhhhc---c---cC-CCccceeeecccccccccccc Confidence 5555556666567777776666666778889999999999999987764 2 10 0000000000 00001111 Q ss_pred cccccccHHHHHH----HHHHhCcc---ccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccc Q lcl|NC_020854. 157 DTPTALSPRHVAE----ARAILGDQ---GDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVS 229 (342) Q Consensus 157 ~~~~~~~~~~l~~----A~~~~GD~---~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~ 229 (342) .+...++.....+ ....+++. ...-.+|+||+..+..+.+. +..+++.++..+. T Consensus 231 ~~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~a~~i~n~~t~~~~l~~--~~~~~d~~G~~v~----------------- 291 (390) T protein:vir:40 231 KTATPLTDLTPATLATKVMLPLTDNGKKSVSDAILVINPADYWSKIYA--ATSYMTPQGVWVT----------------- 291 (390) T ss_pred ccccccchhhHHHHHHHHHHHhhcchhhhhcCceEEEcchhHHHHHHH--HhhccCCCCcccc----------------- Confidence 1122233222222 22233332 23457899999886544321 2234444432211 Q ss_pred eeeeccceEEEeCCcceeccCCCcceEEEEEecce-eEeecCCcceeEeccCC--CcceeEEEEeeEEEeeecc---e-- Q lcl|NC_020854. 230 VPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGA-VASGEQMAMQTETDRDI--LAKSDAMSIDLHYVYHPVG---A-- 301 (342) Q Consensus 230 i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GA-i~~~~k~~~~ve~dr~~--~~g~~~l~~r~~y~~~~~G---~-- 301 (342) -....|++|++++.||-. .++|+.-. +.+....++.+++++.. ..+++.+....++-..|.- | T Consensus 292 ~~~~~g~pvv~~~~~p~~---------~i~~Gd~s~~~i~~~~~~~v~~~~~~~f~~~~~~~r~~~r~dg~v~~~~A~~~ 362 (390) T protein:vir:40 292 GILPVPLEIVQSVAVPVG---------KAVAGRAKDYFMGIGSEQVIRTSTEYRLLDDETLYYAKQYANGRPKDNSSFLV 362 (390) T ss_pred ccCCCceeEEEcCCCCCC---------cEEEEeeceEEEEeecceEEEecchhhhhcCcEEEEEEEEeCCEEecccceEE Confidence 112259999999999732 13333221 22344455667766654 4577888888888777643 2 Q ss_pred -eee--------------cCcCCcChHH Q lcl|NC_020854. 302 -KWA--------------VTTTNPTRAQ 314 (342) Q Consensus 302 -s~~--------------~~~~sPt~~~ 314 (342) +.+ +.+.+|+-+| T Consensus 363 l~~~~~~~~~~~~~~~~~~~~~~~~~~~ 390 (390) T protein:vir:40 363 FDITGLEGSPAIDVNVVNNATPSETPAE 390 (390) T ss_pred EEeeccCCCCCCCcceeeCCCCCCCCCC Confidence 221 1122344444 No 136 >protein:vir:99675 Length: 324 # NCBI annotation: Major capsid protein # Family: family:all:975 # MgeID: mge:1523 # MgeName: VP4 # Cross-refs: genbank:acc:YP_249589;genbank:gi:68299740;genbank:GeneID:3799990 Probab=98.64 E-value=2.8e-09 Score=67.44 Aligned_cols=272 Identities=12% Similarity=0.069 Sum_probs=156.3 Q ss_pred hh-ccCCCCEEEccccccCCCCcccccCCCcee--chhhcccceeeeEeee-eccceeechHHHhhhcchHHHHHHHHHH Q lcl|NC_020854. 40 LN-ATEGGDFINVPFWKANLSGDFEVLSDSSSL--TPGKITADKQVAAILH-RGRAFEARDLAALAAGSDPMAAIGAKVA 115 (342) Q Consensus 40 l~-~~~~G~ti~~P~~~~i~~gda~~~~~~~~i--~~~~lt~~~~~a~i~~-~~k~~~~tD~a~~~~~~dp~~~i~~qia 115 (342) |. .-.+|+++.+|+.. .-.......|++| +++.+...+.+-+|=. .--.+.+.|+-...+..|++.+..+|.+ T Consensus 1 ~vr~i~~g~s~~~~~iG---~~~~~~~~~G~~l~~~~~~~~~~e~~itID~~l~~~~~VdDiD~~qa~~Dlr~e~s~~~G 77 (324) T protein:vir:99 1 MTRTITSGKSAQFPVMG---RTKARYLKQGQSLDDGREDIKHTEKVITIDGLLTTDVLIYDIEDAMNHYDVRSEYSTQMG 77 (324) T ss_pred CeeeeecCceEEEeeee---eeEeccccCCCCcCCCcCCcCcccEEEEecchhhhhhhhhhHHHHhcCccchhHHHHHHH Confidence 22 12469999999753 2456678888888 4677888887666543 4445788999988888999999999999 Q ss_pred HHHHHHHHHHHHHHHHHHHhhhcccccch-----hheeeecccccccccccccH----HHHHHHHHHhCcc--ccCeEEE Q lcl|NC_020854. 116 DYVANQRQKDLLSCLQGVFGSLNANTSSS-----AFFDLCIDSESGDTPTALSP----RHVAEARAILGDQ--GDKLTAV 184 (342) Q Consensus 116 ~~~~~~~~~~lla~L~g~~~~~~a~~~~~-----~~~~~~~~~~~~~~~~~~~~----~~l~~A~~~~GD~--~~~~~~i 184 (342) ...++.+|+.++..+..+........... ....+.... .......+. +.|.+|.++|-.+ -..-..+ T Consensus 78 ~aLA~~~Dq~i~~~~a~~~~~~a~~~~~~~~~~g~~~~~~~~~--~~~~~~~~~~~~~dai~~a~~~Lde~~VP~~gR~~ 155 (324) T protein:vir:99 78 EALAMAADVANYAEMAKLVNSRKETTNENIEGLGAASLVKITG--KKEDPAKYGTQVIQALTYARAAFAKKYIPAGDRTF 155 (324) T ss_pred HHHHHHHHHHHHHHHHHhhhcccccccCCcccCCccceecccc--cccccccCHHHHHHHHHHHHHHHhhcCCCCCCCEE Confidence 99999999999877654432211111000 001111111 112223343 4445555555443 1134789 Q ss_pred EEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceEEEeCCcceeccC-------------- Q lcl|NC_020854. 185 AMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSG-------------- 250 (342) Q Consensus 185 vmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~-------------- 250 (342) +|.|.+|..|.+...+..... . ..+. ...+.++.++|.+|+.+..+|..... T Consensus 156 vv~P~~y~~Ll~~~~~~~~~~---~------~~~~-----~~~G~V~~i~Gf~V~~Sn~lp~~~~t~~~~a~~~~~~~~~ 221 (324) T protein:vir:99 156 YTDPDTYSAILAALMPNAANY---A------ALID-----PETGNIRNVMGFEVVETPHMTAQMVTNPTDAFDGTGHIFP 221 (324) T ss_pred EeChHHHHHHhhccccccccc---c------cccc-----eecceEEEEeceEEEecCCccccccccccccccccccccc Confidence 999999998876654332111 0 1111 22467899999999999999864211 Q ss_pred ------CCcce-------EEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeEEEeeec---c---eeeecC---cC Q lcl|NC_020854. 251 ------GSTEY-------ATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPV---G---AKWAVT---TT 308 (342) Q Consensus 251 ------~~~~y-------~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~---G---~s~~~~---~~ 308 (342) ...+| ..++|-+-|++...-.++.+|..|+.....+.+...+.|+..+. + +.+..+ +. T Consensus 222 ~~~~~~~~~ky~~d~~~~~gl~~~~~a~~tv~~~~~~~e~~~~~~~~~d~i~~~~a~G~~~lRPe~a~~v~l~~~~~~~~ 301 (324) T protein:vir:99 222 ATGDSTTTGKMTVGADNVVGLFVHRSAVATLKLKDMALERARRPEYQADQIIAKYAMGHGGLRPEAVGAIIFEDGETPAV 301 (324) T ss_pred cccccccccccccccCceeEEEEehhheEEEeeecceecceechhhHHHhhhhhhhhcCcccccceEEEEEEccCccccc Confidence 01122 23566666777777778889999998887787777777766543 1 122111 11 Q ss_pred CcChHH-hcCCcCceeecCccccceEEEEecC Q lcl|NC_020854. 309 NPTRAQ-LETVANWSKVYELKNIGIVRATNVS 339 (342) Q Consensus 309 sPt~~~-L~~~~NW~~v~d~k~i~~~~~~~~~ 339 (342) .|.-.+ ++...- + +--++ ++.. T Consensus 302 ~~~~~~~~~~~~~-~------~~~~~--~~~~ 324 (324) T protein:vir:99 302 APDVITGVASFAA-P------ASTRA--KSSA 324 (324) T ss_pred cchhhhhhccccC-c------cccee--eecC Confidence 222110 000000 0 00000 0000 No 137 >protein:vir:962 Length: 397 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:19 # MgeName: bIL285 # Cross-refs: genbank:acc:NP_076616;genbank:gi:13095724;genbank:GeneID:920264 Probab=98.63 E-value=3.6e-08 Score=61.42 Aligned_cols=254 Identities=10% Similarity=0.052 Sum_probs=129.5 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceech-hhc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTP-GKI 76 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~-~~l 76 (342) ++ +.-....+|+.+...+.+.. +...+.+ .+.. . ..++....+|.... .++.+..+.|++..+. ... T Consensus 132 ~~~~~~~~~~~~vp~~~~~~i~~~~-~~~~l~~--~~~~---~--~~~~~~~~~~~~~~-~~~~~~~~~E~~~~~~~~~~ 202 (397) T protein:vir:96 132 RDGFTSVEGGALIPQELLQPQLEPK-DIVDLSK--YVRS---V--PVNSASGKFPVISK-SGSKMATVQQLEKNPQLANP 202 (397) T ss_pred hhcccccccccchhHHHHHHHHHhh-hhhhHHH--hhhh---c--cccccceeEEEEec-cCCccccccccccccccccc Confidence 22 22344566766655554422 2222211 1110 0 12344556665442 2344556777776653 445 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESG 156 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~ 156 (342) +..+-....++.+.-..++.....-+..|....+.+++++...+..+..++..+. .+ T Consensus 203 ~~~~i~~~~~~~~~~~~~s~ell~ds~~~l~~~i~~~l~~~~~~~~~~~i~~g~g-----------------------~~ 259 (397) T protein:vir:96 203 KMVEIDYSVATRRGYIPISQEMIDDASYDVTGLIADEIQDQSLNTKNADIAAVLK-----------------------TA 259 (397) T ss_pred cccceeecHhHhhcchhhHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHhhccc-----------------------cc Confidence 5555455555555555555444332334556668888888888777766554321 01 Q ss_pred cccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccc Q lcl|NC_020854. 157 DTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGL 236 (342) Q Consensus 157 ~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~ 236 (342) ......+++.+.++....-+... -++|+|||..+..|++. ++++++.+... ......-++++|+ T Consensus 260 ~~~~~~~~d~~~~~~~~~~~~~~-~a~~v~n~~~~~~l~~l------kd~~G~~~~~~---------~~~~~~~~~l~G~ 323 (397) T protein:vir:96 260 TAKSVVGVDGLKDLINKEIKKVY-DVKLFISASMYSELDKL------KDKNGRYLLQD---------SITAASGKQLLGK 323 (397) T ss_pred ccccccchHHHHHHHHHhhhhhc-CcEEEEcHHHHHHHHHh------hccCCCeEecc---------CccCCCccccccc Confidence 12234567888888765444333 36899999999999874 34554433211 1122345789999 Q ss_pred eEEEeCCcceeccCCCcceEEEEEec--ceeEeecCCcceeEeccCCCcceeEEEEeeEEEe---eecceee---ecC Q lcl|NC_020854. 237 RVIVSDDVNTAGSGGSTEYATYFFTQ--GAVASGEQMAMQTETDRDILAKSDAMSIDLHYVY---HPVGAKW---AVT 306 (342) Q Consensus 237 ~VvvdD~~p~~~~~~~~~y~t~l~~~--GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~---~~~G~s~---~~~ 306 (342) ||++++..+...+ .++ .+++|+. .++.++....+.+++.+.... .+.+.+-.++.. ||..|.. +.+ T Consensus 324 pv~~~~~~~~~~~--~~~-~~~~~gd~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~r~d~~~~~~~a~~~~~~~~a 397 (397) T protein:vir:96 324 EVVVLDDDVIGKS--VGN-VVGFIGDAKAFASFFDRKQVSVSWVDNNIY-GQLLAGIIRYDVKATDKKAGFYVTFTIG 397 (397) T ss_pred ceEEecccccCCC--CCc-eEEEEeehhcceEeEeecceEEEEeccccc-ceeEEEEEEEccEEecccceEEEEeecC Confidence 9998776544321 222 3455553 234455555566665544332 223333333322 2333322 223 No 138 >protein:vir:174 Length: 423 # NCBI annotation: capsid protein # Family: family:all:1412 # MgeID: mge:5 # MgeName: HK620 # Cross-refs: genbank:acc:NP_112079;genbank:gi:13559869;genbank:GeneID:920999 Probab=98.62 E-value=4.1e-08 Score=61.08 Aligned_cols=302 Identities=14% Similarity=0.036 Sum_probs=164.3 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccch--hhhccCCCCEEEccccccCCCCcccccC--CCceechhhc Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMT--ELNATEGGDFINVPFWKANLSGDFEVLS--DSSSLTPGKI 76 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~--~l~~~~~G~ti~~P~~~~i~~gda~~~~--~~~~i~~~~l 76 (342) ||.++. -++||+++.-+.+.+.+.+.|.+ ++-++- ++..+.-||+|+||.=.. ....++. .+..++++.+ T Consensus 1 MaN~ll-T~ip~iia~~al~~l~~~lV~~~--lVnr~y~~e~~~~k~GDTV~I~~p~~---~~~~~~~~~~~~~~~~~~l 74 (423) T protein:vir:17 1 MPNNLD-SNVSQIVLKKFLPGFMSDLVLAK--TVDRQLLAGEINSSTGDSVSFKRPHQ---FSSLRTPTGDISGQNKNNL 74 (423) T ss_pred Cccchh-hhhHHHHHHHHHHHHHhhcccch--hhcccCCcchhhcccCCEEEEeeCCc---ceeecccCcccCCcccCcc Confidence 996543 23799999888888888777744 555543 343223499999985332 3333443 3345778899 Q ss_pred ccceeeeEee-eeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 77 TADKQVAAIL-HRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 77 t~~~~~a~i~-~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) +.++...++- ....++.++|+...+.-.|. .++.++.....++++|.+|++.+.+. + .... .+ T Consensus 75 ~e~~v~l~id~~k~va~~v~d~E~~~~i~~~-~~~l~~A~~aLA~~vd~~ia~~~~~~-a----~~~~----------gt 138 (423) T protein:vir:17 75 ISGKATGRVGNYITVAVEYQQLEEAIKLNQL-EEILAPVRQRIVTDLETELAHFMMNN-G----ALSL----------GS 138 (423) T ss_pred ccceeEEEeeceeeeeeeecHHHHhcChhHH-HHHHHHHHHHHHHHHHHHHHHHHhhc-c----cccc----------cc Confidence 8887766664 45667888888876666674 67777778889999999988765331 1 1100 00 Q ss_pred ccccccccHHHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccce-ee Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAILGDQGD--KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSV-PT 232 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i-~~ 232 (342) . ....-.++.+.++..+|.+..= .-..+|+.|..+..|.+..-.-+..++-. ....-+..+ +. T Consensus 139 ~-~t~~~a~~~i~~a~~~Ld~~~vP~~~R~~Vv~p~~~a~Ll~~~~~~~~~~~~~-------------~~alr~g~i~G~ 204 (423) T protein:vir:17 139 P-NTPITKWSDVAQTASFLKDLGVNEGENYAVMDPWSAQRLADAQTGLHASDQLV-------------RTAWENAQIPTN 204 (423) T ss_pred C-CcccccHHHHHHHHHHHHhccCCcCCCEEEeChHHHHHHhccccceecccccc-------------hHHHhhccceee Confidence 0 0111136889999999987522 34778999999999986532111111100 001112344 68 Q ss_pred eccceEEEeCCcceeccCCCcceEEEEEecc-----eeEeecCCc---ceeEe--ccCCCcceeEEEEeeEEEeeecc-- Q lcl|NC_020854. 233 YMGLRVIVSDDVNTAGSGGSTEYATYFFTQG-----AVASGEQMA---MQTET--DRDILAKSDAMSIDLHYVYHPVG-- 300 (342) Q Consensus 233 ~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~G-----Ai~~~~k~~---~~ve~--dr~~~~g~~~l~~r~~y~~~~~G-- 300 (342) +.|..|.+|..+|....+.... +.....+ +...+.... +.... +-+.....|.+..-..+.+|+.- T Consensus 205 i~GFdvy~Snnip~~T~gt~~~--t~~~~~~~~v~~~a~~~~~~~~~~~~~~~~~~~g~l~~GD~~t~aGv~~v~~~tk~ 282 (423) T protein:vir:17 205 FGGIRALMSNGLASRTQGAFGG--TLTVKTQPTVTYNAVKDSYQFTVTLTGATTSVTGFLKAGDQVKFTNTYWLQQQTKQ 282 (423) T ss_pred ecceEEEEeCCCccccccceec--eeeecccccccccccccccceeeeeeeeeeeccCceeecceEEecceeeecccccc Confidence 8999999999999653322211 1111111 111111111 11111 22323333455444555555422 Q ss_pred ----------eeeecC-------------cCCcC-------------hHHhcCCcCcee------------ecCccccce Q lcl|NC_020854. 301 ----------AKWAVT-------------TTNPT-------------RAQLETVANWSK------------VYELKNIGI 332 (342) Q Consensus 301 ----------~s~~~~-------------~~sPt-------------~~~L~~~~NW~~------------v~d~k~i~~ 332 (342) .+|+.. .++|. .+..++++.|+. +|.+.++++ T Consensus 283 v~~~~~t~~~~~~~v~~~~~~~a~~~~tv~i~p~~i~~~~~~~~~~v~a~~a~~~~vT~~~~a~~t~~~nl~~~~~a~~l 362 (423) T protein:vir:17 283 ALYNGATPISFTATVTADANSDSSGDVTVTLSGVPIYDTTNPQYNSVSRQVAAGDAVSVVGTASQTMKPNLFYNKFFCGL 362 (423) T ss_pred cccccccccceEEEEEecccccccCceEEEecCccccccCCcccccceecccCCceeeccccccCCeeEEEEecCcceEE Confidence 233210 01221 133444444443 566667776 Q ss_pred EEEEecCCCC Q lcl|NC_020854. 333 VRATNVSNFD 342 (342) Q Consensus 333 ~~~~~~~~~~ 342 (342) +..-- +.. T Consensus 363 ~~~pl--~~~ 370 (423) T protein:vir:17 363 GSIPL--PKL 370 (423) T ss_pred EEEcc--cCC Confidence 65511 111 No 139 >protein:vir:4159 Length: 315 # NCBI annotation: structural protein # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:87 # MgeName: psiM2 # Cross-refs: genbank:acc:NP_046968;genbank:gi:9630538;genbank:GeneID:1261712 Probab=98.58 E-value=8.2e-08 Score=59.44 Aligned_cols=275 Identities=10% Similarity=0.024 Sum_probs=141.9 Q ss_pred Ccceecc----ccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEcccccc---CCCCcccccCCCceech Q lcl|NC_020854. 1 MATLRSD----IIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKA---NLSGDFEVLSDSSSLTP 73 (342) Q Consensus 1 MaT~~~d----~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~---i~~gda~~~~~~~~i~~ 73 (342) |. .+| .+.||.+..++ +.+.+.+.|.+ .+...... .+.+..++.-.. ...|. ....+++..+. T Consensus 19 ~t--~~d~~Gg~l~P~~~~~~i-~~~~e~s~~l~--~~~vi~~~----~~~~~~i~~~g~~~~~~~g~-~~~~~~~~~~~ 88 (315) T protein:vir:41 19 ID--VPDLGRGVLSVDRFGEFV-KAVRDSAVIIP--EARIDNAL----KSYEKDISRLSLVLDVGPGR-DETGQKLAPPE 88 (315) T ss_pred cC--CcCCCCceechHHHHHHH-HHHHhhhhhhh--hceeeecc----ccccccccccccCccccccc-ccccCcCCCCC Confidence 33 244 48899998866 56777777755 23222221 122333332110 11111 11112222222 Q ss_pred hhcccceeeeEeeeeccceeechHHHhhh--cchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccc-----chhh Q lcl|NC_020854. 74 GKITADKQVAAILHRGRAFEARDLAALAA--GSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTS-----SSAF 146 (342) Q Consensus 74 ~~lt~~~~~a~i~~~~k~~~~tD~a~~~~--~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~-----~~~~ 146 (342) .+.+-++..-..++..--+.+++..-.-+ +.|..+.+..++++.++++.+..++. | ...+... .+-+ T Consensus 89 ~~~~f~~~~l~~~~l~~~~~it~elL~D~~~~~~~e~~l~~~~a~~~a~~~~~~~~n---G---dg~s~~p~~~~~~G~l 162 (315) T protein:vir:41 89 STAEVKTNTLYMREMVTKVVIHEDAIEDNIEGKAFEQKIVTLLGEGISYVLEKYYLH---G---DTSSSDPLLRMSDGWL 162 (315) T ss_pred CccccceeeeceeeeeeeccccHHHHHhhhccccHHHHHHHHHHHHHHHHHHHHhhc---c---CCcCcCccccccccce Confidence 23333333323333333356666664433 35888999999999999988877664 2 1111000 0111 Q ss_pred eeee---cccccccccccccHHHHHHHHHHhCcccc---CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeecccee Q lcl|NC_020854. 147 FDLC---IDSESGDTPTALSPRHVAEARAILGDQGD---KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSM 220 (342) Q Consensus 147 ~~~~---~~~~~~~~~~~~~~~~l~~A~~~~GD~~~---~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~ 220 (342) .... ...........+..+.|.+....+-.+.- .-.+|+||+.++..+|+.. +.+.. . + T Consensus 163 ~~a~~~~~~~~~~~~a~~~~~d~l~~l~~sl~~~yr~~~~~~~~imn~~t~~~~rklk------~~~g~--~-------l 227 (315) T protein:vir:41 163 KLASEKLTESDVDPEAEDWPMNLFDTMIESLPTPYRNNLPNMKFYVTWDIYRAYRDAL------KGRET--G-------L 227 (315) T ss_pred ecccccccccccccccccccHHHHHHHHHhcChHHhhcCCceEEEEcHHHHHHHHHHh------ccCCC--c-------c Confidence 0000 01111122334567778888887777543 3568999999999998753 12211 1 1 Q ss_pred ecccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecce-eEeecCCcceeEeccCCCcceeEEEEeeEEEeeec Q lcl|NC_020854. 221 AAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGA-VASGEQMAMQTETDRDILAKSDAMSIDLHYVYHPV 299 (342) Q Consensus 221 ~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GA-i~~~~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~ 299 (342) ..+......-.+++|++|+..+.||....+.. .++|+.-. +.|+..+.+.++++|+...+...++.+.+..++.. T Consensus 228 w~~~~~~g~~~tl~G~PV~~~~~m~~~~~~~~----~ilf~d~~nl~~~~~~~i~i~~~~~a~~~~~~~~~~~r~d~~~~ 303 (315) T protein:vir:41 228 GDQALTGANSILYDGRPVQYVPALEALNDGKS----RALFVVPTQLVYGFWRNIKVVPDYDAEMRLTKYVASLRTDNHYE 303 (315) T ss_pred ccchhhcCCCceecccceEecccccccCCCCc----cEEEecccceEEEeccccEEEeeecCCCCceEEEEEEEeceeEE Confidence 11112234456899999999999987654332 35555432 45666777888999988777766666665433321 Q ss_pred ceeeec-CcCCc Q lcl|NC_020854. 300 GAKWAV-TTTNP 310 (342) Q Consensus 300 G~s~~~-~~~sP 310 (342) --.+.. +-..- T Consensus 304 ~~~~~a~~~~~v 315 (315) T protein:vir:41 304 DEEGAVSATITV 315 (315) T ss_pred eccceeEeeeeC Confidence 111100 00000 No 140 >protein:vir:101650 Length: 497 # NCBI annotation: gp13 # Family: family:all:585 # MgeID: mge:1515 # MgeName: 244 # Cross-refs: genbank:acc:YP_654768;genbank:gi:109302766;genbank:GeneID:4156084 Probab=98.57 E-value=7e-08 Score=59.81 Aligned_cols=285 Identities=10% Similarity=0.036 Sum_probs=140.1 Q ss_pred Ccc---eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MAT---LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT---~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |++ .-.-..+|+.+..-+.+...+.+.+.+ ++.. + ..++..+++|.... .++.+.-+.|++.++..+.+ T Consensus 151 ~~~~~~~~gg~~vp~~~~~~ii~~~~~~~~i~~--l~~~---~--~~~~~~~~~~~~~~-~~~~a~wv~E~~~~~~s~~~ 222 (497) T protein:vir:10 151 NPFGSTGTFAPGILPTFLPGIVEQLFYELSLAD--LISS---R--PVTSPNLSYLTESA-AHNNAAAVAEAGTYPFSSEE 222 (497) T ss_pred hhcccCcccccccchhhhHHHHHHHHhhhhHHh--hccc---c--ccCCCceEEEEEcC-CCCcceeeccCccccccccc Confidence 331 122234555555555555555555543 2221 1 12345688898652 23566788999999988887 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH-----HHHHHhhhcccccc---hhheee Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSC-----LQGVFGSLNANTSS---SAFFDL 149 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~-----L~g~~~~~~a~~~~---~~~~~~ 149 (342) -++-....++.+--..++++.-.-+ .+-...|.++++..+++..+..+|.= ..|++....+.... ....+. T Consensus 223 f~~i~~~~~k~a~~~~iS~ell~d~-~~l~~~i~~~l~~~i~~~~d~~~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~~ 301 (497) T protein:vir:10 223 FARVYEQVGKVANALTITDEGLRDA-PELFNFVQGRLLEGIQRKEEVQLLAGGGYPGVNGLLQRSTGFTASSASSLFGAT 301 (497) T ss_pred ceeeEeeeeeeEeecHhHHHHHHhH-HHHHHHHHHHHHHHHHHHHHHHhhcCCCcccccccccccccccccccccchhhh Confidence 7766666666665555555442222 23455688899999998888777641 11121110000000 000000 Q ss_pred e------cccccccccccccHHHH---------------------------------HHHHHHhCcc-ccCeEEEEEchH Q lcl|NC_020854. 150 C------IDSESGDTPTALSPRHV---------------------------------AEARAILGDQ-GDKLTAVAMHSK 189 (342) Q Consensus 150 ~------~~~~~~~~~~~~~~~~l---------------------------------~~A~~~~GD~-~~~~~~ivmhS~ 189 (342) . ............+...+ ..+....-.. ...-.+|+||+. T Consensus 302 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~vmn~~ 381 (497) T protein:vir:10 302 SATVSNVKFPADGTNGAFVGQDTVASLKYGRVVTGAAGSGSGVAGSYPTAAEIAENVFDAFVDIQLTLFQTPNAVVMNPR 381 (497) T ss_pred hhhhhhhhhhcccccchhhhhhHHHHHHHHHhhhhhhhhccchhccccchhhhhhHHHHHHhhhhhhcccCCCeEEEchH Confidence 0 00000000000111111 1111111110 112237999999 Q ss_pred HHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecceeEeec Q lcl|NC_020854. 190 VYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGE 269 (342) Q Consensus 190 v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~ 269 (342) ++..|++. ++++++.+.....+.....+ ...-++++|+||++++.+|.. ++..=-|..+++.+.. T Consensus 382 ~~~~l~~l------kd~~G~~i~~~~~~~~~~~~---~~~~~~l~G~pV~~t~~~~~~------~~~~Gd~~~~~~~i~~ 446 (497) T protein:vir:10 382 DWELLRLT------KDANGQYMGGNFFGNAYGNP---VNGGKNIWGVPVVTTPLIPLG------TILVGHFAPSVIQTAR 446 (497) T ss_pred HHHHHHHh------hcCCCceeccCccccccccc---ccCCceeeceeeEecCCCCCC------ceEEeecccceEEEEE Confidence 99999875 34554433322222111111 123458899999999999742 2111113456676766 Q ss_pred CCcceeEeccCC----CcceeEEEEeeEEEeee---cceee---ec-CcCC Q lcl|NC_020854. 270 QMAMQTETDRDI----LAKSDAMSIDLHYVYHP---VGAKW---AV-TTTN 309 (342) Q Consensus 270 k~~~~ve~dr~~----~~g~~~l~~r~~y~~~~---~G~s~---~~-~~~s 309 (342) ...+.++..+.. ......+....++.+.| ..|.. +. +..| T Consensus 447 r~~~~v~~~~~~~~~f~~n~v~~r~~~r~~~~v~~p~A~~~l~~~~~~~~~ 497 (497) T protein:vir:10 447 REGVTMQMTNSNGTDFVDGKVTVRAEERLGLLVYRPSAFQLIQLKKGATGS 497 (497) T ss_pred ecccEEEeecccchhhhcCcEEEEEEEeecceeeccccEEEEEecCCccCC Confidence 666666665432 24556777777776644 44432 21 1222 No 141 >protein:vir:7855 Length: 497 # NCBI annotation: gp12 # Family: family:all:585 # MgeID: mge:150 # MgeName: CJW1 # Cross-refs: genbank:acc:NP_817462;genbank:gi:29565891;genbank:GeneID:1259081 Probab=98.57 E-value=7e-08 Score=59.81 Aligned_cols=285 Identities=10% Similarity=0.036 Sum_probs=140.1 Q ss_pred Ccc---eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MAT---LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT---~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |++ .-.-..+|+.+..-+.+...+.+.+.+ ++.. + ..++..+++|.... .++.+.-+.|++.++..+.+ T Consensus 151 ~~~~~~~~gg~~vp~~~~~~ii~~~~~~~~i~~--l~~~---~--~~~~~~~~~~~~~~-~~~~a~wv~E~~~~~~s~~~ 222 (497) T protein:vir:78 151 NPFGSTGTFAPGILPTFLPGIVEQLFYELSLAD--LISS---R--PVTSPNLSYLTESA-AHNNAAAVAEAGTYPFSSEE 222 (497) T ss_pred hhcccCcccccccchhhhHHHHHHHHhhhhHHh--hccc---c--ccCCCceEEEEEcC-CCCcceeeccCccccccccc Confidence 331 122234555555555555555555543 2221 1 12345688898652 23566788999999988887 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH-----HHHHHhhhcccccc---hhheee Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSC-----LQGVFGSLNANTSS---SAFFDL 149 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~-----L~g~~~~~~a~~~~---~~~~~~ 149 (342) -++-....++.+--..++++.-.-+ .+-...|.++++..+++..+..+|.= ..|++....+.... ....+. T Consensus 223 f~~i~~~~~k~a~~~~iS~ell~d~-~~l~~~i~~~l~~~i~~~~d~~~l~G~G~~~p~Gil~~~~~~~~~~~~~~~~~~ 301 (497) T protein:vir:78 223 FARVYEQVGKVANALTITDEGLRDA-PELFNFVQGRLLEGIQRKEEVQLLAGGGYPGVNGLLQRSTGFTASSASSLFGAT 301 (497) T ss_pred ceeeEeeeeeeEeecHhHHHHHHhH-HHHHHHHHHHHHHHHHHHHHHHhhcCCCcccccccccccccccccccccchhhh Confidence 7766666666665555555442222 23455688899999998888777641 11121110000000 000000 Q ss_pred e------cccccccccccccHHHH---------------------------------HHHHHHhCcc-ccCeEEEEEchH Q lcl|NC_020854. 150 C------IDSESGDTPTALSPRHV---------------------------------AEARAILGDQ-GDKLTAVAMHSK 189 (342) Q Consensus 150 ~------~~~~~~~~~~~~~~~~l---------------------------------~~A~~~~GD~-~~~~~~ivmhS~ 189 (342) . ............+...+ ..+....-.. ...-.+|+||+. T Consensus 302 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~vmn~~ 381 (497) T protein:vir:78 302 SATVSNVKFPADGTNGAFVGQDTVASLKYGRVVTGAAGSGSGVAGSYPTAAEIAENVFDAFVDIQLTLFQTPNAVVMNPR 381 (497) T ss_pred hhhhhhhhhhcccccchhhhhhHHHHHHHHHhhhhhhhhccchhccccchhhhhhHHHHHHhhhhhhcccCCCeEEEchH Confidence 0 00000000000111111 1111111110 112237999999 Q ss_pred HHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecceeEeec Q lcl|NC_020854. 190 VYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGE 269 (342) Q Consensus 190 v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~ 269 (342) ++..|++. ++++++.+.....+.....+ ...-++++|+||++++.+|.. ++..=-|..+++.+.. T Consensus 382 ~~~~l~~l------kd~~G~~i~~~~~~~~~~~~---~~~~~~l~G~pV~~t~~~~~~------~~~~Gd~~~~~~~i~~ 446 (497) T protein:vir:78 382 DWELLRLT------KDANGQYMGGNFFGNAYGNP---VNGGKNIWGVPVVTTPLIPLG------TILVGHFAPSVIQTAR 446 (497) T ss_pred HHHHHHHh------hcCCCceeccCccccccccc---ccCCceeeceeeEecCCCCCC------ceEEeecccceEEEEE Confidence 99999875 34554433322222111111 123458899999999999742 2111113456676766 Q ss_pred CCcceeEeccCC----CcceeEEEEeeEEEeee---cceee---ec-CcCC Q lcl|NC_020854. 270 QMAMQTETDRDI----LAKSDAMSIDLHYVYHP---VGAKW---AV-TTTN 309 (342) Q Consensus 270 k~~~~ve~dr~~----~~g~~~l~~r~~y~~~~---~G~s~---~~-~~~s 309 (342) ...+.++..+.. ......+....++.+.| ..|.. +. +..| T Consensus 447 r~~~~v~~~~~~~~~f~~n~v~~r~~~r~~~~v~~p~A~~~l~~~~~~~~~ 497 (497) T protein:vir:78 447 REGVTMQMTNSNGTDFVDGKVTVRAEERLGLLVYRPSAFQLIQLKKGATGS 497 (497) T ss_pred ecccEEEeecccchhhhcCcEEEEEEEeecceeeccccEEEEEecCCccCC Confidence 666666665432 24556777777776644 44432 21 1222 No 142 >protein:vir:105374 Length: 423 # NCBI annotation: gene 5 protein # Family: family:all:1412 # MgeID: mge:1556 # MgeName: Sf6 # Cross-refs: genbank:acc:NP_958181;genbank:gi:41057283;genbank:GeneID:2716621 Probab=98.56 E-value=4e-08 Score=61.16 Aligned_cols=298 Identities=14% Similarity=0.069 Sum_probs=155.5 Q ss_pred Cccee-ccccchhHHHHHHHhhhHHhhhhhhcCccccch--hhhccCCCCEEEccccccCCCCcccccC--CCceechhh Q lcl|NC_020854. 1 MATLR-SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMT--ELNATEGGDFINVPFWKANLSGDFEVLS--DSSSLTPGK 75 (342) Q Consensus 1 MaT~~-~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~--~l~~~~~G~ti~~P~~~~i~~gda~~~~--~~~~i~~~~ 75 (342) ||.++ .. +||+++.-+.+.+.+.+.| ..++.++- ++..+.-||+|+||.=. .....++. ++..++++. T Consensus 1 MaN~llT~--~p~iia~~aL~~l~~~lV~--~~lVnr~y~~ef~~~k~GDTV~I~~p~---~~~~~d~~~~~~~~~~~~d 73 (423) T protein:vir:10 1 MPNNLDSN--VSQIVLKKFLPGFMSDLVL--AKTVDRQLLAGEINSSTGDSVSFKRPH---QFSSLRTPTGDISGQNKNN 73 (423) T ss_pred Cccchhhh--hHHHHHHHHHHHHHhhccc--chhhcccCCCcccccccCCEEEEeeCC---ceeeeccCCccccccccCc Confidence 99553 43 7999998888888887777 34555543 34322249999987543 23444444 334678889 Q ss_pred cccceeeeEee-eeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccc Q lcl|NC_020854. 76 ITADKQVAAIL-HRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSE 154 (342) Q Consensus 76 lt~~~~~a~i~-~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~ 154 (342) ++.++...++- ....++.++|+...+.-.+. .++.++.....++++|.+|++.+.+. +.+.. . T Consensus 74 l~e~~v~l~id~~k~va~~v~d~E~~~~i~~~-~~~l~~A~~aLA~~vd~~ia~~~~~~-~~~~~--------------g 137 (423) T protein:vir:10 74 LISGKATGRVGNYITVAVEYQQLEEAIKLNQL-EEILAPVRQRIVTDLETELAHFMMNN-GALSL--------------G 137 (423) T ss_pred cccceeEEEeeceeeeeeeechHHHhcChhhH-HHHHHHHHHHHHHHHHHHHHHHHhhc-ccccc--------------c Confidence 98888777764 45667888888866666664 67777778889999999998765432 11000 0 Q ss_pred cccccccccHHHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccce-e Q lcl|NC_020854. 155 SGDTPTALSPRHVAEARAILGDQGD--KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSV-P 231 (342) Q Consensus 155 ~~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i-~ 231 (342) +..+ ..-.++.+.++..+|.+..- .-..+|+.|..+..|.+..-.-+. .+... ....-+..+ + T Consensus 138 t~~t-~~~a~~~i~~a~~~Ld~~~vP~~~R~~Vv~p~~~a~Ll~~~~~~~~--~~~~~-----------~~alr~g~i~G 203 (423) T protein:vir:10 138 SPNT-PITKWSDVAQTASFLKDLGVNEGENYAVMDPWSAQRLADAQTGLHA--SDQLV-----------RTAWENAQIPT 203 (423) T ss_pred cCCc-ccchHHHHHHHHHHHHhccCCcCCCEEEeChHHHHHHhccccceec--ccccc-----------hhhhhhcccee Confidence 0001 11136788999998887522 346789999999999864321111 11100 001112334 6 Q ss_pred eeccceEEEeCCcceeccCCCcceEEEEEecceeE-e----ecCCccee------EeccCCCcceeEEEEeeEEEeeecc Q lcl|NC_020854. 232 TYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVA-S----GEQMAMQT------ETDRDILAKSDAMSIDLHYVYHPVG 300 (342) Q Consensus 232 ~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~-~----~~k~~~~v------e~dr~~~~g~~~l~~r~~y~~~~~G 300 (342) .+.|..|.+|..+|....+.... +.....+... . +... ..+ .++-......|.+.--..|.+|+.- T Consensus 204 ~i~GFdv~~Snnip~~T~gt~~~--t~~~~~~~~v~~~a~~~a~~-~~~~~~~~~~~~~~~l~~GD~~t~aGv~~v~~~t 280 (423) T protein:vir:10 204 NFGGIRALMSNGLASRTQGAFGG--TLTVKTQPTVTYNAVKDSYQ-FTVTLTGATASVTGFLKAGDQVKFTNTYWLQQQT 280 (423) T ss_pred eecceEEEEeCCCcccccccccc--ceeeeecceeccccccccce-eeeeeeeccccccCceeecceEEecceeeecccc Confidence 88999999999999654332221 1111112111 1 1111 111 1111223333455555555555533 Q ss_pred e------------eeecC-------------cCCcCh-------------HHhcCCcCceeecCccccceEEEEecCCCC Q lcl|NC_020854. 301 A------------KWAVT-------------TTNPTR-------------AQLETVANWSKVYELKNIGIVRATNVSNFD 342 (342) Q Consensus 301 ~------------s~~~~-------------~~sPt~-------------~~L~~~~NW~~v~d~k~i~~~~~~~~~~~~ 342 (342) . +|+.. .++|.. +..++++.|+.+.- +.+-+.-|.=+. T Consensus 281 k~~~~~~~t~~~~~~~v~a~~~~~~~g~~tv~i~p~~i~~~~~~~~~~v~a~~a~~~~vT~~~~----a~~t~~~nl~~~ 356 (423) T protein:vir:10 281 KQALYNGATPISFTATVTADANSDSGGDVTVTLSGVPIYDTTNPQYNSVSRQVEAGDAVSVVGT----ASQTMKPNLFYN 356 (423) T ss_pred cccccccccCcceEEEEEeeeeeccCCceeeeccCccccccCCcccccccccccCCceeecccc----ccCCeeEEEEec Confidence 2 33211 011211 22233333332210 000000111111 No 143 >protein:vir:93616 Length: 645 # NCBI annotation: putative major head protein/prohead protease # Family: family:all:21 # MgeID: mge:157 # MgeName: phi 4795 # Cross-refs: genbank:acc:YP_001449293;genbank:gi:157166041;goa:Q6H9U8;interpro:IPR006433;uniprot:Q6H9U8;genbank:GeneID:5580438 Probab=98.54 E-value=7.6e-08 Score=59.62 Aligned_cols=277 Identities=13% Similarity=0.037 Sum_probs=141.2 Q ss_pred Ccce---eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MATL---RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT~---~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |+|. -..+++|+.+..-+.+.+.+.+.+.+.+... -+.+.. .|| .+++|.... .+.+.-+.|++.++..+.+ T Consensus 338 ~~~~~~~~Gg~~vp~~~~~~ii~~l~~~svv~~l~~~~-~~~~~~-~~~-~~~ip~~t~--~~~a~wv~Eg~~~~~s~~~ 412 (645) T protein:vir:93 338 TTTDPQWAGSLSEYQEYAQDFIDYLRPQTIIGRFGQGG-IPALRQ-VPF-NIRVHAQVS--GGAAGWVGEGKTKPLTKFD 412 (645) T ss_pred ccccccccCCccCchhhHHHHHHhhhhhhhHHhhcccc-cccccc-ccC-ceeeeeeec--CcceEEeccCccccccccc Confidence 3221 2567899998877777777777666644321 111111 233 356887642 3566668999999988887 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) -++-....++.+--..++++-..-+.-+..+.+.+++++.++++.++.+|.-..+ ..++.....+. ...... T Consensus 413 f~~v~l~~~kla~~~~iS~ell~ds~~~~~~~i~~~l~~aia~~~d~a~l~g~g~----~~~~~~p~gi~---~~~~~~- 484 (645) T protein:vir:93 413 FESITFSHAKVSAIAVLTEELIRFSSPAADALVRNALAEAVVARLDTDFVDPKKA----AVADVSPASIT---HDVKGT- 484 (645) T ss_pred eeEEEEeeEEEEEeehhHHHHHhhchHHHHHHHHHHHHHHHHHHHHHHhhcCCCc----ccCCcccccee---cccccc- Confidence 7666666666666556655543334446566789999999999999877642111 11111111111 111111 Q ss_pred ccccccHHHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGD--KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG 235 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G 235 (342) .........+..+...+-+... .-.+|+|||..+..|+++ ++++++.++. .. ...-++++| T Consensus 485 ~~~~~~~~d~~~~~~~~~~a~~~~~~a~~vmn~~~~~~L~~l------kd~~G~~~~~--------~~---~~~~~tL~G 547 (645) T protein:vir:93 485 ASSGNPDADAEAAFGQFVAANLQPTGAVWLMSSTNALALSMR------KNALGQKEYP--------DM---TLLGGSFQG 547 (645) T ss_pred ccccchHHHHHHHHHHHHhcCCCccccEEEEcHHHHHHHHhc------cccCCceeec--------CC---CCCCceeec Confidence 1111223445566555543322 235899999999999875 3343332210 00 111258999 Q ss_pred ceEEEeCCcceeccCCCcceE-EEEEecceeEeecCCcceeEe--ccCCC--------------cceeEEEEeeEEEee- Q lcl|NC_020854. 236 LRVIVSDDVNTAGSGGSTEYA-TYFFTQGAVASGEQMAMQTET--DRDIL--------------AKSDAMSIDLHYVYH- 297 (342) Q Consensus 236 ~~VvvdD~~p~~~~~~~~~y~-t~l~~~GAi~~~~k~~~~ve~--dr~~~--------------~g~~~l~~r~~y~~~- 297 (342) +||++++.+|-.-. ..... .|+--.|.+.+...+...++. ++... .....+.+..++.+. T Consensus 548 ~PV~~s~~vp~~~~--~gd~s~~~ig~~~~v~i~~s~~a~~~~~~~~~~~~~~~~~~~~v~lf~~d~vaira~~r~d~~~ 625 (645) T protein:vir:93 548 LPVIVSQYVGDQLV--LVNAPDIYLADDGGVAVDMSREASLEMQSEPTGDSTTPSPVELVSMFQTGSVAIRAERWINWRR 625 (645) T ss_pred eeeEEeccCCccee--EeccccEEEEEecceEEEeecceeEEEeecccccccccccccchhHhhcCceEEEEEEEEccee Confidence 99999999873110 01111 122223444444433322221 11110 111233333333322 Q ss_pred --------ecceeeecCcCC Q lcl|NC_020854. 298 --------PVGAKWAVTTTN 309 (342) Q Consensus 298 --------~~G~s~~~~~~s 309 (342) +-|+.|-.+.-. T Consensus 626 ~~p~a~~~lt~~~~g~~~~~ 645 (645) T protein:vir:93 626 RRTAAVAVITGVNYGSASGG 645 (645) T ss_pred eCccceEEEecccCCcccCC Confidence 345566222111 No 144 >protein:vir:78640 Length: 352 # NCBI annotation: phage capsid # Family: family:all:658 # MgeID: mge:1855 # MgeName: tp310-2 # Cross-refs: genbank:acc:YP_001429943;genbank:gi:156603997;genbank:GeneID:5525386 Probab=98.50 E-value=2e-08 Score=62.80 Aligned_cols=258 Identities=13% Similarity=0.134 Sum_probs=144.0 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |. +.-...++|+-+..-+.+.+.+.+.+.+ ++.. . ..+|. ++|.-.. ..+++.-+.|+..++..+.+ T Consensus 83 l~~~~~~~gG~lIP~~~~~~Ii~~l~~~s~l~~--~~~v----~-~~~~~--~~p~~~~-~~~~a~~v~E~~~~~~~~~~ 152 (352) T protein:vir:78 83 LPTGNDSGGDKLLPKTLSKEIVSEPFAKNQLRE--KARL----T-NIKGL--EIPRVSY-TLDDDDFITDVETAKELKLK 152 (352) T ss_pred hccCCCCCCceeccHhHHHHHHHHHHhhcchhh--heee----E-ecCCc--eEEEEec-CCCccccccccccccccccc Confidence 43 2334568898777666666666555533 2221 1 12333 3464432 23567778899988888776 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) -++-....++.+.-+.+++..-.-+..|..+.+.+++++.+.+.....++..-.| .+.-...+.. ... .. T Consensus 153 f~~v~~~~~k~~~~i~is~ell~Ds~~~l~~~i~~~la~~~~~~e~~~~~~~g~g------~~~~~g~l~~---~~~-~~ 222 (352) T protein:vir:78 153 GDTVKFTTNKFKVFAAISDTVIHGSDVDLVNWVENALQSGLAAKERKDALAVSPK------SGLEHMSFYN---GSV-KE 222 (352) T ss_pred ceeeeecceeEEeechhhHHHHhhhhHHHHHHHHHHHHHHHHHHHHHhhhhcCCC------Ccccccceec---ccc-cc Confidence 6655555555555566666544334567778899999998887655544432111 1110111111 111 11 Q ss_pred ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccce Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLR 237 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~ 237 (342) .+....++.|.++...+-.....-.+|+||+..+..|++.. + ....+ .+. ..-.+++|++ T Consensus 223 ~t~~~~~d~i~~~~~~l~~~~~~~a~~~mn~~t~~~l~~~~-----~-~~~~~--------~~~------~~~~~llG~P 282 (352) T protein:vir:78 223 VEGANMYDAIINALADLHEDYRDNATIYMRYADYVKIISVL-----S-NGTTN--------FFD------TPAEKVFGKP 282 (352) T ss_pred ccccchHHHHHHHHhccChhhhcCCEEEEehHHHHHHHHHH-----h-ccCCc--------ccc------cCCccccccc Confidence 11222467888888777666656688999999998887631 1 11111 111 1124689999 Q ss_pred EEEeCCcceeccCCCcceEEEEEecceeEee--cCCcceeEeccCCCcceeEEEEeeEEEeeec-ceeee------cCcC Q lcl|NC_020854. 238 VIVSDDVNTAGSGGSTEYATYFFTQGAVASG--EQMAMQTETDRDILAKSDAMSIDLHYVYHPV-GAKWA------VTTT 308 (342) Q Consensus 238 VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~--~k~~~~ve~dr~~~~g~~~l~~r~~y~~~~~-G~s~~------~~~~ 308 (342) |+++|+++.. +| |-|.+. .-....+++.++...+.+.+..+.++-..|. .-+|. .++. T Consensus 283 V~~~~~~~~~-----------~~--Gdf~~~~~~~~~~~~~~~~~~~~g~~~f~~~~r~Dg~~~~~eA~~~l~~~a~~~~ 349 (352) T protein:vir:78 283 VVFTDAAVKP-----------IV--GDFNYFGINYDGTTYDTDKDVKKGEYLFVLTAWYDQQRTLDSAFRIAKAKESTGS 349 (352) T ss_pred eEEecCCCce-----------eE--eehhhhhhhhhhheeeeeccccCCeeEEEEEeeeCceeechhheEEEEeecccCC Confidence 9999977532 22 223221 1122455666677778888888888877764 23332 2233 Q ss_pred CcC Q lcl|NC_020854. 309 NPT 311 (342) Q Consensus 309 sPt 311 (342) -|+ T Consensus 350 ~~~ 352 (352) T protein:vir:78 350 LPS 352 (352) T ss_pred CCC Confidence 465 No 145 >protein:vir:79928 Length: 393 # NCBI annotation: major head protein # Family: family:all:30335 # MgeID: mge:1874 # MgeName: 0305phi8-36 # Cross-refs: genbank:acc:YP_001429616;genbank:gi:156564106;genbank:GeneID:5525693 Probab=98.47 E-value=7.7e-08 Score=59.58 Aligned_cols=297 Identities=12% Similarity=0.109 Sum_probs=175.5 Q ss_pred CcceeccccchhHHHHHHHh---hhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIE---QTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~---~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |||-.+.+.+|-|+..-+.+ ++.=-.+++| -.+|.. |..-.+|.+.. --+-++.||..++-..|. T Consensus 74 mtt~~a~IliP~vis~v~~Eaaepl~~~~kl~q------k~~L~~---Grsm~F~~~g~---~Ra~~IgEGgE~~~~sld 141 (393) T protein:vir:79 74 MATPSAQILIPRVIVGTMREAAEPLYIGTKMLQ------KIRLKS---GQSMIFPSIGI---MRAYDVAEGQEIPEDSID 141 (393) T ss_pred hcCCCcceechhhhhhhhhhcccchhHHHHHHH------HHhhhc---Ccceeccchhe---eeeccccccccccccchh Confidence 99999999999999877766 2222223322 113322 55666776652 355678899988888887 Q ss_pred -cceeeeEee--eeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccc Q lcl|NC_020854. 78 -ADKQVAAIL--HRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSE 154 (342) Q Consensus 78 -~~~~~a~i~--~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~ 154 (342) +..+...+. +.|-.+.++|+...-++=|-+.-..++.....+|+.+..++..++.-=-...-+.++......+--+. T Consensus 142 ~~T~dsv~~~~gK~G~~Ia~SqEmIsDSg~Dvin~~l~aA~RaMaRkKee~a~n~fk~~ghtvfDa~st~t~ahptGr~~ 221 (393) T protein:vir:79 142 WQTHESPEIRVGKSGIRLRFTDEMISDSQWDLMSMMIKQAGRAMGRHKEQKAYHQFRSHGHTVFDNYSTNKLAHTTGLDK 221 (393) T ss_pred hhcCCceeEEechhhhhhhhHHHHhhcchHHHHHHHHHHHHHHHHhhhHHHHHhhhhcccceeeeccccCccceeecCCc Confidence 333333332 34455788898887788899999999999999999999999887641100000111111111111111 Q ss_pred cccccccccHHHHHHH-HHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeee Q lcl|NC_020854. 155 SGDTPTALSPRHVAEA-RAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTY 233 (342) Q Consensus 155 ~~~~~~~~~~~~l~~A-~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~ 233 (342) .+-...+|+.+.|.|- .+.+-+.. ...+++|||-......|+++.+-++..- -+++.-..-. -.....-.-+ T Consensus 222 ~~~qNGTlSleDllDm~~av~~~hy-t~svi~MHPLAWnv~AKna~me~~~~na-~gN~~~~~~~-----ts~algp~~i 294 (393) T protein:vir:79 222 NGVQNDTFSAEDFLDLIIAVMANEY-TPSDLMMHPLAWTVFAKNELMGSLQANP-YGNYPAKGAP-----SSMALGPDSI 294 (393) T ss_pred cccccccccHHHHHHHHHHHhcccC-CcceEEEcCchhhhhhhhhhhcceeecc-ccccCccccc-----hhhhhchhhh Confidence 1234568999999996 45666554 4679999999999999998877665321 1111000000 0000111123 Q ss_pred cc-----ceEEEeCCcceeccCCCcceEEEEEecceeEeecCCc-ceeEeccCCCcceeEEEEeeEEEeeecc------- Q lcl|NC_020854. 234 MG-----LRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMA-MQTETDRDILAKSDAMSIDLHYVYHPVG------- 300 (342) Q Consensus 234 ~G-----~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~-~~ve~dr~~~~g~~~l~~r~~y~~~~~G------- 300 (342) +| ..|+++--+|.+. +..++--|.+-..-.++..=++ +.+++-.++..+-+-+--+.+|++++.. T Consensus 295 ~~~~~~nlnv~~sPfvp~d~--k~~rFd~~~Vd~NnvgvlLV~D~i~tdq~ddk~rdiq~iKl~ERYG~gvLn~gkaiav 372 (393) T protein:vir:79 295 QGRLPFNFNVNLSPFIPLDK--KSRRFDVYAVDRNNVGVLLVRDDLKTDQWDEKARGLQNIKMIERYGIGILNEGKAIAV 372 (393) T ss_pred ccccccceeEEEeccccccc--ccceeeEEEeecCCceEEEEecCcceeccccccccceeeeeeeeeceeeeeCCceEEE Confidence 33 3788888777754 4566777777777777655443 4444444666677778888888886632 Q ss_pred ---eeeecCcCCcChHHhcCCcC Q lcl|NC_020854. 301 ---AKWAVTTTNPTRAQLETVAN 320 (342) Q Consensus 301 ---~s~~~~~~sPt~~~L~~~~N 320 (342) ++++..- |..--+.+-.| T Consensus 373 akNI~~~k~y--~~P~~~~~~~~ 393 (393) T protein:vir:79 373 AKNISMDKSY--AEPMLIKNVGN 393 (393) T ss_pred Eecceeeccc--ccchhhhccCC Confidence 2222221 22222333333 No 146 >protein:vir:3525 Length: 423 # NCBI annotation: major head protein # Family: family:all:1412 # MgeID: mge:72 # MgeName: APSE-1 # Cross-refs: genbank:acc:NP_050985;genbank:gi:9633571;genbank:GeneID:1262318 Probab=98.45 E-value=1.1e-07 Score=58.76 Aligned_cols=303 Identities=12% Similarity=0.026 Sum_probs=154.9 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccch--hhhccCCCCEEEccccccCCCCcccccCC--Cceechhhc Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMT--ELNATEGGDFINVPFWKANLSGDFEVLSD--SSSLTPGKI 76 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~--~l~~~~~G~ti~~P~~~~i~~gda~~~~~--~~~i~~~~l 76 (342) ||.++.- ++||+++.-..+.+.+.+.|. .++-++- ++..+.-||+|+||.=. .....++.. +..+.++.+ T Consensus 1 MAN~llT-~iP~iia~~al~~l~~~lV~~--~lV~r~y~ge~~~a~~GDTV~I~~p~---~~~v~d~~~~~~~~~~~~~~ 74 (423) T protein:vir:35 1 MANNLES-NISQIVLKKFLPGFMSDIVLC--KTVDRQLLSGEINSNTGDSVSFKRPH---QFKSERTETGDITGKDKNGL 74 (423) T ss_pred Cccchhh-hhHHHHHHHHHHHHHhhcccc--hhcccCCCcccccccCCCEEEEeeCC---cceeecccCcCCCCcccccc Confidence 9965432 379999988888888877774 4565553 33222249999998543 245556643 467888999 Q ss_pred ccceeeeEee-eeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 77 TADKQVAAIL-HRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 77 t~~~~~a~i~-~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) +..+-..++= ....++.++|+...+...|. ..+..+.+.+.+++++.+++..+..-. ++. +. + T Consensus 75 ~e~~v~l~id~~k~~a~~v~d~e~~l~i~~~-~~~l~~a~~ala~~vd~~l~~~l~~~a----~~~----vg-----t-- 138 (423) T protein:vir:35 75 FSAKATGKVGKYITVAVEWTQIEEALKLNQL-DQILSPIHERMVTDLETELAHFMMNNG----ALS----LG-----S-- 138 (423) T ss_pred ccceeeEEeccceeccceeCHHHHHhhHHHH-HHHHHHHHHHHHHHHHHHHHHHHhhcc----ccc----cc-----c-- Confidence 8777555553 45667899999887777776 455666678899999999987653211 100 00 0 Q ss_pred ccccccccHHHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccce-ee Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAILGDQGD--KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSV-PT 232 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i-~~ 232 (342) .....-.++.+.+|..+|.+..- .-+.+|+.|..+..|.+... ++...+... ....-+..+ +. T Consensus 139 -~~t~~~~~~~i~~a~~~Ld~~~vP~~~R~~Vv~p~~~a~Ll~~~~--~~~~~~~~~-----------~~alr~g~i~G~ 204 (423) T protein:vir:35 139 -PNTAIKKWADVAQTASFIKDIGIKTGENYAIMDPWSAQRLADAQS--GLHAADQLV-----------RTAWENAQISGN 204 (423) T ss_pred -ccCCcchHHHHHHHHHHHHHhcCCcCCCEEEeCHHHHHHHhcccc--ceeccccch-----------hHHHhhccceee Confidence 00111236889999999976422 34889999999999986421 111111100 001112334 78 Q ss_pred eccceEEEeCCcceeccCCCcceEEEEEecceeEe--ecCCcce---------eEeccCCCcceeEEEEeeEEEeeecc- Q lcl|NC_020854. 233 YMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVAS--GEQMAMQ---------TETDRDILAKSDAMSIDLHYVYHPVG- 300 (342) Q Consensus 233 ~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~--~~k~~~~---------ve~dr~~~~g~~~l~~r~~y~~~~~G- 300 (342) +.|..|++|..+|....+.... +.... ++... ....+.. ..++-+.....|.+.....+.+|+.- T Consensus 205 i~GFdv~~Snnvp~~T~gt~~~--~~~v~-~a~~v~~~a~~~~~~~~~~~~~~~~~~~g~l~~GD~~t~aGv~~v~~~t~ 281 (423) T protein:vir:35 205 FGGIRALMSNGLASRKQGDFDG--AITVK-TAPNVDYLSVKDSYQFTVALTGATPSKTGFLKAGDQLKFTSTHWLNQQSK 281 (423) T ss_pred ecceEEEEcCCCcccccccccc--ceeec-cccccccccccccccceeeeeeeeeccCCcEEecceEEeeeeeecccccc Confidence 8999999999999653322111 11111 11111 1000000 01111222333444444444444311 Q ss_pred -----------eeeecC-------------cCCcChHHhcCCcCceee-cCccccceEEEEecC--------CCC Q lcl|NC_020854. 301 -----------AKWAVT-------------TTNPTRAQLETVANWSKV-YELKNIGIVRATNVS--------NFD 342 (342) Q Consensus 301 -----------~s~~~~-------------~~sPt~~~L~~~~NW~~v-~d~k~i~~~~~~~~~--------~~~ 342 (342) ..|... .++|..--.+..+...-| ..+++-.-+-++-.. =+. T Consensus 282 ~~~~~~~t~~~~~~~V~~~~~~~a~g~~~v~i~p~~~~~~~~~~~~~v~a~~a~~~~vt~~~~a~~~~~~nl~~~ 356 (423) T protein:vir:35 282 QTLYNGSTAMSFTATVLEETNSTASGDVTVKLSGVPIYDEKNSQYNAVDAKVKAGDAVSIIGTAKQQMKPNLFYN 356 (423) T ss_pred ceeecccCCceeEEEEeccccccccCceeEEccccccccCCCcccccccccccCCceeeeeecCCCceeEEEeec Confidence 122110 012221111111122211 122222222111110 111 No 147 >protein:vir:105522 Length: 423 # NCBI annotation: phage major head protein # Family: family:all:1412 # MgeID: mge:1463 # MgeName: phiSG1 # Cross-refs: genbank:acc:YP_516191;genbank:gi:89885994;genbank:GeneID:3964382 Probab=98.21 E-value=1.5e-06 Score=52.47 Aligned_cols=296 Identities=11% Similarity=0.033 Sum_probs=148.0 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccch--hhhccCCCCEEEccccccCCCCcccccCCCce---echhh Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMT--ELNATEGGDFINVPFWKANLSGDFEVLSDSSS---LTPGK 75 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~--~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~---i~~~~ 75 (342) ||.++.. ++||++++-..+.+.+.+.| ..++-++- ++..+.=||||+||.=. + +...+. .+.+ .+++. T Consensus 1 MANsl~~-l~p~iia~~al~~l~~~lV~--~~lV~r~y~~ef~~ak~GDTV~I~~P~-~--~~~~d~-~~~~~t~~~~~~ 73 (423) T protein:vir:10 1 MANNLDA-NVSQIVLKKFLPGFMSDLVL--CKTVDRQLLAGEINSSTGDSVSFKRPH-Q--FKSERT-MDGDITGKSKNS 73 (423) T ss_pred Ccccccc-ccHHHHHHHHHHHHHhhccc--chhhccCCCccccccccCCEEEEeeCC-c--eeeecc-cCcccCcccccc Confidence 8844432 68999999988888888777 44565553 33322249999988643 2 333332 1222 24566 Q ss_pred cccceeeeEee-eeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccc Q lcl|NC_020854. 76 ITADKQVAAIL-HRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSE 154 (342) Q Consensus 76 lt~~~~~a~i~-~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~ 154 (342) +...+-..++- ....++.++|+...+.-.+. .++.++.....++++|.+|...+... .++. . .+ T Consensus 74 l~e~~v~l~id~~k~~a~~v~d~E~~l~i~~~-~~~l~~A~~aLA~~vd~~ia~~~~~~----~~~~----v-----gt- 138 (423) T protein:vir:10 74 LISAKATGEVGNYITVAVEYRQIEEALKLNQL-DQILVPINERMVTDLETELALFMMKH----GALS----L-----GS- 138 (423) T ss_pred cccceEEEEecceeeeeeeeChHHHhcChhHH-HHHHHHHHHHHHHHHHHHHHHHhhhc----cccc----c-----cc- Confidence 66655555443 45567888888876666676 67888888889999999987544211 1110 0 00 Q ss_pred cccccccccHHHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhh-hhhhhhhhhhcccceeeeccceeecccccccce- Q lcl|NC_020854. 155 SGDTPTALSPRHVAEARAILGDQGD--KLTAVAMHSKVYYDLVER-RAIDYVSTADARGTSTTQSGGSMAAAYGGEVSV- 230 (342) Q Consensus 155 ~~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~-~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i- 230 (342) . ....-.++.+.+|..+|.+..- .-..+||.|..+..|.+. ..+.. .+..+ ... .-+..| T Consensus 139 -~-~t~~~a~~~~a~a~~~L~~~~vP~~~R~~Vv~p~~~a~Ll~~~~~~~~---~~~~~----------~~a-lr~~~i~ 202 (423) T protein:vir:10 139 -P-NTPIKKWSDVAQTASFLKDLGINSGENYAVMDPWAAQRLADAQSGLHV---SEQLV----------RTA-WENAQIS 202 (423) T ss_pred -c-ccccccHHHHHHHHHHHhhccCCcCCCEEEeCHHHHHHHhhhhhhhcc---ccccc----------hHH-HHhcccc Confidence 0 0111136788999888887422 346789999999999753 22211 11100 011 112334 Q ss_pred eeeccceEEEeCCcceeccCCCcceEEEEEecceeEeecCCcc-e-------eE----eccCCCcceeEEEEeeEEEeee Q lcl|NC_020854. 231 PTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAM-Q-------TE----TDRDILAKSDAMSIDLHYVYHP 298 (342) Q Consensus 231 ~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~-~-------ve----~dr~~~~g~~~l~~r~~y~~~~ 298 (342) +.+.|..|.+|..+|....+..+- ++. ..|+..+.. .++ + .. .--.....-|++.--..+.+|+ T Consensus 203 G~~~GFdi~~Sn~vp~~T~g~~~g--a~~-~~~~~~vt~-a~~~~~~~~~~~~~~~T~s~~g~l~~GD~~t~aGv~~v~~ 278 (423) T protein:vir:10 203 GNFGGIRALMSNGLASRTQGAFGG--KLT-VKGTPEVNY-DSVKDSYAFTATLTGATASKKGFLKVGDQLQFDDTHWLNQ 278 (423) T ss_pred eeecceEEEEecCCcccccccccc--eee-eeeeeEEEe-cccccccccccceeeccceeceeEEecceEeecceeeecc Confidence 788999999999998643322110 111 122222211 110 0 00 0011122234555555555554 Q ss_pred cc------------eeeecCcCCcChHHhcCCcCceeecCccccceEEEE--------------------------ecCC Q lcl|NC_020854. 299 VG------------AKWAVTTTNPTRAQLETVANWSKVYELKNIGIVRAT--------------------------NVSN 340 (342) Q Consensus 299 ~G------------~s~~~~~~sPt~~~L~~~~NW~~v~d~k~i~~~~~~--------------------------~~~~ 340 (342) .= .+|++.. +-..++.+.. .....|+.|+-+... -|.= T Consensus 279 ~tk~~l~~~~~~~~~~~~V~~---~~~~~a~~~~-tv~i~p~~~~~~~~~~~~~V~a~~a~~~~vT~~~~~~~t~~~nl~ 354 (423) T protein:vir:10 279 QSKQTLYNGASALSFTATVME---DANAHSSGDV-TVKISGVPIFDAGYPQYNAVDRLLAEGDTVSVIGTSKQAMKPNLF 354 (423) T ss_pred cccceeecccCCcceEEEEEe---cccccccCce-EEEeccccccccCcccccceeccccCCceeEEeeccCCceeEEEE Confidence 22 1332200 0000011111 122222222222111 1111 Q ss_pred CC Q lcl|NC_020854. 341 FD 342 (342) Q Consensus 341 ~~ 342 (342) +. T Consensus 355 ~~ 356 (423) T protein:vir:10 355 YN 356 (423) T ss_pred ec Confidence 11 No 148 >protein:vir:93696 Length: 364 # NCBI annotation: Bcep22gp55 # Family: family:all:974 # MgeID: mge:1470 # MgeName: Bcep22 # Cross-refs: genbank:acc:NP_944284;genbank:gi:38640361;genbank:GeneID:2658350 Probab=98.20 E-value=7.6e-07 Score=54.12 Aligned_cols=303 Identities=14% Similarity=0.095 Sum_probs=158.8 Q ss_pred CcceeccccchhHH---HHHHHhhhHHhh----hhhhc---CccccchhhhccCCCCEEEccccccCCCCcccccCCCce Q lcl|NC_020854. 1 MATLRSDIIIPEVF---TPYVIEQTTQRD----AFLAS---GVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSS 70 (342) Q Consensus 1 MaT~~~d~i~Pev~---~~yv~~~~~~~~----~f~~s---g~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~ 70 (342) ||++.-..=+|+.. +.-+.....+++ +|... .+|+.-.+|.- .+|++++++.-.. |.|++---++.-+ T Consensus 1 Ma~T~~~~~~p~a~~~ws~~l~~~~~~~s~f~~~l~G~~~~~~I~~~~dL~k-~~Gd~v~f~L~~~-L~g~gv~Gd~~le 78 (364) T protein:vir:93 1 MSQTVIPFGDPKAVKRWSADLAVDVRKKSYFEQRFIGTSENAVIQRKTELES-DAGDRITFDLSVH-LRGKPTYGDARVE 78 (364) T ss_pred CceeccCcCCHHHHHHHHHHHHHHHHhhCccccccccCCCCCcEEEeeecCC-CCCceEEeeeeee-cccCCcccCceee Confidence 99777666778743 333333333343 33322 34444455554 5799999998775 5676533333333 Q ss_pred echhhcccceeeeEeeeeccceeech-HHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcc---------- Q lcl|NC_020854. 71 LTPGKITADKQVAAILHRGRAFEARD-LAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNA---------- 139 (342) Q Consensus 71 i~~~~lt~~~~~a~i~~~~k~~~~tD-~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a---------- 139 (342) ---+.|+..++..+|=...+++.... .++.-+--|...+....++.||++..|..++-.|.|..+.+.. T Consensus 79 Gnee~L~~~~~~i~idq~r~~V~~~g~ms~qRt~~dlr~~ar~~L~~w~~~~~d~~~f~~laGarg~~~~~~~~~~~~~~ 158 (364) T protein:vir:93 79 GKEESLRFYQDEVRIDQVRHSVSAGGRMSRKRTVHNIRRIARDRLGDYFYKFTDELLFIYLSGARGINLDFIETPDFTGY 158 (364) T ss_pred ccccceeEEeeEEEEeeccccccccCchhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhcccccccccccccCcccc Confidence 34788999999999988888886533 3455566788889999999999999999999988875432211 Q ss_pred -------cccchhheee-ecccccccccccccHHHHHHHHHH---hCcc-------------ccCeEEEEEchHHHHHHH Q lcl|NC_020854. 140 -------NTSSSAFFDL-CIDSESGDTPTALSPRHVAEARAI---LGDQ-------------GDKLTAVAMHSKVYYDLV 195 (342) Q Consensus 140 -------~~~~~~~~~~-~~~~~~~~~~~~~~~~~l~~A~~~---~GD~-------------~~~~~~ivmhS~v~~~L~ 195 (342) ..++-++.-- ..+...=+++..++.+.+.+|... +|-. .++..+++|||-.+.+|+ T Consensus 159 ~~N~v~aPt~~r~~~~~~at~~~~l~stD~~sl~~id~a~~~a~~~~~~~~~~~~~~Pv~~~g~~~yV~~l~p~q~~~Lr 238 (364) T protein:vir:93 159 AGNPLDAPDVDHLLYGGVATSKASLAATDIMAPLVIEKAVEKAAMMQAENPDVANMVPVSIDGDDHYVCVMSEYQATDMR 238 (364) T ss_pred cccccCCCCCCcEEeccccCchhhccccccccHHHHHHHHHHHHHhCCCCCCCcccceeEecCcceeEEEEcchhhhhhh Confidence 1111111100 011111123467889989888653 3311 125679999999999999 Q ss_pred hhh---hhhhhhhhhcccceeeeccceeecccccccceeeeccceEEEeCCcc----eeccCCCcceEEEEEecceeEee Q lcl|NC_020854. 196 ERR---AIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVN----TAGSGGSTEYATYFFTQGAVASG 268 (342) Q Consensus 196 ~~~---li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p----~~~~~~~~~y~t~l~~~GAi~~~ 268 (342) ... .++|.+.+.+. ++..+- .=.+.++.|+|..|.---.++ ....+....-.++++|.=|+.+. T Consensus 239 ~~t~~~w~d~qk~A~~~---~g~~nP------lF~G~~gm~ngvii~~~~~vi~~~~~~~~~~v~~~ralllGaQA~~~a 309 (364) T protein:vir:93 239 TAAGGTWIDFQKAAAAA---EGRNNP------IFKGGLGMINNVVLHKHRNVIRFNDYGAGANVEAARALFMGRQAGVIA 309 (364) T ss_pred hcCCHHHHHHHHHhhhc---ccccCC------ceecCeeeEcCeEEeccCCcccccccccCccccchhhheecceeeEEE Confidence 643 45555543211 111111 112568899997665332222 11111111234688887775553 Q ss_pred cCC----cc-eeEeccCCCcceeEEEEeeEEEeeecceeeecC---cC-CcChHHhcC Q lcl|NC_020854. 269 EQM----AM-QTETDRDILAKSDAMSIDLHYVYHPVGAKWAVT---TT-NPTRAQLET 317 (342) Q Consensus 269 ~k~----~~-~ve~dr~~~~g~~~l~~r~~y~~~~~G~s~~~~---~~-sPt~~~L~~ 317 (342) .+. ++ -+|...|-+.+..+.. + .++.++-..|... .+ =||.+.+-. T Consensus 310 ~g~~~g~~~~w~Ee~~D~gn~~~i~~-~--~i~G~kK~rF~~~DfGvi~idtaa~~~~ 364 (364) T protein:vir:93 310 YGTANGLRFDWEETVKDYGNEPAIAA-G--FIAGMKKARFNNKDFGVISIDTAAKKHS 364 (364) T ss_pred eecCCCCCceeeecccCCCCchhhhh-h--hHhhhhhcccCCccceEEEecccccccC Confidence 221 22 2332222222111110 0 0111111112100 00 022111111 No 149 >protein:vir:107120 Length: 329 # NCBI annotation: conserved phage protein # Family: family:all:701 # MgeID: mge:1571 # MgeName: CNPH82 # Cross-refs: genbank:acc:YP_950606;genbank:gi:119953686;genbank:GeneID:4643129 Probab=98.17 E-value=3.2e-06 Score=50.72 Aligned_cols=280 Identities=11% Similarity=-0.021 Sum_probs=151.6 Q ss_pred Ccc---------eeccccchhH------HHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCccccc Q lcl|NC_020854. 1 MAT---------LRSDIIIPEV------FTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVL 65 (342) Q Consensus 1 MaT---------~~~d~i~Pev------~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~ 65 (342) -|| +-...+.|+- |.+.+.+.+...+ + -+..+. +..... .+|++|.||.+.- ....++ T Consensus 18 ~~~~~~~~~~~~~~~~~~~~nt~~l~~k~~~~LD~~~~~~~-~-s~~~~~-N~~~e~-~~g~tVkIp~i~~---~gl~DY 90 (329) T protein:vir:10 18 NATGKLKLNLQHFANKSVEPGDTLLKNKHVGILEKVTAANS-Y-SAPAVI-SNDAIF-MQGRSFTVIKGDV---TELKDY 90 (329) T ss_pred cccceeEEehhhhcCCccCCchhHHHHHHHHHHHHHHHhhc-e-eeeeec-ccceee-ccCcEEEEeeecc---cccccc Confidence 111 1122345543 3344444333221 1 122222 333333 4799999999863 345688 Q ss_pred CCCceechhhcccceeeeEee-eeccceeechHHHhhhcc--hHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccc Q lcl|NC_020854. 66 SDSSSLTPGKITADKQVAAIL-HRGRAFEARDLAALAAGS--DPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTS 142 (342) Q Consensus 66 ~~~~~i~~~~lt~~~~~a~i~-~~~k~~~~tD~a~~~~~~--dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~ 142 (342) +.+...+++.++.....-++- .|+.+|.+.|.-..-+.. ......+++.....+..+|...++.|.+- +.. T Consensus 91 ~R~~g~~~g~vt~~~~t~tidqdR~~~F~VD~~D~dEtn~~l~a~~i~~~~~~~~v~pEiDay~~skla~~-----a~~- 164 (329) T protein:vir:10 91 KRNATNEFDHPQIQETTYFLDQEKYWGRFVDALDRRDTEGNIDINYVVAKQASEVVAPYLDNLRFATLARN-----KAK- 164 (329) T ss_pred cCCCCccccccccceeEEEeecccceeeecchhhHhhhhhhhhHHHHHHHHHHHHhhhHHHHHHHHHHHhh-----ccc- Confidence 877788889998888886665 466666666555433221 22233444555556666777777665321 100 Q ss_pred chhheeeecccccccccccccHHHHHHHHHHhCccc-cCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceee Q lcl|NC_020854. 143 SSAFFDLCIDSESGDTPTALSPRHVAEARAILGDQG-DKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMA 221 (342) Q Consensus 143 ~~~~~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~-~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~ 221 (342) . ... ..++.-.++.|.++..+|.+.. ..-..++|.|.++.-|.+...+ +...+.. T Consensus 165 --~------~~~--~~t~~nay~~i~~a~~~Lde~~vp~~Rvl~VtP~~~~~Lk~~~~f--~~~~~~~------------ 220 (329) T protein:vir:10 165 --H------LTV--GSGADAQYDAVLDVSVELDEIGAGASRILFVTPKFYKGIKKFVIE--LPQGDNR------------ 220 (329) T ss_pred --c------ccc--ccCHHHHHHHHHHHHHHHHhcCCCCCcEEEeCHHHHHHHHhhhhh--hcccccc------------ Confidence 0 000 1112224788899998898753 2456899999999999876433 3322211 Q ss_pred cccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEecc-CCCcceeEEEEeeEEEeeec- Q lcl|NC_020854. 222 AAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDR-DILAKSDAMSIDLHYVYHPV- 299 (342) Q Consensus 222 ~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr-~~~~g~~~l~~r~~y~~~~~- 299 (342) .....++.++.+.|.+|+...+... +.+..++.-++|+..-.|.. .+|..+ ..+.....+..|..|.+-+. T Consensus 221 ~~~~~~g~Vg~idG~~Ii~vps~~~------k~in~ii~~~~A~~~~~K~~-~~~~~~p~~~~~a~~v~gr~yyd~~V~~ 293 (329) T protein:vir:10 221 QQVLGKGVQGELDGFTIVKVPSKML------QGVEAMAVIGEVMASPIQAN-EAKLNSNVPGMFGTLAEQMLYTGAFVPE 293 (329) T ss_pred ccceeeeeeeeecCeEEEEecCCcc------cceeEEEEcCCceeeeeeee-eeeeeCCCCccchheeeeeeeeeeEEEc Confidence 1112245688899999987543321 23344555577887766654 344333 33333477788888877653 Q ss_pred ----ceeeecC----cCCcCh---HHhcCCcCceeec Q lcl|NC_020854. 300 ----GAKWAVT----TTNPTR---AQLETVANWSKVY 325 (342) Q Consensus 300 ----G~s~~~~----~~sPt~---~~L~~~~NW~~v~ 325 (342) |+ |... ..+.+. ..+++++.|+.=. T Consensus 294 ~k~~~I-~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~ 329 (329) T protein:vir:10 294 HLQKYI-FTIGGKEVETNRDGVDAHADETNASADTGA 329 (329) T ss_pred cccCEE-EEecccCcccCCCCCCccccccccccccCC Confidence 33 3211 112221 2456666666533 No 150 >protein:vir:97031 Length: 402 # NCBI annotation: 31 # Family: family:all:2806 # MgeID: mge:1644 # MgeName: K1-5 # Cross-refs: genbank:acc:YP_654132;genbank:gi:108862016;genbank:GeneID:5075980 Probab=98.07 E-value=4.2e-07 Score=55.54 Aligned_cols=313 Identities=14% Similarity=0.055 Sum_probs=174.5 Q ss_pred Cc-----ce------eccc-cchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCC Q lcl|NC_020854. 1 MA-----TL------RSDI-IIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDS 68 (342) Q Consensus 1 Ma-----T~------~~d~-i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~ 68 (342) |+ |+ -++. +-=|+|..-|...+.+.+.|.. .... -++ .+|+++.+|+-+. ..++....| T Consensus 1 Ms~~n~~t~~~~~~s~~~~al~le~f~geV~taF~~~si~~~--~~~v-rti---~~GkS~qf~~iG~---~~a~y~~~G 71 (402) T protein:vir:97 1 MSTPNTLTNVAVSASGEVDSLLIEKFNGKVNEQYLKGENILS--YFDV-QTV---TGTNTVSNKYLGE---TELQVLAPG 71 (402) T ss_pred CCCcccccccccccccchhhhhhhhhhhhHHHHHHHHHhhcC--ccee-eee---cccceEEEEEEee---eEEeeeccc Confidence 66 21 1222 2227777777777777777743 2222 123 4699999998532 455667888 Q ss_pred ceechhhcccceeeeEeee-eccceeechHHHhhhcch-HHHHHHHHHHHHHHHHHHHHHHHHHHH-HHhhhcccccchh Q lcl|NC_020854. 69 SSLTPGKITADKQVAAILH-RGRAFEARDLAALAAGSD-PMAAIGAKVADYVANQRQKDLLSCLQG-VFGSLNANTSSSA 145 (342) Q Consensus 69 ~~i~~~~lt~~~~~a~i~~-~~k~~~~tD~a~~~~~~d-p~~~i~~qia~~~~~~~~~~lla~L~g-~~~~~~a~~~~~~ 145 (342) +.+..+.+...+..-+|=. .--...+.|+-.....-| +-.+++++++...++.+|+.++..++. ..+...+-..... T Consensus 72 ~~ldg~~~~~~k~~ItID~lL~a~~~V~diDeaq~~yD~vRse~s~e~G~ALA~~~Dq~ii~~i~~aa~a~t~~~~~~~~ 151 (402) T protein:vir:97 72 QSPNATPTQADKNQLVIDTTVIARNTVAHIHDVQGDIDSLKPKLAMNQAKQLKRLEDQMAIQQMLLGGIANTKAERNKPR 151 (402) T ss_pred cccCCCCcccccEEEEeCceeechhhhhhHHHHHhcccchhHHHHHHHHHHHHHHHHHHHHHHHHHhhccccccccccCc Confidence 8888888887766555532 112345778877666666 678999999999999999988765432 1111111100000 Q ss_pred -hee-eecccccccccccccHHHHH----HHHHHhCccc--cCeEEEEEchHHHHHHHhhh-hh--hhhhhhhcccceee Q lcl|NC_020854. 146 -FFD-LCIDSESGDTPTALSPRHVA----EARAILGDQG--DKLTAVAMHSKVYYDLVERR-AI--DYVSTADARGTSTT 214 (342) Q Consensus 146 -~~~-~~~~~~~~~~~~~~~~~~l~----~A~~~~GD~~--~~~~~ivmhS~v~~~L~~~~-li--~~~~~s~~~~~~~~ 214 (342) ... ...........+..+...|. +|.+.|.++. ..-.+++|.|..|..|++.. |+ +|... T Consensus 152 ~~~~g~s~~~~~t~~~a~~~~~~l~~ai~~a~~~LdEkdVP~~dRv~vv~P~~y~~Ll~~~rl~n~d~~~~--------- 222 (402) T protein:vir:97 152 VKGHGFSINVNVTESEALANPQYVMAAVEYALEQQLEQEVDISDVAIMMPWKFFNALRDADRIVDKTYTIS--------- 222 (402) T ss_pred ccccccccccccccchhhcCHHHHHHHHHHHHHHHHhcCCCccccEEEeChHHHHHHhhcccccchhhccc--------- Confidence 000 00000001112233555555 4555555421 13379999999999999863 22 11110 Q ss_pred eccceeecccccccceeeeccceEEEeCCcceecc-------------------CCCcceEEEEEecceeEeecCCccee Q lcl|NC_020854. 215 QSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGS-------------------GGSTEYATYFFTQGAVASGEQMAMQT 275 (342) Q Consensus 215 ~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~-------------------~~~~~y~t~l~~~GAi~~~~k~~~~v 275 (342) .++. ...+.+...+|.+|+.+..+|...+ +...+...++|-+-|++...-.++.. T Consensus 223 -~~g~-----~~~G~v~~v~Gv~Vv~SnnlP~~a~~it~~~ls~a~~G~~y~~t~d~t~~~~~~f~~~Av~tvk~~~vT~ 296 (402) T protein:vir:97 223 -QSGA-----TINGFVLSSYNCPVIPSNRFPTFAQDQAHHLLSNEDNGYRYDPIAEMNGAVAVLFTSDALLVGRTIEVTG 296 (402) T ss_pred -cCCc-----cccceeEEEeceEEEecCccccccccccccccccCCCCccCCcCcccceeEEEEEecceEEEEEeecccc Confidence 1111 1246788899999999999986321 11123356788888998888888888 Q ss_pred EeccCCCcceeEEEEeeEEEeeecc------eeeecCcCCcChHHhcCCcCceee-cCccccceEEEEecCCCC Q lcl|NC_020854. 276 ETDRDILAKSDAMSIDLHYVYHPVG------AKWAVTTTNPTRAQLETVANWSKV-YELKNIGIVRATNVSNFD 342 (342) Q Consensus 276 e~dr~~~~g~~~l~~r~~y~~~~~G------~s~~~~~~sPt~~~L~~~~NW~~v-~d~k~i~~~~~~~~~~~~ 342 (342) |..|+.....+.|.+.+.|++.|+= +..+.......-.+|.+.-+=-+. ..+|+ ..|+++=- T Consensus 297 ~~~~d~r~~~~~id~~~a~G~g~~RPeaa~vv~~~~~~t~~~~~~~~~~~~~~~~~~~~~~-----~~~~~~~~ 365 (402) T protein:vir:97 297 DIFYEKKEKTYYIDTFMAEGAIPDRWEAVSVVTTKRDATTGDAGGPGDDHATVLARAQRKA-----VYVKTEGA 365 (402) T ss_pred chhhchhHHHHHHHHHHHhCCcccCccceEEEEEecccccccCCccccchhhhhcccccce-----EEEecccc Confidence 8888888888877777777776632 112111111111233333221111 11221 12233322 No 151 >protein:vir:97331 Length: 319 # NCBI annotation: ORF011 # Family: family:all:701 # MgeID: mge:1666 # MgeName: 52A # Cross-refs: genbank:acc:YP_240611;genbank:gi:66396278;genbank:GeneID:5133687 Probab=98.06 E-value=5.6e-06 Score=49.37 Aligned_cols=279 Identities=10% Similarity=-0.054 Sum_probs=145.9 Q ss_pred Cc------cee--ccc-----cchhH--------HHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCC Q lcl|NC_020854. 1 MA------TLR--SDI-----IIPEV--------FTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLS 59 (342) Q Consensus 1 Ma------T~~--~d~-----i~Pev--------~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~ 59 (342) |- |-. .++ -.||- |.+.+.+.. ....+ |+....+....+ .+|++|.||.+.. T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~nt~~l~~k~~~~LD~~~-~~~~~--s~~~~~N~~~e~-~gg~tVkIp~i~~--- 73 (319) T protein:vir:97 1 MNKTIKNATGMLKLNLQHFANKSVEPGQTLLKNKHVGILERVT-AVNAY--STPALISNDAIF-MEGRSFTVMKGDT--- 73 (319) T ss_pred CCcccccccceeEeehhhhhccCCCcchHHHHHHHHHHHHHHH-HHhhh--hhhcccCcceEe-ccCcEEEEeeecc--- Confidence 22 110 111 11222 333333222 22222 222222344444 4799999999863 Q ss_pred CcccccCCCceechhhcccceeeeEee-eeccceeechHHHhhhcc--hHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_020854. 60 GDFEVLSDSSSLTPGKITADKQVAAIL-HRGRAFEARDLAALAAGS--DPMAAIGAKVADYVANQRQKDLLSCLQGVFGS 136 (342) Q Consensus 60 gda~~~~~~~~i~~~~lt~~~~~a~i~-~~~k~~~~tD~a~~~~~~--dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~ 136 (342) ....+++.++..+++.++.....-++- .|+.+|.+.|.-..-+.. ......+++.....+..+|...++.|.+- T Consensus 74 ~gl~DY~R~~g~~~g~vt~~~~t~tidqdR~~~F~VD~~D~~Etn~~l~a~~i~~~~~~~~v~PEiDay~~skla~~--- 150 (319) T protein:vir:97 74 TELKDYKRNATNEFDHPKIEETTYFLDQEKYWGRFVDALDRKDTEGNIDINYVVARQGAEVVAPYLDNLRFATLARN--- 150 (319) T ss_pred cccccccCCCCcccCCcccceeEEEeecccccccccchhhHhhhhchhhHHHHHHHHHHHHhhhhhhHHHHHHHHhh--- Confidence 345688877788899998888887764 477777777666533322 23333445555555666677766655321 Q ss_pred hcccccchhheeeecccccccccccccHHHHHHHHHHhCccc-cCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeee Q lcl|NC_020854. 137 LNANTSSSAFFDLCIDSESGDTPTALSPRHVAEARAILGDQG-DKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQ 215 (342) Q Consensus 137 ~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~-~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~ 215 (342) +. .. .+. ..++.-.++.|.++..+|-+.. ..-..++|.|.++.-|++... |.+..+.. T Consensus 151 --a~---~~------~~~--~~t~~n~y~~i~~a~~~Lde~~VP~~Rvl~Vtp~~~~~L~~~~~--f~~~~~~~------ 209 (319) T protein:vir:97 151 --KA---KH------LTV--GTGSDAQYDAVLDVSVELDEIKAPENRVLFVSPTFYKGIKKFVI--ALPQGDTR------ 209 (319) T ss_pred --cc---cc------ccc--ccCHHHHHHHHHHHHHHHHhcCCCCCcEEEeCHHHHHHHHhhhh--hhcccccc------ Confidence 10 00 000 0111224788888888886642 345789999999999987643 44433211 Q ss_pred ccceeecccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEecc-CCCcceeEEEEeeEE Q lcl|NC_020854. 216 SGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDR-DILAKSDAMSIDLHY 294 (342) Q Consensus 216 ~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr-~~~~g~~~l~~r~~y 294 (342) .....++.++.+.|.+|+...+. ..+.+..++.-++|+..-.|-. .+|..+ ..+...+.+..|..| T Consensus 210 ------~~~~~~g~Vg~idG~~Vi~vps~------~~k~in~i~~h~~A~~~~~k~~-~~~~~~p~~~~~a~~v~gr~y~ 276 (319) T protein:vir:97 210 ------QQVLGKGVQGELDGFVIVKVPTK------LLQGLQAIAVVGEVLASPIQAD-LAKTNSNIPGMFGTLAEQLLYT 276 (319) T ss_pred ------ccceeeeeceeecCeEEEEeccc------ccccceEEEEcCCeeeeeeeee-eeeccCCCccccceeeeeeeee Confidence 11123567888999999864322 1123334444477776655533 333322 343334777788877 Q ss_pred Eeeeccee----eecC----------cCCcChHHhcCCcCcee Q lcl|NC_020854. 295 VYHPVGAK----WAVT----------TTNPTRAQLETVANWSK 323 (342) Q Consensus 295 ~~~~~G~s----~~~~----------~~sPt~~~L~~~~NW~~ 323 (342) .+-+.-=+ |... +..|++++-.-....+. T Consensus 277 d~~V~~~k~~~Iy~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 319 (319) T protein:vir:97 277 GAFVPEHLQKYIFTIGGTEVATKRDGVDAHADNVAKPSGSLEM 319 (319) T ss_pred eeEEeccccceEEEeecCCcccCCCccccccccccCCcccccC Confidence 77764211 4322 22333333222222222 No 152 >protein:vir:94800 Length: 319 # NCBI annotation: ORF012 # Family: family:all:701 # MgeID: mge:1531 # MgeName: 29 # Cross-refs: genbank:acc:YP_240536;genbank:gi:66396203;genbank:GeneID:5133580 Probab=98.06 E-value=5.6e-06 Score=49.37 Aligned_cols=279 Identities=10% Similarity=-0.054 Sum_probs=145.9 Q ss_pred Cc------cee--ccc-----cchhH--------HHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCC Q lcl|NC_020854. 1 MA------TLR--SDI-----IIPEV--------FTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLS 59 (342) Q Consensus 1 Ma------T~~--~d~-----i~Pev--------~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~ 59 (342) |- |-. .++ -.||- |.+.+.+.. ....+ |+....+....+ .+|++|.||.+.. T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~nt~~l~~k~~~~LD~~~-~~~~~--s~~~~~N~~~e~-~gg~tVkIp~i~~--- 73 (319) T protein:vir:94 1 MNKTIKNATGMLKLNLQHFANKSVEPGQTLLKNKHVGILERVT-AVNAY--STPALISNDAIF-MEGRSFTVMKGDT--- 73 (319) T ss_pred CCcccccccceeEeehhhhhccCCCcchHHHHHHHHHHHHHHH-HHhhh--hhhcccCcceEe-ccCcEEEEeeecc--- Confidence 22 110 111 11222 333333222 22222 222222344444 4799999999863 Q ss_pred CcccccCCCceechhhcccceeeeEee-eeccceeechHHHhhhcc--hHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_020854. 60 GDFEVLSDSSSLTPGKITADKQVAAIL-HRGRAFEARDLAALAAGS--DPMAAIGAKVADYVANQRQKDLLSCLQGVFGS 136 (342) Q Consensus 60 gda~~~~~~~~i~~~~lt~~~~~a~i~-~~~k~~~~tD~a~~~~~~--dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~ 136 (342) ....+++.++..+++.++.....-++- .|+.+|.+.|.-..-+.. ......+++.....+..+|...++.|.+- T Consensus 74 ~gl~DY~R~~g~~~g~vt~~~~t~tidqdR~~~F~VD~~D~~Etn~~l~a~~i~~~~~~~~v~PEiDay~~skla~~--- 150 (319) T protein:vir:94 74 TELKDYKRNATNEFDHPKIEETTYFLDQEKYWGRFVDALDRKDTEGNIDINYVVARQGAEVVAPYLDNLRFATLARN--- 150 (319) T ss_pred cccccccCCCCcccCCcccceeEEEeecccccccccchhhHhhhhchhhHHHHHHHHHHHHhhhhhhHHHHHHHHhh--- Confidence 345688877788899998888887764 477777777666533322 23333445555555666677766655321 Q ss_pred hcccccchhheeeecccccccccccccHHHHHHHHHHhCccc-cCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeee Q lcl|NC_020854. 137 LNANTSSSAFFDLCIDSESGDTPTALSPRHVAEARAILGDQG-DKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQ 215 (342) Q Consensus 137 ~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~-~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~ 215 (342) +. .. .+. ..++.-.++.|.++..+|-+.. ..-..++|.|.++.-|++... |.+..+.. T Consensus 151 --a~---~~------~~~--~~t~~n~y~~i~~a~~~Lde~~VP~~Rvl~Vtp~~~~~L~~~~~--f~~~~~~~------ 209 (319) T protein:vir:94 151 --KA---KH------LTV--GTGSDAQYDAVLDVSVELDEIKAPENRVLFVSPTFYKGIKKFVI--ALPQGDTR------ 209 (319) T ss_pred --cc---cc------ccc--ccCHHHHHHHHHHHHHHHHhcCCCCCcEEEeCHHHHHHHHhhhh--hhcccccc------ Confidence 10 00 000 0111224788888888886642 345789999999999987643 44433211 Q ss_pred ccceeecccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEecc-CCCcceeEEEEeeEE Q lcl|NC_020854. 216 SGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDR-DILAKSDAMSIDLHY 294 (342) Q Consensus 216 ~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr-~~~~g~~~l~~r~~y 294 (342) .....++.++.+.|.+|+...+. ..+.+..++.-++|+..-.|-. .+|..+ ..+...+.+..|..| T Consensus 210 ------~~~~~~g~Vg~idG~~Vi~vps~------~~k~in~i~~h~~A~~~~~k~~-~~~~~~p~~~~~a~~v~gr~y~ 276 (319) T protein:vir:94 210 ------QQVLGKGVQGELDGFVIVKVPTK------LLQGLQAIAVVGEVLASPIQAD-LAKTNSNIPGMFGTLAEQLLYT 276 (319) T ss_pred ------ccceeeeeceeecCeEEEEeccc------ccccceEEEEcCCeeeeeeeee-eeeccCCCccccceeeeeeeee Confidence 11123567888999999864322 1123334444477776655533 333322 343334777788877 Q ss_pred Eeeeccee----eecC----------cCCcChHHhcCCcCcee Q lcl|NC_020854. 295 VYHPVGAK----WAVT----------TTNPTRAQLETVANWSK 323 (342) Q Consensus 295 ~~~~~G~s----~~~~----------~~sPt~~~L~~~~NW~~ 323 (342) .+-+.-=+ |... +..|++++-.-....+. T Consensus 277 d~~V~~~k~~~Iy~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 319 (319) T protein:vir:94 277 GAFVPEHLQKYIFTIGGTEVATKRDGVDAHADNVAKPSGSLEM 319 (319) T ss_pred eeEEeccccceEEEeecCCcccCCCccccccccccCCcccccC Confidence 77764211 4322 22333333222222222 No 153 >protein:vir:80128 Length: 466 # NCBI annotation: Phage capsid protein # Family: family:all:635 # MgeID: mge:1877 # MgeName: bacteriophage bv1 # Cross-refs: genbank:acc:YP_001425603;genbank:gi:155042936;genbank:GeneID:5469556 Probab=98.04 E-value=7.1e-07 Score=54.28 Aligned_cols=286 Identities=12% Similarity=0.063 Sum_probs=141.6 Q ss_pred Cccee--ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhccc Q lcl|NC_020854. 1 MATLR--SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITA 78 (342) Q Consensus 1 MaT~~--~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~ 78 (342) +-+.. ..+++|+-+...+.+.+.+.+.|.+ .+...+ .+| +..+|.-... ..+.-+.|++.++..+.+- T Consensus 150 ~~~~~~g~~~~vP~~~~~~i~~~l~~~~~l~~--~~~v~~-----~~g-~~~~~~~~~~--~~a~wv~E~~~~~~~~~~f 219 (466) T protein:vir:80 150 QKRAVSGAELTIPDVMLELLRDNMHRYSKLIS--KVRLRP-----LKG-TARQNIAGAI--PEGVWTEAVANLNELSLSF 219 (466) T ss_pred hhhhhccccccccHHHHHHHHHhhhhhhhhhh--heeeee-----cCc-eeEeeeecCC--cceeecccccccccccccc Confidence 11223 3478999888888777777666643 221111 123 3455544332 4444567888887766665 Q ss_pred ceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH-----HHHHHhhhcccccchhheeeeccc Q lcl|NC_020854. 79 DKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSC-----LQGVFGSLNANTSSSAFFDLCIDS 153 (342) Q Consensus 79 ~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~-----L~g~~~~~~a~~~~~~~~~~~~~~ 153 (342) +.-....++.+.-+.+++..-.-+..|..+.+.+++++.+++..+..+|.- =.|++.......... ... T Consensus 220 ~~i~~~~~k~~~~~~iS~ell~ds~~~l~~~i~~~la~~~~~~~~~ail~G~G~~~P~Gil~~~~~~~~~~------~~~ 293 (466) T protein:vir:80 220 SQIEVDGYKVGGFIPIPNSTLEDSDLNLADEILDAIGQAIGFALDKAILYGTGTKMPVGIVTRLAQTTQPP------NWG 293 (466) T ss_pred cceeecceeeeeehhhhHHHHhcchHHHHHHHHHHHHHHHHHHHhhheeeccCCCCcceeeeccccccccc------ccc Confidence 555555666665567766665556667788899999999999888877641 001211100000000 000 Q ss_pred ccccccccccHHH--------------HHHHHHH----hCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeee Q lcl|NC_020854. 154 ESGDTPTALSPRH--------------VAEARAI----LGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQ 215 (342) Q Consensus 154 ~~~~~~~~~~~~~--------------l~~A~~~----~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~ 215 (342) ........++... +.+.... ..-......+|+||+..+..|+...... +..+.... T Consensus 294 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~w~~~~~~~~~l~~~~~~~-----~~~g~~~~- 367 (466) T protein:vir:80 294 TKAPAWTNLSTTNLLKIDPTGKSAEEFFSELVLKLSKARANYSNGMKFWAMSSNTHAVLMSKAITF-----NSAGALVA- 367 (466) T ss_pred cccccccccchhhhhhhhhhccchhhHHHHHHHHHHhhhccccCCceeEEecchhHHHhhcccccc-----cCCccccc- Confidence 0000001111111 1121111 1222334567999999999988764320 11111110 Q ss_pred ccceeecccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCC--cceeEEEEeeE Q lcl|NC_020854. 216 SGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDIL--AKSDAMSIDLH 293 (342) Q Consensus 216 ~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~--~g~~~l~~r~~ 293 (342) . ...-..++|++|++++.||... ..+..... ..+.....+.+++++... .+++.+.+..+ T Consensus 368 ------~----~~~~~~i~G~pvv~s~~~~~~~-------~~~g~~~~-y~i~~r~~~~i~~~~~~~f~~d~~~~r~~~r 429 (466) T protein:vir:80 368 ------S----LNNTMPIVGGDIVILDFIPDND-------IIGGYGSL-YLLAERADIKLAQSEHVRFIEDQTVFKGTAR 429 (466) T ss_pred ------c----CCCcccccccceeecCccCccc-------eeeecccc-EEEEeecceEEEechhhhhhcCcEEEEEEEE Confidence 0 0112347899999999997531 12222222 234445566666666554 56678888888 Q ss_pred EEeeecce-ee---ecCcCCcChHHhcCCcCceeecCccccceE Q lcl|NC_020854. 294 YVYHPVGA-KW---AVTTTNPTRAQLETVANWSKVYELKNIGIV 333 (342) Q Consensus 294 y~~~~~G~-s~---~~~~~sPt~~~L~~~~NW~~v~d~k~i~~~ 333 (342) +...|+-- +| +.+..+|.=.. +.+.+.-+.|=| T Consensus 430 ~dg~~~~~~afv~~~~~~~~~~~~~-------~~~~~~~~~~~~ 466 (466) T protein:vir:80 430 YDGKPVFGEGFVAVNIANANPTTSI-------TFAPDEANVPEV 466 (466) T ss_pred EccEEeccCceEEEEecCCCcccce-------eeecCcCcCCCC Confidence 87666331 22 11222211100 001111111111 No 154 >protein:vir:7019 Length: 401 # NCBI annotation: major capsid protein # Family: family:all:2806 # MgeID: mge:141 # MgeName: SP6 # Cross-refs: genbank:acc:NP_853592;genbank:gi:31711674;genbank:GeneID:1481800 Probab=97.89 E-value=7.4e-07 Score=54.21 Aligned_cols=314 Identities=16% Similarity=0.117 Sum_probs=164.2 Q ss_pred Ccc-----ee------c-cccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCC Q lcl|NC_020854. 1 MAT-----LR------S-DIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDS 68 (342) Q Consensus 1 MaT-----~~------~-d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~ 68 (342) |++ +- + +-+-=|+|.--|...+.++..|.. ..... +| .+|++..+|+-. .-.++....| T Consensus 1 Ms~~n~~t~~~~~~sg~~~al~Le~f~GeV~taF~~~si~~~--~~~vR-ti---~~gkS~qf~~~G---~s~~~~~~pG 71 (401) T protein:vir:70 1 MSTPNNLTNVAVSASGEVDSLLIEKFNGKVNEQYLKGENIMS--YFDVQ-TV---TGTNTVSNKYLG---ETELQVLAPG 71 (401) T ss_pred CCCCccccccccccccchhHhHHhHhcchHHHHHHHHhhhcc--cceee-ee---cccceEEEEEee---eeEeeeecCC Confidence 661 10 0 112225555555556666666632 22111 22 469999999753 2455678888 Q ss_pred ceechhhcccceeeeEeee-eccceeechHHHhhhcch-HHHHHHHHHHHHHHHHHHHHHHHHHH-HHHhhhcc---cc- Q lcl|NC_020854. 69 SSLTPGKITADKQVAAILH-RGRAFEARDLAALAAGSD-PMAAIGAKVADYVANQRQKDLLSCLQ-GVFGSLNA---NT- 141 (342) Q Consensus 69 ~~i~~~~lt~~~~~a~i~~-~~k~~~~tD~a~~~~~~d-p~~~i~~qia~~~~~~~~~~lla~L~-g~~~~~~a---~~- 141 (342) +.+..+.+.+.+.+=+|=. .--...+.|+-...+.-| +-.+++++++...++.+|..++..++ +.++...+ +. T Consensus 72 ~~ld~~~~~~dK~~ItID~lL~a~~~V~dlDe~q~~yD~vRse~s~e~G~ALA~~~Dq~iiq~i~~aa~ana~~~~~~p~ 151 (401) T protein:vir:70 72 QSPAATSTQADKNQLVIDATVIARNTVAHLHDVQGDIDSLKPKLATNQAKQLKRMEDEMLIQQMMLGGIANTQAKRTNPR 151 (401) T ss_pred CCcCCCCcccccEEEEeCceeehhhhhhhHHHHHhcccccchHHHHHHHHHHHHHHHHHHHHHHHHhccccccccccCCC Confidence 9998888888876655533 223467778887777667 66799999999999999998876553 11211101 11 Q ss_pred cchhheeeecccccccccccccHHHHHHH----HHHhCcc--ccCeEEEEEchHHHHHHHhh-hhhhhhhhhhcccceee Q lcl|NC_020854. 142 SSSAFFDLCIDSESGDTPTALSPRHVAEA----RAILGDQ--GDKLTAVAMHSKVYYDLVER-RAIDYVSTADARGTSTT 214 (342) Q Consensus 142 ~~~~~~~~~~~~~~~~~~~~~~~~~l~~A----~~~~GD~--~~~~~~ivmhS~v~~~L~~~-~li~~~~~s~~~~~~~~ 214 (342) ...+-..+.... .......+...|.+| .+.|-++ ...-.++++.|..|.-|++. +|++- + +.. T Consensus 152 ~~~~G~~i~v~~--~~~~~~~~~~~l~~ai~dA~~~LdEkdVP~~r~vvl~pp~~Ys~Ll~~d~L~nr----d----~~~ 221 (401) T protein:vir:70 152 VKGHGFSINVEV--AEGEALVNPQYVMAAVEFALEQQLEQEVDISDVAILMPWRYFNVLRDADRIVDK----T----YTI 221 (401) T ss_pred cCCCceEEeccc--cccccccCHHHHHHHHHHHHHHHHhcCCCccceEEEcCHHHHHHHHhcCcccch----h----hcc Confidence 111111222211 122233444555544 3333221 11234555566666566553 34421 1 000 Q ss_pred eccceeecccccccceeeeccceEEEeCCcceecc-------------------CCCcceEEEEEecceeEeecCCccee Q lcl|NC_020854. 215 QSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGS-------------------GGSTEYATYFFTQGAVASGEQMAMQT 275 (342) Q Consensus 215 ~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~-------------------~~~~~y~t~l~~~GAi~~~~k~~~~v 275 (342) ..++. ...+.+....|.+|+.+..+|.... +...+....+|-+-|++...-.++.. T Consensus 222 s~~g~-----~~~G~v~~vaGv~Vv~SnnlP~~a~~it~~~ls~a~~G~~y~~~~d~s~~~~v~f~~~Av~tvk~~~lt~ 296 (401) T protein:vir:70 222 SQSGA-----TIQGFTLSSYNCPVIPSNRFPKYSQGQTHHLLSNEDNGYRYDPLPAMNGAIAVLFTADALLVGRSIDVTG 296 (401) T ss_pred ccCCc-----cccceEEEEeceEEEeeccccccccccccccccccCCCccCCCCccccceeEEEEehhheEEEEeecccc Confidence 11111 1245677899999999999986321 11123456788888998888888888 Q ss_pred EeccCCCcceeEEEEeeEEEeeecc-----e---eeecCcCCcChHHhcCCcCceee-cCccccceEEEEecCCCC Q lcl|NC_020854. 276 ETDRDILAKSDAMSIDLHYVYHPVG-----A---KWAVTTTNPTRAQLETVANWSKV-YELKNIGIVRATNVSNFD 342 (342) Q Consensus 276 e~dr~~~~g~~~l~~r~~y~~~~~G-----~---s~~~~~~sPt~~~L~~~~NW~~v-~d~k~i~~~~~~~~~~~~ 342 (342) |..|++....+.|-+.+.|++.|+= + +++.....|.-.++..-+ -.++ ..+|++ -+++...+- T Consensus 297 ~~~~d~r~~~~~id~~~a~g~g~~RPeaa~vv~~k~~~~~~~~~~~~~~~~~-~~~~~~~~~~~---~~~~~~~~~ 368 (401) T protein:vir:70 297 DIFYEKKEKTYYIDTFMAEGAIPDRWEAVSVVTTKRNTTTGAVEGTDGAQHT-IVKNRAQRKAV---YVKNAAPVA 368 (401) T ss_pred chhhhhhhhHHHHHHHHHhCCcccchhheEEEeecCcccccccccCCcchhh-hhhhhccceeE---Eeccccchh Confidence 8889988888877777777666532 1 111000011111111110 0111 122222 111111111 No 155 >protein:vir:105645 Length: 400 # NCBI annotation: putative major capsid protein # Family: family:all:2806 # MgeID: mge:1674 # MgeName: K1E # Cross-refs: genbank:acc:YP_425009;genbank:gi:83571757;uniprot:Q2WC43;genbank:GeneID:3837286 Probab=97.83 E-value=7.7e-07 Score=54.09 Aligned_cols=317 Identities=16% Similarity=0.115 Sum_probs=169.2 Q ss_pred Ccc-----ee-------ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCC Q lcl|NC_020854. 1 MAT-----LR-------SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDS 68 (342) Q Consensus 1 MaT-----~~-------~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~ 68 (342) |++ +- .+-+-=|+|.--|...+.++..|.. ..... +| .+|++..+|+-. .-.++....| T Consensus 1 Ms~~n~~t~p~~~gsg~~~aL~Le~f~GeV~taF~~~si~~~--~~~vR-tI---~~gkS~qf~~lG---~s~a~y~~pG 71 (400) T protein:vir:10 1 MSTPNNLTNVAVSASGEVDSLLIEKFNGKVNEQYLKGENIMS--YFDVQ-TV---TGTNTVSNKYLG---ETELQVLAPG 71 (400) T ss_pred CCCCccccccccccccchhhhHHhHhcchHHHHHHHHhhhcc--cceee-ee---cccceEEEEEee---eeEEeeecCC Confidence 661 10 0112235565556566666666632 22111 22 469999999753 2456678889 Q ss_pred ceechhhcccceeeeEeee-eccceeechHHHhhhcch-HHHHHHHHHHHHHHHHHHHHHHHHH-HHHHhhhccc----c Q lcl|NC_020854. 69 SSLTPGKITADKQVAAILH-RGRAFEARDLAALAAGSD-PMAAIGAKVADYVANQRQKDLLSCL-QGVFGSLNAN----T 141 (342) Q Consensus 69 ~~i~~~~lt~~~~~a~i~~-~~k~~~~tD~a~~~~~~d-p~~~i~~qia~~~~~~~~~~lla~L-~g~~~~~~a~----~ 141 (342) +.+..+.+...+.+=+|=. .---..+.|+-.....-| +-.|++++++...++.+|+.++..+ .+.++...+. . T Consensus 72 ~~ldg~~~~~dk~~ItIDtLL~a~~~V~dlDd~q~~yD~vRse~s~e~G~ALA~~~Dq~iiq~i~~a~~a~t~~~~~~~~ 151 (400) T protein:vir:10 72 QSPAATSTQADKNQLVIDATVIARNTVAHLHDVQGDIDSLKPKLATNQAKQLKKMEDEMLIQQMLLGGIANTQAKRTNPR 151 (400) T ss_pred CCcCCCCcccCcEEEEeCceeeecchhhhHHHHhhccccccHHHHHHHHHHHHHHHHHHHHHHHHHhcccccccccccCC Confidence 9998888877776655532 222356777777777667 7899999999999999999887644 3322211111 1 Q ss_pred cchhheeeecccccccccccccHHHHH----HHHHHhCcc-c-cCeEEEEEchHHHHHHHhh-hhhhhhhhhhcccceee Q lcl|NC_020854. 142 SSSAFFDLCIDSESGDTPTALSPRHVA----EARAILGDQ-G-DKLTAVAMHSKVYYDLVER-RAIDYVSTADARGTSTT 214 (342) Q Consensus 142 ~~~~~~~~~~~~~~~~~~~~~~~~~l~----~A~~~~GD~-~-~~~~~ivmhS~v~~~L~~~-~li~~~~~s~~~~~~~~ 214 (342) ...+...+.+... +..+..+...|. +|.+.|-++ - ..-.++++.|..|.-|+.. +|++-- +.. T Consensus 152 g~~~g~s~~v~~~--~~~~~~~~~~l~~A~~~A~~~LdEkdVP~~d~vvl~pp~~Ys~Ll~~dkLvnrd--------f~~ 221 (400) T protein:vir:10 152 VKGHGFSVNVEVN--EGEALVNPQYVMAAVEFALEQQLEQEVDISDVAILMPWRYFNVLRDADRIVDKS--------YTI 221 (400) T ss_pred ccccccceeeccc--ccccccCHHHHHHHHHHHHHHHHhcCCCccceEEEcCHHHHHHHHhCCcccchh--------ccc Confidence 1112222222211 222233555555 443333221 1 1223666666677666553 243211 111 Q ss_pred eccceeecccccccceeeeccceEEEeCCcceec-------------------cCCCcceEEEEEecceeEeecCCccee Q lcl|NC_020854. 215 QSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAG-------------------SGGSTEYATYFFTQGAVASGEQMAMQT 275 (342) Q Consensus 215 ~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~-------------------~~~~~~y~t~l~~~GAi~~~~k~~~~v 275 (342) ..++. ...+.+..+.|.+|+.+..+|... ++...+....+|-+-|++..+-.++.. T Consensus 222 s~~g~-----~~~g~v~~v~Gv~Iv~Sn~lP~~a~~~~~~~lS~a~~G~~y~~t~d~s~~~av~F~~sAv~tvk~~~lt~ 296 (400) T protein:vir:10 222 SQSGA-----TIQGFVLSSYNCPVIPSNRFPKYSQGQKHHLLSNEDNGYRYDPIAEMNGAIAVLFTADALLVGRSIDVIG 296 (400) T ss_pred cCCCc-----cccceEEEEeceEEEeeCcCCcccCcccccccccCCCCccCCccccccceeEEEEehhheEEEEeecccc Confidence 11111 124567789999999999998531 112223456788888999988888999 Q ss_pred EeccCCCcceeEEEEeeEEEeeecceee-----ecCcCCcC--------hHHhcCCcCceeecCccccceEEEEecCCCC Q lcl|NC_020854. 276 ETDRDILAKSDAMSIDLHYVYHPVGAKW-----AVTTTNPT--------RAQLETVANWSKVYELKNIGIVRATNVSNFD 342 (342) Q Consensus 276 e~dr~~~~g~~~l~~r~~y~~~~~G~s~-----~~~~~sPt--------~~~L~~~~NW~~v~d~k~i~~~~~~~~~~~~ 342 (342) |..|++....+.|-+.+.|++.|+==.. +.-+..|. -..+-+-+|-+.||- |+-.-+.----..+- T Consensus 297 ~~~~d~r~~~~~id~~~a~G~g~~RPeaa~vv~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~~~~~~ 375 (400) T protein:vir:10 297 DIFYEKKEKTYYIDTFMSEGAIPDRWEAVSVVTTKRQSTGAVDSGNAAQHTQVLNRAQRKAVYV-KNAAPAGAFAAASLS 375 (400) T ss_pred ccccchhhHHHHHHHHHHhCCcccchhheEEEEecCCcccccccCcchhHHHHHhhcccceEEE-ecccccccccccccc Confidence 9999999999999998888888743111 11111111 112222222222221 000000000001111 No 156 >protein:vir:79008 Length: 299 # NCBI annotation: putative main capsid protein # Family: family:all:701 # MgeID: mge:1861 # MgeName: phiC2 # Cross-refs: genbank:acc:YP_001110725;genbank:gi:134287342;genbank:GeneID:4955182 Probab=97.63 E-value=3.5e-05 Score=45.01 Aligned_cols=272 Identities=8% Similarity=0.019 Sum_probs=134.3 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchh---hhccCCCCEEEccccccCCCCcccccCCCc-eechhhc Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTE---LNATEGGDFINVPFWKANLSGDFEVLSDSS-SLTPGKI 76 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~---l~~~~~G~ti~~P~~~~i~~gda~~~~~~~-~i~~~~l 76 (342) ||+ + +. +|.|.+.+.+++.+.+.+ |.+...+. +.. .||++|.||.... ....+++-++ --....+ T Consensus 1 MA~-~-n~--a~~~~~~Ld~~~~~~l~~---~~L~~~~~~~~v~~-~gg~tVkI~~i~~---~gl~DY~R~~~g~~~g~~ 69 (299) T protein:vir:79 1 MAA-L-NY--AKEYSNVLAQAYPYTLNF---GDLYATPNNGRYRW-TGSKTIEIPTIST---TGRVDSNRDTIAVAQRNY 69 (299) T ss_pred Ccc-c-hh--HHHHHHHHHHHHHhhcee---eeeccCcccceeee-cCCCEEEEecccc---ccccccccCCCccccccc Confidence 993 2 22 388888888888776544 44433322 222 4799999998653 3345666544 3344466 Q ss_pred ccceeeeEe-eeeccceeechHHH-hhhcchHHHHHHHHHH-HHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccc Q lcl|NC_020854. 77 TADKQVAAI-LHRGRAFEARDLAA-LAAGSDPMAAIGAKVA-DYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDS 153 (342) Q Consensus 77 t~~~~~a~i-~~~~k~~~~tD~a~-~~~~~dp~~~i~~qia-~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~ 153 (342) +...+.-++ +.|..+|.+.+.-. ++.+.-.++++.+++. ..-+..+|+..++.|-. . +...+... . T Consensus 70 ~~~~~t~~ldqdr~~~f~vD~~Dvdet~~~~~~a~v~~~~~~~~v~pEiDay~~skl~~---~--a~~~g~~~------~ 138 (299) T protein:vir:79 70 DNAWEPKVLTNQRKWSTLVHPADINQTNYVASIGNITKVYNEEQKFPEMDAYCISKIYA---D--WTALGNTA------D 138 (299) T ss_pred CcceeEEEeeccccceeccchhhHHHHhhhhHHHHHHHHHHHHHhhhHhhHHHHHHHHH---h--hhhcCCcc------c Confidence 655555444 45888888884432 2222223344333332 33334456655654421 0 10000000 0 Q ss_pred ccccccccccHHHHHHHHHHhCccc--cCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeeccccccccee Q lcl|NC_020854. 154 ESGDTPTALSPRHVAEARAILGDQG--DKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVP 231 (342) Q Consensus 154 ~~~~~~~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~ 231 (342) .+..+... -++.|.++..+|-+.. ..-..++|.|.++.-|.+...|. +..+.. ... ..-++.++ T Consensus 139 ~~~~T~~n-~y~~i~~~~~~lde~~vP~~~rvl~vtp~~~~~L~~~~~f~--k~~~~~-----~~~------~~~~g~Vg 204 (299) T protein:vir:79 139 TTVLTTTN-VLEVFDKLMEKMTEARVPENGRILYVTPVVNTLIKNAKEIQ--RTVNIK-----DAG------TSLNRQTT 204 (299) T ss_pred ccccCHHH-HHHHHHHHHHHHHhcCCCCCCeEEEeCHHHHHHHhhchhhh--cccccc-----ccc------ceeeeeee Confidence 00001111 2566777777777643 24588999999999998765432 222111 000 11246688 Q ss_pred eeccceEEE--eCCccee----c---cCCCcc-eEEEEEecceeEeecCCcceeEeccCCCcceeEEE-EeeEEEeeec- Q lcl|NC_020854. 232 TYMGLRVIV--SDDVNTA----G---SGGSTE-YATYFFTQGAVASGEQMAMQTETDRDILAKSDAMS-IDLHYVYHPV- 299 (342) Q Consensus 232 ~~~G~~Vvv--dD~~p~~----~---~~~~~~-y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~-~r~~y~~~~~- 299 (342) .+.|.+|+. ++.+... . .+...+ ..-+++-++|+.--.|.+..--..|+.++..+.++ .|..+-+-++ T Consensus 205 ~idG~~Ii~Vps~r~~t~~~~~~G~~~~~~ak~in~ii~~~~a~~~~~K~~~~~~~~P~~~~~~~~~~~~r~y~d~~v~~ 284 (299) T protein:vir:79 205 DIDTVKIIKVPSNLMKTAYDFTTGWKVGAGAKQIFMSLVHPSAIITPVSYQFSKLDEPTAVTEGKYFYFEESFEDVFILN 284 (299) T ss_pred eecceEEEEechhhcCccceeccCccccCcccccceEEEcCCeeeeeEeeeeEEeecCCCCCccceeeeeeeeeeeeeec Confidence 899999986 4444321 0 111122 22344445555544454422224566655445444 3443334333 Q ss_pred ----ce--eeecCcC Q lcl|NC_020854. 300 ----GA--KWAVTTT 308 (342) Q Consensus 300 ----G~--s~~~~~~ 308 (342) |+ +.+.++. T Consensus 285 nk~~~i~~~~~~a~~ 299 (299) T protein:vir:79 285 KKADAIQFVVEGAGA 299 (299) T ss_pred cccCeEEEEeeecCC Confidence 33 2222222 No 157 >protein:vir:3158 Length: 321 # NCBI annotation: capsid protein gpE # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:316 # MgeName: PhiCh1 # Cross-refs: genbank:acc:NP_665929;genbank:gi:22091115;genbank:GeneID:951342 Probab=97.32 E-value=9.7e-05 Score=42.59 Aligned_cols=280 Identities=11% Similarity=0.041 Sum_probs=123.8 Q ss_pred Ccceecc---ccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCC-ceechhhc Q lcl|NC_020854. 1 MATLRSD---IIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDS-SSLTPGKI 76 (342) Q Consensus 1 MaT~~~d---~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~-~~i~~~~l 76 (342) |.+...+ .+.|++...++ +++.+.+.|.+ .+..-+. ......+|.|..- ........++ ......+. T Consensus 19 ~~~~~~~~g~~v~~~~~~~l~-~~i~e~s~~l~--~i~v~~v-----~~~~~~i~~~~~~-~~~~~~~~e~~~~~~~~~~ 89 (321) T protein:vir:31 19 LTVDDLDAGGTLPDPLWDEFW-TDMIEETPLLD--AIRTETV-----GAKKTRIPTLNIG-ERHRRPQDEGEWNENESDV 89 (321) T ss_pred ccccccCCcceeCHHHHHHHH-HHHHHhhhhhh--hceeeec-----cCcceeeeeeccC-Ccccccccccccccccccc Confidence 3322211 45666555544 44666666654 2322221 1233456666421 1111122222 22333334 Q ss_pred ccceeeeEeeeeccceeechHHHhh-h-cchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccc----hhheee- Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALA-A-GSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSS----SAFFDL- 149 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~-~-~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~----~~~~~~- 149 (342) +.++..-..++..--+.++..--.- + +.|..+.+.+++++.+++..+...+. |...+.+.. .-++.. T Consensus 90 ~~~~~~~~~~k~~~~~~it~e~L~d~a~~~d~e~~i~~~ia~~~a~~~~~~~~n------Gd~~~~~~~~~~n~G~l~~a 163 (321) T protein:vir:31 90 STGTIDISTEKATVAWDLPREVVQENPEGEALADRILNLMTDAWSADVEDLAAN------GDEDAEDSFENQNDGFITVA 163 (321) T ss_pred eeeeeeeeeEEEEeehhccHHHHHhhhcchhHHHHHHHHHHHHHHHHHHhheee------ccccCCCcccccchhhhhhh Confidence 4444444444444445555544322 2 45888889998988888776665542 222222110 111110 Q ss_pred eccc-ccccccccccHHHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeeccccc Q lcl|NC_020854. 150 CIDS-ESGDTPTALSPRHVAEARAILGDQGD--KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGG 226 (342) Q Consensus 150 ~~~~-~~~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~ 226 (342) .... ....+...++.+.|.++...+-.... .-.+|+||+..+..+++. + .+.+.. +..+... T Consensus 164 ~~~~~~~~~~~~~~~~d~l~~l~~~l~~~yr~~~~~v~im~~~~~~~~~~~-l----~~~~~~----------~~~~~l~ 228 (321) T protein:vir:31 164 EGDVETIDAADDILDNDLVIRTIAGLDSKYRARMNPALIVSEDQLLSYHYT-L----TDRDTP----------LGDNVIM 228 (321) T ss_pred ccccccccccccccCHHHHHHHHHhccHhHhcCCCeEEEechHHHHHHHHH-H----hcCCCc----------cccchhh Confidence 0001 11123345778889999888866432 234799999998776642 1 111110 1111111 Q ss_pred ccceeeeccceEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccCCCc---ce----eEEEEeeEEEeee Q lcl|NC_020854. 227 EVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRDILA---KS----DAMSIDLHYVYHP 298 (342) Q Consensus 227 ~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~~~---g~----~~l~~r~~y~~~~ 298 (342) ...-.+++|++|++++.||-. .++|+. --+.|+..+.+.+++.++... +. ..+..+..|++-- T Consensus 229 ~~~~~tl~G~pvv~~~~mP~~---------~il~t~~~nl~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ve~ 299 (321) T protein:vir:31 229 GEADVNPFSFPIIGSGLWPDD---------KAMFTDPQNLIYALYRDLEIDVLTESDKVSERDLHARYFMRGDDDFAIEN 299 (321) T ss_pred ccccccccceeEEEcCCCCCC---------cEEEeccccEEEEEeeccEEEEeecCccccccceeeEeeeeeecceeEec Confidence 233447899999999999832 122222 112233334445554444221 11 2233334444333 Q ss_pred cc-eeeecCcCCcChHHhcCCcC Q lcl|NC_020854. 299 VG-AKWAVTTTNPTRAQLETVAN 320 (342) Q Consensus 299 ~G-~s~~~~~~sPt~~~L~~~~N 320 (342) ++ +.+...-+-| -+-|+..+. T Consensus 300 ~~a~a~~~~i~~~-~~~~~~~~~ 321 (321) T protein:vir:31 300 TEAVVLAEGLGDP-LEHLEEETS 321 (321) T ss_pred cccEEEEecCCcc-hhcccCCCC Confidence 32 1221110000 111111111 No 158 >protein:vir:819 Length: 404 # NCBI annotation: hypothetical protein # Family: family:all:974 # MgeID: mge:16 # MgeName: VT2-Sa # Cross-refs: genbank:acc:NP_050552;genbank:gi:9633449;genbank:GeneID:1262254 Probab=97.25 E-value=4.5e-05 Score=44.43 Aligned_cols=293 Identities=14% Similarity=0.120 Sum_probs=141.0 Q ss_pred Cccee-cc-ccchh--HHHHHHHhhh-------------HHh---hhhhhc---CccccchhhhccCCCCEEEccccccC Q lcl|NC_020854. 1 MATLR-SD-IIIPE--VFTPYVIEQT-------------TQR---DAFLAS---GVVQPMTELNATEGGDFINVPFWKAN 57 (342) Q Consensus 1 MaT~~-~d-~i~Pe--v~~~yv~~~~-------------~~~---~~f~~s---g~~~~d~~l~~~~~G~ti~~P~~~~i 57 (342) |+|.- .. ++.-+ .|+....... ..+ ..+.++ ..|+.-.+|.- ..|+.|+++.-.. T Consensus 1 ~~~~~~~~a~~~~~~~lft~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~I~~~~dL~K-~aGd~vtf~L~~~- 78 (404) T protein:vir:81 1 MTTVTSAQANKLYQVALFTAANRNRSMVNILTEQQEAPKAVSPDKKSTKQTSAGAPVVRITDLNK-QAGDEVTFSIMHK- 78 (404) T ss_pred CCCcCCcchhhhHHHHHHHHHhcCChhHhhhhhhhhhhhhhccchhhccCCCCCccEEEeecCCC-CCCcEEEEeEeee- Confidence 55321 11 11111 1222111100 000 011111 22333344443 4799999998775 Q ss_pred CCCcccccCCCceechhhcccceeeeEeeeeccceeech-HHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_020854. 58 LSGDFEVLSDSSSLTPGKITADKQVAAILHRGRAFEARD-LAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGS 136 (342) Q Consensus 58 ~~gda~~~~~~~~i~~~~lt~~~~~a~i~~~~k~~~~tD-~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~ 136 (342) |.|++---++.-+---+.|+..++..+|=...+++.... .++.-+--|...+....+++||++..|..++-.|.|..+. T Consensus 79 L~g~gv~Gd~~lEGnee~L~~~s~~i~Idq~r~~V~~~g~msqQRt~~dlr~~ar~~L~~w~~~~~d~~~~~~laG~rg~ 158 (404) T protein:vir:81 79 LSKRPTMGDERVEGRGEDLSHADFSLKINQGRHLVDAGGRMSQQRTKFNLASSARTLLGTYFNDLQDQCAIVHLAGARGD 158 (404) T ss_pred cccCCcccCceeeccccceeEEeeEEEEeeecccccccCchhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHhccccc Confidence 567653333333334788999999999988888875543 2345566788889999999999999999999988876542 Q ss_pred h-----------------------cccccchhhee-eecccccccccccccHHHHHHHHHHh-------------Ccc-- Q lcl|NC_020854. 137 L-----------------------NANTSSSAFFD-LCIDSESGDTPTALSPRHVAEARAIL-------------GDQ-- 177 (342) Q Consensus 137 ~-----------------------~a~~~~~~~~~-~~~~~~~~~~~~~~~~~~l~~A~~~~-------------GD~-- 177 (342) - .|...+-++.- -.++...=+++..|+.+.+.++.... ||+ T Consensus 159 ~~n~~~~vp~~~~~~~~~~~~N~v~APt~~r~~~~g~at~~~~l~stD~~s~~~Id~~~~~~~~~~~pi~Pv~~~g~~~~ 238 (404) T protein:vir:81 159 FVADDTILPTAEHPEFKKIMINDVLPPTHDRHFFGGDATSFEQIEAADIFSIGLVDNLSLFIDEMAHPLQPVRLSGDELH 238 (404) T ss_pred cccccceeeccccccccceeecccCCCCCCcEEeccCccchhhhhhcccccHHHHHHHHHHHHHhCCCCcceEecccccc Confidence 0 01111111110 00011111234567877777775443 333 Q ss_pred -ccCeEEEEEchHHHHHHHhhh----hhhhhhhh------hcccceeeeccceeecccccccceeeeccceEE------- Q lcl|NC_020854. 178 -GDKLTAVAMHSKVYYDLVERR----AIDYVSTA------DARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVI------- 239 (342) Q Consensus 178 -~~~~~~ivmhS~v~~~L~~~~----li~~~~~s------~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vv------- 239 (342) .+...+++|||-.+++|+.+- ..+..+.+ +.++++. +..+.|+|+.|. T Consensus 239 ~~~~~yV~~~~p~q~~~Lr~dt~~~~w~d~q~~A~a~~rg~~nPlF~--------------G~~gm~ngvii~~~~~~~I 304 (404) T protein:vir:81 239 GEDPYYVLYVTPRQWNDWYTSTSGKDWNQMMVRAVNRAKGFNHPLFK--------------GECAMWRNILVRKYAGMPI 304 (404) T ss_pred CccceEEEEechHHHHHHhhCCCcHHHHHHHHHHhhccccccCCcee--------------cCeeEEcCEEEEecCCcee Confidence 113689999999999999983 33333322 2233333 345556664443 Q ss_pred ---------EeCCcceec----cCCCcceEEEEEecceeEeecCC-----cceeEeccCCCcceeEEEEeeEEEeeecce Q lcl|NC_020854. 240 ---------VSDDVNTAG----SGGSTEYATYFFTQGAVASGEQM-----AMQTETDRDILAKSDAMSIDLHYVYHPVGA 301 (342) Q Consensus 240 ---------vdD~~p~~~----~~~~~~y~t~l~~~GAi~~~~k~-----~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~ 301 (342) +++.-.-.. .....+-..+++|.=|+.+.-+. +.-.|...|-+.+..+.... +.|+ T Consensus 305 rf~~g~~~~~~~n~~~a~~~~~aa~~~v~RallLGaQAl~~A~g~~~g~~~~w~Ee~~D~g~~~~i~~~~------i~G~ 378 (404) T protein:vir:81 305 RFYQGSKVLVSENNLTATTKEVAAATNIDRAMLLGAQALANAYGQKAGGHFNMVEKKTDMDNRTEIAISW------INGL 378 (404) T ss_pred eecccceeeecCCccccccccccccccchhheeecceeEEEEeeccCCCCceeEeeccccCchhhhhhHH------Hhhh Confidence 222211000 01111124478887776553221 12223322222221111100 1222 Q ss_pred e---eecC-c-------C-CcChHHh Q lcl|NC_020854. 302 K---WAVT-T-------T-NPTRAQL 315 (342) Q Consensus 302 s---~~~~-~-------~-sPt~~~L 315 (342) + |.+. + + =||.+-| T Consensus 379 kK~rF~~~~g~~~DfGvi~idta~~~ 404 (404) T protein:vir:81 379 KKIRFPEKSGKMQDHGVIAVDTAVKL 404 (404) T ss_pred hhccccCCCCceeeEEEEEecccccC Confidence 1 2111 0 0 0222222 No 159 >protein:vir:3298 Length: 404 # NCBI annotation: hypothetical protein # Family: family:all:974 # MgeID: mge:66 # MgeName: 933W # Cross-refs: genbank:acc:NP_049514;genbank:gi:9632520;genbank:GeneID:1262006 Probab=97.25 E-value=4.5e-05 Score=44.43 Aligned_cols=293 Identities=14% Similarity=0.120 Sum_probs=141.0 Q ss_pred Cccee-cc-ccchh--HHHHHHHhhh-------------HHh---hhhhhc---CccccchhhhccCCCCEEEccccccC Q lcl|NC_020854. 1 MATLR-SD-IIIPE--VFTPYVIEQT-------------TQR---DAFLAS---GVVQPMTELNATEGGDFINVPFWKAN 57 (342) Q Consensus 1 MaT~~-~d-~i~Pe--v~~~yv~~~~-------------~~~---~~f~~s---g~~~~d~~l~~~~~G~ti~~P~~~~i 57 (342) |+|.- .. ++.-+ .|+....... ..+ ..+.++ ..|+.-.+|.- ..|+.|+++.-.. T Consensus 1 ~~~~~~~~a~~~~~~~lft~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~I~~~~dL~K-~aGd~vtf~L~~~- 78 (404) T protein:vir:32 1 MTTVTSAQANKLYQVALFTAANRNRSMVNILTEQQEAPKAVSPDKKSTKQTSAGAPVVRITDLNK-QAGDEVTFSIMHK- 78 (404) T ss_pred CCCcCCcchhhhHHHHHHHHHhcCChhHhhhhhhhhhhhhhccchhhccCCCCCccEEEeecCCC-CCCcEEEEeEeee- Confidence 55321 11 11111 1222111100 000 011111 22333344443 4799999998775 Q ss_pred CCCcccccCCCceechhhcccceeeeEeeeeccceeech-HHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_020854. 58 LSGDFEVLSDSSSLTPGKITADKQVAAILHRGRAFEARD-LAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGS 136 (342) Q Consensus 58 ~~gda~~~~~~~~i~~~~lt~~~~~a~i~~~~k~~~~tD-~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~ 136 (342) |.|++---++.-+---+.|+..++..+|=...+++.... .++.-+--|...+....+++||++..|..++-.|.|..+. T Consensus 79 L~g~gv~Gd~~lEGnee~L~~~s~~i~Idq~r~~V~~~g~msqQRt~~dlr~~ar~~L~~w~~~~~d~~~~~~laG~rg~ 158 (404) T protein:vir:32 79 LSKRPTMGDERVEGRGEDLSHADFSLKINQGRHLVDAGGRMSQQRTKFNLASSARTLLGTYFNDLQDQCAIVHLAGARGD 158 (404) T ss_pred cccCCcccCceeeccccceeEEeeEEEEeeecccccccCchhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHhccccc Confidence 567653333333334788999999999988888875543 2345566788889999999999999999999988876542 Q ss_pred h-----------------------cccccchhhee-eecccccccccccccHHHHHHHHHHh-------------Ccc-- Q lcl|NC_020854. 137 L-----------------------NANTSSSAFFD-LCIDSESGDTPTALSPRHVAEARAIL-------------GDQ-- 177 (342) Q Consensus 137 ~-----------------------~a~~~~~~~~~-~~~~~~~~~~~~~~~~~~l~~A~~~~-------------GD~-- 177 (342) - .|...+-++.- -.++...=+++..|+.+.+.++.... ||+ T Consensus 159 ~~n~~~~vp~~~~~~~~~~~~N~v~APt~~r~~~~g~at~~~~l~stD~~s~~~Id~~~~~~~~~~~pi~Pv~~~g~~~~ 238 (404) T protein:vir:32 159 FVADDTILPTAEHPEFKKIMINDVLPPTHDRHFFGGDATSFEQIEAADIFSIGLVDNLSLFIDEMAHPLQPVRLSGDELH 238 (404) T ss_pred cccccceeeccccccccceeecccCCCCCCcEEeccCccchhhhhhcccccHHHHHHHHHHHHHhCCCCcceEecccccc Confidence 0 01111111110 00011111234567877777775443 333 Q ss_pred -ccCeEEEEEchHHHHHHHhhh----hhhhhhhh------hcccceeeeccceeecccccccceeeeccceEE------- Q lcl|NC_020854. 178 -GDKLTAVAMHSKVYYDLVERR----AIDYVSTA------DARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVI------- 239 (342) Q Consensus 178 -~~~~~~ivmhS~v~~~L~~~~----li~~~~~s------~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vv------- 239 (342) .+...+++|||-.+++|+.+- ..+..+.+ +.++++. +..+.|+|+.|. T Consensus 239 ~~~~~yV~~~~p~q~~~Lr~dt~~~~w~d~q~~A~a~~rg~~nPlF~--------------G~~gm~ngvii~~~~~~~I 304 (404) T protein:vir:32 239 GEDPYYVLYVTPRQWNDWYTSTSGKDWNQMMVRAVNRAKGFNHPLFK--------------GECAMWRNILVRKYAGMPI 304 (404) T ss_pred CccceEEEEechHHHHHHhhCCCcHHHHHHHHHHhhccccccCCcee--------------cCeeEEcCEEEEecCCcee Confidence 113689999999999999983 33333322 2233333 345556664443 Q ss_pred ---------EeCCcceec----cCCCcceEEEEEecceeEeecCC-----cceeEeccCCCcceeEEEEeeEEEeeecce Q lcl|NC_020854. 240 ---------VSDDVNTAG----SGGSTEYATYFFTQGAVASGEQM-----AMQTETDRDILAKSDAMSIDLHYVYHPVGA 301 (342) Q Consensus 240 ---------vdD~~p~~~----~~~~~~y~t~l~~~GAi~~~~k~-----~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~ 301 (342) +++.-.-.. .....+-..+++|.=|+.+.-+. +.-.|...|-+.+..+.... +.|+ T Consensus 305 rf~~g~~~~~~~n~~~a~~~~~aa~~~v~RallLGaQAl~~A~g~~~g~~~~w~Ee~~D~g~~~~i~~~~------i~G~ 378 (404) T protein:vir:32 305 RFYQGSKVLVSENNLTATTKEVAAATNIDRAMLLGAQALANAYGQKAGGHFNMVEKKTDMDNRTEIAISW------INGL 378 (404) T ss_pred eecccceeeecCCccccccccccccccchhheeecceeEEEEeeccCCCCceeEeeccccCchhhhhhHH------Hhhh Confidence 222211000 01111124478887776553221 12223322222221111100 1222 Q ss_pred e---eecC-c-------C-CcChHHh Q lcl|NC_020854. 302 K---WAVT-T-------T-NPTRAQL 315 (342) Q Consensus 302 s---~~~~-~-------~-sPt~~~L 315 (342) + |.+. + + =||.+-| T Consensus 379 kK~rF~~~~g~~~DfGvi~idta~~~ 404 (404) T protein:vir:32 379 KKIRFPEKSGKMQDHGVIAVDTAVKL 404 (404) T ss_pred hhccccCCCCceeeEEEEEecccccC Confidence 1 2111 0 0 0222222 No 160 >protein:vir:104439 Length: 404 # NCBI annotation: putative virion structural protein # Family: family:all:974 # MgeID: mge:1471 # MgeName: 86 # Cross-refs: genbank:acc:YP_794063;genbank:gi:116222008;genbank:GeneID:4397504 Probab=97.25 E-value=4.5e-05 Score=44.43 Aligned_cols=293 Identities=14% Similarity=0.120 Sum_probs=141.0 Q ss_pred Cccee-cc-ccchh--HHHHHHHhhh-------------HHh---hhhhhc---CccccchhhhccCCCCEEEccccccC Q lcl|NC_020854. 1 MATLR-SD-IIIPE--VFTPYVIEQT-------------TQR---DAFLAS---GVVQPMTELNATEGGDFINVPFWKAN 57 (342) Q Consensus 1 MaT~~-~d-~i~Pe--v~~~yv~~~~-------------~~~---~~f~~s---g~~~~d~~l~~~~~G~ti~~P~~~~i 57 (342) |+|.- .. ++.-+ .|+....... ..+ ..+.++ ..|+.-.+|.- ..|+.|+++.-.. T Consensus 1 ~~~~~~~~a~~~~~~~lft~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~I~~~~dL~K-~aGd~vtf~L~~~- 78 (404) T protein:vir:10 1 MTTVTSAQANKLYQVALFTAANRNRSMVNILTEQQEAPKAVSPDKKSTKQTSAGAPVVRITDLNK-QAGDEVTFSIMHK- 78 (404) T ss_pred CCCcCCcchhhhHHHHHHHHHhcCChhHhhhhhhhhhhhhhccchhhccCCCCCccEEEeecCCC-CCCcEEEEeEeee- Confidence 55321 11 11111 1222111100 000 011111 22333344443 4799999998775 Q ss_pred CCCcccccCCCceechhhcccceeeeEeeeeccceeech-HHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_020854. 58 LSGDFEVLSDSSSLTPGKITADKQVAAILHRGRAFEARD-LAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGS 136 (342) Q Consensus 58 ~~gda~~~~~~~~i~~~~lt~~~~~a~i~~~~k~~~~tD-~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~ 136 (342) |.|++---++.-+---+.|+..++..+|=...+++.... .++.-+--|...+....+++||++..|..++-.|.|..+. T Consensus 79 L~g~gv~Gd~~lEGnee~L~~~s~~i~Idq~r~~V~~~g~msqQRt~~dlr~~ar~~L~~w~~~~~d~~~~~~laG~rg~ 158 (404) T protein:vir:10 79 LSKRPTMGDERVEGRGEDLSHADFSLKINQGRHLVDAGGRMSQQRTKFNLASSARTLLGTYFNDLQDQCAIVHLAGARGD 158 (404) T ss_pred cccCCcccCceeeccccceeEEeeEEEEeeecccccccCchhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHhccccc Confidence 567653333333334788999999999988888875543 2345566788889999999999999999999988876542 Q ss_pred h-----------------------cccccchhhee-eecccccccccccccHHHHHHHHHHh-------------Ccc-- Q lcl|NC_020854. 137 L-----------------------NANTSSSAFFD-LCIDSESGDTPTALSPRHVAEARAIL-------------GDQ-- 177 (342) Q Consensus 137 ~-----------------------~a~~~~~~~~~-~~~~~~~~~~~~~~~~~~l~~A~~~~-------------GD~-- 177 (342) - .|...+-++.- -.++...=+++..|+.+.+.++.... ||+ T Consensus 159 ~~n~~~~vp~~~~~~~~~~~~N~v~APt~~r~~~~g~at~~~~l~stD~~s~~~Id~~~~~~~~~~~pi~Pv~~~g~~~~ 238 (404) T protein:vir:10 159 FVADDTILPTAEHPEFKKIMINDVLPPTHDRHFFGGDATSFEQIEAADIFSIGLVDNLSLFIDEMAHPLQPVRLSGDELH 238 (404) T ss_pred cccccceeeccccccccceeecccCCCCCCcEEeccCccchhhhhhcccccHHHHHHHHHHHHHhCCCCcceEecccccc Confidence 0 01111111110 00011111234567877777775443 333 Q ss_pred -ccCeEEEEEchHHHHHHHhhh----hhhhhhhh------hcccceeeeccceeecccccccceeeeccceEE------- Q lcl|NC_020854. 178 -GDKLTAVAMHSKVYYDLVERR----AIDYVSTA------DARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVI------- 239 (342) Q Consensus 178 -~~~~~~ivmhS~v~~~L~~~~----li~~~~~s------~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vv------- 239 (342) .+...+++|||-.+++|+.+- ..+..+.+ +.++++. +..+.|+|+.|. T Consensus 239 ~~~~~yV~~~~p~q~~~Lr~dt~~~~w~d~q~~A~a~~rg~~nPlF~--------------G~~gm~ngvii~~~~~~~I 304 (404) T protein:vir:10 239 GEDPYYVLYVTPRQWNDWYTSTSGKDWNQMMVRAVNRAKGFNHPLFK--------------GECAMWRNILVRKYAGMPI 304 (404) T ss_pred CccceEEEEechHHHHHHhhCCCcHHHHHHHHHHhhccccccCCcee--------------cCeeEEcCEEEEecCCcee Confidence 113689999999999999983 33333322 2233333 345556664443 Q ss_pred ---------EeCCcceec----cCCCcceEEEEEecceeEeecCC-----cceeEeccCCCcceeEEEEeeEEEeeecce Q lcl|NC_020854. 240 ---------VSDDVNTAG----SGGSTEYATYFFTQGAVASGEQM-----AMQTETDRDILAKSDAMSIDLHYVYHPVGA 301 (342) Q Consensus 240 ---------vdD~~p~~~----~~~~~~y~t~l~~~GAi~~~~k~-----~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~ 301 (342) +++.-.-.. .....+-..+++|.=|+.+.-+. +.-.|...|-+.+..+.... +.|+ T Consensus 305 rf~~g~~~~~~~n~~~a~~~~~aa~~~v~RallLGaQAl~~A~g~~~g~~~~w~Ee~~D~g~~~~i~~~~------i~G~ 378 (404) T protein:vir:10 305 RFYQGSKVLVSENNLTATTKEVAAATNIDRAMLLGAQALANAYGQKAGGHFNMVEKKTDMDNRTEIAISW------INGL 378 (404) T ss_pred eecccceeeecCCccccccccccccccchhheeecceeEEEEeeccCCCCceeEeeccccCchhhhhhHH------Hhhh Confidence 222211000 01111124478887776553221 12223322222221111100 1222 Q ss_pred e---eecC-c-------C-CcChHHh Q lcl|NC_020854. 302 K---WAVT-T-------T-NPTRAQL 315 (342) Q Consensus 302 s---~~~~-~-------~-sPt~~~L 315 (342) + |.+. + + =||.+-| T Consensus 379 kK~rF~~~~g~~~DfGvi~idta~~~ 404 (404) T protein:vir:10 379 KKIRFPEKSGKMQDHGVIAVDTAVKL 404 (404) T ss_pred hhccccCCCCceeeEEEEEecccccC Confidence 1 2111 0 0 0222222 No 161 >protein:vir:10123 Length: 404 # NCBI annotation: hypothetical protein # Family: family:all:974 # MgeID: mge:180 # MgeName: Stx2 converting bacteriophage II # Cross-refs: genbank:acc:NP_859253;genbank:gi:32171009;genbank:GeneID:2653345 Probab=97.25 E-value=4.5e-05 Score=44.43 Aligned_cols=293 Identities=14% Similarity=0.120 Sum_probs=141.0 Q ss_pred Cccee-cc-ccchh--HHHHHHHhhh-------------HHh---hhhhhc---CccccchhhhccCCCCEEEccccccC Q lcl|NC_020854. 1 MATLR-SD-IIIPE--VFTPYVIEQT-------------TQR---DAFLAS---GVVQPMTELNATEGGDFINVPFWKAN 57 (342) Q Consensus 1 MaT~~-~d-~i~Pe--v~~~yv~~~~-------------~~~---~~f~~s---g~~~~d~~l~~~~~G~ti~~P~~~~i 57 (342) |+|.- .. ++.-+ .|+....... ..+ ..+.++ ..|+.-.+|.- ..|+.|+++.-.. T Consensus 1 ~~~~~~~~a~~~~~~~lft~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~~~~~I~~~~dL~K-~aGd~vtf~L~~~- 78 (404) T protein:vir:10 1 MTTVTSAQANKLYQVALFTAANRNRSMVNILTEQQEAPKAVSPDKKSTKQTSAGAPVVRITDLNK-QAGDEVTFSIMHK- 78 (404) T ss_pred CCCcCCcchhhhHHHHHHHHHhcCChhHhhhhhhhhhhhhhccchhhccCCCCCccEEEeecCCC-CCCcEEEEeEeee- Confidence 55321 11 11111 1222111100 000 011111 22333344443 4799999998775 Q ss_pred CCCcccccCCCceechhhcccceeeeEeeeeccceeech-HHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhh Q lcl|NC_020854. 58 LSGDFEVLSDSSSLTPGKITADKQVAAILHRGRAFEARD-LAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGS 136 (342) Q Consensus 58 ~~gda~~~~~~~~i~~~~lt~~~~~a~i~~~~k~~~~tD-~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~ 136 (342) |.|++---++.-+---+.|+..++..+|=...+++.... .++.-+--|...+....+++||++..|..++-.|.|..+. T Consensus 79 L~g~gv~Gd~~lEGnee~L~~~s~~i~Idq~r~~V~~~g~msqQRt~~dlr~~ar~~L~~w~~~~~d~~~~~~laG~rg~ 158 (404) T protein:vir:10 79 LSKRPTMGDERVEGRGEDLSHADFSLKINQGRHLVDAGGRMSQQRTKFNLASSARTLLGTYFNDLQDQCAIVHLAGARGD 158 (404) T ss_pred cccCCcccCceeeccccceeEEeeEEEEeeecccccccCchhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHHhccccc Confidence 567653333333334788999999999988888875543 2345566788889999999999999999999988876542 Q ss_pred h-----------------------cccccchhhee-eecccccccccccccHHHHHHHHHHh-------------Ccc-- Q lcl|NC_020854. 137 L-----------------------NANTSSSAFFD-LCIDSESGDTPTALSPRHVAEARAIL-------------GDQ-- 177 (342) Q Consensus 137 ~-----------------------~a~~~~~~~~~-~~~~~~~~~~~~~~~~~~l~~A~~~~-------------GD~-- 177 (342) - .|...+-++.- -.++...=+++..|+.+.+.++.... ||+ T Consensus 159 ~~n~~~~vp~~~~~~~~~~~~N~v~APt~~r~~~~g~at~~~~l~stD~~s~~~Id~~~~~~~~~~~pi~Pv~~~g~~~~ 238 (404) T protein:vir:10 159 FVADDTILPTAEHPEFKKIMINDVLPPTHDRHFFGGDATSFEQIEAADIFSIGLVDNLSLFIDEMAHPLQPVRLSGDELH 238 (404) T ss_pred cccccceeeccccccccceeecccCCCCCCcEEeccCccchhhhhhcccccHHHHHHHHHHHHHhCCCCcceEecccccc Confidence 0 01111111110 00011111234567877777775443 333 Q ss_pred -ccCeEEEEEchHHHHHHHhhh----hhhhhhhh------hcccceeeeccceeecccccccceeeeccceEE------- Q lcl|NC_020854. 178 -GDKLTAVAMHSKVYYDLVERR----AIDYVSTA------DARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVI------- 239 (342) Q Consensus 178 -~~~~~~ivmhS~v~~~L~~~~----li~~~~~s------~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vv------- 239 (342) .+...+++|||-.+++|+.+- ..+..+.+ +.++++. +..+.|+|+.|. T Consensus 239 ~~~~~yV~~~~p~q~~~Lr~dt~~~~w~d~q~~A~a~~rg~~nPlF~--------------G~~gm~ngvii~~~~~~~I 304 (404) T protein:vir:10 239 GEDPYYVLYVTPRQWNDWYTSTSGKDWNQMMVRAVNRAKGFNHPLFK--------------GECAMWRNILVRKYAGMPI 304 (404) T ss_pred CccceEEEEechHHHHHHhhCCCcHHHHHHHHHHhhccccccCCcee--------------cCeeEEcCEEEEecCCcee Confidence 113689999999999999983 33333322 2233333 345556664443 Q ss_pred ---------EeCCcceec----cCCCcceEEEEEecceeEeecCC-----cceeEeccCCCcceeEEEEeeEEEeeecce Q lcl|NC_020854. 240 ---------VSDDVNTAG----SGGSTEYATYFFTQGAVASGEQM-----AMQTETDRDILAKSDAMSIDLHYVYHPVGA 301 (342) Q Consensus 240 ---------vdD~~p~~~----~~~~~~y~t~l~~~GAi~~~~k~-----~~~ve~dr~~~~g~~~l~~r~~y~~~~~G~ 301 (342) +++.-.-.. .....+-..+++|.=|+.+.-+. +.-.|...|-+.+..+.... +.|+ T Consensus 305 rf~~g~~~~~~~n~~~a~~~~~aa~~~v~RallLGaQAl~~A~g~~~g~~~~w~Ee~~D~g~~~~i~~~~------i~G~ 378 (404) T protein:vir:10 305 RFYQGSKVLVSENNLTATTKEVAAATNIDRAMLLGAQALANAYGQKAGGHFNMVEKKTDMDNRTEIAISW------INGL 378 (404) T ss_pred eecccceeeecCCccccccccccccccchhheeecceeEEEEeeccCCCCceeEeeccccCchhhhhhHH------Hhhh Confidence 222211000 01111124478887776553221 12223322222221111100 1222 Q ss_pred e---eecC-c-------C-CcChHHh Q lcl|NC_020854. 302 K---WAVT-T-------T-NPTRAQL 315 (342) Q Consensus 302 s---~~~~-~-------~-sPt~~~L 315 (342) + |.+. + + =||.+-| T Consensus 379 kK~rF~~~~g~~~DfGvi~idta~~~ 404 (404) T protein:vir:10 379 KKIRFPEKSGKMQDHGVIAVDTAVKL 404 (404) T ss_pred hhccccCCCCceeeEEEEEecccccC Confidence 1 2111 0 0 0222222 No 162 >protein:vir:2770 Length: 318 # NCBI annotation: hypothetical protein # Family: family:all:974 # MgeID: mge:59 # MgeName: Stx2 converting bacteriophage I # Cross-refs: genbank:acc:NP_612887;genbank:gi:20065804;genbank:GeneID:935710 Probab=96.97 E-value=0.00023 Score=40.52 Aligned_cols=247 Identities=12% Similarity=0.090 Sum_probs=129.1 Q ss_pred Ccce-eccccchh-------------------HHHHHHHhhhH---HhhhhhhcC---ccccchhhhccCCCCEEEcccc Q lcl|NC_020854. 1 MATL-RSDIIIPE-------------------VFTPYVIEQTT---QRDAFLASG---VVQPMTELNATEGGDFINVPFW 54 (342) Q Consensus 1 MaT~-~~d~i~Pe-------------------v~~~yv~~~~~---~~~~f~~sg---~~~~d~~l~~~~~G~ti~~P~~ 54 (342) |+|. -.| |+ .|+.-+..... .+..|.+++ .|+.-.+|.- ..||.|+++.- T Consensus 1 mt~~~~~~---~~~~~~~~~ft~~~~~~~~vk~ws~~l~~~~~~~~~~~~~~g~~~~~~I~r~~dL~K-~~GD~Vtf~L~ 76 (318) T protein:vir:27 1 MTTVTSAQ---ANKLFQVALFTAANRNRSMVNILTEQQEAPKAVSPDKKSTKQTSAGAPVVRITDLNK-QAGDEVTFSIM 76 (318) T ss_pred CCccCCCC---hHHHHHHHHHHHHhcCChHHHHHHHhhhhHHHhhhhhhcccCCCCCceEEEeccCCC-CCccEEEEeEe Confidence 7743 233 33 12222211111 112333332 2444455543 47999999987 Q ss_pred ccCCCCcccccCCCceechhhcccceeeeEeeeeccceeechH-HHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_020854. 55 KANLSGDFEVLSDSSSLTPGKITADKQVAAILHRGRAFEARDL-AALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGV 133 (342) Q Consensus 55 ~~i~~gda~~~~~~~~i~~~~lt~~~~~a~i~~~~k~~~~tD~-a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~ 133 (342) .. |.|+.---++.-+---+.|+..++..+|=...+++....- ++.-+--|...++...+++||++..|..++-.|-|. T Consensus 77 ~~-L~g~gv~Gd~~lEGnee~L~~~~d~l~IDq~r~~V~~gg~msqqRt~~dlR~~ar~~L~~w~~~~~Dq~~~v~laGa 155 (318) T protein:vir:27 77 HK-LSKRPTMGDERVEGRGEDLSHADFSLKINQGRHLVDAGGRMSQQRTKFNLASSARTLLGTYFNDLQDQCAIVHLAGA 155 (318) T ss_pred ec-cccCccccCceeeccccceEEEeeEEEEeeeccccccccchhhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhc Confidence 75 5777633333333447889999999988777777755432 234455688888999999999999999999999777 Q ss_pred Hhh-----------------------hcccccchhhee-eecccccccccccccHHHHHHHHHHh-------------Cc Q lcl|NC_020854. 134 FGS-----------------------LNANTSSSAFFD-LCIDSESGDTPTALSPRHVAEARAIL-------------GD 176 (342) Q Consensus 134 ~~~-----------------------~~a~~~~~~~~~-~~~~~~~~~~~~~~~~~~l~~A~~~~-------------GD 176 (342) .+. ..|..++-++.- -.++...=+++..|+.+.+.++.... || T Consensus 156 rg~~~n~~~~~p~~~~~~~~~~~~N~v~aPt~~r~~~~g~at~~~~l~stD~~s~~lid~~~~~~~~~a~pi~PV~v~g~ 235 (318) T protein:vir:27 156 RGDFVADDTILPTAEHPEFKKIMINDVLPPTHDRHFFGGDATSFEQIEAADIFSIGLVDNLSLFIDEMAHPLQPVRLSGD 235 (318) T ss_pred ccccccccceEecccCccchhhhhcccCCCCCCcEEeccCccchhhhhhcccccHHHHHHHHHHHHHhCCCCcceeeccc Confidence 642 011111111110 00111111344577777776664433 22 Q ss_pred c---ccCeEEEEEchHHHHHHHhhh----hhhhhhhhhcccceeeeccceeecccccccceeeeccceEEEeCCcceecc Q lcl|NC_020854. 177 Q---GDKLTAVAMHSKVYYDLVERR----AIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGS 249 (342) Q Consensus 177 ~---~~~~~~ivmhS~v~~~L~~~~----li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~ 249 (342) + .+...+++|||-.+++|+.+- ..+..+.+.+.. .+..+- .=.+.++.|+|..+.---.+| T Consensus 236 ~~~~~~~~yV~~~~p~q~~~Lrtdt~~~~w~d~q~~A~~r~--~g~knP------LF~G~~gm~ngvil~~~~~vp---- 303 (318) T protein:vir:27 236 ELHGEDPYYVLYVTPRQWNDWYTSTSGKDWNQMMVRAVNRA--KGFNHP------LFKGECAMWRNILVRKYAGMP---- 303 (318) T ss_pred cccCCcceEEEEechHHHHHHhhcCCCHHHHHHHHHHHhcc--cccCCC------ceecceeeecCEEEeecCCcc---- Confidence 2 123689999999999999872 333333222111 000000 001345666664443222222 Q ss_pred CCCcceEEEEEecceeEeecCCcceeEeccCCCcceeEEEEeeE Q lcl|NC_020854. 250 GGSTEYATYFFTQGAVASGEQMAMQTETDRDILAKSDAMSIDLH 293 (342) Q Consensus 250 ~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~g~~~l~~r~~ 293 (342) |.+. .|.++-++|.. T Consensus 304 ---------------Irf~--------------~G~~v~~~~~~ 318 (318) T protein:vir:27 304 ---------------IRFY--------------QGQRFWYQRIT 318 (318) T ss_pred ---------------EEEc--------------CCCeeeeeecC Confidence 1221 22233333322 No 163 >protein:vir:9643 Length: 377 # NCBI annotation: major coat protein # Family: family:all:635 # MgeID: mge:173 # MgeName: 315.1 # Cross-refs: genbank:acc:NP_795405;genbank:gi:28876178;genbank:GeneID:1257724 Probab=96.79 E-value=0.00034 Score=39.63 Aligned_cols=267 Identities=10% Similarity=-0.015 Sum_probs=124.8 Q ss_pred Cc-c--eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechh-hc Q lcl|NC_020854. 1 MA-T--LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPG-KI 76 (342) Q Consensus 1 Ma-T--~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~-~l 76 (342) +. + .-...++|+-+..-+.+.+.+.+.+.+ ++. ... .+| ...+|.= ..++.+.-+.++..++.+ +. T Consensus 79 ~~~~~~~~gg~lvP~~~~~~I~~~l~~~s~i~~--~~~----v~~-~~~-~~~i~~~--~~~~~a~wv~e~~~~~~~~~~ 148 (377) T protein:vir:96 79 DKNVGGKDKFKLLPEETMVQVFDDLVAEHPLLK--VIN----FKN-TSL-RLKALTA--ETSGTAVWGDIFGEIKGQLKQ 148 (377) T ss_pred HhcCCCCCCceecCHHHHHHHHHHHHhhhhhhh--hce----eEe-cCC-ceEEEEe--cCCcceeEeecccccccccCc Confidence 22 2 233568888887777777777666644 222 111 234 3456742 234666666777766543 22 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHH---------HHHHHHhhhcccccchhhe Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLS---------CLQGVFGSLNANTSSSAFF 147 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla---------~L~g~~~~~~a~~~~~~~~ 147 (342) +-++-....++...-..++..--.-+.-|..+.+.+++++.+++..+..++. +|+.+-+..++........ T Consensus 149 ~f~~i~l~~~kl~~~~~is~~ll~ds~~~le~~i~~~l~~~~~~~~~~a~i~G~G~~~P~Gil~~~~~~~~~~~~~~~~~ 228 (377) T protein:vir:96 149 AFKEQDFSQFKLTAFVVIPKDALKFGPKWLKQFITEQLKEAIAVALELAIVKGNGLLQPVGLLKDLSQPTVDQSTGRDIT 228 (377) T ss_pred cceeEeeeeeeEEeechhhHHHhhcchhhHHHHHHHHHHHHHHHHHhhceEeccCCCcceeeeecccccccccccccccc Confidence 2233333333333334444444334566777889999999999988877764 1110000000000000000 Q ss_pred eeecccccccccccccHHHHHHHHHH----h---Cc----cccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeec Q lcl|NC_020854. 148 DLCIDSESGDTPTALSPRHVAEARAI----L---GD----QGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQS 216 (342) Q Consensus 148 ~~~~~~~~~~~~~~~~~~~l~~A~~~----~---GD----~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~ 216 (342) ..............++.+.+.+-+.. + |. ....-.+|+|||.++.+++.+- .|. +++ T Consensus 229 ~~~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~a~~~mn~~t~~~~~~~~--~~~-~~~--------- 296 (377) T protein:vir:96 229 TYKTDKEAIADLSDLDPDTAVELLVPVMKHLSVNDKKHPLKIAGQVKLLLNPEDRWTLEAKF--TSR-NQF--------- 296 (377) T ss_pred ceeeccccccccccCChhHHHHHHHHHHHhhccccccccccccCceEEEEchhhHHhccccc--ccc-CCC--------- Confidence 00000111111223445555443322 1 11 1123457999999988764322 121 111 Q ss_pred cceeecccccccceeeecc--ceEEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCC--CcceeEEEEee Q lcl|NC_020854. 217 GGSMAAAYGGEVSVPTYMG--LRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDI--LAKSDAMSIDL 292 (342) Q Consensus 217 ~~~~~~~~~~~~~i~~~~G--~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~--~~g~~~l~~r~ 292 (342) +...+.+| .+|+.++.+|.. + .++.--+-..+.....+.+++.+.. ..+++.+.+.. T Consensus 297 -----------G~~~~~l~~p~~v~~s~~~p~~------~--i~fgdf~~Y~i~~r~~~~i~~~~~~~~~~d~~~f~~~~ 357 (377) T protein:vir:96 297 -----------GEYVTVLPHGITILESLAVETG------K--AIAFVANRYDAFMATASTIEEYDQTFAMEDLQLYLTKN 357 (377) T ss_pred -----------CCceeccCCCceEEecCCCCcc------c--EEEEEcCcEEEEEecccEEEeehhhhhhcCCeEEEEEE Confidence 11223344 457777777632 1 1111112244455555666655443 35667777777 Q ss_pred EEEeeecc-eee---ecCcC Q lcl|NC_020854. 293 HYVYHPVG-AKW---AVTTT 308 (342) Q Consensus 293 ~y~~~~~G-~s~---~~~~~ 308 (342) ++--.|.- -+| +.++- T Consensus 358 r~dG~~~d~~a~~vl~l~~~ 377 (377) T protein:vir:96 358 YFYGKAKDNHTAALLTLAGG 377 (377) T ss_pred EEcCEEecCCcEEEEEEecC Confidence 76555422 222 21211 No 164 >protein:vir:105610 Length: 430 # NCBI annotation: virion structural protein # Family: family:all:974 # MgeID: mge:1540 # MgeName: F116 # Cross-refs: genbank:acc:YP_164307;genbank:gi:56692923;genbank:GeneID:3197221 Probab=96.76 E-value=0.00035 Score=39.50 Aligned_cols=304 Identities=13% Similarity=0.085 Sum_probs=150.4 Q ss_pred Ccceeccc--cchh---HHHHHHHhhhHHh----hhhhhc----------------Cc---cccchhhhccCCCCEEEcc Q lcl|NC_020854. 1 MATLRSDI--IIPE---VFTPYVIEQTTQR----DAFLAS----------------GV---VQPMTELNATEGGDFINVP 52 (342) Q Consensus 1 MaT~~~d~--i~Pe---v~~~yv~~~~~~~----~~f~~s----------------g~---~~~d~~l~~~~~G~ti~~P 52 (342) |++....+ =.|+ +++.-+.....+. ++|... +. |+.-.+|.- ..||.|+++ T Consensus 1 ~~~a~T~~~~~~p~a~~~ws~~l~~~~~k~~~~~~kl~G~~~~~~~~~~~~~~~~ts~~~pI~r~~dL~K-~~GD~Vtf~ 79 (430) T protein:vir:10 1 MTASKTTMRYGDPNAMIQQAAGLFALCQGRNSTLNRLTGKMPSGTSDAEKKTKGQSSLELPIVQAQDLGR-NKGDEVRFH 79 (430) T ss_pred CcceeeecccCChhHHHHHHHHHHHHHhhhhhhHHHhhccccccccchhhhccCCCCCCccEEEeccCCC-CCccEEEEe Confidence 88332222 2333 4444443333222 344331 11 445555643 469999999 Q ss_pred ccccCCCCcccccCCCceechhhcccceeeeEeeeeccceeechH-HHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_020854. 53 FWKANLSGDFEVLSDSSSLTPGKITADKQVAAILHRGRAFEARDL-AALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQ 131 (342) Q Consensus 53 ~~~~i~~gda~~~~~~~~i~~~~lt~~~~~a~i~~~~k~~~~tD~-a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~ 131 (342) .-.. |.|++---+..-+---+.|+.+++..+|=...+++....- ++.-+--|...+....+++||++..|..++-.|- T Consensus 80 L~~~-L~g~gv~Gd~~lEGnee~L~~~~d~l~IDq~R~~V~~gg~msqQRt~~dlR~~ar~~L~~w~~~~~Dq~~~v~la 158 (430) T protein:vir:10 80 FVQP-ANAFPIMGSEYAEGKGTGLKIGSDQLRVNQARFPVDLGDVMSQIRNPYDLRRLGRPKAKWFMDAYLDQSMLVHLA 158 (430) T ss_pred Eeec-cccCceecCceeeccccceEEEeeEEEEeeeccccccCCchhhhhhhhHHHHHHHHHHHHHHHHHHHHHHHHHHh Confidence 8875 5777533233233347889999999999888888877643 4555667888899999999999999999999888 Q ss_pred HHHhh-----------------------hcccccchhheeeecc---------cccccccccccHHHHHHHHHHhCcc-- Q lcl|NC_020854. 132 GVFGS-----------------------LNANTSSSAFFDLCID---------SESGDTPTALSPRHVAEARAILGDQ-- 177 (342) Q Consensus 132 g~~~~-----------------------~~a~~~~~~~~~~~~~---------~~~~~~~~~~~~~~l~~A~~~~GD~-- 177 (342) |..+. ..|..++-++.--... ..+-+++..|+.+.+.+|....... T Consensus 159 Garg~~~~~~~~~~~~~~~~~~~~~~N~v~aPt~nrh~~~~G~at~~~~~~~~~~sl~stD~~s~~~id~a~~~a~~~~~ 238 (430) T protein:vir:10 159 GARGNHYNKEWCLPLETHPKLADMLVNRVKAPTKNRHFVASADAITGVAPNAGEYNITTADVLDVDVVDSIATYMDQIEL 238 (430) T ss_pred hhhcccccccccccccCCcchhhhhccccCCCCCceeEeecccccccccccccccchhhhcccCHHHHHHHHHHHHhhCC Confidence 76331 1112222222200000 0011234568888887776544321 Q ss_pred --------cc------CeEEEEEchHHHHHHHhhh-hhhhhhhhhcccceeeeccceeecccccccceeeeccceEEEeC Q lcl|NC_020854. 178 --------GD------KLTAVAMHSKVYYDLVERR-AIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSD 242 (342) Q Consensus 178 --------~~------~~~~ivmhS~v~~~L~~~~-li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD 242 (342) .+ .+.+++|||-.+++|+.+- ..+| +..-......+.. ...=.+.++.|+|..|.-- T Consensus 239 ~i~Pv~v~gd~~~g~~~~yV~~~~p~q~~~Lr~dt~~~~w-q~~~~a~a~~g~~------nPlF~G~~gm~ngvii~~~- 310 (430) T protein:vir:10 239 PPPPVKFEGDEAAEDSPIRVLLCSPAQYNSFAKQEKFRSW-QAAALARASNAKQ------HPIFRVDAGLWSNTLIIKM- 310 (430) T ss_pred CCcceEeecccccCCccEEEEEechHHHHHHhhCcchHHH-HHHHHHhhccccc------CCceecceeeecCeEEecC- Confidence 12 2589999999999999984 3333 2111101111000 1111367888988766532 Q ss_pred Cccee----------c----------------cCCCcceEEEEEecceeEeecCCc-------ceeEeccCCCcceeEEE Q lcl|NC_020854. 243 DVNTA----------G----------------SGGSTEYATYFFTQGAVASGEQMA-------MQTETDRDILAKSDAMS 289 (342) Q Consensus 243 ~~p~~----------~----------------~~~~~~y~t~l~~~GAi~~~~k~~-------~~ve~dr~~~~g~~~l~ 289 (342) ..|+- . ......-..+++|.-|+.+..+.. .=.|...|-+.+..+.. T Consensus 311 ~~virf~~g~~~~~~a~~~~~~~~~~~~~a~~~~~~~v~RalllGaQA~~~A~g~~~~~g~~f~w~Ee~~D~g~~~~i~~ 390 (430) T protein:vir:10 311 PKPIRFYAGDTIKYCAAYNSEAESSAVVSDSFGNQYAVDRALLLGGQALAQAWAASEHSGMPFFWSEKDMDHGDKLELLI 390 (430) T ss_pred CceeeecCCCccccccCCcccccccccccccccccccchhhhhccchhheeeeeccCCCCcceeeeeeccccCchhhhhh Confidence 11110 0 001112234566666665532221 11222222222111110 Q ss_pred EeeEEEeeeccee---eecC---cCC---------cChHHhcCCcCceeecCcc Q lcl|NC_020854. 290 IDLHYVYHPVGAK---WAVT---TTN---------PTRAQLETVANWSKVYELK 328 (342) Q Consensus 290 ~r~~y~~~~~G~s---~~~~---~~s---------Pt~~~L~~~~NW~~v~d~k 328 (342) . . +.|++ |... +.+ ||.+-+- .++| T Consensus 391 ~---~---i~G~kK~rF~~~~~~~~~~~DfGvi~idtaa~~~--------~~~~ 430 (430) T protein:vir:10 391 G---A---ILGCSKIRFAVEATNGLEYTDHGVMAIDTAVKII--------GPRK 430 (430) T ss_pred h---H---HhccceeeecCCCCCCceeeeeEEEEhhhhhhhh--------cCCC Confidence 0 0 12221 2110 000 1111111 1111 No 165 >protein:vir:1781 Length: 221 # NCBI annotation: minor capsid protein # Family: family:all:975 # MgeID: mge:38 # MgeName: P60 # Cross-refs: genbank:acc:NP_570347;genbank:gi:18640506;genbank:GeneID:932719 Probab=96.55 E-value=0.00032 Score=39.72 Aligned_cols=188 Identities=14% Similarity=0.132 Sum_probs=94.4 Q ss_pred ee-eeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccch-hheeeecccccccccccc Q lcl|NC_020854. 85 IL-HRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSS-AFFDLCIDSESGDTPTAL 162 (342) Q Consensus 85 i~-~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~-~~~~~~~~~~~~~~~~~~ 162 (342) |= -.--.+.+.|+-..-+..|.+.++.+|.+...++.+|+.++..+...... .+..+.. ...+.......+.+...+ T Consensus 1 iD~lL~a~~~VdDiD~aqa~~dvr~e~t~e~G~ALA~~~D~~i~~~~~~aA~~-~~p~~~~~~g~~~~~~a~~t~~~~~l 79 (221) T protein:vir:17 1 MDDLLVASQFVYDLDEILAQWNTRSEISKQIGEALAIHYDERIARVLASASIA-AAPVTGQDGGFSVNIGAGNTNNAQAI 79 (221) T ss_pred CCcchhHHHHHHhHHHHHhhhHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhh-cCcccccccCcceeccccccCCHHHH Confidence 11 12234667888888889999999999999999999999998766532211 1111110 001111111111111122 Q ss_pred cHHHHHHHHHHhCccc--cCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeeccceEEE Q lcl|NC_020854. 163 SPRHVAEARAILGDQG--DKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIV 240 (342) Q Consensus 163 ~~~~l~~A~~~~GD~~--~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vvv 240 (342) ++.|.+|.++|-++. ..=..+++.|..|..|.+..- .++...+ .. ...+.++ +...++.+.|.+|+. T Consensus 80 -~dai~~a~~~LdekdVP~~gR~~vv~P~~y~~LL~~~d-~~~~n~d----~~-~s~g~~~----~g~~i~~v~G~~V~~ 148 (221) T protein:vir:17 80 -VDGFFEAAAVLDERSAPMDGRVAVLSPRQYYSLISSVD-TNILNRE----IG-NTQGDMN----TGKGLYVNAGIRIYK 148 (221) T ss_pred -HHHHHHHHHHHhhcCCCCCCCEEEeCcHHHHHHHHhcC-cceeeee----cc-ccccccc----ccceeeeecCcEEEE Confidence 466666766665432 245577889999999986310 0110000 00 0011111 112478899999999 Q ss_pred eCCcceeccCC---------------------CcceEEEEEecceeEee-----cCCccee---EeccCCCcc Q lcl|NC_020854. 241 SDDVNTAGSGG---------------------STEYATYFFTQGAVASG-----EQMAMQT---ETDRDILAK 284 (342) Q Consensus 241 dD~~p~~~~~~---------------------~~~y~t~l~~~GAi~~~-----~k~~~~v---e~dr~~~~g 284 (342) +..+|.....+ ..+-..+++-+-|++.. ..||+.+ -.-|.+.+. T Consensus 149 SnnlP~~~gt~~~~~ag~~~~~~~~~~~yr~~fs~~~glv~~~~Avgtvkl~~~~~~~~~~~~~~~~~~~~~~ 221 (221) T protein:vir:17 149 SNVLASLYGTNLVTDPGDATTSGENNGSYRPAITDRAGLVFHKEAADTVEVLLPPSRPPLVISMFSIRRPDRR 221 (221) T ss_pred eccCCcccccccccCCccccccccccccccccccceEEEEEcchheeeeeeecCCCCCceeeeeeeccCCCCC Confidence 99998632110 00112344445555442 1333221 111222222 No 166 >protein:vir:102335 Length: 312 # NCBI annotation: putative capsid protein # Family: family:all:701 # MgeID: mge:1566 # MgeName: phi CD119 # Cross-refs: genbank:acc:YP_529560;genbank:gi:90592716;genbank:GeneID:3974467 Probab=96.15 E-value=0.00095 Score=37.16 Aligned_cols=267 Identities=14% Similarity=0.148 Sum_probs=127.1 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCce--echhhccc Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSS--LTPGKITA 78 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~--i~~~~lt~ 78 (342) ||.++ --.+.|.+.+.+.+...+ + |+.+..++.+..-.||++|.||...- ....+++-+.. -+...++. T Consensus 1 Mantl---~ya~~~~~~LD~~~~~~~-~--s~~l~~~~~~v~~~ggktVkIp~i~~---~gl~DY~R~~g~~~~~g~v~~ 71 (312) T protein:vir:10 1 MANTL---AYGQVLQQGLDKQATQEL-L--TGWMDSNAKQIKYEGGKEVKIGKLST---DGLGDYSRGSANAYVGGDVKF 71 (312) T ss_pred CCcch---hHHHHHHHHHHHHHHhhh-c--cccccCCCceEEEecCcEEEEEeeec---ccccccccccCCccccccccc Confidence 88444 123667777777766644 3 56676665544335799999998763 33345554333 34455666 Q ss_pred ceeeeEee-eeccceeec----hHHH-hhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecc Q lcl|NC_020854. 79 DKQVAAIL-HRGRAFEAR----DLAA-LAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCID 152 (342) Q Consensus 79 ~~~~a~i~-~~~k~~~~t----D~a~-~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~ 152 (342) ..+.-+.- -|+..|.+. |++. .++-++.|++ +..+.-.-.+|+.-++.|-.-.. ..+... +. . T Consensus 72 ~~et~tl~qDR~~~F~vD~mDvDETn~~~s~anv~~e---f~r~~vvPEiDayrfskla~~a~--~~~~~~----~~--~ 140 (312) T protein:vir:10 72 EYETKTMTQDRGRKFTLDAMDVDETNFLVTATTVMGE---FQRLKVIPEIDAYRLSRLATIAI--GIKGDT----NV--E 140 (312) T ss_pred cceeEEeeecccceeeccccchhhHhhHHHHHHHHHH---HHHhhhcchhhHHHHHHHHhhhh--cccccc----cc--c Confidence 66665554 578888777 4443 2334444444 22233333445555554421100 001000 00 0 Q ss_pred cccccccccccHHHHHHHHHHhCccc-cCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeeccccccccee Q lcl|NC_020854. 153 SESGDTPTALSPRHVAEARAILGDQG-DKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVP 231 (342) Q Consensus 153 ~~~~~~~~~~~~~~l~~A~~~~GD~~-~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~ 231 (342) ...+.+ +.--++.|.++..+|-|.. ..-.+++|.|.++.-|++. ........+ ......+..++ T Consensus 141 ~~~~~T-~~ni~~~i~~~~~~lde~~vp~~rvl~vTp~~~~lLk~~-~~~~~~~~~-------------~~~~~i~~~V~ 205 (312) T protein:vir:10 141 YSYSVN-SSTIINKIKTGIKIIRENGYNGPLVCHLTYDSMFAIEEK-VLEKLTAVT-------------FAQGGIQTQVP 205 (312) T ss_pred cccccC-HHHHHHHHHHHHHHHHHccCCCceEEEeChHHHHHHhhh-hhceecccc-------------cccceeeeeee Confidence 000001 1112455666666676642 2456799999999766653 222211111 01111256678 Q ss_pred eeccceEEEeCCcceeccCCCcceEEEEEecc--------------------------eeEe-ecCCcceeEeccCCCcc Q lcl|NC_020854. 232 TYMGLRVIVSDDVNTAGSGGSTEYATYFFTQG--------------------------AVAS-GEQMAMQTETDRDILAK 284 (342) Q Consensus 232 ~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~G--------------------------Ai~~-~~k~~~~ve~dr~~~~g 284 (342) .+.|.+|+. +| ...=|+.|-|-.| ...+ -.|.+..--..|+.+.+ T Consensus 206 ~iDgv~Ii~---VP-----s~r~~t~~~f~dG~t~~~~~gg~~~~~~ak~INfiiv~~~a~i~~~K~~~~~if~P~~~~~ 277 (312) T protein:vir:10 206 SIDGCALIK---TP-----QNRMYSSILLNDGTTSNQTAGGYLKGTKALDTNFIIAPVDVPLAITKQDKMRIFDPETNQT 277 (312) T ss_pred eecccEEEE---ch-----hhhccceeeeccCcccccccCceeecCcccccceEEeCCceeeceeeeeeeeeeCCCCCCC Confidence 888888873 22 1122333333333 2112 22322111234555554 Q ss_pred ee--EEEEeeEEEeeecc-----e--eeecCcCCc Q lcl|NC_020854. 285 SD--AMSIDLHYVYHPVG-----A--KWAVTTTNP 310 (342) Q Consensus 285 ~~--~l~~r~~y~~~~~G-----~--s~~~~~~sP 310 (342) .+ .+..|..+-+-++. + +.+.+..-- T Consensus 278 ~d~~~~~~R~Y~D~fv~~nk~~~Iyv~~k~a~~~~ 312 (312) T protein:vir:10 278 ANAWSMDYRRYHDLWVTDNKANSVYANFKDAKPVG 312 (312) T ss_pred cceeeeeeeeeeeeeeeccccCeEEEEeecccCCC Confidence 43 56666666666543 2 222211101 No 167 >protein:vir:78920 Length: 290 # NCBI annotation: Cps # Family: family:all:701 # MgeID: mge:1859 # MgeName: A006 # Cross-refs: genbank:acc:YP_001468846;genbank:gi:157325479;genbank:GeneID:5601917 Probab=96.13 E-value=0.00097 Score=37.12 Aligned_cols=267 Identities=10% Similarity=0.063 Sum_probs=134.7 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcccce Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITADK 80 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~~ 80 (342) ||=-.. +.|.+-+.+++.+... ||.+.. +.... .+|++|.||...- ....+++-++...-..++... T Consensus 1 Main~a-----~~~~~~Ld~~~~~~~~---t~~l~~-~~~~~-~ggktVkI~~i~~---~gl~DY~R~~g~~~g~v~~~~ 67 (290) T protein:vir:78 1 MAINYV-----DKYGKELDQKLVFGTY---TNELET-PNLLW-LDAKTFKIQTITT---TGLKAHTRNKGYNEGSASNTN 67 (290) T ss_pred CchhHH-----HHHHHHHHHHHHhhhe---eeeccc-cceee-ccCCEEEEeeecc---CcccccccCCCcccCccccce Confidence 883222 4566666666655443 333332 23333 4799999998652 445567766667667776666 Q ss_pred eeeEee-eeccceeec--hH--HH-hhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccc Q lcl|NC_020854. 81 QVAAIL-HRGRAFEAR--DL--AA-LAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSE 154 (342) Q Consensus 81 ~~a~i~-~~~k~~~~t--D~--a~-~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~ 154 (342) +.-++- .|+.+|.+. |. +. .++-.+.+ +++..+.-+-.+|+..++.|-+-.+. .+ .... . T Consensus 68 et~tl~qdR~~~F~vD~~DvDEt~~~~~~~nv~---~ef~~~~v~PEiDayr~skla~~a~~-----~~-~~~~-----~ 133 (290) T protein:vir:78 68 KSYTIDFDRDVEFFVDVMDVDETGQALSAANVT---KEFNSRHAGPEMDAYRFSKLATAAKT-----NS-NSVA-----E 133 (290) T ss_pred eeEEeeccccceeeccccchhHHhhhhhHHHHH---HHHHHHHhhhhhhHHHHHHHHhhhhc-----cC-cccc-----c Confidence 665543 577777775 43 22 23333433 44444555666677777765332111 11 0100 0 Q ss_pred cccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeec Q lcl|NC_020854. 155 SGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYM 234 (342) Q Consensus 155 ~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~ 234 (342) + .++.--++.|.++..+|-+-...-..++|.|.++.-|.+...| .+..+.... +....+..++.+. T Consensus 134 t--~t~~n~~~~i~~~~~~ldevp~~~rvl~vtp~~~~lL~~~~~f--~r~~~~~~~----------~~~~i~~~V~~id 199 (290) T protein:vir:78 134 E--ITKDNVFTKLKAAIRKVKKYGTQNLVMYVSPDVMAALELSDDF--VRAINVQNI----------GPSSIETRITAID 199 (290) T ss_pred c--cCHHHHHHHHHHHHHHHHhcCCCCeEEEECHHHHHHHhhChhh--hcccccccc----------ccccccceeeeec Confidence 0 0111235667777777744334567889999999988776533 222211000 0001245688889 Q ss_pred cceEEEeC---Ccce-----e--ccCCCcceEEEEEec-ceeEeecCCcceeEeccCCCccee--EEEEeeEEEeeecce Q lcl|NC_020854. 235 GLRVIVSD---DVNT-----A--GSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRDILAKSD--AMSIDLHYVYHPVGA 301 (342) Q Consensus 235 G~~VvvdD---~~p~-----~--~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~~~~g~~--~l~~r~~y~~~~~G~ 301 (342) |.+|+..- .+-. . ..++..+..-|++.+ +|..--.|.+..--.+|+.+..-+ .+..|..+-+-++.- T Consensus 200 G~~ii~vps~~r~~t~~~f~~G~~~~~~ak~in~ii~~~~a~i~~~K~~~~~~~~P~~~~~~d~~~~~~r~y~d~~v~~n 279 (290) T protein:vir:78 200 GTRIVEVEAEDRFYDTFDFTDGYKPAAGAKKLNFLLVNKGSVVGGAKHASIYLHAPGSVGQGDGWLYQYRVYHDIFVLDQ 279 (290) T ss_pred CcEEEEecccchhhhhhhhcccccccCCccceeEEEEcCCceeeeeeeeEEEeeCCCCCcCcceeeeeeeeeeeeeeecc Confidence 99887421 1110 0 011223333444443 333333444422223455554443 666777666666543 Q ss_pred ee----ecCcC Q lcl|NC_020854. 302 KW----AVTTT 308 (342) Q Consensus 302 s~----~~~~~ 308 (342) +- .+... T Consensus 280 k~~~i~~~~~~ 290 (290) T protein:vir:78 280 QKDGVIASTEV 290 (290) T ss_pred ccCeeEEEeeC Confidence 22 11111 No 168 >protein:vir:79712 Length: 285 # NCBI annotation: major capsid protein gp34 # Family: family:all:701 # MgeID: mge:1873 # MgeName: LL-H # Cross-refs: genbank:acc:YP_001285883;genbank:gi:148750840;genbank:GeneID:5220414 Probab=95.65 E-value=0.0017 Score=35.80 Aligned_cols=269 Identities=13% Similarity=0.065 Sum_probs=133.7 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhh-ccCCCCEEEccccccCCCCcccccCCCceechhhcccc Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELN-ATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITAD 79 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~-~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~~ 79 (342) ||=-+. +.+.+.+.+++...... +.+........ .-.||++|.+|.... .....+++-+...+.+.++.. T Consensus 1 Main~~-----~k~~~~ld~~~~~~~~~--~~l~~~~n~~~~~~~gak~VkIp~ist--~~gl~dY~R~~g~~~g~v~~~ 71 (285) T protein:vir:79 1 MTVVLD-----SKDLARIDEEYKADSQV--WSYLTGGNGVTQRFRGHNEVRINKLSG--FVDATAYKRGQDNARKTISVG 71 (285) T ss_pred Ccchhh-----HHHHHHHHHHHHHhhhh--hhhcccCCcceeEecCCCEEEEeeecc--cccccccccccCcccccccee Confidence 773222 34556666666555444 33444332221 114799999998742 123445666666778888777 Q ss_pred eeeeEee-eeccceeechHHHhhhcchHHHHHHHHHHH-HHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccccc Q lcl|NC_020854. 80 KQVAAIL-HRGRAFEARDLAALAAGSDPMAAIGAKVAD-YVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGD 157 (342) Q Consensus 80 ~~~a~i~-~~~k~~~~tD~a~~~~~~dp~~~i~~qia~-~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~ 157 (342) .+.-++- -|+..+.+......-.+.=.++++.+++.+ .-.-.+|+.-++.|-+- +. . . . +. +- T Consensus 72 ~et~tl~~DR~~~f~iD~mDvdEn~~~~~~ni~~ef~~~~vvPEiDayrfskla~~-----a~---~-~---~--~~-~~ 136 (285) T protein:vir:79 72 KETVKLTHEDWFGYDLDQFDMDENGAYTVENVVREHNKMITIPHRDKVAVQKLFDS-----AA---K-K---A--TD-SI 136 (285) T ss_pred eeEEEeeccccceecccccchhhhhhhhHHHHHHHHHhhhhcchhhHHHHHHHHhh-----cc---c-c---c--cc-cc Confidence 6666554 477777665222211222234444433322 22233455555544321 10 0 0 0 00 01 Q ss_pred ccccccHHHHHHHHHHhCccc-cCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceeeecc- Q lcl|NC_020854. 158 TPTALSPRHVAEARAILGDQG-DKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMG- 235 (342) Q Consensus 158 ~~~~~~~~~l~~A~~~~GD~~-~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G- 235 (342) +... -++.|.+|..+|-|.. ..-.+++|.|.++.-|++...|. +..+...... .+ ..+..++.+-| T Consensus 137 T~~n-v~~~i~~~~~~lde~~vp~~rvl~vTp~~~~~Lk~s~~~~--r~~~~~~~~~--~~-------~i~~~V~~lDg~ 204 (285) T protein:vir:79 137 TKDN-ALDAYDTAEAYMFDNEVPGGFVMFVSSAYYTALKQSAAVT--RTFSTDGTMV--IN-------GIDRRVAQLDGG 204 (285) T ss_pred CHHH-HHHHHHHHHHHHHHcCCCCceEEEEChHHHHHHHhhhhhh--eeccccccee--cc-------ceeeeeccccce Confidence 1122 2566777777776643 24568999999999988776443 2222100000 00 11345666777 Q ss_pred ceEEE--eCCcceeccCCCcceEEEEEecc-eeEeecCCcceeEeccCCCccee--EEEEeeEEEeeecceee------e Q lcl|NC_020854. 236 LRVIV--SDDVNTAGSGGSTEYATYFFTQG-AVASGEQMAMQTETDRDILAKSD--AMSIDLHYVYHPVGAKW------A 304 (342) Q Consensus 236 ~~Vvv--dD~~p~~~~~~~~~y~t~l~~~G-Ai~~~~k~~~~ve~dr~~~~g~~--~l~~r~~y~~~~~G~s~------~ 304 (342) .+|+. +|.+...+ ..+-.-|++.+. |..=-.|.+..--.+|..+.+-+ .+..|..+-+-++.-+- . T Consensus 205 v~ii~Vps~r~kt~~---~~k~Infiiv~~~a~i~~~K~~~~~~f~P~~~~~~d~~~~~~R~Y~d~fv~~nk~~~Iy~~~ 281 (285) T protein:vir:79 205 VPIVRVSSDRLKGLG---ITNHVNFILTPLSAIAPIVKYDSVSVIDPSTDRSGNRWTIKGLSYYDAIVLDNAKKGIYVAA 281 (285) T ss_pred eEEEEcchhhccCcC---cchhccEEEecCceeccceeeeeeEeECCCCCCCcceeeeeeeeeeeeeehhhccceeeeee Confidence 56664 34443221 123344555544 44334455544445666555443 66677766666643221 1 Q ss_pred cCcC Q lcl|NC_020854. 305 VTTT 308 (342) Q Consensus 305 ~~~~ 308 (342) .+++ T Consensus 282 ~a~~ 285 (285) T protein:vir:79 282 TAGV 285 (285) T ss_pred cccC Confidence 2233 No 169 >protein:vir:97397 Length: 517 # NCBI annotation: major capsid protein # Family: family:all:11745 # MgeID: mge:1675 # MgeName: Q54 # Cross-refs: genbank:acc:YP_762590;genbank:gi:115304291;genbank:GeneID:5130600 Probab=94.68 E-value=0.0038 Score=33.86 Aligned_cols=260 Identities=11% Similarity=0.069 Sum_probs=102.3 Q ss_pred Cccee-----ccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhh Q lcl|NC_020854. 1 MATLR-----SDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGK 75 (342) Q Consensus 1 MaT~~-----~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~ 75 (342) +.+.+ ..+..|.-+..-+....... ++++..-. ..+.....+|.-.. ...+..+.+|...+... T Consensus 237 ~~~~~~~~~~~~~~~p~~~~~~i~~~~~~~-----~~i~~~~~----~~~i~~~~~~~~~~--~~~a~~~~eG~~kp~s~ 305 (517) T protein:vir:97 237 WTAELKERGISGMPAPAGILKRIQDAVNDE-----GSLLPFIR----HENLPTLVVGGDNA--LTQGTGHTTGTDKTESN 305 (517) T ss_pred eeeecccccccccccchHHHHHHHHhhhhh-----ccceeeee----eccccceeeecccc--cceeeeeecCCcccccc Confidence 11111 12222322211111111110 11211110 11223344454321 23445567888777778 Q ss_pred cccceeeeEeeeeccceeechHHHhhhcch----HHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeec Q lcl|NC_020854. 76 ITADKQVAAILHRGRAFEARDLAALAAGSD----PMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCI 151 (342) Q Consensus 76 lt~~~~~a~i~~~~k~~~~tD~a~~~~~~d----p~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~ 151 (342) ++-.+.....+..+.-+.++..-..-+.-| -..-|.++++..++++.+..+|. | .. ..........+.. T Consensus 306 ~tf~~~~~~~~~ia~~~~~S~qll~Ds~~dd~~~l~s~i~~~l~~~l~~~ee~a~l~---G---dG-tg~~~~gi~~~a~ 378 (517) T protein:vir:97 306 ITLQTRVLTPQYVYKYIKLPKIVMNSNATDIAGAILTYVMNRLPDMVIMAVNRAIIM---G---GV-TGVSETQIYPVVG 378 (517) T ss_pred cceeeEEeeHhhhhhhhhhhHHHHHHhhhccHHHHHHHHHHHHHHHHHHHHHHHHhc---c---cC-CCccccccccccc Confidence 877777777777666555544322111222 23348888888888888877653 2 11 1110111111110 Q ss_pred ccccccccccccHHHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccc Q lcl|NC_020854. 152 DSESGDTPTALSPRHVAEARAILGDQGD--KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVS 229 (342) Q Consensus 152 ~~~~~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~ 229 (342) .. .+......+.+.+.+..+-+... .-..|+||+.++..|+++ |+++++.+. +.. .+... T Consensus 379 ~~---~~~~~~~~~~~~d~i~~l~~a~~~a~~a~~vmn~~t~~~I~kl------KD~~G~Yl~--~~~-------~~~~~ 440 (517) T protein:vir:97 379 DA---WATNVTGTTNIQELLEKLSVATPKAADSTLVIHRNDLAAIRFL------KDKNGNYVF--PVG-------VSNQT 440 (517) T ss_pred cc---ccccccccchHHHHHHHHHHHhhhccCCEEEECHHHHHHHHHh------hcCCCCeec--cCc-------CCccc Confidence 00 11111122333343333333222 235799999999999876 345544332 111 12234 Q ss_pred eeeeccceEEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeE--eccCCCcceeEEEEeeEE---Eeeecceee- Q lcl|NC_020854. 230 VPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTE--TDRDILAKSDAMSIDLHY---VYHPVGAKW- 303 (342) Q Consensus 230 i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve--~dr~~~~g~~~l~~r~~y---~~~~~G~s~- 303 (342) ..+.+|..-++ |.... +. .+..+..|= .+.....+.+. +||.. .++.+..+.+- +.++.-+.+ T Consensus 441 ~~~l~G~~~~~----~~~~~---~~-~~~~~~~~y-~i~~~~g~~~~~~fd~~~--n~~~f~~~~~~~g~i~~~~r~a~~ 509 (517) T protein:vir:97 441 IATHFGFNRLV----QSVAV---DE-KTAVSLSGY-VTNGSRGMEFEQGTILVE--NNKEYLFEMPISGSLEYKGTTAYG 509 (517) T ss_pred ccccCCccccc----ccccc---Cc-eeEeecccc-EEEeecceeeeeeeeccc--CceeEeeeeeeccccccccceEEE Confidence 45555632222 21111 11 111122221 11111112221 22322 23334444332 233333333 Q ss_pred --e--cCc Q lcl|NC_020854. 304 --A--VTT 307 (342) Q Consensus 304 --~--~~~ 307 (342) + .+| T Consensus 510 ~~~p~~~~ 517 (517) T protein:vir:97 510 TYTPPVAG 517 (517) T ss_pred EEcCCCCC Confidence 1 223 No 170 >protein:vir:9509 Length: 381 # NCBI annotation: hypothetical protein # Family: family:all:635 # MgeID: mge:170 # MgeName: phiN315 # Cross-refs: genbank:acc:NP_835556;genbank:gi:30043951;genbank:GeneID:1260537 Probab=94.54 E-value=0.0041 Score=33.65 Aligned_cols=272 Identities=9% Similarity=-0.000 Sum_probs=128.2 Q ss_pred Ccc---eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechh-hc Q lcl|NC_020854. 1 MAT---LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPG-KI 76 (342) Q Consensus 1 MaT---~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~-~l 76 (342) |.+ .-...++|+-+.+-+.+.+.+.+.+.+ ++. +.. .+|. ..+|.-. ..+.+.-+.++..++.+ +. T Consensus 76 ~~~~~~~~gg~lvP~~~~~~I~~~l~~~s~i~~--~~~----v~~-~~~~-~~i~~~~--~~~~a~w~~e~~~~~~~~~~ 145 (381) T protein:vir:95 76 INKNVNYKEEKLLPEETIDRIFEDLTTNHPLLA--DLG----IKN-AGLR-LKFLKSE--TSGVAVWGKIYGEIKGQLDA 145 (381) T ss_pred HhcccCCCCceecCHHHHHHHHHHHHhhcccee--hee----eEe-cCcc-eEEEEec--CCcceeeecccccccccccc Confidence 332 234578899888888888877776644 222 211 2354 4677533 23555555665555432 22 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeee----cc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLC----ID 152 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~----~~ 152 (342) +-++-....++.+.-+.++..--.-+..|..+.+.+++++.+++..+..++. | .. .+.-.+.+.++. .. T Consensus 146 ~f~~i~l~~~kl~~~~~is~elL~Ds~~~ie~~i~~~la~~~a~~~~~a~i~---G---~G-~~qP~Gil~~~~~~~~~~ 218 (381) T protein:vir:95 146 AFSEETAIQNKLTAFVVLPKDLNDFGPAWIERFVRVQIEEAFAVALETAFLK---G---TG-KDQPIGLNRQVQKGVSVT 218 (381) T ss_pred cceeeeecceeEEeechhhHHHhhcCHHHHHHHHHHHHHHHHHHHhhheeEe---c---cC-CCCceeeeeccCcccccc Confidence 3233333334444444444443333455777789999999998887766653 1 11 000000000000 00 Q ss_pred ----------cccccccccccHHHHHHHHHHhC---c----cccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeee Q lcl|NC_020854. 153 ----------SESGDTPTALSPRHVAEARAILG---D----QGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQ 215 (342) Q Consensus 153 ----------~~~~~~~~~~~~~~l~~A~~~~G---D----~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~ 215 (342) ............+.|.+....+. . ....-..|+||+.++.+|+++.. + +++++. T Consensus 219 ~g~~~~~~~~~t~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~a~~~mn~~t~~~l~~~~~--~-~~~~G~------ 289 (381) T protein:vir:95 219 EGAYPEKEEQGTLTFANPRATVNELTQVFKYHSTNEKGKSVAVKGNVTMVVNPSDAFEVQAQYT--H-LNANGV------ 289 (381) T ss_pred cccccccccccccccccchhhHHHHHHHHHhhccccccccccccCceEEEEccccHHhhccccc--c-CCCCCc------ Confidence 00000111111233333333222 1 12234578999999999886542 1 111111 Q ss_pred ccceeecccccccceeeeccceEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccC--CCcceeEEEEee Q lcl|NC_020854. 216 SGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRD--ILAKSDAMSIDL 292 (342) Q Consensus 216 ~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~--~~~g~~~l~~r~ 292 (342) ++...+ .|.+|+.++.||-. + .+|+. .-..+.....+.+++.+. ...+++.+.+.. T Consensus 290 --------~v~~l~----~g~~vv~s~~~p~~------~---iifgDfs~Y~i~~r~~~~i~~~~~~~~~~d~~~f~a~~ 348 (381) T protein:vir:95 290 --------YVTALP----FNLNVIESTVQEAG------K---VLTYVKGLYDGYLAGGINVQKFKETLALDDMDLYTAKQ 348 (381) T ss_pred --------eeecCC----CCceEEecCCCCcC------c---EEEEecccEEEEEecccEEEeechhHhhcCCeEEEEEE Confidence 011111 36778999988731 1 22221 123344555555555443 456777888888 Q ss_pred EEEeeec---ceee---ecCcCCcChHHhcCCcC Q lcl|NC_020854. 293 HYVYHPV---GAKW---AVTTTNPTRAQLETVAN 320 (342) Q Consensus 293 ~y~~~~~---G~s~---~~~~~sPt~~~L~~~~N 320 (342) ++-..|. .+.. +..+..|+- +...-+- T Consensus 349 r~dg~~~~~~A~~v~~l~~~~~~~~~-~~~~~~~ 381 (381) T protein:vir:95 349 FAYGKAKDNKVAAVWKLDLKGHKPAL-EGTEETL 381 (381) T ss_pred EEcCEEecCceEEEEEEEecCCCcCc-ccccccC Confidence 8766553 2222 222222221 1111111 No 171 >protein:vir:101291 Length: 381 # NCBI annotation: hypothetical protein # Family: family:all:635 # MgeID: mge:1591 # MgeName: phiNM3 # Cross-refs: genbank:acc:YP_908831;genbank:gi:118725095;genbank:GeneID:4555862 Probab=94.54 E-value=0.0041 Score=33.65 Aligned_cols=272 Identities=9% Similarity=-0.000 Sum_probs=128.2 Q ss_pred Ccc---eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechh-hc Q lcl|NC_020854. 1 MAT---LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPG-KI 76 (342) Q Consensus 1 MaT---~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~-~l 76 (342) |.+ .-...++|+-+.+-+.+.+.+.+.+.+ ++. +.. .+|. ..+|.-. ..+.+.-+.++..++.+ +. T Consensus 76 ~~~~~~~~gg~lvP~~~~~~I~~~l~~~s~i~~--~~~----v~~-~~~~-~~i~~~~--~~~~a~w~~e~~~~~~~~~~ 145 (381) T protein:vir:10 76 INKNVNYKEEKLLPEETIDRIFEDLTTNHPLLA--DLG----IKN-AGLR-LKFLKSE--TSGVAVWGKIYGEIKGQLDA 145 (381) T ss_pred HhcccCCCCceecCHHHHHHHHHHHHhhcccee--hee----eEe-cCcc-eEEEEec--CCcceeeecccccccccccc Confidence 332 234578899888888888877776644 222 211 2354 4677533 23555555665555432 22 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeee----cc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLC----ID 152 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~----~~ 152 (342) +-++-....++.+.-+.++..--.-+..|..+.+.+++++.+++..+..++. | .. .+.-.+.+.++. .. T Consensus 146 ~f~~i~l~~~kl~~~~~is~elL~Ds~~~ie~~i~~~la~~~a~~~~~a~i~---G---~G-~~qP~Gil~~~~~~~~~~ 218 (381) T protein:vir:10 146 AFSEETAIQNKLTAFVVLPKDLNDFGPAWIERFVRVQIEEAFAVALETAFLK---G---TG-KDQPIGLNRQVQKGVSVT 218 (381) T ss_pred cceeeeecceeEEeechhhHHHhhcCHHHHHHHHHHHHHHHHHHHhhheeEe---c---cC-CCCceeeeeccCcccccc Confidence 3233333334444444444443333455777789999999998887766653 1 11 000000000000 00 Q ss_pred ----------cccccccccccHHHHHHHHHHhC---c----cccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeee Q lcl|NC_020854. 153 ----------SESGDTPTALSPRHVAEARAILG---D----QGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQ 215 (342) Q Consensus 153 ----------~~~~~~~~~~~~~~l~~A~~~~G---D----~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~ 215 (342) ............+.|.+....+. . ....-..|+||+.++.+|+++.. + +++++. T Consensus 219 ~g~~~~~~~~~t~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~a~~~mn~~t~~~l~~~~~--~-~~~~G~------ 289 (381) T protein:vir:10 219 EGAYPEKEEQGTLTFANPRATVNELTQVFKYHSTNEKGKSVAVKGNVTMVVNPSDAFEVQAQYT--H-LNANGV------ 289 (381) T ss_pred cccccccccccccccccchhhHHHHHHHHHhhccccccccccccCceEEEEccccHHhhccccc--c-CCCCCc------ Confidence 00000111111233333333222 1 12234578999999999886542 1 111111 Q ss_pred ccceeecccccccceeeeccceEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccC--CCcceeEEEEee Q lcl|NC_020854. 216 SGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRD--ILAKSDAMSIDL 292 (342) Q Consensus 216 ~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~--~~~g~~~l~~r~ 292 (342) ++...+ .|.+|+.++.||-. + .+|+. .-..+.....+.+++.+. ...+++.+.+.. T Consensus 290 --------~v~~l~----~g~~vv~s~~~p~~------~---iifgDfs~Y~i~~r~~~~i~~~~~~~~~~d~~~f~a~~ 348 (381) T protein:vir:10 290 --------YVTALP----FNLNVIESTVQEAG------K---VLTYVKGLYDGYLAGGINVQKFKETLALDDMDLYTAKQ 348 (381) T ss_pred --------eeecCC----CCceEEecCCCCcC------c---EEEEecccEEEEEecccEEEeechhHhhcCCeEEEEEE Confidence 011111 36778999988731 1 22221 123344555555555443 456777888888 Q ss_pred EEEeeec---ceee---ecCcCCcChHHhcCCcC Q lcl|NC_020854. 293 HYVYHPV---GAKW---AVTTTNPTRAQLETVAN 320 (342) Q Consensus 293 ~y~~~~~---G~s~---~~~~~sPt~~~L~~~~N 320 (342) ++-..|. .+.. +..+..|+- +...-+- T Consensus 349 r~dg~~~~~~A~~v~~l~~~~~~~~~-~~~~~~~ 381 (381) T protein:vir:10 349 FAYGKAKDNKVAAVWKLDLKGHKPAL-EGTEETL 381 (381) T ss_pred EEcCEEecCceEEEEEEEecCCCcCc-ccccccC Confidence 8766553 2222 222222221 1111111 No 172 >protein:vir:95963 Length: 395 # NCBI annotation: ORF009 # Family: family:all:635 # MgeID: mge:1594 # MgeName: 2638A # Cross-refs: genbank:acc:YP_239802;genbank:gi:66395459;genbank:GeneID:5132880 Probab=94.37 E-value=0.0046 Score=33.39 Aligned_cols=274 Identities=10% Similarity=0.042 Sum_probs=125.0 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechh-hc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPG-KI 76 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~-~l 76 (342) |. +.-.-.++|+.+.+-+.+.+.+.+.+.+ ++.. . ..+|. ..+|.... .+.+.-+.+...++.+ +. T Consensus 86 ~~~~t~~~gG~liP~~~~~~Ii~~l~~~s~i~~--~~~v----~-~~~~~-~~i~~~~~--~~~a~w~~e~~~~~~~~~~ 155 (395) T protein:vir:95 86 INYDVGYTDEKILPETVVERVFDDLQKDHPLLS--KINF----Q-NAGIK-TRVIKADP--AGQAVWGKVFGEIKGQLDA 155 (395) T ss_pred HhhccCCCCceeccHHHHHHHHHHHHhhhhhhh--hcee----E-ecCCc-eEEEEecC--CcceEEeecccccCccccc Confidence 22 1223457888888888888888777755 2221 1 12353 57886542 3555444454555432 22 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHH-------HHHHHhhhcccccchhheee Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSC-------LQGVFGSLNANTSSSAFFDL 149 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~-------L~g~~~~~~a~~~~~~~~~~ 149 (342) +-++-....++...-+.++..--.-+.-|-.+.+.+++++.+++..++.++.- =+|++.......... .... T Consensus 156 ~f~~i~l~~~kl~~~~~iS~ell~ds~~~ie~~i~~~la~~ia~~~~~a~i~G~G~~~~qP~Gil~~~~~~~~~~-~~~~ 234 (395) T protein:vir:95 156 AFREENFTQYKLTCFVVLPDDLSTFGPAWIERFVRTQIQEAISVALESAIINGGGAAKTQPVGLMKDVNTNSGAV-TDKA 234 (395) T ss_pred cceeeeeceeeEEEeecccHHHHhcchhHHHHHHHHHHHHHHHHHHhhheeeccCCCCcCceeeeeccccccccc-cccc Confidence 22232333334443344544443334556677899999999999888766530 012221100000000 0000 Q ss_pred ecccccccccccccHHHHHHHHHHhC-------ccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeec Q lcl|NC_020854. 150 CIDSESGDTPTALSPRHVAEARAILG-------DQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAA 222 (342) Q Consensus 150 ~~~~~~~~~~~~~~~~~l~~A~~~~G-------D~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~ 222 (342) . ..........+....+.++...+. .....-..|+||+..+.+++.+-+ |. ++. T Consensus 235 ~-~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~mn~~t~~~~~g~~~--~~-~~~--------------- 295 (395) T protein:vir:95 235 S-SGTLTFADADTTILELNDVLKNLSVDEKGKELKIDGKVALVVNPRDSWDVQARYT--YL-TAN--------------- 295 (395) T ss_pred c-cchhhhhhhHhhHHHHHHHHHhhccccccchhhhcCceEEEEcchhhhhcCCcce--ec-cCC--------------- Confidence 0 000001111222333333332221 122344578999998876653321 11 111 Q ss_pred ccccccceeeec--cceEEEeCCcceeccCCCcceEEEEEecce-eEeecCCcceeEeccCC--CcceeEEEEeeEEEee Q lcl|NC_020854. 223 AYGGEVSVPTYM--GLRVIVSDDVNTAGSGGSTEYATYFFTQGA-VASGEQMAMQTETDRDI--LAKSDAMSIDLHYVYH 297 (342) Q Consensus 223 ~~~~~~~i~~~~--G~~VvvdD~~p~~~~~~~~~y~t~l~~~GA-i~~~~k~~~~ve~dr~~--~~g~~~l~~r~~y~~~ 297 (342) +...+.+ |.+|++++.||-. .++|+.=. ..++....+.+++.+.. ..+++.+++..++... T Consensus 296 -----G~~~~~lg~g~~v~~~~~~p~~---------~i~fgdfs~y~i~~r~~~~i~~~~~~~~~~d~~~f~~~~r~dg~ 361 (395) T protein:vir:95 296 -----GGFVTVLPYNVTIITSEFVPEG---------KLVAFVTDRYNAVRGGGLTVKKFDQTLALEDAVLFTAKTFAYGQ 361 (395) T ss_pred -----CcceeccCCcceEEEcCCCCCC---------cEEEEecccEEEEEecceEEEeccchhhhCCcEEEEEEEEECCE Confidence 1122333 6778999998731 12333211 22334444555554443 4567778888777666 Q ss_pred ecc---eee-----ecCcC----CcC-hHHhcCC Q lcl|NC_020854. 298 PVG---AKW-----AVTTT----NPT-RAQLETV 318 (342) Q Consensus 298 ~~G---~s~-----~~~~~----sPt-~~~L~~~ 318 (342) |.- +.. .++.. +|. ..=.+.+ T Consensus 362 ~~~~~A~~~l~i~~~~~~~~~~~~~~~~~~~~~~ 395 (395) T protein:vir:95 362 PDDNKASAVYDLKVASAPRRQTSAGGTTDGIAEA 395 (395) T ss_pred EeccccEEEEEeeccCCCCCCCCCCCCCCccccC Confidence 543 222 11111 111 1111111 No 173 >protein:vir:95451 Length: 313 # NCBI annotation: hypothetical protein ORF044 # Family: family:all:11728 # MgeID: mge:1570 # MgeName: PA11 # Cross-refs: genbank:acc:YP_001294637;genbank:gi:149408203;genbank:GeneID:5237018 Probab=94.36 E-value=0.0046 Score=33.38 Aligned_cols=290 Identities=13% Similarity=0.146 Sum_probs=158.5 Q ss_pred Cc-c-eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhccc Q lcl|NC_020854. 1 MA-T-LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITA 78 (342) Q Consensus 1 Ma-T-~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~ 78 (342) |- | ...-+|..|.+++.+...+-++ |..-.+...-..+. -|+++++|-.. +..-+.-.|.++.+.+.|.+ T Consensus 1 ~~~TSNT~A~I~SE~~s~~I~~~LH~~--LL~~~~~R~V~DF~---~G~~L~I~tiG---s~~~~~~~E~~~~~~~~i~T 72 (313) T protein:vir:95 1 MQLTSNTRAFIESEQYSKFILLNLHDG--LLPETFYRNVSDFG---SGETLHIKTIG---SVTLQEAEEDTPLIYNPIET 72 (313) T ss_pred CcccccchheehhhhHHHHHHHHhhcc--ccchhhhhhhccCC---CCCEEEecccC---ceeeeccccCCCeeeccccc Confidence 65 3 3456799999999987666543 32222222222222 39999999753 34445667888999999999 Q ss_pred ceeeeEee-eeccceeechHHHhh--hcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeeccccc Q lcl|NC_020854. 79 DKQVAAIL-HRGRAFEARDLAALA--AGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSES 155 (342) Q Consensus 79 ~~~~a~i~-~~~k~~~~tD~a~~~--~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~ 155 (342) ++-.-.+. +.|.+|-++|--..- .-.+-|++.+...+.++...+|-++|+.=..-|+.+.....-.-+-...+++ T Consensus 73 GEIt~~i~~Y~G~A~~vt~~LR~D~~~I~~~~A~~~AE~~RAI~E~~~TD~L~~G~~~FA~~~~P~~vNG~PH~~V~~-- 150 (313) T protein:vir:95 73 GEITFQITEYKGDAWYVTDDLREDGTDIDRLMAERAAESTRAIQETFETDFLKTGAEYFAANPGPHNVNGFPHVIVSA-- 150 (313) T ss_pred ceEEEEEEeecCChhhhhhhhhhcchhHHHHhhhcchhhHHHHHHHHhhHHHhhchhhhccCCCCcccccccceEEec-- Confidence 99877665 789999887765432 2457777777788999999999999998777786543332222222223333 Q ss_pred ccccccccHHHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhhhhhhh-hhhhhcccceeeeccceeecccccccceee Q lcl|NC_020854. 156 GDTPTALSPRHVAEARAILGDQGD--KLTAVAMHSKVYYDLVERRAIDY-VSTADARGTSTTQSGGSMAAAYGGEVSVPT 232 (342) Q Consensus 156 ~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~~li~~-~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~ 232 (342) .+..++..+.|..-...|--..- .=.+.++.|.+.+.|.-.--|.. + ++ ..-.+..-++.++.+ -+-. T Consensus 151 -~T~~~~~~~~~~~~~~~~~~a~~P~~G~v~IvDP~~~~~L~~l~~It~~v--t~--~~k~I~ESG~A~~~~----Fi~~ 221 (313) T protein:vir:95 151 -ETNGVFALKHLIAMRLAFDKANVPAEGRVFIVDPVAEATLNGLVTITHDV--TD--FGKMILESGMARGQR----FIMN 221 (313) T ss_pred -cCCceehhhHHHHhhhhhhhccCCccceEEEEcchhhhhhhhhheeeccc--cc--ccceeeeccCCchhH----HHHH Confidence 34567888888888777754332 34567899999999875432211 1 11 111111112222211 1223 Q ss_pred eccceEEEeCCccee-----ccCCCcce-EEEEEe--cce--eEeecCCcceeEecc--CCCcceeEEEEeeEEEeee-- Q lcl|NC_020854. 233 YMGLRVIVSDDVNTA-----GSGGSTEY-ATYFFT--QGA--VASGEQMAMQTETDR--DILAKSDAMSIDLHYVYHP-- 298 (342) Q Consensus 233 ~~G~~VvvdD~~p~~-----~~~~~~~y-~t~l~~--~GA--i~~~~k~~~~ve~dr--~~~~g~~~l~~r~~y~~~~-- 298 (342) .-|+-+++++.+.+. .+.+.+++ .-|.-. -|- |...-++-+..|..| +...-++....||-+++.- T Consensus 222 ~YG~Di~~SN~L~~AN~~D~~tT~~G~~~NlFM~i~D~~~~P~~~AWr~MP~s~~~~~~~~~~~~~~~~~R~G~Gi~R~~ 301 (313) T protein:vir:95 222 LYGWDILTSNRLHVANYNDGTTTGNGYVGNLFMCILDDQTKPIMGAWRRMPKSEGERNKDRARDEHVVRCRYGFGIQRLD 301 (313) T ss_pred HhhhhhhhhhhhhhccccccccccCceeeeeeeeeecccccceeeeeccccccccccccccccccceeeeeecccceeec Confidence 347777777655432 11122222 111110 110 111112222333333 3444556666776665542 Q ss_pred -cceeeecCcCC Q lcl|NC_020854. 299 -VGAKWAVTTTN 309 (342) Q Consensus 299 -~G~s~~~~~~s 309 (342) .|.--+++... T Consensus 302 ~L~~~~~~A~~~ 313 (313) T protein:vir:95 302 TLGLLATSATAY 313 (313) T ss_pred ceeEEEeccccC Confidence 22222222222 No 174 >protein:vir:94933 Length: 330 # NCBI annotation: putative phage structural protein # Family: family:all:1120 # MgeID: mge:1538 # MgeName: Xp15 # Cross-refs: genbank:acc:YP_239278;genbank:gi:66392060;genbank:GeneID:5076578 Probab=93.50 E-value=0.0073 Score=32.28 Aligned_cols=281 Identities=15% Similarity=0.158 Sum_probs=116.1 Q ss_pred Ccc-eeccc--cchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhcc Q lcl|NC_020854. 1 MAT-LRSDI--IIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKIT 77 (342) Q Consensus 1 MaT-~~~d~--i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt 77 (342) |++ ++++. ..|.-+..-|.+.+.+.+.+.+ .+ +-.... |...+.+.=+-+ +.+.-.+-+..+++++=. T Consensus 25 m~alTLaea~~l~~d~~~~~VIE~l~~~s~iL~--~l-pf~~ve----~~~~~~~r~~~l--p~a~~r~~n~~~~~~~~~ 95 (330) T protein:vir:94 25 MPTVTLAESAKLSQDHLVSGLIETIVEVNPLYE--MM-PFTEIE----GNALAYNRENVL--GDVQFLAVGGTITAKNPA 95 (330) T ss_pred hhhhhhhHHhhcCchhhHHHHHHhhhccchHHh--hc-cccccc----CCcceeeeeecC--CcceeeeccccccccCcc Confidence 773 44442 4566666666666666665543 11 001111 111111211112 222111222223332212 Q ss_pred cceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHH---HHHHHHHHHHHHH------HHHHHHhhhcccccchhhee Q lcl|NC_020854. 78 ADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVA---DYVANQRQKDLLS------CLQGVFGSLNANTSSSAFFD 148 (342) Q Consensus 78 ~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia---~~~~~~~~~~lla------~L~g~~~~~~a~~~~~~~~~ 148 (342) +..+.---.+...++..=|....-.+++|++....|+. +...++.+..+|. ...|+..... .. T Consensus 96 Tf~q~t~~l~~l~~~~~Vd~~iadl~g~~~d~~~~q~~~~ieal~~~~e~~linGDs~~~~F~GL~~~~~--------~~ 167 (330) T protein:vir:94 96 TFTKVTSELTTLIGDAEVNGLIQATRSDFMDQTSVQVASKAKSIGRQYQASMITGDGTGNSFQGMMGLVA--------AS 167 (330) T ss_pred eeeeeeechhhhhhhHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHHHHHHhhccCCCCccccchhhcCC--------cc Confidence 22222222333333333333322247788888777665 4566666666654 0112211100 01 Q ss_pred eecccccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeeccccccc Q lcl|NC_020854. 149 LCIDSESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEV 228 (342) Q Consensus 149 ~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~ 228 (342) ....+ +.+.+.++.+.|.+.+.+.=.....-..|+||.+...+++... +.....++.- ..-. .-.. T Consensus 168 q~i~t--g~~gg~~T~d~LDeLl~~v~~~~g~~~~~l~n~a~~r~I~a~~-----R~~~~~~v~~-----~~~~--~~G~ 233 (330) T protein:vir:94 168 QTISA--GANGGTLTFELLDQLLDLVKDKDGQVDYLMSSFAMRRKYFSLL-----RALGGAAIGE-----VMTL--PSGR 233 (330) T ss_pred cEEec--CCCCCCCCHHHHHHHHHHhcCCCCCCcEEEechhHHHHHHHHH-----HhccCCCCCC-----cccc--cCCC Confidence 11111 2234567788888877765443445668888777666665431 1110000000 0000 0125 Q ss_pred ceeeeccceEEEeCCcceeccC-C-CcceEEEEEec-------ceeEeecCCcceeEeccCCCc----ceeEEEEeeEEE Q lcl|NC_020854. 229 SVPTYMGLRVIVSDDVNTAGSG-G-STEYATYFFTQ-------GAVASGEQMAMQTETDRDILA----KSDAMSIDLHYV 295 (342) Q Consensus 229 ~i~~~~G~~VvvdD~~p~~~~~-~-~~~y~t~l~~~-------GAi~~~~k~~~~ve~dr~~~~----g~~~l~~r~~y~ 295 (342) .+.+|+|.+|+..|..|...+. + .+.-..|.+.- |-.++..+..+.+ .-|+.+. .....-.++.+. T Consensus 234 ~v~~~~GvPi~~~d~ip~~~~~~~~~~ttsIyav~~G~~~~~qgV~Gl~~~g~~gl-sVr~~G~~~~k~v~~~~v~~y~~ 312 (330) T protein:vir:94 234 QIPTYRGVPWFVNDFIPSNMTQGTATNATAIFAGTFDDGSNKYGIAGLTARGSAGL-RVQNVGAKENADETITRVKMYCG 312 (330) T ss_pred EEeeeCCeEEEecccccCCCCcccCCCceeEEEEeecccccccceEeecCCCCCcc-eeeeCCCccccceeeEEEEEeee Confidence 6899999999999999886432 1 12223354442 3345543322221 1233331 112122223232 Q ss_pred eeecceeeecCcCCcChHHhcCC Q lcl|NC_020854. 296 YHPVGAKWAVTTTNPTRAQLETV 318 (342) Q Consensus 296 ~~~~G~s~~~~~~sPt~~~L~~~ 318 (342) +.++= .+.. --....+.| T Consensus 313 ~av~~-~~a~----~~L~~V~~g 330 (330) T protein:vir:94 313 FANFS-QLGL----AAIKGLIPG 330 (330) T ss_pred eEEec-hhhe----eeeccccCC Confidence 22210 0000 001111111 No 175 >protein:vir:97255 Length: 310 # NCBI annotation: hypothetical protein ORF017 # Family: family:all:1120 # MgeID: mge:1657 # MgeName: M6 # Cross-refs: genbank:acc:YP_001294525;genbank:gi:149408246;genbank:GeneID:5237120 Probab=93.21 E-value=0.0084 Score=31.97 Aligned_cols=278 Identities=14% Similarity=0.132 Sum_probs=115.3 Q ss_pred Cc-ceecc--ccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCC-Cccc--ccC--CC-cee Q lcl|NC_020854. 1 MA-TLRSD--IIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLS-GDFE--VLS--DS-SSL 71 (342) Q Consensus 1 Ma-T~~~d--~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~-gda~--~~~--~~-~~i 71 (342) |+ -++++ ...+.-+..-|.+.+.+.+.+.+ .+ +-..+.+ |. +.|+..-. +++. .++ .+ ... T Consensus 1 mpaltLaea~k~~~d~l~~~ViE~~~~~s~lL~--~L-pF~~veg---~~----~~ynR~~~~~~~~~~~v~~~~~~~g~ 70 (310) T protein:vir:97 1 MASVTLAESAKLAQDELVAGVIENIITVNRMFD--VL-PFDSIEG---NS----LAYNRENVLGDVIMAGVGTTFSGAGA 70 (310) T ss_pred CcccchHHHhhcCcchHHHHHHHHHhccchHHH--hC-CcccccC---Cc----ceeeEeeccCCcccccccccccCCCc Confidence 88 34433 35566666666666666665543 11 0001111 11 11111100 0000 000 00 111 Q ss_pred chhhcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHH---HHHHHHHHHHHHH------HHHHHHhhhccccc Q lcl|NC_020854. 72 TPGKITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVA---DYVANQRQKDLLS------CLQGVFGSLNANTS 142 (342) Q Consensus 72 ~~~~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia---~~~~~~~~~~lla------~L~g~~~~~~a~~~ 142 (342) .+..-+.........-.+....+.-.-+.+-.++|+.+++.|+. ++..++.+..+|. .-.|+... .+. T Consensus 71 ~~~~~t~~~~~~~L~i~~g~~~Vd~~i~dl~~~~~~dq~~~Ql~~~iea~~~~~e~~lINGD~a~n~F~GL~~~--~~~- 147 (310) T protein:vir:97 71 GKAAATFTKVNSNLTTIMGDAEVNGLIQATRSGDGNDQTAVQIASKAKSAGRKYQDQLINGNGAGNEFAGLIQL--CAS- 147 (310) T ss_pred cccccccceeeeeeeeeeehhhhhhHHHhhhcCChHHHHHHHHHHHHHHHHHHHHHHhhccccCCCcccchhhc--CCc- Confidence 12222222222222333333333322223334778888877765 6778888888876 11111111 000 Q ss_pred chhheeeecccccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeec Q lcl|NC_020854. 143 SSAFFDLCIDSESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAA 222 (342) Q Consensus 143 ~~~~~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~ 222 (342) . ..... ..+.+.++.+.|.+.+.+.=+.......++|||+.+.+++... +.....+++-.. .. T Consensus 148 -~----q~i~~--~~~gg~~t~d~LDeLl~~v~~~~g~p~~~l~~~~~~r~i~A~~-----R~~~~~g~~~~~-----~~ 210 (310) T protein:vir:97 148 -G----QKATT--GATGSAISFAILDELMDLVVDKDGQVDYLTMHARTLRSYKALL-----RALGGASINEVV-----EL 210 (310) T ss_pred -c----ceeec--CCCCCCCCHHHHHHHHHHHhcCCCCCCEEEecHHHHHHHHHHH-----HHhcCCCCCCcc-----cc Confidence 0 01111 1123456778887777765444455678999998655554321 111111111000 00 Q ss_pred ccccccceeeeccceEEEeCCcceeccC--CCcceEEEEEecc-------eeEee-cCCc-ceeEeccCC--CcceeEEE Q lcl|NC_020854. 223 AYGGEVSVPTYMGLRVIVSDDVNTAGSG--GSTEYATYFFTQG-------AVASG-EQMA-MQTETDRDI--LAKSDAMS 289 (342) Q Consensus 223 ~~~~~~~i~~~~G~~VvvdD~~p~~~~~--~~~~y~t~l~~~G-------Ai~~~-~k~~-~~ve~dr~~--~~g~~~l~ 289 (342) ....++.+|+|.+|+..|..|+..+. ..+.-..|.+.-| -+++. .+.| +.+ ++... .+...... T Consensus 211 --~~G~~v~~~~GiPi~~~d~ip~~~~~~~~~gtTsIya~r~Ge~~~~~Gv~Gl~~~~~~glsV-r~~G~~~~~~v~~~~ 287 (310) T protein:vir:97 211 --PSGAEVPAYSGTPIFRNDYIPTNQTKGGTTGCTTIFAGTLDDGSRTHGIAGLTATQAAGIQV-VDVGESEDSDEHIWR 287 (310) T ss_pred --CCCCEEeeeCCeEEEEeCccCCCccccccCCceeEEEEeeCccccccceeccccCCccceeE-EeCCcccCCcceeEE Confidence 11246789999999999999986422 2222234554433 23322 1222 222 22111 11111111 Q ss_pred EeeEEEeeecceeeecCcCCcC-hHHhcCCcC Q lcl|NC_020854. 290 IDLHYVYHPVGAKWAVTTTNPT-RAQLETVAN 320 (342) Q Consensus 290 ~r~~y~~~~~G~s~~~~~~sPt-~~~L~~~~N 320 (342) ..+.+.+.+ .+|. .+-|.+-.| T Consensus 288 V~~Y~~~av---------~~~~A~a~L~~V~~ 310 (310) T protein:vir:97 288 VKWYCGLAL---------FSEKGLACADGITN 310 (310) T ss_pred EEEeeeEEE---------ecccceeeeccccC Confidence 122222221 1221 122333333 No 176 >protein:vir:98635 Length: 377 # NCBI annotation: major coat protein # Family: family:all:635 # MgeID: mge:1601 # MgeName: phi3396 # Cross-refs: genbank:acc:YP_001039923;genbank:gi:126011098;genbank:GeneID:4818471 Probab=91.94 E-value=0.014 Score=30.82 Aligned_cols=277 Identities=11% Similarity=-0.008 Sum_probs=117.1 Q ss_pred Ccce---eccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhh-c Q lcl|NC_020854. 1 MATL---RSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGK-I 76 (342) Q Consensus 1 MaT~---~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~-l 76 (342) |.++ -....+|+-+.+-+.+.+.+.+.+.+ ++. +.. .+|+ ..+|.-. .++.+.-+.++.+++.+. . T Consensus 79 ~~~~~~~~gg~~vP~~~~~~I~~~l~~~s~i~~--~~~----v~~-~~~~-~~~~~~~--~~~~a~w~~e~~~~~~~~~~ 148 (377) T protein:vir:98 79 DKNVGGKDKFKLLPEETMVQVFDDLVAEHPLLK--VIN----FKN-TSLR-LKALTAE--TSGTAVWGDIFGEIKGQLKQ 148 (377) T ss_pred HhccCCCCCccccCHHHHHHHHHHHHHhhhhhh--hee----eEe-cCcc-eEEEEec--CCcceeEeecccccCcccCc Confidence 4422 23568898888888877777666644 221 221 2354 4788543 345555566666665332 2 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeee---ccc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLC---IDS 153 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~---~~~ 153 (342) +-++-....++...-..++..--.-+..|..+.+.+++++.+++..+..++. |.. .+.-.+.+.+.. +.. T Consensus 149 ~f~~i~l~~~kl~a~~~is~elL~ds~~~ie~~i~~~la~~~a~~~~~a~i~------G~G-~~qP~Gil~~~~~~~~~~ 221 (377) T protein:vir:98 149 AFKEQDFSQFKLTAFVVIPKDALKFGPKWIKQFITEQLKEAIAVALELAIVK------GDG-LLQPVGLLKDLSQPTVDQ 221 (377) T ss_pred cceeEeecceeEEeeecccHHhhhccHhHHHHHHHHHHHHHHHHHHhhceEe------ccC-CCcceeeeeccccccccc Confidence 2222222233333224444444334556777889999999999888766653 111 111011110000 000 Q ss_pred cccc--ccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceee--------eccceeecc Q lcl|NC_020854. 154 ESGD--TPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTT--------QSGGSMAAA 223 (342) Q Consensus 154 ~~~~--~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~--------~~~~~~~~~ 223 (342) ..+. .......+.+.+.....=.....-.+|+||......+++.+ +.+++.+-.. ....... T Consensus 222 ~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~a~~~m~~~t~~~~~klk------d~~G~~i~~~n~~~~~~~~p~~~~~-- 293 (377) T protein:vir:98 222 STGRDITTYKTDKEAIADLSDLTPDNAPKKLVPVMKHLSVNDKKRPL------KIAGQVKLILNPEDRWALEAQFTSR-- 293 (377) T ss_pred ccccccccccchhhhHhhhhhhchhHHHHHHHHHHHHHHHHHHhhhh------ccCCceEEEecccchhhcccccccc-- Confidence 0000 00000111222221111111122234555555554444321 1111111100 0000000 Q ss_pred cccccceeeeccce--EEEeCCcceeccCCCcceEEEEEecceeEeecCCcceeEeccCC--CcceeEEEEeeEEEeeec Q lcl|NC_020854. 224 YGGEVSVPTYMGLR--VIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDI--LAKSDAMSIDLHYVYHPV 299 (342) Q Consensus 224 ~~~~~~i~~~~G~~--VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~--~~g~~~l~~r~~y~~~~~ 299 (342) ..++...+.+|++ |+.++.+|-. + ..|.- -....+.....+.+++.++. ..+++.+.+..++--.|. T Consensus 294 -~~~G~~~t~lg~p~~vv~s~~~p~~------~-i~fgd-f~~Y~i~~r~~~~i~~~~~~~~~~d~~~f~~~~r~dg~~~ 364 (377) T protein:vir:98 294 -NQFGEYVTVLPHGITILESLAVETG------K-AIAFV-ANRYDAFMATASTIEEYDQTFAMEDLQLYLTKNYFYGKAK 364 (377) T ss_pred -CCCCccccccCCCceEEecCCCCcc------c-EEEEE-ecceeEEeecceEEEeechhhhhcCceEEEEEEEEcCEEe Confidence 0123344666665 6667777631 1 11111 11233445555666654443 356677777766655443 Q ss_pred c-eeeecCcCCcChHHhcCCcCceeecCccccceEEEEec Q lcl|NC_020854. 300 G-AKWAVTTTNPTRAQLETVANWSKVYELKNIGIVRATNV 338 (342) Q Consensus 300 G-~s~~~~~~sPt~~~L~~~~NW~~v~d~k~i~~~~~~~~ 338 (342) - -+|. +..+.-. T Consensus 365 ~~~a~~---------------------------vl~i~~~ 377 (377) T protein:vir:98 365 DNHTAA---------------------------LLTLAGG 377 (377) T ss_pred ccCcEE---------------------------EEEEecC Confidence 2 2221 1111000 No 177 >protein:vir:99523 Length: 311 # NCBI annotation: putative protein # Family: family:all:701 # MgeID: mge:1559 # MgeName: Lj928 # Cross-refs: genbank:acc:NP_958538;genbank:gi:41179320;genbank:GeneID:2717161 Probab=91.92 E-value=0.014 Score=30.81 Aligned_cols=278 Identities=11% Similarity=0.058 Sum_probs=124.9 Q ss_pred Ccceeccc-cc-hhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechhhccc Q lcl|NC_020854. 1 MATLRSDI-II-PEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPGKITA 78 (342) Q Consensus 1 MaT~~~d~-i~-Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~~lt~ 78 (342) |.|--..| ++ -+.|.+.+.+++.... + ||.+..++.-. -.||++|.||...- ....+++-+.--..+.++- T Consensus 1 ~~~~an~mAlnya~~~~~~Ld~~~~~~~-~--t~~l~~~~~~~-~~Gak~VkIp~i~~---~gl~dY~R~~g~~~g~v~~ 73 (311) T protein:vir:99 1 MPTDAETRGFNYVTKDGNLLDQKITAGL-F--TAALGTPEVDL-VNGGRSFTLKTIST---SGLKDHTRGKGFNSGTISD 73 (311) T ss_pred CCCcchhhHHHHHHHHHHHHHHHHHhhh-c--ccceecCchhe-eecCCEEEEEeeee---ccccccccccCccccceee Confidence 66432222 21 3455666666665532 2 56655433211 24799999999763 3334554444455677777 Q ss_pred ceeeeEee-eeccceeec--h--HHH-hhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecc Q lcl|NC_020854. 79 DKQVAAIL-HRGRAFEAR--D--LAA-LAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCID 152 (342) Q Consensus 79 ~~~~a~i~-~~~k~~~~t--D--~a~-~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~ 152 (342) ..+.-+.- -|+..|.+. | ++. .++-++-+++ +..+.-.-.+|..-++.|-.......+........ T Consensus 74 ~~et~tl~~DR~~~f~vD~mDvdETn~~~~~ani~~~---f~r~~vvPEiDayrfskla~~a~~~~~~~~~~~~~----- 145 (311) T protein:vir:99 74 EKTIYTMGQDRDVEFYLDRQDVDETDNELAMANISNV---FITEHVQPELDSYRFSKIATSFDNLDGTDTEGTLL----- 145 (311) T ss_pred eeeEEEeeeccceeeecchhchhhhhhhhHHHHHHHH---HHHhhhcchhhHHHHHHHHhhhhcccccccchhhh----- Confidence 66665554 477777776 3 221 2222333333 22222333345555554422111100000000000 Q ss_pred cccccccccccH----HHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhh-hhhhhcccceeeeccceeecccccc Q lcl|NC_020854. 153 SESGDTPTALSP----RHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDY-VSTADARGTSTTQSGGSMAAAYGGE 227 (342) Q Consensus 153 ~~~~~~~~~~~~----~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~-~~~s~~~~~~~~~~~~~~~~~~~~~ 227 (342) .........++. +.|-.++..|=|-...-.+++|.|.++.-|.+...|.- +...+ . .....+ T Consensus 146 ~~~~~~~~~lt~~nvl~~l~~~~~~~~~v~~~~rvl~vTp~~~~lLk~~~~~~r~~~~~~-----------~--~~~~i~ 212 (311) T protein:vir:99 146 AKTHKTEETLDETNAYSQLKTGIGKVRKYGTQNLVGYVSSEVMDALERSKEFTRNITNQN-----------V--GTTALE 212 (311) T ss_pred ccccccccccCHHHHHHHHHHHHHHHHhcCCCCeEEEEChHHHHHHhhchhhheeeeccc-----------c--cccccc Confidence 000011223333 34455555564433345899999999997776543321 11110 0 001124 Q ss_pred cceeeeccceEEEe-C--Cccee-------ccCCCcceEEEEEecc-eeEeecCCcceeEeccCCCcce--eEEEEeeEE Q lcl|NC_020854. 228 VSVPTYMGLRVIVS-D--DVNTA-------GSGGSTEYATYFFTQG-AVASGEQMAMQTETDRDILAKS--DAMSIDLHY 294 (342) Q Consensus 228 ~~i~~~~G~~Vvvd-D--~~p~~-------~~~~~~~y~t~l~~~G-Ai~~~~k~~~~ve~dr~~~~g~--~~l~~r~~y 294 (342) ..++.+.|.+||-. + .+... ..+...+..-|++.+. |..--.|....--.+|+.+..- -.+..|..+ T Consensus 213 ~~V~~lDgv~Ii~V~ps~r~~t~~~ft~G~~~~~~ak~INfiiv~~~a~i~~~K~~~v~~f~P~~~~~gd~~l~~~R~Y~ 292 (311) T protein:vir:99 213 SRITSIDGVQLIEVYESNRFMTKYDFTDGAKPTEDAKAINFLVVAKPAVISIVKENAVFLFAPGQHTDGDGYLYQNRLYH 292 (311) T ss_pred cccceecCeEEEEecCchhhcchhhhcCCccccCcccccceEEeCCCeeeeeeeeeeeeeeCCCCCCCcceeeeeeeeee Confidence 56888888887633 2 22100 0111223333444433 3333334332223445554333 355666666 Q ss_pred Eeeecc-----e--eeecC Q lcl|NC_020854. 295 VYHPVG-----A--KWAVT 306 (342) Q Consensus 295 ~~~~~G-----~--s~~~~ 306 (342) -+-++. + +.+.+ T Consensus 293 D~fv~~nk~~~Iyv~~k~A 311 (311) T protein:vir:99 293 DLFIKKHKRDGIFVSVKKA 311 (311) T ss_pred eeeeeccccCeEEEeeecC Confidence 665543 2 12222 No 178 >protein:vir:78090 Length: 302 # NCBI annotation: Cps # Family: family:all:701 # MgeID: mge:1844 # MgeName: P35 # Cross-refs: genbank:acc:YP_001468790;genbank:gi:157325371;genbank:GeneID:5601852 Probab=89.63 E-value=0.025 Score=29.34 Aligned_cols=282 Identities=12% Similarity=0.105 Sum_probs=122.0 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCC--CCcccccCCCceechhhccc Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANL--SGDFEVLSDSSSLTPGKITA 78 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~--~gda~~~~~~~~i~~~~lt~ 78 (342) ||.++ +. .+.|.+.+.+.+...+ + |+.+..++....-.||++|.+|...-.- .....+++-+..-....++- T Consensus 1 Mantl-~y--a~~~~~~Ld~~~~~~~-~--t~~l~~~~~~v~~~Gak~vkIp~is~~~~~TsGl~dy~R~~g~~~g~v~~ 74 (302) T protein:vir:78 1 MANSL-AL--AQIYQDNIDKAIAVNS-K--SAFLEANPNNVQYNGGNTIKIADISFGSGTTGDLKAYNRSTGFTQGSVTL 74 (302) T ss_pred CCchh-HH--HHHHHHHHHHHHHhhh-c--eeecccCCceEEEecCcEEEEEEEEeeccccccccccccccCccccceee Confidence 88443 11 2556677777665543 2 5666555444323579999999986110 01112333233333344444 Q ss_pred ceeeeEe-eeeccceeec----hHHH-hhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecc Q lcl|NC_020854. 79 DKQVAAI-LHRGRAFEAR----DLAA-LAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCID 152 (342) Q Consensus 79 ~~~~a~i-~~~~k~~~~t----D~a~-~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~ 152 (342) ..+.-+. +-|+..|.+. |++. .++-++-|++ +..+.-.-.+|+.-++.|-+.-. + .+...+.. T Consensus 75 ~~et~tlt~DR~~~f~vD~mDvdETn~~~~~ani~~e---f~r~~vvPEiDayrfskla~~a~---~---~~~~~~~~-- 143 (302) T protein:vir:78 75 AWSDYTLDYDLAQSFQIDAMDVDETKNLATVGNVLSE---YQRTKIVPAIDKYRFTKLANDGT---G---VGGVIDLS-- 143 (302) T ss_pred eeeeEEeeeccceeeeccccchhhhhhhhHHHHHHHH---HHHhhhcchhhHHHHHHHHHhhh---c---cCcccccc-- Confidence 4444333 2466666665 3322 2223333333 12222233345555554422110 0 00111100 Q ss_pred cccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccceee Q lcl|NC_020854. 153 SESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPT 232 (342) Q Consensus 153 ~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~ 232 (342) ... .++.--.+.+..++..|.|. .-.+++|.|.++.-|.+..++.-..... ..+.+ ..+..++. T Consensus 144 ~~~--~t~~nvl~~i~~~~~~~~e~--~~~vl~vtp~~~~~Lk~a~~~~~~~~~~-----~~~~~-------~i~~~V~~ 207 (302) T protein:vir:78 144 KPD--ASAQALMGDIATAMELVDDS--NQLILVTSPTTLAGLLNTALIRESKNTQ-----VLRRG-------EVDTKITF 207 (302) T ss_pred ccc--hhHHHHHHHHHHHHHHhhcc--CCeEEEEChHHHHHHhcchhhccceecc-----ccccc-------cccceeee Confidence 000 01111234566677778885 3578999999999888765442211110 00001 12456777 Q ss_pred eccceEEE--eCCccee-------ccCCCcceEEEEEecc-eeEeecCCcceeEeccCCCccee--EEEEeeEEEeeecc Q lcl|NC_020854. 233 YMGLRVIV--SDDVNTA-------GSGGSTEYATYFFTQG-AVASGEQMAMQTETDRDILAKSD--AMSIDLHYVYHPVG 300 (342) Q Consensus 233 ~~G~~Vvv--dD~~p~~-------~~~~~~~y~t~l~~~G-Ai~~~~k~~~~ve~dr~~~~g~~--~l~~r~~y~~~~~G 300 (342) +.|.+|+. +|.+... ..+...+..-|++.+. |..--.|.+..--.+|+.+..-+ .+..|..+-+-++. T Consensus 208 lDgv~Ii~VPs~r~~t~~~f~~G~~~~~~ak~INfiiv~~~a~ia~~K~~~~~if~P~~~~~gd~~l~~~R~Y~D~fV~~ 287 (302) T protein:vir:78 208 IQDVEVLQVPSEYLYDKVAPKVGVPDYTGAKKIPYMIFKRDAPTGIVKTDKVRVFEPDTNQSADAYKVDLRLYHDLIVPK 287 (302) T ss_pred ecccEEEEchhhhcccceeccCCccccCCccceeEEEECCCeeeeeeeeeeeEeeCCCCCCCcceeeeeeeeEeeeeeec Confidence 77777762 1111100 0011112223333332 32223333322224566655444 66677766666644 Q ss_pred eeeecCcCCcChHHhc Q lcl|NC_020854. 301 AKWAVTTTNPTRAQLE 316 (342) Q Consensus 301 ~s~~~~~~sPt~~~L~ 316 (342) -+-..--.|+. +.++ T Consensus 288 nk~~gI~~~~~-~~~~ 302 (302) T protein:vir:78 288 NQRPGIIKASF-GTIA 302 (302) T ss_pred cccCeEEEeec-cccC Confidence 33210000000 0000 No 179 >protein:vir:100632 Length: 381 # NCBI annotation: 77ORF006 # Family: family:all:635 # MgeID: mge:1476 # MgeName: 77 # Cross-refs: genbank:acc:NP_958606;genbank:gi:41189521;genbank:GeneID:2743778 Probab=86.54 E-value=0.045 Score=27.98 Aligned_cols=272 Identities=11% Similarity=0.034 Sum_probs=124.4 Q ss_pred Cc-c--eeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechh-hc Q lcl|NC_020854. 1 MA-T--LRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPG-KI 76 (342) Q Consensus 1 Ma-T--~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~-~l 76 (342) |. . .-....+|+-|.+-+.+.+.+.+.+.+ ++. ... .+|. ..+|.-. .++.+.-+.+...++.+ +. T Consensus 76 ~~~~t~~~Gg~lvP~~~~~~I~~~l~~~spir~--~a~----v~~-~~~~-~~i~~~~--~~~~a~W~~e~~~~~~~~~~ 145 (381) T protein:vir:10 76 INKSVGYKEEKLLPEETIDRIFEDLTTNHPLLA--DLG----IKN-AGLR-LKFLKSE--TSGVAVWGKIYGEIKGQLDA 145 (381) T ss_pred HhhcCCCCCceecCHHHHHHHHHHHHhhcceee--eee----eEe-cCcc-eEEEeec--CCcceEEeecccccccccCc Confidence 32 2 223478888888777777777665544 221 211 2343 4567433 23555444554555432 22 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeee----cc Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLC----ID 152 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~----~~ 152 (342) +-++-....++...-..++..--.-+.-|-.+.+.+++++.+++..+..++. | .. .+.-.+.+.++. .. T Consensus 146 ~f~~i~l~~~kl~a~i~is~elL~Ds~~~le~~i~~~la~~~a~~~~~afi~---G---dG-~~qP~Gil~~~~~~~~~~ 218 (381) T protein:vir:10 146 AFSEETAIQNKLTAFVVLPKDLNDFGPAWIERFVRVQIEEAFAVALETAFLK---G---TG-KDQPIGLNRQVQKGVSVT 218 (381) T ss_pred cceeEeecceeEEeeccccHHHHhccHHHHHHHHHHHHHHHHHHHhhceeEe---c---cc-CCCceeeeecCCcccccc Confidence 2222223333444334454444333455667779999999988887765543 2 11 111001110000 00 Q ss_pred cc------cccccccccHHHHHH-------HHHHhCc----cccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeee Q lcl|NC_020854. 153 SE------SGDTPTALSPRHVAE-------ARAILGD----QGDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQ 215 (342) Q Consensus 153 ~~------~~~~~~~~~~~~l~~-------A~~~~GD----~~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~ 215 (342) .. ...+...++...+.+ .....+. ....-..|+||+..+.+|++... + ++++++ T Consensus 219 ~g~~~~~~~~~~~t~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~vmn~~t~~~l~~~~~--~-~~~~G~------ 289 (381) T protein:vir:10 219 DGAYPEKEEQGTLTFANPRATVNELTQVFKYHSTNEKGKSVAVKGNVTMVVNPSDAFEVQAQYT--H-LNANGV------ 289 (381) T ss_pred ccccccccccccccccchhhHHHHHHHHHHhhhhhhccccccccCceEEEEchhhHHhhccccc--c-CCCCCc------ Confidence 00 000001111122111 1111111 11223468999999999976532 1 122211 Q ss_pred ccceeecccccccceeeeccceEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccC--CCcceeEEEEee Q lcl|NC_020854. 216 SGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRD--ILAKSDAMSIDL 292 (342) Q Consensus 216 ~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~--~~~g~~~l~~r~ 292 (342) ++..++ .|.+|++++.||-. + ++|+. +-..+....++.+++.+. ...+++.+.+.. T Consensus 290 --------~v~~lp----~g~~vv~~~~~p~~------~---i~fGDfs~Y~i~~r~~~~i~~~~~~~~~~d~~~f~a~~ 348 (381) T protein:vir:10 290 --------YVTALP----FNLNVIESTVQEAG------K---VLTYVKGLYDGYLAGGINVQKFKETLALDDMDLYTAKQ 348 (381) T ss_pred --------eeecCC----CCceeEEcCCCCcC------c---EEEEEcccEEEEEecccEEEeechhhhhcCceEEEEEE Confidence 111111 37789999988731 1 23322 123344455555655443 345777787777 Q ss_pred EEEeeec---ceee---ecCcCCcChHHhcCCc Q lcl|NC_020854. 293 HYVYHPV---GAKW---AVTTTNPTRAQLETVA 319 (342) Q Consensus 293 ~y~~~~~---G~s~---~~~~~sPt~~~L~~~~ 319 (342) ++--.|. -+.- +..+.-|.-++-+-.- T Consensus 349 r~dG~~~~~~A~~v~~l~~~~~~~~~~~~~~~~ 381 (381) T protein:vir:10 349 FAYGKAKDNKVAAVWKLDLKGHKPALEDTEETL 381 (381) T ss_pred EEcCEEecCCcEEEEEEeecCCccccccccccC Confidence 7765443 2221 2233334433322221 No 180 >protein:vir:105464 Length: 346 # NCBI annotation: putative phage major capsid protein # Family: family:all:701 # MgeID: mge:1502 # MgeName: KC5a # Cross-refs: genbank:acc:YP_529874;genbank:gi:90592614;genbank:GeneID:3974528 Probab=85.22 E-value=0.054 Score=27.51 Aligned_cols=301 Identities=11% Similarity=0.006 Sum_probs=130.2 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccch-hhhccCCCCEEEccccccCCCCcccccCCCceec-hhhccc Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMT-ELNATEGGDFINVPFWKANLSGDFEVLSDSSSLT-PGKITA 78 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~-~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~-~~~lt~ 78 (342) ||=-.+ +.|.+.+.+++.....-..++...+.. .... .||++|.||...- .....+++-..-.. .+.++. T Consensus 1 Mainya-----~~~~~~Ld~~~~~~~lts~~l~~~~~~~~v~~-~ggktVkIp~is~--tsGl~DY~R~~g~~~~g~v~~ 72 (346) T protein:vir:10 1 MTINYA-----EKYQAAVQQAFYDGHLYSAELWNSPSNSIIKF-DGAKHIKVPRLEI--TSGRKDRQRRTITTPVANYSN 72 (346) T ss_pred CcchhH-----HHHHHHHHHHHHhhhccchhhcccccccceEe-cCCCEEEEEEeee--ecccccccccCCccccccccc Confidence 883333 345555555553321110112222222 2222 4799999998741 11234554443332 356666 Q ss_pred ceeeeEee-eeccceeec--h--HHH-hhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecc Q lcl|NC_020854. 79 DKQVAAIL-HRGRAFEAR--D--LAA-LAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCID 152 (342) Q Consensus 79 ~~~~a~i~-~~~k~~~~t--D--~a~-~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~ 152 (342) ..+.-++- -|+..|.+. | ++. .++-++.|++ ++.+.-.-.+|+..++.|-..... ... .... T Consensus 73 ~~et~tl~qDR~~~F~vD~mDvDETn~~~~~anv~~e---f~r~~vvPEiDayrfskLa~~a~~---~~~-~~~~----- 140 (346) T protein:vir:10 73 DWDSYELKNERYWSTLVDPSDIDETNMVVSLANITKQ---FNLDSKMPEKDRYMFSHLYSGKEA---AHD-GGIT----- 140 (346) T ss_pred ceeEEEeeccccceecccccchHHHHHHhHHHHHHHH---HHHHhhcchhhHHHHHHHHHhhhh---hcc-cccc----- Confidence 66665554 577777776 3 222 2333444444 122222233466656554211110 000 0000 Q ss_pred cccccccccccHHHHHHHHHHhCccc--cCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceeecccccccce Q lcl|NC_020854. 153 SESGDTPTALSPRHVAEARAILGDQG--DKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMAAAYGGEVSV 230 (342) Q Consensus 153 ~~~~~~~~~~~~~~l~~A~~~~GD~~--~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i 230 (342) + ...+ +.--++.|.++..+|-|.. ..-..++|.|.++.-|.+...| .+..+.. + ....+..+ T Consensus 141 ~-~a~T-~~ni~~~i~~~~~~lde~~vp~~~rvl~vTp~~~~lLk~s~~f--~k~~~v~-----~-------~~~i~~~V 204 (346) T protein:vir:10 141 T-NTLD-EKNILPAFDNMMLDFDEARIPSTNRILYVTPKTNAILKRAEAM--NRALTLK-----D-------PNNIQRTV 204 (346) T ss_pred c-cccC-HHHHHHHHHHHHHHHHHccCCCCCeEEEECHHHHHHHhhchhh--eeccccc-----c-------ccccceee Confidence 0 0001 1112466777777776642 2447899999999988776543 2222210 1 11225678 Q ss_pred eeeccceEEE--eCCccee-------ccCCCcceEEEEEecc-eeEeecCCcceeEeccCCCc-ceeEEEEeeEEEeeec Q lcl|NC_020854. 231 PTYMGLRVIV--SDDVNTA-------GSGGSTEYATYFFTQG-AVASGEQMAMQTETDRDILA-KSDAMSIDLHYVYHPV 299 (342) Q Consensus 231 ~~~~G~~Vvv--dD~~p~~-------~~~~~~~y~t~l~~~G-Ai~~~~k~~~~ve~dr~~~~-g~~~l~~r~~y~~~~~ 299 (342) +.+.|.+|+. ++.+... ..++..+..-|++.+. |..--.|....--.++.... |.-.+..|..+-+-++ T Consensus 205 ~siDGv~Ii~VPs~r~~t~~~f~~G~~~~t~ak~INfiiv~~~A~ia~~K~~~~~if~P~~~~~g~~l~~~R~Y~D~fv~ 284 (346) T protein:vir:10 205 YSLDDVTIRVVPSDLMQTAYDFSDGSKIIDTAKQIEMFLIYNGVQIAPEKYSFVGFDQPSAATSGNYLYYEQSYDDVLLL 284 (346) T ss_pred eeecCeEEEEcchhhcccchhhccCccccCCccceeEEEECCceeeeeeeeeeeEeeCCCCCcccceeeeeeeeeeeeee Confidence 8999999974 4554311 0112233444554444 33323343322223343222 2235667776666664 Q ss_pred c-----eee--ecCcCCcC--hHHhcCCcCceeecCccccceEEEEecCCCC Q lcl|NC_020854. 300 G-----AKW--AVTTTNPT--RAQLETVANWSKVYELKNIGIVRATNVSNFD 342 (342) Q Consensus 300 G-----~s~--~~~~~sPt--~~~L~~~~NW~~v~d~k~i~~~~~~~~~~~~ 342 (342) . +-. ..+...+. -+.=++++|=..|-+.|+ ..++-..| T Consensus 285 ~nk~~~Iyv~~~~a~~~~~~~~~~~~kpt~~~~~~~~~~-----~~~~~~~~ 331 (346) T protein:vir:10 285 NTKTKGIQFVVSDKPKKDQEQSGQDAKPTAESTLEEIKA-----YLDKNHID 331 (346) T ss_pred ccccceEEEeeecccccCccCcccccCcccccchHHHHH-----Hhcccccc Confidence 3 322 11111111 011122222222222221 11222222 No 181 >protein:vir:95512 Length: 693 # NCBI annotation: Putative Clp protease # Family: family:all:62 # ACLAME annotation(s): go:0008236 - serine-type peptidase activity; phi:0000017 - phage prohead/capsid assembly # MgeID: mge:1574 # MgeName: F10 # Cross-refs: genbank:acc:YP_001293349;genbank:gi:148912770;genbank:GeneID:5228164 Probab=82.14 E-value=0.079 Score=26.61 Aligned_cols=271 Identities=13% Similarity=0.113 Sum_probs=129.8 Q ss_pred Cc-c-eeccc--cchhHHHHHHHhhhH---Hh-hhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceec Q lcl|NC_020854. 1 MA-T-LRSDI--IIPEVFTPYVIEQTT---QR-DAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLT 72 (342) Q Consensus 1 Ma-T-~~~d~--i~Pev~~~yv~~~~~---~~-~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~ 72 (342) || | +-||- |--+|...-+.+... .. -+|...| ..+.+ .+...+.+=- . ++-+.|.|+.++. T Consensus 394 ~a~~htTSDFp~IL~~~~nk~l~~~y~~a~~t~~~~~~~~---~~~DF---k~~~~~~lg~---~--~~L~~V~E~gEyk 462 (693) T protein:vir:95 394 LAFTHTSSDFGLILLDVANKSVLAGWEEAEETFPLWTKSG---ILTDF---KPARRVGLGE---F--SSLRQVREGAEYK 462 (693) T ss_pred HHHhcCcchhHHHHHHHHHHHHHHHHHhhhhHHHHHhccC---CCCcc---cccceeecCC---C--CChhhcCCCCcee Confidence 22 1 12221 222222222211111 11 1111101 11111 1222333211 1 4556789999998 Q ss_pred hhhcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecc Q lcl|NC_020854. 73 PGKITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCID 152 (342) Q Consensus 73 ~~~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~ 152 (342) ...++-...+-.+...|+-+++|-.+-.=---+.+..+-.+++.+-++.+.+.+.+.|.+ ...-.+....++.-|.. T Consensus 463 ~~t~~e~~e~~~l~tyG~~~~iTRqaiINDDLga~~~ip~~~g~aA~~~~~~~vy~~L~~---Np~m~DGk~LFhadH~N 539 (693) T protein:vir:95 463 YVTLGERGEQIILATYGELFSITRQAIINDDLQMLSDIPFKLGQAAKATIGDLVYAVLTG---NPAMSDGKTLFHADHSN 539 (693) T ss_pred eeecCCccceeehhhcCCeeeecHHhhhccchHHHHHHHHHHHHHHHHHHHHHHHHHHhc---CccccCCcceeeccccc Confidence 888887777778888999999988774433335566788888888888888888888743 11111112222211111 Q ss_pred cccccccccccHHHHHHHHHHhCccc--------c----CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeecccee Q lcl|NC_020854. 153 SESGDTPTALSPRHVAEARAILGDQG--------D----KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSM 220 (342) Q Consensus 153 ~~~~~~~~~~~~~~l~~A~~~~GD~~--------~----~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~ 220 (342) - .+.+.+.++.++|..|...|.-.. . .-+.|++++......++.---..++.++.+ T Consensus 540 l-~tga~sals~~sl~~a~~am~~qk~~~~~~~g~~L~i~P~~llvP~~le~~a~~l~~s~~~~~a~~~----------- 607 (693) T protein:vir:95 540 L-LTGAASALSIDSLSKAKTQMATQKAQVEKGKGRTLNIRPGFVLTPVALEDKANQIINSESVPGADVN----------- 607 (693) T ss_pred c-ccccccccChHHHHHHHHHHHHhhcchhccCCceeecccceEEecchHHHHHHHHhccccccccccc----------- Confidence 1 112345788999999877765432 1 235788888887777653211122211110 Q ss_pred ecccccccceeeeccc-eEEEeCCcceeccCCCcceEEEEEecce---eE--eecCCc-ceeEeccCCCcceeEEEEeeE Q lcl|NC_020854. 221 AAAYGGEVSVPTYMGL-RVIVSDDVNTAGSGGSTEYATYFFTQGA---VA--SGEQMA-MQTETDRDILAKSDAMSIDLH 293 (342) Q Consensus 221 ~~~~~~~~~i~~~~G~-~VvvdD~~p~~~~~~~~~y~t~l~~~GA---i~--~~~k~~-~~ve~dr~~~~g~~~l~~r~~ 293 (342) ...+--|+|+ .||++-.+. ....+. =|++...+ |. |-.+.+ +.+|+...-.----.+-.|.- T Consensus 608 ------~~~~NP~~~~~~vi~~prL~---~~s~~~--Wyl~a~~~~dtie~~yL~G~~~P~ie~~~gf~~dG~~~kvr~D 676 (693) T protein:vir:95 608 ------SGIVNPIRAFAQVIGEPRLD---DASATA--WYMAAKKGSDTIEVAYLDGVDTPYLEQQEGFTVDGVASKVRID 676 (693) T ss_pred ------cccccchhccccccccceec---CCCCCc--eEEecCCCCCeEEEEEecCCCCCeEeecCCCCcceEEEEEEEe Confidence 0111224454 344433331 111111 24444333 33 334443 455555543333334556666 Q ss_pred EEeeeccee-e-ecCcC Q lcl|NC_020854. 294 YVYHPVGAK-W-AVTTT 308 (342) Q Consensus 294 y~~~~~G~s-~-~~~~~ 308 (342) |++.+.+|. | ++.|. T Consensus 677 ~G~~~iD~Rg~~kn~GA 693 (693) T protein:vir:95 677 AGVAPLDFRGLQKSNGA 693 (693) T ss_pred ccCceeeccccccCCCC Confidence 666665553 1 34443 No 182 >protein:vir:8324 Length: 410 # NCBI annotation: gp41 # Family: family:all:30827 # MgeID: mge:154 # MgeName: Corndog # Cross-refs: genbank:acc:NP_817892;genbank:gi:29566325;genbank:GeneID:1259520 Probab=79.19 E-value=0.11 Score=25.91 Aligned_cols=261 Identities=13% Similarity=0.101 Sum_probs=135.7 Q ss_pred Cc----ce-e---ccccchhHHHHHHHhhhHHhhhhhhc--CccccchhhhccCCCCEEEccccccCCCCcccc------ Q lcl|NC_020854. 1 MA----TL-R---SDIIIPEVFTPYVIEQTTQRDAFLAS--GVVQPMTELNATEGGDFINVPFWKANLSGDFEV------ 64 (342) Q Consensus 1 Ma----T~-~---~d~i~Pev~~~yv~~~~~~~~~f~~s--g~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~------ 64 (342) |+ +. - +..|-|+..++-+ +|+++ +++...++|- .+|.|+.-|.-. -+..+ T Consensus 127 ~r~a~~~~~Tgd~~~~i~~~~v~d~i--------~li~q~r~i~slf~tLP--~~g~T~eY~v~t----~~~tV~~q~~~ 192 (410) T protein:vir:83 127 YARAADHQKTGDLQGVIPDPIVGPVI--------DFIDSARPLVSTLGTLP--LNNATFYRPIVS----QRPAVGLQGVA 192 (410) T ss_pred HHHhhccCcccccccccchhHhhhHH--------HHHhhccchhhhhhhCC--CCCCeeEEeeec----ccccccccccc Confidence 22 11 1 2234444333322 33333 1222223443 358877665332 22211 Q ss_pred ---cCCCceechhhcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccc Q lcl|NC_020854. 65 ---LSDSSSLTPGKITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANT 141 (342) Q Consensus 65 ---~~~~~~i~~~~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~ 141 (342) -+||++++..|++.++..+.|+..|-...++--+-+-+.-.-+....+-++-+.++.-++..-+.|.+.+....+-. T Consensus 193 ~kqa~EGd~L~~gKl~~~t~tA~ikTyGGyt~LSRQ~IERs~v~~L~~~lraL~~AYA~atea~vra~L~~t~t~~~a~~ 272 (410) T protein:vir:83 193 GGASDEKTELDSQKMVIDRLTVNAKTLGGYVNVSRQAIDFSSPSALDLVVNGLGQQYAIETEALVGAALASTSTGAVGYG 272 (410) T ss_pred cccccccccccccceeeeeccceeehhcCcccccceeeecCChhhHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhhhhhh Confidence 35899999999999999999998887766666565666666666666666666666666666666655443221110 Q ss_pred cchhheeeecccccccccccccHHHHHHHHHHhCcccc--CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccce Q lcl|NC_020854. 142 SSSAFFDLCIDSESGDTPTALSPRHVAEARAILGDQGD--KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGS 219 (342) Q Consensus 142 ~~~~~~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~--~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~ 219 (342) ......- ...+.+|..++-|... .+..|.+++.|..++-+. .+.-..+.. ...+ T Consensus 273 -------------~~Tad~~--~~~i~da~~~v~da~~~~~~~~i~vS~DVl~~~~~~---f~~~~~~~~--dt~G---- 328 (410) T protein:vir:83 273 -------------NATADNV--ASAIWQAAGAVYTAVKGMGRLVIAIAPDVLGDFGPL---FAPVNPTNA--HSTG---- 328 (410) T ss_pred -------------hccHHHH--HHHHHHHHHHHhhhhccceeeeEEechhhhhhccce---eeccCCCCc--cccc---- Confidence 0011111 2356678888888733 566779999996555432 111111111 1100 Q ss_pred eecccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecceeEeecCC--cceeEeccCCCcceeEEEEeeEEEee Q lcl|NC_020854. 220 MAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQM--AMQTETDRDILAKSDAMSIDLHYVYH 297 (342) Q Consensus 220 ~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~--~~~ve~dr~~~~g~~~l~~r~~y~~~ 297 (342) ......|..--+.++|.+|++..+.+- .+.+++-+-||.++++. |+.. .+.++.+-...+. ..|. T Consensus 329 fg~~~lg~gi~G~~~~ipVvm~~~a~A--------gTA~f~~~~Ai~~~eS~~gp~qL-~d~~i~nLt~~yS--gY~a-- 395 (410) T protein:vir:83 329 FEAGRFGQGVMGSISGIPVVMSAALGS--------GDAYLFSTAAIECFEQRVGTLQV-VEPSVFGLQVAYA--GYFS-- 395 (410) T ss_pred ccccccccchhhhhcccceEEecCCCc--------CeeeEeccceeeeeecCCceeEe-eCCchhhhhhhhe--eeee-- Confidence 001111233456788999999887642 35788889999998876 3444 4555544333332 1111 Q ss_pred ecceeeecCcCCcChHH Q lcl|NC_020854. 298 PVGAKWAVTTTNPTRAQ 314 (342) Q Consensus 298 ~~G~s~~~~~~sPt~~~ 314 (342) .+.-+ ..++=|--.. T Consensus 396 -~a~~~-~~gliPv~g~ 410 (410) T protein:vir:83 396 -TLVVN-EDAIVPLVGS 410 (410) T ss_pred -ecccc-ccceeeeccC Confidence 11111 1111111000 No 183 >protein:vir:79548 Length: 652 # NCBI annotation: putative protease/scaffold protein # Family: family:all:62 # ACLAME annotation(s): go:0008236 - serine-type peptidase activity; phi:0000017 - phage prohead/capsid assembly # MgeID: mge:1871 # MgeName: cdtI # Cross-refs: genbank:acc:YP_001272518;genbank:gi:148609387;genbank:GeneID:5204384 Probab=78.33 E-value=0.12 Score=25.72 Aligned_cols=267 Identities=10% Similarity=0.103 Sum_probs=127.1 Q ss_pred Cc-c-eeccccchhHHHH----HHHhhhHHh----hhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCce Q lcl|NC_020854. 1 MA-T-LRSDIIIPEVFTP----YVIEQTTQR----DAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSS 70 (342) Q Consensus 1 Ma-T-~~~d~i~Pev~~~----yv~~~~~~~----~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~ 70 (342) +| | .-+| -|.+|.+ -+.+..... -+|..-| ..+.+ .+...+.+-- -|+-+.|.|+.+ T Consensus 359 ~A~~hsTsD--Fp~IL~~~~nk~l~~~y~~a~~t~~~~~~~~---~~~DF---k~~~~~~lg~-----~~~L~~V~E~gE 425 (652) T protein:vir:79 359 AAFTHSTSD--FGNILLDVANKAILQGWEDAPETYEQWTRKG---QLSDF---KIAHRVGMGG-----FSALRQVREGAE 425 (652) T ss_pred HHhhcCcch--HHHHHHHHHHHHHHHHHhhhHHHHHHHhccC---CCccc---cccceeecCC-----CCCccccCCCCc Confidence 22 1 1233 1322222 221111111 1221111 11111 1223333321 256678899999 Q ss_pred echhhcccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccc--cchhhee Q lcl|NC_020854. 71 LTPGKITADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANT--SSSAFFD 148 (342) Q Consensus 71 i~~~~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~--~~~~~~~ 148 (342) +....++-+..+-.+...|+-+++|-.+-.----+.+..+-..++.+-++.+.+.+.+.|.+ + ... .+..+.+ T Consensus 426 yk~~t~~e~~e~~~l~tyG~~~~iTRqaiINDDL~a~~~ip~~~g~aA~~~~~~~vy~~l~~----N-p~~~~DGk~LF~ 500 (652) T protein:vir:79 426 YKYVTTGDKQATIALATYGELFSITRQAIINDDLNMLTDVPMKLGRAAKSTIADLVYAILTS----N-PKISTDNVSLFD 500 (652) T ss_pred cceeeecCccceeeeecccCeeeeehheeeccchhHHHHHHHHHHHHHHHHHHHHHHHHHhc----C-cccccCCceeec Confidence 99888887777778889999999987774322224566778888888888888888887743 1 111 1112210 Q ss_pred eecccccccccccccHHHHHHHHHHhCcccc-------CeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeeeccceee Q lcl|NC_020854. 149 LCIDSESGDTPTALSPRHVAEARAILGDQGD-------KLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQSGGSMA 221 (342) Q Consensus 149 ~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~-------~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~~~~~~~ 221 (342) |.+-..-.+.+.++.+.|..|..+|..+.+ .-+.|+++|...+..++.-.-..++.++.+ T Consensus 501 -hA~H~Nl~~~aa~~~~~l~~ar~aM~~Qk~g~~~l~i~P~~llvp~~le~~a~~ll~s~~v~~a~~~------------ 567 (652) T protein:vir:79 501 -KAKHANVLESAAMDVASLDKARQLMRVQKEGERHLNIRPAFVLVPTAMESVANQVIRSSSVKGADIN------------ 567 (652) T ss_pred -ccccccccccccCCHHHHHHHHHHHHHhccCCccccccccEEEecchhHHHHHHHhccCCCcccccc------------ Confidence 111111112356899999999876654322 245788999887776553111222222111 Q ss_pred cccccccceeeeccc-eEEEeCCcceeccCCCcceEEEEEecc---eeE--eecCCc-ceeEeccCCCcceeEEEEeeEE Q lcl|NC_020854. 222 AAYGGEVSVPTYMGL-RVIVSDDVNTAGSGGSTEYATYFFTQG---AVA--SGEQMA-MQTETDRDILAKSDAMSIDLHY 294 (342) Q Consensus 222 ~~~~~~~~i~~~~G~-~VvvdD~~p~~~~~~~~~y~t~l~~~G---Ai~--~~~k~~-~~ve~dr~~~~g~~~l~~r~~y 294 (342) ...+..+.|. .||++-.+. +. ....-|++.+. .|. |-++.+ +.+|+...-..---.+-.|.-| T Consensus 568 -----~~~~Np~~~~~~~i~eprL~--~~---s~~~wylaa~~~~dtiev~yL~G~~~P~ie~~~gf~~dG~~~kvrlD~ 637 (652) T protein:vir:79 568 -----AGIINPVKDFATVIAEPRLD--DN---SQTTFYLAASKGSDTIEVAYLNGVDTPYIDQMEGFSVDGVTTKVRIDA 637 (652) T ss_pred -----cccccccccccccccccccC--CC---CcccEEEecCCCCCeEEEEEecCCCCCeeeecCCCCcceEEEEEEEec Confidence 0112223454 444433331 11 11123444333 233 334433 4555544322223345556666 Q ss_pred Eeeeccee-eecCcC Q lcl|NC_020854. 295 VYHPVGAK-WAVTTT 308 (342) Q Consensus 295 ~~~~~G~s-~~~~~~ 308 (342) ++.+.+|- |..++. T Consensus 638 G~~~iD~RG~~k~t~ 652 (652) T protein:vir:79 638 GVAPVDHRGLVKCTA 652 (652) T ss_pred cCceeeccceeeecC Confidence 66654442 222211 No 184 >protein:vir:96490 Length: 348 # NCBI annotation: head protein # Family: family:all:1083 # MgeID: mge:1620 # MgeName: 2972 # Cross-refs: genbank:acc:YP_238492;genbank:gi:66391768;genbank:GeneID:5176912 Probab=76.46 E-value=0.14 Score=25.35 Aligned_cols=311 Identities=10% Similarity=0.089 Sum_probs=119.1 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhh-------hccCCCCEEEccccccCCCCcccccCCCceech Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTEL-------NATEGGDFINVPFWKANLSGDFEVLSDSSSLTP 73 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l-------~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~ 73 (342) |++ +.|+|++..++.|+.+.......|+...+....+.. .+. .|..+..||-.+. .. +.-..- T Consensus 1 M~~-i~d~f~~~~l~~~i~~~~~~~~~~l~~~~Fp~~~~~~~~~~~~~~~-~~~~~~a~~v~~~--~~------~~~~~r 70 (348) T protein:vir:96 1 MGL-IYDKVTASNIAGYFNTLQENVDSTLGESIFPARKQLGTKLSYIKGA-SGQSVALKAAAFD--TN------VTIRDR 70 (348) T ss_pred Ccc-hhhccCHHHHHHHHHhcccchhhhhhhhcCCCccccceeEEEEeec-CCceeEeeeecCC--CC------cceecc Confidence 995 799999999999997654433333322222211111 111 2333334443221 11 111111 Q ss_pred hhcccceeeeEeeeeccceeechHHHh---h-hcch-HHHHHHHHHH-------HHHHHHHHHHHHHHHH-HHHhhhccc Q lcl|NC_020854. 74 GKITADKQVAAILHRGRAFEARDLAAL---A-AGSD-PMAAIGAKVA-------DYVANQRQKDLLSCLQ-GVFGSLNAN 140 (342) Q Consensus 74 ~~lt~~~~~a~i~~~~k~~~~tD~a~~---~-~~~d-p~~~i~~qia-------~~~~~~~~~~lla~L~-g~~~~~~a~ 140 (342) +...+..-.....+....+.+.|.-.+ . ++.+ +.+.+.++++ +.+.++.+..+...|. |-+.-. . T Consensus 71 ~~~~~~~~~~p~i~~~~~i~~~d~~~l~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~qal~~Gki~~~--~ 148 (348) T protein:vir:96 71 VSAEIHDEQMPFFKEALLVKENDRQQLNLVKDTGNEALINTIVAGIFNDDVTLINGARARLEAMRMQVLATGKIAFT--S 148 (348) T ss_pred cceeeeeeecCccccccccCHHHHHHHHhhhccCCchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCeeEee--c Confidence 112222212122223334455554332 1 1222 2344555544 3455555555555442 322100 0 Q ss_pred ccchhhe--------eeecccccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhh-hhhhhhhcccc Q lcl|NC_020854. 141 TSSSAFF--------DLCIDSESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAI-DYVSTADARGT 211 (342) Q Consensus 141 ~~~~~~~--------~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li-~~~~~s~~~~~ 211 (342) ....... .++....=.++.+. -.+.|.+.+...-++......++|.++++..|+++..+ +.++....+. T Consensus 149 ~~~~~~vdfg~~~~~~~t~~~~W~~~~ad-p~~di~~~~~~~~~~G~~~~~~i~~~~~~~~l~~~~~v~~~~~~~~~~~- 226 (348) T protein:vir:96 149 DGVNKDIDYGVKADHKKQVSKSWAEPGAT-PLADLEDAIETARELGLNPERAIMNAKTFGLIRKAASTVKAIKPLAGDG- 226 (348) T ss_pred CCeeEEEeccCCcccceeeccccCCCCCC-HHHHHHHHHHHHHhcCCcccEEEeCHHHHHHHhcCHHHHHHHhccCCcc- Confidence 0000000 01111111111111 12334444444434444677899999999999987433 2232221110 Q ss_pred eeeeccceeecccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecceeEeecCCc-------ceeEeccCCCcc Q lcl|NC_020854. 212 STTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMA-------MQTETDRDILAK 284 (342) Q Consensus 212 ~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~-------~~ve~dr~~~~g 284 (342) +.+.. .....-++++.|..|++=|.--...+++.. -++-.|.+.+..... ...|... ...+ T Consensus 227 ------~~~~~-~~~~~~~~~~~g~~i~~y~~~y~d~~G~~~----~~~p~~~v~l~~~~~~G~~~yg~~~e~~~-~~~~ 294 (348) T protein:vir:96 227 ------SSVTK-AELQNYVADNYGVEIVLENGTYRNEKGEVS----KFFPDGHLTLIPNGPLGNTVFGTTPEESD-LFAD 294 (348) T ss_pred ------ccccH-HHHHHHHhhhcCceEEEEccEEEecCCcEe----ccccCCeEEEEcCCCceeEEeccChhhhh-hhhc Confidence 01111 011123445557677665543322222211 122333332221100 0001000 0000 Q ss_pred eeEEEEeeEEEeeecceeeecCcCCcChHHhcCCcCceeec-CccccceEEEEecC Q lcl|NC_020854. 285 SDAMSIDLHYVYHPVGAKWAVTTTNPTRAQLETVANWSKVY-ELKNIGIVRATNVS 339 (342) Q Consensus 285 ~~~l~~r~~y~~~~~G~s~~~~~~sPt~~~L~~~~NW~~v~-d~k~i~~~~~~~~~ 339 (342) ++....-...--+++-.+|.. ..|....+...++.=-|. ++..+-++.+..-. T Consensus 295 ~~~~~~~~~~~~~~~~~~~~~--~dP~~~~~~~~s~plPv~~~~~~~~~a~Vl~~~ 348 (348) T protein:vir:96 295 NTVNADVEIVDSGIAVTTTKT--TDPVNVQTKVSMVALPSFERLGDVYMLTVIPGV 348 (348) T ss_pred ccccccceecCCeeEEEeeec--CCCceEEEEEeeeeeccccCCCcEEEEEEecCC Confidence 000000000000111124433 246666666555553333 45555555555554 No 185 >protein:vir:78350 Length: 383 # NCBI annotation: Cps # Family: family:all:635 # MgeID: mge:1850 # MgeName: B025 # Cross-refs: genbank:acc:YP_001468644;genbank:gi:157325222;genbank:GeneID:5601696 Probab=75.59 E-value=0.14 Score=25.18 Aligned_cols=263 Identities=9% Similarity=-0.022 Sum_probs=117.4 Q ss_pred Cc---ceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCCCceechh-hc Q lcl|NC_020854. 1 MA---TLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSDSSSLTPG-KI 76 (342) Q Consensus 1 Ma---T~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~~-~l 76 (342) |. +.-....+|+-+.+-+.+.+.+.+.+.+ ++. +.. .+|+ ..+|.-.. .+.+.-+.++..++.. +. T Consensus 83 ~~~~~~~~gg~lvP~~~~~~I~~~l~~~s~l~~--~~~----v~~-~~~~-~~i~~~~~--~~~a~w~~e~~~~~~~~~~ 152 (383) T protein:vir:78 83 INKEVGYKEETLLPQTVVDEIFEDLTTEHPFLA--SIG----MRT-TGLR-TKFLKSET--SGVAVWGKIFGEIKGQLDA 152 (383) T ss_pred HhccCCCCCccccCHHHHHHHHHHHHhhcccee--eee----eEe-cCCc-eEEEEEcC--CcceEEeecccccccccCc Confidence 33 2344578899888888888877766644 222 221 3455 47886542 3555445555555322 22 Q ss_pred ccceeeeEeeeeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeec----c Q lcl|NC_020854. 77 TADKQVAAILHRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCI----D 152 (342) Q Consensus 77 t~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~----~ 152 (342) +-++-.-..++...-+.++..--.-+.-|-.+.+.+++++.+++..+..++. | .. .+.-.+.+.++.. . T Consensus 153 ~f~~i~l~~~kl~~~i~is~ell~Ds~~~ie~~i~~~l~~~~a~~~~~a~i~---G---~G-~~qP~Gil~~~~~~~~~~ 225 (383) T protein:vir:78 153 TFSDEESIQNKLTAFVVVPKDLEKFGPAWVKRFVVTQIEEAFAVALESAYIV---G---DG-NDKPIGLNRKVGKGSTVV 225 (383) T ss_pred ceeeEeecceeeEeeccchHHHhhccHHHHHHHHHHHHHHHHHHHHhhheEe---c---cC-CCCceeeeeccCCccccc Confidence 2222222333343334444443333445667789999999999888877653 1 11 1100011110000 0 Q ss_pred cc---cccccccccH---HHH---HHHHHH-----hCcc---ccCeEEEEEchHHHHHHHhhhhhhhhhhhhcccceeee Q lcl|NC_020854. 153 SE---SGDTPTALSP---RHV---AEARAI-----LGDQ---GDKLTAVAMHSKVYYDLVERRAIDYVSTADARGTSTTQ 215 (342) Q Consensus 153 ~~---~~~~~~~~~~---~~l---~~A~~~-----~GD~---~~~~~~ivmhS~v~~~L~~~~li~~~~~s~~~~~~~~~ 215 (342) .. ...+...++. ..+ ..+... +.+. ...-..|+||+..|..++.. .... ..+ T Consensus 226 ~~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~n~~~~~~~~~~--~~~~-~~~-------- 294 (383) T protein:vir:78 226 DGVYAEKAATGTLTFANPKTTVNELTDVYKYHSVKENGHPLNVAGKVTLLVNPTDAWDVKKQ--YTSL-NAN-------- 294 (383) T ss_pred ccccccccccchhhhhhhHHHHHHHHHHHhccchhcccchhhhcCceEEEEcCcchhhhccc--hhcc-CCC-------- Confidence 00 0000111111 111 111111 1111 11123477887766544321 1110 011 Q ss_pred ccceeecccccccceeeec--cceEEEeCCcceeccCCCcceEEEEEec-ceeEeecCCcceeEeccC--CCcceeEEEE Q lcl|NC_020854. 216 SGGSMAAAYGGEVSVPTYM--GLRVIVSDDVNTAGSGGSTEYATYFFTQ-GAVASGEQMAMQTETDRD--ILAKSDAMSI 290 (342) Q Consensus 216 ~~~~~~~~~~~~~~i~~~~--G~~VvvdD~~p~~~~~~~~~y~t~l~~~-GAi~~~~k~~~~ve~dr~--~~~g~~~l~~ 290 (342) +...+.+ |.+|+.++.+|-.. .+|+. ....+....++.+++.+. ...+++.+.. T Consensus 295 ------------G~~~t~l~~~~~iv~s~~~p~~~---------iifgdfs~Y~i~~r~~~~i~~~~~~~f~~d~~~f~~ 353 (383) T protein:vir:78 295 ------------GVYVTALPFNLNIIESLFVPEKK---------AISYVAERYDALIGGPLDIGTYDQTLAIEDLNLYAA 353 (383) T ss_pred ------------CceeeecCCCceEEecCCCCccc---------EEEeeccceEEEecccceEEecchhhhhcCceEEEE Confidence 1122333 55577788776321 12221 123445555666665443 3456677777 Q ss_pred eeEEEeeecc---e-----eeecCcCCcCh Q lcl|NC_020854. 291 DLHYVYHPVG---A-----KWAVTTTNPTR 312 (342) Q Consensus 291 r~~y~~~~~G---~-----s~~~~~~sPt~ 312 (342) ..++--.|.- + +-..+...|.- T Consensus 354 ~~r~dG~~~~~~A~~vl~~~~~~~~~~~~~ 383 (383) T protein:vir:78 354 KQFAYGKAKDDKAAAVWTLNINPAEQTPEG 383 (383) T ss_pred EEEEcCEEecCCeEEEEEEEecCCCCCCCC Confidence 7776655422 2 22223333433 No 186 >protein:vir:95875 Length: 401 # NCBI annotation: major coat protein # Family: family:all:10944 # MgeID: mge:1586 # MgeName: N4 # Cross-refs: genbank:acc:YP_950534;genbank:gi:119952248;genbank:GeneID:5075702 Probab=70.86 E-value=0.2 Score=24.38 Aligned_cols=289 Identities=15% Similarity=0.111 Sum_probs=133.2 Q ss_pred Cc----------ceeccccchhHHHHHHHhhhHHhh----hhhhcCccccchhhhccCCCCEEEccccccCCCCccc-cc Q lcl|NC_020854. 1 MA----------TLRSDIIIPEVFTPYVIEQTTQRD----AFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFE-VL 65 (342) Q Consensus 1 Ma----------T~~~d~i~Pev~~~yv~~~~~~~~----~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~-~~ 65 (342) |- +...--.-||+=.-|-.++..... .|-+ .+..-+ +- +--|.||.+=+|-++ -++. .+ T Consensus 1 ~~~~~a~~~~~~~s~~g~~~~~~~t~y~~~k~L~~Aa~~lv~~~--fA~~~p-iP-kn~GkTIk~r~y~pl--~~~~~pl 74 (401) T protein:vir:95 1 MLNYNAPTDGQKSSIDGANSDQMQTFFWLKKAIITARKEQYFMP--LASVTN-MP-KHYGKTIKVYEYVPL--LDDRNIN 74 (401) T ss_pred CCccCCCcccccccccccccceeeehhhHHHHHhhhhhhhhhhh--cccccc-cc-cccCCeEEEEecccc--cccccch Confidence 11 111011233332223222222221 2211 111111 11 123778887666654 2222 23 Q ss_pred CCCce--------------------ech--------------hhcccceeeeEeeeeccceeechHHHhhhcchHHHH-H Q lcl|NC_020854. 66 SDSSS--------------------LTP--------------GKITADKQVAAILHRGRAFEARDLAALAAGSDPMAA-I 110 (342) Q Consensus 66 ~~~~~--------------------i~~--------------~~lt~~~~~a~i~~~~k~~~~tD~a~~~~~~dp~~~-i 110 (342) ++|.+ |+. .+++--...+.+++.|.=.++||+..+..-++-+.+ + T Consensus 75 ~eGv~a~G~~~~~g~~y~~~rdv~~it~~m~~~t~~~~rvn~v~~~~~d~~g~l~qyG~~~e~Td~~~dt~~D~~l~~h~ 154 (401) T protein:vir:95 75 DQGIDASGATIVNGNLYGSSKDIGNITSKLPLLTENGGRVNRVGFTRIAREGSIHKFGFFYEFTQESIDFDSDDGLMEHL 154 (401) T ss_pred hcCCCcccccccCccccccccccceeecccccccccccccccccceeeeeeeeeeeccCccchhhhhhhhhcchHHHHHH Confidence 34331 111 133333444555666666689999988776666665 2 Q ss_pred HHHH----HHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccccccccccHHHHHHHHHHhCc-c-------- Q lcl|NC_020854. 111 GAKV----ADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDTPTALSPRHVAEARAILGD-Q-------- 177 (342) Q Consensus 111 ~~qi----a~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD-~-------- 177 (342) ...+ +.-..+.++++||+....++-. +........+........++++.+-++...+-+ + T Consensus 155 s~ell~g~~~~t~d~i~~dll~ag~~viyA-------g~ats~At~~~~~~~~t~vt~~~l~rl~~~L~~nRapk~t~~i 227 (401) T protein:vir:95 155 SRELMNGATQITEAVLQKDLLAAAGTVLYA-------GAATSDATITGEGSTPSVVSYKNLMRLDQILTENRTPTQTTII 227 (401) T ss_pred HHHHhhhhhhhHHHHHHHHHHhhcCeeecC-------CccceeeeccccccccceechhHHHHHHHHHHhcccccchhhh Confidence 2222 2233445566666432111110 000111223344456667888888887655443 1 Q ss_pred ----------ccCeEEEEEchHHHHHHHhh-------hhhhhhhhhhcccceeeeccceeecccccccceeeeccceEEE Q lcl|NC_020854. 178 ----------GDKLTAVAMHSKVYYDLVER-------RAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIV 240 (342) Q Consensus 178 ----------~~~~~~ivmhS~v~~~L~~~-------~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~~~G~~Vvv 240 (342) -..-.+.+|||....+|+.+ +.++-.++.+...++ +..|+.+-+.|+|+ T Consensus 228 ~~s~~~dTk~i~~s~va~~h~~L~~di~a~~D~~~~~~fi~v~kYa~~~~i~--------------~gEiG~i~~vR~i~ 293 (401) T protein:vir:95 228 TGSRMIDTKVIGATRVMYVGSELVPELKAMKDLFGNKAFIETQHYADAGTIM--------------NGEVGSIDKFRIIQ 293 (401) T ss_pred hhhhccCccccccceEEEEecCchhHHHHHHHhcCCCCceehhhcCCccccc--------------cccccccCceeEEe Confidence 12345689999888888755 356666666654433 46778888889888 Q ss_pred eCCcc--------ee------------ccCCCcceEEEEEecceeEeec-----CC-cceeEeccCCCc---c-eeEEEE Q lcl|NC_020854. 241 SDDVN--------TA------------GSGGSTEYATYFFTQGAVASGE-----QM-AMQTETDRDILA---K-SDAMSI 290 (342) Q Consensus 241 dD~~p--------~~------------~~~~~~~y~t~l~~~GAi~~~~-----k~-~~~ve~dr~~~~---g-~~~l~~ 290 (342) +.-+- .. +.++..+|..++++.-|++... +. ++.+-..+ .+. + .|-|-. T Consensus 294 ~p~~~~w~~ag~~a~~~~~~y~~~~~~~gg~~dVyp~lV~G~dAf~~~~l~g~g~~~~~~~ivk~-pG~~~ad~~DPlgQ 372 (401) T protein:vir:95 294 VPEMLHWAGAGAQATGANPGYRTSMVSGQEHYDVYPMLVVGDDSFTSIGFQTDGKSLKFTVMTKM-PGKETADRNDPYGE 372 (401) T ss_pred cccceeecCCcccccccccccccccccCCCcceeeeeeEEccccceecccccCCccccceeEeec-CCcCCCCCCCcccc Confidence 77522 11 2245558999999999998621 11 11221111 110 0 111111 Q ss_pred eeEEEeeecceeeecC--cCCcChHHhcCCcCceeecCccccceEEEEecCCC Q lcl|NC_020854. 291 DLHYVYHPVGAKWAVT--TTNPTRAQLETVANWSKVYELKNIGIVRATNVSNF 341 (342) Q Consensus 291 r~~y~~~~~G~s~~~~--~~sPt~~~L~~~~NW~~v~d~k~i~~~~~~~~~~~ 341 (342) | .--|++|--+ ..+| =-++++.+-++| T Consensus 373 ~-----g~vgwK~~~a~~vL~~-------------------e~m~~ies~a~~ 401 (401) T protein:vir:95 373 T-----GFSSIKWYYGILVKRP-------------------ERLALIKTVAPL 401 (401) T ss_pred e-----ehhhhhhhhhhheecc-------------------ceeEEEEeecCC Confidence 1 0112222111 1111 124555555555 No 187 >protein:vir:4074 Length: 480 # NCBI annotation: major capsid (head) protein # Family: family:all:11745 # MgeID: mge:85 # MgeName: c2 # Cross-refs: genbank:acc:NP_043553;genbank:gi:9628687;genbank:GeneID:1261180 Probab=67.20 E-value=0.26 Score=23.83 Aligned_cols=264 Identities=11% Similarity=0.055 Sum_probs=81.3 Q ss_pred Cc-ceeccccchhH--HHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccC-------------------C Q lcl|NC_020854. 1 MA-TLRSDIIIPEV--FTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKAN-------------------L 58 (342) Q Consensus 1 Ma-T~~~d~i~Pev--~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i-------------------~ 58 (342) .+ +...+....|. +..++...... .|.++.. .-..+.....| ..-+|.+... . T Consensus 175 ~~~~~~~~~~~~e~r~~~~~~~~~~e~--~~~~~~~--~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 249 (480) T protein:vir:40 175 IPSEKPEDAERKFMRELGSKMAEMPEQ--GFLREFA--NGADLNVVNSL-GSITSKYARKSGIYDGAMKARFQGLTLAED 249 (480) T ss_pred ccccchhhhhhHHHHHHHHHhccchhh--hhhhhhh--hhccccccccc-cccccchhhheeechhhhhhhhhcceeeec Confidence 11 11111111111 12222111000 1111000 00000000000 0001110000 0 Q ss_pred CCcccc-----cCCCceechhhcccceeeeEee--eeccceeechHHHhhhcc--hHHHHHHHHHHHHHHHHHHHHHHHH Q lcl|NC_020854. 59 SGDFEV-----LSDSSSLTPGKITADKQVAAIL--HRGRAFEARDLAALAAGS--DPMAAIGAKVADYVANQRQKDLLSC 129 (342) Q Consensus 59 ~gda~~-----~~~~~~i~~~~lt~~~~~a~i~--~~~k~~~~tD~a~~~~~~--dp~~~i~~qia~~~~~~~~~~lla~ 129 (342) .++... ...+..-++...++ ..+. ...+-+.....+..+.-. +-...+.++++..+.++.++.+|. T Consensus 250 g~~~~~~~~e~~~~~~~~~~~~~~~----~~~~~~~v~~l~~~~k~t~~lLDDa~~l~~~i~~~l~~~~~~~ee~a~l~- 324 (480) T protein:vir:40 250 GVDDTFISGTFKAGTDKNKSQTATK----RSLRPQMAEAYLQMDKATVRGVNDSGALSEYVMSEMVNRVIQKVEYNMIL- 324 (480) T ss_pred cccceeeeeeeeccccccccccccc----chhhHHHHHHHHHhHHHHHHHhhhhHHHHHHHHHHHHHHHHHHHHHHhhc- Confidence 000000 00011111111000 0000 011111111111111111 223336777777777776665543 Q ss_pred HHHHHhhhcccccchhheeeecccccccccccccHHHHHHHHHHhCccccCeE-EEEEchHHHHHHHhhhhhhhhhhhhc Q lcl|NC_020854. 130 LQGVFGSLNANTSSSAFFDLCIDSESGDTPTALSPRHVAEARAILGDQGDKLT-AVAMHSKVYYDLVERRAIDYVSTADA 208 (342) Q Consensus 130 L~g~~~~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~-~ivmhS~v~~~L~~~~li~~~~~s~~ 208 (342) | . +... .....+.... .+.+......+.+.+-...+-.....-+ .|+||+.+...|++. |++++ T Consensus 325 --G---~-g~g~--~~~~g~~~~~-~~~~~~~~~~d~id~L~~al~~~y~~~a~~~vmn~~t~~~I~kl------KD~~G 389 (480) T protein:vir:40 325 --G---S-VDGS--NGFYGLKTAT-DGWTKQIEYTDLFEGITDAVAECSISDAITIVMSPQTFAELRKA------KGTDG 389 (480) T ss_pred --c---C-CCCc--cccccceeec-ccccccchhHHHHHHHHHhhhHHhhCCCCEEEECHHHHHHHHHh------hcCCC Confidence 2 1 0010 1111111111 1111222233444444444433333334 589999999998875 44554 Q ss_pred ccceeeeccceeecccccccceeeeccceEEEeC-----CcceeccCCCcceEEEEEecceeEeecCCcceeEeccCCCc Q lcl|NC_020854. 209 RGTSTTQSGGSMAAAYGGEVSVPTYMGLRVIVSD-----DVNTAGSGGSTEYATYFFTQGAVASGEQMAMQTETDRDILA 283 (342) Q Consensus 209 ~~~~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD-----~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~~~ve~dr~~~~ 283 (342) + +..+. ........+.+|++|++++ ++|.+... .. ++.++.. ..+...+.++.. T Consensus 390 ~--Yi~q~-------~~~~~~~~~llG~pvv~~~~~~~~~~~~~~~~--~~---------~~~~~d~-~~~~~~~~~~~~ 448 (480) T protein:vir:40 390 H--SRFNE-------LATKEQIAQSFGAVNLETRVWMPKDEVAVYNH--DE---------YVLIGDL-NVENYNDFDLRY 448 (480) T ss_pred C--eeccC-------cccccCcceecccceeeeeccccCCcceeeeC--Cc---------cEEEEec-ccceeccccccc Confidence 3 32221 2234567899999998874 33333211 11 2223321 122222323222 Q ss_pred ceeEEEEeeEEEeeecce---ee--ecCcCCc Q lcl|NC_020854. 284 KSDAMSIDLHYVYHPVGA---KW--AVTTTNP 310 (342) Q Consensus 284 g~~~l~~r~~y~~~~~G~---s~--~~~~~sP 310 (342) -...+..+.+-..++... .+ ..++.-- T Consensus 449 ~~~~~~~e~~v~g~~~~~~~~~~~~~~~~~~~ 480 (480) T protein:vir:40 449 NVEQWLSETLVGGSIRGKNRSAYLKKKGSLGV 480 (480) T ss_pred chhhhhhhhhhceeeEccccEEEEEeccCcCC Confidence 222233333322222221 11 1111111 No 188 >protein:vir:2106 Length: 430 # NCBI annotation: coat protein # Family: family:all:1412 # MgeID: mge:46 # MgeName: P22 # Cross-refs: genbank:acc:NP_059630;genbank:gi:9635538;genbank:GeneID:1262831 Probab=54.44 E-value=0.5 Score=22.22 Aligned_cols=303 Identities=15% Similarity=0.091 Sum_probs=121.6 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchhhhccCCCCEEEccccccCCCCcccccCC-Cceechhhcccc Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTELNATEGGDFINVPFWKANLSGDFEVLSD-SSSLTPGKITAD 79 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~l~~~~~G~ti~~P~~~~i~~gda~~~~~-~~~i~~~~lt~~ 79 (342) ||+++..+...-+ +=+.+-+.+.+.|.++.-.-+......+.-||+|.+|.=. . +....-.+ ... +..+..+ T Consensus 1 Ma~~~~~~lti~~--~eal~~~~n~lV~a~~~~~~r~~d~~~~r~Gdti~ip~p~-~--~~~~~G~~~t~~--~~~~~e~ 73 (430) T protein:vir:21 1 MALNEGQIVTLAV--DEIIETISAITPMAQKAKKYTPPAASMQRSSNTIWMPVEQ-E--SPTQEGWDLTDK--ATGLLEL 73 (430) T ss_pred CccccchhhHHHH--HHHHHHhhhhhhhhhhhhccCCchhhhhcccceEEeeccc-c--ccccccccccCC--Cccceee Confidence 9988655533222 2233344455556543222222222223459999998521 1 22111110 111 1223333 Q ss_pred eeeeEee-eeccceeechHHHhhhcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccccchhheeeecccccccc Q lcl|NC_020854. 80 KQVAAIL-HRGRAFEARDLAALAAGSDPMAAIGAKVADYVANQRQKDLLSCLQGVFGSLNANTSSSAFFDLCIDSESGDT 158 (342) Q Consensus 80 ~~~a~i~-~~~k~~~~tD~a~~~~~~dp~~~i~~qia~~~~~~~~~~lla~L~g~~~~~~a~~~~~~~~~~~~~~~~~~~ 158 (342) +-..++- .++-.++.+ +.++...+...++.+.-...++.+++.+|++.+... +..+... .... .+.. T Consensus 74 ~v~~~~~~~~~V~~~~~--~kEl~~~~~~er~l~pAm~~LA~~Vd~dl~~~~~~~-~~~v~~~--------~~~t-~~~~ 141 (430) T protein:vir:21 74 NVAVNMGEPDNDFFQLR--ADDLRDETAYRRRIQSAARKLANNVELKVANMAAEM-GSLVITS--------PDAI-GTNT 141 (430) T ss_pred eEeEEEeeeccceEEee--hhHhcChhhHHHHHHHHHHHHHHHHHHHHHHHhhhh-hhccccc--------cCCC-CCCC Confidence 3332222 233335555 333556677788888888999999999999875321 1111100 0000 1111 Q ss_pred cccccHHHHHHHHHHhCcc---ccCeEEEEEchHHHHHHHhh-hhhhhhhhhhcccceeeeccceeecccccccceee-e Q lcl|NC_020854. 159 PTALSPRHVAEARAILGDQ---GDKLTAVAMHSKVYYDLVER-RAIDYVSTADARGTSTTQSGGSMAAAYGGEVSVPT-Y 233 (342) Q Consensus 159 ~~~~~~~~l~~A~~~~GD~---~~~~~~ivmhS~v~~~L~~~-~li~~~~~s~~~~~~~~~~~~~~~~~~~~~~~i~~-~ 233 (342) ... .+.+.++.+.|-+. .+.-...+++|..+..|-.. .-+.+-... .. .. .-+..|+. + T Consensus 142 ~~~--~~~~A~a~~~L~~~~vP~~~~R~~~~~p~~~~~l~~~l~~~~~~~~~---~~----------~A-~r~g~i~r~~ 205 (430) T protein:vir:21 142 ADA--WNFVADAEEIMFSRELNRDMGTSYFFNPQDYKKAGYDLTKRDIFGRI---PE----------EA-YRDGTIQRQV 205 (430) T ss_pred Ccc--hhhHHHHHHHHHHhcCCCCCCcEEEeChHHHHHHhhhhccccccccc---hh----------HH-Hhhccccccc Confidence 122 34455555566654 33356889999999988542 111111000 00 00 01223332 3 Q ss_pred ccce-EEEeCCcceeccCCCcceEEEEEecceeEe-------e-cCCccee--------EeccCCCcceeEEEEeeEEEe Q lcl|NC_020854. 234 MGLR-VIVSDDVNTAGSGGSTEYATYFFTQGAVAS-------G-EQMAMQT--------ETDRDILAKSDAMSIDLHYVY 296 (342) Q Consensus 234 ~G~~-VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~-------~-~k~~~~v--------e~dr~~~~g~~~l~~r~~y~~ 296 (342) .|.. +.-++.+|....+...-++ ..||-.+ . .+.-.++ .+.-.....-|.+.-...|.+ T Consensus 206 ~Gfd~~~~s~~~~~~t~gt~t~~t----v~gA~~~~~~~~tv~~~g~~~~~d~~~~~it~s~tg~l~~GD~ftiaGV~~v 281 (430) T protein:vir:21 206 AGFDDVLRSPKLPVLTKSTATGIT----VSGAQSFKPVAWQLDNDGNKVNVDNRFATVTLSATTGMKRGDKISFAGVKFL 281 (430) T ss_pred chhhhhhhcCCcccccCccCcCce----eccccccccccceeccccccccccccceeeeeecccceecccEEEecceeee Confidence 4543 3445556554322222111 2233211 0 0000000 000111222233443444444 Q ss_pred eecc-----e--eeec-----C---cCCcC-----h--------------HHhcCCcCc----------eeecCccccce Q lcl|NC_020854. 297 HPVG-----A--KWAV-----T---TTNPT-----R--------------AQLETVANW----------SKVYELKNIGI 332 (342) Q Consensus 297 ~~~G-----~--s~~~-----~---~~sPt-----~--------------~~L~~~~NW----------~~v~d~k~i~~ 332 (342) |+.= . .|.+ + .++|. + +.+++++.= +++|.+.+|.+ T Consensus 282 ~~itk~~~~~l~qf~V~a~~~~ttv~I~Pai~~~~~~~~~~~~~~y~nVsaspa~~aavT~v~~a~~~~Nl~fh~~A~~L 361 (430) T protein:vir:21 282 GQMAKNVLAQDATFSVVRVVDGTHVEITPKPVALDDVSLSPEQRAYANVNTSLADAMAVNILNVKDARTNVFWADDAIRI 361 (430) T ss_pred ccccccccCCcceEEEEEecCCceeEEeecccccccccccccccccceeccccccCceeEEeccCCcccceeEccceeEE Confidence 4311 1 1110 0 01121 1 122222211 23455555555 Q ss_pred EEEEecCCCC Q lcl|NC_020854. 333 VRATNVSNFD 342 (342) Q Consensus 333 ~~~~~~~~~~ 342 (342) +-.---.+-+ T Consensus 362 a~~pl~~p~~ 371 (430) T protein:vir:21 362 VSQPIPANHE 371 (430) T ss_pred EEecccCCCC Confidence 5443322222 No 189 >protein:vir:4902 Length: 348 # NCBI annotation: gp348 # Family: family:all:1083 # MgeID: mge:107 # MgeName: Sfi11 # Cross-refs: genbank:acc:NP_056680;genbank:gi:9635015;genbank:GeneID:1262657 Probab=54.00 E-value=0.51 Score=22.17 Aligned_cols=308 Identities=11% Similarity=0.110 Sum_probs=121.7 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchh----h---hccCCCCEEEccccccCCCCcccccCCCceech Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTE----L---NATEGGDFINVPFWKANLSGDFEVLSDSSSLTP 73 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~----l---~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~ 73 (342) |++ +.|+|+|..+..|+.+.......|+...+...... + .+ ..|..+-.|+-... .. +.-..- T Consensus 1 M~~-l~d~f~~~~l~~~v~~~~~~~~~~l~~~~Fp~~~~~~~~~~~~~~-~~~~~~~a~~v~~~--~~------~~~~~r 70 (348) T protein:vir:49 1 MGL-IYDKVTASNIAGYFNALQENVDSTLGESIFPARKQLGTKLSYITG-ASGQSVALKAAAFD--TN------VTVRDR 70 (348) T ss_pred Ccc-hhhhcCHHHHHHHHHhccccchhhhHhhcCCCccccCceeEEEEe-ecCceeeeeeecCC--CC------cceecc Confidence 995 89999999999999865433333322212211111 1 11 12444444543322 11 111111 Q ss_pred hhcccceeeeEeeeeccceeechHHHh---hhcc-hH-HHHHHHHHH-------HHHHHHHHHHHHHHHH-HHHhhhccc Q lcl|NC_020854. 74 GKITADKQVAAILHRGRAFEARDLAAL---AAGS-DP-MAAIGAKVA-------DYVANQRQKDLLSCLQ-GVFGSLNAN 140 (342) Q Consensus 74 ~~lt~~~~~a~i~~~~k~~~~tD~a~~---~~~~-dp-~~~i~~qia-------~~~~~~~~~~lla~L~-g~~~~~~a~ 140 (342) +..++..-.....+....+.+.|.-.+ .... .+ .+.+.++++ +...+..+..+...|. |-+.- .. T Consensus 71 ~~~~~~~~~~p~i~~~~~i~~~d~~~l~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~qal~~Gki~i--~~ 148 (348) T protein:vir:49 71 VSAEMHDEQMPFFKEAMLVKENDRQQLNLVKDSGNAALVNTIVAGIFNDNLTLVNGARARLEAMRMQVLATGKIAF--TS 148 (348) T ss_pred cceeeeeeecCccccccccCHHHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhCCeEEE--ec Confidence 122222222222233444566664332 2212 11 223334433 3455555555555542 32211 00 Q ss_pred ccchhheee--------ecccccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhh-hhhhhhhcccc Q lcl|NC_020854. 141 TSSSAFFDL--------CIDSESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAI-DYVSTADARGT 211 (342) Q Consensus 141 ~~~~~~~~~--------~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li-~~~~~s~~~~~ 211 (342) .......|. +....=.++.+.+ ...|.+.....-+.......++|.++++..|+++.-+ +.+....... T Consensus 149 ~g~~~~vdyg~~~~~~~t~~~~W~~~~adp-~~di~~~~~~~~~~G~~~~~ii~~~~~~~~l~~~~~v~~~~~~~~~~~- 226 (348) T protein:vir:49 149 DGVNKDIDYGVKPDHKKQVSKSWAEPGATP-LADLEDAIETARELGLNPERAVMNAKTFGLIRKAASTVKVIKPLAGDG- 226 (348) T ss_pred CCceEEEeecCCcccceeeeeccCCCCCCH-HHHHHHHHHHHHhcCCcccEEEeCHHHHHHHhcCHHHHHHhhccCccc- Confidence 000011111 0000000111111 1234444444434445677899999999999987433 2332211110 Q ss_pred eeeeccceeecccccccceeeeccceEEEeCCcceeccCCCcceEEEEEecceeEeecCCc-------ceeEec-c--CC Q lcl|NC_020854. 212 STTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTEYATYFFTQGAVASGEQMA-------MQTETD-R--DI 281 (342) Q Consensus 212 ~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~y~t~l~~~GAi~~~~k~~-------~~ve~d-r--~~ 281 (342) +.+. .......+.++.|.+|++=|.--...+++.. .++-.+.+.+..... ...|.+ . +. T Consensus 227 ------~~i~-~~~~~~~~~~~~g~~i~~y~~~y~d~dG~~~----~~~p~~~v~l~~~~~~G~~~yg~~~e~~~~~~~~ 295 (348) T protein:vir:49 227 ------SSVT-KAELDNYIADNFGVTVVLENGTYRNEKGEVS----KFFPDGHLTLIPNGPLGNTVFGTTPEESDLFADN 295 (348) T ss_pred ------cccc-HHHHHHHHHhhcCceEEEEeeEEEecCCcEe----eeecCCeEEEecCCCcceeEEecChhhhhhcccc Confidence 0111 0111223445567667665554333332221 122333333221110 001111 0 10 Q ss_pred CcceeEEEEeeEEEeeecceeeecCcCCcChHHhcCCcCceee-cCccccceEEEEecC Q lcl|NC_020854. 282 LAKSDAMSIDLHYVYHPVGAKWAVTTTNPTRAQLETVANWSKV-YELKNIGIVRATNVS 339 (342) Q Consensus 282 ~~g~~~l~~r~~y~~~~~G~s~~~~~~sPt~~~L~~~~NW~~v-~d~k~i~~~~~~~~~ 339 (342) ......-..+.++.+ -+|... .|....+...+..=-| .++..+-++.+++-. T Consensus 296 ~~~~~~~~~~~~~~~----~~~~~~--dP~~~~~~~~s~~lPv~~~~~~~~~a~Vl~~~ 348 (348) T protein:vir:49 296 TVNADVEIVDNGIAV----TTTKTT--DPVNVQTKVSMVALPSFERLDDVYMLTVIPAV 348 (348) T ss_pred ccccceeecCCeEEE----eeeecC--CCceEEEEEeeeccccccCCCcEEEEEEecCC Confidence 111111111221211 134332 4555555544444333 356677777777666 No 190 >protein:vir:2736 Length: 348 # NCBI annotation: putative structural protein # Family: family:all:1083 # MgeID: mge:58 # MgeName: O1205 # Cross-refs: genbank:acc:NP_695109;genbank:gi:23455878;genbank:GeneID:955608 Probab=50.52 E-value=0.61 Score=21.77 Aligned_cols=311 Identities=10% Similarity=0.098 Sum_probs=120.8 Q ss_pred CcceeccccchhHHHHHHHhhhHHhhhhhhcCccccchh----h---hccCCCCEEEccccccCCCCcccccCCCceech Q lcl|NC_020854. 1 MATLRSDIIIPEVFTPYVIEQTTQRDAFLASGVVQPMTE----L---NATEGGDFINVPFWKANLSGDFEVLSDSSSLTP 73 (342) Q Consensus 1 MaT~~~d~i~Pev~~~yv~~~~~~~~~f~~sg~~~~d~~----l---~~~~~G~ti~~P~~~~i~~gda~~~~~~~~i~~ 73 (342) |+ ++.|+|+|..+..|+.+.......|+...+...... + .++ .|..+-.|+-... ..+ .-..- T Consensus 1 M~-~i~d~f~~~~l~~~v~~~~~~~~~~l~~~~Fp~~~~~~~~~~~~~~~-~~~~~~a~~v~~~--~~~------~~~~r 70 (348) T protein:vir:27 1 MG-LIYDKVTASNIAGYFNALQENVSSTLGESIFPARKQLGTKLSYIKGA-SGQSVALKAAAFD--TNV------TIRDR 70 (348) T ss_pred Cc-chhhhcCHHHHHHHHHhccchhhhhhHhhcCCCccccceeEEEEeec-cCceeEeeeecCC--CCc------ceecc Confidence 99 489999999999999765443333322222221111 1 111 2333333433221 111 11111 Q ss_pred hhcccceeeeEeeeeccceeechHHHh---hhcchH--HHHHHHHH-------HHHHHHHHHHHHHHHH-HHHHhhhccc Q lcl|NC_020854. 74 GKITADKQVAAILHRGRAFEARDLAAL---AAGSDP--MAAIGAKV-------ADYVANQRQKDLLSCL-QGVFGSLNAN 140 (342) Q Consensus 74 ~~lt~~~~~a~i~~~~k~~~~tD~a~~---~~~~dp--~~~i~~qi-------a~~~~~~~~~~lla~L-~g~~~~~~a~ 140 (342) +.+++.+-.....+...-+.+.|.-.+ ....+| +..+.+++ .+.+.++.+..+...| +|-+.-. . T Consensus 71 ~~~~~~~~~~p~i~~~~~i~~~d~~~~~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~~al~~Gki~i~--~ 148 (348) T protein:vir:27 71 VSAEMHDEQMPFFKEAMLVKENDRQQLNLVKDSGNAVLVNTIVAGIFNDNLTLVNGARARLEAMRMQVLATGKIAFT--S 148 (348) T ss_pred cceeeeeeecCccccccccCHHHHHHHHHhhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCeeEEe--c Confidence 111211111122223334455554332 222221 22343433 3455555555555555 2332110 1 Q ss_pred ccchhhee--------eecccccccccccccHHHHHHHHHHhCccccCeEEEEEchHHHHHHHhhhhh-hhhhhhhcccc Q lcl|NC_020854. 141 TSSSAFFD--------LCIDSESGDTPTALSPRHVAEARAILGDQGDKLTAVAMHSKVYYDLVERRAI-DYVSTADARGT 211 (342) Q Consensus 141 ~~~~~~~~--------~~~~~~~~~~~~~~~~~~l~~A~~~~GD~~~~~~~ivmhS~v~~~L~~~~li-~~~~~s~~~~~ 211 (342) .......| ++....=.++.+.+ .+.|.+....+-+..-....++|.++++..|+++.-+ +.+...... T Consensus 149 ~~~~~~vdfg~~~~~~~t~~~~W~~~~adp-~~di~~~~~~~~~~G~~~~~ii~~~~~~~~l~~~~~v~~~~~~~~~~-- 225 (348) T protein:vir:27 149 DGVNKDIDYGVKPDHKKQVSKSWAEPGATP-LADLEDAIETARELGLNPERAVMNAKTFGLIRKAASTVKVIKPLAGD-- 225 (348) T ss_pred CCeeEEEeecCCcccceeeeeccCCCCCCH-HHHHHHHHHHHHhcCCcccEEEECHHHHHHHhcCHHHHHHhcccCcc-- Confidence 00111111 11001001111111 2445555555544445678899999999999987433 222211100 Q ss_pred eeeeccceeecccccccceeeeccceEEEeCCcceeccCCCcc----eEEEEEecceeEee-cCCcceeEeccCCCcc-- Q lcl|NC_020854. 212 STTQSGGSMAAAYGGEVSVPTYMGLRVIVSDDVNTAGSGGSTE----YATYFFTQGAVASG-EQMAMQTETDRDILAK-- 284 (342) Q Consensus 212 ~~~~~~~~~~~~~~~~~~i~~~~G~~VvvdD~~p~~~~~~~~~----y~t~l~~~GAi~~~-~k~~~~ve~dr~~~~g-- 284 (342) .+.+.. .....-++++.|..|++-|.--...+++... .+..++..|.++.. .+..++ +.+...+.. T Consensus 226 -----~~~i~~-~~~~~~~~~~~g~~i~~yd~~y~d~~G~~~~~~p~~~vvl~~~~~~G~~~yG~~~e-~~~~~~~~~~~ 298 (348) T protein:vir:27 226 -----GSAVTK-AELENYIADNFGVSIVLENGTYRNDKGEVSKFYPDGHLTLIPNGPLGNTVFGTTPE-ESDLFADNTVN 298 (348) T ss_pred -----ccccCH-HHHHHHHHhhcCceEEEEeeEEEcCCCcCcccccCCeEEEEcCCcceeEEeccCcc-hhhhhhccccc Confidence 011111 1112234455677777666543333333222 22333333433321 111111 111111000 Q ss_pred eeEEEEeeEEEeeecceeeecCcCCcChHHhcCCcCceeec-CccccceEEEEecC Q lcl|NC_020854. 285 SDAMSIDLHYVYHPVGAKWAVTTTNPTRAQLETVANWSKVY-ELKNIGIVRATNVS 339 (342) Q Consensus 285 ~~~l~~r~~y~~~~~G~s~~~~~~sPt~~~L~~~~NW~~v~-d~k~i~~~~~~~~~ 339 (342) ..+-....+ .+--+|.. ..|....+...+..=-|. ++.++-++.+.+-. T Consensus 299 ~~~~~~~~~----~~~~~~~~--~dP~~~~~~~~s~~lPv~~~~~~~~~a~Vl~~~ 348 (348) T protein:vir:27 299 AEVEIVDNG----IAVTTTKT--TDPVNVQTKVSMVALPSFERLDDVYMLTVIPAV 348 (348) T ss_pred cceeeeCCe----eEEEeeec--CCCceEEEEEeeeeeccccCCCcEEEEEEecCC Confidence 001001111 11123432 245555555554444333 45566666665555 Done!