Query lcl|NC_015466.1_cdsid_YP_004421842.1 [gene=RDJLphi1_gp74] [protein=capsid related protein] [protein_id=YP_004421842.1] [location=46906..47940] Match_columns 344 No_of_seqs 136 out of 163 Neff 7.7 Searched_HMMs 1612 Date Thu Nov 7 13:44:55 2013 Command /home/guerois/workspace/virfam/python/lib/hhsearch//hhsearch2 -i .//seq/seq_74 -d /home/guerois/workspace/virfam/python/profile_database/capsid_neck_tail.hhm -glob -cpu 7 -o .//seq/HHR/seq_74_vs_rec_db.hhr No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 protein:vir:99888 Length: 309 100.0 1.6E-95 9.7E-99 540.3 22.8 307 5-344 1-308 (309) 2 protein:vir:107882 Length: 307 100.0 4.9E-93 3E-96 526.7 23.2 301 4-344 1-307 (307) 3 protein:vir:79078 Length: 307 100.0 1.7E-92 1.1E-95 523.7 23.5 301 4-344 1-307 (307) 4 protein:vir:106590 Length: 349 100.0 1.6E-36 9.7E-40 216.9 16.6 322 1-343 1-349 (349) 5 protein:vir:96490 Length: 348 100.0 3.6E-35 2.3E-38 209.4 20.6 321 1-344 1-346 (348) 6 protein:vir:98480 Length: 348 100.0 4.9E-35 3E-38 208.7 20.3 323 1-344 1-348 (348) 7 protein:vir:2736 Length: 348 # 100.0 6.1E-35 3.8E-38 208.2 19.7 318 1-344 1-346 (348) 8 protein:vir:4902 Length: 348 # 100.0 2.1E-34 1.3E-37 205.2 19.7 320 1-344 1-346 (348) 9 protein:vir:78006 Length: 409 99.9 2.5E-29 1.5E-32 177.4 19.7 331 1-344 3-389 (409) 10 protein:vir:79503 Length: 409 99.9 2.5E-29 1.5E-32 177.4 19.7 331 1-344 3-389 (409) 11 protein:vir:6378 Length: 346 # 99.9 2.4E-27 1.5E-30 166.5 18.3 314 1-343 1-346 (346) 12 protein:vir:393 Length: 341 # 99.9 4.3E-24 2.7E-27 148.7 19.5 311 1-344 1-341 (341) 13 protein:vir:3424 Length: 341 # 99.9 2E-23 1.2E-26 145.1 18.3 315 1-344 1-341 (341) 14 protein:vir:108211 Length: 318 99.3 5E-14 3.1E-17 93.5 11.1 284 1-344 1-317 (318) 15 protein:vir:10324 Length: 320 98.9 2.6E-10 1.6E-13 73.1 15.0 294 1-344 4-315 (320) 16 protein:vir:95258 Length: 368 97.8 2.4E-06 1.5E-09 51.4 13.2 321 1-344 1-364 (368) 17 protein:vir:9820 Length: 272 # 96.9 7.4E-05 4.6E-08 43.2 12.4 261 1-344 1-269 (272) 18 protein:vir:3033 Length: 272 # 96.9 7.4E-05 4.6E-08 43.2 12.4 261 1-344 1-269 (272) 19 protein:vir:93742 Length: 274 96.9 8.8E-05 5.5E-08 42.8 12.5 266 1-344 1-270 (274) 20 protein:vir:80930 Length: 278 96.8 5.9E-05 3.7E-08 43.8 11.1 275 1-344 1-277 (278) 21 protein:vir:97433 Length: 274 96.8 0.00012 7.5E-08 42.0 12.8 266 1-344 1-270 (274) 22 protein:vir:94494 Length: 274 96.8 0.00012 7.5E-08 42.0 12.8 266 1-344 1-270 (274) 23 protein:vir:80684 Length: 315 96.6 0.00028 1.7E-07 40.0 13.6 303 1-344 1-306 (315) 24 protein:vir:94771 Length: 298 96.3 0.00023 1.4E-07 40.6 11.3 294 1-343 1-298 (298) 25 protein:vir:96123 Length: 274 96.3 0.00022 1.3E-07 40.7 11.1 269 1-344 1-270 (274) 26 protein:vir:7771 Length: 330 # 96.3 0.00017 1E-07 41.3 10.4 299 1-344 1-323 (330) 27 protein:vir:9574 Length: 300 # 96.2 0.00064 4E-07 38.1 13.3 294 1-344 1-300 (300) 28 protein:vir:1239 Length: 274 # 96.1 0.00037 2.3E-07 39.4 11.6 265 1-344 1-270 (274) 29 protein:vir:3613 Length: 272 # 96.0 0.00031 1.9E-07 39.8 10.5 266 1-344 1-272 (272) 30 protein:vir:1638 Length: 298 # 95.4 0.001 6.5E-07 36.9 11.3 291 1-343 1-298 (298) 31 protein:vir:739 Length: 231 # 95.4 0.0018 1.1E-06 35.7 12.6 231 41-344 1-231 (231) 32 protein:vir:96262 Length: 274 95.2 0.0016 9.9E-07 35.9 11.6 264 1-344 1-269 (274) 33 protein:vir:95898 Length: 274 95.2 0.0016 9.9E-07 35.9 11.6 264 1-344 1-269 (274) 34 protein:vir:8187 Length: 311 # 95.2 0.0026 1.6E-06 34.8 12.9 302 1-344 1-310 (311) 35 protein:vir:105334 Length: 276 95.0 0.0015 9.4E-07 36.0 11.0 266 1-344 1-270 (276) 36 protein:vir:1886 Length: 385 # 95.0 0.0028 1.7E-06 34.6 12.3 272 1-344 105-384 (385) 37 protein:vir:191 Length: 385 # 95.0 0.0028 1.7E-06 34.6 12.3 272 1-344 105-384 (385) 38 protein:vir:9309 Length: 324 # 94.9 0.0019 1.2E-06 35.5 11.3 284 1-344 27-315 (324) 39 protein:vir:96223 Length: 324 94.9 0.0018 1.1E-06 35.6 11.2 284 1-344 27-315 (324) 40 protein:vir:78830 Length: 324 94.9 0.0016 1E-06 35.9 10.9 287 1-344 27-315 (324) 41 protein:vir:96392 Length: 324 94.9 0.0016 1E-06 35.9 10.9 287 1-344 27-315 (324) 42 protein:vir:78523 Length: 338 94.8 0.0024 1.5E-06 34.9 11.6 309 1-344 1-335 (338) 43 protein:vir:104085 Length: 320 94.8 0.0019 1.2E-06 35.5 10.9 293 1-344 14-317 (320) 44 protein:vir:99920 Length: 311 94.6 0.0039 2.4E-06 33.8 12.9 303 1-344 1-311 (311) 45 protein:vir:99749 Length: 324 94.0 0.0057 3.5E-06 32.9 12.9 287 1-344 27-315 (324) 46 protein:vir:2430 Length: 318 # 93.8 0.0035 2.1E-06 34.1 10.4 290 1-344 14-313 (318) 47 protein:vir:81070 Length: 390 93.8 0.0058 3.6E-06 32.9 11.4 270 1-342 113-390 (390) 48 protein:vir:96833 Length: 275 93.4 0.0056 3.5E-06 32.9 10.7 263 1-344 3-271 (275) 49 protein:vir:94142 Length: 304 93.3 0.0081 5E-06 32.1 11.5 287 1-343 1-304 (304) 50 protein:vir:105905 Length: 304 93.3 0.0081 5E-06 32.1 11.5 287 1-343 1-304 (304) 51 protein:vir:78223 Length: 333 93.0 0.008 5E-06 32.1 11.0 311 1-344 8-332 (333) 52 protein:vir:97053 Length: 390 92.9 0.0094 5.9E-06 31.7 11.2 270 1-342 113-390 (390) 53 protein:vir:103955 Length: 324 92.5 0.011 7E-06 31.3 12.7 284 1-344 27-315 (324) 54 protein:vir:97255 Length: 310 92.3 0.012 7.4E-06 31.1 11.1 298 1-344 1-310 (310) 55 protein:vir:4226 Length: 326 # 92.3 0.012 7.5E-06 31.1 11.3 292 1-344 1-323 (326) 56 protein:vir:94673 Length: 419 92.2 0.013 7.8E-06 31.0 12.7 284 1-344 123-417 (419) 57 protein:vir:94933 Length: 330 92.0 0.013 8.3E-06 30.9 12.6 299 1-344 25-329 (330) 58 protein:vir:97148 Length: 324 91.7 0.015 9.2E-06 30.6 13.0 284 1-344 27-315 (324) 59 protein:vir:95107 Length: 270 91.4 0.012 7.7E-06 31.0 10.0 260 1-344 1-267 (270) 60 protein:vir:10364 Length: 390 91.3 0.016 1E-05 30.4 11.2 270 1-342 114-390 (390) 61 protein:vir:41 Length: 299 # N 91.2 0.017 1E-05 30.3 11.9 288 1-344 6-298 (299) 62 protein:vir:4339 Length: 395 # 90.6 0.02 1.2E-05 29.9 11.6 275 1-344 114-395 (395) 63 protein:vir:2344 Length: 397 # 89.6 0.019 1.2E-05 30.0 9.6 288 1-344 10-306 (397) 64 protein:vir:81227 Length: 413 88.9 0.029 1.8E-05 29.0 14.9 278 1-344 118-410 (413) 65 protein:vir:94070 Length: 339 88.6 0.031 1.9E-05 28.8 14.9 279 1-337 43-339 (339) 66 protein:vir:9759 Length: 303 # 87.8 0.036 2.2E-05 28.5 12.5 292 1-344 1-303 (303) 67 protein:vir:100135 Length: 418 87.1 0.041 2.5E-05 28.2 12.2 273 1-344 136-415 (418) 68 protein:vir:80376 Length: 435 86.8 0.043 2.7E-05 28.1 11.4 297 1-344 132-433 (435) 69 protein:vir:4600 Length: 415 # 86.5 0.045 2.8E-05 28.0 11.7 281 1-344 120-404 (415) 70 protein:vir:4700 Length: 415 # 86.5 0.045 2.8E-05 28.0 11.7 281 1-344 120-404 (415) 71 protein:vir:80068 Length: 301 85.9 0.049 3E-05 27.8 13.4 287 1-342 1-301 (301) 72 protein:vir:5255 Length: 304 # 83.8 0.054 3.3E-05 27.6 8.6 289 1-335 1-304 (304) 73 protein:vir:2504 Length: 305 # 81.9 0.081 5E-05 26.6 12.6 292 1-344 1-298 (305) 74 protein:vir:9410 Length: 415 # 81.0 0.089 5.5E-05 26.3 11.2 276 1-344 119-404 (415) 75 protein:vir:8102 Length: 543 # 80.4 0.095 5.9E-05 26.2 13.2 287 1-344 249-542 (543) 76 protein:vir:105004 Length: 392 79.9 0.1 6.2E-05 26.1 12.0 272 1-344 106-384 (392) 77 protein:vir:102082 Length: 392 79.9 0.1 6.2E-05 26.1 12.0 272 1-344 106-384 (392) 78 protein:vir:107593 Length: 392 79.9 0.1 6.2E-05 26.1 12.0 272 1-344 106-384 (392) 79 protein:vir:102873 Length: 392 79.9 0.1 6.2E-05 26.1 12.0 272 1-344 106-384 (392) 80 protein:vir:102119 Length: 404 79.5 0.1 6.4E-05 26.0 12.4 276 1-344 110-400 (404) 81 protein:vir:1433 Length: 435 # 77.4 0.13 7.8E-05 25.5 11.1 296 1-344 132-433 (435) 82 protein:vir:3643 Length: 336 # 77.3 0.13 7.8E-05 25.5 15.4 282 1-337 31-336 (336) 83 protein:vir:99576 Length: 388 75.0 0.15 9.4E-05 25.1 14.1 290 1-337 73-388 (388) 84 protein:vir:79642 Length: 329 73.7 0.17 0.0001 24.8 13.3 289 1-332 26-329 (329) 85 protein:vir:79987 Length: 415 72.9 0.18 0.00011 24.7 10.9 277 1-344 120-404 (415) 86 protein:vir:81100 Length: 415 72.9 0.18 0.00011 24.7 10.9 277 1-344 120-404 (415) 87 protein:vir:98339 Length: 415 72.9 0.18 0.00011 24.7 10.9 277 1-344 120-404 (415) 88 protein:vir:4456 Length: 401 # 71.9 0.19 0.00012 24.5 11.5 287 1-344 107-401 (401) 89 protein:vir:78090 Length: 302 71.5 0.14 8.4E-05 25.3 7.0 288 1-344 1-302 (302) 90 protein:vir:3158 Length: 321 # 71.4 0.2 0.00012 24.5 11.0 286 1-344 1-311 (321) 91 protein:vir:78558 Length: 336 65.9 0.28 0.00017 23.6 14.2 285 1-337 31-336 (336) 92 protein:vir:107687 Length: 319 64.8 0.29 0.00018 23.5 13.0 280 1-342 1-319 (319) 93 protein:vir:1084 Length: 437 # 60.3 0.38 0.00023 22.9 10.5 263 1-344 156-427 (437) 94 protein:vir:3845 Length: 395 # 59.3 0.4 0.00025 22.8 11.5 271 1-344 105-383 (395) 95 protein:vir:107732 Length: 379 57.3 0.44 0.00027 22.6 12.4 286 1-337 49-379 (379) 96 protein:vir:101557 Length: 336 54.0 0.51 0.00032 22.2 15.5 282 1-337 39-336 (336) 97 protein:vir:106734 Length: 336 52.9 0.54 0.00034 22.0 14.2 280 1-337 31-336 (336) 98 protein:vir:1383 Length: 421 # 50.2 0.62 0.00038 21.7 11.7 262 1-344 116-394 (421) 99 protein:vir:95763 Length: 297 47.2 0.71 0.00044 21.4 11.8 282 1-344 9-296 (297) 100 protein:vir:80213 Length: 334 46.8 0.72 0.00045 21.4 9.7 294 1-342 1-334 (334) 101 protein:vir:103285 Length: 296 39.2 1 0.00064 20.5 13.1 281 1-344 1-295 (296) 102 protein:vir:81160 Length: 371 37.6 1.1 0.00069 20.3 11.5 271 1-344 91-371 (371) 103 protein:vir:79928 Length: 393 37.0 1.1 0.00071 20.3 11.9 293 1-344 59-393 (393) 104 protein:vir:100884 Length: 389 37.0 1.1 0.00071 20.3 11.6 267 1-344 109-384 (389) 105 protein:vir:100172 Length: 394 36.7 1.2 0.00072 20.2 10.6 263 1-344 111-384 (394) 106 protein:vir:102335 Length: 312 36.2 1.2 0.00074 20.2 8.3 284 1-336 1-312 (312) 107 protein:vir:10450 Length: 344 34.5 1.3 0.0008 20.0 8.1 318 1-344 1-344 (344) 108 protein:vir:94711 Length: 347 33.1 1.4 0.00085 19.8 11.8 307 1-344 1-346 (347) 109 protein:vir:100247 Length: 425 33.0 1.4 0.00086 19.8 12.1 286 1-344 130-424 (425) 110 protein:vir:104342 Length: 314 31.0 1.5 0.00095 19.6 11.2 283 1-340 19-314 (314) 111 protein:vir:4856 Length: 293 # 27.3 1.9 0.0012 19.1 10.8 272 1-344 5-281 (293) 112 protein:vir:94622 Length: 341 26.9 1.9 0.0012 19.1 9.3 304 1-344 3-339 (341) 113 protein:vir:78739 Length: 332 25.6 2 0.0013 18.9 7.6 307 1-342 1-332 (332) 114 protein:vir:4997 Length: 397 # 25.3 2.1 0.0013 18.9 10.1 268 1-344 109-386 (397) 115 protein:vir:79712 Length: 285 22.4 2.5 0.0015 18.4 9.2 273 1-344 1-283 (285) 116 protein:vir:1541 Length: 347 # 21.8 2.6 0.0016 18.4 10.2 320 1-344 1-346 (347) 117 protein:vir:4830 Length: 397 # 21.2 2.6 0.0016 18.3 11.5 269 1-344 109-386 (397) 118 protein:vir:80180 Length: 381 20.8 2.7 0.0017 18.2 9.8 301 1-344 1-344 (381) 119 protein:vir:485 Length: 407 # 20.5 2.8 0.0017 18.2 12.0 286 1-344 106-400 (407) No 1 >protein:vir:99888 Length: 309 # NCBI annotation: capsid protein # Family: family:all:908 # MgeID: mge:1480 # MgeName: B3 # Cross-refs: genbank:acc:YP_164075;genbank:gi:56692607;genbank:GeneID:3192616 Probab=100.00 E-value=1.6e-95 Score=540.34 Aligned_cols=307 Identities=24% Similarity=0.311 Sum_probs=279.2 Q ss_pred CCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceecccccccc Q lcl|NC_015466. 5 QPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIGNDTYF 84 (344) Q Consensus 5 ~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~~~~~~ 84 (344) |.+ .+|++||+||+||+||+|++ |||+.|||+|||++++|+|++|++++.|+..+++|++|+++++++++.++++++ T Consensus 1 ~~~-~~~~~dp~LT~~A~gy~n~~--~Ia~~l~P~vpV~~~~~~~~~f~~~e~F~~~~t~r~~~~~~~~v~~~~~~~~~~ 77 (309) T protein:vir:99 1 MSN-APFPIDPELTAIAIAYRNGR--MISDEVLPRVPVGKQEFKFWKYDLAQGFTVPETLVGRKSKPNEVEFSATDETGS 77 (309) T ss_pred CCC-CCcCcCHhHHHHHhhccChh--hhhhhcCCccccCccccceeeechhhcccccchhhccCCCcceEeecccCceee Confidence 333 36899999999999999986 999999999999999999999999755555678999999999999999999999 Q ss_pred cccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccccceeec Q lcl|NC_015466. 85 ARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASFDPTNA 164 (344) Q Consensus 85 ~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~~k~tl 164 (344) |++|+|+.++|.+++.+++.++||+++|++.++++|.|++|+++|.++++++ .|+++||++| T Consensus 78 ~~~~~L~~~i~~~~~~~a~~~~d~~~~Av~~l~~~i~l~rE~~~A~lv~~~a------------------~y~~~~k~~L 139 (309) T protein:vir:99 78 TEDHGLDAPVPQADIDNAPTNYNPLGHATEQTTNLILLDREARTSKLVFSPN------------------SYAAGNKTTL 139 (309) T ss_pred ecccceeecCCchhhhhccCCCCHHHHHHHHHHHHHHHHHHHHHHHHhcChh------------------hcCCCceEEe Confidence 9999999999999999999999999999999999999999999999988766 4577899999 Q ss_pred ccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHHHHhCCC Q lcl|NC_015466. 165 SNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLADLFEVD 244 (344) Q Consensus 165 ~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la~~~gl~ 244 (344) +|++ +|||++|||++||++|++++ |++||+|+||+++|++|++||+|++++++++. ..+++|+++|+++||++ T Consensus 140 sgt~--~wsd~~SDPi~~i~~~~~~~----g~~PN~~vlg~~~~~~l~~hp~i~~~ik~~~~-~~g~it~~~la~l~~ve 212 (309) T protein:vir:99 140 SGAD--QWSDPTSNPLPVITDALDSV----ILRPNIGVLGRRTATILRRHPKIVKAYNGSLG-DEGMVPMAFLQELLELD 212 (309) T ss_pred cCcc--ccCCCCCCcHHHHHHHHHhh----CCCcceEEechHHHHHHhhCHHHHHHhcCCCc-cccccCHHHHHHHhCcc Confidence 9997 79999999999999998764 99999999999999999999999999998764 56899999999999999 Q ss_pred eEEEEEEEEeccccCCCCccceeCCCceEEEEecCCC-cccccccccceeecccccCCcCCcccccccCCCCceEEEeec Q lcl|NC_015466. 245 KVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATP-GIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAIESDRIEID 323 (344) Q Consensus 245 ~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~-~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~~~~~vr~~ 323 (344) +|+||+++||++.++|++++++||+++++|+|+++.+ ++++|||||||+| +.+..|..+++ ..+..++++||++ T Consensus 213 ~V~vg~a~~n~a~~g~~~~~~~iwg~~~~L~y~~~~~~~~~~ps~G~t~~~----~~r~~g~~~d~-~~~~~g~~~vr~~ 287 (309) T protein:vir:99 213 AIYIGEARLNIARPGQNPNLIRAWGPHASFIYRDRLADTRNGTTFGLTAQW----GDRVSGSIADP-NIGLRGGQRVRVG 287 (309) T ss_pred eEEeecceeeccccccccccccccCCcEEEEEcCCCCCCcccccccceeec----ccccCCceeee-eeccCCceEEEEe Confidence 9999999999999999999999999999999998876 5889999999976 45666665433 3445556889999 Q ss_pred cccceeeeccccchhhhcccC Q lcl|NC_015466. 324 MSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 324 ~~~~~~v~~~~~g~l~~~~va 344 (344) +++||+|+|+|+||||+|||| T Consensus 288 ~~~k~~i~~~d~G~li~~~va 308 (309) T protein:vir:99 288 ESVKELVTAPDLGFFFENAVA 308 (309) T ss_pred ccccchhcchhcchhhhhccc Confidence 999999999999999999999 No 2 >protein:vir:107882 Length: 307 # NCBI annotation: gp34 # Family: family:all:908 # MgeID: mge:1565 # MgeName: BcepMu # Cross-refs: genbank:acc:YP_024707;genbank:gi:48696944;genbank:GeneID:2845970 Probab=100.00 E-value=4.9e-93 Score=526.66 Aligned_cols=301 Identities=22% Similarity=0.306 Sum_probs=272.7 Q ss_pred CCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceecc-cccc Q lcl|NC_015466. 4 TQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEI-GNDT 82 (344) Q Consensus 4 ~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~-~~~~ 82 (344) ||....+|++||+||+||+||+|++ |||++|||+|||++++|+|++|++++|..+ +++|++++++++++++. +..+ T Consensus 1 m~~~~~~~~~dp~LT~~A~gy~n~~--~ia~~l~P~vpv~~~~~k~~~f~~eaF~~~-~t~r~~~~~~~~v~~~~~~~~~ 77 (307) T protein:vir:10 1 MGRLSKLRIVDPVLTNLAIGYTNAE--FIGQSLMPVVEVEKEGGKIPKFGKESFRLY-KTERALRARSNRMNPEDLGSID 77 (307) T ss_pred CCCCCCCcccChhHHHHHHhhcchh--hhhhhcCCcccccccccceeeECcccccch-hhhcccCCCcceeecccccccc Confidence 8888999999999999999999985 999999999999999999999999998666 57899999888888764 4567 Q ss_pred cccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccccee Q lcl|NC_015466. 83 YFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASFDPT 162 (344) Q Consensus 83 ~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~~k~ 162 (344) +.|++|+|+.++| +|+++++.++|+++|++.++++|.|++|+++|.++++.. .|+++||+ T Consensus 78 ~~~~~~~L~~~id--~r~~~~~~~~~~~~av~~l~d~I~l~~E~~~A~l~~~~~------------------~y~~~~k~ 137 (307) T protein:vir:10 78 IVLDEHDLEYPID--YREDQESAFPLEQAAVQTATEAIQLRREKMVADLAQNPN------------------SYAGGNKK 137 (307) T ss_pred cccccccccccCC--hhhcCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHhcCcc------------------ccCCCceE Confidence 8899998886665 688999999999999999999999999999999988755 45778999 Q ss_pred ecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHHHHhC Q lcl|NC_015466. 163 NASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLADLFE 242 (344) Q Consensus 163 tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la~~~g 242 (344) +|+|++ +|||++|||++||++|+++|++.+|++||+|+||+++|++|++||+|++++++++ .+++|+++|+++|| T Consensus 138 tLsGt~--~Wsd~~sDPi~di~~~~~ai~~~~g~~Pn~~vlg~~a~~al~~hp~i~e~lk~~~---~g~it~~~la~ll~ 212 (307) T protein:vir:10 138 QLSATE--KFTAAGSDPVGVIEDGKEAIRTKIGRRPNTMVIGASAYKTLKAHPQLIEKIKYSM---KGIVTVDLLKEIFE 212 (307) T ss_pred Eecccc--ccCCCCCCcHHHHHHHHHHHHhhhCCccceEEeCHHHHHHHhcCHHHHHHhCCcc---ccccCHHHHHHHhC Confidence 999997 7999999999999999999999999999999999999999999999999999865 47999999999999 Q ss_pred CCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCC-----CcccccccccceeecccccCCcCCcccccccCCCCce Q lcl|NC_015466. 243 VDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPAT-----PGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAIES 317 (344) Q Consensus 243 l~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~-----~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~~~ 317 (344) +++|.||+++||.+ ++.+++||+++++|+|+++. +++++|||||||++.|+ .+++.|. +++|+ T Consensus 213 v~~i~vg~a~~~~~----~~~~~~iw~~~~vl~yv~~~~~~~~~~~~epsfGyT~~~~g~-------~~~d~~~-~~~~~ 280 (307) T protein:vir:10 213 VENIAVGEAIYADD----KDRFTDIWGANIVLAYVPLQRGGQQRTPYEPSYGYTLRKKGN-------PVVDTRI-EDGKL 280 (307) T ss_pred ceeEEEeeeeeecc----CCccceeCCCceEEEecccccCCCCCcccccccceeEEEcCC-------eEeecee-cCCce Confidence 99999999999875 35789999999999998764 46778999999987653 5565544 58899 Q ss_pred EEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 318 DRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 318 ~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) |+|||+++++|+|+++|+||||+|||- T Consensus 281 ~~~r~~~~~~~~i~~~~~G~li~~~~~ 307 (307) T protein:vir:10 281 ELVRSTDIFRPYLLGADAGYLISGING 307 (307) T ss_pred eEEeccccccceeecccccceeccCCC Confidence 999999999999999999999999999 No 3 >protein:vir:79078 Length: 307 # NCBI annotation: gp8 # Family: family:all:908 # MgeID: mge:1862 # MgeName: phiE255 # Cross-refs: genbank:acc:YP_001111208;genbank:gi:134288798;genbank:GeneID:4960752 Probab=100.00 E-value=1.7e-92 Score=523.65 Aligned_cols=301 Identities=21% Similarity=0.303 Sum_probs=272.4 Q ss_pred CCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCccccccee-cccccc Q lcl|NC_015466. 4 TQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTY-EIGNDT 82 (344) Q Consensus 4 ~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~-~~~~~~ 82 (344) ||....+|++||+||+||+||+|++ |||+.|||+|||++++|+|++|++++|..+ +++|++++.+++++. .++..+ T Consensus 1 m~~~~~~~~~dp~LT~~A~gy~n~~--~Iad~lfP~vpV~~~~~k~~~f~~e~f~~~-~t~ra~~~~~~~v~~~~~~~~~ 77 (307) T protein:vir:79 1 MGRLSKLRIVDPVLTNLAIGYTNAE--FIGQTLMPVVEVEKEGGKIPKFGKESFRLY-QTERALRAKSNRMNPEDIDSVD 77 (307) T ss_pred CCCCCCCcccCHHHHHHHhhccchh--hhhhhcCCcccccccccceeeecccccccc-ccccccCCCcceeeeecccccc Confidence 8888999999999999999999986 999999999999999999999999998765 578899998888886 456678 Q ss_pred cccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccccee Q lcl|NC_015466. 83 YFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASFDPT 162 (344) Q Consensus 83 ~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~~k~ 162 (344) +.|.+|+|+.+ +++|+++++.++|+++|++.++++|.|++|+|+|++++++++ |+++||+ T Consensus 78 ~~~~~~~l~~~--id~r~~~~~~~~~~~~Av~~l~d~I~l~~E~~~A~l~~~~~~------------------y~~~~k~ 137 (307) T protein:vir:79 78 VNLDEHDLEYP--IDYREDQESAFPLEQAAVQTATDAIQLRREKMIADLSQNPSS------------------YAAGNKK 137 (307) T ss_pred ccccccchhhc--ccchhcCCCCCCHHHHHHHHHHHHHHhHHHHHHHHHhccccc------------------cCCCceE Confidence 88898887754 456889999999999999999999999999999999998653 5678999 Q ss_pred ecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHHHHhC Q lcl|NC_015466. 163 NASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLADLFE 242 (344) Q Consensus 163 tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la~~~g 242 (344) +|+|++ +|||++|||++||++|+++|++.+|++||+|+||.++|++|++||+|++++++++ .+++|+++|+++|| T Consensus 138 tLsgt~--~Wsd~~sDPi~di~~~~~ai~~~~g~~Pn~~vlg~~a~~~l~~h~~i~~~lk~~~---~g~it~~~la~l~~ 212 (307) T protein:vir:79 138 QLSATE--KFTAANSDPVGVIEDGKEAIRTKIGRRPNTMVIGASAYKTLKAHPQLIEKIKYSM---KGIVTVDLLKEIFE 212 (307) T ss_pred EEccCc--ccCCCCCCcHHHHHHHHHHHHHhhCCccceEEeCHHHHHHHhcCHHHHHHhcCcc---ccccCHHHHHHHhC Confidence 999997 7999999999999999999999999999999999999999999999999999865 37999999999999 Q ss_pred CCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCC-----CcccccccccceeecccccCCcCCcccccccCCCCce Q lcl|NC_015466. 243 VDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPAT-----PGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAIES 317 (344) Q Consensus 243 l~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~-----~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~~~ 317 (344) +++|.||+++||.+ ++.++++|+++++|+|+++. +++++|||||||++.|. .+++.|. +++|+ T Consensus 213 v~~V~vg~a~y~~~----~~~~~~iw~~~~~l~y~~~~~~~~~~~~~~ps~Gyt~~~~g~-------~~~d~~~-~~~~~ 280 (307) T protein:vir:79 213 VENIAVGEAIYADD----KDRFTDIWGANIVLAYVPLQRGGQQRTPYEPSYGYTLRKKGN-------PVVDTRI-EDGKL 280 (307) T ss_pred ceeEEEeeeeeecc----cccchhcCCCceEEEecccccCCCCCcccccccceeEEecCc-------eEEeccc-CCCce Confidence 99999999999875 36789999999999999764 35789999999998864 3455544 58899 Q ss_pred EEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 318 DRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 318 ~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) |+|||+++++|+|+++|+||||+|||- T Consensus 281 ~~vrv~~~~~~~i~~~~~G~li~~~v~ 307 (307) T protein:vir:79 281 ELVRATDIFRPYLLGADAGYLISGING 307 (307) T ss_pred eEEeecccccceeeccccchhhccCCC Confidence 999999999999999999999999999 No 4 >protein:vir:106590 Length: 349 # NCBI annotation: putative major head protein # Family: family:all:1083 # MgeID: mge:1598 # MgeName: Lj965 # Cross-refs: genbank:acc:NP_958585;genbank:gi:41179245;genbank:GeneID:2717126 Probab=100.00 E-value=1.6e-36 Score=216.89 Aligned_cols=322 Identities=11% Similarity=0.035 Sum_probs=200.4 Q ss_pred CCCCC--CCCccc---eecccccceeeeeEc--CcchhhhhhhCcccccCCccceeeeechhh----cccccccccccCc Q lcl|NC_015466. 1 MPFTQ--PSRSDV---HVNRPLTNISIGYVQ--DASHFVAGQVFPQVSVGKQSDAYFTYERGD----FNRDEMQERTPGT 69 (344) Q Consensus 1 m~~~~--~~~~~~---~~dp~LT~iA~~Y~n--~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~----~~~~~~~~ra~g~ 69 (344) |||-| .+.+.| ..|...+....+|.| +...|+++.+||.+++....+++++..+.. .+...+....++. T Consensus 1 ~~~~~~~~~~~~~~~~~~d~~~~~~l~~~~~~~~~~~~l~~~~Fp~~~~~~~~~~~~~~~~~~~~~a~~v~~~~~~~~~~ 80 (349) T protein:vir:10 1 MKNQKLQLDLQRFATPILDMFSQNTVLDYTRNRQYPEMLGDTLFPAVKVPTLEVDILKAGSRVPTIASVSAFDAEAEIGT 80 (349) T ss_pred CCcchhhHHHHHHHHHhhcccCHHHHHHHHHhcCcchhhHhhcCCccccccceeEEEeeccCcceeeeeecCCCCcceec Confidence 88755 333333 234333444444543 223599999999988776666666544322 1111112222222 Q ss_pred ccccceecccccccccccccccccccHHHHHhccCCCCHHH-------HHHHHHHHHHhhhHHHHHHHHHhhhhhhcccc Q lcl|NC_015466. 70 ESAGGTYEIGNDTYFARTRAYHRDVPEQVRANADNPISLDR-------EATIFVTQKGLINREVNWAAAYFTAGAPGDTW 142 (344) Q Consensus 70 ~~~~~~~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~-------~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~ 142 (344) +... ...+ ...+....+.+... .......+ ....+.+ +.+..+.+.|.+|.|++++++++++++...+ T Consensus 81 r~~~-~~~~-~~p~ik~~~~i~e~-dl~~~~~~-~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~q~l~~Gki~~~~- 155 (349) T protein:vir:10 81 REAS-KMTA-ELAYVKRKMQITEE-MLIKLQSP-RNTAEENYLKQYVFDDIDAMVQAVKARGEKMTMEMFATGKITDKK- 155 (349) T ss_pred ccce-eEEe-eccccccccccCHH-HHHHHhhc-cCcchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhCCeeEEcC- Confidence 2110 0111 11121111222111 01111111 1122222 2345566789999999999999998764332 Q ss_pred cccccccccccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhc Q lcl|NC_015466. 143 TFDVDGVASSPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRID 222 (344) Q Consensus 143 ~~~~~gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~ 222 (344) .|+ ..++.++.+|+++|+|++ +||+++|||++||++|+++ .|.+|++++||+++|++|++|++|+++++ T Consensus 156 ----~g~-~vD~g~~~~~~~~lt~~~--~Ws~~~adpi~Di~~~~~~----~g~~p~~~vm~~~~~~~l~~~~~i~~~~~ 224 (349) T protein:vir:10 156 ----NGI-AIDYGVPKKHQETLSGTK--TWDKSDASIIDNLQDWSDS----LDVTPTRALTSKKVLRILMRSTEIKEAIF 224 (349) T ss_pred ----CcE-EEecccCccceeEecCcc--cCCCCCCCHHHHHHHHHHH----hCCCccEEEeCHHHHHHHhcCHHHHHHhc Confidence 232 245667899999999986 6999999999999999865 49999999999999999999999999998 Q ss_pred cCCCccccccCHHHHHHHh---CCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeeccccc Q lcl|NC_015466. 223 RGQTSGAAKANLVTLADLF---EVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVG 299 (344) Q Consensus 223 ~~~~~~~~~vt~~~la~~~---gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g 299 (344) +.+. ...++..++..+| |.++|.+++.+|...........+++|+++.+++. +.+.+|.+.||.|.+..++.. T Consensus 225 ~~~~--~~~~~~~~~~~~l~~~~~~~i~~yd~~y~d~~~~~~~t~~~~~p~~~v~l~--~~~~~G~~~yG~~~e~~~~~~ 300 (349) T protein:vir:10 225 GKDT--GRVVGQADLDQWMTAQGLPIIRAYDGKYRDEDSRGNLTTNSYFPEDRIVLF--NDEVPGQKIYGPTPEENRLIS 300 (349) T ss_pred cccc--ccccCHHHHHHHHHhcCCceEEEEeeEEEeecCCCceeecccccCCeEEEe--cCCCceeEEeeccchhhhhcc Confidence 7543 3477888888877 67789999999865322222345678887766654 456689999999987765432 Q ss_pred CCc-----C-CcccccccCCCCceEEEeeccccceeeeccccchhhhccc Q lcl|NC_015466. 300 SGN-----E-GMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIV 343 (344) Q Consensus 300 ~~~-----~-~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~v 343 (344) +.. . +..+..+.+.....+++.+....-|++.-+++=|.. .|| T Consensus 301 g~~~~~~~~~~~~~~~~~~~dP~~~~~~~~s~~lPv~~~~~~~~~a-~Vl 349 (349) T protein:vir:10 301 SNAQVSNVGNIMAKIYETSEDPIGTWILASATMLPSFASADDVFQA-KVL 349 (349) T ss_pred cccceeeccceEEEeeeecCCCceEEEEEeeeeeeeecCCCcEEEE-EeC Confidence 221 1 222333455667788888888888888777665544 566 No 5 >protein:vir:96490 Length: 348 # NCBI annotation: head protein # Family: family:all:1083 # MgeID: mge:1620 # MgeName: 2972 # Cross-refs: genbank:acc:YP_238492;genbank:gi:66391768;genbank:GeneID:5176912 Probab=100.00 E-value=3.6e-35 Score=209.39 Aligned_cols=321 Identities=10% Similarity=0.038 Sum_probs=207.5 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceec-cc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYE-IG 79 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~-~~ 79 (344) ||+ -..+-....|+.+.....++...|+.+.+||.+++....+.+++..+..... .+-++.+.......++ +. T Consensus 1 M~~----i~d~f~~~~l~~~i~~~~~~~~~~l~~~~Fp~~~~~~~~~~~~~~~~~~~~~--a~~v~~~~~~~~~~r~~~~ 74 (348) T protein:vir:96 1 MGL----IYDKVTASNIAGYFNTLQENVDSTLGESIFPARKQLGTKLSYIKGASGQSVA--LKAAAFDTNVTIRDRVSAE 74 (348) T ss_pred Ccc----hhhccCHHHHHHHHHhcccchhhhhhhhcCCCccccceeEEEEeecCCceeE--eeeecCCCCcceeccccee Confidence 553 3334344667776655655544689999999988776666666644332111 0111122211111111 11 Q ss_pred ccccccccccccccccHHHHHh----cc-CCCCHHHHH-------HHHHHHHHhhhHHHHHHHHHhhhhhhccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRAN----AD-NPISLDREA-------TIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVD 147 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~----a~-~~~~~~~~a-------~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~ 147 (344) ...+....-.....+...++.. .. +.....+.+ +..+.+.|.++.|++++++++++++...+.... T Consensus 75 ~~~~~~p~i~~~~~i~~~d~~~l~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~qal~~Gki~~~~~~~~-- 152 (348) T protein:vir:96 75 IHDEQMPFFKEALLVKENDRQQLNLVKDTGNEALINTIVAGIFNDDVTLINGARARLEAMRMQVLATGKIAFTSDGVN-- 152 (348) T ss_pred eeeeecCccccccccCHHHHHHHHhhhccCCchHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCeeEeecCCee-- Confidence 1111111111112222222211 11 111222222 344667889999999999999988754443221 Q ss_pred ccccccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCc Q lcl|NC_015466. 148 GVASSPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTS 227 (344) Q Consensus 148 gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~ 227 (344) ....+..+..|++++++ +||+++|||++||++|++++++ .|.+|++++||+++|++|++|++|++++++.+.. T Consensus 153 --~~vdfg~~~~~~~t~~~----~W~~~~adp~~di~~~~~~~~~-~G~~~~~~i~~~~~~~~l~~~~~v~~~~~~~~~~ 225 (348) T protein:vir:96 153 --KDIDYGVKADHKKQVSK----SWAEPGATPLADLEDAIETARE-LGLNPERAIMNAKTFGLIRKAASTVKAIKPLAGD 225 (348) T ss_pred --EEEeccCCcccceeecc----ccCCCCCCHHHHHHHHHHHHHh-cCCcccEEEeCHHHHHHHhcCHHHHHHHhccCCc Confidence 22344567899999976 4999999999999999999865 6999999999999999999999999999876543 Q ss_pred cccccCHHHHHHHhC---CCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccC---- Q lcl|NC_015466. 228 GAAKANLVTLADLFE---VDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGS---- 300 (344) Q Consensus 228 ~~~~vt~~~la~~~g---l~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~---- 300 (344) .+.++++++.++++ ..+|.+++.+|. ++++..+++|+++.+++. +.+.+|...||.|.+...+... T Consensus 226 -~~~~~~~~~~~~~~~~~g~~i~~y~~~y~----d~~G~~~~~~p~~~v~l~--~~~~~G~~~yg~~~e~~~~~~~~~~~ 298 (348) T protein:vir:96 226 -GSSVTKAELQNYVADNYGVEIVLENGTYR----NEKGEVSKFFPDGHLTLI--PNGPLGNTVFGTTPEESDLFADNTVN 298 (348) T ss_pred -cccccHHHHHHHHhhhcCceEEEEccEEE----ecCCcEeccccCCeEEEE--cCCCceeEEeccChhhhhhhhccccc Confidence 35788999988873 346877777764 356677889998877776 3456899999998765543221 Q ss_pred -----CcCCcccccccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 301 -----GNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 301 -----~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) -..+..+..|.+....++++.+....-|++.-+++-|.+ .++| T Consensus 299 ~~~~~~~~~~~~~~~~~~dP~~~~~~~~s~plPv~~~~~~~~~a-~Vl~ 346 (348) T protein:vir:96 299 ADVEIVDSGIAVTTTKTTDPVNVQTKVSMVALPSFERLGDVYML-TVIP 346 (348) T ss_pred ccceecCCeeEEEeeecCCCceEEEEEeeeeeccccCCCcEEEE-EEec Confidence 123456788889999999999888888888888776655 6666 No 6 >protein:vir:98480 Length: 348 # NCBI annotation: ORFp38 # Family: family:all:1083 # MgeID: mge:1589 # MgeName: VWB # Cross-refs: genbank:acc:NP_958280;genbank:gi:41057254;uniprot:Q38595;genbank:GeneID:2732864 Probab=100.00 E-value=4.9e-35 Score=208.69 Aligned_cols=323 Identities=15% Similarity=0.081 Sum_probs=209.3 Q ss_pred CCCCCCCCccceec-ccccceeeeeEc--CcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceec Q lcl|NC_015466. 1 MPFTQPSRSDVHVN-RPLTNISIGYVQ--DASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYE 77 (344) Q Consensus 1 m~~~~~~~~~~~~d-p~LT~iA~~Y~n--~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~ 77 (344) |+.+.. ..| ++ +.|+.+.+.|.+ +...|+.+.+||.+++....+++++-.+..... ..-++.+.......++ T Consensus 1 M~~~~~--~d~-~~~~~l~~~i~~~~~~~~~~~~l~~~~fp~~~~~~~~~~~~~~~~~~~~~--a~~~~~~~~~~~~~r~ 75 (348) T protein:vir:98 1 MSWTLD--TEF-IEPTQLTGLIREALRDLQVNRFRLARWLPNVDVDDITFEFLRGGGGLAET--ASYRSWDTESKIGRRE 75 (348) T ss_pred Ccchhh--hhc-cCHHHHHHHHHHHhhccCcchhhHHhcCCCccccceEEEEEeccCCceee--eeeecCCCccceeecc Confidence 776664 445 45 569999888753 234589999999988776666666543321110 0111222221111111 Q ss_pred -ccccccccccccccccccHHHHHhccCCC-----CHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccc Q lcl|NC_015466. 78 -IGNDTYFARTRAYHRDVPEQVRANADNPI-----SLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVAS 151 (344) Q Consensus 78 -~~~~~~~~~~~~l~~~v~~~~~~~a~~~~-----~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~ 151 (344) ++...+....-+....+...++....... +...+.+..+.+.|..+.|++++++++++++...+..+ . T Consensus 76 g~~~~~~~~~~i~~~~~i~~~d~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~qal~~Gki~~~g~~~------~ 149 (348) T protein:vir:98 76 GLAKVMGELPPISEKIPLNEYDRLRLRKLSRDEALPFIARDAQRLARNIGARFEVARGSALVNATVPVTELQQ------T 149 (348) T ss_pred cceeeeeeccccccccccCHHHHHHhcCChHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhCCeEEEecCce------E Confidence 11111111111111122222222221111 11122345567888999999999999998764433221 1 Q ss_pred ccccccccceeecccccccccCC-CCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCc-cc Q lcl|NC_015466. 152 SPTAPASFDPTNASNNDKLHWSD-ASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTS-GA 229 (344) Q Consensus 152 ~~~~~~~~~k~tl~~t~~~~Wsd-~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~-~~ 229 (344) ..+..+..|+++. +. +||+ ++|||++||++|++++++.+|.+|++++||+++|++|++|++|++++++.+.. .. T Consensus 150 vDyg~~~~~~~t~--~~--~Ws~~~~adp~~di~~~~~~~~~~~G~~p~~~vm~~~~~~~l~~~~~i~~~~~~~~~~~~~ 225 (348) T protein:vir:98 150 VDFGRIGSHSVVA--AV--LWSVHATATPISDLESWVATYEDTNGQSPGVILMPKAAVSHMRQCEEVIRQVFPLAPSGTA 225 (348) T ss_pred EccccCccccccc--cc--ccCCCCCCCHHHHHHHHHHHHHHccCCcceEEEeCHHHHHHHhcCHHHHHHHhccCccccc Confidence 2345567777654 32 6975 78999999999999999999999999999999999999999999999876543 23 Q ss_pred cccCHHHHHHHh---CCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecC-------CCcccccccccceeecccc- Q lcl|NC_015466. 230 AKANLVTLADLF---EVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPA-------TPGIMTPSAGYTFNWTGLV- 298 (344) Q Consensus 230 ~~vt~~~la~~~---gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~-------~~~~~~~s~G~T~~~~~~~- 298 (344) .+++.+++.+++ |++.|.+++++|... +..+++|+++.++++... .+.+|.+.||.|.+..... T Consensus 226 ~~~~~~~~~~~~~~~g~~~i~~~d~~~~~~-----g~~~~~~p~~~i~l~p~~~~~~~~~~~~~G~t~~G~~~e~~~~~~ 300 (348) T protein:vir:98 226 PMVSVEQLNTVLSSMGLPPIEVYDAKVAVD-----GVSTRITPANAIALLPEPGATDAAQPTELGATLLGTTAESLEDDY 300 (348) T ss_pred cccCHHHHHHHHHhhCCeEEEEeeeEEEcC-----CceeceecCCeEEEEecCCcccccccccccceecccchhhhcccc Confidence 578888877665 799999999988642 344678888877776542 2347888889887655431 Q ss_pred ---cCCcCCcccccccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 299 ---GSGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 299 ---g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) +....+.++..|.+....++++.+....-|++.-+++-| +-.||| T Consensus 301 ~~~~~~~~~i~~~~~~~~dP~~~~~~~~s~~lPv~~~~~~~~-~a~Vl~ 348 (348) T protein:vir:98 301 ALAPGEQPGIVAATWKTKDPVRLWTHAAAVGIPVLREPNLTF-KAQVLA 348 (348) T ss_pred ccceeccCceeeeeeeecCCcEEEEEEeeeeeccccCCCcEE-EEEEeC Confidence 122335677889999999999999888888887776544 457888 No 7 >protein:vir:2736 Length: 348 # NCBI annotation: putative structural protein # Family: family:all:1083 # MgeID: mge:58 # MgeName: O1205 # Cross-refs: genbank:acc:NP_695109;genbank:gi:23455878;genbank:GeneID:955608 Probab=100.00 E-value=6.1e-35 Score=208.18 Aligned_cols=318 Identities=11% Similarity=0.074 Sum_probs=205.9 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceec-c- Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYE-I- 78 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~-~- 78 (344) ||+ -..+.....|+.+...-.++...|+.+.+||..++....+.+++..+..... .+-++.+.......++ + T Consensus 1 M~~----i~d~f~~~~l~~~v~~~~~~~~~~l~~~~Fp~~~~~~~~~~~~~~~~~~~~~--a~~v~~~~~~~~~~r~~~~ 74 (348) T protein:vir:27 1 MGL----IYDKVTASNIAGYFNALQENVSSTLGESIFPARKQLGTKLSYIKGASGQSVA--LKAAAFDTNVTIRDRVSAE 74 (348) T ss_pred Ccc----hhhhcCHHHHHHHHHhccchhhhhhHhhcCCCccccceeEEEEeeccCceeE--eeeecCCCCcceeccccee Confidence 554 4445334567775443333333589999999887766666666544332110 0111122111111111 1 Q ss_pred --cccccccccccccccccHHHHHhcc---C--CCCHHH-------HHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccc Q lcl|NC_015466. 79 --GNDTYFARTRAYHRDVPEQVRANAD---N--PISLDR-------EATIFVTQKGLINREVNWAAAYFTAGAPGDTWTF 144 (344) Q Consensus 79 --~~~~~~~~~~~l~~~v~~~~~~~a~---~--~~~~~~-------~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~ 144 (344) +.+...++. ...+...++.+.+ . .....+ +.+..+.+.|..+.|++++++++++++...+... T Consensus 75 ~~~~~~p~i~~---~~~i~~~d~~~~~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~~al~~Gki~i~~~~~ 151 (348) T protein:vir:27 75 MHDEQMPFFKE---AMLVKENDRQQLNLVKDSGNAVLVNTIVAGIFNDNLTLVNGARARLEAMRMQVLATGKIAFTSDGV 151 (348) T ss_pred eeeeecCcccc---ccccCHHHHHHHHHhhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCeeEEecCCe Confidence 111111111 1122223332221 1 111112 2345566788999999999999988765433222 Q ss_pred cccccccccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccC Q lcl|NC_015466. 145 DVDGVASSPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRG 224 (344) Q Consensus 145 ~~~gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~ 224 (344) . ....+..+..|++++++. ||+++|||++||++|++.++ ..|.+|++++||+++|++|++|++|++++++. T Consensus 152 ~----~~vdfg~~~~~~~t~~~~----W~~~~adp~~di~~~~~~~~-~~G~~~~~ii~~~~~~~~l~~~~~v~~~~~~~ 222 (348) T protein:vir:27 152 N----KDIDYGVKPDHKKQVSKS----WAEPGATPLADLEDAIETAR-ELGLNPERAVMNAKTFGLIRKAASTVKVIKPL 222 (348) T ss_pred e----EEEeecCCcccceeeeec----cCCCCCCHHHHHHHHHHHHH-hcCCcccEEEECHHHHHHHhcCHHHHHHhccc Confidence 1 123345678999999864 99999999999999999986 47999999999999999999999999999765 Q ss_pred CCccccccCHHHHHHHh---CCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCC Q lcl|NC_015466. 225 QTSGAAKANLVTLADLF---EVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSG 301 (344) Q Consensus 225 ~~~~~~~vt~~~la~~~---gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~ 301 (344) +.. .+.++++++.++| +..+|.+++.+|. ++++..+++|+++.+++. +.+.+|...||.|.+..++..+. T Consensus 223 ~~~-~~~i~~~~~~~~~~~~~g~~i~~yd~~y~----d~~G~~~~~~p~~~vvl~--~~~~~G~~~yG~~~e~~~~~~~~ 295 (348) T protein:vir:27 223 AGD-GSAVTKAELENYIADNFGVSIVLENGTYR----NDKGEVSKFYPDGHLTLI--PNGPLGNTVFGTTPEESDLFADN 295 (348) T ss_pred Ccc-ccccCHHHHHHHHHhhcCceEEEEeeEEE----cCCCcCcccccCCeEEEE--cCCcceeEEeccCcchhhhhhcc Confidence 433 3678999998887 4557988888884 356677888998777666 34568999999998765543221 Q ss_pred ---------cCCcccccccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 302 ---------NEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 302 ---------~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ..+..+..|.+....++++.+....-|++.-+++=|.+ .|++ T Consensus 296 ~~~~~~~~~~~~~~~~~~~~~dP~~~~~~~~s~~lPv~~~~~~~~~a-~Vl~ 346 (348) T protein:vir:27 296 TVNAEVEIVDNGIAVTTTKTTDPVNVQTKVSMVALPSFERLDDVYML-TVIP 346 (348) T ss_pred ccccceeeeCCeeEEEeeecCCCceEEEEEeeeeeccccCCCcEEEE-EEec Confidence 12355778889999999999888888888888765554 5665 No 8 >protein:vir:4902 Length: 348 # NCBI annotation: gp348 # Family: family:all:1083 # MgeID: mge:107 # MgeName: Sfi11 # Cross-refs: genbank:acc:NP_056680;genbank:gi:9635015;genbank:GeneID:1262657 Probab=100.00 E-value=2.1e-34 Score=205.22 Aligned_cols=320 Identities=11% Similarity=0.058 Sum_probs=203.5 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceec-c- Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYE-I- 78 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~-~- 78 (344) ||+ -..+-.-..|+.+......+...|+.+.+||..++..-.+.+++..++..... +-++.+.......++ + T Consensus 1 M~~----l~d~f~~~~l~~~v~~~~~~~~~~l~~~~Fp~~~~~~~~~~~~~~~~~~~~~a--~~v~~~~~~~~~~r~~~~ 74 (348) T protein:vir:49 1 MGL----IYDKVTASNIAGYFNALQENVDSTLGESIFPARKQLGTKLSYITGASGQSVAL--KAAAFDTNVTVRDRVSAE 74 (348) T ss_pred Ccc----hhhhcCHHHHHHHHHhccccchhhhHhhcCCCccccCceeEEEEeecCceeee--eeecCCCCcceeccccee Confidence 654 33342334566644333333335899999998877666666666554322110 111111111111110 0 Q ss_pred ---cccccccccccccccccHHHHHhccCCC--CHHHHH-------HHHHHHHHhhhHHHHHHHHHhhhhhhcccccccc Q lcl|NC_015466. 79 ---GNDTYFARTRAYHRDVPEQVRANADNPI--SLDREA-------TIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDV 146 (344) Q Consensus 79 ---~~~~~~~~~~~l~~~v~~~~~~~a~~~~--~~~~~a-------~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~ 146 (344) ..-.+....+.+.. .++...+...+.. ...+.+ +..+.+.|..+.|++++++++++++...+.... T Consensus 75 ~~~~~~p~i~~~~~i~~-~d~~~l~~~~~~~~~~~~~~~~~~i~~d~~~l~~~i~~r~E~m~~qal~~Gki~i~~~g~~- 152 (348) T protein:vir:49 75 MHDEQMPFFKEAMLVKE-NDRQQLNLVKDSGNAALVNTIVAGIFNDNLTLVNGARARLEAMRMQVLATGKIAFTSDGVN- 152 (348) T ss_pred eeeeecCccccccccCH-HHHHHHHHHhccCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhCCeEEEecCCce- Confidence 01111111121211 1111122222111 112222 344557789999999999999888754443221 Q ss_pred cccccccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCC Q lcl|NC_015466. 147 DGVASSPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQT 226 (344) Q Consensus 147 ~gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~ 226 (344) ....+..+.+|+++++++ ||+++|||++||++|++.+++ +|..|++++||+++|++|++|++|++++++.+. T Consensus 153 ---~~vdyg~~~~~~~t~~~~----W~~~~adp~~di~~~~~~~~~-~G~~~~~ii~~~~~~~~l~~~~~v~~~~~~~~~ 224 (348) T protein:vir:49 153 ---KDIDYGVKPDHKKQVSKS----WAEPGATPLADLEDAIETARE-LGLNPERAVMNAKTFGLIRKAASTVKVIKPLAG 224 (348) T ss_pred ---EEEeecCCcccceeeeec----cCCCCCCHHHHHHHHHHHHHh-cCCcccEEEeCHHHHHHHhcCHHHHHHhhccCc Confidence 123345578999999864 999999999999999999875 699999999999999999999999999977554 Q ss_pred ccccccCHHHHHHHh---CCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCC-- Q lcl|NC_015466. 227 SGAAKANLVTLADLF---EVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSG-- 301 (344) Q Consensus 227 ~~~~~vt~~~la~~~---gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~-- 301 (344) . .+.++++++.+++ +..+|.+++.+|. +++++.+++|+++.+++. +.+.+|.+.||.|.+......+. T Consensus 225 ~-~~~i~~~~~~~~~~~~~g~~i~~y~~~y~----d~dG~~~~~~p~~~v~l~--~~~~~G~~~yg~~~e~~~~~~~~~~ 297 (348) T protein:vir:49 225 D-GSSVTKAELDNYIADNFGVTVVLENGTYR----NEKGEVSKFFPDGHLTLI--PNGPLGNTVFGTTPEESDLFADNTV 297 (348) T ss_pred c-cccccHHHHHHHHHhhcCceEEEEeeEEE----ecCCcEeeeecCCeEEEe--cCCCcceeEEecChhhhhhcccccc Confidence 3 3578888888776 4567888888774 346677889998877766 34568999999998754432211 Q ss_pred -------cCCcccccccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 302 -------NEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 302 -------~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ..+.++..|.+....++++.+....-|++.-+++-|.. .+++ T Consensus 298 ~~~~~~~~~~~~~~~~~~~dP~~~~~~~~s~~lPv~~~~~~~~~a-~Vl~ 346 (348) T protein:vir:49 298 NADVEIVDNGIAVTTTKTTDPVNVQTKVSMVALPSFERLDDVYML-TVIP 346 (348) T ss_pred ccceeecCCeEEEeeeecCCCceEEEEEeeeccccccCCCcEEEE-EEec Confidence 23456778888888899998888888888887766554 5666 No 9 >protein:vir:78006 Length: 409 # NCBI annotation: major head protein # Family: family:all:11999 # MgeID: mge:1843 # MgeName: P23-45 # Cross-refs: genbank:acc:YP_001467942;genbank:gi:157265383;genbank:GeneID:5600496 Probab=99.94 E-value=2.5e-29 Score=177.43 Aligned_cols=331 Identities=10% Similarity=0.023 Sum_probs=189.9 Q ss_pred CCCCCCCCccceecc----------------cccceeeeeEcCcchhhhhhhCcccccCCccceeeee----chhhcccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNR----------------PLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTY----ERGDFNRD 60 (344) Q Consensus 1 m~~~~~~~~~~~~dp----------------~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~----~k~~~~~~ 60 (344) .|-+.++.-+.+-|| -|+.++..+.. ..++.+.+||... +....+++ .++...++ T Consensus 3 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ia~~~~~~p~--~~~L~d~~FP~~~---~f~t~l~~~~~~~kg~kk~~ 77 (409) T protein:vir:78 3 VPININNALARVRDPLSIGGLKFPTTKEIQEAVAAIADKFNQ--ENDLVDRFFPEDS---TFASELELYLLRTQDAEQTG 77 (409) T ss_pred eccccchhhhhhcCcchhcceecCchHHHHHHHHHHHHhcCC--ccchhhccCCCCc---cccceEEEEeeeccCccccc Confidence 344444444333344 12333333322 3478999999732 22222333 23332222 Q ss_pred cccccccCccccc-----ceeccc---cccccccc-ccccccccHHHHHhccCCCCHH------HHHHHHHHHHHhhhHH Q lcl|NC_015466. 61 EMQERTPGTESAG-----GTYEIG---NDTYFART-RAYHRDVPEQVRANADNPISLD------REATIFVTQKGLINRE 125 (344) Q Consensus 61 ~~~~ra~g~~~~~-----~~~~~~---~~~~~~~~-~~l~~~v~~~~~~~a~~~~~~~------~~a~~~~~~~i~l~~E 125 (344) ....-..+....+ ..++.. .++...++ +.++. .....+...++..... .+....+.+.|..|.| T Consensus 78 ~~~~~~~~d~~~pv~~r~~~~~~~~~t~epp~iK~k~~i~e-~dl~~~~~~~n~~~~~~i~~~i~~D~~~L~~~I~~R~E 156 (409) T protein:vir:78 78 MTFVHQVGSTSLPVEARVAKVDLAKATWSPLAFKESRVWDE-KEILYLGRLADEVQAGVINEQIAESLTWLMARMRNRRR 156 (409) T ss_pred ceEeeecCCccccccccceeeeeeeecccccccccccccCH-HHHHHHhCCCChhHHHHHHHHHHHHHHHHHHHHHHHHH Confidence 2111111111111 111111 11111222 22221 0111122222222111 1224456678889999 Q ss_pred HHHHHHHhhhhhhccccc-ccccccccc-cccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCC--CcceE Q lcl|NC_015466. 126 VNWAAAYFTAGAPGDTWT-FDVDGVASS-PTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGF--EPNVL 201 (344) Q Consensus 126 ~~~a~~~~~~~~~~~~~~-~~~~gv~~~-~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~--~Pn~~ 201 (344) +|+++++.++++...+.. ....|++.. ++..+++|++++++++ +|++++|||++||++|++.+.+.+|. +|+.+ T Consensus 157 ~Ma~q~L~tGki~i~g~~~~~~~g~~~~vDyg~pa~hkvtlTgt~--~W~~~~AdPi~DIe~w~~~i~~~~g~~~t~~~~ 234 (409) T protein:vir:78 157 WLTWQVMRTGRITIQPNDPYNPNGLKYVIDYGVTDIELPLPQKFD--AKDGNGNSAVDPIQYFRDLIKAATYFPDRRPVA 234 (409) T ss_pred HHHHHHHhCCeEEEEecCCCccccceEEEecCCCcccceeecccc--cCCCCCCChHHHHHHHHHHHHHhcCCCCCccEE Confidence 999999998887543322 222344333 3456889999999886 69999999999999999999999987 56789 Q ss_pred EeCHHHHHHHh-cCHHHHHHhccCCCccc---cccC---------HHHHHHHhCCCeEEEEEEEEeccccCCCCccceeC Q lcl|NC_015466. 202 TLGKAVYDALV-DHPDIVGRIDRGQTSGA---AKAN---------LVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIG 268 (344) Q Consensus 202 v~~~~v~~~L~-~h~~i~~~i~~~~~~~~---~~vt---------~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw 268 (344) +|+.++|++|+ +|+.|+++++..+.... ..++ .+.+...+|| +|.+++.+|.. +++..++++ T Consensus 235 imt~~~~~~l~~~n~~ik~~l~~~~~~~~~~~~~~~~~~l~~~~~ln~~~~~~GL-~I~vYd~~Y~d----edGt~k~~~ 309 (409) T protein:vir:78 235 IIVGPGFDEVLADNTFVQKYVEYEKGWVVGQNTVQPPREVYRQAALDIFKRYTGL-EVMVYDKTYRD----QDGSVKYWI 309 (409) T ss_pred EEcHHHHHHHHhCcHHHHHhhhcccccccccccccchhhhcchhHhHhhhhhcCc-eEEEEeeEEEe----cCCccccee Confidence 99999997766 55667777765332111 1122 2345566688 58888887743 466777777 Q ss_pred CCceEEEEecCCCcccccccccceeecc--c-ccCCcCCcccccccCCCCceEEEeeccccceeeeccccc-hhhhcccC Q lcl|NC_015466. 269 GKHALLSYAPATPGIMTPSAGYTFNWTG--L-VGSGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLG-YFFGGIVA 344 (344) Q Consensus 269 ~~~~~l~~~~~~~~~~~~s~G~T~~~~~--~-~g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g-~l~~~~va 344 (344) +++.+++...+.+.+|.+.||.|-+... - ......+..+..|.++....++++...+.-|+++++..- |++-++== T Consensus 310 Pd~~vvLl~ap~g~LG~T~yGa~~~~~~~~~~v~~~g~~i~~~~~~~~dP~~~~~~~~~~~~p~l~~~~~~~~~~~~~~~ 389 (409) T protein:vir:78 310 PVGELIVLNQSTGPVGRFVYTAHVAGQRNGKVVYATGPYLTVKDHLQDDPPYYAIIAGFHGLPQLSGYNTEDFSFHRFKW 389 (409) T ss_pred cCCeEEEEcCCcccccceecccccccccchhhhccccceeEecccccCCcceeeeecceEEeeeeecCCccceeehhhhh Confidence 7765544445667799999998643211 0 111122345678889999999999999999999987543 44444321 No 10 >protein:vir:79503 Length: 409 # NCBI annotation: major head protein # Family: family:all:11999 # MgeID: mge:1870 # MgeName: P74-26 # Cross-refs: genbank:acc:YP_001468058;genbank:gi:157265500;genbank:GeneID:5600620 Probab=99.94 E-value=2.5e-29 Score=177.43 Aligned_cols=331 Identities=10% Similarity=0.023 Sum_probs=189.9 Q ss_pred CCCCCCCCccceecc----------------cccceeeeeEcCcchhhhhhhCcccccCCccceeeee----chhhcccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNR----------------PLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTY----ERGDFNRD 60 (344) Q Consensus 1 m~~~~~~~~~~~~dp----------------~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~----~k~~~~~~ 60 (344) .|-+.++.-+.+-|| -|+.++..+.. ..++.+.+||... +....+++ .++...++ T Consensus 3 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ia~~~~~~p~--~~~L~d~~FP~~~---~f~t~l~~~~~~~kg~kk~~ 77 (409) T protein:vir:79 3 VPININNALARVRDPLSIGGLKFPTTKEIQEAVAAIADKFNQ--ENDLVDRFFPEDS---TFASELELYLLRTQDAEQTG 77 (409) T ss_pred eccccchhhhhhcCcchhcceecCchHHHHHHHHHHHHhcCC--ccchhhccCCCCc---cccceEEEEeeeccCccccc Confidence 344444444333344 12333333322 3478999999732 22222333 23332222 Q ss_pred cccccccCccccc-----ceeccc---cccccccc-ccccccccHHHHHhccCCCCHH------HHHHHHHHHHHhhhHH Q lcl|NC_015466. 61 EMQERTPGTESAG-----GTYEIG---NDTYFART-RAYHRDVPEQVRANADNPISLD------REATIFVTQKGLINRE 125 (344) Q Consensus 61 ~~~~ra~g~~~~~-----~~~~~~---~~~~~~~~-~~l~~~v~~~~~~~a~~~~~~~------~~a~~~~~~~i~l~~E 125 (344) ....-..+....+ ..++.. .++...++ +.++. .....+...++..... .+....+.+.|..|.| T Consensus 78 ~~~~~~~~d~~~pv~~r~~~~~~~~~t~epp~iK~k~~i~e-~dl~~~~~~~n~~~~~~i~~~i~~D~~~L~~~I~~R~E 156 (409) T protein:vir:79 78 MTFVHQVGSTSLPVEARVAKVDLAKATWSPLAFKESRVWDE-KEILYLGRLADEVQAGVINEQIAESLTWLMARMRNRRR 156 (409) T ss_pred ceEeeecCCccccccccceeeeeeeecccccccccccccCH-HHHHHHhCCCChhHHHHHHHHHHHHHHHHHHHHHHHHH Confidence 2111111111111 111111 11111222 22221 0111122222222111 1224456678889999 Q ss_pred HHHHHHHhhhhhhccccc-ccccccccc-cccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCC--CcceE Q lcl|NC_015466. 126 VNWAAAYFTAGAPGDTWT-FDVDGVASS-PTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGF--EPNVL 201 (344) Q Consensus 126 ~~~a~~~~~~~~~~~~~~-~~~~gv~~~-~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~--~Pn~~ 201 (344) +|+++++.++++...+.. ....|++.. ++..+++|++++++++ +|++++|||++||++|++.+.+.+|. +|+.+ T Consensus 157 ~Ma~q~L~tGki~i~g~~~~~~~g~~~~vDyg~pa~hkvtlTgt~--~W~~~~AdPi~DIe~w~~~i~~~~g~~~t~~~~ 234 (409) T protein:vir:79 157 WLTWQVMRTGRITIQPNDPYNPNGLKYVIDYGVTDIELPLPQKFD--AKDGNGNSAVDPIQYFRDLIKAATYFPDRRPVA 234 (409) T ss_pred HHHHHHHhCCeEEEEecCCCccccceEEEecCCCcccceeecccc--cCCCCCCChHHHHHHHHHHHHHhcCCCCCccEE Confidence 999999998887543322 222344333 3456889999999886 69999999999999999999999987 56789 Q ss_pred EeCHHHHHHHh-cCHHHHHHhccCCCccc---cccC---------HHHHHHHhCCCeEEEEEEEEeccccCCCCccceeC Q lcl|NC_015466. 202 TLGKAVYDALV-DHPDIVGRIDRGQTSGA---AKAN---------LVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIG 268 (344) Q Consensus 202 v~~~~v~~~L~-~h~~i~~~i~~~~~~~~---~~vt---------~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw 268 (344) +|+.++|++|+ +|+.|+++++..+.... ..++ .+.+...+|| +|.+++.+|.. +++..++++ T Consensus 235 imt~~~~~~l~~~n~~ik~~l~~~~~~~~~~~~~~~~~~l~~~~~ln~~~~~~GL-~I~vYd~~Y~d----edGt~k~~~ 309 (409) T protein:vir:79 235 IIVGPGFDEVLADNTFVQKYVEYEKGWVVGQNTVQPPREVYRQAALDIFKRYTGL-EVMVYDKTYRD----QDGSVKYWI 309 (409) T ss_pred EEcHHHHHHHHhCcHHHHHhhhcccccccccccccchhhhcchhHhHhhhhhcCc-eEEEEeeEEEe----cCCccccee Confidence 99999997766 55667777765332111 1122 2345566688 58888887743 466777777 Q ss_pred CCceEEEEecCCCcccccccccceeecc--c-ccCCcCCcccccccCCCCceEEEeeccccceeeeccccc-hhhhcccC Q lcl|NC_015466. 269 GKHALLSYAPATPGIMTPSAGYTFNWTG--L-VGSGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLG-YFFGGIVA 344 (344) Q Consensus 269 ~~~~~l~~~~~~~~~~~~s~G~T~~~~~--~-~g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g-~l~~~~va 344 (344) +++.+++...+.+.+|.+.||.|-+... - ......+..+..|.++....++++...+.-|+++++..- |++-++== T Consensus 310 Pd~~vvLl~ap~g~LG~T~yGa~~~~~~~~~~v~~~g~~i~~~~~~~~dP~~~~~~~~~~~~p~l~~~~~~~~~~~~~~~ 389 (409) T protein:vir:79 310 PVGELIVLNQSTGPVGRFVYTAHVAGQRNGKVVYATGPYLTVKDHLQDDPPYYAIIAGFHGLPQLSGYNTEDFSFHRFKW 389 (409) T ss_pred cCCeEEEEcCCcccccceecccccccccchhhhccccceeEecccccCCcceeeeecceEEeeeeecCCccceeehhhhh Confidence 7765544445667799999998643211 0 111122345678889999999999999999999987543 44444321 No 11 >protein:vir:6378 Length: 346 # NCBI annotation: capsid protein E # Family: family:all:1021 # MgeID: mge:133 # MgeName: BcepNazgul # Cross-refs: genbank:acc:NP_918991;genbank:gi:34610166;genbank:GeneID:2559600 Probab=99.92 E-value=2.4e-27 Score=166.54 Aligned_cols=314 Identities=12% Similarity=0.065 Sum_probs=179.3 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccc-cCCccceeeeechhh-----cccccccccccCcccccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVS-VGKQSDAYFTYERGD-----FNRDEMQERTPGTESAGG 74 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~-v~~~~~~~~~~~k~~-----~~~~~~~~ra~g~~~~~~ 74 (344) |= + | .-..||.+-....+. .++.+.+||..+ ....... +.+.++. +..+..+...++++...+ T Consensus 1 ~d--~-----f-~~~~l~~~i~~~p~~--~~l~~~~fp~~~~~~t~~i~-i~~~~g~~~la~~v~~~~~~~~~~~~g~~~ 69 (346) T protein:vir:63 1 ME--I-----F-DTLTLAGVIQSGPAL--SMYWQGFYPNEITFDTDEIL-FDLVFKDKKLAPFVAPNVQGRVIAARGYTT 69 (346) T ss_pred CC--c-----c-CHHHHHHHHHhcCCc--cchhhhcCccccccccceEE-EEEecCceeeeeeecCCCCcceecccceee Confidence 10 0 1 113445543333332 478999998643 2222222 1222222 111111111111111111 Q ss_pred eecccccccccccccccccccHHHHHh----ccCCCCHHHH-------HHHHHHHHHhhhHHHHHHHHHhhhhhhccccc Q lcl|NC_015466. 75 TYEIGNDTYFARTRAYHRDVPEQVRAN----ADNPISLDRE-------ATIFVTQKGLINREVNWAAAYFTAGAPGDTWT 143 (344) Q Consensus 75 ~~~~~~~~~~~~~~~l~~~v~~~~~~~----a~~~~~~~~~-------a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~ 143 (344) ..+. -++....+.+.. .....++. ..+...+.++ .+..+.+.|.++.|+++++++..+.+...+.. T Consensus 70 -~~~~-~p~i~~~~~i~~-~d~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~l~~~i~~~~E~m~~~al~~gki~~~g~~ 146 (346) T protein:vir:63 70 -KTFR-PAYVKPKDVINP-NRTLKRRAGEQPIIGGMSLQERFQAVVADSQLEQRQRIENRIEWMCAMATIYGYVDVVGEA 146 (346) T ss_pred -eEee-cCccCccceeCH-HHHHHHhhhhhhccCCcCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCCEEEeeCCc Confidence 1111 111111111110 01111221 1233344433 23455677888999999999988766544333 Q ss_pred ccccccccccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhcc Q lcl|NC_015466. 144 FDVDGVASSPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDR 223 (344) Q Consensus 144 ~~~~gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~ 223 (344) .....++ ++.+..|+++++++. +|++++|||++||++|++++++++|.+|++++||+++|++|++|++|++++++ T Consensus 147 ~~~~~vd---fg~~~~~~~~lt~~~--~W~~~~adp~~di~~~~~~~~~~~g~~~~~~i~~~~~~~~l~~~~~v~~~~~~ 221 (346) T protein:vir:63 147 FPMQRVD---FGRDPALTVQLTGGA--AWDQATSDPLGNIQTMRTTAWKKSNSTITRLTMGLDAWSLFSQKPAVVELLNL 221 (346) T ss_pred eeEEEEe---eCCCccceeeecccc--cCCCCCCCHHHHHHHHHHHHHHccCCceEEEEECHHHHHHHhcCHHHHHHHhh Confidence 2222222 345778999999875 69999999999999999999999999999999999999999999999999976 Q ss_pred CCCccccccCHH------------HHHHHh---CCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCccccccc Q lcl|NC_015466. 224 GQTSGAAKANLV------------TLADLF---EVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSA 288 (344) Q Consensus 224 ~~~~~~~~vt~~------------~la~~~---gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~ 288 (344) .+....+.+... .+..++ |+ +|.+++.+| .+.++..+++|+++.++++ +.+.+|...| T Consensus 222 ~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~~~~gi-~i~~y~~~y----~d~~G~~~~~ip~~~v~~~--p~~~~g~~~y 294 (346) T protein:vir:63 222 FYKGSTSDFNRSRLDDGSPVQYQGTIGGYNGMGTL-ELYTYHDTY----TGDDNTEQEILGSYDVVGT--GPGLQGTQCF 294 (346) T ss_pred hccccccccchhhcccchhhhhhhhHhhhhccCCe-EEEEeccEE----EcCCCceeccccCCeEEEE--ecCCcceEEE Confidence 433222222222 122222 33 366666665 3445667788887776666 3455788889 Q ss_pred ccceeecccccCCcCCcccccccCCCCceEEEeeccccceeeeccccchhhhccc Q lcl|NC_015466. 289 GYTFNWTGLVGSGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIV 343 (344) Q Consensus 289 G~T~~~~~~~g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~v 343 (344) |.+..... +......++..|......++++.+....-+++.-+++=|.++ += T Consensus 295 g~~~d~~~--~~~~~~~~~~~~~~~dp~~~~~~~~s~plPv~~~p~~~~~~~-V~ 346 (346) T protein:vir:63 295 GAIMDFKN--GLVPTRMFPKMWEEEDPSVAMLMTQSAPLMVPAQPNASFRMT-VK 346 (346) T ss_pred eecccccc--CcccceeeeEEEEecCCCEEEEEEeeeccceecCCCcEEEEE-eC Confidence 98765432 334445667778888888888887766666666666543331 11 No 12 >protein:vir:393 Length: 341 # NCBI annotation: gp8 # Family: family:all:1021 # MgeID: mge:325 # MgeName: N15 # Cross-refs: genbank:acc:NP_046903;genbank:gi:9630472;genbank:GeneID:1261647 Probab=99.87 E-value=4.3e-24 Score=148.69 Aligned_cols=311 Identities=9% Similarity=-0.025 Sum_probs=173.2 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCccccc-CCccceeeeechhhcccccccccccCcccccceec-c Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSV-GKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYE-I 78 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v-~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~-~ 78 (344) |- + | .-+.|+++-....+.+ .++.+.+||.... ..+... +.+.++..... +-+.++........+ + T Consensus 1 ~d--~-----f-~~~~L~~~i~~~~~~~-~~l~~~~Fp~~~~~~t~~v~-~~~~~~~~~la--p~v~~~~~~~~~~~~~~ 68 (341) T protein:vir:39 1 MS--V-----Y-TTAQLLAVNEKKFKFD-PLFLRIFFRETYPFSTEKVY-LSQIPGLVNMA--LYVSPIVSGKVIRSRGG 68 (341) T ss_pred CC--c-----c-CHHHHHHHHHhhcCcc-chhHhhcCCcccccCcceEE-EEEecCCceee--EEecCCCCcceecccce Confidence 11 0 1 1234666555554443 5899999996432 222221 22233321111 111222222222111 1 Q ss_pred cc---cccccccccccccccH---HHHHhccC---CCCHHHHH-------HHHHHHHHhhhHHHHHHHHHhhhhhhcccc Q lcl|NC_015466. 79 GN---DTYFARTRAYHRDVPE---QVRANADN---PISLDREA-------TIFVTQKGLINREVNWAAAYFTAGAPGDTW 142 (344) Q Consensus 79 ~~---~~~~~~~~~l~~~v~~---~~~~~a~~---~~~~~~~a-------~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~ 142 (344) +. +...++. ...+.. ..|..+++ ..++.++. +..+.+.|..+.|++++++++++++...+. T Consensus 69 ~~~~~~~p~i~~---~~~i~~~d~~~r~~g~~~~~~~~~~~~~~~~i~~~~~~l~~~i~~r~E~m~~qaL~~Gki~i~~~ 145 (341) T protein:vir:39 69 STSEFTPGYVKP---KHEVNPLMTLRRLPDEDPQNLADPVYRRRRIILQNMKDEELAIAQVEEKQAVAAVLSGKYTMTGE 145 (341) T ss_pred eeeeEeccccCc---ccccCHHHHHHHhhcccccccCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCceEEEcC Confidence 00 0111111 111111 12333332 22333332 344667788899999999998887643222 Q ss_pred cccccccccccccccccceeecccccccccCCCCC---ChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHH Q lcl|NC_015466. 143 TFDVDGVASSPTAPASFDPTNASNNDKLHWSDASS---TPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVG 219 (344) Q Consensus 143 ~~~~~gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~S---DPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~ 219 (344) .+.. ...++..+..|++++++++ +||++++ ||+.||++|. +..|..|++++||+++|++|++|++|++ T Consensus 146 g~~~---~~vDfg~~~~~~~~lt~~~--~W~~~~~~~~d~l~di~~~~----~~~g~~~~~ii~~~~~~~~l~~~~~v~~ 216 (341) T protein:vir:39 146 AFEP---VEVDMGRSAGNNIVQAGAA--AWSSRDKETYDPTDDIEAYA----LNASGVVNIIVFDPKGWALFRSFKAVKE 216 (341) T ss_pred CCcE---EEEeccCCccceeEecCCc--cCCCCCCchHHHHHHHHHHH----HhcCCceEEEEeChHHHHHHhcCHHHHH Confidence 2111 1233456788999999876 6999864 6777777775 4568999999999999999999999999 Q ss_pred HhccCCCcccccc----CHHHHHH---HhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccce Q lcl|NC_015466. 220 RIDRGQTSGAAKA----NLVTLAD---LFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTF 292 (344) Q Consensus 220 ~i~~~~~~~~~~v----t~~~la~---~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~ 292 (344) ++++.......+- ....... .++...|.+++.+|.. ++..+++|+++.++++ +.+.+|.+.||.|. T Consensus 217 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~i~~y~~~y~d-----~g~~~~~ip~~~~~l~--p~~~~g~~~yg~~~ 289 (341) T protein:vir:39 217 KLDTRRGSNSELETALKDLGKAVSYKGMYGDVAIVVYSGQYIE-----NDVKKNYLPDLTMVLG--NTQARGLRTYGCIL 289 (341) T ss_pred HHhhcccccccccchhhhhhhHhhhhhhhcCceEEEEccEEEe-----cCcEEeeecCCeEEEe--eCCCcceEEEeccc Confidence 9976433322221 1222223 3455679888888753 2344666666665554 34557888999886 Q ss_pred eeccccc-CCcCCcccccccCC-CCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 293 NWTGLVG-SGNEGMRIKRFYLD-AIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 293 ~~~~~~g-~~~~~~~~~~~~~~-~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ....... ......+++.|... ...++++.+....-|++.-+++=|.. .|| T Consensus 290 d~~~~~~~~~~~~~~~~~~~~~~dp~~~~~~~~s~plPv~~~p~~~~~a--~V~ 341 (341) T protein:vir:39 290 DADAQREGINASTRYPKNWVQTGDPAREFTMIQSAPLMLLADPDEFVSV--KLA 341 (341) T ss_pred chhhcccceeeeeeeeeeeeecCCCcEEEEEEeccccceeeCCCcEEEE--EeC Confidence 5543211 12223344454443 56888888888888888777765554 344 No 13 >protein:vir:3424 Length: 341 # NCBI annotation: capsid component # Family: family:all:1021 # MgeID: mge:70 # MgeName: lambda # Cross-refs: genbank:acc:NP_040587;genbank:gi:9626251;genbank:GeneID:2703482 Probab=99.85 E-value=2e-23 Score=145.08 Aligned_cols=315 Identities=10% Similarity=-0.017 Sum_probs=173.5 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccc-cCCccceeeeechhhccc-ccccccccCcccccceecc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVS-VGKQSDAYFTYERGDFNR-DEMQERTPGTESAGGTYEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~-v~~~~~~~~~~~k~~~~~-~~~~~ra~g~~~~~~~~~~ 78 (344) |- + | .-..|+++-....+.. .|+++.+||... +...... +.+.++.... +.+.+..+|.....-.++. T Consensus 1 ~d--~-----f-~~~~L~~~i~~~~~~~-~~l~d~~fp~~~~~~t~~v~-~~~~~~~~~lap~v~~~~~~~~~~~~~~~~ 70 (341) T protein:vir:34 1 MS--M-----Y-TTAQLLAANEQKFKFD-PLFLRLFFRESYPFTTEKVY-LSQIPGLVNMALYVSPIVSGEVIRSRGGST 70 (341) T ss_pred CC--C-----c-CHHHHHHHHHhccCcc-chhHHhcCCcccccccceEE-EEEeeCCeeEEEeecCCCCcceeccCceee Confidence 11 0 1 1234666555555543 589999999743 2222221 2333332221 1111112221111111111 Q ss_pred c--ccccccccccccccccHHHHHhccC---CCCHHHHHH-------HHHHHHHhhhHHHHHHHHHhhhhhhcccccccc Q lcl|NC_015466. 79 G--NDTYFARTRAYHRDVPEQVRANADN---PISLDREAT-------IFVTQKGLINREVNWAAAYFTAGAPGDTWTFDV 146 (344) Q Consensus 79 ~--~~~~~~~~~~l~~~v~~~~~~~a~~---~~~~~~~a~-------~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~ 146 (344) . .-.+....+.+. +.....|..+++ ..++.++.. ..+.+.|..+.|++++++++++++...+..+.. T Consensus 71 ~~~~~p~i~~~~~i~-~~d~~~r~~g~~~~~~~~~~~~~~~~i~~~l~~l~~~i~~~~E~m~~qaL~~Gki~~~~~g~~~ 149 (341) T protein:vir:34 71 SEFTPGYVKPKHEVN-PQMTLRRLPDEDPQNLADPAYRRRRIIMQNMRDEELAIAQVEEMQAVSAVLKGKYTMTGEAFDP 149 (341) T ss_pred eEEecCccCccceeC-HHHHHHHhhccccccCcCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhcCcEEEecCCccE Confidence 0 011111111111 111122333332 334444333 334567888999999999998876433221111 Q ss_pred cccccccccccccceeecccccccccCCCC---CChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhcc Q lcl|NC_015466. 147 DGVASSPTAPASFDPTNASNNDKLHWSDAS---STPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDR 223 (344) Q Consensus 147 ~gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~---SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~ 223 (344) . ..+++.+..|++++++++ +|++++ +||++||++|. +..|..|++++||+++|++|++|++|++++++ T Consensus 150 ~---~vDfg~~~~~~~~~t~~~--~W~~~~~~~~d~l~di~~~~----~~~g~~~~~~i~~~~~~~~l~~~~~v~~~~~~ 220 (341) T protein:vir:34 150 V---EVDMGRSEENNITQSGGT--EWSKRDKSTYDPTDDIEAYA----LNASGVVNIIVFDPKGWALFRSFKAVKEKLDT 220 (341) T ss_pred E---EEEeCCCCccceEecCCc--cCCcCCCchHHHHHHHHHHH----HhcCCceEEEEeCHHHHHHHhcCHHHHHHHhh Confidence 1 233455788999999876 699875 46777777664 45799999999999999999999999999976 Q ss_pred CCCcccccc----CHHHHHHH---hCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecc Q lcl|NC_015466. 224 GQTSGAAKA----NLVTLADL---FEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTG 296 (344) Q Consensus 224 ~~~~~~~~v----t~~~la~~---~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~ 296 (344) ...+...+. ...+...+ ++...|.+++.+|.. ++..+++|+++.++++. .+.+|.+.||.|..... T Consensus 221 ~~~~~~~~~~~~~~~~~~~~~~~~~~g~~i~~y~~~y~d-----dG~~~~~ip~~~v~l~p--~g~~g~~~yg~~~d~~~ 293 (341) T protein:vir:34 221 RRGSNSELETAVKDLGKAVSYKGMYGDVAIVVYSGQYVE-----NGVKKNFLPDNTMVLGN--TQARGLRTYGCIQDADA 293 (341) T ss_pred cccccccccccccccccceeeeeecCCceEEEEcCEEEE-----CCcEEeeecCCeEEEee--CCCcceEEEeecccccc Confidence 443322221 12222222 345568888888742 35567788877777664 44578889998865433 Q ss_pred cc-cCCcCCccccccc-CCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 297 LV-GSGNEGMRIKRFY-LDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 297 ~~-g~~~~~~~~~~~~-~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .. +......+++.|. .+...++++.+....-+++.-+++=|..+ || T Consensus 294 ~~~~~~~~~~~~~~~~~~~dp~~~~~~~~s~pLPv~~~pd~~~~a~--V~ 341 (341) T protein:vir:34 294 QREGINASARYPKNWVTTGDPAREFTMIQSAPLMLLADPDEFVSVQ--LA 341 (341) T ss_pred cccceeeeeEeeeeeeecCCCcEEEEEEcccceeeeeCCCcEEEEE--eC Confidence 21 1111223344443 34567888888888788887777655543 44 No 14 >protein:vir:108211 Length: 318 # NCBI annotation: gp9 # Family: family:all:6420 # MgeID: mge:2004 # MgeName: Giles # Cross-refs: genbank:acc:YP_001552338;genbank:gi:160700658;genbank:GeneID:5758931 Probab=99.27 E-value=5e-14 Score=93.48 Aligned_cols=284 Identities=14% Similarity=0.120 Sum_probs=156.6 Q ss_pred CCCCCCCCccceecccccceeeeeEcC------------cchhhhhhhCccccc-CCccceeeeechhhccccccccccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQD------------ASHFVAGQVFPQVSV-GKQSDAYFTYERGDFNRDEMQERTP 67 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~------------~~~~ia~~lfP~v~v-~~~~~~~~~~~k~~~~~~~~~~ra~ 67 (344) |.+-++ -.+..-++.||-= .+-+. +..||++.||=.+.- ....++|.+ +...|.......++. T Consensus 1 ~~~~~~-i~s~~~~~~itv~--~ll~~P~~I~~~i~e~~~~~~iad~lf~~~~a~~~~~v~f~~-~~p~~~~~d~e~VaE 76 (318) T protein:vir:10 1 MTAPTG-IVSVSDGPAITVR--ELVGNPLWIPTALKKMMVNQFISESLFRNGGANPNGVVAYNE-GNPSFLEDDVADVAE 76 (318) T ss_pred CCCCCc-ceeeecCCceehH--HhhCCchhHHHHHHHHHhccchhhhhhhcccccccceeEEEe-cccccccCcHhhccC Confidence 332211 1111112222210 01111 124788888865432 222333322 122233333455777 Q ss_pred Ccccccceecccccccc-cccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccc Q lcl|NC_015466. 68 GTESAGGTYEIGNDTYF-ARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDV 146 (344) Q Consensus 68 g~~~~~~~~~~~~~~~~-~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~ 146 (344) |+.......+.+..... .+..+++..+.++ .+..+..++-+++...+.+-|.+..+.++-.++.++.+ T Consensus 77 ggEiP~~~~~~G~~~ia~~~K~G~~~~vS~E--m~~~n~~~~v~r~~~~l~Nti~r~~d~~a~dal~sa~t--------- 145 (318) T protein:vir:10 77 FGEIPVSAGARGLPRTAFAVKKALGVRVSKE--MIDENRVGAVNDQMLQLRNTFIRANDRSAKALLQSPIV--------- 145 (318) T ss_pred cccccccCCCCCchhhhhhehhccceeccHH--HHhhcChhHHHHHHHHHHHHHHHHHHHHHHHHHhcccc--------- Confidence 87777666655444442 3455666555554 44567788889999999999998888887777665542 Q ss_pred cccccccccccccceeecccccccccCCCCCChHHHHHHHHHHHH---------------HhcCCCcceEEeCHHHHHHH Q lcl|NC_015466. 147 DGVASSPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVL---------------EETGFEPNVLTLGKAVYDAL 211 (344) Q Consensus 147 ~gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~---------------~~~G~~Pn~~v~~~~v~~~L 211 (344) .++..++ .|++ .++|..|+-++++.+. .+-|++||+|+|....|..| T Consensus 146 -------------~~~~~s~----~w~~-~~~~~~d~~~A~e~v~~a~~~~~~a~~~~~~~~~GY~pdtIVlhP~~~~~l 207 (318) T protein:vir:10 146 -------------PTLAVPT----AWDN-GGKVRTDIAIAIEQISTAAPTAYPAGVGSSDEYFGFIPDTIVMHYALLPIL 207 (318) T ss_pred -------------ccccCCc----CCCC-cccccccchhhhhhhhhhhhhhhhhhhhhhhhccCccceeeEECHHHHHHH Confidence 1111222 2775 4677777777665442 46699999999999999999 Q ss_pred hcCHHHHHHhccCCCccccccCH---HHH-HHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccc Q lcl|NC_015466. 212 VDHPDIVGRIDRGQTSGAAKANL---VTL-ADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPS 287 (344) Q Consensus 212 ~~h~~i~~~i~~~~~~~~~~vt~---~~l-a~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s 287 (344) ++|+.+++.+... .+. ..... ..+ -++|||.-| ..+.|+.+-+|++ +.+.+|.-+ T Consensus 208 ~~n~~~~~~y~~~-a~~-~~~~~~~tg~~~g~~lGl~vi-----------------~s~~~p~~~alvl--q~g~vG~~~ 266 (318) T protein:vir:10 208 MDNENFMKVYERN-ANY-VSTAPDWTGNFPGSVMGLNVI-----------------RSRTFPIDRVLIM--ERGTVGFYS 266 (318) T ss_pred hcchhhhhhhhcc-chh-hhhcccccccccceeeceEEe-----------------ecCccCCCeeEEE--ecCCcceee Confidence 9999999876432 211 11101 111 123444311 1223333333333 334444222 Q ss_pred cccceeecccccCCcCCcccccccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 288 AGYTFNWTGLVGSGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 288 ~G~T~~~~~~~g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ...-...+.++.++. ..+.+-..+|++|+.+.+-.-|+-|-+.++|++.+. T Consensus 267 d~~pl~~t~~~~egg------~~~g~~~~s~~~~~~~~~~~~V~~PkA~~~itgi~~ 317 (318) T protein:vir:10 267 DTRPLQFTALYPEGN------GPNGGPTESYRADASHKRALAVDQPKAALWLTGIVT 317 (318) T ss_pred ccccceeeecccCCC------CCCCCcchhhheehheeeeeeeeCcceeEEEeeccC Confidence 111122222221111 123456688999999999999999999999999999 No 15 >protein:vir:10324 Length: 320 # NCBI annotation: ORF26 # Family: family:all:570 # MgeID: mge:182 # MgeName: VHML # Cross-refs: genbank:acc:NP_758919;genbank:gi:27311193;genbank:GeneID:956155 Probab=98.86 E-value=2.6e-10 Score=73.11 Aligned_cols=294 Identities=8% Similarity=-0.008 Sum_probs=120.9 Q ss_pred CCCCCCCCcc-c-eeccc-ccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceec Q lcl|NC_015466. 1 MPFTQPSRSD-V-HVNRP-LTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYE 77 (344) Q Consensus 1 m~~~~~~~~~-~-~~dp~-LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~ 77 (344) +|+......- | +-.|+ -+.|++..++.... |.|.++ +|...+....+ T Consensus 4 ~P~~~g~~~glff~~~~v~T~~V~ie~~~~~l~-----lip~v~-------------------------rg~~g~~~~~~ 53 (320) T protein:vir:10 4 LPVNYGDSRALFAREKKVRTRTILVEEKNGVLT-----LIQSRE-------------------------PGSTENVAKRG 53 (320) T ss_pred CCchhhhhhhhccCCCCcccceEEEEEecCcee-----eeeccC-------------------------CCCCceeecCC Confidence 3322210000 1 11121 23345555444321 122221 11111111110 Q ss_pred cc-ccccccccccccccccHHHH----HhccCCCCH----HHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccc Q lcl|NC_015466. 78 IG-NDTYFARTRAYHRDVPEQVR----ANADNPISL----DREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDG 148 (344) Q Consensus 78 ~~-~~~~~~~~~~l~~~v~~~~~----~~a~~~~~~----~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~g 148 (344) -. -..+...-...+..+.-++. .-++..... ..+.+..+.+.+.+.+|++++..++ +.++ ..+| T Consensus 54 ~~~~~~f~~p~~~~~d~i~a~eiq~~Ra~G~~~~~~~~~~v~~~l~~lr~~~~~T~E~m~~~AL~-G~il------dadG 126 (320) T protein:vir:10 54 KRKVRSFVIPHLPLEDVILPDEYEGLRGFGTTALAAKSELVKERXETMKSSHDITHEHLRMGAKK-GQIL------DADG 126 (320) T ss_pred cceEEEEecceeccCCccCHHHHcCcccCCCchHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhhc-CeEE------cCCC Confidence 00 00000000011111111111 111111111 1233344556777788888887763 4432 2223 Q ss_pred cccccc---cccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCc---ceEEeCHHHHHHHhcCHHHHHHhc Q lcl|NC_015466. 149 VASSPT---APASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEP---NVLTLGKAVYDALVDHPDIVGRID 222 (344) Q Consensus 149 v~~~~~---~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~P---n~~v~~~~v~~~L~~h~~i~~~i~ 222 (344) .....+ +.-+.+.+.. .+++++.|+.+.+.++++.|.++.+-.+ -++++|+++|++|..||+|+++++ T Consensus 127 tv~~d~y~~fGi~~~~i~~------~l~~a~~dv~~~~~~~~~~i~~~l~g~~~t~v~al~g~~f~~al~~h~~Vke~y~ 200 (320) T protein:vir:10 127 TVLYDLYAEFGITKKTIYF------GLDNKDANVAESCRQVLRHVEDNLRGDVMKDVSVDVSEEFFDKFIKHASVKEVFL 200 (320) T ss_pred cEEEechhhhCCccceeEE------ecCCCCccHHHHHHHHHHHHHHHhccCCCCceEEEEChHHHHHHhcCHHHHHHHH Confidence 211111 1111122211 2455677888888888888877655444 378999999999999999999986 Q ss_pred cCCCccccccCHHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCc Q lcl|NC_015466. 223 RGQTSGAAKANLVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGN 302 (344) Q Consensus 223 ~~~~~~~~~vt~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~ 302 (344) +.... ...-.+....-|..--|.+- .|+..+.+.++..+++.+++-..++....+++....|+.- ..-...++-. T Consensus 201 ~~~~~--~~~l~~~~~~~f~~gGi~~~--~Y~g~~~d~~g~~~~~I~~~~~~~~p~g~~~~f~~~~apa-d~~e~vnt~g 275 (320) T protein:vir:10 201 NHEAA--VNRLGGDTRKGFKFGGLIFN--ENRARHVDEEGKETRFIKAGKGHAFPTGTTNTFFTALAPA-DFNETAGTLG 275 (320) T ss_pred hhhhh--hhhccccccceEEecCEEEE--EcccEEEcCCCCeeEeecCCeeEEEEecCchhheeeeccc-CcHhhcCCcc Confidence 54322 11111212122211112221 1444444445555555555544444333334444444432 1001112223 Q ss_pred CCcccccccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 303 EGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 303 ~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .....+.|.....+++.+..-...-++..-| +.|++-.++ T Consensus 276 ~p~y~k~~~~~~~~g~~l~~qS~PLpi~~rP--~~lv~~~~~ 315 (320) T protein:vir:10 276 KRYYAKMEPRRMGRGFDLHSQSNVLPMCCRP--GVLVELDAA 315 (320) T ss_pred cccccccccccCCCeEEEEeeecccccccCc--ceEEEEEec Confidence 3455666666666666654443333322212 222222222 No 16 >protein:vir:95258 Length: 368 # NCBI annotation: Phage conserved protein # Family: family:all:570 # MgeID: mge:1561 # MgeName: Felix 01 # Cross-refs: genbank:acc:NP_944891;genbank:gi:38707831;genbank:GeneID:2744044 Probab=97.78 E-value=2.4e-06 Score=51.42 Aligned_cols=321 Identities=11% Similarity=0.009 Sum_probs=131.4 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhh-hCcccccCCccceeeeechhhcccccccccccCccccc-ceecc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQ-VFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAG-GTYEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~-lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~-~~~~~ 78 (344) |-+- .+...|-+ ..||+--.-..+.+ .+|++. ||+..+|......+ +...+.+.. .+....|+.... ...+. T Consensus 1 ~~d~-f~~d~Fs~-~~LT~ain~~p~~p-~~l~~lglF~~~~v~t~~v~i-E~~~~~l~L--vp~~~rg~~~~~~~~~~~ 74 (368) T protein:vir:95 1 MLTN-SEKSRFFL-ADLTGEVQSIPNTY-GYISNLGLFRSAPITQTTFLM-DLTDWDVSL--LDAVDRDSRKAETSAPER 74 (368) T ss_pred Cccc-ccCCcccH-HHHHHHHHhcCCCc-ceecccccccCCCccceEEEE-EEEcCeEEE--ccccCCCCCCcccccCCc Confidence 2111 22333321 34666554555544 367766 88877766444332 222222222 122223322111 11110 Q ss_pred -cccccccccccccccccHHHHHhccCCCC------HHHHHH----HHHHHHHhhhHHHHHHHHHhhhhhhccccccccc Q lcl|NC_015466. 79 -GNDTYFARTRAYHRDVPEQVRANADNPIS------LDREAT----IFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVD 147 (344) Q Consensus 79 -~~~~~~~~~~~l~~~v~~~~~~~a~~~~~------~~~~a~----~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~ 147 (344) +-....+.-..++..+.-++..+-. .|. ..+..+ +.+.+.+.+.+|.++...++ +.+ .+.+ T Consensus 75 r~~~~f~~ph~~~~d~I~a~eiQg~R-afG~~~~l~~v~~~v~~kl~~~r~~~d~T~E~~r~gAL~-G~i------lDad 146 (368) T protein:vir:95 75 VRQISFPMMYFKEVESITPDEIQGVR-QPGTANELTTEAVVRAKKLMKIRTKFDITREFLFMQALK-GKV------VDAR 146 (368) T ss_pred eeEEEEecceeccccccchHHHcccc-CCCChhHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHhhc-Cee------ECCC Confidence 0011112212223333333332221 121 112222 33344555566666665543 332 3333 Q ss_pred cccccccc---ccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcC---CC---cceEEeCHHHHHHHhcCHHHH Q lcl|NC_015466. 148 GVASSPTA---PASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETG---FE---PNVLTLGKAVYDALVDHPDIV 218 (344) Q Consensus 148 gv~~~~~~---~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G---~~---Pn~~v~~~~v~~~L~~h~~i~ 218 (344) |.-..+++ .-+.+.+.. ..++++.|+-+.+++|...|.++.+ .. .-.+++|+..|++|..||+|+ T Consensus 147 Gtvl~dly~eFGit~~~v~f------~l~~~~tdv~~~~~~~~~~i~d~l~g~~~~~~~~v~alcg~~Ffd~L~~h~~Vk 220 (368) T protein:vir:95 147 GTLYADLYKQFDVEKKTIYF------DLDNPNADIDASIEELRMHMEDEAKTGTVINGEEIHVVVDRVFFSKLTKHPKIR 220 (368) T ss_pred CcEEecchhhhCCccceEEE------EeCCCCcCHHHHHHHHHHHHHHhhcccccccccceEEEEChHHHHHhhcChhHH Confidence 32111111 111111111 2456889999999999999998762 23 357888999999999999999 Q ss_pred HHhccCCCccccccCHHHHHHH------hCCCeEEEEEEE---EeccccCCCCccceeCCCceEEE-----EecCC---- Q lcl|NC_015466. 219 GRIDRGQTSGAAKANLVTLADL------FEVDKVLVMKAV---RNTAKKGQTASHSFIGGKHALLS-----YAPAT---- 280 (344) Q Consensus 219 ~~i~~~~~~~~~~vt~~~la~~------~gl~~I~v~~a~---yn~~~~~~~~~~~~iw~~~~~l~-----~~~~~---- 280 (344) +++++........-.+..+..- -.......|.-+ |+....+.++...++++++.+.+ |.-|. T Consensus 221 eay~~~~~a~~~~~lr~~~r~g~~~~~~~~~~~F~fgGi~f~eYrg~~~~~~g~~~~~v~~d~v~I~~gea~~~P~G~~~ 300 (368) T protein:vir:95 221 DAYLAQQTPLAWQQITGSLRTGGADGVQAHMNTFYYGGVKFVQYNGKFKDKRGKVHTLVSIDSVADTVGVGHAFPNVAML 300 (368) T ss_pred HHHHHHHhhhhhhhhccccccccccccccccceeEecCEEEEEcceeecCCCcceeeeecCCceeeccCceEEEeecccc Confidence 9876543221111111111100 001112222222 33333444555555555442221 10021 Q ss_pred ---CcccccccccceeecccccCCcCCcccccccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 281 ---PGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 281 ---~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) +++..+.|+.-- .-....+-........|.....++..+++-...-++..=| +.|++-..+ T Consensus 301 ~~~~~~F~~~~aPad-~~e~vNt~g~p~Ya~~~~~~~~~g~~le~qSnpLpic~RP--~~lv~~~~~ 364 (368) T protein:vir:95 301 GEANNIFEVAYGPCP-KMGYANTLGQELYVFEYEKDRDEGIDFEAHSYMLPYCTRP--QLLVDVRAD 364 (368) T ss_pred cccCcceEEEecCCC-cHhhcCCCcccccceeeeccCCCeeEEEEeecccchhccc--ceeEEEEec Confidence 233334444321 0011112222233333333444555554444333222222 223333222 No 17 >protein:vir:9820 Length: 272 # NCBI annotation: putative major capsid/head protein # Family: family:all:522 # MgeID: mge:176 # MgeName: 315.4 # Cross-refs: genbank:acc:NP_795582;genbank:gi:28876339;genbank:GeneID:1257858 Probab=96.92 E-value=7.4e-05 Score=43.23 Aligned_cols=261 Identities=7% Similarity=-0.024 Sum_probs=133.3 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhC-ccccc-------CCccceeeeechhhcccccccccccCcccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVF-PQVSV-------GKQSDAYFTYERGDFNRDEMQERTPGTESA 72 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lf-P~v~v-------~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~ 72 (344) |+++.+...+..+-.+++++.+.- +....+| +.+.+ +....++++|.... .. .--+-|...- T Consensus 1 MA~~~T~~~~~~iPev~s~~v~~~------~~~~~~~~~~~~~~~~~~g~~G~tv~iP~~~~~~--~a--~~v~eg~~i~ 70 (272) T protein:vir:98 1 MAVGTTKMAQMLDPEVLADMIDAE------VGKAIRFAPLAEVDTTLEGQPGTTLTVPKWDYIG--DA--EDVAEGEAIP 70 (272) T ss_pred CCCccccchheechHHHHHHHHHH------HHHHhhhhccccccccccCCCCCEEEEEEecCCC--Cc--ccccCCCccc Confidence 998766666554444555543221 1111222 22221 12234455553110 00 1111233333 Q ss_pred cceecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccc Q lcl|NC_015466. 73 GGTYEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASS 152 (344) Q Consensus 73 ~~~~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~ 152 (344) ..+..++......+..+.-..+.++.+. .+..++.....+.+...+....|..+...+-. . T Consensus 71 ~~~~~~~~~~~~~~~~~~~~~itd~~~~--~s~~d~~~~~~~~~~~~~a~~~d~~i~~~~~~--------------a--- 131 (272) T protein:vir:98 71 MTQLGFKKTTMTIKKAGKGVEITDEAIL--SGYGDPVGQAAKQIVEAIDHKVDADVLDALSK--------------S--- 131 (272) T ss_pred ccccccceEEEEeeeeeeeeeecHHHHh--hccccHHHHHHHHHHHHHHHHHHHHHHHHhcc--------------c--- Confidence 3334444444444444433344444443 34567777777777777776666554432210 0 Q ss_pred cccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCcccccc Q lcl|NC_015466. 153 PTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKA 232 (344) Q Consensus 153 ~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~v 232 (344) +++.+ ++..+.+|.++...+.+ .+..+..++|+++++..|++++.+. .++....+. +.+ T Consensus 132 ----------~~~~~--------~~~t~d~i~da~~~l~~-~~~~~~~~vv~p~~~~~L~k~~~~~-~~~~~~~~~-~~~ 190 (272) T protein:vir:98 132 ----------TQTVE--------ATATVDGVSKALDIFND-EDDAETVIVMNPADASTLRLDAAKE-WLGATEVGA-NRV 190 (272) T ss_pred ----------ccccc--------cccCHHHHHHHHHHHhc-cCCCccEEEEcHHHHHHHHHhcccc-ccccccccc-ccc Confidence 00000 11224456666666544 4788999999999999998665432 222222221 223 Q ss_pred CHHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccccC Q lcl|NC_015466. 233 NLVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYL 312 (344) Q Consensus 233 t~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~ 312 (344) ..-++..++|++ |++.... ....++++. .+ .+|+ +... +..++.++. T Consensus 191 ~~g~ig~i~G~~-Vi~s~~~---------------p~~t~~~~~---~~-----a~~~-~~~~--------~~~ve~~r~ 237 (272) T protein:vir:98 191 VSGVYGEVLGVQ-IVRSRKC---------------PKGTAYMVR---KG-----ALRI-MLKR--------NTMVETDRD 237 (272) T ss_pred ccccchhhcCee-EEEcCCC---------------CcceEEEEc---CC-----eEEE-EecC--------Cceeeeccc Confidence 333456778885 5443221 122233321 11 1111 1111 123566677 Q ss_pred CCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 313 DAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 313 ~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ...+...+++...+--+++-+++...++-.-| T Consensus 238 ~~~~~~~i~~~~~~~~~v~~~~~vv~~t~~~a 269 (272) T protein:vir:98 238 ITKAINQIVANKHYGVYLYKAEKAVKITLKDA 269 (272) T ss_pred cccceeEEEEEEEEEEEEEcCCceEEEEeccc Confidence 78889999999999999999998888888877 No 18 >protein:vir:3033 Length: 272 # NCBI annotation: major capsid protein # Family: family:all:522 # MgeID: mge:61 # MgeName: PhiNIH1.1 # Cross-refs: genbank:acc:NP_438146;genbank:gi:16271809;genbank:GeneID:929235 Probab=96.92 E-value=7.4e-05 Score=43.23 Aligned_cols=261 Identities=7% Similarity=-0.024 Sum_probs=133.3 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhC-ccccc-------CCccceeeeechhhcccccccccccCcccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVF-PQVSV-------GKQSDAYFTYERGDFNRDEMQERTPGTESA 72 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lf-P~v~v-------~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~ 72 (344) |+++.+...+..+-.+++++.+.- +....+| +.+.+ +....++++|.... .. .--+-|...- T Consensus 1 MA~~~T~~~~~~iPev~s~~v~~~------~~~~~~~~~~~~~~~~~~g~~G~tv~iP~~~~~~--~a--~~v~eg~~i~ 70 (272) T protein:vir:30 1 MAVGTTKMAQMLDPEVLADMIDAE------VGKAIRFAPLAEVDTTLEGQPGTTLTVPKWDYIG--DA--EDVAEGEAIP 70 (272) T ss_pred CCCccccchheechHHHHHHHHHH------HHHHhhhhccccccccccCCCCCEEEEEEecCCC--Cc--ccccCCCccc Confidence 998766666554444555543221 1111222 22221 12234455553110 00 1111233333 Q ss_pred cceecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccc Q lcl|NC_015466. 73 GGTYEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASS 152 (344) Q Consensus 73 ~~~~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~ 152 (344) ..+..++......+..+.-..+.++.+. .+..++.....+.+...+....|..+...+-. . T Consensus 71 ~~~~~~~~~~~~~~~~~~~~~itd~~~~--~s~~d~~~~~~~~~~~~~a~~~d~~i~~~~~~--------------a--- 131 (272) T protein:vir:30 71 MTQLGFKKTTMTIKKAGKGVEITDEAIL--SGYGDPVGQAAKQIVEAIDHKVDADVLDALSK--------------S--- 131 (272) T ss_pred ccccccceEEEEeeeeeeeeeecHHHHh--hccccHHHHHHHHHHHHHHHHHHHHHHHHhcc--------------c--- Confidence 3334444444444444433344444443 34567777777777777776666554432210 0 Q ss_pred cccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCcccccc Q lcl|NC_015466. 153 PTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKA 232 (344) Q Consensus 153 ~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~v 232 (344) +++.+ ++..+.+|.++...+.+ .+..+..++|+++++..|++++.+. .++....+. +.+ T Consensus 132 ----------~~~~~--------~~~t~d~i~da~~~l~~-~~~~~~~~vv~p~~~~~L~k~~~~~-~~~~~~~~~-~~~ 190 (272) T protein:vir:30 132 ----------TQTVE--------ATATVDGVSKALDIFND-EDDAETVIVMNPADASTLRLDAAKE-WLGATEVGA-NRV 190 (272) T ss_pred ----------ccccc--------cccCHHHHHHHHHHHhc-cCCCccEEEEcHHHHHHHHHhcccc-ccccccccc-ccc Confidence 00000 11224456666666544 4788999999999999998665432 222222221 223 Q ss_pred CHHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccccC Q lcl|NC_015466. 233 NLVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYL 312 (344) Q Consensus 233 t~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~ 312 (344) ..-++..++|++ |++.... ....++++. .+ .+|+ +... +..++.++. T Consensus 191 ~~g~ig~i~G~~-Vi~s~~~---------------p~~t~~~~~---~~-----a~~~-~~~~--------~~~ve~~r~ 237 (272) T protein:vir:30 191 VSGVYGEVLGVQ-IVRSRKC---------------PKGTAYMVR---KG-----ALRI-MLKR--------NTMVETDRD 237 (272) T ss_pred ccccchhhcCee-EEEcCCC---------------CcceEEEEc---CC-----eEEE-EecC--------Cceeeeccc Confidence 333456778885 5443221 122233321 11 1111 1111 123566677 Q ss_pred CCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 313 DAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 313 ~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ...+...+++...+--+++-+++...++-.-| T Consensus 238 ~~~~~~~i~~~~~~~~~v~~~~~vv~~t~~~a 269 (272) T protein:vir:30 238 ITKAINQIVANKHYGVYLYKAEKAVKITLKDA 269 (272) T ss_pred cccceeEEEEEEEEEEEEEcCCceEEEEeccc Confidence 78889999999999999999998888888877 No 19 >protein:vir:93742 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1475 # MgeName: 55 # Cross-refs: genbank:acc:YP_240459;genbank:gi:66396126;genbank:GeneID:5133511 Probab=96.87 E-value=8.8e-05 Score=42.81 Aligned_cols=266 Identities=10% Similarity=0.009 Sum_probs=138.4 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCccccc----CCccceeeeechhhcccccccccccCccccccee Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSV----GKQSDAYFTYERGDFNRDEMQERTPGTESAGGTY 76 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v----~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~ 76 (344) |++.++...+..+=.+++++.+.=-... ++ +.+.+.+ ..+.|...++++-.. ......-.-|....+.+. T Consensus 1 ma~~~T~~~~~iiPev~~~~v~~~~~~~--~~---~~~~~~~~~~l~g~~G~tv~ip~~~~-~g~~~~~~eg~~i~~~~i 74 (274) T protein:vir:93 1 MPQGITKTSNQIIPEVLAPMMQAQLEKK--LR---FASFAEVDSTLQGQPGDTLTFPAFVY-SGDAQVVAEGEKIPTDIL 74 (274) T ss_pred CCccceehhheechHHHHHHHHHHHHhh--hh---hcccccccccccCCCCCEEEEEeecc-CCCcccccCCCccccccc Confidence 9998887777644335555432211110 11 1122211 122243333332110 001111112333333344 Q ss_pred cccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccc Q lcl|NC_015466. 77 EIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAP 156 (344) Q Consensus 77 ~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~ 156 (344) ..+......+..+..-.+.+..+ .....++...+.+.+...+....+..+...+-.+. T Consensus 75 t~~~~~~~i~~~~~~~~i~D~~~--~~~~~d~~~~~~~~~~~~~a~~~d~~~~~~~~~a~-------------------- 132 (274) T protein:vir:93 75 ETKKREAKIRKIAKGTSITDEAL--LSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGAK-------------------- 132 (274) T ss_pred ccceeEEEeeeecccccccHHHH--HhhccchHHHHHHHHHHHHHHHHHHHHHHHHhccc-------------------- Confidence 44444444444444334444444 34456788877778777777766655544331110 Q ss_pred cccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHH Q lcl|NC_015466. 157 ASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVT 236 (344) Q Consensus 157 ~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~ 236 (344) ++ .+.+......|.++...+.+. +..+..+++++.++..|++++.+. .+.....+. +.+..-+ T Consensus 133 -------~~-------~~~~~~~~d~i~dA~~~l~d~-~~~~~~ivv~p~~~~~L~k~~~~~-f~~~s~~g~-~~~~~G~ 195 (274) T protein:vir:93 133 -------LT-------VNADITKLNGLQSAIDKFNDE-DLEPMVLFINPLDAGKLRGDASTN-FTRATELGD-DIIVKGA 195 (274) T ss_pred -------cc-------ccccccCHHHHHHHHHHhhhc-cCCccEEEeCHHHHHHHHhhhhhc-ccccccccc-cceeecc Confidence 00 011122355677777776665 569999999999999999877653 222222222 3344446 Q ss_pred HHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccccCCCCc Q lcl|NC_015466. 237 LADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAIE 316 (344) Q Consensus 237 la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~~ 316 (344) +..++|++ |++... +....++|+. ...+|+ +... ...++..+....+ T Consensus 196 ig~~~G~~-Vi~s~~---------------~p~~t~~l~~--------~gai~~-~~~~--------~~~vE~~Rd~~~~ 242 (274) T protein:vir:93 196 FGEALGAI-IVRTNK---------------LEAGTAILAK--------KGAVKL-ILKR--------DFFLEVARDASTK 242 (274) T ss_pred cceecCee-EEEcCC---------------CCcceEEEEe--------CCeEEE-EecC--------Ccccccccchhhc Confidence 66777875 433211 1122233331 112221 1111 2346777888889 Q ss_pred eEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 317 SDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 317 ~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) +..+++...+--+++-+..-..++-+-| T Consensus 243 ~d~i~~~~~y~~~~~~~~~~v~~t~~~~ 270 (274) T protein:vir:93 243 TTALYSDKHYVAYLYDESKAVKITKGSG 270 (274) T ss_pred ccEEEEEEEEEEEEEcCCceEEEeeCcc Confidence 9999999999888888888777777777 No 20 >protein:vir:80930 Length: 278 # NCBI annotation: Cps # Family: family:all:522 # MgeID: mge:1886 # MgeName: A500 # Cross-refs: genbank:acc:YP_001468392;genbank:gi:157324966;genbank:GeneID:5601363 Probab=96.82 E-value=5.9e-05 Score=43.76 Aligned_cols=275 Identities=11% Similarity=-0.023 Sum_probs=133.1 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCccc-ccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQV-SVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v-~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) |++|++...+..+=.+++++.+.=.... ++--.+...- ....+.|...++++-..... ...-.-|...-+.+...+ T Consensus 1 Ma~~~T~~~~~iiPev~s~~v~~~~~~~--~v~~~~~~~~~~l~g~~G~tv~ip~~~~~g~-a~~~~~g~~i~~~~lt~~ 77 (278) T protein:vir:80 1 MADLTTKLANLIDPEVMGPMISAKLPKA--IKFGKIAPIDNSLEGQPGSEITVPKYKYIGD-AQDVAEGAAIDYSALETE 77 (278) T ss_pred CCCcceehhheecHHHHHHHHHHHHHHh--hhhcccceecccccCCCCCEEEEeeeccCCc-ceeecCCCcCcccccccc Confidence 9998777766644335665432211110 1111111110 11112233333332110000 000111222222233333 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ......+..+. .....+.....+..++...+.+++...+....+..+...+..+.. T Consensus 78 ~~~~~i~~~~~--a~~v~D~~~~~~~~d~~~~~~~~~a~~~a~~~d~~l~~~l~~a~~---------------------- 133 (278) T protein:vir:80 78 SVKHGIKKAGK--GVKLTDESVLSGYGDPVEEAQKQIRMAIASKVDNDILEEALTTTL---------------------- 133 (278) T ss_pred eeeEeeehhhc--cccccHHHHhhccccHHHHHHHHHHHHHHHHHHHHHHHHHhcccc---------------------- Confidence 33333333333 233344444455677888888888888887777666554421110 Q ss_pred ceeecccccccccCCCCC-ChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASS-TPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLA 238 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~S-DPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la 238 (344) +. ++ ..+.... +-+..+-+..+++.......+.++++++.++..|++++.+. .+.....+. +.+.--++. T Consensus 134 -~~--~~----~~t~~~~~~~~~~~~da~~~l~~~~~~~~~~ivv~p~~~~~L~k~~~~~-~~~~~~~g~-~~~~~G~ig 204 (278) T protein:vir:80 134 -EV--KG----AINIGLIDKIENTFTDAPDAIEDESITTTGVLFLNYKDTAKLREEAAGS-WTKASQLGD-DLLVKGAFG 204 (278) T ss_pred -cc--cc----ccccchhhhHHHHHHHHHHhhcccCCCcccEEEECHHHHHHHHhhhhhh-ccccccccc-cceeeccce Confidence 00 00 0111112 22445566677777777777788999999999999877653 222222222 233334555 Q ss_pred HHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccccCCCCceE Q lcl|NC_015466. 239 DLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAIESD 318 (344) Q Consensus 239 ~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~~~~ 318 (344) .++|++ |++.+. +.....+++. .+ .+|+ +... ...++.++....+++ T Consensus 205 ~~~G~~-Vi~s~~---------------~p~~t~~l~~---~g-----Ai~~-~~~~--------~~~vE~~Rd~~~~~d 251 (278) T protein:vir:80 205 ELLGWE-IVRTKK---------------LADGNALAVK---AG-----ALKT-FLKR--------NLLAESGRDMDHKLT 251 (278) T ss_pred eeccee-EEEcCC---------------CCcceEEEEe---cc-----ceee-eecC--------Ccccccccchhhccc Confidence 666764 433221 1112233332 11 2221 2111 234677788888999 Q ss_pred EEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 319 RIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 319 ~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .++....+--+++-++.-..++-+-+ T Consensus 252 ~i~~~~~yg~~v~~~~~~v~it~~a~ 277 (278) T protein:vir:80 252 KFNADQHYAVALVDETKAVKVVPVAG 277 (278) T ss_pred eeeeeeEEEEEEEcCcceEEEeeccC Confidence 99999888888877777666644433 No 21 >protein:vir:97433 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1676 # MgeName: 92 # Cross-refs: genbank:acc:YP_240749;genbank:gi:66396420;genbank:GeneID:5133789 Probab=96.82 E-value=0.00012 Score=42.05 Aligned_cols=266 Identities=11% Similarity=0.006 Sum_probs=138.8 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCccccc----CCccceeeeechhhcccccccccccCccccccee Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSV----GKQSDAYFTYERGDFNRDEMQERTPGTESAGGTY 76 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v----~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~ 76 (344) |++.++.....++=.+++.+...=.... ++ +.+.+.+ ..+.|...++++=... .....-.-|....+.+. T Consensus 1 ma~~~T~~~d~iiPev~~~~v~~~~~~~--l~---~~~~~~~d~~l~g~~G~tv~iP~~~~~-g~a~~~~~g~~i~~~~l 74 (274) T protein:vir:97 1 MPQGLTKTSDQIIPEVLAPMMQAQLEKK--LR---FASFAEVDSTLQGQPGDTLTFPAFVYS-GDAQVVAEGEKIPTDIL 74 (274) T ss_pred CCccceehhheechHHHHHHHHHhhhhh--hh---hcccceecccccCCCCCEEEEeeecCC-CccccccCCCccccccc Confidence 9987776666643336665443211111 11 1122211 1223443333321100 00111112333333334 Q ss_pred cccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccc Q lcl|NC_015466. 77 EIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAP 156 (344) Q Consensus 77 ~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~ 156 (344) ..+......+..+..-.+.+..+ ..+..||...+.+.+...+....+..+...+-.+.. T Consensus 75 t~~~~~~~i~~~~~~~~i~D~~~--~~~~~dp~~~~~~~~a~a~a~~vd~~~~~~l~~a~~------------------- 133 (274) T protein:vir:97 75 ETKKREAKIRKIAKGTSITDEAL--LSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGAKL------------------- 133 (274) T ss_pred ccceeEEEeeeecceecccHHHH--HhccchHHHHHHHHHHHHHHHHHHHHHHHHHhccCc------------------- Confidence 44444444444444334444443 344557777777777777777666655544321110 Q ss_pred cccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHH Q lcl|NC_015466. 157 ASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVT 236 (344) Q Consensus 157 ~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~ 236 (344) .-+++.-....|.++...+.+. +..+..+++++.++..|++++.+. .++....+. +.+..-. T Consensus 134 ---------------~~~~~~~~~d~i~dA~~~l~d~-~~~~~~ivv~p~~~~~L~k~~~~~-f~~~s~~g~-~~~~~G~ 195 (274) T protein:vir:97 134 ---------------TVNADITKLNGLQSAIDKFNDE-DLEPMVLFVNPLDAGKLRGDASTN-FTRATELGD-DIIVKGA 195 (274) T ss_pred ---------------cccccccCHHHHHHHHHHhhcc-CCCceEEEeCHHHHHHHHhhhhhh-ccccCcccc-cceeccc Confidence 0011122356777777776655 668999999999999999877653 333333222 3444445 Q ss_pred HHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccccCCCCc Q lcl|NC_015466. 237 LADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAIE 316 (344) Q Consensus 237 la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~~ 316 (344) +..++|++ |++.+. +.....+|+. .+.+|+ +... +..++..|....+ T Consensus 196 ig~~~G~~-Vi~s~~---------------~p~~t~~l~~--------~gA~~~-~~~~--------~~~vE~~Rd~~~~ 242 (274) T protein:vir:97 196 FGEALGAI-IVRTNK---------------LEAGTAILAK--------KGAVKL-ILKR--------DFFLEVARDASTK 242 (274) T ss_pred cceecCee-EEEcCC---------------CCcceEEEEe--------CcceEe-eecC--------Cceeccccchhhc Confidence 66777764 433211 1112233331 122321 2211 2346788888889 Q ss_pred eEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 317 SDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 317 ~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) +..++...++--+++-+..-..++-+.| T Consensus 243 ~d~i~~~~~y~~~~~~~~~vv~~t~~~~ 270 (274) T protein:vir:97 243 TTALYSDKHYVAYLYDESKAVKITKGSG 270 (274) T ss_pred ccEEEEEEEEEEEEEcCCceEEEecCcc Confidence 9999999999888888888888888887 No 22 >protein:vir:94494 Length: 274 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1508 # MgeName: 88 # Cross-refs: genbank:acc:YP_240676;genbank:gi:66396348;genbank:GeneID:5133758 Probab=96.82 E-value=0.00012 Score=42.05 Aligned_cols=266 Identities=11% Similarity=0.006 Sum_probs=138.8 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCccccc----CCccceeeeechhhcccccccccccCccccccee Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSV----GKQSDAYFTYERGDFNRDEMQERTPGTESAGGTY 76 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v----~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~ 76 (344) |++.++.....++=.+++.+...=.... ++ +.+.+.+ ..+.|...++++=... .....-.-|....+.+. T Consensus 1 ma~~~T~~~d~iiPev~~~~v~~~~~~~--l~---~~~~~~~d~~l~g~~G~tv~iP~~~~~-g~a~~~~~g~~i~~~~l 74 (274) T protein:vir:94 1 MPQGLTKTSDQIIPEVLAPMMQAQLEKK--LR---FASFAEVDSTLQGQPGDTLTFPAFVYS-GDAQVVAEGEKIPTDIL 74 (274) T ss_pred CCccceehhheechHHHHHHHHHhhhhh--hh---hcccceecccccCCCCCEEEEeeecCC-CccccccCCCccccccc Confidence 9987776666643336665443211111 11 1122211 1223443333321100 00111112333333334 Q ss_pred cccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccc Q lcl|NC_015466. 77 EIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAP 156 (344) Q Consensus 77 ~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~ 156 (344) ..+......+..+..-.+.+..+ ..+..||...+.+.+...+....+..+...+-.+.. T Consensus 75 t~~~~~~~i~~~~~~~~i~D~~~--~~~~~dp~~~~~~~~a~a~a~~vd~~~~~~l~~a~~------------------- 133 (274) T protein:vir:94 75 ETKKREAKIRKIAKGTSITDEAL--LSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGAKL------------------- 133 (274) T ss_pred ccceeEEEeeeecceecccHHHH--HhccchHHHHHHHHHHHHHHHHHHHHHHHHHhccCc------------------- Confidence 44444444444444334444443 344557777777777777777666655544321110 Q ss_pred cccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHH Q lcl|NC_015466. 157 ASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVT 236 (344) Q Consensus 157 ~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~ 236 (344) .-+++.-....|.++...+.+. +..+..+++++.++..|++++.+. .++....+. +.+..-. T Consensus 134 ---------------~~~~~~~~~d~i~dA~~~l~d~-~~~~~~ivv~p~~~~~L~k~~~~~-f~~~s~~g~-~~~~~G~ 195 (274) T protein:vir:94 134 ---------------TVNADITKLNGLQSAIDKFNDE-DLEPMVLFVNPLDAGKLRGDASTN-FTRATELGD-DIIVKGA 195 (274) T ss_pred ---------------cccccccCHHHHHHHHHHhhcc-CCCceEEEeCHHHHHHHHhhhhhh-ccccCcccc-cceeccc Confidence 0011122356777777776655 668999999999999999877653 333333222 3444445 Q ss_pred HHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccccCCCCc Q lcl|NC_015466. 237 LADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAIE 316 (344) Q Consensus 237 la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~~ 316 (344) +..++|++ |++.+. +.....+|+. .+.+|+ +... +..++..|....+ T Consensus 196 ig~~~G~~-Vi~s~~---------------~p~~t~~l~~--------~gA~~~-~~~~--------~~~vE~~Rd~~~~ 242 (274) T protein:vir:94 196 FGEALGAI-IVRTNK---------------LEAGTAILAK--------KGAVKL-ILKR--------DFFLEVARDASTK 242 (274) T ss_pred cceecCee-EEEcCC---------------CCcceEEEEe--------CcceEe-eecC--------Cceeccccchhhc Confidence 66777764 433211 1112233331 122321 2211 2346788888889 Q ss_pred eEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 317 SDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 317 ~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) +..++...++--+++-+..-..++-+.| T Consensus 243 ~d~i~~~~~y~~~~~~~~~vv~~t~~~~ 270 (274) T protein:vir:94 243 TTALYSDKHYVAYLYDESKAVKITKGSG 270 (274) T ss_pred ccEEEEEEEEEEEEEcCCceEEEecCcc Confidence 9999999999888888888888888887 No 23 >protein:vir:80684 Length: 315 # NCBI annotation: gp6 # Family: family:all:966 # MgeID: mge:1884 # MgeName: PA6 # Cross-refs: genbank:acc:YP_001285582;genbank:gi:148727088;genbank:GeneID:5247055 Probab=96.64 E-value=0.00028 Score=40.05 Aligned_cols=303 Identities=10% Similarity=-0.019 Sum_probs=129.0 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceecccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIGN 80 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~~ 80 (344) |+........+.+-+.+.+--+..-.. ..+-.++++.+|+.....+++++..+.-.. . .+-|......+.+|+. T Consensus 1 Ma~~~~~~gg~~vP~~~~~~ii~~l~~--~s~i~~l~~~i~~~~~~~~ip~~~~~~~a~-w---v~Eg~~~~~s~~~f~~ 74 (315) T protein:vir:80 1 MADDFLSAGKLELPGSMIGAVRDRAID--SGVLAKLSPEQPTIFGPVKGAVFSGVPRAK-I---VGEGEVKPSASVDVSA 74 (315) T ss_pred CCCCcCCcCceEcchHHHHHHHHHHHh--hchhhhhcceeecCCCceEEEEEeCCcceE-E---eeCCccccccccceee Confidence 888777666666655553322221111 134456678888887777777764321111 1 1123333333344544 Q ss_pred cccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccccc Q lcl|NC_015466. 81 DTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASFD 160 (344) Q Consensus 81 ~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~~ 160 (344) .+...+.-+....+.++.+++.. .+ +...+...|.......++..+-.+..++.+.. ...+. ....+ T Consensus 75 v~l~~~kl~~~~~iS~ell~~s~--~~----~~~~l~~~i~~~la~ai~~~~d~a~~~G~~~~-~~~~~----~~~~~-- 141 (315) T protein:vir:80 75 FTAQPIKVVTQQRVSDEFMWADA--DY----RLGVLQDLISPALGASIGRAVDLIAFHGIDPA-TGKAA----SAVHT-- 141 (315) T ss_pred eEeeeeeEEeeehhhHHHhhcCc--hh----HHHHHHHHHHHHHHHHHHHHHhhheeeccCCC-CCccc----ccccc-- Confidence 44433332223334444443322 11 11112222222222222222222222221100 00000 00000 Q ss_pred eeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCcccccc---CHHHH Q lcl|NC_015466. 161 PTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKA---NLVTL 237 (344) Q Consensus 161 k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~v---t~~~l 237 (344) .. ..++ +=...+.+...||.+....+.......++..+|+++++.+|++ ++.. .+.......+. ....- T Consensus 142 ~~-~~~~---~~~~~~~~~~~d~~~~~~~~~~~~~~~~~~~imn~~~~~~L~~---l~~~-~g~~~~g~~~~~~~~~g~~ 213 (315) T protein:vir:80 142 SL-NKTK---NIVDATDSATADLVKAVGLIAGAGLQVPNGVALDPAFSFALST---EVYP-KGSPLAGQPMYPAAGFAGL 213 (315) T ss_pred cc-cccc---ceeeccccchHHHHHHHHHHhhccCccceEEEEcHHHHHHHHH---Hhhc-cCCcccccccccccccCCC Confidence 00 0001 1112455677888888888877777788899999999999862 2211 11111111111 11111 Q ss_pred HHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccccCCCCce Q lcl|NC_015466. 238 ADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAIES 317 (344) Q Consensus 238 a~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~~~ 317 (344) ..++|+| |.+.+..-.....+......-+.|+.--+++... -+.+++...+ ++ ..+... .--.... T Consensus 214 ~tl~G~P-V~~~~~~~~~~~~~~~~~~~~~~GDfs~~~~g~~--------~~~~i~i~~~-~~-~~~~~~---~~~~~~~ 279 (315) T protein:vir:80 214 DNWRGLN-VGASSTVSGAPEMSPASGVKAIVGDFSRVHWGFQ--------RNFPIELIEY-GD-PDQTGR---DLKGHNE 279 (315) T ss_pred ceeccee-eEecCcCCcccccccccccEEEEeecccEEEEEe--------cCeeEEEecc-cc-ccCccc---chhhcCc Confidence 3577877 4443332111100000001111222111111100 0112211110 00 000000 1112355 Q ss_pred EEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 318 DRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 318 ~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ..+|+.+.++-.|.-+++=..|+++.| T Consensus 280 v~~r~~~r~~~~v~~~~a~~~l~~~~a 306 (315) T protein:vir:80 280 VMVRAEAVLYVAIESLDSFAVVKEKAA 306 (315) T ss_pred EEEEEEEEecceeecccceEEEeeccC Confidence 778999999999999999999999998 No 24 >protein:vir:94771 Length: 298 # NCBI annotation: major head protein # Family: family:all:966 # MgeID: mge:1529 # MgeName: phi LC3 # Cross-refs: genbank:acc:NP_996706;genbank:gi:45597421;genbank:GeneID:2769044 Probab=96.32 E-value=0.00023 Score=40.58 Aligned_cols=294 Identities=12% Similarity=0.026 Sum_probs=124.2 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceecccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIGN 80 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~~ 80 (344) |.. ..-..+.|.+.+--+..-.. ..+-..+++.+|+.....+++++..+.-. .-.+-|.........|+. T Consensus 1 ma~----~gG~lip~~~~~~ii~~~~~--~s~i~~~~~~~~~~~~~~~~p~~~~~~~a----~~v~Eg~~~~~~~~~f~~ 70 (298) T protein:vir:94 1 MVL----NKGTLFDPELVTDLISKVAG--KSSIARLSAQKPIPFNGEKVFTFTMDSEI----DVVAESGKKTHGGVTLAP 70 (298) T ss_pred Cee----ccccccChhHHHHHHHHHHh--hchhhhhcceeeccCCceEEEEEecCcce----EEeeCCccccccccceeE Confidence 442 22233444443211111111 13455667888887766777775322111 111223333333344444 Q ss_pred cccccccccccccccHHHHHhc-cCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 81 DTYFARTRAYHRDVPEQVRANA-DNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 81 ~~~~~~~~~l~~~v~~~~~~~a-~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) .+...+.-+....+.++...+. .+..++++...+.+.+.|.+..| ..++++.....+......+. .. .. T Consensus 71 v~l~~~k~~~~~~iS~ell~~~~~~~~~l~~~i~~~la~ai~~~~d----~~~l~G~~~~~g~~~~~~~~----~~--~~ 140 (298) T protein:vir:94 71 QTMVPIKVEYGARISDEFMYASDEEKINILQAFNDGFAKKVARGID----LMAFHGVNPRLGTASAVIGT----NH--FD 140 (298) T ss_pred EEEeeeEEEEeeehhHHHhccCCccHHHHHHHHHHHHHHHHHHHHH----HHhhcccccCCCcccccccc----cc--cc Confidence 4333333232333444433322 23334444444444444443333 33333311111111111000 00 00 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCcc--ccccCHHHH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSG--AAKANLVTL 237 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~--~~~vt~~~l 237 (344) +..+. ......+..+++.||.+....+... +..++..+|+++.|.+|++ ++..+... ....+...- T Consensus 141 ~~~~~----~~~~~~~~~~~~~~i~~~~~~~~~~-~~~~~~~vmn~~~~~~l~~-------lkd~~G~~l~~~~~~~~~~ 208 (298) T protein:vir:94 141 SKVTQ----KVEAPRGIADPNGAIENAVELLTGV-DADVTGIAINPSFRSALAK-------QKDLQGNALFPELKWGATP 208 (298) T ss_pred ccccc----ccccccccccHHHHHHHHHHhhhhc-CCCccEEEEcHHHHHHHHH-------hhccCCCeeecCcccCCCC Confidence 11100 0113345678899999999887654 7889999999999998863 22221100 011111112 Q ss_pred HHHhCCCeEEEEEEEEeccccCCCCccceeCCCceE-EEEecCCCcccccccccceeecccccCCcCCcccccccCCCCc Q lcl|NC_015466. 238 ADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHAL-LSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAIE 316 (344) Q Consensus 238 a~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~-l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~~ 316 (344) ..++|+| |++.+..-... +. ....-+.|+..- +.|... -+.+++.... ++ ..+..+..| ..+ T Consensus 209 ~tl~G~P-V~~~~~v~~~~--~~-~~~~~~~Gdfs~~~~~~~~--------~~~~~~~~~~-~~-~d~~~~~~f---~~~ 271 (298) T protein:vir:94 209 DTINGLP-VDVNKTVSDMS--LT-QRDRAIIGDFANGFKWGYA--------KEVPLEVIQY-GD-PDNSGLDLK---GYN 271 (298) T ss_pred ceeccee-eEEeccccccc--CC-CccEEEEeeccceEEEEEe--------cCceEEEeec-CC-CcCcchhhh---hcC Confidence 3567887 54444332111 10 001112232211 111000 0111111110 00 001101111 134 Q ss_pred eEEEeeccccceeeeccccchhhhccc Q lcl|NC_015466. 317 SDRIEIDMSYDQKKVAADLGYFFGGIV 343 (344) Q Consensus 317 ~~~vr~~~~~~~~v~~~~~g~l~~~~v 343 (344) ...+|+....+-.+.-+.+=..|+++. T Consensus 272 ~v~~r~~~r~~~~~~~~~a~~~l~~~t 298 (298) T protein:vir:94 272 QVYIRAELFLGWGILDATKFARVTEAN 298 (298) T ss_pred cEEEEEEEEeccEeecccceEEEEecC Confidence 556888888888888888888888888 No 25 >protein:vir:96123 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1602 # MgeName: 37 # Cross-refs: genbank:acc:YP_240078;genbank:gi:66395742;genbank:GeneID:5133103 Probab=96.30 E-value=0.00022 Score=40.66 Aligned_cols=269 Identities=10% Similarity=-0.021 Sum_probs=136.8 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCccc-ccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQV-SVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v-~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) |++..+......+-.+++++.+.=-+.. ++--.+.+.- ....+.|...++++-.. ......-.-|......+...+ T Consensus 1 ma~~~T~~~d~i~Pev~s~~v~~~~~~~--~~~~~~~~~~~~l~g~~G~tv~ip~~~~-~g~~~~~~~g~~i~~~~it~~ 77 (274) T protein:vir:96 1 MAQGTTKVSNLIVPEVLAPMMQAELDKK--LRFAQFADIDSTLVGQPGDTLTFPAFTY-SGDAQVIAEGEKIPVDQIGTS 77 (274) T ss_pred CCccccchhhhhhhHHHHHHHHHHHHhh--hhhcccccccccccCCCCCEEEEEeecc-CCCccccCCCCcCchhhcccc Confidence 8887766666644446666443211111 1111111110 01112233333322110 001111122333333344444 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ......+..+..-.+.+..+ ..+..+|...+.+.+...+....+..+...+-.+. T Consensus 78 ~~~~~i~~~~~~~~i~D~~~--~~~~~d~~~~~~~~~~~~~a~~~d~~i~~~l~~a~----------------------- 132 (274) T protein:vir:96 78 KREAKVRKIGKGTELTDEAV--LSGFGDPQGEAVRQHGLAIANKVDNDVLEALKGAT----------------------- 132 (274) T ss_pred eeEEEEEeeeceeeecHHHH--HhhcchHHHHHHHHHHHHHHHHHHHHHHHHHhcCC----------------------- Confidence 44334433333333333333 34455777777777777777666666554432110 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHHH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLAD 239 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la~ 239 (344) ++ .+.+ ..-...|.++...+.+. ...+..+++++.++..|++++.+. .+.....+. +.+..-++.. T Consensus 133 ----~~------~~~~-~~~~d~i~dA~~~l~d~-~~~~~~ivv~p~~~~~L~k~~~~~-f~~~~~~g~-~~~~~g~ig~ 198 (274) T protein:vir:96 133 ----LT------VEAD-ITKLDGLQTAIDKFNDE-DLEPMVLFVNPLDAGGLRTSASDN-FTRPTQLGD-NIIVKGAFGE 198 (274) T ss_pred ----CC------cCcc-cccHHHHHHHHHHhccc-CCCceEEEeCHHHHHHHHhccccc-ccccccccc-cceeecccce Confidence 00 1111 11145666676666555 568999999999999999877542 222222221 2333345677 Q ss_pred HhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccccCCCCceEE Q lcl|NC_015466. 240 LFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAIESDR 319 (344) Q Consensus 240 ~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~~~~~ 319 (344) ++|++ |++.+. +....++++. ...+|+ +... +..++.++....++.. T Consensus 199 ~~G~~-Vi~s~~---------------~p~~t~~l~~--------~gA~~~-~~~~--------~~~vE~~Rd~~~~~d~ 245 (274) T protein:vir:96 199 ALGAV-IVRSNK---------------LNKGEALLAK--------KGAVKL-ITKR--------DFFLEKDRDASRKSTA 245 (274) T ss_pred ecCee-EEEcCC---------------CCcceEEEEe--------Ccceee-eecC--------CcccccccchhhcccE Confidence 77865 433221 1112233331 112222 1111 2346788888899999 Q ss_pred EeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 320 IEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 320 vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ++....+--+++-++.-..++.+.| T Consensus 246 i~~~~~yg~~~~~~~~vv~~t~~~~ 270 (274) T protein:vir:96 246 LYSDKHYVAYLYDESKVVKITKGAG 270 (274) T ss_pred EEEeeEEEEEEEcCccEEEEEcCcc Confidence 9999999999999999999998888 No 26 >protein:vir:7771 Length: 330 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:149 # MgeName: Bxz2 # Cross-refs: genbank:acc:NP_817605;genbank:gi:29566035;genbank:GeneID:1259229 Probab=96.29 E-value=0.00017 Score=41.29 Aligned_cols=299 Identities=11% Similarity=0.004 Sum_probs=124.3 Q ss_pred CC---------CCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCccc Q lcl|NC_015466. 1 MP---------FTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTES 71 (344) Q Consensus 1 m~---------~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~ 71 (344) |. -++.+...+.+......|-..-++ ..+-..+++.+|+.....+++++..+.-.. -..-|... T Consensus 1 m~~~~~~a~~~~~t~~~g~~i~~~~~~~ii~~~~~---~s~l~~~~~~~~~~~~~~~~p~~~~~~~a~----~v~Eg~~~ 73 (330) T protein:vir:77 1 MAGSTVPSTQVALTGDFSAFLTPEQSQDYFAEIEK---TSIVQRIARKVPMGPTGISIPHWTGAVSAS----WTGEAERK 73 (330) T ss_pred CcccccchhhccccCCCcceechhHHHHHHHHHHh---ccchhhhcceeeccCCceEEEEEcCCccee----EecCCCcc Confidence 22 111111111111111111100111 122344567778777667788764322111 11123333 Q ss_pred ccceecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccc Q lcl|NC_015466. 72 AGGTYEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVAS 151 (344) Q Consensus 72 ~~~~~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~ 151 (344) ...+.+|...++..+.-+....+..+..++ +.++.+....+.+.+.+....| ..++++ .+......|.-. T Consensus 74 ~~~~~~f~~i~~~~~k~~~~~~is~ell~d--s~~~~~~~i~~~l~~ai~~~~~----~~~l~G----~g~~~~~~g~~~ 143 (330) T protein:vir:77 74 PITKGSFGKQELEPVKITTIFAESAEVVRL--NPLNYLNTMRTKIAEAIALKFD----AAAIHG----IDKPSAFKGYLA 143 (330) T ss_pred ccccceeeEEEEeEEEEEEeehhhHHHHhc--chHHHHHHHHHHHHHHHHHHHH----HHhhcc----cCCCCccccccc Confidence 333344444444443333333444444443 2344454444444444443333 333322 221111111111 Q ss_pred ccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccc Q lcl|NC_015466. 152 SPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAK 231 (344) Q Consensus 152 ~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~ 231 (344) ... ......++.....+..+.+.+.+|.+.+..+... +..++..+|+++.|.+|+. ++ ..+. .-+ T Consensus 144 ~~~-----~~~~~~~~~~~~~~~~~~~~~~~l~~~~~~~~~~-~~~~~~~vmn~~~~~~l~~---lk----d~~G--~~l 208 (330) T protein:vir:77 144 ETT-----KVVSLADTNLTTASGPQGNAYLAVNNALSLLVNS-GKKWTGTLLDNVTEPILNT---AV----DGNG--RPL 208 (330) T ss_pred ccc-----ccceeecccccccccccchhHHHHHHHHHhhhhc-CCCccEEEEcHHHHHHHHH---Hh----ccCC--cee Confidence 000 0011111111124456778899999998886544 7888999999999988873 22 2110 011 Q ss_pred cCH----H-----HHHHHhCCCeEEEEEEEEeccccCCCCccceeC-CCceEEEEecCCCcccccccccceeeccc---- Q lcl|NC_015466. 232 ANL----V-----TLADLFEVDKVLVMKAVRNTAKKGQTASHSFIG-GKHALLSYAPATPGIMTPSAGYTFNWTGL---- 297 (344) Q Consensus 232 vt~----~-----~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw-~~~~~l~~~~~~~~~~~~s~G~T~~~~~~---- 297 (344) ... . .-..++|+| |++.+..-. +..+....++ ++.--+.+.... |.++..... T Consensus 209 ~~~~~~~~~~~~~~~~~l~G~P-V~~~~~~p~----~~~~~~~~~~~gd~s~~~i~~~~--------~~~i~~~~e~~~~ 275 (330) T protein:vir:77 209 FVESTYTEQVGAIREGRILGRP-TYVADNVVN----GTVGNRVVGVMGDFSQVIWGQIG--------GLSFDVTDQATLD 275 (330) T ss_pred ecCccccccccccCCceeccee-eEEeccccC----CCCCCccEEEEEecceEEEEEec--------CcEEEEeecceee Confidence 111 0 112466877 434333211 1111111111 111000000000 111111100 Q ss_pred ccCC-cCCcccccccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 298 VGSG-NEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 298 ~g~~-~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ++.. ........+..=......+|+.+.++-.+.-+.+=..++.+.| T Consensus 276 ~~~~~~~~~~~~~~~~f~~~~~~~r~~~r~d~~v~~~~a~~~i~~~~~ 323 (330) T protein:vir:77 276 FGEEQGGVWVPKLISLWQHNMVAVRCEAEFAFMVNDKDAFVKLTDQVA 323 (330) T ss_pred ecccccccccccccchhhcCcEEEEEEEEeccEEecccceEEEEeccC Confidence 0000 0000111112223356788999999999999998889988888 No 27 >protein:vir:9574 Length: 300 # NCBI annotation: gp40 # Family: family:all:966 # MgeID: mge:171 # MgeName: SM1 # Cross-refs: genbank:acc:NP_862879;genbank:gi:32469471;genbank:GeneID:1461316 Probab=96.21 E-value=0.00064 Score=38.09 Aligned_cols=294 Identities=10% Similarity=-0.023 Sum_probs=124.9 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceecccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIGN 80 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~~ 80 (344) |+.++.+.-.++.......|- ..-... .+-.++++.+|+.....+++++..+.-.. . .+-|........+|+. T Consensus 1 ma~~t~~~G~lip~~~~~~ii-~~l~~~--s~i~~l~~~~~~~~~~~~~p~~~~~~~a~-w---v~Eg~~~~~s~~~f~~ 73 (300) T protein:vir:95 1 MSEAQLSKGNLFNPELVTKVI-NKVKGH--SSIAKLSPQKPIPFNGQREFVFDFDSDID-I---VAENGKKTHGGVSLDP 73 (300) T ss_pred CcccccCCcceechhhHHHHH-HHHHhh--hhhhhhcceeeccCCceEEEEEecCcceE-E---eeCCccccccccccee Confidence 998888776654333333332 221111 23346788888887777888865332111 1 1123333333445554 Q ss_pred cccccccccccccccHHHHHhc-cCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 81 DTYFARTRAYHRDVPEQVRANA-DNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 81 ~~~~~~~~~l~~~v~~~~~~~a-~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) .....+.-+...++..+...+. .+..++++.....+.+.|.+..| ..++++....++...... .....+.. T Consensus 74 v~l~~~k~~~~~~iS~ell~~~~d~~~~l~~~i~~~l~~aia~~~d----~~~l~G~~~~~g~~~~~~----~~~~~~~~ 145 (300) T protein:vir:95 74 VTIVPLKVEYGARVSDEFLHASEEAKVDMLTDFVEGFSKKLARGLD----IMSIHGINPRTKQASTII----GDNCFDKK 145 (300) T ss_pred eEeeeEEEEEeehhhHHHhccCCCCHHHHHHHHHHHHHHHHHHHHH----HhhhhcccCCCCCCcccc----cccccccc Confidence 4443333333333444444322 33445555444444444443333 233332110011000000 00001111 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCcc--ccccCHHHH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSG--AAKANLVTL 237 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~--~~~vt~~~l 237 (344) ...+ -+...+++..+|.+....+.. .+..|+..+|+++.+.+|+. ++..+... +...+-..- T Consensus 146 ~~~~--------~~~~~~~~~~~i~~~~~~~~~-~~~~~~~~vmn~~~~~~L~~-------lkd~~G~~i~~~~~~~~~~ 209 (300) T protein:vir:95 146 VTQT--------VPFKDTNPDESMEDAVGMIDG-SERDITGAILDPIFTTALSK-------MKNAEGGKLYPELAWGGVP 209 (300) T ss_pred ccee--------ecccccchHHHHHHHHHHhhh-cCCCccEEEECHHHHHHHHH-------hhccCCCeeccCccccCCC Confidence 1111 112346778889888877654 57899999999999988852 22211100 011111122 Q ss_pred HHHhCCCeEEEEEEEEeccccCCCCccc-eeCCCc--eEEEEecCCCcccccccccceeecccccCCcCCcccccccCCC Q lcl|NC_015466. 238 ADLFEVDKVLVMKAVRNTAKKGQTASHS-FIGGKH--ALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDA 314 (344) Q Consensus 238 a~~~gl~~I~v~~a~yn~~~~~~~~~~~-~iw~~~--~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~ 314 (344) ..++|+| |++.+..- .+...... -+.++. .+.+.... +.++.... +++.-... +..| . T Consensus 210 ~~l~G~P-v~~s~~v~----~~~~~~~~~~~~GDf~~~~~~~~~~---------~~~~~v~~-~~~~d~~~-~~~f---~ 270 (300) T protein:vir:95 210 DAINGLA-VDKNRTVS----YSQTDPKNTAIVGDFETMFKWGYAK---------EVPMEIIK-YGDPDNSG-RDLK---G 270 (300) T ss_pred ceeccee-eEEecCCC----CCCCCCccEEEEeeccceEEEEEec---------ccEEEEee-ccCCCCcc-hhhh---h Confidence 4577887 44443321 11111111 122332 11111111 11222111 00000000 0111 1 Q ss_pred CceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 315 IESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 315 ~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .....+|+.+.++-.|.-+.+=..|+++-- T Consensus 271 ~~~v~~r~~~r~d~~v~~~~a~~~l~~~~g 300 (300) T protein:vir:95 271 YNQIYIRCEAYIGWGIMDAASFARIVKTGG 300 (300) T ss_pred cCcEEEEEEEeecceeecccceEEEecCCC Confidence 233556777776666666665555544433 No 28 >protein:vir:1239 Length: 274 # NCBI annotation: similar to phage B1 major head protein # Family: family:all:522 # MgeID: mge:25 # MgeName: phi ETA # Cross-refs: genbank:acc:NP_510938;genbank:gi:17426272;genbank:GeneID:927376 Probab=96.14 E-value=0.00037 Score=39.37 Aligned_cols=265 Identities=11% Similarity=-0.000 Sum_probs=133.8 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhC-ccccc----CCccceeeeechhhcccccccccccCcccccce Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVF-PQVSV----GKQSDAYFTYERGDFNRDEMQERTPGTESAGGT 75 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lf-P~v~v----~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~ 75 (344) |++.++.....++=.+++.+... ++....+| |.+.+ ..+.|...++++=...- ....-.-|....+.+ T Consensus 1 ma~~~T~l~d~iiPev~~~~v~~------~~~~~l~~~~~~~~d~~l~g~~G~tv~iP~~~~ig-~a~~~~~g~~i~~~~ 73 (274) T protein:vir:12 1 MAQGLTKTSNQIIPEVLAPMMQA------QLEKKLRFASFAEVDSTLQGQPGDTLTFPAFVYSG-DAQVVAEGEKIPTDI 73 (274) T ss_pred CCcceeehhhhhchHHHHHHHHH------HHHhhhhhcccceecccccCCCCCEEEEeeecCCC-ccccccCCCccchhh Confidence 88877766666333355553321 12222121 33222 12234444433210000 000011122222223 Q ss_pred ecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccc Q lcl|NC_015466. 76 YEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTA 155 (344) Q Consensus 76 ~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~ 155 (344) ...+......+..+..-.+.+. ....+..||...+.+.+...+....+..+...+..+. T Consensus 74 lt~~~~~~~i~~~~~~~~i~D~--~~~~~~~d~~~~~~~q~~~~~a~~vd~~~l~~~~~a~------------------- 132 (274) T protein:vir:12 74 LETKKREAKIRKIAKGTSITDE--ALLSGYGDPQGEQVRQHGLAHANKVDNDVLEALMGAK------------------- 132 (274) T ss_pred cccceeeEEeeeecceeeecHH--HHHhcccchHHHHHHHHHHHHHHHHHHHHHHHHhccc------------------- Confidence 3333333334333333333333 3344455677777777776666655554443332110 Q ss_pred ccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHH Q lcl|NC_015466. 156 PASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLV 235 (344) Q Consensus 156 ~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~ 235 (344) .+.+.+......|.++...+.++ ...+..+++++.++..|++++.+. .+.....+. +.+..- T Consensus 133 ---------------~~~~~~a~~~d~i~dA~~~lgd~-~~~~~~ivv~p~~~~~L~k~~~~~-fv~~s~~g~-~~~~~G 194 (274) T protein:vir:12 133 ---------------LTVNADITKLNGLQSAIDKFNDE-DLEPMVLFINPLDAGKLRGDASTN-FTRATELGD-DIIVKG 194 (274) T ss_pred ---------------ccccccccCHHHHHHHHHHhccc-cccccEEEeCHHHHHHHHhhhhhh-ccccccccc-cceecc Confidence 01122334456666776666554 468999999999999999887543 333332222 344444 Q ss_pred HHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccccCCCC Q lcl|NC_015466. 236 TLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAI 315 (344) Q Consensus 236 ~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~ 315 (344) .+..++|++ |++.+. +.....+|+ +.+.+|+ +.. ....++..|.... T Consensus 195 ~ig~~~G~~-Vi~s~~---------------~p~~t~~l~--------~~gA~~~-~~~--------~~~~vE~~Rd~~~ 241 (274) T protein:vir:12 195 AFGEALGAI-IVRSNK---------------LEAGTAILA--------KKGAVKL-ILK--------RDFFLEVARDAST 241 (274) T ss_pred cceeecCee-EEEeCC---------------CCcceEEEE--------eccceee-eec--------CCceeccccchhh Confidence 666777865 433221 111122332 1222332 111 1234678888889 Q ss_pred ceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 316 ESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 316 ~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) +...++...++--+++-++.-..++-+-| T Consensus 242 ~~d~i~~~~~y~~~~~~~~~vv~~t~~~~ 270 (274) T protein:vir:12 242 KTTALYSDKHYVAYLYDESKAVKITKGSG 270 (274) T ss_pred cccEEEeeeEEEEEEEcCCceEEEEcCCc Confidence 99999999999888888877777776666 No 29 >protein:vir:3613 Length: 272 # NCBI annotation: MHP # Family: family:all:522 # MgeID: mge:74 # MgeName: TP901-1 # Cross-refs: genbank:acc:NP_112699;genbank:gi:13786567;genbank:GeneID:921035 Probab=95.99 E-value=0.00031 Score=39.81 Aligned_cols=266 Identities=11% Similarity=0.019 Sum_probs=134.0 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhh-hCcccc----cCCccceeeeechhhcccccccccccCcccccce Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQ-VFPQVS----VGKQSDAYFTYERGDFNRDEMQERTPGTESAGGT 75 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~-lfP~v~----v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~ 75 (344) |++.++..+..++=.+++++.+. + +..-. +.+.+. ...+.|+..++++-...... ..-.-|....+.+ T Consensus 1 ma~~~T~~~d~iiPev~~~~v~~----~--~~~~~~~~~~~~~~~~l~g~~G~ti~iP~~~~~gda-~~~~eg~~i~~~~ 73 (272) T protein:vir:36 1 MSKQKTTLADLVNPEVLAPIVSY----E--LNKALRFAPLAQVDTTLQGQPGNTLKFPAFTYIGDA-ADVAEGGEISLDK 73 (272) T ss_pred CCCcceehhhhhchHHHHHHHHH----H--HHhhhhhccccccccccccCCCCEEEEeeeccCccc-cccCCCCccChhh Confidence 99888777777555566654321 1 11111 112221 12233444444331111111 1112233334444 Q ss_pred ecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccc Q lcl|NC_015466. 76 YEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTA 155 (344) Q Consensus 76 ~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~ 155 (344) ...+......+..+..-.+ .+.....+..+|...+.+++...+....+..+...+... . T Consensus 74 lt~~~~~~~i~~~~k~~~v--tD~~~~~~~~d~~~~~~~~~a~~~a~~~d~~i~~~l~~~--------------~----- 132 (272) T protein:vir:36 74 IGTTTKSVTIKKAAKGTEI--TDEAALSGYGDPIGESNKQLGLSLANKVDDDLLSAAKTT--------------S----- 132 (272) T ss_pred cCCcceeEeeehhhccccc--cHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhccc--------------c----- Confidence 4445544445444443333 334444456678888888888777766665554332110 0 Q ss_pred ccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHH Q lcl|NC_015466. 156 PASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLV 235 (344) Q Consensus 156 ~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~ 235 (344) .+.+ .+.=..+|.++...+.+. +..+..++++++++..|++++.+...-. ..+. ..+--- T Consensus 133 ------~~~~----------~~~~~d~i~~A~~~lgd~-~~~~~~ivv~p~~~~~L~k~~~~~~~~~--~~~~-~~~~~G 192 (272) T protein:vir:36 133 ------QTVS----------TKANVDGVQAALDIFNDE-DAQAYVLIVNPKDAAKIRKDANAKNIGS--EVGA-NALING 192 (272) T ss_pred ------cccc----------ccccHHHHHHHHHHhhhc-CCCceEEEEcHHHHHHHhcccccccccc--cccc-cceeee Confidence 0001 111133566677766655 5679999999999999998877654321 1111 122223 Q ss_pred HHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCce-EEEEecCCCcccccccccceeecccccCCcCCcccccccCCC Q lcl|NC_015466. 236 TLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHA-LLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDA 314 (344) Q Consensus 236 ~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~-~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~ 314 (344) .+..++|++ |++.+.. +.++. ...|....+. +| ++.-. ...++..|... T Consensus 193 ~ig~~~G~~-Vv~s~~~---------------p~~~~~~~~~~~~~gA-----~~-~~~~~--------~~~vE~~R~~~ 242 (272) T protein:vir:36 193 TYADVLGAQ-IVRSKKL---------------AEGSALMFKIVSNSPA-----LK-LVLKR--------GVQVETDRDIV 242 (272) T ss_pred ccceecCee-EEEeCCC---------------CCCceeEEEEEecccc-----ee-eeecC--------Ccccccccchh Confidence 455677865 5443321 11111 1111111111 22 12211 23467888888 Q ss_pred CceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 315 IESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 315 ~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .++..++....+--+|+-++.-..++++-- T Consensus 243 ~~~d~i~~~~~y~~~v~~~~~vv~~t~~g~ 272 (272) T protein:vir:36 243 TKTTVITADEHYAAYLYDLTKVVNITFTGV 272 (272) T ss_pred hcCcEEEEEEEEEEEEEcCccEEEEeecCC Confidence 899999999998888887776555544333 No 30 >protein:vir:1638 Length: 298 # NCBI annotation: Structural protein # Family: family:all:966 # MgeID: mge:33 # MgeName: r1t # Cross-refs: genbank:acc:NP_695059;genbank:gi:23455750;genbank:GeneID:955469 Probab=95.43 E-value=0.001 Score=36.93 Aligned_cols=291 Identities=13% Similarity=0.033 Sum_probs=125.9 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceecccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIGN 80 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~~ 80 (344) |. .... ..+.|.+.+--+..-.. ..+-..+++.+|+.....+++......... .+ +-|.........|+. T Consensus 1 ma---~~gG-~lvp~~~~~~ii~~~~~--~s~i~~l~~~~~~~~~~~~ip~~~~~~~a~-~v---~E~~~~~~~~~~f~~ 70 (298) T protein:vir:16 1 MV---LNKG-TLFDPTLVTDLISKVAG--KSSIARLSAQKPIPFNGEKVFTFTMDSEID-VV---AESGKKTHGGVTLAP 70 (298) T ss_pred Cc---ccCc-ceechhHHHHHHHHHHh--hhhhhhhcceeeccCCceEEEEEecCcceE-Ee---cCCccccccccceeE Confidence 44 2222 23444333322222221 234556678888876666777653221111 11 122222222334443 Q ss_pred cccccccccccccccHHHHHhcc-CCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 81 DTYFARTRAYHRDVPEQVRANAD-NPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 81 ~~~~~~~~~l~~~v~~~~~~~a~-~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) .++..+.-+....+..+...+.+ ...++++...+.+.+.+.+..| ..++++.-...+......+.. . .. T Consensus 71 v~l~~~k~a~~~~iS~ell~~s~d~~~~l~~~i~~~la~ai~~~~d----~~~l~G~~~~~g~~~~~~~~~----~--~~ 140 (298) T protein:vir:16 71 QTMVPIKVEYGARISDEFMYASDEEKINILQEFNDGFAKKVARGID----LMAFHGVNPRLGTASAVIGTN----H--FD 140 (298) T ss_pred EEEeeeeEEEeehhhHHHhhcCcccHHHHHHHHHHHHHHHHHHHHH----HHhhccccCCCCccccccccc----c--cc Confidence 33333333333334444444332 3344555444455555443333 333333111111111010100 0 00 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH----H Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL----V 235 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~----~ 235 (344) +..+. .........+++.||.+....+.. .+..++..+|+++.|.+|++ + +..+ +.-+... . T Consensus 141 ~~~~~----~~~~~~~~~~~~~~i~~~~~~~~~-~~~~~~~~vmn~~~~~~l~~---l----kd~~--G~~i~~~~~~~~ 206 (298) T protein:vir:16 141 SKVTQ----KVEAPRGIADPNGAIENAVELLTG-VDADVTGIAINPSFRSALAK---Q----KDLQ--DNALFPELKWGA 206 (298) T ss_pred ccccc----ccccccccccHHHHHHHHHHHhhh-cCCCccEEEEcHHHHHHHHH---h----hccC--CCeeecCcccCC Confidence 00110 012444677889999999988765 46888999999999998863 2 2221 1111111 1 Q ss_pred HHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCce--EEEEecCCCcccccccccceeecccccCCcCCcccccccCC Q lcl|NC_015466. 236 TLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHA--LLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLD 313 (344) Q Consensus 236 ~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~--~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~ 313 (344) .-..++|+| |.+.+..-.... .....-+.|+.- +.+.. .-+.++..... ++ ..+..+..| T Consensus 207 ~~~~l~G~P-V~~~~~v~~~~~---~~~~~~~~GDfs~~~~~~~---------~~~~~~~~~~~-~~-~~~~~~~~f--- 268 (298) T protein:vir:16 207 TPDTINGLP-VDVNKTVSDMSL---TQRDRAIIGDFANGFKWGY---------AKEVPLEVIQY-GD-PDNSGLDLK--- 268 (298) T ss_pred CCceeccee-eEEecccccccC---CCccEEEEeeccceEEEEE---------ecCceEEEeec-cC-CcCcchhhh--- Confidence 123577887 444433221110 111122323321 11110 11122222111 00 000001111 Q ss_pred CCceEEEeeccccceeeeccccchhhhccc Q lcl|NC_015466. 314 AIESDRIEIDMSYDQKKVAADLGYFFGGIV 343 (344) Q Consensus 314 ~~~~~~vr~~~~~~~~v~~~~~g~l~~~~v 343 (344) ..+...+|+.+..+-.+.-+++=..|+++- T Consensus 269 ~~~~v~~ra~~r~d~~v~~~~a~~~l~~at 298 (298) T protein:vir:16 269 GYNQVYIRAELFLGWGILDATKFARVTEAN 298 (298) T ss_pred hcCcEEEEEEEEEccEeecccceEEEeecC Confidence 124456888888888888888888888888 No 31 >protein:vir:739 Length: 231 # NCBI annotation: major structural protein 4 # Family: family:all:522 # MgeID: mge:14 # MgeName: Tuc2009 # Cross-refs: genbank:acc:NP_108716;genbank:gi:13487838;genbank:GeneID:920884 Probab=95.42 E-value=0.0018 Score=35.70 Aligned_cols=231 Identities=10% Similarity=0.017 Sum_probs=124.7 Q ss_pred ccCCccceeeeechhhcccccccccccCcccccceecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHH Q lcl|NC_015466. 41 SVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKG 120 (344) Q Consensus 41 ~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i 120 (344) .-+-..|..++|++ +.- ....-+-|....+.+...+..+...+..+..-.+.+..+-. +--||...+.+++...| T Consensus 1 ~~~~~~Gdtit~P~--~iG-da~~v~eG~~i~~~~l~~t~~~atIk~~gk~~~itD~a~l~--~~gDp~~ea~~Q~~~~i 75 (231) T protein:vir:73 1 ENGINLANLCEYPN--DIG-DAADVAEGGEISLDKIGTTTKSVTIKKAAKGTEITDEAALS--GYGDPIGESNKQLGLSL 75 (231) T ss_pred CccccCCceEEecc--ccc-chhhhcCCCcCChhhccccceeeeEeeeccceeeeHHHHhh--ccCchHHHHHHHHHHHH Confidence 22333444444442 111 11233345555555566666666676666555555555544 34577777777777666 Q ss_pred hhhHHHHHHHHHhhhhhhcccccccccccccccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcce Q lcl|NC_015466. 121 LINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNV 200 (344) Q Consensus 121 ~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~ 200 (344) ....+..+...+.. + .|+-+++.-+..|.++.+...++ ...|.+ T Consensus 76 A~kvD~di~~~~~~--------------------------------a---~l~~~~~~t~d~i~~A~~~fgde-~~~~~v 119 (231) T protein:vir:73 76 ANKVDDDLLKAAKT--------------------------------T---SQTVSTKANVDGVQAALDIFNDE-DAQAYV 119 (231) T ss_pred HHhhhHHHHHhhcc--------------------------------c---cccccccccHHHHHHHHHHhccc-cccceE Confidence 55555443332211 0 25555667788899999888776 579999 Q ss_pred EEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCC Q lcl|NC_015466. 201 LTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPAT 280 (344) Q Consensus 201 ~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~ 280 (344) ++++++.+..||+.+...+.- ... ..+++---.+..++|++ |++.+ ....+..-...+++ .. T Consensus 120 ivv~p~~~~~Lrk~~~~~~~~--~~~-g~~i~~~G~iG~i~G~~-Vi~S~----~~~~~~~~~~~~i~----------~~ 181 (231) T protein:vir:73 120 LIVNPKDAAKIRKDANAKNIG--SEV-GANALINGTYADVLGAQ-IVRSK----KLAEGSALMFKIVS----------NS 181 (231) T ss_pred EEEcchHHHhhhhccchhhhh--hhh-ccceeeecccceEcceE-EEEcC----CCCCCceeeeeEEe----------ec Confidence 999999999999877654331 111 11233333555666764 43321 11111100001111 11 Q ss_pred CcccccccccceeecccccCCcCCcccccccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 281 PGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 281 ~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) +.++ ++.-. +..++..|+...++..+...+++--+++-+..=..++..-- T Consensus 182 gAl~------~~~k~--------~~~vEtdRd~~~k~~~i~~~~~y~v~l~~~~~vv~~t~~g~ 231 (231) T protein:vir:73 182 PALK------LVLKR--------GVQVETDRDIVTKTTVITADEHYAAYLYDLTKVVNITFTGV 231 (231) T ss_pred ccee------eeecc--------cceeeccccccccccEEEEeEEEEEEEEcCccEEEEEeecC Confidence 1121 12111 23467778888899999888888777666654444322222 No 32 >protein:vir:96262 Length: 274 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1612 # MgeName: ROSA # Cross-refs: genbank:acc:YP_240311;genbank:gi:66395978;genbank:GeneID:5133339 Probab=95.20 E-value=0.0016 Score=35.92 Aligned_cols=264 Identities=9% Similarity=-0.028 Sum_probs=129.3 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhC-cccccC----CccceeeeechhhcccccccccccCcccccce Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVF-PQVSVG----KQSDAYFTYERGDFNRDEMQERTPGTESAGGT 75 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lf-P~v~v~----~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~ 75 (344) |++-++....+++=.+++++.+.= +..-.+| |.+.+. .+.|...++++=... .....-.-|....+.+ T Consensus 1 m~~~~T~l~d~i~Pev~~~~v~~~------~~~~l~~~~~~~~~~~l~g~~G~tv~iP~~~~i-g~a~~~~~g~~i~~~~ 73 (274) T protein:vir:96 1 MAQGMTKLTNQIVPEVLAPMMQAE------LEKKLRFASFAEIDNTLVGQPGDTLTFPAFIYS-GDAKVVAEGEKIPTDI 73 (274) T ss_pred CCcceeehhheechHHHHHHHHHH------HHhhhhccccceecccccCCCCCEEEeeeecCC-CccccccCCCccchhh Confidence 998777667664333666644321 1111111 222111 122333333220000 0001111122222223 Q ss_pred ecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccc Q lcl|NC_015466. 76 YEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTA 155 (344) Q Consensus 76 ~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~ 155 (344) ...+......+..+.. ....+.+...+..+|...+.+.+...+....+..+...+-.+. T Consensus 74 lt~~~~~~~i~~~~~a--~~i~D~~~~~~~~d~~~~~~~~~~~~~a~~vd~~i~~~l~~a~------------------- 132 (274) T protein:vir:96 74 LETKKREAKIRKIAKG--TSISDEALLSGYGDPQGEQVRQHGLAHANKVDDDVLEALKSAK------------------- 132 (274) T ss_pred cccceeEEEeeeeecc--eeehHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHHhccc------------------- Confidence 3333333333333332 3333444444455777777777777776666655443332110 Q ss_pred ccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHH Q lcl|NC_015466. 156 PASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLV 235 (344) Q Consensus 156 ~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~ 235 (344) + .++.... -...|.++...+.+. ...+..++++++++..|++++.+. .+.....+. +.+..- T Consensus 133 --------~------~~~~~~~-~~d~i~~A~~~lgd~-~~~~~~ivv~p~~~~~L~k~~~~~-f~~~s~~g~-~~~~~G 194 (274) T protein:vir:96 133 --------L------TVEADIT-KLTGLQTAIDKFNDE-DLEPMVLFISPLDAGKLRGDATTN-FTRATELGD-DVIVKG 194 (274) T ss_pred --------c------ccccccc-CHHHHHHHHHHhccc-cccccEEEeCHHHHHHHHhhcccc-ccccccccc-cceecc Confidence 0 1211111 145566677766554 568999999999999999887553 232222221 334444 Q ss_pred HHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccccCCCC Q lcl|NC_015466. 236 TLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAI 315 (344) Q Consensus 236 ~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~ 315 (344) .+..++|++ |++.+. +.....+|+ +.+.+|+ +... ...++..|.... T Consensus 195 ~ig~~~G~~-Vi~s~~---------------~~~~t~~l~--------~~gA~~~-~~~~--------~~~vE~~Rd~~~ 241 (274) T protein:vir:96 195 AFGEALGAV-IVRSNK---------------LEAGTAILA--------KKGAVKL-ITKR--------DFFLETDRDPST 241 (274) T ss_pred ccceecCeE-EEEeCC---------------CCCceEEEE--------eccceee-eecC--------Cccccccccccc Confidence 566777766 433211 111222332 1223332 2211 234678888888 Q ss_pred ceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 316 ESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 316 ~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) +...++..+.+--+++-++.-..++ -.+ T Consensus 242 ~~d~i~~~~~y~~~~~~~~~~v~~t-k~~ 269 (274) T protein:vir:96 242 KTTALYSDKHYVAYLYDESKAVKIT-KGS 269 (274) T ss_pred ccCEEEEeEEEEEEEEcCCcEEEEE-cCC Confidence 9999999999888887776666554 222 No 33 >protein:vir:95898 Length: 274 # NCBI annotation: ORF014 # Family: family:all:522 # MgeID: mge:1588 # MgeName: 71 # Cross-refs: genbank:acc:YP_240385;genbank:gi:66396054;genbank:GeneID:5133409 Probab=95.20 E-value=0.0016 Score=35.92 Aligned_cols=264 Identities=9% Similarity=-0.028 Sum_probs=129.3 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhC-cccccC----CccceeeeechhhcccccccccccCcccccce Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVF-PQVSVG----KQSDAYFTYERGDFNRDEMQERTPGTESAGGT 75 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lf-P~v~v~----~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~ 75 (344) |++-++....+++=.+++++.+.= +..-.+| |.+.+. .+.|...++++=... .....-.-|....+.+ T Consensus 1 m~~~~T~l~d~i~Pev~~~~v~~~------~~~~l~~~~~~~~~~~l~g~~G~tv~iP~~~~i-g~a~~~~~g~~i~~~~ 73 (274) T protein:vir:95 1 MAQGMTKLTNQIVPEVLAPMMQAE------LEKKLRFASFAEIDNTLVGQPGDTLTFPAFIYS-GDAKVVAEGEKIPTDI 73 (274) T ss_pred CCcceeehhheechHHHHHHHHHH------HHhhhhccccceecccccCCCCCEEEeeeecCC-CccccccCCCccchhh Confidence 998777667664333666644321 1111111 222111 122333333220000 0001111122222223 Q ss_pred ecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccc Q lcl|NC_015466. 76 YEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTA 155 (344) Q Consensus 76 ~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~ 155 (344) ...+......+..+.. ....+.+...+..+|...+.+.+...+....+..+...+-.+. T Consensus 74 lt~~~~~~~i~~~~~a--~~i~D~~~~~~~~d~~~~~~~~~~~~~a~~vd~~i~~~l~~a~------------------- 132 (274) T protein:vir:95 74 LETKKREAKIRKIAKG--TSISDEALLSGYGDPQGEQVRQHGLAHANKVDDDVLEALKSAK------------------- 132 (274) T ss_pred cccceeEEEeeeeecc--eeehHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHHhccc------------------- Confidence 3333333333333332 3333444444455777777777777776666655443332110 Q ss_pred ccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHH Q lcl|NC_015466. 156 PASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLV 235 (344) Q Consensus 156 ~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~ 235 (344) + .++.... -...|.++...+.+. ...+..++++++++..|++++.+. .+.....+. +.+..- T Consensus 133 --------~------~~~~~~~-~~d~i~~A~~~lgd~-~~~~~~ivv~p~~~~~L~k~~~~~-f~~~s~~g~-~~~~~G 194 (274) T protein:vir:95 133 --------L------TVEADIT-KLTGLQTAIDKFNDE-DLEPMVLFISPLDAGKLRGDATTN-FTRATELGD-DVIVKG 194 (274) T ss_pred --------c------ccccccc-CHHHHHHHHHHhccc-cccccEEEeCHHHHHHHHhhcccc-ccccccccc-cceecc Confidence 0 1211111 145566677766554 568999999999999999887553 232222221 334444 Q ss_pred HHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccccCCCC Q lcl|NC_015466. 236 TLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAI 315 (344) Q Consensus 236 ~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~ 315 (344) .+..++|++ |++.+. +.....+|+ +.+.+|+ +... ...++..|.... T Consensus 195 ~ig~~~G~~-Vi~s~~---------------~~~~t~~l~--------~~gA~~~-~~~~--------~~~vE~~Rd~~~ 241 (274) T protein:vir:95 195 AFGEALGAV-IVRSNK---------------LEAGTAILA--------KKGAVKL-ITKR--------DFFLETDRDPST 241 (274) T ss_pred ccceecCeE-EEEeCC---------------CCCceEEEE--------eccceee-eecC--------Cccccccccccc Confidence 566777766 433211 111222332 1223332 2211 234678888888 Q ss_pred ceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 316 ESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 316 ~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) +...++..+.+--+++-++.-..++ -.+ T Consensus 242 ~~d~i~~~~~y~~~~~~~~~~v~~t-k~~ 269 (274) T protein:vir:95 242 KTTALYSDKHYVAYLYDESKAVKIT-KGS 269 (274) T ss_pred ccCEEEEeEEEEEEEEcCCcEEEEE-cCC Confidence 9999999999888887776666554 222 No 34 >protein:vir:8187 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:153 # MgeName: Che9d # Cross-refs: genbank:acc:NP_817980;genbank:gi:29566414;genbank:GeneID:2700968 Probab=95.18 E-value=0.0026 Score=34.76 Aligned_cols=302 Identities=10% Similarity=0.029 Sum_probs=121.1 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceecccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIGN 80 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~~ 80 (344) |.-...+. +.+-+.+.+--+..-.. ..+-..+++.+|+.....+++++..+.-.. -.+-|......+..|+. T Consensus 1 mat~~~gg--~lvP~~~~~~ii~~~~~--~s~i~~~~~~i~~~~~~~~~p~~~~~~~a~----wv~Eg~~~~~~~~~f~~ 72 (311) T protein:vir:81 1 MVALATGT--FQLPKHLVPGVWQKAQG--QSVLARLSMAEPQEFGEQQYMTLTAPPRGE----VVGEGAQKSESTATFAP 72 (311) T ss_pred CceecCCc--eEcchhHHHHHHHHHHh--cchhhhhcceeecCCCceEEEEEeCCceeE----EeecCcccccccceeeE Confidence 66444433 23322222211111111 123445677788776667777764321111 11223333333344444 Q ss_pred cccccccccccccccHHHHHhc-cCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 81 DTYFARTRAYHRDVPEQVRANA-DNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 81 ~~~~~~~~~l~~~v~~~~~~~a-~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) .+...+.-+-...+.++..++. .+..++++...+.+.+.+....| ...+++..-+. .....|.... ..++. T Consensus 73 v~l~~~kl~~~~~iS~ell~~~~d~~~~l~~~i~~~la~ai~~~~d----~a~l~G~~~~~--~~~~~gi~~~--~~~~~ 144 (311) T protein:vir:81 73 VTAIPRKVQVTQRFSQEVKWADESRQLGVLQTMADLSGVALGRALD----LIGIHGINPLT--GAALSGSPAK--ILDTT 144 (311) T ss_pred EEEeeEEEEEeehhhHHHhhcCcccHHHHHHHHHHHHHHHHHHHHH----HhhhccccCCC--Cccccccccc--ccccc Confidence 4333333222233444433322 22334444333334443332222 22232210000 0111111110 11222 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCcc--ccccCHHHH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSG--AAKANLVTL 237 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~--~~~vt~~~l 237 (344) +.++.+++ ....+..+|......+. ..+..|+..+|++..|.+|++ ++..+... ....+...- T Consensus 145 ~~~~~~~~-------~~~~~~~~i~~~~~~~~-~~~~~~~~~vmn~~~~~~l~~-------lkd~~G~~l~~~~~~~~~~ 209 (311) T protein:vir:81 145 NIVELTTG-------TSATPDLAVEAAVGLVL-GDNLSPDGVALDNTFSFMLAT-------QRDSQGRKLYPELGFGTDV 209 (311) T ss_pred eeeeeccc-------ccchHHHHHHHHHHHhh-hcCCCceEEEEcHHHHHHHHh-------hhccCCCeeecCccccCCC Confidence 33333322 34466677888877765 457899999999999988863 22221100 011111223 Q ss_pred HHHhCCCeEEEEEEEEeccccCCCCccceeCCCc-eEEEEecCCCccccc----ccccceeecccccCCcCCcccccccC Q lcl|NC_015466. 238 ADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKH-ALLSYAPATPGIMTP----SAGYTFNWTGLVGSGNEGMRIKRFYL 312 (344) Q Consensus 238 a~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~-~~l~~~~~~~~~~~~----s~G~T~~~~~~~g~~~~~~~~~~~~~ 312 (344) ..++|+| |.+-+..-..-...........+... ..+++ +++..- .-+.+++.... +. ..+. ...| T Consensus 210 ~tl~G~P-v~~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~----gDfs~~~i~~~~~~~~~~~~~-~~-~~~~-~~~~-- 279 (311) T protein:vir:81 210 ASFAGLN-AAVSDTVRGGPEAVTASTGVYRTTNPNVKAIA----GDFSAFRWGVQVSIPLELIEF-GD-PDGL-GDLK-- 279 (311) T ss_pred ceeccee-EEecccccccccccccccchhcccCCccEEEE----EecccEEEEEeccceEEEecc-CC-CCcc-hhhh-- Confidence 4567877 43322211100000000011111111 11111 111110 00112211100 00 0000 0001 Q ss_pred CCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 313 DAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 313 ~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ..+...+|+.+.++-.+.-+++-..|+.++- T Consensus 280 -~~~~v~~r~~~r~d~~v~~~~a~~~l~~a~~ 310 (311) T protein:vir:81 280 -RQNQIAIRAEVVYGIGIMSTDAFAVVRDADE 310 (311) T ss_pred -hcCcEEEEEEEEeccEeecccceEEEEeecc Confidence 1245678888888888888888888888877 No 35 >protein:vir:105334 Length: 276 # NCBI annotation: putative phage major capsid protein # Family: family:all:522 # MgeID: mge:1679 # MgeName: PH15 # Cross-refs: genbank:acc:YP_950669;genbank:gi:119967839;genbank:GeneID:4643213 Probab=95.02 E-value=0.0015 Score=36.04 Aligned_cols=266 Identities=11% Similarity=-0.016 Sum_probs=126.7 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccc----cCCccceeeeechhhcccccccccccCccccccee Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVS----VGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTY 76 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~----v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~ 76 (344) |++.++..+.+++=.+++.+.+.--... ..+.|.+. ...+.|+..++++=...-. ...-.-|....+.++ T Consensus 1 Ma~~~T~l~d~i~Pev~~~~v~~~~~~~-----~~~~~~~~~~~~l~g~~G~ti~iP~~~~igd-a~~~~eg~~i~~~~l 74 (276) T protein:vir:10 1 MAQGTTTKSTQIVPEVLAPMMQAELDKK-----LRFAQFADIDSTLVGQPGDTLTFPAFVYSGD-ATVVPEGQKIPVDKI 74 (276) T ss_pred CCcceeehhhhhchHHHHHHHHHHHHhh-----hhhcccceecccccCCCCCEEEeeeecCCCc-cccccCCCccCcccc Confidence 9987776676644445555432211111 11122222 2223455555543111111 111112333333334 Q ss_pred cccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccc Q lcl|NC_015466. 77 EIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAP 156 (344) Q Consensus 77 ~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~ 156 (344) ..+......+..+..-.+.+ .....+..||...+.+.+...+....+..+...+-.+. T Consensus 75 t~~~~~a~i~~~~k~~~~tD--~a~~~~~~dp~~~~~~~~~~~~a~~~d~~~~~~l~~~~-------------------- 132 (276) T protein:vir:10 75 ETNRREAKIHKIGKGTDITD--EALLSGYGDPQGEAVRQHGLAIANKVDNDVLEALRGTK-------------------- 132 (276) T ss_pred ccceeeEEeehccccccccH--HHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhccc-------------------- Confidence 44444444444443333333 33344455777777777777766655554443321100 Q ss_pred cccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHH Q lcl|NC_015466. 157 ASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVT 236 (344) Q Consensus 157 ~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~ 236 (344) ..++....+ ...|.++...+.+. ...+++++++++++..|++...+ +.+.....+ .+.+..-+ T Consensus 133 -------------~~~~~~~~t-~d~i~~A~~~lgd~-~~~~~~ivv~p~~~~~L~k~~~~-~f~~~s~~g-~~~~~~G~ 195 (276) T protein:vir:10 133 -------------LTVSADIGT-LAGLEAAIDTFDDE-DLEPMVLFINPKDAGKLRSSASD-NFTRATELG-DNIIVKGA 195 (276) T ss_pred -------------ccccccccC-HHHHHHHHHHhccc-cCcccEEEEcHHHHHHHHHhccc-ccccccccc-ccceeccc Confidence 012222222 34455555565554 56899999999999999864332 122222222 23444456 Q ss_pred HHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccccCCCCc Q lcl|NC_015466. 237 LADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAIE 316 (344) Q Consensus 237 la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~~ 316 (344) +..++|++ |++.+. +.....+|+ ..+. +|+ +.. .+..++..+....+ T Consensus 196 ig~~~G~~-Vi~s~~---------------~p~~t~~l~---~~gA-----i~~-~~~--------~~~~vE~dRd~~~~ 242 (276) T protein:vir:10 196 FGEALGAV-IVRSKK---------------LDEGEAILA---KRGA-----VKL-ITK--------RDFFLETDRDPSTK 242 (276) T ss_pred cceeccee-EEEcCC---------------CCcceEEEE---eccc-----eee-eec--------CCceeecccchhhc Confidence 66777865 433221 111222332 1111 221 111 12346777888888 Q ss_pred eEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 317 SDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 317 ~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ...+++..++--+++-+..-..++-+-- T Consensus 243 ~d~i~~~~~y~~~~~~~~~vv~~t~~~~ 270 (276) T protein:vir:10 243 TTALYSDKHYVAYLYDESKAVKVTKGAG 270 (276) T ss_pred ccEEEEeeEEEEEEEcCcceEEEecCCc Confidence 9999888888666666544333331111 No 36 >protein:vir:1886 Length: 385 # NCBI annotation: major capsid subunit precursor # Family: family:all:585 # MgeID: mge:41 # MgeName: HK022 # Cross-refs: genbank:acc:NP_037666;genbank:gi:9634124;genbank:GeneID:1262513 Probab=94.96 E-value=0.0028 Score=34.58 Aligned_cols=272 Identities=11% Similarity=-0.016 Sum_probs=113.7 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceecccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIGN 80 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~~ 80 (344) |.........+++......|-...+.. ..-..++|.+|+.....+++......-.-.. ..-|.........|.. T Consensus 105 ~~~~~~~~g~~i~~~~~~~ii~~~~~~---~~l~~~~~~~~~~~~~~~~~~~~~~~~~a~~---v~E~~~~~~~~~~~~~ 178 (385) T protein:vir:18 105 LGSDADSAGSLIQPMQIPGIIMPGLRR---LTIRDLLAQGRTSSNALEYVREEVFTNNADV---VAEKALKPESDITFSK 178 (385) T ss_pred hccccccCCceecchhhhHHHHHhhhc---cchhhhcceecccCcceEEEEEecCCcceee---eccCccccccccceeE Confidence 333322222222111111111111111 2233457888887777777775321100001 1123333333445555 Q ss_pred cccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccccc Q lcl|NC_015466. 81 DTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASFD 160 (344) Q Consensus 81 ~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~~ 160 (344) .....+..+....++.+..+.. .+++......+.+.+....| ..++++. +......|... ....+ T Consensus 179 ~~~~~~k~~~~~~is~ell~d~---~~l~~~i~~~la~a~~~~~d----~~~l~G~----g~~~~~~Gi~~----~~~~~ 243 (385) T protein:vir:18 179 QTANVKTIAHWVQASRQVMDDA---PMLQSYINNRLMYGLALKEE----GQLLNGD----GTGDNLEGLNK----VATAY 243 (385) T ss_pred EEEeeeeEEEeehhhHHHHhhH---HHHHHHHHHHHHHHHHHHHH----HHHHhcc----CCCCccccccc----ccccc Confidence 5555444444445555544432 12343333333333333222 2233221 11111122111 11111 Q ss_pred eeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH---HHH Q lcl|NC_015466. 161 PTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL---VTL 237 (344) Q Consensus 161 k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~---~~l 237 (344) .. .++..+.+.+.+|.+.+..+ ...+..++.++|+++.|.+|+. ++ ..+ +.-+.+. ..- T Consensus 244 ~~--------~~~~~~~~~~d~i~~~~~~l-~~~~~~~~~~~~~~~~~~~l~~---lk----d~~--G~~l~~~~~~~~~ 305 (385) T protein:vir:18 244 DT--------SLNATGDTRADIIAHAIYQV-TESEFSASGIVLNPRDWHNIAL---LK----DNE--GRYIFGGPQAFTS 305 (385) T ss_pred cc--------cccccccchHHHHHHHHHhh-ccccCCCCEEEEcHHHHHHHHH---hh----cCC--CceeccCcccCCC Confidence 11 12334456777888887776 4568889999999999998873 22 111 1111110 111 Q ss_pred HHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccce-eecccccCCcCCcccccccC---- Q lcl|NC_015466. 238 ADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTF-NWTGLVGSGNEGMRIKRFYL---- 312 (344) Q Consensus 238 a~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~-~~~~~~g~~~~~~~~~~~~~---- 312 (344) ..++|+| |++.+. + +.+.+++ ++.+.+|.. ...+. .++...+ T Consensus 306 ~~l~G~p-V~~~~~---------------~-p~~~~~~--------gd~~~~~~~~~~~~~--------~v~~~~~~~~~ 352 (385) T protein:vir:18 306 NIMWGLP-VVPTKA---------------Q-AAGTFTV--------GGFDMASQVWDRMDA--------TVEVSREDRDN 352 (385) T ss_pred ceeccee-eEEcCc---------------C-CCCcEEE--------eecccEEEEEEecce--------EEEEeccccch Confidence 2345665 322111 1 1111221 122222221 11111 1111111 Q ss_pred CCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 313 DAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 313 ~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) -......+|+...++-.+.-+.+-..++-..| T Consensus 353 ~~~~~~~~~~~~r~~~~v~~~~a~~~~~~~aa 384 (385) T protein:vir:18 353 FVKNMLTILCEERLALAHYRPTAIIKGTFSSG 384 (385) T ss_pred hhcCcEEEEEEEeeccEEecccceEEEEeccC Confidence 11345567777777777777766666666555 No 37 >protein:vir:191 Length: 385 # NCBI annotation: major head subunit precursor # Family: family:all:585 # MgeID: mge:6 # MgeName: HK97 # Cross-refs: genbank:acc:NP_037701;genbank:gi:9634158;genbank:GeneID:1262530 Probab=94.96 E-value=0.0028 Score=34.58 Aligned_cols=272 Identities=11% Similarity=-0.016 Sum_probs=113.7 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceecccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIGN 80 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~~ 80 (344) |.........+++......|-...+.. ..-..++|.+|+.....+++......-.-.. ..-|.........|.. T Consensus 105 ~~~~~~~~g~~i~~~~~~~ii~~~~~~---~~l~~~~~~~~~~~~~~~~~~~~~~~~~a~~---v~E~~~~~~~~~~~~~ 178 (385) T protein:vir:19 105 LGSDADSAGSLIQPMQIPGIIMPGLRR---LTIRDLLAQGRTSSNALEYVREEVFTNNADV---VAEKALKPESDITFSK 178 (385) T ss_pred hccccccCCceecchhhhHHHHHhhhc---cchhhhcceecccCcceEEEEEecCCcceee---eccCccccccccceeE Confidence 333322222222111111111111111 2233457888887777777775321100001 1123333333445555 Q ss_pred cccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccccc Q lcl|NC_015466. 81 DTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASFD 160 (344) Q Consensus 81 ~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~~ 160 (344) .....+..+....++.+..+.. .+++......+.+.+....| ..++++. +......|... ....+ T Consensus 179 ~~~~~~k~~~~~~is~ell~d~---~~l~~~i~~~la~a~~~~~d----~~~l~G~----g~~~~~~Gi~~----~~~~~ 243 (385) T protein:vir:19 179 QTANVKTIAHWVQASRQVMDDA---PMLQSYINNRLMYGLALKEE----GQLLNGD----GTGDNLEGLNK----VATAY 243 (385) T ss_pred EEEeeeeEEEeehhhHHHHhhH---HHHHHHHHHHHHHHHHHHHH----HHHHhcc----CCCCccccccc----ccccc Confidence 5555444444445555544432 12343333333333333222 2233221 11111122111 11111 Q ss_pred eeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH---HHH Q lcl|NC_015466. 161 PTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL---VTL 237 (344) Q Consensus 161 k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~---~~l 237 (344) .. .++..+.+.+.+|.+.+..+ ...+..++.++|+++.|.+|+. ++ ..+ +.-+.+. ..- T Consensus 244 ~~--------~~~~~~~~~~d~i~~~~~~l-~~~~~~~~~~~~~~~~~~~l~~---lk----d~~--G~~l~~~~~~~~~ 305 (385) T protein:vir:19 244 DT--------SLNATGDTRADIIAHAIYQV-TESEFSASGIVLNPRDWHNIAL---LK----DNE--GRYIFGGPQAFTS 305 (385) T ss_pred cc--------cccccccchHHHHHHHHHhh-ccccCCCCEEEEcHHHHHHHHH---hh----cCC--CceeccCcccCCC Confidence 11 12334456777888887776 4568889999999999998873 22 111 1111110 111 Q ss_pred HHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccce-eecccccCCcCCcccccccC---- Q lcl|NC_015466. 238 ADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTF-NWTGLVGSGNEGMRIKRFYL---- 312 (344) Q Consensus 238 a~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~-~~~~~~g~~~~~~~~~~~~~---- 312 (344) ..++|+| |++.+. + +.+.+++ ++.+.+|.. ...+. .++...+ T Consensus 306 ~~l~G~p-V~~~~~---------------~-p~~~~~~--------gd~~~~~~~~~~~~~--------~v~~~~~~~~~ 352 (385) T protein:vir:19 306 NIMWGLP-VVPTKA---------------Q-AAGTFTV--------GGFDMASQVWDRMDA--------TVEVSREDRDN 352 (385) T ss_pred ceeccee-eEEcCc---------------C-CCCcEEE--------eecccEEEEEEecce--------EEEEeccccch Confidence 2345665 322111 1 1111221 122222221 11111 1111111 Q ss_pred CCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 313 DAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 313 ~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) -......+|+...++-.+.-+.+-..++-..| T Consensus 353 ~~~~~~~~~~~~r~~~~v~~~~a~~~~~~~aa 384 (385) T protein:vir:19 353 FVKNMLTILCEERLALAHYRPTAIIKGTFSSG 384 (385) T ss_pred hhcCcEEEEEEEeeccEEecccceEEEEeccC Confidence 11345567777777777777766666666555 No 38 >protein:vir:9309 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:165 # MgeName: phi 11 # Cross-refs: genbank:acc:NP_803287;genbank:gi:29028597;genbank:GeneID:1258044 Probab=94.94 E-value=0.0019 Score=35.49 Aligned_cols=284 Identities=10% Similarity=-0.041 Sum_probs=125.3 Q ss_pred CCCCCCCCcc-ceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSRSD-VHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~~~-~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) +..++..... ++.....+.|-..-++ ..+-..+++.+|+.....+|+++....-. .-.+-|........+|+ T Consensus 27 ~~~~~~~~~~~liP~~~~~~ii~~~~~---~s~l~~l~~~~~~~~~~~~ip~~~~~~~a----~~v~Eg~~~~~~~~~f~ 99 (324) T protein:vir:93 27 DNVMMHEKKDGTLLNDFTTPILQEVME---NSKIMQLGKYEPMEGTEKKFTFWADKPGA----YWVGEGQKIETSKATWV 99 (324) T ss_pred ccccccCCCcceechhHHHHHHHHHHh---hchhhhhcceeeccCCceEEEEEecCcce----eeecCCcccccccccee Confidence 2222222222 2222222222211111 12344456778887766777775322111 11223444444445555 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ..++..+.-+.-..+.++..++. ..+++....+.+.+.+....| ..++++. +......+... T Consensus 100 ~i~~~~~k~~~~~~iS~ell~ds--~~~l~~~i~~~l~~aia~~~d----~a~l~G~----g~~~~~~~~~~-------- 161 (324) T protein:vir:93 100 NATMRAFKLGVILPVTKEFLNYT--YSQFFEEMKPMIAEAFYKKFD----EAGILNQ----GNNPFGKSIAQ-------- 161 (324) T ss_pred EEEEEeEEEEEeehhhHHHHhcc--hHHHHHHHHHHHHHHHHHHHH----HHHhcCC----CCCCcCccccc-------- Confidence 55555444444444555555433 234444444444444433222 2222221 11111111110 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHHH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLAD 239 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la~ 239 (344) .... ......+.+.+.||.+....+.. .+..++.++|+++.|.+|++ + +..+ +..+.....-.. T Consensus 162 ---~~~~---~~~~~~~~~~~~~i~~~~~~l~~-~~~~~~~~v~n~~~~~~L~~---l----~d~~--G~~~~~~~~~~~ 225 (324) T protein:vir:93 162 ---SIEK---TNKVIKGDFTQDNIIDLEALLED-DELEANAFISKTQNRSLLRK---I----VDPE--TKERIYDRNSDS 225 (324) T ss_pred ---cccc---cceeccccccHHHHHHHHHhhhh-ccCCCCEEEEcHHHHHHHHH---h----hCCC--CCeeecCCCCCc Confidence 0001 12334566778999999888755 47889999999999998863 2 2211 111221111234 Q ss_pred HhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccc----cCCcCCcccccccCCCC Q lcl|NC_015466. 240 LFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLV----GSGNEGMRIKRFYLDAI 315 (344) Q Consensus 240 ~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~----g~~~~~~~~~~~~~~~~ 315 (344) ++|+| |++.. +.....+ .-+.++.--+++.... +.++....+. +....+.....| .. T Consensus 226 l~G~P-Vv~~~-----~~~~~~~--~i~~gdfs~~~~~~~~--------~~~i~~~~~~~~~~~~~~~~~~~~~f---~~ 286 (324) T protein:vir:93 226 LDGLP-VVNLK-----SSNLKRG--ELITGDFDKLIYGIPQ--------LIEYKIDETAQLSTVKNEDGTPVNLF---EQ 286 (324) T ss_pred cccee-eEeec-----CCCCCcc--eEEEEecceEEEEEec--------CcEEEEeecccccccccccccchhhh---hc Confidence 67877 32211 1111111 1122221111111111 1112111110 000001111111 23 Q ss_pred ceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 316 ESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 316 ~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ....+|+.+.++-.+.-+++-..|+++.+ T Consensus 287 n~~~~r~~~r~d~~v~~~~a~~~l~~a~~ 315 (324) T protein:vir:93 287 DMVALRATMHVALHIADDKAFAKLVPADK 315 (324) T ss_pred CcEEEEEEEEeccEEecccceEEEecccc Confidence 45688999999999999999999998888 No 39 >protein:vir:96223 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1607 # MgeName: 69 # Cross-refs: genbank:acc:YP_239571;genbank:gi:66395304;genbank:GeneID:5132771 Probab=94.92 E-value=0.0018 Score=35.61 Aligned_cols=284 Identities=10% Similarity=-0.043 Sum_probs=124.5 Q ss_pred CCCCCCCCcc-ceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSRSD-VHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~~~-~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) +-.++..... ++.....+.|-..-++. .+-..+++.+|+.....+|+++....-. .-.+-|........+|+ T Consensus 27 ~~~~~~~~~~~lip~~~~~~ii~~~~~~---s~l~~l~~~~~~~~~~~~~p~~~~~~~a----~~v~Eg~~~~~~~~~f~ 99 (324) T protein:vir:96 27 DNVMMHEKKDGTLLNDFTTPILQEVMEN---SKIMQLGKYEPMEGTEKKFTFWADKPGA----YWVGEGQKIETSKATWV 99 (324) T ss_pred ccccccCCCcceechhHHHHHHHHHHhh---chhhhhcceeeccCCceEEEEEecCcce----eeecCCcccccccccee Confidence 1111111111 11111112221111111 2234457888888777778776422111 11223444444445565 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ..++..+..+....+..+..++. ..+++....+.+.+.|....| ..++.+. +......+... T Consensus 100 ~v~~~~~k~~~~~~is~ell~ds--~~~l~~~i~~~l~~aia~~~d----~~~l~G~----g~~~~~~~~~~-------- 161 (324) T protein:vir:96 100 NATMRAFKLGVILPVTKEFLNYT--YSQFFEEMKPMIAEAFYKKFD----EAGILNQ----GNNPFGKSIAQ-------- 161 (324) T ss_pred EEEEEeEEEEEeehhhHHHHhcc--hHHHHHHHHHHHHHHHHHHHH----HHhhhcC----CCCCcCccccc-------- Confidence 55555555444444555555433 234454444444444443333 2223221 11111111110 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHHH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLAD 239 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la~ 239 (344) ....+ .....+.+...+|.+....+. ..+..|+.++|+++.|..|+. ++ ..+ +..+.....-.. T Consensus 162 ---~~~~~---~~~~~~~~~~~~i~~~~~~i~-~~~~~~~~~i~n~~~~~~L~~---lk----d~~--G~~~~~~~~~~~ 225 (324) T protein:vir:96 162 ---SIKKT---NKVIKGDFTQDNIIDLEALLE-DDELEANAFISKTQNRSLLRK---IV----DPE--TKERIYDRNSDS 225 (324) T ss_pred ---ccccc---ceecccccchHHHHHHHHhhh-hccCCCCEEEEcHHHHHHHHH---hh----CCC--CCeeecCCCCCc Confidence 00001 122345567888999888775 457899999999999988862 22 111 111121111234 Q ss_pred HhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeeccccc----CCcCCcccccccCCCC Q lcl|NC_015466. 240 LFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVG----SGNEGMRIKRFYLDAI 315 (344) Q Consensus 240 ~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g----~~~~~~~~~~~~~~~~ 315 (344) ++|+| |.+.. +..... ..-+.++.--+++.... +.++....+.. ....+.. ++--.. T Consensus 226 l~G~P-V~~~~-----~~~~~~--~~~~~gd~s~~~~~~~~--------~~~i~~~~~~~~~~~~~~~~~~---~~~~~~ 286 (324) T protein:vir:96 226 LDGLP-VVNLK-----SSNLKR--GELITGDFDKLIYGIPQ--------LIEYKIDETAQLSTVKNEDGTP---VNLFEQ 286 (324) T ss_pred cccee-eEeec-----CCCCCc--ceEEEEecceEEEEEec--------CcEEEEeecccccccccccccc---hhhhhc Confidence 67887 32211 111111 11222222111111110 12222111100 0000000 111122 Q ss_pred ceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 316 ESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 316 ~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ....+|+.+.++-.+.-+++-..|+.+.+ T Consensus 287 n~v~~r~~~r~d~~v~~~~a~~~l~~a~~ 315 (324) T protein:vir:96 287 DMVALRATMHVALHIADDKAFAKLVPADK 315 (324) T ss_pred CcEEEEEEEEeccEEecccceEEEecccc Confidence 45678999999999999999888998888 No 40 >protein:vir:78830 Length: 324 # NCBI annotation: major head protein # Family: family:all:507 # MgeID: mge:1858 # MgeName: 80alpha # Cross-refs: genbank:acc:YP_001285361;genbank:gi:148717889;genbank:GeneID:5246961 Probab=94.91 E-value=0.0016 Score=35.89 Aligned_cols=287 Identities=10% Similarity=-0.065 Sum_probs=125.4 Q ss_pred CCCCCCCCccc-eecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSRSDV-HVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~~~~-~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) +-.+.+..... +.....+.|-..-++. ..-..+++.+|+.....++++.....-. .-.+-|........+|. T Consensus 27 ~~~~~~~~~~~~iP~~~~~~ii~~~~~~---s~l~~l~~~~~~~~~~~~~p~~~~~~~a----~~v~Eg~~~~~~~~~~~ 99 (324) T protein:vir:78 27 DNVMMHEKKDGTLMNEFTTPILQEVMEN---SKIMQLGKYEPMEGTEKKFTFWADKPGA----YWVGEGQKIETSKATWV 99 (324) T ss_pred ccccccCcCccccchhHHHHHHHHHHhh---chhhhhcceeeccCCceEEEEEecCcce----eEecCCcccccccccee Confidence 22222222222 1222222222122221 2234467888888777788876432111 11223444444444555 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ..+...+.-+.-..+..+..++. ..+++....+.+.+.+.+..| .+++++. +......+... T Consensus 100 ~v~~~~~k~~~~~~is~ell~ds--~~~l~~~i~~~la~ai~~~~d----~a~l~G~----g~~~~~~gi~~-------- 161 (324) T protein:vir:78 100 NATMRAFKLGVILPVTKEFLNYT--YSQFFEEMKPMIAEAFYKKFD----EAGILNQ----GNNPFGKSIAQ-------- 161 (324) T ss_pred EEEEeeEEEEEeehhhHHHHhcc--hHHHHHHHHHHHHHHHHHHHH----HHHhccC----CCCCcCccccc-------- Confidence 55444444333334444444433 244444444444444443333 2223221 11111111111 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHHH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLAD 239 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la~ 239 (344) ..........+.....+|.+....+. ..+..++..+|++++|.+|+. + +..+ + ..++....-.. T Consensus 162 ------~~~~~~~~~~~~~t~~~i~~~~~~l~-~~~~~~~~~vmn~~~~~~L~~---l----~d~~-G-~~~~~~~~~~~ 225 (324) T protein:vir:78 162 ------SIEKTNKVIKGDFTQDNIIDLEALLE-DDELEANAFISKTQNRSLLRK---I----VDPE-T-KERIYDRNSDS 225 (324) T ss_pred ------cccccceeccccccHHHHHHHHHhhh-hccCCCCEEEEcHHHHHHHHH---h----hccC-C-CeeecCCCCCc Confidence 00011233355677888999888765 457899999999999988862 2 2221 1 11221111234 Q ss_pred HhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeeccccc-CCcCCcccccccCCCCceE Q lcl|NC_015466. 240 LFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVG-SGNEGMRIKRFYLDAIESD 318 (344) Q Consensus 240 ~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g-~~~~~~~~~~~~~~~~~~~ 318 (344) ++|+| |.+..+ .....+ .-+.++..-+++... -+.+++...+.. +.....--..|.-=..... T Consensus 226 l~G~P-V~~~~~-----~~~~~~--~~~~gd~~~~~~g~~--------~~~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~ 289 (324) T protein:vir:78 226 LDGLP-VVNLKS-----SNLKRG--ELITGDFDKLIYGIP--------QLIEYKIDETAQLSTVKNEDGTPVNLFEQDMV 289 (324) T ss_pred cccee-eEeeCC-----CCCCcc--eEEEEecceEEEEEe--------cCcEEEEeecccccccccccccchhhhhcCcE Confidence 67877 322111 101111 112222111111110 012222211100 0000000000111123567 Q ss_pred EEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 319 RIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 319 ~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .+|+.+.++-.+.-+++=..|+.+.+ T Consensus 290 ~~r~~~r~d~~v~~~~A~~~l~~a~~ 315 (324) T protein:vir:78 290 ALRATMHVALHIADDKAFAKLVPADK 315 (324) T ss_pred EEEEEEEEccEEecccceEEEecccc Confidence 78999999999999998888988887 No 41 >protein:vir:96392 Length: 324 # NCBI annotation: ORF011 # Family: family:all:507 # MgeID: mge:1613 # MgeName: 53 # Cross-refs: genbank:acc:YP_239648;genbank:gi:66395381;genbank:GeneID:5132868 Probab=94.91 E-value=0.0016 Score=35.89 Aligned_cols=287 Identities=10% Similarity=-0.065 Sum_probs=125.4 Q ss_pred CCCCCCCCccc-eecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSRSDV-HVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~~~~-~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) +-.+.+..... +.....+.|-..-++. ..-..+++.+|+.....++++.....-. .-.+-|........+|. T Consensus 27 ~~~~~~~~~~~~iP~~~~~~ii~~~~~~---s~l~~l~~~~~~~~~~~~~p~~~~~~~a----~~v~Eg~~~~~~~~~~~ 99 (324) T protein:vir:96 27 DNVMMHEKKDGTLMNEFTTPILQEVMEN---SKIMQLGKYEPMEGTEKKFTFWADKPGA----YWVGEGQKIETSKATWV 99 (324) T ss_pred ccccccCcCccccchhHHHHHHHHHHhh---chhhhhcceeeccCCceEEEEEecCcce----eEecCCcccccccccee Confidence 22222222222 1222222222122221 2234467888888777788876432111 11223444444444555 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ..+...+.-+.-..+..+..++. ..+++....+.+.+.+.+..| .+++++. +......+... T Consensus 100 ~v~~~~~k~~~~~~is~ell~ds--~~~l~~~i~~~la~ai~~~~d----~a~l~G~----g~~~~~~gi~~-------- 161 (324) T protein:vir:96 100 NATMRAFKLGVILPVTKEFLNYT--YSQFFEEMKPMIAEAFYKKFD----EAGILNQ----GNNPFGKSIAQ-------- 161 (324) T ss_pred EEEEeeEEEEEeehhhHHHHhcc--hHHHHHHHHHHHHHHHHHHHH----HHHhccC----CCCCcCccccc-------- Confidence 55444444333334444444433 244444444444444443333 2223221 11111111111 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHHH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLAD 239 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la~ 239 (344) ..........+.....+|.+....+. ..+..++..+|++++|.+|+. + +..+ + ..++....-.. T Consensus 162 ------~~~~~~~~~~~~~t~~~i~~~~~~l~-~~~~~~~~~vmn~~~~~~L~~---l----~d~~-G-~~~~~~~~~~~ 225 (324) T protein:vir:96 162 ------SIEKTNKVIKGDFTQDNIIDLEALLE-DDELEANAFISKTQNRSLLRK---I----VDPE-T-KERIYDRNSDS 225 (324) T ss_pred ------cccccceeccccccHHHHHHHHHhhh-hccCCCCEEEEcHHHHHHHHH---h----hccC-C-CeeecCCCCCc Confidence 00011233355677888999888765 457899999999999988862 2 2221 1 11221111234 Q ss_pred HhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeeccccc-CCcCCcccccccCCCCceE Q lcl|NC_015466. 240 LFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVG-SGNEGMRIKRFYLDAIESD 318 (344) Q Consensus 240 ~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g-~~~~~~~~~~~~~~~~~~~ 318 (344) ++|+| |.+..+ .....+ .-+.++..-+++... -+.+++...+.. +.....--..|.-=..... T Consensus 226 l~G~P-V~~~~~-----~~~~~~--~~~~gd~~~~~~g~~--------~~~~i~~~~~~~~~~~~~~~~~~~~~f~~d~~ 289 (324) T protein:vir:96 226 LDGLP-VVNLKS-----SNLKRG--ELITGDFDKLIYGIP--------QLIEYKIDETAQLSTVKNEDGTPVNLFEQDMV 289 (324) T ss_pred cccee-eEeeCC-----CCCCcc--eEEEEecceEEEEEe--------cCcEEEEeecccccccccccccchhhhhcCcE Confidence 67877 322111 101111 112222111111110 012222211100 0000000000111123567 Q ss_pred EEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 319 RIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 319 ~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .+|+.+.++-.+.-+++=..|+.+.+ T Consensus 290 ~~r~~~r~d~~v~~~~A~~~l~~a~~ 315 (324) T protein:vir:96 290 ALRATMHVALHIADDKAFAKLVPADK 315 (324) T ss_pred EEEEEEEEccEEecccceEEEecccc Confidence 78999999999999998888988887 No 42 >protein:vir:78523 Length: 338 # NCBI annotation: Putative head structural protein # Family: family:all:507 # MgeID: mge:1853 # MgeName: U2 # Cross-refs: genbank:acc:YP_001491585;genbank:gi:157786408;genbank:GeneID:5625675 Probab=94.83 E-value=0.0024 Score=34.93 Aligned_cols=309 Identities=12% Similarity=0.012 Sum_probs=119.6 Q ss_pred CCC---------CCCCCcc-------ceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhc--cccc- Q lcl|NC_015466. 1 MPF---------TQPSRSD-------VHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDF--NRDE- 61 (344) Q Consensus 1 m~~---------~~~~~~~-------~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~--~~~~- 61 (344) |+. ....+.. ++.....+.|- ..-.. ..+-..+++.+|+..-..+++++..... .... T Consensus 1 ~~~~~e~~~~~~~~~~~~~~~~~~~~liP~~~~~~ii-~~~~~--~s~l~~l~~~~~~~~~~~~ip~~~~~~~a~~v~~~ 77 (338) T protein:vir:78 1 MATLNELAPNTAGSNHQGRLAHVPSDLLPKEIVGPIF-DKAQE--SSLVLRLGENIPISYGETIIPTTVKRPEVGQVGVG 77 (338) T ss_pred CcchHHhhhhhcccccccceecccccccchHHHHHHH-HHHHh--hchhhhhcceeeccCCceEEEEEecCccceeeccc Confidence 221 1111111 11111111111 11111 1233456777888777777777643211 0000 Q ss_pred -ccccccCcccccceecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcc Q lcl|NC_015466. 62 -MQERTPGTESAGGTYEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGD 140 (344) Q Consensus 62 -~~~ra~g~~~~~~~~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~ 140 (344) .....-|.........|....+..+.-+....+.++..++ +..+++....+.+.+.+....| ..++++.-... T Consensus 78 ~~~~~~Eg~~~~~~~~~f~~v~l~~~k~~~~~~is~ell~d--s~~~~~~~i~~~la~a~~~~~d----~~~l~G~g~~~ 151 (338) T protein:vir:78 78 TSNEQREGGTKPLSGTAWDTRSVAPIKLATIVTVSEEFARM--NPSGLYTKLQADLAYAIGRGID----LAVFHGKSPLT 151 (338) T ss_pred ccccccccccccccccceeEEEEEEEEEEEeehhhHHHHhc--CHHHHHHHHHHHHHHHHHHHHH----HHhhcccCCCc Confidence 0001112222222333443333333322222333333332 2234444433444444433332 33333221100 Q ss_pred cccccccccccccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHH Q lcl|NC_015466. 141 TWTFDVDGVASSPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGR 220 (344) Q Consensus 141 ~~~~~~~gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~ 220 (344) . ..+....+..+.....+ ...........+.+|.+....+.......++..+|+++.+.+|+.-.++ T Consensus 152 ~---------~~~~gi~~~~~~~~~~~-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~m~~~~~~~L~~~~~l--- 218 (338) T protein:vir:78 152 G---------SALQGIDTNNVIVNTTN-VDYLQTGTTPLLDRFLDGYDLVSANTDVDFNGWAADPRYRARLLRSQAY--- 218 (338) T ss_pred c---------ccccccccccccccccc-cccccccchhhHHHHHHHHHHhhhhccccceEEEEchHHHHHHHHHhhh--- Confidence 0 00111111111111100 0011122345677888888888777788899999999999998643332 Q ss_pred hccCCCcc--ccccCHHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccc Q lcl|NC_015466. 221 IDRGQTSG--AAKANLVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLV 298 (344) Q Consensus 221 i~~~~~~~--~~~vt~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~ 298 (344) +..+... .....-..-..++|+| |.+.+..-............-+.++..-+++... -|+++....+. T Consensus 219 -~d~~g~~l~~~~~~~~~~~~l~G~P-V~~~~~ip~~~~~~~~~~~~~~~gdfs~~~~~~~--------~~~~i~~~~~~ 288 (338) T protein:vir:78 219 -RDANGNVDPTRINLAASAGDLLGLP-VQFGKAVGGDLGAATDSKVRVVGGDFSQLKYGFA--------DEIRVKMSDTA 288 (338) T ss_pred -ccCCCceeecccccCCCCceeeeee-EEEccccCccccccCCcccEEEEEecceEEEEee--------cccEEEEeecc Confidence 2211100 0011111123566776 4444333211110001111112222211111100 11222222110 Q ss_pred ----cCCcCCcccccccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 299 ----GSGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 299 ----g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) +....+. .+.--......+|+.+.++-.+.-+++=..|+++-| T Consensus 289 ~~~~~~~~~~~---~~~~~~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~~ 335 (338) T protein:vir:78 289 TLTDNTSPTPQ---TVSMWQTNQIAILIEVTFGWLLGDKQAFVKFVDDED 335 (338) T ss_pred ccccccccccc---chhhhhcCcEEEEEEEEeccEeecccceEEEecccC Confidence 0000000 001112355678888888888888888778888777 No 43 >protein:vir:104085 Length: 320 # NCBI annotation: gp17 # Family: family:all:507 # MgeID: mge:1656 # MgeName: Che12 # Cross-refs: genbank:acc:YP_655596;genbank:gi:109392467;genbank:GeneID:4156953 Probab=94.80 E-value=0.0019 Score=35.54 Aligned_cols=293 Identities=8% Similarity=-0.041 Sum_probs=114.9 Q ss_pred CCCCCCCCcc-ceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSRSD-VHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~~~-~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) |..++..... ++.......|-..-++. .+-..+++.+|+.....+++++....-.. -.+-|........+|+ T Consensus 14 ~~~t~~~~~~~~ip~~~~~~ii~~~~~~---s~l~~~~~~~~~~~~~~~~p~~~~~~~a~----~v~E~~~~~~~~~~f~ 86 (320) T protein:vir:10 14 IAQTGDTMFKGYLEPEQAKDYFAEAEKT---SIVQQFAQKVPMGTTGQKIPHWIGDVSAQ----WIGEGDMKPITKGNMT 86 (320) T ss_pred hhccccccccccccHHHHHHHHHHHHhc---cchhhhcceeeccCCceEEEEEeCCcceE----EecCCcccccccccee Confidence 4444332221 22111112121111111 23445678888877777777764321111 1112333333344555 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc- Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS- 158 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~- 158 (344) ..++.++..+....+..+..++ +.++++....+.+.+.+....| ..++++. +. +.........+ T Consensus 87 ~v~~~~~k~~~~~~is~ell~d--s~~~l~~~i~~~l~~a~a~~~d----~a~l~G~----g~-----~~~~~~~~~~~~ 151 (320) T protein:vir:10 87 SQNIAPHKIATIFVASAETVRA--NPANYLGTMRTKVATAFAMAFD----SAALNGT----DS-----PFPTYLAQTTKS 151 (320) T ss_pred EEEEeeEEEEEeehhhHHHHhc--ChHHHHHHHHHHHHHHHHHHHH----HHhhccc----CC-----CCCccccccccc Confidence 5544444444444444544443 2345555444444444443333 2233221 11 00000000000 Q ss_pred cceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCcccccc--CHHH Q lcl|NC_015466. 159 FDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKA--NLVT 236 (344) Q Consensus 159 ~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~v--t~~~ 236 (344) .+....++. .++ .......++.+....+ ...+..+.+.+|+++.|.+|+. ++..- +......... .... T Consensus 152 ~~~~~~~~~---~~~-~~~~~~~~~~~~~~~~-~~~~~~~~~~v~n~~~~~~L~~---lkd~~-G~~l~~~~~~~~~~~~ 222 (320) T protein:vir:10 152 VSLADPGGA---TAS-DLTAYDAVAVNGLSLL-VNAKKKWTHTLLDDIVEPILNG---AKDKN-GRPLFIESTYTDENSP 222 (320) T ss_pred ccceecccc---ccc-ccccHHHHHHHHHhhh-hcccCCCcEEEEcHHHHHHHHH---hhccC-CceeeccccccCcccc Confidence 011111111 011 1122333444444333 3457789999999999999973 33210 0000000000 0000 Q ss_pred H--HHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCc-ccccccccceeeccc----ccCCcCCccccc Q lcl|NC_015466. 237 L--ADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPG-IMTPSAGYTFNWTGL----VGSGNEGMRIKR 309 (344) Q Consensus 237 l--a~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~-~~~~s~G~T~~~~~~----~g~~~~~~~~~~ 309 (344) + ..++|+|-+ +.. . +.++..++++.+...- +++ .-|.+.....+ .++...+..+.. T Consensus 223 ~~~~~i~g~pv~-~~~-----~----------~~~~~~~~~~gd~~~~~~~~-~~~~~i~~~~~~~~~~~~~~~~~~~~~ 285 (320) T protein:vir:10 223 FRAGRIVSRPTI-LSD-----H----------VADGTTVGYMGDFRNVIWGQ-VGGLSFDVTDQATLNLGTPTEPNFVSL 285 (320) T ss_pred ccCceeeeeeeE-ecC-----C----------CCCCceEEEEeecceEEEEE-ecCeEEEEeecceeeeccccccccchh Confidence 1 123444422 111 1 1111222222111000 000 00112211111 000000100111 Q ss_pred ccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 310 FYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 310 ~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) | ......+|+.+.++-.+.-+++=..|++++| T Consensus 286 f---~~~~~~~r~~~~~d~~v~~~~a~~~l~~~~a 317 (320) T protein:vir:10 286 W---QHNLVAVRVEAEYAFHNNDKDAFVKLTNVVT 317 (320) T ss_pred h---hcCcEEEEEEEeeccEEecccceEEEEeccC Confidence 1 1255678899999999999999889999999 No 44 >protein:vir:99920 Length: 311 # NCBI annotation: gp7 # Family: family:all:966 # MgeID: mge:1611 # MgeName: Halo # Cross-refs: genbank:acc:YP_655524;genbank:gi:109392294;genbank:GeneID:4157089 Probab=94.63 E-value=0.0039 Score=33.79 Aligned_cols=303 Identities=11% Similarity=0.013 Sum_probs=121.1 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceecccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIGN 80 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~~ 80 (344) |+-+..+....+.....+.|-..-+. ..+-..+++.+|+.....++++...+.-. .-.+-|........+|.. T Consensus 1 Mat~tt~~g~~vP~~~~~~ii~~~~~---~s~l~~~~~~i~~~~~~~~~p~~~~~~~a----~wv~Eg~~~~~~~~~f~~ 73 (311) T protein:vir:99 1 MATFGTGNLKNLPRNIADGMVKDVVQ---GSTVAVLSARKPQRFGNEDIITFNGRPKA----EFVGEGQQKSSTTGEFDF 73 (311) T ss_pred CceecCCCceeccHHHHHHHHHHHHh---hchhhhhcceeeccCCceEEEEEeCCcee----EEeecCcccccccceeeE Confidence 88666555443322222222211111 23455667788888766777775322110 111223333333444544 Q ss_pred cccccccccccccccHHHHHhc-cCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 81 DTYFARTRAYHRDVPEQVRANA-DNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 81 ~~~~~~~~~l~~~v~~~~~~~a-~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) .++..+.-+....+..+..+.. .+..++++...+.+.+.|....| ..++++.-.+.+. ...+..... -... T Consensus 74 v~l~~~k~~~~~~iS~ell~~~~d~~~~l~~~i~~~la~ai~~~~d----~~~l~G~g~~~g~--~~~g~~~~~--~~~~ 145 (311) T protein:vir:99 74 VTSTPKKAQVTMRFNEEVQWADEDYQLGVLQTLSEAGAEALARALD----LGLYHRINPLTGT--VIPGWSNYL--GAAS 145 (311) T ss_pred EEEeeEEEEEeehhhHHHhhcccccHHHHHHHHHHHHHHHHHHHHH----HHhhcccCcccCc--ccccccccc--cccc Confidence 4443333232333444443322 23344555444444444443333 3333321110000 000100000 0111 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhc-CCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccC----H Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEET-GFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKAN----L 234 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~-G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt----~ 234 (344) +.+++++. ...++..||.+....+.... ...+|..+|+++.|..|+. + +..+ +.-+.. - T Consensus 146 ~~~~~~~~-------~~~~~~~~i~~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~---l----kd~~--G~~l~~~~~~~ 209 (311) T protein:vir:99 146 KRVELTAD-------TIANPDLAIEAAVGLLVANGHPTPVNGLALHPSIAWGLST---A----RYTD--GRKKFPELGLG 209 (311) T ss_pred ceeecccc-------ccchhHHHHHHHHHHHhhhccCCCccEEEEcHHHHHHHHh---h----hccC--CCeeecCcccC Confidence 22222211 23456677887777665443 4567889999999988863 2 2221 111111 1 Q ss_pred HHHHHHhCCCeEEEEEEEEeccccCCCC-ccceeCCCceEEEEecCCCcc-cccccccceeecccccCCcCCcccccccC Q lcl|NC_015466. 235 VTLADLFEVDKVLVMKAVRNTAKKGQTA-SHSFIGGKHALLSYAPATPGI-MTPSAGYTFNWTGLVGSGNEGMRIKRFYL 312 (344) Q Consensus 235 ~~la~~~gl~~I~v~~a~yn~~~~~~~~-~~~~iw~~~~~l~~~~~~~~~-~~~s~G~T~~~~~~~g~~~~~~~~~~~~~ 312 (344) ..-..++|+| |.+.+..- ...+-.. ....+.++...+++.+-+..+ ..-+-+.++.... .++ ..+. +..| T Consensus 210 ~~~~~l~G~P-v~~s~~i~--~~~~~~~~~~~~~~~~~~~~~~Gdf~~~~~~~~~~~~~~~~~~-~~~-~~~~-~~~~-- 281 (311) T protein:vir:99 210 IGVSSFEGID-ASVSDTVN--GGDEADPDDEDLDAARAVRGIVGDFANGIHWGVQRDIPVELIK-YGD-PDGQ-GDLK-- 281 (311) T ss_pred CCCceeccee-eEeecccc--cccccccccchhhccCcceEEEeeccccEEEEEecCceEEEee-cCC-CCcc-hhhh-- Confidence 1124577887 43332211 0000001 111122222222221100000 0000111111110 000 0000 0111 Q ss_pred CCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 313 DAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 313 ~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .....-+|+.+.++=.|.- ++..-+++++| T Consensus 282 -~~d~~~~r~~~r~d~~v~~-~~~v~~~~~~A 311 (311) T protein:vir:99 282 -RHNQIALRLEIVYGWYVFT-DRFVVIENAVA 311 (311) T ss_pred -hcCcEEEEEEEeecceecC-hhHeeeecccC Confidence 1234457777777766554 57778899999 No 45 >protein:vir:99749 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1497 # MgeName: phiETA2 # Cross-refs: genbank:acc:YP_001004307;genbank:gi:122891761;genbank:GeneID:4712304 Probab=94.01 E-value=0.0057 Score=32.90 Aligned_cols=287 Identities=9% Similarity=-0.061 Sum_probs=122.8 Q ss_pred CCCCCCCCcc-ceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSRSD-VHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~~~-~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) +-.++..... ++...+.+.|-..-+.. .+-..+++.+|+.....+++++.... ...-.+-|.........|. T Consensus 27 ~~~~~~~~~~~lip~~~~~~ii~~~~~~---s~l~~~~~~~~~~~~~~~~p~~~~~~----~a~~v~Eg~~~~~~~~~~~ 99 (324) T protein:vir:99 27 DNVMMHEKKDGTLLNDFTTPILQEVMEN---SKIMRLGKYEPMEGTEKKFTFWADKP----GAYWVGEGQKIETSKATWV 99 (324) T ss_pred cceeccCCCcceechhHHHHHHHHHHhh---chhhhhcceeeccCCceEEEEEecCc----ceeEeccCcccccccccee Confidence 1111111111 11111112211111111 12334567778776667777753211 0111222333333344555 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ..+...+.-+.-..+.++..++. ..+++....+.+.+.+....|. .++++. +......+... T Consensus 100 ~v~~~~~k~~~~~~iS~ell~ds--~~~l~~~i~~~l~~ai~~~~d~----~~l~G~----g~~~~~~~~~~-------- 161 (324) T protein:vir:99 100 NATMRAFKLGVILPVTKEFLNYT--YSQFFEEMKPMIAEAFYKKFDE----AGILNQ----GNNPFGKSIAQ-------- 161 (324) T ss_pred EEEEeeEEEEEeehhhHHHHhcc--hHHHHHHHHHHHHHHHHHHHHH----HhhhcC----CCCccCccccc-------- Confidence 54444444333334444444433 3455555555555555443332 223221 11111111110 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHHH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLAD 239 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la~ 239 (344) ...+ .....++.+...+|.+....+. ..+..++..+|+++.|..|++ + +..+. ..+.....-.. T Consensus 162 ---~~~~---~~~~~~~~~~~~~i~~~~~~l~-~~~~~~~~~v~n~~~~~~L~~---l----~d~~g--~~~~~~~~~~~ 225 (324) T protein:vir:99 162 ---SIEK---TNKVIKGDFTQDNIIDLEALLE-DDELEANAFISKTQNRSLLRK---I----VDPET--KERIYDRNSDT 225 (324) T ss_pred ---cccc---cceeccccCCHHHHHHHHHhhh-hccCCCCEEEEcHHHHHHHHH---h----hcCCC--ceeecCCCCcc Confidence 0011 1233356677899999988775 457889999999999998862 2 22211 11111111134 Q ss_pred HhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCc-CCcccccccCCCCceE Q lcl|NC_015466. 240 LFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGN-EGMRIKRFYLDAIESD 318 (344) Q Consensus 240 ~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~-~~~~~~~~~~~~~~~~ 318 (344) ++|+| |++.. +.....+ .-+.++..-+++. + .-|.+++...+..... ...-...++-=..... T Consensus 226 l~G~P-Vv~~~-----~~~~~~~--~~i~gd~~~~~~~-------~-~~~~~i~~~~~~~~~~~~~~~~~~~~~f~~~~~ 289 (324) T protein:vir:99 226 LDGLP-VVNLK-----SSNLKRG--ELITGDFDKLIYG-------I-PQLIEYKIDETAQLSTVKNEDGTPVNLFEQDMV 289 (324) T ss_pred cccee-EEeec-----CCCCCcc--eEEEEecccEEEE-------E-ecCcEEEEeecccccccccccccchhhhhcCcE Confidence 67877 32221 1111111 1122221111111 1 0123333322210000 0000000111123567 Q ss_pred EEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 319 RIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 319 ~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .+|+.+.++-.+.-+.+=..|+.+.+ T Consensus 290 ~~r~~~r~d~~v~~~~a~~~lt~a~~ 315 (324) T protein:vir:99 290 ALRATMHVALHIADDKAFAKLVPADK 315 (324) T ss_pred EEEEEEEEccEEecccceEEEEeccC Confidence 78888899989998888888888888 No 46 >protein:vir:2430 Length: 318 # NCBI annotation: major head subunit # Family: family:all:507 # MgeID: mge:52 # MgeName: D29 # Cross-refs: genbank:acc:NP_046832;genbank:gi:9630400;genbank:GeneID:1261582 Probab=93.84 E-value=0.0035 Score=34.08 Aligned_cols=290 Identities=10% Similarity=-0.012 Sum_probs=117.4 Q ss_pred CCCCCCCCccceecccc-cceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPL-TNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~~~~~~dp~L-T~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) |..+........+-+.+ +.| +..-.+. .+-..+++.+|+.....++++.....-. .-..-|.........|+ T Consensus 14 ~~~~~~~~~~~~ip~~~~~~i-i~~~~~~--~~l~~~~~~~~~~~~~~~ip~~~~~~~a----~~v~Eg~~~~~~~~~f~ 86 (318) T protein:vir:24 14 IAQTGDTMFKGYLEPEQAKDY-FAEAEKT--SIVQQFAQKVPMGTTGQKIPHWVGDVSA----QWIGEGDMKPITKGNMT 86 (318) T ss_pred hhcccCcccceeechhHHHHH-HHHHHhh--chhhhhcceeeccCCceEEEEEeCCcce----EEecCCcccccccccee Confidence 55554433333333322 222 1111111 2344567888887777777775422111 01112333333334454 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ..++.++.-+...++.++..++ ..++.++...+.+.+.+....| ..++++.-.. ...+.. +.. T Consensus 87 ~i~~~~~k~~~~~~iS~e~l~d--s~~~~~~~i~~~l~~~~~~~~d----~a~l~G~g~~-----~~~~~~------~~~ 149 (318) T protein:vir:24 87 SQTIAPHKIATIFVASAETVRA--NPANYLGTMRTKVATAFAMAFD----GAAMHGTDSP-----FPTYIG------QTT 149 (318) T ss_pred EEEEeeEEEEEeehhhHHHhhc--ChHHHHHHHHHHHHHHHHHHHH----HhhhcccCCC-----CCcccc------ccc Confidence 4444444433333444444432 2234444444444444443333 2233221100 000100 000 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHH-- Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTL-- 237 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~l-- 237 (344) ..+..+ ..+..+++...++.+.+..+ ...+..+...+|+++.|.+|+. ++.. .++.--...+.-... T Consensus 150 ~~~~~~-----~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~v~n~~~~~~L~~---lkd~--~G~~l~~~~~~~~~~~~ 218 (318) T protein:vir:24 150 KAISIA-----DTTGATTVYDQVAVNGLSLL-VNDGKKWTHTLLDDITEPILNG---AKDQ--NGRPLFIESTYGEAASP 218 (318) T ss_pred cccccc-----ccccccchHHHHHHHHHHhh-ccccCCCCEEEEcHHHHHHHHH---hhcc--CCceeecCccccCcccc Confidence 111111 12223444445555555443 4557888899999999999973 3221 011000000111111 Q ss_pred ---HHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccc----cCCcCCcccccc Q lcl|NC_015466. 238 ---ADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLV----GSGNEGMRIKRF 310 (344) Q Consensus 238 ---a~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~----g~~~~~~~~~~~ 310 (344) ..++|+|-+ ....-.. +...-+.++..-+++... -|.+++...+. +....+.. + T Consensus 219 ~~~~~i~g~pv~------~~~~~~~--~~~~~~~gdfs~~~~~~~--------~~l~i~~~~~~~~~~~~~~~~~~---~ 279 (318) T protein:vir:24 219 FRSGRIVARPTI------LSDHVVE--GTTVGFMGDFSQLIWGQI--------GGLSFDVTDQATLNLGTVESPNF---V 279 (318) T ss_pred ccCceEEEEeeE------EeCCCCC--CccEEEEeecceEEEEEe--------cCeEEEEeeccceeccccccccc---h Confidence 123333311 1111110 010111222111111110 01222111110 00000110 1 Q ss_pred cCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 311 YLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 311 ~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) +.=......+|+.+.++-.+.-+.+=..|+.+.| T Consensus 280 ~~f~~~~~~~r~~~r~d~~v~~~~a~~~i~~~~a 313 (318) T protein:vir:24 280 SLWQHNLVAVRVEAEYAFHCNDAEAFVALTNVVS 313 (318) T ss_pred hhhhcCcEEEEEEEEEccEEecccceEEEEeecc Confidence 1112356778999999999999998888999888 No 47 >protein:vir:81070 Length: 390 # NCBI annotation: p09 # Family: family:all:585 # MgeID: mge:1889 # MgeName: Xop411 # Cross-refs: genbank:acc:YP_001285679;genbank:gi:148727187;genbank:GeneID:5247115 Probab=93.77 E-value=0.0058 Score=32.86 Aligned_cols=270 Identities=9% Similarity=-0.022 Sum_probs=114.7 Q ss_pred CCCCCCCC-ccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSR-SDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~-~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) ++...... ..+.+......|-...++. ..-..+++.+|+.....+++......-.-. -.+-|.........|+ T Consensus 113 ~~~~~~~~~g~~~~~~~~~~ii~~~~~~---~~l~~~~~~~~~~~~~~~~~~~~~~~~~a~---~v~Eg~~~~~~~~~~~ 186 (390) T protein:vir:81 113 ASTDAAGSAGALTTPNRLPGFITPPDAR---LTVRDLIGSGRTDSALIEYVQETGFVNNAA---IVAEGALKPESSLKFA 186 (390) T ss_pred hccccccCCcceechhhhHHHHHHHhhh---hhhhhhcceeeccCCceEEEEEecCCccee---eecCCcccccccceee Confidence 22222111 1122211222222111111 122334677777766677777543221111 1223444444445565 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ...+..+..+....++.+..++.. +++......+.+.+....| ..++++. +......|.-. .... T Consensus 187 ~i~~~~~k~~~~~~is~ell~d~~---~~~~~i~~~l~~~~~~~~d----~a~l~G~----g~~~~~~Gi~~----~~~~ 251 (390) T protein:vir:81 187 KKTDTTHVIAHTMKATRQILSDAP---QLASYMNNRLIRGLKVKED----AEILRGT----GANDGLLGLIP----QATT 251 (390) T ss_pred EEEEeeeEEEEeehhhHHHHHhHH---HHHHHHHHHHHHHHHHHHH----HHHHhcC----CCCCcccceee----cccc Confidence 555555554444555555554331 2443333334443333222 2333321 11111111110 0000 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH---HH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL---VT 236 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~---~~ 236 (344) ... .-...+.+++.+|.+.+..+. ..+..++..+|++++|.+|+. ++ .++. .-+... .. T Consensus 252 ~~~--------~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~v~~~~~~~~l~~---lk----d~~G--~~l~~~~~~~~ 313 (390) T protein:vir:81 252 YAA--------PTTIAGATRVDQLRLAMLQAS-LAEYNPSGIVINPIDWAAIEL---AK----DANN--QYLIGNARGTL 313 (390) T ss_pred ccc--------ccccccchhHHHHHHHHHhhc-cccCCCCEEEEcHHHHHHHHH---hh----cCCC--ceeecCccccc Confidence 000 011234567788888887764 557899999999999988862 22 2110 011110 11 Q ss_pred HHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccce-eecccccCCcCCcccccccC--- Q lcl|NC_015466. 237 LADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTF-NWTGLVGSGNEGMRIKRFYL--- 312 (344) Q Consensus 237 la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~-~~~~~~g~~~~~~~~~~~~~--- 312 (344) -..++|+| |++.+. + +.+.+++ |+.+.+|.. ...+ ..++..++ T Consensus 314 ~~~l~G~p-v~~~~~---------------~-p~~~~~~--------gd~~~~~~~~~~~~--------~~v~~~~~~~~ 360 (390) T protein:vir:81 314 TPTLWGLP-VVATQA---------------M-APGEFLV--------GAFDLAAQIFDQWD--------ARVEIGYVGED 360 (390) T ss_pred Cceeccee-eEEcCC---------------C-CCCcEEE--------EehhceEEEEEecc--------eEEEEecccch Confidence 12456776 322211 1 1111111 222222221 1111 11111111 Q ss_pred CCCceEEEeeccccceeeeccccchhhhcc Q lcl|NC_015466. 313 DAIESDRIEIDMSYDQKKVAADLGYFFGGI 342 (344) Q Consensus 313 ~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~ 342 (344) -..+...+|+...++-.+.-+++-..++=+ T Consensus 361 ~~~~~v~~r~~~r~d~~v~~~~a~v~~t~a 390 (390) T protein:vir:81 361 FQRNMITVLAEERLALVVYRPEALISGSFA 390 (390) T ss_pred hhcCcEEEEEEEeeccEEecccceEEEEeC Confidence 123456688888888888888777666555 No 48 >protein:vir:96833 Length: 275 # NCBI annotation: ORF015 # Family: family:all:522 # MgeID: mge:1642 # MgeName: EW # Cross-refs: genbank:acc:YP_240157;genbank:gi:66395822;genbank:GeneID:5133174 Probab=93.38 E-value=0.0056 Score=32.94 Aligned_cols=263 Identities=10% Similarity=-0.033 Sum_probs=119.4 Q ss_pred CCCCCCCCccceecc-cccceeeeeEcCcchhhhhhhC-ccccc----CCccceeeeechhhcccccccccccCcccccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNR-PLTNISIGYVQDASHFVAGQVF-PQVSV----GKQSDAYFTYERGDFNRDEMQERTPGTESAGG 74 (344) Q Consensus 1 m~~~~~~~~~~~~dp-~LT~iA~~Y~n~~~~~ia~~lf-P~v~v----~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~ 74 (344) |+++ +....+ +.| +++.+.+.- +..-.+| |.+.+ ..+.|...++++=... .....-.-|...-+. T Consensus 3 ~~~~-T~l~d~-i~PEv~~~~v~~~------~~~~~~~~~~~~~~~~l~g~~G~tv~iP~~~~i-g~a~~~~~g~~i~~~ 73 (275) T protein:vir:96 3 LENM-TKLANM-VNPEVLAPMMQAE------LDKKLKFAQFADIDNTLVGQPGNTITFPAFVYS-GDAKVVPEGEEIPID 73 (275) T ss_pred Cccc-chhhhh-hchHHHHHHHHHH------HHHhhhhcccceecccccCCCCCEEEeeeeccC-CccccccCCCCcchh Confidence 5543 333334 445 666644321 1111111 22211 1223444443321100 001111122333333 Q ss_pred eecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccc Q lcl|NC_015466. 75 TYEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPT 154 (344) Q Consensus 75 ~~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~ 154 (344) +...+......+..+..-.+.+ .....+..||...+.+.+...+....+..+...+-.+ T Consensus 74 ~lt~~~~~~~i~~~~~~~~i~D--~~~~~~~~d~~~~~~~~~a~~~a~~~d~~ll~~l~~a------------------- 132 (275) T protein:vir:96 74 LIETKKRQATIRKIGKGTVLTD--EALLSGYGDPKGEAVRQHGLAIANKVDNDVLEALQGA------------------- 132 (275) T ss_pred hcccceeeEEeehhcccccccH--HHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHHhcc------------------- Confidence 3344444434434333333333 3333444567777777776666655554443322110 Q ss_pred cccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH Q lcl|NC_015466. 155 APASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL 234 (344) Q Consensus 155 ~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~ 234 (344) ++ ..+.+..+ ...|.++...+.+. +..++.++++++++..|++.+.+. .+.....+. +.+.. T Consensus 133 --------~~------~~~~~~~~-~d~i~dA~~~lgd~-~~~~~~ivv~p~~~~~L~k~~~~~-f~~~~~~g~-~~~~~ 194 (275) T protein:vir:96 133 --------TL------KVEADITK-LAGLQTAIDKFNDE-DLEPMVLFVNPLDAGKLRASATDN-FTRATLLGD-NVIVK 194 (275) T ss_pred --------cc------cccccccC-HHHHHHHHHHhccc-cCCccEEEeCHHHHHHHHhccccc-ccccccccc-cceec Confidence 00 01111111 44566666666554 568999999999999999876532 222222221 23444 Q ss_pred HHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccccCCC Q lcl|NC_015466. 235 VTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDA 314 (344) Q Consensus 235 ~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~ 314 (344) -++..++|++ |++.+. +.....+++. .+.+|+ +... ...++..|... T Consensus 195 G~ig~~~G~~-Vi~s~~---------------~p~~t~~i~~--------~gA~~~-~~~~--------~~~vE~~Rd~~ 241 (275) T protein:vir:96 195 GAFGEALGAI-IVRSNK---------------IKEGEAILAK--------RGAVKL-ITKR--------DFFLETERHAS 241 (275) T ss_pred cccceecCee-EEEeCC---------------CCcceEEEEe--------ccceee-eecC--------Ccccccccchh Confidence 4566777775 433221 1112233331 122321 1111 23467778888 Q ss_pred CceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 315 IESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 315 ~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .++..+++...+--+++-++.-.-++-.=| T Consensus 242 ~~~d~i~~~~~y~~~~~~~~~vv~~t~~~~ 271 (275) T protein:vir:96 242 HKSTALFSDKHYVAYLYDESKVVKITKSAS 271 (275) T ss_pred hcCcEEEEeEEEEEEEEcCccEEEEEeccc Confidence 888888888887666665543333332222 No 49 >protein:vir:94142 Length: 304 # NCBI annotation: ORF013 # Family: family:all:507 # MgeID: mge:1494 # MgeName: 96 # Cross-refs: genbank:acc:YP_240234;genbank:gi:66395898;genbank:GeneID:5133311 Probab=93.30 E-value=0.0081 Score=32.06 Aligned_cols=287 Identities=8% Similarity=-0.002 Sum_probs=115.4 Q ss_pred CCCCC--------CCCcccee-cccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCccc Q lcl|NC_015466. 1 MPFTQ--------PSRSDVHV-NRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTES 71 (344) Q Consensus 1 m~~~~--------~~~~~~~~-dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~ 71 (344) |+.-. +....+.+ ......|-..-++. .+-..+++.+|+.....+++++....-.. .. +-+... T Consensus 1 ma~~~~~~~~~~~t~~gg~lip~~~~~~ii~~~~~~---~~l~~~~~~~~~~~~~~~ip~~~~~~~a~-~v---~E~~~~ 73 (304) T protein:vir:94 1 MATPTYTPGNVILSDFKNGVIPAEQGTLIMKDIMAN---SAIMKLAKNEPMTAQKKKFTYLAKGVGAY-WV---SETERI 73 (304) T ss_pred CcccccccccccccCCCceecchhHHHHHHHHHHhc---cchhhhcceeeccCCceEEEEEeCCcceE-Ee---ecCccc Confidence 33211 11111111 11111111111111 12334567778776666777764321111 11 112222 Q ss_pred ccceecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccc Q lcl|NC_015466. 72 AGGTYEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVAS 151 (344) Q Consensus 72 ~~~~~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~ 151 (344) .....+|+......+..+....+..+..+ .+.++++....+.+.+.+....| ..++++. +..... +.. T Consensus 74 ~~~~~~~~~i~~~~~k~~~~~~iS~ell~--ds~~~l~~~i~~~l~~~ia~~~d----~~~l~G~----g~~~~~-~~~- 141 (304) T protein:vir:94 74 QTSKPEYAQAEMEAKKIGVIIPLSKEFLK--WTAKDFFNEVKPLIAEAFYKAFD----QAVIFGT----KSPYNT-STS- 141 (304) T ss_pred ccccceeeEEEEEEEEEEEeehhhHHHHh--cchHHHHHHHHHHHHHHHHHHHH----hhheecc----CCCccc-ccc- Confidence 22233444433333333323333333333 33455555444445444443322 2233221 110000 000 Q ss_pred ccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccc Q lcl|NC_015466. 152 SPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAK 231 (344) Q Consensus 152 ~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~ 231 (344) . +.+....+........+.+.+.||.+....+.. .+..+...+|+++.|.+|++ ++. .+ +.-+ T Consensus 142 -----~--~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~l~~-~~~~~~~~v~~~~~~~~L~~---lkd----~~--G~~l 204 (304) T protein:vir:94 142 -----G--KPLVEGAEEKGNVVTDTNNLYVDLSALMATIED-EELDPNGVLTTRSFRSKMRN---ALD----AN--DRPL 204 (304) T ss_pred -----c--ccccccccccccccccccchHHHHHHHHHHhhh-ccCCcCEEEEcHHHHHHHHH---hhc----cC--CcEe Confidence 0 000000000111333556779999998887654 57889999999999999973 322 11 1111 Q ss_pred cCHHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCc--eEEEEecCCCcccccccccceeeccc--cc----CCcC Q lcl|NC_015466. 232 ANLVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKH--ALLSYAPATPGIMTPSAGYTFNWTGL--VG----SGNE 303 (344) Q Consensus 232 vt~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~--~~l~~~~~~~~~~~~s~G~T~~~~~~--~g----~~~~ 303 (344) .... ...++|+| |++.+..-.. .....-+.++. +++... -|.++....+ .+ .... T Consensus 205 ~~~~-~~~l~G~P-V~~~~~~~~~-----~~~~~~~~gd~~~~~~~~~----------~~~~i~~~~e~~~~~~~~~~~~ 267 (304) T protein:vir:94 205 FDAN-GNEIMGLP-LSYTGADVYD-----KKKSLALMGDWDYARYGIL----------QGIEYAISEDATLTTLQASDAS 267 (304) T ss_pred ecCC-Ccccccee-eEEecccccC-----CCCcEEEEEehhhEEEEEe----------cceEEEEeecceeeeecccccC Confidence 1111 13467887 4333322110 11111122221 111110 0111111110 00 0000 Q ss_pred CcccccccCCCCceEEEeeccccceeeeccccchhhhccc Q lcl|NC_015466. 304 GMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIV 343 (344) Q Consensus 304 ~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~v 343 (344) +..+ +-=......+|+.+.++-.+.-+++-..++.+= T Consensus 268 g~~~---~~f~~~~~~~r~~~r~~~~v~~~~a~~~l~~a~ 304 (304) T protein:vir:94 268 GQPV---SLFERDMFALRATMHIAYMNVKPEAFATLKPTE 304 (304) T ss_pred ccch---hhhhcCcEEEEEEEEeccEeecccceEEEEecC Confidence 0000 001224566788888888888888888887776 No 50 >protein:vir:105905 Length: 304 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:1514 # MgeName: phiETA3 # Cross-refs: genbank:acc:YP_001004375;genbank:gi:122891830;genbank:GeneID:4712376 Probab=93.30 E-value=0.0081 Score=32.06 Aligned_cols=287 Identities=8% Similarity=-0.002 Sum_probs=115.4 Q ss_pred CCCCC--------CCCcccee-cccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCccc Q lcl|NC_015466. 1 MPFTQ--------PSRSDVHV-NRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTES 71 (344) Q Consensus 1 m~~~~--------~~~~~~~~-dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~ 71 (344) |+.-. +....+.+ ......|-..-++. .+-..+++.+|+.....+++++....-.. .. +-+... T Consensus 1 ma~~~~~~~~~~~t~~gg~lip~~~~~~ii~~~~~~---~~l~~~~~~~~~~~~~~~ip~~~~~~~a~-~v---~E~~~~ 73 (304) T protein:vir:10 1 MATPTYTPGNVILSDFKNGVIPAEQGTLIMKDIMAN---SAIMKLAKNEPMTAQKKKFTYLAKGVGAY-WV---SETERI 73 (304) T ss_pred CcccccccccccccCCCceecchhHHHHHHHHHHhc---cchhhhcceeeccCCceEEEEEeCCcceE-Ee---ecCccc Confidence 33211 11111111 11111111111111 12334567778776666777764321111 11 112222 Q ss_pred ccceecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccc Q lcl|NC_015466. 72 AGGTYEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVAS 151 (344) Q Consensus 72 ~~~~~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~ 151 (344) .....+|+......+..+....+..+..+ .+.++++....+.+.+.+....| ..++++. +..... +.. T Consensus 74 ~~~~~~~~~i~~~~~k~~~~~~iS~ell~--ds~~~l~~~i~~~l~~~ia~~~d----~~~l~G~----g~~~~~-~~~- 141 (304) T protein:vir:10 74 QTSKPEYAQAEMEAKKIGVIIPLSKEFLK--WTAKDFFNEVKPLIAEAFYKAFD----QAVIFGT----KSPYNT-STS- 141 (304) T ss_pred ccccceeeEEEEEEEEEEEeehhhHHHHh--cchHHHHHHHHHHHHHHHHHHHH----hhheecc----CCCccc-ccc- Confidence 22233444433333333323333333333 33455555444445444443322 2233221 110000 000 Q ss_pred ccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccc Q lcl|NC_015466. 152 SPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAK 231 (344) Q Consensus 152 ~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~ 231 (344) . +.+....+........+.+.+.||.+....+.. .+..+...+|+++.|.+|++ ++. .+ +.-+ T Consensus 142 -----~--~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~l~~-~~~~~~~~v~~~~~~~~L~~---lkd----~~--G~~l 204 (304) T protein:vir:10 142 -----G--KPLVEGAEEKGNVVTDTNNLYVDLSALMATIED-EELDPNGVLTTRSFRSKMRN---ALD----AN--DRPL 204 (304) T ss_pred -----c--ccccccccccccccccccchHHHHHHHHHHhhh-ccCCcCEEEEcHHHHHHHHH---hhc----cC--CcEe Confidence 0 000000000111333556779999998887654 57889999999999999973 322 11 1111 Q ss_pred cCHHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCc--eEEEEecCCCcccccccccceeeccc--cc----CCcC Q lcl|NC_015466. 232 ANLVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKH--ALLSYAPATPGIMTPSAGYTFNWTGL--VG----SGNE 303 (344) Q Consensus 232 vt~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~--~~l~~~~~~~~~~~~s~G~T~~~~~~--~g----~~~~ 303 (344) .... ...++|+| |++.+..-.. .....-+.++. +++... -|.++....+ .+ .... T Consensus 205 ~~~~-~~~l~G~P-V~~~~~~~~~-----~~~~~~~~gd~~~~~~~~~----------~~~~i~~~~e~~~~~~~~~~~~ 267 (304) T protein:vir:10 205 FDAN-GNEIMGLP-LSYTGADVYD-----KKKSLALMGDWDYARYGIL----------QGIEYAISEDATLTTLQASDAS 267 (304) T ss_pred ecCC-Ccccccee-eEEecccccC-----CCCcEEEEEehhhEEEEEe----------cceEEEEeecceeeeecccccC Confidence 1111 13467887 4333322110 11111122221 111110 0111111110 00 0000 Q ss_pred CcccccccCCCCceEEEeeccccceeeeccccchhhhccc Q lcl|NC_015466. 304 GMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIV 343 (344) Q Consensus 304 ~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~v 343 (344) +..+ +-=......+|+.+.++-.+.-+++-..++.+= T Consensus 268 g~~~---~~f~~~~~~~r~~~r~~~~v~~~~a~~~l~~a~ 304 (304) T protein:vir:10 268 GQPV---SLFERDMFALRATMHIAYMNVKPEAFATLKPTE 304 (304) T ss_pred ccch---hhhhcCcEEEEEEEEeccEeecccceEEEEecC Confidence 0000 001224566788888888888888888887776 No 51 >protein:vir:78223 Length: 333 # NCBI annotation: Putative major head protein # Family: family:all:966 # MgeID: mge:1849 # MgeName: Bethlehem # Cross-refs: genbank:acc:YP_001491666;genbank:gi:157786490;genbank:GeneID:5625701 Probab=92.99 E-value=0.008 Score=32.07 Aligned_cols=311 Identities=10% Similarity=-0.010 Sum_probs=116.9 Q ss_pred CCCCCCC--Cccc------eecc-cccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhh---cccccc-ccccc Q lcl|NC_015466. 1 MPFTQPS--RSDV------HVNR-PLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGD---FNRDEM-QERTP 67 (344) Q Consensus 1 m~~~~~~--~~~~------~~dp-~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~---~~~~~~-~~ra~ 67 (344) .++.+.. ...+ .+-+ +.+.|-..-+. ..+-.++++.+|+.....++++..... +..... ....- T Consensus 8 ~~~~~~~~~~g~~~~~~~~liP~~~~~~ii~~l~~---~s~l~~~~~~~~~~~~~~~~p~~~~~~~a~~v~eg~~~~~~e 84 (333) T protein:vir:78 8 LPNSAGSNHQGRLAHVPSDLLPKEIVGPIFDKAQE---SSLVLRMGEQIPISYGETIIPTTVKRPEVGQVGVGTSNEQRE 84 (333) T ss_pred hhhcccccccCceecCCccccchhHHHHHHHHHHh---hchhhhhcceeeccCCceEEEEEeCCceeEeecCcccccccc Confidence 1111111 1000 0111 11111100111 123345567777766556666653221 100000 00000 Q ss_pred CcccccceecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccc Q lcl|NC_015466. 68 GTESAGGTYEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVD 147 (344) Q Consensus 68 g~~~~~~~~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~ 147 (344) +.........|... .+..+.+...++....-.-.+..++++.....+.+.+.+..| ..++++. +.... T Consensus 85 ~~~~~~~~~~f~~i--~l~~~kl~~~~~is~ell~~s~~~~~~~i~~~la~ai~~~~d----~~~l~G~----g~~~~-- 152 (333) T protein:vir:78 85 GGLKPLSGTAWDTR--SVSPIKLATIVTVSEEFARMNPSGLYTKLQGDLAYAIGRGID----LAVFHGK----SPLTG-- 152 (333) T ss_pred cccccccccceeEE--EEeeEEEEEeehhhHHHHhcCHHHHHHHHHHHHHHHHHHHHH----HHHhccc----CCCCC-- Confidence 11111222233332 333333333333332221122334444444444444443333 2223211 11000 Q ss_pred ccccccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCc Q lcl|NC_015466. 148 GVASSPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTS 227 (344) Q Consensus 148 gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~ 227 (344) ..+....+.... .+.+...........-+.+|.+.+..+.....+.++..+|+++.|..|++....+..- ++.- T Consensus 153 ---~~~~g~~~~~~~-~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~~~~~~d~~--G~~i 226 (333) T protein:vir:78 153 ---SALQGIDTDNVI-ANTTNVDYLQETGDPLLDRLLDGYDLVSANTDVEFNGWAVDPRFRAHLLRAQAYRDAN--GNVD 226 (333) T ss_pred ---cccccccccccc-cccccccccccccchhHHHHHHHHHhhccccccCceEEEEcchHHHHHHHHhhhcCCC--Ccee Confidence 000000000000 0001111122233445778888888887777888999999999999997544333210 1100 Q ss_pred cccccCHHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeeccccc-CCcCCcc Q lcl|NC_015466. 228 GAAKANLVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVG-SGNEGMR 306 (344) Q Consensus 228 ~~~~vt~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g-~~~~~~~ 306 (344) -...+....-..++|+| |.+.+..=............-+.++.--+++ +.. -|.+.....+-. ....+.. T Consensus 227 ~~~~~~~~~~~~l~G~P-v~~~~~i~~~~~~~~~~~~~~~~gD~~~~~~-------g~~-~~~~i~~~~~~~~~~~~~~~ 297 (333) T protein:vir:78 227 PSRINLAAQTGDVLGLP-AQFGRAVGGDLGAAVDSKTRIIGGDFSQLKF-------GFA-DEIRIKMSDTATLTDSGSAT 297 (333) T ss_pred ecCccccCCCceeecee-eEEccccCCCccccCCCccEEEEEecccEEE-------EEe-eccEEEEeccccccccccce Confidence 00111111224567876 4443322100000000000111111110100 000 012222111100 0000110 Q ss_pred cccccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 307 IKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 307 ~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) + +.-..+...+|+.+.++-.+.-+++-..|+++-| T Consensus 298 ~---~~~~~~~v~~r~~~r~d~~v~~~~a~~~l~~~~a 332 (333) T protein:vir:78 298 V---SMWQTNQIAILIEVTFGWLLGDKQAFVKFVDDEQ 332 (333) T ss_pred e---ehhhcCcEEEEEEEEEccEEecccceEEEeccCC Confidence 1 1112345668888999999999999999999999 No 52 >protein:vir:97053 Length: 390 # NCBI annotation: putative head protein # Family: family:all:585 # MgeID: mge:1653 # MgeName: OP1 # Cross-refs: genbank:acc:YP_453565;genbank:gi:84662600;genbank:GeneID:5142468 Probab=92.87 E-value=0.0094 Score=31.68 Aligned_cols=270 Identities=9% Similarity=-0.012 Sum_probs=108.6 Q ss_pred CCCCCCCC-ccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSR-SDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~-~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) +....+.. ..+.....+..|-...+... -|.+ +++.+|+.....+++.+....-.-.. .+-|.........|+ T Consensus 113 ~~~~~~~~~g~lip~~~~~~ii~~~~~~~--~i~~-~~~~~~~~~~~~~~~~~~~~~~~a~~---v~Eg~~~~~~~~~~~ 186 (390) T protein:vir:97 113 ASTDAAGSAGALTTPNRLPGFITPPDARL--TVRD-LIGSGRTDSALIEYVQETGFVNNAAI---VAEGALKPESSLKFA 186 (390) T ss_pred hhcccccccccccchhhhHHHHHHHhhhh--hhHh-hcceeeccCCceEEEEEecCCcceee---ecCCcccccccccee Confidence 22122211 11212222222222222221 2233 46777877666677776432111001 122333333444555 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ..++..+..+....++.+..++. .+++......+.+.+....| ..++++. +......|.-. .... T Consensus 187 ~i~~~~~k~~~~~~is~ell~ds---~~l~~~i~~~la~a~~~~~d----~a~l~G~----g~~~~p~Gi~~----~~~~ 251 (390) T protein:vir:97 187 KKTDTTHVIAHTMKATRQILSDA---PQLASYMNNRLIRGLKVKED----AEILRGT----GANDGLLGLIP----QATT 251 (390) T ss_pred EEEEeeeeEEEeehhhHHHHHhH---HHHHHHHHHHHHHHHHHHHH----HHHhhcC----CCCccccceee----cccc Confidence 55554444444444555544432 12333333333333333222 3333321 11111111110 0000 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH---HH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL---VT 236 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~---~~ 236 (344) .. . .-...+.+.+.+|.+.+..+ ...+..++.++|+++.|.+|++ ++. .+ +.-+... .. T Consensus 252 ~~--~------~~~~~~~~~~d~~~~~~~~~-~~~~~~~~~~v~n~~~~~~L~~---lkd----~~--G~~l~~~~~~~~ 313 (390) T protein:vir:97 252 YA--A------PTTIAGATRVDQLRLAMLQA-SLAEYPASGIVINPIDWAAIEL---AKD----AN--NQYLIGNARGTL 313 (390) T ss_pred cc--c------cccccccchHHHHHHHHHhh-ccccCCCCEEEEcHHHHHHHHH---hhc----CC--CceeecCccCCC Confidence 00 0 01123456667777776655 4557889999999999999873 332 11 1111111 01 Q ss_pred HHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccc-eeecccccCCcCCcccccccCC-- Q lcl|NC_015466. 237 LADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYT-FNWTGLVGSGNEGMRIKRFYLD-- 313 (344) Q Consensus 237 la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T-~~~~~~~g~~~~~~~~~~~~~~-- 313 (344) -..++|+| |++.+.. +.+.++ +|+.+.+|. +.+.+. .++.+++. T Consensus 314 ~~~l~G~p-V~~~~~~----------------~~~~~~--------~gd~~~~~~~~~~~~~--------~i~~~~~~~~ 360 (390) T protein:vir:97 314 TPTLWGLP-VVATQAM----------------APGEFL--------VGAFDLAAQIFDQWDA--------RVEIGYVNDD 360 (390) T ss_pred Cceeccee-eEEcCCC----------------CCCcEE--------EEeccceEEEEEecce--------EEEEeecccc Confidence 12456776 3332211 111112 122222222 111211 11211111 Q ss_pred -CCceEEEeeccccceeeeccccchhhhcc Q lcl|NC_015466. 314 -AIESDRIEIDMSYDQKKVAADLGYFFGGI 342 (344) Q Consensus 314 -~~~~~~vr~~~~~~~~v~~~~~g~l~~~~ 342 (344) ......+|+.+.++-.+.-+++-..++=+ T Consensus 361 f~~~~~~~r~~~r~d~~v~~~~a~v~~~~a 390 (390) T protein:vir:97 361 FQRNMVTVLAEERLALVVYRPEALITGSFA 390 (390) T ss_pred cccCcEEEEEEEeeccEEeccccEEEEEeC Confidence 23445577777777777766554444333 No 53 >protein:vir:103955 Length: 324 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1662 # MgeName: phiNM # Cross-refs: genbank:acc:YP_873992;genbank:gi:118430767;genbank:GeneID:4525449 Probab=92.48 E-value=0.011 Score=31.27 Aligned_cols=284 Identities=11% Similarity=-0.034 Sum_probs=119.8 Q ss_pred CCCCCCCCcc-ceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSRSD-VHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~~~-~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) +-.|+..... ++...+.+.|-..-++. .+-..+++.+|+.....+++++....- ..-.+-|.........|+ T Consensus 27 ~~~~~~~~~~~liP~~~~~~ii~~~~~~---s~l~~~~~~~~~~~~~~~~p~~~~~~~----a~~v~Eg~~~~~~~~~~~ 99 (324) T protein:vir:10 27 DNVMMHEKKDGTLLNDFTTPILQEVMEN---SKIMQLGKYEPMEGTEKKFTFWADKPG----AYWVGEGQKIETSKATWV 99 (324) T ss_pred cceeccCCCcceechhHHHHHHHHHHhh---chhhhhcceeeccCCceEEEEEeCCcc----eeEeccCcccccccccee Confidence 1111111111 11111222221111111 123345677888776677777642211 111222333333334454 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ..+...+.-+....+..+..++. ..+++....+.+.+.+....| ..++++ .+......+.. T Consensus 100 ~v~~~~~k~~~~~~iS~ell~ds--~~~l~~~i~~~l~~ai~~~~d----~a~l~G----~g~~~~~~~i~--------- 160 (324) T protein:vir:10 100 NATMRAFKLGVILPVTKEFLNYT--YSQFFEEMKPMIAEAFYKKFD----EAGILN----QGNNPFGKSIA--------- 160 (324) T ss_pred EEEEeeEEEEEeehhhHHHHhcc--hHHHHHHHHHHHHHHHHHHHH----HHhhhc----CCCCccCcccc--------- Confidence 44444433333333444434332 344555444445444443332 222222 11111111110 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHHH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLAD 239 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la~ 239 (344) .+.........+.....+|.+....+.. .+..++..+|+++.|..|++ ++ ..+. ..+.....-.. T Consensus 161 -----~~~~~~~~~~~~~~t~~~i~~~~~~l~~-~~~~~~~~v~n~~~~~~L~~---l~----d~~g--~~~~~~~~~~~ 225 (324) T protein:vir:10 161 -----QSIEKTNKVIKGDFTQDNIIDLEALLED-DELEANAFISKTQNRSLLRK---IV----DPET--KERIYDRNSDT 225 (324) T ss_pred -----ccccccceeccccCCHHHHHHHHHhhhh-ccCCCCEEEEcHHHHHHHHH---hh----ccCC--ceeecCCCCcc Confidence 0000112334566788999999888754 57889999999999998862 22 2111 11111111134 Q ss_pred HhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeeccccc----CCcCCcccccccCCCC Q lcl|NC_015466. 240 LFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVG----SGNEGMRIKRFYLDAI 315 (344) Q Consensus 240 ~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g----~~~~~~~~~~~~~~~~ 315 (344) ++|+| |.+..+ .....+ .-+.++.--+++... -|.+++...+.. ....+.. ++-=.. T Consensus 226 l~G~P-V~~~~~-----~~~~~~--~~~~gd~~~~~~~~~--------~~~~i~~~~~~~~~~~~~~~~~~---~~~~~~ 286 (324) T protein:vir:10 226 LDGLP-VVNLKS-----SNLKRG--ELITGDFDKLIYGIP--------QLIEYKIDETAQLSTVKNEDGTP---VNLFEQ 286 (324) T ss_pred cccee-EEeecC-----CCCCcc--eEEEEecccEEEEEe--------cCcEEEEeecccccccccccccc---hhhhhc Confidence 67877 322211 111111 112121111111000 012232221100 0000000 111123 Q ss_pred ceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 316 ESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 316 ~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ....+|+.+.++-.+.-+.+=..|+++.+ T Consensus 287 ~~~~~r~~~r~d~~v~~~~A~~~l~~a~~ 315 (324) T protein:vir:10 287 DMVALRATMHVALHIADDKAFAKLVPADK 315 (324) T ss_pred CcEEEEEEEEEccEEecccceEEEEeccC Confidence 56778888888888888888888888888 No 54 >protein:vir:97255 Length: 310 # NCBI annotation: hypothetical protein ORF017 # Family: family:all:1120 # MgeID: mge:1657 # MgeName: M6 # Cross-refs: genbank:acc:YP_001294525;genbank:gi:149408246;genbank:GeneID:5237120 Probab=92.31 E-value=0.012 Score=31.13 Aligned_cols=298 Identities=10% Similarity=0.070 Sum_probs=116.6 Q ss_pred CC-CCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCccc-----ccc Q lcl|NC_015466. 1 MP-FTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTES-----AGG 74 (344) Q Consensus 1 m~-~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~-----~~~ 74 (344) || .+..-......|..--.+ +......++++ .++|..+|..- .+.|.++.-.. ...-++++.+. .+. T Consensus 1 mpaltLaea~k~~~d~l~~~V-iE~~~~~s~lL--~~LpF~~veg~---~~~ynR~~~~~-~~~~~~v~~~~~~~g~~~~ 73 (310) T protein:vir:97 1 MASVTLAESAKLAQDELVAGV-IENIITVNRMF--DVLPFDSIEGN---SLAYNRENVLG-DVIMAGVGTTFSGAGAGKA 73 (310) T ss_pred CcccchHHHhhcCcchHHHHH-HHHHhccchHH--HhCCcccccCC---cceeeEeeccC-CcccccccccccCCCcccc Confidence 77 222111111112211111 11111111222 45577666532 34554443211 12223333222 222 Q ss_pred eecccccccccccccccccccHHHHHhc-cCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccc Q lcl|NC_015466. 75 TYEIGNDTYFARTRAYHRDVPEQVRANA-DNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSP 153 (344) Q Consensus 75 ~~~~~~~~~~~~~~~l~~~v~~~~~~~a-~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~ 153 (344) ...+...++.|..-+-+..|+....+-. ....+.....++..++.+..+.| ..+.++-. ..+...|-.... T Consensus 74 ~~t~~~~~~~L~i~~g~~~Vd~~i~dl~~~~~~dq~~~Ql~~~iea~~~~~e----~~lINGD~----a~n~F~GL~~~~ 145 (310) T protein:vir:97 74 AATFTKVNSNLTTIMGDAEVNGLIQATRSGDGNDQTAVQIASKAKSAGRKYQ----DQLINGNG----AGNEFAGLIQLC 145 (310) T ss_pred ccccceeeeeeeeeeehhhhhhHHHhhhcCChHHHHHHHHHHHHHHHHHHHH----HHhhcccc----CCCcccchhhcC Confidence 3334444455554333333322211211 22222222222222233322222 22333221 111111221111 Q ss_pred ccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccC Q lcl|NC_015466. 154 TAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKAN 233 (344) Q Consensus 154 ~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt 233 (344) .....+... +..+.-=..||+++.+.+.+. ...|..++|+++...+++. +.......+.. ..+ T Consensus 146 ---~~~q~i~~~-------~~gg~~t~d~LDeLl~~v~~~-~g~p~~~l~~~~~~r~i~A---~~R~~~~~g~~---~~~ 208 (310) T protein:vir:97 146 ---ASGQKATTG-------ATGSAISFAILDELMDLVVDK-DGQVDYLTMHARTLRSYKA---LLRALGGASIN---EVV 208 (310) T ss_pred ---CccceeecC-------CCCCCCCHHHHHHHHHHHhcC-CCCCCEEEecHHHHHHHHH---HHHHhcCCCCC---Ccc Confidence 111222221 111222247999999887533 5679999999987555541 11111111110 111 Q ss_pred HH----HHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCccccc Q lcl|NC_015466. 234 LV----TLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKR 309 (344) Q Consensus 234 ~~----~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~ 309 (344) .+ ++-.+-|+|-+.+ +..=....++.....+ -+++. .+|+-. .+.+..|++.++..|..+.. T Consensus 209 ~~~~G~~v~~~~GiPi~~~-d~ip~~~~~~~~~gtT------sIya~-----r~Ge~~--~~~Gv~Gl~~~~~~glsVr~ 274 (310) T protein:vir:97 209 ELPSGAEVPAYSGTPIFRN-DYIPTNQTKGGTTGCT------TIFAG-----TLDDGS--RTHGIAGLTATQAAGIQVVD 274 (310) T ss_pred ccCCCCEEeeeCCeEEEEe-CccCCCccccccCCce------eEEEE-----eeCccc--cccceeccccCCccceeEEe Confidence 11 1223346663322 1110000000000011 11121 122211 01222344445555555555 Q ss_pred cc-CCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 310 FY-LDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 310 ~~-~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .. ........+||.|-+-..+..+++.-.|+|+.= T Consensus 275 ~G~~~~~~v~~~~V~~Y~~~av~~~~A~a~L~~V~~ 310 (310) T protein:vir:97 275 VGESEDSDEHIWRVKWYCGLALFSEKGLACADGITN 310 (310) T ss_pred CCcccCCcceeEEEEEeeeEEEecccceeeeccccC Confidence 55 345566778889999999999998888888888 No 55 >protein:vir:4226 Length: 326 # NCBI annotation: observed 35.2Kd protein # Family: family:all:507 # MgeID: mge:89 # MgeName: L5 # Cross-refs: genbank:acc:NP_039681;swissprot:sw:q05223;genbank:gi:9625447;uniprot:Q05223;genbank:GeneID:2942929 Probab=92.29 E-value=0.012 Score=31.11 Aligned_cols=292 Identities=11% Similarity=-0.024 Sum_probs=112.7 Q ss_pred CCCC-------------------CCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhccccc Q lcl|NC_015466. 1 MPFT-------------------QPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDE 61 (344) Q Consensus 1 m~~~-------------------~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~ 61 (344) |..+ ......+...+....|- ..-.. ..+-..+++.+|+.....+++++..+.-.. T Consensus 1 ~~~~~~r~~~~~~~~e~~a~~~~~~~~g~~ip~~~~~~ii-~~~~~--~s~i~~~~~~~~~~~~~~~~p~~~~~~~a~-- 75 (326) T protein:vir:42 1 MAVNPDRTTPFLGVNDPKVAQTGDSMFEGYLEPEQAQDYF-AEAEK--ISIVQQFAQKIPMGTTGQKIPHWTGDVSAS-- 75 (326) T ss_pred CCCCccchhhhcCcchhhheeccccCCcceechhhHHHHH-HHHHh--cchhhhhcceeeccCCceEEEEEeCCcceE-- Confidence 2211 11111121111111111 11111 122334677888877777777764322111 Q ss_pred ccccccCcccccceecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccc Q lcl|NC_015466. 62 MQERTPGTESAGGTYEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDT 141 (344) Q Consensus 62 ~~~ra~g~~~~~~~~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~ 141 (344) -.+-|.........|...++..+..+...++..+..++ +.++++....+.+.+.+....| ..++++. + T Consensus 76 --~v~Eg~~~~~~~~~f~~i~~~~~k~~~~v~iS~ell~~--s~~~~~~~i~~~l~~a~~~~~d----~a~l~G~----g 143 (326) T protein:vir:42 76 --WIGEGDMKPITKGNMTSQTIAPHKIATIFVASAETVRA--NPANYLGTMRTKVATAFAMAFD----NAAINGT----D 143 (326) T ss_pred --EecCCccccccccceeEEEEeeEEEEEeehhhHHHHhc--CHHHHHHHHHHHHHHHHHHHHH----HHhhccc----C Confidence 11123333333445555444444444444444444443 2345555555555555544333 2233221 1 Q ss_pred cccccccccccccccccc-ceeecccccccccCCCCCChHH-HHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHH Q lcl|NC_015466. 142 WTFDVDGVASSPTAPASF-DPTNASNNDKLHWSDASSTPIE-DIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVG 219 (344) Q Consensus 142 ~~~~~~gv~~~~~~~~~~-~k~tl~~t~~~~Wsd~~SDPi~-di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~ 219 (344) .. ...+.. ..... +.....++ .......... ++...... ....+...+..+|+++.|.+|++ ++. T Consensus 144 s~-~p~gi~----~~~~~~~~~~~~~~----~~~~~~~~~~~~~~~~~~~-~~~~~~~~a~~v~n~~~~~~L~~---lkd 210 (326) T protein:vir:42 144 SP-FPTFLA----QTTKEVSLVDPDGT----GSNADLTVYDAVAVNALSL-LVNAGKKWTHTLLDDITEPILNG---AKD 210 (326) T ss_pred CC-cccccc----ccccccceeecccc----cccccchhHHHHHHHHHhh-hhhhccCccEEEEeHHHHHHHHH---hhc Confidence 10 001111 00001 11111111 1111111111 23333333 34556778889999999998873 322 Q ss_pred HhccCCCccccccCHHH---------HHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCccccccccc Q lcl|NC_015466. 220 RIDRGQTSGAAKANLVT---------LADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGY 290 (344) Q Consensus 220 ~i~~~~~~~~~~vt~~~---------la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~ 290 (344) .+ +.-+.+... ...++|+| |.+.+.. . .+...-+.|+..-+++... -|. T Consensus 211 ----~~--G~~l~~~~~~~~~~~~~~~~~l~G~p-v~~~~~~-----~--~~~~~~~~Gd~s~~~~~~~--------~~~ 268 (326) T protein:vir:42 211 ----KS--GRPLFIESTYTEENSPFRLGRIVARP-TILSDHV-----A--SGTVVGYQGDFRQLVWGQV--------GGL 268 (326) T ss_pred ----cC--CceeeccccccCccccccCceeeeee-EEEcCCC-----C--CCceEEEEeecceEEEEEe--------cce Confidence 11 111111110 12345555 2221110 0 0111111222111111100 112 Q ss_pred ceeeccccc-CCcCCcccccccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 291 TFNWTGLVG-SGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 291 T~~~~~~~g-~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ++....... ........+.+..=......+|+.+.++-.+.-+++-..|+++.| T Consensus 269 ~v~~~~e~~~~~~~~~~~~~~~~~~~d~~~~r~~~~~d~~v~~~~a~~~l~~~~~ 323 (326) T protein:vir:42 269 SFDVTDQATLNLGTPQAPNFVSLWQHNLVAVRVEAEYAFHCNDKDAFVKLTNVDA 323 (326) T ss_pred EEEEeecceeeecccccccchhhhhcCcEEEEEEEEeccEEecccceEEEeeccc Confidence 222111100 000000000111112356778999999999999999888999888 No 56 >protein:vir:94673 Length: 419 # NCBI annotation: major capsid protein # Family: family:all:585 # MgeID: mge:1527 # MgeName: mu1/6 # Cross-refs: genbank:acc:YP_579208;genbank:gi:93007444;genbank:GeneID:5076792 Probab=92.16 E-value=0.013 Score=31.00 Aligned_cols=284 Identities=13% Similarity=0.088 Sum_probs=114.2 Q ss_pred CCCCCCCCccceeccc-ccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhccc----ccccccccCcccccce Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRP-LTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNR----DEMQERTPGTESAGGT 75 (344) Q Consensus 1 m~~~~~~~~~~~~dp~-LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~----~~~~~ra~g~~~~~~~ 75 (344) ++..........+-|. +..+-..-.. ...+-..++..+|+.....+|++........ ....-.+-|....... T Consensus 123 ~~~~~~~~~~~~~~p~~~~~~i~~~~~--~~~~i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~Eg~~~~~~~ 200 (419) T protein:vir:94 123 APAGTITNPNVPHLPQLVPGIVPTTPD--LPLLVADLLDQQNADYNVLEYIRDTSGTAGAGSTWNKAAVVPEGTAKPQST 200 (419) T ss_pred cccccccCCcccccchhhhHHHHHHHh--hhhhhhhcceeeeccCCceeeeeeccccccccccCcccceecCCccccccc Confidence 1111111111111111 1111000000 0122334556666665555565532211000 0000011122222333 Q ss_pred ecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccc Q lcl|NC_015466. 76 YEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTA 155 (344) Q Consensus 76 ~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~ 155 (344) ..|+..++..+..+.-..++.+..+++. +++......+.+.+....| ..+++ +.+.. ...|... T Consensus 201 ~~~~~i~~~~~k~~~~~~is~ell~d~~---~l~~~i~~~la~a~~~~~d----~aii~----G~G~~-~p~Gi~~---- 264 (419) T protein:vir:94 201 LSFDTITTTLKTVAHWLPITRQAADDNS---QLMGYIQGRLTYGLRFLRD----RQLLN----GNGST-EMQGILT---- 264 (419) T ss_pred cceeeEEeeeeeEEEeehhhHHHHHhHH---HHHHHHHHHHHHHHHHHHH----HHHHh----ccCcc-cccceec---- Confidence 4455444444444444455555554331 2333333334444333332 22222 22211 1111111 Q ss_pred ccccceeecccccccccCC-CCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH Q lcl|NC_015466. 156 PASFDPTNASNNDKLHWSD-ASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL 234 (344) Q Consensus 156 ~~~~~k~tl~~t~~~~Wsd-~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~ 234 (344) ........... .+.. ...+.+.+|.+.+..+.. .+..|+..+|+++.|..|+. ++..- ++..-....+.. T Consensus 265 ~~~~~~~~~~~----~~~~~t~~~~~~~l~~~~~~~~~-~~~~~~~~v~n~~~~~~l~~---~k~~~-~~~~~~~~~~~~ 335 (419) T protein:vir:94 265 TPGIGTYQQPK----PTAPATDEPPLVDIRRAKTVAEI-AGFPPDGVVVHPQDWESIEL---DQAPG-SGVFRVIANVQG 335 (419) T ss_pred ccccccccccc----cccccccchhHHHHHHHHHhhhh-ccCCCCEEEEcHHHHHHHHH---HhhcC-CCceeecCCccc Confidence 01011111111 1222 345668889998888764 57889999999999988852 22110 111000011111 Q ss_pred HHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccc-eeecccccCCcCCcccccccCC Q lcl|NC_015466. 235 VTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYT-FNWTGLVGSGNEGMRIKRFYLD 313 (344) Q Consensus 235 ~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T-~~~~~~~g~~~~~~~~~~~~~~ 313 (344) ..-..++|+| |++.+.. ..+.++ +|+.+.+++ +.+.++ .++...+. T Consensus 336 ~~~~~l~G~p-V~~~~~~---------------~~~~~~---------~gd~~~~~~~~~~~~~--------~v~~~~~~ 382 (419) T protein:vir:94 336 EATPRIWGLN-VVSTVAI---------------AQGTAL---------VGGFRQGATLWSRQGI--------TVLMTDSH 382 (419) T ss_pred CCCcccccee-eEEcCCC---------------CCccEE---------EeeccceEEEEEecce--------EEEEeccc Confidence 1223567776 3332211 111111 223332222 222221 12221211 Q ss_pred ----CCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 314 ----AIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 314 ----~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ..+...+|+.+.++-.+.-+++-..++-+-| T Consensus 383 ~~~~~~~~~~~r~~~r~d~~v~~~~a~~~~~~~aa 417 (419) T protein:vir:94 383 ADFFTANTLVILAEFRANLAVYQPKAFVRVTFAAA 417 (419) T ss_pred cchhhcCcEEEEEEEeeccEEeccccEEEEEeccC Confidence 1466778899999999998888888777666 No 57 >protein:vir:94933 Length: 330 # NCBI annotation: putative phage structural protein # Family: family:all:1120 # MgeID: mge:1538 # MgeName: Xp15 # Cross-refs: genbank:acc:YP_239278;genbank:gi:66392060;genbank:GeneID:5076578 Probab=91.99 E-value=0.013 Score=30.86 Aligned_cols=299 Identities=13% Similarity=0.121 Sum_probs=111.0 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccc-eeccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGG-TYEIG 79 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~-~~~~~ 79 (344) ||..+--+...-..-.|+.=.+..-....+++ .++|..++..-. +.|.++.- .+...-|.++....+. ...|. T Consensus 25 m~alTLaea~~l~~d~~~~~VIE~l~~~s~iL--~~lpf~~ve~~~---~~~~r~~~-lp~a~~r~~n~~~~~~~~~Tf~ 98 (330) T protein:vir:94 25 MPTVTLAESAKLSQDHLVSGLIETIVEVNPLY--EMMPFTEIEGNA---LAYNRENV-LGDVQFLAVGGTITAKNPATFT 98 (330) T ss_pred hhhhhhhHHhhcCchhhHHHHHHhhhccchHH--hhcccccccCCc---ceeeeeec-CCcceeeeccccccccCcceee Confidence 33221111100000000000000000000111 233544444333 33433322 1222223344333221 11121 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ..+.+|..-+-...|+....+-..+.++.+....+...+.+..+ .+..++++.... +...|... .+... T Consensus 99 q~t~~l~~l~~~~~Vd~~iadl~g~~~d~~~~q~~~~ieal~~~----~e~~linGDs~~----~~F~GL~~---~~~~~ 167 (330) T protein:vir:94 99 KVTSELTTLIGDAEVNGLIQATRSDFMDQTSVQVASKAKSIGRQ----YQASMITGDGTG----NSFQGMMG---LVAAS 167 (330) T ss_pred eeeechhhhhhhHHHHHHHHHhcCCHHHHHHHHHHHHHHHHHHH----HHHHhhccCCCC----ccccchhh---cCCcc Confidence 22223333222333333333323333444433333333333322 234444443221 11122221 12223 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHH-- Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTL-- 237 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~l-- 237 (344) +.+..-++ .+.-=+.|++++.+.+.+. +-.|..++|++....+++ ++.....+... ...+.+.+ T Consensus 168 q~i~tg~~-------gg~~T~d~LDeLl~~v~~~-~g~~~~~l~n~a~~r~I~---a~~R~~~~~~v---~~~~~~~~G~ 233 (330) T protein:vir:94 168 QTISAGAN-------GGTLTFELLDQLLDLVKDK-DGQVDYLMSSFAMRRKYF---SLLRALGGAAI---GEVMTLPSGR 233 (330) T ss_pred cEEecCCC-------CCCCCHHHHHHHHHHhcCC-CCCCcEEEechhHHHHHH---HHHHhccCCCC---CCcccccCCC Confidence 33333111 1111147788888887544 447999999998887765 12111111110 01222222 Q ss_pred --HHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccccC-CC Q lcl|NC_015466. 238 --ADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYL-DA 314 (344) Q Consensus 238 --a~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~-~~ 314 (344) -.+-|+| |...+-.=....++..+..+.| ++.. +|+-.. ..+-.|....+..|..++...+ .. T Consensus 234 ~v~~~~GvP-i~~~d~ip~~~~~~~~~~ttsI------yav~-----~G~~~~--~qgV~Gl~~~g~~glsVr~~G~~~~ 299 (330) T protein:vir:94 234 QIPTYRGVP-WFVNDFIPSNMTQGTATNATAI------FAGT-----FDDGSN--KYGIAGLTARGSAGLRVQNVGAKEN 299 (330) T ss_pred EEeeeCCeE-EEecccccCCCCcccCCCceeE------EEEe-----eccccc--ccceEeecCCCCCcceeeeCCCccc Confidence 2233555 3332211111111111111111 1111 111100 0112233333444555555553 34 Q ss_pred CceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 315 IESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 315 ~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .....+||.|-+...+..+++.-.|+|+-= T Consensus 300 k~v~~~~v~~y~~~av~~~~a~~~L~~V~~ 329 (330) T protein:vir:94 300 ADETITRVKMYCGFANFSQLGLAAIKGLIP 329 (330) T ss_pred cceeeEEEEEeeeeEEechhheeeeccccC Confidence 455668999999888888888888877766 No 58 >protein:vir:97148 Length: 324 # NCBI annotation: ORF010 # Family: family:all:507 # MgeID: mge:1654 # MgeName: 85 # Cross-refs: genbank:acc:YP_239726;genbank:gi:66394880;genbank:GeneID:5130881 Probab=91.66 E-value=0.015 Score=30.61 Aligned_cols=284 Identities=11% Similarity=-0.045 Sum_probs=123.2 Q ss_pred CCCCCCCCcccee-cccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSRSDVHV-NRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~~~~~~-dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) +..++.......+ ....+.|-..-++ ..+-..+++.+|+.....+++++....-. .-.+-|......+..|+ T Consensus 27 ~~~~~~~~~~~~iP~~~~~~ii~~~~~---~s~l~~~~~~~~~~~~~~~ip~~~~~~~a----~~v~Eg~~~~~~~~~f~ 99 (324) T protein:vir:97 27 DNVMMHEKKDGTLMNEFTTPILQEVME---NSKIMQLGKYEPMEGTEKKFTFWADKPGA----YWVGEGQKIETSKATWV 99 (324) T ss_pred ccccccCCCcceechhHHHHHHHHHHh---hcchhhhcceeeccCCceEEEEEecCcce----eEeccCcccccccccee Confidence 1111111111111 1111221111111 12334457788887766777776422111 11122333334444555 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ..++.++.-+.-..+.++..++. .++++....+.+.+.+....| ..++++. +......+... T Consensus 100 ~v~~~~~k~~~~~~is~ell~ds--~~~l~~~i~~~l~~aia~~~d----~a~l~G~----g~~~~~~gi~~-------- 161 (324) T protein:vir:97 100 NATMRAFKLGVILPVTKEFLNYT--YSQFFEEMKPMIAEAFYKKFD----EAGILNQ----GNNPFGKSIAQ-------- 161 (324) T ss_pred EEEEeeEEEEEeehhhHHHHhcc--hHHHHHHHHHHHHHHHHHHHH----HHhhccC----CCCccCccccc-------- Confidence 55444444333333444444433 345555555555555544333 2233221 11111111110 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHHH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLAD 239 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la~ 239 (344) .........++.....+|.+....+. ..++.+...+|+++.|..|+. + +..+. ..+.....-.. T Consensus 162 ------~~~~~~~~~~~~~~~~~i~~~~~~l~-~~~~~~~~~v~n~~~~~~L~~---l----kd~~g--~~~~~~~~~~t 225 (324) T protein:vir:97 162 ------SIEKTNKVIKGDFTQDNIIDLEALLE-DDELEANAFISKTQNRSLLRK---I----VDPET--KERIYDRNSDT 225 (324) T ss_pred ------cccccceeccccCCHHHHHHHHHhhh-hccCCCCEEEEcHHHHHHHHH---h----hcCCC--ceeecCCCCcc Confidence 00001123345566788888887765 457889999999999988862 2 22211 11111111124 Q ss_pred HhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeeccccc----CCcCCcccccccCCCC Q lcl|NC_015466. 240 LFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVG----SGNEGMRIKRFYLDAI 315 (344) Q Consensus 240 ~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g----~~~~~~~~~~~~~~~~ 315 (344) ++|+| |.+..+ .....+ .-+.++..-+++... -|.+++...+.. ....+. .|.-=.. T Consensus 226 l~G~P-V~~~~~-----~~~~~~--~~~~gd~~~~~i~~~--------~~~~i~~~~~~~~~~~~~~~~~---~~~~f~~ 286 (324) T protein:vir:97 226 LDGLP-VVNLKS-----SNLKRG--ELITGDFDKLIYGIP--------QLIEYKIDETAQLSTVKNEDGT---PVNLFEQ 286 (324) T ss_pred cccee-eEeecC-----CCCCcc--eEEEEecccEEEEEe--------cCcEEEEeeccccccccccccc---chhhhhc Confidence 67877 333221 111111 112222111111110 123333222100 000000 0111123 Q ss_pred ceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 316 ESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 316 ~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ....+|+.+.++-.+.-+++=..|+.+.+ T Consensus 287 d~~~~r~~~r~d~~v~~~~a~~~l~~~~~ 315 (324) T protein:vir:97 287 DMVALRATMHVALHIADDKAFAKLVPADK 315 (324) T ss_pred CcEEEEEEEEeccEEecccceEEEEeccC Confidence 56778899999999999999888888888 No 59 >protein:vir:95107 Length: 270 # NCBI annotation: ORF013 # Family: family:all:522 # MgeID: mge:1549 # MgeName: X2 # Cross-refs: genbank:acc:YP_240822;genbank:gi:66394683;genbank:GeneID:5133901 Probab=91.36 E-value=0.012 Score=31.05 Aligned_cols=260 Identities=13% Similarity=0.031 Sum_probs=121.8 Q ss_pred CCCCCCCCccceecc-cccceeeeeEcCcchhhhhhhCccccc----CCccceeeeechhhcccccccccccCcccccce Q lcl|NC_015466. 1 MPFTQPSRSDVHVNR-PLTNISIGYVQDASHFVAGQVFPQVSV----GKQSDAYFTYERGDFNRDEMQERTPGTESAGGT 75 (344) Q Consensus 1 m~~~~~~~~~~~~dp-~LT~iA~~Y~n~~~~~ia~~lfP~v~v----~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~ 75 (344) |+++.- ... +.| +++++...-.+. ...+.|.+.+ ..+.|...+|++=...-. ...-.-|...-+.+ T Consensus 1 Ma~T~~--~d~-I~Pev~~~~V~e~~~~-----~~~~~~~~~~d~~L~g~~G~ti~~P~~~~igd-ae~~~eg~~i~~~~ 71 (270) T protein:vir:95 1 MTQTKK--ANL-INPEVLANVVSAQMQN-----AIRFTPYAVTDDTLVGQPGDTITRPKYAYIGA-AEDLQEGVAMDTTQ 71 (270) T ss_pred CCceeh--hhh-cchHHHHHHHHHHHHh-----HHhhccccccccccCCCCCCEEEeeeecCCCc-cccccCCCccchhh Confidence 775432 233 445 677754332221 1122343332 223455555543111111 11111233333334 Q ss_pred ecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccc Q lcl|NC_015466. 76 YEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTA 155 (344) Q Consensus 76 ~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~ 155 (344) ...+......+..+..-.+.+. ....+.-||...+..++...+....+..+...+. +. T Consensus 72 lt~~~~~a~i~~~gk~~~itD~--a~~~~~~dp~~~~~~q~a~~~a~~~d~~li~~l~--------------~a------ 129 (270) T protein:vir:95 72 MSMTTTKVTVKETGKAVEVTQT--AIITNVNGTLQEASRQLAMSLADKVEIDYIAELN--------------KS------ 129 (270) T ss_pred cccchheeeeehhhCcceecHH--HHhhhccchHHHHHHHHHHHHHHHHHHHHHHHhc--------------cc------ Confidence 4444444455554443333333 3333344777777777777666555543322211 10 Q ss_pred ccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHH Q lcl|NC_015466. 156 PASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLV 235 (344) Q Consensus 156 ~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~ 235 (344) .|+...+.-..+|.++...+.+. +-.+++++|.++++..|++++.+ +..+++ . +.+.-- T Consensus 130 ---------------~~~~~~~~t~~~~~dA~~~lgd~-~~~~~~i~vhs~~~~~Lrk~~~~-~~~~~~---~-~~~~~G 188 (270) T protein:vir:95 130 ---------------KQTATVSADATGILDAIEVFNSE-NDEDYVLYVNPKDYNKLVKSLFK-VGGNVQ---D-RAISKG 188 (270) T ss_pred ---------------ccccccccCHHHHHHHHHHhccc-cCCCcEEEEcHHHHHHHHhhhcc-cccccc---c-chhccc Confidence 13333334456777777776555 67799999999999999987633 222221 1 222223 Q ss_pred HHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccccCCCC Q lcl|NC_015466. 236 TLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAI 315 (344) Q Consensus 236 ~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~ 315 (344) .+..++|++ |.|-+. ... .....|+ ..+.++ ++... +..+++.|+... T Consensus 189 ~ig~~~G~~-Viv~s~-----~~~---------~~~~~l~---~~gAi~------~~~~~--------~~~vEtdRd~~~ 236 (270) T protein:vir:95 189 DLVEIVGVS-DIVKSK-----RVS---------ENTAFLQ---RYGAME------IVNKK--------KPEAYTDFDILK 236 (270) T ss_pred ccceeccee-EEEeCC-----CCC---------ceeEEEE---ecccee------eeecC--------Cceeeeccchhh Confidence 455666765 322211 000 0112222 112222 11111 123567777778 Q ss_pred ceEEEeeccccceeeeccccchhhh--cccC Q lcl|NC_015466. 316 ESDRIEIDMSYDQKKVAADLGYFFG--GIVA 344 (344) Q Consensus 316 ~~~~vr~~~~~~~~v~~~~~g~l~~--~~va 344 (344) +...+....++--+++.+..=..++ -+-+ T Consensus 237 ~~d~i~~~~~y~v~~~~~skvv~~t~~~a~~ 267 (270) T protein:vir:95 237 RTHLLSTNYHYSVNLKDETGVVKVTFKPSGS 267 (270) T ss_pred cccEEEeeeEEEEEEEccceEEEEEecCCCC Confidence 8888888888766666654333322 1111 No 60 >protein:vir:10364 Length: 390 # NCBI annotation: head protein; major capsid subunit precursor # Family: family:all:585 # MgeID: mge:183 # MgeName: Xp10 # Cross-refs: genbank:acc:NP_858956;genbank:gi:32128421;genbank:GeneID:2648357 Probab=91.31 E-value=0.016 Score=30.36 Aligned_cols=270 Identities=9% Similarity=-0.038 Sum_probs=111.6 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceecccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIGN 80 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~~ 80 (344) +.......-.+.+-..+..|-..-+.. . -|. .+++.+|+.....+++.+....-.-.. ..-|.........|+. T Consensus 114 ~~~~~~~~g~~~~~~~~~~ii~~~~~~-~-~l~-~~~~~~~~~~~~~~~~~~~~~~~~a~~---v~Eg~~~~~~~~~~~~ 187 (390) T protein:vir:10 114 STDAAGSAGALTTPNRLPGFITQPDAR-L-TVR-DLIGSGRTDSALIEYVQETGFVNNAAI---VAEGALKPESSLKFAK 187 (390) T ss_pred hcccccccccccchhHHHHHHHHHHhh-c-hhh-hhcceeeccCCceEEEEEecCCcceee---ecCCccccccccceeE Confidence 222222111111111111111111111 1 122 346777887777788876432111111 1223333444445555 Q ss_pred cccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccccc Q lcl|NC_015466. 81 DTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASFD 160 (344) Q Consensus 81 ~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~~ 160 (344) ..+..+..+....+..+..++. .+++......+.+.+....+ ..++++. +......|.-. .. . T Consensus 188 i~~~~~k~~~~~~is~ell~d~---~~l~~~i~~~l~~~~~~~~~----~~il~G~----G~~~~p~Gi~~----~~--~ 250 (390) T protein:vir:10 188 KTDTTHVIAHTMKATRQILSDA---PQLASYMNNRLIRGLKVKED----AEILRGT----GANDGLLGLIP----QA--T 250 (390) T ss_pred EEEeeEEEEEeehhhHHHHHhH---HHHHHHHHHHHHHHHHHHHH----HHHhhcC----CCCcccccccc----cc--c Confidence 5555444444444555544432 13443333333333332222 2333221 11111111110 00 0 Q ss_pred eeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH---HHH Q lcl|NC_015466. 161 PTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL---VTL 237 (344) Q Consensus 161 k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~---~~l 237 (344) .... .-..++.+++.+|.+++..+. ..+..++.++|+++.|.+|++ ++ .++. .-+... ..- T Consensus 251 ~~~~------~~~~~~~~~~~~~~~~~~~l~-~~~~~~~~~v~n~~~~~~L~~---lk----d~~g--~~l~~~~~~~~~ 314 (390) T protein:vir:10 251 TYAA------PTTIAGATRVDQLRLAMLQAS-LAEYPASGIVINPIDWAAIEL---AK----DANN--QYLIGNARGTLT 314 (390) T ss_pred cccc------cccccccchHHHHHHHHHhhc-cccCCCCEEEEcHHHHHHHHH---hh----cCCC--ceeecCCcCcCC Confidence 0000 112245577888888887765 457888999999999988872 22 2111 011110 001 Q ss_pred HHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccce-eecccccCCcCCcccccccC---C Q lcl|NC_015466. 238 ADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTF-NWTGLVGSGNEGMRIKRFYL---D 313 (344) Q Consensus 238 a~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~-~~~~~~g~~~~~~~~~~~~~---~ 313 (344) ..++|+| |++... +..+.+ + +|+.+.+|.. ...++ .++..++ - T Consensus 315 ~~l~G~p-v~~~~~---------------~p~~~~-~--------~gdf~~~~~~~~~~~~--------~i~~~~~~~~~ 361 (390) T protein:vir:10 315 PTLWGLP-VVATQA---------------MAPGEF-L--------VGAFDLAAQIFDQWDA--------RVEIGYVNDDF 361 (390) T ss_pred ceeccee-eEEcCC---------------CCCCcE-E--------EEeccceEEEEEecce--------EEEEeeccccc Confidence 2356776 322111 111111 1 1122222221 11111 1111111 1 Q ss_pred CCceEEEeeccccceeeeccccchhhhcc Q lcl|NC_015466. 314 AIESDRIEIDMSYDQKKVAADLGYFFGGI 342 (344) Q Consensus 314 ~~~~~~vr~~~~~~~~v~~~~~g~l~~~~ 342 (344) ......+|+...++-.+.-+.+-..++=+ T Consensus 362 ~~~~~~~r~~~r~d~~v~~~~a~~~~~~a 390 (390) T protein:vir:10 362 QRNMVTVLAEERLALVVYRPEALISGSFA 390 (390) T ss_pred ccCcEEEEEEEeeccEEeccccEEEEEeC Confidence 23556777888888888877766555444 No 61 >protein:vir:41 Length: 299 # NCBI annotation: major capsid protein # Family: family:all:507 # MgeID: mge:2 # MgeName: A118 # Cross-refs: genbank:acc:NP_463467;swissprot:trembl:q9t1b7;genbank:gi:16798789;uniprot:Q9T1B7;genbank:GeneID:922353 Probab=91.22 E-value=0.017 Score=30.30 Aligned_cols=288 Identities=8% Similarity=-0.009 Sum_probs=125.3 Q ss_pred CCCCCCCCccceeccccc-ceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLT-NISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT-~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) |..++.......|-+.+. .|-..-++ ..+-..+++.+|+.....++++...... . -..-|......+..|+ T Consensus 6 ~~~~~~~~~~~~iP~~~~~~ii~~~~~---~s~l~~~~~~~~~~~~~~~~~~~~~~~a-~----~v~E~~~~~~~~~~f~ 77 (299) T protein:vir:41 6 DTTTMQSAKTGSIPINISEQIITGVKN---GSAAMKLAKAVPMTKPEEEFTFMSGVGA-F----WVDEAERIQTSKPTFT 77 (299) T ss_pred CcccccCCCceecchhHHHHHHHHHHh---cchhhhhceeeecCCCcEEEEEEcCCce-e----eeecCcccccccccee Confidence 333333332222322222 21111111 1334455677787766666665432211 1 1122333333444555 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ...+..+.-+...++.++..++. .++++......+.+.+....| ..++++. +... ..|.- ... T Consensus 78 ~v~l~~~k~~~~~~is~ell~ds--~~~~~~~i~~~l~~a~~~~~d----~a~l~G~----g~~~-~~gil----~~~-- 140 (299) T protein:vir:41 78 KAKMRSKKMGVIIPTTKENLNYS--VTNFFSLMQAEIVEAFYKKFD----QAVFTGV----ESPY-NWNIL----KSA-- 140 (299) T ss_pred EEEEeeEEEEEeehhhHHHHhcC--HHHHHHHHHHHHHHHHHHHHH----HHHhhcc----cCcc-ccccc----ccc-- Confidence 55444444444444555544432 344555444555555444333 3333322 1100 00110 000 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHHH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLAD 239 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la~ 239 (344) .... ....++.+.+.||.+....+. ..++.++..+|+++.|.+|+. ++.. .+...-.+ .++ ..... T Consensus 141 ---~~~~----~~~~~~~~~~~~l~~~~~~l~-~~~~~~~~~v~n~~~~~~L~~---lkd~-~G~~l~~~-~~~-~~~~~ 206 (299) T protein:vir:41 141 ---TDAS----NLVEETANKYDDLNEAIGLIE-AEDLEPNGIATIRKQRVKYRS---TKDG-NGMPIFNT-ATS-NGVDD 206 (299) T ss_pred ---cccc----eeeccccccHHHHHHHHHhhh-cccCCcCEEEEcHHHHHHHHH---hhcc-CCceeecC-CcC-CCCce Confidence 0000 122345678899999998865 457899999999999999873 3221 11100000 011 01124 Q ss_pred HhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeeccccc----CCcCCcccccccCCCC Q lcl|NC_015466. 240 LFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVG----SGNEGMRIKRFYLDAI 315 (344) Q Consensus 240 ~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g----~~~~~~~~~~~~~~~~ 315 (344) +||+| |.+.+.. ........-+.++..-+++... -+.+++...+.. ....+... .--.. T Consensus 207 l~G~P-V~~~~~~-----~~~~~~~~~~~gdfs~~~i~~~--------~~~~i~~~~~~~~~~~~~~~~~~~---~~~~~ 269 (299) T protein:vir:41 207 VLGLP-IAYTPKY-----TFGDKDISELVGDWNQAYYGIL--------RGVEYEILTEATLTTVADETGKPL---NLAER 269 (299) T ss_pred eccee-eEEeccc-----CCCCCceEEEEEecccEEEEEe--------cCcEEEEeecccccccccccccch---hhhhc Confidence 67776 3332211 1111111111122111111000 012222221100 00001100 11123 Q ss_pred ceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 316 ESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 316 ~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) +...+|+.+.++-.+.-+.+-..++...| T Consensus 270 ~~~~~r~~~~~d~~v~~~~A~~~l~~~aa 298 (299) T protein:vir:41 270 DMAAIKATFEVGFMVVKDEAFSAVQPKAG 298 (299) T ss_pred CcEEEEEEEEeccEEecccceEEEEeccC Confidence 45678999999999999999999988888 No 62 >protein:vir:4339 Length: 395 # NCBI annotation: major head protein # Family: family:all:585 # MgeID: mge:93 # MgeName: D3 # Cross-refs: genbank:acc:NP_061502;genbank:gi:9635591;genbank:GeneID:1262860 Probab=90.64 E-value=0.02 Score=29.93 Aligned_cols=275 Identities=11% Similarity=-0.011 Sum_probs=112.1 Q ss_pred CCCCCCCCccceeccc-ccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRP-LTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~~~~~~dp~-LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) +..+.. .....+-|. .+.|-...++. ..--.+++.+|+.....+++......-.-..+. -|.........|+ T Consensus 114 ~~~~~~-~~g~~vp~~~~~~ii~~~~~~---~~l~~l~~~~~~~~~~~~~~~~~~~~~~a~~v~---E~~~~~~~~~~~~ 186 (395) T protein:vir:43 114 ITSIDG-SGGALVAPDRRPGVVAAPQRR---LTIRDLVAPGTTESNSVEYVRETGFVNNAAPVS---EGTQKPYSDLTFE 186 (395) T ss_pred hcccCC-CCccccchhhHHHHHHHHHhh---hhHHhhccceecCCCceEEEEEecCCCceeeec---CCcccccccccee Confidence 111111 111112221 12221111111 222344677787766677776532211111111 2333333344555 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ...+.++..+....++.+..+.. .+++....+.+.+.+....| ..++++. +......|+-.. ... T Consensus 187 ~i~~~~~k~~~~~~is~ell~d~---~~l~~~v~~~la~a~~~~~d----~~~l~G~----g~~~~~~Gi~~~----~~~ 251 (395) T protein:vir:43 187 LENAPVRTIAHLFKASRQILDDA---SALQSYIDARARYGLMLVEE----CQLLYGN----GTGANLHGIIPQ----AQA 251 (395) T ss_pred EEEEeeeeEEEeehhhHHHHHhH---HHHHHHHHHHHHHHHHHHHH----HHHHhcc----CCCCcccccccc----ccc Confidence 55555555444455555555432 12333333333333332222 2333321 111111121110 001 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccC---HHH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKAN---LVT 236 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt---~~~ 236 (344) .....+ ....+.+.+.+|.+.+..+.. .+..+..++|+++.|.+|+. ++. .+ +.-+.. ... T Consensus 252 ~~~~~~------~~~~~~~~~~~i~~~~~~~~~-~~~~~~~~vmn~~~~~~l~~---lkd----~~--G~~i~~~~~~~~ 315 (395) T protein:vir:43 252 YAPPSG------VVVTAEQRIDRIRLAILQAQL-AEFPASGIVLNPIDWALIEL---NKD----AE--NRYIIGSPQNGT 315 (395) T ss_pred cccccc------cccccchhHHHHHHHHHhhcc-ccCCCcEEEEcHHHHHHHHH---hhc----cC--CceeccccccCC Confidence 111111 222455678888888877654 46788899999999998862 222 11 111111 011 Q ss_pred HHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccce-eecccccCCcCCcccccccCC-- Q lcl|NC_015466. 237 LADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTF-NWTGLVGSGNEGMRIKRFYLD-- 313 (344) Q Consensus 237 la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~-~~~~~~g~~~~~~~~~~~~~~-- 313 (344) -..+||+| |++.+. +..+.+++ |+.+.+++. .+.++ ...+.++... T Consensus 316 ~~~l~G~p-Vv~~~~---------------~~~~~~~~---------gd~~~~~~~~~~~~~------~i~~~~~~~~~f 364 (395) T protein:vir:43 316 TPTLWRLP-VVETQA---------------ITQDEFLT---------GAFSLGAQIFDRMDI------EVLVSTENDKDF 364 (395) T ss_pred Cceeccee-eEEcCC---------------CCCCcEEE---------EeccceEEEEEecce------EEEEeccccchh Confidence 13456765 322111 11112211 122222221 11111 0001111111 Q ss_pred CCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 314 AIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 314 ~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ..+...+|+...++-.+.-+++=..++=+.| T Consensus 365 ~~~~~~~r~~~r~d~~v~~~~a~~~~~~taa 395 (395) T protein:vir:43 365 ENNMVTIRAEERLAFAVYRPEAFVTGSLTAS 395 (395) T ss_pred hcCcEEEEEEEeeccEEecccceEEEEeccC Confidence 2456677888888888887777555544444 No 63 >protein:vir:2344 Length: 397 # NCBI annotation: gp14 # Family: family:all:507 # MgeID: mge:51 # MgeName: Bxb1 # Cross-refs: genbank:acc:NP_075281;genbank:gi:12657868;genbank:GeneID:920118 Probab=89.65 E-value=0.019 Score=29.96 Aligned_cols=288 Identities=9% Similarity=-0.035 Sum_probs=110.0 Q ss_pred CCCCCCCC-ccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSR-SDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~-~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) |..+.... ..+.+....+.|-..-++ ..+-..+++.+++.....+++++..+.... . ..-|......+..|+ T Consensus 10 ~~~~~t~~~~g~l~~~~~~~ii~~l~~---~s~i~~l~~~~~~~~~~~~ip~~~~~~~a~-w---v~Eg~~~~~s~~~f~ 82 (397) T protein:vir:23 10 IAQTKDTMFTGYLDPVQAKDYFAEAEK---TSIVQRVAQKIPMGATGIVIPHWTGDVSAQ-W---IGEGDMKPITKGNMT 82 (397) T ss_pred HhhccCCCCccccchhHHHHHHHHHHh---ccchhhhcceeeccCCceEEEEEcCCcceE-E---ecCCcccccccccee Confidence 33222211 112221122221111111 123345678888887777777764332211 1 112333333444555 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ..++..+..+-..++.++..++. .++++....+.+.+.+....| ..++++. +......+. T Consensus 83 ~v~l~~~k~~~~v~iS~ell~ds--~~~l~~~i~~~l~~aia~~~d----~a~l~G~----gt~~~~~~~---------- 142 (397) T protein:vir:23 83 KRDVHPAKIATIFVASAETVRAN--PANYLGTMRTKVATAIAMAFD----NAALHGT----NAPSAFQGY---------- 142 (397) T ss_pred EEEEeeEEEEEeehhhHHHHhcc--hHHHHHHHHHHHHHHHHHHHH----HHHhhcc----cCCcccccc---------- Confidence 55544444444444555544433 355555545555555544333 2333221 110000000 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCcccccc---CH-H Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKA---NL-V 235 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~v---t~-~ 235 (344) ....+. .....+.....++.+....+.. .+..++..+|+++.+.+|++ ++.. .+...-.+... +. - T Consensus 143 --~~~~~~---~~~~~~~~~~~~~~~~~~~l~~-~~~~~a~~vmn~~~~~~L~~---lkd~-~G~~i~~~~~~~~~~~~~ 212 (397) T protein:vir:23 143 --LDQSNK---TQSISPNAYQGLGVSGLTKLVT-DGKKWTHTLLDDTVEPVLNG---SVDA-NGRPLFVESTYESLTTPF 212 (397) T ss_pred --cccccc---eeeecccchhHHHHHHHHhhhh-cccCCCEEEEcHHHHHHHHH---hhcc-CCceeecccccccccccc Confidence 000000 0111223334455555555544 46788999999999988873 2221 00000000000 00 0 Q ss_pred HHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccc----cCCcCCccccccc Q lcl|NC_015466. 236 TLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLV----GSGNEGMRIKRFY 311 (344) Q Consensus 236 ~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~----g~~~~~~~~~~~~ 311 (344) .-..++|+| |.+.+.. . .+...-+.++..-+++... -|.+++...+. +....+. .+. T Consensus 213 ~~~tl~G~P-v~~s~~~-----~--~g~~~~~~gDfs~~~i~~~--------~~i~i~~~~e~~~~~~~~~~~~---~~~ 273 (397) T protein:vir:23 213 REGRILGRP-TILSDHV-----A--EGDVVGYAGDFSQIIWGQV--------GGLSFDVTDQATLNLGSQESPN---FVS 273 (397) T ss_pred cCceeeeee-EEEeCCC-----C--CCceEEEEeecceEEEEEE--------eceEEEEeeeeeeeeccccccc---eee Confidence 012345655 2222110 0 0111111122111111100 11222221110 0000000 000 Q ss_pred CCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 312 LDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 312 ~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) -=......+|+.+.++-.+.-+++-..++.... T Consensus 274 lf~~d~v~~ra~~r~d~~v~~~~a~~~~~~~~~ 306 (397) T protein:vir:23 274 LWQHNLVAVRVEAEYGLLINDVNAFVKLTFDPV 306 (397) T ss_pred eeeccceeEEEEeeeccceecccceEEEeeccc Confidence 111234556777777777777766655554433 No 64 >protein:vir:81227 Length: 413 # NCBI annotation: gp6, major capsid protein # Family: family:all:585 # MgeID: mge:1893 # MgeName: BFK20 # Cross-refs: genbank:acc:YP_001456736;genbank:gi:157168379;hssp:P49861;interpro:IPR006444;uniprot:Q9MBJ9;genbank:GeneID:5580350 Probab=88.93 E-value=0.029 Score=28.99 Aligned_cols=278 Identities=8% Similarity=-0.012 Sum_probs=108.9 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCccccccee-ccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTY-EIG 79 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~-~~~ 79 (344) +...........+-+.+.+--+.-... ..+-..+++.+|+.....+|+......-......-..-|........ .|+ T Consensus 118 ~~~~~~~~~~~~vp~~~~~~ii~~~~~--~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~a~~v~Eg~~~~~~~~~~f~ 195 (413) T protein:vir:81 118 STATLTDEFQGGYGTTWNRNIIYRRRE--KLVVADLMDNLTMTNTTIKYLMEKANRVVEGGFKTVAEGGKKPYMRFADFD 195 (413) T ss_pred hhcccccccccccchhhHHHHHHHHhh--hhhHHhhcceeeccCCceeEEEeccccccccccceecCcccccccCcccce Confidence 111111111111111111100000000 11223456777887777777764321111000000111222222221 344 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ..++..+..+.-..+.++..++.. .++......+.+.+....| ..++++ .+......|+.. .. T Consensus 196 ~i~~~~~k~~~~~~iS~ell~ds~---~l~~~i~~~la~~~~~~~d----~~~l~G----~G~~~~~~Gi~~----~~-- 258 (413) T protein:vir:81 196 IVTESLSKIAGLTKITDEMIEDYD---FLVSYINARLLEELAIEEE----RQLLLG----DGTGNNLTGLLK----RD-- 258 (413) T ss_pred eeEeeeeeEEEeehhhHHHHHHHH---HHHHHHHHHHHHHHHHHHH----HHHhcc----CCCCCccccccc----cc-- Confidence 444433333333344444444331 1333333333333333222 223322 111111112111 00 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH----- Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL----- 234 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~----- 234 (344) ...+. ....+.+.+.+|...+..+....+++|+.++|++..|.+|+ +++. .+. .-+... T Consensus 259 ~~~~~-------~~~~~~~~~~~i~~~~~~~~~~~~~~~~~~vmn~~~~~~l~---~lkd----~~G--~~l~~~~~~~~ 322 (413) T protein:vir:81 259 GIQTL-------AVSNKDELADSIYKAMTNISLATPFQADALVINPLDYQELR---LAKD----ANG--QYYGGGVFQGQ 322 (413) T ss_pred ccccc-------cccccchhHHHHHHHHHHhhhhccCCCcEEEEcHHHHHHHH---Hhhc----cCC--ceecccccccc Confidence 01111 11123456777778877777788999999999999999886 2221 110 000000 Q ss_pred ------HHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccce-eecccccCCcCCccc Q lcl|NC_015466. 235 ------VTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTF-NWTGLVGSGNEGMRI 307 (344) Q Consensus 235 ------~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~-~~~~~~g~~~~~~~~ 307 (344) .--..+||+| |++.+.. ..+.++ +|+.+.+|.. ...++ ...+ T Consensus 323 ~~~~~~~~~~~l~G~p-v~~s~~~---------------~~~~~~---------~gd~~~~~~~~~~~~~------~v~~ 371 (413) T protein:vir:81 323 YGSGGIMLDPAPWGLR-TVQSQVV---------------PVGKPV---------VGAFRSAASVLRKGGV------RIDS 371 (413) T ss_pred ccccccccCceeccee-eEEcCCC---------------CcccEE---------EEecccEEEEEEecce------EEEE Confidence 0012355665 3222111 011111 1222222221 11111 0001 Q ss_pred ccccC--CCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 308 KRFYL--DAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 308 ~~~~~--~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .++.. -......+|+.+.++-.+.-+++-..++-+-| T Consensus 372 ~~~~~~~~~~~~~~~r~~~r~d~~~~~~~a~~~l~~~~~ 410 (413) T protein:vir:81 372 TNTNVDDFENNLITVRAEERVGLMVTFPEAIVQLDVAEV 410 (413) T ss_pred eccccchhhcCcEEEEEEEeeccEEecccceEEEEecCC Confidence 11111 12355678888888888888887777665444 No 65 >protein:vir:94070 Length: 339 # NCBI annotation: putative structural protein # Family: family:all:1653 # MgeID: mge:1493 # MgeName: OP2 # Cross-refs: genbank:acc:YP_453625;genbank:gi:84662661;genbank:GeneID:5142580 Probab=88.60 E-value=0.031 Score=28.83 Aligned_cols=279 Identities=10% Similarity=0.047 Sum_probs=101.1 Q ss_pred CCCCCCCCccceecccccce-eeeeEcCcchhhhhhhCcccccCCccceeeeechhh----cccccccccccCcccccce Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNI-SIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGD----FNRDEMQERTPGTESAGGT 75 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~i-A~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~----~~~~~~~~ra~g~~~~~~~ 75 (344) +|.|.+..+.++.--.|+-| ..=|+-.-.++.++.|||...++....+..+|..-+ ...+.+ +.+..... T Consensus 43 ~~~~~~~~~~~i~a~~~~~i~~~vy~~~~~~~~~~~l~pv~t~g~w~~~t~~y~~~e~~G~a~~ygd-----~ad~Pl~~ 117 (339) T protein:vir:94 43 TPTLQTTANAGIPAWMTTFVDRRVIDIQLAPMAAAKIFPEVKKGDWTTTYGVFIIAEPVGQVATYSD-----WSANGMSK 117 (339) T ss_pred ccccccccccchhhhhhhhhchhheeecccccchhhhcccccCCCCcccEEEEeeeecccceEEccc-----ccCCCccc Confidence 44444433333211122222 112433334588999999987765444444543211 111110 11111111 Q ss_pred ecccccccccccccccccccHHHHH-hccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccc Q lcl|NC_015466. 76 YEIGNDTYFARTRAYHRDVPEQVRA-NADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPT 154 (344) Q Consensus 76 ~~~~~~~~~~~~~~l~~~v~~~~~~-~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~ 154 (344) +............+....+...+.+ .+....++..+..+-..+. .| +.+|.-.+-.... .+. .+.. T Consensus 118 ~~v~~~~~~v~~~~~g~~y~~~E~~~A~~~g~~l~~~Ka~aA~~a----l~-----~~~N~i~~~Gd~~---~~~-~GLl 184 (339) T protein:vir:94 118 ANVNFESRQNYRYQTWTEYGDLEMATYGEAGIDYVARQEISASLV----MA-----KFANSSYLLGVAG---IAN-YGLM 184 (339) T ss_pred ccceeeEEeEEEEEEEEeecHHHHHHHHhhCCChHHHHHHHHHHH----HH-----HhhceEEeeeecc---cce-EEEE Confidence 1111111122222222223333222 2223334333221111111 11 1122211111000 011 1122 Q ss_pred cccccceeecccccccccCCCCCCh-HHHHHHHHHHHHHhcC-----CCcceEEeCHHHHHHHhcCHHHHHHhccCCCcc Q lcl|NC_015466. 155 APASFDPTNASNNDKLHWSDASSTP-IEDIRQGKRYVLEETG-----FEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSG 228 (344) Q Consensus 155 ~~~~~~k~tl~~t~~~~Wsd~~SDP-i~di~~~~~~i~~~~G-----~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~ 228 (344) +.+.-...+..+ .+|...+.+. +.||.+...++...+| -.|.+++|....+..|.+ .+. T Consensus 185 N~P~l~~~v~~s---~~Wa~kT~~eI~~Di~~~~~~l~~~s~g~~~~~~~~~L~LP~~~~~~L~~----------~n~-- 249 (339) T protein:vir:94 185 NDPSLPAPVAAT---VNWATAAPEDIANDVVAMVGRLISQSGGLITGQERMVMALAPSALNNVNR----------TNN-- 249 (339) T ss_pred eCCCccccccCC---CCcccCCHHHHHHHHHHHHHHHHHhcCCeeeeccCcEEEecHHHHHhccc----------CCc-- Confidence 222211111112 3698776555 8999999999988886 257799999999987752 111 Q ss_pred ccccC-HHHHHHHhCCCeEEEEEEE-EeccccCCCCccceeCCCceEEEEec----CCCcccccccccceeecccccCCc Q lcl|NC_015466. 229 AAKAN-LVTLADLFEVDKVLVMKAV-RNTAKKGQTASHSFIGGKHALLSYAP----ATPGIMTPSAGYTFNWTGLVGSGN 302 (344) Q Consensus 229 ~~~vt-~~~la~~~gl~~I~v~~a~-yn~~~~~~~~~~~~iw~~~~~l~~~~----~~~~~~~~s~G~T~~~~~~~g~~~ 302 (344) .+ +| .+.|++-+ |.+.+-.+. +.++ ++. ...+++.. +...+.-|. .++.+..... T Consensus 250 ~~-~Tvl~~lk~n~--pnl~i~~~~el~~a----~g~-------~~~~~~~~~~~~~~~~~~~p~-----~~~~lpvq~~ 310 (339) T protein:vir:94 250 FG-LSAGAKIAQTY--PNIQFVAVPEFDTA----SGR-------LVQLWVPEVNGQPTGEVAFAE-----KLRSHSIERY 310 (339) T ss_pred CC-ccHHHHHHHhc--CCcEEEEccccccC----CCc-------eEEEEEEeccCCcceEEEcch-----hhhccccEEc Confidence 12 34 45666654 333332211 1111 111 12222211 011111110 0010000000 Q ss_pred CCcccccccCCCCceEEEeeccccceeeeccccch Q lcl|NC_015466. 303 EGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGY 337 (344) Q Consensus 303 ~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~ 337 (344) .......+...-.|..++| |.-++.-.|. T Consensus 311 ~~~~~v~~~~rt~Gv~i~~------P~ai~~~~GI 339 (339) T protein:vir:94 311 STTTRQKHSGATFGAVIYQ------PWAVTQELGV 339 (339) T ss_pred CceEEecceeeeeeEEEEc------cceeeeeecC Confidence 1112222222222222222 2222222232 No 66 >protein:vir:9759 Length: 303 # NCBI annotation: putative structural protein # Family: family:all:966 # MgeID: mge:175 # MgeName: 315.3 # Cross-refs: genbank:acc:NP_795521;genbank:gi:28876283;genbank:GeneID:1257824 Probab=87.83 E-value=0.036 Score=28.49 Aligned_cols=292 Identities=10% Similarity=-0.040 Sum_probs=123.4 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceecccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIGN 80 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~~ 80 (344) |...+.+ .+.+.+.+.+--+..-.+ ..+-..+++.+|+.....+++++..+.-.. . .+-|......+..|+. T Consensus 1 m~t~t~g--g~liP~~~~~~ii~~l~~--~s~i~~l~~~~~~~~~~~~ip~~~~~~~a~-w---v~E~~~~~~s~~~f~~ 72 (303) T protein:vir:97 1 MGTETSK--ASLFDKHLVSDLINKVKG--HSSLAKLSSQKPIPFNGSKEFTFTLDSDID-V---VAENGKKTHGGLSLEP 72 (303) T ss_pred CcccCCC--CeEcchhHHHHHHHHHHh--hchhhhhcceeecCCCceEEEEEecCcceE-E---eecCccccccccceee Confidence 7744333 344555554322333222 234556678888887777888764322111 1 1123333333334443 Q ss_pred cccccccccccccccH--HHHHh-ccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccc Q lcl|NC_015466. 81 DTYFARTRAYHRDVPE--QVRAN-ADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPA 157 (344) Q Consensus 81 ~~~~~~~~~l~~~v~~--~~~~~-a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~ 157 (344) . .++.+.+...++. +...+ ....+++++...+.+.+.+....| ..++++.-.. .+..... .+ T Consensus 73 v--~l~~~kl~~~~~iS~ell~~~~d~~~~l~~~i~~~la~a~~~~ld----~a~l~G~~~~-------~g~~~~~--~~ 137 (303) T protein:vir:97 73 V--TIVPIKVEYGARLSDEFLYATEEEKIDILKAFNEGFAKKLARGID----LMAMHGINPR-------TKKASDV--IG 137 (303) T ss_pred E--EeeeEEEEEeehhhHHHhhcCccchHHHHHHHHHHHHHHHHHHHH----hhhhcccccC-------Ccccccc--cc Confidence 3 3344444444443 33322 233444555444444444443333 3333321000 0000000 00 Q ss_pred ccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHH-- Q lcl|NC_015466. 158 SFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLV-- 235 (344) Q Consensus 158 ~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~-- 235 (344) ......+++.. .....+.++..||.+....+.. .+..|+..+|+++.+.+|+. ++. .+. ..+..++ T Consensus 138 ~~~~~~~~~~~--~~~~~~~~~~~~i~~~~~~~~~-~~~~~~~~vmn~~~~~~L~~---lkd----~~g--~~~~~~~~~ 205 (303) T protein:vir:97 138 TNHFDSKVTQV--VKFTESEDADANIEAAVNLIQG-AEGVVTGLAMDTEFSTALAK---VTN----GEM--GPKMYPELA 205 (303) T ss_pred ccccccccccc--cccccccchHHHHHHHHHHHhh-cCCCccEEEEcHHHHHHHHH---hhc----cCC--CeEEecCcc Confidence 00101111110 1222455788999999887754 58999999999999988862 222 111 1111111 Q ss_pred ---HHHHHhCCCeEEEEEEEEeccccCC-CCccceeCCCce--EEEEecCCCcccccccccceeecccccCCcCCccccc Q lcl|NC_015466. 236 ---TLADLFEVDKVLVMKAVRNTAKKGQ-TASHSFIGGKHA--LLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKR 309 (344) Q Consensus 236 ---~la~~~gl~~I~v~~a~yn~~~~~~-~~~~~~iw~~~~--~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~ 309 (344) .--.++|+| |.+.+..= ...+. .....-+.|+.. +.+... -+.+++.... +.. .+..+.. T Consensus 206 ~~~~~~~l~G~P-v~~s~~v~--~~~~~~~~~~~~~~Gdf~~~~~~~~~---------~~~~~~~~~~-~~~-d~~~~~~ 271 (303) T protein:vir:97 206 WGANPDSINGLK-SSVNTTVG--AGADEAESKDLVIIGDFESMFKWGYA---------KQIPMEIIKY-GDP-DNSGKDL 271 (303) T ss_pred CCCCCceeccee-eEEecccC--CccccCCCccEEEEeeccccEEEEEe---------cCcEEEEeec-cCC-CCcchhh Confidence 012477877 43322210 00000 011111223221 111110 1122222210 000 0000111 Q ss_pred ccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 310 FYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 310 ~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) + ......+|+.+.++-.+.-+++=..|+++=- T Consensus 272 ~---~~n~~~~r~~~r~~~~v~~p~af~~l~~~~~ 303 (303) T protein:vir:97 272 K---GYNQIYLRAEAYIGWGILDAKSFARVTKGEV 303 (303) T ss_pred h---hcCcEEEEEEEEeccEeecccceEEeeCCCC Confidence 1 1234567888888888887776666666544 No 67 >protein:vir:100135 Length: 418 # NCBI annotation: gp5 # Family: family:all:585 # MgeID: mge:1639 # MgeName: phi1026b # Cross-refs: genbank:acc:NP_945035;genbank:gi:38707895;genbank:GeneID:2744182 Probab=87.14 E-value=0.041 Score=28.20 Aligned_cols=273 Identities=12% Similarity=0.023 Sum_probs=108.2 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeech-hhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYER-GDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k-~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) +.....+...++.....+.|-...+.. .+-..+++.+|+.....+++.... ..... . ..-|.........|+ T Consensus 136 ~~~~~~~~g~lvp~~~~~~ii~~~~~~---~~l~~~~~~~~~~~~~~~~~~~~~~~~~a~-~---v~E~~~~~~~~~~f~ 208 (418) T protein:vir:10 136 VGSGVSGSNSLVVADRQAGIIAPPQRK---MTIRDLLMPGQTSSSSIEYTVETGFTNNAA-A---VAEGAQKPTSDLKFN 208 (418) T ss_pred ccCCCCCCccccchhHHHHHHHHHhhh---hhHHhhcceeeccCCceeEEEEecCCCcee-e---eccCcccccccccee Confidence 111111111122222222222212111 223345677788766666766422 11110 0 112333333334454 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) .....++..+...+++.+..+.. .+++....+.+.+.+....| ..++++ . |.+..+.+..+. T Consensus 209 ~v~~~~~k~~~~~~is~ell~ds---~~l~~~i~~~l~~a~~~~~d----~a~l~G----~-------g~~~~p~Gi~~~ 270 (418) T protein:vir:10 209 LKNQPVRTIAHLFKASRQILDDA---PALQSYIDGRARYGLQLTEE----GQILKG----D-------GTGANILGILPQ 270 (418) T ss_pred eEEEeeeeEEEeehhhHHHHHhH---HHHHHHHHHHHHHHHHHHHH----HHHhcc----C-------CCCccccccccc Confidence 44444444333344555544432 13444333334444333222 222322 1 111111111000 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccC---HHH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKAN---LVT 236 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt---~~~ 236 (344) ++......+..+.+.+.+|.+++..+. ..+..++.++|++..|..|+. ++ ..+ +.-+.. ... T Consensus 271 -----~~~~~~~~~~~~~~~~~~i~~~~~~~~-~~~~~~~~~v~n~~~~~~L~~---lk----d~~--G~~i~~~~~~~~ 335 (418) T protein:vir:10 271 -----ASAFMPSITLANATPIDKIRLALLQAV-LAEFPATGIVLNPIDWASIEL---TK----DSQ--GRYIVGNPVNGT 335 (418) T ss_pred -----cccccccccccccccHHHHHHHHHhhc-cccCCCCEEEEcHHHHHHHHH---hh----cCC--CceeccccccCC Confidence 011111234456677889988887764 457788899999999988862 22 211 111111 011 Q ss_pred HHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccce-eecccccCCcCCcccccccCC-- Q lcl|NC_015466. 237 LADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTF-NWTGLVGSGNEGMRIKRFYLD-- 313 (344) Q Consensus 237 la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~-~~~~~~g~~~~~~~~~~~~~~-- 313 (344) -..++|+| |++... +..+.++ +|+.+.++.. .+.++ ...+.++... T Consensus 336 ~~~l~G~p-V~~~~~---------------~p~~~~~---------~gd~s~~~~~~~~~~~------~i~~~~~~~~~f 384 (418) T protein:vir:10 336 TPRLWNLP-VVETQA---------------MTANEFL---------VGAFSMAAQIFDRMEI------EVLLSTENVDDF 384 (418) T ss_pred Cceeccee-eEEcCC---------------CCCCcEE---------EeeccceEEEEEecce------EEEEecccchhh Confidence 23456665 322111 1111111 1222222221 11111 0001111111 Q ss_pred CCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 314 AIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 314 ~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ......+|+...++-.+.-+++-.+++-.-+ T Consensus 385 ~~~~~~~r~~~~~d~~~~~~~a~~~~~~~~~ 415 (418) T protein:vir:10 385 EKNMVSIRAEERLALAVYRPESFVTGALVEQ 415 (418) T ss_pred hcCceEEEEEEeeccEEecccceEEEEeccC Confidence 1345566666666666666666555444433 No 68 >protein:vir:80376 Length: 435 # NCBI annotation: gp6, major capsid head protein # Family: family:all:21 # MgeID: mge:1881 # MgeName: phi644-2 # Cross-refs: genbank:acc:YP_001111085;genbank:gi:134288639;genbank:GeneID:4960624 Probab=86.83 E-value=0.043 Score=28.09 Aligned_cols=297 Identities=10% Similarity=0.002 Sum_probs=112.1 Q ss_pred CCCCCCCCcccee-cccccceeeeeEcCcchhhhhhh-CcccccCCccceeeeechhhcccccccccccCcccccceecc Q lcl|NC_015466. 1 MPFTQPSRSDVHV-NRPLTNISIGYVQDASHFVAGQV-FPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~-dp~LT~iA~~Y~n~~~~~ia~~l-fP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~ 78 (344) +..+.+....+.| +...+.|-..-++. .+-..+ +-.+|+.....+++++....-.. . ..-|.........| T Consensus 132 ~~~~~~~~gg~lvP~~~~~~ii~~l~~~---~~i~~~~~~~v~~~~~~~~~p~~~~~~~a~-~---v~E~~~~~~~~~~f 204 (435) T protein:vir:80 132 LNTLSPGAGGVLVPENLSSEVIELLRPK---SVVRKLGARTLPLSNGNITIPRLKGGAIVG-Y---IGADTDIPTTQQQF 204 (435) T ss_pred hcccCCCCCccccchhHHHHHHHHHhhh---chhhhccceeeecCCCceEEEEEeCCccee-e---eccCccccccccce Confidence 1111111111111 11111111101111 112223 22345555556777664322111 0 11222233333445 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) +..++.....+....+..+..+++.-.+++++...+.+.+.+.+..|. .++++ .+......|.-... .. T Consensus 205 ~~i~~~~~k~~~~~~is~ell~ds~~~~~l~~~i~~~l~~a~~~~~d~----a~l~G----~G~~~~p~Gi~~~~---~~ 273 (435) T protein:vir:80 205 DDLKLTAKKMAALVPIANDLIKYAGVNPNVDQIVVGDLTAAIGAREDK----AFIRD----DGTANTPKGLRFWA---LP 273 (435) T ss_pred eeEEEeeEEEEEeehhhHHHHHhhcccHHHHHHHHHHHHHHHHHHHHH----Hhhcc----CCCCCcccceeecc---cc Confidence 554444444333444445555544333445554444555544443332 23332 11111111111000 00 Q ss_pred cceeecccccccccCCCCCChHHHHHHHHHHHHHhc-CCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHH Q lcl|NC_015466. 159 FDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEET-GFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTL 237 (344) Q Consensus 159 ~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~-G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~l 237 (344) .+..+.+. .....++..|+.++...+.... ++.+...+|++..|.+|+. ++..+ + .-+.+...- T Consensus 274 ~~~~~~~~------~~~~~~~~~d~~~~~~~~~~~~~~~~~~~~vmn~~~~~~L~~-------lkd~~-G-~~l~~~~~~ 338 (435) T protein:vir:80 274 GNVITASD------GSTLQKIETDLGKAILALENADANLTQPGWIMAPRTFRFLEG-------LRDGN-G-NKVYPELAN 338 (435) T ss_pred cceeeccc------ccchhhHHHHHHHHHHHhhccccccccCEEEEcHHHHHHHHh-------hhccC-C-ceeccCCCC Confidence 11111111 1122355667878777665443 4567889999999988852 22221 1 111111111 Q ss_pred HHHhCCCeEEEEEEEEeccccCCCCccce-eCCCceEEEEecCCCcccccccccceeeccccc-CCcCCcccccccCCCC Q lcl|NC_015466. 238 ADLFEVDKVLVMKAVRNTAKKGQTASHSF-IGGKHALLSYAPATPGIMTPSAGYTFNWTGLVG-SGNEGMRIKRFYLDAI 315 (344) Q Consensus 238 a~~~gl~~I~v~~a~yn~~~~~~~~~~~~-iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g-~~~~~~~~~~~~~~~~ 315 (344) ..++|+| |++.+..= ...+..+.... +.++.--++. ++ .-|+++....+-+ ....+..+..|. . T Consensus 339 ~~l~G~p-v~~~~~~p--~~~~~~~~~~~i~~gd~s~~~i-------~~-~~~~~i~~~~~~~~~~~~~~~~~~f~---~ 404 (435) T protein:vir:80 339 GMLKGYP-VGKTTQVP--INLGEAGKESEIYFTDFGDVFI-------GE-EETLEIDYSKEATYKDADGHMVSAFQ---R 404 (435) T ss_pred CeEeeee-eEEecccc--ccccCCCCcceEEEEEcccEEE-------Ee-ecceEEEEeccccccccccchhhhhh---c Confidence 2466776 33332220 11111111111 1121100000 00 0112222221100 000111111122 2 Q ss_pred ceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 316 ESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 316 ~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ....+|+.+.++-.+.-+++-..|+++-= T Consensus 405 n~~~~r~~~r~d~~~~~~~a~~~l~~~~~ 433 (435) T protein:vir:80 405 DQTLIRVIAKNDFGPRHVESIAVLSGVAW 433 (435) T ss_pred CcceeeeeeeeCcEeecccceEEEeccCC Confidence 34678888888888888888777765543 No 69 >protein:vir:4600 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:101 # MgeName: PVL # Cross-refs: genbank:acc:NP_058445;genbank:gi:9635171;genbank:GeneID:1262708 Probab=86.55 E-value=0.045 Score=27.98 Aligned_cols=281 Identities=10% Similarity=0.062 Sum_probs=110.1 Q ss_pred CCCCC-CCCccceecccccceeeeeEcCcc-hhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccc-eec Q lcl|NC_015466. 1 MPFTQ-PSRSDVHVNRPLTNISIGYVQDAS-HFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGG-TYE 77 (344) Q Consensus 1 m~~~~-~~~~~~~~dp~LT~iA~~Y~n~~~-~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~-~~~ 77 (344) +.... +......+- +.++..+..... ...-..++..+|+....++++.......... .-..-|...... ... T Consensus 120 ~~~~~~t~~g~~~iP---~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~v~Eg~~~~~~~~~~ 194 (415) T protein:vir:46 120 QGGSLKTDSGFVVIP---EEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVRQSEVAAL--EKVEELEENPELAVKP 194 (415) T ss_pred hhccccccCCccccc---HHHHHHHHHHHHhhhhhhhhcceeeccCCceeEEEEEecCCcce--eecccccccccccccc Confidence 11111 111111111 111111111000 1122233456666666666665421111100 011122222222 123 Q ss_pred ccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccc Q lcl|NC_015466. 78 IGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPA 157 (344) Q Consensus 78 ~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~ 157 (344) |+...+..+..+.-..+.++..++ +.++++......+.+.+....| ..++++.-.+..... + T Consensus 195 ~~~v~~~~~k~~~~~~iS~ell~d--s~~~l~~~i~~~l~~~i~~~~d----~~il~g~g~g~~~~~---~--------- 256 (415) T protein:vir:46 195 FFQLAYDINTHRGYFRISREAIED--AKVNVLQELKLWMARTIAATRN----KAIIDVITKGSTGST---S--------- 256 (415) T ss_pred eeeEEeeeeeeEeeehhhHHHHhh--chHHHHHHHHHHHHHHHHHHHH----HHHhhccccCCcccc---c--------- Confidence 444444444333333444444443 2345555544555555444333 223322111100000 0 Q ss_pred ccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHH Q lcl|NC_015466. 158 SFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTL 237 (344) Q Consensus 158 ~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~l 237 (344) .........+...+.+.+.+|.+.+..+... ++.++..+|+++.|.+|+. ++.. .+...-.+ .++-..- T Consensus 257 -----~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~-~~~~~~~v~n~~~~~~L~~---lkd~-~G~~i~~~-~~~~~~~ 325 (415) T protein:vir:46 257 -----SGFEKEGKKLEVKKAKSLDDIKDAINLNVKP-NYEHNVAIVSQTMFAKLDK---MKDK-LGNYLIQP-DVKEKTQ 325 (415) T ss_pred -----cccccccceeccccccchHHHHHHHHhhhhh-ccCCCEEEEcHHHHHHHHH---hhcc-CCCeeecc-CcCCCCC Confidence 0000001124446678888888888877654 6789999999999998862 3221 00000000 0111112 Q ss_pred HHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccc-eeecccccCCcCCcccccccCCCCc Q lcl|NC_015466. 238 ADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYT-FNWTGLVGSGNEGMRIKRFYLDAIE 316 (344) Q Consensus 238 a~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T-~~~~~~~g~~~~~~~~~~~~~~~~~ 316 (344) ..++|+| |++.+..- .+..+...-+++ +.+.+|. +...++ .++. ...... T Consensus 326 ~~l~G~p-V~~~~~~~----~~~~~~~~~~~g---------------d~~~~~~~~~~~~~--------~v~~-~~~~~~ 376 (415) T protein:vir:46 326 QRLLGAK-IEILPDEV----LGQKGNNTLIIG---------------NLKDAIVLFDRSQY--------QASW-TDYMHF 376 (415) T ss_pred cccccee-eEEecccc----ccCCCccEEEEE---------------ehhccEEEEeecce--------EEEe-eccccC Confidence 3567877 43332211 011111111222 2221222 111111 1110 111122 Q ss_pred eEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 317 SDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 317 ~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ...+|+...++-.+.-+++-.+++-.-+ T Consensus 377 ~~~~~~~~r~d~~v~~~~a~~~~~~~~~ 404 (415) T protein:vir:46 377 GECLMIAVRQDCRILDYKSAIVIEYDDS 404 (415) T ss_pred ceEEEEEEEeccEEeccccEEEEEeecc Confidence 3446777777888887777776654433 No 70 >protein:vir:4700 Length: 415 # NCBI annotation: phi PVL ORF 7 homologue # Family: family:all:21 # MgeID: mge:102 # MgeName: phiPV83 # Cross-refs: genbank:acc:NP_061632;genbank:gi:9635719;genbank:GeneID:1262976 Probab=86.55 E-value=0.045 Score=27.98 Aligned_cols=281 Identities=10% Similarity=0.062 Sum_probs=110.1 Q ss_pred CCCCC-CCCccceecccccceeeeeEcCcc-hhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccc-eec Q lcl|NC_015466. 1 MPFTQ-PSRSDVHVNRPLTNISIGYVQDAS-HFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGG-TYE 77 (344) Q Consensus 1 m~~~~-~~~~~~~~dp~LT~iA~~Y~n~~~-~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~-~~~ 77 (344) +.... +......+- +.++..+..... ...-..++..+|+....++++.......... .-..-|...... ... T Consensus 120 ~~~~~~t~~g~~~iP---~~~~~~ii~~~~~~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~v~Eg~~~~~~~~~~ 194 (415) T protein:vir:47 120 QGGSLKTDSGFVVIP---EEIVTDILKLKEVEFNLDKYVTVKRVTNGSGKYPVVRQSEVAAL--EKVEELEENPELAVKP 194 (415) T ss_pred hhccccccCCccccc---HHHHHHHHHHHHhhhhhhhhcceeeccCCceeEEEEEecCCcce--eecccccccccccccc Confidence 11111 111111111 111111111000 1122233456666666666665421111100 011122222222 123 Q ss_pred ccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccc Q lcl|NC_015466. 78 IGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPA 157 (344) Q Consensus 78 ~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~ 157 (344) |+...+..+..+.-..+.++..++ +.++++......+.+.+....| ..++++.-.+..... + T Consensus 195 ~~~v~~~~~k~~~~~~iS~ell~d--s~~~l~~~i~~~l~~~i~~~~d----~~il~g~g~g~~~~~---~--------- 256 (415) T protein:vir:47 195 FFQLAYDINTHRGYFRISREAIED--AKVNVLQELKLWMARTIAATRN----KAIIDVITKGSTGST---S--------- 256 (415) T ss_pred eeeEEeeeeeeEeeehhhHHHHhh--chHHHHHHHHHHHHHHHHHHHH----HHHhhccccCCcccc---c--------- Confidence 444444444333333444444443 2345555544555555444333 223322111100000 0 Q ss_pred ccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHH Q lcl|NC_015466. 158 SFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTL 237 (344) Q Consensus 158 ~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~l 237 (344) .........+...+.+.+.+|.+.+..+... ++.++..+|+++.|.+|+. ++.. .+...-.+ .++-..- T Consensus 257 -----~~~~~~~~~~~~~~~~~~~~i~~~~~~~~~~-~~~~~~~v~n~~~~~~L~~---lkd~-~G~~i~~~-~~~~~~~ 325 (415) T protein:vir:47 257 -----SGFEKEGKKLEVKKAKSLDDIKDAINLNVKP-NYEHNVAIVSQTMFAKLDK---MKDK-LGNYLIQP-DVKEKTQ 325 (415) T ss_pred -----cccccccceeccccccchHHHHHHHHhhhhh-ccCCCEEEEcHHHHHHHHH---hhcc-CCCeeecc-CcCCCCC Confidence 0000001124446678888888888877654 6789999999999998862 3221 00000000 0111112 Q ss_pred HHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccc-eeecccccCCcCCcccccccCCCCc Q lcl|NC_015466. 238 ADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYT-FNWTGLVGSGNEGMRIKRFYLDAIE 316 (344) Q Consensus 238 a~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T-~~~~~~~g~~~~~~~~~~~~~~~~~ 316 (344) ..++|+| |++.+..- .+..+...-+++ +.+.+|. +...++ .++. ...... T Consensus 326 ~~l~G~p-V~~~~~~~----~~~~~~~~~~~g---------------d~~~~~~~~~~~~~--------~v~~-~~~~~~ 376 (415) T protein:vir:47 326 QRLLGAK-IEILPDEV----LGQKGNNTLIIG---------------NLKDAIVLFDRSQY--------QASW-TDYMHF 376 (415) T ss_pred cccccee-eEEecccc----ccCCCccEEEEE---------------ehhccEEEEeecce--------EEEe-eccccC Confidence 3567877 43332211 011111111222 2221222 111111 1110 111122 Q ss_pred eEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 317 SDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 317 ~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ...+|+...++-.+.-+++-.+++-.-+ T Consensus 377 ~~~~~~~~r~d~~v~~~~a~~~~~~~~~ 404 (415) T protein:vir:47 377 GECLMIAVRQDCRILDYKSAIVIEYDDS 404 (415) T ss_pred ceEEEEEEEeccEEeccccEEEEEeecc Confidence 3446777777888887777776654433 No 71 >protein:vir:80068 Length: 301 # NCBI annotation: gp8 # Family: family:all:463 # MgeID: mge:1876 # MgeName: B054 # Cross-refs: genbank:acc:YP_001468712;genbank:gi:157325292;genbank:GeneID:5601759 Probab=85.94 E-value=0.049 Score=27.76 Aligned_cols=287 Identities=10% Similarity=0.057 Sum_probs=99.0 Q ss_pred CCCCCCCCccc------eecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCccc--- Q lcl|NC_015466. 1 MPFTQPSRSDV------HVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTES--- 71 (344) Q Consensus 1 m~~~~~~~~~~------~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~--- 71 (344) |=+-.. .+| .+||.+-..+ ..++.++.+||.............|...+..-.. ...+... T Consensus 1 ~~~~~~--g~f~~~~l~~id~~v~e~~------~~~l~~r~l~~v~~~~~~~~~~~~~~~~~~~G~~---~~~~~~~~di 69 (301) T protein:vir:80 1 MQGKIT--ATIEARDLQAIDNVIYEPK------QEELTARSVFPQKFDVNEGAESYSFDVMTRSGAA---KIIANGADDL 69 (301) T ss_pred CCcccc--chhhHHHHHHHHHHHHHhh------hhhhhhhhhcccccCCCCceEEEEEeeeccceeE---EEecCccccc Confidence 221111 122 2444444433 1247788999875333344444444322211000 0011100 Q ss_pred ccceecccccccccccccccccccHHHHHhc-cCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccc Q lcl|NC_015466. 72 AGGTYEIGNDTYFARTRAYHRDVPEQVRANA-DNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVA 150 (344) Q Consensus 72 ~~~~~~~~~~~~~~~~~~l~~~v~~~~~~~a-~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~ 150 (344) ..+.+.++.....+..-+.+-.+-..+.+.+ ....+++.+........+ +...-+.+|.+- . .....|.- T Consensus 70 p~~~~~~~~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aa~~~~----~~~~n~~~f~G~----~-~~g~~GLl 140 (301) T protein:vir:80 70 PLVDVDMVRKSVPIYSIGIGLSYTIQDLRAARMQGTTVDAAKATTVRRAI----AEKENSIAFRGE----K-KYAIKGAF 140 (301) T ss_pred ccccccceeEEEEEEEEEeeeeecHHHHHHHHHhCCChHHHHHHHHHHHH----HHhhceEEeeec----c-cccceeee Confidence 1111222222112222111112222222222 234444443332222222 111122222221 1 11111111 Q ss_pred cccccccccceeecccccccccCCCCCCh-HHHHHHHHHHHHHhcC--CCcceEEeCHHHHHHHhcCHHHHHHhccCCCc Q lcl|NC_015466. 151 SSPTAPASFDPTNASNNDKLHWSDASSTP-IEDIRQGKRYVLEETG--FEPNVLTLGKAVYDALVDHPDIVGRIDRGQTS 227 (344) Q Consensus 151 ~~~~~~~~~~k~tl~~t~~~~Wsd~~SDP-i~di~~~~~~i~~~~G--~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~ 227 (344) ..+.-....+..+.++++ .+|.+.+.|- +.||.+...++...++ ..|++++|+++.+..|.+ .++. + + T Consensus 141 N~p~~~~~~~~~~~~~~~-~~w~~~t~~ei~~di~~~~~~l~~~s~g~~~p~~L~L~p~~~~~L~~-----~~~~--~-~ 211 (301) T protein:vir:80 141 EATGIQIDVSPTTGVGNV-SKWEKKTAEQIIDEIGEAHTKITVLPGYGTASLKLCLPPKQFELINK-----KRYS--N-E 211 (301) T ss_pred cCCCcccccccCcccccc-cccccCCHHHHHHHHHHHHHHHHHhcCceecccEEEecHHHHHhhhh-----cccc--C-C Confidence 111111111112222222 3687766554 8899999988866543 389999999999988851 1111 1 1 Q ss_pred cccccCHHHHHHHhCCCeEEEEEE-EEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcc Q lcl|NC_015466. 228 GAAKANLVTLADLFEVDKVLVMKA-VRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMR 306 (344) Q Consensus 228 ~~~~vt~~~la~~~gl~~I~v~~a-~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~ 306 (344) .+.--.++|++-+.--+| ..+ .+..+ + + -+.+.+++|... ++..+--++--|+..+. ....... T Consensus 212 -~~~tvl~~l~~~~~~~~I--~~~p~L~~~--g--~-----~g~~~~v~~~~~-~d~~~~~v~~~~~~~~~--e~~~~~~ 276 (301) T protein:vir:80 212 -DSRSVLKVLQDNAWFSAI--VRVPDLAGM--G--T-----AGSDSFAVIHDS-NETAELIIPMDITRHPE--EYSFPRT 276 (301) T ss_pred -CCeeHHHHHHHHcCcceE--EEcceeccC--C--C-----CcccEEEEEecC-CcEEEEEecCceeeecc--eecCcee Confidence 122224556543321112 111 11111 1 1 123455555432 21111111111211110 0000111 Q ss_pred cccccCCCCceEEEeeccccceeeeccccchhhhcc Q lcl|NC_015466. 307 IKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGI 342 (344) Q Consensus 307 ~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~ 342 (344) ...+...-.+..+. -|.+-+.+.+. T Consensus 277 ~~~~~~r~~Gv~i~-----------~P~ai~~~~GI 301 (301) T protein:vir:80 277 KVPFEERTAGVVVR-----------FPAAIVRVDGI 301 (301) T ss_pred EeeeeeeeEEEEEE-----------ccceEEEEecC Confidence 11111111122222 23333333333 No 72 >protein:vir:5255 Length: 304 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:117 # MgeName: Aaphi23 # Cross-refs: genbank:acc:NP_852760;genbank:gi:31544035;uniprot:Q7Y5U0;genbank:GeneID:2753552 Probab=83.84 E-value=0.054 Score=27.55 Aligned_cols=289 Identities=11% Similarity=0.019 Sum_probs=100.9 Q ss_pred CCCCCCCCccceecccccceeee-eEcCcchhhhhhhCcccc---cCCccceeeeech-hhcccccccccccCcccccce Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIG-YVQDASHFVAGQVFPQVS---VGKQSDAYFTYER-GDFNRDEMQERTPGTESAGGT 75 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~-Y~n~~~~~ia~~lfP~v~---v~~~~~~~~~~~k-~~~~~~~~~~ra~g~~~~~~~ 75 (344) |+-. +|-+. .|+.|-.- |...-.++.+.++||.-. ...+++.|..|+. +.......... ..+...+. T Consensus 1 ~~~l-----afl~~-qL~~id~~vye~~~~~~~~~~lipv~t~~~~~~~~~~~~~~d~~G~a~~~~i~~~--a~dip~vd 72 (304) T protein:vir:52 1 MSLL-----AYVKN-GLTAVSKDIAETKYPEIVFPQFVYVDQQTAVGITEKLHYGADEHGSLDDGLITVG--TSTLDQVE 72 (304) T ss_pred CchH-----HHHHH-HHHHHhhhhhccccccchhhhhccccCCCCcccceEEEeeeeccCcccccccCCc--CCccceee Confidence 4422 23221 33332211 333333588999999643 3333444444431 22211000000 01122333 Q ss_pred ecccccccccccccccccccHHHHHhcc-CCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccc Q lcl|NC_015466. 76 YEIGNDTYFARTRAYHRDVPEQVRANAD-NPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPT 154 (344) Q Consensus 76 ~~~~~~~~~~~~~~l~~~v~~~~~~~a~-~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~ 154 (344) +.++.....+..-+..-.+..++.+.+. ...+++.+..+-. .+..|..+-+..+ ++.... .|+ .+.. T Consensus 73 ~~~~~~~~~i~~~~~~~~y~~~El~~a~~~g~~l~~~ka~aa----~~a~~~~~n~v~~----~Gd~~~---~g~-~Gll 140 (304) T protein:vir:52 73 VGFTPTRSYIVPWAKSVTWTKPELEQGKLLGLALNTAKIMAL----NKNAQQTLQKVAF----LGHAKD---SRL-TGLL 140 (304) T ss_pred cccceeEEEEEEEeeeeeecHHHHHHHHHhCCCcHHHHHHHH----HHHHHhhhceEEE----Eeeccc---cce-EEEE Confidence 4444443344444444444444333332 2334443222111 1112211112211 221100 011 1112 Q ss_pred cccccceeecccc-cccccCCCCC-ChHHHHHHHHHHHHHhcC--CCcceEEeCHHHHHHHhcCHHHHHHhccCCCcccc Q lcl|NC_015466. 155 APASFDPTNASNN-DKLHWSDASS-TPIEDIRQGKRYVLEETG--FEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAA 230 (344) Q Consensus 155 ~~~~~~k~tl~~t-~~~~Wsd~~S-DPi~di~~~~~~i~~~~G--~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~ 230 (344) +.+.-...+.+++ ...+|-+.+. ..+.||.+...++...+| ..|++++|....+..|.. .+.. +. + T Consensus 141 N~p~v~~~~~~~~~a~~~w~~~T~~eI~~di~~~~~~i~~~s~~~~~p~tl~Lpp~~~~~l~~-----~~~~--~~-~-- 210 (304) T protein:vir:52 141 NNKSVEVYAIKGAAQNTKVQAMDFDKAVAFFKEIFLKGMEKTKRIEAPNTFAIDSLDLAHLAL-----VQRA--NT-D-- 210 (304) T ss_pred eCCCcceeeecCCccCCccccCCHHHHHHHHHHHHHHHHhccCceecCceEEeCHHHHHHHhh-----ccCC--CC-C-- Confidence 2222222222211 1135877654 477899999999998888 589999999999988841 1111 11 1 Q ss_pred ccC-HHHHHHHh----CCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCc Q lcl|NC_015466. 231 KAN-LVTLADLF----EVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGM 305 (344) Q Consensus 231 ~vt-~~~la~~~----gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~ 305 (344) .| .+.|.+-. |.+ |-|-.........|.. +.+.+++|..+...+.-+. -.-|+.-+.. .+..-. T Consensus 211 -~Tvl~~l~~n~~~~~g~~-l~I~~v~~~~~~~g~~-------g~~r~vvY~~d~~~~~~~v-P~p~~~l~~q-~~~~~~ 279 (304) T protein:vir:52 211 -TTALEFLTKHLSAAAGRQ-VAIKALPSNYGTRVTD-------GKTRAMVYVNSKEHVIFDV-PMSPTVLDAQ-PKGLLA 279 (304) T ss_pred -chHHHHHHHhcccccCCc-ceEEEecccccccCCC-------CceEEEEEecChhheEEec-Cccccccchh-hcCCce Confidence 12 34444432 111 2221111111111211 1222233322111110000 0001100000 000000 Q ss_pred ccccccCCCCceEEEeeccccceeeecccc Q lcl|NC_015466. 306 RIKRFYLDAIESDRIEIDMSYDQKKVAADL 335 (344) Q Consensus 306 ~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~ 335 (344) +...|....++.++.+-...+ --|- T Consensus 280 ~~vp~~~r~gGv~v~~P~a~~-----y~D~ 304 (304) T protein:vir:52 280 FESGLRMAFGGVTFMEPDSAL-----YVDY 304 (304) T ss_pred EEecceeeeeeEEEEccceee-----eecC Confidence 111122222233322221111 0000 No 73 >protein:vir:2504 Length: 305 # NCBI annotation: major capsid subunit gp9 # Family: family:all:507 # MgeID: mge:53 # MgeName: TM4 # Cross-refs: genbank:acc:NP_569745;genbank:gi:18496895;genbank:GeneID:932268 Probab=81.93 E-value=0.081 Score=26.56 Aligned_cols=292 Identities=10% Similarity=0.025 Sum_probs=115.5 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccc--cccCcccccceecc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQE--RTPGTESAGGTYEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~--ra~g~~~~~~~~~~ 78 (344) |+.+++......+-+.+.+--+..-.. ..+-..+++.+|+.....++++.....-.. .+.+ ............+| T Consensus 1 ma~~t~~~gg~liP~~~~~~Ii~~~~~--~s~l~~l~~~~~~~~~~~~~p~~~~~~~a~-wv~E~~~~~~~~~~~s~~~f 77 (305) T protein:vir:25 1 MADISRAEVASLIQEAYSDTLLAAAKQ--GSTVLSAFQNVNMGTKTTHLPVLATLPEAD-WVGESATDPKGVKPTSKVTW 77 (305) T ss_pred CCCccCCccceecCHHHHHHHHHHHHh--hchhhhhcceeeccCCcEEEEEEeCCcceE-Eeecccccccccccccccce Confidence 998888776665554442211111111 122344567888876667777654321110 0111 00111111112233 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) ....+..+..+-...+..+..++ +.++.+..-.+.+.+.+....| ..+++ +.+.......... T Consensus 78 ~~i~~~~~k~~~~~~is~ell~d--s~~~~~~~i~~~l~~~~a~~~d----~a~~~----G~g~~~~~~~~~~------- 140 (305) T protein:vir:25 78 ANRTLVAEEIAVIIPVHENVIDD--ATVAVLTEVAELGGQAIGKKLD----QAVIF----GTDKPASWVSPAL------- 140 (305) T ss_pred eeEEeeeEEEEEeehhhHHHHhc--chHHHHHHHHHHHHHHHHHHHh----hhhee----ccCCCCCcccccc------- Confidence 33333322222223333444432 2233443333333333332222 22332 2111100000000 Q ss_pred cceeeccccccccc--CCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHH Q lcl|NC_015466. 159 FDPTNASNNDKLHW--SDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVT 236 (344) Q Consensus 159 ~~k~tl~~t~~~~W--sd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~ 236 (344) .+.....+.....+ .+...|.+.++......+ ...+..+|..+|++..|..|++ + +..+ +.-+..++ T Consensus 141 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-~~~~~~~~~~v~~~~~~~~l~~---l----kd~~--G~~i~~~~- 209 (305) T protein:vir:25 141 IPAAVTAGQAVEVVGGVANESDIVGATNRAAKAV-ASAGWAPDTLLSSLALRYEVAN---I----RDAN--GNPVFRDD- 209 (305) T ss_pred ccccccccccccccccchhhhHHHHHHHHHHHhh-hhcccccceeEecHHHHHHHHH---h----hccC--CceeecCC- Confidence 00000101000011 123456666666666555 3457888999999999988852 2 2221 11222222 Q ss_pred HHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeeccc--ccCCcCCcccccccCCC Q lcl|NC_015466. 237 LADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGL--VGSGNEGMRIKRFYLDA 314 (344) Q Consensus 237 la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~--~g~~~~~~~~~~~~~~~ 314 (344) .++|+|-+ +.+..- .......-+.++.--+.+... -|.+++.... +... .. .+.-=. T Consensus 210 --~l~G~Pv~-~~~~~~-----~~~~~~~~~~gd~s~~~i~~~--------~~~~i~~~~~~~~~~~--~~---~~~~~~ 268 (305) T protein:vir:25 210 --SFAGFRTF-FNRNGA-----WDADAAIEVIADSSRVKIGVR--------QDITVKFLDQATLGTG--EN---QINLAE 268 (305) T ss_pred --cccccceE-EcCccC-----CCCCccEEEEEecceEEEEEe--------cCeEEEEeeeeeeecC--Cc---eeeeee Confidence 46788833 322210 000111111122111110000 0122211111 1110 00 011112 Q ss_pred CceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 315 IESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 315 ~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .....+|+...++-.+.-+.+-..++++-+ T Consensus 269 ~~~~~~R~~~r~~~~v~~p~a~v~~~~~~~ 298 (305) T protein:vir:25 269 RDMVALRLKARFAYVLGVSATAQGANKTPV 298 (305) T ss_pred cCcEEEEEEEeecceeeCcccEEEEccccc Confidence 245567777777777777777777777622 No 74 >protein:vir:9410 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:167 # MgeName: phi 13 # Cross-refs: genbank:acc:NP_803388;genbank:gi:29028700;genbank:GeneID:1258136 Probab=81.03 E-value=0.089 Score=26.33 Aligned_cols=276 Identities=13% Similarity=0.102 Sum_probs=106.5 Q ss_pred CC-CCCC--CCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccc-ee Q lcl|NC_015466. 1 MP-FTQP--SRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGG-TY 76 (344) Q Consensus 1 m~-~~~~--~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~-~~ 76 (344) +. .... +....+.......|-...++ ..+-..++..+||....++++.......... .-..-|...... .. T Consensus 119 ~~~~~~~~~~g~~~iP~~~~~~ii~~~~~---~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~--~~v~Eg~~~~~~~~~ 193 (415) T protein:vir:94 119 IQGGSLKTDSGFVVIPEEIVTDILKLKEV---EFNLDKYVTVKRVTNGSGKYPVVRQSEVAAL--EKVEELEENPELAVK 193 (415) T ss_pred hhhhccccccccccCcHHHHHHHHHHHHh---hhhhhhhcceeeccCCceeEEEEeecCCccc--eeccccccccccccc Confidence 11 0000 00001111111111111111 1223334555666655666554321110000 001112222211 12 Q ss_pred cccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccc Q lcl|NC_015466. 77 EIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAP 156 (344) Q Consensus 77 ~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~ 156 (344) .|+..++..+..+.-..+..+..+++ .++++......+.+.+....| ..++++.-.+...... T Consensus 194 ~~~~i~~~~~k~~~~~~is~ell~ds--~~~~~~~i~~~l~~~~~~~~~----~~il~g~g~g~~~~~~----------- 256 (415) T protein:vir:94 194 PFFQLAYDINTHRGYFRISREAIEDA--KVNVLQELKLWMARTIAATRN----KAIIDVITKGSTGSTS----------- 256 (415) T ss_pred cceeeEeeheeeeeechhhHHHHhhc--hHHHHHHHHHHHHHHHHHHHH----HHHhhccccCcccccc----------- Confidence 34444444444333334444444433 244444444444444443332 2223221111110000 Q ss_pred cccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH-- Q lcl|NC_015466. 157 ASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL-- 234 (344) Q Consensus 157 ~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~-- 234 (344) .........++..+...+.+|.+.+..+. ..++.++..+|+++.|.+|+. + +..+. . -+... T Consensus 257 ------~~~~~~~~~~~~~~~~~~~~i~~~~~~~~-~~~~~~~~~vmn~~~~~~l~~---l----kd~~G-~-~l~~~~~ 320 (415) T protein:vir:94 257 ------SGFEKEGKKLEVKKAKSLDDIKDAINLNV-KPNYEHNVAIVSQTMFAKLDK---M----KDKLG-N-YLIQPDV 320 (415) T ss_pred ------ccccccccccccccccchHHHHHHHHhhh-hhccCCCEEEEcHHHHHHHHH---h----hccCC-C-eeeccCc Confidence 00000011244456677888888888765 457889999999999998863 2 22211 1 11111 Q ss_pred --HHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCC--ceEEEEecCCCcccccccccceeecccccCCcCCcccccc Q lcl|NC_015466. 235 --VTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGK--HALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRF 310 (344) Q Consensus 235 --~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~--~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~ 310 (344) ..-..++|+| |++....- .+..++..-+.++ +.++++ + .-|.++.++. T Consensus 321 ~~~~~~~l~G~p-V~~~~~~~----~~~~~~~~i~~gd~~~~~~~~--------~-~~~~~v~~~~-------------- 372 (415) T protein:vir:94 321 KEKTQQRLLGAK-IEILPDEV----LGQKGNNTLIIGNLKDAIVLF--------D-RSQYQASWTD-------------- 372 (415) T ss_pred CCCCCceeccee-eEEecccc----cCCCCccEEEEEehhccEEEE--------e-ecceEEEEec-------------- Confidence 1123466776 43332211 0111111111111 001110 0 0112222221 Q ss_pred cCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 311 YLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 311 ~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .......+|+...++-.+.-+++-..++-.-+ T Consensus 373 --~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~ 404 (415) T protein:vir:94 373 --YMHFGECLMIAVRQDCRILDYKSAIVIEYDDS 404 (415) T ss_pred --cccCceEEEEEEEeccEEeccccEEEEEEecc Confidence 11223456777777777777777777654433 No 75 >protein:vir:8102 Length: 543 # NCBI annotation: gp6 # Family: family:all:21 # MgeID: mge:152 # MgeName: Che9c # Cross-refs: genbank:acc:NP_817683;genbank:gi:29566114;genbank:GeneID:1259308 Probab=80.39 E-value=0.095 Score=26.18 Aligned_cols=287 Identities=8% Similarity=-0.062 Sum_probs=107.8 Q ss_pred CC--CCCCCCccceeccccccee-eeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceec Q lcl|NC_015466. 1 MP--FTQPSRSDVHVNRPLTNIS-IGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYE 77 (344) Q Consensus 1 m~--~~~~~~~~~~~dp~LT~iA-~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~ 77 (344) ++ .+......++.......+- ...+.. .+-..+...+++. -...+++...+.-. .-.+-|......... T Consensus 249 ~~~~~t~~~gg~lip~~~~~~ii~~~~~~~---~~l~~~~~~~~~~-g~~~~~~~~~~~~a----~~v~Eg~~~~~~~~~ 320 (543) T protein:vir:81 249 RAMGLTKADGGYLVPFQLDPTVIITSNGSL---NDIRRFARQVVAT-GDVWHGVSSAAVQW----SWDAEFEEVSDDSPE 320 (543) T ss_pred hhcccccccCcccCchhhhhHHHHHHHhhh---chhhhhcccccCC-cceEEEEecCCcce----eecccCccccccccc Confidence 11 1111111111112222211 111111 1122222222221 12223332211110 111223333333445 Q ss_pred ccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccc Q lcl|NC_015466. 78 IGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPA 157 (344) Q Consensus 78 ~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~ 157 (344) |+..++..+..+--.+++.+...+. +++.......+.+.+.... ...++++. +......|+.... T Consensus 321 ~~~i~~~~~k~~~~~~is~ell~d~---~~~~~~i~~~l~~~~~~~~----d~ail~G~----Gt~~~p~Gi~~~~---- 385 (543) T protein:vir:81 321 FGQPEIPVKKAQGFVPISIEALQDE---ANVTETVALLFAEGKDELE----AVTLTTGT----GQGNQPTGIVTAL---- 385 (543) T ss_pred cceeeeeeeeeEeeehhhHHHHhcc---HHHHHHHHHHHHHHHHHHH----HHHHhccC----CCCcccccchhhc---- Confidence 5555555444444444555544322 3455444444444444332 23334332 1111112221100 Q ss_pred ccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCc-ceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH-- Q lcl|NC_015466. 158 SFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEP-NVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL-- 234 (344) Q Consensus 158 ~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~P-n~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~-- 234 (344) ...... ..+...++.++.|+.+.+..+.. ..++ ...+|++++|..|+. ++ .++. . -+... T Consensus 386 ~~~~~~------~~~~~~~~~~~~~~~~~~~~l~~--~~~~~~~~v~n~~~~~~l~~---lk----d~~G-~-~l~~~~~ 448 (543) T protein:vir:81 386 AGTAAE------IAPVTAETFALADVYAVYEQLAA--RHRRQGAWLANNLIYNKIRQ---FD----TQGG-A-GLWTTIG 448 (543) T ss_pred cccccc------ccccccccccHHHHHHHHHhhhc--cccCCcEEEEcHHHHHHHHH---hh----cCCC-c-eeccCcC Confidence 000000 12334567788888888877653 3444 468999999998873 22 2110 0 01100 Q ss_pred -HHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccccCC Q lcl|NC_015466. 235 -VTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLD 313 (344) Q Consensus 235 -~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~ 313 (344) ..-..++|+| |++.+..-.. ...-+..++..++|.+-+.-+.-..-|.+..++.. .... ..- T Consensus 449 ~g~~~~l~G~p-v~~~~~~~~~-------~~~~~~~~~~~i~~gd~~~~~i~~~~~~~i~~~~~--------~~~~-~~~ 511 (543) T protein:vir:81 449 NGEPSQLLGRP-VGEAEAMDAN-------WNTSASADNFVLLYGNFQNYVIADRIGMTVEFIPH--------LFGT-NRR 511 (543) T ss_pred CCCCcccccee-eEEecccccc-------ccccccCCcceEEEeeccceeEEeecccEEEEecc--------cccc-chh Confidence 0112356776 4333322100 00111122233332211100000000112211100 0000 111 Q ss_pred CCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 314 AIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 314 ~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ..+...+++...++-.+.-+++-.+++-+.| T Consensus 512 ~~~~~~~~~~~r~d~~v~~~~A~~~l~~~~~ 542 (543) T protein:vir:81 512 PNGSRGWFAYYRMGADVVNPNAFRLLNVETA 542 (543) T ss_pred hcCceEEEEEEeeccEeecccceEEEEeccc Confidence 2345567778888888888888777777777 No 76 >protein:vir:105004 Length: 392 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:1490 # MgeName: W Beta # Cross-refs: genbank:acc:YP_459969;genbank:gi:85701384;genbank:GeneID:3882145 Probab=79.90 E-value=0.1 Score=26.07 Aligned_cols=272 Identities=11% Similarity=-0.001 Sum_probs=104.2 Q ss_pred CCCCCCCCccceeccccc-ceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccc-eecc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLT-NISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGG-TYEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT-~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~-~~~~ 78 (344) |...+.....+.|-+.+. .|-..-++. ..-..++..++|....+++..+........ .-.+-|...... ...| T Consensus 106 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~---s~l~~~~~~~~~~~~~~~~~~~~~~~~~~a--~~v~E~~~~~~~~~~~~ 180 (392) T protein:vir:10 106 MSGLTGEDGGLVIPQDIQTQINELARSF---DALEQYVTVEPVRTRSGSRVLEKNSDMIPF--AEITEMGEIPETDNPKF 180 (392) T ss_pred ccccccCCCceecchhHHHHHHHHHHhh---hhhhhhceeeeccCCceeEEEEeecCCccc--eeecccccccccccccc Confidence 443333333333322222 221111111 112234566677666666544311110000 001112222211 1233 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) +..+...+..+.-..+.++..++ +.++++......+.+.+.+..+..+ +++. |.. T Consensus 181 ~~v~l~~~k~~~~~~iS~ell~d--s~~~l~~~i~~~l~~~i~~~~d~~~----~~g~-----------g~~-------- 235 (392) T protein:vir:10 181 SNVQYAVKDRAGILPLSRSLLQD--SDQNILKYVTKWLGKKSKVTRNVLI----LGVI-----------EKL-------- 235 (392) T ss_pred eeEEeeeeeEEEeehhhHHHHhh--hHHHHHHHHHHHHHHHHHHHHHHHH----hhcc-----------ccc-------- Confidence 33333333322233333444433 2345555444555555443333221 1110 000 Q ss_pred cceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCcc--ccccCHHH Q lcl|NC_015466. 159 FDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSG--AAKANLVT 236 (344) Q Consensus 159 ~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~--~~~vt~~~ 236 (344) ...+...+.+|.+++....+.........+|+++.|.+|+. + +.++... ...++... T Consensus 236 --------------~~~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~---l----kd~~G~~l~~~~~~~~~ 294 (392) T protein:vir:10 236 --------------TKQAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDK---L----KDKDGKYILQSDPTQKN 294 (392) T ss_pred --------------cccCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHH---h----hccCCCeEeecCccCCc Confidence 00122334566666655454433334558999999998863 2 2221100 00111112 Q ss_pred HHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccce-eecccccCCcCCcccccccCC-- Q lcl|NC_015466. 237 LADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTF-NWTGLVGSGNEGMRIKRFYLD-- 313 (344) Q Consensus 237 la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~-~~~~~~g~~~~~~~~~~~~~~-- 313 (344) -..++|+|.|++.+...-... ....+...++ +|+.+-+|.. .+.++- ..+.++... T Consensus 295 ~~tllG~~~v~~~~~~~~~~~--------~~~~~~~~~~-------~gdfs~~~~i~~~~~~~------~~~~~~~~~~f 353 (392) T protein:vir:10 295 KKLFAGTNPVVVVSNRFLKSK--------GTTAKKAPLI-------IGDLKEAIVLFKREDME------LASTDVGGKAF 353 (392) T ss_pred cccccCcccEEEecccccCCC--------cccCCceEEE-------EEehhceEEEEeecceE------EEEeccccchh Confidence 234678876654322211110 1111222222 1222222221 112110 001111100 Q ss_pred CCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 314 AIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 314 ~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ......+|+...++-.+.-+++-..++-..+ T Consensus 354 ~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~~ 384 (392) T protein:vir:10 354 TRNTLDLRAIQRDDVQMWDNEAAVYGEIDLS 384 (392) T ss_pred hcCceEEEEEEeeccEEecccceEEEEeccc Confidence 1234557777777777877777777655544 No 77 >protein:vir:102082 Length: 392 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1503 # MgeName: Fah # Cross-refs: genbank:acc:YP_512315;genbank:gi:89152484;genbank:GeneID:3953075 Probab=79.90 E-value=0.1 Score=26.07 Aligned_cols=272 Identities=11% Similarity=-0.001 Sum_probs=104.2 Q ss_pred CCCCCCCCccceeccccc-ceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccc-eecc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLT-NISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGG-TYEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT-~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~-~~~~ 78 (344) |...+.....+.|-+.+. .|-..-++. ..-..++..++|....+++..+........ .-.+-|...... ...| T Consensus 106 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~---s~l~~~~~~~~~~~~~~~~~~~~~~~~~~a--~~v~E~~~~~~~~~~~~ 180 (392) T protein:vir:10 106 MSGLTGEDGGLVIPQDIQTQINELARSF---DALEQYVTVEPVRTRSGSRVLEKNSDMIPF--AEITEMGEIPETDNPKF 180 (392) T ss_pred ccccccCCCceecchhHHHHHHHHHHhh---hhhhhhceeeeccCCceeEEEEeecCCccc--eeecccccccccccccc Confidence 443333333333322222 221111111 112234566677666666544311110000 001112222211 1233 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) +..+...+..+.-..+.++..++ +.++++......+.+.+.+..+..+ +++. |.. T Consensus 181 ~~v~l~~~k~~~~~~iS~ell~d--s~~~l~~~i~~~l~~~i~~~~d~~~----~~g~-----------g~~-------- 235 (392) T protein:vir:10 181 SNVQYAVKDRAGILPLSRSLLQD--SDQNILKYVTKWLGKKSKVTRNVLI----LGVI-----------EKL-------- 235 (392) T ss_pred eeEEeeeeeEEEeehhhHHHHhh--hHHHHHHHHHHHHHHHHHHHHHHHH----hhcc-----------ccc-------- Confidence 33333333322233333444433 2345555444555555443333221 1110 000 Q ss_pred cceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCcc--ccccCHHH Q lcl|NC_015466. 159 FDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSG--AAKANLVT 236 (344) Q Consensus 159 ~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~--~~~vt~~~ 236 (344) ...+...+.+|.+++....+.........+|+++.|.+|+. + +.++... ...++... T Consensus 236 --------------~~~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~---l----kd~~G~~l~~~~~~~~~ 294 (392) T protein:vir:10 236 --------------TKQAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDK---L----KDKDGKYILQSDPTQKN 294 (392) T ss_pred --------------cccCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHH---h----hccCCCeEeecCccCCc Confidence 00122334566666655454433334558999999998863 2 2221100 00111112 Q ss_pred HHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccce-eecccccCCcCCcccccccCC-- Q lcl|NC_015466. 237 LADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTF-NWTGLVGSGNEGMRIKRFYLD-- 313 (344) Q Consensus 237 la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~-~~~~~~g~~~~~~~~~~~~~~-- 313 (344) -..++|+|.|++.+...-... ....+...++ +|+.+-+|.. .+.++- ..+.++... T Consensus 295 ~~tllG~~~v~~~~~~~~~~~--------~~~~~~~~~~-------~gdfs~~~~i~~~~~~~------~~~~~~~~~~f 353 (392) T protein:vir:10 295 KKLFAGTNPVVVVSNRFLKSK--------GTTAKKAPLI-------IGDLKEAIVLFKREDME------LASTDVGGKAF 353 (392) T ss_pred cccccCcccEEEecccccCCC--------cccCCceEEE-------EEehhceEEEEeecceE------EEEeccccchh Confidence 234678876654322211110 1111222222 1222222221 112110 001111100 Q ss_pred CCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 314 AIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 314 ~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ......+|+...++-.+.-+++-..++-..+ T Consensus 354 ~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~~ 384 (392) T protein:vir:10 354 TRNTLDLRAIQRDDVQMWDNEAAVYGEIDLS 384 (392) T ss_pred hcCceEEEEEEeeccEEecccceEEEEeccc Confidence 1234557777777777877777777655544 No 78 >protein:vir:107593 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1491 # MgeName: Gamma # Cross-refs: genbank:acc:YP_338188;genbank:gi:77020144;genbank:GeneID:3703724 Probab=79.90 E-value=0.1 Score=26.07 Aligned_cols=272 Identities=11% Similarity=-0.001 Sum_probs=104.2 Q ss_pred CCCCCCCCccceeccccc-ceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccc-eecc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLT-NISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGG-TYEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT-~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~-~~~~ 78 (344) |...+.....+.|-+.+. .|-..-++. ..-..++..++|....+++..+........ .-.+-|...... ...| T Consensus 106 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~---s~l~~~~~~~~~~~~~~~~~~~~~~~~~~a--~~v~E~~~~~~~~~~~~ 180 (392) T protein:vir:10 106 MSGLTGEDGGLVIPQDIQTQINELARSF---DALEQYVTVEPVRTRSGSRVLEKNSDMIPF--AEITEMGEIPETDNPKF 180 (392) T ss_pred ccccccCCCceecchhHHHHHHHHHHhh---hhhhhhceeeeccCCceeEEEEeecCCccc--eeecccccccccccccc Confidence 443333333333322222 221111111 112234566677666666544311110000 001112222211 1233 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) +..+...+..+.-..+.++..++ +.++++......+.+.+.+..+..+ +++. |.. T Consensus 181 ~~v~l~~~k~~~~~~iS~ell~d--s~~~l~~~i~~~l~~~i~~~~d~~~----~~g~-----------g~~-------- 235 (392) T protein:vir:10 181 SNVQYAVKDRAGILPLSRSLLQD--SDQNILKYVTKWLGKKSKVTRNVLI----LGVI-----------EKL-------- 235 (392) T ss_pred eeEEeeeeeEEEeehhhHHHHhh--hHHHHHHHHHHHHHHHHHHHHHHHH----hhcc-----------ccc-------- Confidence 33333333322233333444433 2345555444555555443333221 1110 000 Q ss_pred cceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCcc--ccccCHHH Q lcl|NC_015466. 159 FDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSG--AAKANLVT 236 (344) Q Consensus 159 ~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~--~~~vt~~~ 236 (344) ...+...+.+|.+++....+.........+|+++.|.+|+. + +.++... ...++... T Consensus 236 --------------~~~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~---l----kd~~G~~l~~~~~~~~~ 294 (392) T protein:vir:10 236 --------------TKQAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDK---L----KDKDGKYILQSDPTQKN 294 (392) T ss_pred --------------cccCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHH---h----hccCCCeEeecCccCCc Confidence 00122334566666655454433334558999999998863 2 2221100 00111112 Q ss_pred HHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccce-eecccccCCcCCcccccccCC-- Q lcl|NC_015466. 237 LADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTF-NWTGLVGSGNEGMRIKRFYLD-- 313 (344) Q Consensus 237 la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~-~~~~~~g~~~~~~~~~~~~~~-- 313 (344) -..++|+|.|++.+...-... ....+...++ +|+.+-+|.. .+.++- ..+.++... T Consensus 295 ~~tllG~~~v~~~~~~~~~~~--------~~~~~~~~~~-------~gdfs~~~~i~~~~~~~------~~~~~~~~~~f 353 (392) T protein:vir:10 295 KKLFAGTNPVVVVSNRFLKSK--------GTTAKKAPLI-------IGDLKEAIVLFKREDME------LASTDVGGKAF 353 (392) T ss_pred cccccCcccEEEecccccCCC--------cccCCceEEE-------EEehhceEEEEeecceE------EEEeccccchh Confidence 234678876654322211110 1111222222 1222222221 112110 001111100 Q ss_pred CCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 314 AIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 314 ~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ......+|+...++-.+.-+++-..++-..+ T Consensus 354 ~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~~ 384 (392) T protein:vir:10 354 TRNTLDLRAIQRDDVQMWDNEAAVYGEIDLS 384 (392) T ss_pred hcCceEEEEEEeeccEEecccceEEEEeccc Confidence 1234557777777777877777777655544 No 79 >protein:vir:102873 Length: 392 # NCBI annotation: major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1492 # MgeName: Cherry # Cross-refs: genbank:acc:YP_338137;genbank:gi:77020198;genbank:GeneID:3703782 Probab=79.90 E-value=0.1 Score=26.07 Aligned_cols=272 Identities=11% Similarity=-0.001 Sum_probs=104.2 Q ss_pred CCCCCCCCccceeccccc-ceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccc-eecc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLT-NISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGG-TYEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT-~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~-~~~~ 78 (344) |...+.....+.|-+.+. .|-..-++. ..-..++..++|....+++..+........ .-.+-|...... ...| T Consensus 106 ~~~~t~~~gg~~vP~~~~~~ii~~~~~~---s~l~~~~~~~~~~~~~~~~~~~~~~~~~~a--~~v~E~~~~~~~~~~~~ 180 (392) T protein:vir:10 106 MSGLTGEDGGLVIPQDIQTQINELARSF---DALEQYVTVEPVRTRSGSRVLEKNSDMIPF--AEITEMGEIPETDNPKF 180 (392) T ss_pred ccccccCCCceecchhHHHHHHHHHHhh---hhhhhhceeeeccCCceeEEEEeecCCccc--eeecccccccccccccc Confidence 443333333333322222 221111111 112234566677666666544311110000 001112222211 1233 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) +..+...+..+.-..+.++..++ +.++++......+.+.+.+..+..+ +++. |.. T Consensus 181 ~~v~l~~~k~~~~~~iS~ell~d--s~~~l~~~i~~~l~~~i~~~~d~~~----~~g~-----------g~~-------- 235 (392) T protein:vir:10 181 SNVQYAVKDRAGILPLSRSLLQD--SDQNILKYVTKWLGKKSKVTRNVLI----LGVI-----------EKL-------- 235 (392) T ss_pred eeEEeeeeeEEEeehhhHHHHhh--hHHHHHHHHHHHHHHHHHHHHHHHH----hhcc-----------ccc-------- Confidence 33333333322233333444433 2345555444555555443333221 1110 000 Q ss_pred cceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCcc--ccccCHHH Q lcl|NC_015466. 159 FDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSG--AAKANLVT 236 (344) Q Consensus 159 ~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~--~~~vt~~~ 236 (344) ...+...+.+|.+++....+.........+|+++.|.+|+. + +.++... ...++... T Consensus 236 --------------~~~~~~~~d~i~~~~~~~l~~~~~~~a~~vm~~~~~~~L~~---l----kd~~G~~l~~~~~~~~~ 294 (392) T protein:vir:10 236 --------------TKQAIKSLDDIKDVLNVKLDPAISPNAILLTNQDGFNYLDK---L----KDKDGKYILQSDPTQKN 294 (392) T ss_pred --------------cccCccCHHHHHHHHHHhhhhhhccCCEEEEcHHHHHHHHH---h----hccCCCeEeecCccCCc Confidence 00122334566666655454433334558999999998863 2 2221100 00111112 Q ss_pred HHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccce-eecccccCCcCCcccccccCC-- Q lcl|NC_015466. 237 LADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTF-NWTGLVGSGNEGMRIKRFYLD-- 313 (344) Q Consensus 237 la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~-~~~~~~g~~~~~~~~~~~~~~-- 313 (344) -..++|+|.|++.+...-... ....+...++ +|+.+-+|.. .+.++- ..+.++... T Consensus 295 ~~tllG~~~v~~~~~~~~~~~--------~~~~~~~~~~-------~gdfs~~~~i~~~~~~~------~~~~~~~~~~f 353 (392) T protein:vir:10 295 KKLFAGTNPVVVVSNRFLKSK--------GTTAKKAPLI-------IGDLKEAIVLFKREDME------LASTDVGGKAF 353 (392) T ss_pred cccccCcccEEEecccccCCC--------cccCCceEEE-------EEehhceEEEEeecceE------EEEeccccchh Confidence 234678876654322211110 1111222222 1222222221 112110 001111100 Q ss_pred CCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 314 AIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 314 ~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ......+|+...++-.+.-+++-..++-..+ T Consensus 354 ~~~~~~~r~~~r~d~~v~~~~a~~~l~~~~~ 384 (392) T protein:vir:10 354 TRNTLDLRAIQRDDVQMWDNEAAVYGEIDLS 384 (392) T ss_pred hcCceEEEEEEeeccEEecccceEEEEeccc Confidence 1234557777777777877777777655544 No 80 >protein:vir:102119 Length: 404 # NCBI annotation: phage major capsid protein, HK97 family # Family: family:all:21 # MgeID: mge:1641 # MgeName: phiSM101 # Cross-refs: genbank:acc:YP_699941;genbank:gi:110804052;genbank:GeneID:4206662 Probab=79.51 E-value=0.1 Score=25.98 Aligned_cols=276 Identities=11% Similarity=0.035 Sum_probs=108.1 Q ss_pred CCCCCCCCccceec-ccccceeeeeEcCcchhhhhhhCcccccCCccceeee--echhhcccccccccccCccccc--ce Q lcl|NC_015466. 1 MPFTQPSRSDVHVN-RPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFT--YERGDFNRDEMQERTPGTESAG--GT 75 (344) Q Consensus 1 m~~~~~~~~~~~~d-p~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~--~~k~~~~~~~~~~ra~g~~~~~--~~ 75 (344) |...+.....+.|- .+.+.|-..-++. ..--.+++.+|+....+++.. .....-.. -...|..... .+ T Consensus 110 ~~~~~~~~gg~~vP~~~~~~ii~~~~~~---~~l~~l~~~~~~~~~~g~~~~~~~~~~~~~~----~v~e~~~~~~~~~~ 182 (404) T protein:vir:10 110 ISENIDEDGGYAVPEDIQTKINTRLKDT---TDLYNMVDYEPVFTRSGSRTYEKRSKQKPMK----PLSENQQIPTNGDN 182 (404) T ss_pred hccccCCCCceeechhHHHHHHHHHhhh---hhHhhhhceeeccCCccceEEEEecCCccee----eccccccccccccc Confidence 32222222222221 1112211111111 122334566777666665432 21111000 0111211111 11 Q ss_pred ecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccc Q lcl|NC_015466. 76 YEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTA 155 (344) Q Consensus 76 ~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~ 155 (344) ..|+..+...+..+.-..+.++..+++ .++++......+.+.+....| ..++++. +......|.. . T Consensus 183 ~~f~~i~~~~~k~~~~~~iS~ell~ds--~~~l~~~i~~~la~~~~~~~~----~~il~G~----g~~~~~~gi~----~ 248 (404) T protein:vir:10 183 GKLERFNFKLKDLADFMSIPNDLLKFA--DKSLEDWIINWFVDKVRITRN----AEILYGA----GGDEHATGIM----T 248 (404) T ss_pred cceeeeEeeheeeEeeehhhHHHHhhc--HHHHHHHHHHHHHHHHHHHHH----HHHhhcC----CCCCccccee----e Confidence 233333333333333333444444432 233444333444444433222 2333221 1111111111 0 Q ss_pred ccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcc-eEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH Q lcl|NC_015466. 156 PASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPN-VLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL 234 (344) Q Consensus 156 ~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn-~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~ 234 (344) ....+ .....+++.+.++.+.+...... ++.+| +.+|+++.|.+|+. ++.++. .-+..+ T Consensus 249 ~~~~~----------~~~~~~~~~~~~~~~~~~~~l~~-~~~~~~~~v~n~~~~~~L~~-------lkd~~G--~~l~~~ 308 (404) T protein:vir:10 249 ANKFK----------KITLPKSPALKDFKKCKNVELLN-VFKATSSWIVNQDGFNYLDS-------LEDKTG--RPYLQP 308 (404) T ss_pred ccccc----------eeeccccccHHHHHHHHHhhhhc-cccCCCEEEEcHHHHHHHHH-------hhccCC--ceeecc Confidence 00111 12234556788888887765554 45555 57999999998873 222221 112211 Q ss_pred H----HHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCccccccccccee-ecccccCCcCCcccc- Q lcl|NC_015466. 235 V----TLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFN-WTGLVGSGNEGMRIK- 308 (344) Q Consensus 235 ~----~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~-~~~~~g~~~~~~~~~- 308 (344) + .-..++|.| |++..... ..+..+ +..++ +|+.+.++... +.++ .++ T Consensus 309 ~~~~~~~~~l~G~P-V~~~~~~~---~~~~~~-------~~~~~--------~gd~s~~~~~~~~~~~--------~i~~ 361 (404) T protein:vir:10 309 DPKDPTQYRFLGLP-VIELPNDL---LLSTES-------AIPVL--------LGDTKEAYKYVSDGAY--------ELAT 361 (404) T ss_pred CcCCCCCcccccee-eEEecccc---cCCCCC-------ccEEE--------EEeccccEEEEEecce--------EEEE Confidence 1 112456766 32211100 000011 11111 12222222221 1111 111 Q ss_pred ---cccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 309 ---RFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 309 ---~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .|..-......+|+...++-.+.-+++-..++=+.| T Consensus 362 ~~~~~~~~~~~~~~~~~~~r~d~~v~~~~a~~~~~~~~a 400 (404) T protein:vir:10 362 TNIGAGAFETNTTKARIIMRIDGNVKDSEALLIAEIPVE 400 (404) T ss_pred eccccchhhcCceEEEEEEeeccEEecccceEEEEeecc Confidence 112122466778888888888888888888877777 No 81 >protein:vir:1433 Length: 435 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:30 # MgeName: phiE125 # Cross-refs: genbank:acc:NP_536362;genbank:gi:17975167;genbank:GeneID:929171 Probab=77.41 E-value=0.13 Score=25.53 Aligned_cols=296 Identities=9% Similarity=-0.001 Sum_probs=111.2 Q ss_pred CCCCCCCCcccee-cccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceeccc Q lcl|NC_015466. 1 MPFTQPSRSDVHV-NRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIG 79 (344) Q Consensus 1 m~~~~~~~~~~~~-dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~ 79 (344) |....+....+.| ..+.+.|-..-+.. ..|.......+|+.....+++++....-.. . ..-|.........|. T Consensus 132 ~~~~t~~~gg~~vP~~~~~~ii~~l~~~--~~i~~~~~~~~~~~~~~~~~p~~~~~~~a~-~---v~E~~~~~~~~~~f~ 205 (435) T protein:vir:14 132 LNTLSPGAGGVLVPENLSSEVIELLRPK--SVVRKLGARTLPLSNGNITIPRLKGGAIVG-Y---IGADTDIPTTQQQFD 205 (435) T ss_pred cccCCcCCCccccchhHHHHHHHHHhhh--chhhhhcceeeecCCCceEEEEEeCCccee-e---eccCcccccccccee Confidence 2212222111222 11112221111111 112221123345554456666654321110 1 112333333344555 Q ss_pred ccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccccc Q lcl|NC_015466. 80 NDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASF 159 (344) Q Consensus 80 ~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~ 159 (344) ..++.....+...++..+..+++.-.++++......+.+.|....|. .++++ .+......|.-. .... T Consensus 206 ~i~~~~~k~~~~~~iS~ell~ds~~~~~l~~~i~~~l~~ai~~~~d~----a~l~G----~G~~~~p~Gi~~----~~~~ 273 (435) T protein:vir:14 206 DLKLTAKKMAALVPIANDLIKYAGVNPNVDQIVVGDLTAAIGAREDK----AFIRD----DGTANTPKGLRF----WALP 273 (435) T ss_pred EEEeeeEEEEEeehhhHHHHHhhccCHHHHHHHHHHHHHHHHHHHHH----Hhhcc----CCCCccccceee----cccc Confidence 55555444444444555555554323334444444444444433332 22222 111111111110 0000 Q ss_pred ceeecccccccccCCCCCChHHHHHHHHHHHHHhc-CCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHH Q lcl|NC_015466. 160 DPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEET-GFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLA 238 (344) Q Consensus 160 ~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~-G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la 238 (344) ..+ .+.+ ......+...++.+....+.... ++.+...+|++..|.+|+. + +.++. .-+.+...=. T Consensus 274 ~~~-~~~~----~~~~~~~~~~~~~~l~~~~~~~~~~~~~~~~v~n~~~~~~L~~---l----kd~~G--~~l~~~~~~g 339 (435) T protein:vir:14 274 SNV-ITAS----DASTLQKIETDLGKVILALENADANLTQPGWIMAPRTFRFLEG---L----RDGNG--NKVYPELANG 339 (435) T ss_pred cce-eccc----cccchhhHHHHHHHHHHHhhhccccccCCEEEEcHHHHHHHHH---h----hccCC--ceeccCCCCC Confidence 000 0111 11123345567777766665443 5567889999999988862 2 22211 1111111112 Q ss_pred HHhCCCeEEEEEEEEeccccCCCCcc-ceeCCCc--eEEEEecCCCcccccccccceeeccccc-CCcCCcccccccCCC Q lcl|NC_015466. 239 DLFEVDKVLVMKAVRNTAKKGQTASH-SFIGGKH--ALLSYAPATPGIMTPSAGYTFNWTGLVG-SGNEGMRIKRFYLDA 314 (344) Q Consensus 239 ~~~gl~~I~v~~a~yn~~~~~~~~~~-~~iw~~~--~~l~~~~~~~~~~~~s~G~T~~~~~~~g-~~~~~~~~~~~~~~~ 314 (344) .++|+| |++.+..= ...+..+.. .-+.++. +++. . .-++++..+..-+ ....+..+..|. T Consensus 340 ~l~G~P-v~~~~~~p--~~~~~~~~~~~i~~gd~s~~~i~-~---------~~~~~~~~~~~~~~~~~~~~~~~~f~--- 403 (435) T protein:vir:14 340 MLKGYP-VGKTTQVP--INLGETGKESEIYFTDFGDVFIG-E---------EETLEIDYSKEATYKDADGHMVSAFQ--- 403 (435) T ss_pred eeecce-eEeecccc--ccccCCCccceEEEeecccEEEE-E---------ecccEEEEeccccccccccchhhhhh--- Confidence 466777 43322210 000111111 1111211 1111 0 0112222221100 000111111111 Q ss_pred CceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 315 IESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 315 ~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .....+|+.+.++-.+.-|.+=..++++-. T Consensus 404 ~~~~~~r~~~r~d~~~~~~~a~~~l~~~~~ 433 (435) T protein:vir:14 404 RDQTLIRVIAKNDFGPRHVESIAVLAGVAW 433 (435) T ss_pred cChhheeeeeeeCceeecccceEEEecCCC Confidence 234578888888888888887666666555 No 82 >protein:vir:3643 Length: 336 # NCBI annotation: gp12 # Family: family:all:1653 # MgeID: mge:75 # MgeName: Bcep781 # Cross-refs: genbank:acc:NP_705638;genbank:gi:23752323;genbank:GeneID:955719 Probab=77.33 E-value=0.13 Score=25.52 Aligned_cols=282 Identities=9% Similarity=-0.027 Sum_probs=99.8 Q ss_pred CC--------CCCCCCccceeccccccee--eeeEcCcchhhhhhhCcccccCCccceeeeechhhc----ccccccccc Q lcl|NC_015466. 1 MP--------FTQPSRSDVHVNRPLTNIS--IGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDF----NRDEMQERT 66 (344) Q Consensus 1 m~--------~~~~~~~~~~~dp~LT~iA--~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~----~~~~~~~ra 66 (344) |. .+++...+ -+=..||++- .=|+-.-.++.++.|||...++.=..++.+|.-.+. ..+.+ T Consensus 31 ~~~da~d~~~~~~~~~~~-~~~~~l~~~i~p~~~~~~~~~~~~~~l~pv~t~g~W~~~~~~~~~~e~~G~a~~ygd---- 105 (336) T protein:vir:36 31 YAMDAADLSPHLSSTGSS-GIPNYLTTYVDPSVIDILVAPMKAAELVGESKKGDWTTLVAAFITAEPTTKVATYGD---- 105 (336) T ss_pred hhhhhhhccCccccCCCc-chHHHHHHhhccceEeeecchhhhhhhccccccCCccceeEEEeeeeceeeEEEeec---- Confidence 11 11111100 1112456533 223333335789999998665433334444432111 11100 Q ss_pred cCcccccceeccccccccccccccccccc-HHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccc Q lcl|NC_015466. 67 PGTESAGGTYEIGNDTYFARTRAYHRDVP-EQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFD 145 (344) Q Consensus 67 ~g~~~~~~~~~~~~~~~~~~~~~l~~~v~-~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~ 145 (344) +.+...+.+.....+....-.+..-.+. ++..+.+....++..+..+-..+.+ +...|. ++.-+.. T Consensus 106 -~~D~P~~d~~~~~~~~~v~~~~~g~~yg~~E~~~Aa~~~~~l~~~Ka~aA~~al---------e~~~N~-i~~~Gd~-- 172 (336) T protein:vir:36 106 -YSSDGDSGANINYPQRQSYFFQTWTRWGERELEMAGAGRVDLASELNYSSALGL---------AKFLNG-SYLFGVA-- 172 (336) T ss_pred -cCCCceeecccceeeeeEEEEEeeeeeCHHHHHHHHHhCCCcHHHHHHHHHHHH---------HHhhCc-EEEEecc-- Confidence 1111111211111222222222222222 2222222233333222111111111 112221 1111110 Q ss_pred ccccccccccccccc-eeecccccccccCCCCCC-hHHHHHHHHHHHHHhcC-----CCcceEEeCHHHHHHHhcCHHHH Q lcl|NC_015466. 146 VDGVASSPTAPASFD-PTNASNNDKLHWSDASST-PIEDIRQGKRYVLEETG-----FEPNVLTLGKAVYDALVDHPDIV 218 (344) Q Consensus 146 ~~gv~~~~~~~~~~~-k~tl~~t~~~~Wsd~~SD-Pi~di~~~~~~i~~~~G-----~~Pn~~v~~~~v~~~L~~h~~i~ 218 (344) ..+. .+.-+.++-. .++.++ .+|+..+.+ .+.||.+...++...++ -.|++++|....+..|.+ T Consensus 173 ~~~~-yGllNdP~l~a~~t~~t---~~~~~~t~~ei~~Di~~~~~~l~~qt~G~i~~~~~~tL~LP~~~~~~Ls~----- 243 (336) T protein:vir:36 173 GLEN-YGLINDPSLSAPITATT---PWSGSPAVEAVVNEVVALFQVLQTQSQGIITQEDVLRMGLPPTAMSDLSK----- 243 (336) T ss_pred ccce-EEEEecCCCccccccCC---CcccccCHHHHHHHHHHHHHHHHHhcCCeeeeccccEEEechHHHHhccC----- Confidence 0010 1111211111 122222 246666544 89999999999998886 369999999999888742 Q ss_pred HHhccCCCccccccC-HHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecC-CCcccccccccceeecc Q lcl|NC_015466. 219 GRIDRGQTSGAAKAN-LVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPA-TPGIMTPSAGYTFNWTG 296 (344) Q Consensus 219 ~~i~~~~~~~~~~vt-~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~-~~~~~~~s~G~T~~~~~ 296 (344) .+ ..+ +| .+.|++-| |+|.+-.+- ...+.+ ++.+.|++..- +.+..+ .+.+-.++. T Consensus 244 -----~n--~~g-~Tvl~~lk~n~--Pnl~i~t~p---El~~a~-------g~~~~l~~~~~~~~~t~~--~~~p~~~~~ 301 (336) T protein:vir:36 244 -----TN--QYG-LAAAAKLKDIF--PKLEFVTIP---EYDTAS-------GRLVQLWAPRVEGKDTAT--CGFTEKMRA 301 (336) T ss_pred -----CC--ccC-ccHHHHHHHhc--CccEEEEcc---ccccCC-------CceEEEEEEecCCCccee--eecchhhhc Confidence 11 112 33 45667653 444332222 111222 22344444321 111110 111211111 Q ss_pred cccCCcCCcccccccCCCCceEEEeeccccceeeeccccch Q lcl|NC_015466. 297 LVGSGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGY 337 (344) Q Consensus 297 ~~g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~ 337 (344) +............+...-.|..+.|-. -+..-.|. T Consensus 302 l~vq~~~~~~~v~~~~rt~Gv~i~~P~------ai~~~~GI 336 (336) T protein:vir:36 302 HSIERYSSYFRQKKSAGTWGAVIFRPF------AVAQMIGV 336 (336) T ss_pred cceeecCceeEeccccceeeeeeeccc------hheeeecC Confidence 111111112222222222222222222 22222222 No 83 >protein:vir:99576 Length: 388 # NCBI annotation: hypothetical protein # Family: family:all:1653 # MgeID: mge:1544 # MgeName: BcepF1 # Cross-refs: genbank:acc:YP_001039801;genbank:gi:126011051;genbank:GeneID:4818271 Probab=74.96 E-value=0.15 Score=25.07 Aligned_cols=290 Identities=13% Similarity=-0.010 Sum_probs=92.7 Q ss_pred CCCCCCCCccceecccccceeeee-EcCcchhhhhhhCcccccCCccceeeeechh----hcccccccccccCcccccce Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGY-VQDASHFVAGQVFPQVSVGKQSDAYFTYERG----DFNRDEMQERTPGTESAGGT 75 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y-~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~----~~~~~~~~~ra~g~~~~~~~ 75 (344) =|.++++ ..+ .-.-|+.+--++ +=--..+.++.|||...++.=..++.+|.-. ....+.+ +.+...+. T Consensus 73 ~~~t~~~-~gi-p~~~~~~~~p~~~~~~~~p~~~~~l~pv~t~g~W~~~~~~f~v~e~~G~A~~ygd-----~~D~Pl~d 145 (388) T protein:vir:99 73 APTTQAS-IPT-PIQFLQQWLPGFVKVLTSARKIDEILGVKTVGSWEDQEIVQGIVEPAGTAMEYGD-----LTNIPLSS 145 (388) T ss_pred cccccCc-ccH-HHHHhhhhccceeeeeechhhhhhhccccccCCccceeEEEeeeecceeEEEeec-----ccCCCcee Confidence 0111111 000 000111111111 1111136788999986654333344454321 1111110 01111111 Q ss_pred ecccccccccccccccccccHHHH-HhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccc-cccccccc Q lcl|NC_015466. 76 YEIGNDTYFARTRAYHRDVPEQVR-ANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFD-VDGVASSP 153 (344) Q Consensus 76 ~~~~~~~~~~~~~~l~~~v~~~~~-~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~-~~gv~~~~ 153 (344) +............+..-.+..++. +.+....++..+..+-..+. .|...-+..| |+ .... ..+. .+. T Consensus 146 ~~~~~~~r~v~~~~~g~~yg~~El~~A~~~g~~l~~~Ka~AA~~a----le~~~N~i~f----~G--~~g~~~~~~-yGl 214 (388) T protein:vir:99 146 WNVNFERRTIVRGEMGIQVGLLEEGRASAMRINSAEVKRQGAAVQ----LEIMRNAIGF----YG--WEGKNGNRT-FGF 214 (388) T ss_pred ccceeeeeeEEEEEeeeeecHHHHHHHHhhCCCcHHHHHHHHHHH----HHhhhceEEE----Ee--ecCCCccce-EEE Confidence 111111112222222222333222 22223334333221111111 1111111111 11 0000 0000 011 Q ss_pred ccccc-cceeeccc-ccccccCCCCCC-hHHHHHHHHHHHHHhcC------CCcceEEeCHHHHHHHhcCHHHHHHhccC Q lcl|NC_015466. 154 TAPAS-FDPTNASN-NDKLHWSDASST-PIEDIRQGKRYVLEETG------FEPNVLTLGKAVYDALVDHPDIVGRIDRG 224 (344) Q Consensus 154 ~~~~~-~~k~tl~~-t~~~~Wsd~~SD-Pi~di~~~~~~i~~~~G------~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~ 224 (344) -+.|+ ...+..++ +...+|.+++.+ .+.||.++...+...+| -.|.+++|....+..|.+ . T Consensus 215 lNdP~l~a~v~at~~~~~~~Wa~kT~~eI~~Di~~~~~~i~~qs~g~~~~~~~~~tL~LP~~~~~~Ls~----------~ 284 (388) T protein:vir:99 215 LNDPSLLPAIASTTPGGWVSGGANAFQGIVGDLRLMLITLRVQSEDNIDPEDVDITLVLPMNKVDMLSV----------V 284 (388) T ss_pred eeCCCcccccccccCCcCcccccCCHHHHHHHHHHHHHHHHHhcCCeeeecccceEEEechHHHHhccc----------c Confidence 11111 11122221 111358776543 58999999999988887 245589999999988842 1 Q ss_pred CCccccccC-HHHHHHHhCCCeEEEEEE-EEeccccCCCCccceeCCCceEEEEecCCCc--cccccccccee------e Q lcl|NC_015466. 225 QTSGAAKAN-LVTLADLFEVDKVLVMKA-VRNTAKKGQTASHSFIGGKHALLSYAPATPG--IMTPSAGYTFN------W 294 (344) Q Consensus 225 ~~~~~~~vt-~~~la~~~gl~~I~v~~a-~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~--~~~~s~G~T~~------~ 294 (344) +.. + +| .+.|++-| |+|.+-.+ -+..+.... +..++.++...-.+ ++-+.-+.|+. + T Consensus 285 n~~--g-~Tvl~~lk~n~--Pnl~i~t~pEl~~a~~tg--------g~~~~~~~~~~~~~~~~~~~~~~~t~~~~~p~~~ 351 (388) T protein:vir:99 285 TDL--G-ISVRDWLKQTY--PRVRVMSAPELQGGNPDD--------GKDIAYMFLDSVDTAVDGSTDGGDTWAQLVQSKF 351 (388) T ss_pred CcC--C-ccHHHHHHHhc--CCcEEEEecccccccccC--------CceeEEEEecccccccccCccCcceeEEeccccc Confidence 111 1 34 35666654 33433222 122221111 12233344322111 11111111211 1 Q ss_pred cccccCCcCCcccccccCCCCceEEEeeccccceeeeccccch Q lcl|NC_015466. 295 TGLVGSGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGY 337 (344) Q Consensus 295 ~~~~g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~ 337 (344) +.+-...........+...-.|..+.| |.-+..-.|. T Consensus 352 ~~l~vq~~~~~~~~~~~~rt~Gv~ir~------P~Ai~~~~GI 388 (388) T protein:vir:99 352 VTLGVEKRVKNYVEAYSNATAGVMLKR------PWAVVRLIGL 388 (388) T ss_pred ccccceecCceeEeccccceeeeEEec------cchhheeccC Confidence 111111111111112122222222222 2222222222 No 84 >protein:vir:79642 Length: 329 # NCBI annotation: HsbB # Family: family:all:463 # MgeID: mge:1872 # MgeName: TLS # Cross-refs: genbank:acc:YP_001285525;genbank:gi:148734508;genbank:GeneID:5220000 Probab=73.70 E-value=0.17 Score=24.84 Aligned_cols=289 Identities=11% Similarity=0.011 Sum_probs=100.2 Q ss_pred CCCCCCCCc---cceecccccceeee-eEcCcchhhhhhhCccc---ccCCccceeeeech-hhcccccccccccCcccc Q lcl|NC_015466. 1 MPFTQPSRS---DVHVNRPLTNISIG-YVQDASHFVAGQVFPQV---SVGKQSDAYFTYER-GDFNRDEMQERTPGTESA 72 (344) Q Consensus 1 m~~~~~~~~---~~~~dp~LT~iA~~-Y~n~~~~~ia~~lfP~v---~v~~~~~~~~~~~k-~~~~~~~~~~ra~g~~~~ 72 (344) |+.+..... .|.+ ..|+.|-.. |.-.-.++.+.++||.. +-..+++.|..|+. +......+-. .... T Consensus 26 ~~~~~~~~~~~~~f~~-~ql~~id~~v~e~~~~~l~~~~~i~i~~~~~~~~~~~t~~~~~~~G~a~~~~d~~----~dip 100 (329) T protein:vir:79 26 LRGAKNDASDMGIWTS-QELHKIKAQAYEKEYPAGSALRVFPVTSELSDTDKTFEYQTFDKVGHAKIIADYT----DDLS 100 (329) T ss_pred cccceeccchhhHHHH-HHHHHHHHHHHhhhhcccchhhhcccccCCCCceeEEEeeeeecceeeeeecCcc----cccc Confidence 444443221 2211 111211111 21122358899999964 33344444444432 1111000000 0001 Q ss_pred cceecccccccccccccccccccHHHHHhc-cCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccc Q lcl|NC_015466. 73 GGTYEIGNDTYFARTRAYHRDVPEQVRANA-DNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVAS 151 (344) Q Consensus 73 ~~~~~~~~~~~~~~~~~l~~~v~~~~~~~a-~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~ 151 (344) .+.+.+......+...+.+-.+-..+.+.+ ....+++.+........ .|...-+.+|.+ ... .++ . T Consensus 101 ~vd~~~~~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~----~~~~~n~i~f~G----~~~----~g~-~ 167 (329) T protein:vir:79 101 TVDALMTSEFGKVFRLGNAFLISIDEIKAGQRTGKSLSTRKANAAQNA----HDQLVNHLVFKG----SKP----HKI-I 167 (329) T ss_pred eeecccceeEEEEEEEEEEEEecHHHHHHHHHhCCChHHHHHHHHHHH----HHHhhccEEEee----ccc----ccc-e Confidence 112222221122222222222222222222 23344443322222111 111111222222 110 011 1 Q ss_pred ccccccccceeecccccccccCCCCCC-hHHHHHHHHHHHHHhcC--CCcceEEeCHHHHHHHhcCHHHHHHhccCCCcc Q lcl|NC_015466. 152 SPTAPASFDPTNASNNDKLHWSDASST-PIEDIRQGKRYVLEETG--FEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSG 228 (344) Q Consensus 152 ~~~~~~~~~k~tl~~t~~~~Wsd~~SD-Pi~di~~~~~~i~~~~G--~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~ 228 (344) +.-+.+.-......+....+|...+.+ .+.||.+...++...++ ..|++++|+.+.+..|.+ +. .+. T Consensus 168 GLlN~p~v~~~~~~~~~~~~w~~kt~~ei~~di~~~~~~l~~~s~g~~~p~~L~Lpp~~~~~L~~------~~--~~~-- 237 (329) T protein:vir:79 168 SVFEHPNLTTINSAGWNNAAGTGKKPETAQDELEQAIEKIETLTNGQHRANMILIPPSMRKVLMV------RM--PET-- 237 (329) T ss_pred eeecCCCccccccCCCCCccccccCHHHHHHHHHHHHHHHHHhcCceecccEEEecHHHHHHhhc------cc--CCC-- Confidence 111111111111111112257665443 47899999888887765 479999999999988742 11 111 Q ss_pred ccccCHHHHHHHhCCCeEEEEE-EEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCccc Q lcl|NC_015466. 229 AAKANLVTLADLFEVDKVLVMK-AVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRI 307 (344) Q Consensus 229 ~~~vt~~~la~~~gl~~I~v~~-a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~ 307 (344) +.--.++|++.+ +.+.|-. ..+..+. .. +.+.+++|......+ +--++--++.-. -........ T Consensus 238 -~~tvl~~lk~~~--~~l~I~~~~el~~ag--~~-------g~~~~v~y~~~~~~~-~~~vp~~~~~l~--~q~~~~~~~ 302 (329) T protein:vir:79 238 -TMSYLDYFKQQN--GGITIESISELEDID--GA-------GTKAALVYEKDPMNM-SIEIPEAFNMLT--AQPKDLHFK 302 (329) T ss_pred -CccHHHHHHHhC--CCcEEEEcccccccC--CC-------CceEEEEEecCCceE-EEecCcceeeee--ceecCceEE Confidence 223356677654 2222211 1122211 11 122334443221111 101111111111 001111122 Q ss_pred ccccCCCCceEEEeecccc--ceeeec Q lcl|NC_015466. 308 KRFYLDAIESDRIEIDMSY--DQKKVA 332 (344) Q Consensus 308 ~~~~~~~~~~~~vr~~~~~--~~~v~~ 332 (344) ..|.....+..+.|-...+ +=.++| T Consensus 303 v~~~~r~~Gv~i~~P~ai~~~dGI~~~ 329 (329) T protein:vir:79 303 VPCTSKCTGLTIYRPLTLVLIKGLVVG 329 (329) T ss_pred EceeeeEEEEEEECcceeeeeeeeeeC Confidence 2333333333333322222 122222 No 85 >protein:vir:79987 Length: 415 # NCBI annotation: head protein # Family: family:all:21 # MgeID: mge:1875 # MgeName: tp310-3 # Cross-refs: genbank:acc:YP_001430002;genbank:gi:156604057;genbank:GeneID:5525447 Probab=72.94 E-value=0.18 Score=24.71 Aligned_cols=277 Identities=12% Similarity=0.040 Sum_probs=106.9 Q ss_pred CCCC--CCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccc-eec Q lcl|NC_015466. 1 MPFT--QPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGG-TYE 77 (344) Q Consensus 1 m~~~--~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~-~~~ 77 (344) +... ..+...++.......|-..-++. ..-..++..+||....++++........... -..-|...... ... T Consensus 120 ~~~~~~~~~gg~~iP~~~~~~ii~~~~~~---~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~v~E~~~~~~~~~~~ 194 (415) T protein:vir:79 120 QGGSLKTDSGFVVIPEEIVTDILKLKEVE---FNLDKYVTVKRVTNGSGKYPVVRQSEVAALE--KVEELEENPELAVKP 194 (415) T ss_pred hhccccccccccccchHHHHHHHHHHHhh---hhhhhheeeeeccCCceeEEEEeecCCccce--eeccccccCcccccc Confidence 1111 11111111111111111000110 1112234455665555565443211100000 01112222211 123 Q ss_pred ccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccc Q lcl|NC_015466. 78 IGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPA 157 (344) Q Consensus 78 ~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~ 157 (344) |+..+...+..+.-..+..+..+++ .++++......+.+.+....| ..++++.-.+....... . T Consensus 195 ~~~v~~~~~k~~~~~~iS~ell~ds--~~~l~~~i~~~l~~~~~~~~~----~~il~g~g~g~~~~~~~----------~ 258 (415) T protein:vir:79 195 FFQLAYDINTHRGYFRISREAIEDA--KVNVLQELKLWMARTIAATRN----KAIIDVITKGSTGSTSS----------G 258 (415) T ss_pred eeeEEeeeeeeEeeehhhHHHHhhc--hHHHHHHHHHHHHHHHHHHHH----HHHhhccccCccccccc----------c Confidence 4444444444443344445544432 345555444555555443333 22332221111100000 0 Q ss_pred ccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHH-- Q lcl|NC_015466. 158 SFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLV-- 235 (344) Q Consensus 158 ~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~-- 235 (344) +.......+..+...+.+|.+.+..+.. .++.++..+|+++.|.+|+. + +..+. . -+..++ T Consensus 259 -------~~~~~~~~~~~~~~~~~~i~~~~~~~~~-~~~~~~~~v~n~~~~~~l~~---l----kd~~G-~-~l~~~~~~ 321 (415) T protein:vir:79 259 -------FEKEGKKLEVKKAKSLDDIKDAINLNVK-PNYEHNVAIVSQTMFAKLDK---M----KDKLG-N-YLIQPDVK 321 (415) T ss_pred -------ccccccccccccccchhHHHHHHHhhhh-hccCCCEEEEcHHHHHHHHH---h----hccCC-c-eeeccCcC Confidence 0000012344566778888888877654 57889999999999998863 2 22221 1 111111 Q ss_pred --HHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccc-eeecccccCCcCCcccccccC Q lcl|NC_015466. 236 --TLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYT-FNWTGLVGSGNEGMRIKRFYL 312 (344) Q Consensus 236 --~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T-~~~~~~~g~~~~~~~~~~~~~ 312 (344) .-..++|.| |++.+..- .+..++ ..+++ |+.+-+|+ +...++ .++..+ T Consensus 322 ~~~~~~l~G~p-V~~~~~~~----~~~~~~--------~~~~~-------Gd~~~~~~~~~~~~~--------~v~~~~- 372 (415) T protein:vir:79 322 EKTQQRLLGAK-IEILPDEV----LGQKGN--------NTLII-------GNLKDAIVLFDRSQY--------QASWTD- 372 (415) T ss_pred CCCCceeccee-eEEecccc----cCCCCc--------cEEEE-------EehhccEEEEeecce--------EEEEec- Confidence 113456766 33322111 011111 11111 11111121 111111 111101 Q ss_pred CCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 313 DAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 313 ~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .......+|+...++-.+.-+++-.+++-.-+ T Consensus 373 ~~~~~~~~~~~~r~d~~v~~~~a~~~~~~~~~ 404 (415) T protein:vir:79 373 YMHFGECLMIAVRQDCRILDYKSAIVIEYDDS 404 (415) T ss_pred cccCceEEEEEEEeccEEeccccEEEEEEecc Confidence 11223346777778888888887777765544 No 86 >protein:vir:81100 Length: 415 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:1891 # MgeName: tp310-1 # Cross-refs: genbank:acc:YP_001429874;genbank:gi:156603927;genbank:GeneID:5525320 Probab=72.94 E-value=0.18 Score=24.71 Aligned_cols=277 Identities=12% Similarity=0.040 Sum_probs=106.9 Q ss_pred CCCC--CCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccc-eec Q lcl|NC_015466. 1 MPFT--QPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGG-TYE 77 (344) Q Consensus 1 m~~~--~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~-~~~ 77 (344) +... ..+...++.......|-..-++. ..-..++..+||....++++........... -..-|...... ... T Consensus 120 ~~~~~~~~~gg~~iP~~~~~~ii~~~~~~---~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~v~E~~~~~~~~~~~ 194 (415) T protein:vir:81 120 QGGSLKTDSGFVVIPEEIVTDILKLKEVE---FNLDKYVTVKRVTNGSGKYPVVRQSEVAALE--KVEELEENPELAVKP 194 (415) T ss_pred hhccccccccccccchHHHHHHHHHHHhh---hhhhhheeeeeccCCceeEEEEeecCCccce--eeccccccCcccccc Confidence 1111 11111111111111111000110 1112234455665555565443211100000 01112222211 123 Q ss_pred ccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccc Q lcl|NC_015466. 78 IGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPA 157 (344) Q Consensus 78 ~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~ 157 (344) |+..+...+..+.-..+..+..+++ .++++......+.+.+....| ..++++.-.+....... . T Consensus 195 ~~~v~~~~~k~~~~~~iS~ell~ds--~~~l~~~i~~~l~~~~~~~~~----~~il~g~g~g~~~~~~~----------~ 258 (415) T protein:vir:81 195 FFQLAYDINTHRGYFRISREAIEDA--KVNVLQELKLWMARTIAATRN----KAIIDVITKGSTGSTSS----------G 258 (415) T ss_pred eeeEEeeeeeeEeeehhhHHHHhhc--hHHHHHHHHHHHHHHHHHHHH----HHHhhccccCccccccc----------c Confidence 4444444444443344445544432 345555444555555443333 22332221111100000 0 Q ss_pred ccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHH-- Q lcl|NC_015466. 158 SFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLV-- 235 (344) Q Consensus 158 ~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~-- 235 (344) +.......+..+...+.+|.+.+..+.. .++.++..+|+++.|.+|+. + +..+. . -+..++ T Consensus 259 -------~~~~~~~~~~~~~~~~~~i~~~~~~~~~-~~~~~~~~v~n~~~~~~l~~---l----kd~~G-~-~l~~~~~~ 321 (415) T protein:vir:81 259 -------FEKEGKKLEVKKAKSLDDIKDAINLNVK-PNYEHNVAIVSQTMFAKLDK---M----KDKLG-N-YLIQPDVK 321 (415) T ss_pred -------ccccccccccccccchhHHHHHHHhhhh-hccCCCEEEEcHHHHHHHHH---h----hccCC-c-eeeccCcC Confidence 0000012344566778888888877654 57889999999999998863 2 22221 1 111111 Q ss_pred --HHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccc-eeecccccCCcCCcccccccC Q lcl|NC_015466. 236 --TLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYT-FNWTGLVGSGNEGMRIKRFYL 312 (344) Q Consensus 236 --~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T-~~~~~~~g~~~~~~~~~~~~~ 312 (344) .-..++|.| |++.+..- .+..++ ..+++ |+.+-+|+ +...++ .++..+ T Consensus 322 ~~~~~~l~G~p-V~~~~~~~----~~~~~~--------~~~~~-------Gd~~~~~~~~~~~~~--------~v~~~~- 372 (415) T protein:vir:81 322 EKTQQRLLGAK-IEILPDEV----LGQKGN--------NTLII-------GNLKDAIVLFDRSQY--------QASWTD- 372 (415) T ss_pred CCCCceeccee-eEEecccc----cCCCCc--------cEEEE-------EehhccEEEEeecce--------EEEEec- Confidence 113456766 33322111 011111 11111 11111121 111111 111101 Q ss_pred CCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 313 DAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 313 ~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .......+|+...++-.+.-+++-.+++-.-+ T Consensus 373 ~~~~~~~~~~~~r~d~~v~~~~a~~~~~~~~~ 404 (415) T protein:vir:81 373 YMHFGECLMIAVRQDCRILDYKSAIVIEYDDS 404 (415) T ss_pred cccCceEEEEEEEeccEEeccccEEEEEEecc Confidence 11223346777778888888887777765544 No 87 >protein:vir:98339 Length: 415 # NCBI annotation: putative capsid protein # Family: family:all:21 # MgeID: mge:1581 # MgeName: phiPVL(108) # Cross-refs: genbank:acc:YP_918931;genbank:gi:119443693;genbank:GeneID:4594501 Probab=72.94 E-value=0.18 Score=24.71 Aligned_cols=277 Identities=12% Similarity=0.040 Sum_probs=106.9 Q ss_pred CCCC--CCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccc-eec Q lcl|NC_015466. 1 MPFT--QPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGG-TYE 77 (344) Q Consensus 1 m~~~--~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~-~~~ 77 (344) +... ..+...++.......|-..-++. ..-..++..+||....++++........... -..-|...... ... T Consensus 120 ~~~~~~~~~gg~~iP~~~~~~ii~~~~~~---~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~--~v~E~~~~~~~~~~~ 194 (415) T protein:vir:98 120 QGGSLKTDSGFVVIPEEIVTDILKLKEVE---FNLDKYVTVKRVTNGSGKYPVVRQSEVAALE--KVEELEENPELAVKP 194 (415) T ss_pred hhccccccccccccchHHHHHHHHHHHhh---hhhhhheeeeeccCCceeEEEEeecCCccce--eeccccccCcccccc Confidence 1111 11111111111111111000110 1112234455665555565443211100000 01112222211 123 Q ss_pred ccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccc Q lcl|NC_015466. 78 IGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPA 157 (344) Q Consensus 78 ~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~ 157 (344) |+..+...+..+.-..+..+..+++ .++++......+.+.+....| ..++++.-.+....... . T Consensus 195 ~~~v~~~~~k~~~~~~iS~ell~ds--~~~l~~~i~~~l~~~~~~~~~----~~il~g~g~g~~~~~~~----------~ 258 (415) T protein:vir:98 195 FFQLAYDINTHRGYFRISREAIEDA--KVNVLQELKLWMARTIAATRN----KAIIDVITKGSTGSTSS----------G 258 (415) T ss_pred eeeEEeeeeeeEeeehhhHHHHhhc--hHHHHHHHHHHHHHHHHHHHH----HHHhhccccCccccccc----------c Confidence 4444444444443344445544432 345555444555555443333 22332221111100000 0 Q ss_pred ccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHH-- Q lcl|NC_015466. 158 SFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLV-- 235 (344) Q Consensus 158 ~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~-- 235 (344) +.......+..+...+.+|.+.+..+.. .++.++..+|+++.|.+|+. + +..+. . -+..++ T Consensus 259 -------~~~~~~~~~~~~~~~~~~i~~~~~~~~~-~~~~~~~~v~n~~~~~~l~~---l----kd~~G-~-~l~~~~~~ 321 (415) T protein:vir:98 259 -------FEKEGKKLEVKKAKSLDDIKDAINLNVK-PNYEHNVAIVSQTMFAKLDK---M----KDKLG-N-YLIQPDVK 321 (415) T ss_pred -------ccccccccccccccchhHHHHHHHhhhh-hccCCCEEEEcHHHHHHHHH---h----hccCC-c-eeeccCcC Confidence 0000012344566778888888877654 57889999999999998863 2 22221 1 111111 Q ss_pred --HHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccc-eeecccccCCcCCcccccccC Q lcl|NC_015466. 236 --TLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYT-FNWTGLVGSGNEGMRIKRFYL 312 (344) Q Consensus 236 --~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T-~~~~~~~g~~~~~~~~~~~~~ 312 (344) .-..++|.| |++.+..- .+..++ ..+++ |+.+-+|+ +...++ .++..+ T Consensus 322 ~~~~~~l~G~p-V~~~~~~~----~~~~~~--------~~~~~-------Gd~~~~~~~~~~~~~--------~v~~~~- 372 (415) T protein:vir:98 322 EKTQQRLLGAK-IEILPDEV----LGQKGN--------NTLII-------GNLKDAIVLFDRSQY--------QASWTD- 372 (415) T ss_pred CCCCceeccee-eEEecccc----cCCCCc--------cEEEE-------EehhccEEEEeecce--------EEEEec- Confidence 113456766 33322111 011111 11111 11111121 111111 111101 Q ss_pred CCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 313 DAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 313 ~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .......+|+...++-.+.-+++-.+++-.-+ T Consensus 373 ~~~~~~~~~~~~r~d~~v~~~~a~~~~~~~~~ 404 (415) T protein:vir:98 373 YMHFGECLMIAVRQDCRILDYKSAIVIEYDDS 404 (415) T ss_pred cccCceEEEEEEEeccEEeccccEEEEEEecc Confidence 11223346777778888888887777765544 No 88 >protein:vir:4456 Length: 401 # NCBI annotation: Major capsid protein precursor # Family: family:all:21 # MgeID: mge:96 # MgeName: ST64B # Cross-refs: genbank:acc:NP_700379;genbank:gi:23505451;genbank:GeneID:955658 Probab=71.90 E-value=0.19 Score=24.54 Aligned_cols=287 Identities=10% Similarity=-0.017 Sum_probs=112.2 Q ss_pred CCCCCCCCcccee-cccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccce-ecc Q lcl|NC_015466. 1 MPFTQPSRSDVHV-NRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGT-YEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~-dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~-~~~ 78 (344) |.......--+.| ......|-..-+. ..+-..+++.+|+.....++++...+.-... +.+ |....... .+| T Consensus 107 ~~~~~~~~GG~~iP~~~~~~ii~~~~~---~~~l~~~~~~~~~~~~~~~~~~~~~~~~a~w-v~E---~~~~~~~~~~~~ 179 (401) T protein:vir:44 107 LQVGTDEDGGYAVPEELDRSILSLLKD---EVVMRQEATVITVGGSDYKKLVNLGGTASGW-VGE---TDTRSQTATSRL 179 (401) T ss_pred hhcCCCCCCceeccHhHHHHHHHHHHh---hhhhhhhceeeecCCCceEEEEecCCcccee-ecc---ccccCccccccc Confidence 4433322211111 1111111111111 1223445667777777677776433221111 111 11111111 233 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) +..++..+..+.-.++..+..++ +.++++....+.+.+.+.... ...++++. +. +...|+-........ T Consensus 180 ~~v~~~~~k~~~~~~iS~ell~d--s~~~l~~~i~~~la~ai~~~~----~~~~l~G~----G~-~~p~Gil~~~~~~~~ 248 (401) T protein:vir:44 180 GLIEPFMGEIYGNPQATQKMLDD--AFFNVEAWINSELATEFAEQE----EIAFTTGD----GT-KKPKGFLAYESTEES 248 (401) T ss_pred eeeeeehhheeeehhhhHHHHhc--chHHHHHHHHHHHHHHHHHHH----HhhhhccC----CC-Cccceeecccccccc Confidence 33333333333333344444443 344555555555555554322 22333221 11 111111110000000 Q ss_pred cceeecccc-cccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH--- Q lcl|NC_015466. 159 FDPTNASNN-DKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL--- 234 (344) Q Consensus 159 ~~k~tl~~t-~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~--- 234 (344) .+....++ .......++.--+.+|.+.+..+... .....+.+|+++.|.+|+. ++ ..+ +.-+..+ T Consensus 249 -~~~~~~~~~~~~~t~~~~~~~~d~i~~~~~~l~~~-~~~~a~~v~n~~~~~~L~~---lk----d~~--G~~l~~~~~~ 317 (401) T protein:vir:44 249 -DKARAFGKLQHIVSGEATAVTADAIIKLIYTLRKA-HRTGAKFMMNNNSLFAIRL---LK----DTE--GNYLWRPGLE 317 (401) T ss_pred -ccccccccccccccccccccCHHHHHHHHHhcchh-hhcCCEEEEcHHHHHHHHH---hh----ccC--CceeecCCcC Confidence 00000000 00001111222245555555554322 2334478999999988872 22 221 1112111 Q ss_pred -HHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccce-eecccccCCcCCcccccccC Q lcl|NC_015466. 235 -VTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTF-NWTGLVGSGNEGMRIKRFYL 312 (344) Q Consensus 235 -~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~-~~~~~~g~~~~~~~~~~~~~ 312 (344) ..-..++|+| |++.+..- .... +++.+++ |+.+.+|.+ .+.+. . ...+.|. T Consensus 318 ~g~~~~l~G~P-Vv~~~~~p-----~~~~------~~~~i~~--------Gd~~~~~~i~~~~~~-----~-~~~~~~~- 370 (401) T protein:vir:44 318 LGQPSSLAGYG-IAENEQMP-----DIAA------DAKAIAF--------GNFKRGYTIVDRIGT-----R-ILRDPYT- 370 (401) T ss_pred CCCCceeccee-eEEecCcC-----CccC------CccEEEE--------eehhccEEEEEecce-----E-Eeeeccc- Confidence 1223467877 43332210 0000 1112221 222333332 22211 0 0112222 Q ss_pred CCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 313 DAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 313 ~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ..+...+|+...++-.++-+++..+++-+.| T Consensus 371 -~~~~v~~~a~~r~d~~~~~~~a~~~l~~~aa 401 (401) T protein:vir:44 371 -NKPFVGFYTTKRTGGMLVDSQAIKLLKIAAA 401 (401) T ss_pred -cCCcEEEEEEEEeccEEecccceEEEEeecC Confidence 2466778899999999999999999999999 No 89 >protein:vir:78090 Length: 302 # NCBI annotation: Cps # Family: family:all:701 # MgeID: mge:1844 # MgeName: P35 # Cross-refs: genbank:acc:YP_001468790;genbank:gi:157325371;genbank:GeneID:5601852 Probab=71.51 E-value=0.14 Score=25.33 Aligned_cols=288 Identities=10% Similarity=0.031 Sum_probs=107.1 Q ss_pred CCCCCCCCccc--eecccccceeeeeEcCcchhhhhhhCcc-c-ccCCccceeeeechh----hcccccccccccCcccc Q lcl|NC_015466. 1 MPFTQPSRSDV--HVNRPLTNISIGYVQDASHFVAGQVFPQ-V-SVGKQSDAYFTYERG----DFNRDEMQERTPGTESA 72 (344) Q Consensus 1 m~~~~~~~~~~--~~dp~LT~iA~~Y~n~~~~~ia~~lfP~-v-~v~~~~~~~~~~~k~----~~~~~~~~~ra~g~~~~ 72 (344) |||+..-...| .+|.++..=+ +... |.+ -|. + -.+..+.|+++..-. ....+ =.|.-|.. T Consensus 1 Mantl~ya~~~~~~Ld~~~~~~~--~t~~---l~~---~~~~v~~~Gak~vkIp~is~~~~~TsGl~d--y~R~~g~~-- 68 (302) T protein:vir:78 1 MANSLALAQIYQDNIDKAIAVNS--KSAF---LEA---NPNNVQYNGGNTIKIADISFGSGTTGDLKA--YNRSTGFT-- 68 (302) T ss_pred CCchhHHHHHHHHHHHHHHHhhh--ceee---ccc---CCceEEEecCcEEEEEEEEeeccccccccc--cccccCcc-- Confidence 77655322222 1222222111 1000 000 000 0 012233444443210 00111 11222322 Q ss_pred cceeccccccccc-ccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHH-HHHHHHHhhhhhhcccccccccccc Q lcl|NC_015466. 73 GGTYEIGNDTYFA-RTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINRE-VNWAAAYFTAGAPGDTWTFDVDGVA 150 (344) Q Consensus 73 ~~~~~~~~~~~~~-~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E-~~~a~~~~~~~~~~~~~~~~~~gv~ 150 (344) ...++.+.+++.+ .+++..-.++..+.++.+.....-.-.-+...+++.=..+ .|++.+...+.... T Consensus 69 ~g~v~~~~et~tlt~DR~~~f~vD~mDvdETn~~~~~ani~~ef~r~~vvPEiDayrfskla~~a~~~~----------- 137 (302) T protein:vir:78 69 QGSVTLAWSDYTLDYDLAQSFQIDAMDVDETKNLATVGNVLSEYQRTKIVPAIDKYRFTKLANDGTGVG----------- 137 (302) T ss_pred ccceeeeeeeEEeeeccceeeeccccchhhhhhhhHHHHHHHHHHHhhhcchhhHHHHHHHHHhhhccC----------- Confidence 2233344444443 2344444444333333321111000000111111211122 24444443332111 Q ss_pred cccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCcccc Q lcl|NC_015466. 151 SSPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAA 230 (344) Q Consensus 151 ~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~ 230 (344) +....+.. +....+.+.+|++.++.+.+.. +-+|.+++.+..+|++.+.+.+.+.....+ .+ T Consensus 138 ---------~~~~~~~~-----~~t~~nvl~~i~~~~~~~~e~~---~~vl~vtp~~~~~Lk~a~~~~~~~~~~~~~-~~ 199 (302) T protein:vir:78 138 ---------GVIDLSKP-----DASAQALMGDIATAMELVDDSN---QLILVTSPTTLAGLLNTALIRESKNTQVLR-RG 199 (302) T ss_pred ---------cccccccc-----chhHHHHHHHHHHHHHHhhccC---CeEEEEChHHHHHHhcchhhccceeccccc-cc Confidence 00000000 0124578899999999998863 678999999999999988887665432211 12 Q ss_pred ccCHHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcccccc Q lcl|NC_015466. 231 KANLVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRF 310 (344) Q Consensus 231 ~vt~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~ 310 (344) .+ .-.+.++=|++-|.|-..+..+++.-.+|-...--++++=++.+++.+-+.-..+-... +..+ T Consensus 200 ~i-~~~V~~lDgv~Ii~VPs~r~~t~~~f~~G~~~~~~ak~INfiiv~~~a~ia~~K~~~~~--------------if~P 264 (302) T protein:vir:78 200 EV-DTKITFIQDVEVLQVPSEYLYDKVAPKVGVPDYTGAKKIPYMIFKRDAPTGIVKTDKVR--------------VFEP 264 (302) T ss_pred cc-cceeeeecccEEEEchhhhcccceeccCCccccCCccceeEEEECCCeeeeeeeeeeeE--------------eeCC Confidence 22 22244555666666655554443322222111111233333333333322111111111 1122 Q ss_pred cCCC-CceEEEeeccccceeeeccccchh---hhcccC Q lcl|NC_015466. 311 YLDA-IESDRIEIDMSYDQKKVAADLGYF---FGGIVA 344 (344) Q Consensus 311 ~~~~-~~~~~vr~~~~~~~~v~~~~~g~l---~~~~va 344 (344) .... +..|.+....-++--|.-.-.-.+ +..+|| T Consensus 265 ~~~~~gd~~l~~~R~Y~D~fV~~nk~~gI~~~~~~~~~ 302 (302) T protein:vir:78 265 DTNQSADAYKVDLRLYHDLIVPKNQRPGIIKASFGTIA 302 (302) T ss_pred CCCCCcceeeeeeeeEeeeeeeccccCeEEEeeccccC Confidence 2221 123444443333333332221111 233444 No 90 >protein:vir:3158 Length: 321 # NCBI annotation: capsid protein gpE # Family: family:all:1377 # ACLAME annotation(s): phi:0000161 - phage head/capsid # MgeID: mge:316 # MgeName: PhiCh1 # Cross-refs: genbank:acc:NP_665929;genbank:gi:22091115;genbank:GeneID:951342 Probab=71.44 E-value=0.2 Score=24.47 Aligned_cols=286 Identities=10% Similarity=-0.017 Sum_probs=101.8 Q ss_pred CCCCCCCC--ccce--ecccccceeeeeEcCcc---hhhh-----h---hhCcccccCCccceeeeechh--hccccccc Q lcl|NC_015466. 1 MPFTQPSR--SDVH--VNRPLTNISIGYVQDAS---HFVA-----G---QVFPQVSVGKQSDAYFTYERG--DFNRDEMQ 63 (344) Q Consensus 1 m~~~~~~~--~~~~--~dp~LT~iA~~Y~n~~~---~~ia-----~---~lfP~v~v~~~~~~~~~~~k~--~~~~~~~~ 63 (344) ||.=.-++ +.+. -.-..+....||.-++. +++- . .+...++|....++.+..+-+ ...+ . T Consensus 1 ~~~k~~~~~l~~~~~~~~~~~~~~~~g~~v~~~~~~~l~~~i~e~s~~l~~i~v~~v~~~~~~i~~~~~~~~~~~~---~ 77 (321) T protein:vir:31 1 MASRTINNDLSRITEKNALTVDDLDAGGTLPDPLWDEFWTDMIEETPLLDAIRTETVGAKKTRIPTLNIGERHRRP---Q 77 (321) T ss_pred CchHHHHHHHHHHHHhccccccccCCcceeCHHHHHHHHHHHHHhhhhhhhceeeeccCcceeeeeeccCCccccc---c Confidence 32111000 0000 00001122223322211 1110 0 112335666666666654321 1111 0 Q ss_pred ccccCcccccceecccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccc-c Q lcl|NC_015466. 64 ERTPGTESAGGTYEIGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDT-W 142 (344) Q Consensus 64 ~ra~g~~~~~~~~~~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~-~ 142 (344) ... .....+....++..+|.|+.......++.+..++....++.++.....+.+.+.+..+ ...+++-..... . T Consensus 78 ~e~-~~~~~~~~~~~~~~~~~~~k~~~~~~it~e~L~d~a~~~d~e~~i~~~ia~~~a~~~~----~~~~nGd~~~~~~~ 152 (321) T protein:vir:31 78 DEG-EWNENESDVSTGTIDISTEKATVAWDLPREVVQENPEGEALADRILNLMTDAWSADVE----DLAANGDEDAEDSF 152 (321) T ss_pred ccc-ccccccccceeeeeeeeeEEEEeehhccHHHHHhhhcchhHHHHHHHHHHHHHHHHHH----hheeeccccCCCcc Confidence 011 1112233456777788888888777788877765443456666555555554444333 333333211000 0 Q ss_pred cccccccccccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcc-eEEeCHHHHHHHhcCHHHHHHh Q lcl|NC_015466. 143 TFDVDGVASSPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPN-VLTLGKAVYDALVDHPDIVGRI 221 (344) Q Consensus 143 ~~~~~gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn-~~v~~~~v~~~L~~h~~i~~~i 221 (344) .....|. -.....+..+.. +..+ .-...+|.+....|...---.|+ +.+|+++.+.+++ +.+ T Consensus 153 ~~~n~G~----l~~a~~~~~~~~------~~~~-~~~~d~l~~l~~~l~~~yr~~~~~v~im~~~~~~~~~------~~l 215 (321) T protein:vir:31 153 ENQNDGF----ITVAEGDVETID------AADD-ILDNDLVIRTIAGLDSKYRARMNPALIVSEDQLLSYH------YTL 215 (321) T ss_pred cccchhh----hhhhcccccccc------cccc-ccCHHHHHHHHHhccHhHhcCCCeEEEechHHHHHHH------HHH Confidence 0001111 111111111111 1111 12233444444444332222356 5689999876554 333 Q ss_pred ccCCCcc-ccccCHHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccC Q lcl|NC_015466. 222 DRGQTSG-AAKANLVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGS 300 (344) Q Consensus 222 ~~~~~~~-~~~vt~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~ 300 (344) +..+... ...++-..-..++|+|-+. ...++++.+++. ++.+-+||..... T Consensus 216 ~~~~~~~~~~~l~~~~~~tl~G~pvv~-----------------~~~mP~~~il~t-----~~~nl~~~~~~~~------ 267 (321) T protein:vir:31 216 TDRDTPLGDNVIMGEADVNPFSFPIIG-----------------SGLWPDDKAMFT-----DPQNLIYALYRDL------ 267 (321) T ss_pred hcCCCccccchhhccccccccceeEEE-----------------cCCCCCCcEEEe-----ccccEEEEEeecc------ Confidence 3322211 1112222223466766331 112344444432 1222233322110 Q ss_pred CcCCcccccccCCC---CceEEEe--eccccceeeeccccchhhhcccC Q lcl|NC_015466. 301 GNEGMRIKRFYLDA---IESDRIE--IDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 301 ~~~~~~~~~~~~~~---~~~~~vr--~~~~~~~~v~~~~~g~l~~~~va 344 (344) .++.+.+.. .+...++ .....+-+|--.++..+++|.-= T Consensus 268 -----~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ve~~~a~a~~~~i~~ 311 (321) T protein:vir:31 268 -----EIDVLTESDKVSERDLHARYFMRGDDDFAIENTEAVVLAEGLGD 311 (321) T ss_pred -----EEEEeecCccccccceeeEeeeeeecceeEeccccEEEEecCCc Confidence 111111111 1111111 11123333334445555554211 No 91 >protein:vir:78558 Length: 336 # NCBI annotation: major capsid protein # Family: family:all:1653 # MgeID: mge:1854 # MgeName: BcepNY3 # Cross-refs: genbank:acc:YP_001294848;genbank:gi:149882911;genbank:GeneID:5291029 Probab=65.86 E-value=0.28 Score=23.64 Aligned_cols=285 Identities=9% Similarity=-0.024 Sum_probs=97.7 Q ss_pred CC--------CCCCCCccceeccccccee--eeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcc Q lcl|NC_015466. 1 MP--------FTQPSRSDVHVNRPLTNIS--IGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTE 70 (344) Q Consensus 1 m~--------~~~~~~~~~~~dp~LT~iA--~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~ 70 (344) |. .+++...+ -+-..||++- .=|+---.++-++.|||...++.=..++.+|.-.+..-.. ..-+=+.+ T Consensus 31 ~a~da~d~~~~~~t~~~~-g~~~~l~~~i~p~~~~~~~~~~~~~~l~~v~t~g~W~~~~~~~~~~e~~G~a-~~ygd~~D 108 (336) T protein:vir:78 31 YAMDAADLSPHLSSTGSS-GIPNYLTTYVDPSVIDILVAPMKAAELVGESKKGDWTTLVAAFITAEPTTTV-ATYGDYSS 108 (336) T ss_pred HHHhhhhhccccccCCCc-chHHHHHHhcccceeeehhhhhhhhhhcccccCCCccccEEEEeeeecceee-EEeecccC Confidence 11 01111100 1122455433 1133322357799999986654333344455321100000 00000111 Q ss_pred cccceecccccccccccccccccccHHHH-HhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccc Q lcl|NC_015466. 71 SAGGTYEIGNDTYFARTRAYHRDVPEQVR-ANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGV 149 (344) Q Consensus 71 ~~~~~~~~~~~~~~~~~~~l~~~v~~~~~-~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv 149 (344) ...+.+............+..-.+...+. +.+....++..+..+-..+. .+..+|. ++.-+.. ..+. T Consensus 109 ~P~vd~~~~~~~~~v~~~~~g~~yg~~El~~A~~~g~~l~~~Ka~aA~~a---------le~~~N~-~~~~Gd~--~~~~ 176 (336) T protein:vir:78 109 DGDSGTNINYPQRQSYFFQTWTRWGERELEMAGAGRVDLASELNYSSALG---------LAKFLNG-SYLFGVA--GLEN 176 (336) T ss_pred CCeeecceeeEEEEEEEEEeeeeecHHHHHHHHHhCCCcHHHHHHHHHHH---------HHHhhCe-EEEEecc--ccce Confidence 11112222222222222222222222222 22222233222211111111 1112221 1111110 0111 Q ss_pred ccccccccccc-eeecccccccccCCCCCC-hHHHHHHHHHHHHHhcC-----CCcceEEeCHHHHHHHhcCHHHHHHhc Q lcl|NC_015466. 150 ASSPTAPASFD-PTNASNNDKLHWSDASST-PIEDIRQGKRYVLEETG-----FEPNVLTLGKAVYDALVDHPDIVGRID 222 (344) Q Consensus 150 ~~~~~~~~~~~-k~tl~~t~~~~Wsd~~SD-Pi~di~~~~~~i~~~~G-----~~Pn~~v~~~~v~~~L~~h~~i~~~i~ 222 (344) .+..+.+.-. .++.++ .+|...+.+ .+.||.+...++...++ -.|.+++|....+..|.+ T Consensus 177 -~GllN~P~l~a~~t~~~---~~w~~~T~~~I~~Di~~~~~~l~~qt~g~~~~~~~~tL~Lp~~~~~~L~~--------- 243 (336) T protein:vir:78 177 -YGLINDPSLSAPITATT---PWSGSPAVEAVVNEVVTLFQVLQTQSQGIITQEAVLHMGLPPTAMSDLSK--------- 243 (336) T ss_pred -EEEEeCCCCCcccccCc---CcccccCHHHHHHHHHHHHHHHHHhcCCeeeeccceEEEechHHHHhccC--------- Confidence 1111222111 122222 357777755 99999999999988886 247799999999988852 Q ss_pred cCCCccccccC-HHHHHHHhCCCeEEEEEEE-EeccccCCCCccceeCCCceEEEEecC-CCcccccccccceeeccccc Q lcl|NC_015466. 223 RGQTSGAAKAN-LVTLADLFEVDKVLVMKAV-RNTAKKGQTASHSFIGGKHALLSYAPA-TPGIMTPSAGYTFNWTGLVG 299 (344) Q Consensus 223 ~~~~~~~~~vt-~~~la~~~gl~~I~v~~a~-yn~~~~~~~~~~~~iw~~~~~l~~~~~-~~~~~~~s~G~T~~~~~~~g 299 (344) .+. .+ +| .+.|++-| |+|.+-.+. +..+ + ++.+.+++..- +.+.. ....+-.++.+.. T Consensus 244 -~n~--~g-~tv~~~lk~n~--Pnl~i~t~pel~~A----g-------g~~~~~~~~~~~~~~t~--~~~~p~~f~~lpv 304 (336) T protein:vir:78 244 -TNQ--YG-LSAAAKLKEIF--PKLEFVTIPEYDTA----S-------GRLVQLWAPRVEGKDTA--TCGFTEKMRAHSI 304 (336) T ss_pred -CCc--cC-ccHHHHHHHhc--CccEEEEccccccc----C-------cceEEEEEeeccCCcce--eeecchhhhccce Confidence 111 12 33 35666653 444332221 1111 1 12222332210 00100 0111111111111 Q ss_pred CCcCCcccccccCCCCceEEEeeccccceeeeccccch Q lcl|NC_015466. 300 SGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGY 337 (344) Q Consensus 300 ~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~ 337 (344) ..........+...-.|..+.|-... ..-.|. T Consensus 305 q~~~~~~~v~~~~rt~Gv~i~~P~ai------~~~~GI 336 (336) T protein:vir:78 305 ERYSSYFRQKKSAGTWGAVIFRPFAV------AQMIGV 336 (336) T ss_pred eecCceeEeccccceeeeeeeccchh------eeeccC Confidence 11111222222222222222222222 222222 No 92 >protein:vir:107687 Length: 319 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:1518 # MgeName: T1 # Cross-refs: genbank:acc:YP_003898;genbank:gi:45686314;genbank:GeneID:2773027 Probab=64.82 E-value=0.29 Score=23.50 Aligned_cols=280 Identities=11% Similarity=0.056 Sum_probs=95.0 Q ss_pred CCCCCCCCcc----------------------ceecccccceee-eeEcCcchhhhhhhCccc---ccCCccceeeeech Q lcl|NC_015466. 1 MPFTQPSRSD----------------------VHVNRPLTNISI-GYVQDASHFVAGQVFPQV---SVGKQSDAYFTYER 54 (344) Q Consensus 1 m~~~~~~~~~----------------------~~~dp~LT~iA~-~Y~n~~~~~ia~~lfP~v---~v~~~~~~~~~~~k 54 (344) |.+|..+... +-.-..|+.|-. -|.-.-.++.+.++||.. +...+++.|..|+. T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~~~~da~~~~g~~~~~ql~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~ 80 (319) T protein:vir:10 1 MTTKKFDEADKSNVEMYLIQAGVKQDAAATMGIWTAQELHRIKSQSYEEDYPVGSALRVFPVTTELSPTDKTFEYMTFDK 80 (319) T ss_pred CCCcchhHHhhHHHHHHHhhccchhhhhhhhhhHHHHHHHHHHHHHHhhhhcceechhhcccccCCCCceEEEEeeeecc Confidence 3333221110 001111222221 122222357888999864 33344444444432 Q ss_pred -hhcccccccccccCccc---ccceecccccccccccccccccccHHHHHhc-cCCCCHHHHHHHHHHHHHhhhHHHHHH Q lcl|NC_015466. 55 -GDFNRDEMQERTPGTES---AGGTYEIGNDTYFARTRAYHRDVPEQVRANA-DNPISLDREATIFVTQKGLINREVNWA 129 (344) Q Consensus 55 -~~~~~~~~~~ra~g~~~---~~~~~~~~~~~~~~~~~~l~~~v~~~~~~~a-~~~~~~~~~a~~~~~~~i~l~~E~~~a 129 (344) +.... ++... ..+.+.++.....+...+..-.+-..+.+.+ ....+++.+........+ +...- T Consensus 81 ~G~a~~-------~~d~~~dip~v~~~~~~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~----~~~~n 149 (319) T protein:vir:10 81 VGTAQI-------IADYTDDLPLVDALGTSEFGKVFRLGNAYLISIDEIKAGQATGRPLSTRKASACQLAH----DQLVN 149 (319) T ss_pred ccceee-------ecCccccccceeccceeeEEEEEEEEeeeeecHHHHHHHHHhCCChHHHHHHHHHHHH----HHhhc Confidence 11110 11111 1111222221112222222222222222222 233444433322221111 11111 Q ss_pred HHHhhhhhhcccccccccccccccccccccceeecccccccccCC-CCC---ChHHHHHHHHHHHHHhcC--CCcceEEe Q lcl|NC_015466. 130 AAYFTAGAPGDTWTFDVDGVASSPTAPASFDPTNASNNDKLHWSD-ASS---TPIEDIRQGKRYVLEETG--FEPNVLTL 203 (344) Q Consensus 130 ~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~~k~tl~~t~~~~Wsd-~~S---DPi~di~~~~~~i~~~~G--~~Pn~~v~ 203 (344) +.+|.+ .. . .++ .+.-+.+.-...+.+ +|++ ++. ..+.||.+...++...++ ..|++++| T Consensus 150 ~i~f~G----~~-~---~g~-~GLlN~p~~~~~~~~-----~~~~~~t~t~~~i~~di~~~~~~l~~~s~g~~~p~~L~L 215 (319) T protein:vir:10 150 RLVFKG----SA-P---HKI-VSVFNHPNITKITSG-----KWIDVSTMKPETAEAELTQAIETIETITRGQHRATNILI 215 (319) T ss_pred eEEEee----cc-c---ccc-eeEEeCCCceeeecC-----CCCCccccCHHHHHHHHHHHHHHHHHhcCceeeceEEEe Confidence 122211 11 0 011 111111211122221 2433 333 457889888888876543 38999999 Q ss_pred CHHHHHHHhcCHHHHHHhccCCCccccccCHHHHHHHh-CCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCc Q lcl|NC_015466. 204 GKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLADLF-EVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPG 282 (344) Q Consensus 204 ~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la~~~-gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~ 282 (344) +++.+..|.+ +. .+ .+.--.++|++.+ ++. | +....+..+. .. +.+.+++|...... T Consensus 216 ~p~~~~~L~~------~~--~~---~~~t~l~~lk~~~~~l~-I-~~~pel~~ag--~~-------g~~~~v~y~~~~~~ 273 (319) T protein:vir:10 216 PPSMRKVLAI------RM--PE---TTMSYLDYFKSQNSGIE-I-DSIAELEDID--GA-------GTKGVLVYEKNPMN 273 (319) T ss_pred cHHHHHhhhc------cc--CC---CCeeHHHHHHHhcCCce-E-EEeeeecccC--CC-------cceEEEEEecCCce Confidence 9999988842 11 11 1233456777754 222 1 1111222211 11 22344445432111 Q ss_pred ccccccccceeecccccCCcCCcccccccCCCCceEEEeeccccc-eeeeccccchhhhcc Q lcl|NC_015466. 283 IMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAIESDRIEIDMSYD-QKKVAADLGYFFGGI 342 (344) Q Consensus 283 ~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~~~~~vr~~~~~~-~~v~~~~~g~l~~~~ 342 (344) + +--.+--|+.-+ -+.....+.+.+...+- -.|--|.+-+.+.+. T Consensus 274 ~-~~~v~~~~~~~~--------------~e~~~l~~~~~~~~r~~Gv~i~~P~ai~~~dGI 319 (319) T protein:vir:10 274 M-SIEIPEAFNMLP--------------AQPKDLHFKVPCTSKCTGLTIYRPMTIVLITGV 319 (319) T ss_pred E-EEecCcceeeee--------------eeecCceEEEeeeeeeEEEEEEccceeEeeecC Confidence 1 111111111110 11111222222222211 112223333333333 No 93 >protein:vir:1084 Length: 437 # NCBI annotation: capsid protein # Family: family:all:21 # MgeID: mge:21 # MgeName: bIL309 # Cross-refs: genbank:acc:NP_076738;genbank:gi:13095848;genbank:GeneID:920418 Probab=60.31 E-value=0.38 Score=22.92 Aligned_cols=263 Identities=11% Similarity=0.042 Sum_probs=89.4 Q ss_pred CCCCCCCCcccee-cccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCccccc-ceecc Q lcl|NC_015466. 1 MPFTQPSRSDVHV-NRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAG-GTYEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~-dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~-~~~~~ 78 (344) +.........+.| ....+.|- ..+... -+ ..++..+++....++++.............+ +..... ....| T Consensus 156 ~~~~~~~~~g~lvp~~~~~~i~-~~~~~~--~l-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~e---~~~~~e~~~~~~ 228 (437) T protein:vir:10 156 VTGIALKDGKVIIPETILTPEK-EVHQFP--RL-GSLVRTESVTTTTGKLPIFNNSTDLLTAHTE---YGQTTKNATPVI 228 (437) T ss_pred hhhcccccccccchHHHHHHHH-Hhhhhh--hh-hhcceeEeeccCceeeEEeeccccccccccc---cccccccccccc Confidence 1111111111111 11111111 111100 01 1122334555555555544221111111111 111111 11233 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) +..++.....+.-.+++.+..++. .+++.......+.+.+....+ ..++++. ..+. T Consensus 229 ~~v~~~~~k~~~~~~is~ell~ds--~~~~~~~i~~~l~~~~~~~~~----~~i~~g~---------g~~~--------- 284 (437) T protein:vir:10 229 TPILWDLKTYTGGYVFSQELISDS--SYDWQAELQSRLIELRDNTDD----SLIITAL---------TDGI--------- 284 (437) T ss_pred eeeeeehhheeeehhhhHHHHhhh--HHHHHHHHHHHHHHHHHHHHH----HHHhhhh---------cccc--------- Confidence 333333332222333444433322 223333322333333322211 1222211 0000 Q ss_pred cceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcc-eEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH--- Q lcl|NC_015466. 159 FDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPN-VLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL--- 234 (344) Q Consensus 159 ~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn-~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~--- 234 (344) . ....++...+|.+++..-. ..++++| +.+|++++|.+|+. ++.++. .-+..+ T Consensus 285 --~-----------~~~~~~~~~~~~~~~~~~l-~~~~~~~~~~~~~~~~~~~l~~-------lkd~~g--~~~~~~~~~ 341 (437) T protein:vir:10 285 --K-----------KTTSTYLLGDLKKVLNVTL-KPQDSAAASIVMSQSAYNLFDM-------ATDAMG--RPLLQPNVT 341 (437) T ss_pred --c-----------ccccccchhhHHHHHHhhh-hhhhhcCCEEEEcHHHHHHHHH-------hhccCC--CeeeccCcc Confidence 0 0012334445555554322 2345566 57999999988863 222211 111111 Q ss_pred -HHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCc--eEEEEecCCCcccccccccceeecccccCCcCCccccccc Q lcl|NC_015466. 235 -VTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKH--ALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFY 311 (344) Q Consensus 235 -~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~--~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~ 311 (344) ..-..+||.| |++.+.... ..+..+...-++|+. .++++.. -|.++.++. T Consensus 342 ~~~~~~l~G~p-v~~~~~~~~--~~~~~~~~~~~~gd~~~~~~~~~r---------~~~~~~~~~--------------- 394 (437) T protein:vir:10 342 AATGYTLLGKT-VVIVDDKLF--PSASAGDVNIVVAPLKKAVINFKL---------TEITGQFQD--------------- 394 (437) T ss_pred CCCCcccccce-eEEeccccc--CCcCCCceEEEEeeccccEEEEee---------eceEEEEec--------------- Confidence 1123567877 333221110 011111111122221 1111100 122222211 Q ss_pred CCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 312 LDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 312 ~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ........+++.+.++-.++-+++..+|+.-+. T Consensus 395 ~~~~~~~~~~~~~r~d~~~~~~~a~~~l~~~~~ 427 (437) T protein:vir:10 395 TYDIWYKQLGIFLRQNVVQASKDLIVNLTGKLK 427 (437) T ss_pred ccccccceeeEEEEEccEEecccceEEEEeecc Confidence 111223455666677778888888887763322 No 94 >protein:vir:3845 Length: 395 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:322 # MgeName: phi adh # Cross-refs: genbank:acc:NP_050151;swissprot:trembl:q9t1f6;genbank:gi:9633043;uniprot:Q9T1F6;genbank:GeneID:1262163 Probab=59.29 E-value=0.4 Score=22.79 Aligned_cols=271 Identities=9% Similarity=0.040 Sum_probs=96.3 Q ss_pred CCCCC--CCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccc-eec Q lcl|NC_015466. 1 MPFTQ--PSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGG-TYE 77 (344) Q Consensus 1 m~~~~--~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~-~~~ 77 (344) |.... ++...+.+-+.+.+--+..... ..+-..++..+|+....+++............ .-..-|...... ..+ T Consensus 105 ~~~~~~~~~~gg~~vP~~~~~~ii~~~~~--~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~a-~~v~E~~~~~~~~~~~ 181 (395) T protein:vir:38 105 VTSGTTGTGNAGLTIPEDIQLQIRTLTRS--FTSLESLANVENVTTSHGSRVYEKLADITPLK-DLDDESALIGDNDDPE 181 (395) T ss_pred HhhccCccCCCceecchhHhhHHHHHHHh--hcchhhhcceeeccCCcceEEEEeeccCCccc-cccccccccccccccc Confidence 22111 1111111211111100111000 11233345556666665655432111110000 001112121111 123 Q ss_pred ccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccccccc Q lcl|NC_015466. 78 IGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPA 157 (344) Q Consensus 78 ~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~ 157 (344) |....+.++..+.-.++..+..++ +.++++......+.+.+....| ..++++. |... T Consensus 182 f~~v~~~~~k~~~~~~iS~ell~d--s~~~l~~~i~~~la~~~~~~~~----~~il~g~-----------g~~~------ 238 (395) T protein:vir:38 182 LTVVKYLIHRYAGITTVTNTLLKD--TVDNIIQWLVNWAAKKDVVTRN----AKILEVM-----------GKAP------ 238 (395) T ss_pred eeeEEeeeeeeEeehhhHHHHHhh--hHHHHHHHHHHHHHHHHHHHHH----HHHhhcc-----------cccc------ Confidence 333333333322223333444433 2334444444444444443332 2222211 1000 Q ss_pred ccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHH Q lcl|NC_015466. 158 SFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTL 237 (344) Q Consensus 158 ~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~l 237 (344) + . .+.....+|.+..............+.+|+++.|.+|+. ++.. .++.--...++...- T Consensus 239 --~---~----------~~~~~~~~i~~~~~~~l~~~~~~~a~~v~n~~~~~~L~~---lkd~--~G~~l~~~~~~~~~~ 298 (395) T protein:vir:38 239 --K---K----------PTISQFDNIKDLENNTLDPAIESTSSFITNQSGYNILSK---VKDA--DGRYLMQPDVTSPDK 298 (395) T ss_pred --c---c----------cccccHHHHHHHHHHhhhhhhcCCCEEEEcHHHHHHHHH---hhcc--CCceeeccCcCCCCc Confidence 0 0 011122344555544444433444568999999999863 2221 111000001111122 Q ss_pred HHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccc-eeecccccCCcCCcccccccC---- Q lcl|NC_015466. 238 ADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYT-FNWTGLVGSGNEGMRIKRFYL---- 312 (344) Q Consensus 238 a~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T-~~~~~~~g~~~~~~~~~~~~~---- 312 (344) ..++|.| |.+.+..... +..+...-+++ +.+.++. +.+.+. .++..+. T Consensus 299 ~~l~G~p-V~~~~~~~~~---~~~~~~~i~~g---------------d~~~~~~i~~~~~~--------~i~~~~~~~~~ 351 (395) T protein:vir:38 299 YLIDGKP-VIRIADKWLP---DVSGSHPLYFG---------------DLKQGITLFDRQQM--------QIDTTNVGAGS 351 (395) T ss_pred ceeccce-eEEecccccC---cCCCcceEEEE---------------eccccEEEEEecce--------EEEEeccccch Confidence 3456766 3333221111 00111111222 2111111 111111 1111111 Q ss_pred CCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 313 DAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 313 ~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) -......+|+...++-.+.-+++-..++-..+ T Consensus 352 ~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~ 383 (395) T protein:vir:38 352 FEHDTTKLRFIDRFDVQLIDDGAFAAASFKTV 383 (395) T ss_pred hhcCceEEEEEEeeccEEecccceEEEEeecc Confidence 12345667888888888888888887776555 No 95 >protein:vir:107732 Length: 379 # NCBI annotation: gp23 # Family: family:all:1653 # MgeID: mge:1520 # MgeName: BcepB1A # Cross-refs: genbank:acc:YP_024871;genbank:gi:48697513;genbank:GeneID:2948349 Probab=57.30 E-value=0.44 Score=22.55 Aligned_cols=286 Identities=13% Similarity=0.052 Sum_probs=91.4 Q ss_pred CCCC---C-CC-Cccce-eccccc--------ceeeeeEcCc------chhhhhhhCcccccCCccceeeeechhh---- Q lcl|NC_015466. 1 MPFT---Q-PS-RSDVH-VNRPLT--------NISIGYVQDA------SHFVAGQVFPQVSVGKQSDAYFTYERGD---- 56 (344) Q Consensus 1 m~~~---~-~~-~~~~~-~dp~LT--------~iA~~Y~n~~------~~~ia~~lfP~v~v~~~~~~~~~~~k~~---- 56 (344) |+++ | .. ..+.+ ..|.|+ ++-..|. +. ..+.++.|||...++.-..++.+|.-.+ T Consensus 49 ~~~~~~amd~~~~~~~~~~~~~l~~~~~~g~~~~l~~~~-p~~i~~~tap~~a~~l~pv~t~g~W~~~~~~~~v~e~~G~ 127 (379) T protein:vir:10 49 FELMQFAMDSNDIGPIPTPLSPLSPVSIPGLIQFLQNWL-PGHVRILTAVREADEFLGLSTVGQWDDEQIVQRVLEGLGT 127 (379) T ss_pred hhhhhhhhccccccccccccCccccccccchHHHHHhhc-chHHHHHhhhhhhhhhcccccCCCceeeeEEEeeeeeeee Confidence 1110 1 00 01000 011222 1111232 11 1356888888866554444445543211 Q ss_pred cccccccccccCcccccceecccccccccccccccccccH-HHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhh Q lcl|NC_015466. 57 FNRDEMQERTPGTESAGGTYEIGNDTYFARTRAYHRDVPE-QVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTA 135 (344) Q Consensus 57 ~~~~~~~~ra~g~~~~~~~~~~~~~~~~~~~~~l~~~v~~-~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~ 135 (344) ...+.+ +.....+.+............+..-.+-. +.++.+....++..+..+-. .+..| +.+|. T Consensus 128 A~~ygd-----~~d~pl~d~~~~~~~r~v~~~~~g~~yg~~El~~Aa~~g~~l~~~Ka~aA----~~ale-----~~~N~ 193 (379) T protein:vir:10 128 AQPYTD-----GGNMALMSWTPTFETRTVVRFEAGLQVAPLEEARSSRVQVSSADEKRAMV----GEALE-----VQRNR 193 (379) T ss_pred eEEecc-----ccCCCeeeeeeeeeeeeeEEEEEEEeecHHHHHHHHHhCCChHHHHHHHH----HHHHH-----Hhhce Confidence 111100 00111111111111111111111111212 22222222333332211111 11111 11221 Q ss_pred hh-hccc-cccccccccccccccccc----ceeecccccccccCCCCCC-hHHHHHHHHHHHHHhcCC------CcceEE Q lcl|NC_015466. 136 GA-PGDT-WTFDVDGVASSPTAPASF----DPTNASNNDKLHWSDASST-PIEDIRQGKRYVLEETGF------EPNVLT 202 (344) Q Consensus 136 ~~-~~~~-~~~~~~gv~~~~~~~~~~----~k~tl~~t~~~~Wsd~~SD-Pi~di~~~~~~i~~~~G~------~Pn~~v 202 (344) -. ++.+ ...... +..+.+.- ...+-+++ ..+|.+.+.+ .+.||.++...+...++- .|.+++ T Consensus 194 i~f~G~~d~~~~~y----GllNdP~l~a~~t~atg~~~-~t~Wa~kT~~eI~~Di~~~~~~l~~qs~g~~~~~~~~~tL~ 268 (379) T protein:vir:10 194 VAFYGYNDGSGRTF----GFLNDPNLPAYVAVPNGAGG-SPLWAQKTTLEIIADLRNGLTALQVQSMGRIKSNKTPITIG 268 (379) T ss_pred EEEEeecCCCcceE----EEEeCCCCcccccccCCccc-ccccccCCHHHHHHHHHHHHHHHHHhhCCeecccccceeEE Confidence 11 1100 000000 11111111 11122222 2369887644 589999999998877663 455999 Q ss_pred eCHHHHHHHhcCHHHHHHhccCCCccccccC-HHHHHHHhCCCeEEEEEEE-EeccccCCCCccceeCCCceEEEEecC- Q lcl|NC_015466. 203 LGKAVYDALVDHPDIVGRIDRGQTSGAAKAN-LVTLADLFEVDKVLVMKAV-RNTAKKGQTASHSFIGGKHALLSYAPA- 279 (344) Q Consensus 203 ~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt-~~~la~~~gl~~I~v~~a~-yn~~~~~~~~~~~~iw~~~~~l~~~~~- 279 (344) |....+..|.+ .+ ..+ +| .++|++-| |++.+-.+. +..+ .+ +.+.+++|.+. T Consensus 269 LP~~~~~~L~~----------~n--~~g-~Tvl~~lk~n~--Pnl~i~t~pEL~~a----gg------g~~~~~~~~~~~ 323 (379) T protein:vir:10 269 IPNAYENYITT----------PT--ELG-YSVAQYMRESY--PNVTFVSAPELNDA----NG------GSSAIYYYADAV 323 (379) T ss_pred ecHHHHHhhcc----------cc--ccC-ccHHHHHHHhc--CCcEEEEccccccc----CC------CccEEEEEeecc Confidence 99999988852 11 111 33 45666654 333322221 1221 11 11223333321 Q ss_pred C-Cccccc---ccccceeecccccCCcCCcccccccCCCCceEEEeeccccceeeeccccch Q lcl|NC_015466. 280 T-PGIMTP---SAGYTFNWTGLVGSGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGY 337 (344) Q Consensus 280 ~-~~~~~~---s~G~T~~~~~~~g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~ 337 (344) . .+.+++ ....+-.++.+-........... ..+..++..||.-.. ++.-.|. T Consensus 324 ~~~~t~~~~~~~~~~p~k~~~l~ve~~~~~~~~~-~~~rt~Gv~ir~P~A-----i~~~~G~ 379 (379) T protein:vir:10 324 ENNGTDDGRTWLQVVPTKMFTLGVEKKIKGYAEG-YTNATAGAMLKRPFA-----TYRQTGA 379 (379) T ss_pred CCCccCCcceEEEecchhhhhccceecCceeEec-cccceeeeeeecchh-----hheecCC Confidence 1 111111 00011111111001111111112 222223333322222 2222222 No 96 >protein:vir:101557 Length: 336 # NCBI annotation: gp12 # Family: family:all:1653 # MgeID: mge:1477 # MgeName: Bcep43 # Cross-refs: genbank:acc:NP_958117;genbank:gi:41057663;genbank:GeneID:2716814 Probab=54.01 E-value=0.51 Score=22.17 Aligned_cols=282 Identities=10% Similarity=-0.022 Sum_probs=97.7 Q ss_pred CCCCCCCCccceeccccccee-ee-eEcCcchhhhhhhCcccccCCccceeeeechhhc----ccccccccccCcccccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNIS-IG-YVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDF----NRDEMQERTPGTESAGG 74 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA-~~-Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~----~~~~~~~ra~g~~~~~~ 74 (344) =|.+++...+- +-..||++. -+ |+-.-..+.++.|||...++.=..++.+|.-.+. ..+.+ +.+...+ T Consensus 39 ~~~~~~~~~~~-i~~~l~~~i~p~~~~~~~~p~~a~~l~pv~t~g~W~~~~~~~~~~e~~G~a~~ygd-----~~D~P~~ 112 (336) T protein:vir:10 39 SPHLSSTGSSG-IPNYLTTYVDPAVIDILVAPMKAAELVGESKKGDWTTLVAAFITAEPTTKVATYGD-----YSSDGDS 112 (336) T ss_pred cCccccCCCch-hHHHHHhhcccceeeehhhhhhhhhhccccccCCccceeEEEeeeeceeeEEEeec-----cCCCcee Confidence 01111111111 112355533 12 2211124779999998665433334444432111 11100 1111111 Q ss_pred eecccccccccccccccccccH-HHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccccccccccc Q lcl|NC_015466. 75 TYEIGNDTYFARTRAYHRDVPE-QVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSP 153 (344) Q Consensus 75 ~~~~~~~~~~~~~~~l~~~v~~-~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~ 153 (344) .+.....+....-.+..-.+.. +.++.+....++..+..+-..+.+ +...|. ++.-+.. ..+. .+. T Consensus 113 d~~~~~~~~~v~~~~~g~~yg~~El~~A~~~g~~l~~~Ka~aA~~al---------e~~~N~-i~~~Gd~--~~~~-yGl 179 (336) T protein:vir:10 113 GANINYPQRQSYFFQTWTRWGERELEMAGAGRVDLASELNYSSALGL---------AKFLNG-SYLFGVA--GLEN-YGL 179 (336) T ss_pred ecccceeeeeEEEEEeeeeeCHHHHHHHHHhCCCcHHHHHHHHHHHH---------HHhhCc-EEEEecc--ccce-EEE Confidence 2111111222222222222222 222222233333322111111111 112221 1111110 0000 111 Q ss_pred ccccccc-eeecccccccccCCCCCC-hHHHHHHHHHHHHHhcC-----CCcceEEeCHHHHHHHhcCHHHHHHhccCCC Q lcl|NC_015466. 154 TAPASFD-PTNASNNDKLHWSDASST-PIEDIRQGKRYVLEETG-----FEPNVLTLGKAVYDALVDHPDIVGRIDRGQT 226 (344) Q Consensus 154 ~~~~~~~-k~tl~~t~~~~Wsd~~SD-Pi~di~~~~~~i~~~~G-----~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~ 226 (344) -+.++-. .++.++ .+|...+.+ .+.||.+....+...++ -.|++++|....+..|.+ .+ T Consensus 180 lN~P~l~a~~t~~t---~~~~~~t~eei~~Di~~~~~~l~~qs~G~i~~~~~~tL~LP~~~~~~Ls~----------~n- 245 (336) T protein:vir:10 180 INDPSLSAPITATT---PWSGSPAVEAVVNEVVALFQVLQTQSQGIITQEDVLRMGLPPTAMSDLSK----------TN- 245 (336) T ss_pred EeCCCCccccccCC---CcccccCHHHHHHHHHHHHHHHHHhcCCeecccCcceEEecHHHHHhccC----------CC- Confidence 1211111 122222 246666644 89999999999988664 469999999999888742 11 Q ss_pred ccccccC-HHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecC-CCcccccccccceeecccccCCcCC Q lcl|NC_015466. 227 SGAAKAN-LVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPA-TPGIMTPSAGYTFNWTGLVGSGNEG 304 (344) Q Consensus 227 ~~~~~vt-~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~-~~~~~~~s~G~T~~~~~~~g~~~~~ 304 (344) ..+ +| .+.|++-| |+|.+-.+- ...+.++ +.+.|++..- +.+..+ .+.+-.++.+....... T Consensus 246 -~~g-~Tvl~~lk~n~--Pnl~i~t~p---El~~a~G-------~~~~l~~~~~~~~~t~~--~~~p~~~~~l~vq~~~~ 309 (336) T protein:vir:10 246 -QYG-LAAAAKLKDIF--PKLEFVTIP---EYDTASG-------RLVQLWAPRVEGKDTAT--CGFTEKMRAHSIERYSS 309 (336) T ss_pred -ccC-ccHHHHHHHhc--CccEEEEcc---ccccCCC-------ceEEEEEEecCCCccee--eecchhhhccceeecCc Confidence 112 33 45677654 444332221 1112222 2344444321 111110 11121111111111111 Q ss_pred cccccccCCCCceEEEeeccccceeeeccccch Q lcl|NC_015466. 305 MRIKRFYLDAIESDRIEIDMSYDQKKVAADLGY 337 (344) Q Consensus 305 ~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~ 337 (344) .....+...-.|..+.|-. -+..-.|. T Consensus 310 ~~~v~~~~rt~Gv~i~~P~------ai~~~~GI 336 (336) T protein:vir:10 310 YFRQKKSAGTWGAVIFRPF------AVAQMIGV 336 (336) T ss_pred eeEeccccceeeeeeeccc------hheeeecC Confidence 2222222222222222222 22222222 No 97 >protein:vir:106734 Length: 336 # NCBI annotation: gp13 # Family: family:all:1653 # MgeID: mge:1599 # MgeName: Bcep1 # Cross-refs: genbank:acc:NP_944321;genbank:gi:38638620;genbank:GeneID:2657363 Probab=52.86 E-value=0.54 Score=22.04 Aligned_cols=280 Identities=10% Similarity=-0.006 Sum_probs=96.4 Q ss_pred CC--------CCCCCCccceeccccccee--eeeEcCcchhhhhhhCcccccCCccceeeeechhhc----ccccccccc Q lcl|NC_015466. 1 MP--------FTQPSRSDVHVNRPLTNIS--IGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDF----NRDEMQERT 66 (344) Q Consensus 1 m~--------~~~~~~~~~~~dp~LT~iA--~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~----~~~~~~~ra 66 (344) |. .+++...+ -+-..||++- .=|+---.++-++.|||...++.-..++.+|.-.+. ..+.+ T Consensus 31 ~a~da~d~~~~~~t~~~~-g~~~~l~~~i~p~~~~~~~~~~~~~~l~~v~t~g~w~~~~~~~~~~e~~G~a~~ygd---- 105 (336) T protein:vir:10 31 YAMDAADLSPHLSSTGSS-GIPNYLTTYVDPSVIDILVAPMKAAELVGESKKGDWTTLVAAFITAEPTTKVATYGD---- 105 (336) T ss_pred HHHhhhhhccccccCCCc-chHHHHHhhcCcceeeeeechhchhhhcccccCCCcceeeEEEEeeeeeeeEEEccc---- Confidence 11 01111100 1112344422 112222224678899998766555555555532111 11111 Q ss_pred cCcccccceecccccccccccccccccccHHHHH-hccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccccc Q lcl|NC_015466. 67 PGTESAGGTYEIGNDTYFARTRAYHRDVPEQVRA-NADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFD 145 (344) Q Consensus 67 ~g~~~~~~~~~~~~~~~~~~~~~l~~~v~~~~~~-~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~ 145 (344) +.....+.+..........-.+..-.+..++.+ .+....++..+..+-..+ ..+..+|. ++.-+.. T Consensus 106 -~~d~P~~d~~~~~~~~~v~~~~~g~~yg~~El~~A~~~g~~l~~~Ka~aA~~---------ale~~~N~-~~~~Gd~-- 172 (336) T protein:vir:10 106 -YSSDGDSGTNINYPQRQSYFFQTWTRWGERELEMAGAGRVDLASELNYSSAL---------GLAKFLNG-SYLFGVA-- 172 (336) T ss_pred -cCCCcceeeeeeeeeeeEEEEEEEEeeCHHHHHHHHHhCCCcHHHHHHHHHH---------HHHHhhCe-EEEEeec-- Confidence 001111111111111122222222222222222 122222222211111100 11112221 1111100 Q ss_pred ccccccccccccccc-eeecccccccccCCCCCC-hHHHHHHHHHHHHHhcC-----CCcceEEeCHHHHHHHhcCHHHH Q lcl|NC_015466. 146 VDGVASSPTAPASFD-PTNASNNDKLHWSDASST-PIEDIRQGKRYVLEETG-----FEPNVLTLGKAVYDALVDHPDIV 218 (344) Q Consensus 146 ~~gv~~~~~~~~~~~-k~tl~~t~~~~Wsd~~SD-Pi~di~~~~~~i~~~~G-----~~Pn~~v~~~~v~~~L~~h~~i~ 218 (344) ..+. .+.-+.+.-. .++.++ .+|...+.+ .+.||.+....+...++ -.|.+++|....+..|.+ T Consensus 173 ~~~~-~GllN~P~l~a~~t~~~---~~w~~~T~~eI~~Di~~~~~~l~~qt~g~i~~~~~~tL~Lp~~~~~~L~~----- 243 (336) T protein:vir:10 173 GLEN-YGLINDPSLSAPITATT---PWSGSPAVEAVVNEVVTLFQVLQTQSQGIITQEAVLHMGLPPTAMSDLSK----- 243 (336) T ss_pred ccce-EEEeecCCCCcccccCc---CcccccCHHHHHHHHHHHHHHHHHhcCCeeeeccceEEEechHHHHhccC----- Confidence 0011 1111222111 122222 257777755 99999999999988886 247799999999988852 Q ss_pred HHhccCCCccccccC-HHHHHHHhCCCeEEEEEEE-EeccccCCCCccceeCCCceEEEEecCC--Ccccccccccceee Q lcl|NC_015466. 219 GRIDRGQTSGAAKAN-LVTLADLFEVDKVLVMKAV-RNTAKKGQTASHSFIGGKHALLSYAPAT--PGIMTPSAGYTFNW 294 (344) Q Consensus 219 ~~i~~~~~~~~~~vt-~~~la~~~gl~~I~v~~a~-yn~~~~~~~~~~~~iw~~~~~l~~~~~~--~~~~~~s~G~T~~~ 294 (344) .+. .+ +| .+.|++- .|++.+-.+. +..+ + ++.+.+ +++.. .+.. ....+-.+ T Consensus 244 -----~n~--~g-~tv~~~lk~n--~Pnl~i~t~pel~~A----g-------g~~~~~-~~~~~~~~~t~--~~~~P~~f 299 (336) T protein:vir:10 244 -----TNQ--YG-LSAAAKLKEI--FPKLEFVTIPEYDTA----S-------GRLVQL-WAPRVEGKDTA--TCGFTEKM 299 (336) T ss_pred -----CCc--cC-ccHHHHHHHh--CCccEEEEccccccc----C-------CceEEE-EEecccCCcce--eeecChhh Confidence 111 12 33 3456654 3444332221 1111 1 122233 32211 0100 01111111 Q ss_pred cccccCCcCCcccccccCCCCceEEEeeccccceeeeccccch Q lcl|NC_015466. 295 TGLVGSGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGY 337 (344) Q Consensus 295 ~~~~g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~ 337 (344) +.+............+...-.|..+.|-.. +..-.|. T Consensus 300 ~~lpvq~~~~~~~v~~~~rt~Gv~i~rP~a------i~~~~GI 336 (336) T protein:vir:10 300 RAHSIERYSSYFRQKKSAGTWGAVIFRPFA------VAQMLGV 336 (336) T ss_pred hccceeecCceeEeccccceeeeeeeccch------heeeccC Confidence 111111111112222222222222222222 2222222 No 98 >protein:vir:1383 Length: 421 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:314 # MgeName: phi3626 # Cross-refs: genbank:acc:NP_612835;genbank:gi:20065969;genbank:GeneID:935826 Probab=50.16 E-value=0.62 Score=21.73 Aligned_cols=262 Identities=7% Similarity=-0.031 Sum_probs=94.7 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccceecccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYEIGN 80 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~~~~ 80 (344) +.....+ ..++.....+.|-..-++ ...-..++..+||....++++......... .....-|.........|+. T Consensus 116 ~~t~~~g-g~liP~~~~~~Ii~~~~~---~~~l~~l~~~~~~~~~~~~~~~~~~~~~~~--~~~~~E~~~~~~s~~~f~~ 189 (421) T protein:vir:13 116 IMSSTNN-GAVIPQEFVNEFEKLKEG---YPSLKEHCHVIPVNRNAGKMPVRAGASVDK--LANLAKDTELVKAMLKTQP 189 (421) T ss_pred ccccCCc-ceecchhhHHHHHHHHHh---hhhhhhhceeeeccCCceEEEEeecCCccc--eeeccccccccccccceeE Confidence 1111111 101111111221100011 112233456667777777777653322111 1111222222222344444 Q ss_pred cccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccccc Q lcl|NC_015466. 81 DTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPASFD 160 (344) Q Consensus 81 ~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~~~ 160 (344) .++.....+.-.++..+..+++ .++++......+.+.+.+.. +.... +...|. T Consensus 190 i~~~~~k~~~~v~iS~ell~ds--~~~l~~~i~~~la~~~~~~~---------~~~i~-----~~~~g~----------- 242 (421) T protein:vir:13 190 MAYDIDDYGLLAPIDNSLLEDS--EINFLEFVNEEFAEFAVNTE---------NAEIV-----KQAKAV----------- 242 (421) T ss_pred EEeeeeeeEeehhhhHHHHhhh--HHHHHHHHHHHHHHHHHHHh---------hhhHh-----hhhhhc----------- Confidence 4444444333334444444432 23333333333333322111 11111 000111 Q ss_pred eeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH---HHH Q lcl|NC_015466. 161 PTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL---VTL 237 (344) Q Consensus 161 k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~---~~l 237 (344) + +.++..-+.+|.+.+..+.. .+..+..++|+++.|..|+. ++ ..+ +.-+... ..- T Consensus 243 ---~--------~~~~~~~~d~i~~~~~~l~~-~~~~~a~~v~n~~~~~~l~~---lk----d~~--G~~i~~~~~~~~~ 301 (421) T protein:vir:13 243 ---L--------AEETINDYAGLVKTINSLVP-NARKRAIIVTNSDGRAYLDG---LM----DKQ--GRPLLKELSDGGD 301 (421) T ss_pred ---c--------ccccccchHHHHHHHHHhhh-hhcCCCEEEEcHHHHHHHHH---hh----cCC--CceeecCcCCCCC Confidence 0 01222334566677776654 46677899999999988862 22 211 1111111 111 Q ss_pred HHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccc-eeecccccCCcCCcccccccCC--C Q lcl|NC_015466. 238 ADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYT-FNWTGLVGSGNEGMRIKRFYLD--A 314 (344) Q Consensus 238 a~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T-~~~~~~~g~~~~~~~~~~~~~~--~ 314 (344) ..++|+| |++.+.... +..+....+.++ .+-+|. +.+.++ .++..++. . T Consensus 302 ~tl~G~p-V~~~~~~~~----~~~~~~~~~~gd---------------~~~~~~~~~~~~~--------~v~~~~~~~f~ 353 (421) T protein:vir:13 302 LVFKGRP-VIELEESIF----DVGDETKFIVSD---------------FKTLIKFMDRKQY--------LIDQSKEAGYT 353 (421) T ss_pred ceeccee-eEEeccccc----cCCCceEEEEEe---------------ccccEEEEEecce--------EEEeecccccc Confidence 3466777 333222111 111111111121 111111 111111 11111111 1 Q ss_pred CceEEEeeccccceeeecccc---------chhhhc--ccC Q lcl|NC_015466. 315 IESDRIEIDMSYDQKKVAADL---------GYFFGG--IVA 344 (344) Q Consensus 315 ~~~~~vr~~~~~~~~v~~~~~---------g~l~~~--~va 344 (344) .....+|+...++-++.-+.+ |.|... +.+ T Consensus 354 ~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~~a~v~~~~~~~ 394 (421) T protein:vir:13 354 KNETIARIIERFDVNSPLDKSSDAEKIRKFGVIVKLQEVLK 394 (421) T ss_pred cCeeEEEEEeeecceeecchhhheeeecccceeeccccccC Confidence 344556666666555544433 333333 222 No 99 >protein:vir:95763 Length: 297 # NCBI annotation: head protein # Family: family:all:507 # MgeID: mge:1578 # MgeName: SMP # Cross-refs: genbank:acc:YP_950590;genbank:gi:119953785;genbank:GeneID:5076833 Probab=47.24 E-value=0.71 Score=21.41 Aligned_cols=282 Identities=12% Similarity=0.022 Sum_probs=108.9 Q ss_pred CCCCCCCCccceeccccc-ceeeeeEcCcchhhhhhhCcccccCCccceee-eechhhcccccccccccCcccccceecc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLT-NISIGYVQDASHFVAGQVFPQVSVGKQSDAYF-TYERGDFNRDEMQERTPGTESAGGTYEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT-~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~-~~~k~~~~~~~~~~ra~g~~~~~~~~~~ 78 (344) |..+++......|-+.+. .|-..-++ ..+-..+++.+|+...+..++ +....... .-.+-|......+.+| T Consensus 9 ~~~~~t~~~~~lvP~~~~~~ii~~~~~---~s~l~~~~~~~~~~~~~~~~~~~~~~~~~a----~~v~Eg~~~~~~~~~f 81 (297) T protein:vir:95 9 ENVLVSQKKDGTLHKEFTDIIMKEVAQ---NSLVMQLGQYQEMEGEQEKTVYVQTDGISA----YWVNETEKIKTDKPEV 81 (297) T ss_pred ccccccCCCcceechhHHHHHHHHHHh---hchhhhhcceeecCCCccEEEEEEcCCcee----EEeecCccccccccce Confidence 333332222222211111 11111111 123445567777765544333 22111100 1111233333333445 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) +......+..+....+..+..++. .++++....+.+.+.+....| ..++++ .+.... .+.. . . T Consensus 82 ~~v~l~~~k~~~~~~is~ell~ds--~~~l~~~i~~~la~ai~~~~d----~a~l~G----~g~~~~-~gi~----~--~ 144 (297) T protein:vir:95 82 VPVTLKAHKLGIILVTSREALNYT--WKKFFEDMKPQIVEAFYKKID----EAGLLG----HDTPFA-NSVA----K--A 144 (297) T ss_pred eEEEEeeEEEEEeehhhHHHHhcC--HHHHHHHHHHHHHHHHHHHHH----HHHhcc----cCCccc-cccc----c--c Confidence 544444444444444444444433 244555444445444443333 222222 111100 1100 0 0 Q ss_pred cceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHH Q lcl|NC_015466. 159 FDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLA 238 (344) Q Consensus 159 ~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la 238 (344) ..+.. .+. ++..-+.||.+....+.. .+..++..+|+++.+.+|++ ++. .+ + .-+... .-. T Consensus 145 ~~~~~-------~~~-~~~~t~~~i~~~~~~l~~-~~~~~~~~v~~~~~~~~L~~---l~d----~~-G-~~i~~~-~~~ 205 (297) T protein:vir:95 145 AKDAN-------KVI-GGPINYDNILKLQDALYD-ADVEPNAFVSKIQNRSALRE---ARD----GN-K-VSIYDK-AAN 205 (297) T ss_pred ccccc-------eec-ccccCHHHHHHHHHHhhh-ccCCcCEEEEcHHHHHHHHH---hhc----cC-C-ceeecC-CCC Confidence 00000 111 223346788888888765 47889999999999998873 322 11 1 011111 112 Q ss_pred HHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeeccccc----CCcCCcccccccCCC Q lcl|NC_015466. 239 DLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVG----SGNEGMRIKRFYLDA 314 (344) Q Consensus 239 ~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g----~~~~~~~~~~~~~~~ 314 (344) .++|+|-+ +... .... .+ .-+.++.--+++... -|.+++...+.. ....+. .|+.=. T Consensus 206 ~l~G~Pv~-~~~~--~~~~---~~--~~~~gd~s~~~~~~~--------~~~~i~~~~~~~~~~~~~~~~~---~~~~~~ 266 (297) T protein:vir:95 206 TIDGITTV-DLKS--ARFE---KG--DLLAGDFDNLIYGVP--------YNITYKISEEGQISTITNADGT---PINLFE 266 (297) T ss_pred cccceeeE-eecC--CCCC---Cc--eEEEEecccEEEEEe--------cCeEEEEeeccccccccccCcc---chhhhh Confidence 35677632 2110 0000 01 112222111111111 012222111100 000010 111122 Q ss_pred CceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 315 IESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 315 ~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .....+|+.+.++-.+.-+++=..|+.+-= T Consensus 267 ~~~~~~r~~~~~d~~v~~~~a~~~l~~at~ 296 (297) T protein:vir:95 267 QEMIAIRATMDIAVMITKTDAFAKLTPAER 296 (297) T ss_pred cCcEEEEEEEEeccEeecccceEEEeecCC Confidence 355677888888877777766555543322 No 100 >protein:vir:80213 Length: 334 # NCBI annotation: capsid protein # Family: family:all:2806 # MgeID: mge:1879 # MgeName: LKA1 # Cross-refs: genbank:acc:YP_001522884;genbank:gi:158345177;genbank:GeneID:5687476 Probab=46.75 E-value=0.72 Score=21.35 Aligned_cols=294 Identities=11% Similarity=0.004 Sum_probs=104.2 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcc-------------hhhhhhh-CcccccCC-ccceeeeechhhccccccccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDAS-------------HFVAGQV-FPQVSVGK-QSDAYFTYERGDFNRDEMQER 65 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~-------------~~ia~~l-fP~v~v~~-~~~~~~~~~k~~~~~~~~~~r 65 (344) |++-..+. ||.-+.+=.+.+. .|--..+ .+.+.+.. .+++...|+.= -+.....+ T Consensus 1 m~~~~~~~--------~t~~~~~~~~~~~~l~le~~~geV~~af~~~s~~~~~~~~r~i~~G~s~~~~~i--G~~~~~~~ 70 (334) T protein:vir:80 1 MTYPAANT--------HTRPGWGGANSDVSLHIEEHLGLVDASFMYSSKFASWMNVRSLRGTNQLRVDRV--GASTIAGR 70 (334) T ss_pred CCCCcCCC--------ccccccccccchheehhhhhhhHHHHHHHHhhhhhccceeeeccccceEEEeee--cceeeeee Confidence 65432221 2222221111110 0111112 23333211 12333333310 01111223 Q ss_pred ccCcccccceeccccccccccccc-ccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccc Q lcl|NC_015466. 66 TPGTESAGGTYEIGNDTYFARTRA-YHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTF 144 (344) Q Consensus 66 a~g~~~~~~~~~~~~~~~~~~~~~-l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~ 144 (344) .+|.......+.. ....+..+. +--....++.+++.+.+|.+....+..-..+...-+..+...+..++........ T Consensus 71 ~~g~~l~~~~~~~--~~~~l~ID~~l~~~~~VddiD~~q~~~D~rse~~~~~G~aLA~~~D~~~~~~l~kaa~~~~~~~~ 148 (334) T protein:vir:80 71 KAGEELVVQKNVS--DKLNLTVDTVLYARHFFDKFDEWTSNLDVRKETAREDGIALARQYDQACIIQLQKCGDFLAPAHL 148 (334) T ss_pred cCCCCCCCCCccc--CceEEEEeeeeehhhhHhhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhhhcccccc Confidence 3333332222222 122233322 2233445677777778887776665554444433344444444433332211111 Q ss_pred cccccccccccccccceeecccccccccCCCCCChHHHHHH---HHHHHHHhcC----CCcceEEeCHHHHHHHhcCHHH Q lcl|NC_015466. 145 DVDGVASSPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQ---GKRYVLEETG----FEPNVLTLGKAVYDALVDHPDI 217 (344) Q Consensus 145 ~~~gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~---~~~~i~~~~G----~~Pn~~v~~~~v~~~L~~h~~i 217 (344) .....+ .-...+.++|+. .+...||..-+.+ +++.+.+..- ..+-.++++++.|.+|+.|+++ T Consensus 149 ~~~~~~------G~~~~~~~~g~~----~~~~~~~~~l~~a~~~a~~~L~e~dvp~~~~~~R~~vv~P~~y~~Ll~~~r~ 218 (334) T protein:vir:80 149 KPAFHD------GILLPSTISGLA----ADAAADADVLVAAHRQGVEAMVFRDLGDQLMSEGVTLLDPVIFSFLLEHDRL 218 (334) T ss_pred cccccC------Ccceeecccccc----cchhhhHHHHHHHHHHHHHHHHhcCCCCCcCCceEEEeChHHHHHHhccccc Confidence 110000 011233344432 2345566555444 3444333321 1236899999999999999999 Q ss_pred HHHhccCCCccccccCHHHHHHHhCCCeEEEEE-------------EEEeccccCCCCccceeCCCceEEEEecCCCccc Q lcl|NC_015466. 218 VGRIDRGQTSGAAKANLVTLADLFEVDKVLVMK-------------AVRNTAKKGQTASHSFIGGKHALLSYAPATPGIM 284 (344) Q Consensus 218 ~~~i~~~~~~~~~~vt~~~la~~~gl~~I~v~~-------------a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~ 284 (344) ..+-++.+.+. .....-.+..+.|++-+ ..- ..||. +. +++. ..+.+++.+.. ++ T Consensus 219 ~n~d~~~s~~~-~~~~~g~i~~v~G~~V~-~Sn~~P~~~~t~~~~g~~~~~-~a---gd~t----~~~~~~~~~~A--l~ 286 (334) T protein:vir:80 219 MNVEFGAKEGG-NSFVGGRIAMLNGVRVV-ETPRFPQSAITANALGADFNV-TD---AEVR----RKMITFIPSMA--LI 286 (334) T ss_pred ccceecccccc-ccccceeEEEEeceEEE-eecCCCCcccccccccccccc-cc---cccc----ceEEEEEeCce--EE Confidence 88744322111 12223345555665422 100 01110 00 0000 11222221110 10 Q ss_pred ccccccceeecccccCCcCCcccccccCCCCceEEEeeccccceeeeccccchhh----hcc Q lcl|NC_015466. 285 TPSAGYTFNWTGLVGSGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFF----GGI 342 (344) Q Consensus 285 ~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~----~~~ 342 (344) |-+.- ....+.|+++...+|++.....+=-.+.=|++--.+ +|. T Consensus 287 ------t~~~~--------~~~~e~~~~~~~~~d~i~~~~a~G~g~lRPeaa~vv~~~~~~~ 334 (334) T protein:vir:80 287 ------SAQVH--------PVSAQFWEEKKDFGHYLDTFQSYNIGQRRPDAVAVHDITVTNP 334 (334) T ss_pred ------EEEEe--------ecceeeeechhhHHHHHHHHHHcCCceeccceEEEEEEeeecC Confidence 00000 011223333333333332222221111111111000 000 No 101 >protein:vir:103285 Length: 296 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:1605 # MgeName: JK06 # Cross-refs: genbank:acc:YP_277465;genbank:gi:71834107;genbank:GeneID:3562396 Probab=39.18 E-value=1 Score=20.51 Aligned_cols=281 Identities=11% Similarity=-0.015 Sum_probs=100.4 Q ss_pred CCCCCCCCc-cceecccccceeee-eEcCcchhhhhhhCcccccCCc---cceeeeech-hhcccccccccccCcccc-- Q lcl|NC_015466. 1 MPFTQPSRS-DVHVNRPLTNISIG-YVQDASHFVAGQVFPQVSVGKQ---SDAYFTYER-GDFNRDEMQERTPGTESA-- 72 (344) Q Consensus 1 m~~~~~~~~-~~~~dp~LT~iA~~-Y~n~~~~~ia~~lfP~v~v~~~---~~~~~~~~k-~~~~~~~~~~ra~g~~~~-- 72 (344) |-.-..... .| .-..|+.|-.- |.-.-.++.+.++||....... ++.|.+|+. +... ..+.... T Consensus 1 ~~~~~a~~~~~f-~~~ql~~id~~v~e~~~~~l~~~~~i~v~~~~~~~~~~~~~~~~~~~G~a~-------~~~~~~~di 72 (296) T protein:vir:10 1 MGVDKADAAGIW-TVKQLTASLNKAYETEYDQNSVVNLFPVSNEIPGYAKYFEYPVFDGVGIAQ-------IVADYTDDL 72 (296) T ss_pred CcccchhhhHHH-HHHHHHHHHHHHHhhhhcccccceecccccCCCCceeEEEeeeeeccCcee-------EeCCCcccc Confidence 332212111 22 11122222211 1111224788899986532223 333333321 1110 0111111 Q ss_pred -cceecccccccccccccccccccHHHHHhc-cCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccc Q lcl|NC_015466. 73 -GGTYEIGNDTYFARTRAYHRDVPEQVRANA-DNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVA 150 (344) Q Consensus 73 -~~~~~~~~~~~~~~~~~l~~~v~~~~~~~a-~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~ 150 (344) .+.+..+.....+...+..-.+-..+.+.+ ....+++.+........+ +...-+.+|.+ ... .|+. T Consensus 73 p~v~~~~~~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~ka~aA~~~~----~~~~n~~~f~G----~~~----~g~~ 140 (296) T protein:vir:10 73 PLVDALATERQGKVFRFGNAFLISIDEIKVGQATGQSLSTRKQSLAFEAH----DKLLDKLVWSG----STA----HGIP 140 (296) T ss_pred ceeeccceeEEEEEEEEEeeeeecHHHHHHHHHhCCChHHHHHHHHHHHH----HHhhceEEEee----ccc----ccce Confidence 112222222222222222222223333333 223444443322222211 11111222211 110 0110 Q ss_pred cccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcC--CCcceEEeCHHHHHHHhcCHHHHHHhccCCCcc Q lcl|NC_015466. 151 SSPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETG--FEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSG 228 (344) Q Consensus 151 ~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G--~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~ 228 (344) +.-+.+.-...+.++ .|++++ ..+.||.+....+...++ ..|++++|+++.+..|.+ ... +. T Consensus 141 -GLlN~p~v~~~~~~~----~W~~~t-~i~~Di~~~~~~l~~~s~g~~~p~~l~L~p~~~~~L~~------~~~--~~-- 204 (296) T protein:vir:10 141 -SVFDYPNINNVVSGG----SWSQPT-TAVSDITSLLDIIETSTNGQHRATHLLLPTTARRIMQN------LVP--GT-- 204 (296) T ss_pred -eEeecCCCccccccC----CccCHH-HHHHHHHHHHHHHHHhhCceecceeEEeCHHHHHHHhh------ccC--CC-- Confidence 111111112222222 499887 899999999998876543 689999999999987741 111 11 Q ss_pred ccccCHHHHHHHhCCCeEEEE-EEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCccc Q lcl|NC_015466. 229 AAKANLVTLADLFEVDKVLVM-KAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRI 307 (344) Q Consensus 229 ~~~vt~~~la~~~gl~~I~v~-~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~ 307 (344) +.--.++|++.+. .+.|- ...+..+..+ +.+.+++|......+.- -++--++. T Consensus 205 -~~t~l~~ik~~~~--~l~i~~~~~l~~a~~~---------g~~~~v~~~~~~~~~~~-~v~~~~~~------------- 258 (296) T protein:vir:10 205 -SVSYGEFFRQNNS--GVTVEFVQYLNDYNGT---------GTSAAIAYEKDPNNMAI-EIPEATNA------------- 258 (296) T ss_pred -CccHHHHHHHhcC--CceEEEeeeeccCCCC---------cceEEEEEEcCCceEEE-EcCcceee------------- Confidence 2333677777652 22221 1222222111 22333444322111110 01101110 Q ss_pred ccccCCCCceEEEeeccccc-eeeeccccchhhhcccC Q lcl|NC_015466. 308 KRFYLDAIESDRIEIDMSYD-QKKVAADLGYFFGGIVA 344 (344) Q Consensus 308 ~~~~~~~~~~~~vr~~~~~~-~~v~~~~~g~l~~~~va 344 (344) -+.+.....+.+++.+.+- ..|--|.+-+.+.+..= T Consensus 259 -~~~e~~~l~~~~~~~~~~~Gv~i~~P~ai~~~dGI~~ 295 (296) T protein:vir:10 259 -LPAQPKDLHFKIPVTSKATGLIVYRPLTMAVMKGITF 295 (296) T ss_pred -ecccccCceEEEeeEeeEEEEEEECCceeEEEeeeec Confidence 1112222333333333332 12222222222211111 No 102 >protein:vir:81160 Length: 371 # NCBI annotation: major capsid protein # Family: family:all:21 # MgeID: mge:1892 # MgeName: Geobacillus virus E2 # Cross-refs: genbank:acc:YP_001285811;genbank:gi:148747732;genbank:GeneID:5247203 Probab=37.64 E-value=1.1 Score=20.34 Aligned_cols=271 Identities=9% Similarity=0.016 Sum_probs=105.3 Q ss_pred CCCCCCCCccceecccccc-eeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCccccc-ceecc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTN-ISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAG-GTYEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~-iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~-~~~~~ 78 (344) |.........+.|-+.+.. |-..-++ ..+-..+++.+|+....+++........-. ..-.+-|..... ....| T Consensus 91 ~~~~t~~~gg~~vP~~~~~~ii~~~~~---~s~i~~~~~~~~~~~~~~~~~~~~~~~~~~--a~~v~Eg~~~~~~~~~~f 165 (371) T protein:vir:81 91 MSEGSNQDGGYTVPQDIQTRINELRES---KDALQNLITVEPVTTLSGSRVFKKRSQQTG--FVEVAEGAAIGEKATPQF 165 (371) T ss_pred hccCCCccCceeecHhHHHHHHHHHHh---hhhhhhhceeeeccCCceeEEEEeecCCcc--eeeeccccccccccccce Confidence 5444433333323222221 1101111 122333456677766666654432111000 111222333222 22445 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) +..+..++.-+.-.++..+..++. .++++......+.+.+.+..| ..++++. |... T Consensus 166 ~~i~~~~~k~~~~~~iS~ell~ds--~~~l~~~i~~~l~~a~~~~~~----~~i~~g~-----------g~~~------- 221 (371) T protein:vir:81 166 TLLQYQVKKYAGFFRVTNELLNDS--TEAIVNTLVRWIGDESRVTRN----GLIINVL-----------NTKA------- 221 (371) T ss_pred eeEEeeeeEEEEeehhhHHHHhhh--hHHHHHHHHHHHHHHHHHHHH----HHHHhhc-----------cccc------- Confidence 555454444443444555544432 234444444444444433222 1122111 1000 Q ss_pred cceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH---- Q lcl|NC_015466. 159 FDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL---- 234 (344) Q Consensus 159 ~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~---- 234 (344) .+| -..+.+|..................+|++..|.+|+. ++.++. .-+..+ T Consensus 222 -----~~~----------~~~~~~i~~~~~~~l~~~~~~~a~~vmn~~~~~~L~~-------lkd~~g--~~l~~~~~~~ 277 (371) T protein:vir:81 222 -----KTA----------IADLDGLKQIINVQLDPVFRSTSSVIVNQDAFNWLDT-------LKDQNG--QYLLQPSISS 277 (371) T ss_pred -----ccc----------cccHHHHHHHHHhhcchhhhcCCEEEEcHHHHHHHHH-------hhccCC--CeeeecccCC Confidence 000 1112233333333333333344589999999988872 222211 111111 Q ss_pred HHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCC-ceEEEEecCCCccccccccccee-ecccccCCcCCcccccccC Q lcl|NC_015466. 235 VTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGK-HALLSYAPATPGIMTPSAGYTFN-WTGLVGSGNEGMRIKRFYL 312 (344) Q Consensus 235 ~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~-~~~l~~~~~~~~~~~~s~G~T~~-~~~~~g~~~~~~~~~~~~~ 312 (344) ..-..++|.| |++.+.... +.... ..+..+ ..++ +|+-+.+++.. +.++- ..+..+.. T Consensus 278 ~~~~~l~G~p-V~~~~~~~~----~~~~~-~~~~~~~~~i~--------~Gd~~~~~~~~~~~~~~------i~~~~~~~ 337 (371) T protein:vir:81 278 PTGRQLLGLP-VVIVSNKVL----ANRVD-GGTGAQFAPII--------VGDLKEAVVMFDRQRTE------IMSSNVAM 337 (371) T ss_pred CCCceeccee-EEEeccccc----Ccccc-ccccCCcceEE--------EEehhceEEEEeecceE------EEEecccc Confidence 1113445665 333222110 00000 001111 1111 11212222221 11110 00001100 Q ss_pred --CCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 313 --DAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 313 --~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) =......+|+...++-.+.-+++-..++=+.| T Consensus 338 ~~f~~~~v~~~~~~r~d~~~~~~~a~~~~~~~~A 371 (371) T protein:vir:81 338 DAFETDATLWRAIERMDVKMRDDEAFVFGEVQLA 371 (371) T ss_pred chhhcCceEEEEEEeeccEEecccceEEEEEecC Confidence 11466788899999999999999999998888 No 103 >protein:vir:79928 Length: 393 # NCBI annotation: major head protein # Family: family:all:30335 # MgeID: mge:1874 # MgeName: 0305phi8-36 # Cross-refs: genbank:acc:YP_001429616;genbank:gi:156564106;genbank:GeneID:5525693 Probab=37.01 E-value=1.1 Score=20.27 Aligned_cols=293 Identities=9% Similarity=0.060 Sum_probs=114.1 Q ss_pred CCCCCCCCc------------cceecccccceeeeeEcCcchhhhhhhCccccc-CCccceeeeechhhccccccccccc Q lcl|NC_015466. 1 MPFTQPSRS------------DVHVNRPLTNISIGYVQDASHFVAGQVFPQVSV-GKQSDAYFTYERGDFNRDEMQERTP 67 (344) Q Consensus 1 m~~~~~~~~------------~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v-~~~~~~~~~~~k~~~~~~~~~~ra~ 67 (344) |..-.|... +.-|--+|++.-++--.+ -+|+-+||--+.. ..++-.++-++ . .+. ...+- T Consensus 59 m~G~~p~~eV~~~e~mtt~~a~IliP~vis~v~~Eaaep--l~~~~kl~qk~~L~~Grsm~F~~~g--~-~Ra--~~IgE 131 (393) T protein:vir:79 59 MEGETPTNEVNLREFMATPSAQILIPRVIVGTMREAAEP--LYIGTKMLQKIRLKSGQSMIFPSIG--I-MRA--YDVAE 131 (393) T ss_pred hcCCCchhheehhhhhcCCCcceechhhhhhhhhhcccc--hhHHHHHHHHHhhhcCcceeccchh--e-eee--ccccc Confidence 433333222 222222444443332221 2566666644322 11222222222 1 111 11223 Q ss_pred Ccccccceec-ccccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccc Q lcl|NC_015466. 68 GTESAGGTYE-IGNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDV 146 (344) Q Consensus 68 g~~~~~~~~~-~~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~ 146 (344) |........+ ++.+...++..-....+...+.....++.|.-...+....+.+.+.+|..+-...-..+. T Consensus 132 GgE~~~~sld~~T~dsv~~~~gK~G~~Ia~SqEmIsDSg~Dvin~~l~aA~RaMaRkKee~a~n~fk~~gh--------- 202 (393) T protein:vir:79 132 GQEIPEDSIDWQTHESPEIRVGKSGIRLRFTDEMISDSQWDLMSMMIKQAGRAMGRHKEQKAYHQFRSHGH--------- 202 (393) T ss_pred cccccccchhhhcCCceeEEechhhhhhhhHHHHhhcchHHHHHHHHHHHHHHHHhhhHHHHHhhhhcccc--------- Confidence 3333333333 334444444333332222222222334445444555555555555555444333211111 Q ss_pred cccccccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhc---c Q lcl|NC_015466. 147 DGVASSPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRID---R 223 (344) Q Consensus 147 ~gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~---~ 223 (344) ...+++.+..+.-.+|-++.- -..+-=-++||.+-.-++. ..++.|++++|++-+|+.+-+|++.-. +. + T Consensus 203 ----tvfDa~st~t~ahptGr~~~~-~qNGTlSleDllDm~~av~-~~hyt~svi~MHPLAWnv~AKna~me~-~~~na~ 275 (393) T protein:vir:79 203 ----TVFDNYSTNKLAHTTGLDKNG-VQNDTFSAEDFLDLIIAVM-ANEYTPSDLMMHPLAWTVFAKNELMGS-LQANPY 275 (393) T ss_pred ----eeeeccccCccceeecCCccc-cccccccHHHHHHHHHHHh-cccCCcceEEEcCchhhhhhhhhhhcc-eeeccc Confidence 112345566666666643211 1122233577888776654 679999999999999999999966532 21 1 Q ss_pred CCCccccccCHHHHH-HHh-C-CC---eEEEEEE-EEeccccCCCCccce--eCCCceEEEEecCCCcccccccccceee Q lcl|NC_015466. 224 GQTSGAAKANLVTLA-DLF-E-VD---KVLVMKA-VRNTAKKGQTASHSF--IGGKHALLSYAPATPGIMTPSAGYTFNW 294 (344) Q Consensus 224 ~~~~~~~~vt~~~la-~~~-g-l~---~I~v~~a-~yn~~~~~~~~~~~~--iw~~~~~l~~~~~~~~~~~~s~G~T~~~ 294 (344) +|....+--|..++. +++ | +| +|.+.-- -|.++ ..-+++ +-.|++=++.+-+. T Consensus 276 gN~~~~~~~ts~algp~~i~~~~~~nlnv~~sPfvp~d~k----~~rFd~~~Vd~NnvgvlLV~D~-------------- 337 (393) T protein:vir:79 276 GNYPAKGAPSSMALGPDSIQGRLPFNFNVNLSPFIPLDKK----SRRFDVYAVDRNNVGVLLVRDD-------------- 337 (393) T ss_pred cccCccccchhhhhchhhhccccccceeEEEecccccccc----cceeeEEEeecCCceEEEEecC-------------- Confidence 233322233333331 222 1 11 2322211 12221 111111 22233322222111 Q ss_pred cccccCCcCCcccccccCCCCceEEEeeccccce-eeeccc---------------cchhhhcccC Q lcl|NC_015466. 295 TGLVGSGNEGMRIKRFYLDAIESDRIEIDMSYDQ-KKVAAD---------------LGYFFGGIVA 344 (344) Q Consensus 295 ~~~~g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~-~v~~~~---------------~g~l~~~~va 344 (344) ..++.|++.-.+..-+...|.|-- ++-..- .-.||+|+-- T Consensus 338 ----------i~tdq~ddk~rdiq~iKl~ERYG~gvLn~gkaiavakNI~~~k~y~~P~~~~~~~~ 393 (393) T protein:vir:79 338 ----------LKTDQWDEKARGLQNIKMIERYGIGILNEGKAIAVAKNISMDKSYAEPMLIKNVGN 393 (393) T ss_pred ----------cceeccccccccceeeeeeeeeceeeeeCCceEEEEecceeecccccchhhhccCC Confidence 112223333333333333333322 111111 1123444333 No 104 >protein:vir:100884 Length: 389 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:1473 # MgeName: Lc-Nu # Cross-refs: genbank:acc:YP_358764;genbank:gi:78000028;genbank:GeneID:3726155 Probab=37.00 E-value=1.1 Score=20.27 Aligned_cols=267 Identities=12% Similarity=0.124 Sum_probs=99.4 Q ss_pred CCCCCCCCccceec-ccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCccccc-ceecc Q lcl|NC_015466. 1 MPFTQPSRSDVHVN-RPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAG-GTYEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~d-p~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~-~~~~~ 78 (344) |...++....+.|- ...+.|-..-++ ...-..+++.+||....++|+.............+ ++.... ....| T Consensus 109 ~~~~t~~~gg~~vP~~~~~~i~~~~~~---~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~E---~~~~~~~~~~~~ 182 (389) T protein:vir:10 109 TSKVTSTEAGVLIPEEIIYDPTAEVNS---VVDLSTLVTKTPVTTPKGTYPILKRATDRFSSVAE---LAENPKLAEPEF 182 (389) T ss_pred hcccccCCcceeehHHHHHHHHHHHHh---hhhHHhhcceeeccCCeeEEEEEecCCCccccccc---cccccccccccc Confidence 55444433333221 112222111111 12223446777777777777665322111111111 222211 12344 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) +...+..+..+.-.++..+..++. .++++......+.+.+....+..+ +++. ..+. T Consensus 183 ~~i~~~~~k~~~~~~iS~ell~ds--~~~l~~~i~~~la~~~~~~~~~~i----~~g~---------~~~~--------- 238 (389) T protein:vir:10 183 NKVDWSVATYRGAIPLSEEAIADS--AVDLTALVGQSIKEKSVNTYNAMI----APVL---------QSFT--------- 238 (389) T ss_pred eeeeeeheeeEeeehhhHHHHhhh--hHHHHHHHHHHHHHHHHHHHHHHH----hhhh---------cccc--------- Confidence 444444444443334444444332 233444333333333333222211 1110 0000 Q ss_pred cceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccc--cCH-H Q lcl|NC_015466. 159 FDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAK--ANL-V 235 (344) Q Consensus 159 ~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~--vt~-~ 235 (344) ..+. =+..+.|-+.++.. ...+. +. -...+|+++.|..|+. ++.. .+...-.++. .+. . T Consensus 239 -----~~~~----~~~~~~d~l~~~~~---~~~~~-~~-~a~~~~n~~~~~~L~~---lkd~-~G~~i~~~~~~~~~~~~ 300 (389) T protein:vir:10 239 -----AKKT----TTDTLVDSLKHILN---VDLDP-AY-SRALVVTQSLFNTLDT---LKDK-NGRYLLHDASDSITDGT 300 (389) T ss_pred -----cccc----cccccHHHHHHHHH---hhhhh-hh-CcEEEecHHHHHHHHH---hhcc-CCCeeeecCcccccccc Confidence 0000 00123333433332 21221 22 3689999999998873 2211 0100000010 111 1 Q ss_pred HHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCc--eEEEEecCCCcccccccccceeecccccCCcCCcccccccCC Q lcl|NC_015466. 236 TLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKH--ALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLD 313 (344) Q Consensus 236 ~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~--~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~ 313 (344) .-..++|+| |++.+..+... ..+...-++|+. .++++.. -|.++.++.. ..| T Consensus 301 ~~~~l~G~p-V~~~~~~~~~~---~~~~~~~~~gd~~~~~~~~~~---------~~~~i~~~~~----------~~~--- 354 (389) T protein:vir:10 301 AKGTILGVP-VYVVGDTLLGS---LAGDQKAFVGDLKRGVLFTDR---------QQVTLAWEDS----------KIY--- 354 (389) T ss_pred cccccccce-eEEecccccCC---CCCceEEEEeeccccEEEEee---------cceEEEeecc----------ccc--- Confidence 113477888 43322221111 111222233321 1111100 1122322210 111 Q ss_pred CCceEEEeeccccceeeeccccchhhhc--ccC Q lcl|NC_015466. 314 AIESDRIEIDMSYDQKKVAADLGYFFGG--IVA 344 (344) Q Consensus 314 ~~~~~~vr~~~~~~~~v~~~~~g~l~~~--~va 344 (344) ...+|+.+..+-.+.-+++..+++- +.+ T Consensus 355 ---~~~~~~~~r~d~~~~~~~a~~~~~~~~~~~ 384 (389) T protein:vir:10 355 ---GKYLGAAFRFGVQKADSKAGYFVTNTDVPG 384 (389) T ss_pred ---cceEEEEEEeccEEecccceEEEEeeccCC Confidence 1346777778888888888887763 333 No 105 >protein:vir:100172 Length: 394 # NCBI annotation: putative major head protein # Family: family:all:21 # MgeID: mge:1524 # MgeName: phi AT3 # Cross-refs: genbank:acc:YP_025031;genbank:gi:48697264;genbank:GeneID:2948270 Probab=36.66 E-value=1.2 Score=20.23 Aligned_cols=263 Identities=11% Similarity=0.103 Sum_probs=97.3 Q ss_pred CCCCCCCCcccee-cccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCccccc-ceecc Q lcl|NC_015466. 1 MPFTQPSRSDVHV-NRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAG-GTYEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~-dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~-~~~~~ 78 (344) +....+....+.| ....+.|-..-++ ..+-..+++.+||....++++.............+ +..... -..+| T Consensus 111 ~~~~t~~~gg~~vP~~~~~~ii~~~~~---~~~l~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~E---~~~~~~~~~~~~ 184 (394) T protein:vir:10 111 AGHVTSTEAGVLIPEEIIYDPTAEVNS---VVDLSTLVTKTPVTTPKGTYPILKRATDRFSSVAE---LAENPALAEPEF 184 (394) T ss_pred hcccccccCceeccHHHHHHHHHHHHh---hhhhhhhceeeeccCCceEEEEEecCCCccccccc---cccccccccccc Confidence 2222222222222 1122222111111 11223445667777667777654322111111111 111111 12344 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) ...++..+..+.-..+.++..+++ .++++......+.+.+....+.. ++++. ..+.. T Consensus 185 ~~v~l~~~k~~~~~~iS~ell~ds--~~~l~~~i~~~la~~~~~~~~~~----il~g~---------g~~~~-------- 241 (394) T protein:vir:10 185 EQVDWSVSTYRGAIPLSEEAIADS--AVDLTSLVGQSINEKSVNTYNAM----IAPVL---------QSFTA-------- 241 (394) T ss_pred eeEEeeeeeeEeeehhHHHHHhhh--hHHHHHHHHHHHHHHHHHHHHHH----Hhhcc---------ccccc-------- Confidence 444444433333334444444433 23444444444444444333321 11111 00100 Q ss_pred cceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHH--- Q lcl|NC_015466. 159 FDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLV--- 235 (344) Q Consensus 159 ~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~--- 235 (344) ... .+...+.+|.+........ .. -...+|+++.|..|+. + +..+. .-+.... T Consensus 242 ---~~~----------~~~~~~d~l~~~~~~~~~~-~~-~a~~vmn~~~~~~l~~---l----kd~~G--~~i~~~~~~~ 297 (394) T protein:vir:10 242 ---KAT----------TTDTLVDSLKHILNVDLDP-AY-SRALVVTQSLFNTLDT---L----KDKNG--RYLLHDASDS 297 (394) T ss_pred ---ccc----------cccccHHHHHHHHHhhhhh-hc-cCEEEecHHHHHHHHH---h----hccCC--Ceeeeccccc Confidence 000 1122233444444333333 22 3589999999999872 2 22211 0011110 Q ss_pred -----HHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccce-eecccccCCcCCccccc Q lcl|NC_015466. 236 -----TLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTF-NWTGLVGSGNEGMRIKR 309 (344) Q Consensus 236 -----~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~-~~~~~~g~~~~~~~~~~ 309 (344) .-..++|+|-+.+ +..+-.. ..+...-+++ +.+-+|.. ...+. .++. T Consensus 298 ~~~~~~~~~L~G~PV~~~-~~~~~~~---~~~~~~i~~g---------------d~s~~~~~~~~~~~--------~v~~ 350 (394) T protein:vir:10 298 ITDGTAKGTVLGVPVYVV-GDALLGS---AAGDQKAFVG---------------DLKRGVLFADRQQV--------TLAW 350 (394) T ss_pred cccCCcccccccceeEEe-cccccCC---CCCceEEEEe---------------eccccEEEEeecce--------EEEE Confidence 0124678873322 2111110 0111111212 11111111 11110 1111 Q ss_pred ccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 310 FYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 310 ~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .+ .......+|+.+.++-.+.-+++-.+++-.-+ T Consensus 351 ~~-~~~~~~~~~~~~r~d~~~~~~~ai~~~~~~~~ 384 (394) T protein:vir:10 351 ED-SKIYGRYLGAAFRFGVKQADSNAGYFVTNTDA 384 (394) T ss_pred ec-ccccceeEEEEEEeccEEeccccEEEEEeecc Confidence 01 11122346777777888888888888764444 No 106 >protein:vir:102335 Length: 312 # NCBI annotation: putative capsid protein # Family: family:all:701 # MgeID: mge:1566 # MgeName: phi CD119 # Cross-refs: genbank:acc:YP_529560;genbank:gi:90592716;genbank:GeneID:3974467 Probab=36.18 E-value=1.2 Score=20.17 Aligned_cols=284 Identities=11% Similarity=-0.024 Sum_probs=92.8 Q ss_pred CCCCCCCCccc--eecccccceeeeeEcCcchhhhhhhCccc-ccCCccceeeeechhhcccccccccccCcccccceec Q lcl|NC_015466. 1 MPFTQPSRSDV--HVNRPLTNISIGYVQDASHFVAGQVFPQV-SVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGTYE 77 (344) Q Consensus 1 m~~~~~~~~~~--~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v-~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~~~ 77 (344) |||+..-...| .+|..+..-+ +... +-+.. ..| -.+..++++++..-..+. +=.|.-|....+..++ T Consensus 1 Mantl~ya~~~~~~LD~~~~~~~--~s~~---l~~~~--~~v~~~ggktVkIp~i~~~gl~---DY~R~~g~~~~~g~v~ 70 (312) T protein:vir:10 1 MANTLAYGQVLQQGLDKQATQEL--LTGW---MDSNA--KQIKYEGGKEVKIGKLSTDGLG---DYSRGSANAYVGGDVK 70 (312) T ss_pred CCcchhHHHHHHHHHHHHHHhhh--cccc---ccCCC--ceEEEecCcEEEEEeeeccccc---ccccccCCcccccccc Confidence 77655333222 1122111111 0000 00000 001 022233444443322111 1112222222222334 Q ss_pred ccccccccc-cccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHH-HHHHHHHhhhhhhccccccccccccccccc Q lcl|NC_015466. 78 IGNDTYFAR-TRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINRE-VNWAAAYFTAGAPGDTWTFDVDGVASSPTA 155 (344) Q Consensus 78 ~~~~~~~~~-~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E-~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~ 155 (344) .+.+++.+. +++..-.++..+.++.+.....-.-.-+...+++.=..+ .|++.+...+..... +... T Consensus 71 ~~~et~tl~qDR~~~F~vD~mDvDETn~~~s~anv~~ef~r~~vvPEiDayrfskla~~a~~~~~--------~~~~--- 139 (312) T protein:vir:10 71 FEYETKTMTQDRGRKFTLDAMDVDETNFLVTATTVMGEFQRLKVIPEIDAYRLSRLATIAIGIKG--------DTNV--- 139 (312) T ss_pred ccceeEEeeecccceeeccccchhhHhhHHHHHHHHHHHHHhhhcchhhHHHHHHHHhhhhcccc--------cccc--- Confidence 444444432 233333343333333221111000000111111111111 244444433221110 0000 Q ss_pred ccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHH Q lcl|NC_015466. 156 PASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLV 235 (344) Q Consensus 156 ~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~ 235 (344) ....++ ..++.+..|.+.++.+++.....+-+|.+++.+..+|++ . ....+..... ..+.+ .- T Consensus 140 ---~~~~~~----------T~~ni~~~i~~~~~~lde~~vp~~rvl~vTp~~~~lLk~-~-~~~~~~~~~~-~~~~i-~~ 202 (312) T protein:vir:10 140 ---EYSYSV----------NSSTIINKIKTGIKIIRENGYNGPLVCHLTYDSMFAIEE-K-VLEKLTAVTF-AQGGI-QT 202 (312) T ss_pred ---cccccc----------CHHHHHHHHHHHHHHHHHccCCCceEEEeChHHHHHHhh-h-hhceeccccc-cccee-ee Confidence 000011 235789999999999988643456779999999988874 2 2233322211 12222 22 Q ss_pred HHHHHhCCCeEEEEEEEEeccccCCCC---------ccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCcc Q lcl|NC_015466. 236 TLADLFEVDKVLVMKAVRNTAKKGQTA---------SHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMR 306 (344) Q Consensus 236 ~la~~~gl~~I~v~~a~yn~~~~~~~~---------~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~ 306 (344) .+.++=|++-|.|=..+.-+++.=.+| -..---+.++=++.+++.+-+.-..+-... T Consensus 203 ~V~~iDgv~Ii~VPs~r~~t~~~f~dG~t~~~~~gg~~~~~~ak~INfiiv~~~a~i~~~K~~~~~-------------- 268 (312) T protein:vir:10 203 QVPSIDGCALIKTPQNRMYSSILLNDGTTSNQTAGGYLKGTKALDTNFIIAPVDVPLAITKQDKMR-------------- 268 (312) T ss_pred eeeeecccEEEEchhhhccceeeeccCcccccccCceeecCcccccceEEeCCceeeceeeeeeee-------------- Confidence 445555666565544443222211111 001111222222222222211111111000 Q ss_pred cccccCC-CCceEEEeeccccceee-------------eccccc Q lcl|NC_015466. 307 IKRFYLD-AIESDRIEIDMSYDQKK-------------VAADLG 336 (344) Q Consensus 307 ~~~~~~~-~~~~~~vr~~~~~~~~v-------------~~~~~g 336 (344) +..+... ...+|.+....-++--| .++..| T Consensus 269 if~P~~~~~~d~~~~~~R~Y~D~fv~~nk~~~Iyv~~k~a~~~~ 312 (312) T protein:vir:10 269 IFDPETNQTANAWSMDYRRYHDLWVTDNKANSVYANFKDAKPVG 312 (312) T ss_pred eeCCCCCCCcceeeeeeeeeeeeeeeccccCeEEEEeecccCCC Confidence 0011111 11122222222222111 222233 No 107 >protein:vir:10450 Length: 344 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:184 # MgeName: phiA1122 # Cross-refs: genbank:acc:NP_848297;genbank:gi:30387487;genbank:GeneID:1733971 Probab=34.45 E-value=1.3 Score=19.98 Aligned_cols=318 Identities=11% Similarity=0.034 Sum_probs=107.9 Q ss_pred CCCCCCCCcccee-cccccceeeeeEcCcchhh------------hhhh-CcccccCC-ccceeeeech-hhcccccccc Q lcl|NC_015466. 1 MPFTQPSRSDVHV-NRPLTNISIGYVQDASHFV------------AGQV-FPQVSVGK-QSDAYFTYER-GDFNRDEMQE 64 (344) Q Consensus 1 m~~~~~~~~~~~~-dp~LT~iA~~Y~n~~~~~i------------a~~l-fP~v~v~~-~~~~~~~~~k-~~~~~~~~~~ 64 (344) |+|+|.+..+=.- .|- .-|.-+...-|| -..+ .+.+.+.. .+++...|+. ++. .... T Consensus 1 ma~~~~~~~~n~~~~~~----~~~~~~~~al~ie~~~geV~~~f~~~s~~~~~~~~r~i~~g~s~~~~~iG~~---~~~~ 73 (344) T protein:vir:10 1 MANMTGGQQLGTNQGKD----VMAAGDKLALFLKVFGGEVLTAFARTSVTTSRHMVRSISSGKSAQFPVLGRT---QAAY 73 (344) T ss_pred CccccccccCCcccCCc----cCCccchhHHHHHHHHHHHHHHHHHHhhhcccceeeeecccceEEEEeecee---EEEe Confidence 9998765432211 110 000000000011 1111 24443211 1233333221 100 0111 Q ss_pred cccCcccccceeccccccccccccc-ccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhccccc Q lcl|NC_015466. 65 RTPGTESAGGTYEIGNDTYFARTRA-YHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWT 143 (344) Q Consensus 65 ra~g~~~~~~~~~~~~~~~~~~~~~-l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~ 143 (344) +.+|.+-.-..-........+..+. +--....++.+.+.+.+|++....+..-..+...-+..+...+..+...... T Consensus 74 ~~~G~~l~~t~~~~~~~e~~l~ID~~~y~~~~VdDiD~~q~~~D~r~~~~~~~G~aLA~~~D~~i~~~la~~a~~~~~-- 151 (344) T protein:vir:10 74 LAPGENLDDIRKDIKHTEKVITIDGLLTADVLIYDIEDAMNHYDVRSEYTSQLGESLAMAADGAVLAEIAGLCNVESQ-- 151 (344) T ss_pred eecCCCCCCCCCCcccceEEEEEcchhhhhhhhhhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHhhhccccc-- Confidence 1222221000000111111111111 1112234566677777887776665555444444444444443322221110 Q ss_pred ccccccccccccccccceeecccccccccCCCCC---ChHHHHHHHHHHHHHhcCCCc-ceEEeCHHHHHHHhcCHHHHH Q lcl|NC_015466. 144 FDVDGVASSPTAPASFDPTNASNNDKLHWSDASS---TPIEDIRQGKRYVLEETGFEP-NVLTLGKAVYDALVDHPDIVG 219 (344) Q Consensus 144 ~~~~gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~S---DPi~di~~~~~~i~~~~G~~P-n~~v~~~~v~~~L~~h~~i~~ 219 (344) ....+...+....+..+. +...-+++.. ..+.-|.+++..+.++.-..- =.+|++++.+.+|+.|+++.. T Consensus 152 -----~~~~~~g~~~~~~~~~~~-~~~~~t~~~~~~~~~~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~~~~~ 225 (344) T protein:vir:10 152 -----YNENITGLGTATVIETTQ-DKTTLTDQVALGKEIIAALTKARAALTKNYVPSSDRVFYCDPDSYSAILAALMPNA 225 (344) T ss_pred -----cccccccccccceeeccc-ccccccchhhhHHHHHHHHHHHHHHHhhcCCCccCCEEEeChHHHHHHhhcccccc Confidence 001111111111111110 0000111111 123446666666655543222 457789999999999998865 Q ss_pred HhccCCCccccccCHHHHHHHhCCCeEEEEEEEEeccccCCCCccc--eeCCCceEEEEecCCCcc---cccccccceee Q lcl|NC_015466. 220 RIDRGQTSGAAKANLVTLADLFEVDKVLVMKAVRNTAKKGQTASHS--FIGGKHALLSYAPATPGI---MTPSAGYTFNW 294 (344) Q Consensus 220 ~i~~~~~~~~~~vt~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~--~iw~~~~~l~~~~~~~~~---~~~s~G~T~~~ 294 (344) .-+ .+.+ ..-.-.+..+.|++ |+...-. ..+..+... ..+.++.+-.. .+... ...+-|.-|-. T Consensus 226 ~~~-~~~~---~~~~G~V~~v~G~~-V~~Sn~l----p~~~~~~~~~~~tg~~~~~~~~--~~~~~~~~~s~~~~l~~h~ 294 (344) T protein:vir:10 226 ANY-AALI---DPEKGSIRNVMGFE-VVEVPHL----TAGGAGTSREGTTGQKHAFPAT--KSGNDKVAKDNVIGLFMHR 294 (344) T ss_pred ccc-cccc---ceeeeEEEEEeceE-EEecccc----ccccCCcccccccCccccccCC--cccceeeecceeEEEeech Confidence 432 2211 11112333444543 1111100 000000000 01111110000 00000 00000101111 Q ss_pred cccccCCcCCcccccccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 295 TGLVGSGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 295 ~~~~g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .+..-...-...++.++.+...++.|+....+=-.+.=|++.--++=.-- T Consensus 295 ~A~~~v~~~~~~~e~~r~~~~~~d~i~g~~~~G~~vlRPe~a~~v~~~~~ 344 (344) T protein:vir:10 295 SAVGTVKLRDLALERARRANFQADQIIAKYAMGHGGLRPEAAGAVVFKTK 344 (344) T ss_pred hhhhhhhhccceeecccchhHHHHHHHHHhhcccceecccceEEEEeecC Confidence 11101111223456777777777777766666555555554311100000 No 108 >protein:vir:94711 Length: 347 # NCBI annotation: capsid # Family: family:all:975 # MgeID: mge:1528 # MgeName: K1F # Cross-refs: genbank:acc:YP_338120;genbank:gi:77118198;genbank:GeneID:3707734 Probab=33.14 E-value=1.4 Score=19.82 Aligned_cols=307 Identities=10% Similarity=0.001 Sum_probs=118.5 Q ss_pred CCCCCCCCccceecccccceeeeeEcCc--c----hhhhhhh---------CcccccCC-ccceeeeech-hhccccccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDA--S----HFVAGQV---------FPQVSVGK-QSDAYFTYER-GDFNRDEMQ 63 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~--~----~~ia~~l---------fP~v~v~~-~~~~~~~~~k-~~~~~~~~~ 63 (344) |++.+++.. .|+-..|..+.+ . .|+++.. .+.+.+.. .+++...|+. ++ .... T Consensus 1 m~~~~~~~~-------~t~~g~~~~~~d~~al~ik~f~~eV~~~f~~~s~~~~~~~~r~i~~G~sv~i~~iG~---~tv~ 70 (347) T protein:vir:94 1 MANVPGQKI-------GTDQGKGKSSSDALALFLKVFAGEVLTAFTRRSVTADKHIVRTIQNGKSAQFPVMGR---TSGV 70 (347) T ss_pred CCCCCcccc-------ccccccCCccccHHHHHHHHHhHHHHHHHHHHHhhhcccccccccccceEEEecccc---eeee Confidence 876654222 233343333332 1 1222211 12222211 1233333321 11 0111 Q ss_pred ccccCcccccceecccccccccccccc-cccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccc Q lcl|NC_015466. 64 ERTPGTESAGGTYEIGNDTYFARTRAY-HRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTW 142 (344) Q Consensus 64 ~ra~g~~~~~~~~~~~~~~~~~~~~~l-~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~ 142 (344) .+.+|.+-.-..-+.......+..+.+ --....++.+...+.+|++....+.....+....+..++.++-......... T Consensus 71 ~~t~G~~l~~~~~~~~~~e~~itID~~~~~~~~VddiD~~q~~~D~~~~~~~~~g~aLa~~~D~~i~~~~~~~aa~~~~~ 150 (347) T protein:vir:94 71 YLAPGERLSDKRKGIKHTEKVITIDGLLTADVMIFDIEDAMNHYDVAGEYSNQLGEALAIAADGAVLAEMAILCNLPAAS 150 (347) T ss_pred eecCCCCcCCCCCCCCcceEEEEecchhhhhHHhhhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHHHhcccccc Confidence 122222210000011111112222221 1112344556666677777766666655555555555544332111100000 Q ss_pred cccccccccccccccccceeecccccccccCCCCCChHH-------HHHHHHHHHHHhcCCCcc---eEEeCHHHHHHHh Q lcl|NC_015466. 143 TFDVDGVASSPTAPASFDPTNASNNDKLHWSDASSTPIE-------DIRQGKRYVLEETGFEPN---VLTLGKAVYDALV 212 (344) Q Consensus 143 ~~~~~gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~-------di~~~~~~i~~~~G~~Pn---~~v~~~~v~~~L~ 212 (344) +.........+.+....+ ..+.+|.. -|.+++..+.+. .+ |. .+|+++..+..|+ T Consensus 151 -------~~~~~g~~~~s~~~~~~~------~~~~~~~~~~~~~~~~i~~a~~~Lde~-~V-P~~~R~~vv~P~~~~~Ll 215 (347) T protein:vir:94 151 -------NENIAGLGTASVLEVGKK------ADLDTPAKLGEAIIGQLTIARAKLTSN-YV-PAGDRYFYTTPDNYSAIL 215 (347) T ss_pred -------ccccCCCcccceeecccc------ccccchhhhHHHHHHHHHHHHHHHhhc-CC-CCCCcEEEeCHHHHHHHh Confidence 000000011111111000 01233333 344444443333 23 54 8999999999999 Q ss_pred cCHHHHHHhccCCCccccccCHHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEE------EecCCCccccc Q lcl|NC_015466. 213 DHPDIVGRIDRGQTSGAAKANLVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLS------YAPATPGIMTP 286 (344) Q Consensus 213 ~h~~i~~~i~~~~~~~~~~vt~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~------~~~~~~~~~~~ 286 (344) .|+.+...-+.. . +....-.+..++|++-+ .. | .-...+.....-+.+...+ +.....+-... T Consensus 216 ~~~~~~~~~~~~-~---~~~~~G~Vg~i~G~~V~-~S----n--~lp~~~~t~~~~~~~~~~~aG~~~~~~~~~~~~~~~ 284 (347) T protein:vir:94 216 AALMPNAANYAA-L---IDPETGNIRNVMGFVVV-EV----P--HLVQGGAGETRGDDGITIASGQKHAFPATASSDVKV 284 (347) T ss_pred ccchhhhhhccc-c---ccccccceEEEeceEEE-ec----C--cccccccccccccCcceecCcccccccccchhhhcc Confidence 998877643221 1 11111244556665522 11 0 0000000000101110000 00000000000 Q ss_pred cc----ccceeecccccCC-cCCcccccccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 287 SA----GYTFNWTGLVGSG-NEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 287 s~----G~T~~~~~~~g~~-~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .| |.-|...+. +.. .-...++.++.+...++.|+....+=-.+.=|++.-.|+=..| T Consensus 285 ~~~~~~~l~~h~~A~-~~v~~~~~~~e~~r~~~~~~d~i~~~~~~G~~~~rP~~a~~~~~~~A 346 (347) T protein:vir:94 285 TMDNVVGLFSHRSAV-GTVKLRDLALERDRDVDAQGDLIVGKYAMGHGGLRPEAAGALVFSPA 346 (347) T ss_pred cccceeEEEeehhhh-hhhhcccccccchhchhhHHHHhhhhhhhcCcccccceeEEEEecCC Confidence 00 000111110 000 1123567888889999999999998888888888777776677 No 109 >protein:vir:100247 Length: 425 # NCBI annotation: gp76 # Family: family:all:21 # MgeID: mge:1619 # MgeName: Bcep176 # Cross-refs: genbank:acc:YP_355412;genbank:gi:77864702;genbank:GeneID:3725969 Probab=32.96 E-value=1.4 Score=19.80 Aligned_cols=286 Identities=9% Similarity=-0.042 Sum_probs=111.4 Q ss_pred CCCCCCCCccceeccccc-ceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccce-ecc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLT-NISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGT-YEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT-~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~-~~~ 78 (344) |........-+.|-+.+. .|- ..... ..+-..+++.+|+.....+++....+.-. ..+. -|....... ..| T Consensus 130 l~~~t~~~gG~lvP~~~~~~ii-~~~~~--~s~l~~l~~~~~~~~~~~~~~~~~~~~~a-~wv~---E~~~~~~~~~~~f 202 (425) T protein:vir:10 130 LNKGEDSEGGYLTPIEWDRTIT-NKLVL--ISPMRQLCRVQPVSKAGFSKLFNMGGTTS-GWVG---EASQRPQTNAATF 202 (425) T ss_pred hhcCcCCCCceeccHhHHHHHH-HHHHh--hhhhhhhceeeeccCCceEEEEEcCCcce-eeec---ccccccccccccc Confidence 433333222222322221 111 11111 11233456777777777777653222111 1111 122222222 234 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) ...++..+..+.-.++..+..+++ .++++....+.+.+.+....| ..++++ ++. +...|.-........ T Consensus 203 ~~v~~~~~k~~~~i~iS~ell~ds--~~~l~~~i~~~la~ai~~~~d----~~~l~G----~G~-~~p~Gil~~~~~~~~ 271 (425) T protein:vir:10 203 QPLSFASGEIYANPAATQQILDDA--EIDLESWLATEVQTEFAKQEG----KAFLAG----DGT-NKPNGLLTYIAGGAN 271 (425) T ss_pred ceeeeeheeeEeehHhHHHHHhcc--hhHHHHHHHHHHHHHHHHHHH----hhhhcc----cCC-CCcceeeeccccccc Confidence 444444443333334444444433 344555444445444433222 223322 111 111111111110000 Q ss_pred cceeeccccc-ccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCcccccc----C Q lcl|NC_015466. 159 FDPTNASNND-KLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKA----N 233 (344) Q Consensus 159 ~~k~tl~~t~-~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~v----t 233 (344) ... ...+.. ...=..++..-..+|.+.+..+. ...+...+.+|+++.|.+|+. + +..+. . -+. + T Consensus 272 ~~~-~~~~~~~~~~~~~~~~~~~d~l~~l~~~l~-~~~~~~a~~vmn~~~~~~L~~---l----kD~~G-~-~l~~~~~~ 340 (425) T protein:vir:10 272 AAK-HPFGAIEVVNSGAAADITSDGIIDLVYDLP-SAFTGNARFAMNRNTQRQVRK---L----KDGQG-N-YLWQPSYV 340 (425) T ss_pred ccc-ccccccccccccccccccHHHHHHHHhhhh-hhhccCCEEEEchHHHHHHHH---h----hcCCC-c-eeeccCcc Confidence 000 000000 00000012223344555444433 223344578999999988862 2 22211 1 111 1 Q ss_pred HHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCC-ceEEEEecCCCcccccccccc-eeecccccCCcCCccccccc Q lcl|NC_015466. 234 LVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGK-HALLSYAPATPGIMTPSAGYT-FNWTGLVGSGNEGMRIKRFY 311 (344) Q Consensus 234 ~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~-~~~l~~~~~~~~~~~~s~G~T-~~~~~~~g~~~~~~~~~~~~ 311 (344) ...-..+||.| |++.+..- .+..+ +.+++ |+-+.+|. +.+.+. ....+.|. T Consensus 341 ~g~~~~l~G~P-V~~~~~~p------------~~~~~~~~i~~--------Gd~~~~~~i~~~~~~------~v~~d~~~ 393 (425) T protein:vir:10 341 AGQPATLAGYP-VTEVPDMP------------DVAANSTPILF--------GDFQQTYLIIDRIGV------RVLRDPYT 393 (425) T ss_pred CCCCceeccee-eEEecCcC------------CccCCccEEEE--------EehhccEEEEEecce------EEEecccc Confidence 11113466776 43322211 11111 12222 22222222 222211 01122333 Q ss_pred CCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 312 LDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 312 ~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) . .+..-+|....++-.+.-+++-.+++-+.| T Consensus 394 ~--~~~~~~~~~~r~d~~v~~~~A~~~l~~~as 424 (425) T protein:vir:10 394 A--KPYVLFYTTKRVGGGLLNPEPMRAMKVAAS 424 (425) T ss_pred c--CCcEEEEEEEEeccEeecccceEEEEeecc Confidence 2 355677888899999999999999999999 No 110 >protein:vir:104342 Length: 314 # NCBI annotation: hypothetical protein # Family: family:all:463 # MgeID: mge:1593 # MgeName: RTP # Cross-refs: genbank:acc:YP_398971;genbank:gi:81343955;genbank:GeneID:3778874 Probab=30.96 E-value=1.5 Score=19.56 Aligned_cols=283 Identities=10% Similarity=-0.051 Sum_probs=95.3 Q ss_pred CCCCCCCCc-cceecccccceeee-eEcCcchhhhhhhCcccc---cCCccceeeeech-hhcccccccccccCccc--- Q lcl|NC_015466. 1 MPFTQPSRS-DVHVNRPLTNISIG-YVQDASHFVAGQVFPQVS---VGKQSDAYFTYER-GDFNRDEMQERTPGTES--- 71 (344) Q Consensus 1 m~~~~~~~~-~~~~dp~LT~iA~~-Y~n~~~~~ia~~lfP~v~---v~~~~~~~~~~~k-~~~~~~~~~~ra~g~~~--- 71 (344) |=.+..... .|- -..|+.|-.- |.-.-.++.+.++||... -.-+++.|..|+. +... .++... T Consensus 19 ~~~~~~d~~~~fl-~~ql~~id~~v~e~~~~~~~~~~~i~v~~~~~~~~et~~~~~~e~~G~a~-------~~~d~~~di 90 (314) T protein:vir:10 19 MGVEKADAAGIWA-VSQLTAALNRAYEKEYAENSVVNIFPVTNEIPGHAKYFEYPEFDGVGIAQ-------IIADYSDDL 90 (314) T ss_pred hcccchhhhHHHH-HHHHHHHHHHHhhhhccccccceeeccccCCCCceeEEEeeeecccccee-------eeCCccccc Confidence 221221111 221 1122222111 211222478888888643 2223333333321 1111 111111 Q ss_pred ccceecccccccccccccccccccHHHHHhc-cCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccc Q lcl|NC_015466. 72 AGGTYEIGNDTYFARTRAYHRDVPEQVRANA-DNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVA 150 (344) Q Consensus 72 ~~~~~~~~~~~~~~~~~~l~~~v~~~~~~~a-~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~ 150 (344) ..+.+.++.....+...+..-.+-..+.+.+ ....+++.+...-....+ | +..|...+-.... .+ T Consensus 91 p~vd~~~~~~~~~i~~~~~~~~~~~~El~~a~~~g~~l~~~k~~aA~~~~----~-----~~~n~i~f~G~~~---~g-- 156 (314) T protein:vir:10 91 PLVDAFMTEKQGKVFRFGNAFLISTDEIKAGAATGQSLSARKQALAFEAH----D-----NLLDKLVWSGSAP---HG-- 156 (314) T ss_pred ceeecccceeEEEEEEEEeeEEecHHHHHHHHHhCCChHHHHHHHHHHHH----H-----HhhceEEEeeccc---cc-- Confidence 1122222222222333333333333333222 233344433222111111 1 1112211111000 01 Q ss_pred cccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcC--CCcceEEeCHHHHHHHhcCHHHHHHhccCCCcc Q lcl|NC_015466. 151 SSPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETG--FEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSG 228 (344) Q Consensus 151 ~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G--~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~ 228 (344) .++.-++..++..+. ..+|+.+ ...+.||.+...++...++ ..|++++|+...+..|.+ +..++ T Consensus 157 --~~GLlN~p~v~~~~~-~~~WaT~-~ei~~Di~~~~~~l~~~s~g~~~p~~l~Lpp~~~~~L~~------~~~~~---- 222 (314) T protein:vir:10 157 --IVSVFDQPNINNVVA-TPNWSVP-QNAIDDVTAMIDAVESSTQGLHHVTDILLPASARRVMQG------LVPQT---- 222 (314) T ss_pred --ceeEeecCCCccccC-CCCcccH-HHHHHHHHHHHHHHHHhcCccccceeEEecHHHHHhhcc------cccCC---- Confidence 111111111122221 1259643 2559999999999987655 689999999998877641 11111 Q ss_pred ccccCHHHHHHHhCCCeEEEEEEE-EeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCccc Q lcl|NC_015466. 229 AAKANLVTLADLFEVDKVLVMKAV-RNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRI 307 (344) Q Consensus 229 ~~~vt~~~la~~~gl~~I~v~~a~-yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~ 307 (344) +.--.++|++-. +.+.|-.+. +..+. .. +.+.+++|......+.- -+..-++.-+ -........ T Consensus 223 -~~tvl~~l~~n~--~~l~I~~~~el~~ag--~~-------g~~~~v~y~~~~~~~~~-~vp~~~~~l~--~e~~~~~~~ 287 (314) T protein:vir:10 223 -NLSYGELFTRNN--PGLTIRFLQFLDNYD--GA-------GGKAALAFEKSPLNMSI-EIPEVTNVLP--AQPKDLHFR 287 (314) T ss_pred -CccHHHHHHHhC--CCcEEEEcccccccC--CC-------cceEEEEEecCCcEEEE-ecCccceeec--ceecCceEE Confidence 122245666542 222222211 22211 11 22333444322211110 0001111100 000011122 Q ss_pred ccccCCCCceEEEeeccccceeeeccccchhhh Q lcl|NC_015466. 308 KRFYLDAIESDRIEIDMSYDQKKVAADLGYFFG 340 (344) Q Consensus 308 ~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~ 340 (344) ..+.....+.++.|-...+ .--|.=|. T Consensus 288 ~~~~~r~~Gv~i~~P~ai~------~~dGI~~~ 314 (314) T protein:vir:10 288 YPVTSKATGLIVYRPLTMA------VIKGITFA 314 (314) T ss_pred EcceeeeEEEEEECcceeE------eeeeeecC Confidence 2222322233333322111 11121121 No 111 >protein:vir:4856 Length: 293 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:106 # MgeName: DT1 # Cross-refs: genbank:acc:NP_049396;genbank:gi:9632424;genbank:GeneID:1258532 Probab=27.26 E-value=1.9 Score=19.10 Aligned_cols=272 Identities=10% Similarity=-0.044 Sum_probs=105.2 Q ss_pred CCCCCCCCccceecccc-cceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccc-eecc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPL-TNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGG-TYEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~dp~L-T~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~-~~~~ 78 (344) |...+.....+.+-+.+ ..|-..-++. .+-..++..+|+....+++........ ...-.-.+-|...... ...| T Consensus 5 ~~~~t~~~gg~liP~~~~~~Ii~~~~~~---~~l~~~~~~~~~~~~~g~~~~~~~~~~-~~~a~~v~Eg~~~~~~~~~~~ 80 (293) T protein:vir:48 5 KTDHSGSDAGLTIPQDIRTAINTLVRQY---DSLQEYVNVENVTTLTGSRVYEKWTDI-TGLANIDDEAGKIADIDDPKL 80 (293) T ss_pred ecccccCcCceEechhHHHHHHHHHHhh---hhhhhhceeeeccCCcceEEEEeecCC-CcceeeecCCcccccccccce Confidence 54444433333332222 2221111111 122333455666655555332211100 0000011112222221 2344 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) +..+..++..+.-.++.++..++. .++++....+.+.+.+....+ ..++++. .. T Consensus 81 ~~i~l~~~k~~~~~~iS~ell~ds--~~~l~~~i~~~la~~~~~~~~----~~i~~g~-------------~~------- 134 (293) T protein:vir:48 81 SLIKYTIKRYAGISTVTNSLLADS--AENILAWLSGWIAKKVVVTRN----KAILGVV-------------DK------- 134 (293) T ss_pred eEEEEeeeEEEEeehhhHHHHhhh--hHHHHHHHHHHHHHHHHHHHH----hHHhhcc-------------cc------- Confidence 444444444443344444444433 344554444444444332222 1111110 00 Q ss_pred cceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHHHHH Q lcl|NC_015466. 159 FDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLVTLA 238 (344) Q Consensus 159 ~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~~la 238 (344) + -...+...+.||.+.+..+... .......+|++..|..|+. +++. .++.--...++...-. T Consensus 135 -------~-----~~~~~~~~~d~i~~~~~~l~~~-~~~~a~~vmn~~~~~~L~~---lkd~--~g~~l~~~~~~~~~~~ 196 (293) T protein:vir:48 135 -------L-----PTKPTLTKWDDIIDLEAKVDPA-IKQTSFFLTNTSGFTALKK---VKNA--LGDYLMERDVKSPTGY 196 (293) T ss_pred -------c-----cccccccCHHHHHHHHHhhhhh-hcCCCEEEEcHHHHHHHHH---hhcc--CCceEeecCcCCCCCc Confidence 0 0012333455677666666433 4556678999999988863 2211 1110000011111223 Q ss_pred HHhCCCeEEEEEEEEeccccCCCCccceeCCC---ceEEEEecCCCcccccccccceeecccccCCcCCcccccccCCCC Q lcl|NC_015466. 239 DLFEVDKVLVMKAVRNTAKKGQTASHSFIGGK---HALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYLDAI 315 (344) Q Consensus 239 ~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~---~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~~~~ 315 (344) .++|+| |.+.+..+... +..+...-++++ .+.++.. -|.++..... ..+.-.. T Consensus 197 ~l~G~P-v~~~~~~~~~~--~~~~~~~~~~gd~~~~~~~~~~----------~~~~i~~~~~-----------~~~~~~~ 252 (293) T protein:vir:48 197 SIAGFA-VKEISDRWLPN--ASSGVMPLYFGDLKQAVTLFDR----------QQMSLLSTNI-----------GGGAFET 252 (293) T ss_pred eeccee-eEEecccccCC--ccCCceEEEEEeccceEEEEEe----------cceEEEEecc-----------cchhhhc Confidence 567777 33322211110 011111122222 1111100 0112211110 0011124 Q ss_pred ceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 316 ESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 316 ~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) +...+|+.+.++-++.-+++-.+++-.-+ T Consensus 253 ~~~~~r~~~r~d~~~~~~~a~~~l~~~~~ 281 (293) T protein:vir:48 253 DTTKVRVIDRFDVVATDTEAFVPASFKAI 281 (293) T ss_pred CeEEEEEEEeeCcEEecccceEEEEeecc Confidence 56678888888888888888877774443 No 112 >protein:vir:94622 Length: 341 # NCBI annotation: PfWMP4_37 # Family: family:all:2203 # MgeID: mge:1525 # MgeName: Pf-WMP4 # Cross-refs: genbank:acc:YP_762667;genbank:gi:115304375;genbank:GeneID:5142322 Probab=26.93 E-value=1.9 Score=19.06 Aligned_cols=304 Identities=10% Similarity=-0.027 Sum_probs=100.0 Q ss_pred CCCCCCCC-------ccceecccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCccccc Q lcl|NC_015466. 1 MPFTQPSR-------SDVHVNRPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAG 73 (344) Q Consensus 1 m~~~~~~~-------~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~ 73 (344) |.|+..+. +.|+. .+.+...+.-.+... .+++. ....+..-..++...+..-- .+.......|..... T Consensus 3 ~~~~~~~~~~~t~~v~~fip-ei~s~~i~~~l~~~~-v~~~~-~~d~~~~~~~Gdtv~ip~~g--~~~~~d~~~~~~i~~ 77 (341) T protein:vir:94 3 LGNTITGPSINTQRGQQFIP-EQWLSEVQMFRKAKM-LDTSV-VKTWGAQVKKGDTFHVPRIS--ELGVEDKATDVPVGV 77 (341) T ss_pred chhhhccccccchhHHHHHH-HHHHHHHHHHHHhhc-chhhc-cccccccccCCceEEEeccC--cceeeeecCCCcccc Confidence 33333331 11210 000111111001000 01111 11111111113333332210 000111112222211 Q ss_pred ceeccccccccccc-ccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccc Q lcl|NC_015466. 74 GTYEIGNDTYFART-RAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASS 152 (344) Q Consensus 74 ~~~~~~~~~~~~~~-~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~ 152 (344) ..+.-+..+..+.. ......++ +.+.....+|+.....+.....+....|..++..+-.+....... ... T Consensus 78 ~~~~~~~~~itiD~~~~~~~~i~--d~d~~~~~~d~~~~~~~~~~~aLA~~~D~~i~~~~a~~~~~~~~~----~~~--- 148 (341) T protein:vir:94 78 QPVNDTDFVITVDTDRTTAVALD--DLLEIQASYDLRAPYLEAMGYALAKDMTGSILGLRAAVQNTASQN----VFS--- 148 (341) T ss_pred ccccCceEEEEEeeeeecceeec--hHHHHhhccchHHHHHHHHHHHHHHHHHHHHHHHhhhccccccCc----ccc--- Confidence 11221222222211 11122233 334445566777777666666665555555544432211110000 000 Q ss_pred cccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCC-CcceEEeCHHHHHHHhcCHHHHHHhccCCCccccc Q lcl|NC_015466. 153 PTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGF-EPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAK 231 (344) Q Consensus 153 ~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~-~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~ 231 (344) ....+. + .++...-...|.+++..+.++.-- ..=.++++++.+..|++++++.++-+.++ +. T Consensus 149 ----~~~~~~--t-------~~~~~~~~~~i~~a~~~Lde~~VP~~gR~lvv~P~~~~~Ll~~~~~~~~~~~g~----~~ 211 (341) T protein:vir:94 149 ----SSNGAI--T-------GNGQAFSFAVFLAARRLLLEADVPEEKIVLLISPGQESALFTIPQFISKDFINN----AP 211 (341) T ss_pred ----Cccccc--c-------CchhhhhHHHHHHHHHHHhhcCCCccCCEEEeCHHHHHHHhhchhhhhhhcccc----ch Confidence 000000 0 011112235566666665554321 22468999999999999999987643321 22 Q ss_pred cCHHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCce-EEEEecCCC------cc--cccccccceeec----ccc Q lcl|NC_015466. 232 ANLVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHA-LLSYAPATP------GI--MTPSAGYTFNWT----GLV 298 (344) Q Consensus 232 vt~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~-~l~~~~~~~------~~--~~~s~G~T~~~~----~~~ 298 (344) +..-.+..++|++ |+...-. . ...+. .|.... ..+.....+ .+ ...-++.+..+. +.. T Consensus 212 l~~G~ig~i~G~~-V~~Sn~l--p---~~~~~---~~~~~~~~~~~~~~~~~i~~~~~~~~~~~~~~~~~gl~~~~~av~ 282 (341) T protein:vir:94 212 IAQGQIGSLMGVR-VIRTSLI--G---NNSAT---GWRNGAPTIAPAEATPGFTGSRYLPKQDSFTSLPATFTGNSRPVH 282 (341) T ss_pred hheeeeeeEeceE-EEEeccc--c---ccccc---cccccccceecccccccccccccccccccccccEEEEEEeccccc Confidence 3333455666755 2221110 0 00000 011000 000000000 00 000011111110 000 Q ss_pred -----------cCCcCCcccccccCCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 299 -----------GSGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 299 -----------g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ...........-+......+.++....+=-.+.-|++...|.-.-+ T Consensus 283 ~~k~~~~~~~~~~~~~~~~~~~~~~~~~~~~~i~~~~~~G~~~lrp~~~v~~~~~~~ 339 (341) T protein:vir:94 283 TAVMCHMDWAAAVVSKAPRVTQSFENREQVWLMVGRQAYGARLYRPLHAVNIHTTGD 339 (341) T ss_pred ceeeecchhhhccccccccccccchhhhhhhhhhhhhhhcccccCcceeEEEecCcC Confidence 0000000111112222344555555555555555555544433222 No 113 >protein:vir:78739 Length: 332 # NCBI annotation: major capsid protein # Family: family:all:975 # MgeID: mge:1856 # MgeName: Syn5 # Cross-refs: genbank:acc:YP_001285448;genbank:gi:148724482;genbank:GeneID:5220210 Probab=25.58 E-value=2 Score=18.88 Aligned_cols=307 Identities=12% Similarity=0.048 Sum_probs=114.4 Q ss_pred CC---CCC-CCCcccee-cccc-cceee--eeEcCcc--hhhhhhhC-cccccCC-ccceeeeech-hhccccccccccc Q lcl|NC_015466. 1 MP---FTQ-PSRSDVHV-NRPL-TNISI--GYVQDAS--HFVAGQVF-PQVSVGK-QSDAYFTYER-GDFNRDEMQERTP 67 (344) Q Consensus 1 m~---~~~-~~~~~~~~-dp~L-T~iA~--~Y~n~~~--~~ia~~lf-P~v~v~~-~~~~~~~~~k-~~~~~~~~~~ra~ 67 (344) |. +|+ |++.+... .+.= ..+|+ .-..+.- .|--..+| +.+.+.. .+++...|+. ++. ......+ T Consensus 1 ~~~~~~~~~~~~~~~~~~~~~~d~~~al~le~~~geV~~~f~~~s~~~~~~~~r~i~~G~tv~i~~ig~~---~~~~~~~ 77 (332) T protein:vir:78 1 MTTLSNFSLPNQANGGARNADYDVRYATALKLFSGEVFTAFNNASIFKGLVRSYDLRGGKSKQFMFTGKL---SAGYHTP 77 (332) T ss_pred CcccccccCCccccCCccccccccchhhhhhhhhhhHHHHHHHHhhhhhccccccccccceEEEEeccce---eEeeecC Confidence 32 222 22221100 0000 01111 0000100 01111122 3322111 1233333221 100 0111112 Q ss_pred Ccccccceeccccccccccccc-ccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccc Q lcl|NC_015466. 68 GTESAGGTYEIGNDTYFARTRA-YHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDV 146 (344) Q Consensus 68 g~~~~~~~~~~~~~~~~~~~~~-l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~ 146 (344) |..-.... ........+..+. +--....++.+.+.+.++++....+.....+....+..++.++..++....... T Consensus 78 g~~l~~~~-~~~~~~~~l~ID~~ky~~~~VddiD~~q~~~dl~~~~~~~~g~aLA~~~D~~i~~~l~~aa~~~~~~~--- 153 (332) T protein:vir:78 78 GTPIVGDA-GIKANEKTLVMDDLLVSSQFVYSLDEIFSQYSTRAEVSKQIGEALATHYDERIARVLAKASAEASPVT--- 153 (332) T ss_pred CCCCCCCC-CCCCceEEEEEehhhhhHHHHHhHHHHhcCcchHHHHHHHHHHHHHHHHHHHHHHHHHhhhcccCccc--- Confidence 22111100 0111111111111 111122345666677778877777776666666666666666655442211100 Q ss_pred cccccccccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcce-EEeCHHHHHHHhc--CHHHHHHhcc Q lcl|NC_015466. 147 DGVASSPTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNV-LTLGKAVYDALVD--HPDIVGRIDR 223 (344) Q Consensus 147 ~gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~-~v~~~~v~~~L~~--h~~i~~~i~~ 223 (344) + .+....+.++++.. . ....-+.-|.++...+.++.-..-++ +|+++..|..|++ |+++.++-.. T Consensus 154 -~-------~~g~~~~~~~~~~~--~--~~~~~~~~i~~a~~~Lde~~VP~~gR~~vv~P~~y~~Ll~~~d~~~~n~~~~ 221 (332) T protein:vir:78 154 -G-------EPGGFHVNIGAGNT--N--DAQAIVDGFFEAAAVLDERSAPQEGRVAVLSPRQYYSLISSVDTNILNREIG 221 (332) T ss_pred -c-------cccccccccCCccc--c--CHHHHHHHHHHHHHHHhhcCCCccCCEEEeCHHHHHHHHhhcCceeeeeecc Confidence 0 01111223333211 1 11224455667777766665444454 8889999999987 7777665433 Q ss_pred CCCccccccCHHHHHHHhCCCeEEEEEEEEecccc--------CCCCccceeCCCceEEEEecCCCcccccccccceeec Q lcl|NC_015466. 224 GQTSGAAKANLVTLADLFEVDKVLVMKAVRNTAKK--------GQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWT 295 (344) Q Consensus 224 ~~~~~~~~vt~~~la~~~gl~~I~v~~a~yn~~~~--------~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~ 295 (344) +.. + ....-..+.++.|++ |+...-.-+.+.. +....+.--..+.+.|++.+.. ++. . ..- T Consensus 222 ~~~-~-~~~~g~~i~~i~G~~-V~~Sn~lp~~~g~~~~~~~~~~~~n~~~~~~~~~~~~~~h~~a--~~~-----v-~~~ 290 (332) T protein:vir:78 222 NSQ-G-DMNSGKGLYSIAGIR-ILKSNNLAGLYGQDLSSAAVTGENNDYQVDASALAGLIFHREA--AGC-----I-QSV 290 (332) T ss_pred ccc-c-ceecceeeeEEeeeE-EEecCccccCcccccccccccccccccccccccceEEeecccc--eee-----e-eee Confidence 221 1 223323355555654 2111111111100 0000000011222233322111 110 0 000 Q ss_pred ccccCCcCCcccccccCCCCceEEEeeccccceeeeccccchhhhcc Q lcl|NC_015466. 296 GLVGSGNEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFFGGI 342 (344) Q Consensus 296 ~~~g~~~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~ 342 (344) + ..-.+.+.++.+....++|+....+=-.+.=|++...|.-+ T Consensus 291 ~-----~~~~~t~~~~~~~~~~d~i~~~~~~G~~v~rPe~~v~l~~a 332 (332) T protein:vir:78 291 A-----PTIQTTSGDFNVQYQGDLIVGKLAMGCGSLRTSVAGSFQAA 332 (332) T ss_pred c-----cchhhhhcccchhhhHhhhhhhhhhcCceecccceEEEeeC Confidence 0 00012233455555667777777666667777766666666 No 114 >protein:vir:4997 Length: 397 # NCBI annotation: major head protein # Family: family:all:21 # MgeID: mge:109 # MgeName: Sfi21 # Cross-refs: genbank:acc:NP_049971;genbank:gi:9632943;genbank:GeneID:1262106 Probab=25.34 E-value=2.1 Score=18.85 Aligned_cols=268 Identities=10% Similarity=0.003 Sum_probs=99.9 Q ss_pred CCCCCCCCccceecccc-cceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccce-ecc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPL-TNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGT-YEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~dp~L-T~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~-~~~ 78 (344) |.........+.|-..+ ..|-..-++ ...--.+...+|+....+++....... ......-.+-|....... ..| T Consensus 109 ~~~~t~~~gg~~iP~~~~~~ii~~~~~---~~~l~~~~~~~~~~~~~~~~~~~~~~~-~~~~a~~v~E~~~~~~~~~~~~ 184 (397) T protein:vir:49 109 KTDGSGSDAGLTIPQDIRTAINTLVRQ---FDSLQEYVNVENVTTLTGSRVYEKWAD-ITGLAKLDDEGGQIGQNDDPKL 184 (397) T ss_pred hhccCCccCcceecHHHHHHHHHHHHh---hhhHhhhcceeeccCCcceEEEEeecc-CCcceeeeccccccccccccce Confidence 44433333323221111 111100111 011222345556665555543321111 110000111122221111 233 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) +..+..++..+.-.++..+..+++ .++++......+.+.+....|. .++++ . |.. T Consensus 185 ~~v~~~~~k~~~~~~iS~ell~ds--~~~l~~~i~~~l~~~~~~~~d~----ail~G----~-------g~~-------- 239 (397) T protein:vir:49 185 SLIRYAIKRYAGISTVTNSLLADS--AENILAWLSGWIAKKVVVTRNK----AILEA----I-------GTL-------- 239 (397) T ss_pred eeeEeeeeeeEeehhhHHHHHhhh--hHHHHHHHHHHHHHHHHHHHHH----HHHhc----c-------ccc-------- Confidence 333333333333333444444432 3445544444454444433332 22221 1 110 Q ss_pred cceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCHH--- Q lcl|NC_015466. 159 FDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANLV--- 235 (344) Q Consensus 159 ~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~~--- 235 (344) .++ .+...+.+|.+.+..+. ..+..+...+|+++.|.+|+. + +..+. .-+..++ T Consensus 240 ~~~-------------~~~~~~d~i~~~~~~l~-~~~~~~a~~v~n~~~~~~l~~---l----kd~~g--~~l~~~~~~~ 296 (397) T protein:vir:49 240 PNK-------------PTLAKWDDIIDLQAKVD-PAIKQTSLFLTNTSGFTALKK---V----KNAMG--DYLMERDVKS 296 (397) T ss_pred ccc-------------ccccCHHHHHHHHHhhh-hhhcCCCEEEEcHHHHHHHHH---h----hccCC--ceeecccccC Confidence 000 12223345666665554 347788999999999988863 2 22211 1111111 Q ss_pred -HHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccce-eecccccCCcCCcccccccC- Q lcl|NC_015466. 236 -TLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTF-NWTGLVGSGNEGMRIKRFYL- 312 (344) Q Consensus 236 -~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~-~~~~~~g~~~~~~~~~~~~~- 312 (344) .-..++|+|-+.+....... +..+....++++ .+.+|.+ .+.++ .....++.. T Consensus 297 g~~~~l~G~pV~~~~~~~~~~---~~~~~~~~~~gd---------------~~~~~~~~~~~~~------~i~~~~~~~~ 352 (397) T protein:vir:49 297 PTGYSIDGFVVKEISDRFLPN---GTGGAMPLYFGD---------------LKQAVTLFDRQHL------SLLSTNIGGG 352 (397) T ss_pred CCCceecceeeEEeccccccc---ccCCceeEEEee---------------ccceEEEEeeccc------EEEEeccccc Confidence 11245676632222211110 111112222222 1111221 11111 000011111 Q ss_pred -CCCceEEEeeccccceeeeccccchhhh-cccC Q lcl|NC_015466. 313 -DAIESDRIEIDMSYDQKKVAADLGYFFG-GIVA 344 (344) Q Consensus 313 -~~~~~~~vr~~~~~~~~v~~~~~g~l~~-~~va 344 (344) -......+|+...++-.+.-+++-.+++ .++| T Consensus 353 ~~~~~~~~~~~~~r~d~~~~~~~a~~~~~~~~~~ 386 (397) T protein:vir:49 353 AFETDTTKVRVIDRFDVVSTDTEAFVPASFKAIA 386 (397) T ss_pred hhhcCeeeEEEEEeeccEEecccceEEEEecccc Confidence 1235566777777777777777776665 2222 No 115 >protein:vir:79712 Length: 285 # NCBI annotation: major capsid protein gp34 # Family: family:all:701 # MgeID: mge:1873 # MgeName: LL-H # Cross-refs: genbank:acc:YP_001285883;genbank:gi:148750840;genbank:GeneID:5220414 Probab=22.40 E-value=2.5 Score=18.45 Aligned_cols=273 Identities=12% Similarity=0.035 Sum_probs=102.3 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhhhhhhCccc-----ccCCccceeeeechhhcccccccccccCcccccce Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFVAGQVFPQV-----SVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGT 75 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~ia~~lfP~v-----~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~ 75 (344) |+.+....-.-.+|..+. +.. +.++.+-+.- ..+..+.++++.+-..... +=.|..|.... . T Consensus 1 Main~~~k~~~~ld~~~~-------~~~--~~~~l~~~~n~~~~~~~gak~VkIp~ist~~gl~--dY~R~~g~~~g--~ 67 (285) T protein:vir:79 1 MTVVLDSKDLARIDEEYK-------ADS--QVWSYLTGGNGVTQRFRGHNEVRINKLSGFVDAT--AYKRGQDNARK--T 67 (285) T ss_pred CcchhhHHHHHHHHHHHH-------Hhh--hhhhhcccCCcceeEecCCCEEEEeeeccccccc--ccccccCcccc--c Confidence 554432222112222221 110 1222222211 1123344555442111111 12233343222 3 Q ss_pred eccccccccc-ccccccccccHHHHHhccCCCCHHHHHHH-HHHHHHhhhHH-HHHHHHHhhhhhhcccccccccccccc Q lcl|NC_015466. 76 YEIGNDTYFA-RTRAYHRDVPEQVRANADNPISLDREATI-FVTQKGLINRE-VNWAAAYFTAGAPGDTWTFDVDGVASS 152 (344) Q Consensus 76 ~~~~~~~~~~-~~~~l~~~v~~~~~~~a~~~~~~~~~a~~-~~~~~i~l~~E-~~~a~~~~~~~~~~~~~~~~~~gv~~~ 152 (344) ++.+.+++.+ .+++..-.++..+.++ +.... -...+. ...+++.=..+ .|++.+...+.. T Consensus 68 v~~~~et~tl~~DR~~~f~iD~mDvdE-n~~~~-~~ni~~ef~~~~vvPEiDayrfskla~~a~~--------------- 130 (285) T protein:vir:79 68 ISVGKETVKLTHEDWFGYDLDQFDMDE-NGAYT-VENVVREHNKMITIPHRDKVAVQKLFDSAAK--------------- 130 (285) T ss_pred cceeeeEEEeeccccceecccccchhh-hhhhh-HHHHHHHHHhhhhcchhhHHHHHHHHhhccc--------------- Confidence 3333334432 2333333343333332 11000 000010 01111111111 233333321110 Q ss_pred cccccccceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCcccccc Q lcl|NC_015466. 153 PTAPASFDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKA 232 (344) Q Consensus 153 ~~~~~~~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~v 232 (344) .++.++ ..++.+..|++.+..+++.....+=+|.+++.++..|++.+.+.+.+.....-..+.+ T Consensus 131 ------~~~~~~----------T~~nv~~~i~~~~~~lde~~vp~~rvl~vTp~~~~~Lk~s~~~~r~~~~~~~~~~~~i 194 (285) T protein:vir:79 131 ------KATDSI----------TKDNALDAYDTAEAYMFDNEVPGGFVMFVSSAYYTALKQSAAVTRTFSTDGTMVINGI 194 (285) T ss_pred ------cccccc----------CHHHHHHHHHHHHHHHHHcCCCCceEEEEChHHHHHHHhhhhhheecccccceeccce Confidence 001111 2457899999999999887333556699999999999999998876543211111111 Q ss_pred CHHHHHHHhC-CCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCcccccccccceeecccccCCcCCccccccc Q lcl|NC_015466. 233 NLVTLADLFE-VDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFY 311 (344) Q Consensus 233 t~~~la~~~g-l~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~ 311 (344) .. .+.++=| ++-|.|...+..+. +.. +++=++.+++.+-..-..+-+.. +..+. T Consensus 195 ~~-~V~~lDg~v~ii~Vps~r~kt~--~~~--------k~Infiiv~~~a~i~~~K~~~~~--------------~f~P~ 249 (285) T protein:vir:79 195 DR-RVAQLDGGVPIVRVSSDRLKGL--GIT--------NHVNFILTPLSAIAPIVKYDSVS--------------VIDPS 249 (285) T ss_pred ee-eeccccceeEEEEcchhhccCc--Ccc--------hhccEEEecCceeccceeeeeeE--------------eECCC Confidence 11 2333334 56666655544221 111 22322323333222111111111 11111 Q ss_pred CC-CCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 312 LD-AIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 312 ~~-~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) .. ...+|.+....-++--|.-.-.-.++.+.-| T Consensus 250 ~~~~~d~~~~~~R~Y~d~fv~~nk~~~Iy~~~~a 283 (285) T protein:vir:79 250 TDRSGNRWTIKGLSYYDAIVLDNAKKGIYVAATA 283 (285) T ss_pred CCCCcceeeeeeeeeeeeeehhhccceeeeeecc Confidence 11 2224444444444444433333333334444 No 116 >protein:vir:1541 Length: 347 # NCBI annotation: major capsid protein 10A # Family: family:all:975 # MgeID: mge:31 # MgeName: phiYeO3-12 # Cross-refs: genbank:acc:NP_052109;swissprot:trembl:q9t107;genbank:gi:9634035;uniprot:Q9T107;genbank:GeneID:1262383 Probab=21.75 E-value=2.6 Score=18.35 Aligned_cols=320 Identities=10% Similarity=0.048 Sum_probs=112.4 Q ss_pred CCCCCCCCccceecccccceeeeeEcCcchhh------------hhhh-CcccccCC-ccceeeeech-hhccccccccc Q lcl|NC_015466. 1 MPFTQPSRSDVHVNRPLTNISIGYVQDASHFV------------AGQV-FPQVSVGK-QSDAYFTYER-GDFNRDEMQER 65 (344) Q Consensus 1 m~~~~~~~~~~~~dp~LT~iA~~Y~n~~~~~i------------a~~l-fP~v~v~~-~~~~~~~~~k-~~~~~~~~~~r 65 (344) |++++.+.+- -.+|-..+ ++-+....|| --.+ .+.+.+.. .+++...|+. ++ . ....+ T Consensus 1 ma~~~~~~~~-~t~~~~~~---~~~~~~a~~ie~f~g~V~~~f~~~s~~~~~~~~~~~~~G~sv~i~~ig~-~--t~~~~ 73 (347) T protein:vir:15 1 MANIQGGQQI-GTNQGKGQ---SAADKLALFLKVFGGEVLTAFARTSVTMPRHMLRSIASGKSAQFPVIGR-T--KAAYL 73 (347) T ss_pred CCccccCCcc-ccccccCC---CcchHHHHHHHHHHHHHHHHHHHhhhhhhccccccccccceeEeeeccc-e--eeeee Confidence 9999886542 13332221 1111111111 1112 23333211 1233333321 11 0 01111 Q ss_pred ccCccc--ccceecccccccccccccc-cccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccc Q lcl|NC_015466. 66 TPGTES--AGGTYEIGNDTYFARTRAY-HRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTW 142 (344) Q Consensus 66 a~g~~~--~~~~~~~~~~~~~~~~~~l-~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~ 142 (344) .+|.+. .+.....++. .+..+.+ --....++.......+|++....+.....+....+..++..+-.+....... T Consensus 74 ~~g~~l~~~~~~~~~~e~--~ltID~~~~~~~~VddlD~~q~~~D~~~~~~~~~g~aLA~~~D~~i~~~l~~~~~~~~~~ 151 (347) T protein:vir:15 74 KPGENLDDKRKDIKHTEK--VIHIDGLLTADVLIYDIEDAMNHYDVRAEYTAQLGESLAMAADGAVLAELAGLVNLPDAS 151 (347) T ss_pred ccCCCCCCCCCCCccceE--EEEechhhhhhHHhhhHHHHhcCCcchHHHHHHHHHHHHHHHHHHHHHHHHHHhhccccc Confidence 222211 1111111111 1221111 1111234455556667776665555555554444444444332211110000 Q ss_pred cccccccccccccccccceeecccccccccCCCC---CChHHHHHHHHHHHHHhcCC-CcceEEeCHHHHHHHhcCHHHH Q lcl|NC_015466. 143 TFDVDGVASSPTAPASFDPTNASNNDKLHWSDAS---STPIEDIRQGKRYVLEETGF-EPNVLTLGKAVYDALVDHPDIV 218 (344) Q Consensus 143 ~~~~~gv~~~~~~~~~~~k~tl~~t~~~~Wsd~~---SDPi~di~~~~~~i~~~~G~-~Pn~~v~~~~v~~~L~~h~~i~ 218 (344) ..+.. .+.........+.++++ .+++. ...+.-|.++.+.+.++.-- ..-.+|+++..+..|++|+++. T Consensus 152 ---~~~~~-~~g~~~~~~~~~~~~~~---~~~~~~~~~~i~d~~~~a~~~Lde~~VP~~gR~~vv~P~~y~~LL~~~~~~ 224 (347) T protein:vir:15 152 ---NENIE-GLGKPTVLTLVKPTTGD---LTDPVELGKAIIAQLTIARASLTKNYVPAADRTFYTTPDNYSAILAALMPN 224 (347) T ss_pred ---ccccc-ccCcccccccccccccc---chhhhhHHHHHHHHHHHHHHHHhhcCCCccCCEEEeCHHHHHHHhcccccc Confidence 00000 00000011122222221 22221 12244455555555555432 3356888999999999999875 Q ss_pred HHhccCCCccccccCHHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCC--Ccccccccccceeecc Q lcl|NC_015466. 219 GRIDRGQTSGAAKANLVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPAT--PGIMTPSAGYTFNWTG 296 (344) Q Consensus 219 ~~i~~~~~~~~~~vt~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~--~~~~~~s~G~T~~~~~ 296 (344) .+=+ .+. +.+-.-.+..++|++-+ ...-.=..+ +.+....-..+.+..+-+-... ..-..++.|.-|-..+ T Consensus 225 ~~d~-~~~---~~~~~G~Vg~i~G~~V~-~Sn~lp~~~--~t~~~~~~~~g~~~~~~~~~~~~~~~~f~~~~~l~~h~~A 297 (347) T protein:vir:15 225 AANY-QAL---IDHERGTIRNVMGFEVV-EVPHLTAGG--AGDTREDAPADQKHAFPATSSTTVKVALDNVVGLFQHRSA 297 (347) T ss_pred cccc-ccc---ccccceEEEEEeceEEE-ecccccccc--cccccccccccccccccccccceeeeccccceeeeeccce Confidence 4322 111 11212233445565422 110000000 0000000011111111000000 0000011111111111 Q ss_pred cccCC-cCCcccccccCCCCceEEEeeccccceeeeccccchhh-hcccC Q lcl|NC_015466. 297 LVGSG-NEGMRIKRFYLDAIESDRIEIDMSYDQKKVAADLGYFF-GGIVA 344 (344) Q Consensus 297 ~~g~~-~~~~~~~~~~~~~~~~~~vr~~~~~~~~v~~~~~g~l~-~~~va 344 (344) . |.- .....++.++.+...+++|+....+=-.+.=|++..-| ---|+ T Consensus 298 ~-g~v~~~~~~~e~~~~~~~~~d~i~~~~~~G~~vlrP~~av~~~~~~~~ 346 (347) T protein:vir:15 298 V-GTVKLKDLALERARRANYQADQIIAKYAMGHGGLRPEAAGAIVLPKVS 346 (347) T ss_pred e-eeeEeeceeeeecccchhhhhhhehhhhcCCceeccccEEEEecCCCC Confidence 0 000 01124566677777888888877776666666653322 11122 No 117 >protein:vir:4830 Length: 397 # NCBI annotation: MPL-7201 # Family: family:all:21 # MgeID: mge:105 # MgeName: 7201 # Cross-refs: genbank:acc:NP_038327;genbank:gi:9634653;genbank:GeneID:1262632 Probab=21.19 E-value=2.6 Score=18.27 Aligned_cols=269 Identities=11% Similarity=-0.027 Sum_probs=102.1 Q ss_pred CCCCCCCCccceec-ccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccc-eecc Q lcl|NC_015466. 1 MPFTQPSRSDVHVN-RPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGG-TYEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~d-p~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~-~~~~ 78 (344) |........-+.+- ...+.|-...++. ..--.+++.+|+....++++.+....... ...-..-|...... ...| T Consensus 109 ~~~~t~~~gg~~iP~~~~~~ii~~~~~~---~~l~~~~~~~~~~~~~~~~~~~~~~~~~~-~a~~v~E~~~~~~~~~~~~ 184 (397) T protein:vir:48 109 KTDASGSDAGLTIPQDIQTAIHTLVRQY---DSLQEYVNVENVTTLTGSRVYEKWADITG-LAKLDDEAGSIGTNDDPKL 184 (397) T ss_pred hhccCCccccccccHHHHHHHHHHHHHH---HHHHhhhceeeccCCcceEEEEeecCCCc-ceeeeccccccccccccce Confidence 33222222222221 1222221111111 12233456677776666665432211110 00011112222211 1234 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) +..+...+..+.-.++..+..++ +.++++......+.+.+....|. .++++. |... T Consensus 185 ~~v~~~~~k~~~~~~iS~ell~d--s~~~l~~~v~~~l~~~~~~~~d~----~il~G~-----------g~~~------- 240 (397) T protein:vir:48 185 YPIRYAIKRYAGISTVTNSLLAD--SAENILAWLSGWIAKKVVVTRNK----AILEAI-----------ATLP------- 240 (397) T ss_pred eeEEeeheeeeeehhhHHHHHhh--chHHHHHHHHHHHHHHHHHHHHH----HHhhcc-----------cccc------- Confidence 44444433333333444444443 33455555555555555443332 222211 1100 Q ss_pred cceeecccccccccCCCCCChHHHHHHHHHHHHHhcCCCcceEEeCHHHHHHHhcCHHHHHHhccCCCccccccCH---- Q lcl|NC_015466. 159 FDPTNASNNDKLHWSDASSTPIEDIRQGKRYVLEETGFEPNVLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKANL---- 234 (344) Q Consensus 159 ~~k~tl~~t~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~vt~---- 234 (344) + . .+.....+|.+....+.. .+..+...+|+++.|.+|+. ++..+ +..+... T Consensus 241 -~---~----------~~~~~~d~i~~~~~~l~~-~~~~~a~~v~n~~~~~~L~~-------lkd~~--G~~i~~~~~~~ 296 (397) T protein:vir:48 241 -T---K----------PTLTKWDDIIDLQAKVDP-AIKQTSFFLTNTSGFTALKK-------VKNAF--GDYLMERDVKS 296 (397) T ss_pred -c---c----------cccccHHHHHHHHHHhhh-hhcCCCEEEECHHHHHHHHH-------hhcCC--CceeeccCcCC Confidence 0 0 122334455555555543 46778899999999999873 33222 1112211 Q ss_pred HHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCc--eEEEEecCCCcccccccccceeecccccCCcCCcccccccC Q lcl|NC_015466. 235 VTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKH--ALLSYAPATPGIMTPSAGYTFNWTGLVGSGNEGMRIKRFYL 312 (344) Q Consensus 235 ~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~--~~l~~~~~~~~~~~~s~G~T~~~~~~~g~~~~~~~~~~~~~ 312 (344) ..-..++|.|-+.+........ ..+....++|+. .++++.. -|++++.... .+.. T Consensus 297 ~~~~~l~G~PV~~~~~~~~~~~---~~~~~~~~~gd~~~~~~~~~~---------~~~~i~~~~~-----------~~~~ 353 (397) T protein:vir:48 297 PTGYSIDGFAVKEVADRWLANA---SSGAMPLYFGDLKQAVTLFDR---------QQMSLLSTNI-----------GGGA 353 (397) T ss_pred CCCceeccceeEEecccccCCc---CCCceEEEEEeccceEEEEee---------cceEEEEecc-----------chhh Confidence 1223566776333322111111 111112222221 1111100 0111111100 0011 Q ss_pred CCCceEEEeeccccceeeeccccchhhhcc-cC Q lcl|NC_015466. 313 DAIESDRIEIDMSYDQKKVAADLGYFFGGI-VA 344 (344) Q Consensus 313 ~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~-va 344 (344) -......+|+...++-.+.-+++-..++=. ++ T Consensus 354 ~~~~~~~~r~~~r~d~~~~~~~a~~~~~~~~~~ 386 (397) T protein:vir:48 354 FETDTTKIRVIDRFDVVATDTESFVPASFKAIA 386 (397) T ss_pred hhcCceeEEEEeeeccEEecccceEEEEecccc Confidence 123445666667677666666655444421 11 No 118 >protein:vir:80180 Length: 381 # NCBI annotation: capsid protein # Family: family:all:2203 # MgeID: mge:1878 # MgeName: Pf-WMP3 # Cross-refs: genbank:acc:YP_001285797;genbank:gi:148747831;genbank:GeneID:5220456 Probab=20.81 E-value=2.7 Score=18.21 Aligned_cols=301 Identities=12% Similarity=-0.019 Sum_probs=94.6 Q ss_pred CCCCCC------------CCccceecccccceee-eeEcCcchhhhhhhCcccccCCccceeeeechhhccccccccccc Q lcl|NC_015466. 1 MPFTQP------------SRSDVHVNRPLTNISI-GYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTP 67 (344) Q Consensus 1 m~~~~~------------~~~~~~~dp~LT~iA~-~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~ 67 (344) |++... +.++|+. .+.....+ .|.+. ++-..+.-+-.-....++...+++-- .+....... T Consensus 1 ~~~~~~~~~~~~~~~~~t~~~~fiP-ev~s~~v~~~l~~~---lv~~~l~~~~~~~~~~GdTV~ip~~g--~~~a~d~~~ 74 (381) T protein:vir:80 1 MATIQGTGGYKGSAVDLSNVQVFIP-EVWSSEVRMFRDQK---FAALEATKKIPFEGKKGDLIHIPNIS--RAAVYDKQP 74 (381) T ss_pred CceecccccccCcccchhhHHhhhh-HHHHHHHHHHHHHh---hhhhhccccccceeecCceEEeeccC--cceeeeecC Confidence 444332 2222321 01111111 11111 11111111101111223333333211 111111222 Q ss_pred Ccccccceecccccccccccccc-cccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccc Q lcl|NC_015466. 68 GTESAGGTYEIGNDTYFARTRAY-HRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDV 146 (344) Q Consensus 68 g~~~~~~~~~~~~~~~~~~~~~l-~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~ 146 (344) |.......+..+.. .+..+.+ .......+.+.....+|++....+.....+....+..+...+.............. T Consensus 75 g~~i~~~~~~~~~~--~itID~~~~~~~~Idd~D~~~~~~D~~~~~~~~~~~aLA~~~D~~i~~~~~~~~~~~~~~~~t~ 152 (381) T protein:vir:80 75 QTPVNLQARTDSEF--TFTVTKYKESSFMIEDIVNTQASYTLRQYYTKEAGYALARDMDNFALAHRAVINAFPSQRIYSY 152 (381) T ss_pred CCcccccccCCceE--EEEEeeeeecceeechHHHHhhccChHHHHHHHHHHHHHHHHHHHHHHHHhhcccccccccccc Confidence 22222212221222 2222111 12234445556667778888887777766655555444443322111111100000 Q ss_pred cccccccccccccceeecccccccc--cCCCCCChHHHHHHHHHHHHHhcCC-CcceEEeCHHHHHHHhcCHHHHHHhcc Q lcl|NC_015466. 147 DGVASSPTAPASFDPTNASNNDKLH--WSDASSTPIEDIRQGKRYVLEETGF-EPNVLTLGKAVYDALVDHPDIVGRIDR 223 (344) Q Consensus 147 ~gv~~~~~~~~~~~k~tl~~t~~~~--Wsd~~SDPi~di~~~~~~i~~~~G~-~Pn~~v~~~~v~~~L~~h~~i~~~i~~ 223 (344) .....+..... ...+...-+..|.+++..+.+..-. ..-+++++++.+..|++++++.++-+ T Consensus 153 --------------~~~i~~~~~~~~~t~~~~~~t~~~i~~a~~~Lde~~VP~egR~lvv~P~~~~~Ll~~~~~~~ad~- 217 (381) T protein:vir:80 153 --------------DTTLGDGTVNAHLTGTPAPLTYAALLLAKQKLDEADVPQEGRIVMVSPAQYIDLLSINQFISVDF- 217 (381) T ss_pred --------------cccccccccccccccchhhHHHHHHHHHHHHHhhcCCCcCCcEEEeCHHHHHHHhhchhhhhhhh- Confidence 00000000000 1112233567788888877665321 22389999999999999999887642 Q ss_pred CCCccccccCHHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCC--CcccccccccceeecccccCC Q lcl|NC_015466. 224 GQTSGAAKANLVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPAT--PGIMTPSAGYTFNWTGLVGSG 301 (344) Q Consensus 224 ~~~~~~~~vt~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~--~~~~~~s~G~T~~~~~~~g~~ 301 (344) .+. ..+..-.+..+.|++ |+... .-. .+ .+. .+...+.++.. +.+....|+--|...+ T Consensus 218 ~~~---~~l~~G~Ig~i~G~~-Vv~Sn--~lp--~~-~~t------~~~~~agap~~~~~~~~~~~~~g~~s~~a----- 277 (381) T protein:vir:80 218 SQV---KPVTSGVVGTILGME-VIVTT--QIG--IN-SLT------GYVNGQGAPTQPTPGVLGSPYLPDQAGTA----- 277 (381) T ss_pred ccc---hhhhceeeeEEcceE-EEeec--ccc--cc-ccc------ceeeeccccccccccccccccccccccce----- Confidence 221 122222345566655 22211 100 00 000 01111111100 0000111111110000 Q ss_pred cCCcccccccCCC---CceEEEeeccccceeeeccccchh--hh----ccc---------------C Q lcl|NC_015466. 302 NEGMRIKRFYLDA---IESDRIEIDMSYDQKKVAADLGYF--FG----GIV---------------A 344 (344) Q Consensus 302 ~~~~~~~~~~~~~---~~~~~vr~~~~~~~~v~~~~~g~l--~~----~~v---------------a 344 (344) .+...++.|+.-- ..+..+-.+...+.-.-.+.+|-+ ++ .|| + T Consensus 278 ~av~~~k~yd~~~~~~~~~~~~~~g~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 344 (381) T protein:vir:80 278 NVVNTGSASDLAVSLSYFGLPVFSGAGATAADGGQTLGSFGGANRWATAVVCHPDWLAVGVQQNVKS 344 (381) T ss_pred eeeeeeeeeceeeeeeeccceeeecceeeecCCCceeeeehhhhhhhhhcccccccccccceeEeec Confidence 0000111111100 000000000000011111111111 00 011 0 No 119 >protein:vir:485 Length: 407 # NCBI annotation: putative major capsid protein # Family: family:all:21 # MgeID: mge:11 # MgeName: P27 # Cross-refs: genbank:acc:NP_543092;swissprot:trembl:q8w627;genbank:gi:18249904;uniprot:Q8W627;genbank:GeneID:929693 Probab=20.54 E-value=2.8 Score=18.17 Aligned_cols=286 Identities=10% Similarity=0.011 Sum_probs=105.7 Q ss_pred CCCCCCCCccceec-ccccceeeeeEcCcchhhhhhhCcccccCCccceeeeechhhcccccccccccCcccccce-ecc Q lcl|NC_015466. 1 MPFTQPSRSDVHVN-RPLTNISIGYVQDASHFVAGQVFPQVSVGKQSDAYFTYERGDFNRDEMQERTPGTESAGGT-YEI 78 (344) Q Consensus 1 m~~~~~~~~~~~~d-p~LT~iA~~Y~n~~~~~ia~~lfP~v~v~~~~~~~~~~~k~~~~~~~~~~ra~g~~~~~~~-~~~ 78 (344) |...+....-+.|- ...+.|-..-++. .+-..++..+|+.....++++...+.-. ... +-|....... ..| T Consensus 106 ~~~~t~~~gG~~iP~~~~~~I~~~~~~~---~~l~~~~~~~~~~~~~~~~~~~~~~~~a-~~v---~E~~~~~~~~~~~f 178 (407) T protein:vir:48 106 LQVGNDEDGGYAIPEELDRTILTLLKDE---VVMRQEATVITLGGSDYKKLVNLGGTTS-GWV---GETDARPETATSKL 178 (407) T ss_pred hhcccCCCCcccccHhHHHHHHHHHHhh---hhhhhhceeeecCCCceEEEEecCCcce-eee---cccccccccccccc Confidence 22111111111110 0011111000110 1123345566776666666653221110 001 1111111111 234 Q ss_pred cccccccccccccccccHHHHHhccCCCCHHHHHHHHHHHHHhhhHHHHHHHHHhhhhhhcccccccccccccccccccc Q lcl|NC_015466. 79 GNDTYFARTRAYHRDVPEQVRANADNPISLDREATIFVTQKGLINREVNWAAAYFTAGAPGDTWTFDVDGVASSPTAPAS 158 (344) Q Consensus 79 ~~~~~~~~~~~l~~~v~~~~~~~a~~~~~~~~~a~~~~~~~i~l~~E~~~a~~~~~~~~~~~~~~~~~~gv~~~~~~~~~ 158 (344) ...++..+..+.-.++..+..+ .+.++++....+.+.+.+....|. .++++ ++. +...|.-........ T Consensus 179 ~~i~~~~~k~~~~~~iS~ell~--ds~~~l~~~i~~~l~~~i~~~~~~----a~l~G----~G~-~~p~Gil~~~~~~~~ 247 (407) T protein:vir:48 179 GLIEPFMGEIYGNPQATQKMLD--DAFFNVEDWINSELALEFAEQEEI----AFTSG----DGS-KKPKGFLAYESTDED 247 (407) T ss_pred eeEEeeeeeeEeehhhHHHHHh--cchHHHHHHHHHHHHHHHHHHHHh----hhhcc----CCC-Cccceeeeccccccc Confidence 4333333333322333344333 333455555555555555433332 22222 111 111111110000000 Q ss_pred cceeecccc-cccccCCCCCChHHHHHHHHHHHHHhcCCCcc-eEEeCHHHHHHHhcCHHHHHHhccCCCcccccc---- Q lcl|NC_015466. 159 FDPTNASNN-DKLHWSDASSTPIEDIRQGKRYVLEETGFEPN-VLTLGKAVYDALVDHPDIVGRIDRGQTSGAAKA---- 232 (344) Q Consensus 159 ~~k~tl~~t-~~~~Wsd~~SDPi~di~~~~~~i~~~~G~~Pn-~~v~~~~v~~~L~~h~~i~~~i~~~~~~~~~~v---- 232 (344) ......++ .......++.--..||.+.+..+.. ..++| +.+|++..|..|+. + +..+. .-+. T Consensus 248 -~~~~~~~~~~~~~~~~~~~~~~d~i~~l~~~l~~--~~~~~a~~v~n~~~~~~L~~---l----kD~~G--r~l~~~~~ 315 (407) T protein:vir:48 248 -DKTRAFGKLQHIASGAASGVTADAIIKLIYTLRK--AHRSGAKFMMNNSSLFAIRL---L----KDNDG--NYLWRPGI 315 (407) T ss_pred -ccccccccccccccccccccChHHHHHHHHhhch--hhhcCCEEEEcHHHHHHHHH---h----hccCC--ceeeccCc Confidence 00000000 0000111111124555555555432 23444 67999999988863 2 22211 1111 Q ss_pred CHHHHHHHhCCCeEEEEEEEEeccccCCCCccceeCCCceEEEEecCCCccccccccccee-ecccccCCcCCccccccc Q lcl|NC_015466. 233 NLVTLADLFEVDKVLVMKAVRNTAKKGQTASHSFIGGKHALLSYAPATPGIMTPSAGYTFN-WTGLVGSGNEGMRIKRFY 311 (344) Q Consensus 233 t~~~la~~~gl~~I~v~~a~yn~~~~~~~~~~~~iw~~~~~l~~~~~~~~~~~~s~G~T~~-~~~~~g~~~~~~~~~~~~ 311 (344) +...-..++|.| |++.+.. .. +..+...++| |+-+.+|.+. +.+. ....+.|. T Consensus 316 ~~g~~~~l~G~P-V~~~~~~-----p~-------~~~~~~~i~~-------Gd~~~~~~i~~~~~~------~i~~d~~~ 369 (407) T protein:vir:48 316 ELGQPSSLAGYG-IVENEQM-----PD-------IAADAKAIAF-------GNFKRGYTIVDRIGT------RILRDPYT 369 (407) T ss_pred CCCCCceeccee-eEEecCc-----CC-------ccCCccEEEE-------EeccccEEEEEeece------EEEeeccc Confidence 111223466776 4332211 00 1111111221 2222233321 2221 00112232 Q ss_pred CCCCceEEEeeccccceeeeccccchhhhcccC Q lcl|NC_015466. 312 LDAIESDRIEIDMSYDQKKVAADLGYFFGGIVA 344 (344) Q Consensus 312 ~~~~~~~~vr~~~~~~~~v~~~~~g~l~~~~va 344 (344) ..+...+|+.+.++-.++-+++-.+++.+.| T Consensus 370 --~~~~~~~~~~~r~d~~v~~~~a~~~l~~~aa 400 (407) T protein:vir:48 370 --NKPFVGFYTTKRTGGMLVDSQAIKLMKIGAA 400 (407) T ss_pred --cCCcEEEEEEEEeccEEecccceEEEEeecc Confidence 2456778899999999999999999888887 Done!