µ±Ç°Î»ÖãºÊ×Ò³ > ×é×°×é²Î¿¼Îĵµversion1
2 »ùÓÚkmerµÄ»ùÒò×é´óС¹À¼Æ
»ùÓÚ¶ÌÆ¬¶Î¹À¼ÆÕû¸öÐòÁ㤶ÈÎÊÌâ¿ÉÒÔ³éÏóΪÈçÏÂÎÊÌ⣺ ¼ÙÉè´æÔÚÍêÕûÁ¬ÐøÐòÁÐG£¬Ëæ»úѡȡƬ¶Î³¤¶ÈΪk£¬¸ÃƬ¶Î³ÆÎªkmer¡£µ±´ïµ½Ò»¶¨¸²¸Ç¶Èʱ£¬¸ù¾ÝkmerÊýÁ¿ºÍÉî¶È¹À¼ÆÐòÁ㤶ÈG¡£
¼ÙÉ裺kmerÉî¶ÈƵÊý·Ö²¼·þ´Ó²´ËÉ·Ö²¼(Havlak and Chen et al., 2004)¡£
Kmer distribution curve2.E+011.E+018.E+004.E+000.E+001611162126Depth
¶ÔÓÚ²´ËÉ·Ö²¼£¬Ëæ»ú±äÁ¿X?kµÄ¸ÅÂÊΪ£º
SimulatedPoission(7)Frequency31364146
¾ùֵΪ¦Ë£¬ÖÚÊýµÈÓÚ¾ùֵȡÕû£¨floor£¨¦Ë£©£©£¬Òò´Ë¿É½«·åÖµ¶ÔÓ¦Éî¶È×÷ΪkmerÆÚÍûÉî¶È¡£ ¼ÙÉ裺ѡȡµÄkmerÄܹ»±éÀúÕû¸ö»ùÒò×éʱ£¬Ôò¸ù¾ÝLander_watermanËã·¨£¬»ùÒò×é´óС£¨G£©Âú×ãÈçϹ«Ê½£º
G?knumb?numkdepthbdepthknum?rnum?(l?k?1)bnum?rnum?lÆäÖУ¬
knumΪkmer¸öÊý£¬kdepthΪkmerÆÚÍûÉî¶È£¬bnumΪ¼î»ù¸öÊý£¬bdepthΪ¼î»ùÆÚÍûÉî¶È£¬
rnumΪ²âÐòÉú³ÉµÄread¸öÊý£¬lΪ²âÐòreadƽ¾ù³¤¶È¡£
Òò´Ë¿ÉÒÔ»ñµÃÈçϹ«Ê½£º
rnuml?k?1lGkbr?lbdepth?num?num?depth?lGGl?k?1
??´ÓÉÏÊö¹«Ê½¿ÉÖª£¬Èô»ñµÃkmerÆÚÍûÉî¶È£¬¼´¿É¼ÆËã¼î»ùÆÚÍûÉî¶ÈÒÔ¼°»ùÒò×é´óС¡£kmerÉî
¶ÈƵÊý·Ö²¼·þ´Ó²´ËÉ·Ö²¼£¬Òò´Ë¿É½«kmerÉî¶ÈÇúÏßÖ÷·å´¦Éî¶È×÷ΪkmerÆÚÍûÉî¶È£¬´Ó¶ø¹À¼Æ»ùÒò×é´óС¡£
kdepthbdepth3 ¹À¼Æ×¼È·ÐÔÓ°ÏìÒòËØ
ÓÃÄâÄϽæ»ùÒò×飬Éú³É10X 100bp read¡£É趨ÈçÏÂͼÄÚ²ÎÊý£¬ÆäÖÐÉî¶ÈÓÐ10X£¬20X£»ÔÓºÏÂÊÓÐ0ºÍ1%£»´íÎóÂÊÓÐ0ºÍ0.003¡£Íê³ÉÈçÏÂͼ£º 30252015105016111621263136414610Xh0f010Xh0f0310Xh10f010Xh10f0320Xh10f03 1num kmer_num pkdepth genome_size used_base X node_num 10Xh0f0 306358 999579000 8 124947375 1189975000 9.5 101166945 10Xh10f0 2616526 999579000 7 142797000 1189975000 8.3 117532751 10Xh0f03 41062471 999579000 7 142797000 1189975000 8.3 142600356 10Xh10f03 43230395 999579000 6 166596500 1189975000 7.1 158454797
½áÂÛ1£º
ÔÓºÏÓë´íÎó¾ùÒýÆð1´¦·å¸ßÔö¼Ó£¬µ«Á½ÕßµÄÓ°ÏìÓÐÁ¿¼¶ÉϲîÒì¡£´íÎóµÄÓ°Ïì¸ü´ó¡£Á½ÕßÒ²¾ù»áÔö¼ÓÌØÒìkmerÊý¡£ ½áÂÛ2£º
10XʱÔÓºÏÓë´íÎó¾ùÒýÆðÖ÷·åλÖÃÇ°ÒÆ£¬Á½Õß»ìºÏ¿É¼Ó¿ìÖ÷·åÇ°ÒÆ¡£¶ø20Xʱ£¬¸ÃÔÓºÏÂÊϲ¢²»ÒýÆðÖ÷·åÇ°ÒÆ¡£±íÃ÷СÊý¾ÝÁ¿Ê±½ö½öÒÀ¿¿´íÎóÂʹÀ¼ÆÖ÷·åÇ°ÒÆÊDz»¹»µÄ£¬ÔÓºÏÂÊÒ²»áÓ°ÏìÖ÷·åλÖõÄ׼ȷ¹À¼Æ£¬´Ó¶øµ¼Ö»ùÒò×é´óС¹À¼ÆÆ«´ó¡£ ½áÂÛ3£º
10Xʱ£¬Éî¶È³Ë»ýÇúÏß·åÖµÔÚÔÓºÏÂʺʹíÎóÂʶ¼ÓеÄÇé¿öÏÂÒ²Ç°ÒÆ£¬µ«ÂýÓÚÉî¶È·Ö²¼ÇúÏß·åÖµµÄÇ°ÒÆËÙ¶È¡£
ÓÉÒÔÉÏ·ÖÎöµÃ³ö´íÎóÂʻᶯ¹ýÒÆ¶¯Ö÷·åλÖÃÑÏÖØÓ°Ïì»ùÒò×é´óС¹À¼Æ×¼È·ÐÔ£¬ÔÓºÏÂʱ¾Éí¶ÔÖ÷·åλÖÃÓ°Ïì²»´ó¡£µ«´íÎóÂÊ¡¢ÔÓºÏÂʺÍÉî¶ÈÈýÕß×ÛºÏ×÷ÓûáÔÚ²»Í¬³Ì¶ÈÉÏÓ°Ïì»ùÒò×é´óС¹À¼Æ×¼È·ÐÔ¡£ÏÂÃæ·Ö±ð¶ÔÈýÕßµÄÓ°Ïì½øÐзÖÎö¡£ £¨Ò»£©²âÐòÉî¶ÈµÄÓ°Ïì
¶¨Òå»ùÒò×é´óСΪG£¬¹Û²âֵΪG?£¬kmerÉî¶ÈÇúÏßʵ¼Ê·åֵλÖãºdk?d?k??
G??G?Ôò»ùÒò×é´óСƫ²î
G?d?k£¬ÓÉ´Ë¿ÉÖª£¬Éî¶ÈÆÚÍûÖµÔ½¸ß£¬ÔÚÏàͬ?Çé¿öÏ£¬»ùÒò×é´ó
С¹À¼ÆÖµÆ«²îԽС¡£
ÔÚ²»¿¼ÂÇÆäËûÒòËØµÄÇé¿öÏ£¬È¡ÕûÔì³ÉµÄ?ÓëÆÚÍûÉî¶ÈÖµÎ޹أ¬ËùÒÔ¸ßÉî¶È²¢²»Òâζ×ÅÈ¡ÕûÎó²îÒ»¶¨¾ÍС¡£
ÓÃÄâÄϽæ»ùÒò×éÊý¾Ý£¨nogap£©£¬Éú³É100bp³¤µÄread£¬·Ö±ðÉú³É10X£¬15X£¬30X£¬40X¡£ÓÃ17kmer·ÖÎö£¬·Ö±ð»ñµÃkmerÉî¶È·Ö²¼ÇúÏߺÍÉî¶È³Ë»ýÇúÏß¡£
depth effect on the kmer frequency curve1.40E+011.20E+011.00E+0140X8.00E+0030X6.00E+0025X4.00E+0015X10X2.00E+000.00E+001112131415161718191depth effect on the kmer product curve25000000020000000040X15000000030X25X10000000015X5000000010X01112131415161718191
depth Kmer num
Kmer
depth
Genome size Used base X
node_num
expect_X
error rate
10 999543216 8 124942902 1189932400 9 101167694 8.399652701 15 1499305416 12 124942118 1784887400 14 101208919 12.59939999 25 2498857368 20 124942868 2974830200 23 101210177 20.9991261 30 2998622424 25 119944896 3569788600 29 101210214 25.19889739 40 3998165640 33 121156534 4759721000 39 101210241 33.5985501
½áÂÛ£º
Ëæ×ÅÉî¶ÈÔö¼Ó£¬ÕûÌå¹À¼Æ×¼È·ÐÔÔö¼Ó¡£µ«²¢²»ÒâζןßÉî¶È¹À¼ÆµÄÒ»¶¨±ÈµÍÉî¶È¹À¼ÆµÄ×¼¡£ ³Ë»ýÇúÏß·åֵλÖñÈÉî¶È·Ö²¼ÇúÏß·åֵλÖôó1£¬ÒòΪÉî¶ÈÍùÍù¹À¼ÆÆ«Ð¡£¬Òò´ËÍÆ¼ö²ÉÓó˻ýÇúÏß·åֵλÖÃÒ²¹À¼Æ»ùÒò×é´óС£¬´Ó¶ø·´Ó³»ùÒò×é´óС·¶Î§¡£ £¨¶þ£©´íÎóÂʵÄÓ°Ïì
²âÐò´íÎó¶ÔkmerÇúÏßµÄÓ°ÏìÌåÏÖÔÚÁ½¸ö·½Ã棬һ·½Ãæunique kmer£¨node number£©Ôö¼Ó£¬ÉõÖÁ´óÓÚ»ùÒò×飻ÁíÒ»·½Ã棬Éî¶ÈΪ1µÄkmerƵÂÊÔö¼Ó¡£
´ÓÉî¶ÈΪ1µÄkmerƵÂʳö·¢£¬¶ÔÓÚ³¤¶ÈΪkµÄkmerÀ´½²£¬¼ÙÉè1¸ö´íÎó¼î»ùƽ¾ùÔì³É??k¸öÌØÒìkmer£¬ÔòÓÐÈçϵÈʽ£º
4.995659
4.995 4.995631 0.79559 1.813788
??P?nb?f???k?n1P11nknk
?PnPÆäÖÐ1ΪÉî¶ÈΪ1µÄƵÂʹ۲âÖµ£¬1Ϊ²»ÊÜ´íÎóÓ°ÏìµÄÇé¿öϵÄʵ¼ÊÖµ£¬fΪ´íÎóÂÊ£¬bΪ
¼î»ù×ÜÊý£¬
nkΪkmer×ÜÊý£¬ n1ΪkmerÉî¶ÈΪ1µÄkmer¸öÊý¡£
??P(X?1)???eÒòΪkmerÉî¶ÈƵÂÊ·þ´Ó²´ËÉ·Ö²¼£¬Ôò£¬ÆäÖÐ?¼´ÎªkmerµÄÆÚÍû·åÖµÉî
¶È¡£ÒòΪ?ÍùÍùСÓÚ100£¬Òò´ËËæ×Å?Ôö¼Ó£¬Éî¶ÈΪ1´¦µÄ¸ÅÂʼõС£¬µ±??10ʱ£¬ÖµÎª0.045%¡£¶ø´íÎóÂÊÍùÍùµ¼ÖÂ1³öƵÂÊ´ïµ½40%£¬Ïà²î1000±¶¡£Òò´Ë¿ÉÒÔ²»Óÿ¼ÂÇÕâ¸öÓ°Ïì¡£
´Ó¶ø»ñµÃµÈʽ£º
f?´íÎóÂÊ£º
nknkn1??P)????(P?P111nb???knb???knb???k
ÆäÖÐ?Óë¸ÃÐòÁеÄÖØ¸´ÌØÕ÷£¬±ß¼ÊµÈÓйأ¬ÓëÉî¶ÈÒ²ÓÐÒ»¶¨¹ØÏµµ«Ó°Ïì²»´ó£¬ÄâÄϽæµÄ¹À¼Æ
Öµ´óÔ¼ÊÇ0.5¡£
ÓÃÄâÄϽæ»ùÒò×éÊý¾Ý£¨nogap£©£¬Éú³É100bp read£¬´íÎóÂÊΪ0£¬0.01£¬0.03£¬0.06ºÍ0.08µÄ20X read¡£ÓÃ17kmer·ÖÎö£¬·Ö±ð»ñµÃkmerÉî¶È·Ö²¼ÇúÏߺÍÉî¶È³Ë»ýÇúÏß¡£
¹²·ÖÏí92ƪÏà¹ØÎĵµ