µ±Ç°Î»ÖãºÊ×Ò³ > WEKAÊý¾ÝÍÚ¾òʵÑ鱨¸æ
WEKAʵÑ鱨¸æ
Ò»¡¢ Êý¾Ý¼¯
ʵÑé²ÉÓÃWisconsinҽѧԺµÄWilliamH.Wolberg²©Ê¿ÌṩµÄÈéÏÙ°©µÄÊý¾ÝÑù±¾¡£ËùÓÐÊý¾ÝÀ´×ÔÕæÊµÁÙ´²°¸Àý£¬Ã¿¸ö°¸ÀýÓÐ10¸öÊôÐÔ¡£ÆäÖÐǰ¾Å¸öÊôÐÔÊǼì²âÖ¸±ê£¬Ã¿¸öÊôÐÔÖµÓÃ1µ½10µÄÕûÊý±íʾ£¬1±íʾ¼ì²âÖ¸±ê×îÕý³££¬10±íʾ×î²»Õý³£¡£µÚÊ®¸öÊôÐÔÊÇ·ÖÀàÊôÐÔ£¬Ö¸Ê¾¸ÃÖ×ÁöÊÇ·ñΪ¶ñÐÔ¡£Êý¾Ý¼¯ÖеÄÖ×ÁöÐÔÖÊÊÇͨ¹ý»î¼ìµÃ³öµÄ½á¹û¡£
Ö׿éºñ¶È Clump_Thickness integer [1,10]
ϸ°û´óСµÄ¾ùÔÈÐÔ Cell_Size_Uniformity integer [1,10] ϸ°ûÐÎ×´µÄ¾ùÔÈÐÔCell_Shape_Uniformity integer [1,10] ±ßÔµÕ³ÐÔ Marginal_Adhesion integer [1,10] µ¥ÉÏÆ¤Ï¸°ûµÄ´óС Single_Epi_Cell_Size integer [1,10] ÂãºË Bare_Nuclei integer [1,10] ·¦Î¶È¾É«Ìå Bland_Chromatin integer [1,10] Õý³£ºË Normal_Nucleoli integer [1,10] ÓÐË¿·ÖÁÑ Mitoses integer [1,10] Ö×ÁöÐÔÖÊ Class { benign, malignant}
¸ÃÊý¾Ý¼¯¹²ÓÐ669¸öʵÀý¡£
±¾´ÎʵÑé¶ÔÒÔÉÏÊý¾Ý¼¯½øÐÐÁË·ÖÀà¡¢¾ÛÀà¡¢¹ØÁª¹æÔòÈý²¿·Ö²Ù×÷£¬ÒÔÊìϤwekaÈí¼þµÄ²Ù×÷ʹÓ㬲¢³¢ÊÔÍÚ¾òÊý¾ÝÖеÄʵ¼Ê¼ÛÖµ¡£·ÖÀàÖУ¬³¢ÊÔÓÃǰ¾Å¸öÊôÐÔÖµÀ´Ô¤²âÖ×ÁöµÄÐÔÖÊ£¨Á¼ÐÔ¡¢¶ñÐÔ£©£»¾ÛÀàÖУ¬Ñ°ÕÒ¸÷¸ö´Ø²¡ÈË£¨ÓÈÆäÊǶñÐÔÖ×Áö²¡ÈË£©µÄÏÔÖøÌØÕ÷£¬¿ÉÓÃÀ´¸¨ÖúÖÆ¶¨Õë¶ÔÐÔÖÎÁƼƻ®£»¹ØÁª¹æÔòµÄ̽Ë÷£¬Ñ°ÕÒ²»Í¬ÊôÐÔÖµÖ®¼äµÄÏà¹ØÐÔ¡£
¶þ¡¢ ·ÖÀà
1. Êý¾ÝÔ¤´¦Àí
½«wisconsin-breast-cancerÊý¾Ý¼¯·Ö¸îΪÁ½¸ö£¬·Ö±ð×÷Ϊtrainset£¨469¸ö£©ºÍtestset£¨200¸ö£©¡£
2. ʵÑé¹ý³Ì
ÓÃj48·ÖÀàÊ÷¶Ôtrainset½øÐзÖÀàÔËË㣬½á¹ûÈçÏ£º
1 / 4
½á¹û±íÃ÷£¬Ä£ÐÍ·ÖÀàµÄ׼ȷÂÊ´ïµ½ÁË96%¡£Confusion Matrix±íÃ÷ÓÐ13ÀýÁ¼ÐÔÖ×Áö±»´íλµÄ¹éÀàΪ¶ñÐÔ£¨4.5%£©£»ÓÐ6Àý¶ñÐÔÖ×Áö±»´íÎóµÄ¹éÀàΪÁ¼ÐÔ£¨3.1%£©¡£ ½«ÒÔÉÏÄ£ÐÍÓ¦ÓÃÓÚtestsetÒÔ¼ìÑéÔ¤²â׼ȷÂÊ£¬ÔËÐнá¹ûÈçÏ£º
½á¹û±íÃ÷£¬Ô¤²â׼ȷÂÊ´ïµ½ÁË99%¡£Confusion Matrix±íʾÓÐ2ÀýÁ¼ÐÔÖ×Áö±»´íÎóµÄ¹éÀàΪ¶ñÐÔ£¨1.3%£©£»¶ø¶ñÐÔÖ×Áö¾ù±»ÕýÈ··ÖÀà¡£
3. ½á¹û·ÖÎö
Ê×ÏÈ£¬Í¨¹ý¼ìÑ飬j48·½·¨Í¨¹ýѵÁ·¼¯Éú³ÉµÄ¾ö²ßÊ÷¶ÔÖ×ÁöÐÔÖʵÄÔ¤²â׼ȷÂÊ¿ÉÒÔÎȶ¨Ôڽϸßˮƽ£¬Òò´Ë¿ÉÒÔ½«´ËÄ£ÐÍÓÃÓÚÁÙ´²Õï¶Ï¡£Õâ¶ÔÓÚÒòÒ½ÁÆÌõ¼þ²»¼Ñ¶ø²»ÄܽøÐлî¼ì»ò²¡ÔîλÖò»Ò×½øÐлî¼ìµÄ²¡ÈËÀ´Ëµ¾ßÓнϸߵÄʵÓüÛÖµ¡£
Æä´Î£¬´Ó·ÖÀàÊ÷¿ÉÒÔ¿´³ö£¬Ö×ÁöÐÔÖÊÓ롰ϸ°û´óС¾ùÔÈÐÔ¡±¡°ÂãºË¡±ÊôÐԵĹØÁª¶È½Ï¸ß£»¶ø¡°Ï¸°ûÐÎ×´¾ùÔÈÐÔ¡±¡°±ßÔµÕ³ÐÔ¡±¡°µ¥ÉÏÆ¤Ï¸°û´óС¡±¡°·¦Î¶È¾É«Ì塱¡°ÓÐË¿·ÖÁÑ¡±¶ÔÖ×ÁöÁ¼¡¢¶ñÐÔµÄÕï¶Ï¼¸ºõûÓвο¼ÒâÒå¡£Õâ±íÃ÷ÔÚÈÕ³£Õï¶ÏÖУ¬ÈôÊÜÒ½ÁÆÌõ¼þ
2 / 4
ºÍÖÎÁÆÊ±»úµÄÖÆÔ¼£¬¿ÉÒÔÊʵ±¼õÉÙ¼ì²âÖ¸±ê£¬¸ù¾ÝÔ¤²â½á¹û¾¡Ôç²ÉÈ¡ÖÎÁÆ´ëÊ©¡£
×îºó£¬Í¨¹ý¶ÔConfusion MatrixµÄ·ÖÎö¿ÉÖª£¬¸ÃÄ£ÐÍ´æÔÚÁ½Àà´íÎó:½«Á¼ÐÔÎóÕïΪ¶ñÐÔ¡¢½«¶ñÐÔÎóÕïΪÁ¼ÐÔ¡£ÕâÁ½Àà´íÎó¶¼ÊÇÓ¦¸Ã¼«Á¦±ÜÃâµÄ£¬µÚÒ»Àà´íÎó¿ÉÄܵ¼Ö²¡ÈËÇéÐ÷µÍÂä¡¢²»ÅäºÏÖÎÁÆ£¬×îÖÕµ¼Ö²¡Çé¶ñ»¯£»µÚ¶þÀà´íÎó¿ÉÄܵ¼ÖÂÖÎÁÆ·½°¸µÄ´íÎ󣬹ýÓÚ¼¤½øµÄÖÎÁÆ¿ÉÄÜÊÊµÃÆä·´¡£ºÜÄÑÆÀÅÐÕâÁ½Àà´íÎóÄÄÖÖ¸üΪÑÏÖØ£¬µ«Ä£Ð͵Ľá¹û±íÃ÷£¬·¸µÚ¶þÀà´íÎóµÄ¸ÅÂʽϵͣ¬ÌرðÔÚ²âÊÔ¼¯ÖУ¬×¼È·ÂÊ´ïµ½ÁË100%¡£
Èý¡¢ ¾ÛÀà
1. Êý¾ÝÔ¤´¦Àí
ÓÉÓÚ¾ÛÀàÖжÔÓÚÀëÉ¢ÊôÐÔ½«ÏÔʾÆäÖÚÊý£¬²»ÀûÓÚÁ˽âÊýÖµ½á¹¹£¬ËùÒÔ½«ClassÊôÐÔÀàÐÍÓÉNominalת»»ÎªNumeric¡£ÓÃ0±íʾbenign£¬¼´Á¼ÐÔ£»1±íʾmalignant£¬¼´¶ñÐÔ¡£ÕâÑù¸÷¸ö´ØÖеÄÊýÖµÔ½½Ó½ü1£¬±íÃ÷¸Ã´ØÖжñÐÔÖ×Áö±ÈÀýÔ½¸ß¡£ 2. ʵÑé¹ý³Ì
ÓÃSimplekMeansËã·¨£¬ÉèÖòÎÊýnumClusters=5£¬seed=50½øÐоÛÀàÔËË㣬µÃµ½½á¹ûÈçÏ£º
3. ½á¹û·ÖÎö
ÉÏÊö¾ÛÀà½á¹û¹²ÓÐÎå¸ö´Ø£¬¶øÇÒÇ¡ÇÉClassÊôÐԵľùÖµ¾ùΪÕûÊý£¬Õâ±íÃ÷ÿһ¸ö´ØÖеĸ÷ʵÀýµÄÖ×ÁöÐÔÖÊÏàͬ¡£
#0£ºÕâÒ»´ØÖи÷ÊôÐÔÆ«ÀëÕý³£ÖµµÄ³É¶¼Ïà¶Ô½Ï¸ß£¬µ«Ö×ÁöÐÔÖÊΪÁ¼ÐÔ¡£ÕâÖÖÇé¿ö½öÓÐ17¸öʵÀý£¬±íÃ÷¸ÃÇé¿ö³öÏֵĸÅÂʺܵ͡£
#1£º³ý¡°Ö׿éºñ¶È¡±ÕâÒ»ÊôÐÔÍ⣬ÆäËûÖ¸±ê¶¼ºÜÕý³££¬Ö×ÁöÐÔÖÊΪÁ¼ÐÔ¡£ÕâÖÖÇé¿öÓÐ253¸öʵÀý£¬ÉõÖÁ³¬¹ýÁË¡°µäÐÍÁ¼ÐÔ¡±´ØÖеÄʵÀýÊý£¬Õâ±íÃ÷¡°Ö׿éºñ¶È¡±ÕâÒ»
3 / 4
ÊôÐÔÒª½Ï´ó³Ì¶ÈÆ«ÀëÕý³£Öµ²Å¿ÉÄܶÔÓ¦¶ñÐÔÖ×Áö¡£ #2£ºÎÒÃÇ¿ÉÒÔ³ÆÕâÒ»´ØÎª¡°µäÐÍÁ¼ÐÔ¡±£¬Æä¸÷ÊôÐÔÆ«ÀëÕý³£³Ì¶È¶¼ºÜµÍ¡£ #3£º³ý¡°Ö׿éºñ¶È¡±¡°ÂãºË¡±Í⣬ÆäËûÊôÐÔÖµ¶¼²»ÊǺܸߣ¬µ«Ö×ÁöÐÔÖÊΪ¶ñÐÔ¡£ÕâÖÖÇé¿öÕ¼±ÈԼΪËùÓжñÐÔÖ×Áö»¼ÕßÒ»°ë¡£ #4£ºÕâÒ»´Ø¿É³ÆÎª¡°µäÐͶñÐÔ¡±£¬¼¸ºõÿһ¸öÊôÐÔ¶¼ºÜ²»Õý³££¬È»¶øÖ»ÓÐÔ¼Ò»°ëµÄ¶ñÐÔÖ×Áö»¼ÕßÊôÓÚÕâÖÖÇé¿ö¡£ ¸ù¾Ý¾ÛÀà·ÖÎö½á¹û£¬¿ÉÒÔ°ïÖúÒ½ÉúÕë¶Ô¿ÉÄܵļ¸ÖÖ·¢²¡Çé¿ö£¬Öƶ¨²»Í¬µÄÖÎÁƼƻ®¡£ÁíÍ⣬¶Ô¸÷¸ö´ØËùÕ¼±ÈÀýµÄÑо¿¿ÉÒÔ°ïÖúÒ½Ò©¹¤×÷Õ߸üºÃµØÁ˽âÈéÏÙ°©Ö¢×´µÄ·Ö²¼¡£
ËÄ¡¢ ¹ØÁª¹æÔò
1. Êý¾ÝÔ¤´¦Àí
ΪʹÓÃAprioriËã·¨£¬½«Ç°¾Å¸öÊôÐÔÊý¾ÝÀàÐ͸ÄΪÀëÉ¢ÐÍ¡£Ê¹ÓÃfilterÖеÄNumericToNominal·½·¨½«integer [1,10]ÀëÉ¢»¯Îª{1,2,3,4,5,6,7,8,9,10}¡£ 2. ʵÑé¹ý³Ì
Ê×ÏÈÑ¡ÓÃÖÃÐŶÈ×÷ΪºâÁ¿²ÎÊý£¬ÉèÖýÓÊܵÄ×îС²ÎÊýֵΪ0.8£¬½á¹ûÈçÏ£º
ÔÙÑ¡ÓÃÌáÉý¶È×÷ΪºâÁ¿²ÎÊý£¬ÉèÖýÓÊܵÄ×îС²ÎÊýֵΪ1.5£¬½á¹ûÈçÏ£º
3. ½á¹û·ÖÎö
²»ÄÑ·¢ÏÖ£¬ÒÔÉÏÕâЩ¹ØÁª¹æÔòÓÐһЩÊÇûÓмÛÖµµÄ£¬Õâ˵Ã÷ÔÚÊý¾ÝÔ¤´¦Àí½×¶ÎÒÔ¼°¹ØÁª¹æÔòËã·¨µÄ²ÎÊýÉèÖÃÉÏ»¹ÓкܴóµÄ¸Ä½ø¿Õ¼ä¡£
ÕâЩ¹æÔò²»½ö½ö¿ÉÒÔÓÃÓÚÖ×Áö²¡ÇéµÄÔ¤²â¡£ÓÉÓÚÍÚ¾ò³öµÄ¹ØÁª¹æÔò²¢²»ÊǼòµ¥µÄÒò¹û¹ØÏµ£¬¶øÊǶàάµÄÏà¹ØÐÔ£¬ËùÒÔ»¹¿ÉÒÔ¸¨ÖúÈéÏÙ°©µÄ²¡ÀíÑо¿¡£±ÈÈ硰ϸ°û´óСµÄ¾ùÔÈÐÔ¡±ºÍ¡°ÓÐË¿·ÖÁÑ¡±Ö®¼ä¡¢¡°±ßÔµÕ³ÐÔ¡±ºÍ¡°ÓÐË¿·ÖÁÑ¡±Ö®¼äµÄÇ¿¹ØÁªÐԵȵȡ£
4 / 4
¹²·ÖÏí92ƪÏà¹ØÎĵµ