mirror of
https://github.com/sjwhitworth/golearn.git
synced 2025-04-30 13:48:57 +08:00

This patch adds: * Gini index and information gain ratio as DecisionTree split options; * handling for numeric Attributes (split point chosen naïvely on the basis of maximum entropy); * A couple of additional utility functions in base/ * A new dataset (see sources.txt) for testing. Performance on Iris performs markedly without discretisation.
16 lines
165 B
CSV
16 lines
165 B
CSV
Attribute1,Attribute2,Attribute3,Class
|
|
A,70,T,A
|
|
A,90,T,B
|
|
A,85,F,B
|
|
A,95,F,B
|
|
A,70,F,A
|
|
B,90,T,A
|
|
B,78,F,A
|
|
B,65,T,A
|
|
B,75,F,A
|
|
C,80,T,B
|
|
C,70,T,B
|
|
C,80,F,A
|
|
C,80,F,A
|
|
C,96,F,A
|