1
0
mirror of https://github.com/sjwhitworth/golearn.git synced 2025-04-28 13:48:56 +08:00

5 Commits

Author SHA1 Message Date
Richard Townsend
7ba57fe6df trees: Handling FloatAttributes.
This patch adds:

	* Gini index and information gain ratio as
           DecisionTree split options;
	* handling for numeric Attributes (split point
           chosen naïvely on the basis of maximum entropy);
	* A couple of additional utility functions in base/
	* A new dataset (see sources.txt) for testing.

Performance on Iris performs markedly without discretisation.
2014-10-26 17:40:38 +00:00
Richard Townsend
527c6476e1 Optimised version of KNN for Euclidean distances
This patch also:
   * Completes removal of the edf/ package
   * Corrects an erroneous print statement
   * Introduces two new CSV functions
      * ParseCSVToInstancesTemplated makes sure that
        reading a second CSV file maintains strict Attribute
        compatibility with an existing DenseInstances
      * ParseCSVToInstancesWithAttributeGroups gives more control
        over where Attributes end up in memory, important for
        gaining predictable control over the KNN optimisation
      * Decouples BinaryAttributeGroup from FixedAttributeGroup for
        better casting support
2014-09-30 23:10:22 +01:00
Jake Pyne
e5da0a8b04 Correct spelling 2014-09-28 01:34:21 +02:00
Richard Townsend
47341b2869 base: Cleaned up duplicate Attribute resolution functions 2014-08-03 15:17:20 +01:00
Richard Townsend
2bb7c2de75 base: merge from v2-instances 2014-08-03 15:16:38 +01:00