1
0
mirror of https://github.com/sjwhitworth/golearn.git synced 2025-04-28 13:48:56 +08:00

8 Commits

Author SHA1 Message Date
Stephen Whitworth
7ea42ac80b Merge pull request #101 from Sentimentron/arff-staging
ARFF import/export, CSV export, lossless serialisation
2014-11-21 13:53:43 +00:00
Richard Townsend
e30ff6580a ARFF import/export, CSV export, serialisation
* Only numeric and categorical ARFF attributes are currently supported.
* Only the dense version of the ARFF format is supported.
* Compressed format is .tar.gz file which should allow extensibility.
    * Attributes stored using JSON representations.
* Also offers smarter estimation of the precision of numeric Attributes.
* Also adds support for writing instances to CSV
2014-11-13 20:09:00 +00:00
Richard Townsend
8fe06e7332 Support for individual class weightings 2014-10-30 23:28:26 +00:00
Richard Townsend
7ba57fe6df trees: Handling FloatAttributes.
This patch adds:

	* Gini index and information gain ratio as
           DecisionTree split options;
	* handling for numeric Attributes (split point
           chosen naïvely on the basis of maximum entropy);
	* A couple of additional utility functions in base/
	* A new dataset (see sources.txt) for testing.

Performance on Iris performs markedly without discretisation.
2014-10-26 17:40:38 +00:00
Richard Townsend
527c6476e1 Optimised version of KNN for Euclidean distances
This patch also:
   * Completes removal of the edf/ package
   * Corrects an erroneous print statement
   * Introduces two new CSV functions
      * ParseCSVToInstancesTemplated makes sure that
        reading a second CSV file maintains strict Attribute
        compatibility with an existing DenseInstances
      * ParseCSVToInstancesWithAttributeGroups gives more control
        over where Attributes end up in memory, important for
        gaining predictable control over the KNN optimisation
      * Decouples BinaryAttributeGroup from FixedAttributeGroup for
        better casting support
2014-09-30 23:10:22 +01:00
Jake Pyne
e5da0a8b04 Correct spelling 2014-09-28 01:34:21 +02:00
Richard Townsend
47341b2869 base: Cleaned up duplicate Attribute resolution functions 2014-08-03 15:17:20 +01:00
Richard Townsend
2bb7c2de75 base: merge from v2-instances 2014-08-03 15:16:38 +01:00