1
0
mirror of https://github.com/sjwhitworth/golearn.git synced 2025-05-01 22:18:10 +08:00

62 Commits

Author SHA1 Message Date
Richard Townsend
7ba57fe6df trees: Handling FloatAttributes.
This patch adds:

	* Gini index and information gain ratio as
           DecisionTree split options;
	* handling for numeric Attributes (split point
           chosen naïvely on the basis of maximum entropy);
	* A couple of additional utility functions in base/
	* A new dataset (see sources.txt) for testing.

Performance on Iris performs markedly without discretisation.
2014-10-26 17:40:38 +00:00
Richard Townsend
527c6476e1 Optimised version of KNN for Euclidean distances
This patch also:
   * Completes removal of the edf/ package
   * Corrects an erroneous print statement
   * Introduces two new CSV functions
      * ParseCSVToInstancesTemplated makes sure that
        reading a second CSV file maintains strict Attribute
        compatibility with an existing DenseInstances
      * ParseCSVToInstancesWithAttributeGroups gives more control
        over where Attributes end up in memory, important for
        gaining predictable control over the KNN optimisation
      * Decouples BinaryAttributeGroup from FixedAttributeGroup for
        better casting support
2014-09-30 23:10:22 +01:00
Jake Pyne
e5da0a8b04 Correct spelling 2014-09-28 01:34:21 +02:00
Richard Townsend
49e5012b50 Removed base/edf 2014-09-19 23:03:44 +01:00
Amit Kumar Gupta
93838e30e3 Revert "Remove hardly-used logger"
This reverts commit 151df652cab334066826f53d0c7ad8b48c884cae.
2014-08-25 08:11:36 +00:00
Amit Kumar Gupta
4d93b9de89 Convert remaining tests to goconvey 2014-08-23 05:22:16 +00:00
Amit Kumar Gupta
9f67f73330 Convert some tests to goconvey, and improve assertions along the way 2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
1809a8b358 RandomForest returns error when fitting data with fewer features than the RandomForest plans to use
- BaseClassifier Predict and Fit methods return errors
- go fmt ./...

Conflicts:
	ensemble/randomforest.go
	ensemble/randomforest_test.go
	trees/tree_test.go
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
151df652ca Remove hardly-used logger 2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
529b3bcaa5 Avoid renaming packages on import 2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
f38fc713ef Remove unused untested ParseCSV function 2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
580913938f Trim the public interface of the edf package 2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
14aad31821 Consistently use (t *testing.T) instead of T or testEnv 2014-08-22 08:44:41 +00:00
Amit Kumar Gupta
695aec6eb6 Favor idiomatic t.Fatalf over panic for test failures 2014-08-22 08:07:55 +00:00
Amit Kumar Gupta
45545d6ebd Remove Println's from automated test suite since they aren't assertions 2014-08-22 07:58:01 +00:00
Amit Kumar Gupta
d835081de9 Favor idiomatic error return over panic when parsing non-existent CSV file 2014-08-22 07:27:16 +00:00
Amit Kumar Gupta
25f59e2d6b Remove unused private functions from base/util 2014-08-22 07:07:13 +00:00
Amit Kumar Gupta
688a82babb fix typo suceed -> succeed 2014-08-22 07:00:39 +00:00
Amit Kumar Gupta
94e5843bcf go fmt ./... 2014-08-22 06:55:20 +00:00
Richard Townsend
45f0be7607 Merge pull request #75 from Sentimentron/anonmap
edf: use make() to back EdfAnonMap
2014-08-21 16:36:12 +01:00
Richard Townsend
c59384bbed edf: use make() to back EdfAnonMap 2014-08-20 22:25:18 +01:00
Amit Kumar Gupta
525b4536c8 Delete tmpfile after edf test 2014-08-20 05:55:30 +00:00
Richard Townsend
c0d4140557 edf: Removed unnecessary Sync() 2014-08-11 14:38:57 +01:00
Stephen Whitworth
e64b2cea18 Changing mmap repo 2014-08-10 22:38:50 +01:00
Richard Townsend
f9c1e24e5b neural: stop-gap support for neural networks 2014-08-09 19:27:20 +01:00
Richard Townsend
29b3f06566 filters: FloatConvertFilter and BinaryConvertFilter tests 2014-08-09 15:42:25 +01:00
Richard Townsend
8196db1230 base: Added support for grouping and storing BinaryAttributes
* Pond was renamed to FixedAttributeGroup.
* AttributeGroup interface.
* BinaryAttributeGroup introduced.
2014-08-09 15:42:08 +01:00
Richard Townsend
bb1257563b base: deterministic Attribute order 2014-08-03 23:09:18 +01:00
Richard Townsend
c894d1e80d filters: revised binning_test to be more robust 2014-08-03 22:07:08 +01:00
Richard Townsend
d13654021f edf: corrections for Travis environment 2014-08-03 22:07:08 +01:00
Richard Townsend
882b780637 naive: tests pass 2014-08-03 15:17:42 +01:00
Richard Townsend
47341b2869 base: Cleaned up duplicate Attribute resolution functions 2014-08-03 15:17:20 +01:00
Richard Townsend
ff97065261 base: BinaryConvertFilter, Transform()
Transform now takes a new Attribute so BinaryConvertFilter
 can work correctly
2014-08-03 15:17:18 +01:00
Richard Townsend
c91140261d base: BinaryAttribute 2014-08-03 15:17:16 +01:00
Richard Townsend
2bb7c2de75 base: merge from v2-instances 2014-08-03 15:16:38 +01:00
albrow
132e3f4527 Create a new default logger and change some print statements to use the logger instead of fmt.Println. 2014-07-20 15:26:13 -04:00
Niclas Jern
2a3d80b475 Remove complete unnecessary fmt.Println :) 2014-07-19 16:31:11 +03:00
Niclas Jern
5d00d8942e Overeagerly replaced an Error() with Errorf(). 2014-07-18 16:15:19 +03:00
Niclas Jern
e060684a29 Passing parameters to Error() as if it was Errorf() 2014-07-18 14:04:59 +03:00
Niclas Jern
90106077cc base/instances.go:145:21: error strings should not end with punctuation 2014-07-18 13:59:00 +03:00
Niclas Jern
8f154559f1 receiver name Inst should be consistent with previous receiver name inst for Instances 2014-07-18 13:50:12 +03:00
Niclas Jern
627a5537d3 Comments should be of the form "<Struct> ..." or "<MethodName> ..." 2014-07-18 13:48:28 +03:00
Niclas Jern
4d7bc20a36 Should replace "val += 1" with "val++" 2014-07-18 13:25:18 +03:00
Richard Townsend
521844cbb2 base: handling spaces between entries 2014-07-02 15:48:14 +01:00
Remo Hertig
f77c1dcde0 use multiple return values instead of an array in InstancesTrainTestSplit 2014-06-06 21:33:17 +02:00
hpxro7
a54d473cdd Fixed incorrect instance shuffling algorithm. 2014-06-05 21:54:57 -07:00
Richard Townsend
1b0e2dce7c Correction to randomisation and train-test split 2014-05-18 11:23:32 +01:00
Richard Townsend
fdb67a4355 Initial work on decision trees
Random Forest has occasional disastrous accuracy:
	 never seen that happen in WEKA
2014-05-14 14:00:22 +01:00
Richard Townsend
a2c67592df Adds Instances and Attributes type
* Refactors KNNClassifier to use them
* csv handling moved back into base due to a circular dependency
* Also adds the datasets used to test CSV handling
2014-05-13 22:08:11 +01:00
Stephen Whitworth
1ade0afca6 Refactored KNN to implement the estimator interface 2014-05-05 22:41:55 +01:00