Richard Townsend
7ba57fe6df
trees: Handling FloatAttributes.
...
This patch adds:
* Gini index and information gain ratio as
DecisionTree split options;
* handling for numeric Attributes (split point
chosen naïvely on the basis of maximum entropy);
* A couple of additional utility functions in base/
* A new dataset (see sources.txt) for testing.
Performance on Iris performs markedly without discretisation.
2014-10-26 17:40:38 +00:00
Richard Townsend
527c6476e1
Optimised version of KNN for Euclidean distances
...
This patch also:
* Completes removal of the edf/ package
* Corrects an erroneous print statement
* Introduces two new CSV functions
* ParseCSVToInstancesTemplated makes sure that
reading a second CSV file maintains strict Attribute
compatibility with an existing DenseInstances
* ParseCSVToInstancesWithAttributeGroups gives more control
over where Attributes end up in memory, important for
gaining predictable control over the KNN optimisation
* Decouples BinaryAttributeGroup from FixedAttributeGroup for
better casting support
2014-09-30 23:10:22 +01:00
Jake Pyne
e5da0a8b04
Correct spelling
2014-09-28 01:34:21 +02:00
Richard Townsend
49e5012b50
Removed base/edf
2014-09-19 23:03:44 +01:00
Amit Kumar Gupta
93838e30e3
Revert "Remove hardly-used logger"
...
This reverts commit 151df652cab334066826f53d0c7ad8b48c884cae.
2014-08-25 08:11:36 +00:00
Amit Kumar Gupta
4d93b9de89
Convert remaining tests to goconvey
2014-08-23 05:22:16 +00:00
Amit Kumar Gupta
9f67f73330
Convert some tests to goconvey, and improve assertions along the way
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
1809a8b358
RandomForest returns error when fitting data with fewer features than the RandomForest plans to use
...
- BaseClassifier Predict and Fit methods return errors
- go fmt ./...
Conflicts:
ensemble/randomforest.go
ensemble/randomforest_test.go
trees/tree_test.go
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
151df652ca
Remove hardly-used logger
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
529b3bcaa5
Avoid renaming packages on import
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
f38fc713ef
Remove unused untested ParseCSV function
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
580913938f
Trim the public interface of the edf package
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
14aad31821
Consistently use (t *testing.T) instead of T or testEnv
2014-08-22 08:44:41 +00:00
Amit Kumar Gupta
695aec6eb6
Favor idiomatic t.Fatalf over panic for test failures
2014-08-22 08:07:55 +00:00
Amit Kumar Gupta
45545d6ebd
Remove Println's from automated test suite since they aren't assertions
2014-08-22 07:58:01 +00:00
Amit Kumar Gupta
d835081de9
Favor idiomatic error return over panic when parsing non-existent CSV file
2014-08-22 07:27:16 +00:00
Amit Kumar Gupta
25f59e2d6b
Remove unused private functions from base/util
2014-08-22 07:07:13 +00:00
Amit Kumar Gupta
688a82babb
fix typo suceed -> succeed
2014-08-22 07:00:39 +00:00
Amit Kumar Gupta
94e5843bcf
go fmt ./...
2014-08-22 06:55:20 +00:00
Richard Townsend
45f0be7607
Merge pull request #75 from Sentimentron/anonmap
...
edf: use make() to back EdfAnonMap
2014-08-21 16:36:12 +01:00
Richard Townsend
c59384bbed
edf: use make() to back EdfAnonMap
2014-08-20 22:25:18 +01:00
Amit Kumar Gupta
525b4536c8
Delete tmpfile after edf test
2014-08-20 05:55:30 +00:00
Richard Townsend
c0d4140557
edf: Removed unnecessary Sync()
2014-08-11 14:38:57 +01:00
Stephen Whitworth
e64b2cea18
Changing mmap repo
2014-08-10 22:38:50 +01:00
Richard Townsend
f9c1e24e5b
neural: stop-gap support for neural networks
2014-08-09 19:27:20 +01:00
Richard Townsend
29b3f06566
filters: FloatConvertFilter and BinaryConvertFilter tests
2014-08-09 15:42:25 +01:00
Richard Townsend
8196db1230
base: Added support for grouping and storing BinaryAttributes
...
* Pond was renamed to FixedAttributeGroup.
* AttributeGroup interface.
* BinaryAttributeGroup introduced.
2014-08-09 15:42:08 +01:00
Richard Townsend
bb1257563b
base: deterministic Attribute order
2014-08-03 23:09:18 +01:00
Richard Townsend
c894d1e80d
filters: revised binning_test to be more robust
2014-08-03 22:07:08 +01:00
Richard Townsend
d13654021f
edf: corrections for Travis environment
2014-08-03 22:07:08 +01:00
Richard Townsend
882b780637
naive: tests pass
2014-08-03 15:17:42 +01:00
Richard Townsend
47341b2869
base: Cleaned up duplicate Attribute resolution functions
2014-08-03 15:17:20 +01:00
Richard Townsend
ff97065261
base: BinaryConvertFilter, Transform()
...
Transform now takes a new Attribute so BinaryConvertFilter
can work correctly
2014-08-03 15:17:18 +01:00
Richard Townsend
c91140261d
base: BinaryAttribute
2014-08-03 15:17:16 +01:00
Richard Townsend
2bb7c2de75
base: merge from v2-instances
2014-08-03 15:16:38 +01:00
albrow
132e3f4527
Create a new default logger and change some print statements to use the logger instead of fmt.Println.
2014-07-20 15:26:13 -04:00
Niclas Jern
2a3d80b475
Remove complete unnecessary fmt.Println :)
2014-07-19 16:31:11 +03:00
Niclas Jern
5d00d8942e
Overeagerly replaced an Error() with Errorf().
2014-07-18 16:15:19 +03:00
Niclas Jern
e060684a29
Passing parameters to Error() as if it was Errorf()
2014-07-18 14:04:59 +03:00
Niclas Jern
90106077cc
base/instances.go:145:21: error strings should not end with punctuation
2014-07-18 13:59:00 +03:00
Niclas Jern
8f154559f1
receiver name Inst should be consistent with previous receiver name inst for Instances
2014-07-18 13:50:12 +03:00
Niclas Jern
627a5537d3
Comments should be of the form "<Struct> ..." or "<MethodName> ..."
2014-07-18 13:48:28 +03:00
Niclas Jern
4d7bc20a36
Should replace "val += 1" with "val++"
2014-07-18 13:25:18 +03:00
Richard Townsend
521844cbb2
base: handling spaces between entries
2014-07-02 15:48:14 +01:00
Remo Hertig
f77c1dcde0
use multiple return values instead of an array in InstancesTrainTestSplit
2014-06-06 21:33:17 +02:00
hpxro7
a54d473cdd
Fixed incorrect instance shuffling algorithm.
2014-06-05 21:54:57 -07:00
Richard Townsend
1b0e2dce7c
Correction to randomisation and train-test split
2014-05-18 11:23:32 +01:00
Richard Townsend
fdb67a4355
Initial work on decision trees
...
Random Forest has occasional disastrous accuracy:
never seen that happen in WEKA
2014-05-14 14:00:22 +01:00
Richard Townsend
a2c67592df
Adds Instances and Attributes type
...
* Refactors KNNClassifier to use them
* csv handling moved back into base due to a circular dependency
* Also adds the datasets used to test CSV handling
2014-05-13 22:08:11 +01:00
Stephen Whitworth
1ade0afca6
Refactored KNN to implement the estimator interface
2014-05-05 22:41:55 +01:00