1
0
mirror of https://github.com/sjwhitworth/golearn.git synced 2025-04-28 13:48:56 +08:00

83 Commits

Author SHA1 Message Date
Etienne Bruines
db086a864e Made versions of CSV-readers w/ io.ReadSeeker
Each method now ontains a -FromReader counterpart
such that it'll allow use of those helper-methods
even when someone does not have his data in a physical
file. The original methods make use of those -FromReader
methods.

The reader is being reset (Seek(0, 0)) before every method-
specific read, to ensure it's reading from the start of the
reader.

Test cases are not yet touched, and I'm not sure they should.
2017-09-29 09:48:33 +02:00
Linker Lin
9aa4ee64b5 Update util_attributes.go
replaced range by []
2017-02-14 15:08:12 +08:00
Richard Townsend
0f0b4d800b base: fix a failing test case 2016-09-29 11:33:36 +01:00
Richard Townsend
855df3a7fa Merge pull request #135 from Sentimentron/inline-training-data
Support the use of mat64.Dense as an instance type
2016-09-29 11:25:16 +01:00
Richard Townsend
7041fc33c7 base: correct handling of class attributes in ParseCSVToTemplatedInstances 2016-07-11 23:16:18 +01:00
Thatcher Peskens
de9a6246fd added String() function to sentimentIron's inline-training-data function 2016-07-06 18:00:30 -07:00
Richard Townsend
de01a2fd10 Merge pull request #138 from anzellai/fix/go-vet-complaints
Fix go vet complaints
2016-07-04 13:24:41 +01:00
Philip Gatt
f74483db53 Serialize ARFF to Writer in Addition to a File 2016-06-28 15:52:04 -07:00
Anzel Lai
481da97eca Fix go vet complaints 2016-06-14 00:56:47 +01:00
Richard Townsend
6f7326b6ff neural: check that the new dense instances type works... 2016-05-22 12:58:51 +01:00
Richard Townsend
590d7a8091 base: add a new instances type for mat64 2016-05-22 12:58:50 +01:00
Richard Townsend
986cd230f9 clustering: creates the package and implements DBSCAN
Verified against scikit-learn's implementation (gen_test.py)
2015-10-10 20:20:33 +01:00
Stephen Whitworth
092917dee9 Temporarily removing test 2015-01-27 13:24:40 +00:00
Stephen Whitworth
183c672cfe Hopefully, should build now. 2015-01-27 12:32:19 +00:00
Richard Townsend
a250e99644 base: correct some non-deterministic serialisation test behaviour 2015-01-15 22:45:05 +00:00
Stephen Whitworth
353cd38e7c Merge pull request #98 from Sentimentron/dense-staging
New DenseInstances conversion function
2014-11-21 13:53:52 +00:00
Stephen Whitworth
7ea42ac80b Merge pull request #101 from Sentimentron/arff-staging
ARFF import/export, CSV export, lossless serialisation
2014-11-21 13:53:43 +00:00
Richard Townsend
e30ff6580a ARFF import/export, CSV export, serialisation
* Only numeric and categorical ARFF attributes are currently supported.
* Only the dense version of the ARFF format is supported.
* Compressed format is .tar.gz file which should allow extensibility.
    * Attributes stored using JSON representations.
* Also offers smarter estimation of the precision of numeric Attributes.
* Also adds support for writing instances to CSV
2014-11-13 20:09:00 +00:00
Richard Townsend
8fe06e7332 Support for individual class weightings 2014-10-30 23:28:26 +00:00
Richard Townsend
6929052af0 base: conversion to DenseInstances via DenseCopyOf 2014-10-30 22:10:39 +00:00
Richard Townsend
1e888d2a97 base: More general version of equality 2014-10-30 22:02:38 +00:00
Richard Townsend
7ba57fe6df trees: Handling FloatAttributes.
This patch adds:

	* Gini index and information gain ratio as
           DecisionTree split options;
	* handling for numeric Attributes (split point
           chosen naïvely on the basis of maximum entropy);
	* A couple of additional utility functions in base/
	* A new dataset (see sources.txt) for testing.

Performance on Iris performs markedly without discretisation.
2014-10-26 17:40:38 +00:00
Richard Townsend
527c6476e1 Optimised version of KNN for Euclidean distances
This patch also:
   * Completes removal of the edf/ package
   * Corrects an erroneous print statement
   * Introduces two new CSV functions
      * ParseCSVToInstancesTemplated makes sure that
        reading a second CSV file maintains strict Attribute
        compatibility with an existing DenseInstances
      * ParseCSVToInstancesWithAttributeGroups gives more control
        over where Attributes end up in memory, important for
        gaining predictable control over the KNN optimisation
      * Decouples BinaryAttributeGroup from FixedAttributeGroup for
        better casting support
2014-09-30 23:10:22 +01:00
Jake Pyne
e5da0a8b04 Correct spelling 2014-09-28 01:34:21 +02:00
Richard Townsend
49e5012b50 Removed base/edf 2014-09-19 23:03:44 +01:00
Amit Kumar Gupta
93838e30e3 Revert "Remove hardly-used logger"
This reverts commit 151df652cab334066826f53d0c7ad8b48c884cae.
2014-08-25 08:11:36 +00:00
Amit Kumar Gupta
4d93b9de89 Convert remaining tests to goconvey 2014-08-23 05:22:16 +00:00
Amit Kumar Gupta
9f67f73330 Convert some tests to goconvey, and improve assertions along the way 2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
1809a8b358 RandomForest returns error when fitting data with fewer features than the RandomForest plans to use
- BaseClassifier Predict and Fit methods return errors
- go fmt ./...

Conflicts:
	ensemble/randomforest.go
	ensemble/randomforest_test.go
	trees/tree_test.go
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
151df652ca Remove hardly-used logger 2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
529b3bcaa5 Avoid renaming packages on import 2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
f38fc713ef Remove unused untested ParseCSV function 2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
580913938f Trim the public interface of the edf package 2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
14aad31821 Consistently use (t *testing.T) instead of T or testEnv 2014-08-22 08:44:41 +00:00
Amit Kumar Gupta
695aec6eb6 Favor idiomatic t.Fatalf over panic for test failures 2014-08-22 08:07:55 +00:00
Amit Kumar Gupta
45545d6ebd Remove Println's from automated test suite since they aren't assertions 2014-08-22 07:58:01 +00:00
Amit Kumar Gupta
d835081de9 Favor idiomatic error return over panic when parsing non-existent CSV file 2014-08-22 07:27:16 +00:00
Amit Kumar Gupta
25f59e2d6b Remove unused private functions from base/util 2014-08-22 07:07:13 +00:00
Amit Kumar Gupta
688a82babb fix typo suceed -> succeed 2014-08-22 07:00:39 +00:00
Amit Kumar Gupta
94e5843bcf go fmt ./... 2014-08-22 06:55:20 +00:00
Richard Townsend
45f0be7607 Merge pull request #75 from Sentimentron/anonmap
edf: use make() to back EdfAnonMap
2014-08-21 16:36:12 +01:00
Richard Townsend
c59384bbed edf: use make() to back EdfAnonMap 2014-08-20 22:25:18 +01:00
Amit Kumar Gupta
525b4536c8 Delete tmpfile after edf test 2014-08-20 05:55:30 +00:00
Richard Townsend
c0d4140557 edf: Removed unnecessary Sync() 2014-08-11 14:38:57 +01:00
Stephen Whitworth
e64b2cea18 Changing mmap repo 2014-08-10 22:38:50 +01:00
Richard Townsend
f9c1e24e5b neural: stop-gap support for neural networks 2014-08-09 19:27:20 +01:00
Richard Townsend
29b3f06566 filters: FloatConvertFilter and BinaryConvertFilter tests 2014-08-09 15:42:25 +01:00
Richard Townsend
8196db1230 base: Added support for grouping and storing BinaryAttributes
* Pond was renamed to FixedAttributeGroup.
* AttributeGroup interface.
* BinaryAttributeGroup introduced.
2014-08-09 15:42:08 +01:00
Richard Townsend
bb1257563b base: deterministic Attribute order 2014-08-03 23:09:18 +01:00
Richard Townsend
c894d1e80d filters: revised binning_test to be more robust 2014-08-03 22:07:08 +01:00