Stephen Whitworth
9c7049ba89
Merge pull request #90 from Sentimentron/cross-fold-staging
...
Cross-fold validation
2014-11-21 13:53:29 +00:00
Stephen Whitworth
aeb12bd0c7
Merge pull request #99 from Sentimentron/params-staging
...
MultiLinearSVC class weights
2014-11-21 13:53:00 +00:00
Richard Townsend
e30ff6580a
ARFF import/export, CSV export, serialisation
...
* Only numeric and categorical ARFF attributes are currently supported.
* Only the dense version of the ARFF format is supported.
* Compressed format is .tar.gz file which should allow extensibility.
* Attributes stored using JSON representations.
* Also offers smarter estimation of the precision of numeric Attributes.
* Also adds support for writing instances to CSV
2014-11-13 20:09:00 +00:00
Richard Townsend
ec846e62d2
Merge pull request #100 from pontusmelke/use-apply
...
Rewrote gradient descent to use Apply for scalar matrix multiplication.
2014-11-06 09:50:17 +00:00
Pontus Melke
6a04b01f3e
Rewrote gradient descent to use Apply for scalar matrix multiplication.
2014-11-06 08:22:18 +01:00
Richard Townsend
8fe06e7332
Support for individual class weightings
2014-10-30 23:28:26 +00:00
Richard Townsend
6929052af0
base: conversion to DenseInstances via DenseCopyOf
2014-10-30 22:10:39 +00:00
Richard Townsend
1e888d2a97
base: More general version of equality
2014-10-30 22:02:38 +00:00
Stephen Whitworth
056ccef9b6
Merge pull request #91 from Sentimentron/numeric-staging
...
trees: Handling FloatAttributes.
2014-10-27 07:56:19 +00:00
Richard Townsend
b2f5b2840d
Cross-fold validation
2014-10-26 17:48:48 +00:00
Richard Townsend
7ba57fe6df
trees: Handling FloatAttributes.
...
This patch adds:
* Gini index and information gain ratio as
DecisionTree split options;
* handling for numeric Attributes (split point
chosen naïvely on the basis of maximum entropy);
* A couple of additional utility functions in base/
* A new dataset (see sources.txt) for testing.
Performance on Iris performs markedly without discretisation.
2014-10-26 17:40:38 +00:00
Stephen Whitworth
fcb96f1fad
Merge pull request #87 from Sentimentron/knn-opt-staging
...
Optimised version of KNN for Euclidean distances
2014-10-12 08:59:40 +01:00
Stephen Whitworth
b3497a9b80
Merge pull request #88 from Sentimentron/linearsvc-staging
...
Multi-class linear Support Vector Classifiers
2014-10-12 08:59:25 +01:00
Stephen Whitworth
b4733d7109
Merge pull request #89 from Sentimentron/go-1.1-fix
...
Travis: drop CI support for go 1.1
2014-10-10 10:39:18 +01:00
Richard Townsend
b33c95a770
Travis: drop CI support for go 1.1
2014-10-05 11:25:46 +01:00
Richard Townsend
981d43f1dd
Adds support for multi-class linear SVMs.
...
This patch
* Adds a one-vs-all meta classifier into meta/
* Adds a LinearSVC (essentially the same as LogisticRegression
but with different libsvm parameters) to linear_models/
* Adds a MultiLinearSVC into ensemble/ for predicting
CategoricalAttribute classes with the LinearSVC
* Adds a new example dataset based on classifying article headlines.
The example dataset is drawn from WikiNews, and consists of an average,
min and max Word2Vec representation of article headlines from three
categories. The Word2Vec model was computed offline using gensim.
2014-10-05 11:15:41 +01:00
Stephen Whitworth
0e4d04af52
Merge pull request #83 from Lupino/master
...
move ext source to linear_models for easy build
2014-10-01 11:08:38 +01:00
Richard Townsend
527c6476e1
Optimised version of KNN for Euclidean distances
...
This patch also:
* Completes removal of the edf/ package
* Corrects an erroneous print statement
* Introduces two new CSV functions
* ParseCSVToInstancesTemplated makes sure that
reading a second CSV file maintains strict Attribute
compatibility with an existing DenseInstances
* ParseCSVToInstancesWithAttributeGroups gives more control
over where Attributes end up in memory, important for
gaining predictable control over the KNN optimisation
* Decouples BinaryAttributeGroup from FixedAttributeGroup for
better casting support
2014-09-30 23:10:22 +01:00
Stephen Whitworth
8f1bc62401
Merge pull request #85 from okjake/master
...
Small spelling correction
2014-09-28 09:47:22 +01:00
Jake Pyne
e5da0a8b04
Correct spelling
2014-09-28 01:34:21 +02:00
Ubasic
979c727124
move ext source to linear_models for build static
2014-09-25 16:14:53 +08:00
Stephen Whitworth
fc5eafd82b
Merge pull request #82 from Sentimentron/edf-remove
...
Edf remove
2014-09-25 07:32:27 +01:00
Richard Townsend
49e5012b50
Removed base/edf
2014-09-19 23:03:44 +01:00
Richard Townsend
91cfee0e0e
Adding a high-dimensional feature set
2014-09-19 23:03:35 +01:00
Stephen Whitworth
65674e2012
Merge pull request #78 from amitkgupta/master
...
Convert all tests to Goconvey, ensure all tests make assertions, plus miscellaneous cleanup
2014-08-25 12:52:53 +01:00
Amit Kumar Gupta
93838e30e3
Revert "Remove hardly-used logger"
...
This reverts commit 151df652cab334066826f53d0c7ad8b48c884cae.
2014-08-25 08:11:36 +00:00
Amit Kumar Gupta
25811f833b
Revert "Remove (mostly) unused C print functions in linear package"
...
This reverts commit f9e41dec2860090e00dd1e0fb74c1beb79197ae5.
2014-08-25 08:11:36 +00:00
Amit Kumar Gupta
f9e41dec28
Remove (mostly) unused C print functions in linear package
2014-08-23 05:27:40 +00:00
Amit Kumar Gupta
4d93b9de89
Convert remaining tests to goconvey
2014-08-23 05:22:16 +00:00
Amit Kumar Gupta
9f67f73330
Convert some tests to goconvey, and improve assertions along the way
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
1809a8b358
RandomForest returns error when fitting data with fewer features than the RandomForest plans to use
...
- BaseClassifier Predict and Fit methods return errors
- go fmt ./...
Conflicts:
ensemble/randomforest.go
ensemble/randomforest_test.go
trees/tree_test.go
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
151df652ca
Remove hardly-used logger
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
f14020f78c
Remove unused cross_validation package
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
529b3bcaa5
Avoid renaming packages on import
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
478b5055c7
Remove unused untested utility functions
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
63dcd653b0
NewLogisticRegression returns error instead of nil to indicate error
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
f38fc713ef
Remove unused untested ParseCSV function
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
580913938f
Trim the public interface of the edf package
2014-08-22 13:39:29 +00:00
Amit Kumar Gupta
947ee8380e
Return error instead of panicking when unable to get confusion matrix
2014-08-22 13:39:29 +00:00
Stephen Whitworth
c8bf178662
Merge pull request #76 from amitkgupta/master
...
Improve ConfusionMatrix GetSummary formatting, and several tiny improvements
2014-08-22 11:37:17 +01:00
Amit Kumar Gupta
14aad31821
Consistently use (t *testing.T) instead of T or testEnv
2014-08-22 08:44:41 +00:00
Amit Kumar Gupta
695aec6eb6
Favor idiomatic t.Fatalf over panic for test failures
2014-08-22 08:07:55 +00:00
Amit Kumar Gupta
45545d6ebd
Remove Println's from automated test suite since they aren't assertions
2014-08-22 07:58:01 +00:00
Amit Kumar Gupta
66ad866cb3
RandomForest panics when trying to fit data with too few features instead of just hanging forever
2014-08-22 07:39:14 +00:00
Amit Kumar Gupta
d835081de9
Favor idiomatic error return over panic when parsing non-existent CSV file
2014-08-22 07:27:16 +00:00
Amit Kumar Gupta
21bb2fc9fa
Remove redundant import renames
2014-08-22 07:21:24 +00:00
Amit Kumar Gupta
8f5a5f4962
Add headings and improve formatting of ConfusionMatrix GetSummary
2014-08-22 07:09:16 +00:00
Amit Kumar Gupta
25f59e2d6b
Remove unused private functions from base/util
2014-08-22 07:07:13 +00:00
Amit Kumar Gupta
b8e0a36f73
Remove unused untested private function from chimerge_funcs
2014-08-22 07:03:49 +00:00
Amit Kumar Gupta
688a82babb
fix typo suceed -> succeed
2014-08-22 07:00:39 +00:00