1
0
mirror of https://github.com/sjwhitworth/golearn.git synced 2025-05-01 22:18:10 +08:00

14 Commits

Author SHA1 Message Date
Ross Hendrickson
5c302a11ea Add average perceptron file
Rough out logic flow for an average perceptron

Fleshed out example data. Working on FixedDataGrid
support.

Dont use Binary use Float

Update processData to use base helpers to read csv

Move class to end of feature list

Add test for processData

process data to instances

Create path fixed

Add test around Fit. First steps

Modified example, added tests, small fixes
2015-01-15 22:54:16 +00:00
Stephen Whitworth
353cd38e7c Merge pull request #98 from Sentimentron/dense-staging
New DenseInstances conversion function
2014-11-21 13:53:52 +00:00
Richard Townsend
e30ff6580a ARFF import/export, CSV export, serialisation
* Only numeric and categorical ARFF attributes are currently supported.
* Only the dense version of the ARFF format is supported.
* Compressed format is .tar.gz file which should allow extensibility.
    * Attributes stored using JSON representations.
* Also offers smarter estimation of the precision of numeric Attributes.
* Also adds support for writing instances to CSV
2014-11-13 20:09:00 +00:00
Richard Townsend
6929052af0 base: conversion to DenseInstances via DenseCopyOf 2014-10-30 22:10:39 +00:00
Richard Townsend
7ba57fe6df trees: Handling FloatAttributes.
This patch adds:

	* Gini index and information gain ratio as
           DecisionTree split options;
	* handling for numeric Attributes (split point
           chosen naïvely on the basis of maximum entropy);
	* A couple of additional utility functions in base/
	* A new dataset (see sources.txt) for testing.

Performance on Iris performs markedly without discretisation.
2014-10-26 17:40:38 +00:00
Richard Townsend
981d43f1dd Adds support for multi-class linear SVMs.
This patch
  * Adds a one-vs-all meta classifier into meta/
  * Adds a LinearSVC (essentially the same as LogisticRegression
    but with different libsvm parameters) to linear_models/
  * Adds a MultiLinearSVC into ensemble/ for predicting
    CategoricalAttribute  classes with the LinearSVC
  * Adds a new example dataset based on classifying article headlines.

The example dataset is drawn from WikiNews, and consists of an average,
min and max Word2Vec representation of article headlines from three
categories. The Word2Vec model was computed offline using gensim.
2014-10-05 11:15:41 +01:00
Richard Townsend
91cfee0e0e Adding a high-dimensional feature set 2014-09-19 23:03:35 +01:00
Richard Townsend
a9028b8174 examples: merge from v2-instances 2014-08-03 15:16:58 +01:00
Niclas Jern
3fd2c95bcd Added missing csv newlines. 2014-07-19 16:06:10 +03:00
Niclas Jern
7c4bb5c81c Added examples used in linear regression tests. 2014-07-19 15:59:55 +03:00
Richard Townsend
889fec4419 Examples for RandomForest, ID3 and Random trees 2014-05-19 12:42:03 +01:00
Richard Townsend
a2c67592df Adds Instances and Attributes type
* Refactors KNNClassifier to use them
* csv handling moved back into base due to a circular dependency
* Also adds the datasets used to test CSV handling
2014-05-13 22:08:11 +01:00
Stephen Whitworth
0e86db820e Added KNN Regressor 2014-01-05 00:23:31 +00:00
Stephen Whitworth
334c12385e Added example documentation 2014-01-04 19:31:33 +00:00