golearn

mirror of https://github.com/sjwhitworth/golearn.git synced 2025-04-26 13:49:14 +08:00

Author	SHA1	Message	Date
ss8651twtw	1e1b5f11fb	Format code	2018-06-16 22:14:18 +08:00
yenck	bf907556f5	testcase	2018-06-16 22:11:59 +08:00
yenck	80bc1ac6f8	some test for C0	2018-06-16 22:11:59 +08:00
yenck	30071eb8a4	some test for C9	2018-06-16 22:11:59 +08:00
Ilya Tocar	676f69a426	trees: speed-up training Avoid quadratic loop in getNumericAttributeEntropy. We don't need to recalculate whole distribution for each split, just move changed values. Also use array of slices instead of map of maps of strings to avoid map overhead. For our case I see time reductions from 100+ hours to 50 minutes. I've added benchmark with synthetic data (iris.csv repeated 100 times) and it also shows a nice improvement: name old time/op new time/op delta RandomForestFit-8 117s ± 4% 0s ± 1% -99.61% (p=0.001 n=5+10) 0 is a rounding quirk of benchstat, it should be closer to 0.5s: name time/op RandomForestFit-8 460ms ± 1%	2018-05-08 14:59:41 -05:00
Richard Townsend	58ae6f4d1b	trees: Try to fix premature write-after-Close issue	2018-01-28 16:35:55 +00:00
Richard Townsend	e2279995c1	Fixing all tests	2018-01-28 16:22:33 +00:00
Richard Townsend	ce78cd0406	Passes the tests	2018-01-27 18:56:01 +00:00
Richard Townsend	f722f2e59d	trees: implement serialization	2018-01-27 18:00:52 +00:00
Richard Townsend	e7fee0a2d1	Reformat, fix tests	2017-09-10 21:10:54 +01:00
Richard Townsend	fc110aab48	Fix bad import, reformat	2017-09-10 20:35:34 +01:00
Richard Townsend	aee475ca14	Fix the trees tests	2017-09-10 20:13:41 +01:00
Richard Townsend	e27215052b	ensemble: tests pass	2017-09-10 19:30:02 +01:00
Richard Townsend	768d2cd19f	meta: tests are almost passing	2017-09-10 16:59:05 +01:00
Richard Townsend	57e6054404	base: fix unmarshalling attributes, add JSON	2017-08-26 14:56:31 +01:00
Richard Townsend	e68361c162	Genericize for ensemble use	2017-08-08 12:37:57 +01:00
Richard Townsend	a90ef09781	Remove excessive logging	2017-08-08 12:29:00 +01:00
Richard Townsend	d23619eac2	OK, but with a lot of extra printing	2017-08-07 17:26:11 +01:00
meirwahnon	674de9cae3	change Probability order	2017-07-17 16:01:49 +03:00
meirwahnon	518c0d84c4	extren fields of ClassProba	2017-07-17 15:35:35 +03:00
meirwahnon	2b478a0513	fix to float precise	2017-07-17 15:01:08 +03:00
meirwahnon	f56fce1a43	support PredictProba	2017-07-17 14:48:38 +03:00
Ryan Schmukler	cf6192c81c	fix(id3): fix panic on SplitAttribute being nil	2016-06-28 14:36:48 -04:00
Richard Townsend	7ba57fe6df	trees: Handling FloatAttributes. This patch adds: * Gini index and information gain ratio as DecisionTree split options; * handling for numeric Attributes (split point chosen naïvely on the basis of maximum entropy); * A couple of additional utility functions in base/ * A new dataset (see sources.txt) for testing. Performance on Iris performs markedly without discretisation.	2014-10-26 17:40:38 +00:00
Amit Kumar Gupta	4d93b9de89	Convert remaining tests to goconvey	2014-08-23 05:22:16 +00:00
Amit Kumar Gupta	1809a8b358	RandomForest returns error when fitting data with fewer features than the RandomForest plans to use - BaseClassifier Predict and Fit methods return errors - go fmt ./... Conflicts: ensemble/randomforest.go ensemble/randomforest_test.go trees/tree_test.go	2014-08-22 13:39:29 +00:00
Amit Kumar Gupta	529b3bcaa5	Avoid renaming packages on import	2014-08-22 13:39:29 +00:00
Amit Kumar Gupta	947ee8380e	Return error instead of panicking when unable to get confusion matrix	2014-08-22 13:39:29 +00:00
Amit Kumar Gupta	14aad31821	Consistently use (t *testing.T) instead of T or testEnv	2014-08-22 08:44:41 +00:00
Amit Kumar Gupta	695aec6eb6	Favor idiomatic t.Fatalf over panic for test failures	2014-08-22 08:07:55 +00:00
Amit Kumar Gupta	45545d6ebd	Remove Println's from automated test suite since they aren't assertions	2014-08-22 07:58:01 +00:00
Amit Kumar Gupta	21bb2fc9fa	Remove redundant import renames	2014-08-22 07:21:24 +00:00
Richard Townsend	f9c1e24e5b	neural: stop-gap support for neural networks	2014-08-09 19:27:20 +01:00
Richard Townsend	47341b2869	base: Cleaned up duplicate Attribute resolution functions	2014-08-03 15:17:20 +01:00
Richard Townsend	c2d040af30	trees: merge from v2-instances	2014-08-03 15:17:13 +01:00
albrow	132e3f4527	Create a new default logger and change some print statements to use the logger instead of fmt.Println.	2014-07-20 15:26:13 -04:00
Niclas Jern	627a5537d3	Comments should be of the form "<Struct> ..." or "<MethodName> ..."	2014-07-18 13:48:28 +03:00
Niclas Jern	32f36f28c3	if block ends with a return statement -> drop this else and outdent its block	2014-07-18 13:20:46 +03:00
Remo Hertig	f77c1dcde0	use multiple return values instead of an array in InstancesTrainTestSplit	2014-06-06 21:33:17 +02:00
Richard Townsend	a6072ac9de	Package documentation	2014-05-19 12:59:11 +01:00
Richard Townsend	889fec4419	Examples for RandomForest, ID3 and Random trees	2014-05-19 12:42:03 +01:00
Richard Townsend	45ca6063f1	Not sure if this bagging version is better or not More more similar to "Attribute bagging:improving accuracy of classifier ensembles by using random feature subsets" (Brill)	2014-05-18 11:49:35 +01:00
Richard Townsend	26660e1470	Corrected a problem with pruning, actual ID3 decision tree type Going to modify Bagging to select attributes on its own	2014-05-17 21:45:26 +01:00
Richard Townsend	12ace9def5	Identified source of the low accuracy	2014-05-17 20:37:19 +01:00
Richard Townsend	13c0dc3eba	Reduced-error pruning	2014-05-17 18:06:01 +01:00
Richard Townsend	c516907b13	Passes all the tests	2014-05-17 17:35:10 +01:00
Richard Townsend	db3ac3c695	ID3 algorithm working	2014-05-17 17:28:51 +01:00
Richard Townsend	cf165695c8	ChiMerge seems to improve accuracy	2014-05-17 16:20:56 +01:00
Richard Townsend	fdb67a4355	Initial work on decision trees Random Forest has occasional disastrous accuracy: never seen that happen in WEKA	2014-05-14 14:00:22 +01:00
Stephen Whitworth	1ade0afca6	Refactored KNN to implement the estimator interface	2014-05-05 22:41:55 +01:00

1 2

52 Commits