Random-Forest icon indicating copy to clipboard operation
Random-Forest copied to clipboard

Data Resources

Open FormerBuckeye opened this issue 8 years ago • 1 comments

Hi guys,

I hope to know which data files are used to train data and test data.

Also could you tell me what the data layout path is?

Thanks.

FormerBuckeye avatar Nov 14 '17 23:11 FormerBuckeye

KDD DataSet link

Layouts is of format given below, you could decide each feature is Numeric (infinite range) or Categoric (Finite range, Set), Ignore from usage, Mark Label

N 3 C 2 N C 4 N C 8 N 2 C 19 N L I, as mentioned in the java DescribeTrees.class

	 * Breaks the run length code for data layout
	  	 * N-Nominal/Number/Real
		 * C-Categorical/Alphabetical/Numerical
		 * I-Ignore Attribute
		 * L-Label - last of the fields (expected output)

ironmanMA avatar Nov 19 '17 01:11 ironmanMA