object Utilities
Set of utilities functions
- Alphabetic
- By Inheritance
- Utilities
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
HVDM(n1: Array[Double], n2: Array[Double], nominal: Array[Int], sds: Array[Double], attrCounter: Array[Map[Double, Int]], attrClassesCounter: Array[Map[Double, Map[Any, Int]]]): Double
Compute HVDM distance of two nodes
Compute HVDM distance of two nodes
- n1
sample1
- n2
sample2
- nominal
indicate nominal attributes in the instances
- sds
standard deviations
- attrCounter
counter attributes occurrences
- attrClassesCounter
number of occurrences for each value and output class c, for each class
- returns
HVDM distance of two nodes
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
boolToIndex(data: Array[Boolean]): Array[Int]
Return an array of the indices that are true
Return an array of the indices that are true
- data
boolean array to convert
- returns
indices
-
def
buildInstances(data: Array[Array[Double]], classes: Array[Any], fileInfo: FileInfo): Instances
Build a weka Instances object for custom data
Build a weka Instances object for custom data
- data
set of "instances"
- classes
response of instances
- fileInfo
additional information
- returns
weka instances
-
def
chooseByProb(probs: Array[(Double, Int)], probSum: Double, rand: Random): Int
return the element by their probabilities
return the element by their probabilities
- probs
the probabilities
- probSum
the sum of all probabilities
- rand
the random generator
- returns
the chosen element
-
def
clone(): AnyRef
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate() @throws( ... )
-
def
confusionMatrix(originalLabels: Array[Any], predictedLabels: Array[Any], minorityClass: Any): (Int, Int, Int, Int)
Compute the number of true positives (tp), false positives (fp), true negatives (tn) and false negatives (fn)
Compute the number of true positives (tp), false positives (fp), true negatives (tn) and false negatives (fn)
- originalLabels
original labels
- predictedLabels
labels predicted by a classifier
- minorityClass
positive class
- returns
(tp, fp, tn, fn)
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
euclidean(x: Array[Double], y: Array[Double]): Double
Compute the Euclidean Distance between two points
Compute the Euclidean Distance between two points
- x
first instance
- y
second instance
- returns
euclidean distance between the instances
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
def
imbalancedRatio(counter: Map[Any, Int], minorityClass: Any): Double
Compute the soul ratio (number of instances of all the classes except the minority one divided by number of instances of the minority class)
Compute the soul ratio (number of instances of all the classes except the minority one divided by number of instances of the minority class)
- counter
Array containing a pair representing: (class, number of elements)
- minorityClass
indicates which is the minority class
- returns
the soul ratio
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
kFoldPrediction(data: Array[Array[Double]], labels: Array[Any], k: Int, nFolds: Int, which: String): Array[Any]
Split the data into nFolds folds and predict the labels using the test
Split the data into nFolds folds and predict the labels using the test
- data
target data
- labels
labels associated to each point in data
- k
number of neighbours to consider
- nFolds
number of subsets to create
- which
"nearest" to return the nearest neighbours, otherwise, return the farthest ones
- returns
the predictedLabels with less error
-
def
kFoldPredictionHVDM(data: Array[Array[Double]], labels: Array[Any], k: Int, nFolds: Int, nominal: Array[Int], sds: Array[Double], attrCounter: Array[Map[Double, Int]], attrClassesCounter: Array[Map[Double, Map[Any, Int]]], which: String): Array[Any]
Split the data into nFolds folds and predict the labels using the test
Split the data into nFolds folds and predict the labels using the test
- data
target data
- labels
labels associated to each point in data
- k
number of neighbours to consider
- nFolds
number of subsets to create
- nominal
indicate nominal attributes in the instances
- sds
standard deviations
- attrCounter
counter attributes occurrences
- attrClassesCounter
number of occurrences for each value and output class c, for each class
- which
"nearest" to return the nearest neighbours, otherwise, return the farthest ones
- returns
the predictedLabels with less error
-
def
kMeans(data: Array[Array[Double]], nominal: Array[Int], numClusters: Int, restarts: Int, minDispersion: Double, maxIterations: Int, seed: Long): (Double, Array[Array[Double]], Map[Int, Array[Int]])
Compute KMeans core
Compute KMeans core
- data
data to be clustered
- nominal
array to know which attributes are nominal
- numClusters
number of clusters to be created
- restarts
number of times to relaunch the core
- minDispersion
stop if dispersion is lower than this value
- maxIterations
number of iterations to be done in KMeans core
- seed
seed to initialize the random object
- returns
(dispersion, centroids of the cluster, a map of the form: clusterID -> Array of elements in this cluster, a map of the form: elementID -> cluster associated)
-
def
kNeighbors(data: Array[Array[Double]], node: Int, k: Int): Array[Int]
Compute kNN core
Compute kNN core
- data
array of samples
- node
index whom neighbors are going to be evaluated
- k
number of neighbors
- returns
index of the neighbors of node
-
def
kNeighbors(data: Array[Array[Double]], node: Array[Double], k: Int): Array[Int]
Compute kNN core
Compute kNN core
- data
array of samples
- node
array with the attributes of the node
- k
number of neighbors
- returns
index of the neighbors of node
-
def
kNeighborsHVDM(data: Array[Array[Double]], node: Int, k: Int, nominal: Array[Int], sds: Array[Double], attrCounter: Array[Map[Double, Int]], attrClassesCounter: Array[Map[Double, Map[Any, Int]]]): Array[Int]
Compute kNN core
Compute kNN core
- data
array of samples
- node
index whom neighbors are going to be evaluated
- k
number of neighbors
- nominal
indicate nominal attributes in the instances
- sds
standard deviations
- attrCounter
counter attributes occurrences
- attrClassesCounter
number of occurrences for each value and output class c, for each class
- returns
index of the neighbors of node
-
def
kNeighborsHVDM(data: Array[Array[Double]], node: Array[Double], k: Int, nominal: Array[Int], sds: Array[Double], attrCounter: Array[Map[Double, Int]], attrClassesCounter: Array[Map[Double, Map[Any, Int]]]): Array[Int]
Compute kNN core
Compute kNN core
- data
array of samples
- node
array with the attributes of the node
- k
number of neighbors
- nominal
indicate nominal attributes in the instances
- sds
standard deviations
- attrCounter
counter attributes occurrences
- attrClassesCounter
number of occurrences for each value and output class c, for each class
- returns
index of the neighbors of node
-
def
minority(data: Array[Any]): Array[Int]
Calculate minority class
Calculate minority class
- returns
the elements in the minority class
-
def
mode(data: Array[Any]): Any
Compute the mode of an array
Compute the mode of an array
- data
array to compute the mode
- returns
the mode of the array
-
def
nanoTimeToString(elapsedTime: Long): String
Convert nanoseconds to minutes, seconds and milliseconds
Convert nanoseconds to minutes, seconds and milliseconds
- elapsedTime
nanoseconds to be converted
- returns
String representing the conversion
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
nnRule(neighbours: Array[Array[Double]], instance: Array[Double], id: Int, labels: Array[Any], k: Int, which: String): (Any, Array[Int], Array[Double])
Decide the label using the NNRule considering k neighbours of dataset
Decide the label using the NNRule considering k neighbours of dataset
- neighbours
neighbours of the element
- instance
target instance
- id
id of the instance
- labels
labels associated to each point in data
- k
number of neighbours to consider
- which
if it's set to "nearest", return the nearest neighbours, if it sets "farthest", return the farthest ones
- returns
the label associated to newPoint and the index of the k-nearest which
-
def
nnRuleHVDM(neighbours: Array[Array[Double]], instance: Array[Double], id: Int, labels: Array[Any], k: Int, nominal: Array[Int], sds: Array[Double], attrCounter: Array[Map[Double, Int]], attrClassesCounter: Array[Map[Double, Map[Any, Int]]], which: String): (Any, Array[Int], Array[Double])
Decide the label using the NNRule considering k neighbours of dataset
Decide the label using the NNRule considering k neighbours of dataset
- neighbours
neighbours of the element
- instance
target instance
- id
id of the instance
- labels
labels associated to each point in data
- k
number of neighbours to consider
- nominal
indicate nominal attributes in the instances
- sds
standard deviations
- attrCounter
counter attributes occurrences
- attrClassesCounter
number of occurrences for each value and output class c, for each class
- which
"nearest" to return the nearest neighbours, otherwise, return the farthest ones
- returns
the label associated to newPoint and the index of the k-nearest which
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
def
occurrencesByValueAndClass(attribute: Array[Double], classes: Array[Any]): Map[Double, Map[Any, Int]]
Compute the number of occurrences for each value x for attribute represented by array attribute and output class c, for each class c in classes
Compute the number of occurrences for each value x for attribute represented by array attribute and output class c, for each class c in classes
- attribute
attribute to be used
- classes
classes present in the dataset
- returns
map of maps with the form: (value -> (class -> number of elements))
-
def
processData(data: Data): (Array[Array[Double]], Array[Map[Double, Any]])
Convert a data object into a matrix of doubles, taking care of missing values and nominal columns.
Convert a data object into a matrix of doubles, taking care of missing values and nominal columns. Missing data was treated using the most frequent value for nominal variables and the median for numeric variables. Nominal columns are converted to doubles.
- data
data to process
-
def
standardDeviation(xs: Array[Double]): Double
Compute the standard deviation for an array
Compute the standard deviation for an array
- xs
array to be used
- returns
standard deviation of x
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
to2Decimals(data: Array[Array[Double]]): Array[Array[Any]]
Transform the numerical data to the nominal data
Transform the numerical data to the nominal data
- data
array with the results
- returns
the result converted
-
def
toNominal(data: Array[Array[Double]], dict: Array[Map[Double, Any]]): Array[Array[Any]]
Transform numerical data to nominal data
Transform numerical data to nominal data
- data
array with the results
- dict
dictionary with the keys to do the transformation
- returns
the result converted
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
def
toXData(d: Array[Array[Double]]): Array[Array[Any]]
Convert a double matrix to a matrix of Any
Convert a double matrix to a matrix of Any
- d
data to be converted
- returns
matrix of Any
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
def
zeroOneDenormalization(data: Array[Array[Double]], max: Array[Double], min: Array[Double]): Array[Array[Double]]
Denormalize the data
Denormalize the data
- data
the data to denormalize
- max
the max value of the samples for each column
- min
the min value of the samples for each column
- returns
the data denormalized
-
def
zeroOneNormalization(d: Data, x: Array[Array[Double]]): Array[Array[Double]]
Normalize the data as follow: for each column, x, (x-min(x))/(max(x)-min(x)) This method only normalize not nominal columns
Normalize the data as follow: for each column, x, (x-min(x))/(max(x)-min(x)) This method only normalize not nominal columns
- returns
normalized data
-
def
zeroOneToIndex(data: Array[Int]): Array[Int]
Return an array of the indices that are one
Return an array of the indices that are one
- data
zero/one array to convert
- returns
indices
-
object
Distance extends Enumeration
Enumeration to represent the possible distances
Enumeration to represent the possible distances
EUCLIDEAN: Euclidean distance HVDM: Proposed in "Improved Heterogeneous Distance Functions" by "D. Randall Wilson and Tony R. Martinez"