soul - soul.util.Utilities

final def !=(arg0: Any): Boolean

Definition Classes: AnyRef → Any

final def ##(): Int

Definition Classes: AnyRef → Any

final def ==(arg0: Any): Boolean

Definition Classes: AnyRef → Any

def HVDM(n1: Array[Double], n2: Array[Double], nominal: Array[Int], sds: Array[Double], attrCounter: Array[Map[Double, Int]], attrClassesCounter: Array[Map[Double, Map[Any, Int]]]): Double

Compute HVDM distance of two nodes

n1: sample1
n2: sample2
nominal: indicate nominal attributes in the instances
sds: standard deviations
attrCounter: counter attributes occurrences
attrClassesCounter: number of occurrences for each value and output class c, for each class
returns: HVDM distance of two nodes

final def asInstanceOf[T0]: T0

Definition Classes: Any

def boolToIndex(data: Array[Boolean]): Array[Int]

Return an array of the indices that are true

data: boolean array to convert
returns: indices

def buildInstances(data: Array[Array[Double]], classes: Array[Any], fileInfo: FileInfo): Instances

Build a weka Instances object for custom data

data: set of "instances"
classes: response of instances
fileInfo: additional information
returns: weka instances

def chooseByProb(probs: Array[(Double, Int)], probSum: Double, rand: Random): Int

return the element by their probabilities

probs: the probabilities
probSum: the sum of all probabilities
rand: the random generator
returns: the chosen element

def clone(): AnyRef

Attributes: protected[java.lang]
Definition Classes: AnyRef
Annotations: @native() @HotSpotIntrinsicCandidate() @throws( ... )

def confusionMatrix(originalLabels: Array[Any], predictedLabels: Array[Any], minorityClass: Any): (Int, Int, Int, Int)

Compute the number of true positives (tp), false positives (fp), true negatives (tn) and false negatives (fn)

originalLabels: original labels
predictedLabels: labels predicted by a classifier
minorityClass: positive class
returns: (tp, fp, tn, fn)

final def eq(arg0: AnyRef): Boolean

Definition Classes: AnyRef

def equals(arg0: Any): Boolean

Definition Classes: AnyRef → Any

def euclidean(x: Array[Double], y: Array[Double]): Double

Compute the Euclidean Distance between two points

x: first instance
y: second instance
returns: euclidean distance between the instances

final def getClass(): Class[_]

Definition Classes: AnyRef → Any
Annotations: @native() @HotSpotIntrinsicCandidate()

def hashCode(): Int

Definition Classes: AnyRef → Any
Annotations: @native() @HotSpotIntrinsicCandidate()

def imbalancedRatio(counter: Map[Any, Int], minorityClass: Any): Double

Compute the soul ratio (number of instances of all the classes except the minority one divided by number of instances of the minority class)

counter: Array containing a pair representing: (class, number of elements)
minorityClass: indicates which is the minority class
returns: the soul ratio

final def isInstanceOf[T0]: Boolean

Definition Classes: Any

def kFoldPrediction(data: Array[Array[Double]], labels: Array[Any], k: Int, nFolds: Int, which: String): Array[Any]

Split the data into nFolds folds and predict the labels using the test

data: target data
labels: labels associated to each point in data
k: number of neighbours to consider
nFolds: number of subsets to create
which: "nearest" to return the nearest neighbours, otherwise, return the farthest ones
returns: the predictedLabels with less error

def kFoldPredictionHVDM(data: Array[Array[Double]], labels: Array[Any], k: Int, nFolds: Int, nominal: Array[Int], sds: Array[Double], attrCounter: Array[Map[Double, Int]], attrClassesCounter: Array[Map[Double, Map[Any, Int]]], which: String): Array[Any]

Split the data into nFolds folds and predict the labels using the test

data: target data
labels: labels associated to each point in data
k: number of neighbours to consider
nFolds: number of subsets to create
nominal: indicate nominal attributes in the instances
sds: standard deviations
attrCounter: counter attributes occurrences
attrClassesCounter: number of occurrences for each value and output class c, for each class
which: "nearest" to return the nearest neighbours, otherwise, return the farthest ones
returns: the predictedLabels with less error

def kMeans(data: Array[Array[Double]], nominal: Array[Int], numClusters: Int, restarts: Int, minDispersion: Double, maxIterations: Int, seed: Long): (Double, Array[Array[Double]], Map[Int, Array[Int]])

Compute KMeans core

data: data to be clustered
nominal: array to know which attributes are nominal
numClusters: number of clusters to be created
restarts: number of times to relaunch the core
minDispersion: stop if dispersion is lower than this value
maxIterations: number of iterations to be done in KMeans core
seed: seed to initialize the random object
returns: (dispersion, centroids of the cluster, a map of the form: clusterID -> Array of elements in this cluster, a map of the form: elementID -> cluster associated)

def kNeighbors(data: Array[Array[Double]], node: Int, k: Int): Array[Int]

Compute kNN core

data: array of samples
node: index whom neighbors are going to be evaluated
k: number of neighbors
returns: index of the neighbors of node

def kNeighbors(data: Array[Array[Double]], node: Array[Double], k: Int): Array[Int]

Compute kNN core

data: array of samples
node: array with the attributes of the node
k: number of neighbors
returns: index of the neighbors of node

def kNeighborsHVDM(data: Array[Array[Double]], node: Int, k: Int, nominal: Array[Int], sds: Array[Double], attrCounter: Array[Map[Double, Int]], attrClassesCounter: Array[Map[Double, Map[Any, Int]]]): Array[Int]

Compute kNN core

data: array of samples
node: index whom neighbors are going to be evaluated
k: number of neighbors
nominal: indicate nominal attributes in the instances
sds: standard deviations
attrCounter: counter attributes occurrences
attrClassesCounter: number of occurrences for each value and output class c, for each class
returns: index of the neighbors of node

def kNeighborsHVDM(data: Array[Array[Double]], node: Array[Double], k: Int, nominal: Array[Int], sds: Array[Double], attrCounter: Array[Map[Double, Int]], attrClassesCounter: Array[Map[Double, Map[Any, Int]]]): Array[Int]

Compute kNN core

data: array of samples
node: array with the attributes of the node
k: number of neighbors
nominal: indicate nominal attributes in the instances
sds: standard deviations
attrCounter: counter attributes occurrences
attrClassesCounter: number of occurrences for each value and output class c, for each class
returns: index of the neighbors of node

def minority(data: Array[Any]): Array[Int]

Calculate minority class

returns: the elements in the minority class

def mode(data: Array[Any]): Any

Compute the mode of an array

data: array to compute the mode
returns: the mode of the array

def nanoTimeToString(elapsedTime: Long): String

Convert nanoseconds to minutes, seconds and milliseconds

elapsedTime: nanoseconds to be converted
returns: String representing the conversion

final def ne(arg0: AnyRef): Boolean

Definition Classes: AnyRef

def nnRule(neighbours: Array[Array[Double]], instance: Array[Double], id: Int, labels: Array[Any], k: Int, which: String): (Any, Array[Int], Array[Double])

Decide the label using the NNRule considering k neighbours of dataset

neighbours: neighbours of the element
instance: target instance
id: id of the instance
labels: labels associated to each point in data
k: number of neighbours to consider
which: if it's set to "nearest", return the nearest neighbours, if it sets "farthest", return the farthest ones
returns: the label associated to newPoint and the index of the k-nearest which

def nnRuleHVDM(neighbours: Array[Array[Double]], instance: Array[Double], id: Int, labels: Array[Any], k: Int, nominal: Array[Int], sds: Array[Double], attrCounter: Array[Map[Double, Int]], attrClassesCounter: Array[Map[Double, Map[Any, Int]]], which: String): (Any, Array[Int], Array[Double])

Decide the label using the NNRule considering k neighbours of dataset

neighbours: neighbours of the element
instance: target instance
id: id of the instance
labels: labels associated to each point in data
k: number of neighbours to consider
nominal: indicate nominal attributes in the instances
sds: standard deviations
attrCounter: counter attributes occurrences
attrClassesCounter: number of occurrences for each value and output class c, for each class
which: "nearest" to return the nearest neighbours, otherwise, return the farthest ones
returns: the label associated to newPoint and the index of the k-nearest which

final def notify(): Unit

Definition Classes: AnyRef
Annotations: @native() @HotSpotIntrinsicCandidate()

final def notifyAll(): Unit

Definition Classes: AnyRef
Annotations: @native() @HotSpotIntrinsicCandidate()

def occurrencesByValueAndClass(attribute: Array[Double], classes: Array[Any]): Map[Double, Map[Any, Int]]

Compute the number of occurrences for each value x for attribute represented by array attribute and output class c, for each class c in classes

attribute: attribute to be used
classes: classes present in the dataset
returns: map of maps with the form: (value -> (class -> number of elements))

def processData(data: Data): (Array[Array[Double]], Array[Map[Double, Any]])

Convert a data object into a matrix of doubles, taking care of missing values and nominal columns.

Convert a data object into a matrix of doubles, taking care of missing values and nominal columns. Missing data was treated using the most frequent value for nominal variables and the median for numeric variables. Nominal columns are converted to doubles.

data: data to process

def standardDeviation(xs: Array[Double]): Double

Compute the standard deviation for an array

xs: array to be used
returns: standard deviation of x

final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes: AnyRef

def to2Decimals(data: Array[Array[Double]]): Array[Array[Any]]

Transform the numerical data to the nominal data

data: array with the results
returns: the result converted

def toNominal(data: Array[Array[Double]], dict: Array[Map[Double, Any]]): Array[Array[Any]]

Transform numerical data to nominal data

data: array with the results
dict: dictionary with the keys to do the transformation
returns: the result converted

def toString(): String

Definition Classes: AnyRef → Any

def toXData(d: Array[Array[Double]]): Array[Array[Any]]

Convert a double matrix to a matrix of Any

d: data to be converted
returns: matrix of Any

final def wait(arg0: Long, arg1: Int): Unit

Definition Classes: AnyRef
Annotations: @throws( ... )

final def wait(arg0: Long): Unit

Definition Classes: AnyRef
Annotations: @native() @throws( ... )

final def wait(): Unit

Definition Classes: AnyRef
Annotations: @throws( ... )

def zeroOneDenormalization(data: Array[Array[Double]], max: Array[Double], min: Array[Double]): Array[Array[Double]]

Denormalize the data

data: the data to denormalize
max: the max value of the samples for each column
min: the min value of the samples for each column
returns: the data denormalized

def zeroOneNormalization(d: Data, x: Array[Array[Double]]): Array[Array[Double]]

Normalize the data as follow: for each column, x, (x-min(x))/(max(x)-min(x)) This method only normalize not nominal columns

returns: normalized data

def zeroOneToIndex(data: Array[Int]): Array[Int]

Return an array of the indices that are one

data: zero/one array to convert
returns: indices

object Distance extends Enumeration

Enumeration to represent the possible distances

EUCLIDEAN: Euclidean distance HVDM: Proposed in "Improved Heterogeneous Distance Functions" by "D. Randall Wilson and Tony R. Martinez"

Packages

Utilities

object Utilities

Value Members

Deprecated Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

Utilities 

object Utilities

Value Members

Deprecated Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

Utilities