Package pycv :: Package cs :: Package ml :: Package cla
[hide private]
[frames] | no frames]

Package cla

source code

An umbrella package containing all Classification packages.



Submodules [hide private]

Classes [hide private]
  BinaryErrorStats
Error statistics for binary classification.
  BinaryClassifier
  CDataset
  Classifier
  CDataGenerator
  WeightedCDataset
A CDataset with weights attached to the samples.
  ScoredWCDataset
A WeightedCDataset with score s[i] attached to the sample i.
  Toy3BalancedDataGenerator
  ToyImbalancedDataGenerator
  Toy2BalancedDataGenerator
  ToyBalancedDataGenerator
  NBClassifier
  TLinearClassifier
Thresholded linear classifier.
  AdditiveClassifier
Additive Classifier
  Rejector
Rejector
  ScoringClassifier
Scoring Classifier
  Shifter
Shifter
  ScoringCDataset
A binary CDataset with score s[i] associated with sample i.
  Resetter
An AdditiveClassifier that resets the score to a predefined value.
  UAC
Univariate Additive Classifier A classifier of the form: Classify x as class y = sgn( s(x) + c sgn(x-b) ) where s(x) is a score function, (c,b) are unknown parameters.
Functions [hide private]
 
evaluate(bc, wcd, *args, **kwds)
Evaluate a BinaryClassifier using a WeightedCDataset.
source code
 
train_NBClassifier(cd)
Train a NBClassifier using a WeightedCDataset
source code
 
train_LDA(classification_dataset, crit, param1)
Take a 2-class WeightedCDataset, then train a TLinearClassifier.
source code
 
project_LDA(stats2)
Project down to a line using the LDA projection.
source code
 
train_Shifter(scd, sc, criterion, param1)
Train a Shifter to succeed a ScoringClassifier with a different goal.
source code
 
sort_1d(cd)
Take a classification dataset with ishape=(), sort them in ascending order.
source code
 
histogram_1d(wcd, nbins, minValue=None, maxValue=None)
Compute a histogram from a (weighted) classification dataset with ishape=()
source code
 
thresh_1d(criterion, param1, wcd, sort_id=None)
Solve threshold-based 1D binary classifier.
source code
 
thresh_normal_1d(criterion, param1, stats)
Solve the class-conditional Gaussian-assumed classification with thresholding and goal.
source code
Variables [hide private]
  __doc__
Function Details [hide private]

evaluate(bc, wcd, *args, **kwds)

source code 
Evaluate a BinaryClassifier using a WeightedCDataset.
Returns:
brs : BinaryErrorStats
statistics of the error rates

Parameters:

bc : BinaryClassifier
a binary claassifier to evaluate

wcd : a WeightedCDataset is used as the test set

train_NBClassifier(cd)

source code 

Train a NBClassifier using a WeightedCDataset

Input:
classification_dataset: a WeightedCDataset
Output:
an NBClassifier

train_LDA(classification_dataset, crit, param1)

source code 
Take a 2-class WeightedCDataset, then train a TLinearClassifier.
The projection direction is LDA. The threshold is trained using one of three criteria:
    crit = 0: param1 is 'thelambda', then call gaussian.find_classification_threshold()
    crit = 1: param1 is 'minDR', then call gaussian.find_filtering_threshold()
    crit = 2: param1 is 'maxFAR', then call gaussian.find_filtering_threshold2()
    
Input:
    classification_dataset: a 2-class WeightedCDataset
    crit, param1: as mentioned above
Output:
    lc: a LinearClassifier, with lc.err as the estimated 'error'

project_LDA(stats2)

source code 

Project down to a line using the LDA projection. Return the direction.

Input:
stats2: a Stats2 of J classes
Output:
w: the projection direction (vector) -- if input is a tensor, flatten it into a vector

train_Shifter(scd, sc, criterion, param1)

source code 
Train a Shifter to succeed a ScoringClassifier with a different goal.
Parameters:
  • scd (ScoringCDataset) - a dataset of points and their current scores, obtained from
  • sc (ScoringClassifier) - the current ScoringClassifier
  • criterion (integer from 0 to 3) - 0: minimize classification error with prior probabilities 1: minimize classification error without prior probabilities 2: minimize FAR while constraining FRR 3: minimize FRR while constraining FAR
  • param1 (double) -
    a parameter representing
    lambda if criterion < 2 maxFRR if criterion == 2 maxFAR if criterion == 3
Returns:
shifter : Shifter
a Shifter succeeding the current ScoringClassifier 'sc'
shifter.err : double
resulting function value after thresholding
scd2 : ScoringCDataset
a new ScoringCDataset shifted (subtracted) from scd by shifter.thresh

sort_1d(cd)

source code 
Take a classification dataset with ishape=(), sort them in ascending order.
Parameters:
  • cd (CDataset) - a dataset of J classes and ishape ()
Returns:
sorted_id : array(shape=(cd.N,2),'int')
each tuple (j,index) represents an input value, which is identified by its class 'j' and its index 'index' in the class

histogram_1d(wcd, nbins, minValue=None, maxValue=None)

source code 
Compute a histogram from a (weighted) classification dataset with ishape=()
Parameters:
  • wcd (WeightedCDataset) - a (weighted) dataset of J classes and ishape ()
  • nbins (integer) - the number of bins
  • minValue (double) - minimum value of the view, default is the smallest value in the dataset
  • maxValue (double) - maximum value of the view, default is the largest value in the dataset
Returns:
hist : array(shape=(J,nbins), dtype='double')
J histograms of J classes
bin_interval : array(shape=(nbins,2),dtype='double')
array of nbins bin intervals

thresh_1d(criterion, param1, wcd, sort_id=None)

source code 

Solve threshold-based 1D binary classifier.

The function solves the following problem: Given two sets of samples of two classes, a positive one and a negative one, a threshold-based classifier classifies a value x into a positive or a negative class: sign(x - heta). The optimal heta is chosen based on different criteria: - Minimize the classification error: lambda * p(pos)*FRR + p(neg)*FAR - Minimize the error without prior: lambda * FRR + FAR - Minimize FAR with constraint FRR <= maxFRR - Minimize FRR with constraint FAR <= maxFAR

Parameters:
  • criterion (integer from 0 to 3) - 0: minimize classification error with prior probabilities 1: minimize classification error without prior probabilities 2: minimize FAR while constraining FRR 3: minimize FRR while constraining FAR
  • param1 (double) -
    a parameter representing
    lambda if criterion < 2 maxFRR if criterion == 2 maxFAR if criterion == 3
  • wcd (WeightedCDataset(J=2,ishape=()) (in cs.ml.cla package)) - a dataset of sample values of the two classes
  • sort_id (array) - the result of calling sort_1d(wcd), if sort_id is None, sort_1d(wcd) is called
Returns:
result : array(shape=(2,),dtype='d')
an argout array representing - result[0]: the threshold - result[1]: the optimized function value at that threshold

thresh_normal_1d(criterion, param1, stats)

source code 

Solve the class-conditional Gaussian-assumed classification with thresholding and goal.

This set of functions solve the following problem:
Given two classes normally distributed, a positive one and a negative one,
a threshold-based classifier classifies a value x into a positive or a negative
class: sign(x -     heta). The optimal      heta is chosen based on different criteria:
 - Minimize the classification error: \lambda * p(pos)*FRR + p(neg)*FAR
 - Minimize the error without prior: \lambda * FRR + FAR
 - Minimize FAR with constraint FRR <= maxFRR
 - Minimize FRR with constraint FAR <= maxFAR

:Parameters:
    criterion : integer from 0 to 3
        0: minimize classification error with prior probabilities
        1: minimize classification error without prior probabilities
        2: minimize FAR while constraining FRR
        3: minimize FRR while constraining FAR
    param1 : double
        a parameter representing
            \lambda if criterion < 2
            maxFRR if criterion == 2
            maxFAR if criterion == 3
    stats : Stats2e(J=2,d=1)
        REQUIREMENT: mean of class 0 <= mean of class 1

:Returns:
    result : array(shape=(2,),dtype='d')
        an argout array representing
        - result[0]: the threshold
        - result[1]: the optimized function value at that threshold