Package pycv :: Package cs :: Package ml :: Package cla :: Package boost :: Module adaboost
[hide private]
[frames] | no frames]

Module adaboost

source code

Functions [hide private]
 
convert_scoring2weighted(scd)
Convert a binary ScoringCDataset into a WeightedCDataset.
source code
 
_check_stop_2(scores, maxFAR, maxFRR) source code
 
_check_stop_3(scores, maxFAR, maxFRR) source code
 
train_OfflineDBC(scd, trainfunc, M, criterion=0, param1=1.0, skewness_balancing=1, preceeding_sc=None, extra_output=False)
Train an offline DiscreteBoostedClassifier...
source code
 
train_DBC(classification_dataset, trainfunc, M, k=1.0, balancing=2, can_learn=True, polarity_balancing=1, previous=None)
Train a DiscreteBoostedClassifier...
source code
 
train_AdaBoost(classification_dataset, trainfunc, M, can_learn=True, polarity_balancing=1)
Train a DiscreteBoostedClassifier using AdaBoost (Friend et al's DiscreteAdaboost) Warning: This function is now obsolete, use train_DBC() instead.
source code
 
train_VJ(classification_dataset, trainfunc, M, k, evenly=True, can_learn=True, polarity_balancing=1)
Train a DiscreteBoostedClassifier using Viola and Jones' asymmetric boost (NIPS'02) Warning: This function is now obsolete, use train_DBC() instead.
source code
 
train_PC(classification_dataset, trainfunc, M, k, can_learn=True, polarity_balancing=1)
Train a DiscreteBoostedClassifier using our (Pham and Cham's) asymmetric boost (CVPR'07) Warning: This function is now obsolete, use train_DBC() instead.
source code
 
train_PC_NIPS(classification_dataset, trainfunc, M, k)
Train a DiscreteBoostedClassifier using our (Pham and Cham's) asymmetric boost (NIPS'07 -- never submitted) Goal: Assume D^+ is the distribution of x given x positive.
source code
Function Details [hide private]

convert_scoring2weighted(scd)

source code 

Convert a binary ScoringCDataset into a WeightedCDataset.

The formula is weight(x,y) = exp(-y score(x,y)), where y in {-1,1}

Parameters:
  • scd (ScoringCDataset) - scd must be a binary dataset
Returns:
wcd : WeightedCDataset

train_OfflineDBC(scd, trainfunc, M, criterion=0, param1=1.0, skewness_balancing=1, preceeding_sc=None, extra_output=False)

source code 
Train an offline DiscreteBoostedClassifier

Criteria:
    criterion=0: 
        rg \min_f (\lambda P(1) FRR(f) + P(0) FAR(f)) /                 (\lambda P(1) + P(0))
    criterion=1: 
        rg \min_f (\lambda FRR(f) + FAR(f)) / (\lambda + 1)

:Paramters:
    scd : ScoringCDataset
        a binary ScoringCDataset
    trainfunc: a function that takes a WeightedCDataset as input
        and returns a BinaryClassifier as a weak classifier
    M : int
        the maximum number of weak classifiers
    criterion : int
        which criterion
    param1 : double
        \lambda for the criterion
    skewness_balancing : int
        type of balancing among weak classifiers
            0 = no balancing at all, the original AdaBoost's method
            1 = asymmetric weight balancing, Viola-Jones (NIPS'02)
            2 = skewness balancing, Pham-Cham (CVPR'07) 
                (N/A if criterion=1)
    preceeding_sc : ScoringClassifier
        a classifier to preceed this newly trained one, 
        default is None
    extra_output : boolean
        if True then produce extra useful information
        
:Returns:
    dbc : DiscreteBoostedClassifier
        the newly trained DiscreteBoostedClassifier
    err : double (extra_output)
        training error, or training criterion function value
    scd2 : ScoringCDataset (extra_output)
        a new ScoringCDataset with scores augmented by this dbc,

train_DBC(classification_dataset, trainfunc, M, k=1.0, balancing=2, can_learn=True, polarity_balancing=1, previous=None)

source code 
Train a DiscreteBoostedClassifier

Input:
    classification_dataset: a WeightedCDataset of 2 classes
    trainfunc: a function that takes a WeightedCDataset as input
        and returns a BinaryClassifier as a weak classifier
    M: the maximum number of weak classifier
    k: false negatives penalized k times more than false positives
    balancing: type of balancing among weak classifiers
        0 = no balancing at all, this is the original AdaBoost's method
        1 = asymmetric weight balancing, Viola-Jones (NIPS'02)
        2 = skewness balancing, Pham-Cham (CVPR'07)
    can_learn : boolean
        whether the resulting DiscreteBoostedClassifier can learn 
            incrementally
    polarity_balancing: use polarity balancing for online-learning?
        0 = no polarity balancing, same as Oza-Rusell (ICSMC'05)
        1 = polarity balancing, Pham-Cham (CVPR'07)
    previous: previous additive classifier, default is None
Output:
    a DiscreteBoostedClassifier

train_AdaBoost(classification_dataset, trainfunc, M, can_learn=True, polarity_balancing=1)

source code 
Train a DiscreteBoostedClassifier using AdaBoost (Friend et al's DiscreteAdaboost)

Warning:
    This function is now obsolete, use train_DBC() instead.

Input:
    classification_dataset: a WeightedCDataset of 2 classes
    trainfunc: a function that takes a WeightedCDataset as input
        and returns a BinaryClassifier
    M: the maximum number of stages
    can_learn : boolean
        whether the resulting DiscreteBoostedClassifier can learn 
            incrementally
    polarity_balancing: use polarity balancing for online-learning?
        0 = no polarity balancing, same as Oza-Rusell (ICSMC'05)
        1 = polarity balancing, Pham-Cham (CVPR'07)
Output:
    a DiscreteBoostedClassifier

train_VJ(classification_dataset, trainfunc, M, k, evenly=True, can_learn=True, polarity_balancing=1)

source code 
Train a DiscreteBoostedClassifier using Viola and Jones'
    asymmetric boost (NIPS'02)

Warning:
    This function is now obsolete, use train_DBC() instead.

Input:
    classification_dataset: a WeightedCDataset of 2 classes
    trainfunc: a function that takes a WeightedCDataset as input
        and returns a BinaryClassifier
    M: the maximum number of stages
    k: false negatives penalized k times more than false positives
    evenly: distribute lambda evenly among the weak classifiers
    can_learn : boolean
        whether the resulting DiscreteBoostedClassifier can learn 
            incrementally
    polarity_balancing: use polarity balancing for online-learning?
        0 = no polarity balancing, same as Oza-Rusell (ICSMC'05)
        1 = polarity balancing, Pham-Cham (CVPR'07)
Output:
    a DiscreteBoostedClassifier

train_PC(classification_dataset, trainfunc, M, k, can_learn=True, polarity_balancing=1)

source code 
Train a DiscreteBoostedClassifier using our (Pham and Cham's)
    asymmetric boost (CVPR'07)

Warning:
    This function is now obsolete, use train_DBC() instead.

Input:
    classification_dataset: a WeightedCDataset of 2 classes
    trainfunc: a function that takes a WeightedCDataset as input
        and returns a BinaryClassifier
    M: the maximum number of stages
    k: false negatives penalized k times more than false positives
    can_learn : boolean
        whether the resulting DiscreteBoostedClassifier can learn 
            incrementally
    polarity_balancing: use polarity balancing for online-learning?
        0 = no polarity balancing, same as Oza-Rusell (ICSMC'05)
        1 = polarity balancing, Pham-Cham (CVPR'07)
Output:
    a DiscreteBoostedClassifier

train_PC_NIPS(classification_dataset, trainfunc, M, k)

source code 
Train a DiscreteBoostedClassifier using our (Pham and Cham's)
    asymmetric boost (NIPS'07 -- never submitted)

Goal:
    Assume D^+ is the distribution of x given x positive.
    Assume D^- is the distribution of x given x negative.
    We wish to find F_M(x) to minimize:
        k J^-(F_M) + J^+(F_M) (1)
    where
        J^+(F_M) = E_{D^+} [(F_M(x)-1)^2]
        J^-(F_M) = E_{D^-} [(F_M(x)+1)^2]

    Let:
        pi^+_m = 1 - E_{D^+} [F_M(x)]
        pi^-_m = 1 - E_{D^-} [F_M(x)]
        D^+_m a distribution such that p_{D^+_m}(x) = p_{D^+}(x) (1-F_m(x)) / pi^+_m
        D^-_m a distribution such that p_{D^-_m}(x) = p_{D^-}(x) (1+F_m(x)) / pi^-_m
        FRR_m(f) = E_{D^+_m}[f(x) == -1]
        FAR_m(f) = E_{D^-_m}[f(x) == +1]

    I proved that:
        J^+(F_M) = J^+(F_m) + c^2 + 2 c pi^+_m (2FRR_m(f) - 1)
        J^-(F_M) = J^-(F_m) + c^2 + 2 c pi^-_m (2FAR_m(f) - 1)

    Let's say we want to minimize (1) then we need to
        1) choose f minimizing: (weak classifier)
            epsilon(f) = k pi^-_m FRR_m(f) + k pi^+_m FAR_m(f)
        2) choose  minimizing: (voting coefficient)
            (k J^- + J^+)(F_m) + (k+1) c^2 + 4c epsilon(f) - 2c(k pi^-_m + pi^+_m)
            which means:
                c* = rac{k pi^-_m + pi^+_m - 2 epsilon(f)} {k+1}

    I also proved that: (not quite)
        |F_M(x)| <= \sum_{m=1}^M |c_m| <= 1 for all M


Input:
    classification_dataset: a WeightedCDataset of 2 classes
    trainfunc: a function that takes a WeightedCDataset as input
        and returns a BinaryClassifier
    M: the maximum number of stages
    k: false negative rate penalized k times more than false positive rate
Output:
    a DiscreteBoostedClassifier