A distribution that defines the probabilities over a number of discrete (integer-valued) class labels.
More...
#include <discreteDistribution.hpp>
|
| discreteDistribution () |
| Default constructor. More...
|
|
| discreteDistribution (const int num_classes) |
| Constructor. More...
|
|
void | initialise (const int num_classes) |
| Initialise with a certain number of classes and reset probabilities to zero. More...
|
|
void | reset () |
| Reset function - return probabilities to zero. More...
|
|
float | pdf (const int x) const |
| Returns the probability of a particular label. More...
|
|
void | normalise () |
| Normalise the distribution to ensure it is valid. More...
|
|
void | printOut (std::ofstream &stream) const |
| Prints the defining parameters of the distribution to an output filestream. More...
|
|
void | readIn (std::ifstream &stream) |
| Reads the defining parameters of the distribution from a filestream. More...
|
|
void | raiseDistributionTemperature (const double T) |
| Smooth the distribution using the softmax function. More...
|
|
template<class TLabelIterator , class TIdIterator > |
void | fit (TLabelIterator first_label, TLabelIterator last_label, TIdIterator) |
| Fit the distribution to a set of labels. More...
|
|
template<class TId > |
float | pdf (const int x, const TId) const |
| Returns the probability of a particular label. More...
|
|
template<class TId > |
void | combineWith (const discreteDistribution &dist, const TId) |
| Combine this distribution with a second by summing the probability values, without normalisation. More...
|
|
|
int | n_classes |
| The number of discrete classes.
|
|
std::vector< float > | prob |
| Vector containing the probabilities of each class.
|
|
|
std::ofstream & | operator<< (std::ofstream &stream, const discreteDistribution &dist) |
| Allows the distribution to be written to a file via the streaming operator '<<'.
|
|
std::ifstream & | operator>> (std::ifstream &stream, discreteDistribution &dist) |
| Allows the distribution to be written to read from a file via the streaming operator '>>'.
|
|
A distribution that defines the probabilities over a number of discrete (integer-valued) class labels.
The discreteDistribution has the characteristics of both a node distribution and an output distribution, and is used as the node and output distribution for the classifier
canopy::discreteDistribution::discreteDistribution |
( |
| ) |
|
|
inline |
Default constructor.
Initialises with 0 classes
canopy::discreteDistribution::discreteDistribution |
( |
const int |
num_classes | ) |
|
|
inline |
Constructor.
Initialises with a given number of classes
- Parameters
-
num_classes | The number of discrete classes |
Combine this distribution with a second by summing the probability values, without normalisation.
This method is used by the randomForestBase methods to aggregate the distributions in several leaf nodes into one output distribution.
- Template Parameters
-
TId | The type of the IDs of the data points. The ID is unused but required for compatibility with randomForestBase. |
- Parameters
-
dist | The distribution that this distribution should be combined with. |
- | The second parameter is unused and but required for compatibility with randomForestBase |
template<class TLabelIterator , class TIdIterator >
void canopy::discreteDistribution::fit |
( |
TLabelIterator |
first_label, |
|
|
TLabelIterator |
last_label, |
|
|
TIdIterator |
|
|
) |
| |
Fit the distribution to a set of labels.
Fits the discrete distribution to the set of labels between first_label and last label. Expects the labels to take value between 0 and N-1 inclusive, where N is the number of classes that the distribution has been initialised with. There are no checks to ensure this.
- Template Parameters
-
TLabelIterator | The type of the iterator used to access the labels of the training data. Must be a forward iterator that dereferences to an integral type. |
TIdIterator | The type of the iterator used to access the IDs of the data points. The ID is unused but required for compatibility with randomForestBase. |
- Parameters
-
first_label | Iterator to the first label |
last_label | Iterator to the last label |
- | The third parameter is unused but required for compatibility with randomForestBase |
void canopy::discreteDistribution::initialise |
( |
const int |
num_classes | ) |
|
|
inline |
Initialise with a certain number of classes and reset probabilities to zero.
- Parameters
-
num_classes | The number of discrete classes |
void canopy::discreteDistribution::normalise |
( |
| ) |
|
|
inline |
Normalise the distribution to ensure it is valid.
This may be used after several combineWith()
operations to ensure that the resulting distribution represents a valid probability distribution
float canopy::discreteDistribution::pdf |
( |
const int |
x | ) |
const |
|
inline |
Returns the probability of a particular label.
This overloaded version does not require the ID and is intended for use by user code.
- Parameters
-
x | The label of for which the probability is sought |
template<class TId >
float canopy::discreteDistribution::pdf |
( |
const int |
x, |
|
|
const TId |
|
|
) |
| const |
Returns the probability of a particular label.
This is the version used by the randomForestBase methods.
- Template Parameters
-
TId | The type of the IDs of the data points. The ID is unused but required for compatibility with randomForestBase. |
- Parameters
-
x | The label of for which the probability is sought |
- | The second parameter is unused and but required for compatibility with randomForestBase |
void canopy::discreteDistribution::printOut |
( |
std::ofstream & |
stream | ) |
const |
|
inline |
Prints the defining parameters of the distribution to an output filestream.
- Parameters
-
stream | The stream to which the parameters (the probability values for each class) are printed |
void canopy::discreteDistribution::raiseDistributionTemperature |
( |
const double |
T | ) |
|
|
inline |
Smooth the distribution using the softmax function.
This alters the probability distribution by replacing the probability of class \( i \) according to
\[ p_i \leftarrow \frac{ e^{\frac{p_i}{T}}}{\sum_{j=1}^N {e^\frac{p_j}{T}} } \]
where \( N \) is the number of classes and \( T \) is a temperature parameter. This has the effect of regularising the distribution, reducing the certainty.
- Parameters
-
T | The temperature parameter. The higher the temperature, the more the certainty is reduced. T must be a strictly positive number, otherwise this function will have no effect. |
void canopy::discreteDistribution::readIn |
( |
std::ifstream & |
stream | ) |
|
|
inline |
Reads the defining parameters of the distribution from a filestream.
- Parameters
-
stream | The stream from which the parameters (probability values for each class) are to be read |
void canopy::discreteDistribution::reset |
( |
| ) |
|
|
inline |
Reset function - return probabilities to zero.
Use this when using the class as an output distribution to create a new blank distribution before combining with new node distributions
The documentation for this class was generated from the following file: