nltk.probability.LidstoneProbDist¶
- class nltk.probability.LidstoneProbDist[source]¶
Bases:
ProbDistI
The Lidstone estimate for the probability distribution of the experiment used to generate a frequency distribution. The “Lidstone estimate” is parameterized by a real number gamma, which typically ranges from 0 to 1. The Lidstone estimate approximates the probability of a sample with count c from an experiment with N outcomes and B bins as
c+gamma)/(N+B*gamma)
. This is equivalent to adding gamma to the count for each bin, and taking the maximum likelihood estimate of the resulting frequency distribution.- SUM_TO_ONE = False¶
True if the probabilities of the samples in this probability distribution will always sum to one.
- __init__(freqdist, gamma, bins=None)[source]¶
Use the Lidstone estimate to create a probability distribution for the experiment used to generate
freqdist
.- Parameters
freqdist (FreqDist) – The frequency distribution that the probability estimates should be based on.
gamma (float) – A real number used to parameterize the estimate. The Lidstone estimate is equivalent to adding gamma to the count for each bin, and taking the maximum likelihood estimate of the resulting frequency distribution.
bins (int) – The number of sample values that can be generated by the experiment that is described by the probability distribution. This value must be correctly set for the probabilities of the sample values to sum to one. If
bins
is not specified, it defaults tofreqdist.B()
.
- freqdist()[source]¶
Return the frequency distribution that this probability distribution is based on.
- Return type
- prob(sample)[source]¶
Return the probability for a given sample. Probabilities are always real numbers in the range [0, 1].
- Parameters
sample (any) – The sample whose probability should be returned.
- Return type
float
- max()[source]¶
Return the sample with the greatest probability. If two or more samples have the same probability, return one of them; which sample is returned is undefined.
- Return type
any
- samples()[source]¶
Return a list of all samples that have nonzero probabilities. Use
prob
to find the probability of each sample.- Return type
list
- discount()[source]¶
Return the ratio by which counts are discounted on average: c*/c
- Return type
float