nltk.probability.KneserNeyProbDist¶
- class nltk.probability.KneserNeyProbDist[source]¶
Bases:
ProbDistI
Kneser-Ney estimate of a probability distribution. This is a version of back-off that counts how likely an n-gram is provided the n-1-gram had been seen in training. Extends the ProbDistI interface, requires a trigram FreqDist instance to train on. Optionally, a different from default discount value can be specified. The default discount is set to 0.75.
- __init__(freqdist, bins=None, discount=0.75)[source]¶
- Parameters
freqdist (FreqDist) – The trigram frequency distribution upon which to base the estimation
bins (int or float) – Included for compatibility with nltk.tag.hmm
discount (float (preferred, but can be set to int)) – The discount applied when retrieving counts of trigrams
- prob(trigram)[source]¶
Return the probability for a given sample. Probabilities are always real numbers in the range [0, 1].
- Parameters
sample (any) – The sample whose probability should be returned.
- Return type
float
- discount()[source]¶
Return the value by which counts are discounted. By default set to 0.75.
- Return type
float
- set_discount(discount)[source]¶
Set the value by which counts are discounted to the value of discount.
- Parameters
discount (float (preferred, but int possible)) – the new value to discount counts by
- Return type
None
- samples()[source]¶
Return a list of all samples that have nonzero probabilities. Use
prob
to find the probability of each sample.- Return type
list
- max()[source]¶
Return the sample with the greatest probability. If two or more samples have the same probability, return one of them; which sample is returned is undefined.
- Return type
any
- SUM_TO_ONE = True¶
True if the probabilities of the samples in this probability distribution will always sum to one.