nltk.translate.chrf¶

nltk.translate.chrf(reference, hypothesis, min_len=1, max_len=6, beta=3.0, ignore_whitespace=True)[source]¶

Calculates the sentence level CHRF (Character n-gram F-score) described in

Maja Popovic. 2015. CHRF: Character n-gram F-score for Automatic MT Evaluation. In Proceedings of the 10th Workshop on Machine Translation. https://www.statmt.org/wmt15/pdf/WMT49.pdf
Maja Popovic. 2016. CHRF Deconstructed: β Parameters and n-gram Weights. In Proceedings of the 1st Conference on Machine Translation. https://www.statmt.org/wmt16/pdf/W16-2341.pdf

This implementation of CHRF only supports a single reference at the moment.

For details not reported in the paper, consult Maja Popovic’s original implementation: https://github.com/m-popovic/chrF

The code should output results equivalent to running CHRF++ with the following options: -nw 0 -b 3

An example from the original BLEU paper https://www.aclweb.org/anthology/P02-1040.pdf

>>> ref1 = str('It is a guide to action that ensures that the military '
...            'will forever heed Party commands').split()
>>> hyp1 = str('It is a guide to action which ensures that the military '
...            'always obeys the commands of the party').split()
>>> hyp2 = str('It is to insure the troops forever hearing the activity '
...            'guidebook that party direct').split()
>>> sentence_chrf(ref1, hyp1) 
0.6349...
>>> sentence_chrf(ref1, hyp2) 
0.3330...

The infamous “the the the … ” example

>>> ref = 'the cat is on the mat'.split()
>>> hyp = 'the the the the the the the'.split()
>>> sentence_chrf(ref, hyp)  
0.1468...

An example to show that this function allows users to use strings instead of tokens, i.e. list(str) as inputs.

>>> ref1 = str('It is a guide to action that ensures that the military '
...            'will forever heed Party commands')
>>> hyp1 = str('It is a guide to action which ensures that the military '
...            'always obeys the commands of the party')
>>> sentence_chrf(ref1, hyp1) 
0.6349...
>>> type(ref1) == type(hyp1) == str
True
>>> sentence_chrf(ref1.split(), hyp1.split()) 
0.6349...

To skip the unigrams and only use 2- to 3-grams:

>>> sentence_chrf(ref1, hyp1, min_len=2, max_len=3) 
0.6617...

Parameters

references (list(str) / str) – reference sentence
hypothesis (list(str) / str) – a hypothesis sentence
min_len (int) – The minimum order of n-gram this function should extract.
max_len (int) – The maximum order of n-gram this function should extract.
beta (float) – the parameter to assign more importance to recall over precision
ignore_whitespace (bool) – ignore whitespace characters in scoring

Returns

the sentence level CHRF score.

Return type

float

NLTK

Documentation

nltk.translate.chrf¶