seqeval is a Python framework for sequence labeling evaluation. seqeval can evaluate the performance of chunking tasks such as named-entity recognition, part-of-speech tagging, semantic role labeling and so on.
This is well-tested by using the Perl script conlleval, which can be used for measuring the performance of a system that has processed the CoNLL-2000 shared task data.
seqeval supports following formats:
and supports following metrics:
|accuracy_score(y_true, y_pred)||Compute the accuracy.|
|precision_score(y_true, y_pred)||Compute the precision.|
|recall_score(y_true, y_pred)||Compute the recall.|
|f1_score(y_true, y_pred)||Compute the F1 score, also known as balanced F-score or F-measure.|
|classification_report(y_true, y_pred, digits=2)||Build a text report showing the main classification metrics.
Behold, the power of seqeval:
>>> from seqeval.metrics import accuracy_score >>> from seqeval.metrics import classification_report >>> from seqeval.metrics import f1_score >>> >>> y_true = [['O', 'O', 'O', 'B-MISC', 'I-MISC', 'I-MISC', 'O'], ['B-PER', 'I-PER', 'O']] >>> y_pred = [['O', 'O', 'B-MISC', 'I-MISC', 'I-MISC', 'I-MISC', 'O'], ['B-PER', 'I-PER', 'O']] >>> >>> f1_score(y_true, y_pred) 0.50 >>> accuracy_score(y_true, y_pred) 0.80 >>> classification_report(y_true, y_pred) precision recall f1-score support MISC 0.00 0.00 0.00 1 PER 1.00 1.00 1.00 1 avg / total 0.50 0.50 0.50 2
To install seqeval, simply run:
$ pip install seqeval
- numpy >= 1.14.0