Advice and answers from the LightTag Team

Go to LightTag

Reviewing and QA Basics

The key ideas you need to know to get the most out of LightTag

Written by Tal Perry. Updated over a week ago

This article will introduce the ideas underlying our approach to review. It's more philosophical, than most of our documentation, but we recommend reading it to get the most out of review.

Why Review

It's not enough to have labeled data, the labels need to be good. Metrics like inter-annotator agreement give you a first order indication of the quality of the data. But if there is a problem, you need to diagnose it, fix it and communicate the resolution to your team. LightTag makes that as easy as possible with review mode.

Do I Need To Review My Data ?

If you're working with multiple annotators, and/or you want to check the quality of your models then you should review your data to get a sense of it's quality.

However, if you're just one person annotating, you probably don't need to review yourself.

Review Methods

LightTag provides 3 ways to review your data. You can define Review Jobs, which will set up a queue of items to be reviewed and dispatch them to reviewers. Word Level Review lets you see every occurrence of a word or phrase and how it was annotated and enables batch actions. Finally, for spot checking, you can review any item by selecting it from a list.

You can use more than one method and LightTag will ensure that the reviewed data is synchronized. For example, at the begining of a project you might do spot checking to gather feedback. As the project nears completion you'll use Word Level Review to quickly handle common cases before having the rest of the data reviewed in a review job.

Core Principles

When we designed LightTag and it's review feature we honed in on three guiding principles, which you should know.

There is only one truth

Regardless of how many people or models annotated a text, and how many different and conflicting annotations they gave, there is only one right answer.

We know that some things are subjective, but the model your building doesn't, and it's critical that you give it a single, consistent answer to train on.

Don't repeat yourself

We want you to have high quality labeled data fast. To that end, it's our job to ensure you don't need to enter redundant information. So if you say that "X" is a Dog we know that "X" is not a Cat or anything else.

Since we know that, we shouldn't (and don't) make you input that information.

Silence is also a statement

There is a difference between a document that was never seen by an annotator, and a document that was seen but didn't have any annotations applied. If a document was not seen, there is nothing to review. If a document was seen but not annotated, the annotator may have missed something.

How Is Review Data Organized

The data to be reviewed is organized a little differently. So far, you've labeled data in Annotation Jobs which are comprised of a Dataset and a Schema. In review, we LightTag ignores the concept of a job and organizes the data by Dataset and Schema.

This is consistent with the core principles. Annotations on the same Dataset and Schema can come from multiple jobs, or can come from Jobs and from your own AI models that you've uploaded to LightTag.

Additionally, during review, you might add or change annotations. Those annotations weren't created in the context of a particular job, they were made during review.

To maintain the principles of one truth and not repeating yourself, LightTag will show annotations on a particular Dataset and Schema regardless of their source (a job, a model etc).

What Do I Get From Review ?

When your done reviewing, you get two things:

Gold Standard Output

Eventually you'll download your annotations from LightTag. When you do you'll notice two fields, reviewed and correct

{
'reviewed': True,
'correct': True,
'definition_id': '31c859fa-3ca7-44c5-bcc1-cda9161627a0',
'end': 30,
'example_id': 'bc7cfc29-8ae9-4904-b9e9-996d92710ae6',
'start': 22,
'tag': 'GPE',
'tag_id': 'd83d9faf-b2b5-4325-b84f-f2d37642e54e',
'tagged_token_id': 'eedb9314-1f14-420d-9336-e9e1f94db163',
'value': 'Lakeland',
'annotated_by': [{'annotator': 'Melanie Williams',
'annotator_id': 8,
'timestamp': '2018-08-20T20:18:24.944+00:00'},
{'annotator': 'Charlene Wells',
'annotator_id': 9,
'timestamp': '2018-08-22T19:32:34.786+00:00'},
{'annotator': 'Renee Carter',
'annotator_id': 12,
'timestamp': '2018-08-22T20:34:18.518+00:00'}],
}

Your gold standard data is anything with correct = True

Metrics

As you review data, LightTag will provide you with metrics on your annotator's and model's performance. These give you a one stop shop to track the quality of your annotators and data