Journals & Magazines >IEEE Transactions on Neural N... >Volume: 31 Issue: 7

RoSeq: Robust Sequence Labeling

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

In this paper, we mainly investigate two issues for sequence labeling, namely, label imbalance and noisy data that are commonly seen in the scenario of named entity recog...Show More

Metadata

Abstract:

In this paper, we mainly investigate two issues for sequence labeling, namely, label imbalance and noisy data that are commonly seen in the scenario of named entity recognition (NER) and are largely ignored in the existing works. To address these two issues, a new method termed robust sequence labeling (RoSeq) is proposed. Specifically, to handle the label imbalance issue, we first incorporate label statistics in a novel conditional random field (CRF) loss. In addition, we design an additional loss to reduce the weights of overwhelming easy tokens for augmenting the CRF loss. To address the noisy training data, we adopt an adversarial training strategy to improve model generalization. In experiments, the proposed RoSeq achieves the state-of-the-art performances on CoNLL and English Twitter NER—88.07% on CoNLL-2002 Dutch, 87.33% on CoNLL-2002 Spanish, 52.94% on WNUT-2016 Twitter, and 43.03% on WNUT-2017 Twitter without using the additional data.

Published in: IEEE Transactions on Neural Networks and Learning Systems ( Volume: 31, Issue: 7, July 2020)

Page(s): 2304 - 2314

Date of Publication: 08 May 2019

ISSN Information:

PubMed ID: 31071057

DOI: 10.1109/TNNLS.2019.2911236

Funding Agency:

Contents

I. Introduction

Existing multioutput learning mainly aims at determining multiple outputs for a given input. In many cases, the output often involves a structure that is helpful to the training models, e.g., sequences, strings, trees, lattices, or graphs. In order to infer the structured outputs from an observation sequence rather than a data point, label sequence learning or sequence labeling has been widely studied, where the output sequence has inherent interconnections rather than a simple concatenation of individual units. It is also an important step in the most natural language processing (NLP) applications and has been applied to numerous real-world tasks including but not limited to part-of-speech (POS) tagging [1], named entity recognition (NER) [2], [3], and speech recognition [4].

References is not available for this document.

RoSeq: Robust Sequence Labeling

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

RoSeq: Robust Sequence Labeling

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

References