Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Line-a-line: A Tool for Annotating Word-Alignment
Institute for Language and Folklore.
Institute for Language and Folklore, Språkrådet.ORCID iD: 0000-0001-6573-4636
Institute for Language and Folklore, Språkrådet.ORCID iD: 0000-0001-6949-6380
Institute for Language and Folklore, Språkrådet.
2020 (English)In: Proceedings of the 13th Workshop on Building and Using Comparable Corpora / [ed] Reinhard Rapp, Pierre Zweigenbaum och Serge Sharoff, 2020, p. 1-5Conference paper, Published paper (Refereed)
Abstract [en]

We here describe line-a-line, a web-based tool for manual annotation of word-alignments in sentence-aligned parallel corpora. The graphical user interface, which builds on a design template from the Jigsaw system for investigative analysis, displays the words from each sentence pair that is to be annotated as elements in two vertical lists. An alignment between two words is annotated by drag-and-drop, i.e. by dragging an element from the left-hand list and dropping it on an element in the right-hand list. The tool indicates that two words are aligned by lines that connect them and by highlighting associated words when the mouse is hovered over them. Line-a-line uses the efmaral library for producing pre-annotated alignments, on which the user can base the manual annotation. The tool is mainly planned to be used on moderately under-resourced languages, for which resources in the form of parallel corpora are scarce. The automatic word-alignment functionality therefore also incorporates information derived from non-parallel resources, in the form of pre-trained multilingual word embeddings from the MUSE library.

Place, publisher, year, edition, pages
2020. p. 1-5
National Category
Languages and Literature
Research subject
Language Technology
Identifiers
URN: urn:nbn:se:sprakochfolkminnen:diva-1812OAI: oai:DiVA.org:sprakochfolkminnen-1812DiVA, id: diva2:1508851
Conference
13th Workshop on Building and Using Comparable Corpora, LREC
Available from: 2020-12-10 Created: 2020-12-10 Last updated: 2023-12-01Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Line-a-line: A Tool for Annotating Word-Alignments

Authority records

Skeppstedt, MariaAhltorp, MagnusEriksson, GunnarDomeij, Rickard

Search in DiVA

By author/editor
Skeppstedt, MariaAhltorp, MagnusEriksson, GunnarDomeij, Rickard
By organisation
Institute for Language and FolkloreSpråkrådet
Languages and Literature

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 128 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf