This contribution presents the design and development of a web-based prototype for the extraction and analysis of specialized argument structures in multilingual corpora. The tool encapsulates complex command-line scripts into a user-friendly interface, allowing researchers to load, parse, and index corpora, search for noun-verb-noun triples, and organize results into lexical clusters. By leveraging distributional semantics models like word2vec, the tool refines clusters by filtering irrelevant terms and enriching them with semantically related ones. The prototype supports cross-platform accessibility, ensures centralized server-side storage, and provides scalable functionality for future extensions. Currently in the testing phase, it addresses the limitations of previous tools by streamlining corpus analysis and making phraseological studies more accessible to academia.
Requirements for Constructing a Tool for the Extraction of Phraseological Structures
Abstract
Sánchez-Cárdenas B., Rienda P., Medina-Medina N., Ramisch C. (2026) "Requirements for Constructing a Tool for the Extraction of Phraseological Structures
", Journal of Digital Terminology and Lexicography, 2(1), 19-35. DOI: 10.25430/pupj.jdtl.1773243626
Year of Publication
2026
Journal
Journal of Digital Terminology and Lexicography
Volume
2
Issue Number
1
Start Page
19
Last Page
35
Date Published
03/2026
ISSN Number
3103-3601
Serial Article Number
2
DOI
10.25430/pupj.jdtl.1773243626
Issue
Section
Article