Cross Lingual Word Embeddings With Universal Concepts And Their Applications


Download Cross Lingual Word Embeddings With Universal Concepts And Their Applications PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get Cross Lingual Word Embeddings With Universal Concepts And Their Applications book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages.

Download

Cross-Lingual Word Embeddings with Universal Concepts and Their Applications


Cross-Lingual Word Embeddings with Universal Concepts and Their Applications

Author: Pezhman Sheinidashtegol

language: en

Publisher:

Release Date: 2020


DOWNLOAD





Enormous amounts of data are generated in many languages every day due to our increasing global connectivity. This increases the demand for the ability to read and classify data regardless of language. Word embedding is a popular Natural Language Processing (NLP) strategy that uses language modeling and feature learning to map words to vectors of real numbers. However, these models need a significant amount of data annotated for the training. While gradually, the availability of labeled data is increasing, most of these data are only available in high resource languages, such as English. Researchers with different sets of proficient languages seek to address new problems with multilingual NLP applications. In this dissertation, I present multiple approaches to generate cross-lingual word embedding (CWE) using universal concepts (UC) amongst languages to address the limitations of existing methods. My work consists of three approaches to build multilingual/bilingual word embeddings. The first approach includes two steps: pre-processing and processing. In the pre-processing step, we build a bilingual corpus containing both languages' knowledge in the form of sentences for the most frequent words in English and their translated pair in the target language. In this step, knowledge of the source language is shared with the target language and vice versa by swapping one word per sentence with its corresponding translation. In the second step, we use a monolingual embeddings estimator to generate the CWE. The second approach generates multilingual word embeddings using UCs. This approach consists of three parts. For part I, we introduce and build UCs using bilingual dictionaries and graph theory by defining words as nodes and translation pairs as edges. In part II, we explain the configuration used for word2vec to generate encoded-word embeddings. Finally, part III includes decoding the generated embeddings using UCs. The final approach utilizes the supervised method of the MUSE project, but, the model trained on our UCs. Finally, we applied our last two proposed methods to some practical NLP applications; document classification, cross-lingual sentiment analysis, and code-switching sentiment analysis. Our proposed methods outperform the state of the art MUSE method on the majority of applications.

Cross-Lingual Word Embeddings


Cross-Lingual Word Embeddings

Author: Anders Søgaard

language: en

Publisher: Springer Nature

Release Date: 2022-05-31


DOWNLOAD





The majority of natural language processing (NLP) is English language processing, and while there is good language technology support for (standard varieties of) English, support for Albanian, Burmese, or Cebuano--and most other languages--remains limited. Being able to bridge this digital divide is important for scientific and democratic reasons but also represents an enormous growth potential. A key challenge for this to happen is learning to align basic meaning-bearing units of different languages. In this book, the authors survey and discuss recent and historical work on supervised and unsupervised learning of such alignments. Specifically, the book focuses on so-called cross-lingual word embeddings. The survey is intended to be systematic, using consistent notation and putting the available methods on comparable form, making it easy to compare wildly different approaches. In so doing, the authors establish previously unreported relations between these methods and are able to present a fast-growing literature in a very compact way. Furthermore, the authors discuss how best to evaluate cross-lingual word embedding methods and survey the resources available for students and researchers interested in this topic.

Smart Computing Techniques and Applications


Smart Computing Techniques and Applications

Author: Suresh Chandra Satapathy

language: en

Publisher: Springer Nature

Release Date: 2021-07-07


DOWNLOAD





This book presents best selected papers presented at the 4th International Conference on Smart Computing and Informatics (SCI 2020), held at the Department of Computer Science and Engineering, Vasavi College of Engineering (Autonomous), Hyderabad, Telangana, India. It presents advanced and multi-disciplinary research towards the design of smart computing and informatics. The theme is on a broader front which focuses on various innovation paradigms in system knowledge, intelligence and sustainability that may be applied to provide realistic solutions to varied problems in society, environment and industries. The scope is also extended towards the deployment of emerging computational and knowledge transfer approaches, optimizing solutions in various disciplines of science, technology and health care.