---
_id: '10752'
abstract:
- lang: eng
  text: 'The digitalization of almost all aspects of our everyday lives has led to
    unprecedented amounts of data being freely available on the Internet. In particular
    social media platforms provide rich sources of user-generated data, though typically
    in unstructured form, and with high diversity, such as written in many different
    languages. Automatically identifying meaningful information in such big data resources
    and extracting it efficiently is one of the ongoing challenges of our time. A
    common step for this is sentiment analysis, which forms the foundation for tasks
    such as opinion mining or trend prediction. Unfortunately, publicly available
    tools for this task are almost exclusively available for English-language texts.
    Consequently, a large fraction of the Internet users, who do not communicate in
    English, are ignored in automatized studies, a phenomenon called rare-language
    discrimination.In this work we propose a technique to overcome this problem by
    a truly multi-lingual model, which can be trained automatically without linguistic
    knowledge or even the ability to read the many target languages. The main step
    is to combine self-annotation, specifically the use of emoticons as a proxy for
    labels, with multi-lingual sentence representations.To evaluate our method we
    curated several large datasets from data obtained via the free Twitter streaming
    API. The results show that our proposed multi-lingual training is able to achieve
    sentiment predictions at the same quality level for rare languages as for frequent
    ones, and in particular clearly better than what mono-lingual training achieves
    on the same data. '
article_processing_charge: No
author:
- first_name: Jasmin
  full_name: Lampert, Jasmin
  last_name: Lampert
- first_name: Christoph
  full_name: Lampert, Christoph
  id: 40C20FD2-F248-11E8-B48F-1D18A9856A87
  last_name: Lampert
  orcid: 0000-0002-4561-241X
citation:
  ama: 'Lampert J, Lampert C. Overcoming rare-language discrimination in multi-lingual
    sentiment analysis. In: <i>2021 IEEE International Conference on Big Data</i>.
    IEEE; 2022:5185-5192. doi:<a href="https://doi.org/10.1109/bigdata52589.2021.9672003">10.1109/bigdata52589.2021.9672003</a>'
  apa: 'Lampert, J., &#38; Lampert, C. (2022). Overcoming rare-language discrimination
    in multi-lingual sentiment analysis. In <i>2021 IEEE International Conference
    on Big Data</i> (pp. 5185–5192). Orlando, FL, United States: IEEE. <a href="https://doi.org/10.1109/bigdata52589.2021.9672003">https://doi.org/10.1109/bigdata52589.2021.9672003</a>'
  chicago: Lampert, Jasmin, and Christoph Lampert. “Overcoming Rare-Language Discrimination
    in Multi-Lingual Sentiment Analysis.” In <i>2021 IEEE International Conference
    on Big Data</i>, 5185–92. IEEE, 2022. <a href="https://doi.org/10.1109/bigdata52589.2021.9672003">https://doi.org/10.1109/bigdata52589.2021.9672003</a>.
  ieee: J. Lampert and C. Lampert, “Overcoming rare-language discrimination in multi-lingual
    sentiment analysis,” in <i>2021 IEEE International Conference on Big Data</i>,
    Orlando, FL, United States, 2022, pp. 5185–5192.
  ista: 'Lampert J, Lampert C. 2022. Overcoming rare-language discrimination in multi-lingual
    sentiment analysis. 2021 IEEE International Conference on Big Data. Big Data:
    International Conference on Big Data, 5185–5192.'
  mla: Lampert, Jasmin, and Christoph Lampert. “Overcoming Rare-Language Discrimination
    in Multi-Lingual Sentiment Analysis.” <i>2021 IEEE International Conference on
    Big Data</i>, IEEE, 2022, pp. 5185–92, doi:<a href="https://doi.org/10.1109/bigdata52589.2021.9672003">10.1109/bigdata52589.2021.9672003</a>.
  short: J. Lampert, C. Lampert, in:, 2021 IEEE International Conference on Big Data,
    IEEE, 2022, pp. 5185–5192.
conference:
  end_date: 2021-12-18
  location: Orlando, FL, United States
  name: 'Big Data: International Conference on Big Data'
  start_date: 2021-12-15
date_created: 2022-02-10T14:08:23Z
date_published: 2022-01-13T00:00:00Z
date_updated: 2023-08-02T14:27:50Z
day: '13'
department:
- _id: ChLa
doi: 10.1109/bigdata52589.2021.9672003
external_id:
  isi:
  - '000800559505036'
isi: 1
language:
- iso: eng
month: '01'
oa_version: None
page: 5185-5192
publication: 2021 IEEE International Conference on Big Data
publication_identifier:
  isbn:
  - '9781665439022'
publication_status: published
publisher: IEEE
quality_controlled: '1'
status: public
title: Overcoming rare-language discrimination in multi-lingual sentiment analysis
type: conference
user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8
year: '2022'
...
---
_id: '10828'
abstract:
- lang: eng
  text: Digital images enable quantitative analysis of material properties at micro
    and macro length scales, but choosing an appropriate resolution when acquiring
    the image is challenging. A high resolution means longer image acquisition and
    larger data requirements for a given sample, but if the resolution is too low,
    significant information may be lost. This paper studies the impact of changes
    in resolution on persistent homology, a tool from topological data analysis that
    provides a signature of structure in an image across all length scales. Given
    prior information about a function, the geometry of an object, or its density
    distribution at a given resolution, we provide methods to select the coarsest
    resolution yielding results within an acceptable tolerance. We present numerical
    case studies for an illustrative synthetic example and samples from porous materials
    where the theoretical bounds are unknown.
article_processing_charge: No
arxiv: 1
author:
- first_name: Teresa
  full_name: Heiss, Teresa
  id: 4879BB4E-F248-11E8-B48F-1D18A9856A87
  last_name: Heiss
  orcid: 0000-0002-1780-2689
- first_name: Sarah
  full_name: Tymochko, Sarah
  last_name: Tymochko
- first_name: Brittany
  full_name: Story, Brittany
  last_name: Story
- first_name: Adélie
  full_name: Garin, Adélie
  last_name: Garin
- first_name: Hoa
  full_name: Bui, Hoa
  last_name: Bui
- first_name: Bea
  full_name: Bleile, Bea
  last_name: Bleile
- first_name: Vanessa
  full_name: Robins, Vanessa
  last_name: Robins
citation:
  ama: 'Heiss T, Tymochko S, Story B, et al. The impact of changes in resolution on
    the persistent homology of images. In: <i>2021 IEEE International Conference on
    Big Data</i>. IEEE; 2022:3824-3834. doi:<a href="https://doi.org/10.1109/BigData52589.2021.9671483">10.1109/BigData52589.2021.9671483</a>'
  apa: 'Heiss, T., Tymochko, S., Story, B., Garin, A., Bui, H., Bleile, B., &#38;
    Robins, V. (2022). The impact of changes in resolution on the persistent homology
    of images. In <i>2021 IEEE International Conference on Big Data</i> (pp. 3824–3834).
    Orlando, FL, United States; Virtuell: IEEE. <a href="https://doi.org/10.1109/BigData52589.2021.9671483">https://doi.org/10.1109/BigData52589.2021.9671483</a>'
  chicago: Heiss, Teresa, Sarah Tymochko, Brittany Story, Adélie Garin, Hoa Bui, Bea
    Bleile, and Vanessa Robins. “The Impact of Changes in Resolution on the Persistent
    Homology of Images.” In <i>2021 IEEE International Conference on Big Data</i>,
    3824–34. IEEE, 2022. <a href="https://doi.org/10.1109/BigData52589.2021.9671483">https://doi.org/10.1109/BigData52589.2021.9671483</a>.
  ieee: T. Heiss <i>et al.</i>, “The impact of changes in resolution on the persistent
    homology of images,” in <i>2021 IEEE International Conference on Big Data</i>,
    Orlando, FL, United States; Virtuell, 2022, pp. 3824–3834.
  ista: 'Heiss T, Tymochko S, Story B, Garin A, Bui H, Bleile B, Robins V. 2022. The
    impact of changes in resolution on the persistent homology of images. 2021 IEEE
    International Conference on Big Data. Big Data: International Conference on Big
    Data, 3824–3834.'
  mla: Heiss, Teresa, et al. “The Impact of Changes in Resolution on the Persistent
    Homology of Images.” <i>2021 IEEE International Conference on Big Data</i>, IEEE,
    2022, pp. 3824–34, doi:<a href="https://doi.org/10.1109/BigData52589.2021.9671483">10.1109/BigData52589.2021.9671483</a>.
  short: T. Heiss, S. Tymochko, B. Story, A. Garin, H. Bui, B. Bleile, V. Robins,
    in:, 2021 IEEE International Conference on Big Data, IEEE, 2022, pp. 3824–3834.
conference:
  end_date: 2021-12-18
  location: Orlando, FL, United States; Virtuell
  name: 'Big Data: International Conference on Big Data'
  start_date: 2021-12-15
date_created: 2022-03-06T23:01:53Z
date_published: 2022-01-13T00:00:00Z
date_updated: 2023-08-02T14:44:21Z
day: '13'
department:
- _id: HeEd
doi: 10.1109/BigData52589.2021.9671483
external_id:
  arxiv:
  - '2111.05663'
  isi:
  - '000800559503126'
isi: 1
language:
- iso: eng
main_file_link:
- open_access: '1'
  url: https://arxiv.org/abs/2111.05663
month: '01'
oa: 1
oa_version: Preprint
page: 3824-3834
publication: 2021 IEEE International Conference on Big Data
publication_identifier:
  isbn:
  - '9781665439022'
publication_status: published
publisher: IEEE
quality_controlled: '1'
scopus_import: '1'
status: public
title: The impact of changes in resolution on the persistent homology of images
type: conference
user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8
year: '2022'
...
