Refine
Year of publication
- 2022 (3) (remove)
Document Type
- Bachelor Thesis (3) (remove)
Language
- English (3) (remove)
Has Fulltext
- yes (3)
Keywords
- Benutzerverhalten (1)
- Gender bias (1)
- Interactive Information Retrieval (1)
- Living Lab (1)
- Logging Infrastructures (1)
- NLP (1)
- Natural Language Processing (1)
- Open Research Knowledge Graph (1)
- Reddit (1)
- Sentiment (1)
The goal of this work is to detect "gender biases" in the communication of users of Subreddits on the platform Reddit. The analysis is carried out for eleven selected Subreddits. Furthermore, an attempt is made to identify different user types with the help of a k-means clustering and also to analyze "gender biases" in their communication. Based on the aggregated datasets, fasttext Word Embedding models are trained to identify terms that show high semantic relatedness in terms of cosine similarity of their word vectors with selected feminine and masculine terms.
To this end, the terms are analyzed for sentiment using the NRC-VAD Lexicon and tested for statistically significant differences. In addition, the Word Embedding Association Test (WEAT) is performed to check for subliminal associations. In relation to the considered text corpus, it is essentially observed that women are frequently associated with adjectives that associate them with appearances,
childbearing abilities or adaptability also in relation to the family. In contrast, men are associated with and measured by adjectives that refer to their prestige, strengths and weaknesses, career or physical characteristics.
With the growing scientific output that is produced, its getting more important to automate the extraction of knowledge from articles. This bachelor thesis will describe an approach doing exactly this. Scientific articles will be obtained from a database.
These articles will be preprocessed to gain a set of training data, to update a language model that already exists for Python library spaCy. The model will be trained to recognize different sorts of entities regarding to the virus rabies. After this process the model will be used for ten articles and the extracted knowledge will be used to extend the Open Research Knowledge Graph.
As a key part of human-computer interaction(HCI) and usability testing, the capturing and recording of key user interaction plays a center role for ensuring a reliable post-hoc analysis of collected user interaction data, thus improving the odds of insightful HCI and usability testing cycles for use cases such as the evaluation of interactive information retrieval Systems(IRR). As such, the practice of logging is of significant importance for multiple fields of study such as IIR, HCI and most recently also Living Lab approaches. Living lab approaches represent a user-centered research methodology with a focus on user involvement, experimental approaches and extensive collaboration for the sake of co-production of knowledge and as such, has a dire need for robust and easy to use logging solutions.
With past logging solutions being either expensive, hard to use or error-prone, recent conferences gave rise to new logging solutions using contemporary web technologies, which aim to improve the logging landscape within the research community. Over the course of this paper, two of these recent logging solutions, LogUI and Big Brother, are to be inspected for their key features and then evaluated, whether they are suitable logging solutions for living lab and IIR environments. Results and research indicate, that both logging solutions offer significant benefits for research using living lab and IIR approaches, with LogUI embracing many of the experimental paradigms that guide the living lab approach.