Institut für Informationswissenschaft der TH Köln
Refine
Document Type
- Bachelor Thesis (2)
- Master's Thesis (2)
Language
- English (4) (remove)
Has Fulltext
- yes (4) (remove)
Keywords
- Autosuggestion (1)
- Data Mining (1)
- Embedded librarian (1)
- Informationskompetenz (1)
- Lemmatisierung (1)
- Natural Language Processing (1)
- Open Research Knowledge Graph (1)
- Politik (1)
- Suchmaschine (1)
- Text Mining (1)
- Tollwut (1)
- collaboration (1)
- lemmatisation (1)
- outreach (1)
- search engines (1)
- spaCy (1)
- teaching support (1)
- text categorisation (1)
- visibility (1)
With the growing scientific output that is produced, its getting more important to automate the extraction of knowledge from articles. This bachelor thesis will describe an approach doing exactly this. Scientific articles will be obtained from a database.
These articles will be preprocessed to gain a set of training data, to update a language model that already exists for Python library spaCy. The model will be trained to recognize different sorts of entities regarding to the virus rabies. After this process the model will be used for ten articles and the extracted knowledge will be used to extend the Open Research Knowledge Graph.
Research data which is put into long term storage needs to have quality metadata attached so it may be found in the future. Metadata facilitates the reuse of data by third parties and makes it citable in new research contexts and for new research questions. However, better tools are needed to help the researchers add metadata and prepare their data for publication. These tools should integrate well in the existing research workflow of the scientists, to allow metadata enrichment even while they are creating, gathering or collecting the data. In this thesis an existing data publication tool from the project DARIAH-DE was connected to a proven file synchronization software to allow the researchers prepare the data from their personal computers and mobile devices and make it ready for publication. The goal of this thesis was to find out whether the use of file synchronization software eases the data publication process for the researchers.
Analysing the systematics of search engine autocompletion functions by means of data mining methods
(2017)
In the internet era, the information that can be found about politicians online can influence
events such as the results of elections. Research has shown that biased search rankings can
shift the voting preferences of undecided voters. This shows the importance of studying
online search behaviour, especially in the pre-elections phase, when search results can
have a particular influence on the future political scene of a country.
This master thesis aimed to study the behaviour of online search engines in a period before
the German federal election in 2017. The aim was to ascertain if there is any pattern to be
found in the auto-suggestions for searches related to politicians.
In order to gather data for this experiment, a crawler browsed search engine web pages,
input a name and a surname of a politician, and saved that together with all autosuggestions
from the search engine. The autosuggestions were prepared for the analysis and
divided into semantic groups with the help of clustering algorithms.
Different statistical methods, such as correlation analysis, regression analysis, and clustering
were used to identify patterns in the data. The research showed that there are
no particularly strong patterns in the autosuggestions for searches related to politician’s
names. Only moderate dependence was found between gender and personal topics, and
showed that a higher amount of personal information autosuggestions correspond more
to female politicians.
As technology advances, the services offered by libraries and the roles of
librarians continue being reconsidered. This paper describes the traditional
model and development of liaison and embedded librarians, examines the
online visibility of liaison librarians and their services in an embedded sense,
especially regarding instruction, of selected Canadian Academic Libraries, and
provides a short view to German libraries and their subject librarians.
It has been shown that even if not clearly a development of liaison librarians to
certain embedment is emphasized for each library, it at least evolves to a usercentered
approach and a preference to stronger collaborations. The selected
libraries seek to broaden their scope of partnerships. In which deepness it is
realized, lastly depends on the willingness of both participators and capacities.
Libraries have stated their flexibility in various ways and are ready to step in at
the point of need. The one closest to embedded services is the instruction of
information literacy as its effectiveness requests a longer relationship in order to
flourish. Nevertheless research support in Health Sciences obviously has
become an integral part.