“A utility to map arbitrary text to the WLO/OEH topics vocabulary based on keyword matching.”
the only endpoint currently targeted in the Redaktionsumgebung
endpoint POST /topics
Uses langdetect of python to detect language - has been used in MetaQS as well.
Predicts discipline of content
Returns integer, corresponding to discipline of content
Purely detection of duplicates based on pre-trained text
uses MinHash algorithm
currently trainied for id, url and description of some data
- age and state of data unclear
Returns: "a list of scores and document ids relevant to the query document. Only the top ten items are retrieved, in descending order"
Model pretrained, details unclear - same model as classification?
Analyzes the OER object itself and yields categories, e.g., which categories title, description and keywords could belong, too
Needs training, "Kreissektor" is connected to "Allgemeine Psychologie"
Verschiedene Mappings, e.g., OER Kategorien zu Wikipedia
some outdated links
kein Rest-Endpunkt o.ä.