DISKOW EU Project
DISKOW Erasmus+ Funded Project
The DISKOW Project consists of academic and industrial experts from 4 partnership countries with very high expertise in the machine learning, software development and economics.
This Intellectual Output is the final IO of the DISKOW project and covers the activities carried out to assess the final prototype of the Job Knowledge Base platform. The first chapter is devoted to the analysis of the online job vacancies in the labour market, followed by the second chapter, which describes the JKB prototype as a final outcome of the project. The third and fourth chapter are focused on the assessment activities. To assess the accuracy of the data extraction, a sample of random job postings was selected and the extracted data were manually compared with the original text published online. The results were analysed and potential improvements, which could be achieved by the future users identified. Further, external stakeholders were approached and interviewed with the aim to evaluate the user experience with the JKB prototype. The answers obtained are analysed and summarized in the chapter 4 of the IO5.
The Final IO5 Report
Job Knowledge Base (JKB) hosts jobs data collected from the web (as result of IO3). The JKB is based on the KnowAge open source data platform (promoted by ENG) and benefited from the Job Knowledge Analysis Engine and Visualization APIs provided as libraries in the IO3. This IO is a software development activity and formed the complete and final project software prototype. This task started from the project beginning and was finished at the end of the project, with an alpha release at M12 and a fully fledged Beta at M18.
This IO covers development of methods for extracting job-related information from websites. The goal of tasks in the frame of this IO is to analyze the content of job postings, other job-related textual data and information about job seekers, to detect topics within them and to feed into the JKB. This has multiple applications: (1) Grouping similar job openings (2) Finding duplicate job openings (3) Automatically generating a hierarchical categorization of jobs (4) Matching job seeker skills to job openings. In addition, this IO aimed at extending multiple visualization methods for the various prediction results and search tasks developed throughout the project. We used methods from information retrieval and recommender systems to evaluate how to present search results, post-retrieval clustering and computed taxonomies to the user. Based on user tests, we measured the usefulness of the systems, and thus be able to compare the various algorithms with each other. This IO benefited from the studies done in IO1 and IO2 i.e. Modeling and Meta-Modeling of Job Knowledge for Labour Market and Identification and Analysis of the in the European labour market Data respectively.
The final IO3 Report
IO2 report is broadly addressing two main tasks of the DISKOW project: identification of potential data sources and evaluating them with the proposed schema of IO1. Hence, the first part of the report mostly focused on the identification and classification of different online data sources. This report has classified the job portals into three categories: global job portals, European job portals, and local country-specific job portals. This report has explored multiple existing job portals. Next, this report has tried to identify the most relevant and accessible data sources for developing JKB. Subsequently, DISKOW project partners are contacting these job portals for accessing the data. In addition, this report has also explored open data sources such as social media. This report has explored Twitter data as one of the potential sources for social media data and discussed the merits and demerits of this open data source. This report has also developed a framework to create a job knowledge base from social media data. The next portion of the report extends the findings of IO1. This report has tried to evaluate and verified the schema prepared by IO1 in the context of various potential data sources, and finally, this report has suggested how the data sources, identified by IO2, and the schema prepared by IO1, can be integrated into the subsequent intellectual outputs.
This IO covers the modeling and meta-modeling of the Job Knowledge (JK) as a theoretical and conceptual part of the DISKOW project. Accordingly, this report provides the required terminology for the project, describes the structure and ontology of the JK in detail, and demonstrates the hierarchical architecture of it. With an aim to provide a standard and extensible JK model which can be integrated in a wide variety of tools and software technologies, we have adopted an existing ontology for job knowledge from schema.org and extended it with additional properties following the best practices laid out with respect to linked open data. We showcase the application of the proposed JK model in the field of data science as an example. We thereby envision that the outcome of this IO will facilitate the adaptation of the JK model in a wide variety of sectors and case studies.
Dr. Daniel Kudenko