What is our methodology? What are our data sources?
How we work in 5 simple steps:
I. Collect data
We collect data from hundreds of sources mainly Job boards, social networks, public salary surveys, and company sites publishing salaries. We record a number of attributes such as job title, years of experience, seniority level, job descriptions, number of jobs posted, number of professionals available, and compensation packages being offered for hundreds of job profiles. To give you some examples of data sources we use:
Job boards: Indeed, Monster
Salary sites: Buffer, Levels.fyi
Surveys: Ask A Manager Survey, Recruiter Salaries
Social networks: Linkedin, Xing
II. Deduplicate and normalize data
The dataset is first processed by removing duplicates followed by normalization so that it can be modelled.
III. Split dataset for training & testing
This dataset is then split into training and testing sets. The testing sets are used for validating various models we experiment with, while training sets are used by the system to learn and improve from experience.
IV. Create decision tree models
We use regression models to map relations between cities, countries, roles, and seniority levels. This enables us to predict accurate salaries even with small sample size. We validate the various models to ensure accuracy greater than 85%.
V. Predict salaries
Once the models are validated the system makes salary predictions. Finally, these salaries are published on the platform.