OJD DAPS Salaries
This library provides a Python interface to the OJD DAPS Salaries model, which extracts and standardises salaries from job adverts. The model was developed a large dataset of job adverts and their associated salaries, and is designed to be used in conjunction with online job adverts.
Installation
This library can be installed using pip. Ensure you have Python 3.10 or newer installed on your system before proceeding.
pip install git+https://github.com/nestauk/ojd_daps_salaries.git
Usage
Basic Usage
To use the model, you can import it and call the extract_salary
function with a dictionary representing a job advert, containing the variables:
raw_salary_currency
: The currency of the salary (e.g. "GBP", "USD", "EUR")raw_salary_rate
: The rate of the salary (e.g. "year", "month", "day")raw_min_salary
: The minimum salaryraw_max_salary
: The maximum salary
The model will return a dictionary, containing the raw salaries and the annualised salaries:
from ojd_daps_salaries import extract_salary
job_advert = {
"raw_salary_currency": "GBP",
"raw_salary_rate": "year",
"raw_min_salary": 20000,
"raw_max_salary": 30000
}
salary = extract_salary(job_advert)
print(salary)
Working with Job Adverts
The model is designed to be used with job adverts, and can be used to extract salaries from a dataframe of job adverts containing the relevant variables:
import pandas as pd
from ojd_daps_salaries import extract_salary
# Load job adverts
job_adverts = pd.read_csv("job_adverts.csv")
# Extract salaries
job_adverts["salary"] = job_adverts.apply(extract_salary, axis=1)
print(job_adverts["salary"])
Methodology
To see how we developed our model, and details on its performance, please refer to our doumentation here.
License
This project is licensed under the MIT License - see the LICENSE file for details.