CLASSIFIER API¶
classify¶
API endpoint to classify a job title or single-line text
URL¶
https://www.janzz.jobs/japi/classifier/
ALLOWABLE METHODS¶
GET, POST
PARAMETERS¶
- q
- the text to classify
- lang
- format: ISO 369-1, 2 character language code, example: de, en, fr, …
- default value: defined for each client configuration
- effect: search in this language, output all data in this language
- detect_langs
- format: ISO 369-1, 2 character language code, example: de, en, fr, …
- default value: empty
- multiple values: repeat parameter
- effect: when the source language is not known, use a combination of python langdetect and the concept graph data to detect the input language. If this is used, the detected languages will be returned as _search_lang_ in the output.
- want_codes
- format: classification code, example: ISCO-08
- multiple values: repeat parameter
- cls_data_ + classification
- example: cls_data_ISCO-08
- effect: provide known classification data relating to the input in order to boost search results.
- multiple values: repeat parameter for each classification
- output
- format: html or empty
- default value: empty
- effect: output as indended html if set to html, otherwise output as json
REQUEST HEADERS¶
- X_TXID
- Transaction ID, will be saved in the logs and returned in the response headers. Used to find specific information in log files when a transaction id is provided.
SAMPLE REQUEST¶
https://www.janzz.jobs/japi/classifier/classify/?output=html&client=nav1&q=test+engineer&lang=en&want_codes=ISCO-08&cls_data_JANZZ_Industries=201&detect_langs=en&detect_langs=de
SAMPLE RESPONSE¶
{
"runtime": "242ms",
"result": {
"cid": [
{
"id": 97173,
"score": 1.0,
"raw_score": 1668.9249955882344,
"label": "Test Engineer",
"cid_info": {
"occupation_class": 3,
"classifications": {
"ISCO-08": [
{
"code": "2149",
"descr": "... description",
"label": "Engineering professionals not elsewhere classified"
}
]
},
"cid": 97173
},
"boosts": {
"score": 1335.1399964705877,
"boost": 0.8,
"old_score": 1668.9249955882344,
"tags": [
"JI-"
]
}
},
...
],
"ISCO-08": [
{
"id": "2149",
"score": 1.0,
"raw_score": 1694.294045156798,
"label": "Engineering professionals not elsewhere classified",
"descr": "... description of ISCO-08 2149"
},
...
],
"skills": []
},
"search_lang": "en"
}
DESCRIPTION OF FIELDS¶
- runtime - how long it took to run the classification process
- cid - top concepts
- id - concept ID
- score - normalized score from 0 to 1.0
- raw_score - raw score before normalization
- boosts
- input classifications provide positive boosts (> 1.0)to concepts that match the classification value and negative (< 1.0) to ones that don’t.
- score - score after boost applied
- boost - amount that raw score is multiplied by in order to get new score
- tags - list of boosts, configurable per client. For example, JI+ means JANZZ_industry matched, JI- means JANZZ_industry did not match.
- cid_info For each concept, depending on the client configuration, cid_info is provided which contains occupation_class and all classifications from want_codes for that concept.
- classification - results for each classification in want_codes
- these values usually correspond with the classification values of the top concepts, however, they maybe contain classification values which were not in the best_concepts. This happens when a concept is found that is not classified, but its siblings, parents, other related concepts are classified.
- search_lang - detected input language when detect_langs is used.
typeahead¶
API endpoint to use typeahead for classification labels and concept labels, using the JANZZ classifier, as an alternative to /concepts/ and /labels/
URL¶
https://www.janzz.jobs/japi/classifier/typeahead/
ALLOWABLE METHODS¶
GET
SAMPLE REQUEST¶
https://www.janzz.jobs/japi/classifier/typeahead/?q=prod&output=html&num_results=10&num_cls_label_results=0&lang=en
https://www.janzz.jobs/japi/classifier/typeahead/?q=prod&output=html&num_results=0&num_cls_label_results=20&want_codes=ISCO-08
https://www.janzz.jobs/japi/classifier/typeahead/?q=prod&output=html&num_results=10&num_cls_label_results=0&want_codes=ISCO-08
https://www.janzz.jobs/japi/classifier/typeahead/?q=prod&output=html&num_results=10&num_cls_label_results=0
PARAMETERS¶
- q
- the user input in the typeahead
- want_codes
- only return labels from concepts which are classified with this classification, also include the classification values for each returned concept.
- default value: empty
- multiple values: repeat parameter
- num_results
- return N labels
- num_cls_label_results
- eturn N classification labels which match the typeahead input. These will have a concept id (cid) of 0.
- output
- format: html or empty
- default value: empty
- effect: output as indended html if set to html, otherwise output as json
SAMPLE RESPONSE¶
{
"result": [
{
"label": "Animal producers",
"classifications": {
"ISCO-08": [
{
"code": "612",
"descr": "... description of code",
"label": "Animal producers"
}
]
},
"cid": 0
},
{
"label": "Producer",
"classifications": {
"ISCO-08": [
{
"code": "2654",
"descr": "... description of code",
"label": "Film, stage and related directors and producers"
},
{
"code": "9999",
"descr": "Too generic - not classified",
"label": "Not classified"
}
]
},
"cid": 55844
},
{
"label": "Producer (Performing Arts)",
"classifications": {
"ISCO-08": [
{
"code": "2654",
"descr": "... description of code",
"label": "Film, stage and related directors and producers"
}
]
},
"cid": 22596
}
}