CLASSIFIER API ============== classify -------- API endpoint to classify a job title or single-line text URL *** :: https://www.janzz.jobs/japi/classifier/ ALLOWABLE METHODS ***************** GET, POST PARAMETERS ********** - q - the text to classify - lang - format: ISO 369-1, 2 character language code, example: de, en, fr, ... - default value: defined for each client configuration - effect: search in this language, output all data in this language - detect_langs - format: ISO 369-1, 2 character language code, example: de, en, fr, ... - default value: empty - multiple values: repeat parameter - effect: when the source language is not known, use a combination of python langdetect and the concept graph data to detect the input language. If this is used, the detected languages will be returned as _search_lang_ in the output. - want_codes - format: classification code, example: ISCO-08 - multiple values: repeat parameter - `cls_data_` + classification - example: cls_data_ISCO-08 - effect: provide known classification data relating to the input in order to boost search results. - multiple values: repeat parameter for each classification - output - format: html or empty - default value: empty - effect: output as indended html if set to html, otherwise output as json REQUEST HEADERS *************** - X_TXID - Transaction ID, will be saved in the logs and returned in the response headers. Used to find specific information in log files when a transaction id is provided. SAMPLE REQUEST ************** :: https://www.janzz.jobs/japi/classifier/classify/?output=html&client=nav1&q=test+engineer&lang=en&want_codes=ISCO-08&cls_data_JANZZ_Industries=201&detect_langs=en&detect_langs=de SAMPLE RESPONSE *************** :: { "runtime": "242ms", "result": { "cid": [ { "id": 97173, "score": 1.0, "raw_score": 1668.9249955882344, "label": "Test Engineer", "cid_info": { "occupation_class": 3, "classifications": { "ISCO-08": [ { "code": "2149", "descr": "... description", "label": "Engineering professionals not elsewhere classified" } ] }, "cid": 97173 }, "boosts": { "score": 1335.1399964705877, "boost": 0.8, "old_score": 1668.9249955882344, "tags": [ "JI-" ] } }, ... ], "ISCO-08": [ { "id": "2149", "score": 1.0, "raw_score": 1694.294045156798, "label": "Engineering professionals not elsewhere classified", "descr": "... description of ISCO-08 2149" }, ... ], "skills": [] }, "search_lang": "en" } DESCRIPTION OF FIELDS ********************* - **runtime** - how long it took to run the classification process - **cid** - top concepts - id - concept ID - score - normalized score from 0 to 1.0 - raw_score - raw score before normalization - boosts - input classifications provide positive boosts (> 1.0)to concepts that match the classification value and negative (< 1.0) to ones that don't. - score - score after boost applied - boost - amount that raw score is multiplied by in order to get new score - tags - list of boosts, configurable per client. For example, JI+ means JANZZ_industry matched, JI- means JANZZ_industry did not match. - cid_info For each concept, depending on the client configuration, cid_info is provided which contains occupation_class and all classifications from want_codes for that concept. - **classification** - results for each classification in want_codes - these values usually correspond with the classification values of the top concepts, however, they maybe contain classification values which were not in the best_concepts. This happens when a concept is found that is not classified, but its siblings, parents, other related concepts are classified. - **search_lang** - detected input language when detect_langs is used. typeahead --------- API endpoint to use typeahead for classification labels and concept labels, using the JANZZ classifier, as an alternative to /concepts/ and /labels/ URL *** :: https://www.janzz.jobs/japi/classifier/typeahead/ ALLOWABLE METHODS ***************** GET SAMPLE REQUEST ************** ``https://www.janzz.jobs/japi/classifier/typeahead/?q=prod&output=html&num_results=10&num_cls_label_results=0&lang=en`` ``https://www.janzz.jobs/japi/classifier/typeahead/?q=prod&output=html&num_results=0&num_cls_label_results=20&want_codes=ISCO-08`` ``https://www.janzz.jobs/japi/classifier/typeahead/?q=prod&output=html&num_results=10&num_cls_label_results=0&want_codes=ISCO-08`` ``https://www.janzz.jobs/japi/classifier/typeahead/?q=prod&output=html&num_results=10&num_cls_label_results=0`` PARAMETERS ********** - q - the user input in the typeahead - want_codes - only return labels from concepts which are classified with this classification, also include the classification values for each returned concept. - default value: empty - multiple values: repeat parameter - num_results - return N labels - num_cls_label_results - eturn N classification labels which match the typeahead input. These will have a concept id (cid) of 0. - output - format: html or empty - default value: empty - effect: output as indended html if set to html, otherwise output as json SAMPLE RESPONSE *************** :: { "result": [ { "label": "Animal producers", "classifications": { "ISCO-08": [ { "code": "612", "descr": "... description of code", "label": "Animal producers" } ] }, "cid": 0 }, { "label": "Producer", "classifications": { "ISCO-08": [ { "code": "2654", "descr": "... description of code", "label": "Film, stage and related directors and producers" }, { "code": "9999", "descr": "Too generic - not classified", "label": "Not classified" } ] }, "cid": 55844 }, { "label": "Producer (Performing Arts)", "classifications": { "ISCO-08": [ { "code": "2654", "descr": "... description of code", "label": "Film, stage and related directors and producers" } ] }, "cid": 22596 } }