CLASSIFIER API

classify

API endpoint to classify a job title or single-line text

URL

https://www.janzz.jobs/japi/classifier/

ALLOWABLE METHODS

GET, POST

PARAMETERS

  • q
    • the text to classify
  • lang
    • format: ISO 369-1, 2 character language code, example: de, en, fr, …
    • default value: defined for each client configuration
    • effect: search in this language, output all data in this language
  • detect_langs
    • format: ISO 369-1, 2 character language code, example: de, en, fr, …
    • default value: empty
    • multiple values: repeat parameter
    • effect: when the source language is not known, use a combination of python langdetect and the concept graph data to detect the input language. If this is used, the detected languages will be returned as _search_lang_ in the output.
  • want_codes
    • format: classification code, example: ISCO-08
    • multiple values: repeat parameter
  • cls_data_ + classification
    • example: cls_data_ISCO-08
    • effect: provide known classification data relating to the input in order to boost search results.
    • multiple values: repeat parameter for each classification
  • output
    • format: html or empty
    • default value: empty
    • effect: output as indended html if set to html, otherwise output as json

REQUEST HEADERS

  • X_TXID
    • Transaction ID, will be saved in the logs and returned in the response headers. Used to find specific information in log files when a transaction id is provided.

SAMPLE REQUEST

https://www.janzz.jobs/japi/classifier/classify/?output=html&client=nav1&q=test+engineer&lang=en&want_codes=ISCO-08&cls_data_JANZZ_Industries=201&detect_langs=en&detect_langs=de

SAMPLE RESPONSE

{
    "runtime": "242ms",
    "result": {
        "cid": [
            {
                "id": 97173,
                "score": 1.0,
                "raw_score": 1668.9249955882344,
                "label": "Test Engineer",
                "cid_info": {
                    "occupation_class": 3,
                    "classifications": {
                        "ISCO-08": [
                            {
                                "code": "2149",
                                "descr": "... description",
                                "label": "Engineering professionals not elsewhere classified"
                            }
                        ]
                    },
                    "cid": 97173
                },
                "boosts": {
                  "score": 1335.1399964705877,
                  "boost": 0.8,
                  "old_score": 1668.9249955882344,
                  "tags": [
                      "JI-"
                  ]
                }
            },
            ...
        ],
        "ISCO-08": [
            {
                "id": "2149",
                "score": 1.0,
                "raw_score": 1694.294045156798,
                "label": "Engineering professionals not elsewhere classified",
                "descr": "... description of ISCO-08 2149"
            },
            ...
        ],
        "skills": []
    },
    "search_lang": "en"
}

DESCRIPTION OF FIELDS

  • runtime - how long it took to run the classification process
  • cid - top concepts
    • id - concept ID
    • score - normalized score from 0 to 1.0
    • raw_score - raw score before normalization
    • boosts
      • input classifications provide positive boosts (> 1.0)to concepts that match the classification value and negative (< 1.0) to ones that don’t.
      • score - score after boost applied
      • boost - amount that raw score is multiplied by in order to get new score
      • tags - list of boosts, configurable per client. For example, JI+ means JANZZ_industry matched, JI- means JANZZ_industry did not match.
      • cid_info For each concept, depending on the client configuration, cid_info is provided which contains occupation_class and all classifications from want_codes for that concept.
  • classification - results for each classification in want_codes
    • these values usually correspond with the classification values of the top concepts, however, they maybe contain classification values which were not in the best_concepts. This happens when a concept is found that is not classified, but its siblings, parents, other related concepts are classified.
  • search_lang - detected input language when detect_langs is used.

typeahead

API endpoint to use typeahead for classification labels and concept labels, using the JANZZ classifier, as an alternative to /concepts/ and /labels/

URL

https://www.janzz.jobs/japi/classifier/typeahead/

ALLOWABLE METHODS

GET

SAMPLE REQUEST

https://www.janzz.jobs/japi/classifier/typeahead/?q=prod&output=html&num_results=10&num_cls_label_results=0&lang=en https://www.janzz.jobs/japi/classifier/typeahead/?q=prod&output=html&num_results=0&num_cls_label_results=20&want_codes=ISCO-08 https://www.janzz.jobs/japi/classifier/typeahead/?q=prod&output=html&num_results=10&num_cls_label_results=0&want_codes=ISCO-08 https://www.janzz.jobs/japi/classifier/typeahead/?q=prod&output=html&num_results=10&num_cls_label_results=0

PARAMETERS

  • q
    • the user input in the typeahead
  • want_codes
    • only return labels from concepts which are classified with this classification, also include the classification values for each returned concept.
    • default value: empty
    • multiple values: repeat parameter
  • num_results
    • return N labels
  • num_cls_label_results
    • eturn N classification labels which match the typeahead input. These will have a concept id (cid) of 0.
  • output
    • format: html or empty
    • default value: empty
    • effect: output as indended html if set to html, otherwise output as json

SAMPLE RESPONSE

{
    "result": [
        {
          "label": "Animal producers",
          "classifications": {
              "ISCO-08": [
                  {
                      "code": "612",
                      "descr": "... description of code",
                      "label": "Animal producers"
                  }
              ]
          },
          "cid": 0
        },
        {
            "label": "Producer",
            "classifications": {
                "ISCO-08": [
                    {
                        "code": "2654",
                        "descr": "... description of code",
                        "label": "Film, stage and related directors and producers"
                    },
                    {
                        "code": "9999",
                        "descr": "Too generic - not classified",
                        "label": "Not classified"
                    }
                ]
            },
            "cid": 55844
        },
        {
            "label": "Producer (Performing Arts)",
            "classifications": {
                "ISCO-08": [
                    {
                        "code": "2654",
                        "descr": "... description of code",
                        "label": "Film, stage and related directors and producers"
                    }
                ]
            },
            "cid": 22596
        }
}