Inconsistency in NBB data


#1

The JSON representation of data from the National Bank of Belgium appears to be inconsistent with what one finds from other providers, in respect of the dimensions_values_labels. Usually these labels are given as JSON objects but in the NBB case they are JSON arrays. Code written to handle the first case fails on the second.

I’m putting below snipped versions of IMF and NBB output to illustrate what I’m talking about. First IMF, which is the form that I’ve come to expect from dbnomics:

{
  "_meta" : {
    "args" : {
      [...]
      "series_ids" : [
        [
          "IMF",
          "BOP",
          "A.FR.BACK_BP6_USD"
        ]
      ]
    },
    "version" : "22.0.0"
  },
  "datasets" : {
    "IMF/BOP" : {
      [...]
      "code" : "BOP",
      "converted_at" : "2019-05-24T04:16:34Z",
      "description" : "Contains balance of payments ...,
      "dimensions_codes_order" : [
        "FREQ",
        "REF_AREA",
        "INDICATOR"
      ],
      "dimensions_labels" : {
        "FREQ" : "Frequency",
        "INDICATOR" : "Indicator",
        "REF_AREA" : "Reference Area"
      },
      "dimensions_values_labels" : {
        "FREQ" : {
          "A" : "Annual",
          "B" : "Bi-annual",
          "D" : "Daily",
          "M" : "Monthly",
          "Q" : "Quarterly",
          "W" : "Weekly"
        },
        "INDICATOR" : {
          "BACK_BP6_EUR" : "Net Lending ...",
          "BACK_BP6_USD" : "Net Lending ...",
  (and so on)

Now NBB: note that in this case the dimensions_values_labels are given as arrays:

{
  "_meta" : {
    "args" : {
      [...]
      "series_ids" : [
        [
          "NBB",
          "NADETP51N",
          "AN1.L.A"
        ]
      ]
    },
    "version" : "22.0.0"
  },
  "datasets" : {
    "NBB/NADETP51N" : {
      "code" : "NADETP51N",
      "converted_at" : "2018-12-18T14:21:29Z",
      "created_at" : "2018-12-18T14:21:29Z",
      "dimensions_codes_order" : [
        "NADETP51N_CATEGORY",
        "NADETP51N_PRICE",
        "FREQUENCY"
      ],
      "dimensions_labels" : {
        "FREQUENCY" : "Frequency",
        "NADETP51N_CATEGORY" : "Category",
        "NADETP51N_PRICE" : "Price type and unit"
      },
      "dimensions_values_labels" : {
        "FREQUENCY" : [
          [
            "A",
            "Annual"
          ]
        ],
        "NADETP51N_CATEGORY" : [
          [
            "TOT",
            "Fixed assets (AN.11000)"
          ],

#2

Hi @acottrell,

In fact, it’s a feature, not an inconsistency.

Dimensions_values_labels property can be either a dict or a tuple (value, label) array, see dataset.json schema.

This feature has been introduced on July 9, 2018 into dbnomics data-model release 0.8.1 to

allow dimensions_values_labels (property of dataset.json ) to contain ordered list of values when lexicography sort is not appropriate.

See CHANGELOG.md

Most DBnomics dataset.json files have dimensions_values_labels as dict (as it was the only option 1 year ago) but some (as you noticed) are arrays.

Regards


#3

OK, thank you for the explanation.