library(dataobservatory, quietly = TRUE)
data("small_population")
small_population_dataset <- dataset (x= small_population,
                                     dataset_code = "small_population_total",
                                     dataset_title = "Population of Small European Countries",
                                     freq = "A",
                                     unit = "NR",
                                     unit_name = "number")

The codebook S3 class (not yet fully documented and does not have yet and independent constructor) records the statistical processing metadata of a dataset. It as a print method.

It contains a full codebook following SDMX statistical metadata codelist standards, furthermore, it records the Session Information of all processing steps, and adds to the descriptive metadata the R packages or software code that generated the results.

small_population_codebook <- codebook_dataset (small_population_dataset)
is.codebook(small_population_codebook)
#> [1] FALSE
small_population_codebook
#> # A tibble: 4 x 6
#>   dataset_code   var_name id    name   description          RelatedItem         
#>   <chr>          <chr>    <chr> <chr>  <chr>                <chr>               
#> 1 small_populat~ freq     A     Annual To be used for data~ "{\"RelatedItem\":[~
#> 2 small_populat~ method   A     Actua~ No method applied.   ""                  
#> 3 small_populat~ obs_sta~ A     Norma~ To be used as defau~ "{\"RelatedItem\":[~
#> 4 small_populat~ unit     NR    number <NA>                  <NA>

Frequency of Observation

For example, the annual observations follow the SDMX Code List for Frequency 2.1 (CL_FREQ)) definition, and they can be translated to the ISO 8106 time metadata standard, too.

add_frequency("A", "list")
#> $id
#> [1] "A"
#> 
#> $name
#> [1] "Annual"
#> 
#> $description
#> [1] "To be used for data collected or disseminated every year"
#> 
#> $iso8106
#> [1] "P1Y"
#> 
#> $RelatedItem
#> [1] "{\"RelatedItem\":[\"SDMX Code List for Frequency\"],\"relatedItemType\":[\"Dataset\"],\"relationType\":[\"IsDocumentedBy\"],\"relatedItemIdentifier\":[\"{\\\"id\\\":[\\\" CL_FREQ\\\"],\\\"dataset_code\\\":{},\\\"URI\\\":[\\\"https://sdmx.org/?page_id=3215/\\\"],\\\"DOI\\\":{},\\\"Version\\\":[\\\"2.1\\\"],\\\"idAtSource\\\":{},\\\"Other\\\":{}}\"]}"

Observation Status

The add_obs() implements the SDMX Code List for Observation Status 2.2 (CL_OBS_STATUS) defintiion for observation status.

Furthermore, with Estimated values and Imputed values, the software code and computational environment information can be recoded (with add_related_items()).

add_obs("E", "list")
#> $id
#> [1] "E"
#> 
#> $name
#> [1] "Estimated value"
#> 
#> $description
#> [1] "Observation obtained through an estimation methodology (e.g. to produce back-casts) or based on the use of a limited amount of data or ad hoc sampling and through additional calculations (e.g. to produce a value at an early stage of the production stage while not all data are available). It may also be used in case of experimental data (e.g. in the context of a pilot ahead of a full scale production process) or in case of data of (anticipated/assessed) low quality. If needed, additional information can be provided through free text using the COMMENT_OBS attribute at the observation level or at a higher level. This code is to be used when the estimation is done by a sender agency. When the imputation is carried out by a receiver agency in order to replace or fill gaps in reported data series, the flag to use is I “Value imputed by a receiving agency”."
#> 
#> $RelatedItem
#> [1] "{\"RelatedItem\":[\"SDMX Code List for Observation Status\"],\"relatedItemType\":[\"Dataset\"],\"relationType\":[\"IsDocumentedBy\"],\"relatedItemIdentifier\":[\"{\\\"id\\\":[\\\" CL_OBS_STATUS\\\"],\\\"dataset_code\\\":{},\\\"URI\\\":[\\\"https://sdmx.org/?sdmx_news=new-version-of-code-list-for-observation-status-version-2-2/\\\"],\\\"DOI\\\":{},\\\"Version\\\":[\\\"2.2\\\"],\\\"idAtSource\\\":{},\\\"Other\\\":{}}\"]}"

Unit information

Only Eurostat uses hundreds of codelists, most of them for various measurement information. Currently the unit information is not validated in the package. However, this is not a problem when data is imported from other statistical services, because they apply appropriate unit information.

add_unit("M_EUR", "Million euros", 
         admin_format = 'list')
#> $id
#> [1] "M_EUR"
#> 
#> $name
#> [1] "Million euros"
#> 
#> $description
#> [1] NA
#> 
#> $RelatedItem
#> [1] NA

Consolidated Coodbook

consolidated_codebook <- codebook()
is.codebook(consolidated_codebook)
#> [1] TRUE
print(consolidated_codebook)
#> Codebook information for Consolidated Coodbook For the dataobservatory R package 
#> # A tibble: 24 x 6
#>    dataset_code    var_name  id     name     description       RelatedItem      
#>    <chr>           <chr>     <chr>  <chr>    <chr>             <chr>            
#>  1 dataobservator~ method    A      Actual ~ No method applie~ ""               
#>  2 dataobservator~ method    approx Linear ~ Replacing each m~ "{\"RelatedItem\~
#>  3 dataobservator~ method    backc~ Backcas~ Backcasted with ~ "{\"RelatedItem\~
#>  4 dataobservator~ method    forec~ Forecas~ Forecasted with ~ "{\"RelatedItem\~
#>  5 dataobservator~ method    locf   Last Ob~ Replacing each m~ "{\"RelatedItem\~
#>  6 dataobservator~ method    nocb   Next Ob~ Replacing each m~ "{\"RelatedItem\~
#>  7 dataobservator~ method    O      Missing~ No method applie~ ""               
#>  8 dataobservator~ obs_stat~ _O     Other    To be used when ~ "{\"RelatedItem\~
#>  9 dataobservator~ obs_stat~ _U     Unspeci~ To be used when ~ "{\"RelatedItem\~
#> 10 dataobservator~ obs_stat~ _Z     Not app~ To be used when ~ "{\"RelatedItem\~
#> # ... with 14 more rows
#> Only the first 24 entries are printed.
#> There are further 29 entries in the codebook.