readabs

Build Status codecov status Project Status: Active – The project has reached a stable, usable state and is being actively developed. CRAN status

Overview

readabs helps you easily download, import, and tidy time series data from the Australian Bureau of Statistics from within R. This saves you time manually downloading and tediously tidying time series data and allows you to spend more time on your analysis.

We’d welcome Github issues containing error reports or feature requests. Alternatively you can email the package maintainer at mattcowgill at gmail dot com.

Installation

Install the latest CRAN version of readabs with:

install.packages("readabs")

You can install the developer version of readabs from GitHub with:

# if you don't have devtools installed, first run:
# install.packages("devtools")
devtools::install_github("mattcowgill/readabs")

New in recent versions

In version 0.4.2 of the readabs package,

In 0.4.1,

In 0.4.0,

Usage

There is one key function in readabs. It is:

There are some other functions you may find useful.

Both read_abs() and read_abs_local() return a single tidy data frame (tibble) containing long data.

Examples

To download all the time series data from an ABS catalogue number to your disk, and import the data to R as a single tidy data frame, use read_abs(). Here’s an example with the Wage Price Index, catalogue number 6345.0:

library(readabs)
#> Environment variable 'R_READABS_PATH' is unset. Downloaded files will be saved in a temporary directory.
#> You can set 'R_READABS_PATH' at any time. To set it for the rest of this session, use
#>  Sys.setenv(R_READABS_PATH = <path>)

all_wpi <- read_abs("6345.0")
#> Finding filenames for tables corresponding to ABS catalogue 6345.0
#> Attempting to download files from catalogue 6345.0, Wage Price Index, Australia
#> Extracting data from downloaded spreadsheets
#> Tidying data from imported ABS spreadsheets

str(all_wpi)
#> Classes 'tbl_df', 'tbl' and 'data.frame':    56276 obs. of  12 variables:
#>  $ table_no        : chr  "634501" "634501" "634501" "634501" ...
#>  $ sheet_no        : chr  "Data1" "Data1" "Data1" "Data1" ...
#>  $ table_title     : chr  "Table 1. Total Hourly Rates of Pay Excluding Bonuses: Sector, Original, Seasonally Adjusted and Trend" "Table 1. Total Hourly Rates of Pay Excluding Bonuses: Sector, Original, Seasonally Adjusted and Trend" "Table 1. Total Hourly Rates of Pay Excluding Bonuses: Sector, Original, Seasonally Adjusted and Trend" "Table 1. Total Hourly Rates of Pay Excluding Bonuses: Sector, Original, Seasonally Adjusted and Trend" ...
#>  $ date            : Date, format: "1997-09-01" "1997-12-01" ...
#>  $ series          : chr  "Quarterly Index ;  Total hourly rates of pay excluding bonuses ;  Australia ;  Private ;  All industries ;" "Quarterly Index ;  Total hourly rates of pay excluding bonuses ;  Australia ;  Private ;  All industries ;" "Quarterly Index ;  Total hourly rates of pay excluding bonuses ;  Australia ;  Private ;  All industries ;" "Quarterly Index ;  Total hourly rates of pay excluding bonuses ;  Australia ;  Private ;  All industries ;" ...
#>  $ value           : num  67.4 67.9 68.5 68.8 69.6 70 70.4 70.8 71.5 71.9 ...
#>  $ series_type     : chr  "Original" "Original" "Original" "Original" ...
#>  $ data_type       : chr  "INDEX" "INDEX" "INDEX" "INDEX" ...
#>  $ collection_month: chr  "3" "3" "3" "3" ...
#>  $ frequency       : chr  "Quarter" "Quarter" "Quarter" "Quarter" ...
#>  $ series_id       : chr  "A2603039T" "A2603039T" "A2603039T" "A2603039T" ...
#>  $ unit            : chr  "Index Numbers" "Index Numbers" "Index Numbers" "Index Numbers" ...

Maybe you only want a particular table? Here’s how you get a single table:


wpi_t1 <- read_abs("6345.0", tables = 1)
#> Finding filenames for tables corresponding to ABS catalogue 6345.0
#> Attempting to download files from catalogue 6345.0, Wage Price Index, Australia
#> Extracting data from downloaded spreadsheets
#> Tidying data from imported ABS spreadsheets

If you want multiple tables, but not the whole catalogue, that’s easy too:


wpi_t1_t5 <- read_abs("6345.0", tables = c("1", "5a"))
#> Finding filenames for tables corresponding to ABS catalogue 6345.0
#> Attempting to download files from catalogue 6345.0, Wage Price Index, Australia
#> Extracting data from downloaded spreadsheets
#> Tidying data from imported ABS spreadsheets

In many cases, the series column will contain multiple components, separated by ‘;’. The separate_series() function can help wrangling this column.

For more examples, please see the readabs vignette (run browseVignettes("readabs")).

Mentioned in Awesome Official Statistics Software

Mentioned in Awesome Official Statistics

We’re pleased to be included in a list of software that can be used to work with official statistics.

Package history

From version 0.3.0, readabs gained significant new functionality and the package changed substantially.

Pre-0.3.0 functions still work, but read_abs_data() is -deprecated. The behaviour of read_abs_metadata() has changed and the function is deprecated. The old version of readabs is available in the 0.2.9 branch on Github.