Title: | Common DfE R tasks |
---|---|
Description: | This package contains R functions to allow DfE analysts to re-use code for common analytical tasks that are undertaken across the Department. |
Authors: | Cam Race [aut, cre], Laura Selby [aut], Adam Robinson [aut], Jen Machin [ctb], Jake Tufts [ctb], Rich Bielby [ctb] , Menna Zayed [ctb] |
Maintainer: | Cam Race <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.6.1.9000 |
Built: | 2025-01-14 05:24:13 UTC |
Source: | https://github.com/dfe-analytical-services/dfer |
Adds separating commas to big numbers. If a value is not numeric it will return the value unchanged and as a string.
comma_sep(number, nsmall = 0L)
comma_sep(number, nsmall = 0L)
number |
number to be comma separated |
nsmall |
minimum number of digits to the right of the decimal point |
string
comma_sep(100) comma_sep(1000) comma_sep(3567000)
comma_sep(100) comma_sep(1000) comma_sep(3567000)
A lookup of ONS geography country names and codes, as well as some custom DfE names and codes. This is used as the definitive list for the screening of open data before it is published by the DfE.
countries
countries
countries
A data frame with 10 rows and 2 columns:
Country name
Country code
curated by [email protected], ONS codes sourced from https://geoportal.statistics.gov.uk/search?q=countries%20names%20and%20codes
Creates a pre-populated project for DfE R
create_project( path, init_renv = TRUE, include_structure_for_pkg = FALSE, create_publication_proj = FALSE, include_github_gitignore, ... )
create_project( path, init_renv = TRUE, include_structure_for_pkg = FALSE, create_publication_proj = FALSE, include_github_gitignore, ... )
path |
Path of the new project |
init_renv |
Boolean; initiate renv in the project. Default is set to true. |
include_structure_for_pkg |
Boolean; Additional folder structure for package development. Default is set to false. |
create_publication_proj |
Boolean; Should the folder structure be for a publication project. Default is set to false. |
include_github_gitignore |
Boolean; Should a strict .gitignore file for GitHub be created. |
... |
Additional parameters, currently not used |
This function creates a new project with a custom folder structure.
It sets up the R/
folder and template function scripts,
initializes {testthat}
and adds tests for the function scripts,
builds the core project structure, creates a .gitignore file,
creates a readme, and optionally initializes {renv}
.
No return values, the project and its contents are created
## Not run: # Call the function to create a new project dfeR::create_project( path = "C:/path/to/your/new/project", init_renv = TRUE, include_structure_for_pkg = FALSE, create_publication_proj = FALSE, include_github_gitignore = TRUE ) ## End(Not run)
## Not run: # Call the function to create a new project dfeR::create_project( path = "C:/path/to/your/new/project", init_renv = TRUE, include_structure_for_pkg = FALSE, create_publication_proj = FALSE, include_github_gitignore = TRUE ) ## End(Not run)
Fetch a data frame of all Westminster Parliamentary Constituencies for a given year and country based on the dfeR::wd_pcon_lad_la_rgn_ctry file
fetch_pcons(year = "All", countries = "All")
fetch_pcons(year = "All", countries = "All")
year |
year to filter the locations to, default is "All", options of 2017, 2019, 2020, 2021, 2022", 2023, 2024 |
countries |
vector of desired countries to filter the locations to, default is "All", or can be a vector with options of "England", "Scotland", "Wales" or "Northern Ireland" |
data frame of unique location names and codes
# Using head() to show only top 5 rows for examples head(fetch_wards()) head(fetch_pcons()) head(fetch_pcons(2023)) head(fetch_pcons(countries = "Scotland")) head(fetch_pcons(year = 2023, countries = c("England", "Wales"))) fetch_lads(2024, "Wales") fetch_las(2022, "Northern Ireland") # The following have no specific years available and return all values fetch_regions() fetch_countries()
# Using head() to show only top 5 rows for examples head(fetch_wards()) head(fetch_pcons()) head(fetch_pcons(2023)) head(fetch_pcons(countries = "Scotland")) head(fetch_pcons(year = 2023, countries = c("England", "Wales"))) fetch_lads(2024, "Wales") fetch_las(2022, "Northern Ireland") # The following have no specific years available and return all values fetch_regions() fetch_countries()
Fetch countries
fetch_countries()
fetch_countries()
data frame of unique location names and codes
Other fetch_locations:
fetch_lads()
,
fetch_las()
,
fetch_regions()
,
fetch_wards()
# Using head() to show only top 5 rows for examples head(fetch_wards()) head(fetch_pcons()) head(fetch_pcons(2023)) head(fetch_pcons(countries = "Scotland")) head(fetch_pcons(year = 2023, countries = c("England", "Wales"))) fetch_lads(2024, "Wales") fetch_las(2022, "Northern Ireland") # The following have no specific years available and return all values fetch_regions() fetch_countries()
# Using head() to show only top 5 rows for examples head(fetch_wards()) head(fetch_pcons()) head(fetch_pcons(2023)) head(fetch_pcons(countries = "Scotland")) head(fetch_pcons(year = 2023, countries = c("England", "Wales"))) fetch_lads(2024, "Wales") fetch_las(2022, "Northern Ireland") # The following have no specific years available and return all values fetch_regions() fetch_countries()
Fetch local authority districts
fetch_lads(year = "All", countries = "All")
fetch_lads(year = "All", countries = "All")
year |
year to filter the locations to, default is "All", options of 2017, 2019, 2020, 2021, 2022", 2023, 2024 |
countries |
vector of desired countries to filter the locations to, default is "All", or can be a vector with options of "England", "Scotland", "Wales" or "Northern Ireland" |
data frame of unique location names and codes
Other fetch_locations:
fetch_countries()
,
fetch_las()
,
fetch_regions()
,
fetch_wards()
# Using head() to show only top 5 rows for examples head(fetch_wards()) head(fetch_pcons()) head(fetch_pcons(2023)) head(fetch_pcons(countries = "Scotland")) head(fetch_pcons(year = 2023, countries = c("England", "Wales"))) fetch_lads(2024, "Wales") fetch_las(2022, "Northern Ireland") # The following have no specific years available and return all values fetch_regions() fetch_countries()
# Using head() to show only top 5 rows for examples head(fetch_wards()) head(fetch_pcons()) head(fetch_pcons(2023)) head(fetch_pcons(countries = "Scotland")) head(fetch_pcons(year = 2023, countries = c("England", "Wales"))) fetch_lads(2024, "Wales") fetch_las(2022, "Northern Ireland") # The following have no specific years available and return all values fetch_regions() fetch_countries()
Fetch local authorities
fetch_las(year = "All", countries = "All")
fetch_las(year = "All", countries = "All")
year |
year to filter the locations to, default is "All", options of 2017, 2019, 2020, 2021, 2022", 2023, 2024 |
countries |
vector of desired countries to filter the locations to, default is "All", or can be a vector with options of "England", "Scotland", "Wales" or "Northern Ireland" |
data frame of unique location names and codes
Other fetch_locations:
fetch_countries()
,
fetch_lads()
,
fetch_regions()
,
fetch_wards()
# Using head() to show only top 5 rows for examples head(fetch_wards()) head(fetch_pcons()) head(fetch_pcons(2023)) head(fetch_pcons(countries = "Scotland")) head(fetch_pcons(year = 2023, countries = c("England", "Wales"))) fetch_lads(2024, "Wales") fetch_las(2022, "Northern Ireland") # The following have no specific years available and return all values fetch_regions() fetch_countries()
# Using head() to show only top 5 rows for examples head(fetch_wards()) head(fetch_pcons()) head(fetch_pcons(2023)) head(fetch_pcons(countries = "Scotland")) head(fetch_pcons(year = 2023, countries = c("England", "Wales"))) fetch_lads(2024, "Wales") fetch_las(2022, "Northern Ireland") # The following have no specific years available and return all values fetch_regions() fetch_countries()
Fetch regions
fetch_regions()
fetch_regions()
data frame of unique location names and codes
Other fetch_locations:
fetch_countries()
,
fetch_lads()
,
fetch_las()
,
fetch_wards()
# Using head() to show only top 5 rows for examples head(fetch_wards()) head(fetch_pcons()) head(fetch_pcons(2023)) head(fetch_pcons(countries = "Scotland")) head(fetch_pcons(year = 2023, countries = c("England", "Wales"))) fetch_lads(2024, "Wales") fetch_las(2022, "Northern Ireland") # The following have no specific years available and return all values fetch_regions() fetch_countries()
# Using head() to show only top 5 rows for examples head(fetch_wards()) head(fetch_pcons()) head(fetch_pcons(2023)) head(fetch_pcons(countries = "Scotland")) head(fetch_pcons(year = 2023, countries = c("England", "Wales"))) fetch_lads(2024, "Wales") fetch_las(2022, "Northern Ireland") # The following have no specific years available and return all values fetch_regions() fetch_countries()
Fetch wards
fetch_wards(year = "All", countries = "All")
fetch_wards(year = "All", countries = "All")
year |
year to filter the locations to, default is "All", options of 2017, 2019, 2020, 2021, 2022", 2023, 2024 |
countries |
vector of desired countries to filter the locations to, default is "All", or can be a vector with options of "England", "Scotland", "Wales" or "Northern Ireland" |
data frame of unique location names and codes
Other fetch_locations:
fetch_countries()
,
fetch_lads()
,
fetch_las()
,
fetch_regions()
# Using head() to show only top 5 rows for examples head(fetch_wards()) head(fetch_pcons()) head(fetch_pcons(2023)) head(fetch_pcons(countries = "Scotland")) head(fetch_pcons(year = 2023, countries = c("England", "Wales"))) fetch_lads(2024, "Wales") fetch_las(2022, "Northern Ireland") # The following have no specific years available and return all values fetch_regions() fetch_countries()
# Using head() to show only top 5 rows for examples head(fetch_wards()) head(fetch_pcons()) head(fetch_pcons(2023)) head(fetch_pcons(countries = "Scotland")) head(fetch_pcons(year = 2023, countries = c("England", "Wales"))) fetch_lads(2024, "Wales") fetch_las(2022, "Northern Ireland") # The following have no specific years available and return all values fetch_regions() fetch_countries()
This function formats academic year variables for reporting purposes. It will convert an academic year input from 201516 format to 2015/16 format.
format_ay(year)
format_ay(year)
year |
Academic year |
It accepts both numerical and character arguments.
Character vector of formatted academic year
Other format:
format_ay_reverse()
,
format_fy()
,
format_fy_reverse()
format_ay(201617) format_ay("201617")
format_ay(201617) format_ay("201617")
This function converts academic year variables back into 201617 format.
format_ay_reverse(year)
format_ay_reverse(year)
year |
Academic year |
It accepts character arguments.
Unformatted 6 digit year as string
Other format:
format_ay()
,
format_fy()
,
format_fy_reverse()
format_ay_reverse("2016/17")
format_ay_reverse("2016/17")
This function formats financial year variables for reporting purposes. It will convert an year input from 201516 format to 2015-16 format.
format_fy(year)
format_fy(year)
year |
Financial year |
It accepts both numerical and character arguments.
Character vector of formatted financial year
Other format:
format_ay()
,
format_ay_reverse()
,
format_fy_reverse()
format_fy(201617) format_fy("201617")
format_fy(201617) format_fy("201617")
This function converts financial year variables back into 201617 format.
format_fy_reverse(year)
format_fy_reverse(year)
year |
Financial year |
It accepts character arguments.
Unformatted 6 digit year as string
Other format:
format_ay()
,
format_ay_reverse()
,
format_fy()
format_fy_reverse("2016-17")
format_fy_reverse("2016-17")
Potential names for geography and time columns in line with the ones used for the explore education statistics data screener.
geog_time_identifiers
geog_time_identifiers
geog_time_identifiers
A character vector with 38 potential column names in snake case format.
curated by [email protected]. Get guidance on time and geography data.
This function cleans a SQL script, ready for using within R in the DfE.
get_clean_sql(filepath, additional_settings = FALSE)
get_clean_sql(filepath, additional_settings = FALSE)
filepath |
path to a SQL script |
additional_settings |
TRUE or FALSE boolean for the addition of settings at the start of the SQL script |
Cleaned string containing SQL query
# This assumes you have already set up a database connection # and that the filepath for the function exists # For more details see the vignette on connecting to SQL # Pull a cleaned version of the SQL file into R if (file.exists("your_script.sql")) { sql_query <- get_clean_sql("your_script.sql") }
# This assumes you have already set up a database connection # and that the filepath for the function exists # For more details see the vignette on connecting to SQL # Pull a cleaned version of the SQL file into R if (file.exists("your_script.sql")) { sql_query <- get_clean_sql("your_script.sql") }
Helper function that takes a data set id and parameters to query and parse data from the ONS Open Geography API. Technically uses a POST request rather than a GET request.
get_ons_api_data( data_id, query_params = list(where = "1=1", outFields = "*", outSR = "4326", f = "json"), batch_size = 200, verbose = TRUE )
get_ons_api_data( data_id, query_params = list(where = "1=1", outFields = "*", outSR = "4326", f = "json"), batch_size = 200, verbose = TRUE )
data_id |
the id of the data set to query, can be found from the Open Geography Portal |
query_params |
query parameters to pass into the API, see the ESRI documentation for more information on query parameters - ESRI Query (Feature Service/Layer) |
batch_size |
the number of rows per query. This is 250 by default, if you hit errors then try lowering this. The API has a limit of 1000 to 2000 rows per query, and in truth, the actual limit for our method is lower as every ObjectId queried is pasted into the query URL so for every row included in the batch, and especial if those Id's go into the 1,000s or 10,000s they will increase the size of the URL and risk hitting the limit. |
verbose |
TRUE or FALSE boolean. TRUE by default. FALSE will turn off the messages to the console that update on what the function is doing |
It does a pre-query to understand the ObjectIds for the query you want, and then does a query to retrieve those Ids directly in batches before then stacking the whole thing back together to work around the row limits for a single query.
On the Open Geography Portal, find the data set you're interested in and then use the query explorer to find the information for the query.
This function has been mostly developed for ease of use for dfeR maintainers if you're interested in getting data from the Open Geography Portal more widely you should also look at the boundr package.
parsed data.frame of geographic names and codes
if (interactive()) { # Specify some parameters get_ons_api_data( data_id = "LAD23_RGN23_EN_LU", query_params = list(outFields = "column1, column2", outSR = "4326", f = "json") ) # Just fetch everything get_ons_api_data(data_id = "LAD23_RGN23_EN_LU") }
if (interactive()) { # Specify some parameters get_ons_api_data( data_id = "LAD23_RGN23_EN_LU", query_params = list(outFields = "column1, column2", outSR = "4326", f = "json") ) # Just fetch everything get_ons_api_data(data_id = "LAD23_RGN23_EN_LU") }
A lookup of ONS geography shorthands and their respective column names in line with DfE open data standards.
ons_geog_shorthands
ons_geog_shorthands
ons_geog_shorthands
A data frame with 7 rows and 3 columns:
ONS shorthands used in their lookup files
DfE names for geography name columns
DfE names for geography code columns
GOR (Government Office Region) was the predecessor to RGN.
curated by [email protected]
Converts a raw file size from bytes to a more readable format.
pretty_filesize(filesize)
pretty_filesize(filesize)
filesize |
file size in bytes |
Designed to be used in conjunction with the file.size() function in base R.
Presents in kilobytes, megabytes or gigabytes.
Shows as bytes until 1 KB, then kilobytes up to 1 MB, then megabytes until 1GB, then it will show as gigabytes for anything larger.
Rounds the end result to 2 decimal places.
Using base 10 (decimal), so 1024 bytes is 1,024 KB.
string containing prettified file size
Other prettying:
pretty_num()
,
pretty_num_table()
,
pretty_time_taken()
pretty_filesize(2) pretty_filesize(549302) pretty_filesize(9872948939) pretty_filesize(1) pretty_filesize(1000) pretty_filesize(1000^2) pretty_filesize(10^9)
pretty_filesize(2) pretty_filesize(549302) pretty_filesize(9872948939) pretty_filesize(1) pretty_filesize(1000) pretty_filesize(1000^2) pretty_filesize(10^9)
Uses as.numeric()
to force a numeric value and then formats prettily
for easy presentation in console messages, reports, or dashboards.
This rounds to 0 decimal places by default, and adds in comma separators.
Expect that this will commonly be used for adding the pound symbol, the percentage symbol, or to have a +/- prefixed based on the value.
If applying over multiple or unpredictable values and you want to preserve
a non-numeric symbol such as "x" or "c" for data not available, use the
ignore_na = TRUE
argument to return those values unaffected.
If you want to customise what NA values are returned as, use the alt_na
argument.
This function silences the warning around NAs being introduced by coercion.
pretty_num( value, prefix = "", gbp = FALSE, suffix = "", dp = 0, ignore_na = FALSE, alt_na = FALSE, nsmall = NULL )
pretty_num( value, prefix = "", gbp = FALSE, suffix = "", dp = 0, ignore_na = FALSE, alt_na = FALSE, nsmall = NULL )
value |
value to be prettified |
prefix |
prefix for the value, if "+/-" then it will automatically assign + or - based on the value |
gbp |
whether to add the pound symbol or not, defaults to not |
suffix |
suffix for the value, e.g. "%" |
dp |
number of decimal places to round to, 0 by default. |
ignore_na |
whether to skip function for strings that can't be converted and return original value |
alt_na |
alternative value to return in place of NA, e.g. "x" |
nsmall |
minimum number of digits to the right of the decimal point.
If NULL, the value of |
string featuring prettified value
comma_sep()
round_five_up()
as.numeric()
Other prettying:
pretty_filesize()
,
pretty_num_table()
,
pretty_time_taken()
# On individual values pretty_num(5789, gbp = TRUE) pretty_num(564, prefix = "+/-") pretty_num(567812343223, gbp = TRUE, prefix = "+/-") pretty_num(11^9, gbp = TRUE, dp = 3) pretty_num(-11^8, gbp = TRUE, dp = -1) pretty_num(43.3, dp = 1, nsmall = 2) pretty_num("56.089", suffix = "%") pretty_num("x") pretty_num("x", ignore_na = TRUE) pretty_num("nope", alt_na = "x") # Applied over an example vector vector <- c(3998098008, -123421421, "c", "x") pretty_num(vector) pretty_num(vector, prefix = "+/-", gbp = TRUE) # Return original values if NA pretty_num(vector, ignore_na = TRUE) # Return alternative value in place of NA pretty_num(vector, alt_na = "z")
# On individual values pretty_num(5789, gbp = TRUE) pretty_num(564, prefix = "+/-") pretty_num(567812343223, gbp = TRUE, prefix = "+/-") pretty_num(11^9, gbp = TRUE, dp = 3) pretty_num(-11^8, gbp = TRUE, dp = -1) pretty_num(43.3, dp = 1, nsmall = 2) pretty_num("56.089", suffix = "%") pretty_num("x") pretty_num("x", ignore_na = TRUE) pretty_num("nope", alt_na = "x") # Applied over an example vector vector <- c(3998098008, -123421421, "c", "x") pretty_num(vector) pretty_num(vector, prefix = "+/-", gbp = TRUE) # Return original values if NA pretty_num(vector, ignore_na = TRUE) # Return alternative value in place of NA pretty_num(vector, alt_na = "z")
dfeR::pretty_num()
.You can format number and character values in a data frame
by passing arguments to dfeR::pretty_num()
.
Use parameters include_columns
or exclude_columns
to specify columns for formatting.
pretty_num_table(data, include_columns = NULL, exclude_columns = NULL, ...)
pretty_num_table(data, include_columns = NULL, exclude_columns = NULL, ...)
data |
A data frame containing the columns to be formatted. |
include_columns |
A character vector specifying which columns to format.
If |
exclude_columns |
A character vector specifying columns to exclude
from formatting.
If |
... |
Additional arguments passed to |
The function first checks if any columns are specified for inclusion
via include_columns
.
If none are provided, it checks if columns are specified for exclusion
via exclude_columns
.
If neither is specified, all columns in the data frame are formatted.
A data frame with columns formatted using dfeR::pretty_num()
.
Other prettying:
pretty_filesize()
,
pretty_num()
,
pretty_time_taken()
# Example data frame df <- data.frame( a = c(1.234, 5.678, 9.1011), b = c(10.1112, 20.1314, 30.1516), c = c("A", "B", "C") ) # Apply formatting to all columns pretty_num_table(df, dp = 2) # Apply formatting to only selected columns pretty_num_table(df, include_columns = c("a"), dp = 2) # Apply formatting to all columns except specified ones pretty_num_table(df, exclude_columns = c("b"), dp = 2) # Apply formatting to all columns except specified ones and # provide alternative value for NAs pretty_num_table(df, alt_na = "[z]", exclude_columns = c("b"), dp = 2)
# Example data frame df <- data.frame( a = c(1.234, 5.678, 9.1011), b = c(10.1112, 20.1314, 30.1516), c = c("A", "B", "C") ) # Apply formatting to all columns pretty_num_table(df, dp = 2) # Apply formatting to only selected columns pretty_num_table(df, include_columns = c("a"), dp = 2) # Apply formatting to all columns except specified ones pretty_num_table(df, exclude_columns = c("b"), dp = 2) # Apply formatting to all columns except specified ones and # provide alternative value for NAs pretty_num_table(df, alt_na = "[z]", exclude_columns = c("b"), dp = 2)
Converts a start and end value to a readable time format.
pretty_time_taken(start_time, end_time)
pretty_time_taken(start_time, end_time)
start_time |
start time readable by as.POSIXct |
end_time |
end time readable by as.POSIXct |
Designed to be used with Sys.time() when tracking start and end times.
Shows as seconds up until 119 seconds, then minutes until 119 minutes, then hours for anything larger.
Input start and end times must be convertible to POSIXct format.
string containing prettified elapsed time
comma_sep()
round_five_up()
as.POSIXct()
Other prettying:
pretty_filesize()
,
pretty_num()
,
pretty_num_table()
pretty_time_taken( "2024-03-23 07:05:53 GMT", "2024-03-23 12:09:56 GMT" ) # Track the start and end time of a process start <- Sys.time() Sys.sleep(0.1) end <- Sys.time() # Use this function to present it prettily pretty_time_taken(start, end)
pretty_time_taken( "2024-03-23 07:05:53 GMT", "2024-03-23 12:09:56 GMT" ) # Track the start and end time of a process start <- Sys.time() Sys.sleep(0.1) end <- Sys.time() # Use this function to present it prettily pretty_time_taken(start, end)
A lookup of ONS geography region names and codes for England. In their lookups Northern Ireland, Scotland and Wales are regions.
regions
regions
regions
A data frame with 16 rows and 2 columns:
Region name
Region code
Also included inner and outer London county split as DfE frequently publish those as regions, as well as some custom DfE names and codes. This is used as the definitive list for the screening of open data before it is published by the DfE.
curated by [email protected], ONS codes sourced from https://geoportal.statistics.gov.uk/search?q=NAC_RGN
Round any number to a specified number of places, with 5's being rounded up.
round_five_up(number, dp = 0)
round_five_up(number, dp = 0)
number |
number to be rounded |
dp |
number of decimal places to round to, default is 0 |
Rounds to 0 decimal places by default.
You can use a negative value for the decimal places. For example: -1 would round to the nearest 10 -2 would round to the nearest 100 and so on.
This is as an alternative to round in base R, which uses a bankers round. For more information see the round() documentation.
Rounded number
# No dp set round_five_up(2485.85) # With dp set round_five_up(2485.85, 2) round_five_up(2485.85, 1) round_five_up(2485.85, 0) round_five_up(2485.85, -1) round_five_up(2485.85, -2)
# No dp set round_five_up(2485.85) # With dp set round_five_up(2485.85, 2) round_five_up(2485.85, 1) round_five_up(2485.85, 0) round_five_up(2485.85, -1) round_five_up(2485.85, -2)
Quick expansion to the message()
function aimed for use in functions for
an easy addition of a global verbose TRUE / FALSE argument to toggle the
messages on or off
toggle_message(..., verbose)
toggle_message(..., verbose)
... |
any message you would normally pass into |
verbose |
logical, usually a variable passed from the function you are using this within |
# Usually used in a function my_function <- function(count_fingers, verbose) { toggle_message("I have ", count_fingers, " fingers", verbose = verbose) fingers_thumbs <- count_fingers + 2 toggle_message("I have ", fingers_thumbs, " digits", verbose = verbose) } my_function(5, verbose = FALSE) my_function(5, verbose = TRUE) # Can be used in isolation toggle_message("I want the world to read this!", verbose = TRUE) toggle_message("I ain't gonna show this message!", verbose = FALSE) count_fingers <- 5 toggle_message("I have ", count_fingers, " fingers", verbose = TRUE)
# Usually used in a function my_function <- function(count_fingers, verbose) { toggle_message("I have ", count_fingers, " fingers", verbose = verbose) fingers_thumbs <- count_fingers + 2 toggle_message("I have ", fingers_thumbs, " digits", verbose = verbose) } my_function(5, verbose = FALSE) my_function(5, verbose = TRUE) # Can be used in isolation toggle_message("I want the world to read this!", verbose = TRUE) toggle_message("I ain't gonna show this message!", verbose = FALSE) count_fingers <- 5 toggle_message("I have ", count_fingers, " fingers", verbose = TRUE)
A lookup showing the hierarchy of ward to Westminster parliamentary constituency to local authority district to local authority to region to country for years 2017, 2019, 2020, 2021, 2022, 2023 and 2024.
wd_pcon_lad_la_rgn_ctry
wd_pcon_lad_la_rgn_ctry
wd_pcon_lad_la_rgn_ctry
A data frame with 24,629 rows and 14 columns:
First year in the lookups that we see this location
Last year in the lookups that we see this location
Ward name
Parliamentary constituency name
Local authority district name
Local authority name
Region name
Country name
9 digit ward code
9 digit westminster constituency code
9 digit local authority district code
9 digit local authority code
9 digit region code
9 digit country code
Changes we've made to the original lookup:
The original lookup from ONS uses the Upper Tier Local Authority, we then update this so that where there is a metropolitan local authority we use the local authority district as the local authority to match how DfE publish data for local authorities.
We have noticed that in the 2017 version, the Glasgow East constituency had a code of S1400030 instead of the usual S14000030, we've assumed this was an error and have change this in our data so that Glasgow East is S14000030 in 2017.
We have joined on regions using the Ward to LAD to County to Region file.
We have joined on countries based on the E / N / S / W at the start of codes.
Scotland had no published regions in 2017, so given the rest of the years have Scotland as the region, we've forced that in for 2017 too to complete the data set.
https://geoportal.statistics.gov.uk/search?tags=lup_wd_pcon_lad_utla and https://geoportal.statistics.gov.uk/search?q=lup_wd_lad_cty_rgn_gor_ctry
NA
values in tablesReplaces NA
values in tables except for ones in time and geography
columns that must be included in DfE official statistics.
Get more guidance on Open Data Standards.
z_replace(data, replacement_alt = NULL, exclude_columns = NULL)
z_replace(data, replacement_alt = NULL, exclude_columns = NULL)
data |
name of the table that you want to replace NA values in |
replacement_alt |
optional - if you want the NA replacement value to be different to "z" |
exclude_columns |
optional - additional columns to exclude from
NA replacement.
Column names that match ones found in |
Names of geography and time columns that are used in this function can be
found in dfeR::geog_time_identifiers
.
table with "z" or an alternate replacement value instead of NA
values for columns that are not for time or geography.
# Create a table for the example df <- data.frame( time_period = c(2022, 2022, 2022), time_identifier = c("Calendar year", "Calendar year", "Calendar year"), geographic_level = c("National", "Regional", "Regional"), country_code = c("E92000001", "E92000001", "E92000001"), country_name = c("England", "England", "England"), region_code = c(NA, "E12000001", "E12000002"), region_name = c(NA, "North East", "North West"), mystery_count = c(42, 25, NA) ) z_replace(df) # Use a different replacement value z_replace(df, replacement_alt = "c")
# Create a table for the example df <- data.frame( time_period = c(2022, 2022, 2022), time_identifier = c("Calendar year", "Calendar year", "Calendar year"), geographic_level = c("National", "Regional", "Regional"), country_code = c("E92000001", "E92000001", "E92000001"), country_name = c("England", "England", "England"), region_code = c(NA, "E12000001", "E12000002"), region_name = c(NA, "North East", "North West"), mystery_count = c(42, 25, NA) ) z_replace(df) # Use a different replacement value z_replace(df, replacement_alt = "c")