Skip to contents

galah 1.5.2

CRAN release: 2023-03-10

Minor release to resolve issues on CRAN, and a few recent bugs.

Bug fixes

  • Prevent error when providing a tibble as input to search_taxa() (e.g., to resolve homonyms, #168)
  • Better error message when email address is required, but not given (#179)
  • Add an informative message when users call galah_select() while atlas = GBIF (which is not supported; #181)
  • Ensure DOIs are added to downloads when requested (#182)
  • Improve tests to avoid flagging issues on CRAN when one or more atlases are down (#184)
  • Resolve problem where some queries were replaced by ... in galah_filter() (#186)

galah 1.5.1

CRAN release: 2023-01-13

Mask function names from other packages

An experimental feature of version 1.5.1 is the ability to call functions from other packages (#161), as synonyms for galah_ functions. These are:

These are implemented as S3 methods for objects of class data_request, which are created by galah_call(). Hence new function names only work when piped after galah_call().

Experimental support for GBIF queries

The Global Biodiversity Information Facility (GBIF) is the umbrella organisation to which all other atlases supply data. Hence it is logical to be able to query GBIF and it’s “nodes” (i.e. the living atlases) via a common API. Supported functions are:

  • search_taxa and galah_identify for name matching
  • show_all(fields) and show_all(assertions)
  • show_all() calls that give ‘collections’ information are limited to 20 records by default, as GBIF datasets are often huge. search_all() is generally more reliable
  • show_values() for any GBIF field
  • galah_filter and galah_group_by (and therefore filter and group_by(), see above), but NOT galah_select.
  • atlas_counts() (and therefore count(), see above)
  • atlas_occurrences() & atlas_species(); both are implemented via the ‘downloads’ system, meaning that queries can be larger, but may be slow

The current implementation is experimental and back-end changes are expected in future. Users who require a more stable implementation should use the {rgbif} package.

Minor improvements

  • galah_config() gains a print function, and now uses fuzzy matching for the atlas field to match to region, organisation or acronym (as defined by show_all(atlases)). An example use case is to match to organisations via acronyms, e.g. galah_config(atlas = "ALA").
  • Improved support for data from Spain via gbif.es (name-matching, lists, spatial)
  • Swapped provider for data from France; formerly gbif.fr, now OpenObs, as per advice from maintainers
  • Reading data from disk now uses readr::read_csv in place of utils::read.csv for improved speed
  • show_all (and associated sub-functions) gain a limit argument, set to NULL (i.e. no limit) by default
  • galah no longer imports data.table, since the only function previously used from that package (rbindlist) is duplicated by dplyr::bind_rows
  • Help files are now built without markdown for improved speed (mainly while building)

Bug fixes:

  • New function url_paginate() to handle cases where pagination is needed, but total data length is unknown (e.g. show_all_lists(), #170).
  • galah_select(group = "assertions") is always enacted properly by atlas_occurrences, and won’t lead to overly long urls (#137). When called without any other field names, recordID is added to avoid triggering the ‘default’ set of columns.
  • atlas_species works again after some minor changes to the API; but requires a registered email to function

galah 1.5.0

CRAN release: 2022-10-27

Expanded support for querying other International Living Atlases

  • Support for complex queries to 10 Living Atlases, including France, Guatemala and Sweden. Complex queries can be constructed using galah_call(), filtered with galah_ functions, and downloaded with atlas_ functions. Previously, this functionality was only possible with queries to the ALA (#126)

collect_media()

Updates to galah_geolocate()

show_all(), search_all() & show_values(), search_values()

Minor improvements

  • Apply data quality profiles in a pipe with the galah_apply_profile() function (#130)
  • Improved internal consistency of galah_ functions (#133)
  • galah_geolocate() no longer depends on archived {wellknown} package (#141)
  • Added support for queries to exclude/include missing values (e.g. galah_filter(species != "") or galah_filter(species == "") (#143)
  • Re-download a previously-minted DOI with collect_doi() (#140)
  • More checks to ensure galah fails gracefully when an API fails (#157)

Bug fixes

galah 1.4.0

CRAN release: 2022-01-24

Revamped syntax

  • ala_ functions are renamed to use the prefix atlas_. This change reflects their functionality with international atlases (i.e., atlas_occurrences, atlas_counts, atlas_species, atlas_media, atlas_taxonomy, atlas_citation) (#103)
  • select_taxa is replaced by 3 functions: galah_identify, search_taxa and search_identifiers. galah_identify is used when building data queries, whereas search_taxa and search_identifiers are now exclusively used to search for taxonomic information. Syntax changes are intended to reflect their usage and expected output (#112, #122)
  • select_ functions are renamed to use the prefix galah_. Specifically, galah_filter, galah_select and galah_geolocate replace select_filters, select_columns and select_locations. These syntax changes reflect a move towards consistency with dplyr naming and functionality (#101, #108)
  • find_ functions that provide a listing of all possible values renamed to show_all_ (i.e., show_all_profiles, show_all_ranks, show_all_atlases, show_all_cached_files, show_all_fields, show_all_reasons). find_ functions that require and input and return specific results renamed to search_ (i.e., search_field_values, search_profile_attributes) (#112, #113)

galah_group_by

galah_down_to

Pipe queries using galah_call

  • Build data queries using piping syntax (i.e., |>, %>%) by first using galah_call(), narrowing queries with galah_ functions and finishing queries with an atlas_ function (#60, #120).
  • S3 methods are now implemented to functions to allow for piping (#40)

Minor improvements

  • Improved error messages using {glue} and {rlang} (#117)
  • Revamped syntax functions return output as tibbles (#110, #118)
  • Pass vectors to galah_filter (#91, #92)
  • Cache valid fields for faster field look up (#73, #116)
  • New vignettes for updated syntax (#104, #105), plus improvements to previous vignettes.
  • Updated R Markdown-style documentation and added warnings for deprecated functions (#113, #121)

Bug fixes

  • galah no longer returns error when ALA system is down and/or API fails (#102, #119)
  • search_taxa returns correct IDs for search terms with parentheses (#96)
  • search_taxa returns best-fit taxonomic result when ranks are specified in data.frame or tibble (#115)

galah 1.3.1

CRAN release: 2021-08-21

search_taxonomy() renamed to ala_taxonomy()

  • bug fix: ala_taxonomy no longer fails for nodes ranked as informal or unranked (#86)
  • this function now returns a tree built using the data.tree package
  • change in function name required for greater consistency with other data-providing functions in galah

Vignettes

  • vignettes are now pre-compiled to avoid failing on CRAN (#85)
  • expanded vignette on navigating taxonomic information (#42)

galah 1.3.0

CRAN release: 2021-08-06

galah_config()

search_taxonomy()

  • search_taxonomy() provides a means to search for taxonomic names and check the results are ‘correct’ before proceeding to download data via ala_occurrences(), ala_species() or ala_counts() (e.g., not ambiguous or homonymous) (#64 #75)
  • search_taxonomy() returns information of author and authority of taxonomic names (#79)
  • search_taxonomy() consistently orders column names, including in correct taxonomic order by rank (#81)

Caching helper functions

Minor improvements

  • Cache files are saved in RDS format, making query attributes easier to find, including data DOI, search url (#55, #32, #28)
  • ala_media() caches media metadata if galah_config(caching = TRUE)
  • search_fields() allows the user to pass a qid as an argument (#59)
  • Users can now optionally skip filter and count validation checks to spatial and biocache web services by setting galah_config(run_checks = FALSE). This helps users avoid slowing down data request download speeds when many requests are made in quick succession via galah_filter() or ala_occurrences() (#61, #80)
  • ala_counts(), select_columns() and search_fields() now use match.arg to approximate strings through fuzzy matching (#66)
  • Better handling of cache errors and improved error messages (#70)

Bug fixes

  • select_columns(group = 'assertions') now sends qa = includeall to ALA web service API to return all assertion columns (#48)
  • ala_occurrences() returns data DOI when ala_occurrences(mint_doi = TRUE) and re-downloads data when called multiple times (#56)
  • ala_occurrences() no longer converts field names with all-CAPS to camelCase (#62)

galah 1.2.0

CRAN release: 2021-07-02

Living Atlases

  • ala_config() allows users to specify an international Atlas to download data from (#21)

Minor improvements

  • ala_media() includes the file path to the downloaded media in the returned metadata (#22)
  • Data returned from ala_occurrences() contains the search_url used to download records; this takes the user to the website search page (#32)
  • ala_species() provides a more helpful error if no species are found (#39)
  • Data quality filters are created using the specific web service argument, rather than constructing filters from the attributes (#37)
  • select_taxa() has an optional all_ranks argument to return intermediate rank information (#35)

Bug fixes

  • R > 4.0.0 is now required (#43, #45)
  • select_taxa() behaves as expected when character strings of 32 or 36 characters are provided (#23)
  • Caching functionality for ala_occurrences() uses the columns as expected (#30)
  • galah_filter() negates assertion filters when required, fixing the issue of assertion values being ignored (#27)
  • select_taxa() no longer throws an error when queries of more than one term have a differing number of columns in the return value (#41)
  • ala_counts() returns data.frame with consistent column classes when a group_by parameter is called multiple times and ala_config(caching = TRUE) (#47)
  • ala_ functions fail gracefully if a non-id character string is passed (#49)

galah 1.1.0

CRAN release: 2021-05-05

Downloading media

  • ala_media() now takes the same select_ arguments as other ala_ functions (#18)
  • Filtering by media metadata e.g. licence type is possible (#19)
  • search_fields now has media as a type argument option
  • Performance improvement in download times (#13)
  • Progress bar displayed for downloads when verbose == TRUE (#8)
  • All media download types are supported

select_ functions

  • galah_location auto-detects the type of argument provided and so takes a single argument, query, in place of sf and wkt (#17)
  • select_taxa auto-detects the type of argument provided and so takes a single argument, query, in place of term and term_type (#16)

Minor improvements

  • Provide more useful error message for empty occurrence download (#7)
  • ala_counts uses the group_by field name as the returned data.frame column name (#6)
  • ala_occurrences sends sourceId parameter to ALA (#5)
  • search_fields provides a more helpful error for invalid types (#11)

galah 1.0.0

CRAN release: 2021-04-06

First version of galah, built on earlier functionality from the ALA4R package.