Skip to contents

An alternative to using collect() at the end of a query pipe is to call a function with the atlas_ prefix. These solutions are basically synonymous, but atlas_ functions differ in two ways:

  • They have the ability to accept filter, select etc as arguments, rather than within a pipe, but only when using the galah_ forms of those functions (e.g. galah_filter()).

  • atlas_ functions do not require you to specify the method or type arguments to galah_call(), as they are more specific in what data are being requested.

Usage

atlas_occurrences(
  request = NULL,
  identify = NULL,
  filter = NULL,
  geolocate = NULL,
  data_profile = NULL,
  select = NULL,
  mint_doi = FALSE,
  doi = NULL,
  file = NULL
)

atlas_counts(
  request = NULL,
  identify = NULL,
  filter = NULL,
  geolocate = NULL,
  data_profile = NULL,
  group_by = NULL,
  limit = NULL,
  type = c("occurrences", "species")
)

atlas_species(
  request = NULL,
  identify = NULL,
  filter = NULL,
  geolocate = NULL,
  data_profile = NULL
)

atlas_media(
  request = NULL,
  identify = NULL,
  filter = NULL,
  select = NULL,
  geolocate = NULL,
  data_profile = NULL
)

atlas_taxonomy(
  request = NULL,
  identify = NULL,
  filter = NULL,
  constrain_ids = NULL
)

Arguments

request

optional data_request object: generated by a call to galah_call().

identify

tibble: generated by a call to galah_identify().

filter

tibble: generated by a call to galah_filter()

geolocate

string: generated by a call to galah_geolocate()

data_profile

string: generated by a call to galah_apply_profile()

select

tibble: generated by a call to galah_select()

mint_doi

logical: by default no DOI will be generated. Set to TRUE if you intend to use the data in a publication or similar.

doi

string: (Optional) DOI to download. If provided overrides all other arguments. Only available for the ALA.

file

string: (Optional) file name. If not given, will be set to data with date and time added. The file path (directory) is always given by galah_config()$package$directory.

group_by

tibble: generated by a call to galah_group_by().

limit

numeric: maximum number of categories to return, defaulting to 100. If limit is NULL, all results are returned. For some categories this will take a while.

type

string: one of "occurrences" or "species". Defaults to "occurrences", which returns the number of records that match the selected criteria; alternatively returns the number of species. Formerly accepted arguments ("records" or "species") are deprecated but remain functional.

constrain_ids

string: Optional string to limit which taxon_concept_id's are returned. This is useful for restricting taxonomy to particular authoritative sources. Default is "biodiversity.org.au" for Australia, which is the infix common to National Species List IDs; use NULL to suppress source filtering. Regular expressions are supported.

Value

An object of class tbl_df and data.frame (aka a tibble). For atlas_occurrences() and atlas_species(), this will have columns specified by select(). For atlas_counts(), it will have columns specified by group_by().

Details

Note that unless care is taken, some queries can be particularly large. While most cases this will simply take a long time to process, if the number of requested records is >50 million, the call will not return any data. Users can test whether this threshold will be reached by first calling atlas_counts() using the same arguments that they intend to pass to atlas_occurrences(). It may also be beneficial when requesting a large number of records to show a progress bar by setting verbose = TRUE in galah_config(), or to use compute() to run the call before collecting it later with collect().

Examples

if (FALSE) { # \dontrun{
# Best practice is to first calculate the number of records
galah_call() |>
  filter(year == 2015) |>
  atlas_counts()

# Download occurrence records for a specific taxon
galah_config(email = "your_email_here") # login required for downloads
galah_call() |>
  identify("Reptilia") |>
  atlas_occurrences()

# Download occurrence records in a year range
galah_call() |>
  identify("Litoria") |>
  filter(year >= 2010 & year <= 2020) |>
  atlas_occurrences()
  
# Download occurrences records in a WKT-specified area
polygon <- "POLYGON((146.24960 -34.05930,
                     146.37045 -34.05930,
                     146.37045 -34.152549,
                     146.24960 -34.15254,
                     146.24960 -34.05930))"
galah_call() |> 
  identify("Reptilia") |>
  filter(year >= 2010, year <= 2020) |>
  st_crop(polygon) |>
  atlas_occurrences()
  
# Get a list of species within genus "Heleioporus"
# (every row is a species with associated taxonomic data)
galah_call() |>
  galah_identify("Heleioporus") |>
  atlas_species()

# Download Regent Honeyeater records with multimedia attached
# Note this returns one row per multimedia file, NOT one per occurrence
galah_call() |>
  identify("Regent Honeyeater") |>
  filter(year == 2011) |>
  atlas_media()

# Get a taxonomic tree of *Chordata* down to the class level
galah_call() |> 
  galah_identify("chordata") |>
  galah_filter(rank == class) |>
  atlas_taxonomy()
} # }