Skip to contents

GBIF and it's partner nodes store content in hundreds of different fields, and users often require thousands or millions of records at a time. To reduce time taken to download data, and limit complexity of the resulting tibble, it is sensible to restrict the fields returned by atlas_occurrences(). This function allows easy selection of fields, or commonly-requested groups of columns, following syntax shared with dplyr::select().

The full list of available fields can be viewed with show_all(fields). Note that select() and galah_select() are supported for all atlases that allow downloads, with the exception of GBIF, for which all columns are returned.


galah_select(..., group)

# S3 method for data_request
select(.data, ..., group)



zero or more individual column names to include


string: (optional) name of one or more column groups to include. Valid options are "basic", "event" and "assertions"


An object of class data_request, created using galah_call()


A tibble specifying the name and type of each column to include in the call to atlas_counts() or atlas_occurrences().


Calling the argument group = "basic" returns the following columns:

  • decimalLatitude

  • decimalLongitude

  • eventDate

  • scientificName

  • taxonConceptID

  • recordID

  • dataResourceName

  • occurrenceStatus

Using group = "event" returns the following columns:

  • eventRemarks

  • eventTime

  • eventID

  • eventDate

  • samplingEffort

  • samplingProtocol

Using group = "media" returns the following columns:

  • multimedia

  • multimediaLicence

  • images

  • videos

  • sounds

Using group = "assertions" returns all quality assertion-related columns. The list of assertions is shown by show_all_assertions().

See also

search_taxa(), galah_filter() and galah_geolocate() for other ways to restrict the information returned by atlas_occurrences() and related functions; atlas_counts() for how to get counts by levels of variables returned by galah_select; show_all(fields) to list available fields.


if (FALSE) {
# Download occurrence records of *Perameles*, 
# Only return scientificName and eventDate columns
galah_config(email = "")
galah_call() |>
  galah_select(scientificName, eventDate) |>

# Only return the "basic" group of columns and the basisOfRecord column
galah_call() |>
  galah_identify("perameles") |>
  galah_select(basisOfRecord, group = "basic") |>
# When used in a pipe, `galah_select()` and `select()` are synonymous.
# Hence the previous example can be rewritten as:
request_data() |>
  identify("perameles") |>
  select(basisOfRecord, group = "basic") |>