GBIF nodes store content in hundreds of
different fields, and users often require thousands or millions of records at
a time. To reduce time taken to download data, and limit complexity of the
resulting tibble
, it is sensible to restrict the fields returned by
atlas_occurrences()
. This function allows easy selection of fields, or
commonly-requested groups of columns, following syntax shared with
dplyr::select()
.
The full list of available fields can be viewed with show_all(fields)
. Note
that select()
and galah_select()
are supported for all atlases that allow
downloads, with the exception of GBIF, for which all columns are returned.
Usage
galah_select(..., group)
# S3 method for data_request
select(.data, ..., group)
Arguments
- ...
zero or more individual column names to include
- group
string
: (optional) name of one or more column groups to include. Valid options are"basic"
,"event"
"taxonomy"
,"media"
and"assertions"
.- .data
An object of class
data_request
, created usinggalah_call()
Value
A tibble
specifying the name and type of each column to include in the
call to atlas_counts()
or atlas_occurrences()
.
Details
Calling the argument group = "basic"
returns the following columns:
decimalLatitude
decimalLongitude
eventDate
scientificName
taxonConceptID
recordID
dataResourceName
occurrenceStatus
Using group = "event"
returns the following columns:
eventRemarks
eventTime
eventID
eventDate
samplingEffort
samplingProtocol
Using group = "media"
returns the following columns:
multimedia
multimediaLicence
images
videos
sounds
Using group = "taxonomy"
returns higher taxonomic information for a given
query. It is the only group
that is accepted by atlas_species()
as well
as atlas_occurrences()
.
Using group = "assertions"
returns all quality assertion-related
columns. The list of assertions is shown by show_all_assertions()
.
For atlas_occurrences()
, arguments passed to ...
should be valid field
names, which you can check using show_all(fields)
. For atlas_species()
,
it should be one or more of:
counts
to include counts of occurrences per species.synonyms
to include any synonymous names.lists
to include authoritiative lists that each species is included on.
See also
search_taxa()
, galah_filter()
and
galah_geolocate()
for other ways to restrict the information returned
by atlas_occurrences()
and related functions; atlas_counts()
for how to get counts by levels of variables returned by galah_select
;
show_all(fields)
to list available fields.
Examples
if (FALSE) {
# Download occurrence records of *Perameles*,
# Only return scientificName and eventDate columns
galah_config(email = "your-email@email.com")
galah_call() |>
galah_identify("perameles")|>
galah_select(scientificName, eventDate) |>
atlas_occurrences()
# Only return the "basic" group of columns and the basisOfRecord column
galah_call() |>
galah_identify("perameles") |>
galah_select(basisOfRecord, group = "basic") |>
atlas_occurrences()
# When used in a pipe, `galah_select()` and `select()` are synonymous.
# Hence the previous example can be rewritten as:
request_data() |>
identify("perameles") |>
select(basisOfRecord, group = "basic") |>
collect()
}