Look up information
Martin Westgate & Dax Kellie
2024-04-09
Source:vignettes/look_up_information.Rmd
look_up_information.Rmd
galah supports two functions to look up information:
show_all()
and search_all()
. The first
argument to both functions is a type of information that you wish to
look up; for example to see what fields are available to filter a query
by, use:
show_all(fields)
## # A tibble: 646 × 3
## id description type
## <chr> <chr> <chr>
## 1 _nest_parent_ <NA> fields
## 2 _nest_path_ <NA> fields
## 3 _root_ <NA> fields
## 4 abcdTypeStatus <NA> fields
## 5 acceptedNameUsage Accepted name fields
## 6 acceptedNameUsageID Accepted name fields
## 7 accessRights Access rights fields
## 8 annotationsDoi <NA> fields
## 9 annotationsUid Referenced by publication fields
## 10 assertionUserId Assertions by user fields
## # ℹ 636 more rows
And to search for a specific field:
search_all(fields, "australian states")
## # A tibble: 2 × 3
## id description type
## <chr> <chr> <chr>
## 1 cl2013 ASGS Australian States and Territories fields
## 2 cl22 Australian States and Territories fields
Here is a list of information types that can be used with
show_all()
and search_all()
:
Information type | Description | Sub-functions |
---|---|---|
Configuration | ||
atlases | Show what living atlases are available | show_all_atlases(), search_atlases() |
apis | Show what APIs & functions are available for each atlas | show_all_apis(), search_apis() |
reasons | Show what values are acceptable as ‘download reasons’ for a specified atlas | show_all_reasons(), search_reasons() |
Taxonomy | ||
taxa | Search for one or more taxonomic names | search_taxa() |
identifiers | Take a universal identifier and return taxonomic information | search_identifiers() |
ranks | Show valid taxonomic ranks (e.g. Kingdom, Class, Order, etc.) | show_all_ranks(), search_ranks()) |
Filters | ||
fields | Show fields that are stored in an atlas | show_all_fields(), search_fields() |
assertions | Show results of data quality checks run by each atlas | show_all_assertions(), search_assertions() |
Group filters | ||
profiles | Show what data quality profiles are available | show_all_profiles(), search_profiles() |
lists | Show what species lists are available | show_lists(), search_lists() |
Data providers | ||
providers | Show which institutions have provided data | show_all_providers(), search_providers() |
collections | Show the specific collections within those institutions | show_all_collections(), search_collections() |
datasets | Shows all the data groupings within those collections | show_all_datasets(), search_datasets() |
show_all_
subfunctions
While show_all
is useful for a variety of cases, you can
still call the underlying subfunctions if you prefer. Functions with the
prefix show_all_
do exactly that; they show all the
possible values of the category specified.
## # A tibble: 11 × 4
## region institution acronym url
## <chr> <chr> <chr> <chr>
## 1 Australia Atlas of Living Australia ALA https://www.ala.org.au
## 2 Austria Biodiversitäts-Atlas Österreich BAO https://biodiversityatlas.at
## 3 Brazil Sistemas de Informações sobre a Biodiversidade Brasileira SiBBr https://sibbr.gov.br
## 4 Estonia eElurikkus <NA> https://elurikkus.ee
## 5 France Portail français d'accès aux données d'observation sur les espèces OpenObs https://openobs.mnhn.fr
## 6 Global Global Biodiversity Information Facility GBIF https://gbif.org
## 7 Guatemala Sistema Nacional de Información sobre Diversidad Biológica de Guatemala SNIBgt https://snib.conap.gob.gt
## 8 Portugal GBIF Portugal GBIF.pt https://www.gbif.pt
## 9 Spain GBIF Spain GBIF.es https://www.gbif.es
## 10 Sweden Swedish Biodiversity Data Infrastructure SBDI https://biodiversitydata.se
## 11 United Kingdom National Biodiversity Network NBN https://nbn.org.uk
## # A tibble: 13 × 2
## id name
## <int> <chr>
## 1 1 biosecurity management/planning
## 2 11 citizen science
## 3 5 collection management
## 4 0 conservation management/planning
## 5 7 ecological research
## 6 3 education
## 7 2 environmental assessment
## 8 12 restoration/remediation
## 9 4 scientific research
## 10 8 systematic research/taxonomy
## 11 13 species modelling
## 12 6 other
## 13 10 testing
search_
subfunctions
You can also call subfunctions that use the search_
prefix to lookup information. search_
subfunctions differ
from show_all_
in that they require a query to work, and
they useful to search for detailed information that can’t be summarised
across the whole atlas.
search_taxa()
is an especially useful function in galah.
It let’s you search for a single taxon or multiple taxa by name.
search_taxa("reptilia")
## # A tibble: 1 × 9
## search_term scientific_name taxon_concept_id rank match_type kingdom phylum class issues
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 reptilia REPTILIA https://biodiversity.org.au/afd/taxa/682e1228-5b3c-45ff-833b-550efd40c399 class exactMatch Animalia Chordata Reptilia noIssue
search_taxa("reptilia", "aves", "mammalia", "pisces")
## # A tibble: 1 × 9
## search_term scientific_name taxon_concept_id rank match_type kingdom phylum class issues
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 reptilia REPTILIA https://biodiversity.org.au/afd/taxa/682e1228-5b3c-45ff-833b-550efd40c399 class exactMatch Animalia Chordata Reptilia noIssue
Alternatively, search_identifiers()
is the partner
function to search_taxa()
. If we already know a taxonomic
identifier, we can search for which taxa the identifier belongs to.
search_identifiers("urn:lsid:biodiversity.org.au:afd.taxon:682e1228-5b3c-45ff-833b-550efd40c399")
## # A tibble: 1 × 15
## search_term success scientific_name taxon_concept_id rank rank_id lft rgt match_type kingdom kingdom_id phylum phylum_id class class_id
## <chr> <lgl> <chr> <chr> <chr> <int> <int> <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 urn:lsid:biodiversity.org.au:afd.taxon:6… TRUE REPTILIA https://biodive… class 3000 33626 36658 taxonIdMa… Animal… https://b… Chord… https://… Rept… https:/…
show_values()
& search_values()
Once a desired field is found, you can use show_values()
to understand the information contained within that field. For example,
we can show the values contained in the field
basisOfRecord
.
search_all(fields, "basisOfRecord") |> show_values()
## ! Search returned 2 matched fields.
## • Showing values for 'basisOfRecord'.
## # A tibble: 9 × 1
## basisOfRecord
## <chr>
## 1 HUMAN_OBSERVATION
## 2 PRESERVED_SPECIMEN
## 3 OBSERVATION
## 4 OCCURRENCE
## 5 MACHINE_OBSERVATION
## 6 MATERIAL_SAMPLE
## 7 LIVING_SPECIMEN
## 8 MATERIAL_CITATION
## 9 FOSSIL_SPECIMEN
Use this information to pass meaningful queries to
galah_filter()
.
galah_call() |>
galah_filter(basisOfRecord == "LIVING_SPECIMEN") |>
atlas_counts()
## # A tibble: 1 × 1
## count
## <int>
## 1 126135
This works for other types of query, such as data profiles:
search_all(profiles, "ALA") |>
show_values() |>
head()
## • Showing values for 'ALA'.
## # A tibble: 6 × 5
## id enabled description filter displayOrder
## <int> <lgl> <chr> <chr> <int>
## 1 94 TRUE "Exclude all records where spatial validity is \"false\"" "-spa… 1
## 2 96 TRUE "Exclude all records with an assertion that the scientific name provided does not match any of the names lists used by the ALA. For a … "-ass… 1
## 3 97 TRUE "Exclude all records with an assertion that the scientific name provided is not structured as a valid scientific name. Also catches ran… "-ass… 2
## 4 98 TRUE "Exclude all records with an assertion that the name and classification supplied can't be used to choose between 2 homonyms" "-ass… 3
## 5 99 TRUE "Exclude all records with an assertion that kingdom provided doesn't match a known kingdom e.g. Animalia, Plantae" "-ass… 4
## 6 100 TRUE "Exclude all records with an assertion that the scientific name provided in the record does not match the expected taxonomic scope of t… "-ass… 5