Choosing an Atlas#
The GBIF network consists of a series of ‘node’ organisations who collate biodiversity data from their own countries, with GBIF acting as an umbrella organisation to store data from all nodes. Several nodes have their own APIs, often built from the ‘living atlas’ codebase developed by the ALA.
At present, galah supports the following functions and atlases:
Australia
Austria
Brazil
France
GBIF
Guatemala
Spain
Sweden
Set Organisation#
Set which atlas you want to use by changing the atlas argument in galah.galah_config()
. The atlas argument
can accept a a region to select a given atlas, all of which are available
via galah.show_all(atlases=True)
. Once a value is provided, it will automatically update galah’s server
configuration to your selected atlas. The default atlas is Australia.
If you intend to download records, you may need to register a user profile with the relevant atlas first.
>>> import galah
>>> galah.galah_config(atlas="Spain", email="your-email-here")
Look up Information#
You can use the same look-up functions to find useful information about the Atlas you have set. Available information may vary for each Living Atlas.
>>> galah.galah_config(atlas="Australia")
>>> galah.show_all(datasets=True)
name uri uid
0 ALA Taxonomy List for Species Missing from Conservation Lists part B 27_11_2023 https://collections.ala.org.au/ws/dataResource/dr23929 dr23929
1 "A" Flora EPBC https://collections.ala.org.au/ws/dataResource/dr24170 dr24170
2 "H to O" flora EPBC https://collections.ala.org.au/ws/dataResource/dr24172 dr24172
3 "P to Z" flora EPBC https://collections.ala.org.au/ws/dataResource/dr24173 dr24173
4 (Appendix 2) Stratigraphic distribution of key and potential stratigraphic calcareous nannofossil markers in the upper Campanian–Maastrichtian of ODP Hole 122-762C https://collections.ala.org.au/ws/dataResource/dr30595 dr30595
... ... ... ...
10663 Zooplankton samples from Heron net trawls along the 110E meridian, eastern Indian Ocean, RV Investigator voyage IN2019_V03 (2019) https://collections.ala.org.au/ws/dataResource/dr23120 dr23120
10664 Zooplankton sampling in the coastal waters of south eastern Tasmania, Australia (2009-2015) https://collections.ala.org.au/ws/dataResource/dr15943 dr15943
10665 Zoos Victoria Moth Tracker https://collections.ala.org.au/ws/dataResource/dr22371 dr22371
10666 Zosteria fulvipubescens https://collections.ala.org.au/ws/dataResource/dr27880 dr27880
10667 Zosteria fulvipubescens https://collections.ala.org.au/ws/dataResource/dr27881 dr27881
[10668 rows x 3 columns]
>>> galah.show_all(fields=True)
id description type link
0 abcdTypeStatus ABCD field in use by herbaria field NaN
1 acceptedNameUsage http://rs.tdwg.org/dwc/terms/acceptedNameUsage field NaN
2 acceptedNameUsageID http://rs.tdwg.org/dwc/terms/acceptedNameUsageID field NaN
3 accessRights NaN field NaN
4 annotationsDoi NaN field NaN
... ... ... ... ...
1098 multimediaLicence Media filter field media
1099 images Media filter field media
1100 videos Media filter field media
1101 sounds Media filter field media
1102 qid Reference to pre-generated query other
[1103 rows x 4 columns]
>>> galah.search_all(datasets="year")
name uri uid
0 Elgin Road 3 year observations https://collections.ala.org.au/ws/dataResource/dr661 dr661
1 com plants greater than 50 years https://collections.ala.org.au/ws/dataResource/dr21699 dr21699
2 CoM animal species greater than 50 years.csv https://collections.ala.org.au/ws/dataResource/dr21448 dr21448
3 10 year trend of levels of organochlorine pollutants in Antarctic seabirds 2003/04 https://collections.ala.org.au/ws/dataResource/dr16247 dr16247
4 Ocean Sampling Day (OSD) 2014: AUTHORITY-RAW amplicon and metagenome sequencing study from the June solstice in the year 2014 https://collections.ala.org.au/ws/dataResource/dr30174 dr30174
5 Coccolithophore assemblages of a 9,000 year old marine sediment core from a climate hotspot in Tasmania, southeast Australia (2018) https://collections.ala.org.au/ws/dataResource/dr23184 dr23184
6 Year-round at-sea movements of fairy prions (Pachyptila turtur) from Kanowna Island, Bass Strait, south-eastern Australia (2017-2020) https://collections.ala.org.au/ws/dataResource/dr23233 dr23233
7 Jellyfish Database Initiative: Global records on gelatinous zooplankton for the past 200 years, collected from global sources and literature, subset of records from Australian and adjacent seas. (1907-2011) https://collections.ala.org.au/ws/dataResource/dr29550 dr29550
>>> galah.search_taxa(taxa="Heleioporus")
scientificName scientificNameAuthorship taxonConceptID rank kingdom phylum order family genus issues
0 Heleioporus Gray, 1841 https://biodiversity.org.au/afd/taxa/b63103c4-28f7-44a5-b8d7-df459eeff2d3 genus Animalia Chordata Anura Limnodynastidae Heleioporus noIssue
Download data#
You can build queries as you normally would in galah. For taxonomic queries, use galah.search_taxa()
to
make sure your searches are returning the correct taxonomic data.
>>> galah.galah_config(atlas="Australia")
>>> # Returns no data due to misspelling
>>> galah.search_taxa(taxa="vlps")
Empty DataFrame
Columns: []
Index: []
>>> # Returns data
>>> galah.search_taxa(taxa="Vulpes vulpes")
scientificName scientificNameAuthorship taxonConceptID rank kingdom phylum order family genus species vernacularName issues
0 Vulpes vulpes Linnaeus, 1758 https://biodiversity.org.au/afd/taxa/2869ce8a-8212-46c2-8327-dfb7fabb8296 species Animalia Chordata Carnivora Canidae Vulpes Vulpes vulpes Fox noIssue
>>> galah.atlas_counts(taxa="Vulpes vulpes", filters="year>2010")
totalRecords
0 108555
Download species occurrence records from other atlases with galah.atlas_occurrences()
>>> galah.atlas_occurrences(taxa="Vulpes vulpes", filters="year>2010", fields=["taxon_name", "year"])
scientificName year BASIS_OF_RECORD_INVALID CONTINENT_COORDINATE_MISMATCH COORDINATE_ROUNDED COORDINATE_UNCERTAINTY_METERS_INVALID COUNTRY_COORDINATE_MISMATCH COUNTRY_DERIVED_FROM_COORDINATES FIRST_OF_MONTH FIRST_OF_YEAR GEODETIC_DATUM_ASSUMED_WGS84 ID_PRE_OCCURRENCE INDIVIDUAL_COUNT_INVALID LOCATION_NOT_SUPPLIED MISSING_GEODETICDATUM MISSING_GEOREFERENCEDBY MISSING_GEOREFERENCEPROTOCOL MISSING_GEOREFERENCESOURCES MISSING_GEOREFERENCEVERIFICATIONSTATUS MISSING_GEOREFERENCE_DATE MISSING_TAXONRANK MODIFIED_DATE_INVALID MULTIMEDIA_DATE_INVALID OCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT RECORDED_DATE_INVALID STATE_COORDINATE_MISMATCH TAXON_MATCH_FUZZY TAXON_MATCH_HIGHERRANK UNCERTAINTY_IN_PRECISION
0 Vulpes vulpes 2014 False False True False False False False False False False False False False True False True True True False False False False False False False False False
1 Vulpes vulpes 2011 False False True False False False False False False False False False False True True True True True False False False False False False False False False
2 Vulpes vulpes 2013 False False True False False False False False False False False False False True False True True True False False False False False False False False False
3 Vulpes vulpes 2016 True False True True False True False False True False False False True True True True True True True False False False False False False False False
4 Vulpes vulpes 2023 False False True False False False False False False False False False False True False True True True False False False False False False False False False
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
108550 Vulpes vulpes 2024 True False False True False True False False True False False False True True True True True True True False False False False False False False False
108551 Vulpes vulpes 2017 False False True False False False False False False False True False False True False True True True False False False False False False False False False
108552 Vulpes vulpes vulpes 2018 False False False True False True False False True False True False True True True True True True True False False False False False False False False
108553 Vulpes vulpes vulpes 2019 False False False False False False False False False False False False False True False True True True False False False False False False False False False
108554 Vulpes vulpes vulpes 2019 False False True True False True False False True False False False True True True True True True True False False False False False False False False
[108555 rows x 29 columns]
Complex queries with multiple Atlases#
It is also possible to create more complex queries that return data from multiple Living Atlases. As an example, setting atlases within a loop with galah_config() allows us to return the total number of species records in each Living Atlas in one table.
>>> import galah
>>> import pandas as pd
>>> atlases = ["Australia","Austria","Brazil","France","GBIF","Spain"]
>>> counts_dict = {"Atlas": [], "Total Records": []}
>>> for atlas in atlases:
>>> galah.galah_config(atlas=atlas)
>>> counts_dict["Atlas"].append(atlas)
>>> counts_dict["Total Records"].append(galah.atlas_counts()["totalRecords"][0])
>>> pd.DataFrame(counts_dict)
Atlas Total Records
0 Australia 150349915
1 Austria 9173597
2 Brazil 38014118
3 France 160752880
4 GBIF 3148536544
5 Portugal 16043865
6 Spain 59801061
7 Sweden 161214879
8 United Kingdom 300039245