Choosing an Atlas#

The GBIF network consists of a series of ‘node’ organisations who collate biodiversity data from their own countries, with GBIF acting as an umbrella organisation to store data from all nodes. Several nodes have their own APIs, often built from the ‘living atlas’ codebase developed by the ALA.

At present, galah supports the following functions and atlases:

  • Australia

  • Austria

  • Brazil

  • France

  • GBIF

  • Guatemala

  • Spain

  • Sweden

Set Organisation#

Set which atlas you want to use by changing the atlas argument in galah.galah_config(). The atlas argument can accept a a region to select a given atlas, all of which are available via galah.show_all(atlases=True). Once a value is provided, it will automatically update galah’s server configuration to your selected atlas. The default atlas is Australia.

If you intend to download records, you may need to register a user profile with the relevant atlas first.

>>> import galah
>>> galah.galah_config(atlas="Spain", email="your-email-here")

Look up Information#

You can use the same look-up functions to find useful information about the Atlas you have set. Available information may vary for each Living Atlas.

>>> galah.galah_config(atlas="Australia")
>>> galah.show_all(datasets=True)
                                                                                                                                                                      name                                                     uri      uid
0                                                                                          ALA Taxonomy List for Species Missing from Conservation Lists part B 27_11_2023  https://collections.ala.org.au/ws/dataResource/dr23929  dr23929
1                                                                                                                                                           "A" Flora EPBC  https://collections.ala.org.au/ws/dataResource/dr24170  dr24170
2                                                                                                                                                      "H to O" flora EPBC  https://collections.ala.org.au/ws/dataResource/dr24172  dr24172
3                                                                                                                                                      "P to Z" flora EPBC  https://collections.ala.org.au/ws/dataResource/dr24173  dr24173
4      (Appendix 2) Stratigraphic distribution of key and potential stratigraphic calcareous nannofossil markers in the upper Campanian–Maastrichtian of ODP Hole 122-762C  https://collections.ala.org.au/ws/dataResource/dr29981  dr29981
...                                                                                                                                                                    ...                                                     ...      ...
10141                                    Zooplankton samples from Heron net trawls along the 110E meridian, eastern Indian Ocean, RV Investigator voyage IN2019_V03 (2019)  https://collections.ala.org.au/ws/dataResource/dr23120  dr23120
10142                                                                          Zooplankton sampling in the coastal waters of south eastern Tasmania, Australia (2009-2015)  https://collections.ala.org.au/ws/dataResource/dr15943  dr15943
10143                                                                                                                                           Zoos Victoria Moth Tracker  https://collections.ala.org.au/ws/dataResource/dr22371  dr22371
10144                                                                                                                                              Zosteria fulvipubescens  https://collections.ala.org.au/ws/dataResource/dr27880  dr27880
10145                                                                                                                                              Zosteria fulvipubescens  https://collections.ala.org.au/ws/dataResource/dr27881  dr27881

[10146 rows x 3 columns]
>>> galah.show_all(fields=True)
                     id                                     description   type link
0         _nest_parent_                                             NaN  field  NaN
1           _nest_path_                                             NaN  field  NaN
2                _root_                                             NaN  field  NaN
3        abcdTypeStatus                   ABCD field in use by herbaria  field  NaN
4     acceptedNameUsage  http://rs.tdwg.org/dwc/terms/acceptedNameUsage  field  NaN
...                 ...                                             ...    ...  ...
1105  multimediaLicence                              Media filter field  media     
1106             images                              Media filter field  media     
1107             videos                              Media filter field  media     
1108             sounds                              Media filter field  media     
1109                qid                Reference to pre-generated query  other     

[1110 rows x 4 columns]
>>> galah.search_all(datasets="year")
                                                                                                                                                                                                             name                                                     uri      uid
0                                                                                                                                                                                  Elgin Road 3 year observations    https://collections.ala.org.au/ws/dataResource/dr661    dr661
1                                                                                                                                                                                com plants greater than 50 years  https://collections.ala.org.au/ws/dataResource/dr21699  dr21699
2                                                                                                                                                                    CoM animal species greater than 50 years.csv  https://collections.ala.org.au/ws/dataResource/dr21448  dr21448
3                                                                                                                              10 year trend of levels of organochlorine pollutants in Antarctic seabirds 2003/04  https://collections.ala.org.au/ws/dataResource/dr16247  dr16247
4                                                                                   Ocean Sampling Day (OSD) 2014: AUTHORITY-RAW amplicon and metagenome sequencing study from the June solstice in the year 2014  https://collections.ala.org.au/ws/dataResource/dr30174  dr30174
5                                                                                   Ocean Sampling Day (OSD) 2014: AUTHORITY-RAW amplicon and metagenome sequencing study from the June solstice in the year 2014  https://collections.ala.org.au/ws/dataResource/dr30212  dr30212
6                                                                             Coccolithophore assemblages of a 9,000 year old marine sediment core from a climate hotspot in Tasmania, southeast Australia (2018)  https://collections.ala.org.au/ws/dataResource/dr23184  dr23184
7                                                                          Year-round at-sea movements of fairy prions (Pachyptila turtur)  from Kanowna Island, Bass Strait, south-eastern Australia (2017-2020)  https://collections.ala.org.au/ws/dataResource/dr23233  dr23233
8  Jellyfish Database Initiative: Global records on gelatinous zooplankton for the past 200 years, collected from global sources and literature, subset of records from Australian and adjacent seas. (1907-2011)  https://collections.ala.org.au/ws/dataResource/dr29550  dr29550
>>> galah.search_taxa(taxa="Heleioporus")
  scientificName scientificNameAuthorship                                                             taxonConceptID   rank   kingdom    phylum  order           family        genus   issues
0    Heleioporus               Gray, 1841  https://biodiversity.org.au/afd/taxa/b63103c4-28f7-44a5-b8d7-df459eeff2d3  genus  Animalia  Chordata  Anura  Limnodynastidae  Heleioporus  noIssue

Download data#

You can build queries as you normally would in galah. For taxonomic queries, use galah.search_taxa() to make sure your searches are returning the correct taxonomic data.

>>> galah.galah_config(atlas="Australia")
>>> # Returns no data due to misspelling
>>> galah.search_taxa(taxa="vlps")
Empty DataFrame
Columns: []
Index: []
>>> # Returns data
>>> galah.search_taxa(taxa="Vulpes vulpes")
  scientificName scientificNameAuthorship                                                             taxonConceptID     rank   kingdom    phylum      order   family   genus        species vernacularName   issues
0  Vulpes vulpes           Linnaeus, 1758  https://biodiversity.org.au/afd/taxa/2869ce8a-8212-46c2-8327-dfb7fabb8296  species  Animalia  Chordata  Carnivora  Canidae  Vulpes  Vulpes vulpes            Fox  noIssue
>>> galah.atlas_counts(taxa="Vulpes vulpes", filters="year>2010")
   totalRecords
0        105576

Download species occurrence records from other atlases with galah.atlas_occurrences()

>>> galah.atlas_occurrences(taxa="Vulpes vulpes", filters="year>2010", fields=["taxon_name", "year"])
              scientificName  year
0              Vulpes vulpes  2014
1              Vulpes vulpes  2016
2              Vulpes vulpes  2021
3              Vulpes vulpes  2014
4              Vulpes vulpes  2022
...                      ...   ...
105571         Vulpes vulpes  2017
105572         Vulpes vulpes  2016
105573  Vulpes vulpes vulpes  2018
105574  Vulpes vulpes vulpes  2019
105575  Vulpes vulpes vulpes  2019

[105576 rows x 2 columns]

Complex queries with multiple Atlases#

It is also possible to create more complex queries that return data from multiple Living Atlases. As an example, setting atlases within a loop with galah_config() allows us to return the total number of species records in each Living Atlas in one table.

>>> import galah
>>> import pandas as pd
>>> atlases = ["Australia","Austria","Brazil","France","GBIF","Spain"]
>>> counts_dict = {"Atlas": [], "Total Records": []}
>>> for atlas in atlases:
>>>     galah.galah_config(atlas=atlas)
>>>     counts_dict["Atlas"].append(atlas)
>>>     counts_dict["Total Records"].append(galah.atlas_counts()["totalRecords"][0])
>>> pd.DataFrame(counts_dict)
            Atlas  Total Records
0       Australia      150121628
1         Austria        9173597
2          Brazil       37089463
3          France      160368606
4            GBIF     3125039095
5        Portugal       16043865
6           Spain       59764520
7          Sweden      160190282
8  United Kingdom      299207653