Dandelion of species

Minkó Mihály
Data Gardening
Published in
2 min readSep 23, 2021

--

In a recent project we got our hands on the list of species that currently inhabit the ELTE Botanical Garden. This consists of at around 7000 different types of plants which is a great source of data to visualize.

The data part — collecting and cleaning up

When our lab started to collaborate with the Botanical Garden, we got an initial set of species in a spreadsheet.

We wanted to enhance the data and clean it a little. The best solution seemed to do that is to connect to the GBIF API service and pull the related information from there. GBIF stands for Global Biodiversity Information Facility where they pull data from several sources on different species and organize this information in a consumable way. We built a Google spreadsheet app script connector that just did the work for us and so we could have a nice and clean set of data.

Dandelion of species

So we had a starting point to do some visualization of the data. Networks were a suitable choice, so we pulled the data to Gephi and created some simple visualizations of the data set. As the first step we created a species — family relationship, which as it turned out looked pretty much like a dandelion.

This type of representation also helps spotting weird things in the data, that would otherwise remain hidden. For example in the scenario it is not possible to have a species sharing two families since that can only belong under on family. So every connection between families should be considered as misrecorded data that need to be fixed.

The final result shows a beautiful and strange image representing more than 7000 species that are at the ELTE Botanical Garden to see. So get your shoes and visit them!

Part of the full picture

--

--