-
Notifications
You must be signed in to change notification settings - Fork 2
Standardizing data for the EMLN package
Six main tables that form the basis for the development of the standard multilayer class:
- Description
- References
- Interactions
- Layers
- Nodes
- State_nodes
Each of these tables was a CSV file in the data set, and we describe each of them below. Because the data sets were different from each other, our general approach was to use attribute and value fields instead of the common wide format. Hence, instead of:
layer | name | latitude | longitude |
---|---|---|---|
1 | patch 1 | 50.4218 | -101.046 |
2 | patch 2 | 50.7 | -101.2 |
We used:
layer | attribute | value |
---|---|---|
1 | name | patch 1 |
1 | latitude | 50.4218 |
1 | longitude | -101.046 |
2 | name | patch 2 |
2 | latitude | 50.7 |
2 | longitude | -101.2 |
This is because not all networks have these particular attributes. We applied the same approach to all the other tables.
Contains all information related to the physical nodes in the network and their attributes (if there are any).
This table will always contains the following fields:
A unique node ID is assigned to each physical node within the network.
A generic name for the node, be it a specific code that was used in the original data, or the species/family/any taxa name.
(*)Not all node files contain this attribute, but it is present in most of them.
The node type attribute distinguishes between different types of nodes (e.g. pollinator node type).
Taxa verification was performed using the package ‘Taxize’ and its classification function. The nodes' scientific names at different taxonomic levels were verified against the NCBI database.
Nodes that could not be found in the database due to misspellings or the absence of the taxa in the NCBI database were assigned a FALSE value in the taxa_verified attribute. In contrast, a TRUE value indicates that the taxon is present in the NCBI database, regardless of how many IDs are associated with it.
For verified taxa (i.e. taxa_verification == TRUE), this attribute indicates the level at which the verification was performed.
node_id | attribute | value |
---|---|---|
1 | node_name | richardson_spermophile_(ground_squirrel) |
1 | type | taxon |
1 | taxonomy_name | Urocitellus richardsonii |
1 | taxonomy_rank | species |
1 | taxa_verified | TRUE |
1 | verification_level | species |
2 | node_name | coyote |
2 | type | taxon |
2 | taxonomy_name | Canis latrans |
2 | taxonomy_rank | species |
2 | taxa_verified | TRUE |
2 | verification_level | species |
3 | node_name | red-tailed_hawk |
3 | type | taxon |
3 | taxonomy_name | Buteo jamaicensis |
3 | taxonomy_rank | species |
3 | taxa_verified | TRUE |
3 | verification_level | species |
When the same physical node has different attributes at different layers (e.g., abundance can change in time), the information will be stored in this table. The attributes included in this file vary depending on the network and most networks do not have this information. Attributes necessary to identify and differentiate between state nodes within the multilayer network are:
layer_id | node_id | attribute | value |
---|---|---|---|
1 | 1 | node_name | Microtus arvalis |
1 | 1 | sample_size | 1345 |
2 | 1 | node_name | Microtus arvalis |
2 | 1 | sample_size | 11 |
3 | 1 | node_name | Microtus arvalis |
3 | 1 | sample_size | 150 |
The file format contains comprehensive information about the layers that make up a multilayer network. There are certain layer attributes that you can always expect to find in the format. However, there may be additional attributes that are specific to certain types of networks.
There is a unique layer ID assigned to each layer.
The layer can be categorized as one of the following types:
- environment
- time
- space
- perturbation
- interaction
The longitude and latitude attributes provide data regarding the coordinates of the layers. In spatial networks, longitude and latitude features differentiate the layers by their respective coordinates.
The location of the study, including the country or region, is one of the attributes that may be present, particularly in spatial networks where it can vary between different layers.
The layer may have a generic name attribute that can be connected to the edge list instead of the layer ID.
If a network consists of directed layers, then the directed
attribute will be present and have a 'TRUE' value.
layer | attribute | value |
---|---|---|
1 | type | environment |
1 | name | The biotic interactions of the prairie community |
1 | location | Aspen Parkland, North America |
1 | latitude | 50.4218 |
1 | longitude | -101.046 |
1 | date | 01/01/1928 |
1 | directed | TRUE |
2 | type | environment |
2 | name | The biotic interactions of the aspen community |
2 | location | Aspen Parkland, North America |
2 | latitude | 50.4218 |
2 | longitude | -101.046 |
2 | date | 01/01/1928 |
2 | directed | TRUE |
The interactions between nodes of multilayer networks are represented by a commonly used extended edge list. This CSV file is organized in a long format where each interaction is assigned a unique ID and can contain additional attributes that are specific to the particular network.
The interactions list always contains the following attributes:
Each interaction is identified by a unique interaction ID.
Refers to the starting node of the edge.
Represents the layer of the starting node.
Refers to the ending node of the edge.
Represents the layer of the ending node.
There are different types of interaction:
- frugivory
- pollination
- predation
- herbivory
- trophic
- host-parasite
- detritivore
- scavenger
- negative
- positive
- seed-dispersal
- interlayer
- competition
- anemone-fish
- plant-ant
- parasitism
The weight of the interaction.
The interactions file could contain other attributes for each interaction, such as 'method' - the method by which the interaction was measured.
interaction_id | attribute | value |
---|---|---|
1 | node_from | red-tailed_hawk |
1 | layer_from | 1 |
1 | node_to | vole_(microtus) |
1 | layer_to | 1 |
1 | weight | 1 |
1 | type | predation |
1 | method | field observation and gut content |
2 | node_from | weasel |
2 | layer_from | 1 |
2 | node_to | vole_(microtus) |
2 | layer_to | 1 |
2 | weight | 1 |
2 | type | predation |
2 | method | field observation and gut content |
This file contains a general description of the network, which will aid the user in determining if this is the network they want to work with.
This file will always contain the following information:
The developer who collected the data.
A brief description of the source article will be provided if the data was obtained from one.
The source of the raw data:
- Interaction Web DataBase
- Web of Life
- Web: from searching for relevant articles online
- Mangal
Note: if the data is collected from Mangal, there will be a row specifying the mangal code used.
URL or web address where the data was sourced from.
There are two different network attributes in the package. One is the ecological network type (e.g. Pollination network), while the other is the multilayer network type (e.g. Spatial network). For example, a network can be classified as a food Web with a temporal multilayer dimension.
Ecological network types:
- Pollination: describes the interactions between plants and pollinators. The interactions in a pollination network are typically mutualistic. An example of a pollination interaction is the relationship between bees and flowers.
- Seed-Dispersal: describes the movement of seeds from one location to another, typically through the action of an animal. An example of a seed-dispersal interaction is the relationship between a frugivore and a fruit-bearing plant.
- Plant-Ant: describes the relationships between certain plant species and ant colonies, where both organisms benefit from the interaction in different ways.
- Plant-Herbivore: describes the complex relationships between plants and the animals that consume them.
- Host-Parasite: describes the relationships between hosts and parasites, in which the parasite relies on the host for survival and reproduction, while the host may suffer negative effects as a result of the parasite's presence.
- Food-Web: describes the feeding relationships among species, and how energy and nutrients move from one organism to another.
- Anemone-Fish: describes a symbiotic relationship between certain species of anemones and clownfish, in which both species benefit from the presence of the other.
- Multiples: the network encompasses several ecological interactions throughout its layers (e.g. a network that contains host-parasite + food-web interactions).
- Spatial: the layers of a multilayer network are distinguished from each other based on their coordinates within the network.
- Temporal: the layers of a multilayer network representing different time points of the network.
- Perturbation: the layers are differentiated based on whether or not they are experiencing a perturbation (e.g. the effect of interventions like invasive species).
- Environment: each layer represents a different aspect of the environment.
- Multiplex: each layer represents a different type of interaction between the same set of nodes.
The presence or absence of the "state_nodes" CSV file determines whether or not there are attributes associated with state nodes in a particular network. A FALSE value is considered that there are no attributes associated with the state nodes, while a TRUE value points out that there are attributes associated with state nodes.
attribute | value |
---|---|
data_entry | ofir_segev |
multilayer_network_type | Environment |
description | NA |
source | mangal |
data_url | https://mangal.io/doc/api/ |
mangal_code | 87 |
ecological_network_type | Food-Web |
state_nodes | FALSE |
The file contains all the relevant information required to find the article where the data is presented, including the article's doi, author, and year of the article (for data that was extracted from the article). May also include paper/data URLs. There are multiple rows in the file when data is taken from multiple articles.
The doi code of the article that the data was taken from. If the article doesn't have a DOI, the value will be represented as NA.
The name/s of the author/s of the article the data was sourced from.
The year of data publication.
The website address provides access to the article.
The website where the data is hosted.
doi | author | year | paper_url |
---|---|---|---|
10.2307/1948658 | ralph w. dexter | 1947-01-01 | https://doi.org/10.2307%2F1948658 |