From GeoJSON to Community Graph: Analyzing World Nation Borders in Python | by Amanda Iglesias Moreno | Oct, 2023


Using NetworkX for Graph-Primarily based Nation Border Evaluation

Maksim Shutov in Unsplash

Python affords a variety of libraries that permit us to simply and rapidly handle issues in numerous analysis areas. Geospatial information evaluation and graph concept are two analysis areas the place Python offers a robust set of helpful libraries. On this article, we are going to conduct a easy evaluation of world borders, particularly exploring which nations share borders with others. We’ll start by using data from a GeoJSON file containing polygons for all nations worldwide. The final word aim is to create a graph representing the assorted borders utilizing NetworkX and make the most of this graph to carry out a number of analyses.

GeoJSON information allow the illustration of assorted geographical areas and are extensively utilized in geographical evaluation and visualizations. The preliminary stage of our evaluation entails studying the nations.geojson file and changing it right into a GeoDataFrame utilizing GeoPandas. This file has been sourced from the next GitHub repository and incorporates polygons representing totally different nations worldwide.

GeoDataFrame with Complete Nation Data (Picture created by the creator)

As proven above, the GeoDataFrame incorporates the next columns:

  1. ADMIN: Represents the executive title of the geographical space, such because the nation or area title.
  2. ISO_A3: Stands for the ISO 3166–1 alpha-3 nation code, a three-letter code uniquely figuring out nations.
  3. ISO_A2: Denotes the ISO 3166–1 alpha-2 nation code, a two-letter code additionally used for nation identification.
  4. geometry: This column incorporates the geometrical data that defines the form of the geographical space, represented as MULTIPOLYGON information.

You possibly can visualize all of the multi polygons that make up the GeoDataFrame utilizing theplot technique, as demonstrated beneath.

Visible Illustration of the GeoDataFrame (Picture created by the creator)

The multi polygons inside the geometry column belong to the category shapely.geometry.multipolygon.MultiPolygon. These objects comprise numerous attributes, considered one of which is the centroid attribute. The centroid attribute offers the geometric middle of the MULTIPOLYGON and returns a POINT that represents this middle.

Subsequently, we are able to use this POINT to extract the latitude and longitude of every MULTIPOLYGON and retailer the ends in two columns inside the GeoDataFrame. We carry out this calculation as a result of we are going to later use these latitude and longitude values to visualise the nodes on the graph based mostly on their actual geographic positions.

Now it’s time to proceed with the development of the graph that can characterize the borders between totally different nations worldwide. On this graph, the nodes will characterize nations, whereas the perimeters will point out the existence of a border between these nations. If there’s a border between two nodes, the graph could have an edge connecting them; in any other case, there will probably be no edge.

The operate create_country_network processes the knowledge inside the GeoDataFrame and constructs a Graph representing nation borders.

Initially, the operate iterates by way of every row of the GeoDataFrame, the place every row corresponds to a distinct nation. Then, it creates a node for the nation whereas including latitude and longitude as attributes to the node.

Within the occasion that the geometry will not be legitimate, it rectifies it utilizing the buffer(0) technique. This technique primarily fixes invalid geometries by making use of a small buffer operation with a distance of zero. This motion resolves issues akin to self-intersections or different geometric irregularities within the multipolygon illustration.

After creating the nodes, the following step is to populate the community with the related edges. To do that, we iterate by way of the totally different nations, and if there’s an intersection between the polygons representing each nations, it implies they share a typical border, and, in consequence, an edge is created between their nodes.

The following step entails visualizing the created community, the place nodes characterize nations worldwide, and edges signify the presence of borders between them.

The operate plot_country_network_on_map is liable for processing the nodes and edges of the graph G and displaying them on a map.

Community of Nation Borders (Picture created by the creator)

The positions of the nodes on the graph are decided by the latitude and longitude coordinates of the nations. Moreover, a map has been positioned within the background to offer a clearer context for the created community. This map was generated utilizing the boundary attribute from the GeoDataFrame. This attribute offers details about the geometrical boundaries of the represented nations, aiding within the creation of the background map.

It’s essential to notice one element: within the used GeoJSON file, there are islands which can be thought of unbiased nations, though they administratively belong to a selected nation. Because of this you may even see quite a few factors in maritime areas. Remember that the graph created depends on the knowledge out there within the GeoJSON file from which it was generated. If we had been to make use of a distinct file, the ensuing graph can be totally different.

The nation border community we’ve created can swiftly help us in addressing a number of questions. Under, we are going to define three insights that may simply be derived by processing the knowledge offered by the community. Nonetheless, there are various different questions that this community might help us reply.

Perception 1: Inspecting Borders of a Chosen Nation

On this part, we are going to visually assess the neighbors of a selected nation.

The plot_country_borders operate permits fast visualization of the borders of a selected nation. This operate generates a subgraph of the nation offered as enter and its neighboring nations. It then proceeds to visualise these nations, making it straightforward to watch the neighboring nations of a selected nation. On this occasion, the chosen nation is Mexico, however we are able to simply adapt the enter to visualise some other nation.

Community of Nation Borders in Mexico (Picture created by the creator)

As you possibly can see within the generated picture, Mexico shares its border with three nations: the US, Belize, and Guatemala.

Perception 2: High 10 Nations with the Most Borders

On this part, we are going to analyze which nations have the best variety of neighboring nations and show the outcomes on the display. To attain this, we’ve got applied the calculate_top_border_countries operate. This operate assesses the variety of neighbors for every node within the community and shows solely these with the best variety of neighbors (prime 10).

High 10 Nations with the Most Borders (Picture created by the creator)

We should reiterate that the outcomes obtained are depending on the preliminary GeoJSON file. On this case, the Siachen Glacier is coded as a separate nation, which is why it seems as sharing a border with China.

Perception 3: Exploring the Shortest Nation-to-Nation Routes

We conclude our evaluation with a route evaluation. On this case, we are going to consider the minimal variety of borders one should cross when touring from an origin nation to a vacation spot nation.

The find_shortest_path_between_countries operate calculates the shortest path between an origin nation and a vacation spot nation. Nonetheless, it’s essential to notice that this operate offers solely one of many potential shortest paths. This limitation arises from its use of the shortest_path operate from NetworkX, which inherently finds a single shortest path because of the nature of the algorithm used.

To entry all potential paths between two factors, together with a number of shortest paths, there are alternate options out there. Within the context of the find_shortest_path_between_countries operate, one may discover choices akin to all_shortest_paths or all_simple_paths. These alternate options are able to returning a number of shortest paths as an alternative of only one, relying on the precise necessities of the evaluation.

We employed the operate to search out the shortest path between Spain and Poland, and the evaluation revealed that the minimal variety of border crossings required to journey from Spain to Poland is 3.

Discovering the Optimum Route from Spain to Poland (Picture created by the creator)

Python affords a plethora of libraries spanning numerous domains of data, which may be seamlessly built-in into any information science challenge. On this occasion, we’ve got utilized libraries devoted to each geometric information evaluation and graph evaluation to create a graph representing the world’s borders. Subsequently, we’ve got demonstrated use instances for this graph to quickly reply questions, enabling us to conduct geographical evaluation effortlessly.

Thanks for studying.

Amanda Iglesias

Leave a Reply

Your email address will not be published. Required fields are marked *