A Deep Dive into Space Missions: Data Analysis and Visualization

Wuraolaifeoluwa
6 min readMar 20, 2021
Image: EurAsian Times

The possibility of exploring space has long excited me in many walks of my life, growing up I have always been intrigued about the nature of the objects seen in the sky at night.

The concept of space is considered to be of fundamental importance to an understanding of the physical universe. Scientists explore space for investigation of the universe beyond earth’s atmosphere and they use the information acquired to increase knowledge of the cosmos and to benefit humanity. It is usually done by the means of crewed and uncrewed spacecraft.

Space launch is the earliest part of a flight that reaches space. Space launch involves liftoff when a rocket or other space launch vehicle leaves the ground, floating ship, or midair aircraft at the start of a flight.

In the latter half of the 20th century, rockets were developed that were powerful enough to overcome the force of gravity to reach orbital velocities, paving the way for space exploration to become a reality. We have been venturing into space since October 4, 1957, when the Union of Soviet Socialist Republics (U.S.S.R.) launched Sputnik, the first artificial satellite to orbit Earth.

In this article, we will explore the data of all the space missions since the beginning of the Space Race (1957) using data analysis packages like Pandas, Numpy and visualizing the data with Matplotlib and Seaborn.

Exploratory Data Analysis

Data Source

The Space Race datasets used in this analysis were obtained from kaggle and it includes all the space missions since the beginning of Space Race (1957).

Data loading

The first step is to import all the necessary libraries needed for analysis and visualization then we read the data in a csv format.

Importing necessary libraries and data

To view the features of the datasets, I used .head() to grab 5 columns from the data. Below is an overview of what the dataset looks like by calling the name of the file.

First 5 rows from datasets

The dataset has 7 Useful Different Columns:

  • Company Name — The name of the company that undertook the space mission
  • Location — The location of the launch
  • Datum — The date and time of the launch
  • Detail — The Name of the Launch Vehicles
  • Status Rocket — The current status of the rocket
  • Rocket — The cost (in $ millions) of the mission
  • Status Mission — The status, Success or Failure of the Launch

To get the statistical overview of the data, I used .describe()

Statistical overview of the datasets

Feature Engineering and Data Cleaning

The Pandas library was used for different data analyses ranging from data sorting, data processing, dropping irrelevant rows and columns, to filling the missing values in their respective columns.

Data cleaning was done by filling in the missing values using the attributes of the most frequent number, dropped some redundant columns. Feature engineering was used to convert date datatype from object to DateTime and create more features out of it. For example, the name of the country was extracted from the location, and the month and year were extracted from datum to create more features.

Data Visualization

The Python Library Matplotlib was used to plot Histograms, Bar plots, and the rest. And lastly, Seaborn was used to create more attractive and informative statistical graphics to give insights to the plot. Now Let’s go get it by diving into the Data Visualization of each of the data columns.

Firstly, let’s analyze how many missions were successfully launched and how many of them were failures using count plot and pie chart.

Plot of Mission status
Chart of Mission status

Observation

This shows that mission status has a more successful rate with a percentage of 89.7%, followed by failure to partial failure. The least mission status is found to be a prelaunch failure with a percentage of 0.1%.

According to Russell Borogove, physicist, 94% success is over a long history of rocket development. More recently, launching agencies have refined their designs and processes to achieve high reliability. For example, Atlas II through Atlas V has had only one partial failure in 120 launches since 1991.

Analyze the status of mission rockets.

Chart of Rocket status

Observation

The observation shows that a low number of Rockets are active compared to the retired rockets, where the majority of rockets (81.7%) are retired.

Now I would analyze based on the country that has launched a space mission.

Analysis of country that launched a space mission
Plot of country that launched a space mission

Observation

From the datasets, 22 countries have launched space missions. Russia has the highest with (1395) launches followed by the USA (1344) to Kazakhstan(701) and the least with just 1 launch mission are Pacific Missile Range Facility, Yellow Sea, and Shahrud Missile Test Site.

Analysis based on the company who undertook the space mission (to avoid clumsy plot, I analysed 7 most popular companies.)

Chart of Top 7 countries who undertook the space mission

Observation

From the datasets, 56 companies are involved in the space mission. The plot gets clumsy if every company is plotted on it, so for visualization basis, I analyzed 7 top company that undertook the space mission, this shows RVSN USSR to have the highest percentage of 33.1% followed by Arianespace (31.9%) to General Dynamics (16.6%) and the least is US Air Force with 1.8% manufactured.

Classifying the database on success and failure of the mission by the most popular companies.

Relationship between company and mission status

Observation

This shows that the highest success rate of space mission recorded was manufactured by RVSN USSR company and the least is US Air Force, I believe this high rate is because of the RVSN USSR company is the highest recorded manufacturer of space rockets and also because the US Air Force navy is the lowest.

Also, from the plot above, all the companies have a high success rate compared to the rate of failed space missions.

Analyze yearly and monthly distribution of space missions (launch)

Yearly distribution of space launches
Monthly distribution of space launches

Observation

From the plot, 2016 through 2020 and mid ’60s to late ’70s has high variation in the number of space launch missions per year while late ’50s and early ’20s has lower variations in the number of spaces launched by year.

Observation shows, December has the highest number of space launches with over 400 launches monthly, followed by June with 402 launches to April with 383 launches and the least is January with 268 space launches monthly.

Summary

From the datasets of all the space missions since the beginning of Space Race (1957), the majority of the missions were successful, and most mission launches were carried out by Russia and the USA. Also, RVSN USSR and Arianespace are the highest companies that participated in the space mission. And lastly, from my analysis, most space missions were launched in the mid ’60s through the late ‘70s.

For the complete exploratory and explanatory analysis, click here to view all the codes on Github. Thanks for reading!!

--

--