Next you will find the table of contents. To jump to a section, click on the particular header.
The scope of this project is to drill down the terrorist events around the world from 1970 through 2015
The primary objectives are
The idea behind the project is to find out how the terrorism has developed in the Western world and whether we need to build tall walls to protect ourself against future threats. We chose our topic to be more global oriented, because
More general information about the project you find in the video below.
SocialData2017 from polakowo on Vimeo.
The dataset is very comprehensive and contains a lot of terrorism-related information. We downloaded the entire dataset Global Terrorism Database, available from Gtd Homepage. It contains 156,772 terrorist attacks x 137 features, and takes 142.3 MB of disk space. It's worth to mention that it is almost completely encoded (strings/long numbers to short numbers). To decode the dataset we looked at the codebook available here. After exploring the codebook we discovered some columns to be redundant, or not relevant, which we removed. See the corresponding notebook Cleaning Data for further details on how we approached.
We ended up working on 23 columns, which contain the quantitative as well as the qualitative information of the main interest. After decoding, cleaning, filtering, and encoding steps, we've got 156,772 rows x 23 columns, or equivalently 26.8 MB of disk space
You can download the cleaned dataset from this link to try it out by yourself!
Below you find some basic information on columns we used in charts.
The first table contains numeric data:
Column name | Type | Min | Max | NaN | Description |
---|---|---|---|---|---|
year
|
int | 1970 | 2015 | None | Year |
nkilled
|
int | 0 | 1500 | None | Total Number of Fatalities |
nkilledter
|
int | 0 | 500 | None | Number of Perpetrator Fatalities |
nwounded
|
int | 0 | 5500 | None | Total Number of Injured |
nwoundedter
|
int | 0 | 200 | None | Number of Perpetrators Injured |
lat
|
float | -53.1546 | 74.6336 | None | Latitude (of city) |
lon
|
float | -176.176 | 179.367 | None | Longitude (of city) |
The second table contains categorical data:
Column name | Unique | Top | NaN | Description |
---|---|---|---|---|
region
|
12 | Middle East & North Africa | None | Region |
country
|
204 | Iraq | None | Country |
weapontype
|
12 | Explosives/Bombs/Dynamite | 7.59% | Weapon Type |
attacktype
|
9 | Bombing/Explosion | 3.38% | Attack Type |
targettype
|
22 | Private Citizens & Property | 2.46% | Target/Victim Type |
gname
|
3216 | Unknown | 46.41% | Perpetrator Group Name |
Hint: Every categorical feature is encoded with integers to save a lot of space. Therefore, we introduce a global JSON dict called strings.json
containing a map of all integers to their corresponding strings. The decoding process takes place after the data has been successfully loaded to the front end, so none of the charts must take care of it.
You may already noticed that the amount of columns is less than 23. We limited the amount of information to be able to deliver the information quickly and make charts to be more responsive (= less laggy). Below you find optional columns we skipped.
Column name | Description |
---|---|
state
|
Province / Administrative Region / State |
city
|
City |
extended
|
Extended Incident? |
multiple
|
Part of Multiple Incident? |
success
|
Successful Attack? |
suicide
|
Suicide Attack? |
nter
|
Number of Perpetrators |
claimed
|
Claim of Responsibility? |
property
|
Property Damage |
propertyextent
|
Extent of Property Damage |
Python
and Plotly
) capturing those columns as well.
Charts are vital in presentation of data. They are used in both exploratory and descriptive analysis. As the most aggregations are time-intensive, we outsource them to the back end. We perform every major task in two steps:
iPython
to process the data, e.g., apply filters, perform aggregations, etc.d3.js
that facilitates generation and manipulation of web documents with data, for construction of beautiful interactive data visualizations.This histogram aims at exploration of temporal patterns of terrorism from 1970 through 2015
The main question we address is
How has the terrorism developed over time from the perspective of geographical units, types or terrorist groups?
We're interested in the temporal aspect of terrorist development, which touches many interesting attributes:
We implemented the following cool features:
The second chart is a scatterplot, which encodes 3 numeric attributes
The main question we address is
What arein the selected geographical unit?
- the most lethal weapon types,
- the most effective attack types, and
- the most vulnerable target types
We implemented the following cool features:
The scatterplot above has one big issue: we can display up to ~30 circles before we run out of space. But what if we'd love to compare countries? Even on a rectangular map with Mercador projection we need some kind of zoom. To tackle the problem we decided on another, more difficult, but also interesting solution: map countries on a virtual globe and let the user rotate it!
The main question we address is
How do the countries compare with each other in terms of terrorism?
We implemented the following cool features:
K-Means is the first algorithm in pattern recognition we'll use for analysis. Using K-Means, we can partition terrorist attacks into groups (at least it's an idea) to see how the terrorism is distributed geographically.
The main question we address is
Do the terrorist attacks form some geographical groups? Are there some visual patterns to find?
We implemented the following cool features:
The second pattern recognition algorithm is k-Nearest Neighbors, which is a classification and regression algorithm. Using kNN we are able to classify any point on the globe based on its (k-) neighbors.
The main question we address is
What if we knew that a terrorist attack is going to happen somewhere on the globe, what type will it likely be?
We implemented the following cool features:
Some (optional-) information was skipped to shrink the size of the webpage, thus you are welcome to continue the reading in the Explainer Notebook. You may also be interested in testing the things out, for this, clone the repository, download and import the data, and enjoy your analysis.