数据可视化案例:世界幸福指数报告

文章目录
  1. 1. 背景
  2. 2. 内容
  3. 3. dataset列
  4. 4. 人均GDP最高、最低的国家
  5. 5. 幸福指数最高、最低的10个国家
  6. 6. 幸福指数与GDP的二维散点图
  7. 7. 人均寿命与GDP的二维散点图

背景

《世界幸福报告》是对全球幸福状况的里程碑式调查。随着政府、组织和民间社会越来越多地使用幸福指数来为他们的决策提供信息,该报告继续获得全球认可。跨领域的领先专家——经济学、心理学、调查分析、国家统计、健康、公共政策等——描述了如何有效地使用幸福的衡量来评估国家的进步。这些报告回顾了当今世界的幸福状况,并展示了新的幸福科学如何解释幸福的个人和国家差异。

内容

幸福指数和排名使用来自盖洛普世界民意调查的数据。幸福分数后面的列估计了六个因素中的每一个——经济生产、社会支持、预期寿命、自由、没有腐败和慷慨——在多大程度上有助于使每个国家的生活评价高于反乌托邦。它们对每个国家/地区报告的总分没有影响,但它们确实解释了为什么有些国家/地区的排名高于其他国家/地区。

数据集:世界幸福指数报告

1
2
3
4
5
6
7
8
9
10
11
12
13
14
from pylab import mpl

mpl.rcParams['font.sans-serif'] = ['FangSong'] # 指定默认字体
mpl.rcParams['axes.unicode_minus'] = False # 解决保存图像是负号'-'显示为方块的问题

#importing the libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

import warnings
warnings.filterwarnings('ignore')

dataset列

  • Country name:国家名字
  • Regional indicator:地区
  • Ladder score:幸福指数
  • Standard error of ladder score :幸福指数标准差
  • upperwhisker : 标准差上界
  • lowerwhisker:标准差下界
  • Logged GDP per capita:人均GDP
  • Social support:社会保障?
  • Healthy life expectancy:人均寿命
  • Freedom to make life choices:做出人生决定的自由度?
  • Generosity:慷慨?
  • Perceptions of corruption:腐败程度
  • Ladder score in Dystopia
  • Explained by: Log GDP per capita
  • Explained by: Social support
  • Explained by: Healthy life expectancy
  • Explained by: Freedom to make life choices
  • Explained by: Generosity
  • Explained by: Perceptions of corruption
  • Dystopia + residual
A B
社会保障 人均寿命
人均GDP 幸福指数
慷慨指数 做出人生决定的自由度
地区
1
2
3
#load the dataset
df = pd.read_csv('world-happiness-report-2021.csv')
df.head().transpose()
0 1 2 3 4
Country name Finland Denmark Switzerland Iceland Netherlands
Regional indicator Western Europe Western Europe Western Europe Western Europe Western Europe
Ladder score 7.842 7.62 7.571 7.554 7.464
Standard error of ladder score 0.032 0.035 0.036 0.059 0.027
upperwhisker 7.904 7.687 7.643 7.67 7.518
lowerwhisker 7.78 7.552 7.5 7.438 7.41
Logged GDP per capita 10.775 10.933 11.117 10.878 10.932
Social support 0.954 0.954 0.942 0.983 0.942
Healthy life expectancy 72.0 72.7 74.4 73.0 72.4
Freedom to make life choices 0.949 0.946 0.919 0.955 0.913
Generosity -0.098 0.03 0.025 0.16 0.175
Perceptions of corruption 0.186 0.179 0.292 0.673 0.338
Ladder score in Dystopia 2.43 2.43 2.43 2.43 2.43
Explained by: Log GDP per capita 1.446 1.502 1.566 1.482 1.501
Explained by: Social support 1.106 1.108 1.079 1.172 1.079
Explained by: Healthy life expectancy 0.741 0.763 0.816 0.772 0.753
Explained by: Freedom to make life choices 0.691 0.686 0.653 0.698 0.647
Explained by: Generosity 0.124 0.208 0.204 0.293 0.302
Explained by: Perceptions of corruption 0.481 0.485 0.413 0.17 0.384
Dystopia + residual 3.253 2.868 2.839 2.967 2.798
1
# df[["Country name", "Freedom to make life choices"]].sort_values(by="Freedom to make life choices", ascending=False).values
1
df.shape
(149, 20)
1
# df.info
1
df.describe().transpose()
count mean std min 25% 50% 75% max
Ladder score 149.0 5.532839 1.073924e+00 2.523 4.852 5.534 6.255 7.842
Standard error of ladder score 149.0 0.058752 2.200120e-02 0.026 0.043 0.054 0.070 0.173
upperwhisker 149.0 5.648007 1.054330e+00 2.596 4.991 5.625 6.344 7.904
lowerwhisker 149.0 5.417631 1.094879e+00 2.449 4.706 5.413 6.128 7.780
Logged GDP per capita 149.0 9.432208 1.158601e+00 6.635 8.541 9.569 10.421 11.647
Social support 149.0 0.814745 1.148890e-01 0.463 0.750 0.832 0.905 0.983
Healthy life expectancy 149.0 64.992799 6.762043e+00 48.478 59.802 66.603 69.600 76.953
Freedom to make life choices 149.0 0.791597 1.133318e-01 0.382 0.718 0.804 0.877 0.970
Generosity 149.0 -0.015134 1.506567e-01 -0.288 -0.126 -0.036 0.079 0.542
Perceptions of corruption 149.0 0.727450 1.792263e-01 0.082 0.667 0.781 0.845 0.939
Ladder score in Dystopia 149.0 2.430000 5.347044e-15 2.430 2.430 2.430 2.430 2.430
Explained by: Log GDP per capita 149.0 0.977161 4.047399e-01 0.000 0.666 1.025 1.323 1.751
Explained by: Social support 149.0 0.793315 2.588713e-01 0.000 0.647 0.832 0.996 1.172
Explained by: Healthy life expectancy 149.0 0.520161 2.130191e-01 0.000 0.357 0.571 0.665 0.897
Explained by: Freedom to make life choices 149.0 0.498711 1.378884e-01 0.000 0.409 0.514 0.603 0.716
Explained by: Generosity 149.0 0.178047 9.827033e-02 0.000 0.105 0.164 0.239 0.541
Explained by: Perceptions of corruption 149.0 0.135141 1.143614e-01 0.000 0.060 0.101 0.174 0.547
Dystopia + residual 149.0 2.430329 5.376452e-01 0.648 2.138 2.509 2.794 3.482
1
df.duplicated().sum()
0
1
2
plt.figure(figsize=[18,9])
sns.heatmap(df.corr(),annot=True,cmap="Blues");


png

人均GDP最高、最低的国家

1
2
gdp = df[['Logged GDP- per capita','Country name']].sort_values('Logged GDP per capita', ascending=False)
gdp[:10]
Logged GDP per capita Country name
7 11.647 Luxembourg
31 11.488 Singapore
14 11.342 Ireland
2 11.117 Switzerland
24 11.085 United Arab Emirates
5 11.053 Norway
18 11.023 United States
76 11.000 Hong Kong S.A.R. of China
1 10.933 Denmark
4 10.932 Netherlands
1
gdp[-10:]
Logged GDP per capita Country name
142 7.477 Haiti
137 7.434 Sierra Leone
134 7.396 Madagascar
127 7.364 Chad
135 7.362 Togo
119 7.288 Liberia
114 7.158 Mozambique
95 7.098 Niger
143 6.958 Malawi
139 6.635 Burundi

幸福指数最高、最低的10个国家

1
2
3
# top10 ladder score country
ladder_score = df[['Ladder score',"Standard error of ladder score",'Country name']].sort_values("Ladder score", ascending=False)
ladder_score[:10]
Ladder score Standard error of ladder score Country name
0 7.842 0.032 Finland
1 7.620 0.035 Denmark
2 7.571 0.036 Switzerland
3 7.554 0.059 Iceland
4 7.464 0.027 Netherlands
5 7.392 0.035 Norway
6 7.363 0.036 Sweden
7 7.324 0.037 Luxembourg
8 7.277 0.040 New Zealand
9 7.268 0.036 Austria
1
2
# low10 ladder score 
ladder_score[-10:]
Ladder score Standard error of ladder score Country name
139 3.775 0.107 Burundi
140 3.658 0.070 Yemen
141 3.623 0.071 Tanzania
142 3.615 0.173 Haiti
143 3.600 0.092 Malawi
144 3.512 0.120 Lesotho
145 3.467 0.074 Botswana
146 3.415 0.068 Rwanda
147 3.145 0.058 Zimbabwe
148 2.523 0.038 Afghanistan
1
2


幸福指数与GDP的二维散点图

* https://stackoverflow.com/questions/13872533/plot-different-dataframes-in-the-same-figure

  • 揭示了 幸福指数与人均GDP存在强正相关性
1
2
3
4
5
6
7
fig1, ax1 = plt.subplots()
# fig1.set_size_inches(10, 10)
df.plot(y='Ladder score', x = "Logged GDP per capita", kind="scatter", ax=ax1)
plt.title("幸福指数与人均GDP")

# for cn, xs, ys in zip(df['Country name'], df['Logged GDP per capita'], df['Ladder score']):
# plt.text(xs, ys, cn)
Text(0.5, 1.0, '幸福指数与人均GDP')


png

人均寿命与GDP的二维散点图

  • 揭示了 人均寿命与人均GDP存在强正相关性
1
2
3
4
5
df.plot( x = "Logged GDP per capita", y="Healthy life expectancy",kind="scatter")
plt.title("预估人均寿命与人均GDP的散点图")

plt.ylabel('人均寿命',fontsize=10,fontweight='bold')
plt.xlabel('人均GDP',fontsize=10,fontweight='bold')
Text(0.5, 0, '人均GDP')


png

1
2
3
4
5
6
7
threedee = plt.figure(figsize=(16, 12)).gca(projection='3d')
threedee.scatter(df["Logged GDP per capita"], df["Healthy life expectancy"], df['Ladder score'], c=df["Logged GDP per capita"])
threedee.set_xlabel('人均GDP', fontsize=20)
threedee.set_ylabel('人均寿命', fontsize=20)
threedee.set_zlabel('幸福指数', fontsize=20)
plt.title("人均GDP-人均寿命-幸福指数 三维图", fontsize=30)
plt.show()


png