2021-06-26

数据可视化案例:世界幸福指数报告

文章目录

1. 背景
2. 内容
3. dataset列
4. 人均GDP最高、最低的国家
5. 幸福指数最高、最低的10个国家
6. 幸福指数与GDP的二维散点图
7. 人均寿命与GDP的二维散点图

背景

《世界幸福报告》是对全球幸福状况的里程碑式调查。随着政府、组织和民间社会越来越多地使用幸福指数来为他们的决策提供信息，该报告继续获得全球认可。跨领域的领先专家——经济学、心理学、调查分析、国家统计、健康、公共政策等——描述了如何有效地使用幸福的衡量来评估国家的进步。这些报告回顾了当今世界的幸福状况，并展示了新的幸福科学如何解释幸福的个人和国家差异。

内容

幸福指数和排名使用来自盖洛普世界民意调查的数据。幸福分数后面的列估计了六个因素中的每一个——经济生产、社会支持、预期寿命、自由、没有腐败和慷慨——在多大程度上有助于使每个国家的生活评价高于反乌托邦。它们对每个国家/地区报告的总分没有影响，但它们确实解释了为什么有些国家/地区的排名高于其他国家/地区。

数据集：世界幸福指数报告

from pylab import mpl

mpl.rcParams['font.sans-serif'] = ['FangSong'] # 指定默认字体
mpl.rcParams['axes.unicode_minus'] = False # 解决保存图像是负号'-'显示为方块的问题

#importing the libraries
import pandas as pd 
import numpy as np 
import seaborn as sns 
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

import warnings
warnings.filterwarnings('ignore')

dataset列

Country name：国家名字
Regional indicator：地区
Ladder score：幸福指数
Standard error of ladder score ：幸福指数标准差
upperwhisker : 标准差上界
lowerwhisker：标准差下界
Logged GDP per capita：人均GDP
Social support：社会保障？
Healthy life expectancy：人均寿命
Freedom to make life choices：做出人生决定的自由度？
Generosity：慷慨？
Perceptions of corruption：腐败程度
Ladder score in Dystopia
Explained by: Log GDP per capita
Explained by: Social support
Explained by: Healthy life expectancy
Explained by: Freedom to make life choices
Explained by: Generosity
Explained by: Perceptions of corruption
Dystopia + residual

A	B
社会保障	人均寿命
人均GDP	幸福指数
慷慨指数	做出人生决定的自由度
地区

1
2
3

#load the dataset
df = pd.read_csv('world-happiness-report-2021.csv')
df.head().transpose()

	0	1	2	3	4
Country name	Finland	Denmark	Switzerland	Iceland	Netherlands
Regional indicator	Western Europe	Western Europe	Western Europe	Western Europe	Western Europe
Ladder score	7.842	7.62	7.571	7.554	7.464
Standard error of ladder score	0.032	0.035	0.036	0.059	0.027
upperwhisker	7.904	7.687	7.643	7.67	7.518
lowerwhisker	7.78	7.552	7.5	7.438	7.41
Logged GDP per capita	10.775	10.933	11.117	10.878	10.932
Social support	0.954	0.954	0.942	0.983	0.942
Healthy life expectancy	72.0	72.7	74.4	73.0	72.4
Freedom to make life choices	0.949	0.946	0.919	0.955	0.913
Generosity	-0.098	0.03	0.025	0.16	0.175
Perceptions of corruption	0.186	0.179	0.292	0.673	0.338
Ladder score in Dystopia	2.43	2.43	2.43	2.43	2.43
Explained by: Log GDP per capita	1.446	1.502	1.566	1.482	1.501
Explained by: Social support	1.106	1.108	1.079	1.172	1.079
Explained by: Healthy life expectancy	0.741	0.763	0.816	0.772	0.753
Explained by: Freedom to make life choices	0.691	0.686	0.653	0.698	0.647
Explained by: Generosity	0.124	0.208	0.204	0.293	0.302
Explained by: Perceptions of corruption	0.481	0.485	0.413	0.17	0.384
Dystopia + residual	3.253	2.868	2.839	2.967	2.798

1	# df[["Country name", "Freedom to make life choices"]].sort_values(by="Freedom to make life choices", ascending=False).values

df.shape

(149, 20)

# df.info

1	df.describe().transpose()

	count	mean	std	min	25%	50%	75%	max
Ladder score	149.0	5.532839	1.073924e+00	2.523	4.852	5.534	6.255	7.842
Standard error of ladder score	149.0	0.058752	2.200120e-02	0.026	0.043	0.054	0.070	0.173
upperwhisker	149.0	5.648007	1.054330e+00	2.596	4.991	5.625	6.344	7.904
lowerwhisker	149.0	5.417631	1.094879e+00	2.449	4.706	5.413	6.128	7.780
Logged GDP per capita	149.0	9.432208	1.158601e+00	6.635	8.541	9.569	10.421	11.647
Social support	149.0	0.814745	1.148890e-01	0.463	0.750	0.832	0.905	0.983
Healthy life expectancy	149.0	64.992799	6.762043e+00	48.478	59.802	66.603	69.600	76.953
Freedom to make life choices	149.0	0.791597	1.133318e-01	0.382	0.718	0.804	0.877	0.970
Generosity	149.0	-0.015134	1.506567e-01	-0.288	-0.126	-0.036	0.079	0.542
Perceptions of corruption	149.0	0.727450	1.792263e-01	0.082	0.667	0.781	0.845	0.939
Ladder score in Dystopia	149.0	2.430000	5.347044e-15	2.430	2.430	2.430	2.430	2.430
Explained by: Log GDP per capita	149.0	0.977161	4.047399e-01	0.000	0.666	1.025	1.323	1.751
Explained by: Social support	149.0	0.793315	2.588713e-01	0.000	0.647	0.832	0.996	1.172
Explained by: Healthy life expectancy	149.0	0.520161	2.130191e-01	0.000	0.357	0.571	0.665	0.897
Explained by: Freedom to make life choices	149.0	0.498711	1.378884e-01	0.000	0.409	0.514	0.603	0.716
Explained by: Generosity	149.0	0.178047	9.827033e-02	0.000	0.105	0.164	0.239	0.541
Explained by: Perceptions of corruption	149.0	0.135141	1.143614e-01	0.000	0.060	0.101	0.174	0.547
Dystopia + residual	149.0	2.430329	5.376452e-01	0.648	2.138	2.509	2.794	3.482

1	df.duplicated().sum()

1 2	plt.figure(figsize=[18,9]) sns.heatmap(df.corr(),annot=True,cmap="Blues");

png

人均GDP最高、最低的国家

1 2	gdp = df[['Logged GDP- per capita','Country name']].sort_values('Logged GDP per capita', ascending=False) gdp[:10]

	Logged GDP per capita	Country name
7	11.647	Luxembourg
31	11.488	Singapore
14	11.342	Ireland
2	11.117	Switzerland
24	11.085	United Arab Emirates
5	11.053	Norway
18	11.023	United States
76	11.000	Hong Kong S.A.R. of China
1	10.933	Denmark
4	10.932	Netherlands

gdp[-10:]

	Logged GDP per capita	Country name
142	7.477	Haiti
137	7.434	Sierra Leone
134	7.396	Madagascar
127	7.364	Chad
135	7.362	Togo
119	7.288	Liberia
114	7.158	Mozambique
95	7.098	Niger
143	6.958	Malawi
139	6.635	Burundi

幸福指数最高、最低的10个国家

1
2
3

# top10 ladder score country
ladder_score = df[['Ladder score',"Standard error of ladder score",'Country name']].sort_values("Ladder score", ascending=False)
ladder_score[:10]

	Ladder score	Standard error of ladder score	Country name
0	7.842	0.032	Finland
1	7.620	0.035	Denmark
2	7.571	0.036	Switzerland
3	7.554	0.059	Iceland
4	7.464	0.027	Netherlands
5	7.392	0.035	Norway
6	7.363	0.036	Sweden
7	7.324	0.037	Luxembourg
8	7.277	0.040	New Zealand
9	7.268	0.036	Austria

1 2	# low10 ladder score ladder_score[-10:]

	Ladder score	Standard error of ladder score	Country name
139	3.775	0.107	Burundi
140	3.658	0.070	Yemen
141	3.623	0.071	Tanzania
142	3.615	0.173	Haiti
143	3.600	0.092	Malawi
144	3.512	0.120	Lesotho
145	3.467	0.074	Botswana
146	3.415	0.068	Rwanda
147	3.145	0.058	Zimbabwe
148	2.523	0.038	Afghanistan

1
2

幸福指数与GDP的二维散点图

*　https://stackoverflow.com/questions/13872533/plot-different-dataframes-in-the-same-figure

揭示了 幸福指数与人均GDP存在强正相关性

fig1, ax1 = plt.subplots()
# fig1.set_size_inches(10, 10)
df.plot(y='Ladder score', x = "Logged GDP per capita", kind="scatter", ax=ax1)
plt.title("幸福指数与人均GDP")

# for cn, xs, ys in zip(df['Country name'], df['Logged GDP per capita'], df['Ladder score']):
#     plt.text(xs, ys, cn)

Text(0.5, 1.0, '幸福指数与人均GDP')

png

人均寿命与GDP的二维散点图

揭示了 人均寿命与人均GDP存在强正相关性

df.plot( x = "Logged GDP per capita", y="Healthy life expectancy",kind="scatter")
plt.title("预估人均寿命与人均GDP的散点图")

plt.ylabel('人均寿命',fontsize=10,fontweight='bold')
plt.xlabel('人均GDP',fontsize=10,fontweight='bold')

Text(0.5, 0, '人均GDP')

png

threedee = plt.figure(figsize=(16, 12)).gca(projection='3d')
threedee.scatter(df["Logged GDP per capita"], df["Healthy life expectancy"], df['Ladder score'], c=df["Logged GDP per capita"])
threedee.set_xlabel('人均GDP', fontsize=20)
threedee.set_ylabel('人均寿命', fontsize=20)
threedee.set_zlabel('幸福指数', fontsize=20)
plt.title("人均GDP-人均寿命-幸福指数 三维图", fontsize=30)
plt.show()

png

野生芦苇

价值、合作、方法