在线观看国产精品久,疯狂少妇高潮惨叫

這是一份kaggle上的銀行的數(shù)據(jù)集，研究該數(shù)據(jù)集可以預(yù)測客戶是否認(rèn)購定期存款y。這里包含20個特征。

1. 分析框架

2. 數(shù)據(jù)讀取，數(shù)據(jù)清洗

#導(dǎo)入相關(guān)包
importnumpyasnp
importpandasaspd
#讀取數(shù)據(jù)
data=pd.read_csv('./1bank-additional-full.csv')
#查看表的行列數(shù)
data.shape

輸出：

這里只有nr.employed這列有丟失數(shù)據(jù)，查看下：

data['nr.employed'].value_counts()

這里只有5191.0這個值，沒有其他的，且只有7763條數(shù)據(jù)，這里直接將這列當(dāng)做異常值，直接將這列直接刪除了。

#data.drop('nr.employed',axis=1,inplace=True)

3. 探索性數(shù)據(jù)分析

3.1查看各年齡段的人數(shù)的分布

這里可以看出該銀行的主要用戶主要集中在23-60歲這個年齡層，其中29-39這個年齡段的人數(shù)相對其他年齡段多。

importmatplotlib.pyplotasplt
importseabornassns
plt.rcParams['font.sans-serif']='SimHei'
plt.figure(figsize=(20,8),dpi=256)
sns.countplot(x='age',data=data)
plt.title("各年齡段的人數(shù)")

3.2 其他特征的一些分布

plt.figure(figsize=(18,16),dpi=512)
plt.subplot(221)
sns.countplot(x='contact',data=data)
plt.title("contact分布情況")

plt.subplot(222)
sns.countplot(x='day_of_week',data=data)
plt.title("day_of_week分布情況")

plt.subplot(223)
sns.countplot(x='default',data=data)
plt.title("default分布情況")

plt.subplot(224)
sns.countplot(x='education',data=data)
plt.xticks(rotation=70)
plt.title("education分布情況")

plt.savefig('./1.png')

plt.figure(figsize=(18,16),dpi=512)
plt.subplot(221)
sns.countplot(x='housing',data=data)
plt.title("housing分布情況")

plt.subplot(222)
sns.countplot(x='job',data=data)
plt.xticks(rotation=70)
plt.title("job分布情況")

plt.subplot(223)
sns.countplot(x='loan',data=data)
plt.title("loan分布情況")

plt.subplot(224)
sns.countplot(x='marital',data=data)
plt.xticks(rotation=70)
plt.title("marital分布情況")

plt.savefig('./2.png')

plt.figure(figsize=(18,8),dpi=512)
plt.subplot(221)
sns.countplot(x='month',data=data)
plt.xticks(rotation=30)

plt.subplot(222)
sns.countplot(x='poutcome',data=data)
plt.xticks(rotation=30)
plt.savefig('./3.png')

3.3 各特征的相關(guān)性

plt.figure(figsize=(10,8),dpi=256)
plt.rcParams['axes.unicode_minus']=False
sns.heatmap(data.corr(),annot=True)
plt.savefig('./4.png')

4. 特征規(guī)范化

4.1 將自變量的特征值轉(zhuǎn)換成標(biāo)簽類型

#特征化數(shù)據(jù)
fromsklearn.preprocessingimportLabelEncoder
features=['contact','day_of_week','default','education','housing',
'job','loan','marital','month','poutcome']

le_x=LabelEncoder()
forfeatureinfeatures:
data[feature]=le_x.fit_transform(data[feature])

4.2 將結(jié)果y值轉(zhuǎn)換成0、1

defparse_y(x):
if(x=='no'):
return0
else:
return1
data['y']=data['y'].apply(parse_y)
data['y']=data['y'].astype(int)

4.3 數(shù)據(jù)規(guī)范化

#數(shù)據(jù)規(guī)范化到正態(tài)分布的數(shù)據(jù)
#測試數(shù)據(jù)和訓(xùn)練數(shù)據(jù)的分割
fromsklearn.preprocessingimportStandardScaler
fromsklearn.model_selectionimporttrain_test_split
ss=StandardScaler()
train_x,test_x,train_y,test_y=train_test_split(data.iloc[:,:-1],
data['y'],
test_size=0.3)
train_x=ss.fit_transform(train_x)
test_x=ss.transform(test_x)

5. 模型訓(xùn)練

5.1 AdaBoost分類器

fromsklearn.ensembleimportAdaBoostClassifier
fromsklearn.metricsimportaccuracy_score
ada=AdaBoostClassifier()
ada.fit(train_x,train_y)
predict_y=ada.predict(test_x)
print("準(zhǔn)確率：",accuracy_score(test_y,predict_y))

5.2 SVC分類器

fromsklearn.svmimportSVC
svc=SVC()
svc.fit(train_x,train_y)
predict_y=svc.predict(test_x)
print("準(zhǔn)確率：",accuracy_score(test_y,predict_y))

5.3 K鄰近值分類器

fromsklearn.neighborsimportKNeighborsClassifier
knn=KNeighborsClassifier()
knn.fit(train_x,train_y)
predict_y=knn.predict(test_x)
print("準(zhǔn)確率：",accuracy_score(test_y,predict_y))

5.4 決策樹分類器

fromsklearn.treeimportDecisionTreeClassifier
dtc=DecisionTreeClassifier()
dtc.fit(train_x,train_y)
predict_y=dtc.predict(test_x)
print("準(zhǔn)確率：",accuracy_score(test_y,predict_y))

6 模型評價

6.1 AdaBoost分類器

fromsklearn.metricsimportroc_curve
fromsklearn.metricsimportauc
plt.figure(figsize=(8,6))
fpr1,tpr1,threshoulds1=roc_curve(test_y,ada.predict(test_x))
plt.stackplot(fpr1,tpr1,color='steelblue',alpha=0.5,edgecolor='black')
plt.plot(fpr1,tpr1,linewidth=2,color='black')
plt.plot([0,1],[0,1],ls='-',color='red')
plt.text(0.5,0.4,auc(fpr1,tpr1))
plt.title('AdaBoost分類器的ROC曲線')

6.2 SVC分類器

plt.figure(figsize=(8,6))
fpr2,tpr2,threshoulds2=roc_curve(test_y,svc.predict(test_x))
plt.stackplot(fpr2,tpr2,alpha=0.5)
plt.plot(fpr2,tpr2,linewidth=2,color='black')
plt.plot([0,1],[0,1],ls='-',color='red')
plt.text(0.5,0.4,auc(fpr2,tpr2))
plt.title('SVD的ROC曲線')

6.3 K鄰近值分類器

plt.figure(figsize=(8,6))
fpr3,tpr3,threshoulds3=roc_curve(test_y,knn.predict(test_x))
plt.stackplot(fpr3,tpr3,alpha=0.5)
plt.plot(fpr3,tpr3,linewidth=2,color='black')
plt.plot([0,1],[0,1],ls='-',color='red')
plt.text(0.5,0.4,auc(fpr3,tpr3))
plt.title('K鄰近值的ROC曲線')

6.4 決策樹分類器

plt.figure(figsize=(8,6))
fpr4,tpr4,threshoulds4=roc_curve(test_y,dtc.predict(test_x))
plt.stackplot(fpr4,tpr4,alpha=0.5)
plt.plot(fpr4,tpr4,linewidth=2,color='black')
plt.plot([0,1],[0,1],ls='-',color='red')
plt.text(0.5,0.4,auc(fpr4,tpr4))
plt.title('決策樹的ROC曲線')

審核編輯：李倩

聲明：本文內(nèi)容及配圖由入駐作者撰寫或者入駐合作網(wǎng)站授權(quán)轉(zhuǎn)載。文章觀點(diǎn)僅代表作者本人，不代表電子發(fā)燒友網(wǎng)立場。文章及其配圖僅供工程師學(xué)習(xí)之用，如有內(nèi)容侵權(quán)或者其他違規(guī)問題，請聯(lián)系本站處理。舉報投訴

算法

算法

+關(guān)注

關(guān)注
23

文章
4775

瀏覽量
97617
數(shù)據(jù)分析

數(shù)據(jù)分析

+關(guān)注

關(guān)注
2

文章
1512

瀏覽量
36061
python

python

+關(guān)注

關(guān)注
57

文章
4866

瀏覽量
89801

原文標(biāo)題：用 Python 算法預(yù)測客戶行為案例！

文章出處：【微信號：DBDevs，微信公眾號：數(shù)據(jù)分析與開發(fā)】歡迎添加關(guān)注！文章轉(zhuǎn)載請注明出處。

chinese直男口爆体育生外卖, 99久久er热在这里只有精品99, 又色又爽又黄18禁美女裸身无遮挡, gogogo高清免费观看日本电视,私密按摩师高清版在线,人妻视频毛茸茸,91论坛兴趣闲谈,欧美亚洲精品 8区,国产精品久久久久精品免费

搜索歷史

用Python算法預(yù)測客戶行為案例！

1. 分析框架

2. 數(shù)據(jù)讀取，數(shù)據(jù)清洗

3. 探索性數(shù)據(jù)分析

3.1查看各年齡段的人數(shù)的分布

3.2 其他特征的一些分布

3.3 各特征的相關(guān)性

4. 特征規(guī)范化

4.1 將自變量的特征值轉(zhuǎn)換成標(biāo)簽類型

4.2 將結(jié)果y值轉(zhuǎn)換成0、1

4.3 數(shù)據(jù)規(guī)范化

5. 模型訓(xùn)練

5.1 AdaBoost分類器

5.2 SVC分類器

5.3 K鄰近值分類器

5.4 決策樹分類器

6 模型評價

6.1 AdaBoost分類器

6.2 SVC分類器

6.3 K鄰近值分類器

6.4 決策樹分類器

評論

搜索歷史

用Python算法預(yù)測客戶行為案例！

1. 分析框架

2. 數(shù)據(jù)讀取，數(shù)據(jù)清洗

3. 探索性數(shù)據(jù)分析

3.1查看各年齡段的人數(shù)的分布

3.2 其他特征的一些分布

3.3 各特征的相關(guān)性

4. 特征規(guī)范化

4.1 將自變量的特征值轉(zhuǎn)換成標(biāo)簽類型

4.2 將結(jié)果y值轉(zhuǎn)換成0、1

4.3 數(shù)據(jù)規(guī)范化

5. 模型訓(xùn)練

5.1 AdaBoost分類器

5.2 SVC分類器

5.3 K鄰近值分類器

5.4 決策樹分類器

6 模型評價

6.1 AdaBoost分類器

6.2 SVC分類器

6.3 K鄰近值分類器

6.4 決策樹分類器

評論

用Python算法預(yù)測客戶行為案例！

2. 數(shù)據(jù)讀取，數(shù)據(jù)清洗

4.2 將結(jié)果y值轉(zhuǎn)換成0、1