import
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import f1_score
train_test_split(X, y, test_size, random_state)
seed = 123
#X <-1)
#y <-2)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=seed)
수치형 데이터 제거
col_cate=X_train.select_dtypes(exclude='number').columns
col_cate
Yes : 1 , No : 0 , 그 외 값 : -1 치환
X_train[col_cate].applymap(lambda x: 1 if x=='Yes' else (0 if x=='No' else -1))
*X_train,x_test 모두 적용
MinMax정규화
scaler_mm = MinMaxScaler()
X_train_scaler=scaler_mm.fit_transform(X_train)
X_test_scaler=scaler_mm.transform(X_test)
https://deepinsight.tistory.com/165
LogisticRegression
model_lr = LogisticRegression(random_state = seed)
model_lr.fit(X = X_train_scaler,
y = y_train)
predict
y_pred=model_lr.predict(X_test_scaler)
f1
f1_score(y_test, y_pred, pos_label='Yes').round(2)
confusion_matrix
confusion_matrix(y_test, y_pred)
classification_report
from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))
728x90
'Software Development > Data Science' 카테고리의 다른 글
명목형 변수의 One Hot Encoding + 합치기 (0) | 2023.04.16 |
---|---|
수학문제-확률과 통계 (0) | 2023.04.15 |
Python RandomForestRegressor (0) | 2023.04.13 |
python 함수 소소한 메모 (0) | 2023.04.12 |
Python - lambda & 정규표현식 기초 (0) | 2023.04.11 |