로지스틱 회귀함수

Software Development/Data Science

로지스틱 회귀함수

huiyu 2023. 4. 14. 06:48

import

from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import f1_score

train_test_split(X, y, test_size, random_state)

seed = 123

#X <-1)
#y <-2)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=seed)

수치형 데이터 제거

col_cate=X_train.select_dtypes(exclude='number').columns
col_cate

Yes : 1 , No : 0 , 그 외 값 : -1 치환

X_train[col_cate].applymap(lambda x: 1 if x=='Yes' else (0 if x=='No' else -1))

*X_train,x_test 모두 적용

MinMax정규화

scaler_mm = MinMaxScaler()
X_train_scaler=scaler_mm.fit_transform(X_train)
X_test_scaler=scaler_mm.transform(X_test)

https://deepinsight.tistory.com/165

[scikit-learn] transform()과 fit_transform()의 차이는 무엇일까?

왜 scikit-learn에서 모델을 학습할 때, train dataset에서만 .fit_transform()메서드를 사용하는 건가요? TL;DR 안녕하세요 steve-lee입니다. 실용 머신러닝 A to Z 첫번 째 시간은 scikit-learn에서 자주 사용하는 tra

deepinsight.tistory.com

LogisticRegression

model_lr = LogisticRegression(random_state = seed)
model_lr.fit(X = X_train_scaler,
             y = y_train)

predict

y_pred=model_lr.predict(X_test_scaler)

f1_score(y_test, y_pred, pos_label='Yes').round(2)

confusion_matrix

confusion_matrix(y_test, y_pred)

classification_report

from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))

728x90

'Software Development > Data Science' 카테고리의 다른 글

명목형 변수의 One Hot Encoding + 합치기 (0)	2023.04.16
수학문제-확률과 통계 (0)	2023.04.15
Python RandomForestRegressor (0)	2023.04.13
python 함수 소소한 메모 (0)	2023.04.12
Python - lambda & 정규표현식 기초 (0)	2023.04.11

현재글로지스틱 회귀함수

huiyu's blog

매일 기록하기 - 개발, 운동, 마라톤, 책, 영화, 여행

마라톤, 실기준비, OpenGL, 타이젠, 읽다, 한식조리사, 도트찍기, c#, 운동일지, 쓰다, C++, 운동기록, Unity, 설치, 매일기록, 알고리즘, Tizen, 운동, WPF, 업무기록,

Today :
Yesterday :

일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

huiyu's blog