Python RandomForestRegressor

Software Development/Data Science

Python RandomForestRegressor

huiyu 2023. 4. 13. 06:36

1. import RandomForestRegressor

from sklearn.ensemble import RandomForestRegressor

2. 모델 생성

model = RandomForestRegressor()

3. 모델 학습 : fit

model.fit(x_train, y_train)

4. 모델 검증

print(model.score(x_train, y_train))
print(model.score(x_test, y_test))

5. 모델 예측

y_predict = model.predict(x_test)
print(y_predict[0])

6. 피쳐 중요도 확인

model.feature_importances_

->feature_importances : 결정트리에서 노드를 분기할 때, 해당 피쳐가 클래스를 나누는데 얼마나 영향을 미쳤는지 표기
- normalized된 ndarray를 반환하여 0~1값을 갖는다.
- 0이면 클래스에서 해당 피쳐가 선택되지 않은 것.
- 1이면 해당 피쳐가 클래스를 완벽하게 나누었다는 것을 의미.

7. 변수 중요도별 내림차순 정렬, 가장 큰 중요도를 갖고 있는 값 출력.

#feature별 feature_importances 갖는 구조 Series 생성
col_features=['feature1','feature2', 'feature3', 'feature4']
pd.Series(index=col_feature, data=model.feature_importances_)

#feautre_importances_ 값이 최대인 값 출력
pd.Series(index=col_feature, data=model.feature_importances_).idxmax()

728x90

'Software Development > Data Science' 카테고리의 다른 글

수학문제-확률과 통계 (0)	2023.04.15
로지스틱 회귀함수 (2)	2023.04.14
python 함수 소소한 메모 (0)	2023.04.12
Python - lambda & 정규표현식 기초 (0)	2023.04.11
Python Data Science 기초 함수 정리 (0)	2023.04.10

현재글Python RandomForestRegressor

huiyu's blog

매일 기록하기 - 개발, 운동, 마라톤, 책, 영화, 여행

Unity, 도트찍기, 한식조리사, 설치, C++, 쓰다, 운동기록, 운동일지, 마라톤, 운동, WPF, 알고리즘, 업무기록, 타이젠, 실기준비, Tizen, c#, 매일기록, 읽다, OpenGL,

Today :
Yesterday :

일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

huiyu's blog