Notice
Recent Posts
Recent Comments
Link
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | ||||
4 | 5 | 6 | 7 | 8 | 9 | 10 |
11 | 12 | 13 | 14 | 15 | 16 | 17 |
18 | 19 | 20 | 21 | 22 | 23 | 24 |
25 | 26 | 27 | 28 | 29 | 30 | 31 |
Tags
- https://stat.ethz.ch/R-manual/R-devel/library/base/html/table.html
- Latex is suck
- https://www.rdocumentation.org/packages/base/versions/3.5.1/topics/nrow
- https://stackoverflow.com/questions/38976217/what-is-the-meaning-of-include-lowest-in-reclassify-raster-package-r
Archives
- Today
- Total
LearnMore
Regression(Personal Project) 본문
Project Regression
Regression analysis
carData 패키지를 설치 및 로딩하고, 패키지네 들어 있는 데이터목록을 조회하시오.
if(!require(carData)){
install.packages("carData")
library(carData)
}else{
library(carData)
}
## Loading required package: carData
if(!require(car)){
install.packages("car")
library(car)
}else{
library(car)
}
## Loading required package: car
data(package="carData")
carData 패키지 내 데이터셋 목록 중 에 Prestige 데이터셋을 로딩하고 코딩북을 확인하시오.
data("Prestige")
help("Prestige")
## starting httpd help server ... done
Prestige 데이터셋 간단조회, 구조파악, 간단 기술통계분석하시오.
head(Prestige)
## education income women prestige census type
## gov.administrators 13.11 12351 11.16 68.8 1113 prof
## general.managers 12.26 25879 4.02 69.1 1130 prof
## accountants 12.77 9271 15.70 63.4 1171 prof
## purchasing.officers 11.42 8865 9.11 56.8 1175 prof
## chemists 14.62 8403 11.68 73.5 2111 prof
## physicists 15.64 11030 5.13 77.6 2113 prof
str(Prestige)
## 'data.frame': 102 obs. of 6 variables:
## $ education: num 13.1 12.3 12.8 11.4 14.6 ...
## $ income : int 12351 25879 9271 8865 8403 11030 8258 14163 11377 11023 ...
## $ women : num 11.16 4.02 15.7 9.11 11.68 ...
## $ prestige : num 68.8 69.1 63.4 56.8 73.5 77.6 72.6 78.1 73.1 68.8 ...
## $ census : int 1113 1130 1171 1175 2111 2113 2133 2141 2143 2153 ...
## $ type : Factor w/ 3 levels "bc","prof","wc": 2 2 2 2 2 2 2 2 2 2 ...
summary(Prestige)
## education income women prestige
## Min. : 6.380 Min. : 611 Min. : 0.000 Min. :14.80
## 1st Qu.: 8.445 1st Qu.: 4106 1st Qu.: 3.592 1st Qu.:35.23
## Median :10.540 Median : 5930 Median :13.600 Median :43.60
## Mean :10.738 Mean : 6798 Mean :28.979 Mean :46.83
## 3rd Qu.:12.648 3rd Qu.: 8187 3rd Qu.:52.203 3rd Qu.:59.27
## Max. :15.970 Max. :25879 Max. :97.510 Max. :87.20
## census type
## Min. :1113 bc :44
## 1st Qu.:3120 prof:31
## Median :5135 wc :23
## Mean :5402 NA's: 4
## 3rd Qu.:8312
## Max. :9517
raw<-na.omit(Prestige)
summary(raw)
## education income women prestige
## Min. : 6.380 Min. : 1656 Min. : 0.000 Min. :17.30
## 1st Qu.: 8.445 1st Qu.: 4250 1st Qu.: 3.268 1st Qu.:35.38
## Median :10.605 Median : 6036 Median :14.475 Median :43.60
## Mean :10.795 Mean : 6939 Mean :28.986 Mean :47.33
## 3rd Qu.:12.755 3rd Qu.: 8226 3rd Qu.:52.203 3rd Qu.:59.90
## Max. :15.970 Max. :25879 Max. :97.510 Max. :87.20
## census type
## Min. :1113 bc :44
## 1st Qu.:3116 prof:31
## Median :5132 wc :23
## Mean :5400
## 3rd Qu.:8328
## Max. :9517
다중회귀분석 실시하시오.
fit<-lm(income ~ education+women+prestige+census+type,data = raw)
summary(fit)
##
## Call:
## lm(formula = income ~ education + women + prestige + census +
## type, data = raw)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7752.4 -954.6 -331.2 742.6 14301.3
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.32053 3037.27048 0.002 0.99808
## education 131.18372 288.74961 0.454 0.65068
## women -53.23480 9.83107 -5.415 4.96e-07 ***
## prestige 139.20912 36.40239 3.824 0.00024 ***
## census 0.04209 0.23568 0.179 0.85865
## typeprof 509.15150 1798.87914 0.283 0.77779
## typewc 347.99010 1173.89384 0.296 0.76757
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2633 on 91 degrees of freedom
## Multiple R-squared: 0.6363, Adjusted R-squared: 0.6123
## F-statistic: 26.54 on 6 and 91 DF, p-value: < 2.2e-16
유의하지 않은 변수를 제외한 추가 다중회귀분석 실시하시오.
fit.fw<-step(fit,direction ="forward",trace = FALSE)
summary(fit.fw)
##
## Call:
## lm(formula = income ~ education + women + prestige + census +
## type, data = raw)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7752.4 -954.6 -331.2 742.6 14301.3
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.32053 3037.27048 0.002 0.99808
## education 131.18372 288.74961 0.454 0.65068
## women -53.23480 9.83107 -5.415 4.96e-07 ***
## prestige 139.20912 36.40239 3.824 0.00024 ***
## census 0.04209 0.23568 0.179 0.85865
## typeprof 509.15150 1798.87914 0.283 0.77779
## typewc 347.99010 1173.89384 0.296 0.76757
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2633 on 91 degrees of freedom
## Multiple R-squared: 0.6363, Adjusted R-squared: 0.6123
## F-statistic: 26.54 on 6 and 91 DF, p-value: < 2.2e-16
fit.bw<-step(fit,direction ="backward",trace = FALSE )
summary(fit.bw)
##
## Call:
## lm(formula = income ~ women + prestige, data = raw)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7678.4 -1050.9 -310.1 839.6 14114.3
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 653.17 838.09 0.779 0.438
## women -50.50 8.42 -5.997 3.61e-08 ***
## prestige 163.74 15.46 10.593 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2587 on 95 degrees of freedom
## Multiple R-squared: 0.6334, Adjusted R-squared: 0.6257
## F-statistic: 82.08 on 2 and 95 DF, p-value: < 2.2e-16
fit.cor<-lm(income~women+prestige,data=raw)
summary(fit.cor)
##
## Call:
## lm(formula = income ~ women + prestige, data = raw)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7678.4 -1050.9 -310.1 839.6 14114.3
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 653.17 838.09 0.779 0.438
## women -50.50 8.42 -5.997 3.61e-08 ***
## prestige 163.74 15.46 10.593 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2587 on 95 degrees of freedom
## Multiple R-squared: 0.6334, Adjusted R-squared: 0.6257
## F-statistic: 82.08 on 2 and 95 DF, p-value: < 2.2e-16
멀티캔버스를 통한 회귀모형 진단 플롯 6개 종합 시각화하시오.
par(mfrow=c(2,3))
plot(fit.cor,which=c(1:6))
다중공선성을 진단하시오.
vif(fit.cor)
## women prestige
## 1.01228 1.01228
예측모형을 통한 새로운 값을 예측하시오.(평균교육연수 12년, 직업별성점수 33, 직업여성비율 22)
new<-data.frame(education=12,prestige=33,women=22)
predict(fit.cor,new,interval = "none")
## 1
## 4945.629
predict(fit.cor,new,interval = "confidence")
## fit lwr upr
## 1 4945.629 4247.515 5643.743
predict(fit.cor,new,interval = "prediction")
## fit lwr upr
## 1 4945.629 -236.9247 10128.18
'Programming > R' 카테고리의 다른 글
Classification(Personal Project) (0) | 2018.10.16 |
---|---|
Association Rule Analysis(Personal Project) (0) | 2018.10.16 |
ABTest(Personal Project) (0) | 2018.10.16 |
'right' and 'include.lowest' parameter in cut() function (0) | 2018.08.28 |
t() function in table (0) | 2018.08.28 |
Comments