내장함수를 이용한 데이터 프레임 조작.

2019. 5. 22. 10:56

exam = read.csv("Data/csv_exam.csv")

exam

exam[2,] #특정 row만 출력

#조건부 행 추출.

exam$class==1

exam[exam$class==1 , ]

exam[exam$math >= 80 , ]

exam[exam$english >= 70 & exam$class == 2 , ]

exam[exam$english < 90 | exam$science < 50 , ]

#열 추출.

exam[ , 1]

exam[ , "id"]

#조건부 열 추출

exam[ , c("class" , "math")] #c 함수로 묶어야 한다..!

#조건부 데이터 추출.

exam[1,3]

exam[5,"math"]

exam[exam$id == 10,"math"]

exam[exam$math >= 80 , c("id", "english")]

#DPLYR

exam %>%

filter(exam$math >= 50 & exam$english >= 80) %>%

mutate(tot = (math + english + science) / 3) %>%

group_by(class) %>%

summarise(myMean = mean(tot)) #그룹단위 평균 mean 사용가능!

# summarise( avg = sum(tot) / n() )

#내장함수

exam$tot = (exam$math + exam$english + exam$science) / 3

#fomular 해석법 : 물결무늬 오른쪽에 대해서 그룹화 해서 왼쪽 변수에 대해 특정 함수를 적용하라.

aggregate(data=exam[ exam$math >= 50 & exam$english >= 80 , ] , tot~class , mean )

### 그룹 함수와 개별 함수를 잘 구분해서 볼것.

mpg %>%

filter(class %in% c("compact", "suv")) %>% group_by(class) %>% summarise(meanTot = mean( (cty + hwy) / 2 ) )

리스트 <-> 벡터 , 속성값 추가 및 접근, lapply, tapply (0)	2019.05.22
변수타입 복습 정리 (0)	2019.05.22
국내지도그래프,시계열 그래프 , ggChoropleth , plotly , dodge (0)	2019.05.22
USArrests, 행->컬럼. 지도 시각화, (0)	2019.05.20
라이브러리, ggplot x축 분할, Flip, Join, excel, discrete, fill, position (0)	2019.05.20

Software knowledge worth spreading