<Reshaping data frames>
We sometimes need to reorganize data frames in a new form according to certain needs for making functions or statistical calculations.
<Stack>
> dat1=data.frame(A=runif(20,0,0.7),B=runif(20,0,0.9),C=runif(20,0.8))
> dat1
A B C
1 0.60730714 0.34485044 0.8083856
2 0.43592037 0.68032657 0.8526101
3 0.41984612 0.89238618 0.9468320
4 0.48211414 0.06180126 0.9420313
5 0.62057361 0.44256735 0.9672691
6 0.65831211 0.25447632 0.8672970
7 0.42467509 0.05187750 0.8411565
8 0.18902238 0.05686683 0.9973599
9 0.19892914 0.30698761 0.8395065
10 0.01555739 0.55490717 0.9686474
11 0.01591176 0.34292293 0.8488479
12 0.37174231 0.61580689 0.8873868
13 0.10231748 0.60909110 0.9373522
14 0.61326807 0.15069578 0.9506504
15 0.07295151 0.44691793 0.8699553
16 0.40845801 0.41075289 0.9789578
17 0.32814461 0.73066725 0.9236493
18 0.60540334 0.27622376 0.8842698
19 0.01816268 0.71314950 0.9413320
20 0.36442848 0.80881771 0.8699921
> stack(dat1) |
> stack(dat1,select=c("A","C")) |
values ind 1 0.60730714 A 2 0.43592037 A 3 0.41984612 A 4 0.48211414 A 5 0.62057361 A 6 0.65831211 A 7 0.42467509 A 8 0.18902238 A 9 0.19892914 A 10 0.01555739 A 11 0.01591176 A 12 0.37174231 A 13 0.10231748 A 14 0.61326807 A 15 0.07295151 A 16 0.40845801 A 17 0.32814461 A 18 0.60540334 A 19 0.01816268 A 20 0.36442848 A 21 0.34485044 B 22 0.68032657 B 23 0.89238618 B 24 0.06180126 B 25 0.44256735 B 26 0.25447632 B 27 0.05187750 B 28 0.05686683 B 29 0.30698761 B 30 0.55490717 B 31 0.34292293 B 32 0.61580689 B 33 0.60909110 B 34 0.15069578 B 35 0.44691793 B 36 0.41075289 B 37 0.73066725 B 38 0.27622376 B 39 0.71314950 B 40 0.80881771 B 41 0.80838561 C 42 0.85261014 C 43 0.94683201 C 44 0.94203126 C 45 0.96726905 C 46 0.86729699 C 47 0.84115650 C 48 0.99735987 C 49 0.83950651 C 50 0.96864743 C 51 0.84884789 C 52 0.88738682 C 53 0.93735217 C 54 0.95065040 C 55 0.86995530 C 56 0.97895783 C 57 0.92364927 C 58 0.88426984 C 59 0.94133196 C 60 0.86999215 C |
values ind 1 0.60730714 A 2 0.43592037 A 3 0.41984612 A 4 0.48211414 A 5 0.62057361 A 6 0.65831211 A 7 0.42467509 A 8 0.18902238 A 9 0.19892914 A 10 0.01555739 A 11 0.01591176 A 12 0.37174231 A 13 0.10231748 A 14 0.61326807 A 15 0.07295151 A 16 0.40845801 A 17 0.32814461 A 18 0.60540334 A 19 0.01816268 A 20 0.36442848 A 21 0.80838561 C 22 0.85261014 C 23 0.94683201 C 24 0.94203126 C 25 0.96726905 C 26 0.86729699 C 27 0.84115650 C 28 0.99735987 C 29 0.83950651 C 30 0.96864743 C 31 0.84884789 C 32 0.88738682 C 33 0.93735217 C 34 0.95065040 C 35 0.86995530 C 36 0.97895783 C 37 0.92364927 C 38 0.88426984 C 39 0.94133196 C 40 0.86999215 C |
<Unstack>
> cbind(dat2, d=runif(20,0,0.3)) |
> unstack(cbind(dat2,d=runif(20,0,0.3)),values~ind) |
> unstack(cbind(dat2, d=runif(20,0,0.3)),d~ind) |
values ind d 1 0.60730714 A 0.026263619 2 0.43592037 A 0.248322658 3 0.41984612 A 0.219391077 4 0.48211414 A 0.191770833 5 0.62057361 A 0.280993604 6 0.65831211 A 0.079237235 7 0.42467509 A 0.241292099 8 0.18902238 A 0.120036501 9 0.19892914 A 0.009090804 10 0.01555739 A 0.292907454 11 0.01591176 A 0.008498633 12 0.37174231 A 0.238023847 13 0.10231748 A 0.134750554 14 0.61326807 A 0.067378258 15 0.07295151 A 0.264869956 16 0.40845801 A 0.149131450 17 0.32814461 A 0.071351884 18 0.60540334 A 0.162132894 19 0.01816268 A 0.004526606 20 0.36442848 A 0.226815647 21 0.34485044 B 0.026263619 22 0.68032657 B 0.248322658 23 0.89238618 B 0.219391077 24 0.06180126 B 0.191770833 25 0.44256735 B 0.280993604 26 0.25447632 B 0.079237235 27 0.05187750 B 0.241292099 28 0.05686683 B 0.120036501 29 0.30698761 B 0.009090804 30 0.55490717 B 0.292907454 31 0.34292293 B 0.008498633 32 0.61580689 B 0.238023847 33 0.60909110 B 0.134750554 34 0.15069578 B 0.067378258 35 0.44691793 B 0.264869956 36 0.41075289 B 0.149131450 37 0.73066725 B 0.071351884 38 0.27622376 B 0.162132894 39 0.71314950 B 0.004526606 40 0.80881771 B 0.226815647 41 0.80838561 C 0.026263619 42 0.85261014 C 0.248322658 43 0.94683201 C 0.219391077 44 0.94203126 C 0.191770833 45 0.96726905 C 0.280993604 46 0.86729699 C 0.079237235 47 0.84115650 C 0.241292099 48 0.99735987 C 0.120036501 49 0.83950651 C 0.009090804 50 0.96864743 C 0.292907454 51 0.84884789 C 0.008498633 52 0.88738682 C 0.238023847 53 0.93735217 C 0.134750554 54 0.95065040 C 0.067378258 55 0.86995530 C 0.264869956 56 0.97895783 C 0.149131450 57 0.92364927 C 0.071351884 58 0.88426984 C 0.162132894 59 0.94133196 C 0.004526606 60 0.86999215 C 0.226815647 |
A B C 1 0.60730714 0.34485044 0.8083856 2 0.43592037 0.68032657 0.8526101 3 0.41984612 0.89238618 0.9468320 4 0.48211414 0.06180126 0.9420313 5 0.62057361 0.44256735 0.9672691 6 0.65831211 0.25447632 0.8672970 7 0.42467509 0.05187750 0.8411565 8 0.18902238 0.05686683 0.9973599 9 0.19892914 0.30698761 0.8395065 10 0.01555739 0.55490717 0.9686474 11 0.01591176 0.34292293 0.8488479 12 0.37174231 0.61580689 0.8873868 13 0.10231748 0.60909110 0.9373522 14 0.61326807 0.15069578 0.9506504 15 0.07295151 0.44691793 0.8699553 16 0.40845801 0.41075289 0.9789578 17 0.32814461 0.73066725 0.9236493 18 0.60540334 0.27622376 0.8842698 19 0.01816268 0.71314950 0.9413320 20 0.36442848 0.80881771 0.8699921 |
A B C 1 0.083997591 0.083997591 0.083997591 2 0.089283519 0.089283519 0.089283519 3 0.284200176 0.284200176 0.284200176 4 0.152690073 0.152690073 0.152690073 5 0.105063938 0.105063938 0.105063938 6 0.008653101 0.008653101 0.008653101 7 0.143053737 0.143053737 0.143053737 8 0.131534291 0.131534291 0.131534291 9 0.177895177 0.177895177 0.177895177 10 0.033823705 0.033823705 0.033823705 11 0.163880339 0.163880339 0.163880339 12 0.119386484 0.119386484 0.119386484 13 0.290694124 0.290694124 0.290694124 14 0.054644807 0.054644807 0.054644807 15 0.248584340 0.248584340 0.248584340 16 0.181152571 0.181152571 0.181152571 17 0.054012267 0.054012267 0.054012267 18 0.074821208 0.074821208 0.074821208 19 0.119740172 0.119740172 0.119740172 20 0.109013125 0.109013125 0.109013125 |
<Reshape>
reshape function enables data frames change from wide to long format and vice versa.
>?reshape
This function reshapes a data frame between 'wide' format with repeated measurements in separate columns of the same record and 'long' format with the repeated measurements in separate records.
data: data frame
varying: names of sets of variables in the wide format that correspond to single variables in long format.
v.names: names of variables in the long format that correspond to multiple variables in the wide format.
timevar: the variable in long format that differentiates multiple records from the same group or individual.
idvar: names of one or more variables in long format that identify multiple records from the same group/individual.
> wide=data.frame(name=c("hyojung","coldrice","miscent"),mid1=c(80,90,30),mid2=c(50,20,70),mid3=c(70,20,90),class=c(1,2,1))
> wide
name mid1 mid2 mid3 class
1 hyojung 80 50 70 1
2 coldrice 90 20 20 2
3 miscent 30 70 90 1
>long=reshape(data=wide, varying=list(c("mid1","mid2","mid3")),v.names="score", timevar="mid", idvar="name",direction="long")
> long
name class mid score
hyojung.1 hyojung 1 1 80
coldrice.1 coldrice 2 1 90
miscent.1 miscent 1 1 30
hyojung.2 hyojung 1 2 50
coldrice.2 coldrice 2 2 20
miscent.2 miscent 1 2 70
hyojung.3 hyojung 1 3 70
coldrice.3 coldrice 2 3 20
miscent.3 miscent 1 3 90
> wide2=reshape(data=long,varying=list(c("mid1","mid2","mid3")),v.names="score",timevar="mid",idvar="name",direction="wide")
> wide2
name class mid1 mid2 mid3
hyojung.1 hyojung 1 80 50 70
coldrice.1 coldrice 2 90 20 20
miscent.1 miscent 1 30 70 90
'R' 카테고리의 다른 글
(R) Sorting vector, data frame / sort(), order(),xtfrm (0) | 2020.11.08 |
---|---|
(R) ANOVA TEST (0) | 2020.11.07 |
(R) Combining Data frame, matrix and vectors using rbind, cbind / Merging data frames using merge() (0) | 2020.11.07 |
(R) Make a new variable (ex,quantile) (0) | 2020.11.05 |
(R) Extract rows and columns from data frame with conditions (0) | 2020.11.05 |