본문 바로가기
R

(R) reorganize data frames - stack, unstack, reshape function

by jangpiano 2020. 11. 7.
반응형

<Reshaping data frames> 

We sometimes need to reorganize data frames in a new form according to certain needs for making functions or statistical calculations. 


<Stack>


> dat1=data.frame(A=runif(20,0,0.7),B=runif(20,0,0.9),C=runif(20,0.8))

> dat1

            A          B         C

1  0.60730714 0.34485044 0.8083856

2  0.43592037 0.68032657 0.8526101

3  0.41984612 0.89238618 0.9468320

4  0.48211414 0.06180126 0.9420313

5  0.62057361 0.44256735 0.9672691

6  0.65831211 0.25447632 0.8672970

7  0.42467509 0.05187750 0.8411565

8  0.18902238 0.05686683 0.9973599

9  0.19892914 0.30698761 0.8395065

10 0.01555739 0.55490717 0.9686474

11 0.01591176 0.34292293 0.8488479

12 0.37174231 0.61580689 0.8873868

13 0.10231748 0.60909110 0.9373522

14 0.61326807 0.15069578 0.9506504

15 0.07295151 0.44691793 0.8699553

16 0.40845801 0.41075289 0.9789578

17 0.32814461 0.73066725 0.9236493

18 0.60540334 0.27622376 0.8842698

19 0.01816268 0.71314950 0.9413320

20 0.36442848 0.80881771 0.8699921



 stack(dat1)

 > stack(dat1,select=c("A","C"))


       values ind

1  0.60730714   A

2  0.43592037   A

3  0.41984612   A

4  0.48211414   A

5  0.62057361   A

6  0.65831211   A

7  0.42467509   A

8  0.18902238   A

9  0.19892914   A

10 0.01555739   A

11 0.01591176   A

12 0.37174231   A

13 0.10231748   A

14 0.61326807   A

15 0.07295151   A

16 0.40845801   A

17 0.32814461   A

18 0.60540334   A

19 0.01816268   A

20 0.36442848   A

21 0.34485044   B

22 0.68032657   B

23 0.89238618   B

24 0.06180126   B

25 0.44256735   B

26 0.25447632   B

27 0.05187750   B

28 0.05686683   B

29 0.30698761   B

30 0.55490717   B

31 0.34292293   B

32 0.61580689   B

33 0.60909110   B

34 0.15069578   B

35 0.44691793   B

36 0.41075289   B

37 0.73066725   B

38 0.27622376   B

39 0.71314950   B

40 0.80881771   B

41 0.80838561   C

42 0.85261014   C

43 0.94683201   C

44 0.94203126   C

45 0.96726905   C

46 0.86729699   C

47 0.84115650   C

48 0.99735987   C

49 0.83950651   C

50 0.96864743   C

51 0.84884789   C

52 0.88738682   C

53 0.93735217   C

54 0.95065040   C

55 0.86995530   C

56 0.97895783   C

57 0.92364927   C

58 0.88426984   C

59 0.94133196   C

60 0.86999215   C

  values ind

1  0.60730714   A

2  0.43592037   A

3  0.41984612   A

4  0.48211414   A

5  0.62057361   A

6  0.65831211   A

7  0.42467509   A

8  0.18902238   A

9  0.19892914   A

10 0.01555739   A

11 0.01591176   A

12 0.37174231   A

13 0.10231748   A

14 0.61326807   A

15 0.07295151   A

16 0.40845801   A

17 0.32814461   A

18 0.60540334   A

19 0.01816268   A

20 0.36442848   A

21 0.80838561   C

22 0.85261014   C

23 0.94683201   C

24 0.94203126   C

25 0.96726905   C

26 0.86729699   C

27 0.84115650   C

28 0.99735987   C

29 0.83950651   C

30 0.96864743   C

31 0.84884789   C

32 0.88738682   C

33 0.93735217   C

34 0.95065040   C

35 0.86995530   C

36 0.97895783   C

37 0.92364927   C

38 0.88426984   C

39 0.94133196   C

40 0.86999215   C 


<Unstack>

> cbind(dat2, d=runif(20,0,0.3))

 > unstack(cbind(dat2,d=runif(20,0,0.3)),values~ind)

 > unstack(cbind(dat2, d=runif(20,0,0.3)),d~ind)

        values ind           d

1  0.60730714   A 0.026263619

2  0.43592037   A 0.248322658

3  0.41984612   A 0.219391077

4  0.48211414   A 0.191770833

5  0.62057361   A 0.280993604

6  0.65831211   A 0.079237235

7  0.42467509   A 0.241292099

8  0.18902238   A 0.120036501

9  0.19892914   A 0.009090804

10 0.01555739   A 0.292907454

11 0.01591176   A 0.008498633

12 0.37174231   A 0.238023847

13 0.10231748   A 0.134750554

14 0.61326807   A 0.067378258

15 0.07295151   A 0.264869956

16 0.40845801   A 0.149131450

17 0.32814461   A 0.071351884

18 0.60540334   A 0.162132894

19 0.01816268   A 0.004526606

20 0.36442848   A 0.226815647

21 0.34485044   B 0.026263619

22 0.68032657   B 0.248322658

23 0.89238618   B 0.219391077

24 0.06180126   B 0.191770833

25 0.44256735   B 0.280993604

26 0.25447632   B 0.079237235

27 0.05187750   B 0.241292099

28 0.05686683   B 0.120036501

29 0.30698761   B 0.009090804

30 0.55490717   B 0.292907454

31 0.34292293   B 0.008498633

32 0.61580689   B 0.238023847

33 0.60909110   B 0.134750554

34 0.15069578   B 0.067378258

35 0.44691793   B 0.264869956

36 0.41075289   B 0.149131450

37 0.73066725   B 0.071351884

38 0.27622376   B 0.162132894

39 0.71314950   B 0.004526606

40 0.80881771   B 0.226815647

41 0.80838561   C 0.026263619

42 0.85261014   C 0.248322658

43 0.94683201   C 0.219391077

44 0.94203126   C 0.191770833

45 0.96726905   C 0.280993604

46 0.86729699   C 0.079237235

47 0.84115650   C 0.241292099

48 0.99735987   C 0.120036501

49 0.83950651   C 0.009090804

50 0.96864743   C 0.292907454

51 0.84884789   C 0.008498633

52 0.88738682   C 0.238023847

53 0.93735217   C 0.134750554

54 0.95065040   C 0.067378258

55 0.86995530   C 0.264869956

56 0.97895783   C 0.149131450

57 0.92364927   C 0.071351884

58 0.88426984   C 0.162132894

59 0.94133196   C 0.004526606

60 0.86999215   C 0.226815647


            A          B         C

1  0.60730714 0.34485044 0.8083856

2  0.43592037 0.68032657 0.8526101

3  0.41984612 0.89238618 0.9468320

4  0.48211414 0.06180126 0.9420313

5  0.62057361 0.44256735 0.9672691

6  0.65831211 0.25447632 0.8672970

7  0.42467509 0.05187750 0.8411565

8  0.18902238 0.05686683 0.9973599

9  0.19892914 0.30698761 0.8395065

10 0.01555739 0.55490717 0.9686474

11 0.01591176 0.34292293 0.8488479

12 0.37174231 0.61580689 0.8873868

13 0.10231748 0.60909110 0.9373522

14 0.61326807 0.15069578 0.9506504

15 0.07295151 0.44691793 0.8699553

16 0.40845801 0.41075289 0.9789578

17 0.32814461 0.73066725 0.9236493

18 0.60540334 0.27622376 0.8842698

19 0.01816268 0.71314950 0.9413320

20 0.36442848 0.80881771 0.8699921 


             A           B           C

1  0.083997591 0.083997591 0.083997591

2  0.089283519 0.089283519 0.089283519

3  0.284200176 0.284200176 0.284200176

4  0.152690073 0.152690073 0.152690073

5  0.105063938 0.105063938 0.105063938

6  0.008653101 0.008653101 0.008653101

7  0.143053737 0.143053737 0.143053737

8  0.131534291 0.131534291 0.131534291

9  0.177895177 0.177895177 0.177895177

10 0.033823705 0.033823705 0.033823705

11 0.163880339 0.163880339 0.163880339

12 0.119386484 0.119386484 0.119386484

13 0.290694124 0.290694124 0.290694124

14 0.054644807 0.054644807 0.054644807

15 0.248584340 0.248584340 0.248584340

16 0.181152571 0.181152571 0.181152571

17 0.054012267 0.054012267 0.054012267

18 0.074821208 0.074821208 0.074821208

19 0.119740172 0.119740172 0.119740172

20 0.109013125 0.109013125 0.109013125 


<Reshape>


reshape function enables data frames change from wide to long format and vice versa. 


>?reshape 

This function reshapes a data frame between 'wide' format with repeated measurements in separate columns of the same record and 'long' format with the repeated measurements in separate records. 


data: data frame

varying: names of sets of variables in the wide format that correspond to single variables in long format. 

v.names: names of variables in the long format that correspond to multiple variables in the wide format. 

timevar: the variable in long format that differentiates multiple records from the same group or individual. 

idvar: names of one or more variables in long format that identify multiple records from the same group/individual. 


> wide=data.frame(name=c("hyojung","coldrice","miscent"),mid1=c(80,90,30),mid2=c(50,20,70),mid3=c(70,20,90),class=c(1,2,1))

> wide

      name mid1 mid2 mid3 class

1  hyojung   80   50   70     1

2 coldrice   90   20   20     2

3  miscent   30   70   90     1


>long=reshape(data=wide, varying=list(c("mid1","mid2","mid3")),v.names="score", timevar="mid", idvar="name",direction="long")

> long

               name class mid score

hyojung.1   hyojung     1   1    80

coldrice.1 coldrice     2   1    90

miscent.1   miscent     1   1    30

hyojung.2   hyojung     1   2    50

coldrice.2 coldrice     2   2    20

miscent.2   miscent     1   2    70

hyojung.3   hyojung     1   3    70

coldrice.3 coldrice     2   3    20

miscent.3   miscent     1   3    90


> wide2=reshape(data=long,varying=list(c("mid1","mid2","mid3")),v.names="score",timevar="mid",idvar="name",direction="wide")

> wide2

               name class mid1 mid2 mid3

hyojung.1   hyojung     1   80   50   70

coldrice.1 coldrice     2   90   20   20

miscent.1   miscent     1   30   70   90


반응형