본문 바로가기
R

(R) Sorting vector, data frame / sort(), order(),xtfrm

by jangpiano 2020. 11. 8.
반응형

<sorting a vector>

 

> x = c(1:3,9,7,4)

> sort(x)

[1] 1 2 3 4 7 9

> sort(x, decreasing = TRUE)

[1] 9 7 4 3 2 1


<sorting a data frame>


<sorting a data frame by the alphabetical order of rows> 

> head(Animals)

                    body brain

Mountain beaver     1.35   8.1

Cow               465.00 423.0

Grey wolf          36.33 119.5

Goat               27.66 115.0

Guinea pig          1.04   5.5

Dipliodocus     11700.00  50.0


> ani<-Animals


> order1=order(row.names(ani))


> order1

 [1] 15  7 26 11 24  2  6  8 12  4 19 13  3  5  9 14 23 18 27  1 20 28 10 21 25 17

[27] 22 16


> ani_2<-ani[order1,]


> ani_2

                      body  brain

African elephant  6654.000 5712.0

Asian elephant    2547.000 4603.0

Brachiosaurus    87000.000  154.5

Cat                  3.300   25.6

Chimpanzee          52.160  440.0

Cow                465.000  423.0

Dipliodocus      11700.000   50.0

Donkey             187.100  419.0

Giraffe            529.000  680.0

Goat                27.660  115.0

Golden hamster       0.120    1.0

Gorilla            207.000  406.0

Grey wolf           36.330  119.5

Guinea pig           1.040    5.5

Horse              521.000  655.0

Human               62.000 1320.0

Jaguar             100.000  157.0

Kangaroo            35.000   56.0

Mole                 0.122    3.0

Mountain beaver      1.350    8.1

Mouse                0.023    0.4

Pig                192.000  180.0

Potar monkey        10.000  115.0

Rabbit               2.500   12.1

Rat                  0.280    1.9

Rhesus monkey        6.800  179.0

Sheep               55.500  175.0

Triceratops       9400.000   70.0

  



<sorting a data frame by more than a variable>


*firstly, list the columns in order of 'body' variable and then list in terms of 'brvariable. Without minus, it automatically list the columns in ascending order. 


> order2=order(ani$body,ani$brain)

> order2

 [1] 20 19 27 25  5  1 21 11 17 10  4 18  3 24 22 14 23  8 28 13  2  9 12  7 15 16

[27]  6 26

> ani[order2,]

                      body  brain

Mouse                0.023    0.4

Golden hamster       0.120    1.0

Mole                 0.122    3.0

Rat                  0.280    1.9

Guinea pig           1.040    5.5

Mountain beaver      1.350    8.1

Rabbit               2.500   12.1

Cat                  3.300   25.6

Rhesus monkey        6.800  179.0

Potar monkey        10.000  115.0

Goat                27.660  115.0

Kangaroo            35.000   56.0

Grey wolf           36.330  119.5

Chimpanzee          52.160  440.0

Sheep               55.500  175.0

Human               62.000 1320.0

Jaguar             100.000  157.0

Donkey             187.100  419.0

Pig                192.000  180.0

Gorilla            207.000  406.0

Cow                465.000  423.0

Horse              521.000  655.0

Giraffe            529.000  680.0

Asian elephant    2547.000 4603.0

African elephant  6654.000 5712.0

Triceratops       9400.000   70.0

Dipliodocus      11700.000   50.0

Brachiosaurus    87000.000  154.5


*I will give you another data frame which is named usj because the data frame 'ani' is inappropriate for this example. firstly, list the columns in order of CONT and then list in terms of decreasing order of INTG. With minus here, it automatically means to list in descending order rather than ascending order. As a result, we can see that 'INTG' listed in descending order under same value of CONT (6.2,6.5). 


> usj<-USJudgeRatings

> order3=order(usj$CONT,-usj$INTG)


> head(usj[order3,])

                         CONT INTG DMNR DILG CFMG DECI PREP FAMI ORAL WRIT PHYS RTEN

AARONSON,L.H.   5.7  7.9  7.7  7.3  7.1  7.4  7.1  7.1  7.1  7.0  8.3  7.8

BURNS,E.B.           6.2  8.8  8.7  8.5  7.9  8.0  8.1  8.0  8.0  8.0  8.6  8.6

MISSAL,H.M.         6.2  8.3  8.1  7.7  7.4  7.3  7.3  7.3  7.2  7.3  7.8  7.6

STAPLETON,J.F.   6.5  8.2  7.7  7.8  7.6  7.7  7.7  7.7  7.5  7.6  8.5  7.7

HADDEN,W.L.JR.  6.5  8.1  8.0  8.0  7.9  8.0  7.9  7.8  7.8  7.8  8.4  8.0

DEVITA,H.J.          6.5  8.0  7.6  7.2  7.0  7.1  6.9  7.0  7.0  7.1  6.9  7.2

 


<xtfrm function> 


You can create numeric vectors in alphabetical orders for vectors of characters and factors by using 'xtfrm' function. 


- vectors of characters 


> xtfrm(c("A", "B", "e", "d"))

[1] 1 2 4 3


> order(xtfrm(row.names(ani)))

 [1] 15  7 26 11 24  2  6  8 12  4 19 13  3  5  9 14 23 18 27  1 20 28 10 21 25 17

[27] 22 16

> order(row.names(ani))

 [1] 15  7 26 11 24  2  6  8 12  4 19 13  3  5  9 14 23 18 27  1 20 28 10 21 25 17

[27] 22 16


*You can see that as xtfrm function makes numeric vector in alphabetical orders for vectors of characters, both above prints out same things.

Then when will the xtfrm functions come into it's own? The time is when listing characters vectors in descending order. You can see that the order(-row.names(ani)) prints the 'error' sign only. For this case, we need xtfrm function to make It possible to list the character and factor vectors in descending order. 


> order(-row.names(ani))

Error in -row.names(ani) : invalid argument to unary operator


> order(-xtfrm(row.names(ani)))

 [1] 16 22 17 25 21 10 28 20  1 27 18 23 14  9  5  3 13 19  4 12  8  6  2 24 11 26

[27]  7 15



-vectors of factors 


> as.factor(c("A","B","e","d")) ---- change from the character vector to factor vector to show you that xtfrm works for vectors of factors. 

[1] A B e d

Levels: A B d e

> xtfrm(as.factor(c("A", "B", "e", "d")))

[1] 1 2 4 3


<An useful argument in order function - na.last>

There is a useful argument in order function which deals with NA. You can delete NA before ordering the vectors or data frames or place NA not disturbing us when interpreting the data. 


na.last = NA     ----->the missing values are deleted.

na.last = TRUE ------>the missing values are placed at the end.

na.last = FALSE ----->the missing values are placed at the beginning.


반응형