R 是一個程式語言、統計計算與繪圖的整合環境,是 GNU 的一個project萌生於貝爾實驗室(Bell Laboratories),主要作者為 John Chambers。其語法與 S 語言(S-Plus)非常相似,提供非常多的統計工具,包含線性與非線性模型(linear and nonlinear modelling)、統計檢定(statistical tests)、時間序列分析(time series analysis)、分類分析(classification)、群集分析(clustering)等相關工具。
R 擁有 CRAN
, a network of ftp and web servers around the world that store identical, up-to-date, versions of code and documentation for R.(目前有 4435 多種packages,幾乎由統計學家更新)
1. Using R
Expressions
Logical Values
Variables
Functions
repeat print
1
2
> rep("123" , times=3 )
[1 ] "123" "123" "123"
Help & Example(functionname)*
1
2
> help(sum)
> example(min)
1
2
3
4
5
6
7
8
9
> list.files()
[1 ] "Applications" "ASUS" "Desktop"
[4 ] "Documents" "Downloads" "Dropbox"
[7 ] "Google 雲端硬碟" "Library" "Movies"
[10 ] "Music" "NTU Space" "Pictures"
[13 ] "Public" "tmp"
> read.csv("file.csv" )
na.rm
是否移除 NA (default = FALSE)
2. Vectors 向量
Mixing type,自動轉成 string
1
2
> c("a" , TRUE , 1 )
[1 ] "a" "TRUE" "1"
1
2
3
4
5
> 5 :9
[1 ] 5 6 7 8 9
> seq(5 ,9 ,0.5 )
[1 ] 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0
1
2
3
4
5
6
> a =c("a" , TRUE , 1 )
> a[3 ]
[1 ] "1"
> a[-3 ]
[1 ] "a" "TRUE"
1
2
3
4
5
6
7
8
9
> b = 5 :7
> names(b) <- c("first" , "second" , "third" )
> b
first second third
5 6 7
> b["first" ]
first
5
1
2
3
4
5
> barplot(v)
> x <- seq(1 ,20 ,0.1 )
> y <- sin(x)
> plot(x, y)
1
2
3
> a = a(1 ,2 ,3 )
> a+1
[1 ] 2 3 4
1
2
> a == c(1 ,2 ,5 )
[1 ] TRUE TRUE FALSE
3. Matrice 矩陣
1
2
3
4
5
6
7
8
9
10
11
> matrix(0 ,3 ,4 )
[,1 ] [,2 ] [,3 ] [,4 ]
[1 ,] 0 0 0 0
[2 ,] 0 0 0 0
[3 ,] 0 0 0 0
> matrix(1 :12 ,3 ,4 )
[,1 ] [,2 ] [,3 ] [,4 ]
[1 ,] 1 4 7 10
[2 ,] 2 5 8 11
[3 ,] 3 6 9 12
1
2
3
4
5
6
7
> a <- 1 :12
> dim(a) <- c(3 ,4 )
> a
[,1 ] [,2 ] [,3 ] [,4 ]
[1 ,] 1 4 7 10
[2 ,] 2 5 8 11
[3 ,] 3 6 9 12
1
2
3
4
5
6
7
8
9
> a
[,1 ] [,2 ] [,3 ] [,4 ]
[1 ,] 1 4 7 10
[2 ,] 2 5 8 11
[3 ,] 3 6 9 12
> a[2 ,3 ]
[1 ] 8
> a[ ,3 ]
[1 ] 7 8 9
1
2
3
4
> contour(a)
> persp(a)
> persp(e,expand=0.2 )
> image(volcano)
4. 統計 Summary Statistics
1
2
3
4
5
6
7
> a <- 1 :12
> mean(a)
[1 ] 6.5
> median(a)
[1 ] 6.5
> sd(a)
[1 ] 3.605551
5. Factors
1
2
3
4
5
6
7
8
9
10
11
> a =c('gold' , 'silver' , 'gems' , 'gold' , 'gems' )
> type=factor(a)>
> type
[1 ] gold silver gems gold gems
Levels: gems gold silver
> as.integer(type)
[1 ] 2 3 1 2 1
> levels(type)
[1 ] "gems" "gold" "silver"
6. Data Frames
data.frame()
把同樣的 data structures 用表格呈現
1
2
3
4
5
6
7
8
> a=c(1 :3 )
> b=c(2 :4 )
> c=c(3 :5 )
> f=data.frame(a,b,c)
a b c
1 1 2 3
2 2 3 4
3 3 4 5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
> f
a b c
1 1 2 3
2 2 3 4
3 3 4 5
> f[2 ]
b
1 2
2 3
3 4
> f[[2 ]]
[1 ] 2 3 4
> f[["b" ]]
or
> f$b
[1 ] 2 3 4
7. Real-World Data
8. Next
slide: Ruby & R
Referance