try R learning


Outline
  1. 1. 1. Using R
  2. 2. 2. Vectors 向量
  3. 3. 3. Matrice 矩陣
  4. 4. 4. 統計 Summary Statistics
  5. 5. 5. Factors
  6. 6. 6. Data Frames
  7. 7. 7. Real-World Data
  8. 8. 8. Next
  9. 9. Referance
  • codeschool
  • R 是一個程式語言、統計計算與繪圖的整合環境,是 GNU 的一個project萌生於貝爾實驗室(Bell Laboratories),主要作者為 John Chambers。其語法與 S 語言(S-Plus)非常相似,提供非常多的統計工具,包含線性與非線性模型(linear and nonlinear modelling)、統計檢定(statistical tests)、時間序列分析(time series analysis)、分類分析(classification)、群集分析(clustering)等相關工具。

    R 擁有 CRAN, a network of ftp and web servers around the world that store identical, up-to-date, versions of code and documentation for R.(目前有 4435 多種packages,幾乎由統計學家更新)

    1. Using R

    • Expressions
    • Logical Values
    • Variables
    • Functions
    • repeat print
    1
    2
    > rep("123", times=3)
    [1] "123" "123" "123"
    • Help & Example(functionname)*
    1
    2
    > help(sum)
    > example(min)
    • Files
    1
    2
    3
    4
    5
    6
    7
    8
    9
    > list.files()
    [1] "Applications" "ASUS" "Desktop"
    [4] "Documents" "Downloads" "Dropbox"
    [7] "Google 雲端硬碟" "Library" "Movies"
    [10] "Music" "NTU Space" "Pictures"
    [13] "Public" "tmp"
    > read.csv("file.csv") # read csv file
    • Run R script
    1
    > source("file.R")
    • na.rm 是否移除 NA (default = FALSE)
    1
    > sum(a, na.rm = TRUE)

    2. Vectors 向量

    Mixing type,自動轉成 string

    • c Combine
    1
    2
    > c("a", TRUE, 1)
    [1] "a" "TRUE" "1"
    • Sequence 序列向量
    1
    2
    3
    4
    5
    > 5:9
    [1] 5 6 7 8 9
    > seq(5,9,0.5) # 間隔 0.5
    [1] 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0
    • [V] Access
    1
    2
    3
    4
    5
    6
    > a =c("a", TRUE, 1)
    > a[3]
    [1] "1"
    > a[-3] # execluding
    [1] "a" "TRUE"
    • names(v) 命名
    1
    2
    3
    4
    5
    6
    7
    8
    9
    > b = 5:7
    > names(b) <- c("first", "second", "third")
    > b
    first second third
    5 6 7
    > b["first"]
    first
    5
    • 繪圖 Plotting
    1
    2
    3
    4
    5
    > barplot(v) # 長條圖
    > x <- seq(1,20,0.1)
    > y <- sin(x)
    > plot(x, y) # Scatter Plots 分散圖
    • 計算
    1
    2
    3
    > a = a(1,2,3)
    > a+1
    [1] 2 3 4
    • 判別
    1
    2
    > a == c(1,2,5)
    [1] TRUE TRUE FALSE

    3. Matrice 矩陣

    • 給值+給維度
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    > matrix(0,3,4)
    [,1] [,2] [,3] [,4]
    [1,] 0 0 0 0
    [2,] 0 0 0 0
    [3,] 0 0 0 0
    > matrix(1:12,3,4) # 一個一個填入
    [,1] [,2] [,3] [,4]
    [1,] 1 4 7 10
    [2,] 2 5 8 11
    [3,] 3 6 9 12
    • dim 後給維度
    1
    2
    3
    4
    5
    6
    7
    > a <- 1:12
    > dim(a) <- c(3,4)
    > a
    [,1] [,2] [,3] [,4]
    [1,] 1 4 7 10
    [2,] 2 5 8 11
    [3,] 3 6 9 12
    • [M] Access
    1
    2
    3
    4
    5
    6
    7
    8
    9
    > a
    [,1] [,2] [,3] [,4]
    [1,] 1 4 7 10
    [2,] 2 5 8 11
    [3,] 3 6 9 12
    > a[2,3]
    [1] 8
    > a[ ,3]
    [1] 7 8 9
    • 繪圖 Plotting
    1
    2
    3
    4
    > contour(a)
    > persp(a) # 3D
    > persp(e,expand=0.2) # 擴大
    > image(volcano) # 內建火山 heat map

    4. 統計 Summary Statistics

    • mean() 平均
    1
    2
    3
    4
    5
    6
    7
    > a <- 1:12
    > mean(a)
    [1] 6.5
    > median(a)
    [1] 6.5
    > sd(a) # Standard Deviation
    [1] 3.605551

    5. Factors

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    > a =c('gold', 'silver', 'gems', 'gold', 'gems')
    > type=factor(a)>
    > type
    [1] gold silver gems gold gems
    Levels: gems gold silver
    > as.integer(type)
    [1] 2 3 1 2 1
    > levels(type)
    [1] "gems" "gold" "silver"
    • 繪圖
    1
    暫略 try R 5.2

    6. Data Frames

    • data.frame() 把同樣的 data structures 用表格呈現
    1
    2
    3
    4
    5
    6
    7
    8
    > a=c(1:3)
    > b=c(2:4)
    > c=c(3:5)
    > f=data.frame(a,b,c)
    a b c
    1 1 2 3
    2 2 3 4
    3 3 4 5
    • Access
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    > f
    a b c
    1 1 2 3
    2 2 3 4
    3 3 4 5
    > f[2]
    b
    1 2
    2 3
    3 4
    > f[[2]]
    [1] 2 3 4
    > f[["b"]]
    or
    > f$b
    [1] 2 3 4
    • merge
    1
    暫略 try R 6.4

    7. Real-World Data

    8. Next

    slide: Ruby & R

    Referance

    codeschool