Ｑ＆Ａ (初級者コース)/18 のバックアップ差分(No.14)

追加された行はこの色です。
削除された行はこの色です。
#author("2020-05-18T12:09:38+09:00","","")
#author("2020-07-14T14:21:13+09:00;2020-05-18T12:09:38+09:00","","")
COLOR(green){SIZE(20){初心者のための R および RjpWiki に関する質問コーナー}} 

COLOR(red){SIZE(16){以下の 4 つの「●項目」のどれかをクリック！！}}~
----

COLOR(blue){● [[新規投稿欄:>#sinki]]} --- その前に，「[[投稿における注意事項]]」を読んでから！&br;
COLOR(blue){● [[このページの目次>#mokuji]]} --- 質問への回答・コメントの参照&br;
COLOR(blue){● [[最新のスレッド>#saisin]]} --- 最も最近に書き込まれた「親」記事（最新のコメントが付いた記事ではありません）&br;
COLOR(blue){● [[Ｑ＆Ａ (初級者コース) 過去の目次]]} --- 今までの 17 個の書庫の参照&br;

----

&aname(mokuji);COLOR(green){●● 目次 ●●}

参照は，個々の項目をクリック（質問への回答・コメントの参照も，個々の項目をクリック）~
#contents

----

&aname(sinki);

COLOR(green){●● 新規投稿 ●●}

まずは，「[[投稿における注意事項]]」を読んで，それに従ってください。~
COLOR(red){これに従わない場合には，回答が得られにくい場合があります。}　

#article
**R studio でグラフ化した時に日本語文字化けが起こる [#bacaed2a]
>[[りょう]] (2020-06-30 (火) 21:38:04)~
~
Rによる多変量解析と言う本でRを動かしているのですが、グラフ化すると日本語の部分が□□□のようになります。~
よくわからないのが、"plot", "plotmeans"などでは日本語表示されるのですが、xyplot, histogramだと文字化けすることです。以下がその時のコードです。~
~
>jhk <- read.csv('人事評価結果.csv')~
head(jhk, 3)~
|ID|性別|部署|年代|協調性|技能|知識|ストレス|~
|1|M|A部|中堅|70|65|71|53|~
|2|F|B部|熟練|45|62|70|46|~
・・・・・・~
par(family = "HiraKakuProN-W3") #一応これ試してみました~
library(lattice, gplots)~
histogram(~ストレス, data = jhk, breaks = 20, type='count') #文字化けする~
xyplot(知識~技能|年代+部署, data=jhk) #文字化けする~
plotmeans(協調性~性別, data=jhk, p=0.95, ylim=c(49, 54)) #文字化けしない~
~
・macOS Catalina バージョン10.15.5~
・R studio version 4.0.2　です。この前に使用していたバージョン3.6.3でも同様の状態でした。~
・CSVファイルはUTF-8で保存されています。~
・par(family = "HiraKakuProN-W3") なども試しました。~
~
GoogleやCRAN, マニュアルなどで既に調べたのですがわからず、困っているのでご教授いただけると幸いです。~

//
- ごめんなさい。コードの部分を青くしたり、表形式にしたつもりができていません汗 -- [[りょう]] &new{2020-06-30 (火) 21:39:55};
- お門違いの場所にアップされ，誰も返事をしない，もう見ていないかもしれないが一応書いておこう。&br;histogram, xyplot は lattice のものなので，日本語フォント指定が旨くいかない（なんか方法はあるはずだが，ネットをあさっても...）。&br;回避法としては&br;「画面に描かず PDF ファイルに保存する」とちゃんと描かれるっす。
 pdf("test.pdf") # ひつようなら他のパラメータの設定も
 histogram(~ストレス, data = jhk, breaks = 20, type='count') #文字化しない
 xyplot(知識~技能|年代+部署, data=jhk) #文字化けしない
 plotmeans(協調性~性別, data=jhk, p=0.95, ylim=c(49, 54)) #文字化けしない
 dev.off()
どんなもんだ？ --  &new{2020-07-14 (火) 14:04:24};

#comment

**ggplot [#bc6d7d37]
>[[ぼう]] (2020-05-14 (木) 12:25:32)~
~
R4.00でggplotを使おうと思っていますが、エラーが出てしまいます。~
~
次のコマンドを実行すると、下のような結果が出てきてしまいます。
 > install.packages("ggplot2", dependencies = TRUE)
 > library(ggplot2)
 > str(iris)
 'data.frame':   150 obs. of  5 variables:~
  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
  > ggplot(data=iris,aes(x=Sepal.Length,y=Sepal.Width))~
  ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width)) でエラー: 
    関数 "ggplot" を見つけることができませんでした 

//
- ggplot2 がインストールされていない（OS  はなんですか。Mac では R4.0 なら，コンパイルが必要なインストールが延々と実行されますが，それがちゃんと出来ているか。Windows ならそうでもないのかな）~
と，思ったが，Windows でもインストールはコンパイルが必要なようですね。で，それでも，ちゃんとインストールできましたが。~
あなたの場合，コンパイルに失敗していると言うことはありませんか。コンパイルが可能な環境が出来ていないのかな。普通は，そんなことないはずなのだけど。 --  &new{2020-05-15 (金) 21:38:52};
- ありがとうございます。Win10を使っています。library(ggplot2)でエラーが出ていないのに、ggplotを実行するとエラーが出てしまうので、原因が分かりません。。。 -- [[ぼう]] &new{2020-05-18 (月) 11:28:18};
- ありがとうございます。Win10を使っています。library(ggplot2)でエラーが出ていないのに、ggplotを実行するとエラーが出てしまうので、原因が分かりません。。。 -- [[ぼう]] &new{2020-05-18 (月) 12:09:38};

#comment

**ccf [#y347335b]
>[[show]] (2020-03-10 (火) 12:58:28)~
~
ccf(rw,rw2)と入力した場合、どちらを基準にしてlagを取っているのでしょうか？~

//
- ? ccf -- "The lag k value returned by ccf(x, y) estimates the correlation between x[t+k] and y[t]." --  &new{2020-03-10 (火) 17:55:37};
- rw2を基準としているということでしょうか。 -- [[show]] &new{2020-03-11 (水) 12:34:59};
- what? rw2? what do you mean?--  &new{2020-03-11 (水) 22:56:56};
- ccf(rw,rw2)の場合、rw２を基準にしてるということでしょうか？ -- [[show]] &new{2020-03-13 (金) 08:33:16};
- x[t+k] and y[t] --  &new{2020-03-13 (金) 10:12:16};
- なぜ自分で確かめもせず，不毛な質問を続けるのかなあ？~
以下のようなテストデータを作ってみる。ベクトルの長さ n = 10
 > set.seed(12345)
 > n <- 10
 > x <- rnorm(n)
 > y <- rnorm(n)
ラグ lag = 2，で ccf してみる
 > print(ccf(x, y, lag, plot=FALSE), digits=7)
 
 Autocorrelations of series ‘X’, by lag
 
         -2         -1          0          1          2 
  0.1416173  0.5327757 -0.2716881  0.2595251 -0.1681542 
自分で相互相関係数を計算するときに必要なもの。
 > mean.x <- mean(x) # x の平均（全部使う）
 > mean.y <- mean(y) # y の平均（全部使う）
 > ss.x <- var(x)*(n-1) # x の変動（全部使う）
 > ss.y <- var(y)*(n-1) # y の変動（全部使う）
相互相関係数を計算する。lag = 2~
使用するデータベクトル~
x[(lag+1):n]-mean.x つまり，lag+1 から n まで，長さ n-lag。平均値 mean.x を引いておく。~
y[1:(n-lag)]-mean.y つまり，1 から n-lag まで，長さ n-lag。平均値 mean.y を引いておく。~
両者掛け合わせて合計し（部分の共変動だ） sum((x[(lag+1):n]-mean.x) * (y[1:(n-lag)]-mean.y) ~
x，y 両方の全体の変動の積の平方根 sqrt(ss.x * ss.y) で割る
 > sum((x[(lag+1):n]-mean.x) * (y[1:(n-lag)]-mean.y) / sqrt(ss.x * ss.y)
 [1] -0.1681542
上の ccf の lag = 2 の下に書かれている数値 -0.1681542 と一致しているでしょ。~
つまり，ヘルプに書いている 「x[t+k] and y[t]」というのはそいういうこと，です。よ。~
x[(lag+1):n] と y[1:(n-lag)] どちらを基準にしていますか？~
そんなこと，どっちだっていいじゃないか。 --  &new{2020-03-13 (金) 11:11:38};

#comment

**R のアップデート [#rac7c1ae]
>[[私は誰でしょう？]] (2020-03-01 (日) 21:20:52)~
~
今まで，最新の R にするために（新しもの好きなので）~
あ，見ての通り，Mac ユーザです。~
~
curl https://mac.r-project.org/el-capitan/R-3.6-branch/R-3.6-branch-el-capitan-sa-x86_64.tar.gz | sudo tar fvxz - -C /~
~
~
をやっていたのですが，2 ヶ月ほど前から，実行結果が思わしくなくなっています。~
つまるところ，件の URL はアクセス不可ということのようで，アップデートは出来ません。~
~
念の為~
http://mac.r-project.org/el-capitan/R-3.6-branch/R-3.6-branch-el-capitan-signed.pkg~
を使えばアップデートは出来ます。~

//
- サイトの方のWAFとかでブロックされているのか, 単なる設定ミスかはわかりませんが当方でも同じ症状です.
 curl -O http://mac.r-project.org/el-capitan/R-3.6-branch/R-3.6-branch-el-capitan-signed.pkg
 sudo installer -pkg ./R-3.6-branch-el-capitan-signed.pkg -target /
とかでお茶を濁しては... -- [[何処の誰かは知らないけれど]] &new{2020-03-02 (月) 16:03:50};
- 有難うございます。取りあえずは，それで，しのいでみます。 --  &new{2020-03-02 (月) 22:12:06};

#comment

**連番でファイルを読み込みたいときにファイルに欠番があるときの処理 [#me7fd7f2]
>[[Rビギナー]] (2020-02-26 (水) 10:12:45)~
~
お世話になります．~
下記のように1～3まで連番のファイルを読み込みたいのですが，例えば2番のファイルが無い場合にそのまま1，3のファイルを読み込んでプログラムが止まらないようにするにはどうしたらいいでしょうか．~
try関数で可能かと思いやってみたのですが，だめでした．~
ご教示ください．~
よろしくお願いいたします．~
~
＝＝ここから~
  point <- 1:3 # for文を回す数を設定
  for (i in point) {
    mypath <- file.path(getwd(), paste0(i,"t2-1.csv")) 
    df <- assign(paste0("data",i), read.csv(mypath, header = T))
    df <- data.frame(df)
    df
    head(df)
    df$X <- strptime(df$X,"%Y/%m/%d %H:%M'%S")
    head(df)
    df$X <- as.POSIXlt(df$X,"%Y/%m/%d %H:%M:%S",tz="GMT0")
    head(df)
    #https://qiita.com/tktz/items/733c37b1d6102ae52120
    library(xts)
    df$X <- align.time(df$X-10*60/2,10*60)
    head(df)
    date = paste0("date",i) 
    Temp = paste0("Temp",i) 
    RH = paste0("RH",i) 
    colnames(df) <- c(date,Temp,RH)
    head(df)
    mypath <- file.path(getwd(), paste0(i,"t1.csv")) 
    write.csv(df,mypath)
  }

//
- insert 1 sentence. --  &new{2020-02-26 (水) 10:46:10};
     mypath <- file.path(getwd(), paste0(i,"t2-1.csv")) 
     if (! file.exists(mypath)) next
- or --  &new{2020-02-26 (水) 11:09:20};
 files = list.files(getwd(), pattern="[0-9]*t2-1.csv")
 for (f in files) {
   # describe what you want to do
 }
- お二人の方，ありがとうございました．おかげさまでうまくいきました！ -- [[Rビギナー]] &new{2020-02-26 (水) 14:57:32};
- お二人の方，ありがとうございました．おかげさまでうまくいきました！ -- [[Rビギナー]] &new{2020-02-26 (水) 15:47:15};

#comment

**時系列 [#h477736a]
>[[マリン]] (2020-02-23 (日) 10:13:04)~
~
２つの時系列のデータ（約８０個）の相関（どのくらい類似性があるか）を~
見たいのですが、方法を教えてください。~

//
- まずはググってみて，それからにしなさい --  &new{2020-02-23 (日) 21:09:21};
- ？ -- [[マリン]] &new{2020-02-24 (月) 07:45:37};

#comment

**固定効果を用いた回帰線 [#s967c99e]
> (2020-02-14 (金) 14:34:30)~
~
Book1というcsvファイルのデータを用いてGLMMによる解析を行ったのですが、固定効果の値を用いた回帰線がうまく引けません。~
csvファイルおよび解析を行ったスクリプトをBook1GLMMというファイル名でアップロードしましたので、ご教授いただけますと幸いです。~
よろしくお願いいたします。~

//
- do! just shown as  article '2020-01-30 (木) 12:37:42' the same program isn't it? --  &new{2020-02-14 (金) 17:11:14};
 data <- read.csv("Book1.csv")
 data
 # update.packages
 # install.packages("lme4")
 library(lme4)
 par(family = "HiraKakuProN-W3")
 
 model1 <- glmer(POMN ~ log(siltclayN) + (1 | Place), family = Gamma(link = "log"), data = data)
 summary(model1)
 place = c("HA", "TN", "OG", "YN", "KG", "WA", "AS") # データフレームではこちらを使っているのでこれでなきゃだめ
 place2 = c("半田山", "田野", "小川", "与那", "上賀茂", "西粟倉", "足寄") # 凡例用には別のものを用意する
 pch = c(25, 18, 17, 16, 15, 9, 8)
 col = c("black", "red", "cadetblue", "magenta", "brown", "coral", "darkorchid")
 plot(data$siltclayN, data$POMN, pch = rep(pch, each = 5), col = rep(col, each = 5), cex = 3, cex.axis = 2)
 legend("bottomright", legend = place2, pch = pch, col = col, cex = 2)
 xx <- seq(min(data$siltclayN), max(data$siltclayN), length = 100)
 # yy <- exp(1.1136 + 0.4651 * xx) # そもそも間違い
 # lines(xx, yy, lwd = 2)
 for (i in seq_along(place)) { # 7 本の曲線を引くんでしょ？
   newdata = data.frame(siltclayN=xx, Place=rep(place[i], 100))
   yy = predict(model1, newdata=newdata, type="response")
   lines(xx, yy, col=col[i])
 }
- 7本の曲線は引けました。勉強不足でお恥ずかしいのですが、固定効果の推定値を用いて一本だけ回帰線を引くことはできないのでしょうか？ --  &new{2020-02-14 (金) 18:08:12};
-  quit --  &new{2020-02-14 (金) 21:04:44};

#comment

**ガンマ分布 [#f464b927]
> (2020-02-06 (木) 21:14:35)~
~
GLMMなどで、目的変数に用いるデータが連続値で0以上であればガンマ分布を用いると統計の教科書等には書いてあるのですが、連続値で0以上(データの最小値は0)のデータでガンマ分布を用いて解析しようとすると~
「non-positive values not allowed for the 'gamma' family」というエラーが出ます。~
また、同じデータで線形混合効果モデルなら結果は出るのですが、「boundary (singular) fit: see ?isSingular」という警告？が出るせいなのかランダム効果を考慮した回帰線が一本しか出ません。~
解決策などありましたら、お教えいただけますと幸いです。~

//
- show your data and your script(program). --  &new{2020-02-06 (木) 22:07:07};
- model1<-glmer(A~log(B)+(1|Place),family=Gamma(link="log"),data=d) In this script was error 「non-positive values not allowed for the 'gamma' family」.  --  &new{2020-02-06 (木) 22:26:52};
- model2<-lmer(A~B+(1|Place),data=d) In this script was 「boundary (singular) fit: see ?isSingular」 --  &new{2020-02-06 (木) 22:28:59};
- The data was uploaded. It is data named data.csv. --  &new{2020-02-06 (木) 22:31:00};
- 'non-positive values not allowed for the 'gamma' family' means 0 is not non-positive, isn't it. 
 > library(lme4)
 > d = read.csv("data.csv")
 > model1 <- glmer(A ~ log(B) + (1|Place),f amily=Gamma(link="log"),data=d)
  eval(family$initialize, rho) でエラー: 
   non-positive values not allowed for the 'gamma' family
using d2
 > d2 = d[d$A > 0, ]
 > model1-2 <- glmer(A~log(B) + (1|Place), family=Gamma(link="log"), data=d2)
 > model1-2
no error happend.&br;
can you understand?&br;you must *respect* 'error message'. it warns you are wrong, and he says the reason 'why'.&br;if your data contains 'zero's, it may not be a gamma distribution!!!!! --  &new{2020-02-06 (木) 23:37:25};
- I will try it once. One more thing I want to ask. --  &new{2020-02-07 (金) 00:00:48};
- Now I uploaded another data. When the script of model2 is executed with this data, the message "boundary (singular) fit: see? IsSingular" appears, and only one regression line considering random effects is drawn. What is the cause? --  &new{2020-02-07 (金) 00:03:40};
- you! just , imput '? IsSingular'(without last '?') in console!!!.  message says so, ah. --  &new{2020-02-07 (金) 00:11:48};
 model2<-lmer(NitMAOMrecover~MAOMCN+(1|Place),data=example)
 summary(model2)
 place=c("HA","TN","OG","YN","KG","WA","AS")
 pch=c(25,18,17,16,15,9,8)
 col=c("black","red","cadetblue","magenta","brown","coral","darkorchid")
 plot(example$B,example$A,pch=rep(pch,each=5),col=rep(col,each=5),cex=3,cex.axis=2)
 legend("bottomright",legend=place,pch=pch,col=col,cex=1.5)
 xx<-seq(min(example$B),max(example$B),length=1000)
 for(i in seq_along(place)){
     yy<-predict(model2,newdata=data.frame(B=xx,Place=rep(place[i],1000)),type="response")
     lines(xx,yy,col=col[i],lwd=2)}
- Excuse me. I want to show the script to draw the regression line on the site, but I don't know how to do it. --  &new{2020-02-07 (金) 00:29:15};
- they said. "その前に，「[[投稿における注意事項]]」を読んでから！" and "投稿する前にまず &heart; [[投稿文書の書式:http://www.okadajp.org/RWiki/?%E6%95%B4%E5%BD%A2%E3%83%AB%E3%83%BC%E3%83%AB]] &heart; を読んでください" --  &new{2020-02-07 (金) 08:20:14};
- surprisingly, it's a reckless and messy program.~
Four variables "Sample,Place,A,B" are included in your "example.csv".~
but, your wrote "NitMAOMrecover~MAOMCN+(1|Place),data=example".~
more over, you wrote variables "A", "B" as "example$B" or "example$A"~
a data-frame in lmer function is different from a "new" data-frame in predict function. ~
you have to study a basis of R. --  &new{2020-02-07 (金) 08:23:45};

#comment

**GLMMについて [#tc8ba31b]
>[[ra]] (2020-01-27 (月) 11:58:45)~
~
R初心者です。~
GLMMで解析を行った後、散布図に回帰線を引きたいのですが、predict関数を読み込ませようとすると（オブジェクト"Place"がありません）というエラーが出ます（Placeはランダム効果です）。このエラーを解消し、回帰線を引くにはどうすれば良いでしょうか？~
入力したコマンドを載せておきます。~
よろしくお願いいたします。~
~
model1<-glmer(POMN~log(siltclayN)+(1|Place),family=Gamma(link="log"),data=data)~
plot(data$siltclayN,data$POMN,pch=c(25,18,17,16,15,9,8),cex=3,cex.axis=2)~
legend("bottomright",legend=c("HA","TN","OG","YN","KG","WA","AS"),pch=c(25,18,17,16,15,9,8),cex=2)~
xx<-seq(min(data$siltclayN),max(data$siltclayN),length=100)~
yy<-predict(model1,newdata=data.frame(siltclayN=xx),type="response")~
lines(xx,yy,lwd=2)~

//
- new data set has to include 'siltclayN' *and* 'Place' --  &new{2020-01-27 (月) 13:47:56};
- Sorry,how do I create that data set? --  &new{2020-01-27 (月) 16:26:31};
- just, create --  &new{2020-01-27 (月) 22:11:04};
- 　
 POMN <- c(data$POMN)
 siltclayN <- c(data$siltclayN + Place)
 dataframe <- data.frame(POMN, siltclayN)
 model1 <- glmer(POMN~log(siltclayN) + (1|Place1), family=Gamma(link="log"), data=dataframe)
 plot(dataframe$siltclayN, dataframe$POMN, pch=c(25,18,17,16,15,9,8), cex=3, cex.axis=2)
 legend("bottomright", legend=c("HA","TN","OG","YN","KG","WA","AS"), pch=c(25,18,17,16,15,9,8),cex=2)
 xx <- seq(min(dataframe$siltclayN), max(dataframe$siltclayN), length=100)
 yy <- predict(model1, newdata=data.frame(siltclayN=xx), type="response")
　 --  &new{2020-01-27 (月) 23:14:23};
- What's wrong? --  &new{2020-01-27 (月) 23:16:01};
- newdata=data.frame(siltclayN=xx, Place = ???????) --  &new{2020-01-28 (火) 10:50:09};
- In this command and Place=xx, an error has occurred. Where should I improve? --  &new{2020-01-28 (火) 14:42:23};
- Place is factor isn't it? 'Place = xx' may be invalid!&br;
I'll show you an e xample.
 > df = data.frame(x=c(3,2,1,3,5), f=factor(c(1,2,2,3,3)), y=c(2,4,7,5,6))
 > fit = lm(y ~ x + f, data=df) # f may not be a factor
 > predict(fit) # this prediction uses data-frame df
   1   2   3   4   5 
 2.0 5.4 5.6 5.7 5.3 
if you predict with another data-frame, the data-frame must contain 'x' and 'f'
following newdata contains 'x' and 'f'. first 5 elements are same in df(df$x, df$f), last 3 elements is exactly new data
 > newdata = data.frame(x=c(3,2,1,3,5,  2,3,5), f=factor(c(1,2,2,3,3,  1,2,2)))
 > predict(fit, newdata)
   1   2   3   4   5     6   7   8 
 2.0 5.4 5.6 5.7 5.3   2.2 5.2 4.8 # first five values are identical to previous prediction. last 3 values are new prediction.
do you understand?? --  &new{2020-01-28 (火) 16:28:06};
- Place is a factor. In that case, is Place=c(data$Place) ok? Or should all numerical data and factor data be written in c of x,f and y?  --  &new{2020-01-29 (水) 10:40:07};
- I ran it with reference to your example, how can I draw a prediction line from here? --  &new{2020-01-29 (水) 11:37:47};
- if new data-frame is declared as 'newdata=data.frame(siltclayN=xx, Place=pp), pp[i] corresponds to xx[i].&br;examine above example carefullly. predict function must be called as 'predict(fit, newdata=data.frame(x=c(2,3,5), f=c(1,2,2)))' --  &new{2020-01-29 (水) 13:34:36};
- 'prediction line'?? no, no, prediction *prediction plane*. there are *two* !! independent  variables(x, f) and one dependent variable(y), so, it's a 3 dimensional data(Actually, if f is a factor and has more than 2 levels, the dimension is more than 3!!).  your data is 3 or more dimensinal, too.&br;you know, in case of simple regression analysis y =  a*x+b you have one independent variable x, and one dependent variable y, it's a *two* dimensional data, and you can draw two dimensional scatter plot or regression *line*. &br;if you have 3 independent variable model, no one can draw predicted hyperplane(you can't imagine 4 dimensional space, can you??.&br;however, if you draw POMN vs. siltclayN by Place, you can draw 2 dimensional scatter or regression line. --  &new{2020-01-29 (水) 13:41:39};
- your model is POMN = α * log(siltclayN[i]) + β * Place[i]. if you want to predict POMN for new data set, you must prepare new data pair  siltclayN[j] and Place[j], j = 1, 2, 3, ... --  &new{2020-01-29 (水) 14:02:09};
- your data-frame is curious, isn't it?&br;
what do you mean
 siltclayN <- c(data$siltclayN + Place)
 dataframe <- data.frame(POMN, siltclayN)
if Place is a factor variable you cant add to data$siltclayN. 'data$siltclayN + Place' cause an error.  --  &new{2020-01-29 (水) 14:47:36};
- upload your data is best way to get solution --  &new{2020-01-29 (水) 14:54:46};
- Did you fall into a deadlock? --  &new{2020-01-29 (水) 22:32:29};
- yes... I don't know how to draw a regression line considering random effects. --  &new{2020-01-30 (木) 10:34:08};
- Can you tell me what to improve specifically? --  &new{2020-01-30 (木) 10:36:21};
- upload your data is best way to get solution --  &new{2020-01-30 (木) 10:46:16};
- Upload now. It is a file called Book1. --  &new{2020-01-30 (木) 10:49:42};
- alright, I was able to upload. --  &new{2020-01-30 (木) 11:31:58};
- Your program is strange overall. for example, pch=c(25,18,17,16,15,9,8) must be pch=rep(c(25,18,17,16,15,9,8), 5). examine each point's coordinates.&br;
you may want a result like this. --  &new{2020-01-30 (木) 12:37:42};
 library(lme4)
 
 data = read.csv("Book1.csv")
 model1 <- glmer(POMN ~ log(siltclayN) + (1 | Place), 
 	family = Gamma(link = "log"), data = data)
 summary(model1)
 place = c("HA", "TN", "OG", "YN", "KG", "WA", "AS")
 pch = c(25, 18, 17, 16, 15, 9, 8)
 col = c("black", "red", "cadetblue", "magenta", "brown", "coral", "darkorchid")
 plot(data$siltclayN, data$POMN, pch = rep(pch, each = 5), 
 	col = rep(col, each = 5)) # , cex=3, cex.axis=2)
 legend("bottomright", legend = place, pch = pch, col = col) # ,cex=2)
 xx <- seq(min(data$siltclayN), max(data$siltclayN), length = 100)
 for (i in seq_along(place)) {
 	yy <- predict(model1,
 		newdata = data.frame(siltclayN = xx, Place = rep(place[i], 100)),
 		type = "response")
 	lines(xx, yy, col = col[i])
 }
&ref(glmer.png);
- I wanted to make this diagram. It was very helpful! I didn't have enough knowledge. --  &new{2020-01-30 (木) 13:48:53};
- I'm really thankful to you! --  &new{2020-01-30 (木) 13:50:02};
- I want to draw a regression line with the estimated fixed effects of model1, but I cannot do it well. --  &new{2020-02-13 (木) 21:49:52};

#comment

**文字列からバイト数を指定して切り出す [#dcc3e621]
> (2020-01-26 (日) 22:45:01)~
~
半角・全角が混在する文字列からバイト数を指定して、部分文字列を抽出したいのですが、どのような関数がありますでしょうか。~
例えば~
x <- "あaいbうc"~
という文字列から3-5byte目を抽出したいです。~
（"aい"という部分文字列を抽出したいです。）~
~
substringは文字数を指定する関数のため、byte単位で抽出が不可能なようです。~
~
お手数ですがご教示お願いいたします。~

//
- ? substr / If an input element has declared "bytes" encoding (see Encoding, the subsetting is done in units of bytes not characters. --  &new{2020-01-27 (月) 11:46:10};
- そもそも3-5byte目が「aい」というは正しいですか？ -- [[S]] &new{2020-01-27 (月) 16:29:39};
 > x <- "あaいbうc"
 > charToRaw(x)
  [1] e3 81 82 61 e3 81 84 62 e3 81 86 63
「あ」は3バイト文字なので途中で切られて、3-5byte目は「82 61 e3」です。これは「あ」の3バイト目、1バイト文字の「a」、「い」の1バイト目です。~
全角半角の字数をバイト数で何とかしようとするのは無理があ流のではないでしょうか。
- it's just he wants! --  &new{2020-01-27 (月) 22:13:13};
- euc-jp uses 2 bytes for 1 japanese character
 > x <- "あaいbうc" # UTF-8: あ == \xe3\x81\x82
 > xx = iconv(x, "utf-8", "euc-jp")
 > xx
 [1] "\\xa4\\xa2a\\xa4\\xa4b\\xa4\\xa6c" # EUC-JP: あ == \xa4\xa2
 > Encoding(xx) = "bytes"
 > iconv(substr(xx, 3, 5), "euc-jp", "utf-8")
 [1] "aい"
cp932 uses 2 bytes for 1 japanese character, too
 > x <- "あaいbうc"
 > yy = iconv(x, "utf-8", "cp932")
 > yy
 [1] "\x82\xa0a\x82\xa2b\x82\xa4c" # CP932: あ == \x82\xa0
 > Encoding(yy) = "bytes"
 > iconv(substr(yy, 3, 5), "cp932", "utf-8")
 [1] "aい"
do you understand? --  &new{2020-01-28 (火) 11:38:51};
- 文字コードを変換してsubstrで切り出すということですね。ご回答ありがとうござました。 --  &new{2020-01-30 (木) 00:44:16};

#comment

**apply系関数でfunに線形回帰lmが適用可能な関数 [#v28d972a]
>[[TY]] (2020-01-07 (火) 15:59:04)~
~
シミュレーションで同じ関数を繰り返し(10^5~10^6回程度)実行する場面を想定しています．~
例えば，以下のsample_dataでREP毎にlm(y~x)を実行したいです．~
for文で実行することはできるのですが，一般にapply系関数の方が実行が速いと聞き，~
探しています．apply系関数に関する知識は，tapply関数にmean関数を適用できる，~
　tapply(sample_data$y, sample_data$REP,mean)~
程度のレベルです．~
~
【sample_data】~
      y    x REP
1   133  3.9   1~
2    81 -2.3   1~
3   104 -1.2   1~
4    89  0.7   1~
5   142  4.0   1~
6    74 -2.9   1~
7   140  3.9   1~
8   170  4.4   1~
9   128  1.6   1~
10  107  1.2   1~
11   49 -4.3   2~
12   79 -2.9   2~
13   77 -3.2   2~
14  125  1.8   2~
15   82 -1.1   2~
16  132  2.6   2~
17   99 -0.0   2~
18  140  2.1   2~
19  140  4.9   2~
20   88 -1.1   2~

//
- 「REP毎にlm(y~x)を実行したい」のか，同じような 10^5 ほどのデータを分析したいのか。教えてあげてもいいけど，そのへんどうなのよ。~
つまり，rep が 1~10^5 くらいの種類があって，それぞれについて x,y の組のデータ（サンプルサイズが 10 くらい）があるんだけど，それぞれについて lm で分析してータセットごとの結果が欲しいのか？&br;結果を待っているより，やってみた方が速いよ（ワラ）。反応があるという保証がない限り，回答しない。&br;それにしても，質問の仕方がへた。  --  &new{2020-01-10 (金) 22:10:35};
- ありがとうございます．出直してきます． -- [[TY]] &new{2020-01-13 (月) 10:59:17};
- 必要なのは，そういう対応ではないと思うが。質問にも答えないし，誠意がない。というか，本当に回答が必要なのか？ --  &new{2020-01-16 (木) 21:03:50};

#comment