R 2.2.0 の新機能・変更一覧(2005.10.06 公開)
graphics::xy.coords と xyz.coords、n2mfrow が grDevices 名前空間に移されました (grid でも同様に利用できます)。 graphics::boxplot.stats と contourLines、nclass.* および chull が grDevices 名前空間に移されました。chull() の基礎となっている C のコードは grDevices パッケージに移されました。
o split(x, f), split<-() and unsplit() now by default split by all
levels of a factor f, even when some are empty.
Use split(x, f, drop = TRUE) if you want the old behavior of
dropping empty levels. split() and split<-() are S3 generic
functions with new arguments 'drop' and '...' and all methods now
should have 'drop' and '...' arguments as well.
o anova.mlm() now handles the single-model case.
o attach() now prints an information message when objects are
masked on the search path by or from a newly attached database.
o New functions cdplot() and spineplot() for conditional density
plots and spine plots or spinograms. Spine plots are now used
instead of bar plots for x-y scatterplots where y is a factor.
o The nonparametric variants of cor.test() now behave better in
the presence of ties. The "spearman" method uses the asymptotic
approximation in that case, and the "kendall" method likewise,
but adds a correction for ties (this is not necessary in the
Spearman case).
o density() is now an S3 generic where density.default() {former
density()} has new argument 'weights' for specifying observation
masses different than the default 1/N -- based on a suggestion and
code from Adrian Baddeley.
format.default() が 'width' 引数を取るようになり、'justify' で文字列のセンタリングが可能になりました。 format.default() が、文字列 NA がエンコードされるかどうかを制御する新しい引数 'na.encode' を取るようになりました (デフォルトでは TRUE になっています)。また、新しい引数 'scientific' は実数/複素数の記数法 (fixed/scientific) を制御します。
リストに対して format() がどのように働くかが文書化されました。また、atomic なベクトルを処理する場合と同一の方法で引数を用いるようになりました。
o format.info() now has a 'digits' argument, and is documented
to work for all atomic vectors (it used to work for all but
raw vectors.).
o There is a new function gregexpr() which generalizes regexpr()
to search for all matches in each of the input strings (not
just the first match).
o labels() now has a method for "dist" objects (replacing that
for names() which was withdrawn in 2.1.0).
o library() now explicitly checks for the existence of
directories in 'lib.loc': this avoids some warning messages.
o loadNamespace(keep.source=) now applies only to that namespace
and not others it might load to satisfy imports: this is now
consistent with library().
o max.col() has a new argument for non-random behavior in the
case of ties.
o memory.profile() now uses the type names returned by typeof()
and no longer has two unlabelled entries.
o The default mosaicplot() method by default draws grey boxes.
o New algorithm "port" (the nl2sol algorithm available in the
Port library on netlib) added to the nls() function in the
'stats' package.
o object.size() now supports more types, including external
pointers and weak references.
o options() now returns its result in alphabetical order, and is
documented more comprehensively and accurately. (Now all
options used in base R are documented, including
platform-specific ones.)
Some options are now set in the package which makes use of
them (grDevices, stats or utils) if not already set when the
package is loaded.
o New option("OutDec") to set the decimal point for output conversions.
o New option("add.smooth") to add smoothers to a plot, currently
only used by plot.lm().
o plot.lm() has two new plots (for 'which' = 5 or 6), plotting
residuals or cook distances versus (transformed) leverages - unless
these are constant. Further, the new argument 'add.smooth' adds a
loess smoother to the point plots by default, and 'qqline = TRUE'
adds a qqline() to the normal plot.
The default for 'sub.caption' has been improved for long calls.
o read.table() now passes 'allowEscapes' to scan().
o sample(x, size, prob, replace = TRUE) now uses a faster
algorithm if there are many reasonably probable values. (This
does mean the results will be different from earlier versions
of R.) The speedup is modest unless 'x' is very large _and_
'prob' is very diffuse so that thousands of distinct values
will be generated with an appreciable frequency.
o scatter.smooth() now works a bit more like other plotting
functions (e.g., accepts a data frame for argument 'x').
Improvements suggested by Kevin Wright.
o signif() on complex numbers now rounds jointly to give the
requested number of digits in the larger component, not
independently for each component.
o New generic function simulate() in the 'stats' package with
methods for some classes of fitted models.
o smooth.spline() has a new argument 'keep.data' which allows to
provide residuals() and fitted() methods for smoothing splines.
o Attempting source(file, chdir=TRUE) with a URL or connection
for 'file' now gives a warning and ignores 'chdir'.
o source() closes its input file after parsing it rather than
after executing the commands, as used to happen prior to
2.1.0. (This is probably only significant on Windows where
the file is locked for a much shorter time.)
o split(), split<-(), unsplit() now have a new argument 'drop = FALSE',
by default not dropping empty levels; this is *not* back compatible.
o sprintf() now supports asterisk `*' width or precision
specification (but not both) as well as `*1$' to `*99$'. Also the
handling of `%' as conversion specification terminator is now
left to the system and doesn't affect following specifications.
o The plot method for stl() now allows the colour of the range
bars to be set (default unchanged at "light gray").
o Added tclServiceMode() function to the tcltk package to allow
updating to be suspended.
o terms.formula() no longer allows '.' in a formula unless there
is a (non-empty) 'data' argument or 'allowDotAsName = TRUE' is
supplied. We have found several cases where 'data' had not
been passed down to terms() and so '.' was interpreted as a
single variable leading to incorrect results.
o New functions trans3d(), the 3D -> 2D utility from persp()'s
example, and extendrange(), both in package 'grDevices'.
o [dqp]wilcox and wilcox.test work better with one very large sample
size and an extreme first argument.
o The specification of the substitutions done when processing
Renviron files is more liberal: see ?Startup. It now
accepts forms like R_LIBS=${HOME}/Rlibrary:${WORKGRP}/R/lib .
o Added recommendation that packages have an overview man page
<pkg>-package.Rd, and the promptPackage() function to create a
skeleton version.
o Replacement indexing of a data frame by a logical matrix index
containing NAs is allowed in a few more cases, in particular
always when the replacement value has length one.
o Complex arithmetic is now done by C99 complex types where
supported. This is likely to boost performance, but is
subject to the accuracy with which it has been implemented.
o The printing of complex numbers has changed, handling numbers
as a whole rather than in two parts. So both real and
imaginary parts are shown to the same accuracy, with the
'digits' parameter referring to the accuracy of the larger
component, and both components are shown in fixed or
scientific notation (unless one is entirely zero when it is
always shown in fixed notation).
o LDFLAGS now defaults to -L/usr/local/lib64 on most Linux
64-bit OSes (but not ia64). The use of lib/lib64 can be
overridden by the new variable LIBnn.
o We now test for wctrans_t, as apparently some broken OSes have
wctrans but not wctrans_t (which is required by the relevant
standards) .
o Any external BLAS found is now tested to see if the complex
routine zdotu works correctly: this provides a compatibility
test of compiler return conventions.
o Installation without NLS is now cleaner, and does not install
any message catalogues.
o The (not-recommended) options --with-system-zlib,
--with-system-bzlib and -with-system-pcre now have 'system' in
the name.
o If a Java runtime environment is detected at configure time
its library path is appended to LD_LIBRARY_PATH or equivalent.
New Java-related variables JAVA_HOME (path to JRE/JDK), JAVA_PROG
(path to Java interpreter), JAVA_LD_PATH (Java library path)
and JAVA_LIBS (flags to link against JNI) are made available
in Makeconf.
o Ei-ji Nakama was contributed a patch for FPU control with the
Intel compilers on ix86 Linux.
o The encoding for a packages' 00Index.html is chosen from the
Encoding: field (if any) of the DESCRIPTION file and from the
\encoding{} fields of any Rd files with non-ASCII titles.
If there are conflicts, first-found wins with a warning.
o R_HOME/doc/html/packages.html is now remade by R not Perl code.
This may result in small changes in layout and a change in
encoding (to UTF-8 where supported).
o The return value of new.packages() is now updated for any
packages which may be installed.
o available.packages() will read a compressed PACKAGES.gz file in
preference to PACKAGES if available on the repository: this
will reduce considerably the download time on a dialup connection.
The downloaded information about a repository is cached for the
current R session.
o The information about library trees found by
installed.packages() is cached for the current session, and
updated only if the modification date of the top-level
directory has been changed.
o A data index is now installed for a package with a 'data' dir
but no 'man' dir (even though it will have undocumented data objects).
o contrib.url path for type="mac.binary" has changed from
bin/macosx/<version> to bin/macosx/<arch>/contrib/<version>
where <arch> corresponds to R.version$arch
o checkFF() used by R CMD check has since R 2.0.0 not reported
missing PACKAGE arguments when testing installed packages with
namespaces. It now
- treats installed and source packages in the same way.
- reports missing arguments unless they are in a function in
the namespace with a useDynLib declaration (as the
appropriate DLL for such calls can be searched for).
o codoc() allows help files named pkg_name-defunct.Rd to have
undocumented arguments (and not just base-defunct.Rd).
o C function massdist() {called from density()} has new argument
'xmass' (= weights).
o Raw vectors passed to .C() are now passed as unsigned char *
rather than as SEXPs. (Wish of Keith Frost, PR#7853)
o The search for symbols in a .C/.Call/... call without a
package argument now searches for an enclosing namespace and
so finds functions defined within functions in a namespace.
o R_max_col() has new (5th) argument '*ties_meth' allowing
non-random behavior in the case of ties.
o The header files have been rationalized: the BLAS routine
LSAME is now declared in BLAS.h not Linpack.h, Applic.h no
longer duplicates routines from Linpack.h, and Applic.h is
divided into API and non-API sections.
o memory.c has been instrumented so that Valgrind can track R's
internal memory management. To use this, configure using
--with-valgrind-instrumentation=level
where level is 1 or 2. Both levels will find more bugs with
gctorture(TRUE). Level 2 makes Valgrind run extremely slowly.
o Some support for raw vectors has been added to Rdefines.h.
o R_BaseEnv has been added, to refer to the base environment.
This is currently equal to R_NilValue, but it will change in
a future release.
o %/% has been adjusted to make x == (x %% y) + y * ( x %/% y )
more likely in cases when extended-precision registers were
interfering.
o Operations on POSIXct objects (such as seq(), max() and
subsetting) try harder to preserve time zones and warn if
inconsistent time zones are used.
o as.function.default() no longer asks for a bug report when
given an invalid body. (PR#1880, PR#7535, PR#7702)
o Hershey fonts and grid output (and therefore lattice output)
now rescale correctly in fit-to-window resizing on a Windows
graphics device. Line widths also scale now.
o The X11() device now hints the window manager so that decorations
appear reliably under e.g. the GNOME WM (contributed
by Ei-ji Nakama).
o Subsetting a matrix or an array as a vector used to attempt to
use the row names to name the result, even though the
array might be longer than the row names. Now this is only
done for 1D arrays when it is done in all cases, even matrix
indexing. (Tidies up after the fix to PR#937.)
o Constants in mathlib are declared 'const static double' to
avoid performance issues with the Intel Itanium compiler.
o capabilities() used partial matching but was not documented
to: it no longer does so.
o kernel(1,0) printed wrongly; kernel(<name-string>, *) now returns
a named kernel in all cases; plot(kernel(.),..) is more flexible.
o installed.packages() and download.packages() now always
return a matrix as documented, possibly with 0 rows (rather than
a 0-length character vector or NULL).
o Arithmetic operations on data frames no longer coerce the
names to syntatically valid names.
o Units are now properly recycled in grid layouts
when 'widths' or 'heights' are shorter than the number of
columns or rows (PR#8014).
o spline()/spinefun()'s C code had a memory access buglet which
never lead to incorrect results. (PR#8030)
o sum() was promoting logical arguments to double not integer
(as min() and other members of its group do).
o loess() had a bug causing it to occasionally miscalculate
standard errors (PR#7956). Reported by Benjamin Tyner, fixed
by Berwin Turlach.
o library(keep.source=) was ignored if the package had a
namespace (the setting of options("keep.source.pkgs") was
always used).
o hist.POSIXct() and hist.Date() now respect par("xaxt").
o The 'vfont' argument was not supported correctly in title(),
mtext(), and axis(). The 'vfont' argument is superseded by
the par(family=) approach introduced in 2.0.0. This bug-fix
just updates the warning messages and documentation to
properly reflect the new order of things.
o The C-level function PrintGenericVector could overflow if
asked to print a length-1 character vector of several thousand
characters. This could happen when printing a list matrix,
and was fatal up to 2.1.1 and silently truncated in 2.1.1 patched.
o What happened for proc.time() and system.time() on
(Unix-alike) systems which do not support timing was
incorrectly documented. (They both exist but throw an error.)
Further, systen.time() would give an error in its on.exit
expression.
o weighted.residuals() now does sensible things for glm() fits:
in particular it now agrees with an lm() fit for a Gaussian glm()
fit. (PR#7961).
o The 'lm' and 'glm' methods for add1() took the weights and
offset from the original fit, and so gave errors in the
(dubious) usage where the upper scope resulted in a smaller
number of cases to fit (e.g. by omitting missing values in new
variables). (PR#8049)
o Setting new levels on a factor dropped all existing
attributes, including class "ordered".
o format.default(justify="none") now by default converts NA
character strings, as the other values always did.
o format.info() often gave a different field width from format()
for character vectors (e.g. including missing values or
non-printable characters).
o axis() now ensures that if 'labels' are supplied as character
strings or expressions then 'at' is also supplied (since the
calculated value for 'at' can change under resizing).
o Fixed segfault when PostScript font loading fails, e.g., when
R is unable to find afm files (reported by Ivo Welch).
o terms.formula() got confused if the 'data' argument was a list with
non-syntactic names.
o prompt() and hence package.skeleton() now produce *.Rd files that
give no errors (but warnings) when not edited, much more often.
o promptClass() and promptMethods() now also escape "%" e.g. in '%*%'
and the latter gives a message about the file written.
o wilcox.test() now warns when conf.level is set higher than
achievable, preventing errors (PR#3666) and incorrect answers
with extremely small sample sizes.
o The default (protection pointer) stack size (the default for
'--max-ppsize') has been increased from 10000 to 50000 in order to
match the increased default options("expressions") (in R 2.1.0).
o The R front-end was expecting --gui=tk not Tk as documented,
and rejecting --gui=X11.
o Rdconv -t latex protected only the first << and >> in a chunk
against conversion to guillemets.
o callNextMethod() and callGeneric() have fixes related to
handling arguments.
o ls.diag() now works for fits with missing data. (PR#8139)
o window.default() had an incorrect tolerance and so sometimes
created too short a series if 'start' or 'end' were zero.
o Some (fairly pointless) cases of reshape left a
temporary id variable in the result (PR#8152)
o R CMD build used 'tar xhf' which is invalid on FreeBSD systems
(and followed tar chf, so there could be no symbolic links in
the tarball).
o Subassignment of length zero vectors to NULL gave garbage
answers. (PR#8157)
o Automatic coercion of raw vectors to lists was missing, so for a
list (or data frame) z, z[["a"]] <- raw_vector did not work
and now does. This also affected DF$a <- raw_vector for a
data frame DF.
o The internal code for commandArgs() was missing PROTECTs.
o The width for strwrap() was used as one less than specified.
o R CMD INSTALL was not cleaning up after an unsuccessful
install of a non-bundle which was not already installed.