Re: Required documentation (was: Documenting the C API)

Radford Neal

So proper documentation for functions could look something like this:
- short description
- extended description
- parameters (names, types, assumptions about them)
- return value (types, expected values)
- errors/longjmp (conditions under which this happens)
- gc behavior (can cause gc, under what conditions, implicitly protected arguments)
This seems mostly reasonable to me.

Regarding the last point, however, we should be documenting not
whether a function *currently* has the possibility of calling the gc,
but whether or not we are *committed* to it never calling the gc, even
in future implementations. For example, I think we should make this
committment to never do a gc for REAL. I think this should be a
yes/no committment, not one conditional on arguments or other
circumstances, since not protecting stuff on the basis that a function
that may cause a gc won't actually do so because of the type of an
argument or whatnot is way too fragile a way of writing code, for very
minor performance gain.

Also, we should simply state that the arguments of *all* API functions
are automatically protected if it does a gc. The computational
savings from not doing this are minor compared to the reliability and
convenience advantages of doing it. For example, coerceVector should
protect its SEXP argument (as is the case in pqR).

The tricky part, I think, is how much detail the description
contains. For Rf_findVarInFrame3, e.g., should the behavior
wrt. environments with the ???UserDefinedDatabase??? class be
documented? Is that public API, or just private API for a specific
package? What if the user accidentally puts this class on an
In general, we need to include all information that is needed. Not
all such information will necessarily be in the description of a
particular function, however. Some of it is general information about
the API that belongs elsewhere.

The issue regarding user databases is one of whether or not this
feature, implemented in the R interpreter but currently undocumented,
should or should not be retained. If it is retained, it should of
course be documented.

Functions and other definitions should also be grouped into
categories, with appropriate documentation for each category.
This is one way of providing higher level documentation, but I think
it's better to start with a high-level view of what's needed and
proceed downwards than to start by thinking in terms of documentation
on each function and then try to build higher-level stuff on that.

Should we consider tests to be part of the documentation?
There should be examples, and it would be nice if they were automatically
converted to tests, as for examples in R help, but there should be
additional tests that I think should not be regarded as part of the
documentation. I'm not sure whether or not such tests should be
something that this working group will tackle.


Join to automatically receive all group messages.