13.5 接口文档

Previous13.4 更高层次的注释增强直觉 Next13.6 实现注释：是什么和为什么，而不是怎么做

Last updated 2 years ago

13.5 接口文档

注释的一个最重要的作用是定义抽象。回顾，抽象是一个实体的简化视图，它保留了基本信息，但省略了可以安全忽略的细节。代码不适合用来描述抽象；它太低层了，而且它包括了一些不应该在抽象中看到的实现细节。描述抽象的唯一方法是用注释。如果你想让代码呈现良好的抽象，你必须用注释来记录这些抽象。

记录抽象的第一步是将接口注释（interface comments）与实现注释（implementation comments）分开。接口注释提供了人们在使用一个类或方法时需要知道的信息；它们定义了抽象。实现注释描述了一个类或方法如何在内部工作，以便实现抽象。将这两种注释分开是很重要的，这样接口的用户就不会被暴露在实现细节中。此外，这两种形式最好是不同的。如果接口注释也必须描述实现，那么这个类或方法就是浅的。这意味着写注释的行为可以提供关于设计质量的线索；第15章将重新讨论这个想法。

一个类的接口注释提供了对该类所提供的抽象的高层次描述，比如下面这样：

/**
 * This class implements a simple server-side interface to the HTTP 
 * protocol: by using this class, an application can receive HTTP 
 * requests, process them, and return responses. Each instance of 
 * this class corresponds to a particular socket used to receive 
 * requests. The current implementation is single-threaded and 
 * processes one request at a time.
 */ 
public class Http {...}

该注释描述了类的整体功能，没有任何实现细节，甚至没有特定方法的细节。它还描述了类的每个实例代表什么。最后，注释描述了该类的限制（它不支持来自多个线程的并发访问），这对于考虑是否使用它的开发人员可能很重要。

方法的接口注释既包括用于抽象的更高层次信息，也包括用于精确性的更低层次细节：

注释通常以一两句话开头，描述调用者所感知的方法的行为；这是更高层次的抽象。
注释必须描述每个参数和返回值（如果有的话）。这些注释必须非常精确，并且必须描述对参数值的任何约束以及参数之间的依赖关系。
如果该方法有任何副作用，必须在接口注释中加以记录。副作用是该方法影响系统未来行为的任何后果，但不是结果的一部分。例如，如果该方法向内部数据结构添加了一个值，该值可以通过将来的方法调用来检索，这是一个副作用；向文件系统进行写入也是一个副作用。
方法的接口注释必须描述可以从该方法发出的任何异常。
如果在调用方法之前必须满足任何先决条件，则必须描述这些条件（也许必须先调用其他方法；对于二分搜索方法，必须对正在搜索的列表进行排序）。尽量减少先决条件是个好主意，但任何剩余的条件都必须被记录下来。

下面是一个从Buffer对象中复制数据的方法的接口注释：

/** 
* Copy a range of bytes from a buffer to an external location. 
* 
* \param offset 
*         Index within the buffer of the first byte to copy.
* \param length 
*         Number of bytes to copy.
* \param dest 
*         Where to copy the bytes: must have room for at least 
*         length bytes.
* 
* \return 
*         The return value is the actual number of bytes copied, 
*         which may be less than length if the requested range of 
*         bytes extends past the end of the buffer. 0 is returned 
*         if there is no overlap between the requested range and 
*         the actual buffer.
*/ 
uint32_t 
Buffer::copy(uint32_t offset, uint32_t length, void* dest) 
...

作为一个扩展的例子，让我们考虑一个叫做IndexLookup的类，它是一个分布式存储系统的一部分。该存储系统拥有一个表的集合，每个表都包含许多对象。此外，每个表可以有一个或多个索引；每个索引根据对象的一个特定字段提供对表中对象的有效访问。例如，一个索引可以用来根据对象的name字段来查找，另一个索引可以用来根据对象的age字段来查找。通过这些索引，应用程序可以快速提取所有具有特定name的对象，或者所有具有特定范围的age的对象。

IndexLookup类提供了一个方便的接口来执行索引查询。下面是一个如何在应用程序中使用它的示例：

query = new IndexLookup(table, index, key1, key2); 
while (true) {
    object = query.getNext(); 
    if (object == NULL) { 
        break; 
    } 
    ... process object ...
}

应用程序首先构造一个IndexLookup类型的对象，提供参数来选择一个表、一个索引和索引中的一个范围（例如，如果索引是基于一个age字段，key1和key2可能被指定为21和65，以选择所有年龄在这些值之间的对象）。然后，应用程序反复调用getNext方法。每次调用都会返回一个在所需范围内的对象；一旦所有的匹配对象都被返回，getNext就会返回NULL。由于存储系统是分布式的，这个类的实现有些复杂。一个表中的对象可能分布在多个服务器上，每个索引也可能分布在不同的服务器上；IndexLookup类中的代码必须首先与所有相关的索引服务器通信，以收集范围内对象的信息，然后必须与实际存储对象的服务器通信，以检索其值。

现在让我们考虑一下这个类的接口注释中需要包括哪些信息。对于下面给出的每一条信息，问问你自己，为了使用这个类，开发者是否需要知道这些信息（我对这些问题的回答在本章的最后）。

IndexLookup类向持有索引和对象的服务器发送的信息的格式。
用来确定某个对象是否属于所需范围的比较函数（是用整数、浮点数还是字符串进行比较？）
用于在服务器上存储索引的数据结构。
IndexLookup是否向不同的服务器并发地发出多个请求。
处理服务器崩溃的机制。

下面是IndexLookup类的接口注释的原始版本；摘录的内容还包括该类定义中的几行，注释中提到了这些内容：

/*
 * This class implements the client side framework for index range 
 * lookups. It manages a single LookupIndexKeys RPC and multiple 
 * IndexedRead RPCs. Client side just includes "IndexLookup.h" in 
 * its header to use IndexLookup class. Several parameters can be set 
 * in the config below:
 * - The number of concurrent indexedRead RPCs 
 * - The max number of PKHashes a indexedRead RPC can hold at a time 
 * - The size of the active PKHashes 
 * 
 * To use IndexLookup, the client creates an object of this class by 
 * providing all necessary information. After construction of 
 * IndexLookup, client can call getNext() function to move to next 
 * available object. If getNext() returns NULL, it means we reached 
 * the last object. Client can use getKey, getKeyLength, getValue, 
 * and getValueLength to get object data of current object.
 */ 
 class IndexLookup {
     ... 
   private:
     /// Max number of concurrent indexedRead RPCs
     static const uint8_t NUM_READ_RPC = 10;
     /// Max number of PKHashes that can be sent in one
     /// indexedRead RPC
     static const uint32_t MAX_PKHASHES_PERRPC = 256;
     /// Max number of PKHashes that activeHashes can
     /// hold at once. 
     static const size_t MAX_NUM_PK = (1 << LG_BUFFER_SIZE);
 }

在进一步阅读之前，看看你是否能找出这个注释中的问题。以下是我发现的问题：

第一段的大部分内容是关于实现的，而不是关于接口的。举个例子，用户不需要知道用于与服务器通信的特定远程过程调用的名称。第一段后半部分提到的配置参数都是私有变量，只与类的维护者有关，与用户无关。所有这些实现信息都应该从注释中省略。
该注释还包括一些明显的东西。例如，没有必要告诉用户需要包含IndexLookup.h：任何写C++代码的人都能猜到这是必要的。此外，“通过提供所有必要的信息（by providing all necessary information）”这段文字没有说什么，所以可以省略。为这个类提供一个较短的注释就足够了（也是更好的）。

/*
 * This class is used by client applications to make range queries 
 * using indexes. Each instance represents a single range query.
 * 
 * To start a range query, a client creates an instance of this 
 * class. The client can then call getNext() to retrieve the objects 
 * in the desired range. For each object returned by getNext(), the 
 * caller can invoke getKey(), getKeyLength(), getValue(), and 
 * getValueLength() to get information about that object.
 */

这条注释的最后一段并不是严格必要的，因为它主要是重复了个别方法的评论信息。然而，在类的文档中举例说明它的方法是如何一起工作的，特别是对于那些使用模式不明显的深类来说，这可能是有帮助的。请注意，新的注释并没有提到getNext的NULL返回值。这个注释并不打算记录每个方法的每一个细节；它只是提供高层次的信息来帮助读者理解这些方法是如何一起工作的，以及每个方法何时可能被调用。关于细节，读者可以参考各个方法的接口注释。本注释也没有提到服务器崩溃；那是因为服务器崩溃对这个类的用户来说是不可见的（系统会自动恢复）。

危险信号：实现文档污染了接口

当接口文档（例如方法的文档）描述了使用所记录的事物所不需要的实现细节时，就会出现这种危险信号。

现在考虑以下代码，它显示了IndexLookup中isReady方法的第一版文档。

/** 
* Check if the next object is RESULT_READY. This function is 
* implemented in a DCFT module, each execution of isReady() tries 
* to make small progress, and getNext() invokes isReady() in a 
* while loop, until isReady() returns true.
* 
* isReady() is implemented in a rule-based approach. We check 
* different rules by following a particular order, and perform 
* certain actions if some rule is satisfied.
* 
* \return 
*         True means the next Object is available. Otherwise, return 
*         false.
*/
bool IndexLookup::isReady() { ... }

再一次，本文档的大部分内容，例如对 DCFT 的引用和整个第二段，都与实现有关，因此不属于这里；这是接口注释中最常见的错误之一。一些实现文档是有用的，但它应该放在方法内部，与接口文档明确分开。此外，文档的第一句话含糊其辞（RESULT_READY 是什么意思？），一些重要的信息被遗漏。最后，这里没有必要描述 getNext 的实现。这是一个更好的注释版本：

/*
* Indicates whether an indexed read has made enough progress for 
* getNext to return immediately without blocking. In addition, this 
* method does most of the real work for indexed reads, so it must 
* be invoked (either directly, or indirectly by calling getNext) in 
* order for the indexed read to make progress.
* 
* \return 
*         True means that the next invocation of getNext will not block 
*         (at least one object is available to return, or the end of the 
*         lookup has been reached); false means getNext may block.
*/

这个版本的注释提供了关于“准备好”的更精确的信息，同时提供了一个重要的信息，即如果要推进索引检索，最终必须调用这个方法。

Previous13.4 更高层次的注释增强直觉 Next13.6 实现注释：是什么和为什么，而不是怎么做

Last updated 2 years ago

一个类的接口注释提供了对该类所提供的抽象的高层次描述，比如下面这样：

/**
 * This class implements a simple server-side interface to the HTTP 
 * protocol: by using this class, an application can receive HTTP 
 * requests, process them, and return responses. Each instance of 
 * this class corresponds to a particular socket used to receive 
 * requests. The current implementation is single-threaded and 
 * processes one request at a time.
 */ 
public class Http {...}

方法的接口注释既包括用于抽象的更高层次信息，也包括用于精确性的更低层次细节：

注释通常以一两句话开头，描述调用者所感知的方法的行为；这是更高层次的抽象。
注释必须描述每个参数和返回值（如果有的话）。这些注释必须非常精确，并且必须描述对参数值的任何约束以及参数之间的依赖关系。
如果该方法有任何副作用，必须在接口注释中加以记录。副作用是该方法影响系统未来行为的任何后果，但不是结果的一部分。例如，如果该方法向内部数据结构添加了一个值，该值可以通过将来的方法调用来检索，这是一个副作用；向文件系统进行写入也是一个副作用。
方法的接口注释必须描述可以从该方法发出的任何异常。
如果在调用方法之前必须满足任何先决条件，则必须描述这些条件（也许必须先调用其他方法；对于二分搜索方法，必须对正在搜索的列表进行排序）。尽量减少先决条件是个好主意，但任何剩余的条件都必须被记录下来。

下面是一个从Buffer对象中复制数据的方法的接口注释：

/** 
* Copy a range of bytes from a buffer to an external location. 
* 
* \param offset 
*         Index within the buffer of the first byte to copy.
* \param length 
*         Number of bytes to copy.
* \param dest 
*         Where to copy the bytes: must have room for at least 
*         length bytes.
* 
* \return 
*         The return value is the actual number of bytes copied, 
*         which may be less than length if the requested range of 
*         bytes extends past the end of the buffer. 0 is returned 
*         if there is no overlap between the requested range and 
*         the actual buffer.
*/ 
uint32_t 
Buffer::copy(uint32_t offset, uint32_t length, void* dest) 
...

这个注释的语法（如：\return）遵循Doxygen的惯例，这是一个从C/C++代码中提取注释并将其编译成网页的程序。该注释的目的是提供开发人员在调用该方法时需要的所有信息，包括如何处理特殊情况（注意该方法是如何遵循的建议，将任何与范围规范相关的错误定义为不存在的）。开发者不应该为了调用该方法而需要阅读该方法的主体，接口注释没有提供关于该方法如何实现的信息，例如它是如何扫描其内部数据结构以找到所需数据的。

IndexLookup类提供了一个方便的接口来执行索引查询。下面是一个如何在应用程序中使用它的示例：

query = new IndexLookup(table, index, key1, key2); 
while (true) {
    object = query.getNext(); 
    if (object == NULL) { 
        break; 
    } 
    ... process object ...
}

IndexLookup类向持有索引和对象的服务器发送的信息的格式。
用来确定某个对象是否属于所需范围的比较函数（是用整数、浮点数还是字符串进行比较？）
用于在服务器上存储索引的数据结构。
IndexLookup是否向不同的服务器并发地发出多个请求。
处理服务器崩溃的机制。

下面是IndexLookup类的接口注释的原始版本；摘录的内容还包括该类定义中的几行，注释中提到了这些内容：

/*
 * This class implements the client side framework for index range 
 * lookups. It manages a single LookupIndexKeys RPC and multiple 
 * IndexedRead RPCs. Client side just includes "IndexLookup.h" in 
 * its header to use IndexLookup class. Several parameters can be set 
 * in the config below:
 * - The number of concurrent indexedRead RPCs 
 * - The max number of PKHashes a indexedRead RPC can hold at a time 
 * - The size of the active PKHashes 
 * 
 * To use IndexLookup, the client creates an object of this class by 
 * providing all necessary information. After construction of 
 * IndexLookup, client can call getNext() function to move to next 
 * available object. If getNext() returns NULL, it means we reached 
 * the last object. Client can use getKey, getKeyLength, getValue, 
 * and getValueLength to get object data of current object.
 */ 
 class IndexLookup {
     ... 
   private:
     /// Max number of concurrent indexedRead RPCs
     static const uint8_t NUM_READ_RPC = 10;
     /// Max number of PKHashes that can be sent in one
     /// indexedRead RPC
     static const uint32_t MAX_PKHASHES_PERRPC = 256;
     /// Max number of PKHashes that activeHashes can
     /// hold at once. 
     static const size_t MAX_NUM_PK = (1 << LG_BUFFER_SIZE);
 }

在进一步阅读之前，看看你是否能找出这个注释中的问题。以下是我发现的问题：

第一段的大部分内容是关于实现的，而不是关于接口的。举个例子，用户不需要知道用于与服务器通信的特定远程过程调用的名称。第一段后半部分提到的配置参数都是私有变量，只与类的维护者有关，与用户无关。所有这些实现信息都应该从注释中省略。
该注释还包括一些明显的东西。例如，没有必要告诉用户需要包含IndexLookup.h：任何写C++代码的人都能猜到这是必要的。此外，“通过提供所有必要的信息（by providing all necessary information）”这段文字没有说什么，所以可以省略。为这个类提供一个较短的注释就足够了（也是更好的）。

/*
 * This class is used by client applications to make range queries 
 * using indexes. Each instance represents a single range query.
 * 
 * To start a range query, a client creates an instance of this 
 * class. The client can then call getNext() to retrieve the objects 
 * in the desired range. For each object returned by getNext(), the 
 * caller can invoke getKey(), getKeyLength(), getValue(), and 
 * getValueLength() to get information about that object.
 */

危险信号：实现文档污染了接口

当接口文档（例如方法的文档）描述了使用所记录的事物所不需要的实现细节时，就会出现这种危险信号。

现在考虑以下代码，它显示了IndexLookup中isReady方法的第一版文档。

/** 
* Check if the next object is RESULT_READY. This function is 
* implemented in a DCFT module, each execution of isReady() tries 
* to make small progress, and getNext() invokes isReady() in a 
* while loop, until isReady() returns true.
* 
* isReady() is implemented in a rule-based approach. We check 
* different rules by following a particular order, and perform 
* certain actions if some rule is satisfied.
* 
* \return 
*         True means the next Object is available. Otherwise, return 
*         false.
*/
bool IndexLookup::isReady() { ... }

/*
* Indicates whether an indexed read has made enough progress for 
* getNext to return immediately without blocking. In addition, this 
* method does most of the real work for indexed reads, so it must 
* be invoked (either directly, or indirectly by calling getNext) in 
* order for the indexed read to make progress.
* 
* \return 
*         True means that the next invocation of getNext will not block 
*         (at least one object is available to return, or the end of the 
*         lookup has been reached); false means getNext may block.
*/