Title: Thresholds for Detection Page 1 of Date Prepared: October 17, 1995 April 3, 1996 Priority: Routine Document Affected : Design (1.52) Paragraphs Affected: References: Change Required: --------------- Modify RetrieveDocuments() DocumentCollectionIndex [DCI] and AddQuery() for QueryCollectionIndex [QCI] to accept an additional float argument indicating the user's "relevance threshold" for retrieved or matched Documents. The threshold will be represented by a real number in the range [0,1.0]. A threshold of 0.0 indicates that all matching Documents will be returned. Specific Recommendation: ----------------------- (add text) Section 1.2 Each component has a value, which may be.... * a float (a real number) (modify operation) Section 7.1.4 Document and Query Indexes DocumentCollectionIndex .... RetrieveDocuments(sequence of DocumentCollectionIndex, RetrievalQuery, NumberToRetrieve : integer, Threshold : float): Collection or nil returns a Collection of Documents (of maximal length NumberToRetrieve) which are most closely related to the DetectionNeed from which the RetrievalQuery is derived. All returned Documents will have a relevance rating above the user-specified Threshold. The threshold will be a value in the range [0.0, 1.0]. A threshold of 0.0 indicates that all matching Documents up to a maximum of NumberToRetrieve will be returned. If no Documents match the RetrievalQuery argument, nil will be returned. Title: Thresholds for Detection Page 2 of (modify operation) Section 7.1.4 Document and Query Indexes QueryCollectionIndex .... AddQuery(QueryCollectionIndex, RoutingQuery, Threshold : float) Adds RoutingQuery to the QueryCollectionIndex. If a RoutingQuery in the QueryCollectionIndex has a DetectionNeed component matching the DetectionNeed component of the RoutingQuery argument, the existing RoutingQuery is replaced by the RoutingQuery argument. The Threshold argument is used to control the Documents the RoutingQuery matches during RetrieveQueries() operations. The RoutingQuery will match Documents if the relevance rating is above the specified relevance rating Threshold. A threshold of 0.0 indicates that the DetectionNeed the RoutingQuery was derived from will be returned from the RetrieveQueries() operation for all matching Documents. Reason for Proposed Change: -------------------------- In the current TIPSTER Architecture, a user can query a DocumentCollectionIndex(DCI) to receive a Collection of Documents relevant to the user's information need (DetectionNeed). The user controls the retrieval of Documents only by specifying a maximum number of Documents they will receive. However, a user may wish to receive Documents above a certain "relevance threshold" in the returned Collection. The threshold limitation also occurs with the QueryCollectionIndex. A user can use the AddQuery() operation to add their RoutingQuery to a QueryCollectionIndex (QCI). Once added to the QCI, the RoutingQuery will be used to determine whether an incoming Document matches the user's information need. Again, a user has no way of limiting the range of Documents which may match their information need. A user may wish to only have their RoutingQuery match a Document only if the Document's relevance to the query is above a certain "relevance threshold". Adding a relevance threshold to the operations RetrieveDocuments() for DocumentCollectionIndex [DCI] and AddQuery() for QueryCollectionIndex [QCI] offers users more flexibility. This type of filtering can be done currently outside the scope of the architecture, but doing so prevents plug and play of Detection systems. By adding the threshold to the Tipster API, plug & play of Detection modules is enabled. Title: Thresholds for Detection Page 3 of The "relevance threshold" shall be general so that all TIPSTER Detection systems shall be able to interpret the threshold in a manner appropriate to their system. The threshold will be represented by an float on a scale of [0,1.0]. A threshold of 0.0 indicates that all matching Documents will be returned. Applications Affected: Change Requested By: Organization: University of Massachusetts Name: Kathleen S. DiBella Phone Number: (413)545-9781 Date: