Coverage report: /home/ellis/comp/core/lib/nlp/section.lisp
Kind | Covered | All | % |
expression | 43 | 46 | 93.5 |
branch | 1 | 2 | 50.0 |
Key
Not instrumented
Conditionalized out
Executed
Not executed
Both branches taken
One branch taken
Neither branch taken
1
(in-package :nlp/section)
3
(defun extract-sections (text &key (epsilon 0.5))
4
"Extract the sections from a string of text. Epsilon refers to the
5
distance between two points for them to be considered related."
6
(labels ((average-distance (point points)
8
:key (lambda (i) (distance (vector-data i)
11
(let ((collection (make-instance 'document-collection)))
12
(loop for sentence in (sentence-tokenize text)
13
do (add-document collection
14
(make-instance 'document-cluster
15
:string-contents sentence)))
16
(tf-vectorize-documents collection)
17
(loop for document in (documents collection)
18
with cluster-index = 0
19
for cluster = (get-cluster cluster-index (documents collection))
20
do (if (and cluster (>= epsilon (average-distance document cluster)))
21
(setf (cluster document) cluster-index)
22
(setf (cluster document) (incf cluster-index))))