vita nouva / diary
"The Rose Garden by Carl Aagaard"
13/01/2026

"Diary Entry - January 13, 2026"

02:58

I'd have never known that reading about a president's private life will help me understanding the nonidentity problem. Also why Thomas Jefferson's privates are well documented like that? #Philosophy

23:44

I was inspired recently by the Clarice Lispector twitter bot to create something similar for Kierkegaard. For a non-anthology author like him (and her, Clarice, too in fact), it was a bit difficult to imagine how this will work, how can you cut sentences randomly so they make some context, not that full and not that ambiguous? The raw materials were three source files: a 594KB EPUB of Either/Or, an 825KB EPUB of personal journals and papers, and a 507KB text file of selected writings. Together, they contained over a million words. Not every sentence in a philosophy book is quotable. Chapter headings, footnote markers, translator notes, and incomplete fragments vastly outnumber the gems. A naive approach of random selection would produce mostly garbage: "See vol. II, pp. 234-256" or "Continued from previous section".

So I designed a relatively simple filtration,

private val philosophicalKeywords = Set(
  "soul", "despair", "anxiety", "freedom", "faith",
  "existence", "dread", "passion", "eternity", "infinite",
  "spirit", "silence", "god", "death", "suffering",
  // 180+ more terms...
)

def score(text: String): Int = {
  val keywordCount = philosophicalKeywords.count(kw => lower.contains(kw))
  val keywordScore = math.min(25, keywordCount * 4)
  
  val lengthScore = text.length match {
    case l if l <= 120 => 25  // ideal length
    case l if l <= 150 => 22
    case l if l <= 200 => 15
    case _ => 5
  }
  
  keywordScore + lengthScore + rhetoricalScore + structureScore
}

on top of which a kind of an intelligent quote (shall I call it IQ?) scoring;

  • keywords (180+ terms like "despair", "anxiety", "faith", "existence")
  • Rhetorical patterns (semicolons, em-dashes, contemplative ellipses)
  • Length optimization (60-150 characters scores highest)
  • Aphoristic bonus (short quotes with multiple keywords get extra points)

Shorter quotes with words like "existence" bubble to the top. After running this on all three source files, I got 15,455 quotes sorted by quality. Top score: 81. They're available in the source control if you are interested to have a look: https://github.com/larrasket/kierkegaard #Programming

[permlink]
c. lr0 2025