superstruct-tech · ts-superstruct · May 6, 2025
diff --git a/docs/plans/03-talk-history.md b/docs/plans/03-talk-history.md
@@ -0,0 +1,55 @@
+# Talk History Query Feature – Planning Doc
+
+## Overview
+
+This feature enables users to query the AI in Action bot about previous talks, e.g., "@bot has there been a talk about A2A before?" or "@bot what talks have been about agents?". The bot should reply in the same format as for upcoming speakers.
+
+## High-Level Architecture
+
+- **Data Source:** All talks (past and future) are stored in the `ScheduledSpeaker` MongoDB collection.
+- **Query Logic:** When a user asks about past talks, the bot fetches all talks from the database, filters for those with a `scheduledDate` in the past, and uses the LLM to evaluate which talks match the user's query (by topic).
+- **Modular Components:**
+  - **Database Access Module:** Handles fetching all scheduled talks.
+  - **Talk History:** Filters past talks and delegates matching to the LLM.
+  - **LLM Matching Module:** Uses the LLM to determine which talks are relevant to the query.
+  - **Discord Intent Handler:** Detects user queries about past talks and formats responses.
+
+## Implementation Steps
+
+1. **Design and implement a `talkHistory` module** (e.g., `lib/talkHistory.js`):
+    - Fetch all talks from `ScheduledSpeaker`.
+    - Filter for past talks.
+    - Expose a function to query past talks by topic (delegating matching to the LLM).
+2. **Update Discord intent handling** (in `lib/discord/index.js`):
+    - Add logic to detect queries about past talks using LLM intent detection.
+    - Call the `talkHistory` and format the response.
+3. **Integrate LLM for topic matching** (e.g., via `lib/llm/index.js`):
+    - Pass the user's query and the list of past talk topics to the LLM.
+    - Receive and process the LLM's response to identify relevant talks.
+4. **Testing:**
+    - Write unit tests for `talkHistory` (mocking DB and LLM).
+    - Write integration tests for Discord intent handling (mocking LLM and DB as needed).
+    - Add/extend tests for LLM matching logic.
+
+## Files to Update or Add
+
+- **New:** `lib/talkHistory.js` (core logic for querying/filtering past talks)
+- **Update:**
+  - `lib/discord/index.js` (handle new user queries)
+  - `models/scheduledSpeaker.js` (no change needed, but used for fetching data)
+- **New/Update:** Test files:
+  - `test/talkHistory.test.js` (unit tests for the new service)
+
+## Modularity
+
+- **Database logic** is isolated in the service, allowing DB mocking.
+- **LLM logic** is abstracted, so it can be stubbed/mocked in tests.
+- **Discord intent handling** is separated from business logic, supporting independent testing.
+
+## Tests
+
+- Unit tests for `talkHistory`:
+  - Fetching and filtering past talks.
+  - Delegating topic matching to the LLM.
+
+---
diff --git a/lib/discord/index.js b/lib/discord/index.js
@@ -15,6 +15,7 @@ const {
   getUpcomingSchedule,
   cancelSpeaker,
 } = require('../schedulingLogic')
+const talkHistoryService = require('../talkHistory')
 
 const { token, guildId } = require('../../config').discord
 
@@ -103,6 +104,36 @@ function createClient() {
       return
     }
 
+    // Check for talk history queries
+    const talkHistorySystemMessage = `You are an assistant helping determine if a user's message is asking about past talks.
+Respond with ONLY 'talk_history' if the message is asking about past talks or talk history, or 'other' if it's about something else.`
+
+    try {
+      const intentResponse = await completion({
+        systemMessage: talkHistorySystemMessage,
+        prompt: message.content,
+      })
+      const intent = intentResponse?.trim().toLowerCase()
+
+      if (intent === 'talk_history') {
+        const matchingTalks = await talkHistoryService.queryTalks(message.content)
+
+        if (matchingTalks.length === 0) {
+          await message.reply('I couldn\'t find any past talks matching your query.')
+          return
+        }
+
+        const response = matchingTalks
+          .map(talk => talkHistoryService.formatTalk(talk))
+          .join('\n\n')
+
+        await message.reply(`I found some past talks that might interest you! Here they are:\n\n${response}`)
+        return
+      }
+    } catch (error) {
+      console.error('Error processing talk history query:', error)
+    }
+
     // --- Check if message is in an active sign-up thread ---
     const signupInfo = activeSignups[message.channel.id]
     if (
@@ -561,7 +592,7 @@ function createClient() {
       try {
         // Outer try for intent detection + handling
         const intentSystemMessage =
-          "You are an assistant classifying user intent in a Discord message where the bot was mentioned. Possible intents are 'sign_up', 'view_schedule', 'cancel_talk', or 'other'.\n- Classify as 'sign_up' ONLY if the user explicitly asks to sign up, volunteer, present, or talk.\n- Classify as 'view_schedule' ONLY if the user explicitly asks to see the schedule, upcoming talks, or who is speaking.\n- Classify as 'cancel_talk' ONLY if the user explicitly asks to cancel, withdraw, or back out of their scheduled talk.\n- Otherwise, classify as 'other'. This includes simple replies, acknowledgements, questions not related to the above, or unclear requests.\nRespond with ONLY the intent name ('sign_up', 'view_schedule', 'cancel_talk', 'other')."
+          "You are an assistant classifying user intent in a Discord message where the bot was mentioned. Possible intents are 'sign_up', 'view_schedule', 'cancel_talk', 'talk_history', or 'other'.\n- Classify as 'sign_up' ONLY if the user explicitly asks to sign up, volunteer, present, or talk.\n- Classify as 'view_schedule' ONLY if the user explicitly asks to see the schedule, upcoming talks, or who is speaking.\n- Classify as 'cancel_talk' ONLY if the user explicitly asks to cancel, withdraw, or back out of their scheduled talk.\n- Classify as 'talk_history' ONLY if the user asks about past talks, previous presentations, or historical talks.\n- Otherwise, classify as 'other'. This includes simple replies, acknowledgements, questions not related to the above, or unclear requests.\nRespond with ONLY the intent name ('sign_up', 'view_schedule', 'cancel_talk', 'talk_history', 'other')."
 
         console.log(`Sending to LLM: "${userMessageContent}"`)
         const intentResponse = await completion({
@@ -758,6 +789,31 @@ function createClient() {
             }
           }
         }
+        // --- Handle 'talk_history' Intent ---
+        else if (detectedIntent === 'talk_history') {
+          console.log('Talk history intent detected.')
+          try {
+            const matchingTalks = await talkHistoryService.queryTalks(userMessageContent)
+
+            if (matchingTalks.length === 0) {
+              await message.reply('I couldn\'t find any past talks matching your query.')
+              return
+            }
+
+            const response = matchingTalks
+              .map(talk => talkHistoryService.formatTalk(talk))
+              .join('\n\n')
+
+            await message.reply(`I found some past talks that might interest you! Here they are:\n\n${response}`)
+          } catch (error) {
+            console.error('Error processing talk history query:', error)
+            try {
+              await message.reply('Sorry, I encountered an error while searching for past talks. Please try again later.')
+            } catch (replyError) {
+              console.error('Failed to send talk history error reply:', replyError)
+            }
+          }
+        }
         // --- Handle 'other' or Unrecognized Intent ---
         else {
           // Handles 'other' or any unrecognized intent from LLM

diff --git a/lib/talkHistory.js b/lib/talkHistory.js
@@ -0,0 +1,80 @@
+const ScheduledSpeaker = require('../models/scheduledSpeaker')
+const { completion } = require('./llm')
+
+/**
+ * Get all talks from the database
+ * @returns {Promise<Array>} Array of all talks
+ */
+async function getAllTalks () {
+  return await ScheduledSpeaker.find().sort({ scheduledDate: -1 })
+}
+
+/**
+ * Query talks based on a topic using LLM for matching
+ * @param {string} query - The user's query about talks
+ * @returns {Promise<Array>} Array of matching talks
+ */
+async function queryTalks (query) {
+  const talks = await getAllTalks()
+
+  if (talks.length === 0) {
+    return []
+  }
+
+  // Prepare context for LLM with all fields from the model
+  const talksContext = talks.map(talk => ({
+    discordUserId: talk.discordUserId,
+    discordUsername: talk.discordUsername,
+    topic: talk.topic,
+    scheduledDate: talk.scheduledDate,
+    bookingTimestamp: talk.bookingTimestamp,
+    threadId: talk.threadId
+  }))
+
+  // Use LLM to determine which talks match the query
+  const prompt = `You are an assistant helping match user queries to relevant talks. Your primary focus should be on matching the topic of the talks.
+
+Given the following list of talks and a user query, identify which talks are relevant to the query by focusing on the topic field. Consider semantic similarity and related concepts.
+
+User Query: "${query}"
+
+Talks:
+${JSON.stringify(talksContext, null, 2)}
+
+Return a JSON array of indices (0-based) of the talks that match the query. If no talks match, return an empty array.
+Focus on matching the topic field, but also consider the context of the query.`
+
+  const llmResponse = await completion({
+    systemMessage: 'You are an assistant helping match user queries to relevant talks. Return ONLY a JSON array of indices. Focus on matching the topic field.',
+    prompt
+  })
+
+  try {
+    // Extract JSON from markdown code block if present
+    const jsonMatch = llmResponse.match(/```(?:json)?\s*(\[[\s\S]*?\])\s*```/) || [null, llmResponse]
+    const jsonStr = jsonMatch[1].trim()
+    const matchingIndices = JSON.parse(jsonStr)
+    return matchingIndices.map(index => talks[index])
+  } catch (error) {
+    console.error('Error parsing LLM response:', error)
+    return []
+  }
+}
+
+/**
+ * Format a talk for display
+ * @param {Object} talk - The talk object
+ * @returns {string} Formatted talk information
+ */
+function formatTalk (talk) {
+  return `**${talk.topic}**\n` +
+         `Speaker: ${talk.discordUsername} \n` +
+         `Date: ${talk.scheduledDate.toLocaleDateString()}\n` +
+         `Booked on: ${talk.bookingTimestamp.toLocaleDateString()}\n`
+}
+
+module.exports = {
+  getAllTalks,
+  queryTalks,
+  formatTalk
+}
diff --git a/test/talkHistory.test.js b/test/talkHistory.test.js
@@ -0,0 +1,128 @@
+const test = require('tape')
+const mongoose = require('../lib/mongo')
+const ScheduledSpeaker = require('../models/scheduledSpeaker')
+const { getAllTalks, queryTalks, formatTalk } = require('../lib/talkHistory')
+const llm = require('../lib/llm')
+
+// Helper function to create a test talk
+function createTestTalk (overrides = {}) {
+  const defaultTalk = {
+    discordUserId: 'test-user',
+    discordUsername: 'TestUser',
+    topic: 'Test Topic',
+    scheduledDate: new Date(),
+    bookingTimestamp: new Date(),
+    threadId: 'test-thread'
+  }
+  return { ...defaultTalk, ...overrides }
+}
+
+test('talkHistory - getAllTalks - empty database', async (t) => {
+  await ScheduledSpeaker.deleteMany({}) // Clean slate
+  const talks = await getAllTalks()
+  t.deepEqual(talks, [], 'should return empty array when no talks exist')
+  t.end()
+})
+
+test('talkHistory - getAllTalks - multiple talks', async (t) => {
+  await ScheduledSpeaker.deleteMany({}) // Clean slate
+
+  // Create test talks with different dates
+  const talk1 = createTestTalk({
+    topic: 'First Talk',
+    scheduledDate: new Date('2024-01-01')
+  })
+  const talk2 = createTestTalk({
+    topic: 'Second Talk',
+    scheduledDate: new Date('2024-02-01')
+  })
+  const talk3 = createTestTalk({
+    topic: 'Third Talk',
+    scheduledDate: new Date('2024-03-01')
+  })
+
+  await ScheduledSpeaker.insertMany([talk1, talk2, talk3])
+
+  const talks = await getAllTalks()
+
+  t.equal(talks.length, 3, 'should return all talks')
+  t.equal(talks[0].topic, 'Third Talk', 'should be sorted by date descending')
+  t.equal(talks[1].topic, 'Second Talk', 'should be sorted by date descending')
+  t.equal(talks[2].topic, 'First Talk', 'should be sorted by date descending')
+
+  await ScheduledSpeaker.deleteMany({}) // Cleanup
+  t.end()
+})
+
+test('talkHistory - queryTalks - no talks', async (t) => {
+  await ScheduledSpeaker.deleteMany({}) // Clean slate
+  const results = await queryTalks('any query')
+  t.deepEqual(results, [], 'should return empty array when no talks exist')
+  t.end()
+})
+
+test('talkHistory - queryTalks - matching talks', async (t) => {
+  await ScheduledSpeaker.deleteMany({}) // Clean slate
+
+  // Create test talks with different topics
+  const talk1 = createTestTalk({
+    topic: 'Introduction to Machine Learning',
+    scheduledDate: new Date('2024-01-01')
+  })
+  const talk2 = createTestTalk({
+    topic: 'Deep Learning Basics',
+    scheduledDate: new Date('2024-02-01')
+  })
+  const talk3 = createTestTalk({
+    topic: 'Web Development with React',
+    scheduledDate: new Date('2024-03-01')
+  })
+
+  await ScheduledSpeaker.insertMany([talk1, talk2, talk3])
+
+  // Save original completion function
+  const originalCompletion = llm.completion
+
+  // Mock completion function
+  llm.completion = async () => '```json\n[0, 1]\n```'
+
+  const results = await queryTalks('machine learning')
+
+  t.equal(results.length, 2, 'should return exactly two matching results')
+  t.ok(results.some(talk => talk.topic === 'Introduction to Machine Learning'), 'should include Introduction to Machine Learning')
+  t.ok(results.some(talk => talk.topic === 'Deep Learning Basics'), 'should include Deep Learning Basics')
+
+  // Verify each result has the expected structure
+  results.forEach(talk => {
+    t.ok(talk.topic, 'each result should have a topic')
+    t.ok(talk.discordUsername, 'each result should have a username')
+    t.ok(talk.scheduledDate instanceof Date, 'each result should have a date')
+  })
+
+  // Restore original completion function
+  llm.completion = originalCompletion
+
+  await ScheduledSpeaker.deleteMany({}) // Cleanup
+  t.end()
+})
+
+test('talkHistory - formatTalk', (t) => {
+  const talk = createTestTalk({
+    topic: 'Test Topic',
+    discordUsername: 'TestUser',
+    scheduledDate: new Date('2024-01-01'),
+    bookingTimestamp: new Date('2024-01-01')
+  })
+
+  const formatted = formatTalk(talk)
+
+  t.ok(formatted.includes('Test Topic'), 'should include topic')
+  t.ok(formatted.includes('TestUser'), 'should include username')
+  t.ok(formatted.includes('**'), 'should format topic in bold')
+
+  // Check date formatting more robustly
+  const dateRegex = /\d{1,2}\/\d{1,2}\/\d{4}/ // Matches MM/DD/YYYY or DD/MM/YYYY
+  t.ok(dateRegex.test(formatted), 'should include properly formatted dates')
+
+  t.end()
+})