AgentSkillsCN

search-strategy

查询分解与多源搜索编排。将自然语言问题拆解为针对各来源的定向搜索,将查询转换为特定于各源的语法,按相关性对结果进行排序,并妥善处理歧义与 fallback 策略。

SKILL.md
--- frontmatter
name: search-strategy
description: Query decomposition and multi-source search orchestration. Breaks natural language questions into targeted searches per source, translates queries into source-specific syntax, ranks results by relevance, and handles ambiguity and fallback strategies.

Search Strategy

If you see unfamiliar placeholders or need to check which tools are connected, see CONNECTORS.md.

The core intelligence behind enterprise search. Transforms a single natural language question into parallel, source-specific searches and produces ranked, deduplicated results.

The Goal

Turn this:

code
"What did we decide about the API migration timeline?"

Into targeted searches across every connected source:

code
~~chat:  "API migration timeline decision" (semantic) + "API migration" in:#engineering after:2025-01-01
~~knowledge base: semantic search "API migration timeline decision"
~~project tracker:  text search "API migration" in relevant workspace

Then synthesize the results into a single coherent answer.

Query Decomposition

Step 1: Identify Query Type

Classify the user's question to determine search strategy:

Query TypeExampleStrategy
Decision"What did we decide about X?"Prioritize conversations (Google Chat, Gmail), look for conclusion signals
Status"What's the status of Project Y?"Prioritize recent activity, task trackers, status updates
Document"Where's the spec for Z?"Prioritize Drive, wiki, shared docs
Person"Who's working on X?"Search task assignments, message authors, doc collaborators
Factual"What's our policy on X?"Prioritize wiki, official docs, then confirmatory conversations
Temporal"When did X happen?"Search with broad date range, look for timestamps
Exploratory"What do we know about X?"Broad search across all sources, synthesize
Brand"What do we know about [brand]?"Search brand directory, CRM, editorial archives, marketplace
Job market"Job postings for [company]"Search jobs dataset, employer CRM records, API integrations
Editorial"Articles about [topic]"Search editorial archives, GraphQL API, Drive
Catalog"Products from [brand]"Search marketplace dataset, product database, GraphQL API

Step 2: Extract Search Components

From the query, extract:

  • Keywords: Core terms that must appear in results
  • Entities: People, projects, teams, tools, brands, markets (use memory system if available)
  • Intent signals: Decision words, status words, temporal markers
  • Constraints: Time ranges, source hints, author filters, market/region filters
  • Negations: Things to exclude

Step 3: Generate Sub-Queries Per Source

For each available source, create one or more targeted queries:

Prefer semantic search for:

  • Conceptual questions ("What do we think about...")
  • Questions where exact keywords are unknown
  • Exploratory queries

Prefer keyword search for:

  • Known terms, project names, acronyms, brand names
  • Exact phrases the user quoted
  • Filter-heavy queries (from:, in:, after:)

Generate multiple query variants when the topic might be referred to differently:

code
User: "Kubernetes setup"
Queries: "Kubernetes", "k8s", "cluster", "container orchestration"

Source-Specific Query Translation

~~chat (Google Chat)

Semantic search (natural language questions):

code
query: "What is the status of project aurora?"

Keyword search:

code
query: "project aurora status update"
query: "aurora in:#engineering after:2025-01-15"
query: "from:<@UserID> aurora"

Filter mapping:

Enterprise filterGoogle Chat syntax
from:sarahfrom:sarah or from:<@USERID>
in:engineeringin:engineering
after:2025-01-01after:2025-01-01
before:2025-02-01before:2025-02-01
type:threadis:thread
type:filehas:file

~~knowledge base (Wiki / GitHub)

Semantic search — Use for conceptual queries:

code
descriptive_query: "API migration timeline and decision rationale"

Keyword search — Use for exact terms:

code
query: "API migration"
query: "\"API migration timeline\""  (exact phrase)

~~code repositories (GitHub)

File search:

code
path:src/components filename:Button

Code search:

code
language:typescript "interface Product"
repo:fashionunited/api "GraphQL schema"

README/doc search:

code
path:README.md OR path:docs "deployment"

Repository-specific searches for FashionUnited:

Query IntentRepositorySearch Pattern
API endpointsapipath:src filename:resolver OR schema
UI componentsfrontendpath:src/components
Data pipelinesintegrationspath:feeds OR path:sync
Product schemaproduct-databasepath:models OR schema
Deploy proceduresdeploypath:docs OR README
Company policiesaboutpath:handbook OR policies

~~project tracker (GitHub Issues/Projects)

Task search:

code
text: "API migration"
is:issue is:open
assignee:username
label:priority-high

Filter mapping:

Enterprise filterGitHub syntax
from:sarahauthor:sarah or assignee:sarah
after:2025-01-01created:>2025-01-01
type:milestonemilestone:"Milestone Name"
status:openis:open

~~data warehouse (BigQuery)

Dataset-specific queries for FashionUnited:

Query TypeDatasetExample Query
Editorial contenteditorialArticles, publication dates, authors
Job marketjobsJob postings, employer data, market trends
MarketplacemarketplaceProduct listings, brand catalogs
Traffic/analyticsanalyticsPage views, engagement, traffic sources
Ad performanceadvertisingCampaign metrics, impression data

Query patterns:

sql
-- Editorial archives
SELECT * FROM editorial.articles WHERE title LIKE '%sustainable fashion%'

-- Job posting history
SELECT * FROM jobs.postings WHERE company = 'Brand Name' AND posted_date > '2024-01-01'

-- Marketplace catalog
SELECT * FROM marketplace.products WHERE brand = 'Brand Name'

~~CRM (Vtiger)

Record search:

code
module: Accounts, Contacts, Opportunities, Invoices
search: "Brand Name" OR "Company Name"

Filter mapping:

Enterprise filterVtiger query
type:accountmodule=Accounts
type:contactmodule=Contacts
status:activeaccountstatus=Active
owner:usernameassigned_user_id=username

Fashion Industry Query Patterns

Brand Lookup

When the query mentions a brand name, search across multiple sources:

code
User: "What do we know about Gucci?"

1. ~~CRM: Search Accounts for "Gucci" (customer relationship, billing)
2. BigQuery/editorial: Search articles mentioning "Gucci" (news coverage)
3. BigQuery/marketplace: Search products from "Gucci" (catalog presence)
4. BigQuery/jobs: Search job postings from "Gucci" (employer data)
5. GraphQL API: Search brand directory for "Gucci" (official profile)

Job Market Queries

When the query is about jobs or employers:

code
User: "Design jobs in London this month"

1. BigQuery/jobs: Query postings WHERE category='design' AND location='London' AND posted_date > 30 days ago
2. ~~CRM: Check employer accounts in London with active subscriptions
3. GraphQL API: Real-time job search for design roles in London

Editorial Archive Queries

When the query is about news, articles, or content:

code
User: "Coverage of Paris Fashion Week 2025"

1. BigQuery/editorial: Query articles WHERE topic='Paris Fashion Week' AND year=2025
2. GraphQL API: Full-text search for "Paris Fashion Week 2025"
3. ~~cloud storage: Check Drive for editorial calendars, photo archives
4. ~~chat: Search #editorial channel for PFW discussions

Marketplace/Catalog Queries

When the query is about products or catalog:

code
User: "Products from Zara in the women's category"

1. BigQuery/marketplace: Query products WHERE brand='Zara' AND category='women'
2. GitHub/product-database: Check product schema and feed specs
3. GraphQL API: Real-time product search
4. ~~CRM: Check Zara account status for catalog integration

Result Ranking

Relevance Scoring

Score each result on these factors (weighted by query type):

FactorWeight (Decision)Weight (Status)Weight (Document)Weight (Factual)Weight (Brand)
Keyword match0.30.20.40.30.4
Freshness0.30.40.20.10.2
Authority0.20.10.30.40.3
Completeness0.20.30.10.20.1

Authority Hierarchy

Depends on query type:

For factual/policy questions:

code
Wiki/Official docs > Shared documents > Email announcements > Chat messages

For "what happened" / decision questions:

code
Meeting notes > Thread conclusions > Email confirmations > Chat messages

For status questions:

code
Task tracker > Recent chat > Status docs > Email updates

For brand/entity questions (FashionUnited specific):

code
CRM (official relationship) > Brand directory > Editorial coverage > Marketplace data > Chat mentions

For job market questions:

code
BigQuery jobs dataset > CRM employer records > API integrations > Email correspondence

Handling Ambiguity

When a query is ambiguous, prefer asking one focused clarifying question over guessing:

code
Ambiguous: "search for the migration"
→ "I found references to a few migrations. Are you looking for:
   1. The database migration (Project Phoenix)
   2. The cloud migration (AWS → GCP)
   3. The email migration (Exchange → O365)"

Only ask for clarification when:

  • There are genuinely distinct interpretations that would produce very different results
  • The ambiguity would significantly affect which sources to search

Do NOT ask for clarification when:

  • The query is clear enough to produce useful results
  • Minor ambiguity can be resolved by returning results from multiple interpretations

Fallback Strategies

When a source is unavailable or returns no results:

  1. Source unavailable: Skip it, search remaining sources, note the gap
  2. No results from a source: Try broader query terms, remove date filters, try alternate keywords
  3. All sources return nothing: Suggest query modifications to the user
  4. Rate limited: Note the limitation, return results from other sources, suggest retrying later

Query Broadening

If initial queries return too few results:

code
Original: "PostgreSQL migration Q2 timeline decision"
Broader:  "PostgreSQL migration"
Broader:  "database migration"
Broadest: "migration"

Remove constraints in this order:

  1. Date filters (search all time)
  2. Source/location filters
  3. Less important keywords
  4. Keep only core entity/topic terms

Parallel Execution

Always execute searches across sources in parallel, never sequentially. The total search time should be roughly equal to the slowest single source, not the sum of all sources.

code
[User query]
     ↓ decompose
[Google Chat query] [Gmail query] [Google Drive query] [GitHub query] [BigQuery query] [Vtiger query]
     ↓            ↓            ↓              ↓            ↓            ↓
  (parallel execution)
     ↓
[Merge + Rank + Deduplicate]
     ↓
[Synthesized answer]