Pinch of Salt

System Workflow โ€” How raw feeds become verifiable history

flowchart TD A(["๐ŸŒ RSS Feeds โ€” BBC ยท NYT ยท Al Jazeera"]) -->|feedparser| B(["๐Ÿ“ฐ Raw Articles"]) B -->|TF-IDF vectorization| C(["๐Ÿ”ข Feature Vectors"]) C -->|DBSCAN clustering| D{{"๐Ÿ—‚๏ธ Cluster? โ€” similarity > 0.3"}} D -- "Multi-source" --> E(["๐Ÿง  LLM Extraction โ€” Llama 3.2"]) D -. "Single article" .-> F(["๐Ÿ“ Standalone Article"]) E --> G(["โœ… Verified Facts + Classification"]) G -->|geography ยท category| H[("๐Ÿ—„๏ธ SQLite DB")] F -.-> H H -->|parent / child links| I(["๐Ÿ”— DAG Builder"]) I --> J(["๐Ÿ“Š Knowledge Graph โ€” vis-network"]) H --> K(["๐Ÿ–ฅ๏ธ Dashboard โ€” Filterable UI"]) style A fill:#312e81,stroke:#818cf8,stroke-width:2px,color:#e0e7ff style B fill:#1e1b4b,stroke:#6366f1,stroke-width:2px,color:#c7d2fe style C fill:#1e1b4b,stroke:#6366f1,stroke-width:2px,color:#c7d2fe style D fill:#3b0764,stroke:#c084fc,stroke-width:2px,color:#f3e8ff style E fill:#312e81,stroke:#818cf8,stroke-width:2px,color:#e0e7ff style F fill:#1e293b,stroke:#475569,stroke-width:1px,color:#94a3b8 style G fill:#064e3b,stroke:#34d399,stroke-width:2px,color:#d1fae5 style H fill:#1e1b4b,stroke:#a78bfa,stroke-width:2px,color:#ddd6fe style I fill:#312e81,stroke:#818cf8,stroke-width:2px,color:#e0e7ff style J fill:#1e1b4b,stroke:#6366f1,stroke-width:2px,color:#c7d2fe style K fill:#1e1b4b,stroke:#6366f1,stroke-width:2px,color:#c7d2fe linkStyle default stroke:#818cf8,stroke-width:2px linkStyle 5 stroke:#475569,stroke-width:1px,stroke-dasharray:6 linkStyle 8 stroke:#475569,stroke-width:1px,stroke-dasharray:6

1. RSS Ingestion

Fetches live feeds from global sources using feedparser. Collects titles, descriptions, and publication dates.

2. Semantic Clustering

Vectorizes text with TF-IDF and groups similar articles using DBSCAN (cosine similarity > 0.3).

3. LLM Extraction

Passes clusters to Llama-3.2 to synthesize titles, extract verified facts, and classify by geography & category.

4. DAG Generation

Links events chronologically (parent โ†’ child) and stores them in SQLite. The Knowledge Graph renders this as a directed, hierarchical timeline.

Back to Dashboard