Building an Automatic Article Summarization Tool (RSS → Python → OpenAI → Notion) Step by Step | GateOfAI

Share:

📘 Building an Automatic Article Summarization Tool (RSS → Python → OpenAI → Notion)

Category: AI Lessons — Level: Beginner/Intermediate — Duration: 45–60 minutes

💡 Idea

We will create a tool that monitors any RSS, collects new links, extracts the article text, summarizes it using the OpenAI API, and then automatically saves the summary in a Notion database. You can run it manually or schedule it via Zap.

🧰 Requirements

  • An active OpenAI account with a API key.
  • A Notion workspace and a database to save the summaries.
  • Python 3.10+ installed on your machine.
  • An RSS link from a news source/blog.
  • (Optional) A Zapier account to trigger the code when a new RSS item arrives.

🔧 Step 1: Understanding RSS and Choosing a Source

RSS is a standard format for syndicating updated content (titles/summary/link). You will find the RSS link on many websites or through the orange RSS icon. We will use this link as an input for the tool.

🗂️ Step 2: Setting Up a Database in Notion

  1. Create a Database in Notion titled something like: AI Summaries.
  2. Add the following properties:
    • Title (Page Title) — Type: Title.
    • URL — Type: URL.
    • Source — Short text.
    • Summary — Long text.
    • Published — Date/Time (optional).
  3. Create an Integration from Notion Developers, grant it access to the database, and keep your NOTION_TOKEN and DATABASE_ID. Refer to the API creation documentation for more precise adjustments if needed.

🔐 Step 3: Saving Keys as Environment Variables

# macOS/Linux
export OPENAI_API_KEY="insert_key_here"
export NOTION_TOKEN="insert_notion_token"
export NOTION_DATABASE_ID="insert_database_id"

📦 Step 4: Installing Packages

pip install feedparser trafilatura requests python-dotenv

feedparser for reading RSS, trafilatura for extracting article text from HTML, and requests for sending requests to Notion.

🤖 Step 5: Complete Python Code

This code executes: Read RSS → Download Article Text → Summarize via OpenAI → Create Page in Notion.

import os, time, feedparser, requests
from datetime import datetime
import trafilatura

OPENAI_API_KEY = os.environ["OPENAI_API_KEY"]
NOTION_TOKEN = os.environ["NOTION_TOKEN"]
NOTION_DB = os.environ["NOTION_DATABASE_ID"]

RSS_URL = "https://example.com/feed"  # Replace with your RSS feed

# 1) Read RSS
feed = feedparser.parse(RSS_URL)
items = feed.entries[:5]  # Try the last 5 items

def fetch_article_text(url: str) -> str:
    downloaded = trafilatura.fetch_url(url, timeout=20)
    return trafilatura.extract(downloaded) or ""

def summarize(text: str, title: str) -> str:
    # Using OpenAI's Chat API
    import json, urllib.request
    endpoint = "https://api.openai.com/v1/chat/completions"
    payload = {
      "model": "gpt-4.1-mini",
      "messages": [
        {"role": "system", "content": "You are an assistant that summarizes Arabic texts accurately and concisely."},
        {"role": "user", "content": f"Summarize the following article in 6 clear points highlighting the main idea, while preserving proper names and titles:nTitle: {title}nText:n{text[:12000]}"}  # Size limit
      ],
      "temperature": 0.2
    }
    req = urllib.request.Request(endpoint,
        data=json.dumps(payload).encode("utf-8"),
        headers={"Authorization": f"Bearer {OPENAI_API_KEY}",
                 "Content-Type": "application/json"})
    with urllib.request.urlopen(req, timeout=60) as resp:
        data = json.loads(resp.read().decode("utf-8"))
    return data["choices"][0]["message"]["content"].strip()

def create_notion_page(title: str, url: str, source: str, summary: str, published_iso: str = None):
    notion_endpoint = "https://api.notion.com/v1/pages"
    headers = {
        "Authorization": f"Bearer {NOTION_TOKEN}",
        "Notion-Version": "2022-06-28",
        "Content-Type": "application/json"
    }
    body = {
      "parent": {"database_id": NOTION_DB},
      "properties": {
        "Title": {"title": [{"text": {"content": title}}]},
        "URL": {"url": url},
        "Source": {"rich_text": [{"text": {"content": source}}]},
      },
      "children": [
        {"object":"block","type":"heading_2","heading_2":{"rich_text":[{"type":"text","text":{"content":"Summary"}}]}},
        {"object":"block","type":"paragraph","paragraph":{"rich_text":[{"type":"text","text":{"content": summary}}]}}
      ]
    }
    if published_iso:
        body["properties"]["Published"] = {"date": {"start": published_iso}}
    resp = requests.post(notion_endpoint, headers=headers, json=body, timeout=60)
    resp.raise_for_status()
    return resp.json()

for e in items:
    title = e.get("title", "No Title")
    link = e.get("link", "")
    published = None
    if e.get("published_parsed"):
        published = datetime(*e.published_parsed[:6]).isoformat()

    article_text = fetch_article_text(link)
    if not article_text:
        print("Could not extract text:", link)
        continue

    summary = summarize(article_text, title)
    create_notion_page(title, link, feed.feed.get("title","RSS Source"), summary, published)
    print("Added to Notion:", title)
    time.sleep(2)  # Small delay to respect rates

🧪 Step 6: Testing

  1. Replace RSS_URL with a real feed link (e.g., a tech blog feed).
  2. Run the script: python app.py.
  3. Open the Notion database and check for new pages appearing in it.

⏱️ Optional Step: Automatic Running via Zapier

  1. Create a Zap with the RSS by Zapier trigger to detect new items.
  2. Add a Webhooks by Zapier (POST) step to call your hosted script (or Cloud Function) and pass the article data as JSON.
  3. Alternatively, you can directly call Notion from Zap using the Notion: Create Page step and pass the ready summary if you do the summarization within Zap via the OpenAI API.

🔒 Security and Compliance Notes

  • Store API keys as environment variables or secrets on the server—do not include them in the repository.
  • Review the rate limits for the OpenAI API before running large batches.
  • If using Zapier, make sure to rotate secret tokens regularly and keep an eye on security updates.

🚀 Development Ideas

  • Automatically add topic tagging via a classification model.
  • Send the summary as a weekly email.
  • Aggregate multiple RSS sources while removing duplicates.

📚 Resources

💬 Do you have a question about this lesson? Leave it in the comments and we will assist you step by step.

“`

Meta Description: Learn how to build an automatic article summarization tool that integrates RSS, Python, OpenAI, and Notion in a step-by-step guide suitable for beginners and intermediates.

Keywords: RSS, Python, OpenAI, Notion, automatic summarization, API integration, tech tutorial.

Share:

Was this tutorial helpful?

What are you looking for?