📘 Building an Automatic Article Summarization Tool (RSS → Python → OpenAI → Notion)
💡 Idea
We will create a tool that monitors any RSS, collects new links, extracts the article text, summarizes it using the OpenAI API, and then automatically saves the summary in a Notion database. You can run it manually or schedule it via Zap.
🧰 Requirements
- An active OpenAI account with a API key.
- A Notion workspace and a database to save the summaries.
- Python 3.10+ installed on your machine.
- An RSS link from a news source/blog.
- (Optional) A Zapier account to trigger the code when a new RSS item arrives.
🔧 Step 1: Understanding RSS and Choosing a Source
RSS is a standard format for syndicating updated content (titles/summary/link). You will find the RSS link on many websites or through the orange RSS icon. We will use this link as an input for the tool.
🗂️ Step 2: Setting Up a Database in Notion
- Create a Database in Notion titled something like: AI Summaries.
- Add the following properties:
- Title (Page Title) — Type: Title.
- URL — Type: URL.
- Source — Short text.
- Summary — Long text.
- Published — Date/Time (optional).
- Create an Integration from Notion Developers, grant it access to the database, and keep your
NOTION_TOKENandDATABASE_ID. Refer to the API creation documentation for more precise adjustments if needed.
🔐 Step 3: Saving Keys as Environment Variables
# macOS/Linux
export OPENAI_API_KEY="insert_key_here"
export NOTION_TOKEN="insert_notion_token"
export NOTION_DATABASE_ID="insert_database_id"
📦 Step 4: Installing Packages
pip install feedparser trafilatura requests python-dotenvfeedparser for reading RSS, trafilatura for extracting article text from HTML, and requests for sending requests to Notion.
🤖 Step 5: Complete Python Code
This code executes: Read RSS → Download Article Text → Summarize via OpenAI → Create Page in Notion.
import os, time, feedparser, requests
from datetime import datetime
import trafilatura
OPENAI_API_KEY = os.environ["OPENAI_API_KEY"]
NOTION_TOKEN = os.environ["NOTION_TOKEN"]
NOTION_DB = os.environ["NOTION_DATABASE_ID"]
RSS_URL = "https://example.com/feed" # Replace with your RSS feed
# 1) Read RSS
feed = feedparser.parse(RSS_URL)
items = feed.entries[:5] # Try the last 5 items
def fetch_article_text(url: str) -> str:
downloaded = trafilatura.fetch_url(url, timeout=20)
return trafilatura.extract(downloaded) or ""
def summarize(text: str, title: str) -> str:
# Using OpenAI's Chat API
import json, urllib.request
endpoint = "https://api.openai.com/v1/chat/completions"
payload = {
"model": "gpt-4.1-mini",
"messages": [
{"role": "system", "content": "You are an assistant that summarizes Arabic texts accurately and concisely."},
{"role": "user", "content": f"Summarize the following article in 6 clear points highlighting the main idea, while preserving proper names and titles:nTitle: {title}nText:n{text[:12000]}"} # Size limit
],
"temperature": 0.2
}
req = urllib.request.Request(endpoint,
data=json.dumps(payload).encode("utf-8"),
headers={"Authorization": f"Bearer {OPENAI_API_KEY}",
"Content-Type": "application/json"})
with urllib.request.urlopen(req, timeout=60) as resp:
data = json.loads(resp.read().decode("utf-8"))
return data["choices"][0]["message"]["content"].strip()
def create_notion_page(title: str, url: str, source: str, summary: str, published_iso: str = None):
notion_endpoint = "https://api.notion.com/v1/pages"
headers = {
"Authorization": f"Bearer {NOTION_TOKEN}",
"Notion-Version": "2022-06-28",
"Content-Type": "application/json"
}
body = {
"parent": {"database_id": NOTION_DB},
"properties": {
"Title": {"title": [{"text": {"content": title}}]},
"URL": {"url": url},
"Source": {"rich_text": [{"text": {"content": source}}]},
},
"children": [
{"object":"block","type":"heading_2","heading_2":{"rich_text":[{"type":"text","text":{"content":"Summary"}}]}},
{"object":"block","type":"paragraph","paragraph":{"rich_text":[{"type":"text","text":{"content": summary}}]}}
]
}
if published_iso:
body["properties"]["Published"] = {"date": {"start": published_iso}}
resp = requests.post(notion_endpoint, headers=headers, json=body, timeout=60)
resp.raise_for_status()
return resp.json()
for e in items:
title = e.get("title", "No Title")
link = e.get("link", "")
published = None
if e.get("published_parsed"):
published = datetime(*e.published_parsed[:6]).isoformat()
article_text = fetch_article_text(link)
if not article_text:
print("Could not extract text:", link)
continue
summary = summarize(article_text, title)
create_notion_page(title, link, feed.feed.get("title","RSS Source"), summary, published)
print("Added to Notion:", title)
time.sleep(2) # Small delay to respect rates
🧪 Step 6: Testing
- Replace
RSS_URLwith a real feed link (e.g., a tech blog feed). - Run the script:
python app.py. - Open the Notion database and check for new pages appearing in it.
⏱️ Optional Step: Automatic Running via Zapier
- Create a Zap with the RSS by Zapier trigger to detect new items.
- Add a Webhooks by Zapier (POST) step to call your hosted script (or Cloud Function) and pass the article data as JSON.
- Alternatively, you can directly call Notion from Zap using the Notion: Create Page step and pass the ready summary if you do the summarization within Zap via the OpenAI API.
🔒 Security and Compliance Notes
- Store API keys as environment variables or secrets on the server—do not include them in the repository.
- Review the rate limits for the OpenAI API before running large batches.
- If using Zapier, make sure to rotate secret tokens regularly and keep an eye on security updates.
🚀 Development Ideas
- Automatically add topic tagging via a classification model.
- Send the summary as a weekly email.
- Aggregate multiple RSS sources while removing duplicates.
📚 Resources
- OpenAI API – Chat API Reference
- OpenAI – Examples for Crafting Requests
- Notion API – Create Page
- Notion API – Working with Page Content
- Trafilatura – Extracting Text from Web Pages
- Trafilatura – Usage Guide with Python
- Zapier – Getting Started with Webhooks
- Zapier – Webhooks Integrations
- Microsoft Docs – RSS Connector Definition
💬 Do you have a question about this lesson? Leave it in the comments and we will assist you step by step.
“`
Meta Description: Learn how to build an automatic article summarization tool that integrates RSS, Python, OpenAI, and Notion in a step-by-step guide suitable for beginners and intermediates.
Keywords: RSS, Python, OpenAI, Notion, automatic summarization, API integration, tech tutorial.