python-substack icon indicating copy to clipboard operation
python-substack copied to clipboard

Mark down to post style is not there.

Open a1111198 opened this issue 9 months ago • 0 comments

This solution adds Markdown support to your Substack newsletter workflow. It includes a parser for inline Markdown formatting (supporting bold and italic text) and a function that processes a Markdown file to create a structured post draft.

For those who might be doubting if it works, here’s a link as evidence of its functionality: [Check if it works](https://gist.github.com/Duartemartins/98afb33ca5eae545e909df41be45f39e).


Overview

  • Inline Markdown Parsing:
    The parse_inline function scans a text string for inline Markdown patterns (like bold and italic) and converts them into tokens with corresponding formatting marks.

  • Post Draft Creation:
    The publish_newsletter_to_substack function reads a Markdown file, splits its content into blocks (headings, images, paragraphs, or bullet lists), and converts these into a structured Post object. It then creates and pre-publishes a draft on Substack.


Code Implementation

import re

def parse_inline(text):
    """
    Convert inline Markdown in a text string into a list of tokens
    for use in the post content.

    Supported formatting:
      - **Bold**: Text wrapped in double asterisks.
      - *Italic*: Text wrapped in single asterisks.
    """
    tokens = []
    # Pattern matches either **bold** or *italic* text.
    pattern = r'(\*\*.*?\*\*|\*.*?\*)'
    parts = re.split(pattern, text)
    
    for part in parts:
        if not part:
            continue
        if part.startswith("**") and part.endswith("**"):
            content = part[2:-2]
            tokens.append({"content": content, "marks": [{"type": "strong"}]})
        elif part.startswith("*") and part.endswith("*"):
            content = part[1:-1]
            tokens.append({"content": content, "marks": [{"type": "em"}]})
        else:
            tokens.append({"content": part})
    
    return tokens

def publish_newsletter_to_substack(api):
    """
    Reads a Markdown file, converts its content into a structured Post,
    and creates a draft post on Substack.

    Workflow:
      1. Retrieve the user profile to extract the user ID.
      2. Read and parse the Markdown file.
      3. Convert Markdown blocks into post elements:
         - Headings (lines starting with '#' characters).
         - Images (using Markdown image syntax: ![Alt](URL)).
         - Paragraphs and bullet lists with inline Markdown formatting.
      4. Create and update the draft post.
    """
    # Retrieve user profile to extract user ID.
    profile = retry_on_502(lambda: api.get_user_profile())
    user_id = profile.get("id")
    if not user_id:
        raise ValueError("Could not get user ID from profile")
    
    # Create a Post instance.
    post = Post(
        title="FIX TITLE",  # Replace with the desired title.
        subtitle="",        # Optionally customize the subtitle.
        user_id=user_id
    )
    
    # Read the Markdown file.
    with open("blog_0_2025-03-28T19-46-34-098Z.md", 'r', encoding='utf-8') as f:
        md_content = f.read()
    
    # Split content into blocks separated by double newlines.
    blocks = md_content.split("\n\n")
    
    for block in blocks:
        block = block.strip()
        if not block:
            continue
        
        # Process headings (lines starting with '#' characters).
        if block.startswith("#"):
            level = len(block) - len(block.lstrip('#'))
            heading_text = block.lstrip('#').strip()
            post.heading(content=heading_text, level=level)
        
        # Process images using Markdown image syntax: ![Alt](URL)
        elif block.startswith("!"):
            m = re.match(r'!\[.*?\]\((.*?)\)', block)
            if m:
                image_url = m.group(1)
                # Adjust image URL if it starts with a slash.
                image_url = image_url[1:] if image_url.startswith('/') else image_url
                image = api.get_image(image_url)
                post.add({"type": "captionedImage", "src": image.get("url")})
        
        # Process paragraphs or bullet lists.
        else:
            if "\n" in block:
                # Process each line separately.
                for line in block.split("\n"):
                    line = line.strip()
                    if not line:
                        continue
                    # Remove bullet marker if present.
                    if line.startswith("*"):
                        line = line.lstrip("*").strip()
                    tokens = parse_inline(line)
                    post.add({"type": "paragraph", "content": tokens})
            else:
                tokens = parse_inline(block)
                post.add({"type": "paragraph", "content": tokens})
    
    # Create and update the draft post.
    draft = api.post_draft(post.get_draft())
    api.put_draft(draft.get("id"), draft_section_id=post.draft_section_id)
    api.prepublish_draft(draft.get("id"))
    
    print("Newsletter published (drafted) successfully!")

a1111198 avatar Apr 01 '25 12:04 a1111198