Formatting Google Docs Document (Headings (h1, h2, h3), Bolding, etc.)

workflowsy · May 16, 2024, 8:04pm

Hi there,

Is it possible with the n8n google docs node to format the text that is being inserted into Google Docs? For example, in Make I’m able to provide the document in HTML and it converts something like an <h1> tag to a Heading 1 in google docs. That said, it appears as though that is not possible in n8n, so I wanted to understand if there is any other way to format headings, bold / italicize, underline, etc. within n8n so the document that is being generated is in proper format automatically.

Let me know if anyone has solved for this outside of using Google Doc’s Native API as opposed to the actual n8n Google Docs Node.

n8n · May 16, 2024, 8:04pm

It looks like your topic is missing some important information. Could you provide the following if applicable.

n8n version:
Database (default: SQLite):
n8n EXECUTIONS_PROCESS setting (default: own, main):
Running n8n via (Docker, npm, n8n cloud, desktop app):
Operating system:

liam · May 16, 2024, 11:16pm

That doesn’t seem to be possible in the current state of the node

I would suggest adding a feature request. That is definitely a big thing that is missing and is not exactly straight forward to implement

workflowsy · May 17, 2024, 9:03pm

Ahh @liam, you reached the same conclusion I did. There has been some feature requests for stuff similar to this for a few years but it hasn’t seemed to happen just yet. Maybe I’ll add another feature request and see if it can pick up some momentum.

In the interim, I did create my own solution, albeit a bit hacky using an AWS Lambda Function being called from n8n and the Google Docs API. I’ve included the code below for other who may be hitting the same roadblock.

You basically have the n8n google docs node write in tags (ex. <h1>) at the beginning of any line that you want to have converted to a google docs native formatting element. It then finds all of those tags, performs the supported formatting, and removes the tags from the document all in a batch edit. It’s pretty quick and effective, but again wish it was included out of the box with the n8n node.

import boto3
import json
import re
from google.oauth2 import service_account
from googleapiclient.discovery import build

# AWS Secrets Manager secret name
SECRET_NAME = "prod/xxxxxxxxxxxxxxx"
REGION_NAME = "us-east-1"

# Tags to look for and corresponding styles
TAG_STYLES = {
    "<h1>": "HEADING_1",
    "<h2>": "HEADING_2",
    "<h3>": "HEADING_3",
    "<h4>": "HEADING_4",
    "<b>": "bold",
    "<i>": "italic",
    "<u>": "underline",
    "<tab>": "indent",
    "<center>": "center",
    "<left>": "left",
    "<right>": "right"
}

# Function to get the secret from AWS Secrets Manager
def get_secret():
    # Create a Secrets Manager client
    session = boto3.session.Session()
    client = session.client(
        service_name='secretsmanager',
        region_name=REGION_NAME
    )

    # Handle the specific exceptions for the 'GetSecretValue' API.
    try:
        get_secret_value_response = client.get_secret_value(
            SecretId=SECRET_NAME
        )
    except Exception as e:
        raise e

    # Decrypt secret using the associated KMS key.
    secret = get_secret_value_response['SecretString']
    return json.loads(secret)

# Function to extract the document ID from the Google Docs URL
def extract_document_id(url):
    match = re.search(r'/d/([a-zA-Z0-9-_]+)', url)
    if match:
        return match.group(1)
    else:
        raise ValueError("Invalid Google Docs URL")

# Lambda handler function
def lambda_handler(event, context):
    # Get the document URL from the event
    document_url = event['document_url']
    
    # Extract the document ID from the URL
    document_id = extract_document_id(document_url)

    # Get Google service account credentials from AWS Secrets Manager
    secret_content = get_secret()
    credentials = service_account.Credentials.from_service_account_info(secret_content)

    # Build the service
    service = build('docs', 'v1', credentials=credentials)

    # Retrieve the current content of the document
    document = service.documents().get(documentId=document_id).execute()
    content = document.get('body').get('content')

    # Create a list to hold all requests
    requests = []

    # Function to generate style update requests
    def generate_requests(start_index, end_index, paragraph_styles, text_styles):
        request = []
        if paragraph_styles and start_index < end_index:
            request.append({
                'updateParagraphStyle': {
                    'range': {
                        'startIndex': start_index,
                        'endIndex': end_index,
                    },
                    'paragraphStyle': paragraph_styles,
                    'fields': ','.join(paragraph_styles.keys()),
                }
            })
        if text_styles and start_index < end_index:
            request.append({
                'updateTextStyle': {
                    'range': {
                        'startIndex': start_index,
                        'endIndex': end_index,
                    },
                    'textStyle': text_styles,
                    'fields': ','.join(text_styles.keys()),
                }
            })
        return request

    # Track the cumulative offset caused by removing tags
    cumulative_offset = 0

    # Iterate through the document content to find and process tags
    for element in content:
        if 'paragraph' in element:
            for run in element.get('paragraph').get('elements'):
                text_run = run.get('textRun')
                if text_run:
                    content = text_run.get('content')
                    start_index = run.get('startIndex') - cumulative_offset
                    paragraph_styles = {}
                    text_styles = {}
                    tag_length = 0

                    # Initialize an empty set for the applied styles
                    applied_paragraph_styles = set()
                    applied_text_styles = set()

                    # Iterate over each tag and apply the corresponding styles
                    while True:
                        tag_found = False
                        for tag, style in TAG_STYLES.items():
                            if content.startswith(tag, tag_length):
                                tag_length += len(tag)
                                tag_found = True
                                if "HEADING" in style:
                                    paragraph_styles['namedStyleType'] = style
                                    applied_paragraph_styles.add(tag)
                                elif style == "indent":
                                    paragraph_styles['indentFirstLine'] = {
                                        'magnitude': 18,  # This is 18 points; adjust as needed
                                        'unit': 'PT'
                                    }
                                    applied_paragraph_styles.add(tag)
                                elif style in ["center", "left", "right"]:
                                    alignment_map = {
                                        "center": "CENTER",
                                        "left": "START",
                                        "right": "END"
                                    }
                                    paragraph_styles['alignment'] = alignment_map[style]
                                    applied_paragraph_styles.add(tag)
                                else:
                                    if style == "bold":
                                        text_styles['bold'] = True
                                    elif style == "italic":
                                        text_styles['italic'] = True
                                    elif style == "underline":
                                        text_styles['underline'] = True
                                    applied_text_styles.add(tag)
                        if not tag_found:
                            break

                    if applied_paragraph_styles or applied_text_styles:
                        actual_text = content[tag_length:]
                        tag_end_index = start_index + tag_length
                        end_index = tag_end_index + len(actual_text)

                        # Adjust the end_index to ensure it does not apply beyond the current paragraph
                        paragraph_end_index = run.get('endIndex') - cumulative_offset
                        end_index = min(end_index, paragraph_end_index)

                        # Generate and add the style update request
                        requests.extend(generate_requests(tag_end_index, end_index, paragraph_styles, text_styles))
                        
                        # Generate and add the tag removal request
                        if tag_length > 0:
                            requests.append({
                                'deleteContentRange': {
                                    'range': {
                                        'startIndex': start_index,
                                        'endIndex': tag_end_index,
                                    }
                                }
                            })
                            # Update the cumulative offset
                            cumulative_offset += tag_length

    # Execute the batch update
    if requests:
        result = service.documents().batchUpdate(
            documentId=document_id, body={'requests': requests}).execute()
        print(f"Processed {len(requests) // 2} tagged text segments.")
    else:
        print("No tagged text found in the document.")

    return {
        'statusCode': 200,
        'body': json.dumps('Document updated successfully')
    }

Lambda Example Payload

{
    "document_url": "https://docs.google.com/document/d/your-document-id"
}

Even with this approach you need some understanding of how to create an AWS Lambda layer, a secret within AWS, Service Account Creds in Google Cloud, etc… not the most intuitive solution.

Anyway, I hope this at least helps someone and if anyone has any questions feel free to reach out. I’d be happy to help!

system · May 25, 2024, 7:51pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.