---
title: "Automating a Complete Language Version with Python and the WordPress REST API: WPML, ACF, and Yoast SEO"
description: "Setting up a new language version on a multilingual WordPress site — with WPML, ACF Custom Blocks, and Yoast SEO metadata — is a manual nightmare. This post shows how a Python script automates the entire import via the REST API: from static HTML files to fully linked translations in WordPress."
date: 2026-03-09
modified: 2026-05-09
author: "Eberhard Lauth"
url: "https://netzkundig.com/en/blog/python-rest-api-importer-wpml-acf-yoast/"
featured_image: "https://netzkundig.com/wp-content/uploads/2026/03/html-to-wpml-import.png"
categories:
  - name: "AI"
    url: "https://netzkundig.com/en/thema/ai/"
language: "en-US"
---

# Automating a Complete Language Version with Python and the WordPress REST API: WPML, ACF, and Yoast SEO

A client runs an extensive multilingual WordPress website — English is the primary language, additional language versions already exist. For a new target market, a complete new language version was needed: several dozen pages, complex structure built from ACF Custom Blocks, fully cross-linked, with curated Yoast SEO metadata.

The translations were already in hand — as static HTML files, delivered by an external provider. Now this content had to make its way into WordPress: as Gutenberg blocks, as WPML-linked translations of the original English pages, with SEO metadata correctly carried over.

Doing this by hand would have taken hours, would have been error-prone, and would not have been reproducible. So I wrote a Python-based importer.

## The starting point in detail

The website was running on:

- **WordPress** with the Gutenberg editor
- **WPML** for multilingual support (`/en/`, `/[new-language]/`)
- **ACF** (Advanced Custom Fields) for several Custom Block types
- **Yoast SEO** for meta optimization
- **Custom block plugins** — including an Anchor-Navigation block, an interactive Internal-Link block, and a Geolocation block

The source data: HTML files, stored locally. The HTML originated from WordPress — which made the conversion considerably easier, though far from trivial.

## Why Python + REST API instead of WP-CLI or a plugin?

The obvious alternatives ruled themselves out quickly:

**WP-CLI** would have required direct server access and isn’t a good fit for complex WPML linking logic. **XML import via plugin** (e.g. WP All Import) can’t reliably handle ACF Custom Blocks, especially when field keys are hardcoded. **Manual import** wasn’t a serious option given the number of pages.

A Python script using the REST API was the cleanest solution: locally testable, dry-run capable, fully reproducible, and free of production risk until the final run.

## Architecture of the importer

```
language_importer_v2/
├── config.json          # API credentials, base URLs, language codes
├── importer.py          # main script
├── converter.py         # HTML → Gutenberg blocks
├── seo_extractor.py     # Yoast metadata from HTML <head>
├── wpml_linker.py       # WPML translation linking via REST API
└── README.md

```

The flow in broad strokes:

1. Read the HTML file
2. Generate Gutenberg markup (block-by-block conversion)
3. Extract SEO metadata from the `<head>`
4. Determine the slug of the English source page (automatic matching)
5. Create the new WordPress page via REST API
6. Write Yoast meta via REST API
7. Establish the WPML link to the English source page

## HTML to Gutenberg: the conversion problem

The HTML originated from WordPress but no longer contained any Gutenberg comments — those are stripped out at render time. The converter therefore had to reconstruct the original block structure from the rendered HTML.

For standard blocks (paragraphs, headings, lists, images) this is straightforward:

```
def convert_paragraph(element):
    text = element.decode_contents()
    return f'<!-- wp:paragraph -->\n<p>{text}</p>\n<!-- /wp:paragraph -->'

def convert_heading(element):
    level = int(element.name[1])  # h2 → 2
    classes = element.get("class", [])
    text = element.decode_contents()
    return (
        f'<!-- wp:heading {{"level":{level}}} -->\n'
        f'<h{level} class="wp-block-heading {" ".join(classes)}">{text}</h{level}>\n'
        f'<!-- /wp:heading -->'
    )

```

More complex were nested structures — lists in particular, where `wp:list-item` has been expected as a standalone block inside `wp:list` since Gutenberg version 6.x:

```
def convert_list(element):
    ordered = element.name == "ol"
    block_type = "ordered" if ordered else "unordered"
    items_markup = "\n".join([
        f'<!-- wp:list-item --><li>{li.decode_contents()}</li><!-- /wp:list-item -->'
        for li in element.find_all("li", recursive=False)
    ])
    tag = "ol" if ordered else "ul"
    return (
        f'<!-- wp:list {{"ordered":{str(ordered).lower()}}} -->\n'
        f'<{tag} class="wp-block-list">\n{items_markup}\n</{tag}>\n'
        f'<!-- /wp:list -->'
    )

```

## ACF Custom Blocks: the field-key problem

This was the project’s actual complexity. The website used several custom block types registered via ACF Pro. Gutenberg doesn’t store ACF block attributes by readable field names but by internal **field keys** — unique IDs that were generated when the field was first created in ACF.

These keys can’t be derived automatically from either the HTML or the REST API. They had to be read once from the database (or via `acf_get_field()` in code) and then hardcoded into the importer:

```
ACF_FIELD_KEYS = {
    "anchorlink": {
        "destination": "field_5ce787531418a",
    },
    "internallink": {
        "destination": "field_5cfa2b3e72c53",
    },
    "slider": {
        "post_id": "field_5ced17feae48d",
    },
    "ip_location": {
        "condition":  "field_6107d54dda6da",
        "countries":  "field_6107d5eeda6da",
    },
}

```

An ACF block in the Gutenberg markup then looks like this:

```
def convert_acf_internallink(destination_post_id):
    key = ACF_FIELD_KEYS["internallink"]["destination"]
    return (
        f'<!-- wp:acf/internallink {{"name":"acf/internallink","data":'
        f'{{"{key}":"{destination_post_id}"}}}} /-->'
    )

```

Important: ACF self-closing blocks end with `/-->`, not with a separate closing comment.

For the Internal-Link block, an `INTERNAL_LINK_MAP` also had to be maintained — a mapping from local HTML href targets to WordPress post IDs of the English pages, since the new language versions didn’t yet exist at link time:

```
INTERNAL_LINK_MAP = {
    "/en/product/": 1234,
    "/en/about/":   5678,
    # ...
}

```

## Slug matching: automatic instead of manual

For the importer to find the matching English source page for each new translation, automatic slug matching was needed. The local HTML file names followed a consistent naming convention — `index.html`, `product.html`, `about-us.html` — and could be mapped directly to English WordPress slugs.

The importer queries the English page via REST API and stores its post ID for the later WPML link:

```
def get_english_post_id(slug):
    response = requests.get(
        f"{WP_BASE_URL}/wp-json/wp/v2/pages",
        params={"slug": slug, "lang": "en"},
        auth=WP_AUTH
    )
    data = response.json()
    if not data:
        raise ValueError(f"No English page found for slug '{slug}'.")
    return data[0]["id"]

```

## WPML translation linking via REST API

WPML offers two REST endpoints for programmatically linking translations. In practice, neither was reliable — depending on the WPML version and plugin configuration, one or the other would respond:

```
def link_wpml_translation(new_post_id, source_post_id, language_code):
    endpoints = [
        f"{WP_BASE_URL}/wp-json/wpml/v1/translation",
        f"{WP_BASE_URL}/wp-json/wpml/v1/posts/connect",
    ]
    payload = {
        "post_id":       new_post_id,
        "source_post_id": source_post_id,
        "language_code": language_code,
    }
    for endpoint in endpoints:
        response = requests.post(endpoint, json=payload, auth=WP_AUTH)
        if response.status_code == 200:
            print(f"  ✓ WPML link set via {endpoint}")
            return
    print(f"  ⚠ WPML link failed — please link manually (post {new_post_id})")

```

The fallback hint in the log was important: in some cases the linking had to be done manually in the WordPress backend afterward. The script flagged these cases clearly.

## Extracting Yoast SEO metadata from the HTML &lt;HEAD&gt;

The static HTML files contained complete Yoast output in the `<head>` — `<title>`, meta description, Open Graph tags, Twitter cards, canonical URL, and focus keyword. This data was to be carried over 1:1.

The extractor reads all relevant tags using BeautifulSoup:

```
from bs4 import BeautifulSoup

def extract_yoast_meta(html_content):
    soup = BeautifulSoup(html_content, "html.parser")
    head = soup.head

    return {
        "_yoast_wpseo_title":              _get_meta(head, "title"),
        "_yoast_wpseo_metadesc":           _get_meta(head, "description"),
        "_yoast_wpseo_opengraph-title":    _get_og(head, "og:title"),
        "_yoast_wpseo_opengraph-description": _get_og(head, "og:description"),
        "_yoast_wpseo_opengraph-image":    _get_og(head, "og:image"),
        "_yoast_wpseo_canonical":          _get_canonical(head),
        "_yoast_wpseo_focuskw":            _get_yoast_focuskw(head),
        "_yoast_wpseo_twitter-title":      _get_meta(head, "twitter:title"),
        "_yoast_wpseo_twitter-description":_get_meta(head, "twitter:description"),
        "_yoast_wpseo_twitter-image":      _get_meta(head, "twitter:image"),
    }

```

Writing happens at page-creation time, directly in the `meta` field of the REST API request — Yoast must expose the relevant meta keys in its `rest_api_init` configuration (which a standard Yoast install does out of the box).

## Dry-run and offline mode

Two modes considerably accelerated development and testing:

**`--dry-run`**: The script simulates every step but writes nothing to WordPress. Ideal for an initial pass to surface issues in conversion or slug matching.

**`--gutenberg-only`**: Generates the Gutenberg markup and saves it locally as an `.html` file, alongside an accompanying `_seo.json` file with the extracted Yoast metadata — without any REST API connection. Very useful for manual review before the live import:

```
output/
├── about-us.html          # Gutenberg markup for manual review
├── about-us_seo.json      # Yoast metadata as JSON
├── product.html
├── product_seo.json
└── ...

```

## What still had to be done by hand

Not everything could be automated. The **Slider block** references a custom post type for slider content. This had to be created manually in the new language first, before its post ID could be entered as a constant in the importer. In those cases the script generated a placeholder:

```
<!-- wp:acf/slider {"name":"acf/slider","data":{"field_5ced17feae48d":"TODO_SLIDER_ID"}} /-->

```

These spots were touched up after the import via search in the WordPress backend — an acceptable trade-off compared to building complex precondition logic into the script.

## Lessons learned from the project

- **ACF field keys aren’t optional.** There’s no clean way to derive them dynamically from the HTML. Reading them once and fixing them in code is the pragmatic solution — and it works reliably.
- **The WPML REST API is fragile.** The fallback to two endpoints and the explicit logging of failed links saved the project from several silent errors.
- **Always dry-run first.** The Gutenberg-only mode surfaced three conversion errors during the first complete pass that would have been hard to debug in a live import.
- **HTML from WordPress is a good starting point.** When the source HTML was rendered by WordPress, classes, IDs, and structures are predictable — which makes the conversion logic considerably easier than for arbitrary HTML.
- **Inherit the page template from the source page.** In the final step the importer automatically picked up the `page_template` field of the English source page — making sure new pages didn’t end up on the default template.

---

## Conclusion

For multilingual WordPress projects with ACF and WPML where content has to be transferred in bulk from external sources, a Python importer using the REST API is the most robust solution. It’s testable, reproducible, and its explicit nature enforces a thorough understanding of your own data structure — especially when it comes to ACF field keys and WPML linking logic.

The code can be ported to other projects with little effort, as long as the ACF field keys and slug conventions are adapted.

Need help migrating or integrating a complex, large-scale multilingual WordPress site?   
[Get in touch for a no-obligation conversation](#kontakt).
