Skip to content

v6+HTML post processing (PageCrawler)

Intro

The HTML of articles/boards is being rendered through a NodeJS app. Some edge cases might require additional processing to be done to this HTML. XalokNext does this using the service Wf\Bundle\CmsBaseBundle\ContentCrawler\PageCrawler. In this service there's a $crawler property that is an instance of Symfony's DOMCrawler that can be used to query the contents of the HTML currently being processed.

Depending on the type of content being rendered, it calls one of these methods:

  • processArticleContent
  • processBoardContent
  • processBoardGroupContent
  • processListingContent

Each of these methods receives the Entity\Page object whose HTML is currently being processed.o

Additionally, there's also the processCommonContent that is invoked for all types of contents.

Note: all attributes whose names start with wf- are removed by the JS code rendering the HTML, if you need to do some postprocessing based on attribute names, use a different prefix. XalokNext, for example, uses wfattr- prefix for these post-processing attributes names.

XalokNext post processing

Some common operations are included in the base implementation of the Wf\Bundle\CmsBaseBundle\ContentCrawler\PageCrawler service

wfattr-transparent

Due to XalokNext editor's internals, mixing modules definitions with static HTML in the same container has undesired effects (the static HTML is always at the top of the container, the modules come after that, no matter if the twig defines them in a different order). To be able to have moduleA + staticHTML + moduleB, moduleA and moduleB must be wrapped in their own containers. This extra container can break the design or it might be required by SEO to be removed since it doesn't serve any particular purpose, other than to fit XalokNext's limitation.

For such cases you can mark the container (or any other element) with the wfattr-transparent attribute, this element will be removed by the PageCrawler and its children will be kept:

html
<!-- template -->
<div class="container">
    <div wfattr-transparent>
        <div class="module-a">moduleA</div>
    </div>
   <!-- static HTML -->
   <div wfattr-transparent>
        <div class="module-b">moduleB</div>
    </div>
</div>

<!-- result -->
<div class="container">
    <div class="module-a">moduleA</div>
    <!-- static HTML -->
    <div class="module-b">moduleB</div>
</div>