top of page

Crawling Client-Side Rendered Webpages For Better SEO

Updated: Dec 13, 2022



Introduction

Many web applications require to be integrated SEO. Especially require for the URL shared in social networks and messengers to be shown social sharing image, title, and description.


Additionally, it must be indexed on the Google search index for efficient marketing.

In the scope of this topic will be explained the technical aspects of how SEO retrieves information from websites and shows shared URLs. It will especially be explained how to integrate with SEO dynamic web pages rendered fully on the client side.


Additionally, will be presented the step-by-step instruction for web applications hosted on Microsoft IIS.


How Does The SEO Retrieves Information From URL?

First of all, let's define what exactly the goal is. Here are the main points:

  • Be able to share URLs of the web application as shown in this image

  • Make all available URLs in the application indexed by the Google Search engine for appearing during a Google search.

Now, it can be explained how social networks retrieve all required information from shared URLs.

When the URL gets shared, the messenger (Ex.: WhatsApp, Slack, etc.) or social network (Ex.: Facebook, LinkedIn, etc.) platform sends HTTP GET request to provided URL and reads special tags from the header of the received HTML document.

Here is the list of the most popular meta tags for SEO:


<meta name="og:title" content="The Title Of Your Website"/>
<meta name="og:url" content="URL of your website"/>
<meta name="og:description" content="Some description about your web page which will represent brave eplanation of the content"/>
<meta name="og:image" content="Here you can put the URL of an image you want to be shown when your page is linked to."/>

So, when it's required to be shared on social networks, as mentioned, it's enough to add to the head section of your HTML document the mentioned meta tags.


How To Fill Meta Tags On Dynamically Generated Pages?

To make meta tags discoverable for SEO, those tags should be contained in the response for the HTTP GET request.

This means that meta tags should be discoverable in clear form. It can be checked if you send a GET request by Postman.

Server-Side Rendered Pages

For dynamically generated pages, such as products, courses, etc, it's easy to do if the website is rendered on the server side. In this case, the web server returns an already rendered HTML document to the received GET request.

To render the website on the server side can be used technologies that initially work that way, like ASP.NET, or can be integrated with the application the JavaScript frameworks like NextJS, NuxtJS, GatesbyJS, etc.

Usually, with server-side rendered websites, the initial request generates the full HTML on the back end, and after application can retrieve the data components based on different actions and fetch that with HTML DOM.

The server-side rendering approach has some pros and cons presented below:


Server-Side Pros

Server-Side cons

Search engines can crawl the site for better SEO.

Frequent server requests.

The initial page load is faster.

An overall slow page rendering.

Great for static sites.

Full page reloads.

Non-rich site interactions.

Client-Side Rendered Pages

Nowadays, many websites are just pretending just sites. Those are rich functional applications executed in the browser. There are many frameworks rendering web applications on the client side, for example, ReactJS, VueJS, Angular, etc. The client-side rendering approach's advantages and disadvantages are presented in the table below:

Client-Side Pros

Client-Side Cons

Rich site interactions.

Low SEO if not implemented correctly.

Fast website rendering after the initial load.

The initial load might require more time.

Great for web applications.

In most cases, requires an external library.

Robust selection of JavaScript libraries.

SEO Crawling Problem With Client-Side Rendered Application

Let's imagine we have a big application, for example, an online learning platform, rendered on the client side. In detail, we have an API as a back-end service and a web application implemented in ReactJS.

Now, we want to make our application's dynamic pages be crawled by search engines such as Google and be able to share the links of our courses as shown in the image in this topic.

Well, one of the possible solutions will be integrating a JavaScript framework for server-side rendering, such as NextJS with the application to render meta tags on the server side. However, it could be an expensive and heavy refactoring which will heavily impact the SDLC pipeline and application development plans.

So, what can we do?

Here are the plan's main points, which can be implemented in any environment. Additionally, below will be provided step-by-step instructions on how to do that with web applications hosted on Microsoft IIS.

  1. Add an endpoint on your API service to return the simple HTML document with filled SEO meta tags and redirect the command in the body to the required URL on the website.

  2. Configure the reverse proxy on your web server to some URL you'll share on social networks.

  3. The reverse proxy will respond by the API endpoint implemented in step 1.

  4. Add a sitemap.xml file to your web application and update it with any new dynamic URL creation.

  5. Integrate the URL of the sitemap.xml file with the Google Search Engine.

That's it, now your application, which is still rendered on the server side, will be crawled by search engines, and when you share the URL of the products with a dynamic URL, that will be rendered by all social networks properly and will look like on the image shared previously.


Step-By-Step Instruction For IIS Hosted Application

Now the detailed instruction for the IIS-hosted application.

Let's imagine we have an application implemented with ReactJS and some API that returns all the required data for our website. Now, we have some products we would like to be crawled by search engines. Here is the URL structure of the products we show on our web application: https://{MyApplicationDomain}/products/{productID}. And this is the URL of our API service https://api.{MyApplicationDomain}. Lets see all the steps.

1. Add on API service a new endpoint (/seo/products/{productId}) to retrieve meta tags with redirect link:


public async Task<IActionResult> GetProductDetailsSEO(Guid pId)
        {
            var product = getProductDetails(pId)
            string content = @"
            <html lang=""en"">
            <head>
                <meta charset=""UTF-8"">                        
                <meta property=""og:title"" content="product.Title">
                <meta property=""og:type"" content="website">
                <meta property="og:url"         
                      content="https://{MyDomain}/products/{pID}">
                <meta property=""og:description""     
                      content="product.Descrition">
                <meta property=""og:image"" content="product.ImageURL">
                <title>product.Title</title>
           </head>
           <body>
               <script>
                   window.location.href =     
                        "https://{MyApplicationDomain}/products/{pID}";
                </script>
           </body>
           </html>";          
            return new ContentResult()
            {
                Content = content,
                ContentType = "text/html",
            };
        }

2. Install, if not installed, a module named URL Rewrite on your IIS.

3. Configure with URL Rewrite one inbound rule on the URL you would like to share in social networks and be crawled to send a request to the endpoint on API implemented in step 1.

Your web.config file after this rule should look like in this example below:



<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <system.webServer>		
		<rewrite>		
		  <rules>
                <clear />                
                <rule name="SEORule" stopProcessing="true">
                    <match url="(.*)share/products/(.*)" />
                    <conditions logicalGrouping="MatchAll"
                                              trackAllCaptures="false" />
                    <action type="Rewrite" 
             url="https://api.{MyApplicationDomain}/seo/products/{R:2}" />
                </rule>                
                <rule name="ReactRouter Routes" enabled="true" 
                stopProcessing="true">
                    <match url=".*" />
                    <conditions logicalGrouping="MatchAll" 
                    trackAllCaptures="false">
                        <add input="{REQUEST_FILENAME}" matchType="IsFile"     
                        negate="true" />
                    </conditions>
                    <action type="Rewrite" url="index.html" />                   
                </rule>
		  </rules>            
		</rewrite>
    </system.webServer>
</configuration>

4. To prevent IIS error 500.52, add extra configuration to URL Rewrite. You can do that with GUI, or type in your web.config file. Here how the result should look like:


<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <system.webServer>		
		<rewrite>		
		  <rules>
                <clear />                
                <rule name="SEORule" stopProcessing="true">
                    <match url="(.*)share/product/(.*)" />
                    <conditions logicalGrouping="MatchAll"
                                              trackAllCaptures="false" />
                    <action type="Rewrite" 
             url="https://api.{MyApplicationDomain}/seo/products/{R:2}" />
                </rule>                       
                <rule name="ReactRouter Routes" enabled="true" 
                stopProcessing="true">
                    <match url=".*" />
                    <conditions logicalGrouping="MatchAll" 
                    trackAllCaptures="false">
                        <add input="{REQUEST_FILENAME}" matchType="IsFile" 
                        negate="true" />
                    </conditions>
                    <action type="Rewrite" url="index.html" />
                    <serverVariables>
                        <set name="HTTP_X_ORIGINAL_ACCEPT_ENCODING" 
                        value="{HTTP_ACCEPT_ENCODING}" />
                        <set name="HTTP_ACCEPT_ENCODING" value="" />
                    </serverVariables>
                </rule>

		  </rules>
            <outboundRules>                
                <rule name="For 500.52 fix" 
                preCondition="AcceptEncodingRestore">
                    <match serverVariable="HTTP_ACCEPT_ENCODING" 
                    pattern="^(.*)" />
                    <action type="Rewrite" value="
                    {HTTP_X_ORIGINAL_ACCEPT_ENCODING}" />
                </rule>
                <preConditions>
                    <preCondition name="ResponseIsHtml1">
                        <add input="{RESPONSE_CONTENT_TYPE}" 
                        pattern="^text/html" />
                    </preCondition>
                    <preCondition name="AcceptEncodingRestore">
                        <add input="{HTTP_X_ORIGINAL_ACCEPT_ENCODING}" 
                        pattern=".*" />
                    </preCondition>
                </preConditions>
            </outboundRules>
		</rewrite>
    </system.webServer>
</configuration>

5. Add a sitemap.xml file to your web application and add on new product creation event on API; add a new URL to this file. The XML file for the site map should look like this:


<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" generatedBy="MyApp">  
  <url>
    <loc>https://{MyApplicationDomain}/share/products/{productID}</loc>
    <lastmod>2022-12-07</lastmod>
    <priority>1</priority>
  </url>
</urlset>

That's it. Now, if you share your product links, it will show the associated title, description, image, and title. Additionally, because of sitemap.xml file, the search engine crawlers will know all URLs that you want to be indexed and will do that.

29 views0 comments
bottom of page